All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/46] gcc-LTO support for the kernel
@ 2022-11-14 11:42 Jiri Slaby (SUSE)
  2022-11-14 11:42 ` [PATCH 01/46] x86/boot: robustify calling startup_{32,64}() from the decompressor code Jiri Slaby (SUSE)
                   ` (47 more replies)
  0 siblings, 48 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Slaby (SUSE),
	Alexander Potapenko, Alexander Shishkin, Alexei Starovoitov,
	Alexey Makhalov, Andrew Morton, Andrey Konovalov,
	Andrey Ryabinin, Andrii Nakryiko, Andy Lutomirski,
	Ard Biesheuvel, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Peter Zijlstra, Petr Mladek,
	Rafael J. Wysocki, Richard Biener, Sedat Dilek, Song Liu,
	Stanislav Fomichev, Stefano Stabellini, Steven Rostedt,
	Thomas Gleixner, Valentin Schneider, Vincent Guittot,
	Vincenzo Frascino, Viresh Kumar, VMware PV-Drivers Reviewers,
	Yonghong Song

Hi,

this is the first call for comments (and kbuild complaints) for this
support of gcc (full) LTO in the kernel. Most of the patches come from
Andi. Me and Martin rebased them to new kernels and fixed the to-use
known issues. Also I updated most of the commit logs and reordered the
patches to groups of patches with similar intent.

The very first patch comes from Alexander and is pending on some x86
queue already (I believe). I am attaching it only for completeness.
Without that, the kernel does not boot (LTO reorders a lot).

In our measurements, the performance differences are negligible.

The kernel is bigger with gcc LTO due to more inlining. The next step
might be to play with non-static functions as we export everything, so
the compiler cannot actually drop anything (esp. inlined and no longer
needed functions).

Cc: Alexander Potapenko <glider@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Makhalov <amakhalov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Hao Luo <haoluo@google.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Hubicka <jh@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Martin Liska <mliska@suse.cz>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Richard Biener <RGuenther@suse.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
Cc: Yonghong Song <yhs@fb.com>

Alexander Lobakin (1):
  x86/boot: robustify calling startup_{32,64}() from the decompressor
    code

Andi Kleen (36):
  Compiler Attributes, lto: introduce __noreorder
  tracepoint, lto: Mark static call functions as __visible
  static_call, lto: Mark static keys as __visible
  static_call, lto: Mark static_call_return0() as __visible
  static_call, lto: Mark func_a() as __visible_on_lto
  x86/alternative, lto: Mark int3_*() as global and __visible
  x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto
  x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible
  x86/xen, lto: Mark xen_vcpu_stolen() as __visible
  x86, lto: Mark gdt_page and native_sched_clock() as __visible
  amd, lto: Mark amd pmu and pstate functions as __visible_on_lto
  entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and
    __visible
  softirq, lto: Mark irq_enter/exit_rcu() as __visible
  btf, lto: Make all BTF IDs global on LTO
  init.h, lto: mark initcalls as __noreorder
  bpf, lto: mark interpreter jump table as __noreorder
  sched, lto: mark sched classes as __noreorder
  linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall()
  scripts, lto: re-add gcc-ld
  scripts, lto: use CONFIG_LTO for many LTO specific actions
  Kbuild, lto: Add Link Time Optimization support
  x86/purgatory, lto: Disable gcc LTO for purgatory
  x86/realmode, lto: Disable gcc LTO for real mode code
  x86/vdso, lto: Disable gcc LTO for the vdso
  scripts, lto: disable gcc LTO for some mod sources
  Kbuild, lto: disable gcc LTO for bounds+asm-offsets
  lib/string, lto: disable gcc LTO for string.o
  Compiler attributes, lto: disable __flatten with LTO
  Kbuild, lto: don't include weak source file symbols in System.map
  x86, lto: Disable relative init pointers with gcc LTO
  x86/livepatch, lto: Disable live patching with gcc LTO
  x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used
  scripts, lto: check C symbols for modversions
  scripts/bloat-o-meter, lto: handle gcc LTO
  x86, lto: Finally enable gcc LTO for x86

Jiri Slaby (5):
  kbuild: pass jobserver to cmd_ld_vmlinux.o
  compiler.h: introduce __visible_on_lto
  compiler.h: introduce __global_on_lto
  btf, lto: pass scope as strings
  x86/apic, lto: Mark apic_driver*() as __noreorder

Martin Liska (4):
  kbuild: lto: preserve MAKEFLAGS for module linking
  x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto
  mm/kasan, lto: Mark kasan mem{cpy,move,set} as __used
  kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned

 Documentation/kbuild/index.rst      |  2 +
 Documentation/kbuild/lto-build.rst  | 76 +++++++++++++++++++++++++++++
 Kbuild                              |  3 ++
 Makefile                            |  6 ++-
 arch/Kconfig                        | 52 ++++++++++++++++++++
 arch/x86/Kconfig                    |  5 +-
 arch/x86/boot/compressed/head_32.S  |  2 +-
 arch/x86/boot/compressed/head_64.S  |  2 +-
 arch/x86/boot/compressed/misc.c     | 16 +++---
 arch/x86/entry/vdso/Makefile        |  2 +
 arch/x86/events/amd/core.c          |  2 +-
 arch/x86/include/asm/apic.h         |  4 +-
 arch/x86/include/asm/preempt.h      |  4 +-
 arch/x86/kernel/alternative.c       |  5 +-
 arch/x86/kernel/cpu/common.c        |  2 +-
 arch/x86/kernel/paravirt.c          |  2 +-
 arch/x86/kernel/sev-shared.c        |  2 +-
 arch/x86/kernel/tsc.c               |  2 +-
 arch/x86/lib/memcpy_32.c            |  6 +--
 arch/x86/purgatory/Makefile         |  2 +
 arch/x86/realmode/Makefile          |  1 +
 drivers/cpufreq/amd-pstate.c        | 15 +++---
 drivers/xen/time.c                  |  2 +-
 include/asm-generic/vmlinux.lds.h   |  2 +-
 include/linux/btf_ids.h             | 24 ++++-----
 include/linux/compiler.h            |  8 +++
 include/linux/compiler_attributes.h | 15 ++++++
 include/linux/export.h              |  6 ++-
 include/linux/init.h                |  2 +-
 include/linux/linkage.h             | 16 +++---
 include/linux/static_call.h         | 12 ++---
 include/linux/tracepoint.h          |  4 +-
 kernel/bpf/core.c                   |  2 +-
 kernel/entry/common.c               |  2 +-
 kernel/kallsyms.c                   |  2 +-
 kernel/livepatch/Kconfig            |  1 +
 kernel/sched/sched.h                |  1 +
 kernel/softirq.c                    |  4 +-
 kernel/static_call.c                |  2 +-
 kernel/static_call_inline.c         |  6 +--
 kernel/time/posix-stubs.c           | 19 +++++++-
 lib/Makefile                        |  2 +
 mm/kasan/generic.c                  |  2 +-
 mm/kasan/shadow.c                   |  6 +--
 scripts/Makefile.build              | 17 ++++---
 scripts/Makefile.lib                |  2 +-
 scripts/Makefile.lto                | 43 ++++++++++++++++
 scripts/Makefile.modfinal           |  2 +-
 scripts/Makefile.vmlinux            |  3 +-
 scripts/Makefile.vmlinux_o          |  6 +--
 scripts/bloat-o-meter               |  2 +-
 scripts/gcc-ld                      | 40 +++++++++++++++
 scripts/link-vmlinux.sh             |  9 ++--
 scripts/mksysmap                    |  2 +
 scripts/mod/Makefile                |  3 ++
 scripts/module.lds.S                |  2 +-
 56 files changed, 384 insertions(+), 100 deletions(-)
 create mode 100644 Documentation/kbuild/lto-build.rst
 create mode 100644 scripts/Makefile.lto
 create mode 100755 scripts/gcc-ld

-- 
2.38.1


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 01/46] x86/boot: robustify calling startup_{32,64}() from the decompressor code
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
@ 2022-11-14 11:42 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o Jiri Slaby (SUSE)
                   ` (46 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alexander Lobakin, Jiri Slaby

From: Alexander Lobakin <alexandr.lobakin@intel.com>

After commit ce697ccee1a8 ("kbuild: remove head-y syntax"), I
started digging whether x86 is ready for removing this old cruft.
Removing its objects from the list makes the kernel unbootable.
This applies only to bzImage, vmlinux still works correctly.
The reason is that with no strict object order determined by the
linker arguments, not the linker script, startup_64 can be placed
not right at the beginning of the kernel.
Here's vmlinux.map's beginning before removing:

ffffffff81000000         vmlinux.o:(.head.text)
ffffffff81000000                 startup_64
ffffffff81000070                 secondary_startup_64
ffffffff81000075                 secondary_startup_64_no_verify
ffffffff81000160                 verify_cpu

and after:

ffffffff81000000         vmlinux.o:(.head.text)
ffffffff81000000                 pvh_start_xen
ffffffff81000080                 startup_64
ffffffff810000f0                 secondary_startup_64
ffffffff810000f5                 secondary_startup_64_no_verify

Not a problem itself, but the self-extractor code has the address of
that function hardcoded the beginning, not looking onto the ELF
header, which always contains the address of startup_{32,64}().

So, instead of doing an "act of blind faith", just take the address
from the ELF header and extract a relative offset to the entry
point. The decompressor function already returns a pointer to the
beginning of the kernel to the Asm code, which then jumps to it,
so add that offset to the return value.
This doesn't change anything for now, but allows to resign from the
"head object list" for x86 and makes sure valid Kbuild or any other
improvements won't break anything here in general.

Tested-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
---
 arch/x86/boot/compressed/head_32.S |  2 +-
 arch/x86/boot/compressed/head_64.S |  2 +-
 arch/x86/boot/compressed/misc.c    | 16 ++++++++++------
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 3b354eb9516d..56f9847e208b 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -187,7 +187,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
 	leal	boot_heap@GOTOFF(%ebx), %eax
 	pushl	%eax			/* heap area */
 	pushl	%esi			/* real mode pointer */
-	call	extract_kernel		/* returns kernel location in %eax */
+	call	extract_kernel		/* returns kernel entry point in %eax */
 	addl	$24, %esp
 
 /*
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d33f060900d2..aeba5aa3d26c 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -593,7 +593,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
 	movl	input_len(%rip), %ecx	/* input_len */
 	movq	%rbp, %r8		/* output target address */
 	movl	output_len(%rip), %r9d	/* decompressed length, end of relocs */
-	call	extract_kernel		/* returns kernel location in %rax */
+	call	extract_kernel		/* returns kernel entry point in %rax */
 	popq	%rsi
 
 /*
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index cf690d8712f4..2548d7fb243e 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -277,7 +277,7 @@ static inline void handle_relocations(void *output, unsigned long output_len,
 { }
 #endif
 
-static void parse_elf(void *output)
+static size_t parse_elf(void *output)
 {
 #ifdef CONFIG_X86_64
 	Elf64_Ehdr ehdr;
@@ -287,16 +287,15 @@ static void parse_elf(void *output)
 	Elf32_Phdr *phdrs, *phdr;
 #endif
 	void *dest;
+	size_t off;
 	int i;
 
 	memcpy(&ehdr, output, sizeof(ehdr));
 	if (ehdr.e_ident[EI_MAG0] != ELFMAG0 ||
 	   ehdr.e_ident[EI_MAG1] != ELFMAG1 ||
 	   ehdr.e_ident[EI_MAG2] != ELFMAG2 ||
-	   ehdr.e_ident[EI_MAG3] != ELFMAG3) {
+	   ehdr.e_ident[EI_MAG3] != ELFMAG3)
 		error("Kernel is not a valid ELF file");
-		return;
-	}
 
 	debug_putstr("Parsing ELF... ");
 
@@ -305,6 +304,7 @@ static void parse_elf(void *output)
 		error("Failed to allocate space for phdrs");
 
 	memcpy(phdrs, output + ehdr.e_phoff, sizeof(*phdrs) * ehdr.e_phnum);
+	off = ehdr.e_entry - phdrs->p_paddr;
 
 	for (i = 0; i < ehdr.e_phnum; i++) {
 		phdr = &phdrs[i];
@@ -328,6 +328,8 @@ static void parse_elf(void *output)
 	}
 
 	free(phdrs);
+
+	return off;
 }
 
 /*
@@ -356,6 +358,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	const unsigned long kernel_total_size = VO__end - VO__text;
 	unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
 	unsigned long needed_size;
+	size_t off;
 
 	/* Retain x86 boot parameters pointer passed from startup_32/64. */
 	boot_params = rmode;
@@ -456,14 +459,15 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	debug_putstr("\nDecompressing Linux... ");
 	__decompress(input_data, input_len, NULL, NULL, output, output_len,
 			NULL, error);
-	parse_elf(output);
+	off = parse_elf(output);
+	debug_putaddr(off);
 	handle_relocations(output, output_len, virt_addr);
 	debug_putstr("done.\nBooting the kernel.\n");
 
 	/* Disable exception handling before booting the kernel */
 	cleanup_exception_handling();
 
-	return output;
+	return output + off;
 }
 
 void fortify_panic(const char *name)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
  2022-11-14 11:42 ` [PATCH 01/46] x86/boot: robustify calling startup_{32,64}() from the decompressor code Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 17:57   ` Masahiro Yamada
  2022-11-14 11:43 ` [PATCH 03/46] kbuild: lto: preserve MAKEFLAGS for module linking Jiri Slaby (SUSE)
                   ` (45 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Slaby, Sedat Dilek, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Martin Liska

From: Jiri Slaby <jslaby@suse.cz>

Until the link-vmlinux.sh split (cf. the commit below), the linker was
run with jobserver set in MAKEFLAGS. After the split, the command in
Makefile.vmlinux_o is not prefixed by "+" anymore, so this information
is lost.

Restore it as linkers working in parallel (namely gcc LTO) make a use of
it. Actually, they complain, if jobserver is not set:
  lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS'

Fixes: 5d45950dfbb1 (kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o)
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/Makefile.vmlinux_o | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/Makefile.vmlinux_o b/scripts/Makefile.vmlinux_o
index 0edfdb40364b..1c86895cfcf8 100644
--- a/scripts/Makefile.vmlinux_o
+++ b/scripts/Makefile.vmlinux_o
@@ -58,7 +58,7 @@ define rule_ld_vmlinux.o
 endef
 
 vmlinux.o: $(initcalls-lds) vmlinux.a $(KBUILD_VMLINUX_LIBS) FORCE
-	$(call if_changed_rule,ld_vmlinux.o)
+	+$(call if_changed_rule,ld_vmlinux.o)
 
 targets += vmlinux.o
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 03/46] kbuild: lto: preserve MAKEFLAGS for module linking
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
  2022-11-14 11:42 ` [PATCH 01/46] x86/boot: robustify calling startup_{32,64}() from the decompressor code Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 18:02   ` Masahiro Yamada
  2022-11-14 11:43 ` [PATCH 04/46] compiler.h: introduce __visible_on_lto Jiri Slaby (SUSE)
                   ` (44 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Martin Liska, Sedat Dilek, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Jiri Slaby

From: Martin Liska <mliska@suse.cz>

Prefix cc_o_c and ld_multi_m commands in makefile in order to preserve
access to jobserver. This is needed for gcc LTO at least (enabled in
later patches in this series). Note that both commands can invoke the
linker (ld_single_m in the former case).

Fixes this warning:
lto-wrapper: warning: jobserver is not available: ‘--jobserver-auth=’ is not present in ‘MAKEFLAGS’

Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Fixes: 5d45950dfbb1 (kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o)
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/Makefile.build | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 41f3602fc8de..564a20ce2667 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -247,7 +247,7 @@ endef
 
 # Built-in and composite module parts
 $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
-	$(call if_changed_rule,cc_o_c)
+	+$(call if_changed_rule,cc_o_c)
 	$(call cmd,force_checksrc)
 
 # To make this rule robust against "Argument list too long" error,
@@ -457,7 +457,7 @@ endef
 $(multi-obj-m): objtool-enabled := $(delay-objtool)
 $(multi-obj-m): part-of-module := y
 $(multi-obj-m): %.o: %.mod FORCE
-	$(call if_changed_rule,ld_multi_m)
+	+$(call if_changed_rule,ld_multi_m)
 $(call multi_depend, $(multi-obj-m), .o, -objs -y -m)
 
 # Add intermediate targets:
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 04/46] compiler.h: introduce __visible_on_lto
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (2 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 03/46] kbuild: lto: preserve MAKEFLAGS for module linking Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 05/46] compiler.h: introduce __global_on_lto Jiri Slaby (SUSE)
                   ` (43 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jiri Slaby, kernel test robot, Martin Liska

From: Jiri Slaby <jslaby@suse.cz>

__visible_on_lto is defined as "__visible" when gcc LTO is turned on
(see later patches), and "static" otherwise. It is needed for top-level
symbols which are referenced in assembly. It is because the assembly and
the symbol can each end up in a different file with gcc LTO. And that
leads to linker errors.

So the symbols have to be visible when gcc LTO is in charge. On the
contrary, they have to be static on non-gcc-LTO builds. Otherwise a
warning about missing declaration occurs.

Reported-by: kernel test robot <lkp@intel.com>
Cc: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/compiler.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 973a1bfd7ef5..2305a3cbe99c 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -133,6 +133,12 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #define __annotate_jump_table
 #endif /* CONFIG_OBJTOOL */
 
+#ifdef CONFIG_LTO_GCC
+# define __visible_on_lto		__visible
+#else
+# define __visible_on_lto		static
+#endif
+
 #ifndef unreachable
 # define unreachable() do {		\
 	annotate_unreachable();		\
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 05/46] compiler.h: introduce __global_on_lto
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (3 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 04/46] compiler.h: introduce __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 06/46] Compiler Attributes, lto: introduce __noreorder Jiri Slaby (SUSE)
                   ` (42 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jiri Slaby, Martin Liska

From: Jiri Slaby <jslaby@suse.cz>

__global_on_lto is defined as "globl" when gcc LTO is turned on (see
later patches), and "local" otherwise. It is needed for top-level
symbols which are referenced in assembly. It is because the assembly and
the symbol can each end up in a different file with gcc LTO. And that
leads to linker errors.

So the symbols have to be global when gcc LTO is in charge. On the
contrary, they can remain local on non-gcc-LTO builds.

Cc: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/compiler.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 2305a3cbe99c..16e4c1de14c4 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -135,8 +135,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 #ifdef CONFIG_LTO_GCC
 # define __visible_on_lto		__visible
+# define __global_on_lto		"globl"
 #else
 # define __visible_on_lto		static
+# define __global_on_lto		"local"
 #endif
 
 #ifndef unreachable
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 06/46] Compiler Attributes, lto: introduce __noreorder
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (4 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 05/46] compiler.h: introduce __global_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 07/46] tracepoint, lto: Mark static call functions as __visible Jiri Slaby (SUSE)
                   ` (41 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

gcc 5 has a new no_reorder attribute that prevents top level reordering
only for that symbol. So add a new __noreorder wrapper for the
no_reorder attribute. This will be used in the next patches to support
gcc LTO.

[js] split this to introduction & use

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/compiler_attributes.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 898b3458b24a..be6c71fd5ebb 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -379,4 +379,14 @@
  */
 #define __fix_address noinline __noclone
 
+/*
+ * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
+ */
+
+#if __has_attribute(__no_reorder__)
+#define __noreorder			__attribute__((no_reorder))
+#else
+#define __noreorder
+#endif
+
 #endif /* __LINUX_COMPILER_ATTRIBUTES_H */
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 07/46] tracepoint, lto: Mark static call functions as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (5 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 06/46] Compiler Attributes, lto: introduce __noreorder Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 08/46] static_call, lto: Mark static keys " Jiri Slaby (SUSE)
                   ` (40 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Steven Rostedt, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark static call functions as __visible, namely tracepoints here.

[js] unify the __visible placement

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/tracepoint.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 4b33b95eb8be..1ce0655f0c9c 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -239,7 +239,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
  * poking RCU a bit.
  */
 #define __DECLARE_TRACE(name, proto, args, cond, data_proto)		\
-	extern int __traceiter_##name(data_proto);			\
+	extern __visible int __traceiter_##name(data_proto);		\
 	DECLARE_STATIC_CALL(tp_func_##name, __traceiter_##name);	\
 	extern struct tracepoint __tracepoint_##name;			\
 	static inline void trace_##name(proto)				\
@@ -306,7 +306,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
 		.unregfunc = _unreg,					\
 		.funcs = NULL };					\
 	__TRACEPOINT_ENTRY(_name);					\
-	int __traceiter_##_name(void *__data, proto)			\
+	__visible int __traceiter_##_name(void *__data, proto)		\
 	{								\
 		struct tracepoint_func *it_func_ptr;			\
 		void *it_func;						\
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 08/46] static_call, lto: Mark static keys as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (6 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 07/46] tracepoint, lto: Mark static call functions as __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 15:51   ` Peter Zijlstra
  2022-11-14 18:57   ` Josh Poimboeuf
  2022-11-14 11:43 ` [PATCH 09/46] static_call, lto: Mark static_call_return0() " Jiri Slaby (SUSE)
                   ` (39 subsequent siblings)
  47 siblings, 2 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Peter Zijlstra, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark static call functions as __visible, namely static keys here.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/static_call.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index df53bed9d71f..e629ab0c4ca3 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -182,7 +182,7 @@ extern long __static_call_return0(void);
 
 #define DEFINE_STATIC_CALL(name, _func)					\
 	DECLARE_STATIC_CALL(name, _func);				\
-	struct static_call_key STATIC_CALL_KEY(name) = {		\
+	__visible struct static_call_key STATIC_CALL_KEY(name) = {		\
 		.func = _func,						\
 		.type = 1,						\
 	};								\
@@ -190,7 +190,7 @@ extern long __static_call_return0(void);
 
 #define DEFINE_STATIC_CALL_NULL(name, _func)				\
 	DECLARE_STATIC_CALL(name, _func);				\
-	struct static_call_key STATIC_CALL_KEY(name) = {		\
+	__visible struct static_call_key STATIC_CALL_KEY(name) = {	\
 		.func = NULL,						\
 		.type = 1,						\
 	};								\
@@ -198,7 +198,7 @@ extern long __static_call_return0(void);
 
 #define DEFINE_STATIC_CALL_RET0(name, _func)				\
 	DECLARE_STATIC_CALL(name, _func);				\
-	struct static_call_key STATIC_CALL_KEY(name) = {		\
+	__visible struct static_call_key STATIC_CALL_KEY(name) = {		\
 		.func = __static_call_return0,				\
 		.type = 1,						\
 	};								\
@@ -227,14 +227,14 @@ static inline int static_call_init(void) { return 0; }
 
 #define DEFINE_STATIC_CALL(name, _func)					\
 	DECLARE_STATIC_CALL(name, _func);				\
-	struct static_call_key STATIC_CALL_KEY(name) = {		\
+	__visible struct static_call_key STATIC_CALL_KEY(name) = {		\
 		.func = _func,						\
 	};								\
 	ARCH_DEFINE_STATIC_CALL_TRAMP(name, _func)
 
 #define DEFINE_STATIC_CALL_NULL(name, _func)				\
 	DECLARE_STATIC_CALL(name, _func);				\
-	struct static_call_key STATIC_CALL_KEY(name) = {		\
+	__visible struct static_call_key STATIC_CALL_KEY(name) = {	\
 		.func = NULL,						\
 	};								\
 	ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)
@@ -288,7 +288,7 @@ static inline long __static_call_return0(void)
 
 #define __DEFINE_STATIC_CALL(name, _func, _func_init)			\
 	DECLARE_STATIC_CALL(name, _func);				\
-	struct static_call_key STATIC_CALL_KEY(name) = {		\
+	__visible struct static_call_key STATIC_CALL_KEY(name) = {	\
 		.func = _func_init,					\
 	}
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 09/46] static_call, lto: Mark static_call_return0() as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (7 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 08/46] static_call, lto: Mark static keys " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto Jiri Slaby (SUSE)
                   ` (38 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Peter Zijlstra, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark static_call_return0() as __visible.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/static_call.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/static_call.c b/kernel/static_call.c
index e9c3e69f3837..9197fe86d8bd 100644
--- a/kernel/static_call.c
+++ b/kernel/static_call.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/static_call.h>
 
-long __static_call_return0(void)
+__visible long __static_call_return0(void)
 {
 	return 0;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (8 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 09/46] static_call, lto: Mark static_call_return0() " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 15:54   ` Peter Zijlstra
  2022-11-14 11:43 ` [PATCH 11/46] x86/alternative, lto: Mark int3_*() as global and __visible Jiri Slaby (SUSE)
                   ` (37 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Peter Zijlstra, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark func_a() as __visible_on_lto as it was static.

[js] use __visible_on_lto

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/static_call_inline.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/static_call_inline.c b/kernel/static_call_inline.c
index dc5665b62814..6933b4437597 100644
--- a/kernel/static_call_inline.c
+++ b/kernel/static_call_inline.c
@@ -501,7 +501,7 @@ early_initcall(static_call_init);
 
 #ifdef CONFIG_STATIC_CALL_SELFTEST
 
-static int func_a(int x)
+__visible_on_lto int sc_func_a(int x)
 {
 	return x+1;
 }
@@ -511,7 +511,7 @@ static int func_b(int x)
 	return x+2;
 }
 
-DEFINE_STATIC_CALL(sc_selftest, func_a);
+DEFINE_STATIC_CALL(sc_selftest, sc_func_a);
 
 static struct static_call_data {
       int (*func)(int);
@@ -520,7 +520,7 @@ static struct static_call_data {
 } static_call_data [] __initdata = {
       { NULL,   2, 3 },
       { func_b, 2, 4 },
-      { func_a, 2, 3 }
+      { sc_func_a, 2, 3 }
 };
 
 static int __init test_static_call_init(void)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 11/46] x86/alternative, lto: Mark int3_*() as global and __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (9 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 12/46] x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto Jiri Slaby (SUSE)
                   ` (36 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark int3_magic() and int3_selftest_ip() as global and __visible.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/alternative.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 5cadcea035e0..05e5eb9cbd51 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -823,11 +823,12 @@ extern struct paravirt_patch_site __start_parainstructions[],
  * convention such that we can 'call' it from assembly.
  */
 
-extern void int3_magic(unsigned int *ptr); /* defined in asm */
+extern __visible void int3_magic(unsigned int *ptr); /* defined in asm */
 
 asm (
 "	.pushsection	.init.text, \"ax\", @progbits\n"
 "	.type		int3_magic, @function\n"
+"	.globl		int3_magic\n"
 "int3_magic:\n"
 	ANNOTATE_NOENDBR
 "	movl	$1, (%" _ASM_ARG1 ")\n"
@@ -836,7 +837,7 @@ asm (
 "	.popsection\n"
 );
 
-extern void int3_selftest_ip(void); /* defined in asm below */
+extern __visible void int3_selftest_ip(void); /* defined in asm below */
 
 static int __init
 int3_exception_notify(struct notifier_block *self, unsigned long val, void *data)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 12/46] x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (10 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 11/46] x86/alternative, lto: Mark int3_*() as global and __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 15:58   ` Peter Zijlstra
  2022-11-14 11:43 ` [PATCH 13/46] x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible Jiri Slaby (SUSE)
                   ` (35 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Juergen Gross, Alexey Makhalov,
	VMware PV-Drivers Reviewers, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86, Martin Liska,
	Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark native_steal_clock() as __visible_on_lto.

[js] use __visible_on_lto

Cc: Juergen Gross <jgross@suse.com>
Cc: "Srivatsa S. Bhat
Cc: Alexey Makhalov <amakhalov@vmware.com>
Cc: VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/paravirt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 7ca2d46c08cc..27a537cd4b0e 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -120,7 +120,7 @@ unsigned int paravirt_patch(u8 type, void *insn_buff, unsigned long addr,
 struct static_key paravirt_steal_enabled;
 struct static_key paravirt_steal_rq_enabled;
 
-static u64 native_steal_clock(int cpu)
+__visible_on_lto u64 native_steal_clock(int cpu)
 {
 	return 0;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 13/46] x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (11 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 12/46] x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 14/46] x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto Jiri Slaby (SUSE)
                   ` (34 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark preempt_schedule_thunk() and preempt_schedule_notrace_thunk() as
__visible.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/include/asm/preempt.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index 5f6daea1ee24..c76ec881b23c 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -106,13 +106,13 @@ static __always_inline bool should_resched(int preempt_offset)
 #ifdef CONFIG_PREEMPTION
 
 extern asmlinkage void preempt_schedule(void);
-extern asmlinkage void preempt_schedule_thunk(void);
+extern __visible asmlinkage void preempt_schedule_thunk(void);
 
 #define preempt_schedule_dynamic_enabled	preempt_schedule_thunk
 #define preempt_schedule_dynamic_disabled	NULL
 
 extern asmlinkage void preempt_schedule_notrace(void);
-extern asmlinkage void preempt_schedule_notrace_thunk(void);
+extern __visible asmlinkage void preempt_schedule_notrace_thunk(void);
 
 #define preempt_schedule_notrace_dynamic_enabled	preempt_schedule_notrace_thunk
 #define preempt_schedule_notrace_dynamic_disabled	NULL
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 14/46] x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (12 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 13/46] x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 16:02   ` Peter Zijlstra
  2022-11-14 11:43 ` [PATCH 15/46] x86/xen, lto: Mark xen_vcpu_stolen() as __visible Jiri Slaby (SUSE)
                   ` (33 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Martin Liska, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Jiri Slaby

From: Martin Liska <mliska@suse.cz>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark cpuid_table_copy as __visible_on_lto.

[js] use __visible_on_lto

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/sev-shared.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 3a5b0c9c4fcc..554da8aabfc7 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -64,7 +64,7 @@ struct snp_cpuid_table {
 static u16 ghcb_version __ro_after_init;
 
 /* Copy of the SNP firmware's CPUID page. */
-static struct snp_cpuid_table cpuid_table_copy __ro_after_init;
+__visible_on_lto struct snp_cpuid_table cpuid_table_copy __ro_after_init;
 
 /*
  * These will be initialized based on CPUID table so that non-present
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 15/46] x86/xen, lto: Mark xen_vcpu_stolen() as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (13 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 14/46] x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 16/46] x86, lto: Mark gdt_page and native_sched_clock() " Jiri Slaby (SUSE)
                   ` (32 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, xen-devel, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark xen_vcpu_stolen() as __visible.

Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: <xen-devel@lists.xenproject.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/xen/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 152dd33bb223..006a04592c8f 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -145,7 +145,7 @@ void xen_get_runstate_snapshot(struct vcpu_runstate_info *res)
 }
 
 /* return true when a vcpu could run but has no real cpu to run on */
-bool xen_vcpu_stolen(int vcpu)
+__visible bool xen_vcpu_stolen(int vcpu)
 {
 	return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 16/46] x86, lto: Mark gdt_page and native_sched_clock() as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (14 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 15/46] x86/xen, lto: Mark xen_vcpu_stolen() as __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 17/46] amd, lto: Mark amd pmu and pstate functions as __visible_on_lto Jiri Slaby (SUSE)
                   ` (31 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark gdt_page and native_sched_clock() as __visible.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <x86@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/cpu/common.c | 2 +-
 arch/x86/kernel/tsc.c        | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e508f239098..5417a8fd7a45 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -201,7 +201,7 @@ static const struct cpu_dev default_cpu = {
 
 static const struct cpu_dev *this_cpu = &default_cpu;
 
-DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
+__visible DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
 #ifdef CONFIG_X86_64
 	/*
 	 * We need valid kernel segments for data and code in long mode too
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index cafacb2e58cc..df1589482662 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -215,7 +215,7 @@ static void __init cyc2ns_init_secondary_cpus(void)
 /*
  * Scheduler clock - returns current time in nanosec units.
  */
-u64 native_sched_clock(void)
+__visible u64 native_sched_clock(void)
 {
 	if (static_branch_likely(&__use_tsc)) {
 		u64 tsc_now = rdtsc();
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 17/46] amd, lto: Mark amd pmu and pstate functions as __visible_on_lto
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (15 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 16/46] x86, lto: Mark gdt_page and native_sched_clock() " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible Jiri Slaby (SUSE)
                   ` (30 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Huang Rui, Rafael J. Wysocki,
	Viresh Kumar, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark amd_pmu_test_overflow_topbit() and all amd pstate functions as
__visible_on_lto.

Also the pstate ones have to be renamed so that they are unique.

[ml] fix amd_pmu_test_overflow_topbit() too
[js] use __visible_on_lto

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/events/amd/core.c   |  2 +-
 drivers/cpufreq/amd-pstate.c | 15 ++++++++-------
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 8b70237c33f7..9dfdfd85b493 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -643,7 +643,7 @@ static inline void amd_pmu_ack_global_status(u64 status)
 	wrmsrl(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, status);
 }
 
-static bool amd_pmu_test_overflow_topbit(int idx)
+__visible_on_lto bool amd_pmu_test_overflow_topbit(int idx)
 {
 	u64 counter;
 
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ace7d50cf2ac..d0b67a60191d 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -66,7 +66,7 @@ MODULE_PARM_DESC(shared_mem,
 
 static struct cpufreq_driver amd_pstate_driver;
 
-static inline int pstate_enable(bool enable)
+__visible_on_lto int do_amd_pstate_enable(bool enable)
 {
 	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
 }
@@ -84,14 +84,14 @@ static int cppc_enable(bool enable)
 	return ret;
 }
 
-DEFINE_STATIC_CALL(amd_pstate_enable, pstate_enable);
+DEFINE_STATIC_CALL(amd_pstate_enable, do_amd_pstate_enable);
 
 static inline int amd_pstate_enable(bool enable)
 {
 	return static_call(amd_pstate_enable)(enable);
 }
 
-static int pstate_init_perf(struct amd_cpudata *cpudata)
+__visible_on_lto int do_amd_pstate_init_perf(struct amd_cpudata *cpudata)
 {
 	u64 cap1;
 	u32 highest_perf;
@@ -142,15 +142,16 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
 	return 0;
 }
 
-DEFINE_STATIC_CALL(amd_pstate_init_perf, pstate_init_perf);
+DEFINE_STATIC_CALL(amd_pstate_init_perf, do_amd_pstate_init_perf);
 
 static inline int amd_pstate_init_perf(struct amd_cpudata *cpudata)
 {
 	return static_call(amd_pstate_init_perf)(cpudata);
 }
 
-static void pstate_update_perf(struct amd_cpudata *cpudata, u32 min_perf,
-			       u32 des_perf, u32 max_perf, bool fast_switch)
+__visible_on_lto void do_amd_pstate_update_perf(struct amd_cpudata *cpudata,
+			       u32 min_perf, u32 des_perf, u32 max_perf,
+			       bool fast_switch)
 {
 	if (fast_switch)
 		wrmsrl(MSR_AMD_CPPC_REQ, READ_ONCE(cpudata->cppc_req_cached));
@@ -172,7 +173,7 @@ static void cppc_update_perf(struct amd_cpudata *cpudata,
 	cppc_set_perf(cpudata->cpu, &perf_ctrls);
 }
 
-DEFINE_STATIC_CALL(amd_pstate_update_perf, pstate_update_perf);
+DEFINE_STATIC_CALL(amd_pstate_update_perf, do_amd_pstate_update_perf);
 
 static inline void amd_pstate_update_perf(struct amd_cpudata *cpudata,
 					  u32 min_perf, u32 des_perf,
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (16 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 17/46] amd, lto: Mark amd pmu and pstate functions as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-16 23:30   ` Thomas Gleixner
  2022-11-14 11:43 ` [PATCH 19/46] export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and __visible Jiri Slaby (SUSE)
                   ` (29 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Peter Zijlstra, Andy Lutomirski,
	Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark raw_irqentry_exit_cond_resched() as __visible.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/entry/common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 846add8394c4..13c1a7a0e8ce 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -378,7 +378,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
 	return ret;
 }
 
-void raw_irqentry_exit_cond_resched(void)
+__visible void raw_irqentry_exit_cond_resched(void)
 {
 	if (!preempt_count()) {
 		/* Sanity check RCU and thread stack */
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 19/46] export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (17 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 20/46] softirq, lto: Mark irq_enter/exit_rcu() as __visible Jiri Slaby (SUSE)
                   ` (28 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark __kstrtab_*[] and __kstrtabns_*[] symbols as global and
__visible.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/export.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/export.h b/include/linux/export.h
index 3f31ced0d977..3cb5f85327da 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -85,11 +85,13 @@ struct kernel_symbol {
  */
 #define ___EXPORT_SYMBOL(sym, sec, ns)						\
 	extern typeof(sym) sym;							\
-	extern const char __kstrtab_##sym[];					\
-	extern const char __kstrtabns_##sym[];					\
+	extern const char __visible __kstrtab_##sym[];				\
+	extern const char __visible __kstrtabns_##sym[];			\
 	asm("	.section \"__ksymtab_strings\",\"aMS\",%progbits,1	\n"	\
+	    "	.globl __kstrtab_" #sym "				\n"	\
 	    "__kstrtab_" #sym ":					\n"	\
 	    "	.asciz 	\"" #sym "\"					\n"	\
+	    "	.globl __kstrtabns_" #sym "				\n"	\
 	    "__kstrtabns_" #sym ":					\n"	\
 	    "	.asciz 	\"" ns "\"					\n"	\
 	    "	.previous						\n");	\
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 20/46] softirq, lto: Mark irq_enter/exit_rcu() as __visible
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (18 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 19/46] export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 21/46] btf, lto: pass scope as strings Jiri Slaby (SUSE)
                   ` (27 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Symbols referenced from assembler (either directly or e.f. from
DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
they could end up in a different object file than the assembler. This
can lead to linker errors without this patch.

So mark irq_enter_rcu() and irq_exit_rcu() as __visible.

Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/softirq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..9d62e09c9581 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -604,7 +604,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
 /**
  * irq_enter_rcu - Enter an interrupt context with RCU watching
  */
-void irq_enter_rcu(void)
+__visible void irq_enter_rcu(void)
 {
 	__irq_enter_raw();
 
@@ -657,7 +657,7 @@ static inline void __irq_exit_rcu(void)
  *
  * Also processes softirqs if needed and possible.
  */
-void irq_exit_rcu(void)
+__visible void irq_exit_rcu(void)
 {
 	__irq_exit_rcu();
 	 /* must be last! */
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 21/46] btf, lto: pass scope as strings
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (19 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 20/46] softirq, lto: Mark irq_enter/exit_rcu() as __visible Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 22/46] btf, lto: Make all BTF IDs global on LTO Jiri Slaby (SUSE)
                   ` (26 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Slaby, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Martin Liska

From: Jiri Slaby <jslaby@suse.cz>

gcc LTO can put assembler top level statements into other assembler
files. The BTF IDs assumed that they are in the same file. We need to
make all BTF IDs global to work around this.

This is a preparation for that, as we will pass __global_on_lto as
scope. That is macro that expands either to "globl" or "local" depending
whether LTO is enabled.

That wouldn't work without this patch as we stringify scope at the
moment.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/btf_ids.h | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index 2aea877d644f..3011757a48ef 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -83,16 +83,16 @@ word							\
 #define __BTF_ID_LIST(name, scope)			\
 asm(							\
 ".pushsection " BTF_IDS_SECTION ",\"a\";       \n"	\
-"." #scope " " #name ";                        \n"	\
+"." scope " " #name ";                         \n"	\
 #name ":;                                      \n"	\
 ".popsection;                                  \n");
 
 #define BTF_ID_LIST(name)				\
-__BTF_ID_LIST(name, local)				\
+__BTF_ID_LIST(name, "local")				\
 extern u32 name[];
 
 #define BTF_ID_LIST_GLOBAL(name, n)			\
-__BTF_ID_LIST(name, globl)
+__BTF_ID_LIST(name, "globl")
 
 /* The BTF_ID_LIST_SINGLE macro defines a BTF_ID_LIST with
  * a single entry.
@@ -142,18 +142,18 @@ asm(							\
 #define __BTF_SET_START(name, scope)			\
 asm(							\
 ".pushsection " BTF_IDS_SECTION ",\"a\";       \n"	\
-"." #scope " __BTF_ID__set__" #name ";         \n"	\
+"." scope " __BTF_ID__set__" #name ";          \n"	\
 "__BTF_ID__set__" #name ":;                    \n"	\
 ".zero 4                                       \n"	\
 ".popsection;                                  \n");
 
 #define BTF_SET_START(name)				\
-__BTF_ID_LIST(name, local)				\
-__BTF_SET_START(name, local)
+__BTF_ID_LIST(name, "local")				\
+__BTF_SET_START(name, "local")
 
 #define BTF_SET_START_GLOBAL(name)			\
-__BTF_ID_LIST(name, globl)				\
-__BTF_SET_START(name, globl)
+__BTF_ID_LIST(name, "globl")				\
+__BTF_SET_START(name, "globl")
 
 #define BTF_SET_END(name)				\
 asm(							\
@@ -186,14 +186,14 @@ extern struct btf_id_set name;
 #define __BTF_SET8_START(name, scope)			\
 asm(							\
 ".pushsection " BTF_IDS_SECTION ",\"a\";       \n"	\
-"." #scope " __BTF_ID__set8__" #name ";        \n"	\
+"." scope " __BTF_ID__set8__" #name ";         \n"	\
 "__BTF_ID__set8__" #name ":;                   \n"	\
 ".zero 8                                       \n"	\
 ".popsection;                                  \n");
 
 #define BTF_SET8_START(name)				\
-__BTF_ID_LIST(name, local)				\
-__BTF_SET8_START(name, local)
+__BTF_ID_LIST(name, "local")				\
+__BTF_SET8_START(name, "local")
 
 #define BTF_SET8_END(name)				\
 asm(							\
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 22/46] btf, lto: Make all BTF IDs global on LTO
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (20 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 21/46] btf, lto: pass scope as strings Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 23/46] init.h, lto: mark initcalls as __noreorder Jiri Slaby (SUSE)
                   ` (25 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

gcc LTO can put assembler top level statements into other assembler
files. The BTF IDs assumed that they are in the same file. So if we are
building with gcc LTO, make all BTF IDs global to work around this.

This is done by new __global_on_lto macro.

[js] do that for 8B BTF set too (commit ab21d6063c01)
[js] do global only in LTO case

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/btf_ids.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index 3011757a48ef..a2bef302e42c 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -37,7 +37,7 @@ struct btf_id_set8 {
 #define ____BTF_ID(symbol, word)			\
 asm(							\
 ".pushsection " BTF_IDS_SECTION ",\"a\";       \n"	\
-".local " #symbol " ;                          \n"	\
+"." __global_on_lto " " #symbol " ;            \n"	\
 ".type  " #symbol ", STT_OBJECT;               \n"	\
 ".size  " #symbol ", 4;                        \n"	\
 #symbol ":                                     \n"	\
@@ -88,7 +88,7 @@ asm(							\
 ".popsection;                                  \n");
 
 #define BTF_ID_LIST(name)				\
-__BTF_ID_LIST(name, "local")				\
+__BTF_ID_LIST(name, __global_on_lto)			\
 extern u32 name[];
 
 #define BTF_ID_LIST_GLOBAL(name, n)			\
@@ -148,8 +148,8 @@ asm(							\
 ".popsection;                                  \n");
 
 #define BTF_SET_START(name)				\
-__BTF_ID_LIST(name, "local")				\
-__BTF_SET_START(name, "local")
+__BTF_ID_LIST(name, __global_on_lto)			\
+__BTF_SET_START(name, __global_on_lto)
 
 #define BTF_SET_START_GLOBAL(name)			\
 __BTF_ID_LIST(name, "globl")				\
@@ -192,8 +192,8 @@ asm(							\
 ".popsection;                                  \n");
 
 #define BTF_SET8_START(name)				\
-__BTF_ID_LIST(name, "local")				\
-__BTF_SET8_START(name, "local")
+__BTF_ID_LIST(name, __global_on_lto)			\
+__BTF_SET8_START(name, __global_on_lto)
 
 #define BTF_SET8_END(name)				\
 asm(							\
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 23/46] init.h, lto: mark initcalls as __noreorder
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (21 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 22/46] btf, lto: Make all BTF IDs global on LTO Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 24/46] bpf, lto: mark interpreter jump table " Jiri Slaby (SUSE)
                   ` (24 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Kernels don't like any reordering of initcalls between files, as several
initcalls depend on each other. LTO is allowed to reorder as it wishes
and previously needed to use -fno-toplevel-reordering to prevent boot
failures. Now we can use __noreorder per symbol. So mark initcall
functions as such.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/init.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/init.h b/include/linux/init.h
index 077d7f93b402..ca827e2fb0da 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -246,7 +246,7 @@ extern bool initcall_debug;
 	static_assert(__same_type(initcall_t, &fn));
 #else
 #define ____define_initcall(fn, __unused, __name, __sec)	\
-	static initcall_t __name __used 			\
+	static initcall_t __name __used __noreorder 		\
 		__attribute__((__section__(__sec))) = fn;
 #endif
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 24/46] bpf, lto: mark interpreter jump table as __noreorder
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (22 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 23/46] init.h, lto: mark initcalls as __noreorder Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 25/46] sched, lto: mark sched classes " Jiri Slaby (SUSE)
                   ` (23 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

gcc LTO has a problem that can cause static variables containing &&
labels to be put into a different LTO partition and then fail the build.
This can happen with the jump table in the BPF interprer.

Mark the interpreter function and the jump table as __noreorder, this
guarantees they both end up in the first partition.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/bpf/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 25a54e04560e..d40ce00622f6 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1640,7 +1640,7 @@ u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
  *
  * Return: whatever value is in %BPF_R0 at program exit
  */
-static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
+static u64 __noreorder ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
 {
 #define BPF_INSN_2_LBL(x, y)    [BPF_##x | BPF_##y] = &&x##_##y
 #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 25/46] sched, lto: mark sched classes as __noreorder
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (23 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 24/46] bpf, lto: mark interpreter jump table " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 26/46] x86/apic, lto: Mark apic_driver*() " Jiri Slaby (SUSE)
                   ` (22 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider,
	Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

The scheduler code assumes that the scheduler classes are in a
particular order in memory. gcc LTO can violate this. Specify
__noreorder to avoid a boot BUG().

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/sched/sched.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index a4a20046e586..fe2703528972 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2230,6 +2230,7 @@ static inline void set_next_task(struct rq *rq, struct task_struct *next)
  */
 #define DEFINE_SCHED_CLASS(name) \
 const struct sched_class name##_sched_class \
+	__noreorder \
 	__aligned(__alignof__(struct sched_class)) \
 	__section("__" #name "_sched_class")
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 26/46] x86/apic, lto: Mark apic_driver*() as __noreorder
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (24 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 25/46] sched, lto: mark sched classes " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 27/46] linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall() Jiri Slaby (SUSE)
                   ` (21 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska

From: Jiri Slaby <jslaby@suse.cz>

The apic code assumes that the apic drivers are in a particular order in
memory. gcc LTO can violate this. So add __noreorder to apic_driver()
and apic_drivers() to avoid a boot BUG().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/include/asm/apic.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 3415321c8240..9c5c69482ab0 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -363,12 +363,12 @@ extern struct apic *apic;
  * to enforce the order with in them.
  */
 #define apic_driver(sym)					\
-	static const struct apic *__apicdrivers_##sym __used		\
+	static const struct apic *__apicdrivers_##sym __used __noreorder \
 	__aligned(sizeof(struct apic *))			\
 	__section(".apicdrivers") = { &sym }
 
 #define apic_drivers(sym1, sym2)					\
-	static struct apic *__apicdrivers_##sym1##sym2[2] __used	\
+	static struct apic *__apicdrivers_##sym1##sym2[2] __used __noreorder \
 	__aligned(sizeof(struct apic *))				\
 	__section(".apicdrivers") = { &sym1, &sym2 }
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 27/46] linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall()
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (25 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 26/46] x86/apic, lto: Mark apic_driver*() " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 28/46] scripts, lto: re-add gcc-ld Jiri Slaby (SUSE)
                   ` (20 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

With LTO, aliases get largely resolved in the compiler, not in the
linker.

Implement cond_syscall() and SYSCALL_ALIAS() in C to let the compiler
understand the aliases so that it can resolve them properly.

Likely, the architecture specific versions are now not needed anymore,
but they are kept for now.

There is one subtlety here:
The assembler version didn't care whether there was a prototype or not.
This variant assumes there is no prototype because it uses a dummy
(void) signature. This works for sys_ni.c, but breaks for
kernel/time/posix-stubs.c. To avoid problems there, a second variant of
the macro (_PROTO) is added. That uses the previously declared type
(by typeof()).

I actually tried to avoid this by adding prototypes for SYS_NI() and use
only the _PROTO variant, but it resulted in very large patches and lots
of problems with all the different cases. Eventually, I gave up and just
use the prototype case in posix-stubs.c

[js] gcc >= 8 emits Wattribute-alias warning. Work around that by
     __diag_*(). This is ugly, but due to gcc bug, I see no better
     option. Suggestions welcome.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/linkage.h   | 16 ++++++++--------
 kernel/time/posix-stubs.c | 19 +++++++++++++++++--
 2 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 1feab6136b5b..688b9bb80e96 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -23,17 +23,17 @@
 #endif
 
 #ifndef cond_syscall
-#define cond_syscall(x)	asm(				\
-	".weak " __stringify(x) "\n\t"			\
-	".set  " __stringify(x) ","			\
-		 __stringify(sys_ni_syscall))
+#define cond_syscall(x)	\
+	extern long x(void) __attribute__((alias("sys_ni_syscall"), weak));
 #endif
 
 #ifndef SYSCALL_ALIAS
-#define SYSCALL_ALIAS(alias, name) asm(			\
-	".globl " __stringify(alias) "\n\t"		\
-	".set   " __stringify(alias) ","		\
-		  __stringify(name))
+#define SYSCALL_ALIAS(a, name) \
+	long a(void) __attribute__((alias(__stringify(name))))
+#define SYSCALL_ALIAS_PROTO(a, name) \
+	typeof(a) a __attribute__((alias(__stringify(name))))
+#else
+#define SYSCALL_ALIAS_PROTO(a, name) SYSCALL_ALIAS(a, name)
 #endif
 
 #define __page_aligned_data	__section(".data..page_aligned") __aligned(PAGE_SIZE)
diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c
index 90ea5f373e50..23e1a63adc2b 100644
--- a/kernel/time/posix-stubs.c
+++ b/kernel/time/posix-stubs.c
@@ -31,13 +31,21 @@ asmlinkage long sys_ni_posix_timers(void)
 }
 
 #ifndef SYS_NI
-#define SYS_NI(name)  SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers)
+#define SYS_NI(name)  SYSCALL_ALIAS_PROTO(sys_##name, sys_ni_posix_timers)
 #endif
 
 #ifndef COMPAT_SYS_NI
-#define COMPAT_SYS_NI(name)  SYSCALL_ALIAS(compat_sys_##name, sys_ni_posix_timers)
+#define COMPAT_SYS_NI(name) \
+	SYSCALL_ALIAS_PROTO(compat_sys_##name, sys_ni_posix_timers)
 #endif
 
+/*
+ * This cannot go to SYS_NI() or SYSCALL_ALIAS_PROTO() due to gcc bug fixed in
+ * gcc >= 13 (cf. PR 97498). I wonder how is __SYSCALL_DEFINEx() able to work?
+ */
+__diag_push();
+__diag_ignore(GCC, 8, "-Wattribute-alias", "Alias to nonimplemented syscall");
+
 SYS_NI(timer_create);
 SYS_NI(timer_gettime);
 SYS_NI(timer_getoverrun);
@@ -51,6 +59,8 @@ SYS_NI(clock_adjtime32);
 SYS_NI(alarm);
 #endif
 
+__diag_pop();
+
 /*
  * We preserve minimal support for CLOCK_REALTIME and CLOCK_MONOTONIC
  * as it is easy to remain compatible with little code. CLOCK_BOOTTIME
@@ -157,6 +167,9 @@ SYSCALL_DEFINE4(clock_nanosleep, const clockid_t, which_clock, int, flags,
 				 which_clock);
 }
 
+__diag_push();
+__diag_ignore(GCC, 8, "-Wattribute-alias", "Alias to nonimplemented syscall");
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYS_NI(timer_create);
 #endif
@@ -170,6 +183,8 @@ COMPAT_SYS_NI(setitimer);
 SYS_NI(timer_settime32);
 SYS_NI(timer_gettime32);
 
+__diag_pop();
+
 SYSCALL_DEFINE2(clock_settime32, const clockid_t, which_clock,
 		struct old_timespec32 __user *, tp)
 {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 28/46] scripts, lto: re-add gcc-ld
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (26 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 27/46] linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall() Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 29/46] scripts, lto: use CONFIG_LTO for many LTO specific actions Jiri Slaby (SUSE)
                   ` (19 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

The primary goal of the script is to mangle linker command line arguments
into something which gcc understands. Such as converting "-z now" into
"-Wl,-z,now".

The script was removed by commit 86879fd277e8 (scripts: remove obsolete
gcc-ld script) as there was no use in the kernel. It had been added long
time ago to support exactly these lto patches, so we need to add it back
now.

Since the removed version, it is improved a bit:
* some missing linker and gcc command line arguments were added, and
* when V=1 is specified, it prints the final gcc command line

[js] rebase + commit message massage

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/gcc-ld | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100755 scripts/gcc-ld

diff --git a/scripts/gcc-ld b/scripts/gcc-ld
new file mode 100755
index 000000000000..13e85ece8d04
--- /dev/null
+++ b/scripts/gcc-ld
@@ -0,0 +1,40 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# run gcc with ld options
+# used as a wrapper to execute link time optimizations
+# yes virginia, this is not pretty
+
+ARGS="-nostdlib"
+
+for j in "$@" ; do
+	if [ "$j" = -v ] ; then
+		exec `$CC -print-prog-name=ld` -v
+	fi
+done
+
+while [ "$1" != "" ] ; do
+	case "$1" in
+	-save-temps*|-m32|-m64) N="$1" ;;
+	-r) N="$1" ;;
+	-flinker-output*) N="$1" ;;
+	-[Wg]*) N="$1" ;;
+	-[olv]|-[Ofd]*|-nostdlib) N="$1" ;;
+	--end-group|--start-group|--whole-archive|--no-whole-archive|\
+--no-undefined|--hash-style*|--build-id*|--eh-frame-hdr|-Bsymbolic)
+		 N="-Wl,$1" ;;
+	-[RTFGhIezcbyYu]*|\
+--script|--defsym|-init|-Map|--oformat|-rpath|\
+-rpath-link|--sort-section|--section-start|-Tbss|-Tdata|-Ttext|-soname|\
+--version-script|--dynamic-list|--version-exports-symbol|--wrap|-m|-z)
+		A="$1" ; shift ; N="-Wl,$A,$1" ;;
+	-[m]*) N="$1" ;;
+	-*) N="-Wl,$1" ;;
+	*)  N="$1" ;;
+	esac
+	ARGS="$ARGS $N"
+	shift
+done
+
+[ -n "$V" ] && echo >&2 $CC $ARGS
+
+exec $CC $ARGS
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 29/46] scripts, lto: use CONFIG_LTO for many LTO specific actions
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (27 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 28/46] scripts, lto: re-add gcc-ld Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support Jiri Slaby (SUSE)
                   ` (18 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

The clang LTO and the gcc LTO share some changes in Makefiles and build
scripts. Change the common ones to use CONFIG_LTO instead of
CONFIG_LTO_CLANG so that they can be used by gcc too.

[js] fix scripts/link-vmlinux.sh too

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 Makefile                          | 2 +-
 include/asm-generic/vmlinux.lds.h | 2 +-
 kernel/kallsyms.c                 | 2 +-
 scripts/Makefile.build            | 2 +-
 scripts/Makefile.lib              | 2 +-
 scripts/link-vmlinux.sh           | 2 +-
 scripts/module.lds.S              | 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index 58cd4f5e1c3a..0b723c903819 100644
--- a/Makefile
+++ b/Makefile
@@ -992,7 +992,7 @@ endif
 endif
 endif
 
-ifdef CONFIG_LTO
+ifdef CONFIG_LTO_CLANG
 KBUILD_CFLAGS	+= -fno-lto $(CC_FLAGS_LTO)
 KBUILD_AFLAGS	+= -fno-lto
 export CC_FLAGS_LTO
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 3dc5824141cd..5e2179dd41d5 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -96,7 +96,7 @@
  * RODATA_MAIN is not used because existing code already defines .rodata.x
  * sections to be brought in with rodata.
  */
-#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG)
+#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO)
 #define TEXT_MAIN .text .text.[0-9a-zA-Z_]*
 #define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral* .data.$__unnamed_* .data.$L*
 #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]*
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 60c20f301a6b..1d4557ae090f 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -167,7 +167,7 @@ static bool cleanup_symbol_name(char *s)
 {
 	char *res;
 
-	if (!IS_ENABLED(CONFIG_LTO_CLANG))
+	if (!IS_ENABLED(CONFIG_LTO))
 		return false;
 
 	/*
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 564a20ce2667..0a28e3884efe 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -153,7 +153,7 @@ is-single-obj-m = $(and $(part-of-module),$(filter $@, $(obj-m)),y)
 
 # When a module consists of a single object, there is no reason to keep LLVM IR.
 # Make $(LD) covert LLVM IR to ELF here.
-ifdef CONFIG_LTO_CLANG
+ifdef CONFIG_LTO
 cmd_ld_single_m = $(if $(is-single-obj-m), ; $(LD) $(ld_flags) -r -o $(tmp-target) $@; mv $(tmp-target) $@)
 endif
 
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 3aa384cec76b..ac918fd84d96 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -269,7 +269,7 @@ objtool-args = $(objtool-args-y)					\
 	$(if $(delay-objtool), --link)					\
 	$(if $(part-of-module), --module)
 
-delay-objtool := $(or $(CONFIG_LTO_CLANG),$(CONFIG_X86_KERNEL_IBT))
+delay-objtool := $(or $(CONFIG_LTO),$(CONFIG_X86_KERNEL_IBT))
 
 cmd_objtool = $(if $(objtool-enabled), ; $(objtool) $(objtool-args) $@)
 cmd_gen_objtooldep = $(if $(objtool-enabled), { echo ; echo '$@: $$(wildcard $(objtool))' ; } >> $(dot-target).cmd)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 918470d768e9..652f33be9549 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -60,7 +60,7 @@ vmlinux_link()
 	# skip output file argument
 	shift
 
-	if is_enabled CONFIG_LTO_CLANG || is_enabled CONFIG_X86_KERNEL_IBT; then
+	if is_enabled CONFIG_LTO || is_enabled CONFIG_X86_KERNEL_IBT; then
 		# Use vmlinux.o instead of performing the slow LTO link again.
 		objs=vmlinux.o
 		libs=
diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index da4bddd26171..b36b0527b0a8 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -27,7 +27,7 @@ SECTIONS {
 	__kcfi_traps 		: { KEEP(*(.kcfi_traps)) }
 #endif
 
-#ifdef CONFIG_LTO_CLANG
+#ifdef CONFIG_LTO
 	/*
 	 * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and
 	 * -ffunction-sections, which increases the size of the final module.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (28 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 29/46] scripts, lto: use CONFIG_LTO for many LTO specific actions Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 18:55   ` Josh Poimboeuf
  2022-11-14 11:43 ` [PATCH 31/46] x86/purgatory, lto: Disable gcc LTO for purgatory Jiri Slaby (SUSE)
                   ` (17 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Richard Biener, Jan Hubicka, H . J . Lu,
	Don Zickus, Martin Liska, Bagas Sanjaya, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

This patch adds gcc LTO support. It leverages some of the existing
support for clang LTO.

With LTO, gcc will do whole program optimizations for the whole kernel
and each module. This increases compile time, but can generate faster
and smaller code and allows the compiler to do global checking. For
example the compiler can complain now about type mismatches for symbols
between different files.

LTO allows gcc to inline functions between different files and do
various other optimization across the whole binary.

The LTO patches have been used for many years by various users, mostly
to make their kernel smaller. The original versions date back to 2012.

This version has a lot of outdated cruft dropped and doesn't need any
special tool chain (except for new enough) anymore.

This adds the basic Kbuild plumbing for LTO:
* Add a new LDFINAL variable that controls the final link for vmlinux or
  module. In this case we call gcc-ld instead of ld, to run the LTO
  step.

* Add Makefile support to enable LTO

For more information see Documentation/kbuild/lto-build.rst

Thanks to H.J. Lu, Joe Mario, Honza Hubicka, Richard Biener, Don Zickus,
Changlong Xie, Gleb Schukin, Martin Liska, various github contributors,
who helped with this project (and probably some more who I forgot,
sorry).

[js] pass -flto only once (the one with jobserver)
[ml] "-m: command not found" and whitespace fix
[bs] fixed Documentation issues:
  * blank line padding before single requirement list
  * use bullet list for FAQ
  * use bullet lists for external link references list
  * add LTO documentation to toc index

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Cc: Richard Biener <RGuenther@suse.com>
Cc: Jan Hubicka <jh@suse.de>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 Documentation/kbuild/index.rst     |  2 +
 Documentation/kbuild/lto-build.rst | 76 ++++++++++++++++++++++++++++++
 Makefile                           |  4 +-
 arch/Kconfig                       | 52 ++++++++++++++++++++
 scripts/Makefile.build             |  9 ++--
 scripts/Makefile.lto               | 43 +++++++++++++++++
 scripts/Makefile.modfinal          |  2 +-
 scripts/Makefile.vmlinux           |  3 +-
 scripts/Makefile.vmlinux_o         |  4 +-
 scripts/link-vmlinux.sh            |  7 +--
 10 files changed, 190 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/kbuild/lto-build.rst
 create mode 100644 scripts/Makefile.lto

diff --git a/Documentation/kbuild/index.rst b/Documentation/kbuild/index.rst
index cee2f99f734b..1937eee7c437 100644
--- a/Documentation/kbuild/index.rst
+++ b/Documentation/kbuild/index.rst
@@ -22,6 +22,8 @@ Kernel Build System
     gcc-plugins
     llvm
 
+    lto-build
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/kbuild/lto-build.rst b/Documentation/kbuild/lto-build.rst
new file mode 100644
index 000000000000..3fb17342e72f
--- /dev/null
+++ b/Documentation/kbuild/lto-build.rst
@@ -0,0 +1,76 @@
+=====================================================
+gcc link time optimization (LTO) for the Linux kernel
+=====================================================
+
+Link Time Optimization allows the compiler to optimize the complete program
+instead of just each file.
+
+The compiler can inline functions between files and do various other global
+optimizations, like specializing functions for common parameters,
+determing when global variables are clobbered, making functions pure/const,
+propagating constants globally, removing unneeded data and others.
+
+It will also drop unused functions which can make the kernel
+image smaller in some circumstances, in particular for small kernel
+configurations.
+
+For small monolithic kernels it can throw away unused code very effectively
+(especially when modules are disabled) and usually shrinks
+the code size.
+
+Build time and memory consumption at build time will increase, depending
+on the size of the largest binary. Modular kernels are less affected.
+With LTO incremental builds are less incremental, as always the whole
+binary needs to be re-optimized (but not re-parsed)
+
+Oopses can be somewhat more difficult to read, due to the more aggressive
+inlining: it helps to use scripts/faddr2line.
+
+It is currently incompatible with live patching.
+
+Normal "reasonable" builds work with less than 4GB of RAM, but very large
+configurations like allyesconfig typically need more memory. The actual
+memory needed depends on the available memory (gcc sizes its garbage
+collector pools based on that or on the ulimit -m limits) and
+the compiler version.
+
+Requirements:
+-------------
+
+- Enough memory: 4GB for a standard build, more for allyesconfig
+  The peak memory usage happens single threaded (when lto-wpa merges types),
+  so dialing back -j options will not help much.
+
+A 32bit hosted compiler is unlikely to work due to the memory requirements.
+You can however build a kernel targeted at 32bit on a 64bit host.
+
+FAQs:
+-----
+
+* I get a section type attribute conflict
+
+  Usually because of someone doing const __initdata (should be
+  const __initconst) or const __read_mostly (should be just const). Check
+  both symbols reported by gcc.
+
+References:
+-----------
+
+* Presentation on Kernel LTO
+  (note, performance numbers/details totally outdated.)
+
+  http://halobates.de/kernel-lto.pdf
+
+* Generic gcc LTO:
+
+  * http://www.ucw.cz/~hubicka/slides/labs2013.pdf
+  * http://www.hipeac.net/system/files/barcelona.pdf
+
+* Somewhat outdated too (from GCC site):
+
+  * http://gcc.gnu.org/projects/lto/lto.pdf
+  * http://gcc.gnu.org/projects/lto/whopr.pdf
+
+Happy Link-Time-Optimizing!
+
+Andi Kleen
diff --git a/Makefile b/Makefile
index 0b723c903819..d0dfb5ca2b21 100644
--- a/Makefile
+++ b/Makefile
@@ -482,6 +482,7 @@ KBUILD_HOSTLDLIBS   := $(HOST_LFS_LIBS) $(HOSTLDLIBS)
 
 # Make variables (CC, etc...)
 CPP		= $(CC) -E
+LDFINAL		= $(LD)
 ifneq ($(LLVM),)
 CC		= $(LLVM_PREFIX)clang$(LLVM_SUFFIX)
 LD		= $(LLVM_PREFIX)ld.lld$(LLVM_SUFFIX)
@@ -604,7 +605,7 @@ export RUSTC RUSTDOC RUSTFMT RUSTC_OR_CLIPPY_QUIET RUSTC_OR_CLIPPY BINDGEN CARGO
 export HOSTRUSTC KBUILD_HOSTRUSTFLAGS
 export CPP AR NM STRIP OBJCOPY OBJDUMP READELF PAHOLE RESOLVE_BTFIDS LEX YACC AWK INSTALLKERNEL
 export PERL PYTHON3 CHECK CHECKFLAGS MAKE UTS_MACHINE HOSTCXX
-export KGZIP KBZIP2 KLZOP LZMA LZ4 XZ ZSTD
+export KGZIP KBZIP2 KLZOP LZMA LZ4 XZ ZSTD LDFINAL
 export KBUILD_HOSTCXXFLAGS KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS LDFLAGS_MODULE
 export KBUILD_USERCFLAGS KBUILD_USERLDFLAGS
 
@@ -1085,6 +1086,7 @@ include-$(CONFIG_KMSAN)		+= scripts/Makefile.kmsan
 include-$(CONFIG_UBSAN)		+= scripts/Makefile.ubsan
 include-$(CONFIG_KCOV)		+= scripts/Makefile.kcov
 include-$(CONFIG_RANDSTRUCT)	+= scripts/Makefile.randstruct
+include-$(CONFIG_LTO_GCC)	+= scripts/Makefile.lto
 include-$(CONFIG_GCC_PLUGINS)	+= scripts/Makefile.gcc-plugins
 
 include $(addprefix $(srctree)/, $(include-y))
diff --git a/arch/Kconfig b/arch/Kconfig
index 8f138e580d1a..ad52c8fddfb4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -689,6 +689,21 @@ config HAS_LTO_CLANG
 	  The compiler and Kconfig options support building with Clang's
 	  LTO.
 
+config ARCH_SUPPORTS_LTO_GCC
+	bool
+
+# Some ar versions leak file descriptors when using the LTO
+# plugin and cause strange errors when ulimit -n is too low.
+# Pick an arbitrary threshold, which should be enough for most
+# kernel configs. This was a regression that is only
+# in some transient binutils version, so either older or
+# new enough is ok.
+# This might not be the exact range with this bug.
+config BAD_AR
+	depends on LD_VERSION = 23000
+	depends on $(shell,ulimit -n) < 4000
+	def_bool y
+
 choice
 	prompt "Link Time Optimization (LTO)"
 	default LTO_NONE
@@ -736,8 +751,45 @@ config LTO_CLANG_THIN
 	    https://clang.llvm.org/docs/ThinLTO.html
 
 	  If unsure, say Y.
+
+config LTO_GCC
+	bool "gcc LTO"
+	depends on ARCH_SUPPORTS_LTO_GCC && CC_IS_GCC
+	depends on GCC_VERSION >= 100300
+	depends on LD_VERSION >= 22700
+	depends on !BAD_AR
+	select LTO
+	help
+	  Enable whole program (link time) optimizations (LTO) for the whole
+	  kernel and each module. This usually increases compile time,
+	  especially for incremential builds, but tends to generate better code
+	  as well as some global checks.
+
+	  It allows the compiler to inline functions between different files
+	  and do other global optimization, like propagating constants between
+	  functions, determine side effects of functions, avoid unnecessary
+	  register saving around functions, or optimize unused function
+	  arguments. It also allows the compiler to drop unused functions.
+
+	  With this option the compiler will also do some global checking over
+	  different source files.
+
+	  This requires a gcc 10.3 or later compiler and binutils >= 2.27.
+
+	  On larger non modular configurations this may need more than 4GB of
+	  RAM for the link phase, as well as a 64bit host compiler.
+
+	  For more information see Documentation/kbuild/lto-build.rst
 endchoice
 
+config LTO_CP_CLONE
+	bool "Allow aggressive cloning for function specialization"
+	depends on LTO_GCC
+	help
+	  Allow the compiler to clone and specialize functions for specific
+	  arguments when it determines these arguments are commonly
+	  called.  Experimential. Will increase text size.
+
 config ARCH_SUPPORTS_CFI_CLANG
 	bool
 	help
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 0a28e3884efe..9b522c9efcb6 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -154,7 +154,7 @@ is-single-obj-m = $(and $(part-of-module),$(filter $@, $(obj-m)),y)
 # When a module consists of a single object, there is no reason to keep LLVM IR.
 # Make $(LD) covert LLVM IR to ELF here.
 ifdef CONFIG_LTO
-cmd_ld_single_m = $(if $(is-single-obj-m), ; $(LD) $(ld_flags) -r -o $(tmp-target) $@; mv $(tmp-target) $@)
+cmd_ld_single_m = $(if $(is-single-obj-m), ; $(LDFINAL) $(ld_flags) -r -o $(tmp-target) $@; mv $(tmp-target) $@)
 endif
 
 quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
@@ -265,7 +265,8 @@ $(obj)/%.usyms: $(obj)/%.o FORCE
 	$(call if_changed,undefined_syms)
 
 quiet_cmd_cc_lst_c = MKLST   $@
-      cmd_cc_lst_c = $(CC) $(c_flags) -g -c -o $*.o $< && \
+      cmd_cc_lst_c = $(if $(CONFIG_LTO),$(warning Listing in LTO mode does not match final binary)) \
+		     $(CC) $(c_flags) -g -c -o $*.o $< && \
 		     $(CONFIG_SHELL) $(srctree)/scripts/makelst $*.o \
 				     System.map $(OBJDUMP) > $@
 
@@ -446,8 +447,8 @@ $(obj)/modules.order: $(obj-m) FORCE
 $(obj)/lib.a: $(lib-y) FORCE
 	$(call if_changed,ar)
 
-quiet_cmd_ld_multi_m = LD [M]  $@
-      cmd_ld_multi_m = $(LD) $(ld_flags) -r -o $@ @$(patsubst %.o,%.mod,$@) $(cmd_objtool)
+quiet_cmd_ld_multi_m = LDFINAL [M] $@
+      cmd_ld_multi_m = $(LDFINAL) $(ld_flags) -r -o $@ @$(patsubst %.o,%.mod,$@) $(cmd_objtool)
 
 define rule_ld_multi_m
 	$(call cmd_and_savecmd,ld_multi_m)
diff --git a/scripts/Makefile.lto b/scripts/Makefile.lto
new file mode 100644
index 000000000000..33ac0da2bb47
--- /dev/null
+++ b/scripts/Makefile.lto
@@ -0,0 +1,43 @@
+#
+# Support for gcc link time optimization
+#
+
+DISABLE_LTO_GCC :=
+export DISABLE_LTO_GCC
+
+ifdef CONFIG_LTO_GCC
+	CC_FLAGS_LTO_GCC := -flto
+	DISABLE_LTO_GCC := -fno-lto
+
+	KBUILD_CFLAGS += ${CC_FLAGS_LTO_GCC}
+
+	CC_FLAGS_LTO := -flto
+	export CC_FLAGS_LTO
+
+	lto-flags-y := -flinker-output=nolto-rel -flto=jobserver
+	lto-flags-y += -fwhole-program
+
+	lto-flags-$(CONFIG_LTO_CP_CLONE) += -fipa-cp-clone
+
+	# allow extra flags from command line
+	lto-flags-y += ${LTO_EXTRA_CFLAGS}
+
+	# For LTO we need to use gcc to do the linking, not ld
+	# directly. Use a wrapper to convert the ld command line
+	# to gcc
+	LDFINAL := ${CONFIG_SHELL} ${srctree}/scripts/gcc-ld \
+                  ${lto-flags-y}
+
+	# LTO gcc creates a lot of files in TMPDIR, and with /tmp as tmpfs
+	# it's easy to drive the machine OOM. Use the object directory
+	# instead for temporaries.
+	# This has the drawback that there might be some junk more visible
+	# after interrupted compilations, but you would have that junk
+	# there anyways in /tmp.
+	TMPDIR ?= $(objtree)
+	export TMPDIR
+
+	# use plugin aware tools
+	AR = $(CROSS_COMPILE)gcc-ar
+	NM = $(CROSS_COMPILE)gcc-nm
+endif # CONFIG_LTO_GCC
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 25bedd83644b..c52536c91c8c 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -32,7 +32,7 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
 
 quiet_cmd_ld_ko_o = LD [M]  $@
       cmd_ld_ko_o +=							\
-	$(LD) -r $(KBUILD_LDFLAGS)					\
+	$(LDFINAL) -r $(KBUILD_LDFLAGS)					\
 		$(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE)		\
 		-T scripts/module.lds -o $@ $(filter %.o, $^);		\
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
diff --git a/scripts/Makefile.vmlinux b/scripts/Makefile.vmlinux
index 49946cb96844..8871e55f881b 100644
--- a/scripts/Makefile.vmlinux
+++ b/scripts/Makefile.vmlinux
@@ -26,7 +26,8 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
 
 # Final link of vmlinux with optional arch pass after final link
 cmd_link_vmlinux =							\
-	$< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";		\
+	$< "$(LD)" "$(LDFINAL)" "$(KBUILD_LDFLAGS)"			\
+	"$(LDFLAGS_vmlinux)";						\
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
 targets += vmlinux
diff --git a/scripts/Makefile.vmlinux_o b/scripts/Makefile.vmlinux_o
index 1c86895cfcf8..1f646b16aa70 100644
--- a/scripts/Makefile.vmlinux_o
+++ b/scripts/Makefile.vmlinux_o
@@ -44,9 +44,9 @@ objtool-args = $(vmlinux-objtool-args-y) --link
 # Link of vmlinux.o used for section mismatch analysis
 # ---------------------------------------------------------------------------
 
-quiet_cmd_ld_vmlinux.o = LD      $@
+quiet_cmd_ld_vmlinux.o = LDFINAL $@
       cmd_ld_vmlinux.o = \
-	$(LD) ${KBUILD_LDFLAGS} -r -o $@ \
+	$(LDFINAL) ${KBUILD_LDFLAGS} -r -o $@ \
 	$(addprefix -T , $(initcalls-lds)) \
 	--whole-archive vmlinux.a --no-whole-archive \
 	--start-group $(KBUILD_VMLINUX_LIBS) --end-group \
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 652f33be9549..c89258bcf818 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -29,8 +29,9 @@
 set -e
 
 LD="$1"
-KBUILD_LDFLAGS="$2"
-LDFLAGS_vmlinux="$3"
+LDFINAL="$2"
+KBUILD_LDFLAGS="$3"
+LDFLAGS_vmlinux="$4"
 
 is_enabled() {
 	grep -q "^$1=y" include/config/auto.conf
@@ -82,7 +83,7 @@ vmlinux_link()
 		ldlibs="-lutil -lrt -lpthread"
 	else
 		wl=
-		ld="${LD}"
+		ld="${LDFINAL}"
 		ldflags="${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}"
 		ldlibs=
 	fi
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 31/46] x86/purgatory, lto: Disable gcc LTO for purgatory
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (29 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 32/46] x86/realmode, lto: Disable gcc LTO for real mode code Jiri Slaby (SUSE)
                   ` (16 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

There are various issues with gcc LTO in the purgatory code, so disable
LTO here for now.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/purgatory/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 17f09dc26381..c00dc09d6fe4 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -60,6 +60,8 @@ ifdef CONFIG_CFI_CLANG
 PURGATORY_CFLAGS_REMOVE		+= $(CC_FLAGS_CFI)
 endif
 
+PURGATORY_CFLAGS_REMOVE		+= $(CC_FLAGS_LTO)
+
 CFLAGS_REMOVE_purgatory.o	+= $(PURGATORY_CFLAGS_REMOVE)
 CFLAGS_purgatory.o		+= $(PURGATORY_CFLAGS)
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 32/46] x86/realmode, lto: Disable gcc LTO for real mode code
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (30 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 31/46] x86/purgatory, lto: Disable gcc LTO for purgatory Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 33/46] x86/vdso, lto: Disable gcc LTO for the vdso Jiri Slaby (SUSE)
                   ` (15 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

The early real mode bootup code makes various assumptions that break
with LTO. For example it assumes that top level assembler statements
don't get reordered. Disable LTO for the real mode code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/realmode/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/realmode/Makefile b/arch/x86/realmode/Makefile
index a0b491ae2de8..47b8b500cf15 100644
--- a/arch/x86/realmode/Makefile
+++ b/arch/x86/realmode/Makefile
@@ -10,6 +10,7 @@
 # Sanitizer runtimes are unavailable and cannot be linked here.
 KASAN_SANITIZE			:= n
 KCSAN_SANITIZE			:= n
+KBUILD_CFLAGS			+= $(DISABLE_LTO_GCC)
 
 subdir- := rm
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 33/46] x86/vdso, lto: Disable gcc LTO for the vdso
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (31 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 32/46] x86/realmode, lto: Disable gcc LTO for real mode code Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 34/46] scripts, lto: disable gcc LTO for some mod sources Jiri Slaby (SUSE)
                   ` (14 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86, Martin Liska,
	Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Disable gcc LTO for the vdso. It's not really useful here and causes
various strange problems.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/entry/vdso/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 3e88b9df8c8f..e8099ee163a0 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -3,6 +3,8 @@
 # Building vDSO images for x86.
 #
 
+KBUILD_CFLAGS +=		$(DISABLE_LTO_GCC)
+
 # Absolute relocation type $(ARCH_REL_TYPE_ABS) needs to be defined before
 # the inclusion of generic Makefile.
 ARCH_REL_TYPE_ABS := R_X86_64_JUMP_SLOT|R_X86_64_GLOB_DAT|R_X86_64_RELATIVE|
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 34/46] scripts, lto: disable gcc LTO for some mod sources
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (32 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 33/46] x86/vdso, lto: Disable gcc LTO for the vdso Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 35/46] Kbuild, lto: disable gcc LTO for bounds+asm-offsets Jiri Slaby (SUSE)
                   ` (13 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

The mod tools scan assembler (devicetable-offsets.s) to generate symbols
into devicetable-offsets.h and binary (empty.o) to find out ELF setup.
That doesn't work with LTO. So just disable LTO for empty.o and
devicetable-offsets.s.

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/mod/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index c9e38ad937fd..aa3465d6bc4a 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD := y
 CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO)
+CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO_GCC)
 
 hostprogs-always-y	+= modpost mk_elfconfig
 always-y		+= empty.o
@@ -9,6 +10,8 @@ modpost-objs	:= modpost.o file2alias.o sumversion.o
 
 devicetable-offsets-file := devicetable-offsets.h
 
+$(obj)/devicetable-offsets.s: KBUILD_CFLAGS += $(DISABLE_LTO_GCC)
+
 $(obj)/$(devicetable-offsets-file): $(obj)/devicetable-offsets.s FORCE
 	$(call filechk,offsets,__DEVICETABLE_OFFSETS_H__)
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 35/46] Kbuild, lto: disable gcc LTO for bounds+asm-offsets
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (33 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 34/46] scripts, lto: disable gcc LTO for some mod sources Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 36/46] lib/string, lto: disable gcc LTO for string.o Jiri Slaby (SUSE)
                   ` (12 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Disable LTO when generating the bounds+asm-offsets.s files which are
scanned for C constants. With a LTO build, the file would contain the
gcc IR in assembler form, which breaks the scanning scripts.

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 Kbuild | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Kbuild b/Kbuild
index 464b34a08f51..40744d76d416 100644
--- a/Kbuild
+++ b/Kbuild
@@ -11,6 +11,8 @@ bounds-file := include/generated/bounds.h
 
 targets := kernel/bounds.s
 
+kernel/bounds.s: KBUILD_CFLAGS += $(DISABLE_LTO_GCC)
+
 $(bounds-file): kernel/bounds.s FORCE
 	$(call filechk,offsets,__LINUX_BOUNDS_H__)
 
@@ -30,6 +32,7 @@ offsets-file := include/generated/asm-offsets.h
 targets += arch/$(SRCARCH)/kernel/asm-offsets.s
 
 arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)
+arch/$(SRCARCH)/kernel/asm-offsets.s: KBUILD_CFLAGS += $(DISABLE_LTO_GCC)
 
 $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
 	$(call filechk,offsets,__ASM_OFFSETS_H__)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 36/46] lib/string, lto: disable gcc LTO for string.o
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (34 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 35/46] Kbuild, lto: disable gcc LTO for bounds+asm-offsets Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 37/46] Compiler attributes, lto: disable __flatten with LTO Jiri Slaby (SUSE)
                   ` (11 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen, Andi Kleen, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

gcc can generate calls for string functions implicitly, and that assumes
they exist in a non-LTOed copy. Mark string.o as LTO disabled to avoid
missing symbols at link time.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 lib/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/Makefile b/lib/Makefile
index 59bd7c2f793a..bf72b58de5c8 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -27,6 +27,8 @@ KASAN_SANITIZE_string.o := n
 CFLAGS_string.o += -fno-stack-protector
 endif
 
+CFLAGS_string.o += $(DISABLE_LTO_GCC)
+
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 rbtree.o radix-tree.o timerqueue.o xarray.o \
 	 maple_tree.o idr.o extable.o irq_regs.o argv_split.o \
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 37/46] Compiler attributes, lto: disable __flatten with LTO
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (35 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 36/46] lib/string, lto: disable gcc LTO for string.o Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 17:01   ` Miguel Ojeda
  2022-11-14 11:43 ` [PATCH 38/46] Kbuild, lto: don't include weak source file symbols in System.map Jiri Slaby (SUSE)
                   ` (10 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Miguel Ojeda, Nick Desaulniers, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

Using __flatten causes a simple gcc 12 LTO build not fit into 16GB
anymore. Disable flatten with LTO. With gcc 12, the build still does not
finish linking in 10 minutes, eating 40GB of RAM at that point.

There is an upstream bug about this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

Until this is resolved, simply disable __flatten with LTO.

In the future, instead of this patch, we should likely drop __flatten
and its only user (pcpu_build_alloc_info()) and use always_inline to all
functions which shall be inlined there.

Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/compiler_attributes.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index be6c71fd5ebb..09cf8eebcb0d 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -229,7 +229,12 @@
  * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
  * clang: https://clang.llvm.org/docs/AttributeReference.html#flatten
  */
+#ifndef CONFIG_LTO_GCC
 # define __flatten			__attribute__((flatten))
+#else
+/* Causes very large memory use with gcc in LTO mode */
+# define __flatten
+#endif
 
 /*
  * Note the missing underscores.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 38/46] Kbuild, lto: don't include weak source file symbols in System.map
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (36 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 37/46] Compiler attributes, lto: disable __flatten with LTO Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 39/46] x86, lto: Disable relative init pointers with gcc LTO Jiri Slaby (SUSE)
                   ` (9 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

The gcc LTO build can generate some extra weak source code file name
symbols on the second kallsyms link like:
  0000000002fdf20a W head64.c.552cf5a6

This causes the "Inconsistent kallsyms data" error due to mismatches in
the stage1 vs stage2 kallsyms link. Filter those out when generating
the System.map.

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/mksysmap | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/mksysmap b/scripts/mksysmap
index 16a08b8ef2f8..0f19a44ab136 100755
--- a/scripts/mksysmap
+++ b/scripts/mksysmap
@@ -34,6 +34,7 @@
 #   U - undefined global symbols
 #   N - debugging symbols
 #   w - local weak symbols
+#   W - weak symbols if they contain .c.
 
 # readprofile starts reading symbols when _stext is found, and
 # continue until it finds a symbol which is not either of 'T', 't',
@@ -57,4 +58,5 @@ $NM -n $1 | grep -v		\
 	-e ' __kstrtab_'	\
 	-e ' __kstrtabns_'	\
 	-e ' L0$'		\
+	-e ' W .*\.c\.'		\
 > $2
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 39/46] x86, lto: Disable relative init pointers with gcc LTO
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (37 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 38/46] Kbuild, lto: don't include weak source file symbols in System.map Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 40/46] x86/livepatch, lto: Disable live patching " Jiri Slaby (SUSE)
                   ` (8 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Relative init pointers are implemented using custom top-level assembler
that references the init function. With LTO, the top-level assembler
statement can end up in other assembler files than the init function,
which then causes linker errors if the init function was static.

This could be fixed by making all the init functions global, but that
would be a very intrusive change all over the tree.

Instead, disable relative init pointers for gcc LTO.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/Kconfig | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 67745ceab0db..6455d843d559 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -176,7 +176,9 @@ config X86
 	select HAVE_ARCH_MMAP_RND_BITS		if MMU
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if MMU && COMPAT
 	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
-	select HAVE_ARCH_PREL32_RELOCATIONS
+	# LTO can move assembler to different files, so all
+	# the init functions would need to be global for this to work
+	select HAVE_ARCH_PREL32_RELOCATIONS	if !LTO_GCC
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
 	select HAVE_ARCH_STACKLEAK
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 40/46] x86/livepatch, lto: Disable live patching with gcc LTO
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (38 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 39/46] x86, lto: Disable relative init pointers with gcc LTO Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 19:07   ` Josh Poimboeuf
  2022-11-17 20:00   ` Song Liu
  2022-11-14 11:43 ` [PATCH 41/46] x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used Jiri Slaby (SUSE)
                   ` (7 subsequent siblings)
  47 siblings, 2 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Josh Poimboeuf, Jiri Kosina, Miroslav Benes,
	Petr Mladek, Joe Lawrence, live-patching, Andi Kleen,
	Martin Liska, Jiri Slaby

From: Andi Kleen <andi@firstfloor.org>

It is not supported by gcc 12 so far, so it causes compiler "sorry"
messages.

Other than the compiler support, there shouldn't be any barriers for
live patching LTOed kernels, although it might be more difficult to
create patches for larger functions.

Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: live-patching@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/livepatch/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/livepatch/Kconfig b/kernel/livepatch/Kconfig
index 53d51ed619a3..22699adc39a6 100644
--- a/kernel/livepatch/Kconfig
+++ b/kernel/livepatch/Kconfig
@@ -12,6 +12,7 @@ config LIVEPATCH
 	depends on KALLSYMS_ALL
 	depends on HAVE_LIVEPATCH
 	depends on !TRIM_UNUSED_KSYMS
+	depends on !LTO_GCC # not supported in gcc
 	help
 	  Say Y here if you want to support kernel live patching.
 	  This option has no runtime impact until a kernel "patch"
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 41/46] x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (39 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 40/46] x86/livepatch, lto: Disable live patching " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 42/46] mm/kasan, lto: Mark kasan " Jiri Slaby (SUSE)
                   ` (6 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

gcc doesn't always recognize that memcpy/set/move called through
__builtins are referenced because the reference happens too late in the
RTL expansion phase. This can make LTO to drop them, leading to
undefined symbols. Mark them as __used to avoid that.

This is only needed on 32bit, on 64bit they're assembler anyways.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/lib/memcpy_32.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/memcpy_32.c b/arch/x86/lib/memcpy_32.c
index ef3af7ff2c8a..53fa1cac79d1 100644
--- a/arch/x86/lib/memcpy_32.c
+++ b/arch/x86/lib/memcpy_32.c
@@ -6,19 +6,19 @@
 #undef memset
 #undef memmove
 
-__visible void *memcpy(void *to, const void *from, size_t n)
+__used __visible void *memcpy(void *to, const void *from, size_t n)
 {
 	return __memcpy(to, from, n);
 }
 EXPORT_SYMBOL(memcpy);
 
-__visible void *memset(void *s, int c, size_t count)
+__used __visible void *memset(void *s, int c, size_t count)
 {
 	return __memset(s, c, count);
 }
 EXPORT_SYMBOL(memset);
 
-__visible void *memmove(void *dest, const void *src, size_t n)
+__used __visible void *memmove(void *dest, const void *src, size_t n)
 {
 	int d0,d1,d2,d3,d4,d5;
 	char *ret = dest;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 42/46] mm/kasan, lto: Mark kasan mem{cpy,move,set} as __used
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (40 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 41/46] x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 43/46] scripts, lto: check C symbols for modversions Jiri Slaby (SUSE)
                   ` (5 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Martin Liska, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, kasan-dev, linux-mm, Jiri Slaby

From: Martin Liska <mliska@suse.cz>

gcc doesn't always recognize that memcpy/set/move called through
__builtins are referenced because the reference happens too late in the
RTL expansion phase. This can make LTO to drop them, leading to
undefined symbols. Mark them as __used to avoid that.

Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/kasan/shadow.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 0e3648b603a6..94c98feea9c8 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -39,7 +39,7 @@ bool __kasan_check_write(const volatile void *p, unsigned int size)
 EXPORT_SYMBOL(__kasan_check_write);
 
 #undef memset
-void *memset(void *addr, int c, size_t len)
+__used void *memset(void *addr, int c, size_t len)
 {
 	if (!kasan_check_range((unsigned long)addr, len, true, _RET_IP_))
 		return NULL;
@@ -49,7 +49,7 @@ void *memset(void *addr, int c, size_t len)
 
 #ifdef __HAVE_ARCH_MEMMOVE
 #undef memmove
-void *memmove(void *dest, const void *src, size_t len)
+__used void *memmove(void *dest, const void *src, size_t len)
 {
 	if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) ||
 	    !kasan_check_range((unsigned long)dest, len, true, _RET_IP_))
@@ -60,7 +60,7 @@ void *memmove(void *dest, const void *src, size_t len)
 #endif
 
 #undef memcpy
-void *memcpy(void *dest, const void *src, size_t len)
+__used void *memcpy(void *dest, const void *src, size_t len)
 {
 	if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) ||
 	    !kasan_check_range((unsigned long)dest, len, true, _RET_IP_))
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 43/46] scripts, lto: check C symbols for modversions
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (41 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 42/46] mm/kasan, lto: Mark kasan " Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 44/46] scripts/bloat-o-meter, lto: handle gcc LTO Jiri Slaby (SUSE)
                   ` (4 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

The gcc LTO nm doesn't output assembler symbols, which makes the
symversions check fail because ksymtab is defined in assembler. Instead,
check for a C symbol that is generated too.

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/Makefile.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 9b522c9efcb6..dafa8aeed9c2 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -173,7 +173,7 @@ ifdef CONFIG_MODVERSIONS
 #   be compiled and linked to the kernel and/or modules.
 
 gen_symversions =								\
-	if $(NM) $@ 2>/dev/null | grep -q __ksymtab; then			\
+	if $(NM) $@ 2>/dev/null | grep -q __kstrtab; then			\
 		$(call cmd_gensymtypes_$(1),$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \
 			>> $(dot-target).cmd;					\
 	fi
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 44/46] scripts/bloat-o-meter, lto: handle gcc LTO
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (42 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 43/46] scripts, lto: check C symbols for modversions Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:43 ` [PATCH 45/46] kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned Jiri Slaby (SUSE)
                   ` (3 subsequent siblings)
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Masahiro Yamada, Michal Marek, Nick Desaulniers,
	linux-kbuild, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

gcc LTO can add .lto_priv postfixes to symbols. Ignore those in
bloat-o-meter to allow comparison of non-LTO with LTO kernels.

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 scripts/bloat-o-meter | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/bloat-o-meter b/scripts/bloat-o-meter
index f9553f60a14a..ab994b3bf6e2 100755
--- a/scripts/bloat-o-meter
+++ b/scripts/bloat-o-meter
@@ -45,7 +45,7 @@ def getsizes(file, format):
                 if name == "linux_banner": continue
                 if name == "vermagic": continue
                 # statics and some other optimizations adds random .NUMBER
-                name = re_NUMBER.sub('', name)
+                name = re_NUMBER.sub('', name).replace(".lto_priv", "")
                 sym[name] = sym.get(name, 0) + int(size, 16)
     return sym
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 45/46] kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (43 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 44/46] scripts/bloat-o-meter, lto: handle gcc LTO Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-26 17:07   ` Andrey Konovalov
  2022-11-14 11:43 ` [PATCH 46/46] x86, lto: Finally enable gcc LTO for x86 Jiri Slaby (SUSE)
                   ` (2 subsequent siblings)
  47 siblings, 1 reply; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Martin Liska, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, kasan-dev, linux-mm, Jiri Slaby

From: Martin Liska <mliska@suse.cz>

The function memory_is_poisoned() can handle any size which can be
propagated by LTO later on. So we can end up with a constant that is not
handled in the switch. Thus just break and call memory_is_poisoned_n()
which handles arbitrary size to avoid build errors with gcc LTO.

Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/kasan/generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index d8b5590f9484..d261f83c6687 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -152,7 +152,7 @@ static __always_inline bool memory_is_poisoned(unsigned long addr, size_t size)
 		case 16:
 			return memory_is_poisoned_16(addr);
 		default:
-			BUILD_BUG();
+			break;
 		}
 	}
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 46/46] x86, lto: Finally enable gcc LTO for x86
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (44 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 45/46] kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned Jiri Slaby (SUSE)
@ 2022-11-14 11:43 ` Jiri Slaby (SUSE)
  2022-11-14 11:56 ` [PATCH 00/46] gcc-LTO support for the kernel Ard Biesheuvel
  2022-11-14 19:40 ` Ard Biesheuvel
  47 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby (SUSE) @ 2022-11-14 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, Martin Liska, Jiri Slaby

From: Andi Kleen <ak@linux.intel.com>

Now that everything is in place, allow gcc LTO for the x86 build.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6455d843d559..2c96facf4a42 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -112,6 +112,7 @@ config X86
 	select ARCH_USES_CFI_TRAPS		if X86_64 && CFI_CLANG
 	select ARCH_SUPPORTS_LTO_CLANG
 	select ARCH_SUPPORTS_LTO_CLANG_THIN
+	select ARCH_SUPPORTS_LTO_GCC
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_MEMTEST
 	select ARCH_USE_QUEUED_RWLOCKS
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (45 preceding siblings ...)
  2022-11-14 11:43 ` [PATCH 46/46] x86, lto: Finally enable gcc LTO for x86 Jiri Slaby (SUSE)
@ 2022-11-14 11:56 ` Ard Biesheuvel
  2022-11-14 12:04   ` Jiri Slaby
  2022-11-14 19:40 ` Ard Biesheuvel
  47 siblings, 1 reply; 87+ messages in thread
From: Ard Biesheuvel @ 2022-11-14 11:56 UTC (permalink / raw)
  To: Jiri Slaby (SUSE), Borislav Petkov
  Cc: linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Daniel Borkmann, Daniel Bristot de Oliveira, Dave Hansen,
	Dietmar Eggemann, Dmitry Vyukov, Don Zickus, Hao Luo, H . J . Lu,
	H. Peter Anvin, Huang Rui, Ingo Molnar, Jan Hubicka, Jason Baron,
	Jiri Kosina, Jiri Olsa, Joe Lawrence, John Fastabend,
	Josh Poimboeuf, Juergen Gross, Juri Lelli, KP Singh,
	Mark Rutland, Martin KaFai Lau, Martin Liska, Masahiro Yamada,
	Mel Gorman, Miguel Ojeda, Michal Marek, Miroslav Benes,
	Namhyung Kim, Nick Desaulniers, Oleksandr Tyshchenko,
	Peter Zijlstra, Petr Mladek, Rafael J. Wysocki, Richard Biener,
	Sedat Dilek, Song Liu, Stanislav Fomichev, Stefano Stabellini,
	Steven Rostedt, Thomas Gleixner, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> Hi,
>
> this is the first call for comments (and kbuild complaints) for this
> support of gcc (full) LTO in the kernel. Most of the patches come from
> Andi. Me and Martin rebased them to new kernels and fixed the to-use
> known issues. Also I updated most of the commit logs and reordered the
> patches to groups of patches with similar intent.
>
> The very first patch comes from Alexander and is pending on some x86
> queue already (I believe). I am attaching it only for completeness.
> Without that, the kernel does not boot (LTO reorders a lot).
>

You didn't cc me on that patch so I will reply here: I don't think
this is the right solution.
On x86, there is a lot of stuff injected into .head.text that simply
does not belong there, and getting rid of the __head annotation and
dropping __HEAD from the Xen pvh head.S file would be a much better
solution.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-14 11:56 ` [PATCH 00/46] gcc-LTO support for the kernel Ard Biesheuvel
@ 2022-11-14 12:04   ` Jiri Slaby
  0 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby @ 2022-11-14 12:04 UTC (permalink / raw)
  To: Ard Biesheuvel, Borislav Petkov
  Cc: linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Daniel Borkmann, Daniel Bristot de Oliveira, Dave Hansen,
	Dietmar Eggemann, Dmitry Vyukov, Don Zickus, Hao Luo, H . J . Lu,
	H. Peter Anvin, Huang Rui, Ingo Molnar, Jan Hubicka, Jason Baron,
	Jiri Kosina, Jiri Olsa, Joe Lawrence, John Fastabend,
	Josh Poimboeuf, Juergen Gross, Juri Lelli, KP Singh,
	Mark Rutland, Martin KaFai Lau, Martin Liska, Masahiro Yamada,
	Mel Gorman, Miguel Ojeda, Michal Marek, Miroslav Benes,
	Namhyung Kim, Nick Desaulniers, Oleksandr Tyshchenko,
	Peter Zijlstra, Petr Mladek, Rafael J. Wysocki, Richard Biener,
	Sedat Dilek, Song Liu, Stanislav Fomichev, Stefano Stabellini,
	Steven Rostedt, Thomas Gleixner, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On 14. 11. 22, 12:56, Ard Biesheuvel wrote:
> On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>>
>> Hi,
>>
>> this is the first call for comments (and kbuild complaints) for this
>> support of gcc (full) LTO in the kernel. Most of the patches come from
>> Andi. Me and Martin rebased them to new kernels and fixed the to-use
>> known issues. Also I updated most of the commit logs and reordered the
>> patches to groups of patches with similar intent.
>>
>> The very first patch comes from Alexander and is pending on some x86
>> queue already (I believe). I am attaching it only for completeness.
>> Without that, the kernel does not boot (LTO reorders a lot).
>>
> 
> You didn't cc me on that patch so I will reply here: I don't think
> this is the right solution.
> On x86, there is a lot of stuff injected into .head.text that simply
> does not belong there, and getting rid of the __head annotation and
> dropping __HEAD from the Xen pvh head.S file would be a much better
> solution.

I think Alexander was working on that too. I'm not sure -- anyway, we 
still have the other fix. That is putting startup_64() to a special 
section and put that to the beginning of vmlinux using lds. (Until 
.head.text is completely gone for good -- same as on arm, you wrote 
somewhere.)

In any case, that patch was added only for reference, if anyone wants to 
give the series a try. Next time, I can attach the other workaround ;).

I don't expect anyone will take the series as is. There will be a lot of 
comments, I suppose. Hence many re-spins...

thanks,
-- 
js
suse labs


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 08/46] static_call, lto: Mark static keys as __visible
  2022-11-14 11:43 ` [PATCH 08/46] static_call, lto: Mark static keys " Jiri Slaby (SUSE)
@ 2022-11-14 15:51   ` Peter Zijlstra
  2022-11-14 18:52     ` Josh Poimboeuf
  2022-11-14 20:34     ` Andi Kleen
  2022-11-14 18:57   ` Josh Poimboeuf
  1 sibling, 2 replies; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-14 15:51 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:06PM +0100, Jiri Slaby (SUSE) wrote:
> From: Andi Kleen <andi@firstfloor.org>
> 
> Symbols referenced from assembler (either directly or e.f. from
> DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> they could end up in a different object file than the assembler. This
> can lead to linker errors without this patch.
> 
> So mark static call functions as __visible, namely static keys here.

Why doesn't llvm-lto need this?

Also, why am I getting a random selection of the patchset?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto
  2022-11-14 11:43 ` [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 15:54   ` Peter Zijlstra
  2022-11-14 20:29     ` Andi Kleen
  0 siblings, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-14 15:54 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:08PM +0100, Jiri Slaby (SUSE) wrote:

> -static int func_a(int x)
> +__visible_on_lto int sc_func_a(int x)

>  } static_call_data [] __initdata = {
>        { NULL,   2, 3 },
>        { func_b, 2, 4 },
> -      { func_a, 2, 3 }
> +      { sc_func_a, 2, 3 }
>  };

I must say I really hate this. Also, with address taken, it still
eliminiates it?

This whole GCC-LTO sounds sub-par.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 12/46] x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto
  2022-11-14 11:43 ` [PATCH 12/46] x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 15:58   ` Peter Zijlstra
  0 siblings, 0 replies; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-14 15:58 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Juergen Gross, Alexey Makhalov,
	VMware PV-Drivers Reviewers, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86, Martin Liska,
	Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:10PM +0100, Jiri Slaby (SUSE) wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> Symbols referenced from assembler (either directly or e.f. from
> DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> they could end up in a different object file than the assembler. This
> can lead to linker errors without this patch.

> @@ -120,7 +120,7 @@ unsigned int paravirt_patch(u8 type, void *insn_buff, unsigned long addr,
>  struct static_key paravirt_steal_enabled;
>  struct static_key paravirt_steal_rq_enabled;
>  
> -static u64 native_steal_clock(int cpu)
> +__visible_on_lto u64 native_steal_clock(int cpu)

More hate; same reason, DEFINE_STATIC_CALL() takes the function address
and stuffs it in a variable, WTF is GCC-LTO eliminating it?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 14/46] x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto
  2022-11-14 11:43 ` [PATCH 14/46] x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto Jiri Slaby (SUSE)
@ 2022-11-14 16:02   ` Peter Zijlstra
  0 siblings, 0 replies; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-14 16:02 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Martin Liska, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86, Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:12PM +0100, Jiri Slaby (SUSE) wrote:
> From: Martin Liska <mliska@suse.cz>
> 
> Symbols referenced from assembler (either directly or e.f. from
> DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> they could end up in a different object file than the assembler. This
> can lead to linker errors without this patch.
> 
> So mark cpuid_table_copy as __visible_on_lto.
> 
> [js] use __visible_on_lto
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Signed-off-by: Martin Liska <mliska@suse.cz>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  arch/x86/kernel/sev-shared.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 3a5b0c9c4fcc..554da8aabfc7 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -64,7 +64,7 @@ struct snp_cpuid_table {
>  static u16 ghcb_version __ro_after_init;
>  
>  /* Copy of the SNP firmware's CPUID page. */
> -static struct snp_cpuid_table cpuid_table_copy __ro_after_init;
> +__visible_on_lto struct snp_cpuid_table cpuid_table_copy __ro_after_init;

Same again, address is taken (and passed into inline asm). Must not be
eliminated.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 37/46] Compiler attributes, lto: disable __flatten with LTO
  2022-11-14 11:43 ` [PATCH 37/46] Compiler attributes, lto: disable __flatten with LTO Jiri Slaby (SUSE)
@ 2022-11-14 17:01   ` Miguel Ojeda
  0 siblings, 0 replies; 87+ messages in thread
From: Miguel Ojeda @ 2022-11-14 17:01 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Miguel Ojeda, Nick Desaulniers,
	Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 12:45 PM Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> +#ifndef CONFIG_LTO_GCC
>  # define __flatten                     __attribute__((flatten))
> +#else
> +/* Causes very large memory use with gcc in LTO mode */
> +# define __flatten
> +#endif

Currently, this header avoids attributes that depend on configuration
options on purpose (see the comment at the top), so it would be best
to move it elsewhere, e.g. `compiler_types.h`.

Though I feel bad about having to move this attribute out since it is
just that config option compared to other more involved bits in
`compiler_types.h`... :(

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o
  2022-11-14 11:43 ` [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o Jiri Slaby (SUSE)
@ 2022-11-14 17:57   ` Masahiro Yamada
  2022-11-15  6:36     ` Jiri Slaby
  0 siblings, 1 reply; 87+ messages in thread
From: Masahiro Yamada @ 2022-11-14 17:57 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Jiri Slaby, Sedat Dilek, Michal Marek,
	Nick Desaulniers, Martin Liska

On Mon, Nov 14, 2022 at 8:44 PM Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> From: Jiri Slaby <jslaby@suse.cz>
>
> Until the link-vmlinux.sh split (cf. the commit below), the linker was
> run with jobserver set in MAKEFLAGS. After the split, the command in
> Makefile.vmlinux_o is not prefixed by "+" anymore, so this information
> is lost.
>
> Restore it as linkers working in parallel (namely gcc LTO) make a use of
> it. Actually, they complain, if jobserver is not set:
>   lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS'
>
> Fixes: 5d45950dfbb1 (kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o)


This Fixes is wrong since GCC LTO is not in upstream code.






> Cc: Sedat Dilek <sedat.dilek@gmail.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Michal Marek <michal.lkml@markovi.net>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Martin Liska <mliska@suse.cz>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  scripts/Makefile.vmlinux_o | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/scripts/Makefile.vmlinux_o b/scripts/Makefile.vmlinux_o
> index 0edfdb40364b..1c86895cfcf8 100644
> --- a/scripts/Makefile.vmlinux_o
> +++ b/scripts/Makefile.vmlinux_o
> @@ -58,7 +58,7 @@ define rule_ld_vmlinux.o
>  endef
>
>  vmlinux.o: $(initcalls-lds) vmlinux.a $(KBUILD_VMLINUX_LIBS) FORCE
> -       $(call if_changed_rule,ld_vmlinux.o)
> +       +$(call if_changed_rule,ld_vmlinux.o)
>
>  targets += vmlinux.o
>
> --
> 2.38.1
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 03/46] kbuild: lto: preserve MAKEFLAGS for module linking
  2022-11-14 11:43 ` [PATCH 03/46] kbuild: lto: preserve MAKEFLAGS for module linking Jiri Slaby (SUSE)
@ 2022-11-14 18:02   ` Masahiro Yamada
  0 siblings, 0 replies; 87+ messages in thread
From: Masahiro Yamada @ 2022-11-14 18:02 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Martin Liska, Sedat Dilek, Michal Marek,
	Nick Desaulniers, Jiri Slaby

On Mon, Nov 14, 2022 at 8:44 PM Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> From: Martin Liska <mliska@suse.cz>
>
> Prefix cc_o_c and ld_multi_m commands in makefile in order to preserve
> access to jobserver. This is needed for gcc LTO at least (enabled in
> later patches in this series). Note that both commands can invoke the
> linker (ld_single_m in the former case).
>
> Fixes this warning:
> lto-wrapper: warning: jobserver is not available: ‘--jobserver-auth=’ is not present in ‘MAKEFLAGS’
>
> Cc: Sedat Dilek <sedat.dilek@gmail.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Michal Marek <michal.lkml@markovi.net>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Fixes: 5d45950dfbb1 (kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o)


Same as 02.

Also, 5d45950dfbb1 did not touch scripts/Makefile.build at all.
Please stop adding random, wrong Fixes.



Make already compiles many files in parallel.
It does not make sense to request a jobserver for
a single C file compilation.

Is there any way to turn off this annoyance?










> Signed-off-by: Martin Liska <mliska@suse.cz>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  scripts/Makefile.build | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 41f3602fc8de..564a20ce2667 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -247,7 +247,7 @@ endef
>
>  # Built-in and composite module parts
>  $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
> -       $(call if_changed_rule,cc_o_c)
> +       +$(call if_changed_rule,cc_o_c)
>         $(call cmd,force_checksrc)
>
>  # To make this rule robust against "Argument list too long" error,
> @@ -457,7 +457,7 @@ endef
>  $(multi-obj-m): objtool-enabled := $(delay-objtool)
>  $(multi-obj-m): part-of-module := y
>  $(multi-obj-m): %.o: %.mod FORCE
> -       $(call if_changed_rule,ld_multi_m)
> +       +$(call if_changed_rule,ld_multi_m)
>  $(call multi_depend, $(multi-obj-m), .o, -objs -y -m)
>
>  # Add intermediate targets:
> --
> 2.38.1
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 08/46] static_call, lto: Mark static keys as __visible
  2022-11-14 15:51   ` Peter Zijlstra
@ 2022-11-14 18:52     ` Josh Poimboeuf
  2022-11-14 20:34     ` Andi Kleen
  1 sibling, 0 replies; 87+ messages in thread
From: Josh Poimboeuf @ 2022-11-14 18:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andi Kleen, Jason Baron, Steven Rostedt,
	Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 04:51:07PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 14, 2022 at 12:43:06PM +0100, Jiri Slaby (SUSE) wrote:
> > From: Andi Kleen <andi@firstfloor.org>
> > 
> > Symbols referenced from assembler (either directly or e.f. from
> > DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> > they could end up in a different object file than the assembler. This
> > can lead to linker errors without this patch.
> > 
> > So mark static call functions as __visible, namely static keys here.
> 
> Why doesn't llvm-lto need this?
> 
> Also, why am I getting a random selection of the patchset?

Same, please Cc me on the whole set next time.

-- 
Josh

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support
  2022-11-14 11:43 ` [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support Jiri Slaby (SUSE)
@ 2022-11-14 18:55   ` Josh Poimboeuf
  2022-11-15 13:31     ` Martin Liška
  0 siblings, 1 reply; 87+ messages in thread
From: Josh Poimboeuf @ 2022-11-14 18:55 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, linux-kbuild, Richard Biener, Jan Hubicka,
	H . J . Lu, Don Zickus, Martin Liska, Bagas Sanjaya, Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:28PM +0100, Jiri Slaby (SUSE) wrote:
> +++ b/Documentation/kbuild/lto-build.rst
> @@ -0,0 +1,76 @@
> +=====================================================
> +gcc link time optimization (LTO) for the Linux kernel
> +=====================================================
> +
> +Link Time Optimization allows the compiler to optimize the complete program
> +instead of just each file.
> +
> +The compiler can inline functions between files and do various other global
> +optimizations, like specializing functions for common parameters,
> +determing when global variables are clobbered, making functions pure/const,
> +propagating constants globally, removing unneeded data and others.
> +
> +It will also drop unused functions which can make the kernel
> +image smaller in some circumstances, in particular for small kernel
> +configurations.
> +
> +For small monolithic kernels it can throw away unused code very effectively
> +(especially when modules are disabled) and usually shrinks
> +the code size.
> +
> +Build time and memory consumption at build time will increase, depending
> +on the size of the largest binary. Modular kernels are less affected.
> +With LTO incremental builds are less incremental, as always the whole
> +binary needs to be re-optimized (but not re-parsed)
> +
> +Oopses can be somewhat more difficult to read, due to the more aggressive
> +inlining: it helps to use scripts/faddr2line.
> +
> +It is currently incompatible with live patching.

... because ?

-- 
Josh

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 08/46] static_call, lto: Mark static keys as __visible
  2022-11-14 11:43 ` [PATCH 08/46] static_call, lto: Mark static keys " Jiri Slaby (SUSE)
  2022-11-14 15:51   ` Peter Zijlstra
@ 2022-11-14 18:57   ` Josh Poimboeuf
  1 sibling, 0 replies; 87+ messages in thread
From: Josh Poimboeuf @ 2022-11-14 18:57 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Peter Zijlstra, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:06PM +0100, Jiri Slaby (SUSE) wrote:
> From: Andi Kleen <andi@firstfloor.org>
> 
> Symbols referenced from assembler (either directly or e.f. from
> DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> they could end up in a different object file than the assembler. This
> can lead to linker errors without this patch.
> 
> So mark static call functions as __visible, namely static keys here.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: Jason Baron <jbaron@akamai.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Signed-off-by: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Martin Liska <mliska@suse.cz>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  include/linux/static_call.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/static_call.h b/include/linux/static_call.h
> index df53bed9d71f..e629ab0c4ca3 100644
> --- a/include/linux/static_call.h
> +++ b/include/linux/static_call.h
> @@ -182,7 +182,7 @@ extern long __static_call_return0(void);
>  
>  #define DEFINE_STATIC_CALL(name, _func)					\
>  	DECLARE_STATIC_CALL(name, _func);				\
> -	struct static_call_key STATIC_CALL_KEY(name) = {		\
> +	__visible struct static_call_key STATIC_CALL_KEY(name) = {		\

Why not __visible_on_lto?

-- 
Josh

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 40/46] x86/livepatch, lto: Disable live patching with gcc LTO
  2022-11-14 11:43 ` [PATCH 40/46] x86/livepatch, lto: Disable live patching " Jiri Slaby (SUSE)
@ 2022-11-14 19:07   ` Josh Poimboeuf
  2022-11-14 20:28     ` Andi Kleen
  2022-11-17 20:00   ` Song Liu
  1 sibling, 1 reply; 87+ messages in thread
From: Josh Poimboeuf @ 2022-11-14 19:07 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Jiri Kosina, Miroslav Benes,
	Petr Mladek, Joe Lawrence, live-patching, Andi Kleen,
	Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 12:43:38PM +0100, Jiri Slaby (SUSE) wrote:
> From: Andi Kleen <andi@firstfloor.org>
> 
> It is not supported by gcc 12 so far, so it causes compiler "sorry"
> messages.

What specifically is not supported by GCC 12?  What are the "sorry"
messages?

> Other than the compiler support, there shouldn't be any barriers for
> live patching LTOed kernels, although it might be more difficult to
> create patches for larger functions.

This seems to conflict with the documentation.

> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: Jiri Kosina <jikos@kernel.org>
> Cc: Miroslav Benes <mbenes@suse.cz>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Joe Lawrence <joe.lawrence@redhat.com>
> Cc: live-patching@vger.kernel.org
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> Signed-off-by: Martin Liska <mliska@suse.cz>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  kernel/livepatch/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/livepatch/Kconfig b/kernel/livepatch/Kconfig
> index 53d51ed619a3..22699adc39a6 100644
> --- a/kernel/livepatch/Kconfig
> +++ b/kernel/livepatch/Kconfig
> @@ -12,6 +12,7 @@ config LIVEPATCH
>  	depends on KALLSYMS_ALL
>  	depends on HAVE_LIVEPATCH
>  	depends on !TRIM_UNUSED_KSYMS
> +	depends on !LTO_GCC # not supported in gcc

The comment doesn't help.

-- 
Josh

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
                   ` (46 preceding siblings ...)
  2022-11-14 11:56 ` [PATCH 00/46] gcc-LTO support for the kernel Ard Biesheuvel
@ 2022-11-14 19:40 ` Ard Biesheuvel
  2022-11-17  8:28   ` Peter Zijlstra
  47 siblings, 1 reply; 87+ messages in thread
From: Ard Biesheuvel @ 2022-11-14 19:40 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Peter Zijlstra, Petr Mladek,
	Rafael J. Wysocki, Richard Biener, Sedat Dilek, Song Liu,
	Stanislav Fomichev, Stefano Stabellini, Steven Rostedt,
	Thomas Gleixner, Valentin Schneider, Vincent Guittot,
	Vincenzo Frascino, Viresh Kumar, VMware PV-Drivers Reviewers,
	Yonghong Song

On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> Hi,
>
> this is the first call for comments (and kbuild complaints) for this
> support of gcc (full) LTO in the kernel. Most of the patches come from
> Andi. Me and Martin rebased them to new kernels and fixed the to-use
> known issues. Also I updated most of the commit logs and reordered the
> patches to groups of patches with similar intent.
>
> The very first patch comes from Alexander and is pending on some x86
> queue already (I believe). I am attaching it only for completeness.
> Without that, the kernel does not boot (LTO reorders a lot).
>
> In our measurements, the performance differences are negligible.
>
> The kernel is bigger with gcc LTO due to more inlining.

OK, so if I understand this correctly:
- the performance is the same
- the resulting image is bigger
- we need a whole lot of ugly hacks to placate the linker.

Pardon my cynicism, but this cover letter does not mention any
advantages of LTO, so what is the point of all of this?

(On Clang, LTO was needed for CFI, but this is not even the case anymore)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 40/46] x86/livepatch, lto: Disable live patching with gcc LTO
  2022-11-14 19:07   ` Josh Poimboeuf
@ 2022-11-14 20:28     ` Andi Kleen
  2022-11-14 22:00       ` Josh Poimboeuf
  0 siblings, 1 reply; 87+ messages in thread
From: Andi Kleen @ 2022-11-14 20:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andi Kleen, Jiri Kosina, Miroslav Benes,
	Petr Mladek, Joe Lawrence, live-patching, Andi Kleen,
	Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 11:07:42AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 14, 2022 at 12:43:38PM +0100, Jiri Slaby (SUSE) wrote:
> > From: Andi Kleen <andi@firstfloor.org>
> > 
> > It is not supported by gcc 12 so far, so it causes compiler "sorry"
> > messages.
> 
> What specifically is not supported by GCC 12? 

-fwhole-program and the live patching options are mutually exclusive.
Okay I suppose it could be handled by disabling -fwhole-program, although 
that might limit some optimizations.

> What are the "sorry" messages?

It's an error message from the compiler telling you that something is
not implemented.


-Andi

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto
  2022-11-14 15:54   ` Peter Zijlstra
@ 2022-11-14 20:29     ` Andi Kleen
  0 siblings, 0 replies; 87+ messages in thread
From: Andi Kleen @ 2022-11-14 20:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andi Kleen, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 04:54:16PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 14, 2022 at 12:43:08PM +0100, Jiri Slaby (SUSE) wrote:
> 
> > -static int func_a(int x)
> > +__visible_on_lto int sc_func_a(int x)
> 
> >  } static_call_data [] __initdata = {
> >        { NULL,   2, 3 },
> >        { func_b, 2, 4 },
> > -      { func_a, 2, 3 }
> > +      { sc_func_a, 2, 3 }
> >  };
> 
> I must say I really hate this. Also, with address taken, it still
> eliminiates it?

It doesn't eliminate it, but makes it static, which causes the label to
change, so the assembler reference breaks.

-Andi

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 08/46] static_call, lto: Mark static keys as __visible
  2022-11-14 15:51   ` Peter Zijlstra
  2022-11-14 18:52     ` Josh Poimboeuf
@ 2022-11-14 20:34     ` Andi Kleen
  2022-11-17  8:24       ` Peter Zijlstra
  1 sibling, 1 reply; 87+ messages in thread
From: Andi Kleen @ 2022-11-14 20:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andi Kleen, Josh Poimboeuf, Jason Baron,
	Steven Rostedt, Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 04:51:07PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 14, 2022 at 12:43:06PM +0100, Jiri Slaby (SUSE) wrote:
> > From: Andi Kleen <andi@firstfloor.org>
> > 
> > Symbols referenced from assembler (either directly or e.f. from
> > DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> > they could end up in a different object file than the assembler. This
> > can lead to linker errors without this patch.
> > 
> > So mark static call functions as __visible, namely static keys here.
> 
> Why doesn't llvm-lto need this?

It has an integrated assembler that can feed this information to the LTO
symbol table, while gas cannot do that.

There was some discussion to extend the gcc top level asm syntax to 
express external symbols, but so far it doesn't exist.

> 
> Also, why am I getting a random selection of the patchset?

Me too.

-Andi


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 40/46] x86/livepatch, lto: Disable live patching with gcc LTO
  2022-11-14 20:28     ` Andi Kleen
@ 2022-11-14 22:00       ` Josh Poimboeuf
  2022-11-15 13:32         ` Martin Liška
  0 siblings, 1 reply; 87+ messages in thread
From: Josh Poimboeuf @ 2022-11-14 22:00 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, Andi Kleen, Martin Liska,
	Jiri Slaby

On Mon, Nov 14, 2022 at 12:28:09PM -0800, Andi Kleen wrote:
> On Mon, Nov 14, 2022 at 11:07:42AM -0800, Josh Poimboeuf wrote:
> > On Mon, Nov 14, 2022 at 12:43:38PM +0100, Jiri Slaby (SUSE) wrote:
> > > From: Andi Kleen <andi@firstfloor.org>
> > > 
> > > It is not supported by gcc 12 so far, so it causes compiler "sorry"
> > > messages.
> > 
> > What specifically is not supported by GCC 12? 
> 
> -fwhole-program and the live patching options are mutually exclusive.

What live patching options are you referring to?

-- 
Josh

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o
  2022-11-14 17:57   ` Masahiro Yamada
@ 2022-11-15  6:36     ` Jiri Slaby
  0 siblings, 0 replies; 87+ messages in thread
From: Jiri Slaby @ 2022-11-15  6:36 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kernel, Jiri Slaby, Sedat Dilek, Michal Marek,
	Nick Desaulniers, Martin Liska

On 14. 11. 22, 18:57, Masahiro Yamada wrote:
> On Mon, Nov 14, 2022 at 8:44 PM Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>>
>> From: Jiri Slaby <jslaby@suse.cz>
>>
>> Until the link-vmlinux.sh split (cf. the commit below), the linker was
>> run with jobserver set in MAKEFLAGS. After the split, the command in
>> Makefile.vmlinux_o is not prefixed by "+" anymore, so this information
>> is lost.
>>
>> Restore it as linkers working in parallel (namely gcc LTO) make a use of
>> it. Actually, they complain, if jobserver is not set:
>>    lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS'
>>
>> Fixes: 5d45950dfbb1 (kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o)
> 
> 
> This Fixes is wrong since GCC LTO is not in upstream code.

Yup, this is a left-over. Now dropped from both.

thanks,
-- 
js


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support
  2022-11-14 18:55   ` Josh Poimboeuf
@ 2022-11-15 13:31     ` Martin Liška
  0 siblings, 0 replies; 87+ messages in thread
From: Martin Liška @ 2022-11-15 13:31 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, linux-kbuild, Richard Biener, Jan Hubicka,
	H . J . Lu, Don Zickus, Bagas Sanjaya, Jiri Slaby

On 11/14/22 19:55, Josh Poimboeuf wrote:
> On Mon, Nov 14, 2022 at 12:43:28PM +0100, Jiri Slaby (SUSE) wrote:
>> +++ b/Documentation/kbuild/lto-build.rst
>> @@ -0,0 +1,76 @@
>> +=====================================================
>> +gcc link time optimization (LTO) for the Linux kernel
>> +=====================================================
>> +
>> +Link Time Optimization allows the compiler to optimize the complete program
>> +instead of just each file.
>> +
>> +The compiler can inline functions between files and do various other global
>> +optimizations, like specializing functions for common parameters,
>> +determing when global variables are clobbered, making functions pure/const,
>> +propagating constants globally, removing unneeded data and others.
>> +
>> +It will also drop unused functions which can make the kernel
>> +image smaller in some circumstances, in particular for small kernel
>> +configurations.
>> +
>> +For small monolithic kernels it can throw away unused code very effectively
>> +(especially when modules are disabled) and usually shrinks
>> +the code size.
>> +
>> +Build time and memory consumption at build time will increase, depending
>> +on the size of the largest binary. Modular kernels are less affected.
>> +With LTO incremental builds are less incremental, as always the whole
>> +binary needs to be re-optimized (but not re-parsed)
>> +
>> +Oopses can be somewhat more difficult to read, due to the more aggressive
>> +inlining: it helps to use scripts/faddr2line.
>> +
>> +It is currently incompatible with live patching.
> 
> ... because ?

There's no fundamental reason why live patching can't coexist with -flto.

We removed the sorry message for GCC 13.1 release:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1a308905c1baf64d0ea4d09d7d92b55e79a2a339
when it comes to -flive-patching=inline-clone option.

But it seems Linux does not utilize the option (based on git grep):
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flive-patching

That said, I would remove this limitation as LTO can make creation of live patches
more complicated, but fundamentally there's no barrier.

Thanks,
Martin

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 40/46] x86/livepatch, lto: Disable live patching with gcc LTO
  2022-11-14 22:00       ` Josh Poimboeuf
@ 2022-11-15 13:32         ` Martin Liška
  0 siblings, 0 replies; 87+ messages in thread
From: Martin Liška @ 2022-11-15 13:32 UTC (permalink / raw)
  To: Josh Poimboeuf, Andi Kleen
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, Andi Kleen, Jiri Slaby

On 11/14/22 23:00, Josh Poimboeuf wrote:
> On Mon, Nov 14, 2022 at 12:28:09PM -0800, Andi Kleen wrote:
>> On Mon, Nov 14, 2022 at 11:07:42AM -0800, Josh Poimboeuf wrote:
>>> On Mon, Nov 14, 2022 at 12:43:38PM +0100, Jiri Slaby (SUSE) wrote:
>>>> From: Andi Kleen <andi@firstfloor.org>
>>>>
>>>> It is not supported by gcc 12 so far, so it causes compiler "sorry"
>>>> messages.
>>>
>>> What specifically is not supported by GCC 12? 
>>
>> -fwhole-program and the live patching options are mutually exclusive.
> 
> What live patching options are you referring to?
> 

As mentioned in the reply to the next email, we speak about:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flive-patching
option:

gcc -flto -flive-patching=inline-clone a.c
cc1: sorry, unimplemented: live patching is not supported with LTO

Cheers,
Martin

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-14 11:43 ` [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible Jiri Slaby (SUSE)
@ 2022-11-16 23:30   ` Thomas Gleixner
  2022-11-17  8:40     ` Peter Zijlstra
  2022-11-22  9:32     ` Jiri Slaby
  0 siblings, 2 replies; 87+ messages in thread
From: Thomas Gleixner @ 2022-11-16 23:30 UTC (permalink / raw)
  To: Jiri Slaby (SUSE), linux-kernel
  Cc: Andi Kleen, Peter Zijlstra, Andy Lutomirski, Martin Liska, Jiri Slaby

On Mon, Nov 14 2022 at 12:43, Jiri Slaby wrote:
> Symbols referenced from assembler (either directly or e.f. from

from assembler? I'm not aware that the assembler references anything.

Also what does e.f. mean? Did you want to write e.g.?

> DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> they could end up in a different object file than the assembler. This

than the assembler? Are we shipping the assembler in an object file?

> can lead to linker errors without this patch.

git grep -i 'this patch' Documentation/process/

> So mark raw_irqentry_exit_cond_resched() as __visible.

And all that tells me what? I know what you want to say, but it's not
there.

  Symbols in different compilation units which are referenced from
  assembly code either directly or indirectly, e.g. from
  DEFINE_STATIC_KEY(), must be marked visible for GCC based LTO builds.

  Add the missing __visible annotation to raw_irqentry_exit_cond_resched().

See?

There is no 'global' because it's obvious that a symbol in a different
compilation unit must be global to be resolvable. It's also obvious that
code in different compilation units ends up in different object files.

So stating that it's a 'must' to have such symbols marked visible is
good enough for an argument because that tells the reader that this is a
mandatory requirement for an GCC based LTO build.

No?

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 08/46] static_call, lto: Mark static keys as __visible
  2022-11-14 20:34     ` Andi Kleen
@ 2022-11-17  8:24       ` Peter Zijlstra
  0 siblings, 0 replies; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-17  8:24 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Josh Poimboeuf, Jason Baron, Steven Rostedt,
	Ard Biesheuvel, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 12:34:33PM -0800, Andi Kleen wrote:
> On Mon, Nov 14, 2022 at 04:51:07PM +0100, Peter Zijlstra wrote:
> > On Mon, Nov 14, 2022 at 12:43:06PM +0100, Jiri Slaby (SUSE) wrote:
> > > From: Andi Kleen <andi@firstfloor.org>
> > > 
> > > Symbols referenced from assembler (either directly or e.f. from
> > > DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> > > they could end up in a different object file than the assembler. This
> > > can lead to linker errors without this patch.
> > > 
> > > So mark static call functions as __visible, namely static keys here.
> > 
> > Why doesn't llvm-lto need this?
> 
> It has an integrated assembler that can feed this information to the LTO
> symbol table, while gas cannot do that.
> 
> There was some discussion to extend the gcc top level asm syntax to 
> express external symbols, but so far it doesn't exist.

Urgh, that's ugly too. Why does GCC insist on ugly solutions; clang has
shown it can be done sanely, follow.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-14 19:40 ` Ard Biesheuvel
@ 2022-11-17  8:28   ` Peter Zijlstra
  2022-11-17  8:50     ` Richard Biener
  0 siblings, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-17  8:28 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Richard Biener, Sedat Dilek, Song Liu, Stanislav Fomichev,
	Stefano Stabellini, Steven Rostedt, Thomas Gleixner,
	Valentin Schneider, Vincent Guittot, Vincenzo Frascino,
	Viresh Kumar, VMware PV-Drivers Reviewers, Yonghong Song

On Mon, Nov 14, 2022 at 08:40:50PM +0100, Ard Biesheuvel wrote:
> On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
> >
> > Hi,
> >
> > this is the first call for comments (and kbuild complaints) for this
> > support of gcc (full) LTO in the kernel. Most of the patches come from
> > Andi. Me and Martin rebased them to new kernels and fixed the to-use
> > known issues. Also I updated most of the commit logs and reordered the
> > patches to groups of patches with similar intent.
> >
> > The very first patch comes from Alexander and is pending on some x86
> > queue already (I believe). I am attaching it only for completeness.
> > Without that, the kernel does not boot (LTO reorders a lot).
> >
> > In our measurements, the performance differences are negligible.
> >
> > The kernel is bigger with gcc LTO due to more inlining.
> 
> OK, so if I understand this correctly:
> - the performance is the same
> - the resulting image is bigger
> - we need a whole lot of ugly hacks to placate the linker.
> 
> Pardon my cynicism, but this cover letter does not mention any
> advantages of LTO, so what is the point of all of this?

Seconded; I really hate all the ugly required for the GCC-LTO
'solution'. There not actually being any benefit just makes it a very
simple decision to drop all these patches on the floor.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-16 23:30   ` Thomas Gleixner
@ 2022-11-17  8:40     ` Peter Zijlstra
  2022-11-17 22:07       ` Andi Kleen
  2022-11-22  9:32     ` Jiri Slaby
  1 sibling, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-17  8:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andi Kleen, Andy Lutomirski, Martin Liska,
	Jiri Slaby

On Thu, Nov 17, 2022 at 12:30:34AM +0100, Thomas Gleixner wrote:
> On Mon, Nov 14 2022 at 12:43, Jiri Slaby wrote:
> > Symbols referenced from assembler (either directly or e.f. from
> 
> from assembler? I'm not aware that the assembler references anything.
> 
> Also what does e.f. mean? Did you want to write e.g.?
> 
> > DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
> > they could end up in a different object file than the assembler. This
> 
> than the assembler? Are we shipping the assembler in an object file?
> 
> > can lead to linker errors without this patch.
> 
> git grep -i 'this patch' Documentation/process/
> 
> > So mark raw_irqentry_exit_cond_resched() as __visible.
> 
> And all that tells me what? I know what you want to say, but it's not
> there.
> 
>   Symbols in different compilation units which are referenced from
>   assembly code either directly or indirectly, e.g. from
>   DEFINE_STATIC_KEY(), must be marked visible for GCC based LTO builds.
> 
>   Add the missing __visible annotation to raw_irqentry_exit_cond_resched().
> 
> See?
> 
> There is no 'global' because it's obvious that a symbol in a different
> compilation unit must be global to be resolvable. It's also obvious that
> code in different compilation units ends up in different object files.
> 
> So stating that it's a 'must' to have such symbols marked visible is
> good enough for an argument because that tells the reader that this is a
> mandatory requirement for an GCC based LTO build.
> 
> No?

I still don't understand any of it -- this symbol is not static (and
thus lives in the global namespace and it's name must not be mangled
lest it breaks ABI), this symbol has it's address taken, so it must not
be eliminated.

WTF does this crazy LTO thing require __visible on it?

The original Changelog babbles something about multiple object files,
which doesn't make sense either, there is only a single object file with
LTO -- that's sort of the whole point. The translation unit output
becomes some intermediate gunk -- to be used as input for the LTO pass,
but it is not an ELF object file.

The linker takes all these intermediate files, does the global
optimization thing and then generates a real ELF object file.

Anyway; I think we can drop all this crazy on the floor again, since per
the 0/n (which I didn't get) there isn't any actual benefit from using
GCC-LTO, so why should we bother with all this ugly.

I would suggest GCC implement this integrated assembler and follow the
clang lead here -- or people who want LTO use clang. GCC is clearly
inferior here.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17  8:28   ` Peter Zijlstra
@ 2022-11-17  8:50     ` Richard Biener
  2022-11-17 11:42       ` Peter Zijlstra
  2022-11-17 11:48       ` Thomas Gleixner
  0 siblings, 2 replies; 87+ messages in thread
From: Richard Biener @ 2022-11-17  8:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ard Biesheuvel, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Richard Biener, Sedat Dilek, Song Liu, Stanislav Fomichev,
	Stefano Stabellini, Steven Rostedt, Thomas Gleixner,
	Valentin Schneider, Vincent Guittot, Vincenzo Frascino,
	Viresh Kumar, VMware PV-Drivers Reviewers, Yonghong Song

On Thu, 17 Nov 2022, Peter Zijlstra wrote:

> On Mon, Nov 14, 2022 at 08:40:50PM +0100, Ard Biesheuvel wrote:
> > On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
> > >
> > > Hi,
> > >
> > > this is the first call for comments (and kbuild complaints) for this
> > > support of gcc (full) LTO in the kernel. Most of the patches come from
> > > Andi. Me and Martin rebased them to new kernels and fixed the to-use
> > > known issues. Also I updated most of the commit logs and reordered the
> > > patches to groups of patches with similar intent.
> > >
> > > The very first patch comes from Alexander and is pending on some x86
> > > queue already (I believe). I am attaching it only for completeness.
> > > Without that, the kernel does not boot (LTO reorders a lot).
> > >
> > > In our measurements, the performance differences are negligible.
> > >
> > > The kernel is bigger with gcc LTO due to more inlining.
> > 
> > OK, so if I understand this correctly:
> > - the performance is the same
> > - the resulting image is bigger
> > - we need a whole lot of ugly hacks to placate the linker.
> > 
> > Pardon my cynicism, but this cover letter does not mention any
> > advantages of LTO, so what is the point of all of this?
> 
> Seconded; I really hate all the ugly required for the GCC-LTO
> 'solution'. There not actually being any benefit just makes it a very
> simple decision to drop all these patches on the floor.

I'd say that instead a prerequesite for the series would be to actually
enforce hidden visibility for everything not part of the kernel module
API so the compiler can throw away unused functions.  Currently it has
to keep everything because with a shared object there might be external
references to everything exported from individual TUs.

There was a size benefit mentioned for module-less monolithic kernels
as likely used in embedded setups, not sure if that's enough motivation
to properly annotate symbols with visibility - and as far as I understand
all these 'required' are actually such fixes.

Richard.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17  8:50     ` Richard Biener
@ 2022-11-17 11:42       ` Peter Zijlstra
  2022-11-17 11:49         ` Ard Biesheuvel
  2022-11-17 11:48       ` Thomas Gleixner
  1 sibling, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-17 11:42 UTC (permalink / raw)
  To: Richard Biener
  Cc: Ard Biesheuvel, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Richard Biener, Sedat Dilek, Song Liu, Stanislav Fomichev,
	Stefano Stabellini, Steven Rostedt, Thomas Gleixner,
	Valentin Schneider, Vincent Guittot, Vincenzo Frascino,
	Viresh Kumar, VMware PV-Drivers Reviewers, Yonghong Song

On Thu, Nov 17, 2022 at 08:50:59AM +0000, Richard Biener wrote:
> On Thu, 17 Nov 2022, Peter Zijlstra wrote:
> 
> > On Mon, Nov 14, 2022 at 08:40:50PM +0100, Ard Biesheuvel wrote:
> > > On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > this is the first call for comments (and kbuild complaints) for this
> > > > support of gcc (full) LTO in the kernel. Most of the patches come from
> > > > Andi. Me and Martin rebased them to new kernels and fixed the to-use
> > > > known issues. Also I updated most of the commit logs and reordered the
> > > > patches to groups of patches with similar intent.
> > > >
> > > > The very first patch comes from Alexander and is pending on some x86
> > > > queue already (I believe). I am attaching it only for completeness.
> > > > Without that, the kernel does not boot (LTO reorders a lot).
> > > >
> > > > In our measurements, the performance differences are negligible.
> > > >
> > > > The kernel is bigger with gcc LTO due to more inlining.
> > > 
> > > OK, so if I understand this correctly:
> > > - the performance is the same
> > > - the resulting image is bigger
> > > - we need a whole lot of ugly hacks to placate the linker.
> > > 
> > > Pardon my cynicism, but this cover letter does not mention any
> > > advantages of LTO, so what is the point of all of this?
> > 
> > Seconded; I really hate all the ugly required for the GCC-LTO
> > 'solution'. There not actually being any benefit just makes it a very
> > simple decision to drop all these patches on the floor.
> 
> I'd say that instead a prerequesite for the series would be to actually
> enforce hidden visibility for everything not part of the kernel module
> API so the compiler can throw away unused functions.  Currently it has
> to keep everything because with a shared object there might be external
> references to everything exported from individual TUs.

I'm not sure what you're on about; only symbols annotated with
EXPORT_SYMBOL*() are accessible from modules (aka DSOs) and those will
have their address taken. You can feely eliminate any unused symbol.

> There was a size benefit mentioned for module-less monolithic kernels
> as likely used in embedded setups, not sure if that's enough motivation
> to properly annotate symbols with visibility - and as far as I understand
> all these 'required' are actually such fixes.

I'm not seeing how littering __visible is useful or desired, doubly so
for that static hack, that's just a crude work around for GCC LTO being
inferior for not being able to read inline asm.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17  8:50     ` Richard Biener
  2022-11-17 11:42       ` Peter Zijlstra
@ 2022-11-17 11:48       ` Thomas Gleixner
  1 sibling, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2022-11-17 11:48 UTC (permalink / raw)
  To: Richard Biener, Peter Zijlstra
  Cc: Ard Biesheuvel, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Richard Biener, Sedat Dilek, Song Liu, Stanislav Fomichev,
	Stefano Stabellini, Steven Rostedt, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On Thu, Nov 17 2022 at 08:50, Richard Biener wrote:
> On Thu, 17 Nov 2022, Peter Zijlstra wrote:
>> Seconded; I really hate all the ugly required for the GCC-LTO
>> 'solution'. There not actually being any benefit just makes it a very
>> simple decision to drop all these patches on the floor.
>
> I'd say that instead a prerequesite for the series would be to actually
> enforce hidden visibility for everything not part of the kernel module
> API so the compiler can throw away unused functions.  Currently it has
> to keep everything because with a shared object there might be external
> references to everything exported from individual TUs.
>
> There was a size benefit mentioned for module-less monolithic kernels
> as likely used in embedded setups, not sure if that's enough motivation
> to properly annotate symbols with visibility - and as far as I understand
> all these 'required' are actually such fixes.

To accomodate a broken tool which cannot figure out which functions are
referenced in the final lump and which are not, right?

Can we pretty please fix the tool instead of proliferating the
brokenness?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17 11:42       ` Peter Zijlstra
@ 2022-11-17 11:49         ` Ard Biesheuvel
  2022-11-17 13:55           ` Richard Biener
  0 siblings, 1 reply; 87+ messages in thread
From: Ard Biesheuvel @ 2022-11-17 11:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Richard Biener, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Richard Biener, Sedat Dilek, Song Liu, Stanislav Fomichev,
	Stefano Stabellini, Steven Rostedt, Thomas Gleixner,
	Valentin Schneider, Vincent Guittot, Vincenzo Frascino,
	Viresh Kumar, VMware PV-Drivers Reviewers, Yonghong Song

On Thu, 17 Nov 2022 at 12:43, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Nov 17, 2022 at 08:50:59AM +0000, Richard Biener wrote:
> > On Thu, 17 Nov 2022, Peter Zijlstra wrote:
> >
> > > On Mon, Nov 14, 2022 at 08:40:50PM +0100, Ard Biesheuvel wrote:
> > > > On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > this is the first call for comments (and kbuild complaints) for this
> > > > > support of gcc (full) LTO in the kernel. Most of the patches come from
> > > > > Andi. Me and Martin rebased them to new kernels and fixed the to-use
> > > > > known issues. Also I updated most of the commit logs and reordered the
> > > > > patches to groups of patches with similar intent.
> > > > >
> > > > > The very first patch comes from Alexander and is pending on some x86
> > > > > queue already (I believe). I am attaching it only for completeness.
> > > > > Without that, the kernel does not boot (LTO reorders a lot).
> > > > >
> > > > > In our measurements, the performance differences are negligible.
> > > > >
> > > > > The kernel is bigger with gcc LTO due to more inlining.
> > > >
> > > > OK, so if I understand this correctly:
> > > > - the performance is the same
> > > > - the resulting image is bigger
> > > > - we need a whole lot of ugly hacks to placate the linker.
> > > >
> > > > Pardon my cynicism, but this cover letter does not mention any
> > > > advantages of LTO, so what is the point of all of this?
> > >
> > > Seconded; I really hate all the ugly required for the GCC-LTO
> > > 'solution'. There not actually being any benefit just makes it a very
> > > simple decision to drop all these patches on the floor.
> >
> > I'd say that instead a prerequesite for the series would be to actually
> > enforce hidden visibility for everything not part of the kernel module
> > API so the compiler can throw away unused functions.  Currently it has
> > to keep everything because with a shared object there might be external
> > references to everything exported from individual TUs.
>
> I'm not sure what you're on about; only symbols annotated with
> EXPORT_SYMBOL*() are accessible from modules (aka DSOs) and those will
> have their address taken. You can feely eliminate any unused symbol.
>
> > There was a size benefit mentioned for module-less monolithic kernels
> > as likely used in embedded setups, not sure if that's enough motivation
> > to properly annotate symbols with visibility - and as far as I understand
> > all these 'required' are actually such fixes.
>
> I'm not seeing how littering __visible is useful or desired, doubly so
> for that static hack, that's just a crude work around for GCC LTO being
> inferior for not being able to read inline asm.

We have an __ADDRESSABLE() macro and asmlinkage modifier to annotate
symbols that may appear to the compiler as though they are never
referenced.

Would it be possible to repurpose those so that the LTO code knows
which symbols it must not remove?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17 11:49         ` Ard Biesheuvel
@ 2022-11-17 13:55           ` Richard Biener
  2022-11-17 14:32             ` Peter Zijlstra
  2022-11-17 15:15             ` Ard Biesheuvel
  0 siblings, 2 replies; 87+ messages in thread
From: Richard Biener @ 2022-11-17 13:55 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Peter Zijlstra, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Sedat Dilek, Song Liu, Stanislav Fomichev, Stefano Stabellini,
	Steven Rostedt, Thomas Gleixner, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On Thu, 17 Nov 2022, Ard Biesheuvel wrote:

> On Thu, 17 Nov 2022 at 12:43, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Nov 17, 2022 at 08:50:59AM +0000, Richard Biener wrote:
> > > On Thu, 17 Nov 2022, Peter Zijlstra wrote:
> > >
> > > > On Mon, Nov 14, 2022 at 08:40:50PM +0100, Ard Biesheuvel wrote:
> > > > > On Mon, 14 Nov 2022 at 12:44, Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > this is the first call for comments (and kbuild complaints) for this
> > > > > > support of gcc (full) LTO in the kernel. Most of the patches come from
> > > > > > Andi. Me and Martin rebased them to new kernels and fixed the to-use
> > > > > > known issues. Also I updated most of the commit logs and reordered the
> > > > > > patches to groups of patches with similar intent.
> > > > > >
> > > > > > The very first patch comes from Alexander and is pending on some x86
> > > > > > queue already (I believe). I am attaching it only for completeness.
> > > > > > Without that, the kernel does not boot (LTO reorders a lot).
> > > > > >
> > > > > > In our measurements, the performance differences are negligible.
> > > > > >
> > > > > > The kernel is bigger with gcc LTO due to more inlining.
> > > > >
> > > > > OK, so if I understand this correctly:
> > > > > - the performance is the same
> > > > > - the resulting image is bigger
> > > > > - we need a whole lot of ugly hacks to placate the linker.
> > > > >
> > > > > Pardon my cynicism, but this cover letter does not mention any
> > > > > advantages of LTO, so what is the point of all of this?
> > > >
> > > > Seconded; I really hate all the ugly required for the GCC-LTO
> > > > 'solution'. There not actually being any benefit just makes it a very
> > > > simple decision to drop all these patches on the floor.
> > >
> > > I'd say that instead a prerequesite for the series would be to actually
> > > enforce hidden visibility for everything not part of the kernel module
> > > API so the compiler can throw away unused functions.  Currently it has
> > > to keep everything because with a shared object there might be external
> > > references to everything exported from individual TUs.
> >
> > I'm not sure what you're on about; only symbols annotated with
> > EXPORT_SYMBOL*() are accessible from modules (aka DSOs) and those will
> > have their address taken. You can feely eliminate any unused symbol.

But IIRC that's not reflected on the ELF level by making EXPORT_SYMBOL*()
symbols public and the rest hidden - instead all symbols global in the C TUs
will become public and the module dynamic loader details are hidden from
GCCs view of the kernel image as ELF relocatable object.

> > > There was a size benefit mentioned for module-less monolithic kernels
> > > as likely used in embedded setups, not sure if that's enough motivation
> > > to properly annotate symbols with visibility - and as far as I understand
> > > all these 'required' are actually such fixes.
> >
> > I'm not seeing how littering __visible is useful or desired, doubly so
> > for that static hack, that's just a crude work around for GCC LTO being
> > inferior for not being able to read inline asm.
> 
> We have an __ADDRESSABLE() macro and asmlinkage modifier to annotate
> symbols that may appear to the compiler as though they are never
> referenced.
> 
> Would it be possible to repurpose those so that the LTO code knows
> which symbols it must not remove?

I find

/*
 * Force the compiler to emit 'sym' as a symbol, so that we can reference
 * it from inline assembler. Necessary in case 'sym' could be inlined
 * otherwise, or eliminated entirely due to lack of references that are
 * visible to the compiler.
 */
#define ___ADDRESSABLE(sym, __attrs) \
	static void * __used __attrs \
		__UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)&sym;
#define __ADDRESSABLE(sym) \
	___ADDRESSABLE(sym, __section(".discard.addressable"))

that should be enough to force LTO keeping 'sym' - unless there's
a linker script that discards .discard.addressable which I fear LTO
will notice, losing the effect.  A more direct way would be to attach
__used to 'sym' directly.  __ADDRESSABLE doesn't seem to be used
directly but instead I see cases like

#define __define_initcall_stub(__stub, fn)                      \
        int __init __stub(void);                                \
        int __init __stub(void)                                 \
        {                                                       \
                return fn();                                    \
        }                                                       \
        __ADDRESSABLE(__stub)

where one could have added __used to the __stub prototypes instead?

The folks who worked on LTO enablement of the kernel should know the
real issue better - I understand asm()s are a pain because GCC
refuses to parse the assembler string heuristically for used
symbols (but it can never be more than heuristics).  The issue with
asm()s is not so much elimination (__used solves that) but that
GCC can end up moving the asm() and the refered to symbols to
different link-time units causing unresolved symbols for non-global
symbols.  -fno-toplevel-reorder should fix that at some cost.

Richard.

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17 13:55           ` Richard Biener
@ 2022-11-17 14:32             ` Peter Zijlstra
  2022-11-17 14:40               ` Richard Biener
  2022-11-17 15:15             ` Ard Biesheuvel
  1 sibling, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2022-11-17 14:32 UTC (permalink / raw)
  To: Richard Biener
  Cc: Ard Biesheuvel, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Sedat Dilek, Song Liu, Stanislav Fomichev, Stefano Stabellini,
	Steven Rostedt, Thomas Gleixner, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On Thu, Nov 17, 2022 at 01:55:07PM +0000, Richard Biener wrote:

> > > I'm not sure what you're on about; only symbols annotated with
> > > EXPORT_SYMBOL*() are accessible from modules (aka DSOs) and those will
> > > have their address taken. You can feely eliminate any unused symbol.
> 
> But IIRC that's not reflected on the ELF level by making EXPORT_SYMBOL*()
> symbols public and the rest hidden - instead all symbols global in the C TUs
> will become public and the module dynamic loader details are hidden from
> GCCs view of the kernel image as ELF relocatable object.

It is reflected by keeping their address in __ksymtab_$foo sections, as
such their address 'escapes'.

> > We have an __ADDRESSABLE() macro and asmlinkage modifier to annotate
> > symbols that may appear to the compiler as though they are never
> > referenced.
> > 
> > Would it be possible to repurpose those so that the LTO code knows
> > which symbols it must not remove?
> 
> I find
> 
> /*
>  * Force the compiler to emit 'sym' as a symbol, so that we can reference
>  * it from inline assembler. Necessary in case 'sym' could be inlined
>  * otherwise, or eliminated entirely due to lack of references that are
>  * visible to the compiler.
>  */
> #define ___ADDRESSABLE(sym, __attrs) \
> 	static void * __used __attrs \
> 		__UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)&sym;
> #define __ADDRESSABLE(sym) \
> 	___ADDRESSABLE(sym, __section(".discard.addressable"))
> 
> that should be enough to force LTO keeping 'sym' - unless there's
> a linker script that discards .discard.addressable which I fear LTO
> will notice, losing the effect.

The initial LTO link pass will not discard .discard sections in order to
generate a regular ELF object file. This object file is then fed to
objtool and the kallsyms tool and eventually linked with the linker
script in a multi-stage link pass.

Also see scripts/link-vmlinux.sh for all the horrible details.

> The folks who worked on LTO enablement of the kernel should know the
> real issue better - I understand asm()s are a pain because GCC
> refuses to parse the assembler string heuristically for used
> symbols (but it can never be more than heuristics). 

I don't understand why it can't be more than heuristics; eventually the
asm() contents end up in a real assembler and it has to make sense.

Might as well parse it directly -- isn't that what clang-ias does?

> The issue with asm()s is not so much elimination (__used solves that)
> but that GCC can end up moving the asm() and the refered to symbols to
> different link-time units causing unresolved symbols for non-global
> symbols.  -fno-toplevel-reorder should fix that at some cost.

I thought the whole point of LTO was that there was only a single link
time unit, translate all the tus into intermadiate gunk and then collect
the whole lot in one go.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17 14:32             ` Peter Zijlstra
@ 2022-11-17 14:40               ` Richard Biener
  0 siblings, 0 replies; 87+ messages in thread
From: Richard Biener @ 2022-11-17 14:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ard Biesheuvel, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Sedat Dilek, Song Liu, Stanislav Fomichev, Stefano Stabellini,
	Steven Rostedt, Thomas Gleixner, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On Thu, 17 Nov 2022, Peter Zijlstra wrote:

> On Thu, Nov 17, 2022 at 01:55:07PM +0000, Richard Biener wrote:
> 
> > > > I'm not sure what you're on about; only symbols annotated with
> > > > EXPORT_SYMBOL*() are accessible from modules (aka DSOs) and those will
> > > > have their address taken. You can feely eliminate any unused symbol.
> > 
> > But IIRC that's not reflected on the ELF level by making EXPORT_SYMBOL*()
> > symbols public and the rest hidden - instead all symbols global in the C TUs
> > will become public and the module dynamic loader details are hidden from
> > GCCs view of the kernel image as ELF relocatable object.
> 
> It is reflected by keeping their address in __ksymtab_$foo sections, as
> such their address 'escapes'.

That's not enough to make symbols not appearing in __ksymtab_$foo
sections eliminatable.

> > > We have an __ADDRESSABLE() macro and asmlinkage modifier to annotate
> > > symbols that may appear to the compiler as though they are never
> > > referenced.
> > > 
> > > Would it be possible to repurpose those so that the LTO code knows
> > > which symbols it must not remove?
> > 
> > I find
> > 
> > /*
> >  * Force the compiler to emit 'sym' as a symbol, so that we can reference
> >  * it from inline assembler. Necessary in case 'sym' could be inlined
> >  * otherwise, or eliminated entirely due to lack of references that are
> >  * visible to the compiler.
> >  */
> > #define ___ADDRESSABLE(sym, __attrs) \
> > 	static void * __used __attrs \
> > 		__UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)&sym;
> > #define __ADDRESSABLE(sym) \
> > 	___ADDRESSABLE(sym, __section(".discard.addressable"))
> > 
> > that should be enough to force LTO keeping 'sym' - unless there's
> > a linker script that discards .discard.addressable which I fear LTO
> > will notice, losing the effect.
> 
> The initial LTO link pass will not discard .discard sections in order to
> generate a regular ELF object file. This object file is then fed to
> objtool and the kallsyms tool and eventually linked with the linker
> script in a multi-stage link pass.
> 
> Also see scripts/link-vmlinux.sh for all the horrible details.
> 
> > The folks who worked on LTO enablement of the kernel should know the
> > real issue better - I understand asm()s are a pain because GCC
> > refuses to parse the assembler string heuristically for used
> > symbols (but it can never be more than heuristics). 
> 
> I don't understand why it can't be more than heuristics; eventually the
> asm() contents end up in a real assembler and it has to make sense.
> 
> Might as well parse it directly -- isn't that what clang-ias does?

GCC doesn't have an integrated assembler and the actual assembler text
that's emitted is not known at the stage we need to know the symbol.
Which means for GCC it would be heuristics.

> > The issue with asm()s is not so much elimination (__used solves that)
> > but that GCC can end up moving the asm() and the refered to symbols to
> > different link-time units causing unresolved symbols for non-global
> > symbols.  -fno-toplevel-reorder should fix that at some cost.
> 
> I thought the whole point of LTO was that there was only a single link
> time unit, translate all the tus into intermadiate gunk and then collect
> the whole lot in one go.

that's what it does, but it fans out to parallelize the final compile,
dividing the whole lot again which is where this problem can appear
if GCC doesn't see that asm() X uses symbol Y.

Richard.

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/46] gcc-LTO support for the kernel
  2022-11-17 13:55           ` Richard Biener
  2022-11-17 14:32             ` Peter Zijlstra
@ 2022-11-17 15:15             ` Ard Biesheuvel
  1 sibling, 0 replies; 87+ messages in thread
From: Ard Biesheuvel @ 2022-11-17 15:15 UTC (permalink / raw)
  To: Richard Biener
  Cc: Peter Zijlstra, Jiri Slaby (SUSE),
	linux-kernel, Alexander Potapenko, Alexander Shishkin,
	Alexei Starovoitov, Alexey Makhalov, Andrew Morton,
	Andrey Konovalov, Andrey Ryabinin, Andrii Nakryiko,
	Andy Lutomirski, Arnaldo Carvalho de Melo, Ben Segall,
	Borislav Petkov, Daniel Borkmann, Daniel Bristot de Oliveira,
	Dave Hansen, Dietmar Eggemann, Dmitry Vyukov, Don Zickus,
	Hao Luo, H . J . Lu, H. Peter Anvin, Huang Rui, Ingo Molnar,
	Jan Hubicka, Jason Baron, Jiri Kosina, Jiri Olsa, Joe Lawrence,
	John Fastabend, Josh Poimboeuf, Juergen Gross, Juri Lelli,
	KP Singh, Mark Rutland, Martin KaFai Lau, Martin Liska,
	Masahiro Yamada, Mel Gorman, Miguel Ojeda, Michal Marek,
	Miroslav Benes, Namhyung Kim, Nick Desaulniers,
	Oleksandr Tyshchenko, Petr Mladek, Rafael J. Wysocki,
	Sedat Dilek, Song Liu, Stanislav Fomichev, Stefano Stabellini,
	Steven Rostedt, Thomas Gleixner, Valentin Schneider,
	Vincent Guittot, Vincenzo Frascino, Viresh Kumar,
	VMware PV-Drivers Reviewers, Yonghong Song

On Thu, 17 Nov 2022 at 14:55, Richard Biener <rguenther@suse.de> wrote:
>
> On Thu, 17 Nov 2022, Ard Biesheuvel wrote:
>
...
> > We have an __ADDRESSABLE() macro and asmlinkage modifier to annotate
> > symbols that may appear to the compiler as though they are never
> > referenced.
> >
> > Would it be possible to repurpose those so that the LTO code knows
> > which symbols it must not remove?
>
> I find
>
> /*
>  * Force the compiler to emit 'sym' as a symbol, so that we can reference
>  * it from inline assembler. Necessary in case 'sym' could be inlined
>  * otherwise, or eliminated entirely due to lack of references that are
>  * visible to the compiler.
>  */
> #define ___ADDRESSABLE(sym, __attrs) \
>         static void * __used __attrs \
>                 __UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)&sym;
> #define __ADDRESSABLE(sym) \
>         ___ADDRESSABLE(sym, __section(".discard.addressable"))
>
> that should be enough to force LTO keeping 'sym' - unless there's
> a linker script that discards .discard.addressable which I fear LTO
> will notice, losing the effect.  A more direct way would be to attach
> __used to 'sym' directly.  __ADDRESSABLE doesn't seem to be used
> directly but instead I see cases like
>
> #define __define_initcall_stub(__stub, fn)                      \
>         int __init __stub(void);                                \
>         int __init __stub(void)                                 \
>         {                                                       \
>                 return fn();                                    \
>         }                                                       \
>         __ADDRESSABLE(__stub)
>
> where one could have added __used to the __stub prototypes instead?
>

Probably, yes.

But my point was not really about the implementation of those things,
more about whether we could redefine them to something else that would
help the compiler infer that this symbol needs to be retained.

asmlinkage in particular seems relevant, which is currently only used
for C++ inclusion or for setting regparm{0} on i386.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 40/46] x86/livepatch, lto: Disable live patching with gcc LTO
  2022-11-14 11:43 ` [PATCH 40/46] x86/livepatch, lto: Disable live patching " Jiri Slaby (SUSE)
  2022-11-14 19:07   ` Josh Poimboeuf
@ 2022-11-17 20:00   ` Song Liu
  1 sibling, 0 replies; 87+ messages in thread
From: Song Liu @ 2022-11-17 20:00 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Andi Kleen, Josh Poimboeuf, Jiri Kosina,
	Miroslav Benes, Petr Mladek, Joe Lawrence, live-patching,
	Andi Kleen, Martin Liska, Jiri Slaby

On Mon, Nov 14, 2022 at 3:48 AM Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> From: Andi Kleen <andi@firstfloor.org>
>
> It is not supported by gcc 12 so far, so it causes compiler "sorry"
> messages.
>
> Other than the compiler support, there shouldn't be any barriers for
> live patching LTOed kernels, although it might be more difficult to
> create patches for larger functions.

A loosely related question: does livepatch work with CLANG LTO?
AFAICT, kpatch-build doesn't support it. But the kernel side should
work just fine?

Thanks,
Song

[...]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-17  8:40     ` Peter Zijlstra
@ 2022-11-17 22:07       ` Andi Kleen
  2022-11-18  1:28         ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Andi Kleen @ 2022-11-17 22:07 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andy Lutomirski, Martin Liska, Jiri Slaby


> I still don't understand any of it -- this symbol is not static (and
> thus lives in the global namespace and it's name must not be mangled
> lest it breaks ABI), this symbol has it's address taken, so it must not
> be eliminated.

It's not eliminated, but is still manged because gcc turns it into 
static due to

-fwhole-program. Maybe this could avoided in gcc, but at least that's 
what it does currently.

I believe disabling -fwhole-program would likely avoid it, but it would 
also prevent some code

transformations because gcc would need to assume that every function can 
be called by

someone it doesn't see.

> WTF does this crazy LTO thing require __visible on it?
>
> The original Changelog babbles something about multiple object files,
> which doesn't make sense either, there is only a single object file with
> LTO -- that's sort of the whole point. The translation unit output
> becomes some intermediate gunk -- to be used as input for the LTO pass,
> but it is not an ELF object file.
>
> The linker takes all these intermediate files, does the global
> optimization thing and then generates a real ELF object file.

That would be a single threaded very very slow global compilation. 
Instead gcc WHOPR uses

partitioning to generate smaller units that can be compiled in parallel 
based on their call dependencies,

and these use different object files from the individual assembler 
invocations.

>
> Anyway; I think we can drop all this crazy on the floor again, since per
> the 0/n (which I didn't get) there isn't any actual benefit from using
> GCC-LTO, so why should we bother with all this ugly.

At least in the past it generated smaller kernels for small configurations.

One benefit that wasn't mentioned is doing type and other checks (e.g. 
constant propagation

through inlining) across files.

In general LTO gives the compiler a lot more freedom to optimize code, 
so even if it's not quite there

yet I think it's beneficial to let users play around with it and see if 
they can get benefits.



-Andi


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-17 22:07       ` Andi Kleen
@ 2022-11-18  1:28         ` Thomas Gleixner
  2022-11-19  0:50           ` Andi Kleen
  0 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2022-11-18  1:28 UTC (permalink / raw)
  To: Andi Kleen, Peter Zijlstra
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andy Lutomirski, Martin Liska, Jiri Slaby

On Thu, Nov 17 2022 at 14:07, Andi Kleen wrote:
>> Anyway; I think we can drop all this crazy on the floor again, since per
>> the 0/n (which I didn't get) there isn't any actual benefit from using
>> GCC-LTO, so why should we bother with all this ugly.
>
> At least in the past it generated smaller kernels for small configurations.
>
> One benefit that wasn't mentioned is doing type and other checks (e.g. 
> constant propagation
>
> through inlining) across files.
>
> In general LTO gives the compiler a lot more freedom to optimize code, 
> so even if it's not quite there
>
> yet I think it's beneficial to let users play around with it and see if 
> they can get benefits.

Sure, they can play around with it but that does not require to merge
all this nonsensical ballast for a half thought out compiler.

If they want to do that they can apply the pile of patches as provided
and play around.

If anything useful comes out of that with sensible changelogs and a
sensible argumentation why supporting a half thought out compiler is
required then we can revisit that.

Up to that point this is all considered to be __invisible.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-18  1:28         ` Thomas Gleixner
@ 2022-11-19  0:50           ` Andi Kleen
  2022-11-19  8:50             ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Andi Kleen @ 2022-11-19  0:50 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andy Lutomirski, Martin Liska, Jiri Slaby


> Sure, they can play around with it but that does not require to merge
> all this nonsensical ballast for a half thought out compiler.


You are referring to __visible?

TBH I don't understand the problem. In general __visible is useful 
documentation,

so you know something is used from assembler or other strange contexts. 
Doing such things

explicitly marked instead of implicitly hidden and they just happen to 
work by accident

seems cleaner to me.


I can also see the __visible markings being useful for other purposes, 
e.g. static analysis tools or

dynamic instrumentation like the various sanitizers. Everything that is 
referenced outside

the normal code that the compiler sees may need some special handling.


That said I don't see the point of __visible_in_lto either, it should be 
just all __visible.


Similar argument applies to __noreorder, it's also useful documentation.


There are a few real workarounds in the patchkit that are a bit ugly, 
but __visible isn't it.


>
> If they want to do that they can apply the pile of patches as provided
> and play around.


It's very difficult to maintain out of tree, while in tree it's much 
simpler.

I think Linux should support its primary compiler well and not give up 
due to relatively small obstacles.


-Andi

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-19  0:50           ` Andi Kleen
@ 2022-11-19  8:50             ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2022-11-19  8:50 UTC (permalink / raw)
  To: Andi Kleen, Peter Zijlstra
  Cc: Jiri Slaby (SUSE),
	linux-kernel, Andy Lutomirski, Martin Liska, Jiri Slaby

On Fri, Nov 18 2022 at 16:50, Andi Kleen wrote:
>> Sure, they can play around with it but that does not require to merge
>> all this nonsensical ballast for a half thought out compiler.
>
> You are referring to __visible?
>
> TBH I don't understand the problem. In general __visible is useful
> documentation, so you know something is used from assembler or other
> strange contexts.  Doing such things explicitly marked instead of
> implicitly hidden and they just happen to work by accident
> seems cleaner to me.

Seems cleaner is really not a technical argument. Visible is completely
useless. Either a symbol is global and therefore reachable from any
point in the final "executable" or it's not. Whether that reference is
in assembly or from a pointer, static key or whatever does not matter at
all. There is no such thing as a 'strange context'.

Nothing works here by accident. A global symbol is a global symbol
whether it's defined or referenced from C or from ASM or from any other
programming language does not matter at all.

> I can also see the __visible markings being useful for other purposes,
> e.g. static analysis tools or dynamic instrumentation like the various
> sanitizers. Everything that is referenced outside the normal code that
> the compiler sees may need some special handling.

All you have is 'may need' and 'I can see'. Where is the actual use case?

>> If they want to do that they can apply the pile of patches as provided
>> and play around.
>
> It's very difficult to maintain out of tree, while in tree it's much 
> simpler.

Sure. Lots of things are simpler to maintain in tree, but that's not an
argument for merging anything.

> I think Linux should support its primary compiler well and not give up 
> due to relatively small obstacles.

It's not an obstacle. It's a fundamental broken model. clang has proven
that it can be done proper, so there is no reason to proliferate the
inferior.

While you might consider gcc to be the primary compiler, that might have
been true a decade ago. A lot of people prefer clang as their primary
compiler simply because its saner and the maintainers behind it are
working with us and not trying to inflict their half baken crap on us to
spare themself the work to do it right.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
  2022-11-16 23:30   ` Thomas Gleixner
  2022-11-17  8:40     ` Peter Zijlstra
@ 2022-11-22  9:32     ` Jiri Slaby
  1 sibling, 0 replies; 87+ messages in thread
From: Jiri Slaby @ 2022-11-22  9:32 UTC (permalink / raw)
  To: Thomas Gleixner, linux-kernel
  Cc: Andi Kleen, Peter Zijlstra, Andy Lutomirski, Martin Liska, Jiri Slaby

Hi,

On 17. 11. 22, 0:30, Thomas Gleixner wrote:
> On Mon, Nov 14 2022 at 12:43, Jiri Slaby wrote:
>> Symbols referenced from assembler (either directly or e.f. from
> 
> from assembler? I'm not aware that the assembler references anything.

"""
Noun assembler

assembler (countable and uncountable, plural assemblers)

1.  (programming, countable) A program that reads source code written in 
assembly language and produces executable machine code, possibly 
together with information needed by linkers, debuggers and other tools.

2. (computer languages, informal, chiefly uncountable) Assembly language.

     I wrote that program in assembler.
""" [1]

I refer in the above to 2. You refer to 1.

In some languages, incl. mine, we don't distinguish between the two. 
It's always assembler. Yet, that might confuse you, even though it's 
correct as you can see above. I can switch to mode 1 (assembler and 
assembly) for sure.

[1] https://en.wiktionary.org/wiki/assembler

> Also what does e.f. mean? Did you want to write e.g.?

Yes, my and my spellchecker's bad.

>> DEFINE_STATIC_KEY()) need to be global and visible in gcc LTO because
>> they could end up in a different object file than the assembler. This
> 
> than the assembler? Are we shipping the assembler in an object file?

Nope, see above.

>> can lead to linker errors without this patch.
> 
> git grep -i 'this patch' Documentation/process/

Sorry, I don't understand, care to elaborate? None of the lines from the 
output seems to match the case here.

>> So mark raw_irqentry_exit_cond_resched() as __visible.
> 
> And all that tells me what? I know what you want to say, but it's not
> there.
> 
>    Symbols in different compilation units which are referenced from
>    assembly code either directly or indirectly, e.g. from
>    DEFINE_STATIC_KEY(), must be marked visible for GCC based LTO builds.
> 
>    Add the missing __visible annotation to raw_irqentry_exit_cond_resched().
> 
> See?
> 
> There is no 'global' because it's obvious that a symbol in a different
> compilation unit must be global to be resolvable. It's also obvious that
> code in different compilation units ends up in different object files.

It's not about different compilation units. It's about different partitions.

> So stating that it's a 'must' to have such symbols marked visible is
> good enough for an argument because that tells the reader that this is a
> mandatory requirement for an GCC based LTO build.

My bad that I failed to explain properly in the commit log. But we are 
working on throwing all this __visible thing away. Agreed, that it's 
ridiculous/absurd.

thanks,
-- 
js
suse labs


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 45/46] kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned
  2022-11-14 11:43 ` [PATCH 45/46] kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned Jiri Slaby (SUSE)
@ 2022-11-26 17:07   ` Andrey Konovalov
  0 siblings, 0 replies; 87+ messages in thread
From: Andrey Konovalov @ 2022-11-26 17:07 UTC (permalink / raw)
  To: Jiri Slaby (SUSE)
  Cc: linux-kernel, Martin Liska, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Vincenzo Frascino, Andrew Morton, kasan-dev,
	linux-mm, Jiri Slaby

On Mon, Nov 14, 2022 at 12:45 PM Jiri Slaby (SUSE) <jirislaby@kernel.org> wrote:
>
> From: Martin Liska <mliska@suse.cz>
>
> The function memory_is_poisoned() can handle any size which can be
> propagated by LTO later on. So we can end up with a constant that is not
> handled in the switch. Thus just break and call memory_is_poisoned_n()
> which handles arbitrary size to avoid build errors with gcc LTO.
>
> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Andrey Konovalov <andreyknvl@gmail.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: kasan-dev@googlegroups.com
> Cc: linux-mm@kvack.org
> Signed-off-by: Martin Liska <mliska@suse.cz>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  mm/kasan/generic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index d8b5590f9484..d261f83c6687 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -152,7 +152,7 @@ static __always_inline bool memory_is_poisoned(unsigned long addr, size_t size)
>                 case 16:
>                         return memory_is_poisoned_16(addr);
>                 default:
> -                       BUILD_BUG();
> +                       break;
>                 }
>         }
>
> --
> 2.38.1
>

Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2022-11-26 17:08 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-14 11:42 [PATCH 00/46] gcc-LTO support for the kernel Jiri Slaby (SUSE)
2022-11-14 11:42 ` [PATCH 01/46] x86/boot: robustify calling startup_{32,64}() from the decompressor code Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 02/46] kbuild: pass jobserver to cmd_ld_vmlinux.o Jiri Slaby (SUSE)
2022-11-14 17:57   ` Masahiro Yamada
2022-11-15  6:36     ` Jiri Slaby
2022-11-14 11:43 ` [PATCH 03/46] kbuild: lto: preserve MAKEFLAGS for module linking Jiri Slaby (SUSE)
2022-11-14 18:02   ` Masahiro Yamada
2022-11-14 11:43 ` [PATCH 04/46] compiler.h: introduce __visible_on_lto Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 05/46] compiler.h: introduce __global_on_lto Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 06/46] Compiler Attributes, lto: introduce __noreorder Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 07/46] tracepoint, lto: Mark static call functions as __visible Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 08/46] static_call, lto: Mark static keys " Jiri Slaby (SUSE)
2022-11-14 15:51   ` Peter Zijlstra
2022-11-14 18:52     ` Josh Poimboeuf
2022-11-14 20:34     ` Andi Kleen
2022-11-17  8:24       ` Peter Zijlstra
2022-11-14 18:57   ` Josh Poimboeuf
2022-11-14 11:43 ` [PATCH 09/46] static_call, lto: Mark static_call_return0() " Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 10/46] static_call, lto: Mark func_a() as __visible_on_lto Jiri Slaby (SUSE)
2022-11-14 15:54   ` Peter Zijlstra
2022-11-14 20:29     ` Andi Kleen
2022-11-14 11:43 ` [PATCH 11/46] x86/alternative, lto: Mark int3_*() as global and __visible Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 12/46] x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto Jiri Slaby (SUSE)
2022-11-14 15:58   ` Peter Zijlstra
2022-11-14 11:43 ` [PATCH 13/46] x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 14/46] x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto Jiri Slaby (SUSE)
2022-11-14 16:02   ` Peter Zijlstra
2022-11-14 11:43 ` [PATCH 15/46] x86/xen, lto: Mark xen_vcpu_stolen() as __visible Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 16/46] x86, lto: Mark gdt_page and native_sched_clock() " Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 17/46] amd, lto: Mark amd pmu and pstate functions as __visible_on_lto Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 18/46] entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible Jiri Slaby (SUSE)
2022-11-16 23:30   ` Thomas Gleixner
2022-11-17  8:40     ` Peter Zijlstra
2022-11-17 22:07       ` Andi Kleen
2022-11-18  1:28         ` Thomas Gleixner
2022-11-19  0:50           ` Andi Kleen
2022-11-19  8:50             ` Thomas Gleixner
2022-11-22  9:32     ` Jiri Slaby
2022-11-14 11:43 ` [PATCH 19/46] export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and __visible Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 20/46] softirq, lto: Mark irq_enter/exit_rcu() as __visible Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 21/46] btf, lto: pass scope as strings Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 22/46] btf, lto: Make all BTF IDs global on LTO Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 23/46] init.h, lto: mark initcalls as __noreorder Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 24/46] bpf, lto: mark interpreter jump table " Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 25/46] sched, lto: mark sched classes " Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 26/46] x86/apic, lto: Mark apic_driver*() " Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 27/46] linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall() Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 28/46] scripts, lto: re-add gcc-ld Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 29/46] scripts, lto: use CONFIG_LTO for many LTO specific actions Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 30/46] Kbuild, lto: Add Link Time Optimization support Jiri Slaby (SUSE)
2022-11-14 18:55   ` Josh Poimboeuf
2022-11-15 13:31     ` Martin Liška
2022-11-14 11:43 ` [PATCH 31/46] x86/purgatory, lto: Disable gcc LTO for purgatory Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 32/46] x86/realmode, lto: Disable gcc LTO for real mode code Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 33/46] x86/vdso, lto: Disable gcc LTO for the vdso Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 34/46] scripts, lto: disable gcc LTO for some mod sources Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 35/46] Kbuild, lto: disable gcc LTO for bounds+asm-offsets Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 36/46] lib/string, lto: disable gcc LTO for string.o Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 37/46] Compiler attributes, lto: disable __flatten with LTO Jiri Slaby (SUSE)
2022-11-14 17:01   ` Miguel Ojeda
2022-11-14 11:43 ` [PATCH 38/46] Kbuild, lto: don't include weak source file symbols in System.map Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 39/46] x86, lto: Disable relative init pointers with gcc LTO Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 40/46] x86/livepatch, lto: Disable live patching " Jiri Slaby (SUSE)
2022-11-14 19:07   ` Josh Poimboeuf
2022-11-14 20:28     ` Andi Kleen
2022-11-14 22:00       ` Josh Poimboeuf
2022-11-15 13:32         ` Martin Liška
2022-11-17 20:00   ` Song Liu
2022-11-14 11:43 ` [PATCH 41/46] x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 42/46] mm/kasan, lto: Mark kasan " Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 43/46] scripts, lto: check C symbols for modversions Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 44/46] scripts/bloat-o-meter, lto: handle gcc LTO Jiri Slaby (SUSE)
2022-11-14 11:43 ` [PATCH 45/46] kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned Jiri Slaby (SUSE)
2022-11-26 17:07   ` Andrey Konovalov
2022-11-14 11:43 ` [PATCH 46/46] x86, lto: Finally enable gcc LTO for x86 Jiri Slaby (SUSE)
2022-11-14 11:56 ` [PATCH 00/46] gcc-LTO support for the kernel Ard Biesheuvel
2022-11-14 12:04   ` Jiri Slaby
2022-11-14 19:40 ` Ard Biesheuvel
2022-11-17  8:28   ` Peter Zijlstra
2022-11-17  8:50     ` Richard Biener
2022-11-17 11:42       ` Peter Zijlstra
2022-11-17 11:49         ` Ard Biesheuvel
2022-11-17 13:55           ` Richard Biener
2022-11-17 14:32             ` Peter Zijlstra
2022-11-17 14:40               ` Richard Biener
2022-11-17 15:15             ` Ard Biesheuvel
2022-11-17 11:48       ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.