All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Section alignment issues?
@ 2023-11-22 22:18 deller
  2023-11-22 22:18 ` [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries deller
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: deller @ 2023-11-22 22:18 UTC (permalink / raw)
  To: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules,
	linux-arch, Luis Chamberlain

From: Helge Deller <deller@gmx.de>

While working on the 64-bit parisc kernel, I noticed that the __ksymtab[]
table was not correctly 64-bit aligned in many modules.
The following patches do fix some of those issues in the generic code.

But further investigation shows that multiple sections in the kernel and in
modules are possibly not correctly aligned, and thus may lead to performance
degregations at runtime (small on x86, huge on parisc, sparc and others which
need exception handlers). Sometimes wrong alignments may also be simply hidden
by the linker or kernel module loader which pulls in the sections by luck with
a correct alignment (e.g. because the previous section was aligned already).

An objdump on a x86 module shows e.g.:

./kernel/net/netfilter/nf_log_syslog.ko:     file format elf64-x86-64
Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00001fdf  0000000000000000  0000000000000000  00000040  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .init.text    000000f6  0000000000000000  0000000000000000  00002020  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .exit.text    0000005c  0000000000000000  0000000000000000  00002120  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  3 .rodata.str1.8 000000dc  0000000000000000  0000000000000000  00002180  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .rodata.str1.1 0000030a  0000000000000000  0000000000000000  0000225c  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .rodata       000000b0  0000000000000000  0000000000000000  00002580  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .modinfo      0000019e  0000000000000000  0000000000000000  00002630  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .return_sites 00000034  0000000000000000  0000000000000000  000027ce  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  8 .call_sites   0000029c  0000000000000000  0000000000000000  00002802  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

In this example I believe the ".return_sites" and ".call_sites" should have
an alignment of at least 32-bit (4 bytes).

On other architectures or modules other sections like ".altinstructions" or
"__ex_table" may show up wrongly instead.

In general I think it would be beneficial to search for wrong alignments at
link time, and maybe at runtime.

The patch at the end of this cover letter
- adds compile time checks to the "modpost" tool, and
- adds a runtime check to the kernel module loader at runtime.
And it will possibly show false positives too (!!!)
I do understand that some of those sections are not performce critical
and thus any alignment is OK.

The modpost patch will emit at compile time such warnings (on x86-64 kernel build):

WARNING: modpost: vmlinux: section .initcall7.init (type 1, flags 2) has alignment of 1, expected at least 4.
Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?
WARNING: modpost: vmlinux: section .altinstructions (type 1, flags 2) has alignment of 1, expected at least 2.
WARNING: modpost: vmlinux: section .initcall6.init (type 1, flags 2) has alignment of 1, expected at least 4.
WARNING: modpost: vmlinux: section .initcallearly.init (type 1, flags 2) has alignment of 1, expected at least 4.
WARNING: modpost: vmlinux: section .rodata.cst2 (type 1, flags 18) has alignment of 2, expected at least 64.
WARNING: modpost: vmlinux: section .static_call_tramp_key (type 1, flags 2) has alignment of 1, expected at least 8.
WARNING: modpost: vmlinux: section .con_initcall.init (type 1, flags 2) has alignment of 1, expected at least 8.
WARNING: modpost: vmlinux: section __bug_table (type 1, flags 3) has alignment of 1, expected at least 4.
...

while the kernel module loader may show at runtime:

** WARNING **:   module: efivarfs, dest=0xffffffffc02d08d2, section=.retpoline_sites possibly not correctly aligned.
** WARNING **:   module: efivarfs, dest=0xffffffffc02d0bae, section=.ibt_endbr_seal possibly not correctly aligned.
** WARNING **:   module: efivarfs, dest=0xffffffffc02d0be6, section=.orc_unwind possibly not correctly aligned.
** WARNING **:   module: efivarfs, dest=0xffffffffc02d12a6, section=.orc_unwind_ip possibly not correctly aligned.
** WARNING **:   module: efivarfs, dest=0xffffffffc02d178c, section=.note.Linux possibly not correctly aligned.
** WARNING **:   module: efivarfs, dest=0xffffffffc02d17bc, section=.orc_header possibly not correctly aligned.
** WARNING **:   module: xt_addrtype, dest=0xffffffffc01ef18a, size=0x00000020, section=.return_sites

My questions:
- Am I wrong with my analysis?
- What does people see on other architectures?
- Does it make sense to add a compile- and runtime-check, like the patch below, to the kernel?

Helge

---

From: Helge Deller <deller@gmx.de>
Subject: [PATCH] MODPOST: Detect and report possible section alignment issues

IMPORTANT: THIS PATCH IS NOT INTENDED TO BE APPLIED !!!!

The main idea here is to check at compile time (during modpost run) and at
runtime (when loading a module) if the sections in kernel modules (and
vmlinux) are correctly aligned in memory and report if a wrong alignment is
suspected.

It WILL report false positives. Many section names still need to be added to
the various tables.
But even at this stage it gives some insight at the various possibly wrong
section alignments.

Signed-off-by: Helge Deller <deller@gmx.de>

diff --git a/kernel/module/main.c b/kernel/module/main.c
index 98fedfdb8db5..88201d6b0c17 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2277,6 +2277,10 @@ static int move_module(struct module *mod, struct load_info *info)
 				ret = -ENOEXEC;
 				goto out_enomem;
 			}
+			if (((uintptr_t)dest) & (BITS_PER_LONG/8 - 1)) {
+				printk("** WARNING **: \tmodule: %s, dest=0x%lx, section=%s possibly not correctly aligned.\n",
+					mod->name, (long)dest, info->secstrings + shdr->sh_name);
+			}
 			memcpy(dest, (void *)shdr->sh_addr, shdr->sh_size);
 		}
 		/*
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 739402f45509..2add144a05e3 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -49,6 +49,8 @@ modpost-args =										\
 	$(if $(KBUILD_NSDEPS),-d $(MODULES_NSDEPS))					\
 	$(if $(CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS)$(KBUILD_NSDEPS),-N)	\
 	$(if $(findstring 1, $(KBUILD_EXTRA_WARN)),-W)					\
+	$(if $(CONFIG_64BIT),-6)							\
+	$(if $(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS),-R)					\
 	-o $@

 modpost-deps := $(MODPOST)
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index cb6406f485a9..70c69db6a17c 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -43,6 +43,10 @@ static bool ignore_missing_files;
 /* If set to 1, only warn (instead of error) about missing ns imports */
 static bool allow_missing_ns_imports;

+/* is target a 64-bit platform and has it prel32 relocation support? */
+static bool target_64bit;
+static bool target_prel32_relocations;
+
 static bool error_occurred;

 static bool extra_warn;
@@ -1493,6 +1497,76 @@ static void check_sec_ref(struct module *mod, struct elf_info *elf)
 	}
 }

+/**
+ * Check alignment of sections in modules.
+ **/
+static void check_sec_alignment(struct module *mod, struct elf_info *elf)
+{
+	/* sections that may use PREL32 relocations and only need 4-byte alignment */
+	static const char *const prel32_sec_list[] = {
+		"__tracepoints_ptrs",
+		"__ksymtab",
+		"__bug_table",
+		".smp_locks",
+		NULL
+	};
+	/* sections that are fine with any/1-byte alignment */
+	static const char *const byte_sec_list[] = {
+		".modinfo",
+		".init.ramfs",
+		NULL
+	};
+	/* sections with special alignment */
+	static struct { int align; const char *name; } const special_list[] = {
+		{ 64,	".rodata.cst2" },
+		{ 0,	NULL }
+	};
+
+	int i;
+
+	/* ignore vmlinux for now? */
+	// if (mod->is_vmlinux) return;
+
+	/* Walk through all sections */
+	for (i = 0; i < elf->num_sections; i++) {
+		Elf_Shdr *sechdr = &elf->sechdrs[i];
+		const char *sec = sech_name(elf, sechdr);
+		const char *modname = mod->name;
+		const int is_shalign = sechdr->sh_addralign;
+		int should_shalign;
+		int k;
+
+		/* ignore some sections */
+		if ((sechdr->sh_type == SHT_NULL) ||
+		    !(sechdr->sh_flags & SHF_ALLOC))
+			continue;
+
+		/* default alignment is 8 for 64-bit and 4 for 32-bit targets * */
+		should_shalign = target_64bit ? 8 : 4;
+		if (match(sec, prel32_sec_list))
+			should_shalign = target_prel32_relocations ? 4 : should_shalign;
+		else if (strstr(sec, "text")) /* assume text segments to be 4-byte aligned */
+			should_shalign = 4;
+		else if (match(sec, byte_sec_list)) /* any alignment is OK. */
+			continue;
+		else {
+			/* search in list with special alignment */
+			k = 0;
+			while (special_list[k].align && strstarts(sec, special_list[k].name)) {
+				should_shalign = special_list[k].align;
+				break;
+			}
+		}
+
+		if (is_shalign  >= should_shalign)
+			continue;
+
+		warn("%s: section %s (type %d, flags %lu) has alignment of %d, expected at least %d.\n"
+		     "Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?\n",
+		     modname, sec, sechdr->sh_type, sechdr->sh_flags, is_shalign, should_shalign);
+	}
+}
+
 static char *remove_dot(char *s)
 {
 	size_t n = strcspn(s, ".");
@@ -1653,6 +1727,8 @@ static void read_symbols(const char *modname)
 		handle_moddevtable(mod, &info, sym, symname);
 	}

+	check_sec_alignment(mod, &info);
+
 	check_sec_ref(mod, &info);

 	if (!mod->is_vmlinux) {
@@ -2183,7 +2259,7 @@ int main(int argc, char **argv)
 	LIST_HEAD(dump_lists);
 	struct dump_list *dl, *dl2;

-	while ((opt = getopt(argc, argv, "ei:MmnT:to:au:WwENd:")) != -1) {
+	while ((opt = getopt(argc, argv, "ei:MmnT:to:au:WwENd:6R")) != -1) {
 		switch (opt) {
 		case 'e':
 			external_module = true;
@@ -2232,6 +2308,12 @@ int main(int argc, char **argv)
 		case 'd':
 			missing_namespace_deps = optarg;
 			break;
+		case '6':
+			target_64bit = true;
+			break;
+		case 'R':
+			target_prel32_relocations = true;
+			break;
 		default:
 			exit(1);
 		}
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 70c69db6a17c..b09ab081dc03 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1510,15 +1510,24 @@ static void check_sec_alignment(struct module *mod, struct elf_info *elf)
 		".smp_locks",
 		NULL
 	};
-	/* sections that are fine with any/1-byte alignment */
+	/* sections that are fine with any alignment */
 	static const char *const byte_sec_list[] = {
 		".modinfo",
 		".init.ramfs",
 		NULL
 	};
-	/* sections with special alignment */
+	/* sections with special given minimal alignment. Checked with
+	 * startswith(). */
 	static struct { int align; const char *name; } const special_list[] = {
 		{ 64,	".rodata.cst2" },
+		{ 2,	".altinstructions" }, /* at least on x86 */
+		{ 1,	".altinstr_replacement" },
+		{ 1,	".altinstr_aux" },
+		{ 4,	".call_sites" },
+		{ 4,	".return_sites" },
+		{ 1,	".note.Linux" },	/* correct? */
+		{ 4,	"__ex_table" },
+		{ 4,	".initcall" },		/* at least 4 ? */
 		{ 0,	NULL }
 	};

@@ -1545,16 +1554,17 @@ static void check_sec_alignment(struct module *mod, struct elf_info *elf)
 		should_shalign = target_64bit ? 8 : 4;
 		if (match(sec, prel32_sec_list))
 			should_shalign = target_prel32_relocations ? 4 : should_shalign;
-		else if (strstr(sec, "text")) /* assume text segments to be 4-byte aligned */
-			should_shalign = 4;
 		else if (match(sec, byte_sec_list)) /* any alignment is OK. */
 			continue;
+		else if (strstr(sec, "text")) /* assume text segments to be 4-byte aligned */
+			should_shalign = 4;
 		else {
 			/* search in list with special alignment */
-			k = 0;
-			while (special_list[k].align && strstarts(sec, special_list[k].name)) {
-				should_shalign = special_list[k].align;
-				break;
+			for (k = 0; special_list[k].align; k++) {
+				if (strstarts(sec, special_list[k].name)) {
+					should_shalign = special_list[k].align;
+					break;
+				}
 			}
 		}

Helge Deller (4):
  linux/export: Fix alignment for 64-bit ksymtab entries
  modules: Ensure 64-bit alignment on __ksymtab_* sections
  vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and
    .pci_fixup sections
  modules: Add missing entry for __ex_table

 include/asm-generic/vmlinux.lds.h | 5 +++++
 include/linux/export-internal.h   | 5 ++++-
 scripts/module.lds.S              | 9 +++++----
 3 files changed, 14 insertions(+), 5 deletions(-)

-- 
2.41.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
@ 2023-11-22 22:18 ` deller
  2023-12-21 10:22   ` Masahiro Yamada
  2023-11-22 22:18 ` [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections deller
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: deller @ 2023-11-22 22:18 UTC (permalink / raw)
  To: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules,
	linux-arch, Luis Chamberlain

From: Helge Deller <deller@gmx.de>

An alignment of 4 bytes is wrong for 64-bit platforms which don't define
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers).
Fix their alignment to 8 bytes.

Signed-off-by: Helge Deller <deller@gmx.de>
---
 include/linux/export-internal.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h
index 69501e0ec239..cd253eb51d6c 100644
--- a/include/linux/export-internal.h
+++ b/include/linux/export-internal.h
@@ -16,10 +16,13 @@
  * and eliminates the need for absolute relocations that require runtime
  * processing on relocatable kernels.
  */
+#define __KSYM_ALIGN		".balign 4"
 #define __KSYM_REF(sym)		".long " #sym "- ."
 #elif defined(CONFIG_64BIT)
+#define __KSYM_ALIGN		".balign 8"
 #define __KSYM_REF(sym)		".quad " #sym
 #else
+#define __KSYM_ALIGN		".balign 4"
 #define __KSYM_REF(sym)		".long " #sym
 #endif
 
@@ -42,7 +45,7 @@
 	    "	.asciz \"" ns "\""					"\n"	\
 	    "	.previous"						"\n"	\
 	    "	.section \"___ksymtab" sec "+" #name "\", \"a\""	"\n"	\
-	    "	.balign	4"						"\n"	\
+		__KSYM_ALIGN						"\n"	\
 	    "__ksymtab_" #name ":"					"\n"	\
 		__KSYM_REF(sym)						"\n"	\
 		__KSYM_REF(__kstrtab_ ##name)				"\n"	\
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
  2023-11-22 22:18 ` [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries deller
@ 2023-11-22 22:18 ` deller
  2023-12-22  5:59   ` Luis Chamberlain
  2023-11-22 22:18 ` [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections deller
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: deller @ 2023-11-22 22:18 UTC (permalink / raw)
  To: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules,
	linux-arch, Luis Chamberlain

From: Helge Deller <deller@gmx.de>

On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
(e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
64-bit pointers into the __ksymtab* sections.
Make sure that those sections will be correctly aligned at module link time,
otherwise unaligned memory accesses may happen at runtime.

The __kcrctab* sections store 32-bit entities, so use ALIGN(4) for those.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.0+
---
 scripts/module.lds.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index bf5bcf2836d8..b00415a9ff27 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -15,10 +15,10 @@ SECTIONS {
 		*(.discard.*)
 	}
 
-	__ksymtab		0 : { *(SORT(___ksymtab+*)) }
-	__ksymtab_gpl		0 : { *(SORT(___ksymtab_gpl+*)) }
-	__kcrctab		0 : { *(SORT(___kcrctab+*)) }
-	__kcrctab_gpl		0 : { *(SORT(___kcrctab_gpl+*)) }
+	__ksymtab		0 : ALIGN(8) { *(SORT(___ksymtab+*)) }
+	__ksymtab_gpl		0 : ALIGN(8) { *(SORT(___ksymtab_gpl+*)) }
+	__kcrctab		0 : ALIGN(4) { *(SORT(___kcrctab+*)) }
+	__kcrctab_gpl		0 : ALIGN(4) { *(SORT(___kcrctab_gpl+*)) }
 
 	.ctors			0 : ALIGN(8) { *(SORT(.ctors.*)) *(.ctors) }
 	.init_array		0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections
  2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
  2023-11-22 22:18 ` [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries deller
  2023-11-22 22:18 ` [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections deller
@ 2023-11-22 22:18 ` deller
  2023-12-21 13:07   ` Masahiro Yamada
  2023-11-22 22:18 ` [PATCH 4/4] modules: Add missing entry for __ex_table deller
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: deller @ 2023-11-22 22:18 UTC (permalink / raw)
  To: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules,
	linux-arch, Luis Chamberlain

From: Helge Deller <deller@gmx.de>

On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
(e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
64-bit pointers into the __ksymtab* sections.
Make sure that the start of those sections is 64-bit aligned in the vmlinux
executable, otherwise unaligned memory accesses may happen at runtime.

The __kcrctab* sections store 32-bit entities, so make those sections
32-bit aligned.

The pci fixup routines want to be 64-bit aligned on 64-bit platforms
which don't define CONFIG_HAVE_ARCH_PREL32_RELOCATIONS. An alignment
of 8 bytes is sufficient to guarantee aligned accesses at runtime.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.0+
---
 include/asm-generic/vmlinux.lds.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bae0fe4d499b..fa4335346e7d 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -467,6 +467,7 @@
 	}								\
 									\
 	/* PCI quirks */						\
+	. = ALIGN(8);							\
 	.pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {	\
 		BOUNDED_SECTION_PRE_LABEL(.pci_fixup_early,  _pci_fixups_early,  __start, __end) \
 		BOUNDED_SECTION_PRE_LABEL(.pci_fixup_header, _pci_fixups_header, __start, __end) \
@@ -484,6 +485,7 @@
 	PRINTK_INDEX							\
 									\
 	/* Kernel symbol table: Normal symbols */			\
+	. = ALIGN(8);							\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
 		__start___ksymtab = .;					\
 		KEEP(*(SORT(___ksymtab+*)))				\
@@ -491,6 +493,7 @@
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
+	. = ALIGN(8);							\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
 		__start___ksymtab_gpl = .;				\
 		KEEP(*(SORT(___ksymtab_gpl+*)))				\
@@ -498,6 +501,7 @@
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
+	. = ALIGN(4);							\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
 		__start___kcrctab = .;					\
 		KEEP(*(SORT(___kcrctab+*)))				\
@@ -505,6 +509,7 @@
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
+	. = ALIGN(4);							\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
 		__start___kcrctab_gpl = .;				\
 		KEEP(*(SORT(___kcrctab_gpl+*)))				\
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/4] modules: Add missing entry for __ex_table
  2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
                   ` (2 preceding siblings ...)
  2023-11-22 22:18 ` [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections deller
@ 2023-11-22 22:18 ` deller
  2024-01-29 18:50   ` Luis Chamberlain
  2023-12-19 21:26 ` [PATCH 0/4] Section alignment issues? Luis Chamberlain
  2023-12-21 13:40 ` Masahiro Yamada
  5 siblings, 1 reply; 31+ messages in thread
From: deller @ 2023-11-22 22:18 UTC (permalink / raw)
  To: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules,
	linux-arch, Luis Chamberlain

From: Helge Deller <deller@gmx.de>

The entry for __ex_table was missing, which may make __ex_table
become 1- or 2-byte aligned in modules.
Add the entry to ensure it gets 32-bit aligned.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.0+
---
 scripts/module.lds.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index b00415a9ff27..488f61b156b2 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -26,6 +26,7 @@ SECTIONS {
 	.altinstructions	0 : ALIGN(8) { KEEP(*(.altinstructions)) }
 	__bug_table		0 : ALIGN(8) { KEEP(*(__bug_table)) }
 	__jump_table		0 : ALIGN(8) { KEEP(*(__jump_table)) }
+	__ex_table		0 : ALIGN(4) { KEEP(*(__ex_table)) }
 
 	__patchable_function_entries : { *(__patchable_function_entries) }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
                   ` (3 preceding siblings ...)
  2023-11-22 22:18 ` [PATCH 4/4] modules: Add missing entry for __ex_table deller
@ 2023-12-19 21:26 ` Luis Chamberlain
  2023-12-20 19:40   ` Luis Chamberlain
  2023-12-21 13:40 ` Masahiro Yamada
  5 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-19 21:26 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules, linux-arch

On Wed, Nov 22, 2023 at 11:18:10PM +0100, deller@kernel.org wrote:
> From: Helge Deller <deller@gmx.de>
> My questions:
> - Am I wrong with my analysis?

This would typically of course depend on the arch, but whether it helps
is what I would like to see with real numbers rather then speculation.
Howeer, I don't expect some of these are hot paths except maybe the
table lookups. So could you look at some perf stat differences
without / with alignment ? Other than bootup live patching would be
a good test case. We have selftest for modules, the script in selftests
tools/testing/selftests/kmod/kmod.sh is pretty aggressive, but the live
patching tests might be better suited.

> - What does people see on other architectures?
> - Does it make sense to add a compile- and runtime-check, like the patch below, to the kernel?

The chatty aspects really depend on the above results.

Aren't there some archs where an unaligned access would actually crash?
Why hasn't that happened?

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-12-19 21:26 ` [PATCH 0/4] Section alignment issues? Luis Chamberlain
@ 2023-12-20 19:40   ` Luis Chamberlain
  2023-12-22  9:13     ` Helge Deller
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-20 19:40 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules, linux-arch

On Tue, Dec 19, 2023 at 01:26:49PM -0800, Luis Chamberlain wrote:
> On Wed, Nov 22, 2023 at 11:18:10PM +0100, deller@kernel.org wrote:
> > From: Helge Deller <deller@gmx.de>
> > My questions:
> > - Am I wrong with my analysis?
> 
> This would typically of course depend on the arch, but whether it helps
> is what I would like to see with real numbers rather then speculation.
> Howeer, I don't expect some of these are hot paths except maybe the
> table lookups. So could you look at some perf stat differences
> without / with alignment ? Other than bootup live patching would be
> a good test case. We have selftest for modules, the script in selftests
> tools/testing/selftests/kmod/kmod.sh is pretty aggressive, but the live
> patching tests might be better suited.
> 
> > - What does people see on other architectures?
> > - Does it make sense to add a compile- and runtime-check, like the patch below, to the kernel?
> 
> The chatty aspects really depend on the above results.
> 
> Aren't there some archs where an unaligned access would actually crash?
> Why hasn't that happened?

I've gone down through memory lane and we have discussed this before.

It would seem this misalignment should not affect performance, and this
should not be an issue unless you have a buggy exception hanlder. We
actually ran into one before. Please refer to merge commit

e74acdf55da6649dd30be5b621a93b71cbe7f3f9

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-11-22 22:18 ` [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries deller
@ 2023-12-21 10:22   ` Masahiro Yamada
  2023-12-21 16:01     ` Masahiro Yamada
  0 siblings, 1 reply; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-21 10:22 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
>
> From: Helge Deller <deller@gmx.de>
>
> An alignment of 4 bytes is wrong for 64-bit platforms which don't define
> CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers).
> Fix their alignment to 8 bytes.
>
> Signed-off-by: Helge Deller <deller@gmx.de>


This is correct.

Acked-by: Masahiro Yamada <masahiroy@kernel.org>

Please add


Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")





> ---
>  include/linux/export-internal.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h
> index 69501e0ec239..cd253eb51d6c 100644
> --- a/include/linux/export-internal.h
> +++ b/include/linux/export-internal.h
> @@ -16,10 +16,13 @@
>   * and eliminates the need for absolute relocations that require runtime
>   * processing on relocatable kernels.
>   */
> +#define __KSYM_ALIGN           ".balign 4"
>  #define __KSYM_REF(sym)                ".long " #sym "- ."
>  #elif defined(CONFIG_64BIT)
> +#define __KSYM_ALIGN           ".balign 8"
>  #define __KSYM_REF(sym)                ".quad " #sym
>  #else
> +#define __KSYM_ALIGN           ".balign 4"
>  #define __KSYM_REF(sym)                ".long " #sym
>  #endif
>
> @@ -42,7 +45,7 @@
>             "   .asciz \"" ns "\""                                      "\n"    \
>             "   .previous"                                              "\n"    \
>             "   .section \"___ksymtab" sec "+" #name "\", \"a\""        "\n"    \
> -           "   .balign 4"                                              "\n"    \
> +               __KSYM_ALIGN                                            "\n"    \
>             "__ksymtab_" #name ":"                                      "\n"    \
>                 __KSYM_REF(sym)                                         "\n"    \
>                 __KSYM_REF(__kstrtab_ ##name)                           "\n"    \
> --
> 2.41.0
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections
  2023-11-22 22:18 ` [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections deller
@ 2023-12-21 13:07   ` Masahiro Yamada
  2023-12-22  9:02     ` Helge Deller
  0 siblings, 1 reply; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-21 13:07 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
>
> From: Helge Deller <deller@gmx.de>
>
> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
> 64-bit pointers into the __ksymtab* sections.
> Make sure that the start of those sections is 64-bit aligned in the vmlinux
> executable, otherwise unaligned memory accesses may happen at runtime.


Are you solving a real problem?


1/4 already ensures the proper alignment of __ksymtab*, doesn't it?



I applied the following hack to attempt to
break the alignment intentionally.


diff --git a/include/asm-generic/vmlinux.lds.h
b/include/asm-generic/vmlinux.lds.h
index bae0fe4d499b..e2b5c9acee97 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -482,7 +482,7 @@
        TRACEDATA                                                       \
                                                                        \
        PRINTK_INDEX                                                    \
-                                                                       \
+       . = . + 1;                                                      \
        /* Kernel symbol table: Normal symbols */                       \
        __ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {         \
                __start___ksymtab = .;                                  \




The __ksymtab section and __start___ksymtab symbol
are still properly aligned due to the '.balign'
in <linux/export-internal.h>



So, my understanding is this patch is unneeded.


Or, does the behaviour depend on toolchains?








> The __kcrctab* sections store 32-bit entities, so make those sections
> 32-bit aligned.
>
> The pci fixup routines want to be 64-bit aligned on 64-bit platforms
> which don't define CONFIG_HAVE_ARCH_PREL32_RELOCATIONS. An alignment
> of 8 bytes is sufficient to guarantee aligned accesses at runtime.
>
> Signed-off-by: Helge Deller <deller@gmx.de>
> Cc: <stable@vger.kernel.org> # v6.0+


















> ---
>  include/asm-generic/vmlinux.lds.h | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index bae0fe4d499b..fa4335346e7d 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -467,6 +467,7 @@
>         }                                                               \
>                                                                         \
>         /* PCI quirks */                                                \
> +       . = ALIGN(8);                                                   \
>         .pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {        \
>                 BOUNDED_SECTION_PRE_LABEL(.pci_fixup_early,  _pci_fixups_early,  __start, __end) \
>                 BOUNDED_SECTION_PRE_LABEL(.pci_fixup_header, _pci_fixups_header, __start, __end) \
> @@ -484,6 +485,7 @@
>         PRINTK_INDEX                                                    \
>                                                                         \
>         /* Kernel symbol table: Normal symbols */                       \
> +       . = ALIGN(8);                                                   \
>         __ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {         \
>                 __start___ksymtab = .;                                  \
>                 KEEP(*(SORT(___ksymtab+*)))                             \
> @@ -491,6 +493,7 @@
>         }                                                               \
>                                                                         \
>         /* Kernel symbol table: GPL-only symbols */                     \
> +       . = ALIGN(8);                                                   \
>         __ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {     \
>                 __start___ksymtab_gpl = .;                              \
>                 KEEP(*(SORT(___ksymtab_gpl+*)))                         \
> @@ -498,6 +501,7 @@
>         }                                                               \
>                                                                         \
>         /* Kernel symbol table: Normal symbols */                       \
> +       . = ALIGN(4);                                                   \
>         __kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {         \
>                 __start___kcrctab = .;                                  \
>                 KEEP(*(SORT(___kcrctab+*)))                             \
> @@ -505,6 +509,7 @@
>         }                                                               \
>                                                                         \
>         /* Kernel symbol table: GPL-only symbols */                     \
> +       . = ALIGN(4);                                                   \
>         __kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {     \
>                 __start___kcrctab_gpl = .;                              \
>                 KEEP(*(SORT(___kcrctab_gpl+*)))                         \
> --
> 2.41.0
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
                   ` (4 preceding siblings ...)
  2023-12-19 21:26 ` [PATCH 0/4] Section alignment issues? Luis Chamberlain
@ 2023-12-21 13:40 ` Masahiro Yamada
  2023-12-21 15:42   ` Masahiro Yamada
  5 siblings, 1 reply; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-21 13:40 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
>
> From: Helge Deller <deller@gmx.de>
>
> While working on the 64-bit parisc kernel, I noticed that the __ksymtab[]
> table was not correctly 64-bit aligned in many modules.
> The following patches do fix some of those issues in the generic code.
>
> But further investigation shows that multiple sections in the kernel and in
> modules are possibly not correctly aligned, and thus may lead to performance
> degregations at runtime (small on x86, huge on parisc, sparc and others which
> need exception handlers). Sometimes wrong alignments may also be simply hidden
> by the linker or kernel module loader which pulls in the sections by luck with
> a correct alignment (e.g. because the previous section was aligned already).
>
> An objdump on a x86 module shows e.g.:
>
> ./kernel/net/netfilter/nf_log_syslog.ko:     file format elf64-x86-64
> Sections:
> Idx Name          Size      VMA               LMA               File off  Algn
>   0 .text         00001fdf  0000000000000000  0000000000000000  00000040  2**4
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>   1 .init.text    000000f6  0000000000000000  0000000000000000  00002020  2**4
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>   2 .exit.text    0000005c  0000000000000000  0000000000000000  00002120  2**4
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>   3 .rodata.str1.8 000000dc  0000000000000000  0000000000000000  00002180  2**3
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   4 .rodata.str1.1 0000030a  0000000000000000  0000000000000000  0000225c  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   5 .rodata       000000b0  0000000000000000  0000000000000000  00002580  2**5
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   6 .modinfo      0000019e  0000000000000000  0000000000000000  00002630  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   7 .return_sites 00000034  0000000000000000  0000000000000000  000027ce  2**0
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>   8 .call_sites   0000029c  0000000000000000  0000000000000000  00002802  2**0
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>
> In this example I believe the ".return_sites" and ".call_sites" should have
> an alignment of at least 32-bit (4 bytes).
>
> On other architectures or modules other sections like ".altinstructions" or
> "__ex_table" may show up wrongly instead.
>
> In general I think it would be beneficial to search for wrong alignments at
> link time, and maybe at runtime.
>
> The patch at the end of this cover letter
> - adds compile time checks to the "modpost" tool, and
> - adds a runtime check to the kernel module loader at runtime.
> And it will possibly show false positives too (!!!)
> I do understand that some of those sections are not performce critical
> and thus any alignment is OK.
>
> The modpost patch will emit at compile time such warnings (on x86-64 kernel build):
>
> WARNING: modpost: vmlinux: section .initcall7.init (type 1, flags 2) has alignment of 1, expected at least 4.
> Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?
> WARNING: modpost: vmlinux: section .altinstructions (type 1, flags 2) has alignment of 1, expected at least 2.
> WARNING: modpost: vmlinux: section .initcall6.init (type 1, flags 2) has alignment of 1, expected at least 4.
> WARNING: modpost: vmlinux: section .initcallearly.init (type 1, flags 2) has alignment of 1, expected at least 4.
> WARNING: modpost: vmlinux: section .rodata.cst2 (type 1, flags 18) has alignment of 2, expected at least 64.
> WARNING: modpost: vmlinux: section .static_call_tramp_key (type 1, flags 2) has alignment of 1, expected at least 8.
> WARNING: modpost: vmlinux: section .con_initcall.init (type 1, flags 2) has alignment of 1, expected at least 8.
> WARNING: modpost: vmlinux: section __bug_table (type 1, flags 3) has alignment of 1, expected at least 4.
> ...




modpost acts on vmlinux.o instead of vmlinux.


vmlinux.o is a relocatable ELF, which is not a real layout
because no linker script has been considered yet at this
point.


vmlinux is an executable ELF, produced by a linker,
with the linker script taken into consideration
to determine the final section/symbol layout.


So, checking this in modpost is meaningless.



I did not check the module checking code, but
modules are also relocatable ELF.











>
> while the kernel module loader may show at runtime:
>
> ** WARNING **:   module: efivarfs, dest=0xffffffffc02d08d2, section=.retpoline_sites possibly not correctly aligned.
> ** WARNING **:   module: efivarfs, dest=0xffffffffc02d0bae, section=.ibt_endbr_seal possibly not correctly aligned.
> ** WARNING **:   module: efivarfs, dest=0xffffffffc02d0be6, section=.orc_unwind possibly not correctly aligned.
> ** WARNING **:   module: efivarfs, dest=0xffffffffc02d12a6, section=.orc_unwind_ip possibly not correctly aligned.
> ** WARNING **:   module: efivarfs, dest=0xffffffffc02d178c, section=.note.Linux possibly not correctly aligned.
> ** WARNING **:   module: efivarfs, dest=0xffffffffc02d17bc, section=.orc_header possibly not correctly aligned.
> ** WARNING **:   module: xt_addrtype, dest=0xffffffffc01ef18a, size=0x00000020, section=.return_sites
>
> My questions:
> - Am I wrong with my analysis?
> - What does people see on other architectures?
> - Does it make sense to add a compile- and runtime-check, like the patch below, to the kernel?
>
> Helge
>
> ---
>
> From: Helge Deller <deller@gmx.de>
> Subject: [PATCH] MODPOST: Detect and report possible section alignment issues
>
> IMPORTANT: THIS PATCH IS NOT INTENDED TO BE APPLIED !!!!
>
> The main idea here is to check at compile time (during modpost run) and at
> runtime (when loading a module) if the sections in kernel modules (and
> vmlinux) are correctly aligned in memory and report if a wrong alignment is
> suspected.
>
> It WILL report false positives. Many section names still need to be added to
> the various tables.
> But even at this stage it gives some insight at the various possibly wrong
> section alignments.
>
> Signed-off-by: Helge Deller <deller@gmx.de>
>
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 98fedfdb8db5..88201d6b0c17 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -2277,6 +2277,10 @@ static int move_module(struct module *mod, struct load_info *info)
>                                 ret = -ENOEXEC;
>                                 goto out_enomem;
>                         }
> +                       if (((uintptr_t)dest) & (BITS_PER_LONG/8 - 1)) {
> +                               printk("** WARNING **: \tmodule: %s, dest=0x%lx, section=%s possibly not correctly aligned.\n",
> +                                       mod->name, (long)dest, info->secstrings + shdr->sh_name);
> +                       }
>                         memcpy(dest, (void *)shdr->sh_addr, shdr->sh_size);
>                 }
>                 /*
> diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
> index 739402f45509..2add144a05e3 100644
> --- a/scripts/Makefile.modpost
> +++ b/scripts/Makefile.modpost
> @@ -49,6 +49,8 @@ modpost-args =                                                                                \
>         $(if $(KBUILD_NSDEPS),-d $(MODULES_NSDEPS))                                     \
>         $(if $(CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS)$(KBUILD_NSDEPS),-N)       \
>         $(if $(findstring 1, $(KBUILD_EXTRA_WARN)),-W)                                  \
> +       $(if $(CONFIG_64BIT),-6)                                                        \
> +       $(if $(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS),-R)                                 \
>         -o $@
>
>  modpost-deps := $(MODPOST)
> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> index cb6406f485a9..70c69db6a17c 100644
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -43,6 +43,10 @@ static bool ignore_missing_files;
>  /* If set to 1, only warn (instead of error) about missing ns imports */
>  static bool allow_missing_ns_imports;
>
> +/* is target a 64-bit platform and has it prel32 relocation support? */
> +static bool target_64bit;
> +static bool target_prel32_relocations;
> +
>  static bool error_occurred;
>
>  static bool extra_warn;
> @@ -1493,6 +1497,76 @@ static void check_sec_ref(struct module *mod, struct elf_info *elf)
>         }
>  }
>
> +/**
> + * Check alignment of sections in modules.
> + **/
> +static void check_sec_alignment(struct module *mod, struct elf_info *elf)
> +{
> +       /* sections that may use PREL32 relocations and only need 4-byte alignment */
> +       static const char *const prel32_sec_list[] = {
> +               "__tracepoints_ptrs",
> +               "__ksymtab",
> +               "__bug_table",
> +               ".smp_locks",
> +               NULL
> +       };
> +       /* sections that are fine with any/1-byte alignment */
> +       static const char *const byte_sec_list[] = {
> +               ".modinfo",
> +               ".init.ramfs",
> +               NULL
> +       };
> +       /* sections with special alignment */
> +       static struct { int align; const char *name; } const special_list[] = {
> +               { 64,   ".rodata.cst2" },
> +               { 0,    NULL }
> +       };
> +
> +       int i;
> +
> +       /* ignore vmlinux for now? */
> +       // if (mod->is_vmlinux) return;
> +
> +       /* Walk through all sections */
> +       for (i = 0; i < elf->num_sections; i++) {
> +               Elf_Shdr *sechdr = &elf->sechdrs[i];
> +               const char *sec = sech_name(elf, sechdr);
> +               const char *modname = mod->name;
> +               const int is_shalign = sechdr->sh_addralign;
> +               int should_shalign;
> +               int k;
> +
> +               /* ignore some sections */
> +               if ((sechdr->sh_type == SHT_NULL) ||
> +                   !(sechdr->sh_flags & SHF_ALLOC))
> +                       continue;
> +
> +               /* default alignment is 8 for 64-bit and 4 for 32-bit targets * */
> +               should_shalign = target_64bit ? 8 : 4;
> +               if (match(sec, prel32_sec_list))
> +                       should_shalign = target_prel32_relocations ? 4 : should_shalign;
> +               else if (strstr(sec, "text")) /* assume text segments to be 4-byte aligned */
> +                       should_shalign = 4;
> +               else if (match(sec, byte_sec_list)) /* any alignment is OK. */
> +                       continue;
> +               else {
> +                       /* search in list with special alignment */
> +                       k = 0;
> +                       while (special_list[k].align && strstarts(sec, special_list[k].name)) {
> +                               should_shalign = special_list[k].align;
> +                               break;
> +                       }
> +               }
> +
> +               if (is_shalign  >= should_shalign)
> +                       continue;
> +
> +               warn("%s: section %s (type %d, flags %lu) has alignment of %d, expected at least %d.\n"
> +                    "Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?\n",
> +                    modname, sec, sechdr->sh_type, sechdr->sh_flags, is_shalign, should_shalign);
> +       }
> +}
> +
>  static char *remove_dot(char *s)
>  {
>         size_t n = strcspn(s, ".");
> @@ -1653,6 +1727,8 @@ static void read_symbols(const char *modname)
>                 handle_moddevtable(mod, &info, sym, symname);
>         }
>
> +       check_sec_alignment(mod, &info);
> +
>         check_sec_ref(mod, &info);
>
>         if (!mod->is_vmlinux) {
> @@ -2183,7 +2259,7 @@ int main(int argc, char **argv)
>         LIST_HEAD(dump_lists);
>         struct dump_list *dl, *dl2;
>
> -       while ((opt = getopt(argc, argv, "ei:MmnT:to:au:WwENd:")) != -1) {
> +       while ((opt = getopt(argc, argv, "ei:MmnT:to:au:WwENd:6R")) != -1) {
>                 switch (opt) {
>                 case 'e':
>                         external_module = true;
> @@ -2232,6 +2308,12 @@ int main(int argc, char **argv)
>                 case 'd':
>                         missing_namespace_deps = optarg;
>                         break;
> +               case '6':
> +                       target_64bit = true;
> +                       break;
> +               case 'R':
> +                       target_prel32_relocations = true;
> +                       break;
>                 default:
>                         exit(1);
>                 }
> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> index 70c69db6a17c..b09ab081dc03 100644
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -1510,15 +1510,24 @@ static void check_sec_alignment(struct module *mod, struct elf_info *elf)
>                 ".smp_locks",
>                 NULL
>         };
> -       /* sections that are fine with any/1-byte alignment */
> +       /* sections that are fine with any alignment */
>         static const char *const byte_sec_list[] = {
>                 ".modinfo",
>                 ".init.ramfs",
>                 NULL
>         };
> -       /* sections with special alignment */
> +       /* sections with special given minimal alignment. Checked with
> +        * startswith(). */
>         static struct { int align; const char *name; } const special_list[] = {
>                 { 64,   ".rodata.cst2" },
> +               { 2,    ".altinstructions" }, /* at least on x86 */
> +               { 1,    ".altinstr_replacement" },
> +               { 1,    ".altinstr_aux" },
> +               { 4,    ".call_sites" },
> +               { 4,    ".return_sites" },
> +               { 1,    ".note.Linux" },        /* correct? */
> +               { 4,    "__ex_table" },
> +               { 4,    ".initcall" },          /* at least 4 ? */
>                 { 0,    NULL }
>         };
>
> @@ -1545,16 +1554,17 @@ static void check_sec_alignment(struct module *mod, struct elf_info *elf)
>                 should_shalign = target_64bit ? 8 : 4;
>                 if (match(sec, prel32_sec_list))
>                         should_shalign = target_prel32_relocations ? 4 : should_shalign;
> -               else if (strstr(sec, "text")) /* assume text segments to be 4-byte aligned */
> -                       should_shalign = 4;
>                 else if (match(sec, byte_sec_list)) /* any alignment is OK. */
>                         continue;
> +               else if (strstr(sec, "text")) /* assume text segments to be 4-byte aligned */
> +                       should_shalign = 4;
>                 else {
>                         /* search in list with special alignment */
> -                       k = 0;
> -                       while (special_list[k].align && strstarts(sec, special_list[k].name)) {
> -                               should_shalign = special_list[k].align;
> -                               break;
> +                       for (k = 0; special_list[k].align; k++) {
> +                               if (strstarts(sec, special_list[k].name)) {
> +                                       should_shalign = special_list[k].align;
> +                                       break;
> +                               }
>                         }
>                 }
>
> Helge Deller (4):
>   linux/export: Fix alignment for 64-bit ksymtab entries
>   modules: Ensure 64-bit alignment on __ksymtab_* sections
>   vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and
>     .pci_fixup sections
>   modules: Add missing entry for __ex_table
>
>  include/asm-generic/vmlinux.lds.h | 5 +++++
>  include/linux/export-internal.h   | 5 ++++-
>  scripts/module.lds.S              | 9 +++++----
>  3 files changed, 14 insertions(+), 5 deletions(-)
>
> --
> 2.41.0
>


--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-12-21 13:40 ` Masahiro Yamada
@ 2023-12-21 15:42   ` Masahiro Yamada
  2023-12-22  8:23     ` Helge Deller
  2023-12-22  9:48     ` David Laight
  0 siblings, 2 replies; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-21 15:42 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On Thu, Dec 21, 2023 at 10:40 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
> >
> > From: Helge Deller <deller@gmx.de>
> >
> > While working on the 64-bit parisc kernel, I noticed that the __ksymtab[]
> > table was not correctly 64-bit aligned in many modules.
> > The following patches do fix some of those issues in the generic code.
> >
> > But further investigation shows that multiple sections in the kernel and in
> > modules are possibly not correctly aligned, and thus may lead to performance
> > degregations at runtime (small on x86, huge on parisc, sparc and others which
> > need exception handlers). Sometimes wrong alignments may also be simply hidden
> > by the linker or kernel module loader which pulls in the sections by luck with
> > a correct alignment (e.g. because the previous section was aligned already).
> >
> > An objdump on a x86 module shows e.g.:
> >
> > ./kernel/net/netfilter/nf_log_syslog.ko:     file format elf64-x86-64
> > Sections:
> > Idx Name          Size      VMA               LMA               File off  Algn
> >   0 .text         00001fdf  0000000000000000  0000000000000000  00000040  2**4
> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >   1 .init.text    000000f6  0000000000000000  0000000000000000  00002020  2**4
> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >   2 .exit.text    0000005c  0000000000000000  0000000000000000  00002120  2**4
> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >   3 .rodata.str1.8 000000dc  0000000000000000  0000000000000000  00002180  2**3
> >                   CONTENTS, ALLOC, LOAD, READONLY, DATA
> >   4 .rodata.str1.1 0000030a  0000000000000000  0000000000000000  0000225c  2**0
> >                   CONTENTS, ALLOC, LOAD, READONLY, DATA
> >   5 .rodata       000000b0  0000000000000000  0000000000000000  00002580  2**5
> >                   CONTENTS, ALLOC, LOAD, READONLY, DATA
> >   6 .modinfo      0000019e  0000000000000000  0000000000000000  00002630  2**0
> >                   CONTENTS, ALLOC, LOAD, READONLY, DATA
> >   7 .return_sites 00000034  0000000000000000  0000000000000000  000027ce  2**0
> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >   8 .call_sites   0000029c  0000000000000000  0000000000000000  00002802  2**0
> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >
> > In this example I believe the ".return_sites" and ".call_sites" should have
> > an alignment of at least 32-bit (4 bytes).
> >
> > On other architectures or modules other sections like ".altinstructions" or
> > "__ex_table" may show up wrongly instead.
> >
> > In general I think it would be beneficial to search for wrong alignments at
> > link time, and maybe at runtime.
> >
> > The patch at the end of this cover letter
> > - adds compile time checks to the "modpost" tool, and
> > - adds a runtime check to the kernel module loader at runtime.
> > And it will possibly show false positives too (!!!)
> > I do understand that some of those sections are not performce critical
> > and thus any alignment is OK.
> >
> > The modpost patch will emit at compile time such warnings (on x86-64 kernel build):
> >
> > WARNING: modpost: vmlinux: section .initcall7.init (type 1, flags 2) has alignment of 1, expected at least 4.
> > Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?
> > WARNING: modpost: vmlinux: section .altinstructions (type 1, flags 2) has alignment of 1, expected at least 2.
> > WARNING: modpost: vmlinux: section .initcall6.init (type 1, flags 2) has alignment of 1, expected at least 4.
> > WARNING: modpost: vmlinux: section .initcallearly.init (type 1, flags 2) has alignment of 1, expected at least 4.
> > WARNING: modpost: vmlinux: section .rodata.cst2 (type 1, flags 18) has alignment of 2, expected at least 64.
> > WARNING: modpost: vmlinux: section .static_call_tramp_key (type 1, flags 2) has alignment of 1, expected at least 8.
> > WARNING: modpost: vmlinux: section .con_initcall.init (type 1, flags 2) has alignment of 1, expected at least 8.
> > WARNING: modpost: vmlinux: section __bug_table (type 1, flags 3) has alignment of 1, expected at least 4.
> > ...
>
>
>
>
> modpost acts on vmlinux.o instead of vmlinux.
>
>
> vmlinux.o is a relocatable ELF, which is not a real layout
> because no linker script has been considered yet at this
> point.
>
>
> vmlinux is an executable ELF, produced by a linker,
> with the linker script taken into consideration
> to determine the final section/symbol layout.
>
>
> So, checking this in modpost is meaningless.
>
>
>
> I did not check the module checking code, but
> modules are also relocatable ELF.



Sorry, I replied too early.
(Actually I replied without reading your modpost code).

Now, I understand what your checker is doing.


I did not test how many false positives are produced,
but it catches several suspicious mis-alignments.


However, I am not convinced with this warning.


+               warn("%s: section %s (type %d, flags %lu) has
alignment of %d, expected at least %d.\n"
+                    "Maybe you need to add ALIGN() to the modules.lds
file (or fix modpost) ?\n",
+                    modname, sec, sechdr->sh_type, sechdr->sh_flags,
is_shalign, should_shalign);
+       }


Adding ALGIN() hides the real problem.


I think the real problem is that not enough alignment was requested
in the code.


For example, the right fix for ".initcall7.init" should be this:


diff --git a/include/linux/init.h b/include/linux/init.h
index 3fa3f6241350..650311e4b215 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -264,6 +264,7 @@ extern struct module __this_module;
 #define ____define_initcall(fn, __stub, __name, __sec)         \
        __define_initcall_stub(__stub, fn)                      \
        asm(".section   \"" __sec "\", \"a\"            \n"     \
+           ".balign 4                                  \n"     \
            __stringify(__name) ":                      \n"     \
            ".long      " __stringify(__stub) " - .     \n"     \
            ".previous                                  \n");   \



Then, "this section requires at least 4 byte alignment"
is recorded in the sh_addralign field.

Then, the rest is the linker's job.

We should not tweak the linker script.






--
Best Regards
Masahiro Yamada

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-12-21 10:22   ` Masahiro Yamada
@ 2023-12-21 16:01     ` Masahiro Yamada
  2023-12-22  6:07       ` Luis Chamberlain
  0 siblings, 1 reply; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-21 16:01 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On Thu, Dec 21, 2023 at 7:22 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
> >
> > From: Helge Deller <deller@gmx.de>
> >
> > An alignment of 4 bytes is wrong for 64-bit platforms which don't define
> > CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers).
> > Fix their alignment to 8 bytes.
> >
> > Signed-off-by: Helge Deller <deller@gmx.de>
>
>
> This is correct.
>
> Acked-by: Masahiro Yamada <masahiroy@kernel.org>
>
> Please add
>
>
> Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
>
>


If there is no objection, I will pick this up
to linux-kbuild/fixes.


Thanks.





-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2023-11-22 22:18 ` [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections deller
@ 2023-12-22  5:59   ` Luis Chamberlain
  2023-12-22 12:13     ` Helge Deller
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-22  5:59 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules, linux-arch

On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@kernel.org wrote:
> From: Helge Deller <deller@gmx.de>
> 
> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
> 64-bit pointers into the __ksymtab* sections.
> Make sure that those sections will be correctly aligned at module link time,
> otherwise unaligned memory accesses may happen at runtime.

The ramifications are not explained there. You keep sending me patches
with this and we keep doing a nose dive on this. It means I have to do
more work. So as I had suggested with your patch which I merged in
commit 87c482bdfa79 ("modules: Ensure natural alignment for
.altinstructions and __bug_table sections") please clarify the
impact of not merging this patch. Also last time you noticed the
misalignment due to a faulty exception handler, please mention how
you found this out now.

And since this is not your first patch on the exact related subject
I'd appreciate if you can show me perf stat results differences between
having and not having this patch merged. Why? Because we talk about
a performance penalthy, but we are not saying how much, and since this
is an ongoing thing, we might as well have a tool belt with ways to
measure such performance impact to bring clarity and value to this
and future related patches.

> The __kcrctab* sections store 32-bit entities, so use ALIGN(4) for those.

I've given some thought about how to test this. Sadly perf kallsysms
just opens the /proc/kallsysms file, but that's fine, we need our own
test.

I think a 3 new simple modules selftest would do it and running perf
stat on it. One module, let us call it module A which constructs its own
name space prefix for its exported symbols and has tons of silly symbols
for arbitrary data, whatever. We then have module B which refers to a
few arbitrary symbols from module A, hopefully spread out linearly, so
if module A had 10,000 symbols, we'd have module A refer to a symbol
ever 1,000 symbols. Finally we want a symbol C which has say, 50,000
symbols all of which will not be used at all by the first two modules,
but the selftest will load module C first, prior to calling modprobe B.

We'll stress test this way two calls which use find_symbol():

1) Upon load of B it will trigger simplify_symbols() to look for the
symbol it uses from the module A with tons of symbols. That's an
indirect way for us to call resolve_symbol_wait() from module A without
having to use symbol_get() which want to remove as exported as it is
just a hack which should go away. Our goal is for us to test
resolve_symbol() which will call find_symbol() and that will eventually
look for the symbol on module A with:

  find_exported_symbol_in_section()

That uses bsearch() so a binary search for the symbol and we'd end up
hitting the misalignments here. Binary search will at worst be O(log(n))
and so the only way to aggreviate the search will be to add tons of
symbols to A, and have B use a few of them.

2) When you load B, userspace will at first load A as depmod will inform
userspace A goes before B. Upon B's load towards the end right before
we call module B's init routine we get complete_formation() called on
the module. That will first check for duplicate symbols with the call
to verify_exported_symbols(). That is when we'll force iteration on
module C's insane symbol list.

The selftests just runs

perf stat -e pick-your-poison-for-misalignments tools/testing/selftests/kmod/ksymtab.sh

Where ksymtab.sh is your new script which calls:

modprobe C
modprobe B

I say pick-your-poison-for-misalignments because I am not sure what is
best here.

Thoughts?

> Signed-off-by: Helge Deller <deller@gmx.de>
> Cc: <stable@vger.kernel.org> # v6.0+

That's a stretch without any data, don't you think?

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-12-21 16:01     ` Masahiro Yamada
@ 2023-12-22  6:07       ` Luis Chamberlain
  2023-12-22  6:08         ` Luis Chamberlain
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-22  6:07 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: deller, linux-kernel, Arnd Bergmann, linux-modules, linux-arch

On Fri, Dec 22, 2023 at 01:01:23AM +0900, Masahiro Yamada wrote:
> On Thu, Dec 21, 2023 at 7:22 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
> > >
> > > From: Helge Deller <deller@gmx.de>
> > >
> > > An alignment of 4 bytes is wrong for 64-bit platforms which don't define
> > > CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers).
> > > Fix their alignment to 8 bytes.
> > >
> > > Signed-off-by: Helge Deller <deller@gmx.de>
> >
> >
> > This is correct.
> >
> > Acked-by: Masahiro Yamada <masahiroy@kernel.org>
> >
> > Please add
> >
> >
> > Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
> >
> >
> 
> 
> If there is no objection, I will pick this up
> to linux-kbuild/fixes.

The new selftests I've suggested should help get perf data to cover both
modules and built-in kernel symbols given find_symbol() will first hit
built-in symbols first before modules with one caveat: we'd want to extend
the selftest with a part which builds a module built-in with also tons
of other symbols.

So I'm all for you taking this but I don't think we need to rush for the
same reasons I mentioned in my reply to Helge.

I think it would be nice to get real perf data with perf stat as I
suggested, and include that in the commit logs. I think it would also be
useful to include a description about the fact that there is no real fix
and that the performance hit is all that happens as the architecture
just emulates the aligment. In the worst case, if exception handlers
are broken we could crash but that is rare although it does happen.

If we want to go bananas we could even get a graph of size of modules
Vs cost on misaligment as a relationship with time. Without this, frankly
cost on "performance" is artificial.

Thoughts?

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-12-22  6:07       ` Luis Chamberlain
@ 2023-12-22  6:08         ` Luis Chamberlain
  2023-12-22  7:01           ` Masahiro Yamada
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-22  6:08 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: deller, linux-kernel, Arnd Bergmann, linux-modules, linux-arch

On Thu, Dec 21, 2023 at 10:07:13PM -0800, Luis Chamberlain wrote:
> 
> If we want to go bananas we could even get a graph of size of modules

Sorry I meant size of number of symbols Vs cost.

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-12-22  6:08         ` Luis Chamberlain
@ 2023-12-22  7:01           ` Masahiro Yamada
  2023-12-22 20:11             ` Luis Chamberlain
  0 siblings, 1 reply; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-22  7:01 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: deller, linux-kernel, Arnd Bergmann, linux-modules, linux-arch

On Fri, Dec 22, 2023 at 3:08 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Thu, Dec 21, 2023 at 10:07:13PM -0800, Luis Chamberlain wrote:
> >
> > If we want to go bananas we could even get a graph of size of modules
>
> Sorry I meant size of number of symbols Vs cost.
>
>  Luis



But, 1/4 is really a bug-fix, isn't it?


ksymtab was previously 8-byte aligned for CONFIG_64BIT,
but now is only 4-byte aligned.


$ git show ddb5cdbafaaa^:include/linux/export.h |
                              head -n66 | tail -n5
struct kernel_symbol {
        unsigned long value;
        const char *name;
        const char *namespace;
};


$ git show ddb5cdbafaaa^:include/asm-generic/export.h |
                               head -23 | tail -8
#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
#define KSYM_ALIGN 4
#elif defined(CONFIG_64BIT)
#define KSYM_ALIGN 8
#else
#define KSYM_ALIGN 4
#endif




In the old behavior, <linux/export.h> used C code
for producing ksymtab, hence it was naturally
aligned by the compiler. (unsigned long and pointer
require 8-byte alignment for CONFIG_64BIT)


<asm-generic/export.h> explicitly required
8-byte alignment for CONFIG_64BIT.





In the current behavior, <linux/export-internal.h>
produces all ksymtab by using inline assembler,
but it hard-codes ".balign 4".











-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-12-21 15:42   ` Masahiro Yamada
@ 2023-12-22  8:23     ` Helge Deller
  2023-12-23  1:32       ` Masahiro Yamada
  2023-12-22  9:48     ` David Laight
  1 sibling, 1 reply; 31+ messages in thread
From: Helge Deller @ 2023-12-22  8:23 UTC (permalink / raw)
  To: Masahiro Yamada, deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On 12/21/23 16:42, Masahiro Yamada wrote:
> On Thu, Dec 21, 2023 at 10:40 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>>
>> On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
>>>
>>> From: Helge Deller <deller@gmx.de>
>>>
>>> While working on the 64-bit parisc kernel, I noticed that the __ksymtab[]
>>> table was not correctly 64-bit aligned in many modules.
>>> The following patches do fix some of those issues in the generic code.
>>>
>>> But further investigation shows that multiple sections in the kernel and in
>>> modules are possibly not correctly aligned, and thus may lead to performance
>>> degregations at runtime (small on x86, huge on parisc, sparc and others which
>>> need exception handlers). Sometimes wrong alignments may also be simply hidden
>>> by the linker or kernel module loader which pulls in the sections by luck with
>>> a correct alignment (e.g. because the previous section was aligned already).
>>>
>>> An objdump on a x86 module shows e.g.:
>>>
>>> ./kernel/net/netfilter/nf_log_syslog.ko:     file format elf64-x86-64
>>> Sections:
>>> Idx Name          Size      VMA               LMA               File off  Algn
>>>    0 .text         00001fdf  0000000000000000  0000000000000000  00000040  2**4
>>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>    1 .init.text    000000f6  0000000000000000  0000000000000000  00002020  2**4
>>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>    2 .exit.text    0000005c  0000000000000000  0000000000000000  00002120  2**4
>>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>>>    3 .rodata.str1.8 000000dc  0000000000000000  0000000000000000  00002180  2**3
>>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>    4 .rodata.str1.1 0000030a  0000000000000000  0000000000000000  0000225c  2**0
>>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>    5 .rodata       000000b0  0000000000000000  0000000000000000  00002580  2**5
>>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>    6 .modinfo      0000019e  0000000000000000  0000000000000000  00002630  2**0
>>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>    7 .return_sites 00000034  0000000000000000  0000000000000000  000027ce  2**0
>>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>>>    8 .call_sites   0000029c  0000000000000000  0000000000000000  00002802  2**0
>>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>>>
>>> In this example I believe the ".return_sites" and ".call_sites" should have
>>> an alignment of at least 32-bit (4 bytes).
>>>
>>> On other architectures or modules other sections like ".altinstructions" or
>>> "__ex_table" may show up wrongly instead.
>>>
>>> In general I think it would be beneficial to search for wrong alignments at
>>> link time, and maybe at runtime.
>>>
>>> The patch at the end of this cover letter
>>> - adds compile time checks to the "modpost" tool, and
>>> - adds a runtime check to the kernel module loader at runtime.
>>> And it will possibly show false positives too (!!!)
>>> I do understand that some of those sections are not performce critical
>>> and thus any alignment is OK.
>>>
>>> The modpost patch will emit at compile time such warnings (on x86-64 kernel build):
>>>
>>> WARNING: modpost: vmlinux: section .initcall7.init (type 1, flags 2) has alignment of 1, expected at least 4.
>>> Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?
>>> WARNING: modpost: vmlinux: section .altinstructions (type 1, flags 2) has alignment of 1, expected at least 2.
>>> WARNING: modpost: vmlinux: section .initcall6.init (type 1, flags 2) has alignment of 1, expected at least 4.
>>> WARNING: modpost: vmlinux: section .initcallearly.init (type 1, flags 2) has alignment of 1, expected at least 4.
>>> WARNING: modpost: vmlinux: section .rodata.cst2 (type 1, flags 18) has alignment of 2, expected at least 64.
>>> WARNING: modpost: vmlinux: section .static_call_tramp_key (type 1, flags 2) has alignment of 1, expected at least 8.
>>> WARNING: modpost: vmlinux: section .con_initcall.init (type 1, flags 2) has alignment of 1, expected at least 8.
>>> WARNING: modpost: vmlinux: section __bug_table (type 1, flags 3) has alignment of 1, expected at least 4.
>>> ...
>>
>>
>>
>>
>> modpost acts on vmlinux.o instead of vmlinux.
>>
>>
>> vmlinux.o is a relocatable ELF, which is not a real layout
>> because no linker script has been considered yet at this
>> point.
>>
>>
>> vmlinux is an executable ELF, produced by a linker,
>> with the linker script taken into consideration
>> to determine the final section/symbol layout.
>>
>>
>> So, checking this in modpost is meaningless.
>>
>>
>>
>> I did not check the module checking code, but
>> modules are also relocatable ELF.
>
>
>
> Sorry, I replied too early.
> (Actually I replied without reading your modpost code).
>
> Now, I understand what your checker is doing.
>
>
> I did not test how many false positives are produced,
> but it catches several suspicious mis-alignments.

Yes.

> However, I am not convinced with this warning.
>
>
> +               warn("%s: section %s (type %d, flags %lu) has
> alignment of %d, expected at least %d.\n"
> +                    "Maybe you need to add ALIGN() to the modules.lds
> file (or fix modpost) ?\n",
> +                    modname, sec, sechdr->sh_type, sechdr->sh_flags,
> is_shalign, should_shalign);
> +       }
>
>
> Adding ALGIN() hides the real problem.

Right.
It took me some time to understand the effects here too.
See below...

> I think the real problem is that not enough alignment was requested
> in the code.
>
> For example, the right fix for ".initcall7.init" should be this:
>
> diff --git a/include/linux/init.h b/include/linux/init.h
> index 3fa3f6241350..650311e4b215 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -264,6 +264,7 @@ extern struct module __this_module;
>   #define ____define_initcall(fn, __stub, __name, __sec)         \
>          __define_initcall_stub(__stub, fn)                      \
>          asm(".section   \"" __sec "\", \"a\"            \n"     \
> +           ".balign 4                                  \n"     \
>              __stringify(__name) ":                      \n"     \
>              ".long      " __stringify(__stub) " - .     \n"     \
>              ".previous                                  \n");   \
>
> Then, "this section requires at least 4 byte alignment"
> is recorded in the sh_addralign field.

Yes, this is the important part.

> Then, the rest is the linker's job.
>
> We should not tweak the linker script.

That's right, but let's phrase it slightly different...
There is *no need* to tweak the linker script, *if* the alignment
gets correctly assigned by the inline assembly (like your
initcall patch above).
But on some platforms (e.g. on parisc) I noticed that this .balign
was missing for some other sections, in which case the other (not preferred)
possible option is to tweak the linker script.

So I think we agree that fixing the inline assembly is the right
way to go?

Either way, a link-time check like the proposed modpost patch
may catch section issue for upcoming/newly added sections too.

Helge

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections
  2023-12-21 13:07   ` Masahiro Yamada
@ 2023-12-22  9:02     ` Helge Deller
  2023-12-23  4:10       ` Masahiro Yamada
  0 siblings, 1 reply; 31+ messages in thread
From: Helge Deller @ 2023-12-22  9:02 UTC (permalink / raw)
  To: Masahiro Yamada, deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

On 12/21/23 14:07, Masahiro Yamada wrote:
> On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
>>
>> From: Helge Deller <deller@gmx.de>
>>
>> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
>> 64-bit pointers into the __ksymtab* sections.
>> Make sure that the start of those sections is 64-bit aligned in the vmlinux
>> executable, otherwise unaligned memory accesses may happen at runtime.
>
>
> Are you solving a real problem?

Not any longer.
I faced a problem on parisc when neither #1 and #3 were applied
because of a buggy unalignment exception handler. But this is
not something which I would count a "real generic problem".

> 1/4 already ensures the proper alignment of __ksymtab*, doesn't it?

Yes, it does.

>...
> So, my understanding is this patch is unneeded.

Yes, it's not required and I'm fine if we drop it.

But regarding __kcrctab:

>> @@ -498,6 +501,7 @@
>>          }                                                               \
>>                                                                          \
>>          /* Kernel symbol table: Normal symbols */                       \
>> +       . = ALIGN(4);                                                   \
>>          __kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {         \
>>                  __start___kcrctab = .;                                  \
>>                  KEEP(*(SORT(___kcrctab+*)))                             \

I think this patch would be beneficial to get proper alignment:

diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h
index cd253eb51d6c..d445705ac13c 100644
--- a/include/linux/export-internal.h
+++ b/include/linux/export-internal.h
@@ -64,6 +64,7 @@

  #define SYMBOL_CRC(sym, crc, sec)   \
         asm(".section \"___kcrctab" sec "+" #sym "\",\"a\""     "\n" \
+           ".balign 4"                                         "\n" \
             "__crc_" #sym ":"                                   "\n" \
             ".long " #crc                                       "\n" \
             ".previous"                                         "\n")


Helge

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-12-20 19:40   ` Luis Chamberlain
@ 2023-12-22  9:13     ` Helge Deller
  0 siblings, 0 replies; 31+ messages in thread
From: Helge Deller @ 2023-12-22  9:13 UTC (permalink / raw)
  To: Luis Chamberlain, deller
  Cc: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules, linux-arch

On 12/20/23 20:40, Luis Chamberlain wrote:
> On Tue, Dec 19, 2023 at 01:26:49PM -0800, Luis Chamberlain wrote:
>> On Wed, Nov 22, 2023 at 11:18:10PM +0100, deller@kernel.org wrote:
>>> From: Helge Deller <deller@gmx.de>
>>> My questions:
>>> - Am I wrong with my analysis?
>>
>> This would typically of course depend on the arch, but whether it helps
>> is what I would like to see with real numbers rather then speculation.
>> Howeer, I don't expect some of these are hot paths except maybe the
>> table lookups. So could you look at some perf stat differences
>> without / with alignment ? Other than bootup live patching would be
>> a good test case. We have selftest for modules, the script in selftests
>> tools/testing/selftests/kmod/kmod.sh is pretty aggressive, but the live
>> patching tests might be better suited.
>>
>>> - What does people see on other architectures?
>>> - Does it make sense to add a compile- and runtime-check, like the patch below, to the kernel?
>>
>> The chatty aspects really depend on the above results.
>>
>> Aren't there some archs where an unaligned access would actually crash?
>> Why hasn't that happened?
>
> I've gone down through memory lane and we have discussed this before.
>
> It would seem this misalignment should not affect performance, and this
> should not be an issue unless you have a buggy exception hanlder. We
> actually ran into one before. Please refer to merge commit
>
> e74acdf55da6649dd30be5b621a93b71cbe7f3f9

Yes, this is the second time I stumbled over this issue.
But let's continue discussing in the other mail thread...

Helge

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 0/4] Section alignment issues?
  2023-12-21 15:42   ` Masahiro Yamada
  2023-12-22  8:23     ` Helge Deller
@ 2023-12-22  9:48     ` David Laight
  1 sibling, 0 replies; 31+ messages in thread
From: David Laight @ 2023-12-22  9:48 UTC (permalink / raw)
  To: 'Masahiro Yamada', deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch, Luis Chamberlain

...
> diff --git a/include/linux/init.h b/include/linux/init.h
> index 3fa3f6241350..650311e4b215 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -264,6 +264,7 @@ extern struct module __this_module;
>  #define ____define_initcall(fn, __stub, __name, __sec)         \
>         __define_initcall_stub(__stub, fn)                      \
>         asm(".section   \"" __sec "\", \"a\"            \n"     \
> +           ".balign 4                                  \n"     \
>             __stringify(__name) ":                      \n"     \
>             ".long      " __stringify(__stub) " - .     \n"     \
>             ".previous                                  \n");   \
> 
> 
> 
> Then, "this section requires at least 4 byte alignment"
> is recorded in the sh_addralign field.

Perhaps one of the headers should contain (something like):
#ifdef CONFIG_64
#define BALIGN_PTR ".balign 8\n"
#else
#define BALIGN_PTR ".balign 4\n"
#endif

to make it all easier (although that example doesn't need it).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2023-12-22  5:59   ` Luis Chamberlain
@ 2023-12-22 12:13     ` Helge Deller
  2023-12-22 20:10       ` Luis Chamberlain
  0 siblings, 1 reply; 31+ messages in thread
From: Helge Deller @ 2023-12-22 12:13 UTC (permalink / raw)
  To: Luis Chamberlain, deller
  Cc: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules, linux-arch

Hi Luis,

On 12/22/23 06:59, Luis Chamberlain wrote:
> On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@kernel.org wrote:
>> From: Helge Deller <deller@gmx.de>
>>
>> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
>> 64-bit pointers into the __ksymtab* sections.
>> Make sure that those sections will be correctly aligned at module link time,
>> otherwise unaligned memory accesses may happen at runtime.
>
> The ramifications are not explained there.

What do you mean exactly by this?

> You keep sending me patches with this and we keep doing a nose dive
> on this. It means I have to do more work.
Sorry about that. Since you are mentioned as maintainer for modules I
thought you are the right contact.

Furthermore, this patch is pretty small and trivial.
And I had the impression that people understand that having unaligned
structures is generally a bad idea as it usually always impacts performance
(although the performance penalty in this case isn't critical at all,
as we are not on hot paths).

> So as I had suggested with your patch which I merged in
> commit 87c482bdfa79 ("modules: Ensure natural alignment for
> .altinstructions and __bug_table sections") please clarify the
> impact of not merging this patch. Also last time you noticed the
> misalignment due to a faulty exception handler, please mention how
> you found this out now.

Sure.

> And since this is not your first patch on the exact related subject
> I'd appreciate if you can show me perf stat results differences between
> having and not having this patch merged. Why? Because we talk about
> a performance penalthy, but we are not saying how much, and since this
> is an ongoing thing, we might as well have a tool belt with ways to
> measure such performance impact to bring clarity and value to this
> and future related patches.
>
>> The __kcrctab* sections store 32-bit entities, so use ALIGN(4) for those.
>
> I've given some thought about how to test this. Sadly perf kallsysms
> just opens the /proc/kallsysms file, but that's fine, we need our own
> test.
>
> I think a 3 new simple modules selftest would do it and running perf
> stat on it. One module, let us call it module A which constructs its own
> name space prefix for its exported symbols and has tons of silly symbols
> for arbitrary data, whatever. We then have module B which refers to a
> few arbitrary symbols from module A, hopefully spread out linearly, so
> if module A had 10,000 symbols, we'd have module A refer to a symbol
> ever 1,000 symbols. Finally we want a symbol C which has say, 50,000
> symbols all of which will not be used at all by the first two modules,
> but the selftest will load module C first, prior to calling modprobe B.
>
> We'll stress test this way two calls which use find_symbol():
>
> 1) Upon load of B it will trigger simplify_symbols() to look for the
> symbol it uses from the module A with tons of symbols. That's an
> indirect way for us to call resolve_symbol_wait() from module A without
> having to use symbol_get() which want to remove as exported as it is
> just a hack which should go away. Our goal is for us to test
> resolve_symbol() which will call find_symbol() and that will eventually
> look for the symbol on module A with:
>
>    find_exported_symbol_in_section()
>
> That uses bsearch() so a binary search for the symbol and we'd end up
> hitting the misalignments here. Binary search will at worst be O(log(n))
> and so the only way to aggreviate the search will be to add tons of
> symbols to A, and have B use a few of them.
>
> 2) When you load B, userspace will at first load A as depmod will inform
> userspace A goes before B. Upon B's load towards the end right before
> we call module B's init routine we get complete_formation() called on
> the module. That will first check for duplicate symbols with the call
> to verify_exported_symbols(). That is when we'll force iteration on
> module C's insane symbol list.
>
> The selftests just runs
>
> perf stat -e pick-your-poison-for-misalignments tools/testing/selftests/kmod/ksymtab.sh
>
> Where ksymtab.sh is your new script which calls:
>
> modprobe C
> modprobe B
>
> I say pick-your-poison-for-misalignments because I am not sure what is
> best here.
>
> Thoughts?

I really appreciate your thoughts here!

But given the triviality of this patch, I personally think you are
proposing a huge academic investment in time and resources with no real benefit.
On which architecture would you suggest to test this?
What would be the effective (really new) outcome?
If the performance penalty is zero or close to zero, will that imply
that such a patch isn't useful?
Please also keep in mind that not everyone gets paid to work on the kernel,
so I have neither the time nor the various architectures to test on.

Then, as Masahiro already mentioned, the real solution is
probably to add ".balign" to the various inline assembler parts.
With this the alignment gets recorded in the sh_addralign field
in the object files, and the linker will correctly lay out the executable
even without this patch here.
And, this is exactly what you would get if C-initializers would have
been used instead of inline assembly.

But independend of the correct way to fix it, I think the linker
script ideally should mention all sections it expects with the right
alignments. Just for completeness.

I'll leave it up to you if you will apply this patch.
Currently it's not absolutely needed any longer, but let's think about
the pros/cons of this patch:
Pro:
- It can prevent unnecessary unaligned memory accesses on all arches.
- It provides the linker with correct alignment for the various sections.
- It mimics what you would get if the structs were coded with C-initializers.
- A correct section alignment can address upcoming inline assembly patches
   which may have missed to add the .balign
Cons:
- For this specific patch the worst case scenario is that we add a total of
   up to 7 bytes to kernel image (__ksymtab gets aligned to 8 bytes and the
   following sections are then already aligned).

So, honestly I don't see a real reason why it shouldn't be applied...

>> Signed-off-by: Helge Deller <deller@gmx.de>
>> Cc: <stable@vger.kernel.org> # v6.0+
>
> That's a stretch without any data, don't you think?

Yes. No need to push such a patch to stable series.

Helge

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2023-12-22 12:13     ` Helge Deller
@ 2023-12-22 20:10       ` Luis Chamberlain
  2023-12-30  7:33         ` Helge Deller
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-22 20:10 UTC (permalink / raw)
  To: Helge Deller
  Cc: deller, linux-kernel, Masahiro Yamada, Arnd Bergmann,
	linux-modules, linux-arch

On Fri, Dec 22, 2023 at 01:13:26PM +0100, Helge Deller wrote:
> Hi Luis,
> 
> On 12/22/23 06:59, Luis Chamberlain wrote:
> > On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@kernel.org wrote:
> > > From: Helge Deller <deller@gmx.de>
> > > 
> > > On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> > > (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
> > > 64-bit pointers into the __ksymtab* sections.
> > > Make sure that those sections will be correctly aligned at module link time,
> > > otherwise unaligned memory accesses may happen at runtime.
> > 
> > The ramifications are not explained there.
> 
> What do you mean exactly by this?

You don't explain the impact of not applying this patch.

> > You keep sending me patches with this and we keep doing a nose dive
> > on this. It means I have to do more work.
> Sorry about that. Since you are mentioned as maintainer for modules I
> thought you are the right contact.

I am certaintly the right person to send this patch to, however, I am
saying that given our previous dialog I would like the commit log to
describe a bit more detail of a few things:

 * how you found this
 * what are the impact of not applying it

Specially if you are going to be sending patches right before the
holidays with a Cc stable fix annotation. This gets me on hight alert
and I have to put things down to see how really critical this is so to
decide to fast track this to Linus or not.

> Furthermore, this patch is pretty small and trivial.

Many patches can be but some can break things and some can be small but
also improve things critically, for example if we are aware of broken
exception handlers.

> And I had the impression that people understand that having unaligned
> structures is generally a bad idea as it usually always impacts performance
> (although the performance penalty in this case isn't critical at all,
> as we are not on hot paths).

There are two aspects to this, one is the critical nature which is
implied by your patch which pegs it as a stable fix, given you prior
patches about this it leaves me wondering if it is fixing some crashes
on some systems given faulty exception handlers.

The performance thing is also subjective, but it could not be subjective
by doing some very trivial tests as I suggested. Such a test would also
help as we lack specific selftsts for this case and we can grow it later
with other things. I figured I'd ask you to try it, since *you* keep
sending patches about misalignments on module sections so I figured
you must be seeing something others are not, and so you must care.

> > Thoughts?
> 
> I really appreciate your thoughts here!
> 
> But given the triviality of this patch, I personally think you are
> proposing a huge academic investment in time and resources with no real benefit.
> On which architecture would you suggest to test this?
> What would be the effective (really new) outcome?
> If the performance penalty is zero or close to zero, will that imply
> that such a patch isn't useful?
> Please also keep in mind that not everyone gets paid to work on the kernel,
> so I have neither the time nor the various architectures to test on.

I think you make this seem so difficult, but I understand it could seem
that way. I've attached at the end of this email a patch that does just
this then to help.

> So, honestly I don't see a real reason why it shouldn't be applied...

Like I said, you Cc'd stable as a fix, as a maintainer it is my job to
verify how critical this is and ask for more details about how you found
it and evaluate the real impact. Even if it was not a stable fix I tend
to ask this for patches, even if they are trivial.

> > > Signed-off-by: Helge Deller <deller@gmx.de>
> > > Cc: <stable@vger.kernel.org> # v6.0+
> > 
> > That's a stretch without any data, don't you think?
> 
> Yes. No need to push such a patch to stable series.

OK, can you extend the patch below with something like:

perf stat --repeat 100 --pre 'modprobe -r b a b c' -- ./tools/testing/selftests/module/find_symbol.sh

And test before and after?

I ran a simple test as-is and the data I get is within noise, and so
I think we need the --repeat 100 thing.

-----------------------------------------------------------------------------------
before:
sudo ./tools/testing/selftests/module/find_symbol.sh 

 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        81,956,206 ns   duration_time                                                         
        81,883,000 ns   system_time                                                           
               210      page-faults                                                           

       0.081956206 seconds time elapsed

       0.000000000 seconds user
       0.081883000 seconds sys



 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        85,960,863 ns   duration_time                                                         
        84,679,000 ns   system_time                                                           
               212      page-faults                                                           

       0.085960863 seconds time elapsed

       0.000000000 seconds user
       0.084679000 seconds sys



 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        86,484,868 ns   duration_time                                                         
        86,541,000 ns   system_time                                                           
               213      page-faults                                                           

       0.086484868 seconds time elapsed

       0.000000000 seconds user
       0.086541000 seconds sys

-----------------------------------------------------------------------------------
After your modules alignement fix:
sudo ./tools/testing/selftests/module/find_symbol.sh 
 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        83,579,980 ns   duration_time                                                         
        83,530,000 ns   system_time                                                           
               212      page-faults                                                           

       0.083579980 seconds time elapsed

       0.000000000 seconds user
       0.083530000 seconds sys



 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        70,721,786 ns   duration_time                                                         
        69,289,000 ns   system_time                                                           
               211      page-faults                                                           

       0.070721786 seconds time elapsed

       0.000000000 seconds user
       0.069289000 seconds sys



 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        76,513,219 ns   duration_time                                                         
        76,381,000 ns   system_time                                                           
               214      page-faults                                                           

       0.076513219 seconds time elapsed

       0.000000000 seconds user
       0.076381000 seconds sys

After your modules alignement fix:
sudo ./tools/testing/selftests/module/find_symbol.sh 
 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        83,579,980 ns   duration_time                                                         
        83,530,000 ns   system_time                                                           
               212      page-faults                                                           

       0.083579980 seconds time elapsed

       0.000000000 seconds user
       0.083530000 seconds sys



 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        70,721,786 ns   duration_time                                                         
        69,289,000 ns   system_time                                                           
               211      page-faults                                                           

       0.070721786 seconds time elapsed

       0.000000000 seconds user
       0.069289000 seconds sys



 Performance counter stats for '/sbin/modprobe test_kallsyms_b':

        76,513,219 ns   duration_time                                                         
        76,381,000 ns   system_time                                                           
               214      page-faults                                                           

       0.076513219 seconds time elapsed

       0.000000000 seconds user
       0.076381000 seconds sys
-----------------------------------------------------------------------------------

From efe903f7a1e5917405f9555e9c00b1da2ac4b830 Mon Sep 17 00:00:00 2001
From: Luis Chamberlain <mcgrof@kernel.org>
Date: Fri, 22 Dec 2023 11:22:08 -0800
Subject: [PATCH] selftests: add new kallsyms selftests

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 lib/Kconfig.debug                             |  95 +++++++++++++
 lib/Makefile                                  |   1 +
 lib/tests/Makefile                            |   1 +
 lib/tests/module/.gitignore                   |   4 +
 lib/tests/module/Makefile                     |  15 ++
 lib/tests/module/gen_test_kallsyms.sh         | 128 ++++++++++++++++++
 tools/testing/selftests/module/Makefile       |  12 ++
 tools/testing/selftests/module/config         |   7 +
 tools/testing/selftests/module/find_symbol.sh |  65 +++++++++
 9 files changed, 328 insertions(+)
 create mode 100644 lib/tests/Makefile
 create mode 100644 lib/tests/module/.gitignore
 create mode 100644 lib/tests/module/Makefile
 create mode 100755 lib/tests/module/gen_test_kallsyms.sh
 create mode 100644 tools/testing/selftests/module/Makefile
 create mode 100644 tools/testing/selftests/module/config
 create mode 100755 tools/testing/selftests/module/find_symbol.sh

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 97ce28f4d154..4f5fbaceaf88 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2835,6 +2835,101 @@ config TEST_KMOD
 
 	  If unsure, say N.
 
+config TEST_RUNTIME
+	bool
+
+config TEST_RUNTIME_MODULE
+	bool
+
+config TEST_KALLSYMS
+	tristate "module kallsyms find_symbol() test"
+	depends on m
+	select TEST_RUNTIME
+	select TEST_RUNTIME_MODULE
+	select TEST_KALLSYMS_A
+	select TEST_KALLSYMS_B
+	select TEST_KALLSYMS_C
+	select TEST_KALLSYMS_D
+	help
+	  This allows us to stress test find_symbol() through the kallsyms
+	  used to place symbols on the kernel ELF kallsyms and modules kallsyms
+	  where we place kernel symbols such as exported symbols.
+
+	  We have four test modules:
+
+	  A: has KALLSYSMS_NUMSYMS exported symbols
+	  B: uses one of A's symbols
+	  C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported
+	  D: adds 2 * the symbols than C
+
+	  We stress test find_symbol() through two means:
+
+	  1) Upon load of B it will trigger simplify_symbols() to look for the
+	  one symbol it uses from the module A with tons of symbols. This is an
+	  indirect way for us to have B call resolve_symbol_wait() upon module
+	  load. This will eventually call find_symbol() which will eventually
+	  try to find the symbols used with find_exported_symbol_in_section().
+	  find_exported_symbol_in_section() uses bsearch() so a binary search
+	  for each symbol. Binary search will at worst be O(log(n)) so the
+	  larger TEST_MODULE_KALLSYSMS the worse the search.
+
+	  2) The selftests should load C first, before B. Upon B's load towards
+	  the end right before we call module B's init routine we get
+	  complete_formation() called on the module. That will first check
+	  for duplicate symbols with the call to verify_exported_symbols().
+	  That is when we'll force iteration on module C's insane symbol list.
+	  Since it has 10 * KALLSYMS_NUMSYMS it means we can first test
+	  just loading B without C. The amount of time it takes to load C Vs
+	  B can give us an idea of the impact growth of the symbol space and
+	  give us projection. Module A only uses one symbol from B so to allow
+	  this scaling in module C to be proportional, if it used more symbols
+	  then the first test would be doing more and increasing just the
+	  search space would be slightly different. The last module, module D
+	  will just increase the search space by twice the number of symbols in
+	  C so to allow for full projects.
+
+	  tools/testing/selftests/module/find_symbol.sh
+
+	  If unsure, say N.
+
+if TEST_KALLSYMS
+
+config TEST_KALLSYMS_A
+	tristate
+	depends on m
+
+config TEST_KALLSYMS_B
+	tristate
+	depends on m
+
+config TEST_KALLSYMS_C
+	tristate
+	depends on m
+
+config TEST_KALLSYMS_D
+	tristate
+	depends on m
+
+config TEST_KALLSYMS_NUMSYMS
+	int "test kallsyms number of symbols"
+	default 10000
+	help
+	  The number of symbols to create on TEST_KALLSYMS_A, only one of which
+	  module TEST_KALLSYMS_B will use. This also will be used
+	  for how many symbols TEST_KALLSYMS_C will have, scaled up by
+	  TEST_KALLSYMS_SCALE_FACTOR.
+
+config TEST_KALLSYMS_SCALE_FACTOR
+	int "test kallsyms scale factor"
+	default 4
+	help
+	  How many more unusued symbols will TEST_KALLSYSMS_C have than
+	  TEST_KALLSYMS_A. If 4, then module C will have 4 * syms
+	  than module A. Then TEST_KALLSYMS_D will have double the amount
+	  of symbols than C so to allow projections.
+
+endif # TEST_KALLSYMS
+
 config TEST_DEBUG_VIRTUAL
 	tristate "Test CONFIG_DEBUG_VIRTUAL feature"
 	depends on DEBUG_VIRTUAL
diff --git a/lib/Makefile b/lib/Makefile
index 6b09731d8e61..48e69a61cd85 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_TEST_XARRAY) += test_xarray.o
 obj-$(CONFIG_TEST_MAPLE_TREE) += test_maple_tree.o
 obj-$(CONFIG_TEST_PARMAN) += test_parman.o
 obj-$(CONFIG_TEST_KMOD) += test_kmod.o
+obj-$(CONFIG_TEST_RUNTIME) += tests/
 obj-$(CONFIG_TEST_DEBUG_VIRTUAL) += test_debug_virtual.o
 obj-$(CONFIG_TEST_MEMCAT_P) += test_memcat_p.o
 obj-$(CONFIG_TEST_OBJAGG) += test_objagg.o
diff --git a/lib/tests/Makefile b/lib/tests/Makefile
new file mode 100644
index 000000000000..8e4f42cb9c54
--- /dev/null
+++ b/lib/tests/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_TEST_RUNTIME_MODULE)		+= module/
diff --git a/lib/tests/module/.gitignore b/lib/tests/module/.gitignore
new file mode 100644
index 000000000000..8be7891b250f
--- /dev/null
+++ b/lib/tests/module/.gitignore
@@ -0,0 +1,4 @@
+test_kallsyms_a.c
+test_kallsyms_b.c
+test_kallsyms_c.c
+test_kallsyms_d.c
diff --git a/lib/tests/module/Makefile b/lib/tests/module/Makefile
new file mode 100644
index 000000000000..af5c27b996cb
--- /dev/null
+++ b/lib/tests/module/Makefile
@@ -0,0 +1,15 @@
+obj-$(CONFIG_TEST_KALLSYMS_A) += test_kallsyms_a.o
+obj-$(CONFIG_TEST_KALLSYMS_B) += test_kallsyms_b.o
+obj-$(CONFIG_TEST_KALLSYMS_C) += test_kallsyms_c.o
+obj-$(CONFIG_TEST_KALLSYMS_D) += test_kallsyms_d.o
+
+$(obj)/%.c: FORCE
+	@$(kecho) "  GEN     $@"
+	$(Q)$(srctree)/lib/tests/module/gen_test_kallsyms.sh $@\
+		$(CONFIG_TEST_KALLSYMS_NUMSYMS) \
+		$(CONFIG_TEST_KALLSYMS_SCALE_FACTOR)
+
+clean-files += test_kallsyms_a.c
+clean-files += test_kallsyms_b.c
+clean-files += test_kallsyms_c.c
+clean-files += test_kallsyms_d.c
diff --git a/lib/tests/module/gen_test_kallsyms.sh b/lib/tests/module/gen_test_kallsyms.sh
new file mode 100755
index 000000000000..e85f10dc11bd
--- /dev/null
+++ b/lib/tests/module/gen_test_kallsyms.sh
@@ -0,0 +1,128 @@
+#!/bin/bash
+
+TARGET=$(basename $1)
+DIR=lib/tests/module
+TARGET="$DIR/$TARGET"
+NUM_SYMS=$2
+SCALE_FACTOR=$3
+TEST_TYPE=$(echo $TARGET | sed -e 's|lib/tests/module/test_kallsyms_||g')
+TEST_TYPE=$(echo $TEST_TYPE | sed -e 's|.c||g')
+
+gen_template_module_header()
+{
+	cat <<____END_MODULE
+// SPDX-License-Identifier: GPL-2.0-or-later OR copyleft-next-0.3.1
+/*
+ * Copyright (C) 2023 Luis Chamberlain <mcgrof@kernel.org>
+ *
+ * Automatically generated code for testing, do not edit manually.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+
+____END_MODULE
+}
+
+gen_num_syms()
+{
+	PREFIX=$1
+	NUM=$2
+	for i in $(seq 1 $NUM); do
+		printf "int auto_test_%s_%010d = 0xff;\n" $PREFIX $i
+		printf "EXPORT_SYMBOL_GPL(auto_test_%s_%010d);\n" $PREFIX $i
+	done
+	echo
+}
+
+gen_template_module_data_a()
+{
+	gen_num_syms a $1
+	cat <<____END_MODULE
+static int auto_runtime_test(void)
+{
+	return 0;
+}
+
+____END_MODULE
+}
+
+gen_template_module_data_b()
+{
+	printf "\nextern int auto_test_a_%010d;\n\n" 28
+	echo "static int auto_runtime_test(void)"
+	echo "{"
+	printf "\nreturn auto_test_a_%010d;\n" 28
+	echo "}"
+}
+
+gen_template_module_data_c()
+{
+	gen_num_syms c $1
+	cat <<____END_MODULE
+static int auto_runtime_test(void)
+{
+	return 0;
+}
+
+____END_MODULE
+}
+
+gen_template_module_data_d()
+{
+	gen_num_syms d $1
+	cat <<____END_MODULE
+static int auto_runtime_test(void)
+{
+	return 0;
+}
+
+____END_MODULE
+}
+
+gen_template_module_exit()
+{
+	cat <<____END_MODULE
+static int __init auto_test_module_init(void)
+{
+	return auto_runtime_test();
+}
+module_init(auto_test_module_init);
+
+static void __exit auto_test_module_exit(void)
+{
+}
+module_exit(auto_test_module_exit);
+
+MODULE_AUTHOR("Luis Chamberlain <mcgrof@kernel.org>");
+MODULE_LICENSE("GPL");
+____END_MODULE
+}
+
+case $TEST_TYPE in
+	a)
+		gen_template_module_header > $TARGET
+		gen_template_module_data_a $NUM_SYMS >> $TARGET
+		gen_template_module_exit >> $TARGET
+		;;
+	b)
+		gen_template_module_header > $TARGET
+		gen_template_module_data_b >> $TARGET
+		gen_template_module_exit >> $TARGET
+		;;
+	c)
+		gen_template_module_header > $TARGET
+		gen_template_module_data_c $((NUM_SYMS * SCALE_FACTOR)) >> $TARGET
+		gen_template_module_exit >> $TARGET
+		;;
+	d)
+		gen_template_module_header > $TARGET
+		gen_template_module_data_d $((NUM_SYMS * SCALE_FACTOR * 2)) >> $TARGET
+		gen_template_module_exit >> $TARGET
+		;;
+	*)
+		;;
+esac
diff --git a/tools/testing/selftests/module/Makefile b/tools/testing/selftests/module/Makefile
new file mode 100644
index 000000000000..6132d7ddb08b
--- /dev/null
+++ b/tools/testing/selftests/module/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# Makefile for module loading selftests
+
+# No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
+all:
+
+TEST_PROGS := find_symbol.sh
+
+include ../lib.mk
+
+# Nothing to clean up.
+clean:
diff --git a/tools/testing/selftests/module/config b/tools/testing/selftests/module/config
new file mode 100644
index 000000000000..259f4fd6b5e2
--- /dev/null
+++ b/tools/testing/selftests/module/config
@@ -0,0 +1,7 @@
+CONFIG_TEST_KMOD=m
+CONFIG_TEST_LKM=m
+CONFIG_XFS_FS=m
+
+# For the module parameter force_init_test is used
+CONFIG_TUN=m
+CONFIG_BTRFS_FS=m
diff --git a/tools/testing/selftests/module/find_symbol.sh b/tools/testing/selftests/module/find_symbol.sh
new file mode 100755
index 000000000000..c206b42525f1
--- /dev/null
+++ b/tools/testing/selftests/module/find_symbol.sh
@@ -0,0 +1,65 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later OR copyleft-next-0.3.1
+# Copyright (C) 2023 Luis Chamberlain <mcgrof@kernel.org>
+#
+# This is a stress test script for kallsyms through find_symbol()
+
+set -e
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+test_reqs()
+{
+	if ! which modprobe 2> /dev/null > /dev/null; then
+		echo "$0: You need modprobe installed" >&2
+		exit $ksft_skip
+	fi
+
+	if ! which kmod 2> /dev/null > /dev/null; then
+		echo "$0: You need kmod installed" >&2
+		exit $ksft_skip
+	fi
+
+	if ! which perf 2> /dev/null > /dev/null; then
+		echo "$0: You need perf installed" >&2
+		exit $ksft_skip
+	fi
+
+	uid=$(id -u)
+	if [ $uid -ne 0 ]; then
+		echo $msg must be run as root >&2
+		exit $ksft_skip
+	fi
+}
+
+remove_all()
+{
+	$MODPROBE -r test_kallsyms_b
+	for i in a b c d; do
+		$MODPROBE -r test_kallsyms_$i
+	done
+}
+test_reqs
+
+MODPROBE=$(</proc/sys/kernel/modprobe)
+STATS="-e duration_time"
+STATS="$STATS -e user_time"
+STATS="$STATS -e system_time"
+STATS="$STATS -e page-faults"
+
+remove_all
+perf stat $STATS $MODPROBE test_kallsyms_b
+remove_all
+
+# Now pollute the namespace
+$MODPROBE test_kallsyms_c
+perf stat $STATS $MODPROBE test_kallsyms_b
+
+# Now pollute the namespace with twice the number of symbols than the last time
+remove_all
+$MODPROBE test_kallsyms_c
+$MODPROBE test_kallsyms_d
+perf stat $STATS $MODPROBE test_kallsyms_b
+
+exit 0
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-12-22  7:01           ` Masahiro Yamada
@ 2023-12-22 20:11             ` Luis Chamberlain
  2023-12-23 14:35               ` Masahiro Yamada
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2023-12-22 20:11 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: deller, linux-kernel, Arnd Bergmann, linux-modules, linux-arch

On Fri, Dec 22, 2023 at 04:01:30PM +0900, Masahiro Yamada wrote:
> On Fri, Dec 22, 2023 at 3:08 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > On Thu, Dec 21, 2023 at 10:07:13PM -0800, Luis Chamberlain wrote:
> > >
> > > If we want to go bananas we could even get a graph of size of modules
> >
> > Sorry I meant size of number of symbols Vs cost.
> >
> >  Luis
> 
> 
> 
> But, 1/4 is really a bug-fix, isn't it?

Ah you mean a regression fix, yeah sure, thanks I see !

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] Section alignment issues?
  2023-12-22  8:23     ` Helge Deller
@ 2023-12-23  1:32       ` Masahiro Yamada
  0 siblings, 0 replies; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-23  1:32 UTC (permalink / raw)
  To: Helge Deller
  Cc: deller, linux-kernel, Arnd Bergmann, linux-modules, linux-arch,
	Luis Chamberlain

On Fri, Dec 22, 2023 at 5:23 PM Helge Deller <deller@gmx.de> wrote:
>
> On 12/21/23 16:42, Masahiro Yamada wrote:
> > On Thu, Dec 21, 2023 at 10:40 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >>
> >> On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
> >>>
> >>> From: Helge Deller <deller@gmx.de>
> >>>
> >>> While working on the 64-bit parisc kernel, I noticed that the __ksymtab[]
> >>> table was not correctly 64-bit aligned in many modules.
> >>> The following patches do fix some of those issues in the generic code.
> >>>
> >>> But further investigation shows that multiple sections in the kernel and in
> >>> modules are possibly not correctly aligned, and thus may lead to performance
> >>> degregations at runtime (small on x86, huge on parisc, sparc and others which
> >>> need exception handlers). Sometimes wrong alignments may also be simply hidden
> >>> by the linker or kernel module loader which pulls in the sections by luck with
> >>> a correct alignment (e.g. because the previous section was aligned already).
> >>>
> >>> An objdump on a x86 module shows e.g.:
> >>>
> >>> ./kernel/net/netfilter/nf_log_syslog.ko:     file format elf64-x86-64
> >>> Sections:
> >>> Idx Name          Size      VMA               LMA               File off  Algn
> >>>    0 .text         00001fdf  0000000000000000  0000000000000000  00000040  2**4
> >>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >>>    1 .init.text    000000f6  0000000000000000  0000000000000000  00002020  2**4
> >>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >>>    2 .exit.text    0000005c  0000000000000000  0000000000000000  00002120  2**4
> >>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >>>    3 .rodata.str1.8 000000dc  0000000000000000  0000000000000000  00002180  2**3
> >>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
> >>>    4 .rodata.str1.1 0000030a  0000000000000000  0000000000000000  0000225c  2**0
> >>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
> >>>    5 .rodata       000000b0  0000000000000000  0000000000000000  00002580  2**5
> >>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
> >>>    6 .modinfo      0000019e  0000000000000000  0000000000000000  00002630  2**0
> >>>                    CONTENTS, ALLOC, LOAD, READONLY, DATA
> >>>    7 .return_sites 00000034  0000000000000000  0000000000000000  000027ce  2**0
> >>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >>>    8 .call_sites   0000029c  0000000000000000  0000000000000000  00002802  2**0
> >>>                    CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >>>
> >>> In this example I believe the ".return_sites" and ".call_sites" should have
> >>> an alignment of at least 32-bit (4 bytes).
> >>>
> >>> On other architectures or modules other sections like ".altinstructions" or
> >>> "__ex_table" may show up wrongly instead.
> >>>
> >>> In general I think it would be beneficial to search for wrong alignments at
> >>> link time, and maybe at runtime.
> >>>
> >>> The patch at the end of this cover letter
> >>> - adds compile time checks to the "modpost" tool, and
> >>> - adds a runtime check to the kernel module loader at runtime.
> >>> And it will possibly show false positives too (!!!)
> >>> I do understand that some of those sections are not performce critical
> >>> and thus any alignment is OK.
> >>>
> >>> The modpost patch will emit at compile time such warnings (on x86-64 kernel build):
> >>>
> >>> WARNING: modpost: vmlinux: section .initcall7.init (type 1, flags 2) has alignment of 1, expected at least 4.
> >>> Maybe you need to add ALIGN() to the modules.lds file (or fix modpost) ?
> >>> WARNING: modpost: vmlinux: section .altinstructions (type 1, flags 2) has alignment of 1, expected at least 2.
> >>> WARNING: modpost: vmlinux: section .initcall6.init (type 1, flags 2) has alignment of 1, expected at least 4.
> >>> WARNING: modpost: vmlinux: section .initcallearly.init (type 1, flags 2) has alignment of 1, expected at least 4.
> >>> WARNING: modpost: vmlinux: section .rodata.cst2 (type 1, flags 18) has alignment of 2, expected at least 64.
> >>> WARNING: modpost: vmlinux: section .static_call_tramp_key (type 1, flags 2) has alignment of 1, expected at least 8.
> >>> WARNING: modpost: vmlinux: section .con_initcall.init (type 1, flags 2) has alignment of 1, expected at least 8.
> >>> WARNING: modpost: vmlinux: section __bug_table (type 1, flags 3) has alignment of 1, expected at least 4.
> >>> ...
> >>
> >>
> >>
> >>
> >> modpost acts on vmlinux.o instead of vmlinux.
> >>
> >>
> >> vmlinux.o is a relocatable ELF, which is not a real layout
> >> because no linker script has been considered yet at this
> >> point.
> >>
> >>
> >> vmlinux is an executable ELF, produced by a linker,
> >> with the linker script taken into consideration
> >> to determine the final section/symbol layout.
> >>
> >>
> >> So, checking this in modpost is meaningless.
> >>
> >>
> >>
> >> I did not check the module checking code, but
> >> modules are also relocatable ELF.
> >
> >
> >
> > Sorry, I replied too early.
> > (Actually I replied without reading your modpost code).
> >
> > Now, I understand what your checker is doing.
> >
> >
> > I did not test how many false positives are produced,
> > but it catches several suspicious mis-alignments.
>
> Yes.
>
> > However, I am not convinced with this warning.
> >
> >
> > +               warn("%s: section %s (type %d, flags %lu) has
> > alignment of %d, expected at least %d.\n"
> > +                    "Maybe you need to add ALIGN() to the modules.lds
> > file (or fix modpost) ?\n",
> > +                    modname, sec, sechdr->sh_type, sechdr->sh_flags,
> > is_shalign, should_shalign);
> > +       }
> >
> >
> > Adding ALGIN() hides the real problem.
>
> Right.
> It took me some time to understand the effects here too.
> See below...
>
> > I think the real problem is that not enough alignment was requested
> > in the code.
> >
> > For example, the right fix for ".initcall7.init" should be this:
> >
> > diff --git a/include/linux/init.h b/include/linux/init.h
> > index 3fa3f6241350..650311e4b215 100644
> > --- a/include/linux/init.h
> > +++ b/include/linux/init.h
> > @@ -264,6 +264,7 @@ extern struct module __this_module;
> >   #define ____define_initcall(fn, __stub, __name, __sec)         \
> >          __define_initcall_stub(__stub, fn)                      \
> >          asm(".section   \"" __sec "\", \"a\"            \n"     \
> > +           ".balign 4                                  \n"     \
> >              __stringify(__name) ":                      \n"     \
> >              ".long      " __stringify(__stub) " - .     \n"     \
> >              ".previous                                  \n");   \
> >
> > Then, "this section requires at least 4 byte alignment"
> > is recorded in the sh_addralign field.
>
> Yes, this is the important part.
>
> > Then, the rest is the linker's job.
> >
> > We should not tweak the linker script.
>
> That's right, but let's phrase it slightly different...
> There is *no need* to tweak the linker script, *if* the alignment
> gets correctly assigned by the inline assembly (like your
> initcall patch above).
> But on some platforms (e.g. on parisc) I noticed that this .balign
> was missing for some other sections, in which case the other (not preferred)
> possible option is to tweak the linker script.
>
> So I think we agree that fixing the inline assembly is the right
> way to go?


Yes, I think so.



> Either way, a link-time check like the proposed modpost patch
> may catch section issue for upcoming/newly added sections too.


Yes. This check seems to be useful.




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections
  2023-12-22  9:02     ` Helge Deller
@ 2023-12-23  4:10       ` Masahiro Yamada
  0 siblings, 0 replies; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-23  4:10 UTC (permalink / raw)
  To: Helge Deller
  Cc: deller, linux-kernel, Arnd Bergmann, linux-modules, linux-arch,
	Luis Chamberlain

On Fri, Dec 22, 2023 at 6:02 PM Helge Deller <deller@gmx.de> wrote:
>
> On 12/21/23 14:07, Masahiro Yamada wrote:
> > On Thu, Nov 23, 2023 at 7:18 AM <deller@kernel.org> wrote:
> >>
> >> From: Helge Deller <deller@gmx.de>
> >>
> >> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> >> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
> >> 64-bit pointers into the __ksymtab* sections.
> >> Make sure that the start of those sections is 64-bit aligned in the vmlinux
> >> executable, otherwise unaligned memory accesses may happen at runtime.
> >
> >
> > Are you solving a real problem?
>
> Not any longer.
> I faced a problem on parisc when neither #1 and #3 were applied
> because of a buggy unalignment exception handler. But this is
> not something which I would count a "real generic problem".
>
> > 1/4 already ensures the proper alignment of __ksymtab*, doesn't it?
>
> Yes, it does.
>
> >...
> > So, my understanding is this patch is unneeded.
>
> Yes, it's not required and I'm fine if we drop it.
>
> But regarding __kcrctab:
>
> >> @@ -498,6 +501,7 @@
> >>          }                                                               \
> >>                                                                          \
> >>          /* Kernel symbol table: Normal symbols */                       \
> >> +       . = ALIGN(4);                                                   \
> >>          __kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {         \
> >>                  __start___kcrctab = .;                                  \
> >>                  KEEP(*(SORT(___kcrctab+*)))                             \
>
> I think this patch would be beneficial to get proper alignment:
>
> diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h
> index cd253eb51d6c..d445705ac13c 100644
> --- a/include/linux/export-internal.h
> +++ b/include/linux/export-internal.h
> @@ -64,6 +64,7 @@
>
>   #define SYMBOL_CRC(sym, crc, sec)   \
>          asm(".section \"___kcrctab" sec "+" #sym "\",\"a\""     "\n" \
> +           ".balign 4"                                         "\n" \
>              "__crc_" #sym ":"                                   "\n" \
>              ".long " #crc                                       "\n" \
>              ".previous"                                         "\n")


Yes!


Please send a patch with this:


Fixes: f3304ecd7f06 ("linux/export: use inline assembler to populate
symbol CRCs")



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries
  2023-12-22 20:11             ` Luis Chamberlain
@ 2023-12-23 14:35               ` Masahiro Yamada
  0 siblings, 0 replies; 31+ messages in thread
From: Masahiro Yamada @ 2023-12-23 14:35 UTC (permalink / raw)
  To: Luis Chamberlain, deller
  Cc: linux-kernel, Arnd Bergmann, linux-modules, linux-arch

On Sat, Dec 23, 2023 at 5:11 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Fri, Dec 22, 2023 at 04:01:30PM +0900, Masahiro Yamada wrote:
> > On Fri, Dec 22, 2023 at 3:08 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > >
> > > On Thu, Dec 21, 2023 at 10:07:13PM -0800, Luis Chamberlain wrote:
> > > >
> > > > If we want to go bananas we could even get a graph of size of modules
> > >
> > > Sorry I meant size of number of symbols Vs cost.
> > >
> > >  Luis
> >
> >
> >
> > But, 1/4 is really a bug-fix, isn't it?
>
> Ah you mean a regression fix, yeah sure, thanks I see !
>
>  Luis



Now, I applied 1/4 to linux-kbuild/fixes.

Thanks.



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2023-12-22 20:10       ` Luis Chamberlain
@ 2023-12-30  7:33         ` Helge Deller
  2024-01-22 16:10           ` Luis Chamberlain
  0 siblings, 1 reply; 31+ messages in thread
From: Helge Deller @ 2023-12-30  7:33 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: deller, linux-kernel, Masahiro Yamada, Arnd Bergmann,
	linux-modules, linux-arch

Hi Luis,

On 12/22/23 21:10, Luis Chamberlain wrote:
> On Fri, Dec 22, 2023 at 01:13:26PM +0100, Helge Deller wrote:
>> On 12/22/23 06:59, Luis Chamberlain wrote:
>>> On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@kernel.org wrote:
>>>> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>>>> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
>>>> 64-bit pointers into the __ksymtab* sections.
>>>> Make sure that those sections will be correctly aligned at module link time,
>>>> otherwise unaligned memory accesses may happen at runtime.
>>> ...
...
>> So, honestly I don't see a real reason why it shouldn't be applied...
>
> Like I said, you Cc'd stable as a fix,

I added "Cc: stable@vger.kernel.org" on the patch itself, so *if* the patch
would have been applied by you, it would later end up in stable kernel series too.
But I did not CC'ed the stable mailing list directly, so my patch was never
sent to that mailing list.

> as a maintainer it is my job to
> verify how critical this is and ask for more details about how you found
> it and evaluate the real impact. Even if it was not a stable fix I tend
> to ask this for patches, even if they are trivial.
> ...
> OK, can you extend the patch below with something like:
>
> perf stat --repeat 100 --pre 'modprobe -r b a b c' -- ./tools/testing/selftests/module/find_symbol.sh
>
> And test before and after?
>
> I ran a simple test as-is and the data I get is within noise, and so
> I think we need the --repeat 100 thing.

Your selftest code is based on perf.
AFAICS we don't have perf on parisc/hppa, so I can't test your selftest code
on that architecture.
I assume you tested on x86, where the CPU will transparently take care of
unaligned accesses. This is probably why the results are within
the noise.
But on some platforms the CPU raises an exception on unaligned accesses
and jumps into special exception handler assembler code inside the kernel.
This is much more expensive than on x86, which is why we track on parisc
in /proc/cpuinfo counters on how often this exception handler is called:
IRQ:       CPU0       CPU1
   3:       1332          0         SuperIO  ttyS0
   7:    1270013          0         SuperIO  pata_ns87415
  64:  320023012  320021431             CPU  timer
  65:   17080507   20624423             CPU  IPI
UAH:   10948640      58104   Unaligned access handler traps

This "UAH" field could theoretically be used to extend your selftest.
But is it really worth it? The outcome is very much architecture and CPU
specific, maybe it's just within the noise as you measured.

IMHO we should always try to natively align structures, and if we see
we got it wrong in kernel code, we should fix it.
My patches just fix those memory sections where we use inline
assembly (instead of C) and thus missed to provide the correct alignments.

Helge

> -----------------------------------------------------------------------------------
> before:
> sudo ./tools/testing/selftests/module/find_symbol.sh
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          81,956,206 ns   duration_time
>          81,883,000 ns   system_time
>                 210      page-faults
>
>         0.081956206 seconds time elapsed
>
>         0.000000000 seconds user
>         0.081883000 seconds sys
>
>
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          85,960,863 ns   duration_time
>          84,679,000 ns   system_time
>                 212      page-faults
>
>         0.085960863 seconds time elapsed
>
>         0.000000000 seconds user
>         0.084679000 seconds sys
>
>
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          86,484,868 ns   duration_time
>          86,541,000 ns   system_time
>                 213      page-faults
>
>         0.086484868 seconds time elapsed
>
>         0.000000000 seconds user
>         0.086541000 seconds sys
>
> -----------------------------------------------------------------------------------
> After your modules alignement fix:
> sudo ./tools/testing/selftests/module/find_symbol.sh
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          83,579,980 ns   duration_time
>          83,530,000 ns   system_time
>                 212      page-faults
>
>         0.083579980 seconds time elapsed
>
>         0.000000000 seconds user
>         0.083530000 seconds sys
>
>
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          70,721,786 ns   duration_time
>          69,289,000 ns   system_time
>                 211      page-faults
>
>         0.070721786 seconds time elapsed
>
>         0.000000000 seconds user
>         0.069289000 seconds sys
>
>
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          76,513,219 ns   duration_time
>          76,381,000 ns   system_time
>                 214      page-faults
>
>         0.076513219 seconds time elapsed
>
>         0.000000000 seconds user
>         0.076381000 seconds sys
>
> After your modules alignement fix:
> sudo ./tools/testing/selftests/module/find_symbol.sh
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          83,579,980 ns   duration_time
>          83,530,000 ns   system_time
>                 212      page-faults
>
>         0.083579980 seconds time elapsed
>
>         0.000000000 seconds user
>         0.083530000 seconds sys
>
>
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          70,721,786 ns   duration_time
>          69,289,000 ns   system_time
>                 211      page-faults
>
>         0.070721786 seconds time elapsed
>
>         0.000000000 seconds user
>         0.069289000 seconds sys
>
>
>
>   Performance counter stats for '/sbin/modprobe test_kallsyms_b':
>
>          76,513,219 ns   duration_time
>          76,381,000 ns   system_time
>                 214      page-faults
>
>         0.076513219 seconds time elapsed
>
>         0.000000000 seconds user
>         0.076381000 seconds sys
> -----------------------------------------------------------------------------------
>
> [perf-based selftest patch from Luis stripped]


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2023-12-30  7:33         ` Helge Deller
@ 2024-01-22 16:10           ` Luis Chamberlain
  2024-01-22 16:47             ` Helge Deller
  0 siblings, 1 reply; 31+ messages in thread
From: Luis Chamberlain @ 2024-01-22 16:10 UTC (permalink / raw)
  To: Helge Deller
  Cc: deller, linux-kernel, Masahiro Yamada, Arnd Bergmann,
	linux-modules, linux-arch

On Sat, Dec 30, 2023 at 08:33:24AM +0100, Helge Deller wrote:
> Your selftest code is based on perf.
> AFAICS we don't have perf on parisc/hppa, 

I see!

> so I can't test your selftest code
> on that architecture.
> I assume you tested on x86, where the CPU will transparently take care of
> unaligned accesses. This is probably why the results are within
> the noise.
> But on some platforms the CPU raises an exception on unaligned accesses
> and jumps into special exception handler assembler code inside the kernel.
> This is much more expensive than on x86, which is why we track on parisc
> in /proc/cpuinfo counters on how often this exception handler is called:
> IRQ:       CPU0       CPU1
>   3:       1332          0         SuperIO  ttyS0
>   7:    1270013          0         SuperIO  pata_ns87415
>  64:  320023012  320021431             CPU  timer
>  65:   17080507   20624423             CPU  IPI
> UAH:   10948640      58104   Unaligned access handler traps
> 
> This "UAH" field could theoretically be used to extend your selftest.

Nice!

> But is it really worth it? The outcome is very much architecture and CPU
> specific, maybe it's just within the noise as you measured.

It's within the noise for x86_64, but given what you suggest
for parisc where it is much more expensive, we should see a non-noise
delta. Even just time on loading the module should likely result in
a considerable delta than on x86_64. You may just need to play a bit
with the default values at build time.

> IMHO we should always try to natively align structures, and if we see
> we got it wrong in kernel code, we should fix it.

This was all motivated by the first review criteria of these patches
as if they were stable worthy or not. Even if we don't consider them
stable material, given the test is now written and easily extended to
test on parisc with just timing information and UAH I think it would
be nice to have this data for a few larger default factor values so we
can compare against x86_64 while we're at it.

If you don't feel like doing that test that's fine too, we can just
ignore that. I'll still apply the patches but, I figured I'd ask to
collect information while the test was already written and it should
now be easy to compare / contrast differences.

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2024-01-22 16:10           ` Luis Chamberlain
@ 2024-01-22 16:47             ` Helge Deller
  2024-01-22 18:48               ` Luis Chamberlain
  0 siblings, 1 reply; 31+ messages in thread
From: Helge Deller @ 2024-01-22 16:47 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: deller, linux-kernel, Masahiro Yamada, Arnd Bergmann,
	linux-modules, linux-arch

On 1/22/24 17:10, Luis Chamberlain wrote:
> On Sat, Dec 30, 2023 at 08:33:24AM +0100, Helge Deller wrote:
>> Your selftest code is based on perf.
>> AFAICS we don't have perf on parisc/hppa,
>
> I see!
>
>> so I can't test your selftest code
>> on that architecture.
>> I assume you tested on x86, where the CPU will transparently take care of
>> unaligned accesses. This is probably why the results are within
>> the noise.
>> But on some platforms the CPU raises an exception on unaligned accesses
>> and jumps into special exception handler assembler code inside the kernel.
>> This is much more expensive than on x86, which is why we track on parisc
>> in /proc/cpuinfo counters on how often this exception handler is called:
>> IRQ:       CPU0       CPU1
>>    3:       1332          0         SuperIO  ttyS0
>>    7:    1270013          0         SuperIO  pata_ns87415
>>   64:  320023012  320021431             CPU  timer
>>   65:   17080507   20624423             CPU  IPI
>> UAH:   10948640      58104   Unaligned access handler traps
>>
>> This "UAH" field could theoretically be used to extend your selftest.
>
> Nice!
>
>> But is it really worth it? The outcome is very much architecture and CPU
>> specific, maybe it's just within the noise as you measured.
>
> It's within the noise for x86_64, but given what you suggest
> for parisc where it is much more expensive, we should see a non-noise
> delta. Even just time on loading the module should likely result in
> a considerable delta than on x86_64. You may just need to play a bit
> with the default values at build time.

I don't know if it will be a "considerable" amount of time.

>> IMHO we should always try to natively align structures, and if we see
>> we got it wrong in kernel code, we should fix it.
>
> This was all motivated by the first review criteria of these patches
> as if they were stable worthy or not. Even if we don't consider them
> stable material, given the test is now written and easily extended to
> test on parisc with just timing information and UAH I think it would
> be nice to have this data for a few larger default factor values so we
> can compare against x86_64 while we're at it.
>
> If you don't feel like doing that test that's fine too, we can just
> ignore that.

I can do that test, but I won't have time for that in the next few weeks...

> I'll still apply the patches
Yes, please do!
Even if I don't test now, I (or others) will test at a later point.

> but, I figured I'd ask to collect information while the test was already
> written and it should now be easy to compare / contrast differences.
Ok.

Helge

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections
  2024-01-22 16:47             ` Helge Deller
@ 2024-01-22 18:48               ` Luis Chamberlain
  0 siblings, 0 replies; 31+ messages in thread
From: Luis Chamberlain @ 2024-01-22 18:48 UTC (permalink / raw)
  To: Helge Deller
  Cc: deller, linux-kernel, Masahiro Yamada, Arnd Bergmann,
	linux-modules, linux-arch

On Mon, Jan 22, 2024 at 05:47:49PM +0100, Helge Deller wrote:
> On 1/22/24 17:10, Luis Chamberlain wrote:
> > 
> > It's within the noise for x86_64, but given what you suggest
> > for parisc where it is much more expensive, we should see a non-noise
> > delta. Even just time on loading the module should likely result in
> > a considerable delta than on x86_64. You may just need to play a bit
> > with the default values at build time.
> 
> I don't know if it will be a "considerable" amount of time.

There are variables which you can tune, so given what you suggest
it should be possible to get to that spot rather easily with a few
changes to the default values I think.

> > If you don't feel like doing that test that's fine too, we can just
> > ignore that.
> 
> I can do that test, but I won't have time for that in the next few weeks...

I would appreciate it!

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] modules: Add missing entry for __ex_table
  2023-11-22 22:18 ` [PATCH 4/4] modules: Add missing entry for __ex_table deller
@ 2024-01-29 18:50   ` Luis Chamberlain
  0 siblings, 0 replies; 31+ messages in thread
From: Luis Chamberlain @ 2024-01-29 18:50 UTC (permalink / raw)
  To: deller
  Cc: linux-kernel, Masahiro Yamada, Arnd Bergmann, linux-modules, linux-arch

On Wed, Nov 22, 2023 at 11:18:14PM +0100, deller@kernel.org wrote:
> From: Helge Deller <deller@gmx.de>
> 
> The entry for __ex_table was missing, which may make __ex_table
> become 1- or 2-byte aligned in modules.
> Add the entry to ensure it gets 32-bit aligned.
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> Cc: <stable@vger.kernel.org> # v6.0+

Cc'ing stable was overkill, I'll remove it.

  Luis

> ---
>  scripts/module.lds.S | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/scripts/module.lds.S b/scripts/module.lds.S
> index b00415a9ff27..488f61b156b2 100644
> --- a/scripts/module.lds.S
> +++ b/scripts/module.lds.S
> @@ -26,6 +26,7 @@ SECTIONS {
>  	.altinstructions	0 : ALIGN(8) { KEEP(*(.altinstructions)) }
>  	__bug_table		0 : ALIGN(8) { KEEP(*(__bug_table)) }
>  	__jump_table		0 : ALIGN(8) { KEEP(*(__jump_table)) }
> +	__ex_table		0 : ALIGN(4) { KEEP(*(__ex_table)) }
>  
>  	__patchable_function_entries : { *(__patchable_function_entries) }
>  
> -- 
> 2.41.0
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-01-29 18:50 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-22 22:18 [PATCH 0/4] Section alignment issues? deller
2023-11-22 22:18 ` [PATCH 1/4] linux/export: Fix alignment for 64-bit ksymtab entries deller
2023-12-21 10:22   ` Masahiro Yamada
2023-12-21 16:01     ` Masahiro Yamada
2023-12-22  6:07       ` Luis Chamberlain
2023-12-22  6:08         ` Luis Chamberlain
2023-12-22  7:01           ` Masahiro Yamada
2023-12-22 20:11             ` Luis Chamberlain
2023-12-23 14:35               ` Masahiro Yamada
2023-11-22 22:18 ` [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections deller
2023-12-22  5:59   ` Luis Chamberlain
2023-12-22 12:13     ` Helge Deller
2023-12-22 20:10       ` Luis Chamberlain
2023-12-30  7:33         ` Helge Deller
2024-01-22 16:10           ` Luis Chamberlain
2024-01-22 16:47             ` Helge Deller
2024-01-22 18:48               ` Luis Chamberlain
2023-11-22 22:18 ` [PATCH 3/4] vmlinux.lds.h: Fix alignment for __ksymtab*, __kcrctab_* and .pci_fixup sections deller
2023-12-21 13:07   ` Masahiro Yamada
2023-12-22  9:02     ` Helge Deller
2023-12-23  4:10       ` Masahiro Yamada
2023-11-22 22:18 ` [PATCH 4/4] modules: Add missing entry for __ex_table deller
2024-01-29 18:50   ` Luis Chamberlain
2023-12-19 21:26 ` [PATCH 0/4] Section alignment issues? Luis Chamberlain
2023-12-20 19:40   ` Luis Chamberlain
2023-12-22  9:13     ` Helge Deller
2023-12-21 13:40 ` Masahiro Yamada
2023-12-21 15:42   ` Masahiro Yamada
2023-12-22  8:23     ` Helge Deller
2023-12-23  1:32       ` Masahiro Yamada
2023-12-22  9:48     ` David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.