linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
@ 2024-03-06 20:05 Calvin Owens
  2024-03-06 20:05 ` [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available Calvin Owens
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-06 20:05 UTC (permalink / raw)
  To: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner
  Cc: Calvin Owens, bpf, linux-modules, linux-kernel

Hello all,

This patchset makes it possible to use bpftrace with kprobes on kernels
built without loadable module support.

On a Raspberry Pi 4b, this saves about 700KB of memory where BPF is
needed but loadable module support is not. These two kernels had
identical configurations, except CONFIG_MODULE was off in the second:

   - Linux version 6.8.0-rc7
   - Memory: 3330672K/4050944K available (16576K kernel code, 2390K rwdata,
   - 12364K rodata, 5632K init, 675K bss, 195984K reserved, 524288K cma-reserved)
   + Linux version 6.8.0-rc7-00003-g2af01251ca21
   + Memory: 3331400K/4050944K available (16512K kernel code, 2384K rwdata,
   + 11728K rodata, 5632K init, 673K bss, 195256K reserved, 524288K cma-reserved)

I don't intend to present an exhaustive list of !MODULES usecases, since
I'm sure there are many I'm not aware of. Performance is a common one,
the primary justification being that static text is mapped on hugepages
and module text is not. Security is another, since rootkits are much
harder to implement without modules.

The first patch is the interesting one: it moves module_alloc() into its
own file with its own Kconfig option, so it can be utilized even when
loadable module support is disabled. I got the idea from an unmerged
patch from a few years ago I found on lkml (see [1/4] for details). I
think this also has value in its own right, since I suspect there are
potential users beyond bpf, hopefully we will hear from some.

Patches 2-3 are proofs of concept to demonstrate the first patch is
sufficient to achieve my goal (full ebpf functionality without modules).

Patch 4 adds a new "-n" argument to vmtest.sh to run the BPF selftests
without modules, so the prior three patches can be rigorously tested.

If something like the first patch were to eventually be merged, the rest
could go through the normal bpf-next process as I clean them up: I've
only based them on Linus' tree and combined them into a series here to
introduce the idea.

If you prefer to fetch the patches via git:

  [1/4] https://github.com/jcalvinowens/linux.git work/module-alloc
 +[2/4]+[3/4] https://github.com/jcalvinowens/linux.git work/nomodule-bpf
 +[4/4] https://github.com/jcalvinowens/linux.git testing/nomodule-bpf-ci

In addition to the automated BPF selftests, I've lightly tested this on
my laptop (x86_64), a Raspberry Pi 4b (arm64), and a Raspberry Pi Zero W
(arm). The other architectures have only been compile tested.

I didn't want to spam all the arch maintainers with what I expect will
be a discussion mostly about modules and bpf, so I've left them off this
first submission. I will be sure to add them on future submissions of
the first patch. Of course, feedback on the arch bits is welcome here.

In addition to feedback on the patches themselves, I'm interested in
hearing from anybody else who might find this functionality useful.

Thanks,
Calvin


Calvin Owens (4):
  module: mm: Make module_alloc() generally available
  bpf: Allow BPF_JIT with CONFIG_MODULES=n
  kprobes: Allow kprobes with CONFIG_MODULES=n
  selftests/bpf: Support testing the !MODULES case

 arch/Kconfig                                  |   4 +-
 arch/arm/kernel/module.c                      |  35 -----
 arch/arm/mm/Makefile                          |   2 +
 arch/arm/mm/module_alloc.c                    |  40 ++++++
 arch/arm64/kernel/module.c                    | 127 -----------------
 arch/arm64/mm/Makefile                        |   1 +
 arch/arm64/mm/module_alloc.c                  | 130 ++++++++++++++++++
 arch/loongarch/kernel/module.c                |   6 -
 arch/loongarch/mm/Makefile                    |   2 +
 arch/loongarch/mm/module_alloc.c              |  10 ++
 arch/mips/kernel/module.c                     |  10 --
 arch/mips/mm/Makefile                         |   2 +
 arch/mips/mm/module_alloc.c                   |  13 ++
 arch/nios2/kernel/module.c                    |  20 ---
 arch/nios2/mm/Makefile                        |   2 +
 arch/nios2/mm/module_alloc.c                  |  22 +++
 arch/parisc/kernel/module.c                   |  12 --
 arch/parisc/mm/Makefile                       |   1 +
 arch/parisc/mm/module_alloc.c                 |  15 ++
 arch/powerpc/kernel/module.c                  |  36 -----
 arch/powerpc/mm/Makefile                      |   1 +
 arch/powerpc/mm/module_alloc.c                |  41 ++++++
 arch/riscv/kernel/module.c                    |  11 --
 arch/riscv/mm/Makefile                        |   1 +
 arch/riscv/mm/module_alloc.c                  |  17 +++
 arch/s390/kernel/module.c                     |  37 -----
 arch/s390/mm/Makefile                         |   1 +
 arch/s390/mm/module_alloc.c                   |  42 ++++++
 arch/sparc/kernel/module.c                    |  31 -----
 arch/sparc/mm/Makefile                        |   2 +
 arch/sparc/mm/module_alloc.c                  |  31 +++++
 arch/x86/kernel/ftrace.c                      |   2 +-
 arch/x86/kernel/module.c                      |  56 --------
 arch/x86/mm/Makefile                          |   2 +
 arch/x86/mm/module_alloc.c                    |  59 ++++++++
 fs/proc/kcore.c                               |   2 +-
 include/trace/events/bpf_testmod.h            |   1 +
 kernel/bpf/Kconfig                            |  11 +-
 kernel/bpf/Makefile                           |   2 +
 kernel/bpf/bpf_struct_ops.c                   |  28 +++-
 kernel/bpf/bpf_testmod/Makefile               |   1 +
 kernel/bpf/bpf_testmod/bpf_testmod.c          |   1 +
 kernel/bpf/bpf_testmod/bpf_testmod.h          |   1 +
 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h    |   1 +
 kernel/kprobes.c                              |  22 +++
 kernel/module/Kconfig                         |   1 +
 kernel/module/main.c                          |  17 ---
 kernel/trace/trace_kprobe.c                   |  11 ++
 mm/Kconfig                                    |   3 +
 mm/Makefile                                   |   1 +
 mm/module_alloc.c                             |  21 +++
 mm/vmalloc.c                                  |   2 +-
 net/bpf/test_run.c                            |   2 +
 tools/testing/selftests/bpf/Makefile          |  28 ++--
 .../selftests/bpf/bpf_testmod/Makefile        |   2 +-
 .../bpf/bpf_testmod/bpf_testmod-events.h      |   6 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   4 +
 .../bpf/bpf_testmod/bpf_testmod_kfunc.h       |   2 +
 tools/testing/selftests/bpf/config            |   5 -
 tools/testing/selftests/bpf/config.mods       |   5 +
 tools/testing/selftests/bpf/config.nomods     |   1 +
 .../selftests/bpf/progs/btf_type_tag_percpu.c |   2 +
 .../selftests/bpf/progs/btf_type_tag_user.c   |   2 +
 tools/testing/selftests/bpf/progs/core_kern.c |   2 +
 .../selftests/bpf/progs/iters_testmod_seq.c   |   2 +
 .../bpf/progs/test_core_reloc_module.c        |   2 +
 .../selftests/bpf/progs/test_ldsx_insn.c      |   2 +
 .../selftests/bpf/progs/test_module_attach.c  |   3 +
 .../selftests/bpf/progs/tracing_struct.c      |   2 +
 tools/testing/selftests/bpf/testing_helpers.c |  14 ++
 tools/testing/selftests/bpf/vmtest.sh         |  24 +++-
 71 files changed, 636 insertions(+), 424 deletions(-)
 create mode 100644 arch/arm/mm/module_alloc.c
 create mode 100644 arch/arm64/mm/module_alloc.c
 create mode 100644 arch/loongarch/mm/module_alloc.c
 create mode 100644 arch/mips/mm/module_alloc.c
 create mode 100644 arch/nios2/mm/module_alloc.c
 create mode 100644 arch/parisc/mm/module_alloc.c
 create mode 100644 arch/powerpc/mm/module_alloc.c
 create mode 100644 arch/riscv/mm/module_alloc.c
 create mode 100644 arch/s390/mm/module_alloc.c
 create mode 100644 arch/sparc/mm/module_alloc.c
 create mode 100644 arch/x86/mm/module_alloc.c
 create mode 120000 include/trace/events/bpf_testmod.h
 create mode 100644 kernel/bpf/bpf_testmod/Makefile
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.c
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.h
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
 create mode 100644 mm/module_alloc.c
 create mode 100644 tools/testing/selftests/bpf/config.mods
 create mode 100644 tools/testing/selftests/bpf/config.nomods

-- 
2.43.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available
  2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
@ 2024-03-06 20:05 ` Calvin Owens
  2024-03-07 14:43   ` Christophe Leroy
  2024-03-08  2:16   ` Masami Hiramatsu
  2024-03-06 20:05 ` [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n Calvin Owens
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-06 20:05 UTC (permalink / raw)
  To: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner
  Cc: Calvin Owens, bpf, linux-modules, linux-kernel

Both BPF_JIT and KPROBES depend on CONFIG_MODULES, but only require
module_alloc() itself, which can be easily separated into a standalone
allocator for executable kernel memory.

Thomas Gleixner sent a patch to do that for x86 as part of a larger
series a couple years ago:

    https://lore.kernel.org/all/20220716230953.442937066@linutronix.de/

I've simply extended that approach to the whole kernel.

Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
---
 arch/Kconfig                     |   2 +-
 arch/arm/kernel/module.c         |  35 ---------
 arch/arm/mm/Makefile             |   2 +
 arch/arm/mm/module_alloc.c       |  40 ++++++++++
 arch/arm64/kernel/module.c       | 127 ------------------------------
 arch/arm64/mm/Makefile           |   1 +
 arch/arm64/mm/module_alloc.c     | 130 +++++++++++++++++++++++++++++++
 arch/loongarch/kernel/module.c   |   6 --
 arch/loongarch/mm/Makefile       |   2 +
 arch/loongarch/mm/module_alloc.c |  10 +++
 arch/mips/kernel/module.c        |  10 ---
 arch/mips/mm/Makefile            |   2 +
 arch/mips/mm/module_alloc.c      |  13 ++++
 arch/nios2/kernel/module.c       |  20 -----
 arch/nios2/mm/Makefile           |   2 +
 arch/nios2/mm/module_alloc.c     |  22 ++++++
 arch/parisc/kernel/module.c      |  12 ---
 arch/parisc/mm/Makefile          |   1 +
 arch/parisc/mm/module_alloc.c    |  15 ++++
 arch/powerpc/kernel/module.c     |  36 ---------
 arch/powerpc/mm/Makefile         |   1 +
 arch/powerpc/mm/module_alloc.c   |  41 ++++++++++
 arch/riscv/kernel/module.c       |  11 ---
 arch/riscv/mm/Makefile           |   1 +
 arch/riscv/mm/module_alloc.c     |  17 ++++
 arch/s390/kernel/module.c        |  37 ---------
 arch/s390/mm/Makefile            |   1 +
 arch/s390/mm/module_alloc.c      |  42 ++++++++++
 arch/sparc/kernel/module.c       |  31 --------
 arch/sparc/mm/Makefile           |   2 +
 arch/sparc/mm/module_alloc.c     |  31 ++++++++
 arch/x86/kernel/ftrace.c         |   2 +-
 arch/x86/kernel/module.c         |  56 -------------
 arch/x86/mm/Makefile             |   2 +
 arch/x86/mm/module_alloc.c       |  59 ++++++++++++++
 fs/proc/kcore.c                  |   2 +-
 kernel/module/Kconfig            |   1 +
 kernel/module/main.c             |  17 ----
 mm/Kconfig                       |   3 +
 mm/Makefile                      |   1 +
 mm/module_alloc.c                |  21 +++++
 mm/vmalloc.c                     |   2 +-
 42 files changed, 467 insertions(+), 402 deletions(-)
 create mode 100644 arch/arm/mm/module_alloc.c
 create mode 100644 arch/arm64/mm/module_alloc.c
 create mode 100644 arch/loongarch/mm/module_alloc.c
 create mode 100644 arch/mips/mm/module_alloc.c
 create mode 100644 arch/nios2/mm/module_alloc.c
 create mode 100644 arch/parisc/mm/module_alloc.c
 create mode 100644 arch/powerpc/mm/module_alloc.c
 create mode 100644 arch/riscv/mm/module_alloc.c
 create mode 100644 arch/s390/mm/module_alloc.c
 create mode 100644 arch/sparc/mm/module_alloc.c
 create mode 100644 arch/x86/mm/module_alloc.c
 create mode 100644 mm/module_alloc.c

diff --git a/arch/Kconfig b/arch/Kconfig
index a5af0edd3eb8..cfc24ced16dd 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1305,7 +1305,7 @@ config ARCH_HAS_STRICT_MODULE_RWX
 
 config STRICT_MODULE_RWX
 	bool "Set loadable kernel module data as NX and text as RO" if ARCH_OPTIONAL_KERNEL_RWX
-	depends on ARCH_HAS_STRICT_MODULE_RWX && MODULES
+	depends on ARCH_HAS_STRICT_MODULE_RWX && MODULE_ALLOC
 	default !ARCH_OPTIONAL_KERNEL_RWX || ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
 	help
 	  If this is set, module text and rodata memory will be made read-only,
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index e74d84f58b77..1c8798732d12 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -4,15 +4,12 @@
  *
  *  Copyright (C) 2002 Russell King.
  *  Modified for nommu by Hyok S. Choi
- *
- * Module allocation method suggested by Andi Kleen.
  */
 #include <linux/module.h>
 #include <linux/moduleloader.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/elf.h>
-#include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/gfp.h>
@@ -22,38 +19,6 @@
 #include <asm/unwind.h>
 #include <asm/opcodes.h>
 
-#ifdef CONFIG_XIP_KERNEL
-/*
- * The XIP kernel text is mapped in the module area for modules and
- * some other stuff to work without any indirect relocations.
- * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
- * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
- */
-#undef MODULES_VADDR
-#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
-#endif
-
-#ifdef CONFIG_MMU
-void *module_alloc(unsigned long size)
-{
-	gfp_t gfp_mask = GFP_KERNEL;
-	void *p;
-
-	/* Silence the initial allocation */
-	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
-		gfp_mask |= __GFP_NOWARN;
-
-	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
-	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
-		return p;
-	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
-				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
-}
-#endif
-
 bool module_init_section(const char *name)
 {
 	return strstarts(name, ".init") ||
diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 71b858c9b10c..a05a6701a884 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -100,3 +100,5 @@ obj-$(CONFIG_CACHE_UNIPHIER)	+= cache-uniphier.o
 
 KASAN_SANITIZE_kasan_init.o	:= n
 obj-$(CONFIG_KASAN)		+= kasan_init.o
+
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/arm/mm/module_alloc.c b/arch/arm/mm/module_alloc.c
new file mode 100644
index 000000000000..e48be48b2b5f
--- /dev/null
+++ b/arch/arm/mm/module_alloc.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+#ifdef CONFIG_XIP_KERNEL
+/*
+ * The XIP kernel text is mapped in the module area for modules and
+ * some other stuff to work without any indirect relocations.
+ * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
+ * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
+ */
+#undef MODULES_VADDR
+#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
+#endif
+
+/*
+ * Module allocation method suggested by Andi Kleen.
+ */
+
+#ifdef CONFIG_MMU
+void *module_alloc(unsigned long size)
+{
+	gfp_t gfp_mask = GFP_KERNEL;
+	void *p;
+
+	/* Silence the initial allocation */
+	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
+		gfp_mask |= __GFP_NOWARN;
+
+	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
+				__builtin_return_address(0));
+	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
+		return p;
+	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
+				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
+				__builtin_return_address(0));
+}
+#endif
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index dd851297596e..78758ed818b0 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -13,143 +13,16 @@
 #include <linux/elf.h>
 #include <linux/ftrace.h>
 #include <linux/gfp.h>
-#include <linux/kasan.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/moduleloader.h>
-#include <linux/random.h>
 #include <linux/scs.h>
-#include <linux/vmalloc.h>
 
 #include <asm/alternative.h>
 #include <asm/insn.h>
 #include <asm/scs.h>
 #include <asm/sections.h>
 
-static u64 module_direct_base __ro_after_init = 0;
-static u64 module_plt_base __ro_after_init = 0;
-
-/*
- * Choose a random page-aligned base address for a window of 'size' bytes which
- * entirely contains the interval [start, end - 1].
- */
-static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
-{
-	u64 max_pgoff, pgoff;
-
-	if ((end - start) >= size)
-		return 0;
-
-	max_pgoff = (size - (end - start)) / PAGE_SIZE;
-	pgoff = get_random_u32_inclusive(0, max_pgoff);
-
-	return start - pgoff * PAGE_SIZE;
-}
-
-/*
- * Modules may directly reference data and text anywhere within the kernel
- * image and other modules. References using PREL32 relocations have a +/-2G
- * range, and so we need to ensure that the entire kernel image and all modules
- * fall within a 2G window such that these are always within range.
- *
- * Modules may directly branch to functions and code within the kernel text,
- * and to functions and code within other modules. These branches will use
- * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
- * that the entire kernel text and all module text falls within a 128M window
- * such that these are always within range. With PLTs, we can expand this to a
- * 2G window.
- *
- * We chose the 128M region to surround the entire kernel image (rather than
- * just the text) as using the same bounds for the 128M and 2G regions ensures
- * by construction that we never select a 128M region that is not a subset of
- * the 2G region. For very large and unusual kernel configurations this means
- * we may fall back to PLTs where they could have been avoided, but this keeps
- * the logic significantly simpler.
- */
-static int __init module_init_limits(void)
-{
-	u64 kernel_end = (u64)_end;
-	u64 kernel_start = (u64)_text;
-	u64 kernel_size = kernel_end - kernel_start;
-
-	/*
-	 * The default modules region is placed immediately below the kernel
-	 * image, and is large enough to use the full 2G relocation range.
-	 */
-	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
-	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
-
-	if (!kaslr_enabled()) {
-		if (kernel_size < SZ_128M)
-			module_direct_base = kernel_end - SZ_128M;
-		if (kernel_size < SZ_2G)
-			module_plt_base = kernel_end - SZ_2G;
-	} else {
-		u64 min = kernel_start;
-		u64 max = kernel_end;
-
-		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
-			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
-		} else {
-			module_direct_base = random_bounding_box(SZ_128M, min, max);
-			if (module_direct_base) {
-				min = module_direct_base;
-				max = module_direct_base + SZ_128M;
-			}
-		}
-
-		module_plt_base = random_bounding_box(SZ_2G, min, max);
-	}
-
-	pr_info("%llu pages in range for non-PLT usage",
-		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
-	pr_info("%llu pages in range for PLT usage",
-		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
-
-	return 0;
-}
-subsys_initcall(module_init_limits);
-
-void *module_alloc(unsigned long size)
-{
-	void *p = NULL;
-
-	/*
-	 * Where possible, prefer to allocate within direct branch range of the
-	 * kernel such that no PLTs are necessary.
-	 */
-	if (module_direct_base) {
-		p = __vmalloc_node_range(size, MODULE_ALIGN,
-					 module_direct_base,
-					 module_direct_base + SZ_128M,
-					 GFP_KERNEL | __GFP_NOWARN,
-					 PAGE_KERNEL, 0, NUMA_NO_NODE,
-					 __builtin_return_address(0));
-	}
-
-	if (!p && module_plt_base) {
-		p = __vmalloc_node_range(size, MODULE_ALIGN,
-					 module_plt_base,
-					 module_plt_base + SZ_2G,
-					 GFP_KERNEL | __GFP_NOWARN,
-					 PAGE_KERNEL, 0, NUMA_NO_NODE,
-					 __builtin_return_address(0));
-	}
-
-	if (!p) {
-		pr_warn_ratelimited("%s: unable to allocate memory\n",
-				    __func__);
-	}
-
-	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
-		vfree(p);
-		return NULL;
-	}
-
-	/* Memory is intended to be executable, reset the pointer tag. */
-	return kasan_reset_tag(p);
-}
-
 enum aarch64_reloc_op {
 	RELOC_OP_NONE,
 	RELOC_OP_ABS,
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index dbd1bc95967d..cf616635a80d 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd.o
 obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd-asm.o
 obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
 obj-$(CONFIG_ARM64_MTE)		+= mteswap.o
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
 KASAN_SANITIZE_physaddr.o	+= n
 
 obj-$(CONFIG_KASAN)		+= kasan_init.o
diff --git a/arch/arm64/mm/module_alloc.c b/arch/arm64/mm/module_alloc.c
new file mode 100644
index 000000000000..302642ea9e26
--- /dev/null
+++ b/arch/arm64/mm/module_alloc.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+#include <linux/kasan.h>
+#include <linux/random.h>
+
+static u64 module_direct_base __ro_after_init = 0;
+static u64 module_plt_base __ro_after_init = 0;
+
+/*
+ * Choose a random page-aligned base address for a window of 'size' bytes which
+ * entirely contains the interval [start, end - 1].
+ */
+static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
+{
+	u64 max_pgoff, pgoff;
+
+	if ((end - start) >= size)
+		return 0;
+
+	max_pgoff = (size - (end - start)) / PAGE_SIZE;
+	pgoff = get_random_u32_inclusive(0, max_pgoff);
+
+	return start - pgoff * PAGE_SIZE;
+}
+
+/*
+ * Modules may directly reference data and text anywhere within the kernel
+ * image and other modules. References using PREL32 relocations have a +/-2G
+ * range, and so we need to ensure that the entire kernel image and all modules
+ * fall within a 2G window such that these are always within range.
+ *
+ * Modules may directly branch to functions and code within the kernel text,
+ * and to functions and code within other modules. These branches will use
+ * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
+ * that the entire kernel text and all module text falls within a 128M window
+ * such that these are always within range. With PLTs, we can expand this to a
+ * 2G window.
+ *
+ * We chose the 128M region to surround the entire kernel image (rather than
+ * just the text) as using the same bounds for the 128M and 2G regions ensures
+ * by construction that we never select a 128M region that is not a subset of
+ * the 2G region. For very large and unusual kernel configurations this means
+ * we may fall back to PLTs where they could have been avoided, but this keeps
+ * the logic significantly simpler.
+ */
+static int __init module_init_limits(void)
+{
+	u64 kernel_end = (u64)_end;
+	u64 kernel_start = (u64)_text;
+	u64 kernel_size = kernel_end - kernel_start;
+
+	/*
+	 * The default modules region is placed immediately below the kernel
+	 * image, and is large enough to use the full 2G relocation range.
+	 */
+	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
+	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
+
+	if (!kaslr_enabled()) {
+		if (kernel_size < SZ_128M)
+			module_direct_base = kernel_end - SZ_128M;
+		if (kernel_size < SZ_2G)
+			module_plt_base = kernel_end - SZ_2G;
+	} else {
+		u64 min = kernel_start;
+		u64 max = kernel_end;
+
+		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
+			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
+		} else {
+			module_direct_base = random_bounding_box(SZ_128M, min, max);
+			if (module_direct_base) {
+				min = module_direct_base;
+				max = module_direct_base + SZ_128M;
+			}
+		}
+
+		module_plt_base = random_bounding_box(SZ_2G, min, max);
+	}
+
+	pr_info("%llu pages in range for non-PLT usage",
+		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
+	pr_info("%llu pages in range for PLT usage",
+		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
+
+	return 0;
+}
+subsys_initcall(module_init_limits);
+
+void *module_alloc(unsigned long size)
+{
+	void *p = NULL;
+
+	/*
+	 * Where possible, prefer to allocate within direct branch range of the
+	 * kernel such that no PLTs are necessary.
+	 */
+	if (module_direct_base) {
+		p = __vmalloc_node_range(size, MODULE_ALIGN,
+					 module_direct_base,
+					 module_direct_base + SZ_128M,
+					 GFP_KERNEL | __GFP_NOWARN,
+					 PAGE_KERNEL, 0, NUMA_NO_NODE,
+					 __builtin_return_address(0));
+	}
+
+	if (!p && module_plt_base) {
+		p = __vmalloc_node_range(size, MODULE_ALIGN,
+					 module_plt_base,
+					 module_plt_base + SZ_2G,
+					 GFP_KERNEL | __GFP_NOWARN,
+					 PAGE_KERNEL, 0, NUMA_NO_NODE,
+					 __builtin_return_address(0));
+	}
+
+	if (!p) {
+		pr_warn_ratelimited("%s: unable to allocate memory\n",
+				    __func__);
+	}
+
+	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
+		vfree(p);
+		return NULL;
+	}
+
+	/* Memory is intended to be executable, reset the pointer tag. */
+	return kasan_reset_tag(p);
+}
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index b13b2858fe39..7f03166513b3 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -489,12 +489,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-void *module_alloc(unsigned long size)
-{
-	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
-}
-
 static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
 				   const Elf_Shdr *sechdrs, struct module *mod)
 {
diff --git a/arch/loongarch/mm/Makefile b/arch/loongarch/mm/Makefile
index e4d1e581dbae..3966fc6118f1 100644
--- a/arch/loongarch/mm/Makefile
+++ b/arch/loongarch/mm/Makefile
@@ -10,3 +10,5 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
 obj-$(CONFIG_KASAN)		+= kasan_init.o
 
 KASAN_SANITIZE_kasan_init.o     := n
+
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/loongarch/mm/module_alloc.c b/arch/loongarch/mm/module_alloc.c
new file mode 100644
index 000000000000..24b0cb3a2088
--- /dev/null
+++ b/arch/loongarch/mm/module_alloc.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0+
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+void *module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
+}
diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
index 7b2fbaa9cac5..ba0f62d8eff5 100644
--- a/arch/mips/kernel/module.c
+++ b/arch/mips/kernel/module.c
@@ -13,7 +13,6 @@
 #include <linux/elf.h>
 #include <linux/mm.h>
 #include <linux/numa.h>
-#include <linux/vmalloc.h>
 #include <linux/slab.h>
 #include <linux/fs.h>
 #include <linux/string.h>
@@ -31,15 +30,6 @@ struct mips_hi16 {
 static LIST_HEAD(dbe_list);
 static DEFINE_SPINLOCK(dbe_lock);
 
-#ifdef MODULE_START
-void *module_alloc(unsigned long size)
-{
-	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
-				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
-}
-#endif
-
 static void apply_r_mips_32(u32 *location, u32 base, Elf_Addr v)
 {
 	*location = base + v;
diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile
index 304692391519..b9cfe37e41e4 100644
--- a/arch/mips/mm/Makefile
+++ b/arch/mips/mm/Makefile
@@ -45,3 +45,5 @@ obj-$(CONFIG_MIPS_CPU_SCACHE)	+= sc-mips.o
 obj-$(CONFIG_SCACHE_DEBUGFS)	+= sc-debugfs.o
 
 obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
+
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/mips/mm/module_alloc.c b/arch/mips/mm/module_alloc.c
new file mode 100644
index 000000000000..fcdbdece42f3
--- /dev/null
+++ b/arch/mips/mm/module_alloc.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+#ifdef MODULE_START
+void *module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
+				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
+				__builtin_return_address(0));
+}
+#endif
diff --git a/arch/nios2/kernel/module.c b/arch/nios2/kernel/module.c
index 76e0a42d6e36..f4483243578d 100644
--- a/arch/nios2/kernel/module.c
+++ b/arch/nios2/kernel/module.c
@@ -13,7 +13,6 @@
 #include <linux/moduleloader.h>
 #include <linux/elf.h>
 #include <linux/mm.h>
-#include <linux/vmalloc.h>
 #include <linux/slab.h>
 #include <linux/fs.h>
 #include <linux/string.h>
@@ -21,25 +20,6 @@
 
 #include <asm/cacheflush.h>
 
-/*
- * Modules should NOT be allocated with kmalloc for (obvious) reasons.
- * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
- * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
- * addresses in 0xc0000000)
- */
-void *module_alloc(unsigned long size)
-{
-	if (size == 0)
-		return NULL;
-	return kmalloc(size, GFP_KERNEL);
-}
-
-/* Free memory returned from module_alloc */
-void module_memfree(void *module_region)
-{
-	kfree(module_region);
-}
-
 int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab,
 			unsigned int symindex, unsigned int relsec,
 			struct module *mod)
diff --git a/arch/nios2/mm/Makefile b/arch/nios2/mm/Makefile
index 9d37fafd1dd1..facbb3e60013 100644
--- a/arch/nios2/mm/Makefile
+++ b/arch/nios2/mm/Makefile
@@ -13,3 +13,5 @@ obj-y	+= mmu_context.o
 obj-y	+= pgtable.o
 obj-y	+= tlb.o
 obj-y	+= uaccess.o
+
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/nios2/mm/module_alloc.c b/arch/nios2/mm/module_alloc.c
new file mode 100644
index 000000000000..92c7c32ef8b3
--- /dev/null
+++ b/arch/nios2/mm/module_alloc.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/slab.h>
+
+/*
+ * Modules should NOT be allocated with kmalloc for (obvious) reasons.
+ * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
+ * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
+ * addresses in 0xc0000000)
+ */
+void *module_alloc(unsigned long size)
+{
+	if (size == 0)
+		return NULL;
+	return kmalloc(size, GFP_KERNEL);
+}
+
+/* Free memory returned from module_alloc */
+void module_memfree(void *module_region)
+{
+	kfree(module_region);
+}
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index d214bbe3c2af..4e5d991b2b65 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -41,7 +41,6 @@
 
 #include <linux/moduleloader.h>
 #include <linux/elf.h>
-#include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/ftrace.h>
 #include <linux/string.h>
@@ -173,17 +172,6 @@ static inline int reassemble_22(int as22)
 		((as22 & 0x0003ff) << 3));
 }
 
-void *module_alloc(unsigned long size)
-{
-	/* using RWX means less protection for modules, but it's
-	 * easier than trying to map the text, data, init_text and
-	 * init_data correctly */
-	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
-				    GFP_KERNEL,
-				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
-				    __builtin_return_address(0));
-}
-
 #ifndef CONFIG_64BIT
 static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
 {
diff --git a/arch/parisc/mm/Makefile b/arch/parisc/mm/Makefile
index ffdb5c0a8cc6..95a6d4469785 100644
--- a/arch/parisc/mm/Makefile
+++ b/arch/parisc/mm/Makefile
@@ -5,3 +5,4 @@
 
 obj-y	 := init.o fault.o ioremap.o fixmap.o
 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
+obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
diff --git a/arch/parisc/mm/module_alloc.c b/arch/parisc/mm/module_alloc.c
new file mode 100644
index 000000000000..5ad9bfc3ffab
--- /dev/null
+++ b/arch/parisc/mm/module_alloc.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+void *module_alloc(unsigned long size)
+{
+	/* using RWX means less protection for modules, but it's
+	 * easier than trying to map the text, data, init_text and
+	 * init_data correctly */
+	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
+				    GFP_KERNEL,
+				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
+				    __builtin_return_address(0));
+}
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index f6d6ae0a1692..b5fe9c61e527 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -89,39 +89,3 @@ int module_finalize(const Elf_Ehdr *hdr,
 	return 0;
 }
 
-static __always_inline void *
-__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
-{
-	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
-	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
-
-	/*
-	 * Don't do huge page allocations for modules yet until more testing
-	 * is done. STRICT_MODULE_RWX may require extra work to support this
-	 * too.
-	 */
-	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
-				    VM_FLUSH_RESET_PERMS,
-				    NUMA_NO_NODE, __builtin_return_address(0));
-}
-
-void *module_alloc(unsigned long size)
-{
-#ifdef MODULES_VADDR
-	unsigned long limit = (unsigned long)_etext - SZ_32M;
-	void *ptr = NULL;
-
-	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
-
-	/* First try within 32M limit from _etext to avoid branch trampolines */
-	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
-		ptr = __module_alloc(size, limit, MODULES_END, true);
-
-	if (!ptr)
-		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
-
-	return ptr;
-#else
-	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
-#endif
-}
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 503a6e249940..4572273a838f 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -19,3 +19,4 @@ obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o
 obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
 obj-$(CONFIG_PTDUMP_CORE)	+= ptdump/
 obj-$(CONFIG_KASAN)		+= kasan/
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/powerpc/mm/module_alloc.c b/arch/powerpc/mm/module_alloc.c
new file mode 100644
index 000000000000..818e5cd8fbc6
--- /dev/null
+++ b/arch/powerpc/mm/module_alloc.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+static __always_inline void *
+__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
+{
+	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
+	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
+
+	/*
+	 * Don't do huge page allocations for modules yet until more testing
+	 * is done. STRICT_MODULE_RWX may require extra work to support this
+	 * too.
+	 */
+	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
+				    VM_FLUSH_RESET_PERMS,
+				    NUMA_NO_NODE, __builtin_return_address(0));
+}
+
+void *module_alloc(unsigned long size)
+{
+#ifdef MODULES_VADDR
+	unsigned long limit = (unsigned long)_etext - SZ_32M;
+	void *ptr = NULL;
+
+	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
+
+	/* First try within 32M limit from _etext to avoid branch trampolines */
+	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
+		ptr = __module_alloc(size, limit, MODULES_END, true);
+
+	if (!ptr)
+		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
+
+	return ptr;
+#else
+	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
+#endif
+}
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 5e5a82644451..53d7005fdbdb 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -11,7 +11,6 @@
 #include <linux/kernel.h>
 #include <linux/log2.h>
 #include <linux/moduleloader.h>
-#include <linux/vmalloc.h>
 #include <linux/sizes.h>
 #include <linux/pgtable.h>
 #include <asm/alternative.h>
@@ -905,16 +904,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
-void *module_alloc(unsigned long size)
-{
-	return __vmalloc_node_range(size, 1, MODULES_VADDR,
-				    MODULES_END, GFP_KERNEL,
-				    PAGE_KERNEL, VM_FLUSH_RESET_PERMS,
-				    NUMA_NO_NODE,
-				    __builtin_return_address(0));
-}
-#endif
 
 int module_finalize(const Elf_Ehdr *hdr,
 		    const Elf_Shdr *sechdrs,
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 2c869f8026a8..fba8e3595459 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -36,3 +36,4 @@ endif
 obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
 obj-$(CONFIG_RISCV_DMA_NONCOHERENT) += dma-noncoherent.o
 obj-$(CONFIG_RISCV_NONSTANDARD_CACHE_OPS) += cache-ops.o
+obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
diff --git a/arch/riscv/mm/module_alloc.c b/arch/riscv/mm/module_alloc.c
new file mode 100644
index 000000000000..2c1fb95a57e2
--- /dev/null
+++ b/arch/riscv/mm/module_alloc.c
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/pgtable.h>
+#include <asm/alternative.h>
+#include <asm/sections.h>
+
+#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
+void *module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, MODULES_VADDR,
+				    MODULES_END, GFP_KERNEL,
+				    PAGE_KERNEL, VM_FLUSH_RESET_PERMS,
+				    NUMA_NO_NODE,
+				    __builtin_return_address(0));
+}
+#endif
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 42215f9404af..ef8a7539bb0b 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -36,43 +36,6 @@
 
 #define PLT_ENTRY_SIZE 22
 
-static unsigned long get_module_load_offset(void)
-{
-	static DEFINE_MUTEX(module_kaslr_mutex);
-	static unsigned long module_load_offset;
-
-	if (!kaslr_enabled())
-		return 0;
-	/*
-	 * Calculate the module_load_offset the first time this code
-	 * is called. Once calculated it stays the same until reboot.
-	 */
-	mutex_lock(&module_kaslr_mutex);
-	if (!module_load_offset)
-		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-	mutex_unlock(&module_kaslr_mutex);
-	return module_load_offset;
-}
-
-void *module_alloc(unsigned long size)
-{
-	gfp_t gfp_mask = GFP_KERNEL;
-	void *p;
-
-	if (PAGE_ALIGN(size) > MODULES_LEN)
-		return NULL;
-	p = __vmalloc_node_range(size, MODULE_ALIGN,
-				 MODULES_VADDR + get_module_load_offset(),
-				 MODULES_END, gfp_mask, PAGE_KERNEL,
-				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
-				 NUMA_NO_NODE, __builtin_return_address(0));
-	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
-		vfree(p);
-		return NULL;
-	}
-	return p;
-}
-
 #ifdef CONFIG_FUNCTION_TRACER
 void module_arch_cleanup(struct module *mod)
 {
diff --git a/arch/s390/mm/Makefile b/arch/s390/mm/Makefile
index 352ff520fd94..4f44c4096c6d 100644
--- a/arch/s390/mm/Makefile
+++ b/arch/s390/mm/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
 obj-$(CONFIG_PTDUMP_CORE)	+= dump_pagetables.o
 obj-$(CONFIG_PGSTE)		+= gmap.o
 obj-$(CONFIG_PFAULT)		+= pfault.o
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/s390/mm/module_alloc.c b/arch/s390/mm/module_alloc.c
new file mode 100644
index 000000000000..88eadce4bc68
--- /dev/null
+++ b/arch/s390/mm/module_alloc.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0+
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+#include <linux/kasan.h>
+
+static unsigned long get_module_load_offset(void)
+{
+	static DEFINE_MUTEX(module_kaslr_mutex);
+	static unsigned long module_load_offset;
+
+	if (!kaslr_enabled())
+		return 0;
+	/*
+	 * Calculate the module_load_offset the first time this code
+	 * is called. Once calculated it stays the same until reboot.
+	 */
+	mutex_lock(&module_kaslr_mutex);
+	if (!module_load_offset)
+		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
+	mutex_unlock(&module_kaslr_mutex);
+	return module_load_offset;
+}
+
+void *module_alloc(unsigned long size)
+{
+	gfp_t gfp_mask = GFP_KERNEL;
+	void *p;
+
+	if (PAGE_ALIGN(size) > MODULES_LEN)
+		return NULL;
+	p = __vmalloc_node_range(size, MODULE_ALIGN,
+				 MODULES_VADDR + get_module_load_offset(),
+				 MODULES_END, gfp_mask, PAGE_KERNEL,
+				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
+				 NUMA_NO_NODE, __builtin_return_address(0));
+	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
+		vfree(p);
+		return NULL;
+	}
+	return p;
+}
diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
index 66c45a2764bc..0611a41cd586 100644
--- a/arch/sparc/kernel/module.c
+++ b/arch/sparc/kernel/module.c
@@ -8,7 +8,6 @@
 #include <linux/moduleloader.h>
 #include <linux/kernel.h>
 #include <linux/elf.h>
-#include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/gfp.h>
 #include <linux/string.h>
@@ -21,36 +20,6 @@
 
 #include "entry.h"
 
-#ifdef CONFIG_SPARC64
-
-#include <linux/jump_label.h>
-
-static void *module_map(unsigned long size)
-{
-	if (PAGE_ALIGN(size) > MODULES_LEN)
-		return NULL;
-	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
-}
-#else
-static void *module_map(unsigned long size)
-{
-	return vmalloc(size);
-}
-#endif /* CONFIG_SPARC64 */
-
-void *module_alloc(unsigned long size)
-{
-	void *ret;
-
-	ret = module_map(size);
-	if (ret)
-		memset(ret, 0, size);
-
-	return ret;
-}
-
 /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
 int module_frob_arch_sections(Elf_Ehdr *hdr,
 			      Elf_Shdr *sechdrs,
diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile
index 809d993f6d88..a8e9ba46679a 100644
--- a/arch/sparc/mm/Makefile
+++ b/arch/sparc/mm/Makefile
@@ -14,3 +14,5 @@ obj-$(CONFIG_SPARC32)   += leon_mm.o
 
 # Only used by sparc64
 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
+
+obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
diff --git a/arch/sparc/mm/module_alloc.c b/arch/sparc/mm/module_alloc.c
new file mode 100644
index 000000000000..14aef0f75650
--- /dev/null
+++ b/arch/sparc/mm/module_alloc.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+#ifdef CONFIG_SPARC64
+static void *module_map(unsigned long size)
+{
+	if (PAGE_ALIGN(size) > MODULES_LEN)
+		return NULL;
+	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
+				__builtin_return_address(0));
+}
+#else
+static void *module_map(unsigned long size)
+{
+	return vmalloc(size);
+}
+#endif /* CONFIG_SPARC64 */
+
+void *module_alloc(unsigned long size)
+{
+	void *ret;
+
+	ret = module_map(size);
+	if (ret)
+		memset(ret, 0, size);
+
+	return ret;
+}
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 12df54ff0e81..99f242e11f88 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -260,7 +260,7 @@ void arch_ftrace_update_code(int command)
 /* Currently only x86_64 supports dynamic trampolines */
 #ifdef CONFIG_X86_64
 
-#ifdef CONFIG_MODULES
+#if IS_ENABLED(CONFIG_MODULE_ALLOC)
 #include <linux/moduleloader.h>
 /* Module allocation simplifies allocating memory for code */
 static inline void *alloc_tramp(unsigned long size)
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index e18914c0e38a..ad7e3968ee8f 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -8,21 +8,14 @@
 
 #include <linux/moduleloader.h>
 #include <linux/elf.h>
-#include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
-#include <linux/kasan.h>
 #include <linux/bug.h>
-#include <linux/mm.h>
-#include <linux/gfp.h>
 #include <linux/jump_label.h>
-#include <linux/random.h>
 #include <linux/memory.h>
 
 #include <asm/text-patching.h>
-#include <asm/page.h>
-#include <asm/setup.h>
 #include <asm/unwind.h>
 
 #if 0
@@ -36,56 +29,7 @@ do {							\
 } while (0)
 #endif
 
-#ifdef CONFIG_RANDOMIZE_BASE
-static unsigned long module_load_offset;
 
-/* Mutex protects the module_load_offset. */
-static DEFINE_MUTEX(module_kaslr_mutex);
-
-static unsigned long int get_module_load_offset(void)
-{
-	if (kaslr_enabled()) {
-		mutex_lock(&module_kaslr_mutex);
-		/*
-		 * Calculate the module_load_offset the first time this
-		 * code is called. Once calculated it stays the same until
-		 * reboot.
-		 */
-		if (module_load_offset == 0)
-			module_load_offset =
-				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-		mutex_unlock(&module_kaslr_mutex);
-	}
-	return module_load_offset;
-}
-#else
-static unsigned long int get_module_load_offset(void)
-{
-	return 0;
-}
-#endif
-
-void *module_alloc(unsigned long size)
-{
-	gfp_t gfp_mask = GFP_KERNEL;
-	void *p;
-
-	if (PAGE_ALIGN(size) > MODULES_LEN)
-		return NULL;
-
-	p = __vmalloc_node_range(size, MODULE_ALIGN,
-				 MODULES_VADDR + get_module_load_offset(),
-				 MODULES_END, gfp_mask, PAGE_KERNEL,
-				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
-				 NUMA_NO_NODE, __builtin_return_address(0));
-
-	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
-		vfree(p);
-		return NULL;
-	}
-
-	return p;
-}
 
 #ifdef CONFIG_X86_32
 int apply_relocate(Elf32_Shdr *sechdrs,
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index c80febc44cd2..b9e42770a002 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -67,3 +67,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_amd.o
 
 obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_identity.o
 obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_boot.o
+
+obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
diff --git a/arch/x86/mm/module_alloc.c b/arch/x86/mm/module_alloc.c
new file mode 100644
index 000000000000..00391c15e1eb
--- /dev/null
+++ b/arch/x86/mm/module_alloc.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+#include <linux/kasan.h>
+#include <linux/random.h>
+#include <linux/mutex.h>
+#include <asm/setup.h>
+
+#ifdef CONFIG_RANDOMIZE_BASE
+static unsigned long module_load_offset;
+
+/* Mutex protects the module_load_offset. */
+static DEFINE_MUTEX(module_kaslr_mutex);
+
+static unsigned long int get_module_load_offset(void)
+{
+	if (kaslr_enabled()) {
+		mutex_lock(&module_kaslr_mutex);
+		/*
+		 * Calculate the module_load_offset the first time this
+		 * code is called. Once calculated it stays the same until
+		 * reboot.
+		 */
+		if (module_load_offset == 0)
+			module_load_offset =
+				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
+		mutex_unlock(&module_kaslr_mutex);
+	}
+	return module_load_offset;
+}
+#else
+static unsigned long int get_module_load_offset(void)
+{
+	return 0;
+}
+#endif
+
+void *module_alloc(unsigned long size)
+{
+	gfp_t gfp_mask = GFP_KERNEL;
+	void *p;
+
+	if (PAGE_ALIGN(size) > MODULES_LEN)
+		return NULL;
+
+	p = __vmalloc_node_range(size, MODULE_ALIGN,
+				 MODULES_VADDR + get_module_load_offset(),
+				 MODULES_END, gfp_mask, PAGE_KERNEL,
+				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
+				 NUMA_NO_NODE, __builtin_return_address(0));
+
+	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
+		vfree(p);
+		return NULL;
+	}
+
+	return p;
+}
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 6422e569b080..b8f4dcf92a89 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -668,7 +668,7 @@ static void __init proc_kcore_text_init(void)
 }
 #endif
 
-#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
+#if defined(CONFIG_MODULE_ALLOC) && defined(MODULES_VADDR)
 /*
  * MODULES_VADDR has no intersection with VMALLOC_ADDR.
  */
diff --git a/kernel/module/Kconfig b/kernel/module/Kconfig
index 0ea1b2970a23..a49460022350 100644
--- a/kernel/module/Kconfig
+++ b/kernel/module/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 menuconfig MODULES
 	bool "Enable loadable module support"
+	select MODULE_ALLOC
 	modules
 	help
 	  Kernel modules are small pieces of compiled code which can
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 36681911c05a..085bc6e75b3f 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1179,16 +1179,6 @@ resolve_symbol_wait(struct module *mod,
 	return ksym;
 }
 
-void __weak module_memfree(void *module_region)
-{
-	/*
-	 * This memory may be RO, and freeing RO memory in an interrupt is not
-	 * supported by vmalloc.
-	 */
-	WARN_ON(in_interrupt());
-	vfree(module_region);
-}
-
 void __weak module_arch_cleanup(struct module *mod)
 {
 }
@@ -1610,13 +1600,6 @@ static void free_modinfo(struct module *mod)
 	}
 }
 
-void * __weak module_alloc(unsigned long size)
-{
-	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
-			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
-			NUMA_NO_NODE, __builtin_return_address(0));
-}
-
 bool __weak module_init_section(const char *name)
 {
 	return strstarts(name, ".init");
diff --git a/mm/Kconfig b/mm/Kconfig
index ffc3a2ba3a8c..92bfb5ae2e95 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1261,6 +1261,9 @@ config LOCK_MM_AND_FIND_VMA
 config IOMMU_MM_DATA
 	bool
 
+config MODULE_ALLOC
+	def_bool n
+
 source "mm/damon/Kconfig"
 
 endmenu
diff --git a/mm/Makefile b/mm/Makefile
index e4b5b75aaec9..731bd2c20ceb 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -134,3 +134,4 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o
 obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
 obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
 obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
+obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
diff --git a/mm/module_alloc.c b/mm/module_alloc.c
new file mode 100644
index 000000000000..821af49e9a7c
--- /dev/null
+++ b/mm/module_alloc.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+
+void * __weak module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
+			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
+			NUMA_NO_NODE, __builtin_return_address(0));
+}
+
+void __weak module_memfree(void *module_region)
+{
+	/*
+	 * This memory may be RO, and freeing RO memory in an interrupt is not
+	 * supported by vmalloc.
+	 */
+	WARN_ON(in_interrupt());
+	vfree(module_region);
+}
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d12a17fc0c17..b7d963fe0707 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -642,7 +642,7 @@ int is_vmalloc_or_module_addr(const void *x)
 	 * and fall back on vmalloc() if that fails. Others
 	 * just put it in the vmalloc space.
 	 */
-#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
+#if defined(CONFIG_MODULE_ALLOC) && defined(MODULES_VADDR)
 	unsigned long addr = (unsigned long)kasan_reset_tag(x);
 	if (addr >= MODULES_VADDR && addr < MODULES_END)
 		return 1;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n
  2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
  2024-03-06 20:05 ` [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available Calvin Owens
@ 2024-03-06 20:05 ` Calvin Owens
  2024-03-07 22:09   ` Christophe Leroy
  2024-03-06 20:05 ` [RFC][PATCH 3/4] kprobes: Allow kprobes " Calvin Owens
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Calvin Owens @ 2024-03-06 20:05 UTC (permalink / raw)
  To: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner
  Cc: Calvin Owens, bpf, linux-modules, linux-kernel

No BPF code has to change, except in struct_ops (for module refs).

This conflicts with bpf-next because of this (relevant) series:

    https://lore.kernel.org/all/20240119225005.668602-1-thinker.li@gmail.com/

If something like this is merged down the road, it can go through
bpf-next at leisure once the module_alloc change is in: it's a one-way
dependency.

Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
---
 kernel/bpf/Kconfig          |  2 +-
 kernel/bpf/bpf_struct_ops.c | 28 ++++++++++++++++++++++++----
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
index 6a906ff93006..77df483a8925 100644
--- a/kernel/bpf/Kconfig
+++ b/kernel/bpf/Kconfig
@@ -42,7 +42,7 @@ config BPF_JIT
 	bool "Enable BPF Just In Time compiler"
 	depends on BPF
 	depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT
-	depends on MODULES
+	select MODULE_ALLOC
 	help
 	  BPF programs are normally handled by a BPF interpreter. This option
 	  allows the kernel to generate native code when a program is loaded
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 02068bd0e4d9..fbf08a1bb00c 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -108,11 +108,30 @@ const struct bpf_prog_ops bpf_struct_ops_prog_ops = {
 #endif
 };
 
+#if IS_ENABLED(CONFIG_MODULES)
 static const struct btf_type *module_type;
 
+static int bpf_struct_module_type_init(struct btf *btf)
+{
+	s32 module_id;
+
+	module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
+	if (module_id < 0)
+		return 1;
+
+	module_type = btf_type_by_id(btf, module_id);
+	return 0;
+}
+#else
+static int bpf_struct_module_type_init(struct btf *btf)
+{
+	return 0;
+}
+#endif
+
 void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
 {
-	s32 type_id, value_id, module_id;
+	s32 type_id, value_id;
 	const struct btf_member *member;
 	struct bpf_struct_ops *st_ops;
 	const struct btf_type *t;
@@ -125,12 +144,10 @@ void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
 #include "bpf_struct_ops_types.h"
 #undef BPF_STRUCT_OPS_TYPE
 
-	module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
-	if (module_id < 0) {
+	if (bpf_struct_module_type_init(btf)) {
 		pr_warn("Cannot find struct module in btf_vmlinux\n");
 		return;
 	}
-	module_type = btf_type_by_id(btf, module_id);
 
 	for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
 		st_ops = bpf_struct_ops[i];
@@ -433,12 +450,15 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 
 		moff = __btf_member_bit_offset(t, member) / 8;
 		ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL);
+
+#if IS_ENABLED(CONFIG_MODULES)
 		if (ptype == module_type) {
 			if (*(void **)(udata + moff))
 				goto reset_unlock;
 			*(void **)(kdata + moff) = BPF_MODULE_OWNER;
 			continue;
 		}
+#endif
 
 		err = st_ops->init_member(t, member, kdata, udata);
 		if (err < 0)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
  2024-03-06 20:05 ` [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available Calvin Owens
  2024-03-06 20:05 ` [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n Calvin Owens
@ 2024-03-06 20:05 ` Calvin Owens
  2024-03-07  7:22   ` Mike Rapoport
                     ` (2 more replies)
  2024-03-06 20:05 ` [RFC][PATCH 4/4] selftests/bpf: Support testing the !MODULES case Calvin Owens
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-06 20:05 UTC (permalink / raw)
  To: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner
  Cc: Calvin Owens, bpf, linux-modules, linux-kernel

If something like this is merged down the road, it can go in at leisure
once the module_alloc change is in: it's a one-way dependency.

Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
---
 arch/Kconfig                |  2 +-
 kernel/kprobes.c            | 22 ++++++++++++++++++++++
 kernel/trace/trace_kprobe.c | 11 +++++++++++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index cfc24ced16dd..e60ce984d095 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -52,8 +52,8 @@ config GENERIC_ENTRY
 
 config KPROBES
 	bool "Kprobes"
-	depends on MODULES
 	depends on HAVE_KPROBES
+	select MODULE_ALLOC
 	select KALLSYMS
 	select TASKS_RCU if PREEMPTION
 	help
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9d9095e81792..194270e17d57 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
 		str_has_prefix("__pfx_", symbuf);
 }
 
+#if IS_ENABLED(CONFIG_MODULES)
 static int check_kprobe_address_safe(struct kprobe *p,
 				     struct module **probed_mod)
+#else
+static int check_kprobe_address_safe(struct kprobe *p)
+#endif
 {
 	int ret;
 
@@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
 		goto out;
 	}
 
+#if IS_ENABLED(CONFIG_MODULES)
 	/* Check if 'p' is probing a module. */
 	*probed_mod = __module_text_address((unsigned long) p->addr);
 	if (*probed_mod) {
@@ -1603,6 +1608,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
 			ret = -ENOENT;
 		}
 	}
+#endif
+
 out:
 	preempt_enable();
 	jump_label_unlock();
@@ -1614,7 +1621,9 @@ int register_kprobe(struct kprobe *p)
 {
 	int ret;
 	struct kprobe *old_p;
+#if IS_ENABLED(CONFIG_MODULES)
 	struct module *probed_mod;
+#endif
 	kprobe_opcode_t *addr;
 	bool on_func_entry;
 
@@ -1633,7 +1642,11 @@ int register_kprobe(struct kprobe *p)
 	p->nmissed = 0;
 	INIT_LIST_HEAD(&p->list);
 
+#if IS_ENABLED(CONFIG_MODULES)
 	ret = check_kprobe_address_safe(p, &probed_mod);
+#else
+	ret = check_kprobe_address_safe(p);
+#endif
 	if (ret)
 		return ret;
 
@@ -1676,8 +1689,10 @@ int register_kprobe(struct kprobe *p)
 out:
 	mutex_unlock(&kprobe_mutex);
 
+#if IS_ENABLED(CONFIG_MODULES)
 	if (probed_mod)
 		module_put(probed_mod);
+#endif
 
 	return ret;
 }
@@ -2482,6 +2497,7 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
 	return 0;
 }
 
+#if IS_ENABLED(CONFIG_MODULES)
 /* Remove all symbols in given area from kprobe blacklist */
 static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
 {
@@ -2499,6 +2515,7 @@ static void kprobe_remove_ksym_blacklist(unsigned long entry)
 {
 	kprobe_remove_area_blacklist(entry, entry + 1);
 }
+#endif
 
 int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
 				   char *type, char *sym)
@@ -2564,6 +2581,7 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
 	return ret ? : arch_populate_kprobe_blacklist();
 }
 
+#if IS_ENABLED(CONFIG_MODULES)
 static void add_module_kprobe_blacklist(struct module *mod)
 {
 	unsigned long start, end;
@@ -2665,6 +2683,7 @@ static struct notifier_block kprobe_module_nb = {
 	.notifier_call = kprobes_module_callback,
 	.priority = 0
 };
+#endif /* IS_ENABLED(CONFIG_MODULES) */
 
 void kprobe_free_init_mem(void)
 {
@@ -2724,8 +2743,11 @@ static int __init init_kprobes(void)
 	err = arch_init_kprobes();
 	if (!err)
 		err = register_die_notifier(&kprobe_exceptions_nb);
+
+#if IS_ENABLED(CONFIG_MODULES)
 	if (!err)
 		err = register_module_notifier(&kprobe_module_nb);
+#endif
 
 	kprobes_initialized = (err == 0);
 	kprobe_sysctls_init();
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index c4c6e0e0068b..dd4598f775b9 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -102,6 +102,7 @@ static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
 	return kprobe_gone(&tk->rp.kp);
 }
 
+#if IS_ENABLED(CONFIG_MODULES)
 static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
 						 struct module *mod)
 {
@@ -129,6 +130,12 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
 
 	return ret;
 }
+#else
+static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
+{
+	return true;
+}
+#endif
 
 static bool trace_kprobe_is_busy(struct dyn_event *ev)
 {
@@ -670,6 +677,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
 	return ret;
 }
 
+#if IS_ENABLED(CONFIG_MODULES)
 /* Module notifier call back, checking event on the module */
 static int trace_kprobe_module_callback(struct notifier_block *nb,
 				       unsigned long val, void *data)
@@ -704,6 +712,7 @@ static struct notifier_block trace_kprobe_module_nb = {
 	.notifier_call = trace_kprobe_module_callback,
 	.priority = 1	/* Invoked after kprobe module callback */
 };
+#endif /* IS_ENABLED(CONFIG_MODULES) */
 
 static int count_symbols(void *data, unsigned long unused)
 {
@@ -1897,8 +1906,10 @@ static __init int init_kprobe_trace_early(void)
 	if (ret)
 		return ret;
 
+#if IS_ENABLED(CONFIG_MODULES)
 	if (register_module_notifier(&trace_kprobe_module_nb))
 		return -EINVAL;
+#endif
 
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 4/4] selftests/bpf: Support testing the !MODULES case
  2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
                   ` (2 preceding siblings ...)
  2024-03-06 20:05 ` [RFC][PATCH 3/4] kprobes: Allow kprobes " Calvin Owens
@ 2024-03-06 20:05 ` Calvin Owens
  2024-03-06 21:34 ` [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Luis Chamberlain
  2024-03-25 22:46 ` Jarkko Sakkinen
  5 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-06 20:05 UTC (permalink / raw)
  To: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner
  Cc: Calvin Owens, bpf, linux-modules, linux-kernel

This symlinks bpf_testmod into the main source, so it can be built-in
for running selftests in the new !MODULES case.

To be clear, no changes to the existing selftests are required: this
only exists to enable testing the new case which was not previously
possible. I'm sure somebody will be able to suggest a less ugly way I
can do this...

Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
---
 include/trace/events/bpf_testmod.h            |  1 +
 kernel/bpf/Kconfig                            |  9 ++++++
 kernel/bpf/Makefile                           |  2 ++
 kernel/bpf/bpf_testmod/Makefile               |  1 +
 kernel/bpf/bpf_testmod/bpf_testmod.c          |  1 +
 kernel/bpf/bpf_testmod/bpf_testmod.h          |  1 +
 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h    |  1 +
 net/bpf/test_run.c                            |  2 ++
 tools/testing/selftests/bpf/Makefile          | 28 +++++++++++++------
 .../selftests/bpf/bpf_testmod/Makefile        |  2 +-
 .../bpf/bpf_testmod/bpf_testmod-events.h      |  6 ++++
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  4 +++
 .../bpf/bpf_testmod/bpf_testmod_kfunc.h       |  2 ++
 tools/testing/selftests/bpf/config            |  5 ----
 tools/testing/selftests/bpf/config.mods       |  5 ++++
 tools/testing/selftests/bpf/config.nomods     |  1 +
 .../selftests/bpf/progs/btf_type_tag_percpu.c |  2 ++
 .../selftests/bpf/progs/btf_type_tag_user.c   |  2 ++
 tools/testing/selftests/bpf/progs/core_kern.c |  2 ++
 .../selftests/bpf/progs/iters_testmod_seq.c   |  2 ++
 .../bpf/progs/test_core_reloc_module.c        |  2 ++
 .../selftests/bpf/progs/test_ldsx_insn.c      |  2 ++
 .../selftests/bpf/progs/test_module_attach.c  |  3 ++
 .../selftests/bpf/progs/tracing_struct.c      |  2 ++
 tools/testing/selftests/bpf/testing_helpers.c | 14 ++++++++++
 tools/testing/selftests/bpf/vmtest.sh         | 24 ++++++++++++++--
 26 files changed, 110 insertions(+), 16 deletions(-)
 create mode 120000 include/trace/events/bpf_testmod.h
 create mode 100644 kernel/bpf/bpf_testmod/Makefile
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.c
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.h
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
 create mode 100644 tools/testing/selftests/bpf/config.mods
 create mode 100644 tools/testing/selftests/bpf/config.nomods

diff --git a/include/trace/events/bpf_testmod.h b/include/trace/events/bpf_testmod.h
new file mode 120000
index 000000000000..ae237a90d381
--- /dev/null
+++ b/include/trace/events/bpf_testmod.h
@@ -0,0 +1 @@
+../../../tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
\ No newline at end of file
diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
index 77df483a8925..d5ba795182e5 100644
--- a/kernel/bpf/Kconfig
+++ b/kernel/bpf/Kconfig
@@ -100,4 +100,13 @@ config BPF_LSM
 
 	  If you are unsure how to answer this question, answer N.
 
+config BPF_TEST_MODULE
+	bool "Build the module for BPF selftests as a built-in"
+	depends on BPF_SYSCALL
+	depends on BPF_JIT
+	depends on !MODULES
+	default n
+	help
+	  This allows most of the bpf selftests to run without modules.
+
 endmenu # "BPF subsystem"
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index f526b7573e97..04b3e50ff940 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -46,3 +46,5 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/
 obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
 $(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE
 	$(call if_changed_rule,cc_o_c)
+
+obj-$(CONFIG_BPF_TEST_MODULE) += bpf_testmod/
diff --git a/kernel/bpf/bpf_testmod/Makefile b/kernel/bpf/bpf_testmod/Makefile
new file mode 100644
index 000000000000..55a73fd8443e
--- /dev/null
+++ b/kernel/bpf/bpf_testmod/Makefile
@@ -0,0 +1 @@
+obj-y += bpf_testmod.o
diff --git a/kernel/bpf/bpf_testmod/bpf_testmod.c b/kernel/bpf/bpf_testmod/bpf_testmod.c
new file mode 120000
index 000000000000..ca3baca5d9c4
--- /dev/null
+++ b/kernel/bpf/bpf_testmod/bpf_testmod.c
@@ -0,0 +1 @@
+../../../tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
\ No newline at end of file
diff --git a/kernel/bpf/bpf_testmod/bpf_testmod.h b/kernel/bpf/bpf_testmod/bpf_testmod.h
new file mode 120000
index 000000000000..f8d3df98b6a5
--- /dev/null
+++ b/kernel/bpf/bpf_testmod/bpf_testmod.h
@@ -0,0 +1 @@
+../../../tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h
\ No newline at end of file
diff --git a/kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h b/kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
new file mode 120000
index 000000000000..fdf42f5eaeb0
--- /dev/null
+++ b/kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
@@ -0,0 +1 @@
+../../../tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
\ No newline at end of file
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index dfd919374017..33029c91bf92 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -573,10 +573,12 @@ __bpf_kfunc int bpf_modify_return_test2(int a, int *b, short c, int d,
 	return a + *b + c + d + (long)e + f + g;
 }
 
+#if !IS_ENABLED(CONFIG_BPF_TEST_MODULE)
 int noinline bpf_fentry_shadow_test(int a)
 {
 	return a + 1;
 }
+#endif
 
 struct prog_test_member1 {
 	int a;
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index fd15017ed3b1..12da018c9fc3 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -108,9 +108,16 @@ TEST_PROGS_EXTENDED := with_addr.sh \
 # Compile but not part of 'make run_tests'
 TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \
 	flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \
-	test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \
-	xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \
-	xdp_features
+	test_lirc_mode2_user xdping test_cpp runqslower bench xskxceiver \
+	xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata xdp_features
+
+RUN_TESTS_WITHOUT_MODULES ?= 0
+TRUNNER_EXTRA_CFLAGS ?=
+
+ifeq ($(RUN_TESTS_WITHOUT_MODULES),0)
+TEST_GEN_PROGS_EXTENDED += bpf_testmod.ko
+TRUNNER_EXTRA_CFLAGS += -DBPF_TESTMOD_EXTERNAL
+endif
 
 TEST_GEN_FILES += liburandom_read.so urandom_read sign-file uprobe_multi
 
@@ -400,22 +407,22 @@ $(OUTPUT)/cgroup_getset_retval_hooks.o: cgroup_getset_retval_hooks.h
 # $3 - CFLAGS
 define CLANG_BPF_BUILD_RULE
 	$(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2)
-	$(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v3 -o $2
+	$(Q)$(CLANG) $3 $(TRUNNER_EXTRA_CFLAGS) -O2 --target=bpf -c $1 -mcpu=v3 -o $2
 endef
 # Similar to CLANG_BPF_BUILD_RULE, but with disabled alu32
 define CLANG_NOALU32_BPF_BUILD_RULE
 	$(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2)
-	$(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v2 -o $2
+	$(Q)$(CLANG) $3 $(TRUNNER_EXTRA_CFLAGS) -O2 --target=bpf -c $1 -mcpu=v2 -o $2
 endef
 # Similar to CLANG_BPF_BUILD_RULE, but with cpu-v4
 define CLANG_CPUV4_BPF_BUILD_RULE
 	$(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2)
-	$(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v4 -o $2
+	$(Q)$(CLANG) $3 $(TRUNNER_EXTRA_CFLAGS) -O2 --target=bpf -c $1 -mcpu=v4 -o $2
 endef
 # Build BPF object using GCC
 define GCC_BPF_BUILD_RULE
 	$(call msg,GCC-BPF,$(TRUNNER_BINARY),$2)
-	$(Q)$(BPF_GCC) $3 -O2 -c $1 -o $2
+	$(Q)$(BPF_GCC) $3 $(TRUNNER_EXTRA_CFLAGS) -O2 -c $1 -o $2
 endef
 
 SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
@@ -605,7 +612,7 @@ TRUNNER_EXTRA_SOURCES := test_progs.c		\
 			 json_writer.c 		\
 			 flow_dissector_load.h	\
 			 ip_check_defrag_frags.h
-TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko	\
+TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read				\
 		       $(OUTPUT)/liburandom_read.so			\
 		       $(OUTPUT)/xdp_synproxy				\
 		       $(OUTPUT)/sign-file				\
@@ -614,6 +621,11 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko	\
 		       verify_sig_setup.sh				\
 		       $(wildcard progs/btf_dump_test_case_*.c)		\
 		       $(wildcard progs/*.bpf.o)
+
+ifeq ($(RUN_TESTS_WITHOUT_MODULES),0)
+TRUNNER_EXTRA_FILES += $(OUTPUT)/bpf_testmod.ko
+endif
+
 TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
 TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS) -DENABLE_ATOMICS_TESTS
 $(eval $(call DEFINE_TEST_RUNNER,test_progs))
diff --git a/tools/testing/selftests/bpf/bpf_testmod/Makefile b/tools/testing/selftests/bpf/bpf_testmod/Makefile
index 15cb36c4483a..123f161339e4 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
+++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
@@ -10,7 +10,7 @@ endif
 MODULES = bpf_testmod.ko
 
 obj-m += bpf_testmod.o
-CFLAGS_bpf_testmod.o = -I$(src)
+CFLAGS_bpf_testmod.o = -I$(src) -DBPF_TESTMOD_EXTERNAL
 
 all:
 	+$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) modules
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
index 11ee801e75e7..57a9795d814a 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /* Copyright (c) 2020 Facebook */
+
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM bpf_testmod
 
@@ -7,7 +8,10 @@
 #define _BPF_TESTMOD_EVENTS_H
 
 #include <linux/tracepoint.h>
+
+#ifdef BPF_TESTMOD_EXTERNAL
 #include "bpf_testmod.h"
+#endif
 
 TRACE_EVENT(bpf_testmod_test_read,
 	TP_PROTO(struct task_struct *task, struct bpf_testmod_test_read_ctx *ctx),
@@ -51,7 +55,9 @@ BPF_TESTMOD_DECLARE_TRACE(bpf_testmod_test_writable_bare,
 
 #endif /* _BPF_TESTMOD_EVENTS_H */
 
+#ifdef BPF_TESTMOD_EXTERNAL
 #undef TRACE_INCLUDE_PATH
 #define TRACE_INCLUDE_PATH .
 #define TRACE_INCLUDE_FILE bpf_testmod-events
+#endif
 #include <trace/define_trace.h>
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 91907b321f91..78769fe1c66b 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -12,7 +12,11 @@
 #include "bpf_testmod_kfunc.h"
 
 #define CREATE_TRACE_POINTS
+#ifdef BPF_TESTMOD_EXTERNAL
 #include "bpf_testmod-events.h"
+#else
+#include "trace/events/bpf_testmod.h"
+#endif
 
 typedef int (*func_proto_typedef)(long);
 typedef int (*func_proto_typedef_nested1)(func_proto_typedef);
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
index 7c664dd61059..fe4a67cf04cb 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h
@@ -26,6 +26,7 @@ struct prog_test_ref_kfunc {
 };
 #endif
 
+#if defined(BPF_TESTMOD_EXTERNAL) || defined(__KERNEL__)
 struct prog_test_pass1 {
 	int x0;
 	struct {
@@ -63,6 +64,7 @@ struct prog_test_fail3 {
 	char arr1[2];
 	char arr2[];
 };
+#endif
 
 struct prog_test_ref_kfunc *
 bpf_kfunc_call_test_acquire(unsigned long *scalar_ptr) __ksym;
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index c125c441abc7..b26e79e42fb7 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -44,11 +44,6 @@ CONFIG_IPV6_TUNNEL=y
 CONFIG_KEYS=y
 CONFIG_LIRC=y
 CONFIG_LWTUNNEL=y
-CONFIG_MODULE_SIG=y
-CONFIG_MODULE_SRCVERSION_ALL=y
-CONFIG_MODULE_UNLOAD=y
-CONFIG_MODULES=y
-CONFIG_MODVERSIONS=y
 CONFIG_MPLS=y
 CONFIG_MPLS_IPTUNNEL=y
 CONFIG_MPLS_ROUTING=y
diff --git a/tools/testing/selftests/bpf/config.mods b/tools/testing/selftests/bpf/config.mods
new file mode 100644
index 000000000000..7fc4edb66b35
--- /dev/null
+++ b/tools/testing/selftests/bpf/config.mods
@@ -0,0 +1,5 @@
+CONFIG_MODULE_SIG=y
+CONFIG_MODULE_SRCVERSION_ALL=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULES=y
+CONFIG_MODVERSIONS=y
diff --git a/tools/testing/selftests/bpf/config.nomods b/tools/testing/selftests/bpf/config.nomods
new file mode 100644
index 000000000000..aea6afdc0a0b
--- /dev/null
+++ b/tools/testing/selftests/bpf/config.nomods
@@ -0,0 +1 @@
+CONFIG_BPF_TEST_MODULE=y
diff --git a/tools/testing/selftests/bpf/progs/btf_type_tag_percpu.c b/tools/testing/selftests/bpf/progs/btf_type_tag_percpu.c
index 38f78d9345de..b3b52934dd37 100644
--- a/tools/testing/selftests/bpf/progs/btf_type_tag_percpu.c
+++ b/tools/testing/selftests/bpf/progs/btf_type_tag_percpu.c
@@ -4,6 +4,7 @@
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_testmod_btf_type_tag_1 {
 	int a;
 };
@@ -11,6 +12,7 @@ struct bpf_testmod_btf_type_tag_1 {
 struct bpf_testmod_btf_type_tag_2 {
 	struct bpf_testmod_btf_type_tag_1 *p;
 };
+#endif
 
 __u64 g;
 
diff --git a/tools/testing/selftests/bpf/progs/btf_type_tag_user.c b/tools/testing/selftests/bpf/progs/btf_type_tag_user.c
index 5523f77c5a44..a41cf28ef437 100644
--- a/tools/testing/selftests/bpf/progs/btf_type_tag_user.c
+++ b/tools/testing/selftests/bpf/progs/btf_type_tag_user.c
@@ -4,6 +4,7 @@
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_testmod_btf_type_tag_1 {
 	int a;
 };
@@ -11,6 +12,7 @@ struct bpf_testmod_btf_type_tag_1 {
 struct bpf_testmod_btf_type_tag_2 {
 	struct bpf_testmod_btf_type_tag_1 *p;
 };
+#endif
 
 int g;
 
diff --git a/tools/testing/selftests/bpf/progs/core_kern.c b/tools/testing/selftests/bpf/progs/core_kern.c
index 004f2acef2eb..82deb60ef672 100644
--- a/tools/testing/selftests/bpf/progs/core_kern.c
+++ b/tools/testing/selftests/bpf/progs/core_kern.c
@@ -67,9 +67,11 @@ struct __sk_bUfF /* it will not exist in vmlinux */ {
 	int len;
 } __attribute__((preserve_access_index));
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_testmod_test_read_ctx /* it exists in bpf_testmod */ {
 	size_t len;
 } __attribute__((preserve_access_index));
+#endif
 
 SEC("tc")
 int balancer_ingress(struct __sk_buff *ctx)
diff --git a/tools/testing/selftests/bpf/progs/iters_testmod_seq.c b/tools/testing/selftests/bpf/progs/iters_testmod_seq.c
index 3873fb6c292a..39658b05ac1e 100644
--- a/tools/testing/selftests/bpf/progs/iters_testmod_seq.c
+++ b/tools/testing/selftests/bpf/progs/iters_testmod_seq.c
@@ -5,10 +5,12 @@
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_iter_testmod_seq {
 	u64 :64;
 	u64 :64;
 };
+#endif
 
 extern int bpf_iter_testmod_seq_new(struct bpf_iter_testmod_seq *it, s64 value, int cnt) __ksym;
 extern s64 *bpf_iter_testmod_seq_next(struct bpf_iter_testmod_seq *it) __ksym;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_module.c b/tools/testing/selftests/bpf/progs/test_core_reloc_module.c
index bcb31ff92dcc..77b2dae54dd5 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_module.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_module.c
@@ -8,12 +8,14 @@
 
 char _license[] SEC("license") = "GPL";
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_testmod_test_read_ctx {
 	/* field order is mixed up */
 	size_t len;
 	char *buf;
 	loff_t off;
 } __attribute__((preserve_access_index));
+#endif
 
 struct {
 	char in[256];
diff --git a/tools/testing/selftests/bpf/progs/test_ldsx_insn.c b/tools/testing/selftests/bpf/progs/test_ldsx_insn.c
index 2a2a942737d7..f1d7276c6629 100644
--- a/tools/testing/selftests/bpf/progs/test_ldsx_insn.c
+++ b/tools/testing/selftests/bpf/progs/test_ldsx_insn.c
@@ -48,9 +48,11 @@ int map_val_prog(const void *ctx)
 
 }
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_testmod_struct_arg_1 {
 	int a;
 };
+#endif
 
 long long int_member;
 
diff --git a/tools/testing/selftests/bpf/progs/test_module_attach.c b/tools/testing/selftests/bpf/progs/test_module_attach.c
index 8a1b50f3a002..772cff1190b1 100644
--- a/tools/testing/selftests/bpf/progs/test_module_attach.c
+++ b/tools/testing/selftests/bpf/progs/test_module_attach.c
@@ -5,7 +5,10 @@
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 #include <bpf/bpf_core_read.h>
+
+#ifdef BPF_TESTMOD_EXTERNAL
 #include "../bpf_testmod/bpf_testmod.h"
+#endif
 
 __u32 raw_tp_read_sz = 0;
 
diff --git a/tools/testing/selftests/bpf/progs/tracing_struct.c b/tools/testing/selftests/bpf/progs/tracing_struct.c
index 515daef3c84b..3b5c69858feb 100644
--- a/tools/testing/selftests/bpf/progs/tracing_struct.c
+++ b/tools/testing/selftests/bpf/progs/tracing_struct.c
@@ -5,6 +5,7 @@
 #include <bpf/bpf_tracing.h>
 #include <bpf/bpf_helpers.h>
 
+#ifdef BPF_TESTMOD_EXTERNAL
 struct bpf_testmod_struct_arg_1 {
 	int a;
 };
@@ -22,6 +23,7 @@ struct bpf_testmod_struct_arg_4 {
 	u64 a;
 	int b;
 };
+#endif
 
 long t1_a_a, t1_a_b, t1_b, t1_c, t1_ret, t1_nregs;
 __u64 t1_reg0, t1_reg1, t1_reg2, t1_reg3;
diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c
index d2458c1b1671..331be87d74d5 100644
--- a/tools/testing/selftests/bpf/testing_helpers.c
+++ b/tools/testing/selftests/bpf/testing_helpers.c
@@ -342,6 +342,12 @@ int unload_bpf_testmod(bool verbose)
 {
 	if (kern_sync_rcu())
 		fprintf(stdout, "Failed to trigger kernel-side RCU sync!\n");
+
+	if (access("/proc/modules", F_OK)) {
+		fprintf(stdout, "Modules are disabled, fake unload success\n");
+		return 0;
+	}
+
 	if (delete_module("bpf_testmod", 0)) {
 		if (errno == ENOENT) {
 			if (verbose)
@@ -363,6 +369,14 @@ int load_bpf_testmod(bool verbose)
 	if (verbose)
 		fprintf(stdout, "Loading bpf_testmod.ko...\n");
 
+	if (access("/proc/modules", F_OK)) {
+		if (!access("/sys/kernel/debug/tracing/events/bpf_testmod", F_OK))
+			return 0;
+
+		fprintf(stdout, "Modules are disabled, testmod not built-in\n");
+		return -ENOENT;
+	}
+
 	fd = open("bpf_testmod.ko", O_RDONLY);
 	if (fd < 0) {
 		fprintf(stdout, "Can't find bpf_testmod.ko kernel module: %d\n", -errno);
diff --git a/tools/testing/selftests/bpf/vmtest.sh b/tools/testing/selftests/bpf/vmtest.sh
index 65d14f3bbe30..27e0b1241b16 100755
--- a/tools/testing/selftests/bpf/vmtest.sh
+++ b/tools/testing/selftests/bpf/vmtest.sh
@@ -44,11 +44,12 @@ NUM_COMPILE_JOBS="$(nproc)"
 LOG_FILE_BASE="$(date +"bpf_selftests.%Y-%m-%d_%H-%M-%S")"
 LOG_FILE="${LOG_FILE_BASE}.log"
 EXIT_STATUS_FILE="${LOG_FILE_BASE}.exit_status"
+MODULES="yes"
 
 usage()
 {
 	cat <<EOF
-Usage: $0 [-i] [-s] [-d <output_dir>] -- [<command>]
+Usage: $0 [-i] [-s] [-n] [-d <output_dir>] -- [<command>]
 
 <command> is the command you would normally run when you are in
 tools/testing/selftests/bpf. e.g:
@@ -76,6 +77,7 @@ Options:
 	-s)		Instead of powering off the VM, start an interactive
 			shell. If <command> is specified, the shell runs after
 			the command finishes executing
+	-n)		Run tests with CONFIG_MODULES=n
 EOF
 }
 
@@ -341,7 +343,7 @@ main()
 	local exit_command="poweroff -f"
 	local debug_shell="no"
 
-	while getopts ':hskid:j:' opt; do
+	while getopts ':hskid:j:n' opt; do
 		case ${opt} in
 		i)
 			update_image="yes"
@@ -357,6 +359,9 @@ main()
 			debug_shell="yes"
 			exit_command="bash"
 			;;
+		n)
+			MODULES="no"
+			;;
 		h)
 			usage
 			exit 0
@@ -409,12 +414,27 @@ main()
 
 	echo "Output directory: ${OUTPUT_DIR}"
 
+	if [[ "${MODULES}" == "yes" ]]; then
+		KCONFIG_REL_PATHS+=("tools/testing/selftests/bpf/config.mods")
+	else
+		make_command="${make_command} RUN_TESTS_WITHOUT_MODULES=1"
+		KCONFIG_REL_PATHS+=("tools/testing/selftests/bpf/config.nomods")
+	fi
+
 	mkdir -p "${OUTPUT_DIR}"
 	mkdir -p "${mount_dir}"
 	update_kconfig "${kernel_checkout}" "${kconfig_file}"
 
 	recompile_kernel "${kernel_checkout}" "${make_command}"
 
+	# Touch the opposite mods/nomods config we used to ensure the
+	# kernel is rebuilt when the user adds or drops the -n flag.
+	if [[ "${MODULES}" == "yes" ]]; then
+		touch -m "tools/testing/selftests/bpf/config.nomods"
+	else
+		touch -m "tools/testing/selftests/bpf/config.mods"
+	fi
+
 	if [[ "${update_image}" == "no" && ! -f "${rootfs_img}" ]]; then
 		echo "rootfs image not found in ${rootfs_img}"
 		update_image="yes"
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
                   ` (3 preceding siblings ...)
  2024-03-06 20:05 ` [RFC][PATCH 4/4] selftests/bpf: Support testing the !MODULES case Calvin Owens
@ 2024-03-06 21:34 ` Luis Chamberlain
  2024-03-06 23:23   ` Calvin Owens
  2024-03-08  2:45   ` Masami Hiramatsu
  2024-03-25 22:46 ` Jarkko Sakkinen
  5 siblings, 2 replies; 27+ messages in thread
From: Luis Chamberlain @ 2024-03-06 21:34 UTC (permalink / raw)
  To: Calvin Owens, Song Liu, Christophe Leroy, Mike Rapoport
  Cc: Andrew Morton, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> Hello all,
> 
> This patchset makes it possible to use bpftrace with kprobes on kernels
> built without loadable module support.

This is a step in the right direction for another reason: clearly the
module_alloc() is not about modules, and we have special reasons for it
now beyond modules. The effort to share a generalize a huge page for
these things is also another reason for some of this but that is more
long term.

I'm all for minor changes here so to avoid regressions but it seems a
rename is in order -- if we're going to all this might as well do it
now. And for that I'd just like to ask you paint the bikeshed with
Song Liu as he's been the one slowly making way to help us get there
with the "module: replace module_layout with module_memory",
and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
the EXECMEM stuff would be what we use instead then. Mike kept the
module_alloc() and the execmem was just a wrapper but your move of the
arch stuff makes sense as well and I think would complement his series
nicely.

If you're gonna split code up to move to another place, it'd be nice
if you can add copyright headers as was done with the kernel/module.c
split into kernel/module/*.c

Can we start with some small basic stuff we can all agree on?

[0] https://lwn.net/Articles/944857/

  Luis

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-06 21:34 ` [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Luis Chamberlain
@ 2024-03-06 23:23   ` Calvin Owens
  2024-03-07  1:58     ` Song Liu
  2024-03-07  7:13     ` Mike Rapoport
  2024-03-08  2:45   ` Masami Hiramatsu
  1 sibling, 2 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-06 23:23 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Song Liu, Christophe Leroy, Mike Rapoport, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > Hello all,
> > 
> > This patchset makes it possible to use bpftrace with kprobes on kernels
> > built without loadable module support.
> 
> This is a step in the right direction for another reason: clearly the
> module_alloc() is not about modules, and we have special reasons for it
> now beyond modules. The effort to share a generalize a huge page for
> these things is also another reason for some of this but that is more
> long term.
> 
> I'm all for minor changes here so to avoid regressions but it seems a
> rename is in order -- if we're going to all this might as well do it
> now. And for that I'd just like to ask you paint the bikeshed with
> Song Liu as he's been the one slowly making way to help us get there
> with the "module: replace module_layout with module_memory",
> and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> the EXECMEM stuff would be what we use instead then. Mike kept the
> module_alloc() and the execmem was just a wrapper but your move of the
> arch stuff makes sense as well and I think would complement his series
> nicely.

I apologize for missing that. I think these are the four most recent
versions of the different series referenced from that LWN link:

  a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
  b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
  c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
  d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/

Song and Mike, please correct me if I'm wrong, but I think what I've
done here (see [1], sorry for not adding you initially) is compatible
with everything both of you have recently proposed above. How do you
feel about this as a first step?

For naming, execmem_alloc() seems reasonable to me? I have no strong
feelings at all, I'll just use that going forward unless somebody else
expresses an opinion.

[1] https://lore.kernel.org/lkml/cover.1709676663.git.jcalvinowens@gmail.com/T/#m337096e158a5f771d0c7c2fb15a3b80a4443226a

> If you're gonna split code up to move to another place, it'd be nice
> if you can add copyright headers as was done with the kernel/module.c
> split into kernel/module/*.c

Silly question: should it be the same copyright header as the original
corresponding module.c, or a new one? I tried to preserve the license
header because I wasn't sure what to do about it.

Thanks,
Calvin

> Can we start with some small basic stuff we can all agree on?
> 
> [0] https://lwn.net/Articles/944857/
> 
>   Luis

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-06 23:23   ` Calvin Owens
@ 2024-03-07  1:58     ` Song Liu
  2024-03-08  2:50       ` Masami Hiramatsu
  2024-03-07  7:13     ` Mike Rapoport
  1 sibling, 1 reply; 27+ messages in thread
From: Song Liu @ 2024-03-07  1:58 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Luis Chamberlain, Christophe Leroy, Mike Rapoport, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

Hi Calvin,

It is great to hear from you! :)

On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
>
> On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > Hello all,
> > >
> > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > built without loadable module support.
> >
> > This is a step in the right direction for another reason: clearly the
> > module_alloc() is not about modules, and we have special reasons for it
> > now beyond modules. The effort to share a generalize a huge page for
> > these things is also another reason for some of this but that is more
> > long term.
> >
> > I'm all for minor changes here so to avoid regressions but it seems a
> > rename is in order -- if we're going to all this might as well do it
> > now. And for that I'd just like to ask you paint the bikeshed with
> > Song Liu as he's been the one slowly making way to help us get there
> > with the "module: replace module_layout with module_memory",
> > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > the EXECMEM stuff would be what we use instead then. Mike kept the
> > module_alloc() and the execmem was just a wrapper but your move of the
> > arch stuff makes sense as well and I think would complement his series
> > nicely.
>
> I apologize for missing that. I think these are the four most recent
> versions of the different series referenced from that LWN link:
>
>   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
>   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
>   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
>   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
>
> Song and Mike, please correct me if I'm wrong, but I think what I've
> done here (see [1], sorry for not adding you initially) is compatible
> with everything both of you have recently proposed above. How do you
> feel about this as a first step?

I agree that the work here is compatible with other efforts. I have no
objection to making this the first step.

>
> For naming, execmem_alloc() seems reasonable to me? I have no strong
> feelings at all, I'll just use that going forward unless somebody else
> expresses an opinion.

I am not good at naming things. No objection from me to "execmem_alloc".

Thanks,
Song

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-06 23:23   ` Calvin Owens
  2024-03-07  1:58     ` Song Liu
@ 2024-03-07  7:13     ` Mike Rapoport
  1 sibling, 0 replies; 27+ messages in thread
From: Mike Rapoport @ 2024-03-07  7:13 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Luis Chamberlain, Song Liu, Christophe Leroy, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

Hi Calvin,

On Wed, Mar 06, 2024 at 03:23:22PM -0800, Calvin Owens wrote:
> On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > Hello all,
> > > 
> > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > built without loadable module support.
> > 
> > This is a step in the right direction for another reason: clearly the
> > module_alloc() is not about modules, and we have special reasons for it
> > now beyond modules. The effort to share a generalize a huge page for
> > these things is also another reason for some of this but that is more
> > long term.
> > 
> > I'm all for minor changes here so to avoid regressions but it seems a
> > rename is in order -- if we're going to all this might as well do it
> > now. And for that I'd just like to ask you paint the bikeshed with
> > Song Liu as he's been the one slowly making way to help us get there
> > with the "module: replace module_layout with module_memory",
> > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > the EXECMEM stuff would be what we use instead then. Mike kept the
> > module_alloc() and the execmem was just a wrapper but your move of the
> > arch stuff makes sense as well and I think would complement his series
> > nicely.

Actually I've dropped module_alloc() in favor of execmem_alloc() ;-)
 
> I apologize for missing that. I think these are the four most recent
> versions of the different series referenced from that LWN link:
> 
>   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
>   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
>   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
>   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> 
> Song and Mike, please correct me if I'm wrong, but I think what I've
> done here (see [1], sorry for not adding you initially) is compatible
> with everything both of you have recently proposed above. How do you
> feel about this as a first step?

No objections from me.

> For naming, execmem_alloc() seems reasonable to me? I have no strong
> feelings at all, I'll just use that going forward unless somebody else
> expresses an opinion.

I like execmem_alloc() and CONFIG_EXECMEM.
 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-06 20:05 ` [RFC][PATCH 3/4] kprobes: Allow kprobes " Calvin Owens
@ 2024-03-07  7:22   ` Mike Rapoport
  2024-03-08  2:46     ` Masami Hiramatsu
  2024-03-08 20:36     ` Calvin Owens
  2024-03-07 22:16   ` Christophe Leroy
  2024-03-08  2:46   ` Masami Hiramatsu
  2 siblings, 2 replies; 27+ messages in thread
From: Mike Rapoport @ 2024-03-07  7:22 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner, bpf, linux-modules,
	linux-kernel

On Wed, Mar 06, 2024 at 12:05:10PM -0800, Calvin Owens wrote:
> If something like this is merged down the road, it can go in at leisure
> once the module_alloc change is in: it's a one-way dependency.
> 
> Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> ---
>  arch/Kconfig                |  2 +-
>  kernel/kprobes.c            | 22 ++++++++++++++++++++++
>  kernel/trace/trace_kprobe.c | 11 +++++++++++
>  3 files changed, 34 insertions(+), 1 deletion(-)

When I did this in my last execmem posting, I think I've got slightly less
ugly ifdery, you may want to take a look at that:

https://lore.kernel.org/all/20230918072955.2507221-13-rppt@kernel.org
 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cfc24ced16dd..e60ce984d095 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -52,8 +52,8 @@ config GENERIC_ENTRY
>  
>  config KPROBES
>  	bool "Kprobes"
> -	depends on MODULES
>  	depends on HAVE_KPROBES
> +	select MODULE_ALLOC
>  	select KALLSYMS
>  	select TASKS_RCU if PREEMPTION
>  	help
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 9d9095e81792..194270e17d57 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
>  		str_has_prefix("__pfx_", symbuf);
>  }
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  static int check_kprobe_address_safe(struct kprobe *p,
>  				     struct module **probed_mod)
> +#else
> +static int check_kprobe_address_safe(struct kprobe *p)
> +#endif
>  {
>  	int ret;
>  
> @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
>  		goto out;
>  	}
>  
> +#if IS_ENABLED(CONFIG_MODULES)

Plain #ifdef will do here and below. IS_ENABLED is for usage withing the
code, like

	if (IS_ENABLED(CONFIG_MODULES))
		;

>  	/* Check if 'p' is probing a module. */
>  	*probed_mod = __module_text_address((unsigned long) p->addr);
>  	if (*probed_mod) {

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available
  2024-03-06 20:05 ` [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available Calvin Owens
@ 2024-03-07 14:43   ` Christophe Leroy
  2024-03-08 20:53     ` Calvin Owens
  2024-03-08  2:16   ` Masami Hiramatsu
  1 sibling, 1 reply; 27+ messages in thread
From: Christophe Leroy @ 2024-03-07 14:43 UTC (permalink / raw)
  To: Calvin Owens, Luis Chamberlain, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner
  Cc: bpf, linux-modules, linux-kernel

Hi Calvin,

Le 06/03/2024 à 21:05, Calvin Owens a écrit :
> [Vous ne recevez pas souvent de courriers de jcalvinowens@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> Both BPF_JIT and KPROBES depend on CONFIG_MODULES, but only require
> module_alloc() itself, which can be easily separated into a standalone
> allocator for executable kernel memory.

Easily maybe, but not as easily as you think, see below.

> 
> Thomas Gleixner sent a patch to do that for x86 as part of a larger
> series a couple years ago:
> 
>      https://lore.kernel.org/all/20220716230953.442937066@linutronix.de/
> 
> I've simply extended that approach to the whole kernel.
> 
> Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> ---
>   arch/Kconfig                     |   2 +-
>   arch/arm/kernel/module.c         |  35 ---------
>   arch/arm/mm/Makefile             |   2 +
>   arch/arm/mm/module_alloc.c       |  40 ++++++++++
>   arch/arm64/kernel/module.c       | 127 ------------------------------
>   arch/arm64/mm/Makefile           |   1 +
>   arch/arm64/mm/module_alloc.c     | 130 +++++++++++++++++++++++++++++++
>   arch/loongarch/kernel/module.c   |   6 --
>   arch/loongarch/mm/Makefile       |   2 +
>   arch/loongarch/mm/module_alloc.c |  10 +++
>   arch/mips/kernel/module.c        |  10 ---
>   arch/mips/mm/Makefile            |   2 +
>   arch/mips/mm/module_alloc.c      |  13 ++++
>   arch/nios2/kernel/module.c       |  20 -----
>   arch/nios2/mm/Makefile           |   2 +
>   arch/nios2/mm/module_alloc.c     |  22 ++++++
>   arch/parisc/kernel/module.c      |  12 ---
>   arch/parisc/mm/Makefile          |   1 +
>   arch/parisc/mm/module_alloc.c    |  15 ++++
>   arch/powerpc/kernel/module.c     |  36 ---------
>   arch/powerpc/mm/Makefile         |   1 +
>   arch/powerpc/mm/module_alloc.c   |  41 ++++++++++

Missing several powerpc changes to make it work. You must audit every 
use of CONFIG_MODULES inside powerpc. Here are a few exemples:

Function get_patch_pfn() to enable text code patching.

arch/powerpc/Kconfig : 	select KASAN_VMALLOC			if KASAN && MODULES

arch/powerpc/include/asm/kasan.h:

#if defined(CONFIG_MODULES) && defined(CONFIG_PPC32)
#define KASAN_KERN_START	ALIGN_DOWN(PAGE_OFFSET - SZ_256M, SZ_256M)
#else
#define KASAN_KERN_START	PAGE_OFFSET
#endif

arch/powerpc/kernel/head_8xx.S and arch/powerpc/kernel/head_book3s_32.S: 
InstructionTLBMiss interrupt handler must know that there is executable 
kernel text outside kernel core.

Function is_module_segment() to identified segments used for module text 
and set NX (NoExec) MMU flag on non-module segments.



>   arch/riscv/kernel/module.c       |  11 ---
>   arch/riscv/mm/Makefile           |   1 +
>   arch/riscv/mm/module_alloc.c     |  17 ++++
>   arch/s390/kernel/module.c        |  37 ---------
>   arch/s390/mm/Makefile            |   1 +
>   arch/s390/mm/module_alloc.c      |  42 ++++++++++
>   arch/sparc/kernel/module.c       |  31 --------
>   arch/sparc/mm/Makefile           |   2 +
>   arch/sparc/mm/module_alloc.c     |  31 ++++++++
>   arch/x86/kernel/ftrace.c         |   2 +-
>   arch/x86/kernel/module.c         |  56 -------------
>   arch/x86/mm/Makefile             |   2 +
>   arch/x86/mm/module_alloc.c       |  59 ++++++++++++++
>   fs/proc/kcore.c                  |   2 +-
>   kernel/module/Kconfig            |   1 +
>   kernel/module/main.c             |  17 ----
>   mm/Kconfig                       |   3 +
>   mm/Makefile                      |   1 +
>   mm/module_alloc.c                |  21 +++++
>   mm/vmalloc.c                     |   2 +-
>   42 files changed, 467 insertions(+), 402 deletions(-)

...

> diff --git a/mm/Kconfig b/mm/Kconfig
> index ffc3a2ba3a8c..92bfb5ae2e95 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -1261,6 +1261,9 @@ config LOCK_MM_AND_FIND_VMA
>   config IOMMU_MM_DATA
>          bool
> 
> +config MODULE_ALLOC
> +       def_bool n
> +

I'd call it something else than CONFIG_MODULE_ALLOC as you want to use 
it when CONFIG_MODULE is not selected.

Something like CONFIG_EXECMEM_ALLOC or CONFIG_DYNAMIC_EXECMEM ?



Christophe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n
  2024-03-06 20:05 ` [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n Calvin Owens
@ 2024-03-07 22:09   ` Christophe Leroy
  2024-03-08 21:04     ` Calvin Owens
  0 siblings, 1 reply; 27+ messages in thread
From: Christophe Leroy @ 2024-03-07 22:09 UTC (permalink / raw)
  To: Calvin Owens, Luis Chamberlain, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner
  Cc: bpf, linux-modules, linux-kernel



Le 06/03/2024 à 21:05, Calvin Owens a écrit :
> [Vous ne recevez pas souvent de courriers de jcalvinowens@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> No BPF code has to change, except in struct_ops (for module refs).
> 
> This conflicts with bpf-next because of this (relevant) series:
> 
>      https://lore.kernel.org/all/20240119225005.668602-1-thinker.li@gmail.com/
> 
> If something like this is merged down the road, it can go through
> bpf-next at leisure once the module_alloc change is in: it's a one-way
> dependency.
> 
> Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> ---
>   kernel/bpf/Kconfig          |  2 +-
>   kernel/bpf/bpf_struct_ops.c | 28 ++++++++++++++++++++++++----
>   2 files changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
> index 6a906ff93006..77df483a8925 100644
> --- a/kernel/bpf/Kconfig
> +++ b/kernel/bpf/Kconfig
> @@ -42,7 +42,7 @@ config BPF_JIT
>          bool "Enable BPF Just In Time compiler"
>          depends on BPF
>          depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT
> -       depends on MODULES
> +       select MODULE_ALLOC
>          help
>            BPF programs are normally handled by a BPF interpreter. This option
>            allows the kernel to generate native code when a program is loaded
> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> index 02068bd0e4d9..fbf08a1bb00c 100644
> --- a/kernel/bpf/bpf_struct_ops.c
> +++ b/kernel/bpf/bpf_struct_ops.c
> @@ -108,11 +108,30 @@ const struct bpf_prog_ops bpf_struct_ops_prog_ops = {
>   #endif
>   };
> 
> +#if IS_ENABLED(CONFIG_MODULES)

Can you avoid ifdefs as much as possible ?

>   static const struct btf_type *module_type;
> 
> +static int bpf_struct_module_type_init(struct btf *btf)
> +{
> +       s32 module_id;

Could be:

	if (!IS_ENABLED(CONFIG_MODULES))
		return 0;

> +
> +       module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
> +       if (module_id < 0)
> +               return 1;
> +
> +       module_type = btf_type_by_id(btf, module_id);
> +       return 0;
> +}
> +#else
> +static int bpf_struct_module_type_init(struct btf *btf)
> +{
> +       return 0;
> +}
> +#endif
> +
>   void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
>   {
> -       s32 type_id, value_id, module_id;
> +       s32 type_id, value_id;
>          const struct btf_member *member;
>          struct bpf_struct_ops *st_ops;
>          const struct btf_type *t;
> @@ -125,12 +144,10 @@ void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
>   #include "bpf_struct_ops_types.h"
>   #undef BPF_STRUCT_OPS_TYPE
> 
> -       module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
> -       if (module_id < 0) {
> +       if (bpf_struct_module_type_init(btf)) {
>                  pr_warn("Cannot find struct module in btf_vmlinux\n");
>                  return;
>          }
> -       module_type = btf_type_by_id(btf, module_id);
> 
>          for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
>                  st_ops = bpf_struct_ops[i];
> @@ -433,12 +450,15 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> 
>                  moff = __btf_member_bit_offset(t, member) / 8;
>                  ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL);
> +
> +#if IS_ENABLED(CONFIG_MODULES)

Can't see anything depending on CONFIG_MODULES here, can you instead do:

		if (IS_ENABLED(CONFIG_MODULES) && ptype == module_type) {

>                  if (ptype == module_type) {
>                          if (*(void **)(udata + moff))
>                                  goto reset_unlock;
>                          *(void **)(kdata + moff) = BPF_MODULE_OWNER;
>                          continue;
>                  }
> +#endif
> 
>                  err = st_ops->init_member(t, member, kdata, udata);
>                  if (err < 0)
> --
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-06 20:05 ` [RFC][PATCH 3/4] kprobes: Allow kprobes " Calvin Owens
  2024-03-07  7:22   ` Mike Rapoport
@ 2024-03-07 22:16   ` Christophe Leroy
  2024-03-08 21:02     ` Calvin Owens
  2024-03-08  2:46   ` Masami Hiramatsu
  2 siblings, 1 reply; 27+ messages in thread
From: Christophe Leroy @ 2024-03-07 22:16 UTC (permalink / raw)
  To: Calvin Owens, Luis Chamberlain, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner
  Cc: bpf, linux-modules, linux-kernel



Le 06/03/2024 à 21:05, Calvin Owens a écrit :
> [Vous ne recevez pas souvent de courriers de jcalvinowens@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> If something like this is merged down the road, it can go in at leisure
> once the module_alloc change is in: it's a one-way dependency.

Too many #ifdef, please reorganise stuff to avoid that and avoid 
changing prototypes based of CONFIG_MODULES.

Other few comments below.

> 
> Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> ---
>   arch/Kconfig                |  2 +-
>   kernel/kprobes.c            | 22 ++++++++++++++++++++++
>   kernel/trace/trace_kprobe.c | 11 +++++++++++
>   3 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cfc24ced16dd..e60ce984d095 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -52,8 +52,8 @@ config GENERIC_ENTRY
> 
>   config KPROBES
>          bool "Kprobes"
> -       depends on MODULES
>          depends on HAVE_KPROBES
> +       select MODULE_ALLOC
>          select KALLSYMS
>          select TASKS_RCU if PREEMPTION
>          help
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 9d9095e81792..194270e17d57 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
>                  str_has_prefix("__pfx_", symbuf);
>   }
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>   static int check_kprobe_address_safe(struct kprobe *p,
>                                       struct module **probed_mod)
> +#else
> +static int check_kprobe_address_safe(struct kprobe *p)
> +#endif

A bit ugly to have to change the prototype, why not just keep probed_mod 
at all time ?

When CONFIG_MODULES is not selected, __module_text_address() returns 
NULL so it should work without that many #ifdefs.

>   {
>          int ret;
> 
> @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
>                  goto out;
>          }
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>          /* Check if 'p' is probing a module. */
>          *probed_mod = __module_text_address((unsigned long) p->addr);
>          if (*probed_mod) {
> @@ -1603,6 +1608,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
>                          ret = -ENOENT;
>                  }
>          }
> +#endif
> +
>   out:
>          preempt_enable();
>          jump_label_unlock();
> @@ -1614,7 +1621,9 @@ int register_kprobe(struct kprobe *p)
>   {
>          int ret;
>          struct kprobe *old_p;
> +#if IS_ENABLED(CONFIG_MODULES)
>          struct module *probed_mod;
> +#endif
>          kprobe_opcode_t *addr;
>          bool on_func_entry;
> 
> @@ -1633,7 +1642,11 @@ int register_kprobe(struct kprobe *p)
>          p->nmissed = 0;
>          INIT_LIST_HEAD(&p->list);
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>          ret = check_kprobe_address_safe(p, &probed_mod);
> +#else
> +       ret = check_kprobe_address_safe(p);
> +#endif
>          if (ret)
>                  return ret;
> 
> @@ -1676,8 +1689,10 @@ int register_kprobe(struct kprobe *p)
>   out:
>          mutex_unlock(&kprobe_mutex);
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>          if (probed_mod)
>                  module_put(probed_mod);
> +#endif
> 
>          return ret;
>   }
> @@ -2482,6 +2497,7 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
>          return 0;
>   }
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>   /* Remove all symbols in given area from kprobe blacklist */
>   static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
>   {
> @@ -2499,6 +2515,7 @@ static void kprobe_remove_ksym_blacklist(unsigned long entry)
>   {
>          kprobe_remove_area_blacklist(entry, entry + 1);
>   }
> +#endif
> 
>   int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
>                                     char *type, char *sym)
> @@ -2564,6 +2581,7 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
>          return ret ? : arch_populate_kprobe_blacklist();
>   }
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>   static void add_module_kprobe_blacklist(struct module *mod)
>   {
>          unsigned long start, end;
> @@ -2665,6 +2683,7 @@ static struct notifier_block kprobe_module_nb = {
>          .notifier_call = kprobes_module_callback,
>          .priority = 0
>   };
> +#endif /* IS_ENABLED(CONFIG_MODULES) */
> 
>   void kprobe_free_init_mem(void)
>   {
> @@ -2724,8 +2743,11 @@ static int __init init_kprobes(void)
>          err = arch_init_kprobes();
>          if (!err)
>                  err = register_die_notifier(&kprobe_exceptions_nb);
> +
> +#if IS_ENABLED(CONFIG_MODULES)
>          if (!err)
>                  err = register_module_notifier(&kprobe_module_nb);
> +#endif
> 
>          kprobes_initialized = (err == 0);
>          kprobe_sysctls_init();
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index c4c6e0e0068b..dd4598f775b9 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -102,6 +102,7 @@ static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
>          return kprobe_gone(&tk->rp.kp);
>   }
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>   static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
>                                                   struct module *mod)
>   {
> @@ -129,6 +130,12 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> 
>          return ret;
>   }
> +#else
> +static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> +{
> +       return true;
> +}
> +#endif
> 
>   static bool trace_kprobe_is_busy(struct dyn_event *ev)
>   {
> @@ -670,6 +677,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
>          return ret;
>   }
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>   /* Module notifier call back, checking event on the module */
>   static int trace_kprobe_module_callback(struct notifier_block *nb,
>                                         unsigned long val, void *data)
> @@ -704,6 +712,7 @@ static struct notifier_block trace_kprobe_module_nb = {
>          .notifier_call = trace_kprobe_module_callback,
>          .priority = 1   /* Invoked after kprobe module callback */
>   };
> +#endif /* IS_ENABLED(CONFIG_MODULES) */
> 
>   static int count_symbols(void *data, unsigned long unused)
>   {
> @@ -1897,8 +1906,10 @@ static __init int init_kprobe_trace_early(void)
>          if (ret)
>                  return ret;
> 
> +#if IS_ENABLED(CONFIG_MODULES)
>          if (register_module_notifier(&trace_kprobe_module_nb))
>                  return -EINVAL;
Why a #if here ?

If CONFIG_MODULES is not selected, register_module_notifier() return 
always 0.

> +#endif
> 
>          return 0;
>   }
> --
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available
  2024-03-06 20:05 ` [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available Calvin Owens
  2024-03-07 14:43   ` Christophe Leroy
@ 2024-03-08  2:16   ` Masami Hiramatsu
  2024-03-08 20:43     ` Calvin Owens
  1 sibling, 1 reply; 27+ messages in thread
From: Masami Hiramatsu @ 2024-03-08  2:16 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

Hi Calvin,

On Wed,  6 Mar 2024 12:05:08 -0800
Calvin Owens <jcalvinowens@gmail.com> wrote:

> Both BPF_JIT and KPROBES depend on CONFIG_MODULES, but only require
> module_alloc() itself, which can be easily separated into a standalone
> allocator for executable kernel memory.

Thanks for your work!
As Luis pointed, it is better to use different name because this
is not only for modules and it does not depend on CONFIG_MODULES.

> 
> Thomas Gleixner sent a patch to do that for x86 as part of a larger
> series a couple years ago:
> 
>     https://lore.kernel.org/all/20220716230953.442937066@linutronix.de/
> 
> I've simply extended that approach to the whole kernel.

I would like to see a series of patches for each architecture so that
architecture maintainers carefully check and test this feature.

What about introducing CONFIG_HAVE_EXEC_ALLOC and enable it on
each architecture? Then you can start small set of major architectures
and expand it later. 

Thank you,

> 
> Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> ---
>  arch/Kconfig                     |   2 +-
>  arch/arm/kernel/module.c         |  35 ---------
>  arch/arm/mm/Makefile             |   2 +
>  arch/arm/mm/module_alloc.c       |  40 ++++++++++
>  arch/arm64/kernel/module.c       | 127 ------------------------------
>  arch/arm64/mm/Makefile           |   1 +
>  arch/arm64/mm/module_alloc.c     | 130 +++++++++++++++++++++++++++++++
>  arch/loongarch/kernel/module.c   |   6 --
>  arch/loongarch/mm/Makefile       |   2 +
>  arch/loongarch/mm/module_alloc.c |  10 +++
>  arch/mips/kernel/module.c        |  10 ---
>  arch/mips/mm/Makefile            |   2 +
>  arch/mips/mm/module_alloc.c      |  13 ++++
>  arch/nios2/kernel/module.c       |  20 -----
>  arch/nios2/mm/Makefile           |   2 +
>  arch/nios2/mm/module_alloc.c     |  22 ++++++
>  arch/parisc/kernel/module.c      |  12 ---
>  arch/parisc/mm/Makefile          |   1 +
>  arch/parisc/mm/module_alloc.c    |  15 ++++
>  arch/powerpc/kernel/module.c     |  36 ---------
>  arch/powerpc/mm/Makefile         |   1 +
>  arch/powerpc/mm/module_alloc.c   |  41 ++++++++++
>  arch/riscv/kernel/module.c       |  11 ---
>  arch/riscv/mm/Makefile           |   1 +
>  arch/riscv/mm/module_alloc.c     |  17 ++++
>  arch/s390/kernel/module.c        |  37 ---------
>  arch/s390/mm/Makefile            |   1 +
>  arch/s390/mm/module_alloc.c      |  42 ++++++++++
>  arch/sparc/kernel/module.c       |  31 --------
>  arch/sparc/mm/Makefile           |   2 +
>  arch/sparc/mm/module_alloc.c     |  31 ++++++++
>  arch/x86/kernel/ftrace.c         |   2 +-
>  arch/x86/kernel/module.c         |  56 -------------
>  arch/x86/mm/Makefile             |   2 +
>  arch/x86/mm/module_alloc.c       |  59 ++++++++++++++
>  fs/proc/kcore.c                  |   2 +-
>  kernel/module/Kconfig            |   1 +
>  kernel/module/main.c             |  17 ----
>  mm/Kconfig                       |   3 +
>  mm/Makefile                      |   1 +
>  mm/module_alloc.c                |  21 +++++
>  mm/vmalloc.c                     |   2 +-
>  42 files changed, 467 insertions(+), 402 deletions(-)
>  create mode 100644 arch/arm/mm/module_alloc.c
>  create mode 100644 arch/arm64/mm/module_alloc.c
>  create mode 100644 arch/loongarch/mm/module_alloc.c
>  create mode 100644 arch/mips/mm/module_alloc.c
>  create mode 100644 arch/nios2/mm/module_alloc.c
>  create mode 100644 arch/parisc/mm/module_alloc.c
>  create mode 100644 arch/powerpc/mm/module_alloc.c
>  create mode 100644 arch/riscv/mm/module_alloc.c
>  create mode 100644 arch/s390/mm/module_alloc.c
>  create mode 100644 arch/sparc/mm/module_alloc.c
>  create mode 100644 arch/x86/mm/module_alloc.c
>  create mode 100644 mm/module_alloc.c
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index a5af0edd3eb8..cfc24ced16dd 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1305,7 +1305,7 @@ config ARCH_HAS_STRICT_MODULE_RWX
>  
>  config STRICT_MODULE_RWX
>  	bool "Set loadable kernel module data as NX and text as RO" if ARCH_OPTIONAL_KERNEL_RWX
> -	depends on ARCH_HAS_STRICT_MODULE_RWX && MODULES
> +	depends on ARCH_HAS_STRICT_MODULE_RWX && MODULE_ALLOC
>  	default !ARCH_OPTIONAL_KERNEL_RWX || ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
>  	help
>  	  If this is set, module text and rodata memory will be made read-only,
> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> index e74d84f58b77..1c8798732d12 100644
> --- a/arch/arm/kernel/module.c
> +++ b/arch/arm/kernel/module.c
> @@ -4,15 +4,12 @@
>   *
>   *  Copyright (C) 2002 Russell King.
>   *  Modified for nommu by Hyok S. Choi
> - *
> - * Module allocation method suggested by Andi Kleen.
>   */
>  #include <linux/module.h>
>  #include <linux/moduleloader.h>
>  #include <linux/kernel.h>
>  #include <linux/mm.h>
>  #include <linux/elf.h>
> -#include <linux/vmalloc.h>
>  #include <linux/fs.h>
>  #include <linux/string.h>
>  #include <linux/gfp.h>
> @@ -22,38 +19,6 @@
>  #include <asm/unwind.h>
>  #include <asm/opcodes.h>
>  
> -#ifdef CONFIG_XIP_KERNEL
> -/*
> - * The XIP kernel text is mapped in the module area for modules and
> - * some other stuff to work without any indirect relocations.
> - * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
> - * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
> - */
> -#undef MODULES_VADDR
> -#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
> -#endif
> -
> -#ifdef CONFIG_MMU
> -void *module_alloc(unsigned long size)
> -{
> -	gfp_t gfp_mask = GFP_KERNEL;
> -	void *p;
> -
> -	/* Silence the initial allocation */
> -	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
> -		gfp_mask |= __GFP_NOWARN;
> -
> -	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> -				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> -				__builtin_return_address(0));
> -	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
> -		return p;
> -	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
> -				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> -				__builtin_return_address(0));
> -}
> -#endif
> -
>  bool module_init_section(const char *name)
>  {
>  	return strstarts(name, ".init") ||
> diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
> index 71b858c9b10c..a05a6701a884 100644
> --- a/arch/arm/mm/Makefile
> +++ b/arch/arm/mm/Makefile
> @@ -100,3 +100,5 @@ obj-$(CONFIG_CACHE_UNIPHIER)	+= cache-uniphier.o
>  
>  KASAN_SANITIZE_kasan_init.o	:= n
>  obj-$(CONFIG_KASAN)		+= kasan_init.o
> +
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/arm/mm/module_alloc.c b/arch/arm/mm/module_alloc.c
> new file mode 100644
> index 000000000000..e48be48b2b5f
> --- /dev/null
> +++ b/arch/arm/mm/module_alloc.c
> @@ -0,0 +1,40 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +#ifdef CONFIG_XIP_KERNEL
> +/*
> + * The XIP kernel text is mapped in the module area for modules and
> + * some other stuff to work without any indirect relocations.
> + * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
> + * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
> + */
> +#undef MODULES_VADDR
> +#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
> +#endif
> +
> +/*
> + * Module allocation method suggested by Andi Kleen.
> + */
> +
> +#ifdef CONFIG_MMU
> +void *module_alloc(unsigned long size)
> +{
> +	gfp_t gfp_mask = GFP_KERNEL;
> +	void *p;
> +
> +	/* Silence the initial allocation */
> +	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
> +		gfp_mask |= __GFP_NOWARN;
> +
> +	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> +				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> +				__builtin_return_address(0));
> +	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
> +		return p;
> +	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
> +				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> +				__builtin_return_address(0));
> +}
> +#endif
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index dd851297596e..78758ed818b0 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -13,143 +13,16 @@
>  #include <linux/elf.h>
>  #include <linux/ftrace.h>
>  #include <linux/gfp.h>
> -#include <linux/kasan.h>
>  #include <linux/kernel.h>
>  #include <linux/mm.h>
>  #include <linux/moduleloader.h>
> -#include <linux/random.h>
>  #include <linux/scs.h>
> -#include <linux/vmalloc.h>
>  
>  #include <asm/alternative.h>
>  #include <asm/insn.h>
>  #include <asm/scs.h>
>  #include <asm/sections.h>
>  
> -static u64 module_direct_base __ro_after_init = 0;
> -static u64 module_plt_base __ro_after_init = 0;
> -
> -/*
> - * Choose a random page-aligned base address for a window of 'size' bytes which
> - * entirely contains the interval [start, end - 1].
> - */
> -static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
> -{
> -	u64 max_pgoff, pgoff;
> -
> -	if ((end - start) >= size)
> -		return 0;
> -
> -	max_pgoff = (size - (end - start)) / PAGE_SIZE;
> -	pgoff = get_random_u32_inclusive(0, max_pgoff);
> -
> -	return start - pgoff * PAGE_SIZE;
> -}
> -
> -/*
> - * Modules may directly reference data and text anywhere within the kernel
> - * image and other modules. References using PREL32 relocations have a +/-2G
> - * range, and so we need to ensure that the entire kernel image and all modules
> - * fall within a 2G window such that these are always within range.
> - *
> - * Modules may directly branch to functions and code within the kernel text,
> - * and to functions and code within other modules. These branches will use
> - * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
> - * that the entire kernel text and all module text falls within a 128M window
> - * such that these are always within range. With PLTs, we can expand this to a
> - * 2G window.
> - *
> - * We chose the 128M region to surround the entire kernel image (rather than
> - * just the text) as using the same bounds for the 128M and 2G regions ensures
> - * by construction that we never select a 128M region that is not a subset of
> - * the 2G region. For very large and unusual kernel configurations this means
> - * we may fall back to PLTs where they could have been avoided, but this keeps
> - * the logic significantly simpler.
> - */
> -static int __init module_init_limits(void)
> -{
> -	u64 kernel_end = (u64)_end;
> -	u64 kernel_start = (u64)_text;
> -	u64 kernel_size = kernel_end - kernel_start;
> -
> -	/*
> -	 * The default modules region is placed immediately below the kernel
> -	 * image, and is large enough to use the full 2G relocation range.
> -	 */
> -	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
> -	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
> -
> -	if (!kaslr_enabled()) {
> -		if (kernel_size < SZ_128M)
> -			module_direct_base = kernel_end - SZ_128M;
> -		if (kernel_size < SZ_2G)
> -			module_plt_base = kernel_end - SZ_2G;
> -	} else {
> -		u64 min = kernel_start;
> -		u64 max = kernel_end;
> -
> -		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
> -			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
> -		} else {
> -			module_direct_base = random_bounding_box(SZ_128M, min, max);
> -			if (module_direct_base) {
> -				min = module_direct_base;
> -				max = module_direct_base + SZ_128M;
> -			}
> -		}
> -
> -		module_plt_base = random_bounding_box(SZ_2G, min, max);
> -	}
> -
> -	pr_info("%llu pages in range for non-PLT usage",
> -		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
> -	pr_info("%llu pages in range for PLT usage",
> -		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
> -
> -	return 0;
> -}
> -subsys_initcall(module_init_limits);
> -
> -void *module_alloc(unsigned long size)
> -{
> -	void *p = NULL;
> -
> -	/*
> -	 * Where possible, prefer to allocate within direct branch range of the
> -	 * kernel such that no PLTs are necessary.
> -	 */
> -	if (module_direct_base) {
> -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> -					 module_direct_base,
> -					 module_direct_base + SZ_128M,
> -					 GFP_KERNEL | __GFP_NOWARN,
> -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> -					 __builtin_return_address(0));
> -	}
> -
> -	if (!p && module_plt_base) {
> -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> -					 module_plt_base,
> -					 module_plt_base + SZ_2G,
> -					 GFP_KERNEL | __GFP_NOWARN,
> -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> -					 __builtin_return_address(0));
> -	}
> -
> -	if (!p) {
> -		pr_warn_ratelimited("%s: unable to allocate memory\n",
> -				    __func__);
> -	}
> -
> -	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
> -		vfree(p);
> -		return NULL;
> -	}
> -
> -	/* Memory is intended to be executable, reset the pointer tag. */
> -	return kasan_reset_tag(p);
> -}
> -
>  enum aarch64_reloc_op {
>  	RELOC_OP_NONE,
>  	RELOC_OP_ABS,
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index dbd1bc95967d..cf616635a80d 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd.o
>  obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd-asm.o
>  obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
>  obj-$(CONFIG_ARM64_MTE)		+= mteswap.o
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
>  KASAN_SANITIZE_physaddr.o	+= n
>  
>  obj-$(CONFIG_KASAN)		+= kasan_init.o
> diff --git a/arch/arm64/mm/module_alloc.c b/arch/arm64/mm/module_alloc.c
> new file mode 100644
> index 000000000000..302642ea9e26
> --- /dev/null
> +++ b/arch/arm64/mm/module_alloc.c
> @@ -0,0 +1,130 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +#include <linux/kasan.h>
> +#include <linux/random.h>
> +
> +static u64 module_direct_base __ro_after_init = 0;
> +static u64 module_plt_base __ro_after_init = 0;
> +
> +/*
> + * Choose a random page-aligned base address for a window of 'size' bytes which
> + * entirely contains the interval [start, end - 1].
> + */
> +static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
> +{
> +	u64 max_pgoff, pgoff;
> +
> +	if ((end - start) >= size)
> +		return 0;
> +
> +	max_pgoff = (size - (end - start)) / PAGE_SIZE;
> +	pgoff = get_random_u32_inclusive(0, max_pgoff);
> +
> +	return start - pgoff * PAGE_SIZE;
> +}
> +
> +/*
> + * Modules may directly reference data and text anywhere within the kernel
> + * image and other modules. References using PREL32 relocations have a +/-2G
> + * range, and so we need to ensure that the entire kernel image and all modules
> + * fall within a 2G window such that these are always within range.
> + *
> + * Modules may directly branch to functions and code within the kernel text,
> + * and to functions and code within other modules. These branches will use
> + * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
> + * that the entire kernel text and all module text falls within a 128M window
> + * such that these are always within range. With PLTs, we can expand this to a
> + * 2G window.
> + *
> + * We chose the 128M region to surround the entire kernel image (rather than
> + * just the text) as using the same bounds for the 128M and 2G regions ensures
> + * by construction that we never select a 128M region that is not a subset of
> + * the 2G region. For very large and unusual kernel configurations this means
> + * we may fall back to PLTs where they could have been avoided, but this keeps
> + * the logic significantly simpler.
> + */
> +static int __init module_init_limits(void)
> +{
> +	u64 kernel_end = (u64)_end;
> +	u64 kernel_start = (u64)_text;
> +	u64 kernel_size = kernel_end - kernel_start;
> +
> +	/*
> +	 * The default modules region is placed immediately below the kernel
> +	 * image, and is large enough to use the full 2G relocation range.
> +	 */
> +	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
> +	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
> +
> +	if (!kaslr_enabled()) {
> +		if (kernel_size < SZ_128M)
> +			module_direct_base = kernel_end - SZ_128M;
> +		if (kernel_size < SZ_2G)
> +			module_plt_base = kernel_end - SZ_2G;
> +	} else {
> +		u64 min = kernel_start;
> +		u64 max = kernel_end;
> +
> +		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
> +			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
> +		} else {
> +			module_direct_base = random_bounding_box(SZ_128M, min, max);
> +			if (module_direct_base) {
> +				min = module_direct_base;
> +				max = module_direct_base + SZ_128M;
> +			}
> +		}
> +
> +		module_plt_base = random_bounding_box(SZ_2G, min, max);
> +	}
> +
> +	pr_info("%llu pages in range for non-PLT usage",
> +		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
> +	pr_info("%llu pages in range for PLT usage",
> +		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
> +
> +	return 0;
> +}
> +subsys_initcall(module_init_limits);
> +
> +void *module_alloc(unsigned long size)
> +{
> +	void *p = NULL;
> +
> +	/*
> +	 * Where possible, prefer to allocate within direct branch range of the
> +	 * kernel such that no PLTs are necessary.
> +	 */
> +	if (module_direct_base) {
> +		p = __vmalloc_node_range(size, MODULE_ALIGN,
> +					 module_direct_base,
> +					 module_direct_base + SZ_128M,
> +					 GFP_KERNEL | __GFP_NOWARN,
> +					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> +					 __builtin_return_address(0));
> +	}
> +
> +	if (!p && module_plt_base) {
> +		p = __vmalloc_node_range(size, MODULE_ALIGN,
> +					 module_plt_base,
> +					 module_plt_base + SZ_2G,
> +					 GFP_KERNEL | __GFP_NOWARN,
> +					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> +					 __builtin_return_address(0));
> +	}
> +
> +	if (!p) {
> +		pr_warn_ratelimited("%s: unable to allocate memory\n",
> +				    __func__);
> +	}
> +
> +	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
> +		vfree(p);
> +		return NULL;
> +	}
> +
> +	/* Memory is intended to be executable, reset the pointer tag. */
> +	return kasan_reset_tag(p);
> +}
> diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
> index b13b2858fe39..7f03166513b3 100644
> --- a/arch/loongarch/kernel/module.c
> +++ b/arch/loongarch/kernel/module.c
> @@ -489,12 +489,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>  	return 0;
>  }
>  
> -void *module_alloc(unsigned long size)
> -{
> -	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> -			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
> -}
> -
>  static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
>  				   const Elf_Shdr *sechdrs, struct module *mod)
>  {
> diff --git a/arch/loongarch/mm/Makefile b/arch/loongarch/mm/Makefile
> index e4d1e581dbae..3966fc6118f1 100644
> --- a/arch/loongarch/mm/Makefile
> +++ b/arch/loongarch/mm/Makefile
> @@ -10,3 +10,5 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
>  obj-$(CONFIG_KASAN)		+= kasan_init.o
>  
>  KASAN_SANITIZE_kasan_init.o     := n
> +
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/loongarch/mm/module_alloc.c b/arch/loongarch/mm/module_alloc.c
> new file mode 100644
> index 000000000000..24b0cb3a2088
> --- /dev/null
> +++ b/arch/loongarch/mm/module_alloc.c
> @@ -0,0 +1,10 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +void *module_alloc(unsigned long size)
> +{
> +	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> +			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
> +}
> diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
> index 7b2fbaa9cac5..ba0f62d8eff5 100644
> --- a/arch/mips/kernel/module.c
> +++ b/arch/mips/kernel/module.c
> @@ -13,7 +13,6 @@
>  #include <linux/elf.h>
>  #include <linux/mm.h>
>  #include <linux/numa.h>
> -#include <linux/vmalloc.h>
>  #include <linux/slab.h>
>  #include <linux/fs.h>
>  #include <linux/string.h>
> @@ -31,15 +30,6 @@ struct mips_hi16 {
>  static LIST_HEAD(dbe_list);
>  static DEFINE_SPINLOCK(dbe_lock);
>  
> -#ifdef MODULE_START
> -void *module_alloc(unsigned long size)
> -{
> -	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
> -				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> -				__builtin_return_address(0));
> -}
> -#endif
> -
>  static void apply_r_mips_32(u32 *location, u32 base, Elf_Addr v)
>  {
>  	*location = base + v;
> diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile
> index 304692391519..b9cfe37e41e4 100644
> --- a/arch/mips/mm/Makefile
> +++ b/arch/mips/mm/Makefile
> @@ -45,3 +45,5 @@ obj-$(CONFIG_MIPS_CPU_SCACHE)	+= sc-mips.o
>  obj-$(CONFIG_SCACHE_DEBUGFS)	+= sc-debugfs.o
>  
>  obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
> +
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/mips/mm/module_alloc.c b/arch/mips/mm/module_alloc.c
> new file mode 100644
> index 000000000000..fcdbdece42f3
> --- /dev/null
> +++ b/arch/mips/mm/module_alloc.c
> @@ -0,0 +1,13 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +#ifdef MODULE_START
> +void *module_alloc(unsigned long size)
> +{
> +	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
> +				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> +				__builtin_return_address(0));
> +}
> +#endif
> diff --git a/arch/nios2/kernel/module.c b/arch/nios2/kernel/module.c
> index 76e0a42d6e36..f4483243578d 100644
> --- a/arch/nios2/kernel/module.c
> +++ b/arch/nios2/kernel/module.c
> @@ -13,7 +13,6 @@
>  #include <linux/moduleloader.h>
>  #include <linux/elf.h>
>  #include <linux/mm.h>
> -#include <linux/vmalloc.h>
>  #include <linux/slab.h>
>  #include <linux/fs.h>
>  #include <linux/string.h>
> @@ -21,25 +20,6 @@
>  
>  #include <asm/cacheflush.h>
>  
> -/*
> - * Modules should NOT be allocated with kmalloc for (obvious) reasons.
> - * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
> - * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
> - * addresses in 0xc0000000)
> - */
> -void *module_alloc(unsigned long size)
> -{
> -	if (size == 0)
> -		return NULL;
> -	return kmalloc(size, GFP_KERNEL);
> -}
> -
> -/* Free memory returned from module_alloc */
> -void module_memfree(void *module_region)
> -{
> -	kfree(module_region);
> -}
> -
>  int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab,
>  			unsigned int symindex, unsigned int relsec,
>  			struct module *mod)
> diff --git a/arch/nios2/mm/Makefile b/arch/nios2/mm/Makefile
> index 9d37fafd1dd1..facbb3e60013 100644
> --- a/arch/nios2/mm/Makefile
> +++ b/arch/nios2/mm/Makefile
> @@ -13,3 +13,5 @@ obj-y	+= mmu_context.o
>  obj-y	+= pgtable.o
>  obj-y	+= tlb.o
>  obj-y	+= uaccess.o
> +
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/nios2/mm/module_alloc.c b/arch/nios2/mm/module_alloc.c
> new file mode 100644
> index 000000000000..92c7c32ef8b3
> --- /dev/null
> +++ b/arch/nios2/mm/module_alloc.c
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/slab.h>
> +
> +/*
> + * Modules should NOT be allocated with kmalloc for (obvious) reasons.
> + * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
> + * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
> + * addresses in 0xc0000000)
> + */
> +void *module_alloc(unsigned long size)
> +{
> +	if (size == 0)
> +		return NULL;
> +	return kmalloc(size, GFP_KERNEL);
> +}
> +
> +/* Free memory returned from module_alloc */
> +void module_memfree(void *module_region)
> +{
> +	kfree(module_region);
> +}
> diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
> index d214bbe3c2af..4e5d991b2b65 100644
> --- a/arch/parisc/kernel/module.c
> +++ b/arch/parisc/kernel/module.c
> @@ -41,7 +41,6 @@
>  
>  #include <linux/moduleloader.h>
>  #include <linux/elf.h>
> -#include <linux/vmalloc.h>
>  #include <linux/fs.h>
>  #include <linux/ftrace.h>
>  #include <linux/string.h>
> @@ -173,17 +172,6 @@ static inline int reassemble_22(int as22)
>  		((as22 & 0x0003ff) << 3));
>  }
>  
> -void *module_alloc(unsigned long size)
> -{
> -	/* using RWX means less protection for modules, but it's
> -	 * easier than trying to map the text, data, init_text and
> -	 * init_data correctly */
> -	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> -				    GFP_KERNEL,
> -				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
> -				    __builtin_return_address(0));
> -}
> -
>  #ifndef CONFIG_64BIT
>  static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
>  {
> diff --git a/arch/parisc/mm/Makefile b/arch/parisc/mm/Makefile
> index ffdb5c0a8cc6..95a6d4469785 100644
> --- a/arch/parisc/mm/Makefile
> +++ b/arch/parisc/mm/Makefile
> @@ -5,3 +5,4 @@
>  
>  obj-y	 := init.o fault.o ioremap.o fixmap.o
>  obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
> +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> diff --git a/arch/parisc/mm/module_alloc.c b/arch/parisc/mm/module_alloc.c
> new file mode 100644
> index 000000000000..5ad9bfc3ffab
> --- /dev/null
> +++ b/arch/parisc/mm/module_alloc.c
> @@ -0,0 +1,15 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +void *module_alloc(unsigned long size)
> +{
> +	/* using RWX means less protection for modules, but it's
> +	 * easier than trying to map the text, data, init_text and
> +	 * init_data correctly */
> +	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> +				    GFP_KERNEL,
> +				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
> +				    __builtin_return_address(0));
> +}
> diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
> index f6d6ae0a1692..b5fe9c61e527 100644
> --- a/arch/powerpc/kernel/module.c
> +++ b/arch/powerpc/kernel/module.c
> @@ -89,39 +89,3 @@ int module_finalize(const Elf_Ehdr *hdr,
>  	return 0;
>  }
>  
> -static __always_inline void *
> -__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
> -{
> -	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
> -	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
> -
> -	/*
> -	 * Don't do huge page allocations for modules yet until more testing
> -	 * is done. STRICT_MODULE_RWX may require extra work to support this
> -	 * too.
> -	 */
> -	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
> -				    VM_FLUSH_RESET_PERMS,
> -				    NUMA_NO_NODE, __builtin_return_address(0));
> -}
> -
> -void *module_alloc(unsigned long size)
> -{
> -#ifdef MODULES_VADDR
> -	unsigned long limit = (unsigned long)_etext - SZ_32M;
> -	void *ptr = NULL;
> -
> -	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
> -
> -	/* First try within 32M limit from _etext to avoid branch trampolines */
> -	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
> -		ptr = __module_alloc(size, limit, MODULES_END, true);
> -
> -	if (!ptr)
> -		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
> -
> -	return ptr;
> -#else
> -	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
> -#endif
> -}
> diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
> index 503a6e249940..4572273a838f 100644
> --- a/arch/powerpc/mm/Makefile
> +++ b/arch/powerpc/mm/Makefile
> @@ -19,3 +19,4 @@ obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o
>  obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
>  obj-$(CONFIG_PTDUMP_CORE)	+= ptdump/
>  obj-$(CONFIG_KASAN)		+= kasan/
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/powerpc/mm/module_alloc.c b/arch/powerpc/mm/module_alloc.c
> new file mode 100644
> index 000000000000..818e5cd8fbc6
> --- /dev/null
> +++ b/arch/powerpc/mm/module_alloc.c
> @@ -0,0 +1,41 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +static __always_inline void *
> +__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
> +{
> +	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
> +	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
> +
> +	/*
> +	 * Don't do huge page allocations for modules yet until more testing
> +	 * is done. STRICT_MODULE_RWX may require extra work to support this
> +	 * too.
> +	 */
> +	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
> +				    VM_FLUSH_RESET_PERMS,
> +				    NUMA_NO_NODE, __builtin_return_address(0));
> +}
> +
> +void *module_alloc(unsigned long size)
> +{
> +#ifdef MODULES_VADDR
> +	unsigned long limit = (unsigned long)_etext - SZ_32M;
> +	void *ptr = NULL;
> +
> +	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
> +
> +	/* First try within 32M limit from _etext to avoid branch trampolines */
> +	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
> +		ptr = __module_alloc(size, limit, MODULES_END, true);
> +
> +	if (!ptr)
> +		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
> +
> +	return ptr;
> +#else
> +	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
> +#endif
> +}
> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> index 5e5a82644451..53d7005fdbdb 100644
> --- a/arch/riscv/kernel/module.c
> +++ b/arch/riscv/kernel/module.c
> @@ -11,7 +11,6 @@
>  #include <linux/kernel.h>
>  #include <linux/log2.h>
>  #include <linux/moduleloader.h>
> -#include <linux/vmalloc.h>
>  #include <linux/sizes.h>
>  #include <linux/pgtable.h>
>  #include <asm/alternative.h>
> @@ -905,16 +904,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>  	return 0;
>  }
>  
> -#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> -void *module_alloc(unsigned long size)
> -{
> -	return __vmalloc_node_range(size, 1, MODULES_VADDR,
> -				    MODULES_END, GFP_KERNEL,
> -				    PAGE_KERNEL, VM_FLUSH_RESET_PERMS,
> -				    NUMA_NO_NODE,
> -				    __builtin_return_address(0));
> -}
> -#endif
>  
>  int module_finalize(const Elf_Ehdr *hdr,
>  		    const Elf_Shdr *sechdrs,
> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> index 2c869f8026a8..fba8e3595459 100644
> --- a/arch/riscv/mm/Makefile
> +++ b/arch/riscv/mm/Makefile
> @@ -36,3 +36,4 @@ endif
>  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
>  obj-$(CONFIG_RISCV_DMA_NONCOHERENT) += dma-noncoherent.o
>  obj-$(CONFIG_RISCV_NONSTANDARD_CACHE_OPS) += cache-ops.o
> +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> diff --git a/arch/riscv/mm/module_alloc.c b/arch/riscv/mm/module_alloc.c
> new file mode 100644
> index 000000000000..2c1fb95a57e2
> --- /dev/null
> +++ b/arch/riscv/mm/module_alloc.c
> @@ -0,0 +1,17 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/pgtable.h>
> +#include <asm/alternative.h>
> +#include <asm/sections.h>
> +
> +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> +void *module_alloc(unsigned long size)
> +{
> +	return __vmalloc_node_range(size, 1, MODULES_VADDR,
> +				    MODULES_END, GFP_KERNEL,
> +				    PAGE_KERNEL, VM_FLUSH_RESET_PERMS,
> +				    NUMA_NO_NODE,
> +				    __builtin_return_address(0));
> +}
> +#endif
> diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> index 42215f9404af..ef8a7539bb0b 100644
> --- a/arch/s390/kernel/module.c
> +++ b/arch/s390/kernel/module.c
> @@ -36,43 +36,6 @@
>  
>  #define PLT_ENTRY_SIZE 22
>  
> -static unsigned long get_module_load_offset(void)
> -{
> -	static DEFINE_MUTEX(module_kaslr_mutex);
> -	static unsigned long module_load_offset;
> -
> -	if (!kaslr_enabled())
> -		return 0;
> -	/*
> -	 * Calculate the module_load_offset the first time this code
> -	 * is called. Once calculated it stays the same until reboot.
> -	 */
> -	mutex_lock(&module_kaslr_mutex);
> -	if (!module_load_offset)
> -		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> -	mutex_unlock(&module_kaslr_mutex);
> -	return module_load_offset;
> -}
> -
> -void *module_alloc(unsigned long size)
> -{
> -	gfp_t gfp_mask = GFP_KERNEL;
> -	void *p;
> -
> -	if (PAGE_ALIGN(size) > MODULES_LEN)
> -		return NULL;
> -	p = __vmalloc_node_range(size, MODULE_ALIGN,
> -				 MODULES_VADDR + get_module_load_offset(),
> -				 MODULES_END, gfp_mask, PAGE_KERNEL,
> -				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> -				 NUMA_NO_NODE, __builtin_return_address(0));
> -	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> -		vfree(p);
> -		return NULL;
> -	}
> -	return p;
> -}
> -
>  #ifdef CONFIG_FUNCTION_TRACER
>  void module_arch_cleanup(struct module *mod)
>  {
> diff --git a/arch/s390/mm/Makefile b/arch/s390/mm/Makefile
> index 352ff520fd94..4f44c4096c6d 100644
> --- a/arch/s390/mm/Makefile
> +++ b/arch/s390/mm/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
>  obj-$(CONFIG_PTDUMP_CORE)	+= dump_pagetables.o
>  obj-$(CONFIG_PGSTE)		+= gmap.o
>  obj-$(CONFIG_PFAULT)		+= pfault.o
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/s390/mm/module_alloc.c b/arch/s390/mm/module_alloc.c
> new file mode 100644
> index 000000000000..88eadce4bc68
> --- /dev/null
> +++ b/arch/s390/mm/module_alloc.c
> @@ -0,0 +1,42 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +#include <linux/kasan.h>
> +
> +static unsigned long get_module_load_offset(void)
> +{
> +	static DEFINE_MUTEX(module_kaslr_mutex);
> +	static unsigned long module_load_offset;
> +
> +	if (!kaslr_enabled())
> +		return 0;
> +	/*
> +	 * Calculate the module_load_offset the first time this code
> +	 * is called. Once calculated it stays the same until reboot.
> +	 */
> +	mutex_lock(&module_kaslr_mutex);
> +	if (!module_load_offset)
> +		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> +	mutex_unlock(&module_kaslr_mutex);
> +	return module_load_offset;
> +}
> +
> +void *module_alloc(unsigned long size)
> +{
> +	gfp_t gfp_mask = GFP_KERNEL;
> +	void *p;
> +
> +	if (PAGE_ALIGN(size) > MODULES_LEN)
> +		return NULL;
> +	p = __vmalloc_node_range(size, MODULE_ALIGN,
> +				 MODULES_VADDR + get_module_load_offset(),
> +				 MODULES_END, gfp_mask, PAGE_KERNEL,
> +				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> +				 NUMA_NO_NODE, __builtin_return_address(0));
> +	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> +		vfree(p);
> +		return NULL;
> +	}
> +	return p;
> +}
> diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
> index 66c45a2764bc..0611a41cd586 100644
> --- a/arch/sparc/kernel/module.c
> +++ b/arch/sparc/kernel/module.c
> @@ -8,7 +8,6 @@
>  #include <linux/moduleloader.h>
>  #include <linux/kernel.h>
>  #include <linux/elf.h>
> -#include <linux/vmalloc.h>
>  #include <linux/fs.h>
>  #include <linux/gfp.h>
>  #include <linux/string.h>
> @@ -21,36 +20,6 @@
>  
>  #include "entry.h"
>  
> -#ifdef CONFIG_SPARC64
> -
> -#include <linux/jump_label.h>
> -
> -static void *module_map(unsigned long size)
> -{
> -	if (PAGE_ALIGN(size) > MODULES_LEN)
> -		return NULL;
> -	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> -				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> -				__builtin_return_address(0));
> -}
> -#else
> -static void *module_map(unsigned long size)
> -{
> -	return vmalloc(size);
> -}
> -#endif /* CONFIG_SPARC64 */
> -
> -void *module_alloc(unsigned long size)
> -{
> -	void *ret;
> -
> -	ret = module_map(size);
> -	if (ret)
> -		memset(ret, 0, size);
> -
> -	return ret;
> -}
> -
>  /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
>  int module_frob_arch_sections(Elf_Ehdr *hdr,
>  			      Elf_Shdr *sechdrs,
> diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile
> index 809d993f6d88..a8e9ba46679a 100644
> --- a/arch/sparc/mm/Makefile
> +++ b/arch/sparc/mm/Makefile
> @@ -14,3 +14,5 @@ obj-$(CONFIG_SPARC32)   += leon_mm.o
>  
>  # Only used by sparc64
>  obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
> +
> +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> diff --git a/arch/sparc/mm/module_alloc.c b/arch/sparc/mm/module_alloc.c
> new file mode 100644
> index 000000000000..14aef0f75650
> --- /dev/null
> +++ b/arch/sparc/mm/module_alloc.c
> @@ -0,0 +1,31 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +#ifdef CONFIG_SPARC64
> +static void *module_map(unsigned long size)
> +{
> +	if (PAGE_ALIGN(size) > MODULES_LEN)
> +		return NULL;
> +	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> +				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> +				__builtin_return_address(0));
> +}
> +#else
> +static void *module_map(unsigned long size)
> +{
> +	return vmalloc(size);
> +}
> +#endif /* CONFIG_SPARC64 */
> +
> +void *module_alloc(unsigned long size)
> +{
> +	void *ret;
> +
> +	ret = module_map(size);
> +	if (ret)
> +		memset(ret, 0, size);
> +
> +	return ret;
> +}
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 12df54ff0e81..99f242e11f88 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -260,7 +260,7 @@ void arch_ftrace_update_code(int command)
>  /* Currently only x86_64 supports dynamic trampolines */
>  #ifdef CONFIG_X86_64
>  
> -#ifdef CONFIG_MODULES
> +#if IS_ENABLED(CONFIG_MODULE_ALLOC)
>  #include <linux/moduleloader.h>
>  /* Module allocation simplifies allocating memory for code */
>  static inline void *alloc_tramp(unsigned long size)
> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
> index e18914c0e38a..ad7e3968ee8f 100644
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -8,21 +8,14 @@
>  
>  #include <linux/moduleloader.h>
>  #include <linux/elf.h>
> -#include <linux/vmalloc.h>
>  #include <linux/fs.h>
>  #include <linux/string.h>
>  #include <linux/kernel.h>
> -#include <linux/kasan.h>
>  #include <linux/bug.h>
> -#include <linux/mm.h>
> -#include <linux/gfp.h>
>  #include <linux/jump_label.h>
> -#include <linux/random.h>
>  #include <linux/memory.h>
>  
>  #include <asm/text-patching.h>
> -#include <asm/page.h>
> -#include <asm/setup.h>
>  #include <asm/unwind.h>
>  
>  #if 0
> @@ -36,56 +29,7 @@ do {							\
>  } while (0)
>  #endif
>  
> -#ifdef CONFIG_RANDOMIZE_BASE
> -static unsigned long module_load_offset;
>  
> -/* Mutex protects the module_load_offset. */
> -static DEFINE_MUTEX(module_kaslr_mutex);
> -
> -static unsigned long int get_module_load_offset(void)
> -{
> -	if (kaslr_enabled()) {
> -		mutex_lock(&module_kaslr_mutex);
> -		/*
> -		 * Calculate the module_load_offset the first time this
> -		 * code is called. Once calculated it stays the same until
> -		 * reboot.
> -		 */
> -		if (module_load_offset == 0)
> -			module_load_offset =
> -				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> -		mutex_unlock(&module_kaslr_mutex);
> -	}
> -	return module_load_offset;
> -}
> -#else
> -static unsigned long int get_module_load_offset(void)
> -{
> -	return 0;
> -}
> -#endif
> -
> -void *module_alloc(unsigned long size)
> -{
> -	gfp_t gfp_mask = GFP_KERNEL;
> -	void *p;
> -
> -	if (PAGE_ALIGN(size) > MODULES_LEN)
> -		return NULL;
> -
> -	p = __vmalloc_node_range(size, MODULE_ALIGN,
> -				 MODULES_VADDR + get_module_load_offset(),
> -				 MODULES_END, gfp_mask, PAGE_KERNEL,
> -				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> -				 NUMA_NO_NODE, __builtin_return_address(0));
> -
> -	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> -		vfree(p);
> -		return NULL;
> -	}
> -
> -	return p;
> -}
>  
>  #ifdef CONFIG_X86_32
>  int apply_relocate(Elf32_Shdr *sechdrs,
> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> index c80febc44cd2..b9e42770a002 100644
> --- a/arch/x86/mm/Makefile
> +++ b/arch/x86/mm/Makefile
> @@ -67,3 +67,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_amd.o
>  
>  obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_identity.o
>  obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_boot.o
> +
> +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> diff --git a/arch/x86/mm/module_alloc.c b/arch/x86/mm/module_alloc.c
> new file mode 100644
> index 000000000000..00391c15e1eb
> --- /dev/null
> +++ b/arch/x86/mm/module_alloc.c
> @@ -0,0 +1,59 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +#include <linux/kasan.h>
> +#include <linux/random.h>
> +#include <linux/mutex.h>
> +#include <asm/setup.h>
> +
> +#ifdef CONFIG_RANDOMIZE_BASE
> +static unsigned long module_load_offset;
> +
> +/* Mutex protects the module_load_offset. */
> +static DEFINE_MUTEX(module_kaslr_mutex);
> +
> +static unsigned long int get_module_load_offset(void)
> +{
> +	if (kaslr_enabled()) {
> +		mutex_lock(&module_kaslr_mutex);
> +		/*
> +		 * Calculate the module_load_offset the first time this
> +		 * code is called. Once calculated it stays the same until
> +		 * reboot.
> +		 */
> +		if (module_load_offset == 0)
> +			module_load_offset =
> +				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> +		mutex_unlock(&module_kaslr_mutex);
> +	}
> +	return module_load_offset;
> +}
> +#else
> +static unsigned long int get_module_load_offset(void)
> +{
> +	return 0;
> +}
> +#endif
> +
> +void *module_alloc(unsigned long size)
> +{
> +	gfp_t gfp_mask = GFP_KERNEL;
> +	void *p;
> +
> +	if (PAGE_ALIGN(size) > MODULES_LEN)
> +		return NULL;
> +
> +	p = __vmalloc_node_range(size, MODULE_ALIGN,
> +				 MODULES_VADDR + get_module_load_offset(),
> +				 MODULES_END, gfp_mask, PAGE_KERNEL,
> +				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> +				 NUMA_NO_NODE, __builtin_return_address(0));
> +
> +	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> +		vfree(p);
> +		return NULL;
> +	}
> +
> +	return p;
> +}
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index 6422e569b080..b8f4dcf92a89 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -668,7 +668,7 @@ static void __init proc_kcore_text_init(void)
>  }
>  #endif
>  
> -#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
> +#if defined(CONFIG_MODULE_ALLOC) && defined(MODULES_VADDR)
>  /*
>   * MODULES_VADDR has no intersection with VMALLOC_ADDR.
>   */
> diff --git a/kernel/module/Kconfig b/kernel/module/Kconfig
> index 0ea1b2970a23..a49460022350 100644
> --- a/kernel/module/Kconfig
> +++ b/kernel/module/Kconfig
> @@ -1,6 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  menuconfig MODULES
>  	bool "Enable loadable module support"
> +	select MODULE_ALLOC
>  	modules
>  	help
>  	  Kernel modules are small pieces of compiled code which can
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 36681911c05a..085bc6e75b3f 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -1179,16 +1179,6 @@ resolve_symbol_wait(struct module *mod,
>  	return ksym;
>  }
>  
> -void __weak module_memfree(void *module_region)
> -{
> -	/*
> -	 * This memory may be RO, and freeing RO memory in an interrupt is not
> -	 * supported by vmalloc.
> -	 */
> -	WARN_ON(in_interrupt());
> -	vfree(module_region);
> -}
> -
>  void __weak module_arch_cleanup(struct module *mod)
>  {
>  }
> @@ -1610,13 +1600,6 @@ static void free_modinfo(struct module *mod)
>  	}
>  }
>  
> -void * __weak module_alloc(unsigned long size)
> -{
> -	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> -			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
> -			NUMA_NO_NODE, __builtin_return_address(0));
> -}
> -
>  bool __weak module_init_section(const char *name)
>  {
>  	return strstarts(name, ".init");
> diff --git a/mm/Kconfig b/mm/Kconfig
> index ffc3a2ba3a8c..92bfb5ae2e95 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -1261,6 +1261,9 @@ config LOCK_MM_AND_FIND_VMA
>  config IOMMU_MM_DATA
>  	bool
>  
> +config MODULE_ALLOC
> +	def_bool n
> +
>  source "mm/damon/Kconfig"
>  
>  endmenu
> diff --git a/mm/Makefile b/mm/Makefile
> index e4b5b75aaec9..731bd2c20ceb 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -134,3 +134,4 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o
>  obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
>  obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
>  obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
> +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> diff --git a/mm/module_alloc.c b/mm/module_alloc.c
> new file mode 100644
> index 000000000000..821af49e9a7c
> --- /dev/null
> +++ b/mm/module_alloc.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +
> +void * __weak module_alloc(unsigned long size)
> +{
> +	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> +			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
> +			NUMA_NO_NODE, __builtin_return_address(0));
> +}
> +
> +void __weak module_memfree(void *module_region)
> +{
> +	/*
> +	 * This memory may be RO, and freeing RO memory in an interrupt is not
> +	 * supported by vmalloc.
> +	 */
> +	WARN_ON(in_interrupt());
> +	vfree(module_region);
> +}
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d12a17fc0c17..b7d963fe0707 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -642,7 +642,7 @@ int is_vmalloc_or_module_addr(const void *x)
>  	 * and fall back on vmalloc() if that fails. Others
>  	 * just put it in the vmalloc space.
>  	 */
> -#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
> +#if defined(CONFIG_MODULE_ALLOC) && defined(MODULES_VADDR)
>  	unsigned long addr = (unsigned long)kasan_reset_tag(x);
>  	if (addr >= MODULES_VADDR && addr < MODULES_END)
>  		return 1;
> -- 
> 2.43.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-06 21:34 ` [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Luis Chamberlain
  2024-03-06 23:23   ` Calvin Owens
@ 2024-03-08  2:45   ` Masami Hiramatsu
  1 sibling, 0 replies; 27+ messages in thread
From: Masami Hiramatsu @ 2024-03-08  2:45 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Calvin Owens, Song Liu, Christophe Leroy, Mike Rapoport,
	Andrew Morton, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

Hi,

On Wed, 6 Mar 2024 13:34:40 -0800
Luis Chamberlain <mcgrof@kernel.org> wrote:

> On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > Hello all,
> > 
> > This patchset makes it possible to use bpftrace with kprobes on kernels
> > built without loadable module support.
> 
> This is a step in the right direction for another reason: clearly the
> module_alloc() is not about modules, and we have special reasons for it
> now beyond modules. The effort to share a generalize a huge page for
> these things is also another reason for some of this but that is more
> long term.

Indeed. If it works without CONFIG_MODULES, it may be exec_alloc() or
something like that. Anyway, thanks for great job on this item!

> 
> I'm all for minor changes here so to avoid regressions but it seems a
> rename is in order -- if we're going to all this might as well do it
> now. And for that I'd just like to ask you paint the bikeshed with
> Song Liu as he's been the one slowly making way to help us get there
> with the "module: replace module_layout with module_memory",
> and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> the EXECMEM stuff would be what we use instead then. Mike kept the
> module_alloc() and the execmem was just a wrapper but your move of the
> arch stuff makes sense as well and I think would complement his series
> nicely.

yeah, it is better to work with Mike.

Thank you,

> 
> If you're gonna split code up to move to another place, it'd be nice
> if you can add copyright headers as was done with the kernel/module.c
> split into kernel/module/*.c
> 
> Can we start with some small basic stuff we can all agree on?
> 
> [0] https://lwn.net/Articles/944857/
> 
>   Luis


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-06 20:05 ` [RFC][PATCH 3/4] kprobes: Allow kprobes " Calvin Owens
  2024-03-07  7:22   ` Mike Rapoport
  2024-03-07 22:16   ` Christophe Leroy
@ 2024-03-08  2:46   ` Masami Hiramatsu
  2024-03-08 20:57     ` Calvin Owens
  2 siblings, 1 reply; 27+ messages in thread
From: Masami Hiramatsu @ 2024-03-08  2:46 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Wed,  6 Mar 2024 12:05:10 -0800
Calvin Owens <jcalvinowens@gmail.com> wrote:

> If something like this is merged down the road, it can go in at leisure
> once the module_alloc change is in: it's a one-way dependency.
> 
> Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> ---
>  arch/Kconfig                |  2 +-
>  kernel/kprobes.c            | 22 ++++++++++++++++++++++
>  kernel/trace/trace_kprobe.c | 11 +++++++++++
>  3 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cfc24ced16dd..e60ce984d095 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -52,8 +52,8 @@ config GENERIC_ENTRY
>  
>  config KPROBES
>  	bool "Kprobes"
> -	depends on MODULES
>  	depends on HAVE_KPROBES
> +	select MODULE_ALLOC

OK, if we use EXEC_ALLOC,

config EXEC_ALLOC
	depends on HAVE_EXEC_ALLOC

And 

  config KPROBES
  	bool "Kprobes"
	depends on MODULES || EXEC_ALLOC
	select EXEC_ALLOC if HAVE_EXEC_ALLOC

then kprobes can be enabled either modules supported or exec_alloc is supported.
(new arch does not need to implement exec_alloc)

Maybe we also need something like

#ifdef CONFIG_EXEC_ALLOC
#define module_alloc(size) exec_alloc(size)
#endif

in kprobes.h, or just add `replacing module_alloc with exec_alloc` patch.

Thank you,

>  	select KALLSYMS
>  	select TASKS_RCU if PREEMPTION
>  	help
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 9d9095e81792..194270e17d57 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
>  		str_has_prefix("__pfx_", symbuf);
>  }
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  static int check_kprobe_address_safe(struct kprobe *p,
>  				     struct module **probed_mod)
> +#else
> +static int check_kprobe_address_safe(struct kprobe *p)
> +#endif
>  {
>  	int ret;
>  
> @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
>  		goto out;
>  	}
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  	/* Check if 'p' is probing a module. */
>  	*probed_mod = __module_text_address((unsigned long) p->addr);
>  	if (*probed_mod) {
> @@ -1603,6 +1608,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
>  			ret = -ENOENT;
>  		}
>  	}
> +#endif
> +
>  out:
>  	preempt_enable();
>  	jump_label_unlock();
> @@ -1614,7 +1621,9 @@ int register_kprobe(struct kprobe *p)
>  {
>  	int ret;
>  	struct kprobe *old_p;
> +#if IS_ENABLED(CONFIG_MODULES)
>  	struct module *probed_mod;
> +#endif
>  	kprobe_opcode_t *addr;
>  	bool on_func_entry;
>  
> @@ -1633,7 +1642,11 @@ int register_kprobe(struct kprobe *p)
>  	p->nmissed = 0;
>  	INIT_LIST_HEAD(&p->list);
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  	ret = check_kprobe_address_safe(p, &probed_mod);
> +#else
> +	ret = check_kprobe_address_safe(p);
> +#endif
>  	if (ret)
>  		return ret;
>  
> @@ -1676,8 +1689,10 @@ int register_kprobe(struct kprobe *p)
>  out:
>  	mutex_unlock(&kprobe_mutex);
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  	if (probed_mod)
>  		module_put(probed_mod);
> +#endif
>  
>  	return ret;
>  }
> @@ -2482,6 +2497,7 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
>  	return 0;
>  }
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  /* Remove all symbols in given area from kprobe blacklist */
>  static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
>  {
> @@ -2499,6 +2515,7 @@ static void kprobe_remove_ksym_blacklist(unsigned long entry)
>  {
>  	kprobe_remove_area_blacklist(entry, entry + 1);
>  }
> +#endif
>  
>  int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
>  				   char *type, char *sym)
> @@ -2564,6 +2581,7 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
>  	return ret ? : arch_populate_kprobe_blacklist();
>  }
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  static void add_module_kprobe_blacklist(struct module *mod)
>  {
>  	unsigned long start, end;
> @@ -2665,6 +2683,7 @@ static struct notifier_block kprobe_module_nb = {
>  	.notifier_call = kprobes_module_callback,
>  	.priority = 0
>  };
> +#endif /* IS_ENABLED(CONFIG_MODULES) */
>  
>  void kprobe_free_init_mem(void)
>  {
> @@ -2724,8 +2743,11 @@ static int __init init_kprobes(void)
>  	err = arch_init_kprobes();
>  	if (!err)
>  		err = register_die_notifier(&kprobe_exceptions_nb);
> +
> +#if IS_ENABLED(CONFIG_MODULES)
>  	if (!err)
>  		err = register_module_notifier(&kprobe_module_nb);
> +#endif
>  
>  	kprobes_initialized = (err == 0);
>  	kprobe_sysctls_init();
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index c4c6e0e0068b..dd4598f775b9 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -102,6 +102,7 @@ static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
>  	return kprobe_gone(&tk->rp.kp);
>  }
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
>  						 struct module *mod)
>  {
> @@ -129,6 +130,12 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
>  
>  	return ret;
>  }
> +#else
> +static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> +{
> +	return true;
> +}
> +#endif
>  
>  static bool trace_kprobe_is_busy(struct dyn_event *ev)
>  {
> @@ -670,6 +677,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
>  	return ret;
>  }
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  /* Module notifier call back, checking event on the module */
>  static int trace_kprobe_module_callback(struct notifier_block *nb,
>  				       unsigned long val, void *data)
> @@ -704,6 +712,7 @@ static struct notifier_block trace_kprobe_module_nb = {
>  	.notifier_call = trace_kprobe_module_callback,
>  	.priority = 1	/* Invoked after kprobe module callback */
>  };
> +#endif /* IS_ENABLED(CONFIG_MODULES) */
>  
>  static int count_symbols(void *data, unsigned long unused)
>  {
> @@ -1897,8 +1906,10 @@ static __init int init_kprobe_trace_early(void)
>  	if (ret)
>  		return ret;
>  
> +#if IS_ENABLED(CONFIG_MODULES)
>  	if (register_module_notifier(&trace_kprobe_module_nb))
>  		return -EINVAL;
> +#endif
>  
>  	return 0;
>  }
> -- 
> 2.43.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-07  7:22   ` Mike Rapoport
@ 2024-03-08  2:46     ` Masami Hiramatsu
  2024-03-08 20:36     ` Calvin Owens
  1 sibling, 0 replies; 27+ messages in thread
From: Masami Hiramatsu @ 2024-03-08  2:46 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Calvin Owens, Luis Chamberlain, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Thu, 7 Mar 2024 09:22:07 +0200
Mike Rapoport <rppt@kernel.org> wrote:

> On Wed, Mar 06, 2024 at 12:05:10PM -0800, Calvin Owens wrote:
> > If something like this is merged down the road, it can go in at leisure
> > once the module_alloc change is in: it's a one-way dependency.
> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >  arch/Kconfig                |  2 +-
> >  kernel/kprobes.c            | 22 ++++++++++++++++++++++
> >  kernel/trace/trace_kprobe.c | 11 +++++++++++
> >  3 files changed, 34 insertions(+), 1 deletion(-)
> 
> When I did this in my last execmem posting, I think I've got slightly less
> ugly ifdery, you may want to take a look at that:
> 
> https://lore.kernel.org/all/20230918072955.2507221-13-rppt@kernel.org

Good catch, and sorry I missed that series last year.
But it seems your patch seems less ugly.

Calvin, can you follow his patch?

Thank you,

>  
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index cfc24ced16dd..e60ce984d095 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -52,8 +52,8 @@ config GENERIC_ENTRY
> >  
> >  config KPROBES
> >  	bool "Kprobes"
> > -	depends on MODULES
> >  	depends on HAVE_KPROBES
> > +	select MODULE_ALLOC
> >  	select KALLSYMS
> >  	select TASKS_RCU if PREEMPTION
> >  	help
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 9d9095e81792..194270e17d57 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
> >  		str_has_prefix("__pfx_", symbuf);
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  static int check_kprobe_address_safe(struct kprobe *p,
> >  				     struct module **probed_mod)
> > +#else
> > +static int check_kprobe_address_safe(struct kprobe *p)
> > +#endif
> >  {
> >  	int ret;
> >  
> > @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >  		goto out;
> >  	}
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> 
> Plain #ifdef will do here and below. IS_ENABLED is for usage withing the
> code, like
> 
> 	if (IS_ENABLED(CONFIG_MODULES))
> 		;
> 
> >  	/* Check if 'p' is probing a module. */
> >  	*probed_mod = __module_text_address((unsigned long) p->addr);
> >  	if (*probed_mod) {
> 
> -- 
> Sincerely yours,
> Mike.


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-07  1:58     ` Song Liu
@ 2024-03-08  2:50       ` Masami Hiramatsu
  2024-03-08  2:55         ` Luis Chamberlain
  0 siblings, 1 reply; 27+ messages in thread
From: Masami Hiramatsu @ 2024-03-08  2:50 UTC (permalink / raw)
  To: Song Liu
  Cc: Calvin Owens, Luis Chamberlain, Christophe Leroy, Mike Rapoport,
	Andrew Morton, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Wed, 6 Mar 2024 17:58:14 -0800
Song Liu <song@kernel.org> wrote:

> Hi Calvin,
> 
> It is great to hear from you! :)
> 
> On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
> >
> > On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > > Hello all,
> > > >
> > > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > > built without loadable module support.
> > >
> > > This is a step in the right direction for another reason: clearly the
> > > module_alloc() is not about modules, and we have special reasons for it
> > > now beyond modules. The effort to share a generalize a huge page for
> > > these things is also another reason for some of this but that is more
> > > long term.
> > >
> > > I'm all for minor changes here so to avoid regressions but it seems a
> > > rename is in order -- if we're going to all this might as well do it
> > > now. And for that I'd just like to ask you paint the bikeshed with
> > > Song Liu as he's been the one slowly making way to help us get there
> > > with the "module: replace module_layout with module_memory",
> > > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > > the EXECMEM stuff would be what we use instead then. Mike kept the
> > > module_alloc() and the execmem was just a wrapper but your move of the
> > > arch stuff makes sense as well and I think would complement his series
> > > nicely.
> >
> > I apologize for missing that. I think these are the four most recent
> > versions of the different series referenced from that LWN link:
> >
> >   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
> >   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
> >   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
> >   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> >
> > Song and Mike, please correct me if I'm wrong, but I think what I've
> > done here (see [1], sorry for not adding you initially) is compatible
> > with everything both of you have recently proposed above. How do you
> > feel about this as a first step?
> 
> I agree that the work here is compatible with other efforts. I have no
> objection to making this the first step.
> 
> >
> > For naming, execmem_alloc() seems reasonable to me? I have no strong
> > feelings at all, I'll just use that going forward unless somebody else
> > expresses an opinion.
> 
> I am not good at naming things. No objection from me to "execmem_alloc".

Hm, it sounds good to me too. I think we should add a patch which just
rename the module_alloc/module_memfree with execmem_alloc/free first.

Thanks,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-08  2:50       ` Masami Hiramatsu
@ 2024-03-08  2:55         ` Luis Chamberlain
  2024-03-08 20:27           ` Calvin Owens
  0 siblings, 1 reply; 27+ messages in thread
From: Luis Chamberlain @ 2024-03-08  2:55 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Song Liu, Calvin Owens, Christophe Leroy, Mike Rapoport,
	Andrew Morton, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, Andrii Nakryiko, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Thu, Mar 7, 2024 at 6:50 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Wed, 6 Mar 2024 17:58:14 -0800
> Song Liu <song@kernel.org> wrote:
>
> > Hi Calvin,
> >
> > It is great to hear from you! :)
> >
> > On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
> > >
> > > On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > > > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > > > Hello all,
> > > > >
> > > > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > > > built without loadable module support.
> > > >
> > > > This is a step in the right direction for another reason: clearly the
> > > > module_alloc() is not about modules, and we have special reasons for it
> > > > now beyond modules. The effort to share a generalize a huge page for
> > > > these things is also another reason for some of this but that is more
> > > > long term.
> > > >
> > > > I'm all for minor changes here so to avoid regressions but it seems a
> > > > rename is in order -- if we're going to all this might as well do it
> > > > now. And for that I'd just like to ask you paint the bikeshed with
> > > > Song Liu as he's been the one slowly making way to help us get there
> > > > with the "module: replace module_layout with module_memory",
> > > > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > > > the EXECMEM stuff would be what we use instead then. Mike kept the
> > > > module_alloc() and the execmem was just a wrapper but your move of the
> > > > arch stuff makes sense as well and I think would complement his series
> > > > nicely.
> > >
> > > I apologize for missing that. I think these are the four most recent
> > > versions of the different series referenced from that LWN link:
> > >
> > >   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
> > >   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
> > >   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
> > >   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> > >
> > > Song and Mike, please correct me if I'm wrong, but I think what I've
> > > done here (see [1], sorry for not adding you initially) is compatible
> > > with everything both of you have recently proposed above. How do you
> > > feel about this as a first step?
> >
> > I agree that the work here is compatible with other efforts. I have no
> > objection to making this the first step.
> >
> > >
> > > For naming, execmem_alloc() seems reasonable to me? I have no strong
> > > feelings at all, I'll just use that going forward unless somebody else
> > > expresses an opinion.
> >
> > I am not good at naming things. No objection from me to "execmem_alloc".
>
> Hm, it sounds good to me too. I think we should add a patch which just
> rename the module_alloc/module_memfree with execmem_alloc/free first.

I think that would be cleaner, yes. Leaving the possible move to a
secondary patch and placing the testing more on the later part.

 Luis

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-08  2:55         ` Luis Chamberlain
@ 2024-03-08 20:27           ` Calvin Owens
  0 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 20:27 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Masami Hiramatsu, Song Liu, Christophe Leroy, Mike Rapoport,
	Andrew Morton, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, Andrii Nakryiko, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Thursday 03/07 at 18:55 -0800, Luis Chamberlain wrote:
> On Thu, Mar 7, 2024 at 6:50 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Wed, 6 Mar 2024 17:58:14 -0800
> > Song Liu <song@kernel.org> wrote:
> >
> > > Hi Calvin,
> > >
> > > It is great to hear from you! :)
> > >
> > > On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
> > > >
> > > > On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > > > > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > > > > Hello all,
> > > > > >
> > > > > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > > > > built without loadable module support.
> > > > >
> > > > > This is a step in the right direction for another reason: clearly the
> > > > > module_alloc() is not about modules, and we have special reasons for it
> > > > > now beyond modules. The effort to share a generalize a huge page for
> > > > > these things is also another reason for some of this but that is more
> > > > > long term.
> > > > >
> > > > > I'm all for minor changes here so to avoid regressions but it seems a
> > > > > rename is in order -- if we're going to all this might as well do it
> > > > > now. And for that I'd just like to ask you paint the bikeshed with
> > > > > Song Liu as he's been the one slowly making way to help us get there
> > > > > with the "module: replace module_layout with module_memory",
> > > > > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > > > > the EXECMEM stuff would be what we use instead then. Mike kept the
> > > > > module_alloc() and the execmem was just a wrapper but your move of the
> > > > > arch stuff makes sense as well and I think would complement his series
> > > > > nicely.
> > > >
> > > > I apologize for missing that. I think these are the four most recent
> > > > versions of the different series referenced from that LWN link:
> > > >
> > > >   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
> > > >   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
> > > >   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
> > > >   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> > > >
> > > > Song and Mike, please correct me if I'm wrong, but I think what I've
> > > > done here (see [1], sorry for not adding you initially) is compatible
> > > > with everything both of you have recently proposed above. How do you
> > > > feel about this as a first step?
> > >
> > > I agree that the work here is compatible with other efforts. I have no
> > > objection to making this the first step.
> > >
> > > >
> > > > For naming, execmem_alloc() seems reasonable to me? I have no strong
> > > > feelings at all, I'll just use that going forward unless somebody else
> > > > expresses an opinion.
> > >
> > > I am not good at naming things. No objection from me to "execmem_alloc".
> >
> > Hm, it sounds good to me too. I think we should add a patch which just
> > rename the module_alloc/module_memfree with execmem_alloc/free first.
> 
> I think that would be cleaner, yes. Leaving the possible move to a
> secondary patch and placing the testing more on the later part.

Makes sense to me.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-07  7:22   ` Mike Rapoport
  2024-03-08  2:46     ` Masami Hiramatsu
@ 2024-03-08 20:36     ` Calvin Owens
  1 sibling, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 20:36 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner, bpf, linux-modules,
	linux-kernel

On Thursday 03/07 at 09:22 +0200, Mike Rapoport wrote:
> On Wed, Mar 06, 2024 at 12:05:10PM -0800, Calvin Owens wrote:
> > If something like this is merged down the road, it can go in at leisure
> > once the module_alloc change is in: it's a one-way dependency.
> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >  arch/Kconfig                |  2 +-
> >  kernel/kprobes.c            | 22 ++++++++++++++++++++++
> >  kernel/trace/trace_kprobe.c | 11 +++++++++++
> >  3 files changed, 34 insertions(+), 1 deletion(-)
> 
> When I did this in my last execmem posting, I think I've got slightly less
> ugly ifdery, you may want to take a look at that:
> 
> https://lore.kernel.org/all/20230918072955.2507221-13-rppt@kernel.org

Thanks Mike, I definitely agree. I'm annoyed at myself for not finding
your patches, I spent some time looking for prior work and I really
don't know how I missed it...

> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index cfc24ced16dd..e60ce984d095 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -52,8 +52,8 @@ config GENERIC_ENTRY
> >  
> >  config KPROBES
> >  	bool "Kprobes"
> > -	depends on MODULES
> >  	depends on HAVE_KPROBES
> > +	select MODULE_ALLOC
> >  	select KALLSYMS
> >  	select TASKS_RCU if PREEMPTION
> >  	help
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 9d9095e81792..194270e17d57 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
> >  		str_has_prefix("__pfx_", symbuf);
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  static int check_kprobe_address_safe(struct kprobe *p,
> >  				     struct module **probed_mod)
> > +#else
> > +static int check_kprobe_address_safe(struct kprobe *p)
> > +#endif
> >  {
> >  	int ret;
> >  
> > @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >  		goto out;
> >  	}
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> 
> Plain #ifdef will do here and below. IS_ENABLED is for usage withing the
> code, like
> 
> 	if (IS_ENABLED(CONFIG_MODULES))
> 		;
> 
> >  	/* Check if 'p' is probing a module. */
> >  	*probed_mod = __module_text_address((unsigned long) p->addr);
> >  	if (*probed_mod) {
> 
> -- 
> Sincerely yours,
> Mike.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available
  2024-03-08  2:16   ` Masami Hiramatsu
@ 2024-03-08 20:43     ` Calvin Owens
  0 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 20:43 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Friday 03/08 at 11:16 +0900, Masami Hiramatsu wrote:
> Hi Calvin,
> 
> On Wed,  6 Mar 2024 12:05:08 -0800
> Calvin Owens <jcalvinowens@gmail.com> wrote:
> 
> > Both BPF_JIT and KPROBES depend on CONFIG_MODULES, but only require
> > module_alloc() itself, which can be easily separated into a standalone
> > allocator for executable kernel memory.
> 
> Thanks for your work!
> As Luis pointed, it is better to use different name because this
> is not only for modules and it does not depend on CONFIG_MODULES.
> 
> > 
> > Thomas Gleixner sent a patch to do that for x86 as part of a larger
> > series a couple years ago:
> > 
> >     https://lore.kernel.org/all/20220716230953.442937066@linutronix.de/
> > 
> > I've simply extended that approach to the whole kernel.
> 
> I would like to see a series of patches for each architecture so that
> architecture maintainers carefully check and test this feature.
> 
> What about introducing CONFIG_HAVE_EXEC_ALLOC and enable it on
> each architecture? Then you can start small set of major architectures
> and expand it later. 

Thanks Masami. That makes sense to me, I'll do it.

I'm also working on getting the other architectures running in QEMU, so
hopefully I'll be able to iron out more of the arch problems on my own
before the next respin.

> Thank you,
> 
> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >  arch/Kconfig                     |   2 +-
> >  arch/arm/kernel/module.c         |  35 ---------
> >  arch/arm/mm/Makefile             |   2 +
> >  arch/arm/mm/module_alloc.c       |  40 ++++++++++
> >  arch/arm64/kernel/module.c       | 127 ------------------------------
> >  arch/arm64/mm/Makefile           |   1 +
> >  arch/arm64/mm/module_alloc.c     | 130 +++++++++++++++++++++++++++++++
> >  arch/loongarch/kernel/module.c   |   6 --
> >  arch/loongarch/mm/Makefile       |   2 +
> >  arch/loongarch/mm/module_alloc.c |  10 +++
> >  arch/mips/kernel/module.c        |  10 ---
> >  arch/mips/mm/Makefile            |   2 +
> >  arch/mips/mm/module_alloc.c      |  13 ++++
> >  arch/nios2/kernel/module.c       |  20 -----
> >  arch/nios2/mm/Makefile           |   2 +
> >  arch/nios2/mm/module_alloc.c     |  22 ++++++
> >  arch/parisc/kernel/module.c      |  12 ---
> >  arch/parisc/mm/Makefile          |   1 +
> >  arch/parisc/mm/module_alloc.c    |  15 ++++
> >  arch/powerpc/kernel/module.c     |  36 ---------
> >  arch/powerpc/mm/Makefile         |   1 +
> >  arch/powerpc/mm/module_alloc.c   |  41 ++++++++++
> >  arch/riscv/kernel/module.c       |  11 ---
> >  arch/riscv/mm/Makefile           |   1 +
> >  arch/riscv/mm/module_alloc.c     |  17 ++++
> >  arch/s390/kernel/module.c        |  37 ---------
> >  arch/s390/mm/Makefile            |   1 +
> >  arch/s390/mm/module_alloc.c      |  42 ++++++++++
> >  arch/sparc/kernel/module.c       |  31 --------
> >  arch/sparc/mm/Makefile           |   2 +
> >  arch/sparc/mm/module_alloc.c     |  31 ++++++++
> >  arch/x86/kernel/ftrace.c         |   2 +-
> >  arch/x86/kernel/module.c         |  56 -------------
> >  arch/x86/mm/Makefile             |   2 +
> >  arch/x86/mm/module_alloc.c       |  59 ++++++++++++++
> >  fs/proc/kcore.c                  |   2 +-
> >  kernel/module/Kconfig            |   1 +
> >  kernel/module/main.c             |  17 ----
> >  mm/Kconfig                       |   3 +
> >  mm/Makefile                      |   1 +
> >  mm/module_alloc.c                |  21 +++++
> >  mm/vmalloc.c                     |   2 +-
> >  42 files changed, 467 insertions(+), 402 deletions(-)
> >  create mode 100644 arch/arm/mm/module_alloc.c
> >  create mode 100644 arch/arm64/mm/module_alloc.c
> >  create mode 100644 arch/loongarch/mm/module_alloc.c
> >  create mode 100644 arch/mips/mm/module_alloc.c
> >  create mode 100644 arch/nios2/mm/module_alloc.c
> >  create mode 100644 arch/parisc/mm/module_alloc.c
> >  create mode 100644 arch/powerpc/mm/module_alloc.c
> >  create mode 100644 arch/riscv/mm/module_alloc.c
> >  create mode 100644 arch/s390/mm/module_alloc.c
> >  create mode 100644 arch/sparc/mm/module_alloc.c
> >  create mode 100644 arch/x86/mm/module_alloc.c
> >  create mode 100644 mm/module_alloc.c
> > 
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index a5af0edd3eb8..cfc24ced16dd 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -1305,7 +1305,7 @@ config ARCH_HAS_STRICT_MODULE_RWX
> >  
> >  config STRICT_MODULE_RWX
> >  	bool "Set loadable kernel module data as NX and text as RO" if ARCH_OPTIONAL_KERNEL_RWX
> > -	depends on ARCH_HAS_STRICT_MODULE_RWX && MODULES
> > +	depends on ARCH_HAS_STRICT_MODULE_RWX && MODULE_ALLOC
> >  	default !ARCH_OPTIONAL_KERNEL_RWX || ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
> >  	help
> >  	  If this is set, module text and rodata memory will be made read-only,
> > diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> > index e74d84f58b77..1c8798732d12 100644
> > --- a/arch/arm/kernel/module.c
> > +++ b/arch/arm/kernel/module.c
> > @@ -4,15 +4,12 @@
> >   *
> >   *  Copyright (C) 2002 Russell King.
> >   *  Modified for nommu by Hyok S. Choi
> > - *
> > - * Module allocation method suggested by Andi Kleen.
> >   */
> >  #include <linux/module.h>
> >  #include <linux/moduleloader.h>
> >  #include <linux/kernel.h>
> >  #include <linux/mm.h>
> >  #include <linux/elf.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/fs.h>
> >  #include <linux/string.h>
> >  #include <linux/gfp.h>
> > @@ -22,38 +19,6 @@
> >  #include <asm/unwind.h>
> >  #include <asm/opcodes.h>
> >  
> > -#ifdef CONFIG_XIP_KERNEL
> > -/*
> > - * The XIP kernel text is mapped in the module area for modules and
> > - * some other stuff to work without any indirect relocations.
> > - * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
> > - * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
> > - */
> > -#undef MODULES_VADDR
> > -#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
> > -#endif
> > -
> > -#ifdef CONFIG_MMU
> > -void *module_alloc(unsigned long size)
> > -{
> > -	gfp_t gfp_mask = GFP_KERNEL;
> > -	void *p;
> > -
> > -	/* Silence the initial allocation */
> > -	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
> > -		gfp_mask |= __GFP_NOWARN;
> > -
> > -	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > -				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > -				__builtin_return_address(0));
> > -	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
> > -		return p;
> > -	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
> > -				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > -				__builtin_return_address(0));
> > -}
> > -#endif
> > -
> >  bool module_init_section(const char *name)
> >  {
> >  	return strstarts(name, ".init") ||
> > diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
> > index 71b858c9b10c..a05a6701a884 100644
> > --- a/arch/arm/mm/Makefile
> > +++ b/arch/arm/mm/Makefile
> > @@ -100,3 +100,5 @@ obj-$(CONFIG_CACHE_UNIPHIER)	+= cache-uniphier.o
> >  
> >  KASAN_SANITIZE_kasan_init.o	:= n
> >  obj-$(CONFIG_KASAN)		+= kasan_init.o
> > +
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/arm/mm/module_alloc.c b/arch/arm/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..e48be48b2b5f
> > --- /dev/null
> > +++ b/arch/arm/mm/module_alloc.c
> > @@ -0,0 +1,40 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +#ifdef CONFIG_XIP_KERNEL
> > +/*
> > + * The XIP kernel text is mapped in the module area for modules and
> > + * some other stuff to work without any indirect relocations.
> > + * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
> > + * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
> > + */
> > +#undef MODULES_VADDR
> > +#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
> > +#endif
> > +
> > +/*
> > + * Module allocation method suggested by Andi Kleen.
> > + */
> > +
> > +#ifdef CONFIG_MMU
> > +void *module_alloc(unsigned long size)
> > +{
> > +	gfp_t gfp_mask = GFP_KERNEL;
> > +	void *p;
> > +
> > +	/* Silence the initial allocation */
> > +	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
> > +		gfp_mask |= __GFP_NOWARN;
> > +
> > +	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > +				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > +				__builtin_return_address(0));
> > +	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
> > +		return p;
> > +	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
> > +				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > +				__builtin_return_address(0));
> > +}
> > +#endif
> > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > index dd851297596e..78758ed818b0 100644
> > --- a/arch/arm64/kernel/module.c
> > +++ b/arch/arm64/kernel/module.c
> > @@ -13,143 +13,16 @@
> >  #include <linux/elf.h>
> >  #include <linux/ftrace.h>
> >  #include <linux/gfp.h>
> > -#include <linux/kasan.h>
> >  #include <linux/kernel.h>
> >  #include <linux/mm.h>
> >  #include <linux/moduleloader.h>
> > -#include <linux/random.h>
> >  #include <linux/scs.h>
> > -#include <linux/vmalloc.h>
> >  
> >  #include <asm/alternative.h>
> >  #include <asm/insn.h>
> >  #include <asm/scs.h>
> >  #include <asm/sections.h>
> >  
> > -static u64 module_direct_base __ro_after_init = 0;
> > -static u64 module_plt_base __ro_after_init = 0;
> > -
> > -/*
> > - * Choose a random page-aligned base address for a window of 'size' bytes which
> > - * entirely contains the interval [start, end - 1].
> > - */
> > -static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
> > -{
> > -	u64 max_pgoff, pgoff;
> > -
> > -	if ((end - start) >= size)
> > -		return 0;
> > -
> > -	max_pgoff = (size - (end - start)) / PAGE_SIZE;
> > -	pgoff = get_random_u32_inclusive(0, max_pgoff);
> > -
> > -	return start - pgoff * PAGE_SIZE;
> > -}
> > -
> > -/*
> > - * Modules may directly reference data and text anywhere within the kernel
> > - * image and other modules. References using PREL32 relocations have a +/-2G
> > - * range, and so we need to ensure that the entire kernel image and all modules
> > - * fall within a 2G window such that these are always within range.
> > - *
> > - * Modules may directly branch to functions and code within the kernel text,
> > - * and to functions and code within other modules. These branches will use
> > - * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
> > - * that the entire kernel text and all module text falls within a 128M window
> > - * such that these are always within range. With PLTs, we can expand this to a
> > - * 2G window.
> > - *
> > - * We chose the 128M region to surround the entire kernel image (rather than
> > - * just the text) as using the same bounds for the 128M and 2G regions ensures
> > - * by construction that we never select a 128M region that is not a subset of
> > - * the 2G region. For very large and unusual kernel configurations this means
> > - * we may fall back to PLTs where they could have been avoided, but this keeps
> > - * the logic significantly simpler.
> > - */
> > -static int __init module_init_limits(void)
> > -{
> > -	u64 kernel_end = (u64)_end;
> > -	u64 kernel_start = (u64)_text;
> > -	u64 kernel_size = kernel_end - kernel_start;
> > -
> > -	/*
> > -	 * The default modules region is placed immediately below the kernel
> > -	 * image, and is large enough to use the full 2G relocation range.
> > -	 */
> > -	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
> > -	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
> > -
> > -	if (!kaslr_enabled()) {
> > -		if (kernel_size < SZ_128M)
> > -			module_direct_base = kernel_end - SZ_128M;
> > -		if (kernel_size < SZ_2G)
> > -			module_plt_base = kernel_end - SZ_2G;
> > -	} else {
> > -		u64 min = kernel_start;
> > -		u64 max = kernel_end;
> > -
> > -		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
> > -			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
> > -		} else {
> > -			module_direct_base = random_bounding_box(SZ_128M, min, max);
> > -			if (module_direct_base) {
> > -				min = module_direct_base;
> > -				max = module_direct_base + SZ_128M;
> > -			}
> > -		}
> > -
> > -		module_plt_base = random_bounding_box(SZ_2G, min, max);
> > -	}
> > -
> > -	pr_info("%llu pages in range for non-PLT usage",
> > -		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
> > -	pr_info("%llu pages in range for PLT usage",
> > -		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
> > -
> > -	return 0;
> > -}
> > -subsys_initcall(module_init_limits);
> > -
> > -void *module_alloc(unsigned long size)
> > -{
> > -	void *p = NULL;
> > -
> > -	/*
> > -	 * Where possible, prefer to allocate within direct branch range of the
> > -	 * kernel such that no PLTs are necessary.
> > -	 */
> > -	if (module_direct_base) {
> > -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > -					 module_direct_base,
> > -					 module_direct_base + SZ_128M,
> > -					 GFP_KERNEL | __GFP_NOWARN,
> > -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -					 __builtin_return_address(0));
> > -	}
> > -
> > -	if (!p && module_plt_base) {
> > -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > -					 module_plt_base,
> > -					 module_plt_base + SZ_2G,
> > -					 GFP_KERNEL | __GFP_NOWARN,
> > -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -					 __builtin_return_address(0));
> > -	}
> > -
> > -	if (!p) {
> > -		pr_warn_ratelimited("%s: unable to allocate memory\n",
> > -				    __func__);
> > -	}
> > -
> > -	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
> > -		vfree(p);
> > -		return NULL;
> > -	}
> > -
> > -	/* Memory is intended to be executable, reset the pointer tag. */
> > -	return kasan_reset_tag(p);
> > -}
> > -
> >  enum aarch64_reloc_op {
> >  	RELOC_OP_NONE,
> >  	RELOC_OP_ABS,
> > diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> > index dbd1bc95967d..cf616635a80d 100644
> > --- a/arch/arm64/mm/Makefile
> > +++ b/arch/arm64/mm/Makefile
> > @@ -10,6 +10,7 @@ obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd.o
> >  obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd-asm.o
> >  obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
> >  obj-$(CONFIG_ARM64_MTE)		+= mteswap.o
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> >  KASAN_SANITIZE_physaddr.o	+= n
> >  
> >  obj-$(CONFIG_KASAN)		+= kasan_init.o
> > diff --git a/arch/arm64/mm/module_alloc.c b/arch/arm64/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..302642ea9e26
> > --- /dev/null
> > +++ b/arch/arm64/mm/module_alloc.c
> > @@ -0,0 +1,130 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +#include <linux/kasan.h>
> > +#include <linux/random.h>
> > +
> > +static u64 module_direct_base __ro_after_init = 0;
> > +static u64 module_plt_base __ro_after_init = 0;
> > +
> > +/*
> > + * Choose a random page-aligned base address for a window of 'size' bytes which
> > + * entirely contains the interval [start, end - 1].
> > + */
> > +static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
> > +{
> > +	u64 max_pgoff, pgoff;
> > +
> > +	if ((end - start) >= size)
> > +		return 0;
> > +
> > +	max_pgoff = (size - (end - start)) / PAGE_SIZE;
> > +	pgoff = get_random_u32_inclusive(0, max_pgoff);
> > +
> > +	return start - pgoff * PAGE_SIZE;
> > +}
> > +
> > +/*
> > + * Modules may directly reference data and text anywhere within the kernel
> > + * image and other modules. References using PREL32 relocations have a +/-2G
> > + * range, and so we need to ensure that the entire kernel image and all modules
> > + * fall within a 2G window such that these are always within range.
> > + *
> > + * Modules may directly branch to functions and code within the kernel text,
> > + * and to functions and code within other modules. These branches will use
> > + * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
> > + * that the entire kernel text and all module text falls within a 128M window
> > + * such that these are always within range. With PLTs, we can expand this to a
> > + * 2G window.
> > + *
> > + * We chose the 128M region to surround the entire kernel image (rather than
> > + * just the text) as using the same bounds for the 128M and 2G regions ensures
> > + * by construction that we never select a 128M region that is not a subset of
> > + * the 2G region. For very large and unusual kernel configurations this means
> > + * we may fall back to PLTs where they could have been avoided, but this keeps
> > + * the logic significantly simpler.
> > + */
> > +static int __init module_init_limits(void)
> > +{
> > +	u64 kernel_end = (u64)_end;
> > +	u64 kernel_start = (u64)_text;
> > +	u64 kernel_size = kernel_end - kernel_start;
> > +
> > +	/*
> > +	 * The default modules region is placed immediately below the kernel
> > +	 * image, and is large enough to use the full 2G relocation range.
> > +	 */
> > +	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
> > +	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
> > +
> > +	if (!kaslr_enabled()) {
> > +		if (kernel_size < SZ_128M)
> > +			module_direct_base = kernel_end - SZ_128M;
> > +		if (kernel_size < SZ_2G)
> > +			module_plt_base = kernel_end - SZ_2G;
> > +	} else {
> > +		u64 min = kernel_start;
> > +		u64 max = kernel_end;
> > +
> > +		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
> > +			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
> > +		} else {
> > +			module_direct_base = random_bounding_box(SZ_128M, min, max);
> > +			if (module_direct_base) {
> > +				min = module_direct_base;
> > +				max = module_direct_base + SZ_128M;
> > +			}
> > +		}
> > +
> > +		module_plt_base = random_bounding_box(SZ_2G, min, max);
> > +	}
> > +
> > +	pr_info("%llu pages in range for non-PLT usage",
> > +		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
> > +	pr_info("%llu pages in range for PLT usage",
> > +		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
> > +
> > +	return 0;
> > +}
> > +subsys_initcall(module_init_limits);
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +	void *p = NULL;
> > +
> > +	/*
> > +	 * Where possible, prefer to allocate within direct branch range of the
> > +	 * kernel such that no PLTs are necessary.
> > +	 */
> > +	if (module_direct_base) {
> > +		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > +					 module_direct_base,
> > +					 module_direct_base + SZ_128M,
> > +					 GFP_KERNEL | __GFP_NOWARN,
> > +					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > +					 __builtin_return_address(0));
> > +	}
> > +
> > +	if (!p && module_plt_base) {
> > +		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > +					 module_plt_base,
> > +					 module_plt_base + SZ_2G,
> > +					 GFP_KERNEL | __GFP_NOWARN,
> > +					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > +					 __builtin_return_address(0));
> > +	}
> > +
> > +	if (!p) {
> > +		pr_warn_ratelimited("%s: unable to allocate memory\n",
> > +				    __func__);
> > +	}
> > +
> > +	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
> > +		vfree(p);
> > +		return NULL;
> > +	}
> > +
> > +	/* Memory is intended to be executable, reset the pointer tag. */
> > +	return kasan_reset_tag(p);
> > +}
> > diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
> > index b13b2858fe39..7f03166513b3 100644
> > --- a/arch/loongarch/kernel/module.c
> > +++ b/arch/loongarch/kernel/module.c
> > @@ -489,12 +489,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> >  	return 0;
> >  }
> >  
> > -void *module_alloc(unsigned long size)
> > -{
> > -	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > -			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
> > -}
> > -
> >  static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
> >  				   const Elf_Shdr *sechdrs, struct module *mod)
> >  {
> > diff --git a/arch/loongarch/mm/Makefile b/arch/loongarch/mm/Makefile
> > index e4d1e581dbae..3966fc6118f1 100644
> > --- a/arch/loongarch/mm/Makefile
> > +++ b/arch/loongarch/mm/Makefile
> > @@ -10,3 +10,5 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
> >  obj-$(CONFIG_KASAN)		+= kasan_init.o
> >  
> >  KASAN_SANITIZE_kasan_init.o     := n
> > +
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/loongarch/mm/module_alloc.c b/arch/loongarch/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..24b0cb3a2088
> > --- /dev/null
> > +++ b/arch/loongarch/mm/module_alloc.c
> > @@ -0,0 +1,10 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > +			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
> > +}
> > diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
> > index 7b2fbaa9cac5..ba0f62d8eff5 100644
> > --- a/arch/mips/kernel/module.c
> > +++ b/arch/mips/kernel/module.c
> > @@ -13,7 +13,6 @@
> >  #include <linux/elf.h>
> >  #include <linux/mm.h>
> >  #include <linux/numa.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/slab.h>
> >  #include <linux/fs.h>
> >  #include <linux/string.h>
> > @@ -31,15 +30,6 @@ struct mips_hi16 {
> >  static LIST_HEAD(dbe_list);
> >  static DEFINE_SPINLOCK(dbe_lock);
> >  
> > -#ifdef MODULE_START
> > -void *module_alloc(unsigned long size)
> > -{
> > -	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
> > -				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -				__builtin_return_address(0));
> > -}
> > -#endif
> > -
> >  static void apply_r_mips_32(u32 *location, u32 base, Elf_Addr v)
> >  {
> >  	*location = base + v;
> > diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile
> > index 304692391519..b9cfe37e41e4 100644
> > --- a/arch/mips/mm/Makefile
> > +++ b/arch/mips/mm/Makefile
> > @@ -45,3 +45,5 @@ obj-$(CONFIG_MIPS_CPU_SCACHE)	+= sc-mips.o
> >  obj-$(CONFIG_SCACHE_DEBUGFS)	+= sc-debugfs.o
> >  
> >  obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
> > +
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/mips/mm/module_alloc.c b/arch/mips/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..fcdbdece42f3
> > --- /dev/null
> > +++ b/arch/mips/mm/module_alloc.c
> > @@ -0,0 +1,13 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +#ifdef MODULE_START
> > +void *module_alloc(unsigned long size)
> > +{
> > +	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
> > +				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> > +				__builtin_return_address(0));
> > +}
> > +#endif
> > diff --git a/arch/nios2/kernel/module.c b/arch/nios2/kernel/module.c
> > index 76e0a42d6e36..f4483243578d 100644
> > --- a/arch/nios2/kernel/module.c
> > +++ b/arch/nios2/kernel/module.c
> > @@ -13,7 +13,6 @@
> >  #include <linux/moduleloader.h>
> >  #include <linux/elf.h>
> >  #include <linux/mm.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/slab.h>
> >  #include <linux/fs.h>
> >  #include <linux/string.h>
> > @@ -21,25 +20,6 @@
> >  
> >  #include <asm/cacheflush.h>
> >  
> > -/*
> > - * Modules should NOT be allocated with kmalloc for (obvious) reasons.
> > - * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
> > - * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
> > - * addresses in 0xc0000000)
> > - */
> > -void *module_alloc(unsigned long size)
> > -{
> > -	if (size == 0)
> > -		return NULL;
> > -	return kmalloc(size, GFP_KERNEL);
> > -}
> > -
> > -/* Free memory returned from module_alloc */
> > -void module_memfree(void *module_region)
> > -{
> > -	kfree(module_region);
> > -}
> > -
> >  int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab,
> >  			unsigned int symindex, unsigned int relsec,
> >  			struct module *mod)
> > diff --git a/arch/nios2/mm/Makefile b/arch/nios2/mm/Makefile
> > index 9d37fafd1dd1..facbb3e60013 100644
> > --- a/arch/nios2/mm/Makefile
> > +++ b/arch/nios2/mm/Makefile
> > @@ -13,3 +13,5 @@ obj-y	+= mmu_context.o
> >  obj-y	+= pgtable.o
> >  obj-y	+= tlb.o
> >  obj-y	+= uaccess.o
> > +
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/nios2/mm/module_alloc.c b/arch/nios2/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..92c7c32ef8b3
> > --- /dev/null
> > +++ b/arch/nios2/mm/module_alloc.c
> > @@ -0,0 +1,22 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/slab.h>
> > +
> > +/*
> > + * Modules should NOT be allocated with kmalloc for (obvious) reasons.
> > + * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
> > + * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
> > + * addresses in 0xc0000000)
> > + */
> > +void *module_alloc(unsigned long size)
> > +{
> > +	if (size == 0)
> > +		return NULL;
> > +	return kmalloc(size, GFP_KERNEL);
> > +}
> > +
> > +/* Free memory returned from module_alloc */
> > +void module_memfree(void *module_region)
> > +{
> > +	kfree(module_region);
> > +}
> > diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
> > index d214bbe3c2af..4e5d991b2b65 100644
> > --- a/arch/parisc/kernel/module.c
> > +++ b/arch/parisc/kernel/module.c
> > @@ -41,7 +41,6 @@
> >  
> >  #include <linux/moduleloader.h>
> >  #include <linux/elf.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/fs.h>
> >  #include <linux/ftrace.h>
> >  #include <linux/string.h>
> > @@ -173,17 +172,6 @@ static inline int reassemble_22(int as22)
> >  		((as22 & 0x0003ff) << 3));
> >  }
> >  
> > -void *module_alloc(unsigned long size)
> > -{
> > -	/* using RWX means less protection for modules, but it's
> > -	 * easier than trying to map the text, data, init_text and
> > -	 * init_data correctly */
> > -	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> > -				    GFP_KERNEL,
> > -				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
> > -				    __builtin_return_address(0));
> > -}
> > -
> >  #ifndef CONFIG_64BIT
> >  static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
> >  {
> > diff --git a/arch/parisc/mm/Makefile b/arch/parisc/mm/Makefile
> > index ffdb5c0a8cc6..95a6d4469785 100644
> > --- a/arch/parisc/mm/Makefile
> > +++ b/arch/parisc/mm/Makefile
> > @@ -5,3 +5,4 @@
> >  
> >  obj-y	 := init.o fault.o ioremap.o fixmap.o
> >  obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
> > +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> > diff --git a/arch/parisc/mm/module_alloc.c b/arch/parisc/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..5ad9bfc3ffab
> > --- /dev/null
> > +++ b/arch/parisc/mm/module_alloc.c
> > @@ -0,0 +1,15 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +	/* using RWX means less protection for modules, but it's
> > +	 * easier than trying to map the text, data, init_text and
> > +	 * init_data correctly */
> > +	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> > +				    GFP_KERNEL,
> > +				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
> > +				    __builtin_return_address(0));
> > +}
> > diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
> > index f6d6ae0a1692..b5fe9c61e527 100644
> > --- a/arch/powerpc/kernel/module.c
> > +++ b/arch/powerpc/kernel/module.c
> > @@ -89,39 +89,3 @@ int module_finalize(const Elf_Ehdr *hdr,
> >  	return 0;
> >  }
> >  
> > -static __always_inline void *
> > -__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
> > -{
> > -	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
> > -	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
> > -
> > -	/*
> > -	 * Don't do huge page allocations for modules yet until more testing
> > -	 * is done. STRICT_MODULE_RWX may require extra work to support this
> > -	 * too.
> > -	 */
> > -	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
> > -				    VM_FLUSH_RESET_PERMS,
> > -				    NUMA_NO_NODE, __builtin_return_address(0));
> > -}
> > -
> > -void *module_alloc(unsigned long size)
> > -{
> > -#ifdef MODULES_VADDR
> > -	unsigned long limit = (unsigned long)_etext - SZ_32M;
> > -	void *ptr = NULL;
> > -
> > -	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
> > -
> > -	/* First try within 32M limit from _etext to avoid branch trampolines */
> > -	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
> > -		ptr = __module_alloc(size, limit, MODULES_END, true);
> > -
> > -	if (!ptr)
> > -		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
> > -
> > -	return ptr;
> > -#else
> > -	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
> > -#endif
> > -}
> > diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
> > index 503a6e249940..4572273a838f 100644
> > --- a/arch/powerpc/mm/Makefile
> > +++ b/arch/powerpc/mm/Makefile
> > @@ -19,3 +19,4 @@ obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o
> >  obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
> >  obj-$(CONFIG_PTDUMP_CORE)	+= ptdump/
> >  obj-$(CONFIG_KASAN)		+= kasan/
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/powerpc/mm/module_alloc.c b/arch/powerpc/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..818e5cd8fbc6
> > --- /dev/null
> > +++ b/arch/powerpc/mm/module_alloc.c
> > @@ -0,0 +1,41 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +static __always_inline void *
> > +__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
> > +{
> > +	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
> > +	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
> > +
> > +	/*
> > +	 * Don't do huge page allocations for modules yet until more testing
> > +	 * is done. STRICT_MODULE_RWX may require extra work to support this
> > +	 * too.
> > +	 */
> > +	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
> > +				    VM_FLUSH_RESET_PERMS,
> > +				    NUMA_NO_NODE, __builtin_return_address(0));
> > +}
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +#ifdef MODULES_VADDR
> > +	unsigned long limit = (unsigned long)_etext - SZ_32M;
> > +	void *ptr = NULL;
> > +
> > +	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
> > +
> > +	/* First try within 32M limit from _etext to avoid branch trampolines */
> > +	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
> > +		ptr = __module_alloc(size, limit, MODULES_END, true);
> > +
> > +	if (!ptr)
> > +		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
> > +
> > +	return ptr;
> > +#else
> > +	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
> > +#endif
> > +}
> > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> > index 5e5a82644451..53d7005fdbdb 100644
> > --- a/arch/riscv/kernel/module.c
> > +++ b/arch/riscv/kernel/module.c
> > @@ -11,7 +11,6 @@
> >  #include <linux/kernel.h>
> >  #include <linux/log2.h>
> >  #include <linux/moduleloader.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/sizes.h>
> >  #include <linux/pgtable.h>
> >  #include <asm/alternative.h>
> > @@ -905,16 +904,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> >  	return 0;
> >  }
> >  
> > -#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > -void *module_alloc(unsigned long size)
> > -{
> > -	return __vmalloc_node_range(size, 1, MODULES_VADDR,
> > -				    MODULES_END, GFP_KERNEL,
> > -				    PAGE_KERNEL, VM_FLUSH_RESET_PERMS,
> > -				    NUMA_NO_NODE,
> > -				    __builtin_return_address(0));
> > -}
> > -#endif
> >  
> >  int module_finalize(const Elf_Ehdr *hdr,
> >  		    const Elf_Shdr *sechdrs,
> > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> > index 2c869f8026a8..fba8e3595459 100644
> > --- a/arch/riscv/mm/Makefile
> > +++ b/arch/riscv/mm/Makefile
> > @@ -36,3 +36,4 @@ endif
> >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> >  obj-$(CONFIG_RISCV_DMA_NONCOHERENT) += dma-noncoherent.o
> >  obj-$(CONFIG_RISCV_NONSTANDARD_CACHE_OPS) += cache-ops.o
> > +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> > diff --git a/arch/riscv/mm/module_alloc.c b/arch/riscv/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..2c1fb95a57e2
> > --- /dev/null
> > +++ b/arch/riscv/mm/module_alloc.c
> > @@ -0,0 +1,17 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/pgtable.h>
> > +#include <asm/alternative.h>
> > +#include <asm/sections.h>
> > +
> > +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > +void *module_alloc(unsigned long size)
> > +{
> > +	return __vmalloc_node_range(size, 1, MODULES_VADDR,
> > +				    MODULES_END, GFP_KERNEL,
> > +				    PAGE_KERNEL, VM_FLUSH_RESET_PERMS,
> > +				    NUMA_NO_NODE,
> > +				    __builtin_return_address(0));
> > +}
> > +#endif
> > diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> > index 42215f9404af..ef8a7539bb0b 100644
> > --- a/arch/s390/kernel/module.c
> > +++ b/arch/s390/kernel/module.c
> > @@ -36,43 +36,6 @@
> >  
> >  #define PLT_ENTRY_SIZE 22
> >  
> > -static unsigned long get_module_load_offset(void)
> > -{
> > -	static DEFINE_MUTEX(module_kaslr_mutex);
> > -	static unsigned long module_load_offset;
> > -
> > -	if (!kaslr_enabled())
> > -		return 0;
> > -	/*
> > -	 * Calculate the module_load_offset the first time this code
> > -	 * is called. Once calculated it stays the same until reboot.
> > -	 */
> > -	mutex_lock(&module_kaslr_mutex);
> > -	if (!module_load_offset)
> > -		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> > -	mutex_unlock(&module_kaslr_mutex);
> > -	return module_load_offset;
> > -}
> > -
> > -void *module_alloc(unsigned long size)
> > -{
> > -	gfp_t gfp_mask = GFP_KERNEL;
> > -	void *p;
> > -
> > -	if (PAGE_ALIGN(size) > MODULES_LEN)
> > -		return NULL;
> > -	p = __vmalloc_node_range(size, MODULE_ALIGN,
> > -				 MODULES_VADDR + get_module_load_offset(),
> > -				 MODULES_END, gfp_mask, PAGE_KERNEL,
> > -				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> > -				 NUMA_NO_NODE, __builtin_return_address(0));
> > -	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> > -		vfree(p);
> > -		return NULL;
> > -	}
> > -	return p;
> > -}
> > -
> >  #ifdef CONFIG_FUNCTION_TRACER
> >  void module_arch_cleanup(struct module *mod)
> >  {
> > diff --git a/arch/s390/mm/Makefile b/arch/s390/mm/Makefile
> > index 352ff520fd94..4f44c4096c6d 100644
> > --- a/arch/s390/mm/Makefile
> > +++ b/arch/s390/mm/Makefile
> > @@ -11,3 +11,4 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
> >  obj-$(CONFIG_PTDUMP_CORE)	+= dump_pagetables.o
> >  obj-$(CONFIG_PGSTE)		+= gmap.o
> >  obj-$(CONFIG_PFAULT)		+= pfault.o
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/s390/mm/module_alloc.c b/arch/s390/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..88eadce4bc68
> > --- /dev/null
> > +++ b/arch/s390/mm/module_alloc.c
> > @@ -0,0 +1,42 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +#include <linux/kasan.h>
> > +
> > +static unsigned long get_module_load_offset(void)
> > +{
> > +	static DEFINE_MUTEX(module_kaslr_mutex);
> > +	static unsigned long module_load_offset;
> > +
> > +	if (!kaslr_enabled())
> > +		return 0;
> > +	/*
> > +	 * Calculate the module_load_offset the first time this code
> > +	 * is called. Once calculated it stays the same until reboot.
> > +	 */
> > +	mutex_lock(&module_kaslr_mutex);
> > +	if (!module_load_offset)
> > +		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> > +	mutex_unlock(&module_kaslr_mutex);
> > +	return module_load_offset;
> > +}
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +	gfp_t gfp_mask = GFP_KERNEL;
> > +	void *p;
> > +
> > +	if (PAGE_ALIGN(size) > MODULES_LEN)
> > +		return NULL;
> > +	p = __vmalloc_node_range(size, MODULE_ALIGN,
> > +				 MODULES_VADDR + get_module_load_offset(),
> > +				 MODULES_END, gfp_mask, PAGE_KERNEL,
> > +				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> > +				 NUMA_NO_NODE, __builtin_return_address(0));
> > +	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> > +		vfree(p);
> > +		return NULL;
> > +	}
> > +	return p;
> > +}
> > diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
> > index 66c45a2764bc..0611a41cd586 100644
> > --- a/arch/sparc/kernel/module.c
> > +++ b/arch/sparc/kernel/module.c
> > @@ -8,7 +8,6 @@
> >  #include <linux/moduleloader.h>
> >  #include <linux/kernel.h>
> >  #include <linux/elf.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/fs.h>
> >  #include <linux/gfp.h>
> >  #include <linux/string.h>
> > @@ -21,36 +20,6 @@
> >  
> >  #include "entry.h"
> >  
> > -#ifdef CONFIG_SPARC64
> > -
> > -#include <linux/jump_label.h>
> > -
> > -static void *module_map(unsigned long size)
> > -{
> > -	if (PAGE_ALIGN(size) > MODULES_LEN)
> > -		return NULL;
> > -	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > -				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -				__builtin_return_address(0));
> > -}
> > -#else
> > -static void *module_map(unsigned long size)
> > -{
> > -	return vmalloc(size);
> > -}
> > -#endif /* CONFIG_SPARC64 */
> > -
> > -void *module_alloc(unsigned long size)
> > -{
> > -	void *ret;
> > -
> > -	ret = module_map(size);
> > -	if (ret)
> > -		memset(ret, 0, size);
> > -
> > -	return ret;
> > -}
> > -
> >  /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
> >  int module_frob_arch_sections(Elf_Ehdr *hdr,
> >  			      Elf_Shdr *sechdrs,
> > diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile
> > index 809d993f6d88..a8e9ba46679a 100644
> > --- a/arch/sparc/mm/Makefile
> > +++ b/arch/sparc/mm/Makefile
> > @@ -14,3 +14,5 @@ obj-$(CONFIG_SPARC32)   += leon_mm.o
> >  
> >  # Only used by sparc64
> >  obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
> > +
> > +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> > diff --git a/arch/sparc/mm/module_alloc.c b/arch/sparc/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..14aef0f75650
> > --- /dev/null
> > +++ b/arch/sparc/mm/module_alloc.c
> > @@ -0,0 +1,31 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +#ifdef CONFIG_SPARC64
> > +static void *module_map(unsigned long size)
> > +{
> > +	if (PAGE_ALIGN(size) > MODULES_LEN)
> > +		return NULL;
> > +	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > +				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> > +				__builtin_return_address(0));
> > +}
> > +#else
> > +static void *module_map(unsigned long size)
> > +{
> > +	return vmalloc(size);
> > +}
> > +#endif /* CONFIG_SPARC64 */
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +	void *ret;
> > +
> > +	ret = module_map(size);
> > +	if (ret)
> > +		memset(ret, 0, size);
> > +
> > +	return ret;
> > +}
> > diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> > index 12df54ff0e81..99f242e11f88 100644
> > --- a/arch/x86/kernel/ftrace.c
> > +++ b/arch/x86/kernel/ftrace.c
> > @@ -260,7 +260,7 @@ void arch_ftrace_update_code(int command)
> >  /* Currently only x86_64 supports dynamic trampolines */
> >  #ifdef CONFIG_X86_64
> >  
> > -#ifdef CONFIG_MODULES
> > +#if IS_ENABLED(CONFIG_MODULE_ALLOC)
> >  #include <linux/moduleloader.h>
> >  /* Module allocation simplifies allocating memory for code */
> >  static inline void *alloc_tramp(unsigned long size)
> > diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
> > index e18914c0e38a..ad7e3968ee8f 100644
> > --- a/arch/x86/kernel/module.c
> > +++ b/arch/x86/kernel/module.c
> > @@ -8,21 +8,14 @@
> >  
> >  #include <linux/moduleloader.h>
> >  #include <linux/elf.h>
> > -#include <linux/vmalloc.h>
> >  #include <linux/fs.h>
> >  #include <linux/string.h>
> >  #include <linux/kernel.h>
> > -#include <linux/kasan.h>
> >  #include <linux/bug.h>
> > -#include <linux/mm.h>
> > -#include <linux/gfp.h>
> >  #include <linux/jump_label.h>
> > -#include <linux/random.h>
> >  #include <linux/memory.h>
> >  
> >  #include <asm/text-patching.h>
> > -#include <asm/page.h>
> > -#include <asm/setup.h>
> >  #include <asm/unwind.h>
> >  
> >  #if 0
> > @@ -36,56 +29,7 @@ do {							\
> >  } while (0)
> >  #endif
> >  
> > -#ifdef CONFIG_RANDOMIZE_BASE
> > -static unsigned long module_load_offset;
> >  
> > -/* Mutex protects the module_load_offset. */
> > -static DEFINE_MUTEX(module_kaslr_mutex);
> > -
> > -static unsigned long int get_module_load_offset(void)
> > -{
> > -	if (kaslr_enabled()) {
> > -		mutex_lock(&module_kaslr_mutex);
> > -		/*
> > -		 * Calculate the module_load_offset the first time this
> > -		 * code is called. Once calculated it stays the same until
> > -		 * reboot.
> > -		 */
> > -		if (module_load_offset == 0)
> > -			module_load_offset =
> > -				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> > -		mutex_unlock(&module_kaslr_mutex);
> > -	}
> > -	return module_load_offset;
> > -}
> > -#else
> > -static unsigned long int get_module_load_offset(void)
> > -{
> > -	return 0;
> > -}
> > -#endif
> > -
> > -void *module_alloc(unsigned long size)
> > -{
> > -	gfp_t gfp_mask = GFP_KERNEL;
> > -	void *p;
> > -
> > -	if (PAGE_ALIGN(size) > MODULES_LEN)
> > -		return NULL;
> > -
> > -	p = __vmalloc_node_range(size, MODULE_ALIGN,
> > -				 MODULES_VADDR + get_module_load_offset(),
> > -				 MODULES_END, gfp_mask, PAGE_KERNEL,
> > -				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> > -				 NUMA_NO_NODE, __builtin_return_address(0));
> > -
> > -	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> > -		vfree(p);
> > -		return NULL;
> > -	}
> > -
> > -	return p;
> > -}
> >  
> >  #ifdef CONFIG_X86_32
> >  int apply_relocate(Elf32_Shdr *sechdrs,
> > diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> > index c80febc44cd2..b9e42770a002 100644
> > --- a/arch/x86/mm/Makefile
> > +++ b/arch/x86/mm/Makefile
> > @@ -67,3 +67,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_amd.o
> >  
> >  obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_identity.o
> >  obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt_boot.o
> > +
> > +obj-$(CONFIG_MODULE_ALLOC)	+= module_alloc.o
> > diff --git a/arch/x86/mm/module_alloc.c b/arch/x86/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..00391c15e1eb
> > --- /dev/null
> > +++ b/arch/x86/mm/module_alloc.c
> > @@ -0,0 +1,59 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +#include <linux/kasan.h>
> > +#include <linux/random.h>
> > +#include <linux/mutex.h>
> > +#include <asm/setup.h>
> > +
> > +#ifdef CONFIG_RANDOMIZE_BASE
> > +static unsigned long module_load_offset;
> > +
> > +/* Mutex protects the module_load_offset. */
> > +static DEFINE_MUTEX(module_kaslr_mutex);
> > +
> > +static unsigned long int get_module_load_offset(void)
> > +{
> > +	if (kaslr_enabled()) {
> > +		mutex_lock(&module_kaslr_mutex);
> > +		/*
> > +		 * Calculate the module_load_offset the first time this
> > +		 * code is called. Once calculated it stays the same until
> > +		 * reboot.
> > +		 */
> > +		if (module_load_offset == 0)
> > +			module_load_offset =
> > +				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
> > +		mutex_unlock(&module_kaslr_mutex);
> > +	}
> > +	return module_load_offset;
> > +}
> > +#else
> > +static unsigned long int get_module_load_offset(void)
> > +{
> > +	return 0;
> > +}
> > +#endif
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +	gfp_t gfp_mask = GFP_KERNEL;
> > +	void *p;
> > +
> > +	if (PAGE_ALIGN(size) > MODULES_LEN)
> > +		return NULL;
> > +
> > +	p = __vmalloc_node_range(size, MODULE_ALIGN,
> > +				 MODULES_VADDR + get_module_load_offset(),
> > +				 MODULES_END, gfp_mask, PAGE_KERNEL,
> > +				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
> > +				 NUMA_NO_NODE, __builtin_return_address(0));
> > +
> > +	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> > +		vfree(p);
> > +		return NULL;
> > +	}
> > +
> > +	return p;
> > +}
> > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> > index 6422e569b080..b8f4dcf92a89 100644
> > --- a/fs/proc/kcore.c
> > +++ b/fs/proc/kcore.c
> > @@ -668,7 +668,7 @@ static void __init proc_kcore_text_init(void)
> >  }
> >  #endif
> >  
> > -#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
> > +#if defined(CONFIG_MODULE_ALLOC) && defined(MODULES_VADDR)
> >  /*
> >   * MODULES_VADDR has no intersection with VMALLOC_ADDR.
> >   */
> > diff --git a/kernel/module/Kconfig b/kernel/module/Kconfig
> > index 0ea1b2970a23..a49460022350 100644
> > --- a/kernel/module/Kconfig
> > +++ b/kernel/module/Kconfig
> > @@ -1,6 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0-only
> >  menuconfig MODULES
> >  	bool "Enable loadable module support"
> > +	select MODULE_ALLOC
> >  	modules
> >  	help
> >  	  Kernel modules are small pieces of compiled code which can
> > diff --git a/kernel/module/main.c b/kernel/module/main.c
> > index 36681911c05a..085bc6e75b3f 100644
> > --- a/kernel/module/main.c
> > +++ b/kernel/module/main.c
> > @@ -1179,16 +1179,6 @@ resolve_symbol_wait(struct module *mod,
> >  	return ksym;
> >  }
> >  
> > -void __weak module_memfree(void *module_region)
> > -{
> > -	/*
> > -	 * This memory may be RO, and freeing RO memory in an interrupt is not
> > -	 * supported by vmalloc.
> > -	 */
> > -	WARN_ON(in_interrupt());
> > -	vfree(module_region);
> > -}
> > -
> >  void __weak module_arch_cleanup(struct module *mod)
> >  {
> >  }
> > @@ -1610,13 +1600,6 @@ static void free_modinfo(struct module *mod)
> >  	}
> >  }
> >  
> > -void * __weak module_alloc(unsigned long size)
> > -{
> > -	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> > -			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
> > -			NUMA_NO_NODE, __builtin_return_address(0));
> > -}
> > -
> >  bool __weak module_init_section(const char *name)
> >  {
> >  	return strstarts(name, ".init");
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index ffc3a2ba3a8c..92bfb5ae2e95 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -1261,6 +1261,9 @@ config LOCK_MM_AND_FIND_VMA
> >  config IOMMU_MM_DATA
> >  	bool
> >  
> > +config MODULE_ALLOC
> > +	def_bool n
> > +
> >  source "mm/damon/Kconfig"
> >  
> >  endmenu
> > diff --git a/mm/Makefile b/mm/Makefile
> > index e4b5b75aaec9..731bd2c20ceb 100644
> > --- a/mm/Makefile
> > +++ b/mm/Makefile
> > @@ -134,3 +134,4 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o
> >  obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
> >  obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
> >  obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
> > +obj-$(CONFIG_MODULE_ALLOC) += module_alloc.o
> > diff --git a/mm/module_alloc.c b/mm/module_alloc.c
> > new file mode 100644
> > index 000000000000..821af49e9a7c
> > --- /dev/null
> > +++ b/mm/module_alloc.c
> > @@ -0,0 +1,21 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/mm.h>
> > +
> > +void * __weak module_alloc(unsigned long size)
> > +{
> > +	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
> > +			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
> > +			NUMA_NO_NODE, __builtin_return_address(0));
> > +}
> > +
> > +void __weak module_memfree(void *module_region)
> > +{
> > +	/*
> > +	 * This memory may be RO, and freeing RO memory in an interrupt is not
> > +	 * supported by vmalloc.
> > +	 */
> > +	WARN_ON(in_interrupt());
> > +	vfree(module_region);
> > +}
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index d12a17fc0c17..b7d963fe0707 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -642,7 +642,7 @@ int is_vmalloc_or_module_addr(const void *x)
> >  	 * and fall back on vmalloc() if that fails. Others
> >  	 * just put it in the vmalloc space.
> >  	 */
> > -#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
> > +#if defined(CONFIG_MODULE_ALLOC) && defined(MODULES_VADDR)
> >  	unsigned long addr = (unsigned long)kasan_reset_tag(x);
> >  	if (addr >= MODULES_VADDR && addr < MODULES_END)
> >  		return 1;
> > -- 
> > 2.43.0
> > 
> 
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available
  2024-03-07 14:43   ` Christophe Leroy
@ 2024-03-08 20:53     ` Calvin Owens
  0 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 20:53 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner, bpf, linux-modules,
	linux-kernel

On Thursday 03/07 at 14:43 +0000, Christophe Leroy wrote:
> Hi Calvin,
> 
> Le 06/03/2024 à 21:05, Calvin Owens a écrit :
> > [Vous ne recevez pas souvent de courriers de jcalvinowens@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > 
> > Both BPF_JIT and KPROBES depend on CONFIG_MODULES, but only require
> > module_alloc() itself, which can be easily separated into a standalone
> > allocator for executable kernel memory.
> 
> Easily maybe, but not as easily as you think, see below.
> 
> > 
> > Thomas Gleixner sent a patch to do that for x86 as part of a larger
> > series a couple years ago:
> > 
> >      https://lore.kernel.org/all/20220716230953.442937066@linutronix.de/
> > 
> > I've simply extended that approach to the whole kernel.
> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >   arch/Kconfig                     |   2 +-
> >   arch/arm/kernel/module.c         |  35 ---------
> >   arch/arm/mm/Makefile             |   2 +
> >   arch/arm/mm/module_alloc.c       |  40 ++++++++++
> >   arch/arm64/kernel/module.c       | 127 ------------------------------
> >   arch/arm64/mm/Makefile           |   1 +
> >   arch/arm64/mm/module_alloc.c     | 130 +++++++++++++++++++++++++++++++
> >   arch/loongarch/kernel/module.c   |   6 --
> >   arch/loongarch/mm/Makefile       |   2 +
> >   arch/loongarch/mm/module_alloc.c |  10 +++
> >   arch/mips/kernel/module.c        |  10 ---
> >   arch/mips/mm/Makefile            |   2 +
> >   arch/mips/mm/module_alloc.c      |  13 ++++
> >   arch/nios2/kernel/module.c       |  20 -----
> >   arch/nios2/mm/Makefile           |   2 +
> >   arch/nios2/mm/module_alloc.c     |  22 ++++++
> >   arch/parisc/kernel/module.c      |  12 ---
> >   arch/parisc/mm/Makefile          |   1 +
> >   arch/parisc/mm/module_alloc.c    |  15 ++++
> >   arch/powerpc/kernel/module.c     |  36 ---------
> >   arch/powerpc/mm/Makefile         |   1 +
> >   arch/powerpc/mm/module_alloc.c   |  41 ++++++++++
> 
> Missing several powerpc changes to make it work. You must audit every 
> use of CONFIG_MODULES inside powerpc. Here are a few exemples:
> 
> Function get_patch_pfn() to enable text code patching.
> 
> arch/powerpc/Kconfig : 	select KASAN_VMALLOC			if KASAN && MODULES
> 
> arch/powerpc/include/asm/kasan.h:
> 
> #if defined(CONFIG_MODULES) && defined(CONFIG_PPC32)
> #define KASAN_KERN_START	ALIGN_DOWN(PAGE_OFFSET - SZ_256M, SZ_256M)
> #else
> #define KASAN_KERN_START	PAGE_OFFSET
> #endif
> 
> arch/powerpc/kernel/head_8xx.S and arch/powerpc/kernel/head_book3s_32.S: 
> InstructionTLBMiss interrupt handler must know that there is executable 
> kernel text outside kernel core.
> 
> Function is_module_segment() to identified segments used for module text 
> and set NX (NoExec) MMU flag on non-module segments.

Thanks Christophe, I'll fix that up.

I'm sure there are many other issues like this in the arch stuff here,
I'm going to run them all through QEMU to catch everything I can before
the next respin.

> >   arch/riscv/kernel/module.c       |  11 ---
> >   arch/riscv/mm/Makefile           |   1 +
> >   arch/riscv/mm/module_alloc.c     |  17 ++++
> >   arch/s390/kernel/module.c        |  37 ---------
> >   arch/s390/mm/Makefile            |   1 +
> >   arch/s390/mm/module_alloc.c      |  42 ++++++++++
> >   arch/sparc/kernel/module.c       |  31 --------
> >   arch/sparc/mm/Makefile           |   2 +
> >   arch/sparc/mm/module_alloc.c     |  31 ++++++++
> >   arch/x86/kernel/ftrace.c         |   2 +-
> >   arch/x86/kernel/module.c         |  56 -------------
> >   arch/x86/mm/Makefile             |   2 +
> >   arch/x86/mm/module_alloc.c       |  59 ++++++++++++++
> >   fs/proc/kcore.c                  |   2 +-
> >   kernel/module/Kconfig            |   1 +
> >   kernel/module/main.c             |  17 ----
> >   mm/Kconfig                       |   3 +
> >   mm/Makefile                      |   1 +
> >   mm/module_alloc.c                |  21 +++++
> >   mm/vmalloc.c                     |   2 +-
> >   42 files changed, 467 insertions(+), 402 deletions(-)
> 
> ...
> 
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index ffc3a2ba3a8c..92bfb5ae2e95 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -1261,6 +1261,9 @@ config LOCK_MM_AND_FIND_VMA
> >   config IOMMU_MM_DATA
> >          bool
> > 
> > +config MODULE_ALLOC
> > +       def_bool n
> > +
> 
> I'd call it something else than CONFIG_MODULE_ALLOC as you want to use 
> it when CONFIG_MODULE is not selected.
> 
> Something like CONFIG_EXECMEM_ALLOC or CONFIG_DYNAMIC_EXECMEM ?
> 
> 
> 
> Christophe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-08  2:46   ` Masami Hiramatsu
@ 2024-03-08 20:57     ` Calvin Owens
  0 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 20:57 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner, bpf,
	linux-modules, linux-kernel

On Friday 03/08 at 11:46 +0900, Masami Hiramatsu wrote:
> On Wed,  6 Mar 2024 12:05:10 -0800
> Calvin Owens <jcalvinowens@gmail.com> wrote:
> 
> > If something like this is merged down the road, it can go in at leisure
> > once the module_alloc change is in: it's a one-way dependency.
> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >  arch/Kconfig                |  2 +-
> >  kernel/kprobes.c            | 22 ++++++++++++++++++++++
> >  kernel/trace/trace_kprobe.c | 11 +++++++++++
> >  3 files changed, 34 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index cfc24ced16dd..e60ce984d095 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -52,8 +52,8 @@ config GENERIC_ENTRY
> >  
> >  config KPROBES
> >  	bool "Kprobes"
> > -	depends on MODULES
> >  	depends on HAVE_KPROBES
> > +	select MODULE_ALLOC
> 
> OK, if we use EXEC_ALLOC,
> 
> config EXEC_ALLOC
> 	depends on HAVE_EXEC_ALLOC
> 
> And 
> 
>   config KPROBES
>   	bool "Kprobes"
> 	depends on MODULES || EXEC_ALLOC
> 	select EXEC_ALLOC if HAVE_EXEC_ALLOC
> 
> then kprobes can be enabled either modules supported or exec_alloc is supported.
> (new arch does not need to implement exec_alloc)
> 
> Maybe we also need something like
> 
> #ifdef CONFIG_EXEC_ALLOC
> #define module_alloc(size) exec_alloc(size)
> #endif
> 
> in kprobes.h, or just add `replacing module_alloc with exec_alloc` patch.
> 
> Thank you,

The example was helpful, thanks. I see what you mean with
HAVE_EXEC_ALLOC, I'll implement it like that in the next verison.

> >  	select KALLSYMS
> >  	select TASKS_RCU if PREEMPTION
> >  	help
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 9d9095e81792..194270e17d57 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
> >  		str_has_prefix("__pfx_", symbuf);
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  static int check_kprobe_address_safe(struct kprobe *p,
> >  				     struct module **probed_mod)
> > +#else
> > +static int check_kprobe_address_safe(struct kprobe *p)
> > +#endif
> >  {
> >  	int ret;
> >  
> > @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >  		goto out;
> >  	}
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  	/* Check if 'p' is probing a module. */
> >  	*probed_mod = __module_text_address((unsigned long) p->addr);
> >  	if (*probed_mod) {
> > @@ -1603,6 +1608,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >  			ret = -ENOENT;
> >  		}
> >  	}
> > +#endif
> > +
> >  out:
> >  	preempt_enable();
> >  	jump_label_unlock();
> > @@ -1614,7 +1621,9 @@ int register_kprobe(struct kprobe *p)
> >  {
> >  	int ret;
> >  	struct kprobe *old_p;
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  	struct module *probed_mod;
> > +#endif
> >  	kprobe_opcode_t *addr;
> >  	bool on_func_entry;
> >  
> > @@ -1633,7 +1642,11 @@ int register_kprobe(struct kprobe *p)
> >  	p->nmissed = 0;
> >  	INIT_LIST_HEAD(&p->list);
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  	ret = check_kprobe_address_safe(p, &probed_mod);
> > +#else
> > +	ret = check_kprobe_address_safe(p);
> > +#endif
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -1676,8 +1689,10 @@ int register_kprobe(struct kprobe *p)
> >  out:
> >  	mutex_unlock(&kprobe_mutex);
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  	if (probed_mod)
> >  		module_put(probed_mod);
> > +#endif
> >  
> >  	return ret;
> >  }
> > @@ -2482,6 +2497,7 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
> >  	return 0;
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  /* Remove all symbols in given area from kprobe blacklist */
> >  static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
> >  {
> > @@ -2499,6 +2515,7 @@ static void kprobe_remove_ksym_blacklist(unsigned long entry)
> >  {
> >  	kprobe_remove_area_blacklist(entry, entry + 1);
> >  }
> > +#endif
> >  
> >  int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
> >  				   char *type, char *sym)
> > @@ -2564,6 +2581,7 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
> >  	return ret ? : arch_populate_kprobe_blacklist();
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  static void add_module_kprobe_blacklist(struct module *mod)
> >  {
> >  	unsigned long start, end;
> > @@ -2665,6 +2683,7 @@ static struct notifier_block kprobe_module_nb = {
> >  	.notifier_call = kprobes_module_callback,
> >  	.priority = 0
> >  };
> > +#endif /* IS_ENABLED(CONFIG_MODULES) */
> >  
> >  void kprobe_free_init_mem(void)
> >  {
> > @@ -2724,8 +2743,11 @@ static int __init init_kprobes(void)
> >  	err = arch_init_kprobes();
> >  	if (!err)
> >  		err = register_die_notifier(&kprobe_exceptions_nb);
> > +
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  	if (!err)
> >  		err = register_module_notifier(&kprobe_module_nb);
> > +#endif
> >  
> >  	kprobes_initialized = (err == 0);
> >  	kprobe_sysctls_init();
> > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > index c4c6e0e0068b..dd4598f775b9 100644
> > --- a/kernel/trace/trace_kprobe.c
> > +++ b/kernel/trace/trace_kprobe.c
> > @@ -102,6 +102,7 @@ static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
> >  	return kprobe_gone(&tk->rp.kp);
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
> >  						 struct module *mod)
> >  {
> > @@ -129,6 +130,12 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> >  
> >  	return ret;
> >  }
> > +#else
> > +static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> > +{
> > +	return true;
> > +}
> > +#endif
> >  
> >  static bool trace_kprobe_is_busy(struct dyn_event *ev)
> >  {
> > @@ -670,6 +677,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
> >  	return ret;
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  /* Module notifier call back, checking event on the module */
> >  static int trace_kprobe_module_callback(struct notifier_block *nb,
> >  				       unsigned long val, void *data)
> > @@ -704,6 +712,7 @@ static struct notifier_block trace_kprobe_module_nb = {
> >  	.notifier_call = trace_kprobe_module_callback,
> >  	.priority = 1	/* Invoked after kprobe module callback */
> >  };
> > +#endif /* IS_ENABLED(CONFIG_MODULES) */
> >  
> >  static int count_symbols(void *data, unsigned long unused)
> >  {
> > @@ -1897,8 +1906,10 @@ static __init int init_kprobe_trace_early(void)
> >  	if (ret)
> >  		return ret;
> >  
> > +#if IS_ENABLED(CONFIG_MODULES)
> >  	if (register_module_notifier(&trace_kprobe_module_nb))
> >  		return -EINVAL;
> > +#endif
> >  
> >  	return 0;
> >  }
> > -- 
> > 2.43.0
> > 
> 
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/4] kprobes: Allow kprobes with CONFIG_MODULES=n
  2024-03-07 22:16   ` Christophe Leroy
@ 2024-03-08 21:02     ` Calvin Owens
  0 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 21:02 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner, bpf, linux-modules,
	linux-kernel

On Thursday 03/07 at 22:16 +0000, Christophe Leroy wrote:
> 
> 
> Le 06/03/2024 à 21:05, Calvin Owens a écrit :
> > [Vous ne recevez pas souvent de courriers de jcalvinowens@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > 
> > If something like this is merged down the road, it can go in at leisure
> > once the module_alloc change is in: it's a one-way dependency.
> 
> Too many #ifdef, please reorganise stuff to avoid that and avoid 
> changing prototypes based of CONFIG_MODULES.
> 
> Other few comments below.

TBH the ugliness here was just me trying not to trigger -Wunused, but
that was silly: as you point out below, it's unncessary. I'll clean it
up.

> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >   arch/Kconfig                |  2 +-
> >   kernel/kprobes.c            | 22 ++++++++++++++++++++++
> >   kernel/trace/trace_kprobe.c | 11 +++++++++++
> >   3 files changed, 34 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index cfc24ced16dd..e60ce984d095 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -52,8 +52,8 @@ config GENERIC_ENTRY
> > 
> >   config KPROBES
> >          bool "Kprobes"
> > -       depends on MODULES
> >          depends on HAVE_KPROBES
> > +       select MODULE_ALLOC
> >          select KALLSYMS
> >          select TASKS_RCU if PREEMPTION
> >          help
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 9d9095e81792..194270e17d57 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -1556,8 +1556,12 @@ static bool is_cfi_preamble_symbol(unsigned long addr)
> >                  str_has_prefix("__pfx_", symbuf);
> >   }
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >   static int check_kprobe_address_safe(struct kprobe *p,
> >                                       struct module **probed_mod)
> > +#else
> > +static int check_kprobe_address_safe(struct kprobe *p)
> > +#endif
> 
> A bit ugly to have to change the prototype, why not just keep probed_mod 
> at all time ?
> 
> When CONFIG_MODULES is not selected, __module_text_address() returns 
> NULL so it should work without that many #ifdefs.
> 
> >   {
> >          int ret;
> > 
> > @@ -1580,6 +1584,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >                  goto out;
> >          }
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >          /* Check if 'p' is probing a module. */
> >          *probed_mod = __module_text_address((unsigned long) p->addr);
> >          if (*probed_mod) {
> > @@ -1603,6 +1608,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >                          ret = -ENOENT;
> >                  }
> >          }
> > +#endif
> > +
> >   out:
> >          preempt_enable();
> >          jump_label_unlock();
> > @@ -1614,7 +1621,9 @@ int register_kprobe(struct kprobe *p)
> >   {
> >          int ret;
> >          struct kprobe *old_p;
> > +#if IS_ENABLED(CONFIG_MODULES)
> >          struct module *probed_mod;
> > +#endif
> >          kprobe_opcode_t *addr;
> >          bool on_func_entry;
> > 
> > @@ -1633,7 +1642,11 @@ int register_kprobe(struct kprobe *p)
> >          p->nmissed = 0;
> >          INIT_LIST_HEAD(&p->list);
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >          ret = check_kprobe_address_safe(p, &probed_mod);
> > +#else
> > +       ret = check_kprobe_address_safe(p);
> > +#endif
> >          if (ret)
> >                  return ret;
> > 
> > @@ -1676,8 +1689,10 @@ int register_kprobe(struct kprobe *p)
> >   out:
> >          mutex_unlock(&kprobe_mutex);
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >          if (probed_mod)
> >                  module_put(probed_mod);
> > +#endif
> > 
> >          return ret;
> >   }
> > @@ -2482,6 +2497,7 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
> >          return 0;
> >   }
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >   /* Remove all symbols in given area from kprobe blacklist */
> >   static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
> >   {
> > @@ -2499,6 +2515,7 @@ static void kprobe_remove_ksym_blacklist(unsigned long entry)
> >   {
> >          kprobe_remove_area_blacklist(entry, entry + 1);
> >   }
> > +#endif
> > 
> >   int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
> >                                     char *type, char *sym)
> > @@ -2564,6 +2581,7 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
> >          return ret ? : arch_populate_kprobe_blacklist();
> >   }
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >   static void add_module_kprobe_blacklist(struct module *mod)
> >   {
> >          unsigned long start, end;
> > @@ -2665,6 +2683,7 @@ static struct notifier_block kprobe_module_nb = {
> >          .notifier_call = kprobes_module_callback,
> >          .priority = 0
> >   };
> > +#endif /* IS_ENABLED(CONFIG_MODULES) */
> > 
> >   void kprobe_free_init_mem(void)
> >   {
> > @@ -2724,8 +2743,11 @@ static int __init init_kprobes(void)
> >          err = arch_init_kprobes();
> >          if (!err)
> >                  err = register_die_notifier(&kprobe_exceptions_nb);
> > +
> > +#if IS_ENABLED(CONFIG_MODULES)
> >          if (!err)
> >                  err = register_module_notifier(&kprobe_module_nb);
> > +#endif
> > 
> >          kprobes_initialized = (err == 0);
> >          kprobe_sysctls_init();
> > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > index c4c6e0e0068b..dd4598f775b9 100644
> > --- a/kernel/trace/trace_kprobe.c
> > +++ b/kernel/trace/trace_kprobe.c
> > @@ -102,6 +102,7 @@ static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
> >          return kprobe_gone(&tk->rp.kp);
> >   }
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >   static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
> >                                                   struct module *mod)
> >   {
> > @@ -129,6 +130,12 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> > 
> >          return ret;
> >   }
> > +#else
> > +static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
> > +{
> > +       return true;
> > +}
> > +#endif
> > 
> >   static bool trace_kprobe_is_busy(struct dyn_event *ev)
> >   {
> > @@ -670,6 +677,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
> >          return ret;
> >   }
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >   /* Module notifier call back, checking event on the module */
> >   static int trace_kprobe_module_callback(struct notifier_block *nb,
> >                                         unsigned long val, void *data)
> > @@ -704,6 +712,7 @@ static struct notifier_block trace_kprobe_module_nb = {
> >          .notifier_call = trace_kprobe_module_callback,
> >          .priority = 1   /* Invoked after kprobe module callback */
> >   };
> > +#endif /* IS_ENABLED(CONFIG_MODULES) */
> > 
> >   static int count_symbols(void *data, unsigned long unused)
> >   {
> > @@ -1897,8 +1906,10 @@ static __init int init_kprobe_trace_early(void)
> >          if (ret)
> >                  return ret;
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> >          if (register_module_notifier(&trace_kprobe_module_nb))
> >                  return -EINVAL;
> Why a #if here ?
> 
> If CONFIG_MODULES is not selected, register_module_notifier() return 
> always 0.
> 
> > +#endif
> > 
> >          return 0;
> >   }
> > --
> > 2.43.0
> > 
> > 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n
  2024-03-07 22:09   ` Christophe Leroy
@ 2024-03-08 21:04     ` Calvin Owens
  0 siblings, 0 replies; 27+ messages in thread
From: Calvin Owens @ 2024-03-08 21:04 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Luis Chamberlain, Andrew Morton, Alexei Starovoitov,
	Steven Rostedt, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Naveen N Rao, Anil S Keshavamurthy,
	David S Miller, Thomas Gleixner, bpf, linux-modules,
	linux-kernel

On Thursday 03/07 at 22:09 +0000, Christophe Leroy wrote:
> 
> 
> Le 06/03/2024 à 21:05, Calvin Owens a écrit :
> > [Vous ne recevez pas souvent de courriers de jcalvinowens@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > 
> > No BPF code has to change, except in struct_ops (for module refs).
> > 
> > This conflicts with bpf-next because of this (relevant) series:
> > 
> >      https://lore.kernel.org/all/20240119225005.668602-1-thinker.li@gmail.com/
> > 
> > If something like this is merged down the road, it can go through
> > bpf-next at leisure once the module_alloc change is in: it's a one-way
> > dependency.
> > 
> > Signed-off-by: Calvin Owens <jcalvinowens@gmail.com>
> > ---
> >   kernel/bpf/Kconfig          |  2 +-
> >   kernel/bpf/bpf_struct_ops.c | 28 ++++++++++++++++++++++++----
> >   2 files changed, 25 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
> > index 6a906ff93006..77df483a8925 100644
> > --- a/kernel/bpf/Kconfig
> > +++ b/kernel/bpf/Kconfig
> > @@ -42,7 +42,7 @@ config BPF_JIT
> >          bool "Enable BPF Just In Time compiler"
> >          depends on BPF
> >          depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT
> > -       depends on MODULES
> > +       select MODULE_ALLOC
> >          help
> >            BPF programs are normally handled by a BPF interpreter. This option
> >            allows the kernel to generate native code when a program is loaded
> > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > index 02068bd0e4d9..fbf08a1bb00c 100644
> > --- a/kernel/bpf/bpf_struct_ops.c
> > +++ b/kernel/bpf/bpf_struct_ops.c
> > @@ -108,11 +108,30 @@ const struct bpf_prog_ops bpf_struct_ops_prog_ops = {
> >   #endif
> >   };
> > 
> > +#if IS_ENABLED(CONFIG_MODULES)
> 
> Can you avoid ifdefs as much as possible ?

Similar to the other one, this was just a misguided attempt to avoid
triggering -Wunused, I'll clean it up.

This particular patch will look very different when rebased on bpf-next.

> >   static const struct btf_type *module_type;
> > 
> > +static int bpf_struct_module_type_init(struct btf *btf)
> > +{
> > +       s32 module_id;
> 
> Could be:
> 
> 	if (!IS_ENABLED(CONFIG_MODULES))
> 		return 0;
> 
> > +
> > +       module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
> > +       if (module_id < 0)
> > +               return 1;
> > +
> > +       module_type = btf_type_by_id(btf, module_id);
> > +       return 0;
> > +}
> > +#else
> > +static int bpf_struct_module_type_init(struct btf *btf)
> > +{
> > +       return 0;
> > +}
> > +#endif
> > +
> >   void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
> >   {
> > -       s32 type_id, value_id, module_id;
> > +       s32 type_id, value_id;
> >          const struct btf_member *member;
> >          struct bpf_struct_ops *st_ops;
> >          const struct btf_type *t;
> > @@ -125,12 +144,10 @@ void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
> >   #include "bpf_struct_ops_types.h"
> >   #undef BPF_STRUCT_OPS_TYPE
> > 
> > -       module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
> > -       if (module_id < 0) {
> > +       if (bpf_struct_module_type_init(btf)) {
> >                  pr_warn("Cannot find struct module in btf_vmlinux\n");
> >                  return;
> >          }
> > -       module_type = btf_type_by_id(btf, module_id);
> > 
> >          for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
> >                  st_ops = bpf_struct_ops[i];
> > @@ -433,12 +450,15 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > 
> >                  moff = __btf_member_bit_offset(t, member) / 8;
> >                  ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL);
> > +
> > +#if IS_ENABLED(CONFIG_MODULES)
> 
> Can't see anything depending on CONFIG_MODULES here, can you instead do:
> 
> 		if (IS_ENABLED(CONFIG_MODULES) && ptype == module_type) {
> 
> >                  if (ptype == module_type) {
> >                          if (*(void **)(udata + moff))
> >                                  goto reset_unlock;
> >                          *(void **)(kdata + moff) = BPF_MODULE_OWNER;
> >                          continue;
> >                  }
> > +#endif
> > 
> >                  err = st_ops->init_member(t, member, kdata, udata);
> >                  if (err < 0)
> > --
> > 2.43.0
> > 
> > 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n
  2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
                   ` (4 preceding siblings ...)
  2024-03-06 21:34 ` [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Luis Chamberlain
@ 2024-03-25 22:46 ` Jarkko Sakkinen
  5 siblings, 0 replies; 27+ messages in thread
From: Jarkko Sakkinen @ 2024-03-25 22:46 UTC (permalink / raw)
  To: Calvin Owens, Luis Chamberlain, Andrew Morton,
	Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Naveen N Rao,
	Anil S Keshavamurthy, David S Miller, Thomas Gleixner
  Cc: bpf, linux-modules, linux-kernel

On Wed Mar 6, 2024 at 10:05 PM EET, Calvin Owens wrote:
> Hello all,
>
> This patchset makes it possible to use bpftrace with kprobes on kernels
> built without loadable module support.
>
> On a Raspberry Pi 4b, this saves about 700KB of memory where BPF is
> needed but loadable module support is not. These two kernels had
> identical configurations, except CONFIG_MODULE was off in the second:
>
>    - Linux version 6.8.0-rc7
>    - Memory: 3330672K/4050944K available (16576K kernel code, 2390K rwdata,
>    - 12364K rodata, 5632K init, 675K bss, 195984K reserved, 524288K cma-reserved)
>    + Linux version 6.8.0-rc7-00003-g2af01251ca21
>    + Memory: 3331400K/4050944K available (16512K kernel code, 2384K rwdata,
>    + 11728K rodata, 5632K init, 673K bss, 195256K reserved, 524288K cma-reserved)
>
> I don't intend to present an exhaustive list of !MODULES usecases, since
> I'm sure there are many I'm not aware of. Performance is a common one,
> the primary justification being that static text is mapped on hugepages
> and module text is not. Security is another, since rootkits are much
> harder to implement without modules.
>
> The first patch is the interesting one: it moves module_alloc() into its
> own file with its own Kconfig option, so it can be utilized even when
> loadable module support is disabled. I got the idea from an unmerged
> patch from a few years ago I found on lkml (see [1/4] for details). I
> think this also has value in its own right, since I suspect there are
> potential users beyond bpf, hopefully we will hear from some.
>
> Patches 2-3 are proofs of concept to demonstrate the first patch is
> sufficient to achieve my goal (full ebpf functionality without modules).
>
> Patch 4 adds a new "-n" argument to vmtest.sh to run the BPF selftests
> without modules, so the prior three patches can be rigorously tested.
>
> If something like the first patch were to eventually be merged, the rest
> could go through the normal bpf-next process as I clean them up: I've
> only based them on Linus' tree and combined them into a series here to
> introduce the idea.
>
> If you prefer to fetch the patches via git:
>
>   [1/4] https://github.com/jcalvinowens/linux.git work/module-alloc
>  +[2/4]+[3/4] https://github.com/jcalvinowens/linux.git work/nomodule-bpf
>  +[4/4] https://github.com/jcalvinowens/linux.git testing/nomodule-bpf-ci
>
> In addition to the automated BPF selftests, I've lightly tested this on
> my laptop (x86_64), a Raspberry Pi 4b (arm64), and a Raspberry Pi Zero W
> (arm). The other architectures have only been compile tested.
>
> I didn't want to spam all the arch maintainers with what I expect will
> be a discussion mostly about modules and bpf, so I've left them off this
> first submission. I will be sure to add them on future submissions of
> the first patch. Of course, feedback on the arch bits is welcome here.
>
> In addition to feedback on the patches themselves, I'm interested in
> hearing from anybody else who might find this functionality useful.
>
> Thanks,
> Calvin
>
>
> Calvin Owens (4):
>   module: mm: Make module_alloc() generally available
>   bpf: Allow BPF_JIT with CONFIG_MODULES=n
>   kprobes: Allow kprobes with CONFIG_MODULES=n
>   selftests/bpf: Support testing the !MODULES case
>
>  arch/Kconfig                                  |   4 +-
>  arch/arm/kernel/module.c                      |  35 -----
>  arch/arm/mm/Makefile                          |   2 +
>  arch/arm/mm/module_alloc.c                    |  40 ++++++
>  arch/arm64/kernel/module.c                    | 127 -----------------
>  arch/arm64/mm/Makefile                        |   1 +
>  arch/arm64/mm/module_alloc.c                  | 130 ++++++++++++++++++
>  arch/loongarch/kernel/module.c                |   6 -
>  arch/loongarch/mm/Makefile                    |   2 +
>  arch/loongarch/mm/module_alloc.c              |  10 ++
>  arch/mips/kernel/module.c                     |  10 --
>  arch/mips/mm/Makefile                         |   2 +
>  arch/mips/mm/module_alloc.c                   |  13 ++
>  arch/nios2/kernel/module.c                    |  20 ---
>  arch/nios2/mm/Makefile                        |   2 +
>  arch/nios2/mm/module_alloc.c                  |  22 +++
>  arch/parisc/kernel/module.c                   |  12 --
>  arch/parisc/mm/Makefile                       |   1 +
>  arch/parisc/mm/module_alloc.c                 |  15 ++
>  arch/powerpc/kernel/module.c                  |  36 -----
>  arch/powerpc/mm/Makefile                      |   1 +
>  arch/powerpc/mm/module_alloc.c                |  41 ++++++
>  arch/riscv/kernel/module.c                    |  11 --
>  arch/riscv/mm/Makefile                        |   1 +
>  arch/riscv/mm/module_alloc.c                  |  17 +++
>  arch/s390/kernel/module.c                     |  37 -----
>  arch/s390/mm/Makefile                         |   1 +
>  arch/s390/mm/module_alloc.c                   |  42 ++++++
>  arch/sparc/kernel/module.c                    |  31 -----
>  arch/sparc/mm/Makefile                        |   2 +
>  arch/sparc/mm/module_alloc.c                  |  31 +++++
>  arch/x86/kernel/ftrace.c                      |   2 +-
>  arch/x86/kernel/module.c                      |  56 --------
>  arch/x86/mm/Makefile                          |   2 +
>  arch/x86/mm/module_alloc.c                    |  59 ++++++++
>  fs/proc/kcore.c                               |   2 +-
>  include/trace/events/bpf_testmod.h            |   1 +
>  kernel/bpf/Kconfig                            |  11 +-
>  kernel/bpf/Makefile                           |   2 +
>  kernel/bpf/bpf_struct_ops.c                   |  28 +++-
>  kernel/bpf/bpf_testmod/Makefile               |   1 +
>  kernel/bpf/bpf_testmod/bpf_testmod.c          |   1 +
>  kernel/bpf/bpf_testmod/bpf_testmod.h          |   1 +
>  kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h    |   1 +
>  kernel/kprobes.c                              |  22 +++
>  kernel/module/Kconfig                         |   1 +
>  kernel/module/main.c                          |  17 ---
>  kernel/trace/trace_kprobe.c                   |  11 ++
>  mm/Kconfig                                    |   3 +
>  mm/Makefile                                   |   1 +
>  mm/module_alloc.c                             |  21 +++
>  mm/vmalloc.c                                  |   2 +-
>  net/bpf/test_run.c                            |   2 +
>  tools/testing/selftests/bpf/Makefile          |  28 ++--
>  .../selftests/bpf/bpf_testmod/Makefile        |   2 +-
>  .../bpf/bpf_testmod/bpf_testmod-events.h      |   6 +
>  .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   4 +
>  .../bpf/bpf_testmod/bpf_testmod_kfunc.h       |   2 +
>  tools/testing/selftests/bpf/config            |   5 -
>  tools/testing/selftests/bpf/config.mods       |   5 +
>  tools/testing/selftests/bpf/config.nomods     |   1 +
>  .../selftests/bpf/progs/btf_type_tag_percpu.c |   2 +
>  .../selftests/bpf/progs/btf_type_tag_user.c   |   2 +
>  tools/testing/selftests/bpf/progs/core_kern.c |   2 +
>  .../selftests/bpf/progs/iters_testmod_seq.c   |   2 +
>  .../bpf/progs/test_core_reloc_module.c        |   2 +
>  .../selftests/bpf/progs/test_ldsx_insn.c      |   2 +
>  .../selftests/bpf/progs/test_module_attach.c  |   3 +
>  .../selftests/bpf/progs/tracing_struct.c      |   2 +
>  tools/testing/selftests/bpf/testing_helpers.c |  14 ++
>  tools/testing/selftests/bpf/vmtest.sh         |  24 +++-
>  71 files changed, 636 insertions(+), 424 deletions(-)
>  create mode 100644 arch/arm/mm/module_alloc.c
>  create mode 100644 arch/arm64/mm/module_alloc.c
>  create mode 100644 arch/loongarch/mm/module_alloc.c
>  create mode 100644 arch/mips/mm/module_alloc.c
>  create mode 100644 arch/nios2/mm/module_alloc.c
>  create mode 100644 arch/parisc/mm/module_alloc.c
>  create mode 100644 arch/powerpc/mm/module_alloc.c
>  create mode 100644 arch/riscv/mm/module_alloc.c
>  create mode 100644 arch/s390/mm/module_alloc.c
>  create mode 100644 arch/sparc/mm/module_alloc.c
>  create mode 100644 arch/x86/mm/module_alloc.c
>  create mode 120000 include/trace/events/bpf_testmod.h
>  create mode 100644 kernel/bpf/bpf_testmod/Makefile
>  create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.c
>  create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.h
>  create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
>  create mode 100644 mm/module_alloc.c
>  create mode 100644 tools/testing/selftests/bpf/config.mods
>  create mode 100644 tools/testing/selftests/bpf/config.nomods

I think with eBPF focus in the patch set should be only on arch's
that you use regulary, i.e. repeating same mistake I did couple of
years ago:

https://lore.kernel.org/all/20220608000014.3054333-1-jarkko@profian.com/

I don't see my patch set conflict with this work, and it adds needed
shenanigans to realize eBPF patches. I refined the shenanigans to match
Masami's suggestions:

https://lore.kernel.org/all/20240325215502.660-1-jarkko@kernel.org/

And it this requires properly working kprobes anyway.

BR, Jarkko

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-03-25 22:46 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06 20:05 [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Calvin Owens
2024-03-06 20:05 ` [RFC][PATCH 1/4] module: mm: Make module_alloc() generally available Calvin Owens
2024-03-07 14:43   ` Christophe Leroy
2024-03-08 20:53     ` Calvin Owens
2024-03-08  2:16   ` Masami Hiramatsu
2024-03-08 20:43     ` Calvin Owens
2024-03-06 20:05 ` [RFC][PATCH 2/4] bpf: Allow BPF_JIT with CONFIG_MODULES=n Calvin Owens
2024-03-07 22:09   ` Christophe Leroy
2024-03-08 21:04     ` Calvin Owens
2024-03-06 20:05 ` [RFC][PATCH 3/4] kprobes: Allow kprobes " Calvin Owens
2024-03-07  7:22   ` Mike Rapoport
2024-03-08  2:46     ` Masami Hiramatsu
2024-03-08 20:36     ` Calvin Owens
2024-03-07 22:16   ` Christophe Leroy
2024-03-08 21:02     ` Calvin Owens
2024-03-08  2:46   ` Masami Hiramatsu
2024-03-08 20:57     ` Calvin Owens
2024-03-06 20:05 ` [RFC][PATCH 4/4] selftests/bpf: Support testing the !MODULES case Calvin Owens
2024-03-06 21:34 ` [RFC][PATCH 0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n Luis Chamberlain
2024-03-06 23:23   ` Calvin Owens
2024-03-07  1:58     ` Song Liu
2024-03-08  2:50       ` Masami Hiramatsu
2024-03-08  2:55         ` Luis Chamberlain
2024-03-08 20:27           ` Calvin Owens
2024-03-07  7:13     ` Mike Rapoport
2024-03-08  2:45   ` Masami Hiramatsu
2024-03-25 22:46 ` Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).