linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13]  mm: jit/text allocator
@ 2023-09-18  7:29 Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 01/13] nios2: define virtual address space for modules Mike Rapoport
                   ` (12 more replies)
  0 siblings, 13 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Hi,

module_alloc() is used everywhere as a mean to allocate memory for code.

Beside being semantically wrong, this unnecessarily ties all subsystmes
that need to allocate code, such as ftrace, kprobes and BPF to modules and
puts the burden of code allocation to the modules code.

Several architectures override module_alloc() because of various
constraints where the executable memory can be located and this causes
additional obstacles for improvements of code allocation.

A centralized infrastructure for code allocation allows allocations of
executable memory as ROX, and future optimizations such as caching large
pages for better iTLB performance and providing sub-page allocations for
users that only need small jit code snippets.

Rick Edgecombe proposed perm_alloc extension to vmalloc [1] and Song Liu
proposed execmem_alloc [2], but both these approaches were targeting BPF
allocations and lacked the ground work to abstract executable allocations
and split them from the modules core.

Thomas Gleixner suggested to express module allocation restrictions and
requirements as struct mod_alloc_type_params [3] that would define ranges,
protections and other parameters for different types of allocations used by
modules and following that suggestion Song separated allocations of
different types in modules (commit ac3b43283923 ("module: replace
module_layout with module_memory")) and posted "Type aware module
allocator" set [4].

I liked the idea of parametrising code allocation requirements as a
structure, but I believe the original proposal and Song's module allocator
was too module centric, so I came up with these patches.

This set splits code allocation from modules by introducing
execmem_text_alloc(), execmem_data_alloc() and execmem_free(), APIs,
replaces call sites of module_alloc() and module_memfree() with the new
APIs and implements core text and related allocations in a central place.

Instead of architecture specific overrides for module_alloc(), the
architectures that require non-default behaviour for text allocation must
fill execmem_alloc_params structure and implement execmem_arch_params()
that returns a pointer to that structure. If an architecture does not
implement execmem_arch_params(), the defaults compatible with the current
modules::module_alloc() are used.

Since architectures define different restrictions on placement,
permissions, alignment and other parameters for memory that can be used by
different subsystems that allocate executable memory, execmem APIs
take a type argument, that will be used to identify the calling subsystem
and to allow architectures to define parameters for ranges suitable for that
subsystem.

The new infrastructure allows decoupling of BPF, kprobes and ftrace from
modules, and most importantly it paves the way for ROX allocations for
executable memory.

[1] https://lore.kernel.org/lkml/20201120202426.18009-1-rick.p.edgecombe@intel.com/
[2] https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
[3] https://lore.kernel.org/all/87v8mndy3y.ffs@tglx/
[4] https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org

v3 changes:
* add type parameter to execmem allocation APIs
* remove BPF dependency on modules

v2: https://lore.kernel.org/all/20230616085038.4121892-1-rppt@kernel.org
* Separate "module" and "others" allocations with execmem_text_alloc()
and jit_text_alloc()
* Drop ROX entablement on x86
* Add ack for nios2 changes, thanks Dinh Nguyen

v1: https://lore.kernel.org/all/20230601101257.530867-1-rppt@kernel.org

Mike Rapoport (IBM) (13):
  nios2: define virtual address space for modules
  mm: introduce execmem_text_alloc() and execmem_free()
  mm/execmem, arch: convert simple overrides of module_alloc to execmem
  mm/execmem, arch: convert remaining overrides of module_alloc to
    execmem
  modules, execmem: drop module_alloc
  mm/execmem: introduce execmem_data_alloc()
  arm64, execmem: extend execmem_params for generated code allocations
  riscv: extend execmem_params for generated code allocations
  powerpc: extend execmem_params for kprobes allocations
  arch: make execmem setup available regardless of CONFIG_MODULES
  x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
  kprobes: remove dependency on CONFIG_MODULES
  bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of

 arch/Kconfig                       |   2 +-
 arch/arm/kernel/module.c           |  32 -------
 arch/arm/mm/init.c                 |  38 ++++++++
 arch/arm64/kernel/module.c         | 124 -------------------------
 arch/arm64/kernel/probes/kprobes.c |   7 --
 arch/arm64/mm/init.c               | 132 +++++++++++++++++++++++++++
 arch/arm64/net/bpf_jit_comp.c      |  11 ---
 arch/loongarch/kernel/module.c     |   6 --
 arch/loongarch/mm/init.c           |  20 ++++
 arch/mips/kernel/module.c          |  10 +-
 arch/mips/mm/init.c                |  20 ++++
 arch/nios2/include/asm/pgtable.h   |   5 +-
 arch/nios2/kernel/module.c         |  28 +++---
 arch/parisc/kernel/module.c        |  12 +--
 arch/parisc/mm/init.c              |  22 ++++-
 arch/powerpc/kernel/kprobes.c      |  16 +---
 arch/powerpc/kernel/module.c       |  37 --------
 arch/powerpc/mm/mem.c              |  62 +++++++++++++
 arch/riscv/kernel/module.c         |  10 --
 arch/riscv/kernel/probes/kprobes.c |  10 --
 arch/riscv/mm/init.c               |  39 ++++++++
 arch/riscv/net/bpf_jit_core.c      |  13 ---
 arch/s390/kernel/ftrace.c          |   4 +-
 arch/s390/kernel/kprobes.c         |   4 +-
 arch/s390/kernel/module.c          |  42 +--------
 arch/s390/mm/init.c                |  28 ++++++
 arch/sparc/kernel/module.c         |  33 +------
 arch/sparc/mm/Makefile             |   2 +
 arch/sparc/mm/execmem.c            |  25 +++++
 arch/sparc/net/bpf_jit_comp_32.c   |   8 +-
 arch/x86/Kconfig                   |   1 +
 arch/x86/kernel/ftrace.c           |  16 +---
 arch/x86/kernel/kprobes/core.c     |   4 +-
 arch/x86/kernel/module.c           |  51 -----------
 arch/x86/mm/init.c                 |  29 ++++++
 include/linux/execmem.h            | 141 ++++++++++++++++++++++++++++
 include/linux/moduleloader.h       |  15 ---
 kernel/bpf/Kconfig                 |   2 +-
 kernel/bpf/core.c                  |   6 +-
 kernel/kprobes.c                   |  51 ++++++-----
 kernel/module/Kconfig              |   1 +
 kernel/module/main.c               |  45 ++-------
 kernel/trace/trace_kprobe.c        |  11 +++
 mm/Kconfig                         |   3 +
 mm/Makefile                        |   1 +
 mm/execmem.c                       | 142 +++++++++++++++++++++++++++++
 mm/mm_init.c                       |   2 +
 47 files changed, 801 insertions(+), 522 deletions(-)
 create mode 100644 arch/sparc/mm/execmem.c
 create mode 100644 include/linux/execmem.h
 create mode 100644 mm/execmem.c


base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
-- 
2.39.2


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v3 01/13] nios2: define virtual address space for modules
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free() Mike Rapoport
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

nios2 uses kmalloc() to implement module_alloc() because CALL26/PCREL26
cannot reach all of vmalloc address space.

Define module space as 32MiB below the kernel base and switch nios2 to
use vmalloc for module allocations.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/nios2/include/asm/pgtable.h |  5 ++++-
 arch/nios2/kernel/module.c       | 19 ++++---------------
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index 5144506dfa69..d2fb42fb6db8 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -25,7 +25,10 @@
 #include <asm-generic/pgtable-nopmd.h>
 
 #define VMALLOC_START		CONFIG_NIOS2_KERNEL_MMU_REGION_BASE
-#define VMALLOC_END		(CONFIG_NIOS2_KERNEL_REGION_BASE - 1)
+#define VMALLOC_END		(CONFIG_NIOS2_KERNEL_REGION_BASE - SZ_32M - 1)
+
+#define MODULES_VADDR		(CONFIG_NIOS2_KERNEL_REGION_BASE - SZ_32M)
+#define MODULES_END		(CONFIG_NIOS2_KERNEL_REGION_BASE - 1)
 
 struct mm_struct;
 
diff --git a/arch/nios2/kernel/module.c b/arch/nios2/kernel/module.c
index 76e0a42d6e36..9c97b7513853 100644
--- a/arch/nios2/kernel/module.c
+++ b/arch/nios2/kernel/module.c
@@ -21,23 +21,12 @@
 
 #include <asm/cacheflush.h>
 
-/*
- * Modules should NOT be allocated with kmalloc for (obvious) reasons.
- * But we do it for now to avoid relocation issues. CALL26/PCREL26 cannot reach
- * from 0x80000000 (vmalloc area) to 0xc00000000 (kernel) (kmalloc returns
- * addresses in 0xc0000000)
- */
 void *module_alloc(unsigned long size)
 {
-	if (size == 0)
-		return NULL;
-	return kmalloc(size, GFP_KERNEL);
-}
-
-/* Free memory returned from module_alloc */
-void module_memfree(void *module_region)
-{
-	kfree(module_region);
+	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+				    GFP_KERNEL, PAGE_KERNEL_EXEC,
+				    VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
+				    __builtin_return_address(0));
 }
 
 int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 01/13] nios2: define virtual address space for modules Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-21 22:10   ` Song Liu
                     ` (2 more replies)
  2023-09-18  7:29 ` [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem Mike Rapoport
                   ` (10 subsequent siblings)
  12 siblings, 3 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

module_alloc() is used everywhere as a mean to allocate memory for code.

Beside being semantically wrong, this unnecessarily ties all subsystems
that need to allocate code, such as ftrace, kprobes and BPF to modules
and puts the burden of code allocation to the modules code.

Several architectures override module_alloc() because of various
constraints where the executable memory can be located and this causes
additional obstacles for improvements of code allocation.

Start splitting code allocation from modules by introducing
execmem_text_alloc() and execmem_free() APIs.

Initially, execmem_text_alloc() is a wrapper for module_alloc() and
execmem_free() is a replacement of module_memfree() to allow updating all
call sites to use the new APIs.

Since architectures define different restrictions on placement,
permissions, alignment and other parameters for memory that can be used by
different subsystems that allocate executable memory, execmem_text_alloc()
takes a type argument, that will be used to identify the calling subsystem
and to allow architectures define parameters for ranges suitable for that
subsystem.

The name execmem_text_alloc() emphasizes that the allocated memory is for
executable code, the allocations of the associated data, like data sections
of a module will use execmem_data_alloc() interface that will be added
later.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/powerpc/kernel/kprobes.c    |  4 +--
 arch/s390/kernel/ftrace.c        |  4 +--
 arch/s390/kernel/kprobes.c       |  4 +--
 arch/s390/kernel/module.c        |  5 +--
 arch/sparc/net/bpf_jit_comp_32.c |  8 ++---
 arch/x86/kernel/ftrace.c         |  6 ++--
 arch/x86/kernel/kprobes/core.c   |  4 +--
 include/linux/execmem.h          | 56 ++++++++++++++++++++++++++++++++
 include/linux/moduleloader.h     |  3 --
 kernel/bpf/core.c                |  6 ++--
 kernel/kprobes.c                 |  8 ++---
 kernel/module/Kconfig            |  1 +
 kernel/module/main.c             | 25 +++++---------
 mm/Kconfig                       |  3 ++
 mm/Makefile                      |  1 +
 mm/execmem.c                     | 26 +++++++++++++++
 16 files changed, 120 insertions(+), 44 deletions(-)
 create mode 100644 include/linux/execmem.h
 create mode 100644 mm/execmem.c

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index b20ee72e873a..62228c7072a2 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -19,8 +19,8 @@
 #include <linux/extable.h>
 #include <linux/kdebug.h>
 #include <linux/slab.h>
-#include <linux/moduleloader.h>
 #include <linux/set_memory.h>
+#include <linux/execmem.h>
 #include <asm/code-patching.h>
 #include <asm/cacheflush.h>
 #include <asm/sstep.h>
@@ -130,7 +130,7 @@ void *alloc_insn_page(void)
 {
 	void *page;
 
-	page = module_alloc(PAGE_SIZE);
+	page = execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
 	if (!page)
 		return NULL;
 
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..4052e10eb6a4 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -7,13 +7,13 @@
  *   Author(s): Martin Schwidefsky <schwidefsky@de.ibm.com>
  */
 
-#include <linux/moduleloader.h>
 #include <linux/hardirq.h>
 #include <linux/uaccess.h>
 #include <linux/ftrace.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/kprobes.h>
+#include <linux/execmem.h>
 #include <trace/syscall.h>
 #include <asm/asm-offsets.h>
 #include <asm/text-patching.h>
@@ -220,7 +220,7 @@ static int __init ftrace_plt_init(void)
 {
 	const char *start, *end;
 
-	ftrace_plt = module_alloc(PAGE_SIZE);
+	ftrace_plt = execmem_text_alloc(EXECMEM_FTRACE, PAGE_SIZE);
 	if (!ftrace_plt)
 		panic("cannot allocate ftrace plt\n");
 
diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index d4b863ed0aa7..48928460dcb9 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -9,7 +9,6 @@
 
 #define pr_fmt(fmt) "kprobes: " fmt
 
-#include <linux/moduleloader.h>
 #include <linux/kprobes.h>
 #include <linux/ptrace.h>
 #include <linux/preempt.h>
@@ -21,6 +20,7 @@
 #include <linux/slab.h>
 #include <linux/hardirq.h>
 #include <linux/ftrace.h>
+#include <linux/execmem.h>
 #include <asm/set_memory.h>
 #include <asm/sections.h>
 #include <asm/dis.h>
@@ -38,7 +38,7 @@ void *alloc_insn_page(void)
 {
 	void *page;
 
-	page = module_alloc(PAGE_SIZE);
+	page = execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
 	if (!page)
 		return NULL;
 	set_memory_rox((unsigned long)page, 1);
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 42215f9404af..db5561d0c233 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -21,6 +21,7 @@
 #include <linux/moduleloader.h>
 #include <linux/bug.h>
 #include <linux/memory.h>
+#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/nospec-branch.h>
 #include <asm/facility.h>
@@ -76,7 +77,7 @@ void *module_alloc(unsigned long size)
 #ifdef CONFIG_FUNCTION_TRACER
 void module_arch_cleanup(struct module *mod)
 {
-	module_memfree(mod->arch.trampolines_start);
+	execmem_free(mod->arch.trampolines_start);
 }
 #endif
 
@@ -510,7 +511,7 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
 
 	size = FTRACE_HOTPATCH_TRAMPOLINES_SIZE(s->sh_size);
 	numpages = DIV_ROUND_UP(size, PAGE_SIZE);
-	start = module_alloc(numpages * PAGE_SIZE);
+	start = execmem_text_alloc(EXECMEM_FTRACE, numpages * PAGE_SIZE);
 	if (!start)
 		return -ENOMEM;
 	set_memory_rox((unsigned long)start, numpages);
diff --git a/arch/sparc/net/bpf_jit_comp_32.c b/arch/sparc/net/bpf_jit_comp_32.c
index a74e5004c6c8..5fa9c45fba0a 100644
--- a/arch/sparc/net/bpf_jit_comp_32.c
+++ b/arch/sparc/net/bpf_jit_comp_32.c
@@ -1,10 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
-#include <linux/moduleloader.h>
 #include <linux/workqueue.h>
 #include <linux/netdevice.h>
 #include <linux/filter.h>
 #include <linux/cache.h>
 #include <linux/if_vlan.h>
+#include <linux/execmem.h>
 
 #include <asm/cacheflush.h>
 #include <asm/ptrace.h>
@@ -713,7 +713,7 @@ cond_branch:			f_offset = addrs[i + filter[i].jf];
 				if (unlikely(proglen + ilen > oldproglen)) {
 					pr_err("bpb_jit_compile fatal error\n");
 					kfree(addrs);
-					module_memfree(image);
+					execmem_free(image);
 					return;
 				}
 				memcpy(image + proglen, temp, ilen);
@@ -736,7 +736,7 @@ cond_branch:			f_offset = addrs[i + filter[i].jf];
 			break;
 		}
 		if (proglen == oldproglen) {
-			image = module_alloc(proglen);
+			image = execmem_text_alloc(EXECMEM_BPF, proglen);
 			if (!image)
 				goto out;
 		}
@@ -758,7 +758,7 @@ cond_branch:			f_offset = addrs[i + filter[i].jf];
 void bpf_jit_free(struct bpf_prog *fp)
 {
 	if (fp->jited)
-		module_memfree(fp->bpf_func);
+		execmem_free(fp->bpf_func);
 
 	bpf_prog_unlock_free(fp);
 }
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 12df54ff0e81..ae56d79a6a74 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -25,6 +25,7 @@
 #include <linux/memory.h>
 #include <linux/vmalloc.h>
 #include <linux/set_memory.h>
+#include <linux/execmem.h>
 
 #include <trace/syscall.h>
 
@@ -261,15 +262,14 @@ void arch_ftrace_update_code(int command)
 #ifdef CONFIG_X86_64
 
 #ifdef CONFIG_MODULES
-#include <linux/moduleloader.h>
 /* Module allocation simplifies allocating memory for code */
 static inline void *alloc_tramp(unsigned long size)
 {
-	return module_alloc(size);
+	return execmem_text_alloc(EXECMEM_FTRACE, size);
 }
 static inline void tramp_free(void *tramp)
 {
-	module_memfree(tramp);
+	execmem_free(tramp);
 }
 #else
 /* Trampolines can only be created if modules are supported */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index e8babebad7b8..c4f58e893efd 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -40,12 +40,12 @@
 #include <linux/kgdb.h>
 #include <linux/ftrace.h>
 #include <linux/kasan.h>
-#include <linux/moduleloader.h>
 #include <linux/objtool.h>
 #include <linux/vmalloc.h>
 #include <linux/pgtable.h>
 #include <linux/set_memory.h>
 #include <linux/cfi.h>
+#include <linux/execmem.h>
 
 #include <asm/text-patching.h>
 #include <asm/cacheflush.h>
@@ -448,7 +448,7 @@ void *alloc_insn_page(void)
 {
 	void *page;
 
-	page = module_alloc(PAGE_SIZE);
+	page = execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
 	if (!page)
 		return NULL;
 
diff --git a/include/linux/execmem.h b/include/linux/execmem.h
new file mode 100644
index 000000000000..3491bf7e9714
--- /dev/null
+++ b/include/linux/execmem.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_EXECMEM_ALLOC_H
+#define _LINUX_EXECMEM_ALLOC_H
+
+#include <linux/types.h>
+
+/**
+ * enum execmem_type - types of executable memory ranges
+ *
+ * There are several subsystems that allocate executable memory.
+ * Architectures define different restrictions on placement,
+ * permissions, alignment and other parameters for memory that can be used
+ * by these subsystems.
+ * Types in this enum identify subsystems that allocate executable memory
+ * and let architectures define parameters for ranges suitable for
+ * allocations by each subsystem.
+ *
+ * @EXECMEM_DEFAULT: default parameters that would be used for types that
+ * are not explcitly defined.
+ * @EXECMEM_MODULE_TEXT: parameters for module text sections
+ * @EXECMEM_KPROBES: parameters for kprobes
+ * @EXECMEM_FTRACE: parameters for ftrace
+ * @EXECMEM_BPF: parameters for BPF
+ * @EXECMEM_TYPE_MAX:
+ */
+enum execmem_type {
+	EXECMEM_DEFAULT,
+	EXECMEM_MODULE_TEXT = EXECMEM_DEFAULT,
+	EXECMEM_KPROBES,
+	EXECMEM_FTRACE,
+	EXECMEM_BPF,
+	EXECMEM_TYPE_MAX,
+};
+
+/**
+ * execmem_text_alloc - allocate executable memory
+ * @type: type of the allocation
+ * @size: how many bytes of memory are required
+ *
+ * Allocates memory that will contain executable code, either generated or
+ * loaded from kernel modules.
+ *
+ * The memory will have protections defined by architecture for executable
+ * region of the @type.
+ *
+ * Return: a pointer to the allocated memory or %NULL
+ */
+void *execmem_text_alloc(enum execmem_type type, size_t size);
+
+/**
+ * execmem_free - free executable memory
+ * @ptr: pointer to the memory that should be freed
+ */
+void execmem_free(void *ptr);
+
+#endif /* _LINUX_EXECMEM_ALLOC_H */
diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
index 001b2ce83832..a23718aa2f4d 100644
--- a/include/linux/moduleloader.h
+++ b/include/linux/moduleloader.h
@@ -29,9 +29,6 @@ unsigned int arch_mod_section_prepend(struct module *mod, unsigned int section);
    sections.  Returns NULL on failure. */
 void *module_alloc(unsigned long size);
 
-/* Free memory returned from module_alloc. */
-void module_memfree(void *module_region);
-
 /* Determines if the section name is an init section (that is only used during
  * module loading).
  */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 4e3ce0542e31..75249f2d9f77 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -22,7 +22,6 @@
 #include <linux/skbuff.h>
 #include <linux/vmalloc.h>
 #include <linux/random.h>
-#include <linux/moduleloader.h>
 #include <linux/bpf.h>
 #include <linux/btf.h>
 #include <linux/objtool.h>
@@ -37,6 +36,7 @@
 #include <linux/nospec.h>
 #include <linux/bpf_mem_alloc.h>
 #include <linux/memcontrol.h>
+#include <linux/execmem.h>
 
 #include <asm/barrier.h>
 #include <asm/unaligned.h>
@@ -1007,12 +1007,12 @@ void bpf_jit_uncharge_modmem(u32 size)
 
 void *__weak bpf_jit_alloc_exec(unsigned long size)
 {
-	return module_alloc(size);
+	return execmem_text_alloc(EXECMEM_BPF, size);
 }
 
 void __weak bpf_jit_free_exec(void *addr)
 {
-	module_memfree(addr);
+	execmem_free(addr);
 }
 
 struct bpf_binary_header *
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 0c6185aefaef..0ccb4d2ec9a2 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -26,7 +26,6 @@
 #include <linux/slab.h>
 #include <linux/stddef.h>
 #include <linux/export.h>
-#include <linux/moduleloader.h>
 #include <linux/kallsyms.h>
 #include <linux/freezer.h>
 #include <linux/seq_file.h>
@@ -39,6 +38,7 @@
 #include <linux/jump_label.h>
 #include <linux/static_call.h>
 #include <linux/perf_event.h>
+#include <linux/execmem.h>
 
 #include <asm/sections.h>
 #include <asm/cacheflush.h>
@@ -113,17 +113,17 @@ enum kprobe_slot_state {
 void __weak *alloc_insn_page(void)
 {
 	/*
-	 * Use module_alloc() so this page is within +/- 2GB of where the
+	 * Use execmem_text_alloc() so this page is within +/- 2GB of where the
 	 * kernel image and loaded module images reside. This is required
 	 * for most of the architectures.
 	 * (e.g. x86-64 needs this to handle the %rip-relative fixups.)
 	 */
-	return module_alloc(PAGE_SIZE);
+	return execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
 }
 
 static void free_insn_page(void *page)
 {
-	module_memfree(page);
+	execmem_free(page);
 }
 
 struct kprobe_insn_cache kprobe_insn_slots = {
diff --git a/kernel/module/Kconfig b/kernel/module/Kconfig
index 33a2e991f608..813e116bdee6 100644
--- a/kernel/module/Kconfig
+++ b/kernel/module/Kconfig
@@ -2,6 +2,7 @@
 menuconfig MODULES
 	bool "Enable loadable module support"
 	modules
+	select EXECMEM
 	help
 	  Kernel modules are small pieces of compiled code which can
 	  be inserted in the running kernel, rather than being
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 98fedfdb8db5..4ec982cc943c 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -57,6 +57,7 @@
 #include <linux/audit.h>
 #include <linux/cfi.h>
 #include <linux/debugfs.h>
+#include <linux/execmem.h>
 #include <uapi/linux/module.h>
 #include "internal.h"
 
@@ -1179,16 +1180,6 @@ resolve_symbol_wait(struct module *mod,
 	return ksym;
 }
 
-void __weak module_memfree(void *module_region)
-{
-	/*
-	 * This memory may be RO, and freeing RO memory in an interrupt is not
-	 * supported by vmalloc.
-	 */
-	WARN_ON(in_interrupt());
-	vfree(module_region);
-}
-
 void __weak module_arch_cleanup(struct module *mod)
 {
 }
@@ -1207,7 +1198,7 @@ static void *module_memory_alloc(unsigned int size, enum mod_mem_type type)
 {
 	if (mod_mem_use_vmalloc(type))
 		return vzalloc(size);
-	return module_alloc(size);
+	return execmem_text_alloc(EXECMEM_MODULE_TEXT, size);
 }
 
 static void module_memory_free(void *ptr, enum mod_mem_type type)
@@ -1215,7 +1206,7 @@ static void module_memory_free(void *ptr, enum mod_mem_type type)
 	if (mod_mem_use_vmalloc(type))
 		vfree(ptr);
 	else
-		module_memfree(ptr);
+		execmem_free(ptr);
 }
 
 static void free_mod_mem(struct module *mod)
@@ -2479,9 +2470,9 @@ static void do_free_init(struct work_struct *w)
 
 	llist_for_each_safe(pos, n, list) {
 		initfree = container_of(pos, struct mod_initfree, node);
-		module_memfree(initfree->init_text);
-		module_memfree(initfree->init_data);
-		module_memfree(initfree->init_rodata);
+		execmem_free(initfree->init_text);
+		execmem_free(initfree->init_data);
+		execmem_free(initfree->init_rodata);
 		kfree(initfree);
 	}
 }
@@ -2584,10 +2575,10 @@ static noinline int do_init_module(struct module *mod)
 	 * We want to free module_init, but be aware that kallsyms may be
 	 * walking this with preempt disabled.  In all the failure paths, we
 	 * call synchronize_rcu(), but we don't want to slow down the success
-	 * path. module_memfree() cannot be called in an interrupt, so do the
+	 * path. execmem_free() cannot be called in an interrupt, so do the
 	 * work and call synchronize_rcu() in a work queue.
 	 *
-	 * Note that module_alloc() on most architectures creates W+X page
+	 * Note that execmem_text_alloc() on most architectures creates W+X page
 	 * mappings which won't be cleaned up until do_free_init() runs.  Any
 	 * code such as mark_rodata_ro() which depends on those mappings to
 	 * be cleaned up needs to sync with the queued work - ie
diff --git a/mm/Kconfig b/mm/Kconfig
index 264a2df5ecf5..fb12931238e8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1258,6 +1258,9 @@ config LOCK_MM_AND_FIND_VMA
 	bool
 	depends on !STACK_GROWSUP
 
+config EXECMEM
+	bool
+
 source "mm/damon/Kconfig"
 
 endmenu
diff --git a/mm/Makefile b/mm/Makefile
index ec65984e2ade..2e5fec94f09c 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -138,3 +138,4 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o
 obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
 obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
 obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
+obj-$(CONFIG_EXECMEM) += execmem.o
diff --git a/mm/execmem.c b/mm/execmem.c
new file mode 100644
index 000000000000..638dc2b26a81
--- /dev/null
+++ b/mm/execmem.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+#include <linux/execmem.h>
+#include <linux/moduleloader.h>
+
+static void *execmem_alloc(size_t size)
+{
+	return module_alloc(size);
+}
+
+void *execmem_text_alloc(enum execmem_type type, size_t size)
+{
+	return execmem_alloc(size);
+}
+
+void execmem_free(void *ptr)
+{
+	/*
+	 * This memory may be RO, and freeing RO memory in an interrupt is not
+	 * supported by vmalloc.
+	 */
+	WARN_ON(in_interrupt());
+	vfree(ptr);
+}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 01/13] nios2: define virtual address space for modules Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free() Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-10-04  0:29   ` Edgecombe, Rick P
  2023-10-05 18:11   ` Edgecombe, Rick P
  2023-09-18  7:29 ` [PATCH v3 04/13] mm/execmem, arch: convert remaining " Mike Rapoport
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Several architectures override module_alloc() only to define address
range for code allocations different than VMALLOC address space.

Provide a generic implementation in execmem that uses the parameters
for address space ranges, required alignment and page protections
provided by architectures.

The architectures must fill execmem_params structure and implement
execmem_arch_params() that returns a pointer to that structure. This
way the execmem initialization won't be called from every architecture,
but rather from a central place, namely initialization of the core
memory management.

The execmem provides execmem_text_alloc() API that wraps
__vmalloc_node_range() with the parameters defined by the architectures.
If an architecture does not implement execmem_arch_params(),
execmem_text_alloc() will fall back to module_alloc().

The name execmem_text_alloc() emphasizes that the allocated memory is
for executable code, the allocations of the associated data, like data
sections of a module will use execmem_data_alloc() interface that will
be added later.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/loongarch/kernel/module.c | 18 ++++++++--
 arch/mips/kernel/module.c      | 19 +++++++---
 arch/nios2/kernel/module.c     | 19 +++++++---
 arch/parisc/kernel/module.c    | 23 +++++++-----
 arch/riscv/kernel/module.c     | 20 ++++++++---
 arch/sparc/kernel/module.c     | 44 +++++++++++------------
 include/linux/execmem.h        | 44 +++++++++++++++++++++++
 mm/execmem.c                   | 66 ++++++++++++++++++++++++++++++++--
 mm/mm_init.c                   |  2 ++
 9 files changed, 203 insertions(+), 52 deletions(-)

diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index b8b86088b2dd..a1d8fe9796fa 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -18,6 +18,7 @@
 #include <linux/ftrace.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
+#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/inst.h>
 
@@ -469,10 +470,21 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-			GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0));
+	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
 }
 
 static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
index 0c936cbf20c5..1c959074b35f 100644
--- a/arch/mips/kernel/module.c
+++ b/arch/mips/kernel/module.c
@@ -20,6 +20,7 @@
 #include <linux/kernel.h>
 #include <linux/spinlock.h>
 #include <linux/jump_label.h>
+#include <linux/execmem.h>
 
 extern void jump_label_apply_nops(struct module *mod);
 
@@ -33,11 +34,21 @@ static LIST_HEAD(dbe_list);
 static DEFINE_SPINLOCK(dbe_lock);
 
 #ifdef MODULE_START
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.start = MODULE_START,
+			.end = MODULE_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
-				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
 }
 #endif
 
diff --git a/arch/nios2/kernel/module.c b/arch/nios2/kernel/module.c
index 9c97b7513853..5a8df4f9c04e 100644
--- a/arch/nios2/kernel/module.c
+++ b/arch/nios2/kernel/module.c
@@ -18,15 +18,24 @@
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
+#include <linux/execmem.h>
 
 #include <asm/cacheflush.h>
 
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.start = MODULES_VADDR,
+			.end = MODULES_END,
+			.pgprot = PAGE_KERNEL_EXEC,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-				    GFP_KERNEL, PAGE_KERNEL_EXEC,
-				    VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
-				    __builtin_return_address(0));
+	return &execmem_params;
 }
 
 int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab,
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index d214bbe3c2af..0c6dfd1daef3 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -49,6 +49,7 @@
 #include <linux/bug.h>
 #include <linux/mm.h>
 #include <linux/slab.h>
+#include <linux/execmem.h>
 
 #include <asm/unwind.h>
 #include <asm/sections.h>
@@ -173,15 +174,21 @@ static inline int reassemble_22(int as22)
 		((as22 & 0x0003ff) << 3));
 }
 
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL_RWX,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	/* using RWX means less protection for modules, but it's
-	 * easier than trying to map the text, data, init_text and
-	 * init_data correctly */
-	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
-				    GFP_KERNEL,
-				    PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
-				    __builtin_return_address(0));
+	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+
+	return &execmem_params;
 }
 
 #ifndef CONFIG_64BIT
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 7c651d55fcbd..343a0edfb6dd 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -11,6 +11,7 @@
 #include <linux/vmalloc.h>
 #include <linux/sizes.h>
 #include <linux/pgtable.h>
+#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/sections.h>
 
@@ -436,12 +437,21 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 }
 
 #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	return __vmalloc_node_range(size, 1, MODULES_VADDR,
-				    MODULES_END, GFP_KERNEL,
-				    PAGE_KERNEL, 0, NUMA_NO_NODE,
-				    __builtin_return_address(0));
+	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
 }
 #endif
 
diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
index 66c45a2764bc..1d8d1fba95b9 100644
--- a/arch/sparc/kernel/module.c
+++ b/arch/sparc/kernel/module.c
@@ -14,6 +14,10 @@
 #include <linux/string.h>
 #include <linux/ctype.h>
 #include <linux/mm.h>
+#include <linux/execmem.h>
+#ifdef CONFIG_SPARC64
+#include <linux/jump_label.h>
+#endif
 
 #include <asm/processor.h>
 #include <asm/spitfire.h>
@@ -21,34 +25,26 @@
 
 #include "entry.h"
 
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
 #ifdef CONFIG_SPARC64
-
-#include <linux/jump_label.h>
-
-static void *module_map(unsigned long size)
-{
-	if (PAGE_ALIGN(size) > MODULES_LEN)
-		return NULL;
-	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-				GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
-}
+			.start = MODULES_VADDR,
+			.end = MODULES_END,
 #else
-static void *module_map(unsigned long size)
-{
-	return vmalloc(size);
-}
-#endif /* CONFIG_SPARC64 */
-
-void *module_alloc(unsigned long size)
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+#endif
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	void *ret;
-
-	ret = module_map(size);
-	if (ret)
-		memset(ret, 0, size);
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
 
-	return ret;
+	return &execmem_params;
 }
 
 /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
diff --git a/include/linux/execmem.h b/include/linux/execmem.h
index 3491bf7e9714..44e213625053 100644
--- a/include/linux/execmem.h
+++ b/include/linux/execmem.h
@@ -32,6 +32,44 @@ enum execmem_type {
 	EXECMEM_TYPE_MAX,
 };
 
+/**
+ * struct execmem_range - definition of a memory range suitable for code and
+ *			  related data allocations
+ * @start:	address space start
+ * @end:	address space end (inclusive)
+ * @pgprot:	permissions for memory in this address space
+ * @alignment:	alignment required for text allocations
+ */
+struct execmem_range {
+	unsigned long   start;
+	unsigned long   end;
+	pgprot_t        pgprot;
+	unsigned int	alignment;
+};
+
+/**
+ * struct execmem_params - architecture parameters for code allocations
+ * @ranges: array of ranges defining architecture specific parameters for
+ * each type of executable memory allocations
+ */
+struct execmem_params {
+	struct execmem_range	ranges[EXECMEM_TYPE_MAX];
+};
+
+/**
+ * execmem_arch_params - supply parameters for allocations of executable memory
+ *
+ * A hook for architectures to define parameters for allocations of
+ * executable memory described by struct execmem_params
+ *
+ * For architectures that do not implement this method a default set of
+ * parameters will be used
+ *
+ * Return: a structure defining architecture parameters and restrictions
+ * for allocations of executable memory
+ */
+struct execmem_params *execmem_arch_params(void);
+
 /**
  * execmem_text_alloc - allocate executable memory
  * @type: type of the allocation
@@ -53,4 +91,10 @@ void *execmem_text_alloc(enum execmem_type type, size_t size);
  */
 void execmem_free(void *ptr);
 
+#ifdef CONFIG_EXECMEM
+void execmem_init(void);
+#else
+static inline void execmem_init(void) {}
+#endif
+
 #endif /* _LINUX_EXECMEM_ALLOC_H */
diff --git a/mm/execmem.c b/mm/execmem.c
index 638dc2b26a81..f25a5e064886 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -5,14 +5,26 @@
 #include <linux/execmem.h>
 #include <linux/moduleloader.h>
 
-static void *execmem_alloc(size_t size)
+static struct execmem_params execmem_params;
+
+static void *execmem_alloc(size_t size, struct execmem_range *range)
 {
-	return module_alloc(size);
+	unsigned long start = range->start;
+	unsigned long end = range->end;
+	unsigned int align = range->alignment;
+	pgprot_t pgprot = range->pgprot;
+
+	return __vmalloc_node_range(size, align, start, end,
+				   GFP_KERNEL, pgprot, VM_FLUSH_RESET_PERMS,
+				   NUMA_NO_NODE, __builtin_return_address(0));
 }
 
 void *execmem_text_alloc(enum execmem_type type, size_t size)
 {
-	return execmem_alloc(size);
+	if (!execmem_params.ranges[type].start)
+		return module_alloc(size);
+
+	return execmem_alloc(size, &execmem_params.ranges[type]);
 }
 
 void execmem_free(void *ptr)
@@ -24,3 +36,51 @@ void execmem_free(void *ptr)
 	WARN_ON(in_interrupt());
 	vfree(ptr);
 }
+
+struct execmem_params * __weak execmem_arch_params(void)
+{
+	return NULL;
+}
+
+static bool execmem_validate_params(struct execmem_params *p)
+{
+	struct execmem_range *r = &p->ranges[EXECMEM_DEFAULT];
+
+	if (!r->alignment || !r->start || !r->end || !pgprot_val(r->pgprot)) {
+		pr_crit("Invalid parameters for execmem allocator, module loading will fail");
+		return false;
+	}
+
+	return true;
+}
+
+static void execmem_init_missing(struct execmem_params *p)
+{
+	struct execmem_range *default_range = &p->ranges[EXECMEM_DEFAULT];
+
+	for (int i = EXECMEM_DEFAULT + 1; i < EXECMEM_TYPE_MAX; i++) {
+		struct execmem_range *r = &p->ranges[i];
+
+		if (!r->start) {
+			r->pgprot = default_range->pgprot;
+			r->alignment = default_range->alignment;
+			r->start = default_range->start;
+			r->end = default_range->end;
+		}
+	}
+}
+
+void __init execmem_init(void)
+{
+	struct execmem_params *p = execmem_arch_params();
+
+	if (!p)
+		return;
+
+	if (!execmem_validate_params(p))
+		return;
+
+	execmem_init_missing(p);
+
+	execmem_params = *p;
+}
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 50f2f34745af..7c002b36da21 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -26,6 +26,7 @@
 #include <linux/pgtable.h>
 #include <linux/swap.h>
 #include <linux/cma.h>
+#include <linux/execmem.h>
 #include "internal.h"
 #include "slab.h"
 #include "shuffle.h"
@@ -2797,4 +2798,5 @@ void __init mm_core_init(void)
 	pti_init();
 	kmsan_init_runtime();
 	mm_cache_init();
+	execmem_init();
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (2 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-10-04  0:29   ` Edgecombe, Rick P
  2023-10-23 17:14   ` Will Deacon
  2023-09-18  7:29 ` [PATCH v3 05/13] modules, execmem: drop module_alloc Mike Rapoport
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Extend execmem parameters to accommodate more complex overrides of
module_alloc() by architectures.

This includes specification of a fallback range required by arm, arm64
and powerpc and support for allocation of KASAN shadow required by
arm64, s390 and x86.

The core implementation of execmem_alloc() takes care of suppressing
warnings when the initial allocation fails but there is a fallback range
defined.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/arm/kernel/module.c     | 38 ++++++++++++---------
 arch/arm64/kernel/module.c   | 57 ++++++++++++++------------------
 arch/powerpc/kernel/module.c | 52 ++++++++++++++---------------
 arch/s390/kernel/module.c    | 52 +++++++++++------------------
 arch/x86/kernel/module.c     | 64 +++++++++++-------------------------
 include/linux/execmem.h      | 14 ++++++++
 mm/execmem.c                 | 43 ++++++++++++++++++++++--
 7 files changed, 167 insertions(+), 153 deletions(-)

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index e74d84f58b77..2c7651a2d84c 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -16,6 +16,7 @@
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/gfp.h>
+#include <linux/execmem.h>
 
 #include <asm/sections.h>
 #include <asm/smp_plat.h>
@@ -34,23 +35,28 @@
 #endif
 
 #ifdef CONFIG_MMU
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.start = MODULES_VADDR,
+			.end = MODULES_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	gfp_t gfp_mask = GFP_KERNEL;
-	void *p;
-
-	/* Silence the initial allocation */
-	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS))
-		gfp_mask |= __GFP_NOWARN;
-
-	p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-				gfp_mask, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
-	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
-		return p;
-	return __vmalloc_node_range(size, 1,  VMALLOC_START, VMALLOC_END,
-				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
-				__builtin_return_address(0));
+	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	r->pgprot = PAGE_KERNEL_EXEC;
+
+	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS)) {
+		r->fallback_start = VMALLOC_START;
+		r->fallback_end = VMALLOC_END;
+	}
+
+	return &execmem_params;
 }
 #endif
 
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index dd851297596e..cd6320de1c54 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -20,6 +20,7 @@
 #include <linux/random.h>
 #include <linux/scs.h>
 #include <linux/vmalloc.h>
+#include <linux/execmem.h>
 
 #include <asm/alternative.h>
 #include <asm/insn.h>
@@ -108,46 +109,38 @@ static int __init module_init_limits(void)
 
 	return 0;
 }
-subsys_initcall(module_init_limits);
 
-void *module_alloc(unsigned long size)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	void *p = NULL;
+	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
 
-	/*
-	 * Where possible, prefer to allocate within direct branch range of the
-	 * kernel such that no PLTs are necessary.
-	 */
-	if (module_direct_base) {
-		p = __vmalloc_node_range(size, MODULE_ALIGN,
-					 module_direct_base,
-					 module_direct_base + SZ_128M,
-					 GFP_KERNEL | __GFP_NOWARN,
-					 PAGE_KERNEL, 0, NUMA_NO_NODE,
-					 __builtin_return_address(0));
-	}
+	module_init_limits();
 
-	if (!p && module_plt_base) {
-		p = __vmalloc_node_range(size, MODULE_ALIGN,
-					 module_plt_base,
-					 module_plt_base + SZ_2G,
-					 GFP_KERNEL | __GFP_NOWARN,
-					 PAGE_KERNEL, 0, NUMA_NO_NODE,
-					 __builtin_return_address(0));
-	}
+	r->pgprot = PAGE_KERNEL;
 
-	if (!p) {
-		pr_warn_ratelimited("%s: unable to allocate memory\n",
-				    __func__);
-	}
+	if (module_direct_base) {
+		r->start = module_direct_base;
+		r->end = module_direct_base + SZ_128M;
 
-	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
-		vfree(p);
-		return NULL;
+		if (module_plt_base) {
+			r->fallback_start = module_plt_base;
+			r->fallback_end = module_plt_base + SZ_2G;
+		}
+	} else if (module_plt_base) {
+		r->start = module_plt_base;
+		r->end = module_plt_base + SZ_2G;
 	}
 
-	/* Memory is intended to be executable, reset the pointer tag. */
-	return kasan_reset_tag(p);
+	return &execmem_params;
 }
 
 enum aarch64_reloc_op {
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index f6d6ae0a1692..f4dd26f693a3 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -10,6 +10,7 @@
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
 #include <linux/bug.h>
+#include <linux/execmem.h>
 #include <asm/module.h>
 #include <linux/uaccess.h>
 #include <asm/firmware.h>
@@ -89,39 +90,38 @@ int module_finalize(const Elf_Ehdr *hdr,
 	return 0;
 }
 
-static __always_inline void *
-__module_alloc(unsigned long size, unsigned long start, unsigned long end, bool nowarn)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
 	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
-	gfp_t gfp = GFP_KERNEL | (nowarn ? __GFP_NOWARN : 0);
-
-	/*
-	 * Don't do huge page allocations for modules yet until more testing
-	 * is done. STRICT_MODULE_RWX may require extra work to support this
-	 * too.
-	 */
-	return __vmalloc_node_range(size, 1, start, end, gfp, prot,
-				    VM_FLUSH_RESET_PERMS,
-				    NUMA_NO_NODE, __builtin_return_address(0));
-}
+	struct execmem_range *range = &execmem_params.ranges[EXECMEM_DEFAULT];
 
-void *module_alloc(unsigned long size)
-{
 #ifdef MODULES_VADDR
 	unsigned long limit = (unsigned long)_etext - SZ_32M;
-	void *ptr = NULL;
-
-	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
 
 	/* First try within 32M limit from _etext to avoid branch trampolines */
-	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
-		ptr = __module_alloc(size, limit, MODULES_END, true);
-
-	if (!ptr)
-		ptr = __module_alloc(size, MODULES_VADDR, MODULES_END, false);
-
-	return ptr;
+	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit) {
+		range->start = limit;
+		range->end = MODULES_END;
+		range->fallback_start = MODULES_VADDR;
+		range->fallback_end = MODULES_END;
+	} else {
+		range->start = MODULES_VADDR;
+		range->end = MODULES_END;
+	}
 #else
-	return __module_alloc(size, VMALLOC_START, VMALLOC_END, false);
+	range->start = VMALLOC_START;
+	range->end = VMALLOC_END;
 #endif
+
+	range->pgprot = prot;
+
+	return &execmem_params;
 }
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index db5561d0c233..538d5f24af66 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -37,41 +37,29 @@
 
 #define PLT_ENTRY_SIZE 22
 
-static unsigned long get_module_load_offset(void)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+			.pgprot = PAGE_KERNEL,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
 {
-	static DEFINE_MUTEX(module_kaslr_mutex);
-	static unsigned long module_load_offset;
-
-	if (!kaslr_enabled())
-		return 0;
-	/*
-	 * Calculate the module_load_offset the first time this code
-	 * is called. Once calculated it stays the same until reboot.
-	 */
-	mutex_lock(&module_kaslr_mutex);
-	if (!module_load_offset)
+	unsigned long module_load_offset = 0;
+	unsigned long start;
+
+	if (kaslr_enabled())
 		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-	mutex_unlock(&module_kaslr_mutex);
-	return module_load_offset;
-}
 
-void *module_alloc(unsigned long size)
-{
-	gfp_t gfp_mask = GFP_KERNEL;
-	void *p;
-
-	if (PAGE_ALIGN(size) > MODULES_LEN)
-		return NULL;
-	p = __vmalloc_node_range(size, MODULE_ALIGN,
-				 MODULES_VADDR + get_module_load_offset(),
-				 MODULES_END, gfp_mask, PAGE_KERNEL,
-				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
-				 NUMA_NO_NODE, __builtin_return_address(0));
-	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
-		vfree(p);
-		return NULL;
-	}
-	return p;
+	start = MODULES_VADDR + module_load_offset;
+	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
 }
 
 #ifdef CONFIG_FUNCTION_TRACER
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 5f71a0cf4399..9d37375e2f05 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -19,6 +19,7 @@
 #include <linux/jump_label.h>
 #include <linux/random.h>
 #include <linux/memory.h>
+#include <linux/execmem.h>
 
 #include <asm/text-patching.h>
 #include <asm/page.h>
@@ -36,55 +37,30 @@ do {							\
 } while (0)
 #endif
 
-#ifdef CONFIG_RANDOMIZE_BASE
-static unsigned long module_load_offset;
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+		},
+	},
+};
 
-/* Mutex protects the module_load_offset. */
-static DEFINE_MUTEX(module_kaslr_mutex);
-
-static unsigned long int get_module_load_offset(void)
-{
-	if (kaslr_enabled()) {
-		mutex_lock(&module_kaslr_mutex);
-		/*
-		 * Calculate the module_load_offset the first time this
-		 * code is called. Once calculated it stays the same until
-		 * reboot.
-		 */
-		if (module_load_offset == 0)
-			module_load_offset =
-				get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-		mutex_unlock(&module_kaslr_mutex);
-	}
-	return module_load_offset;
-}
-#else
-static unsigned long int get_module_load_offset(void)
-{
-	return 0;
-}
-#endif
-
-void *module_alloc(unsigned long size)
+struct execmem_params __init *execmem_arch_params(void)
 {
-	gfp_t gfp_mask = GFP_KERNEL;
-	void *p;
-
-	if (PAGE_ALIGN(size) > MODULES_LEN)
-		return NULL;
+	unsigned long module_load_offset = 0;
+	unsigned long start;
 
-	p = __vmalloc_node_range(size, MODULE_ALIGN,
-				 MODULES_VADDR + get_module_load_offset(),
-				 MODULES_END, gfp_mask, PAGE_KERNEL,
-				 VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK,
-				 NUMA_NO_NODE, __builtin_return_address(0));
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
+		module_load_offset =
+			get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
 
-	if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
-		vfree(p);
-		return NULL;
-	}
+	start = MODULES_VADDR + module_load_offset;
+	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
 
-	return p;
+	return &execmem_params;
 }
 
 #ifdef CONFIG_X86_32
diff --git a/include/linux/execmem.h b/include/linux/execmem.h
index 44e213625053..806ad1a0088d 100644
--- a/include/linux/execmem.h
+++ b/include/linux/execmem.h
@@ -32,19 +32,33 @@ enum execmem_type {
 	EXECMEM_TYPE_MAX,
 };
 
+/**
+ * enum execmem_module_flags - options for executable memory allocations
+ * @EXECMEM_KASAN_SHADOW:	allocate kasan shadow
+ */
+enum execmem_range_flags {
+	EXECMEM_KASAN_SHADOW	= (1 << 0),
+};
+
 /**
  * struct execmem_range - definition of a memory range suitable for code and
  *			  related data allocations
  * @start:	address space start
  * @end:	address space end (inclusive)
+ * @fallback_start:	start of the range for fallback allocations
+ * @fallback_end:	end of the range for fallback allocations (inclusive)
  * @pgprot:	permissions for memory in this address space
  * @alignment:	alignment required for text allocations
+ * @flags:	options for memory allocations for this range
  */
 struct execmem_range {
 	unsigned long   start;
 	unsigned long   end;
+	unsigned long   fallback_start;
+	unsigned long   fallback_end;
 	pgprot_t        pgprot;
 	unsigned int	alignment;
+	enum execmem_range_flags flags;
 };
 
 /**
diff --git a/mm/execmem.c b/mm/execmem.c
index f25a5e064886..a8c2f44d0133 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -11,12 +11,46 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
 {
 	unsigned long start = range->start;
 	unsigned long end = range->end;
+	unsigned long fallback_start = range->fallback_start;
+	unsigned long fallback_end = range->fallback_end;
 	unsigned int align = range->alignment;
 	pgprot_t pgprot = range->pgprot;
+	bool kasan = range->flags & EXECMEM_KASAN_SHADOW;
+	unsigned long vm_flags  = VM_FLUSH_RESET_PERMS;
+	bool fallback  = !!fallback_start;
+	gfp_t gfp_flags = GFP_KERNEL;
+	void *p;
 
-	return __vmalloc_node_range(size, align, start, end,
-				   GFP_KERNEL, pgprot, VM_FLUSH_RESET_PERMS,
-				   NUMA_NO_NODE, __builtin_return_address(0));
+	if (PAGE_ALIGN(size) > (end - start))
+		return NULL;
+
+	if (kasan)
+		vm_flags |= VM_DEFER_KMEMLEAK;
+
+	if (fallback)
+		gfp_flags |= __GFP_NOWARN;
+
+	p = __vmalloc_node_range(size, align, start, end, gfp_flags,
+				 pgprot, vm_flags, NUMA_NO_NODE,
+				 __builtin_return_address(0));
+
+	if (!p && fallback) {
+		start = fallback_start;
+		end = fallback_end;
+		gfp_flags = GFP_KERNEL;
+
+		p = __vmalloc_node_range(size, align, start, end, gfp_flags,
+					 pgprot, vm_flags, NUMA_NO_NODE,
+					 __builtin_return_address(0));
+	}
+
+	if (p && kasan &&
+	    (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
+		vfree(p);
+		return NULL;
+	}
+
+	return kasan_reset_tag(p);
 }
 
 void *execmem_text_alloc(enum execmem_type type, size_t size)
@@ -66,6 +100,9 @@ static void execmem_init_missing(struct execmem_params *p)
 			r->alignment = default_range->alignment;
 			r->start = default_range->start;
 			r->end = default_range->end;
+			r->flags = default_range->flags;
+			r->fallback_start = default_range->fallback_start;
+			r->fallback_end = default_range->fallback_end;
 		}
 	}
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 05/13] modules, execmem: drop module_alloc
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (3 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 04/13] mm/execmem, arch: convert remaining " Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc() Mike Rapoport
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Define default parameters for address range for code allocations using
the current values in module_alloc() and make execmem_text_alloc() use
these defaults when an architecture does not supply its specific
parameters.

With this, execmem_text_alloc() implements memory allocation in a way
compatible with module_alloc() and can be used as a replacement for
module_alloc().

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 include/linux/execmem.h      |  8 ++++++++
 include/linux/moduleloader.h | 12 ------------
 kernel/module/main.c         |  7 -------
 mm/execmem.c                 | 12 ++++++++----
 4 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/include/linux/execmem.h b/include/linux/execmem.h
index 806ad1a0088d..519bdfdca595 100644
--- a/include/linux/execmem.h
+++ b/include/linux/execmem.h
@@ -4,6 +4,14 @@
 
 #include <linux/types.h>
 
+#if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \
+		!defined(CONFIG_KASAN_VMALLOC)
+#include <linux/kasan.h>
+#define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
+#else
+#define MODULE_ALIGN PAGE_SIZE
+#endif
+
 /**
  * enum execmem_type - types of executable memory ranges
  *
diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
index a23718aa2f4d..8c81f389117d 100644
--- a/include/linux/moduleloader.h
+++ b/include/linux/moduleloader.h
@@ -25,10 +25,6 @@ int module_frob_arch_sections(Elf_Ehdr *hdr,
 /* Additional bytes needed by arch in front of individual sections */
 unsigned int arch_mod_section_prepend(struct module *mod, unsigned int section);
 
-/* Allocator used for allocating struct module, core sections and init
-   sections.  Returns NULL on failure. */
-void *module_alloc(unsigned long size);
-
 /* Determines if the section name is an init section (that is only used during
  * module loading).
  */
@@ -118,12 +114,4 @@ void module_arch_cleanup(struct module *mod);
 /* Any cleanup before freeing mod->module_init */
 void module_arch_freeing_init(struct module *mod);
 
-#if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \
-		!defined(CONFIG_KASAN_VMALLOC)
-#include <linux/kasan.h>
-#define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
-#else
-#define MODULE_ALIGN PAGE_SIZE
-#endif
-
 #endif
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 4ec982cc943c..c4146bfcd0a7 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1601,13 +1601,6 @@ static void free_modinfo(struct module *mod)
 	}
 }
 
-void * __weak module_alloc(unsigned long size)
-{
-	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
-			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
-			NUMA_NO_NODE, __builtin_return_address(0));
-}
-
 bool __weak module_init_section(const char *name)
 {
 	return strstarts(name, ".init");
diff --git a/mm/execmem.c b/mm/execmem.c
index a8c2f44d0133..abcbd07e05ac 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -55,9 +55,6 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
 
 void *execmem_text_alloc(enum execmem_type type, size_t size)
 {
-	if (!execmem_params.ranges[type].start)
-		return module_alloc(size);
-
 	return execmem_alloc(size, &execmem_params.ranges[type]);
 }
 
@@ -111,8 +108,15 @@ void __init execmem_init(void)
 {
 	struct execmem_params *p = execmem_arch_params();
 
-	if (!p)
+	if (!p) {
+		p = &execmem_params;
+		p->ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+		p->ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+		p->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC;
+		p->ranges[EXECMEM_DEFAULT].alignment = 1;
+
 		return;
+	}
 
 	if (!execmem_validate_params(p))
 		return;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc()
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (4 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 05/13] modules, execmem: drop module_alloc Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-21 22:52   ` Song Liu
  2023-09-18  7:29 ` [PATCH v3 07/13] arm64, execmem: extend execmem_params for generated code allocations Mike Rapoport
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Data related to code allocations, such as module data section, need to
comply with architecture constraints for its placement and its
allocation right now was done using execmem_text_alloc().

Create a dedicated API for allocating data related to code allocations
and allow architectures to define address ranges for data allocations.

Since currently this is only relevant for powerpc variants that use the
VMALLOC address space for module data allocations, automatically reuse
address ranges defined for text unless address range for data is
explicitly defined by an architecture.

With separation of code and data allocations, data sections of the
modules are now mapped as PAGE_KERNEL rather than PAGE_KERNEL_EXEC which
was a default on many architectures.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/powerpc/kernel/module.c | 12 ++++++++++++
 include/linux/execmem.h      | 19 +++++++++++++++++++
 kernel/module/main.c         | 15 +++------------
 mm/execmem.c                 | 17 ++++++++++++++++-
 4 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index f4dd26f693a3..824d9541a310 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -95,6 +95,9 @@ static struct execmem_params execmem_params __ro_after_init = {
 		[EXECMEM_DEFAULT] = {
 			.alignment = 1,
 		},
+		[EXECMEM_MODULE_DATA] = {
+			.alignment = 1,
+		},
 	},
 };
 
@@ -103,7 +106,12 @@ struct execmem_params __init *execmem_arch_params(void)
 	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
 	struct execmem_range *range = &execmem_params.ranges[EXECMEM_DEFAULT];
 
+	/*
+	 * BOOK3S_32 and 8xx define MODULES_VADDR for text allocations and
+	 * allow allocating data in the entire vmalloc space
+	 */
 #ifdef MODULES_VADDR
+	struct execmem_range *data = &execmem_params.ranges[EXECMEM_MODULE_DATA];
 	unsigned long limit = (unsigned long)_etext - SZ_32M;
 
 	/* First try within 32M limit from _etext to avoid branch trampolines */
@@ -116,6 +124,10 @@ struct execmem_params __init *execmem_arch_params(void)
 		range->start = MODULES_VADDR;
 		range->end = MODULES_END;
 	}
+	data->start = VMALLOC_START;
+	data->end = VMALLOC_END;
+	data->pgprot = PAGE_KERNEL;
+	data->alignment = 1;
 #else
 	range->start = VMALLOC_START;
 	range->end = VMALLOC_END;
diff --git a/include/linux/execmem.h b/include/linux/execmem.h
index 519bdfdca595..09d45ac786e9 100644
--- a/include/linux/execmem.h
+++ b/include/linux/execmem.h
@@ -29,6 +29,7 @@
  * @EXECMEM_KPROBES: parameters for kprobes
  * @EXECMEM_FTRACE: parameters for ftrace
  * @EXECMEM_BPF: parameters for BPF
+ * @EXECMEM_MODULE_DATA: parameters for module data sections
  * @EXECMEM_TYPE_MAX:
  */
 enum execmem_type {
@@ -37,6 +38,7 @@ enum execmem_type {
 	EXECMEM_KPROBES,
 	EXECMEM_FTRACE,
 	EXECMEM_BPF,
+	EXECMEM_MODULE_DATA,
 	EXECMEM_TYPE_MAX,
 };
 
@@ -107,6 +109,23 @@ struct execmem_params *execmem_arch_params(void);
  */
 void *execmem_text_alloc(enum execmem_type type, size_t size);
 
+/**
+ * execmem_data_alloc - allocate memory for data coupled to code
+ * @type: type of the allocation
+ * @size: how many bytes of memory are required
+ *
+ * Allocates memory that will contain data coupled with executable code,
+ * like data sections in kernel modules.
+ *
+ * The memory will have protections defined by architecture.
+ *
+ * The allocated memory will reside in an area that does not impose
+ * restrictions on the addressing modes.
+ *
+ * Return: a pointer to the allocated memory or %NULL
+ */
+void *execmem_data_alloc(enum execmem_type type, size_t size);
+
 /**
  * execmem_free - free executable memory
  * @ptr: pointer to the memory that should be freed
diff --git a/kernel/module/main.c b/kernel/module/main.c
index c4146bfcd0a7..2ae83a6abf66 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1188,25 +1188,16 @@ void __weak module_arch_freeing_init(struct module *mod)
 {
 }
 
-static bool mod_mem_use_vmalloc(enum mod_mem_type type)
-{
-	return IS_ENABLED(CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC) &&
-		mod_mem_type_is_core_data(type);
-}
-
 static void *module_memory_alloc(unsigned int size, enum mod_mem_type type)
 {
-	if (mod_mem_use_vmalloc(type))
-		return vzalloc(size);
+	if (mod_mem_type_is_data(type))
+		return execmem_data_alloc(EXECMEM_MODULE_DATA, size);
 	return execmem_text_alloc(EXECMEM_MODULE_TEXT, size);
 }
 
 static void module_memory_free(void *ptr, enum mod_mem_type type)
 {
-	if (mod_mem_use_vmalloc(type))
-		vfree(ptr);
-	else
-		execmem_free(ptr);
+	execmem_free(ptr);
 }
 
 static void free_mod_mem(struct module *mod)
diff --git a/mm/execmem.c b/mm/execmem.c
index abcbd07e05ac..aeff85261360 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -53,11 +53,23 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
 	return kasan_reset_tag(p);
 }
 
+static inline bool execmem_range_is_data(enum execmem_type type)
+{
+	return type == EXECMEM_MODULE_DATA;
+}
+
 void *execmem_text_alloc(enum execmem_type type, size_t size)
 {
 	return execmem_alloc(size, &execmem_params.ranges[type]);
 }
 
+void *execmem_data_alloc(enum execmem_type type, size_t size)
+{
+	WARN_ON_ONCE(!execmem_range_is_data(type));
+
+	return execmem_alloc(size, &execmem_params.ranges[type]);
+}
+
 void execmem_free(void *ptr)
 {
 	/*
@@ -93,7 +105,10 @@ static void execmem_init_missing(struct execmem_params *p)
 		struct execmem_range *r = &p->ranges[i];
 
 		if (!r->start) {
-			r->pgprot = default_range->pgprot;
+			if (execmem_range_is_data(i))
+				r->pgprot = PAGE_KERNEL;
+			else
+				r->pgprot = default_range->pgprot;
 			r->alignment = default_range->alignment;
 			r->start = default_range->start;
 			r->end = default_range->end;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 07/13] arm64, execmem: extend execmem_params for generated code allocations
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (5 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc() Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-10-23 17:21   ` Will Deacon
  2023-09-18  7:29 ` [PATCH v3 08/13] riscv: " Mike Rapoport
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

The memory allocations for kprobes and BPF on arm64 can be placed
anywhere in vmalloc address space and currently this is implemented with
overrides of alloc_insn_page() and bpf_jit_alloc_exec() in arm64.

Define EXECMEM_KPROBES and EXECMEM_BPF ranges in arm64::execmem_params and
drop overrides of alloc_insn_page() and bpf_jit_alloc_exec().

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/arm64/kernel/module.c         | 13 +++++++++++++
 arch/arm64/kernel/probes/kprobes.c |  7 -------
 arch/arm64/net/bpf_jit_comp.c      | 11 -----------
 3 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index cd6320de1c54..d27db168d2a2 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -116,6 +116,16 @@ static struct execmem_params execmem_params __ro_after_init = {
 			.flags = EXECMEM_KASAN_SHADOW,
 			.alignment = MODULE_ALIGN,
 		},
+		[EXECMEM_KPROBES] = {
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+			.alignment = 1,
+		},
+		[EXECMEM_BPF] = {
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+			.alignment = 1,
+		},
 	},
 };
 
@@ -140,6 +150,9 @@ struct execmem_params __init *execmem_arch_params(void)
 		r->end = module_plt_base + SZ_2G;
 	}
 
+	execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
+	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
+
 	return &execmem_params;
 }
 
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index 70b91a8c6bb3..6fccedd02b2a 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -129,13 +129,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
 	return 0;
 }
 
-void *alloc_insn_page(void)
-{
-	return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
-			GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS,
-			NUMA_NO_NODE, __builtin_return_address(0));
-}
-
 /* arm kprobe: install breakpoint in text */
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 150d1c6543f7..3a7590f828d1 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1687,17 +1687,6 @@ u64 bpf_jit_alloc_exec_limit(void)
 	return VMALLOC_END - VMALLOC_START;
 }
 
-void *bpf_jit_alloc_exec(unsigned long size)
-{
-	/* Memory is intended to be executable, reset the pointer tag. */
-	return kasan_reset_tag(vmalloc(size));
-}
-
-void bpf_jit_free_exec(void *addr)
-{
-	return vfree(addr);
-}
-
 /* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
 bool bpf_jit_supports_subprog_tailcalls(void)
 {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 08/13] riscv: extend execmem_params for generated code allocations
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (6 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 07/13] arm64, execmem: extend execmem_params for generated code allocations Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-22 10:37   ` Alexandre Ghiti
  2023-09-18  7:29 ` [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations Mike Rapoport
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

The memory allocations for kprobes and BPF on RISC-V are not placed in
the modules area and these custom allocations are implemented with
overrides of alloc_insn_page() and  bpf_jit_alloc_exec().

Slightly reorder execmem_params initialization to support both 32 and 64
bit variants, define EXECMEM_KPROBES and EXECMEM_BPF ranges in
riscv::execmem_params and drop overrides of alloc_insn_page() and
bpf_jit_alloc_exec().

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/riscv/kernel/module.c         | 21 ++++++++++++++++++++-
 arch/riscv/kernel/probes/kprobes.c | 10 ----------
 arch/riscv/net/bpf_jit_core.c      | 13 -------------
 3 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 343a0edfb6dd..31505ecb5c72 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -436,20 +436,39 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
+#ifdef CONFIG_MMU
 static struct execmem_params execmem_params __ro_after_init = {
 	.ranges = {
 		[EXECMEM_DEFAULT] = {
 			.pgprot = PAGE_KERNEL,
 			.alignment = 1,
 		},
+		[EXECMEM_KPROBES] = {
+			.pgprot = PAGE_KERNEL_READ_EXEC,
+			.alignment = 1,
+		},
+		[EXECMEM_BPF] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
 	},
 };
 
 struct execmem_params __init *execmem_arch_params(void)
 {
+#ifdef CONFIG_64BIT
 	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
 	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+#else
+	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+#endif
+
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
+
+	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
+	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
 
 	return &execmem_params;
 }
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index 2f08c14a933d..e64f2f3064eb 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -104,16 +104,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
 	return 0;
 }
 
-#ifdef CONFIG_MMU
-void *alloc_insn_page(void)
-{
-	return  __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
-				     GFP_KERNEL, PAGE_KERNEL_READ_EXEC,
-				     VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
-				     __builtin_return_address(0));
-}
-#endif
-
 /* install breakpoint in text */
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
index 7b70ccb7fec3..c8a758f0882b 100644
--- a/arch/riscv/net/bpf_jit_core.c
+++ b/arch/riscv/net/bpf_jit_core.c
@@ -218,19 +218,6 @@ u64 bpf_jit_alloc_exec_limit(void)
 	return BPF_JIT_REGION_SIZE;
 }
 
-void *bpf_jit_alloc_exec(unsigned long size)
-{
-	return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
-				    BPF_JIT_REGION_END, GFP_KERNEL,
-				    PAGE_KERNEL, 0, NUMA_NO_NODE,
-				    __builtin_return_address(0));
-}
-
-void bpf_jit_free_exec(void *addr)
-{
-	return vfree(addr);
-}
-
 void *bpf_arch_text_copy(void *dst, void *src, size_t len)
 {
 	int ret;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (7 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 08/13] riscv: " Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-21 22:30   ` Song Liu
  2023-09-22 10:32   ` Christophe Leroy
  2023-09-18  7:29 ` [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES Mike Rapoport
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

powerpc overrides kprobes::alloc_insn_page() to remove writable
permissions when STRICT_MODULE_RWX is on.

Add definition of EXECMEM_KRPOBES to execmem_params to allow using the
generic kprobes::alloc_insn_page() with the desired permissions.

As powerpc uses breakpoint instructions to inject kprobes, it does not
need to constrain kprobe allocations to the modules area and can use the
entire vmalloc address space.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/powerpc/kernel/kprobes.c | 14 --------------
 arch/powerpc/kernel/module.c  | 11 +++++++++++
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 62228c7072a2..14c5ddec3056 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -126,20 +126,6 @@ kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offse
 	return (kprobe_opcode_t *)(addr + offset);
 }
 
-void *alloc_insn_page(void)
-{
-	void *page;
-
-	page = execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
-	if (!page)
-		return NULL;
-
-	if (strict_module_rwx_enabled())
-		set_memory_rox((unsigned long)page, 1);
-
-	return page;
-}
-
 int arch_prepare_kprobe(struct kprobe *p)
 {
 	int ret = 0;
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 824d9541a310..bf2c62aef628 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -95,6 +95,9 @@ static struct execmem_params execmem_params __ro_after_init = {
 		[EXECMEM_DEFAULT] = {
 			.alignment = 1,
 		},
+		[EXECMEM_KPROBES] = {
+			.alignment = 1,
+		},
 		[EXECMEM_MODULE_DATA] = {
 			.alignment = 1,
 		},
@@ -135,5 +138,13 @@ struct execmem_params __init *execmem_arch_params(void)
 
 	range->pgprot = prot;
 
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;
+
+	if (strict_module_rwx_enabled())
+		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
+	else
+		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
+
 	return &execmem_params;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (8 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-26  7:33   ` Arnd Bergmann
  2023-09-18  7:29 ` [PATCH v3 11/13] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES Mike Rapoport
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

execmem does not depend on modules, on the contrary modules use
execmem.

To make execmem available when CONFIG_MODULES=n, for instance for
kprobes, split execmem_params initialization out from
arch/kernel/module.c and compile it when CONFIG_EXECMEM=y

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/arm/kernel/module.c       |  38 ----------
 arch/arm/mm/init.c             |  38 ++++++++++
 arch/arm64/kernel/module.c     | 130 --------------------------------
 arch/arm64/mm/init.c           | 132 +++++++++++++++++++++++++++++++++
 arch/loongarch/kernel/module.c |  18 -----
 arch/loongarch/mm/init.c       |  20 +++++
 arch/mips/kernel/module.c      |  19 -----
 arch/mips/mm/init.c            |  20 +++++
 arch/parisc/kernel/module.c    |  17 -----
 arch/parisc/mm/init.c          |  22 +++++-
 arch/powerpc/kernel/module.c   |  60 ---------------
 arch/powerpc/mm/mem.c          |  62 ++++++++++++++++
 arch/riscv/kernel/module.c     |  39 ----------
 arch/riscv/mm/init.c           |  39 ++++++++++
 arch/s390/kernel/module.c      |  25 -------
 arch/s390/mm/init.c            |  28 +++++++
 arch/sparc/kernel/module.c     |  23 ------
 arch/sparc/mm/Makefile         |   2 +
 arch/sparc/mm/execmem.c        |  25 +++++++
 arch/x86/kernel/module.c       |  27 -------
 arch/x86/mm/init.c             |  29 ++++++++
 21 files changed, 416 insertions(+), 397 deletions(-)
 create mode 100644 arch/sparc/mm/execmem.c

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 2c7651a2d84c..3282f304f6b1 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -16,50 +16,12 @@
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/gfp.h>
-#include <linux/execmem.h>
 
 #include <asm/sections.h>
 #include <asm/smp_plat.h>
 #include <asm/unwind.h>
 #include <asm/opcodes.h>
 
-#ifdef CONFIG_XIP_KERNEL
-/*
- * The XIP kernel text is mapped in the module area for modules and
- * some other stuff to work without any indirect relocations.
- * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
- * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
- */
-#undef MODULES_VADDR
-#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
-#endif
-
-#ifdef CONFIG_MMU
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.start = MODULES_VADDR,
-			.end = MODULES_END,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
-
-	r->pgprot = PAGE_KERNEL_EXEC;
-
-	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS)) {
-		r->fallback_start = VMALLOC_START;
-		r->fallback_end = VMALLOC_END;
-	}
-
-	return &execmem_params;
-}
-#endif
-
 bool module_init_section(const char *name)
 {
 	return strstarts(name, ".init") ||
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a42e4cd11db2..c0b536e398b4 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -22,6 +22,7 @@
 #include <linux/sizes.h>
 #include <linux/stop_machine.h>
 #include <linux/swiotlb.h>
+#include <linux/execmem.h>
 
 #include <asm/cp15.h>
 #include <asm/mach-types.h>
@@ -486,3 +487,40 @@ void free_initrd_mem(unsigned long start, unsigned long end)
 	free_reserved_area((void *)start, (void *)end, -1, "initrd");
 }
 #endif
+
+#ifdef CONFIG_XIP_KERNEL
+/*
+ * The XIP kernel text is mapped in the module area for modules and
+ * some other stuff to work without any indirect relocations.
+ * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
+ * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
+ */
+#undef MODULES_VADDR
+#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
+#endif
+
+#if defined(CONFIG_MMU) && defined(CONFIG_EXECMEM)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.start = MODULES_VADDR,
+			.end = MODULES_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	r->pgprot = PAGE_KERNEL_EXEC;
+
+	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS)) {
+		r->fallback_start = VMALLOC_START;
+		r->fallback_end = VMALLOC_END;
+	}
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index d27db168d2a2..eb1505128b75 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -20,142 +20,12 @@
 #include <linux/random.h>
 #include <linux/scs.h>
 #include <linux/vmalloc.h>
-#include <linux/execmem.h>
 
 #include <asm/alternative.h>
 #include <asm/insn.h>
 #include <asm/scs.h>
 #include <asm/sections.h>
 
-static u64 module_direct_base __ro_after_init = 0;
-static u64 module_plt_base __ro_after_init = 0;
-
-/*
- * Choose a random page-aligned base address for a window of 'size' bytes which
- * entirely contains the interval [start, end - 1].
- */
-static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
-{
-	u64 max_pgoff, pgoff;
-
-	if ((end - start) >= size)
-		return 0;
-
-	max_pgoff = (size - (end - start)) / PAGE_SIZE;
-	pgoff = get_random_u32_inclusive(0, max_pgoff);
-
-	return start - pgoff * PAGE_SIZE;
-}
-
-/*
- * Modules may directly reference data and text anywhere within the kernel
- * image and other modules. References using PREL32 relocations have a +/-2G
- * range, and so we need to ensure that the entire kernel image and all modules
- * fall within a 2G window such that these are always within range.
- *
- * Modules may directly branch to functions and code within the kernel text,
- * and to functions and code within other modules. These branches will use
- * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
- * that the entire kernel text and all module text falls within a 128M window
- * such that these are always within range. With PLTs, we can expand this to a
- * 2G window.
- *
- * We chose the 128M region to surround the entire kernel image (rather than
- * just the text) as using the same bounds for the 128M and 2G regions ensures
- * by construction that we never select a 128M region that is not a subset of
- * the 2G region. For very large and unusual kernel configurations this means
- * we may fall back to PLTs where they could have been avoided, but this keeps
- * the logic significantly simpler.
- */
-static int __init module_init_limits(void)
-{
-	u64 kernel_end = (u64)_end;
-	u64 kernel_start = (u64)_text;
-	u64 kernel_size = kernel_end - kernel_start;
-
-	/*
-	 * The default modules region is placed immediately below the kernel
-	 * image, and is large enough to use the full 2G relocation range.
-	 */
-	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
-	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
-
-	if (!kaslr_enabled()) {
-		if (kernel_size < SZ_128M)
-			module_direct_base = kernel_end - SZ_128M;
-		if (kernel_size < SZ_2G)
-			module_plt_base = kernel_end - SZ_2G;
-	} else {
-		u64 min = kernel_start;
-		u64 max = kernel_end;
-
-		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
-			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
-		} else {
-			module_direct_base = random_bounding_box(SZ_128M, min, max);
-			if (module_direct_base) {
-				min = module_direct_base;
-				max = module_direct_base + SZ_128M;
-			}
-		}
-
-		module_plt_base = random_bounding_box(SZ_2G, min, max);
-	}
-
-	pr_info("%llu pages in range for non-PLT usage",
-		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
-	pr_info("%llu pages in range for PLT usage",
-		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
-
-	return 0;
-}
-
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-		},
-		[EXECMEM_KPROBES] = {
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-			.alignment = 1,
-		},
-		[EXECMEM_BPF] = {
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
-
-	module_init_limits();
-
-	r->pgprot = PAGE_KERNEL;
-
-	if (module_direct_base) {
-		r->start = module_direct_base;
-		r->end = module_direct_base + SZ_128M;
-
-		if (module_plt_base) {
-			r->fallback_start = module_plt_base;
-			r->fallback_end = module_plt_base + SZ_2G;
-		}
-	} else if (module_plt_base) {
-		r->start = module_plt_base;
-		r->end = module_plt_base + SZ_2G;
-	}
-
-	execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
-	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-
 enum aarch64_reloc_op {
 	RELOC_OP_NONE,
 	RELOC_OP_ABS,
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8a0f8604348b..9b7716b4d84c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -31,6 +31,7 @@
 #include <linux/hugetlb.h>
 #include <linux/acpi_iort.h>
 #include <linux/kmemleak.h>
+#include <linux/execmem.h>
 
 #include <asm/boot.h>
 #include <asm/fixmap.h>
@@ -547,3 +548,134 @@ void dump_mem_limit(void)
 		pr_emerg("Memory Limit: none\n");
 	}
 }
+
+#ifdef CONFIG_EXECMEM
+static u64 module_direct_base __ro_after_init = 0;
+static u64 module_plt_base __ro_after_init = 0;
+
+/*
+ * Choose a random page-aligned base address for a window of 'size' bytes which
+ * entirely contains the interval [start, end - 1].
+ */
+static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
+{
+	u64 max_pgoff, pgoff;
+
+	if ((end - start) >= size)
+		return 0;
+
+	max_pgoff = (size - (end - start)) / PAGE_SIZE;
+	pgoff = get_random_u32_inclusive(0, max_pgoff);
+
+	return start - pgoff * PAGE_SIZE;
+}
+
+/*
+ * Modules may directly reference data and text anywhere within the kernel
+ * image and other modules. References using PREL32 relocations have a +/-2G
+ * range, and so we need to ensure that the entire kernel image and all modules
+ * fall within a 2G window such that these are always within range.
+ *
+ * Modules may directly branch to functions and code within the kernel text,
+ * and to functions and code within other modules. These branches will use
+ * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
+ * that the entire kernel text and all module text falls within a 128M window
+ * such that these are always within range. With PLTs, we can expand this to a
+ * 2G window.
+ *
+ * We chose the 128M region to surround the entire kernel image (rather than
+ * just the text) as using the same bounds for the 128M and 2G regions ensures
+ * by construction that we never select a 128M region that is not a subset of
+ * the 2G region. For very large and unusual kernel configurations this means
+ * we may fall back to PLTs where they could have been avoided, but this keeps
+ * the logic significantly simpler.
+ */
+static int __init module_init_limits(void)
+{
+	u64 kernel_end = (u64)_end;
+	u64 kernel_start = (u64)_text;
+	u64 kernel_size = kernel_end - kernel_start;
+
+	/*
+	 * The default modules region is placed immediately below the kernel
+	 * image, and is large enough to use the full 2G relocation range.
+	 */
+	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
+	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
+
+	if (!kaslr_enabled()) {
+		if (kernel_size < SZ_128M)
+			module_direct_base = kernel_end - SZ_128M;
+		if (kernel_size < SZ_2G)
+			module_plt_base = kernel_end - SZ_2G;
+	} else {
+		u64 min = kernel_start;
+		u64 max = kernel_end;
+
+		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
+			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
+		} else {
+			module_direct_base = random_bounding_box(SZ_128M, min, max);
+			if (module_direct_base) {
+				min = module_direct_base;
+				max = module_direct_base + SZ_128M;
+			}
+		}
+
+		module_plt_base = random_bounding_box(SZ_2G, min, max);
+	}
+
+	pr_info("%llu pages in range for non-PLT usage",
+		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
+	pr_info("%llu pages in range for PLT usage",
+		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
+
+	return 0;
+}
+
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+		},
+		[EXECMEM_KPROBES] = {
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+			.alignment = 1,
+		},
+		[EXECMEM_BPF] = {
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	module_init_limits();
+
+	r->pgprot = PAGE_KERNEL;
+
+	if (module_direct_base) {
+		r->start = module_direct_base;
+		r->end = module_direct_base + SZ_128M;
+
+		if (module_plt_base) {
+			r->fallback_start = module_plt_base;
+			r->fallback_end = module_plt_base + SZ_2G;
+		}
+	} else if (module_plt_base) {
+		r->start = module_plt_base;
+		r->end = module_plt_base + SZ_2G;
+	}
+
+	execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
+	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index a1d8fe9796fa..181b5f8b09f1 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -18,7 +18,6 @@
 #include <linux/ftrace.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
-#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/inst.h>
 
@@ -470,23 +469,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.pgprot = PAGE_KERNEL,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-
-	return &execmem_params;
-}
-
 static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
 				   const Elf_Shdr *sechdrs, struct module *mod)
 {
diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
index f3fe8c06ba4d..26b10a51309c 100644
--- a/arch/loongarch/mm/init.c
+++ b/arch/loongarch/mm/init.c
@@ -24,6 +24,7 @@
 #include <linux/gfp.h>
 #include <linux/hugetlb.h>
 #include <linux/mmzone.h>
+#include <linux/execmem.h>
 
 #include <asm/asm-offsets.h>
 #include <asm/bootinfo.h>
@@ -247,3 +248,22 @@ EXPORT_SYMBOL(invalid_pmd_table);
 #endif
 pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss;
 EXPORT_SYMBOL(invalid_pte_table);
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
index 1c959074b35f..ebf9496f5db0 100644
--- a/arch/mips/kernel/module.c
+++ b/arch/mips/kernel/module.c
@@ -33,25 +33,6 @@ struct mips_hi16 {
 static LIST_HEAD(dbe_list);
 static DEFINE_SPINLOCK(dbe_lock);
 
-#ifdef MODULE_START
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.start = MODULE_START,
-			.end = MODULE_END,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-#endif
-
 static void apply_r_mips_32(u32 *location, u32 base, Elf_Addr v)
 {
 	*location = base + v;
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 5dcb525a8995..55e7869d03f2 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -31,6 +31,7 @@
 #include <linux/gfp.h>
 #include <linux/kcore.h>
 #include <linux/initrd.h>
+#include <linux/execmem.h>
 
 #include <asm/bootinfo.h>
 #include <asm/cachectl.h>
@@ -573,3 +574,22 @@ EXPORT_SYMBOL_GPL(invalid_pmd_table);
 #endif
 pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss;
 EXPORT_SYMBOL(invalid_pte_table);
+
+#if defined(CONFIG_EXECMEM) && defined(MODULE_START)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.start = MODULE_START,
+			.end = MODULE_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index 0c6dfd1daef3..fecd2760b7a6 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -174,23 +174,6 @@ static inline int reassemble_22(int as22)
 		((as22 & 0x0003ff) << 3));
 }
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.pgprot = PAGE_KERNEL_RWX,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
-
-	return &execmem_params;
-}
-
 #ifndef CONFIG_64BIT
 static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
 {
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index a088c243edea..c87fed38e38e 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -24,6 +24,7 @@
 #include <linux/nodemask.h>	/* for node_online_map */
 #include <linux/pagemap.h>	/* for release_pages */
 #include <linux/compat.h>
+#include <linux/execmem.h>
 
 #include <asm/pgalloc.h>
 #include <asm/tlb.h>
@@ -479,7 +480,7 @@ void free_initmem(void)
 	/* finally dump all the instructions which were cached, since the
 	 * pages are no-longer executable */
 	flush_icache_range(init_begin, init_end);
-	
+
 	free_initmem_default(POISON_FREE_INITMEM);
 
 	/* set up a new led state on systems shipped LED State panel */
@@ -919,3 +920,22 @@ static const pgprot_t protection_map[16] = {
 	[VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]	= PAGE_RWX
 };
 DECLARE_VM_GET_PAGE_PROT
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL_RWX,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index bf2c62aef628..b30e00964a60 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -10,7 +10,6 @@
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
 #include <linux/bug.h>
-#include <linux/execmem.h>
 #include <asm/module.h>
 #include <linux/uaccess.h>
 #include <asm/firmware.h>
@@ -89,62 +88,3 @@ int module_finalize(const Elf_Ehdr *hdr,
 
 	return 0;
 }
-
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.alignment = 1,
-		},
-		[EXECMEM_KPROBES] = {
-			.alignment = 1,
-		},
-		[EXECMEM_MODULE_DATA] = {
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
-	struct execmem_range *range = &execmem_params.ranges[EXECMEM_DEFAULT];
-
-	/*
-	 * BOOK3S_32 and 8xx define MODULES_VADDR for text allocations and
-	 * allow allocating data in the entire vmalloc space
-	 */
-#ifdef MODULES_VADDR
-	struct execmem_range *data = &execmem_params.ranges[EXECMEM_MODULE_DATA];
-	unsigned long limit = (unsigned long)_etext - SZ_32M;
-
-	/* First try within 32M limit from _etext to avoid branch trampolines */
-	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit) {
-		range->start = limit;
-		range->end = MODULES_END;
-		range->fallback_start = MODULES_VADDR;
-		range->fallback_end = MODULES_END;
-	} else {
-		range->start = MODULES_VADDR;
-		range->end = MODULES_END;
-	}
-	data->start = VMALLOC_START;
-	data->end = VMALLOC_END;
-	data->pgprot = PAGE_KERNEL;
-	data->alignment = 1;
-#else
-	range->start = VMALLOC_START;
-	range->end = VMALLOC_END;
-#endif
-
-	range->pgprot = prot;
-
-	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;
-
-	if (strict_module_rwx_enabled())
-		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
-	else
-		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
-
-	return &execmem_params;
-}
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8b121df7b08f..06f4bb6fb780 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -16,6 +16,7 @@
 #include <linux/highmem.h>
 #include <linux/suspend.h>
 #include <linux/dma-direct.h>
+#include <linux/execmem.h>
 
 #include <asm/swiotlb.h>
 #include <asm/machdep.h>
@@ -406,3 +407,64 @@ int devmem_is_allowed(unsigned long pfn)
  * the EHEA driver. Drop this when drivers/net/ethernet/ibm/ehea is removed.
  */
 EXPORT_SYMBOL_GPL(walk_system_ram_range);
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.alignment = 1,
+		},
+		[EXECMEM_KPROBES] = {
+			.alignment = 1,
+		},
+		[EXECMEM_MODULE_DATA] = {
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
+	struct execmem_range *range = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	/*
+	 * BOOK3S_32 and 8xx define MODULES_VADDR for text allocations and
+	 * allow allocating data in the entire vmalloc space
+	 */
+#ifdef MODULES_VADDR
+	struct execmem_range *data = &execmem_params.ranges[EXECMEM_MODULE_DATA];
+	unsigned long limit = (unsigned long)_etext - SZ_32M;
+
+	/* First try within 32M limit from _etext to avoid branch trampolines */
+	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit) {
+		range->start = limit;
+		range->end = MODULES_END;
+		range->fallback_start = MODULES_VADDR;
+		range->fallback_end = MODULES_END;
+	} else {
+		range->start = MODULES_VADDR;
+		range->end = MODULES_END;
+	}
+	data->start = VMALLOC_START;
+	data->end = VMALLOC_END;
+	data->pgprot = PAGE_KERNEL;
+	data->alignment = 1;
+#else
+	range->start = VMALLOC_START;
+	range->end = VMALLOC_END;
+#endif
+
+	range->pgprot = prot;
+
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;
+
+	if (strict_module_rwx_enabled())
+		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
+	else
+		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 31505ecb5c72..8af08d5449bf 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -11,7 +11,6 @@
 #include <linux/vmalloc.h>
 #include <linux/sizes.h>
 #include <linux/pgtable.h>
-#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/sections.h>
 
@@ -436,44 +435,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-#ifdef CONFIG_MMU
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.pgprot = PAGE_KERNEL,
-			.alignment = 1,
-		},
-		[EXECMEM_KPROBES] = {
-			.pgprot = PAGE_KERNEL_READ_EXEC,
-			.alignment = 1,
-		},
-		[EXECMEM_BPF] = {
-			.pgprot = PAGE_KERNEL,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-#ifdef CONFIG_64BIT
-	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-#else
-	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
-#endif
-
-	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
-
-	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
-	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
-
-	return &execmem_params;
-}
-#endif
-
 int module_finalize(const Elf_Ehdr *hdr,
 		    const Elf_Shdr *sechdrs,
 		    struct module *me)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 0798bd861dcb..b0f7848f39e3 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -24,6 +24,7 @@
 #include <linux/elf.h>
 #endif
 #include <linux/kfence.h>
+#include <linux/execmem.h>
 
 #include <asm/fixmap.h>
 #include <asm/io.h>
@@ -1564,3 +1565,41 @@ void __init pgtable_cache_init(void)
 		preallocate_pgd_pages_range(MODULES_VADDR, MODULES_END, "bpf/modules");
 }
 #endif
+
+#if defined(CONFIG_MMU) && defined(CONFIG_EXECMEM)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+		[EXECMEM_KPROBES] = {
+			.pgprot = PAGE_KERNEL_READ_EXEC,
+			.alignment = 1,
+		},
+		[EXECMEM_BPF] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+#ifdef CONFIG_64BIT
+	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+#else
+	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+#endif
+
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
+
+	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
+	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 538d5f24af66..81a8d92ca092 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -37,31 +37,6 @@
 
 #define PLT_ENTRY_SIZE 22
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-			.pgprot = PAGE_KERNEL,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	unsigned long module_load_offset = 0;
-	unsigned long start;
-
-	if (kaslr_enabled())
-		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-
-	start = MODULES_VADDR + module_load_offset;
-	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-
-	return &execmem_params;
-}
-
 #ifdef CONFIG_FUNCTION_TRACER
 void module_arch_cleanup(struct module *mod)
 {
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 8b94d2212d33..2e6d6512fc5f 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -34,6 +34,7 @@
 #include <linux/percpu.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
+#include <linux/execmem.h>
 #include <asm/pgalloc.h>
 #include <asm/kfence.h>
 #include <asm/ptdump.h>
@@ -311,3 +312,30 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
 	vmem_remove_mapping(start, size);
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+			.pgprot = PAGE_KERNEL,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	unsigned long module_load_offset = 0;
+	unsigned long start;
+
+	if (kaslr_enabled())
+		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
+
+	start = MODULES_VADDR + module_load_offset;
+	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
index 1d8d1fba95b9..dff1d85ba202 100644
--- a/arch/sparc/kernel/module.c
+++ b/arch/sparc/kernel/module.c
@@ -14,7 +14,6 @@
 #include <linux/string.h>
 #include <linux/ctype.h>
 #include <linux/mm.h>
-#include <linux/execmem.h>
 #ifdef CONFIG_SPARC64
 #include <linux/jump_label.h>
 #endif
@@ -25,28 +24,6 @@
 
 #include "entry.h"
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-#ifdef CONFIG_SPARC64
-			.start = MODULES_VADDR,
-			.end = MODULES_END,
-#else
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-#endif
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-
 /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
 int module_frob_arch_sections(Elf_Ehdr *hdr,
 			      Elf_Shdr *sechdrs,
diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile
index 871354aa3c00..87e2cf7efb5b 100644
--- a/arch/sparc/mm/Makefile
+++ b/arch/sparc/mm/Makefile
@@ -15,3 +15,5 @@ obj-$(CONFIG_SPARC32)   += leon_mm.o
 
 # Only used by sparc64
 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
+
+obj-$(CONFIG_EXECMEM) += execmem.o
diff --git a/arch/sparc/mm/execmem.c b/arch/sparc/mm/execmem.c
new file mode 100644
index 000000000000..fb53a859869a
--- /dev/null
+++ b/arch/sparc/mm/execmem.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/mm.h>
+#include <linux/execmem.h>
+
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+#ifdef CONFIG_SPARC64
+			.start = MODULES_VADDR,
+			.end = MODULES_END,
+#else
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+#endif
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 9d37375e2f05..c52d591c0f3f 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -19,7 +19,6 @@
 #include <linux/jump_label.h>
 #include <linux/random.h>
 #include <linux/memory.h>
-#include <linux/execmem.h>
 
 #include <asm/text-patching.h>
 #include <asm/page.h>
@@ -37,32 +36,6 @@ do {							\
 } while (0)
 #endif
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	unsigned long module_load_offset = 0;
-	unsigned long start;
-
-	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
-		module_load_offset =
-			get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-
-	start = MODULES_VADDR + module_load_offset;
-	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-
 #ifdef CONFIG_X86_32
 int apply_relocate(Elf32_Shdr *sechdrs,
 		   const char *strtab,
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 679893ea5e68..022af7ab50f9 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -7,6 +7,7 @@
 #include <linux/swapops.h>
 #include <linux/kmemleak.h>
 #include <linux/sched/task.h>
+#include <linux/execmem.h>
 
 #include <asm/set_memory.h>
 #include <asm/cpu_device_id.h>
@@ -1099,3 +1100,31 @@ unsigned long arch_max_swapfile_size(void)
 	return pages;
 }
 #endif
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	unsigned long module_load_offset = 0;
+	unsigned long start;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
+		module_load_offset =
+			get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
+
+	start = MODULES_VADDR + module_load_offset;
+	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
+#endif /* CONFIG_EXECMEM */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 11/13] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (9 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 12/13] kprobes: remove dependency on CONFIG_MODULES Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 13/13] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of Mike Rapoport
  12 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Dynamic ftrace must allocate memory for code and this was impossible
without CONFIG_MODULES.

With execmem separated from the modules code, execmem_text_alloc() is
available regardless of CONFIG_MODULES.

Remove dependency of dynamic ftrace on CONFIG_MODULES and make
CONFIG_DYNAMIC_FTRACE select CONFIG_EXECMEM in Kconfig.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/x86/Kconfig         |  1 +
 arch/x86/kernel/ftrace.c | 10 ----------
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 982b777eadc7..cc7c4a0a8c16 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -35,6 +35,7 @@ config X86_64
 	select SWIOTLB
 	select ARCH_HAS_ELFCORE_COMPAT
 	select ZONE_DMA32
+	select EXECMEM if DYNAMIC_FTRACE
 
 config FORCE_DYNAMIC_FTRACE
 	def_bool y
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index ae56d79a6a74..7ed7e8297ba3 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -261,8 +261,6 @@ void arch_ftrace_update_code(int command)
 /* Currently only x86_64 supports dynamic trampolines */
 #ifdef CONFIG_X86_64
 
-#ifdef CONFIG_MODULES
-/* Module allocation simplifies allocating memory for code */
 static inline void *alloc_tramp(unsigned long size)
 {
 	return execmem_text_alloc(EXECMEM_FTRACE, size);
@@ -271,14 +269,6 @@ static inline void tramp_free(void *tramp)
 {
 	execmem_free(tramp);
 }
-#else
-/* Trampolines can only be created if modules are supported */
-static inline void *alloc_tramp(unsigned long size)
-{
-	return NULL;
-}
-static inline void tramp_free(void *tramp) { }
-#endif
 
 /* Defined as markers to the end of the ftrace default trampolines */
 extern void ftrace_regs_caller_end(void);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 12/13] kprobes: remove dependency on CONFIG_MODULES
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (10 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 11/13] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  2023-09-18  7:29 ` [PATCH v3 13/13] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of Mike Rapoport
  12 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

kprobes depended on CONFIG_MODULES because it has to allocate memory for
code.

Since code allocations are now implemented with execmem, kprobes can be
enabled in non-modular kernels.

Add #ifdef CONFIG_MODULE guards for the code dealing with kprobes inside
modules, make CONFIG_KPROBES select CONFIG_EXECMEM and drop the
dependency of CONFIG_KPROBES on CONFIG_MODULES.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/Kconfig                |  2 +-
 kernel/kprobes.c            | 43 +++++++++++++++++++++----------------
 kernel/trace/trace_kprobe.c | 11 ++++++++++
 3 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 12d51495caec..c52a600b63ca 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -52,9 +52,9 @@ config GENERIC_ENTRY
 
 config KPROBES
 	bool "Kprobes"
-	depends on MODULES
 	depends on HAVE_KPROBES
 	select KALLSYMS
+	select EXECMEM
 	select TASKS_RCU if PREEMPTION
 	help
 	  Kprobes allows you to trap at almost any kernel address and
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 0ccb4d2ec9a2..c95d0088f966 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1580,6 +1580,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
 		goto out;
 	}
 
+#ifdef CONFIG_MODULES
 	/* Check if 'p' is probing a module. */
 	*probed_mod = __module_text_address((unsigned long) p->addr);
 	if (*probed_mod) {
@@ -1603,6 +1604,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
 			ret = -ENOENT;
 		}
 	}
+#endif
+
 out:
 	preempt_enable();
 	jump_label_unlock();
@@ -2495,24 +2498,6 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
 	return 0;
 }
 
-/* Remove all symbols in given area from kprobe blacklist */
-static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
-{
-	struct kprobe_blacklist_entry *ent, *n;
-
-	list_for_each_entry_safe(ent, n, &kprobe_blacklist, list) {
-		if (ent->start_addr < start || ent->start_addr >= end)
-			continue;
-		list_del(&ent->list);
-		kfree(ent);
-	}
-}
-
-static void kprobe_remove_ksym_blacklist(unsigned long entry)
-{
-	kprobe_remove_area_blacklist(entry, entry + 1);
-}
-
 int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
 				   char *type, char *sym)
 {
@@ -2577,6 +2562,25 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
 	return ret ? : arch_populate_kprobe_blacklist();
 }
 
+#ifdef CONFIG_MODULES
+/* Remove all symbols in given area from kprobe blacklist */
+static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
+{
+	struct kprobe_blacklist_entry *ent, *n;
+
+	list_for_each_entry_safe(ent, n, &kprobe_blacklist, list) {
+		if (ent->start_addr < start || ent->start_addr >= end)
+			continue;
+		list_del(&ent->list);
+		kfree(ent);
+	}
+}
+
+static void kprobe_remove_ksym_blacklist(unsigned long entry)
+{
+	kprobe_remove_area_blacklist(entry, entry + 1);
+}
+
 static void add_module_kprobe_blacklist(struct module *mod)
 {
 	unsigned long start, end;
@@ -2678,6 +2682,7 @@ static struct notifier_block kprobe_module_nb = {
 	.notifier_call = kprobes_module_callback,
 	.priority = 0
 };
+#endif
 
 void kprobe_free_init_mem(void)
 {
@@ -2737,8 +2742,10 @@ static int __init init_kprobes(void)
 	err = arch_init_kprobes();
 	if (!err)
 		err = register_die_notifier(&kprobe_exceptions_nb);
+#ifdef CONFIG_MODULES
 	if (!err)
 		err = register_module_notifier(&kprobe_module_nb);
+#endif
 
 	kprobes_initialized = (err == 0);
 	kprobe_sysctls_init();
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 3d7a180a8427..25a5293a80c0 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -111,6 +111,7 @@ static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
 	return strncmp(module_name(mod), name, len) == 0 && name[len] == ':';
 }
 
+#ifdef CONFIG_MODULES
 static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
 {
 	char *p;
@@ -129,6 +130,12 @@ static nokprobe_inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
 
 	return ret;
 }
+#else
+static inline bool trace_kprobe_module_exist(struct trace_kprobe *tk)
+{
+	return false;
+}
+#endif
 
 static bool trace_kprobe_is_busy(struct dyn_event *ev)
 {
@@ -670,6 +677,7 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
 	return ret;
 }
 
+#ifdef CONFIG_MODULES
 /* Module notifier call back, checking event on the module */
 static int trace_kprobe_module_callback(struct notifier_block *nb,
 				       unsigned long val, void *data)
@@ -704,6 +712,7 @@ static struct notifier_block trace_kprobe_module_nb = {
 	.notifier_call = trace_kprobe_module_callback,
 	.priority = 1	/* Invoked after kprobe module callback */
 };
+#endif
 
 static int __trace_kprobe_create(int argc, const char *argv[])
 {
@@ -1810,8 +1819,10 @@ static __init int init_kprobe_trace_early(void)
 	if (ret)
 		return ret;
 
+#ifdef CONFIG_MODULES
 	if (register_module_notifier(&trace_kprobe_module_nb))
 		return -EINVAL;
+#endif
 
 	return 0;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 13/13] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of
  2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
                   ` (11 preceding siblings ...)
  2023-09-18  7:29 ` [PATCH v3 12/13] kprobes: remove dependency on CONFIG_MODULES Mike Rapoport
@ 2023-09-18  7:29 ` Mike Rapoport
  12 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-18  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	linux-modules, Andrew Morton, Rick Edgecombe, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

BPF just-in-time compiler depended on CONFIG_MODULES because it used
module_alloc() to allocate memory for the generated code.

Since code allocations are now implemented with execmem, drop dependency of
CONFIG_BPF_JIT on CONFIG_MODULES and make it select CONFIG_EXECMEM.

Suggested-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 kernel/bpf/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
index 6a906ff93006..5be11c906f93 100644
--- a/kernel/bpf/Kconfig
+++ b/kernel/bpf/Kconfig
@@ -42,7 +42,7 @@ config BPF_JIT
 	bool "Enable BPF Just In Time compiler"
 	depends on BPF
 	depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT
-	depends on MODULES
+	select EXECMEM
 	help
 	  BPF programs are normally handled by a BPF interpreter. This option
 	  allows the kernel to generate native code when a program is loaded
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-18  7:29 ` [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free() Mike Rapoport
@ 2023-09-21 22:10   ` Song Liu
  2023-09-23 15:42     ` Mike Rapoport
  2023-09-21 22:14   ` Song Liu
  2023-09-21 22:34   ` Song Liu
  2 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2023-09-21 22:10 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
>
[...]
> +
> +#include <linux/mm.h>
> +#include <linux/vmalloc.h>
> +#include <linux/execmem.h>
> +#include <linux/moduleloader.h>
> +
> +static void *execmem_alloc(size_t size)
> +{
> +       return module_alloc(size);
> +}
> +
> +void *execmem_text_alloc(enum execmem_type type, size_t size)
> +{
> +       return execmem_alloc(size);
> +}

execmem_text_alloc (and later execmem_data_alloc) both take "type" as
input. I guess we can just use execmem_alloc(type, size) for everything?

Thanks,
Song

> +
> +void execmem_free(void *ptr)
> +{
> +       /*
> +        * This memory may be RO, and freeing RO memory in an interrupt is not
> +        * supported by vmalloc.
> +        */
> +       WARN_ON(in_interrupt());
> +       vfree(ptr);
> +}
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-18  7:29 ` [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free() Mike Rapoport
  2023-09-21 22:10   ` Song Liu
@ 2023-09-21 22:14   ` Song Liu
  2023-09-23 15:40     ` Mike Rapoport
  2023-09-21 22:34   ` Song Liu
  2 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2023-09-21 22:14 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
>
[...]
> +
> +/**
> + * enum execmem_type - types of executable memory ranges
> + *
> + * There are several subsystems that allocate executable memory.
> + * Architectures define different restrictions on placement,
> + * permissions, alignment and other parameters for memory that can be used
> + * by these subsystems.
> + * Types in this enum identify subsystems that allocate executable memory
> + * and let architectures define parameters for ranges suitable for
> + * allocations by each subsystem.
> + *
> + * @EXECMEM_DEFAULT: default parameters that would be used for types that
> + * are not explcitly defined.
> + * @EXECMEM_MODULE_TEXT: parameters for module text sections
> + * @EXECMEM_KPROBES: parameters for kprobes
> + * @EXECMEM_FTRACE: parameters for ftrace
> + * @EXECMEM_BPF: parameters for BPF
> + * @EXECMEM_TYPE_MAX:
> + */
> +enum execmem_type {
> +       EXECMEM_DEFAULT,

I found EXECMEM_DEFAULT more confusing than helpful.

Song

> +       EXECMEM_MODULE_TEXT = EXECMEM_DEFAULT,
> +       EXECMEM_KPROBES,
> +       EXECMEM_FTRACE,
> +       EXECMEM_BPF,
> +       EXECMEM_TYPE_MAX,
> +};
> +
[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations
  2023-09-18  7:29 ` [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations Mike Rapoport
@ 2023-09-21 22:30   ` Song Liu
  2023-09-23 16:25     ` Mike Rapoport
  2023-09-22 10:32   ` Christophe Leroy
  1 sibling, 1 reply; 49+ messages in thread
From: Song Liu @ 2023-09-21 22:30 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
>
[...]
> @@ -135,5 +138,13 @@ struct execmem_params __init *execmem_arch_params(void)
>
>         range->pgprot = prot;
>
> +       execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
> +       execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;

.end = VMALLOC_END.

Thanks,
Song

> +
> +       if (strict_module_rwx_enabled())
> +               execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
> +       else
> +               execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
> +
>         return &execmem_params;
>  }
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-18  7:29 ` [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free() Mike Rapoport
  2023-09-21 22:10   ` Song Liu
  2023-09-21 22:14   ` Song Liu
@ 2023-09-21 22:34   ` Song Liu
  2023-09-23 15:38     ` Mike Rapoport
  2 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2023-09-21 22:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
>

[...]

> diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> index 42215f9404af..db5561d0c233 100644
> --- a/arch/s390/kernel/module.c
> +++ b/arch/s390/kernel/module.c
> @@ -21,6 +21,7 @@
>  #include <linux/moduleloader.h>
>  #include <linux/bug.h>
>  #include <linux/memory.h>
> +#include <linux/execmem.h>
>  #include <asm/alternative.h>
>  #include <asm/nospec-branch.h>
>  #include <asm/facility.h>
> @@ -76,7 +77,7 @@ void *module_alloc(unsigned long size)
>  #ifdef CONFIG_FUNCTION_TRACER
>  void module_arch_cleanup(struct module *mod)
>  {
> -       module_memfree(mod->arch.trampolines_start);
> +       execmem_free(mod->arch.trampolines_start);
>  }
>  #endif
>
> @@ -510,7 +511,7 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
>
>         size = FTRACE_HOTPATCH_TRAMPOLINES_SIZE(s->sh_size);
>         numpages = DIV_ROUND_UP(size, PAGE_SIZE);
> -       start = module_alloc(numpages * PAGE_SIZE);
> +       start = execmem_text_alloc(EXECMEM_FTRACE, numpages * PAGE_SIZE);

This should be EXECMEM_MODULE_TEXT?

Thanks,
Song

>         if (!start)
>                 return -ENOMEM;
>         set_memory_rox((unsigned long)start, numpages);
[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc()
  2023-09-18  7:29 ` [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc() Mike Rapoport
@ 2023-09-21 22:52   ` Song Liu
  2023-09-22  7:16     ` Christophe Leroy
  2023-09-23 16:20     ` Mike Rapoport
  0 siblings, 2 replies; 49+ messages in thread
From: Song Liu @ 2023-09-21 22:52 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
>
[...]
> diff --git a/include/linux/execmem.h b/include/linux/execmem.h
> index 519bdfdca595..09d45ac786e9 100644
> --- a/include/linux/execmem.h
> +++ b/include/linux/execmem.h
> @@ -29,6 +29,7 @@
>   * @EXECMEM_KPROBES: parameters for kprobes
>   * @EXECMEM_FTRACE: parameters for ftrace
>   * @EXECMEM_BPF: parameters for BPF
> + * @EXECMEM_MODULE_DATA: parameters for module data sections
>   * @EXECMEM_TYPE_MAX:
>   */
>  enum execmem_type {
> @@ -37,6 +38,7 @@ enum execmem_type {
>         EXECMEM_KPROBES,
>         EXECMEM_FTRACE,

In longer term, I think we can improve the JITed code and merge
kprobe/ftrace/bpf. to use the same ranges. Also, do we need special
setting for FTRACE? If not, let's just remove it.

>         EXECMEM_BPF,
> +       EXECMEM_MODULE_DATA,
>         EXECMEM_TYPE_MAX,
>  };

Overall, it is great that kprobe/ftrace/bpf no longer depend on modules.

OTOH, I think we should merge execmem_type and existing mod_mem_type.
Otherwise, we still need to handle page permissions in multiple places.
What is our plan for that?

Thanks,
Song


>
> @@ -107,6 +109,23 @@ struct execmem_params *execmem_arch_params(void);
>   */
>  void *execmem_text_alloc(enum execmem_type type, size_t size);
>
> +/**
> + * execmem_data_alloc - allocate memory for data coupled to code
> + * @type: type of the allocation
> + * @size: how many bytes of memory are required
> + *
> + * Allocates memory that will contain data coupled with executable code,
> + * like data sections in kernel modules.
> + *
> + * The memory will have protections defined by architecture.
> + *
> + * The allocated memory will reside in an area that does not impose
> + * restrictions on the addressing modes.
> + *
> + * Return: a pointer to the allocated memory or %NULL
> + */
> +void *execmem_data_alloc(enum execmem_type type, size_t size);
> +
>  /**
>   * execmem_free - free executable memory
>   * @ptr: pointer to the memory that should be freed
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index c4146bfcd0a7..2ae83a6abf66 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -1188,25 +1188,16 @@ void __weak module_arch_freeing_init(struct module *mod)
>  {
>  }
>
> -static bool mod_mem_use_vmalloc(enum mod_mem_type type)
> -{
> -       return IS_ENABLED(CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC) &&
> -               mod_mem_type_is_core_data(type);
> -}
> -
>  static void *module_memory_alloc(unsigned int size, enum mod_mem_type type)
>  {
> -       if (mod_mem_use_vmalloc(type))
> -               return vzalloc(size);
> +       if (mod_mem_type_is_data(type))
> +               return execmem_data_alloc(EXECMEM_MODULE_DATA, size);
>         return execmem_text_alloc(EXECMEM_MODULE_TEXT, size);
>  }
>
>  static void module_memory_free(void *ptr, enum mod_mem_type type)
>  {
> -       if (mod_mem_use_vmalloc(type))
> -               vfree(ptr);
> -       else
> -               execmem_free(ptr);
> +       execmem_free(ptr);
>  }
>
>  static void free_mod_mem(struct module *mod)
> diff --git a/mm/execmem.c b/mm/execmem.c
> index abcbd07e05ac..aeff85261360 100644
> --- a/mm/execmem.c
> +++ b/mm/execmem.c
> @@ -53,11 +53,23 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
>         return kasan_reset_tag(p);
>  }
>
> +static inline bool execmem_range_is_data(enum execmem_type type)
> +{
> +       return type == EXECMEM_MODULE_DATA;
> +}
> +
>  void *execmem_text_alloc(enum execmem_type type, size_t size)
>  {
>         return execmem_alloc(size, &execmem_params.ranges[type]);
>  }
>
> +void *execmem_data_alloc(enum execmem_type type, size_t size)
> +{
> +       WARN_ON_ONCE(!execmem_range_is_data(type));
> +
> +       return execmem_alloc(size, &execmem_params.ranges[type]);
> +}
> +
>  void execmem_free(void *ptr)
>  {
>         /*
> @@ -93,7 +105,10 @@ static void execmem_init_missing(struct execmem_params *p)
>                 struct execmem_range *r = &p->ranges[i];
>
>                 if (!r->start) {
> -                       r->pgprot = default_range->pgprot;
> +                       if (execmem_range_is_data(i))
> +                               r->pgprot = PAGE_KERNEL;
> +                       else
> +                               r->pgprot = default_range->pgprot;
>                         r->alignment = default_range->alignment;
>                         r->start = default_range->start;
>                         r->end = default_range->end;
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc()
  2023-09-21 22:52   ` Song Liu
@ 2023-09-22  7:16     ` Christophe Leroy
  2023-09-22  8:55       ` Song Liu
  2023-09-23 16:20     ` Mike Rapoport
  1 sibling, 1 reply; 49+ messages in thread
From: Christophe Leroy @ 2023-09-22  7:16 UTC (permalink / raw)
  To: Song Liu, Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe



Le 22/09/2023 à 00:52, Song Liu a écrit :
> On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
>>
> [...]
>> diff --git a/include/linux/execmem.h b/include/linux/execmem.h
>> index 519bdfdca595..09d45ac786e9 100644
>> --- a/include/linux/execmem.h
>> +++ b/include/linux/execmem.h
>> @@ -29,6 +29,7 @@
>>    * @EXECMEM_KPROBES: parameters for kprobes
>>    * @EXECMEM_FTRACE: parameters for ftrace
>>    * @EXECMEM_BPF: parameters for BPF
>> + * @EXECMEM_MODULE_DATA: parameters for module data sections
>>    * @EXECMEM_TYPE_MAX:
>>    */
>>   enum execmem_type {
>> @@ -37,6 +38,7 @@ enum execmem_type {
>>          EXECMEM_KPROBES,
>>          EXECMEM_FTRACE,
> 
> In longer term, I think we can improve the JITed code and merge
> kprobe/ftrace/bpf. to use the same ranges. Also, do we need special
> setting for FTRACE? If not, let's just remove it.

How can we do that ? Some platforms like powerpc require executable 
memory for BPF and non-exec mem for KPROBE so it can't be in the same 
area/ranges.

> 
>>          EXECMEM_BPF,
>> +       EXECMEM_MODULE_DATA,
>>          EXECMEM_TYPE_MAX,
>>   };
> 
> Overall, it is great that kprobe/ftrace/bpf no longer depend on modules.
> 
> OTOH, I think we should merge execmem_type and existing mod_mem_type.
> Otherwise, we still need to handle page permissions in multiple places.
> What is our plan for that?
> 

Christophe

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc()
  2023-09-22  7:16     ` Christophe Leroy
@ 2023-09-22  8:55       ` Song Liu
  2023-09-22 10:13         ` Christophe Leroy
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2023-09-22  8:55 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf

On Fri, Sep 22, 2023 at 12:17 AM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 22/09/2023 à 00:52, Song Liu a écrit :
> > On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
> >>
> > [...]
> >> diff --git a/include/linux/execmem.h b/include/linux/execmem.h
> >> index 519bdfdca595..09d45ac786e9 100644
> >> --- a/include/linux/execmem.h
> >> +++ b/include/linux/execmem.h
> >> @@ -29,6 +29,7 @@
> >>    * @EXECMEM_KPROBES: parameters for kprobes
> >>    * @EXECMEM_FTRACE: parameters for ftrace
> >>    * @EXECMEM_BPF: parameters for BPF
> >> + * @EXECMEM_MODULE_DATA: parameters for module data sections
> >>    * @EXECMEM_TYPE_MAX:
> >>    */
> >>   enum execmem_type {
> >> @@ -37,6 +38,7 @@ enum execmem_type {
> >>          EXECMEM_KPROBES,
> >>          EXECMEM_FTRACE,
> >
> > In longer term, I think we can improve the JITed code and merge
> > kprobe/ftrace/bpf. to use the same ranges. Also, do we need special
> > setting for FTRACE? If not, let's just remove it.
>
> How can we do that ? Some platforms like powerpc require executable
> memory for BPF and non-exec mem for KPROBE so it can't be in the same
> area/ranges.

Hmm... non-exec mem for kprobes?

       if (strict_module_rwx_enabled())
               execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
       else
               execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;

Do you mean the latter case?

Thanks,
Song

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc()
  2023-09-22  8:55       ` Song Liu
@ 2023-09-22 10:13         ` Christophe Leroy
  0 siblings, 0 replies; 49+ messages in thread
From: Christophe Leroy @ 2023-09-22 10:13 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf



Le 22/09/2023 à 10:55, Song Liu a écrit :
> On Fri, Sep 22, 2023 at 12:17 AM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>>
>>
>> Le 22/09/2023 à 00:52, Song Liu a écrit :
>>> On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
>>>>
>>> [...]
>>>> diff --git a/include/linux/execmem.h b/include/linux/execmem.h
>>>> index 519bdfdca595..09d45ac786e9 100644
>>>> --- a/include/linux/execmem.h
>>>> +++ b/include/linux/execmem.h
>>>> @@ -29,6 +29,7 @@
>>>>     * @EXECMEM_KPROBES: parameters for kprobes
>>>>     * @EXECMEM_FTRACE: parameters for ftrace
>>>>     * @EXECMEM_BPF: parameters for BPF
>>>> + * @EXECMEM_MODULE_DATA: parameters for module data sections
>>>>     * @EXECMEM_TYPE_MAX:
>>>>     */
>>>>    enum execmem_type {
>>>> @@ -37,6 +38,7 @@ enum execmem_type {
>>>>           EXECMEM_KPROBES,
>>>>           EXECMEM_FTRACE,
>>>
>>> In longer term, I think we can improve the JITed code and merge
>>> kprobe/ftrace/bpf. to use the same ranges. Also, do we need special
>>> setting for FTRACE? If not, let's just remove it.
>>
>> How can we do that ? Some platforms like powerpc require executable
>> memory for BPF and non-exec mem for KPROBE so it can't be in the same
>> area/ranges.
> 
> Hmm... non-exec mem for kprobes?
> 
>         if (strict_module_rwx_enabled())
>                 execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
>         else
>                 execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
> 
> Do you mean the latter case?
> 

In fact I may have misunderstood patch 9. I'll provide a response there.

Christophe

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations
  2023-09-18  7:29 ` [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations Mike Rapoport
  2023-09-21 22:30   ` Song Liu
@ 2023-09-22 10:32   ` Christophe Leroy
  2023-09-23 16:27     ` Mike Rapoport
  1 sibling, 1 reply; 49+ messages in thread
From: Christophe Leroy @ 2023-09-22 10:32 UTC (permalink / raw)
  To: Mike Rapoport, linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel

Hi Mike,

Le 18/09/2023 à 09:29, Mike Rapoport a écrit :
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> 
> powerpc overrides kprobes::alloc_insn_page() to remove writable
> permissions when STRICT_MODULE_RWX is on.
> 
> Add definition of EXECMEM_KRPOBES to execmem_params to allow using the
> generic kprobes::alloc_insn_page() with the desired permissions.
> 
> As powerpc uses breakpoint instructions to inject kprobes, it does not
> need to constrain kprobe allocations to the modules area and can use the
> entire vmalloc address space.

I don't understand what you mean here. Does it mean kprobe allocation 
doesn't need to be executable ? I don't think so based on the pgprot you 
set.

On powerpc book3s/32, vmalloc space is not executable. Only modules 
space is executable. X/NX cannot be set on a per page basis, it can only 
be set on a 256 Mbytes segment basis.

See commit c49643319715 ("powerpc/32s: Only leave NX unset on segments 
used for modules") and 6ca055322da8 ("powerpc/32s: Use dedicated segment 
for modules with STRICT_KERNEL_RWX") and 7bee31ad8e2f ("powerpc/32s: Fix 
is_module_segment() when MODULES_VADDR is defined").

So if your intention is still to have an executable kprobes, then you 
can't use vmalloc address space.

Christophe

> 
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> ---
>   arch/powerpc/kernel/kprobes.c | 14 --------------
>   arch/powerpc/kernel/module.c  | 11 +++++++++++
>   2 files changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 62228c7072a2..14c5ddec3056 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -126,20 +126,6 @@ kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offse
>   	return (kprobe_opcode_t *)(addr + offset);
>   }
>   
> -void *alloc_insn_page(void)
> -{
> -	void *page;
> -
> -	page = execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
> -	if (!page)
> -		return NULL;
> -
> -	if (strict_module_rwx_enabled())
> -		set_memory_rox((unsigned long)page, 1);
> -
> -	return page;
> -}
> -
>   int arch_prepare_kprobe(struct kprobe *p)
>   {
>   	int ret = 0;
> diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
> index 824d9541a310..bf2c62aef628 100644
> --- a/arch/powerpc/kernel/module.c
> +++ b/arch/powerpc/kernel/module.c
> @@ -95,6 +95,9 @@ static struct execmem_params execmem_params __ro_after_init = {
>   		[EXECMEM_DEFAULT] = {
>   			.alignment = 1,
>   		},
> +		[EXECMEM_KPROBES] = {
> +			.alignment = 1,
> +		},
>   		[EXECMEM_MODULE_DATA] = {
>   			.alignment = 1,
>   		},
> @@ -135,5 +138,13 @@ struct execmem_params __init *execmem_arch_params(void)
>   
>   	range->pgprot = prot;
>   
> +	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
> +	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;
> +
> +	if (strict_module_rwx_enabled())
> +		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
> +	else
> +		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
> +
>   	return &execmem_params;
>   }

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 08/13] riscv: extend execmem_params for generated code allocations
  2023-09-18  7:29 ` [PATCH v3 08/13] riscv: " Mike Rapoport
@ 2023-09-22 10:37   ` Alexandre Ghiti
  2023-09-23 16:23     ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: Alexandre Ghiti @ 2023-09-22 10:37 UTC (permalink / raw)
  To: Mike Rapoport, linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

Hi Mike,

On 18/09/2023 09:29, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
>
> The memory allocations for kprobes and BPF on RISC-V are not placed in
> the modules area and these custom allocations are implemented with
> overrides of alloc_insn_page() and  bpf_jit_alloc_exec().
>
> Slightly reorder execmem_params initialization to support both 32 and 64
> bit variants, define EXECMEM_KPROBES and EXECMEM_BPF ranges in
> riscv::execmem_params and drop overrides of alloc_insn_page() and
> bpf_jit_alloc_exec().
>
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> ---
>   arch/riscv/kernel/module.c         | 21 ++++++++++++++++++++-
>   arch/riscv/kernel/probes/kprobes.c | 10 ----------
>   arch/riscv/net/bpf_jit_core.c      | 13 -------------
>   3 files changed, 20 insertions(+), 24 deletions(-)
>
> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> index 343a0edfb6dd..31505ecb5c72 100644
> --- a/arch/riscv/kernel/module.c
> +++ b/arch/riscv/kernel/module.c
> @@ -436,20 +436,39 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>   	return 0;
>   }
>   
> -#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> +#ifdef CONFIG_MMU
>   static struct execmem_params execmem_params __ro_after_init = {
>   	.ranges = {
>   		[EXECMEM_DEFAULT] = {
>   			.pgprot = PAGE_KERNEL,
>   			.alignment = 1,
>   		},
> +		[EXECMEM_KPROBES] = {
> +			.pgprot = PAGE_KERNEL_READ_EXEC,
> +			.alignment = 1,
> +		},
> +		[EXECMEM_BPF] = {
> +			.pgprot = PAGE_KERNEL,
> +			.alignment = 1,


Not entirely sure it is the same alignment (sorry did not go through the 
entire series), but if it is, the alignment above ^ is not the same that 
is requested by our current bpf_jit_alloc_exec() implementation which is 
PAGE_SIZE.


> +		},
>   	},
>   };
>   
>   struct execmem_params __init *execmem_arch_params(void)
>   {
> +#ifdef CONFIG_64BIT
>   	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
>   	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
> +#else
> +	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
> +	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
> +#endif
> +
> +	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
> +	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
> +
> +	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
> +	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
>   
>   	return &execmem_params;
>   }
> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> index 2f08c14a933d..e64f2f3064eb 100644
> --- a/arch/riscv/kernel/probes/kprobes.c
> +++ b/arch/riscv/kernel/probes/kprobes.c
> @@ -104,16 +104,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
>   	return 0;
>   }
>   
> -#ifdef CONFIG_MMU
> -void *alloc_insn_page(void)
> -{
> -	return  __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
> -				     GFP_KERNEL, PAGE_KERNEL_READ_EXEC,
> -				     VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
> -				     __builtin_return_address(0));
> -}
> -#endif
> -
>   /* install breakpoint in text */
>   void __kprobes arch_arm_kprobe(struct kprobe *p)
>   {
> diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
> index 7b70ccb7fec3..c8a758f0882b 100644
> --- a/arch/riscv/net/bpf_jit_core.c
> +++ b/arch/riscv/net/bpf_jit_core.c
> @@ -218,19 +218,6 @@ u64 bpf_jit_alloc_exec_limit(void)
>   	return BPF_JIT_REGION_SIZE;
>   }
>   
> -void *bpf_jit_alloc_exec(unsigned long size)
> -{
> -	return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
> -				    BPF_JIT_REGION_END, GFP_KERNEL,
> -				    PAGE_KERNEL, 0, NUMA_NO_NODE,
> -				    __builtin_return_address(0));
> -}
> -
> -void bpf_jit_free_exec(void *addr)
> -{
> -	return vfree(addr);
> -}
> -
>   void *bpf_arch_text_copy(void *dst, void *src, size_t len)
>   {
>   	int ret;


Otherwise, you can add:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-21 22:34   ` Song Liu
@ 2023-09-23 15:38     ` Mike Rapoport
  2023-09-23 22:36       ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 15:38 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Bjö rn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Sep 21, 2023 at 03:34:18PM -0700, Song Liu wrote:
> On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> 
> [...]
> 
> > diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> > index 42215f9404af..db5561d0c233 100644
> > --- a/arch/s390/kernel/module.c
> > +++ b/arch/s390/kernel/module.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/moduleloader.h>
> >  #include <linux/bug.h>
> >  #include <linux/memory.h>
> > +#include <linux/execmem.h>
> >  #include <asm/alternative.h>
> >  #include <asm/nospec-branch.h>
> >  #include <asm/facility.h>
> > @@ -76,7 +77,7 @@ void *module_alloc(unsigned long size)
> >  #ifdef CONFIG_FUNCTION_TRACER
> >  void module_arch_cleanup(struct module *mod)
> >  {
> > -       module_memfree(mod->arch.trampolines_start);
> > +       execmem_free(mod->arch.trampolines_start);
> >  }
> >  #endif
> >
> > @@ -510,7 +511,7 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
> >
> >         size = FTRACE_HOTPATCH_TRAMPOLINES_SIZE(s->sh_size);
> >         numpages = DIV_ROUND_UP(size, PAGE_SIZE);
> > -       start = module_alloc(numpages * PAGE_SIZE);
> > +       start = execmem_text_alloc(EXECMEM_FTRACE, numpages * PAGE_SIZE);
> 
> This should be EXECMEM_MODULE_TEXT?

This is an ftrace trampoline, so I think it should be FTRACE type of
allocation.
 
> Thanks,
> Song
> 
> >         if (!start)
> >                 return -ENOMEM;
> >         set_memory_rox((unsigned long)start, numpages);
> [...]

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-21 22:14   ` Song Liu
@ 2023-09-23 15:40     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 15:40 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Bjö rn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Sep 21, 2023 at 03:14:54PM -0700, Song Liu wrote:
> On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> [...]
> > +
> > +/**
> > + * enum execmem_type - types of executable memory ranges
> > + *
> > + * There are several subsystems that allocate executable memory.
> > + * Architectures define different restrictions on placement,
> > + * permissions, alignment and other parameters for memory that can be used
> > + * by these subsystems.
> > + * Types in this enum identify subsystems that allocate executable memory
> > + * and let architectures define parameters for ranges suitable for
> > + * allocations by each subsystem.
> > + *
> > + * @EXECMEM_DEFAULT: default parameters that would be used for types that
> > + * are not explcitly defined.
> > + * @EXECMEM_MODULE_TEXT: parameters for module text sections
> > + * @EXECMEM_KPROBES: parameters for kprobes
> > + * @EXECMEM_FTRACE: parameters for ftrace
> > + * @EXECMEM_BPF: parameters for BPF
> > + * @EXECMEM_TYPE_MAX:
> > + */
> > +enum execmem_type {
> > +       EXECMEM_DEFAULT,
> 
> I found EXECMEM_DEFAULT more confusing than helpful.

I hesitated a lot about that, but in the end decided to have
EXECMEM_DEFAULT and alias EXECMEM_MODULE_TEXT to it because this is what we
essentially have now for the most architectures.

If you'll take a look at arch-specific patches, in many cases there is only
EXECMEM_DEFAULT that an architecture defines and that default is used by
all the subsystems.
 
> Song
> 
> > +       EXECMEM_MODULE_TEXT = EXECMEM_DEFAULT,
> > +       EXECMEM_KPROBES,
> > +       EXECMEM_FTRACE,
> > +       EXECMEM_BPF,
> > +       EXECMEM_TYPE_MAX,
> > +};
> > +
> [...]

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-21 22:10   ` Song Liu
@ 2023-09-23 15:42     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 15:42 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Bjö rn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Sep 21, 2023 at 03:10:26PM -0700, Song Liu wrote:
> On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> [...]
> > +
> > +#include <linux/mm.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/execmem.h>
> > +#include <linux/moduleloader.h>
> > +
> > +static void *execmem_alloc(size_t size)
> > +{
> > +       return module_alloc(size);
> > +}
> > +
> > +void *execmem_text_alloc(enum execmem_type type, size_t size)
> > +{
> > +       return execmem_alloc(size);
> > +}
> 
> execmem_text_alloc (and later execmem_data_alloc) both take "type" as
> input. I guess we can just use execmem_alloc(type, size) for everything?

We could but I still prefer to keep this distinction.
 
> Thanks,
> Song
> 
> > +
> > +void execmem_free(void *ptr)
> > +{
> > +       /*
> > +        * This memory may be RO, and freeing RO memory in an interrupt is not
> > +        * supported by vmalloc.
> > +        */
> > +       WARN_ON(in_interrupt());
> > +       vfree(ptr);
> > +}
> > --
> > 2.39.2
> >

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc()
  2023-09-21 22:52   ` Song Liu
  2023-09-22  7:16     ` Christophe Leroy
@ 2023-09-23 16:20     ` Mike Rapoport
  1 sibling, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 16:20 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Bjö rn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Sep 21, 2023 at 03:52:21PM -0700, Song Liu wrote:
> On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> [...]
> > diff --git a/include/linux/execmem.h b/include/linux/execmem.h
> > index 519bdfdca595..09d45ac786e9 100644
> > --- a/include/linux/execmem.h
> > +++ b/include/linux/execmem.h
> > @@ -29,6 +29,7 @@
> >   * @EXECMEM_KPROBES: parameters for kprobes
> >   * @EXECMEM_FTRACE: parameters for ftrace
> >   * @EXECMEM_BPF: parameters for BPF
> > + * @EXECMEM_MODULE_DATA: parameters for module data sections
> >   * @EXECMEM_TYPE_MAX:
> >   */
> >  enum execmem_type {
> > @@ -37,6 +38,7 @@ enum execmem_type {
> >         EXECMEM_KPROBES,
> >         EXECMEM_FTRACE,
> 
> In longer term, I think we can improve the JITed code and merge
> kprobe/ftrace/bpf. to use the same ranges. Also, do we need special
> setting for FTRACE? If not, let's just remove it.

I don't think we need to limit how the JITed code is generated because we
want to support fewer address space ranges for it. 

As for FTRACE, now it's only needed on x86 and s390 and there it happens
to use the same ranges as MODULES and the rest, but it still gives some
notion of potential semantic differences and the overhead of keeping it is
really negligible.
 
> >         EXECMEM_BPF,
> > +       EXECMEM_MODULE_DATA,
> >         EXECMEM_TYPE_MAX,
> >  };
> 
> Overall, it is great that kprobe/ftrace/bpf no longer depend on modules.
> 
> OTOH, I think we should merge execmem_type and existing mod_mem_type.
> Otherwise, we still need to handle page permissions in multiple places.
> What is our plan for that?

Maybe, but I think this is too early. There are several things missing
before we could remove set_memory usage from modules. E.g. to use ROX
allocations on x86 we at least should update alternatives handling and
reach a consensus about synchronization Andy mentioned in his comments to
v2.
 
> Thanks,
> Song
> 
> 
> >
> > @@ -107,6 +109,23 @@ struct execmem_params *execmem_arch_params(void);
> >   */
> >  void *execmem_text_alloc(enum execmem_type type, size_t size);
> >
> > +/**
> > + * execmem_data_alloc - allocate memory for data coupled to code
> > + * @type: type of the allocation
> > + * @size: how many bytes of memory are required
> > + *
> > + * Allocates memory that will contain data coupled with executable code,
> > + * like data sections in kernel modules.
> > + *
> > + * The memory will have protections defined by architecture.
> > + *
> > + * The allocated memory will reside in an area that does not impose
> > + * restrictions on the addressing modes.
> > + *
> > + * Return: a pointer to the allocated memory or %NULL
> > + */
> > +void *execmem_data_alloc(enum execmem_type type, size_t size);
> > +
> >  /**
> >   * execmem_free - free executable memory
> >   * @ptr: pointer to the memory that should be freed
> > diff --git a/kernel/module/main.c b/kernel/module/main.c
> > index c4146bfcd0a7..2ae83a6abf66 100644
> > --- a/kernel/module/main.c
> > +++ b/kernel/module/main.c
> > @@ -1188,25 +1188,16 @@ void __weak module_arch_freeing_init(struct module *mod)
> >  {
> >  }
> >
> > -static bool mod_mem_use_vmalloc(enum mod_mem_type type)
> > -{
> > -       return IS_ENABLED(CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC) &&
> > -               mod_mem_type_is_core_data(type);
> > -}
> > -
> >  static void *module_memory_alloc(unsigned int size, enum mod_mem_type type)
> >  {
> > -       if (mod_mem_use_vmalloc(type))
> > -               return vzalloc(size);
> > +       if (mod_mem_type_is_data(type))
> > +               return execmem_data_alloc(EXECMEM_MODULE_DATA, size);
> >         return execmem_text_alloc(EXECMEM_MODULE_TEXT, size);
> >  }
> >
> >  static void module_memory_free(void *ptr, enum mod_mem_type type)
> >  {
> > -       if (mod_mem_use_vmalloc(type))
> > -               vfree(ptr);
> > -       else
> > -               execmem_free(ptr);
> > +       execmem_free(ptr);
> >  }
> >
> >  static void free_mod_mem(struct module *mod)
> > diff --git a/mm/execmem.c b/mm/execmem.c
> > index abcbd07e05ac..aeff85261360 100644
> > --- a/mm/execmem.c
> > +++ b/mm/execmem.c
> > @@ -53,11 +53,23 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
> >         return kasan_reset_tag(p);
> >  }
> >
> > +static inline bool execmem_range_is_data(enum execmem_type type)
> > +{
> > +       return type == EXECMEM_MODULE_DATA;
> > +}
> > +
> >  void *execmem_text_alloc(enum execmem_type type, size_t size)
> >  {
> >         return execmem_alloc(size, &execmem_params.ranges[type]);
> >  }
> >
> > +void *execmem_data_alloc(enum execmem_type type, size_t size)
> > +{
> > +       WARN_ON_ONCE(!execmem_range_is_data(type));
> > +
> > +       return execmem_alloc(size, &execmem_params.ranges[type]);
> > +}
> > +
> >  void execmem_free(void *ptr)
> >  {
> >         /*
> > @@ -93,7 +105,10 @@ static void execmem_init_missing(struct execmem_params *p)
> >                 struct execmem_range *r = &p->ranges[i];
> >
> >                 if (!r->start) {
> > -                       r->pgprot = default_range->pgprot;
> > +                       if (execmem_range_is_data(i))
> > +                               r->pgprot = PAGE_KERNEL;
> > +                       else
> > +                               r->pgprot = default_range->pgprot;
> >                         r->alignment = default_range->alignment;
> >                         r->start = default_range->start;
> >                         r->end = default_range->end;
> > --
> > 2.39.2
> >

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 08/13] riscv: extend execmem_params for generated code allocations
  2023-09-22 10:37   ` Alexandre Ghiti
@ 2023-09-23 16:23     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 16:23 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen

On Fri, Sep 22, 2023 at 12:37:07PM +0200, Alexandre Ghiti wrote:
> Hi Mike,
> 
> On 18/09/2023 09:29, Mike Rapoport wrote:
> > From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> > 
> > The memory allocations for kprobes and BPF on RISC-V are not placed in
> > the modules area and these custom allocations are implemented with
> > overrides of alloc_insn_page() and  bpf_jit_alloc_exec().
> > 
> > Slightly reorder execmem_params initialization to support both 32 and 64
> > bit variants, define EXECMEM_KPROBES and EXECMEM_BPF ranges in
> > riscv::execmem_params and drop overrides of alloc_insn_page() and
> > bpf_jit_alloc_exec().
> > 
> > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> > ---
> >   arch/riscv/kernel/module.c         | 21 ++++++++++++++++++++-
> >   arch/riscv/kernel/probes/kprobes.c | 10 ----------
> >   arch/riscv/net/bpf_jit_core.c      | 13 -------------
> >   3 files changed, 20 insertions(+), 24 deletions(-)
> > 
> > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> > index 343a0edfb6dd..31505ecb5c72 100644
> > --- a/arch/riscv/kernel/module.c
> > +++ b/arch/riscv/kernel/module.c
> > @@ -436,20 +436,39 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> >   	return 0;
> >   }
> > -#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > +#ifdef CONFIG_MMU
> >   static struct execmem_params execmem_params __ro_after_init = {
> >   	.ranges = {
> >   		[EXECMEM_DEFAULT] = {
> >   			.pgprot = PAGE_KERNEL,
> >   			.alignment = 1,
> >   		},
> > +		[EXECMEM_KPROBES] = {
> > +			.pgprot = PAGE_KERNEL_READ_EXEC,
> > +			.alignment = 1,
> > +		},
> > +		[EXECMEM_BPF] = {
> > +			.pgprot = PAGE_KERNEL,
> > +			.alignment = 1,
> 
> 
> Not entirely sure it is the same alignment (sorry did not go through the
> entire series), but if it is, the alignment above ^ is not the same that is
> requested by our current bpf_jit_alloc_exec() implementation which is
> PAGE_SIZE.
 
This literally translates vmalloc() in alloc_insn_page() to a set of
parameters, so "1" comes from there. And using alignment of 1 with
vmalloc() implicitly sets it to PAGE_SIZE.

> > +		},
> >   	},
> >   };
> >   struct execmem_params __init *execmem_arch_params(void)
> >   {
> > +#ifdef CONFIG_64BIT
> >   	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
> >   	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
> > +#else
> > +	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
> > +	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
> > +#endif
> > +
> > +	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
> > +	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
> > +
> > +	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
> > +	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
> >   	return &execmem_params;
> >   }
> > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> > index 2f08c14a933d..e64f2f3064eb 100644
> > --- a/arch/riscv/kernel/probes/kprobes.c
> > +++ b/arch/riscv/kernel/probes/kprobes.c
> > @@ -104,16 +104,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
> >   	return 0;
> >   }
> > -#ifdef CONFIG_MMU
> > -void *alloc_insn_page(void)
> > -{
> > -	return  __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
> > -				     GFP_KERNEL, PAGE_KERNEL_READ_EXEC,
> > -				     VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
> > -				     __builtin_return_address(0));
> > -}
> > -#endif
> > -
> >   /* install breakpoint in text */
> >   void __kprobes arch_arm_kprobe(struct kprobe *p)
> >   {
> > diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
> > index 7b70ccb7fec3..c8a758f0882b 100644
> > --- a/arch/riscv/net/bpf_jit_core.c
> > +++ b/arch/riscv/net/bpf_jit_core.c
> > @@ -218,19 +218,6 @@ u64 bpf_jit_alloc_exec_limit(void)
> >   	return BPF_JIT_REGION_SIZE;
> >   }
> > -void *bpf_jit_alloc_exec(unsigned long size)
> > -{
> > -	return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
> > -				    BPF_JIT_REGION_END, GFP_KERNEL,
> > -				    PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -				    __builtin_return_address(0));
> > -}
> > -
> > -void bpf_jit_free_exec(void *addr)
> > -{
> > -	return vfree(addr);
> > -}
> > -
> >   void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> >   {
> >   	int ret;
> 
> 
> Otherwise, you can add:
> 
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> 
> Thanks,
> 
> Alex
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations
  2023-09-21 22:30   ` Song Liu
@ 2023-09-23 16:25     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 16:25 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Bjö rn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Sep 21, 2023 at 03:30:46PM -0700, Song Liu wrote:
> On Mon, Sep 18, 2023 at 12:31 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> [...]
> > @@ -135,5 +138,13 @@ struct execmem_params __init *execmem_arch_params(void)
> >
> >         range->pgprot = prot;
> >
> > +       execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
> > +       execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;
> 
> .end = VMALLOC_END.

Thanks, this should have been

	execmem_params.ranges[EXECMEM_KPROBES].start = range->start;
	execmem_params.ranges[EXECMEM_KPROBES].end = range->end;

where range points to the same range as EXECMEM_MODULE_TEXT.

 
> Thanks,
> Song
> 
> > +
> > +       if (strict_module_rwx_enabled())
> > +               execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
> > +       else
> > +               execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
> > +
> >         return &execmem_params;
> >  }
> > --
> > 2.39.2
> >
> >

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations
  2023-09-22 10:32   ` Christophe Leroy
@ 2023-09-23 16:27     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-23 16:27 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel@lists.infradead.org

Hi Christophe,

On Fri, Sep 22, 2023 at 10:32:46AM +0000, Christophe Leroy wrote:
> Hi Mike,
> 
> Le 18/09/2023 à 09:29, Mike Rapoport a écrit :
> > From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> > 
> > powerpc overrides kprobes::alloc_insn_page() to remove writable
> > permissions when STRICT_MODULE_RWX is on.
> > 
> > Add definition of EXECMEM_KRPOBES to execmem_params to allow using the
> > generic kprobes::alloc_insn_page() with the desired permissions.
> > 
> > As powerpc uses breakpoint instructions to inject kprobes, it does not
> > need to constrain kprobe allocations to the modules area and can use the
> > entire vmalloc address space.
> 
> I don't understand what you mean here. Does it mean kprobe allocation 
> doesn't need to be executable ? I don't think so based on the pgprot you 
> set.
> 
> On powerpc book3s/32, vmalloc space is not executable. Only modules 
> space is executable. X/NX cannot be set on a per page basis, it can only 
> be set on a 256 Mbytes segment basis.
> 
> See commit c49643319715 ("powerpc/32s: Only leave NX unset on segments 
> used for modules") and 6ca055322da8 ("powerpc/32s: Use dedicated segment 
> for modules with STRICT_KERNEL_RWX") and 7bee31ad8e2f ("powerpc/32s: Fix 
> is_module_segment() when MODULES_VADDR is defined").
> 
> So if your intention is still to have an executable kprobes, then you 
> can't use vmalloc address space.

Right, and I've fixed the KPROBES range to uses the same range as MODULES.
The commit message is stale and I need to update it.
 
> Christophe
> 
> > 
> > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> > ---
> >   arch/powerpc/kernel/kprobes.c | 14 --------------
> >   arch/powerpc/kernel/module.c  | 11 +++++++++++
> >   2 files changed, 11 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index 62228c7072a2..14c5ddec3056 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -126,20 +126,6 @@ kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offse
> >   	return (kprobe_opcode_t *)(addr + offset);
> >   }
> >   
> > -void *alloc_insn_page(void)
> > -{
> > -	void *page;
> > -
> > -	page = execmem_text_alloc(EXECMEM_KPROBES, PAGE_SIZE);
> > -	if (!page)
> > -		return NULL;
> > -
> > -	if (strict_module_rwx_enabled())
> > -		set_memory_rox((unsigned long)page, 1);
> > -
> > -	return page;
> > -}
> > -
> >   int arch_prepare_kprobe(struct kprobe *p)
> >   {
> >   	int ret = 0;
> > diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
> > index 824d9541a310..bf2c62aef628 100644
> > --- a/arch/powerpc/kernel/module.c
> > +++ b/arch/powerpc/kernel/module.c
> > @@ -95,6 +95,9 @@ static struct execmem_params execmem_params __ro_after_init = {
> >   		[EXECMEM_DEFAULT] = {
> >   			.alignment = 1,
> >   		},
> > +		[EXECMEM_KPROBES] = {
> > +			.alignment = 1,
> > +		},
> >   		[EXECMEM_MODULE_DATA] = {
> >   			.alignment = 1,
> >   		},
> > @@ -135,5 +138,13 @@ struct execmem_params __init *execmem_arch_params(void)
> >   
> >   	range->pgprot = prot;
> >   
> > +	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
> > +	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_END;
> > +
> > +	if (strict_module_rwx_enabled())
> > +		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
> > +	else
> > +		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
> > +
> >   	return &execmem_params;
> >   }

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-23 15:38     ` Mike Rapoport
@ 2023-09-23 22:36       ` Song Liu
  2023-09-26  8:04         ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2023-09-23 22:36 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Sat, Sep 23, 2023 at 8:39 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Thu, Sep 21, 2023 at 03:34:18PM -0700, Song Liu wrote:
> > On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
> > >
> >
> > [...]
> >
> > > diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> > > index 42215f9404af..db5561d0c233 100644
> > > --- a/arch/s390/kernel/module.c
> > > +++ b/arch/s390/kernel/module.c
> > > @@ -21,6 +21,7 @@
> > >  #include <linux/moduleloader.h>
> > >  #include <linux/bug.h>
> > >  #include <linux/memory.h>
> > > +#include <linux/execmem.h>
> > >  #include <asm/alternative.h>
> > >  #include <asm/nospec-branch.h>
> > >  #include <asm/facility.h>
> > > @@ -76,7 +77,7 @@ void *module_alloc(unsigned long size)
> > >  #ifdef CONFIG_FUNCTION_TRACER
> > >  void module_arch_cleanup(struct module *mod)
> > >  {
> > > -       module_memfree(mod->arch.trampolines_start);
> > > +       execmem_free(mod->arch.trampolines_start);
> > >  }
> > >  #endif
> > >
> > > @@ -510,7 +511,7 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
> > >
> > >         size = FTRACE_HOTPATCH_TRAMPOLINES_SIZE(s->sh_size);
> > >         numpages = DIV_ROUND_UP(size, PAGE_SIZE);
> > > -       start = module_alloc(numpages * PAGE_SIZE);
> > > +       start = execmem_text_alloc(EXECMEM_FTRACE, numpages * PAGE_SIZE);
> >
> > This should be EXECMEM_MODULE_TEXT?
>
> This is an ftrace trampoline, so I think it should be FTRACE type of
> allocation.

Yeah, I was aware of the ftrace trampoline. My point was, ftrace trampoline
doesn't seem to have any special requirements. Therefore, it is probably not
necessary to have a separate type just for it.

AFAICT, kprobe, ftrace, and BPF (JIT and trampoline) can share the same
execmem_type. We may need some work for some archs, but nothing is
fundamentally different among these.

Thanks,
Song

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES
  2023-09-18  7:29 ` [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES Mike Rapoport
@ 2023-09-26  7:33   ` Arnd Bergmann
  2023-09-26  8:32     ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: Arnd Bergmann @ 2023-09-26  7:33 UTC (permalink / raw)
  To: Mike Rapoport, linux-kernel
  Cc: Mark Rutland, x86, Catalin Marinas, Song Liu, sparclinux,
	linux-riscv, Nadav Amit, linux-s390, Helge Deller, Huacai Chen,
	Russell King, Naveen N. Rao, linux-trace-kernel, Will Deacon,
	Heiko Carstens, Steven Rostedt, loongarch, Björn Töpel,
	Thomas Gleixner, bpf, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Puranjay Mohan, linux-mm, Netdev, Kent Overstreet,
	linux-mips, Dinh Nguyen, Luis Chamberlain, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S . Miller,
	linux-modules

On Mon, Sep 18, 2023, at 09:29, Mike Rapoport wrote:
> index a42e4cd11db2..c0b536e398b4 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> +#ifdef CONFIG_XIP_KERNEL
> +/*
> + * The XIP kernel text is mapped in the module area for modules and
> + * some other stuff to work without any indirect relocations.
> + * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
> + * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned 
> on/off.
> + */
> +#undef MODULES_VADDR
> +#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & 
> PMD_MASK)
> +#endif
> +
> +#if defined(CONFIG_MMU) && defined(CONFIG_EXECMEM)
> +static struct execmem_params execmem_params __ro_after_init = {
> +	.ranges = {
> +		[EXECMEM_DEFAULT] = {
> +			.start = MODULES_VADDR,
> +			.end = MODULES_END,
> +			.alignment = 1,
> +		},

This causes a randconfig build failure for me on linux-next now:

arch/arm/mm/init.c:499:25: error: initializer element is not constant
  499 | #define MODULES_VADDR   (((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
      |                         ^
arch/arm/mm/init.c:506:34: note: in expansion of macro 'MODULES_VADDR'
  506 |                         .start = MODULES_VADDR,
      |                                  ^~~~~~~~~~~~~
arch/arm/mm/init.c:499:25: note: (near initialization for 'execmem_params.ranges[0].start')
  499 | #define MODULES_VADDR   (((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
      |                         ^
arch/arm/mm/init.c:506:34: note: in expansion of macro 'MODULES_VADDR'
  506 |                         .start = MODULES_VADDR,
      |                                  ^~~~~~~~~~~~~

I have not done any analysis on the issue so far, I hope
you can see the problem directly. See
https://pastebin.com/raw/xVqAyakH for a .config that runs into
this problem with gcc-13.2.0.

      Arnd

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free()
  2023-09-23 22:36       ` Song Liu
@ 2023-09-26  8:04         ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-26  8:04 UTC (permalink / raw)
  To: Song Liu
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, linux-mm,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Bjö rn Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Sat, Sep 23, 2023 at 03:36:01PM -0700, Song Liu wrote:
> On Sat, Sep 23, 2023 at 8:39 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Thu, Sep 21, 2023 at 03:34:18PM -0700, Song Liu wrote:
> > > On Mon, Sep 18, 2023 at 12:30 AM Mike Rapoport <rppt@kernel.org> wrote:
> > > >
> > >
> > > [...]
> > >
> > > > diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> > > > index 42215f9404af..db5561d0c233 100644
> > > > --- a/arch/s390/kernel/module.c
> > > > +++ b/arch/s390/kernel/module.c
> > > > @@ -21,6 +21,7 @@
> > > >  #include <linux/moduleloader.h>
> > > >  #include <linux/bug.h>
> > > >  #include <linux/memory.h>
> > > > +#include <linux/execmem.h>
> > > >  #include <asm/alternative.h>
> > > >  #include <asm/nospec-branch.h>
> > > >  #include <asm/facility.h>
> > > > @@ -76,7 +77,7 @@ void *module_alloc(unsigned long size)
> > > >  #ifdef CONFIG_FUNCTION_TRACER
> > > >  void module_arch_cleanup(struct module *mod)
> > > >  {
> > > > -       module_memfree(mod->arch.trampolines_start);
> > > > +       execmem_free(mod->arch.trampolines_start);
> > > >  }
> > > >  #endif
> > > >
> > > > @@ -510,7 +511,7 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
> > > >
> > > >         size = FTRACE_HOTPATCH_TRAMPOLINES_SIZE(s->sh_size);
> > > >         numpages = DIV_ROUND_UP(size, PAGE_SIZE);
> > > > -       start = module_alloc(numpages * PAGE_SIZE);
> > > > +       start = execmem_text_alloc(EXECMEM_FTRACE, numpages * PAGE_SIZE);
> > >
> > > This should be EXECMEM_MODULE_TEXT?
> >
> > This is an ftrace trampoline, so I think it should be FTRACE type of
> > allocation.
> 
> Yeah, I was aware of the ftrace trampoline. My point was, ftrace trampoline
> doesn't seem to have any special requirements. Therefore, it is probably not
> necessary to have a separate type just for it.

Since ftrace trampolines are currently used only on s390 and x86 which
enforce the same range for all executable allocations there are no special
requirements indeed. But I think that explicitly marking these allocations
as FTRACE makes it clearer what are they used for and I don't see downsides
to having a type for FTRACE.
 
> AFAICT, kprobe, ftrace, and BPF (JIT and trampoline) can share the same
> execmem_type. We may need some work for some archs, but nothing is
> fundamentally different among these.

Using the same type for all generated code implies that all types of the
generated code must live in the same range and I don't think we want to
impose this limitation on architectures.

For example, RISC-V deliberately added a range for BPF code to allow
relative addressing, see commit 7f3631e88ee6 ("riscv, bpf: Provide RISC-V
specific JIT image alloc/free").
 
> Thanks,
> Song

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES
  2023-09-26  7:33   ` Arnd Bergmann
@ 2023-09-26  8:32     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-09-26  8:32 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Will Deacon, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, Netdev, Kent Overstreet, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3152 bytes --]

Hi Arnd,

On Tue, Sep 26, 2023 at 09:33:48AM +0200, Arnd Bergmann wrote:
> On Mon, Sep 18, 2023, at 09:29, Mike Rapoport wrote:
> > index a42e4cd11db2..c0b536e398b4 100644
> > --- a/arch/arm/mm/init.c
> > +++ b/arch/arm/mm/init.c
> > +#ifdef CONFIG_XIP_KERNEL
> > +/*
> > + * The XIP kernel text is mapped in the module area for modules and
> > + * some other stuff to work without any indirect relocations.
> > + * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
> > + * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned 
> > on/off.
> > + */
> > +#undef MODULES_VADDR
> > +#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & 
> > PMD_MASK)
> > +#endif
> > +
> > +#if defined(CONFIG_MMU) && defined(CONFIG_EXECMEM)
> > +static struct execmem_params execmem_params __ro_after_init = {
> > +	.ranges = {
> > +		[EXECMEM_DEFAULT] = {
> > +			.start = MODULES_VADDR,
> > +			.end = MODULES_END,
> > +			.alignment = 1,
> > +		},
> 
> This causes a randconfig build failure for me on linux-next now:
> 
> arch/arm/mm/init.c:499:25: error: initializer element is not constant
>   499 | #define MODULES_VADDR   (((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
>       |                         ^
> arch/arm/mm/init.c:506:34: note: in expansion of macro 'MODULES_VADDR'
>   506 |                         .start = MODULES_VADDR,
>       |                                  ^~~~~~~~~~~~~
> arch/arm/mm/init.c:499:25: note: (near initialization for 'execmem_params.ranges[0].start')
>   499 | #define MODULES_VADDR   (((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
>       |                         ^
> arch/arm/mm/init.c:506:34: note: in expansion of macro 'MODULES_VADDR'
>   506 |                         .start = MODULES_VADDR,
>       |                                  ^~~~~~~~~~~~~
>
> I have not done any analysis on the issue so far, I hope
> you can see the problem directly. See
> https://pastebin.com/raw/xVqAyakH for a .config that runs into
> this problem with gcc-13.2.0.

The first patch that breaks XIP build is rather patch 04/13, currently
commit 52a34d45419f ("mm/execmem, arch: convert remaining overrides of
module_alloc to execmem") in mm.git/mm-unstable.

The hunk below is a fix for that and the attached patch is the updated
version of 835bc9685f45 ("arch: make execmem setup available regardless of
CONFIG_MODULES")

Andrew, please let me know if you'd like to me to resend these differently.

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 2c7651a2d84c..096cc1ead635 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -38,8 +38,6 @@
 static struct execmem_params execmem_params __ro_after_init = {
 	.ranges = {
 		[EXECMEM_DEFAULT] = {
-			.start = MODULES_VADDR,
-			.end = MODULES_END,
 			.alignment = 1,
 		},
 	},
@@ -49,6 +47,8 @@ struct execmem_params __init *execmem_arch_params(void)
 {
 	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
 
+	r->start = MODULES_VADDR;
+	r->end = MODULES_END;
 	r->pgprot = PAGE_KERNEL_EXEC;
 
 	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS)) {

 
> 
>       Arnd

-- 
Sincerely yours,
Mike.

[-- Attachment #2: 0001-arch-make-execmem-setup-available-regardless-of-CONF.patch --]
[-- Type: text/x-diff, Size: 33023 bytes --]

From a2dae5a88d172d54e7f074799a286faedd2cdb6a Mon Sep 17 00:00:00 2001
From: "Mike Rapoport (IBM)" <rppt@kernel.org>
Date: Wed, 31 May 2023 14:58:24 +0300
Subject: [PATCH] arch: make execmem setup available regardless of
 CONFIG_MODULES

execmem does not depend on modules, on the contrary modules use
execmem.

To make execmem available when CONFIG_MODULES=n, for instance for
kprobes, split execmem_params initialization out from
arch/kernel/module.c and compile it when CONFIG_EXECMEM=y

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 arch/arm/kernel/module.c       |  38 ----------
 arch/arm/mm/init.c             |  38 ++++++++++
 arch/arm64/kernel/module.c     | 130 --------------------------------
 arch/arm64/mm/init.c           | 132 +++++++++++++++++++++++++++++++++
 arch/loongarch/kernel/module.c |  18 -----
 arch/loongarch/mm/init.c       |  20 +++++
 arch/mips/kernel/module.c      |  19 -----
 arch/mips/mm/init.c            |  20 +++++
 arch/parisc/kernel/module.c    |  17 -----
 arch/parisc/mm/init.c          |  22 +++++-
 arch/powerpc/kernel/module.c   |  60 ---------------
 arch/powerpc/mm/mem.c          |  62 ++++++++++++++++
 arch/riscv/kernel/module.c     |  39 ----------
 arch/riscv/mm/init.c           |  39 ++++++++++
 arch/s390/kernel/module.c      |  25 -------
 arch/s390/mm/init.c            |  28 +++++++
 arch/sparc/kernel/module.c     |  23 ------
 arch/sparc/mm/Makefile         |   2 +
 arch/sparc/mm/execmem.c        |  25 +++++++
 arch/x86/kernel/module.c       |  27 -------
 arch/x86/mm/init.c             |  29 ++++++++
 21 files changed, 416 insertions(+), 397 deletions(-)
 create mode 100644 arch/sparc/mm/execmem.c

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 096cc1ead635..3282f304f6b1 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -16,50 +16,12 @@
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/gfp.h>
-#include <linux/execmem.h>
 
 #include <asm/sections.h>
 #include <asm/smp_plat.h>
 #include <asm/unwind.h>
 #include <asm/opcodes.h>
 
-#ifdef CONFIG_XIP_KERNEL
-/*
- * The XIP kernel text is mapped in the module area for modules and
- * some other stuff to work without any indirect relocations.
- * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
- * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
- */
-#undef MODULES_VADDR
-#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
-#endif
-
-#ifdef CONFIG_MMU
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
-
-	r->start = MODULES_VADDR;
-	r->end = MODULES_END;
-	r->pgprot = PAGE_KERNEL_EXEC;
-
-	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS)) {
-		r->fallback_start = VMALLOC_START;
-		r->fallback_end = VMALLOC_END;
-	}
-
-	return &execmem_params;
-}
-#endif
-
 bool module_init_section(const char *name)
 {
 	return strstarts(name, ".init") ||
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a42e4cd11db2..2df78b9345e8 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -22,6 +22,7 @@
 #include <linux/sizes.h>
 #include <linux/stop_machine.h>
 #include <linux/swiotlb.h>
+#include <linux/execmem.h>
 
 #include <asm/cp15.h>
 #include <asm/mach-types.h>
@@ -486,3 +487,40 @@ void free_initrd_mem(unsigned long start, unsigned long end)
 	free_reserved_area((void *)start, (void *)end, -1, "initrd");
 }
 #endif
+
+#ifdef CONFIG_XIP_KERNEL
+/*
+ * The XIP kernel text is mapped in the module area for modules and
+ * some other stuff to work without any indirect relocations.
+ * MODULES_VADDR is redefined here and not in asm/memory.h to avoid
+ * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
+ */
+#undef MODULES_VADDR
+#define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
+#endif
+
+#if defined(CONFIG_MMU) && defined(CONFIG_EXECMEM)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	r->start = MODULES_VADDR;
+	r->end = MODULES_END;
+	r->pgprot = PAGE_KERNEL_EXEC;
+
+	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS)) {
+		r->fallback_start = VMALLOC_START;
+		r->fallback_end = VMALLOC_END;
+	}
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index d27db168d2a2..eb1505128b75 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -20,142 +20,12 @@
 #include <linux/random.h>
 #include <linux/scs.h>
 #include <linux/vmalloc.h>
-#include <linux/execmem.h>
 
 #include <asm/alternative.h>
 #include <asm/insn.h>
 #include <asm/scs.h>
 #include <asm/sections.h>
 
-static u64 module_direct_base __ro_after_init = 0;
-static u64 module_plt_base __ro_after_init = 0;
-
-/*
- * Choose a random page-aligned base address for a window of 'size' bytes which
- * entirely contains the interval [start, end - 1].
- */
-static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
-{
-	u64 max_pgoff, pgoff;
-
-	if ((end - start) >= size)
-		return 0;
-
-	max_pgoff = (size - (end - start)) / PAGE_SIZE;
-	pgoff = get_random_u32_inclusive(0, max_pgoff);
-
-	return start - pgoff * PAGE_SIZE;
-}
-
-/*
- * Modules may directly reference data and text anywhere within the kernel
- * image and other modules. References using PREL32 relocations have a +/-2G
- * range, and so we need to ensure that the entire kernel image and all modules
- * fall within a 2G window such that these are always within range.
- *
- * Modules may directly branch to functions and code within the kernel text,
- * and to functions and code within other modules. These branches will use
- * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
- * that the entire kernel text and all module text falls within a 128M window
- * such that these are always within range. With PLTs, we can expand this to a
- * 2G window.
- *
- * We chose the 128M region to surround the entire kernel image (rather than
- * just the text) as using the same bounds for the 128M and 2G regions ensures
- * by construction that we never select a 128M region that is not a subset of
- * the 2G region. For very large and unusual kernel configurations this means
- * we may fall back to PLTs where they could have been avoided, but this keeps
- * the logic significantly simpler.
- */
-static int __init module_init_limits(void)
-{
-	u64 kernel_end = (u64)_end;
-	u64 kernel_start = (u64)_text;
-	u64 kernel_size = kernel_end - kernel_start;
-
-	/*
-	 * The default modules region is placed immediately below the kernel
-	 * image, and is large enough to use the full 2G relocation range.
-	 */
-	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
-	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
-
-	if (!kaslr_enabled()) {
-		if (kernel_size < SZ_128M)
-			module_direct_base = kernel_end - SZ_128M;
-		if (kernel_size < SZ_2G)
-			module_plt_base = kernel_end - SZ_2G;
-	} else {
-		u64 min = kernel_start;
-		u64 max = kernel_end;
-
-		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
-			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
-		} else {
-			module_direct_base = random_bounding_box(SZ_128M, min, max);
-			if (module_direct_base) {
-				min = module_direct_base;
-				max = module_direct_base + SZ_128M;
-			}
-		}
-
-		module_plt_base = random_bounding_box(SZ_2G, min, max);
-	}
-
-	pr_info("%llu pages in range for non-PLT usage",
-		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
-	pr_info("%llu pages in range for PLT usage",
-		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
-
-	return 0;
-}
-
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-		},
-		[EXECMEM_KPROBES] = {
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-			.alignment = 1,
-		},
-		[EXECMEM_BPF] = {
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
-
-	module_init_limits();
-
-	r->pgprot = PAGE_KERNEL;
-
-	if (module_direct_base) {
-		r->start = module_direct_base;
-		r->end = module_direct_base + SZ_128M;
-
-		if (module_plt_base) {
-			r->fallback_start = module_plt_base;
-			r->fallback_end = module_plt_base + SZ_2G;
-		}
-	} else if (module_plt_base) {
-		r->start = module_plt_base;
-		r->end = module_plt_base + SZ_2G;
-	}
-
-	execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
-	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-
 enum aarch64_reloc_op {
 	RELOC_OP_NONE,
 	RELOC_OP_ABS,
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8a0f8604348b..9b7716b4d84c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -31,6 +31,7 @@
 #include <linux/hugetlb.h>
 #include <linux/acpi_iort.h>
 #include <linux/kmemleak.h>
+#include <linux/execmem.h>
 
 #include <asm/boot.h>
 #include <asm/fixmap.h>
@@ -547,3 +548,134 @@ void dump_mem_limit(void)
 		pr_emerg("Memory Limit: none\n");
 	}
 }
+
+#ifdef CONFIG_EXECMEM
+static u64 module_direct_base __ro_after_init = 0;
+static u64 module_plt_base __ro_after_init = 0;
+
+/*
+ * Choose a random page-aligned base address for a window of 'size' bytes which
+ * entirely contains the interval [start, end - 1].
+ */
+static u64 __init random_bounding_box(u64 size, u64 start, u64 end)
+{
+	u64 max_pgoff, pgoff;
+
+	if ((end - start) >= size)
+		return 0;
+
+	max_pgoff = (size - (end - start)) / PAGE_SIZE;
+	pgoff = get_random_u32_inclusive(0, max_pgoff);
+
+	return start - pgoff * PAGE_SIZE;
+}
+
+/*
+ * Modules may directly reference data and text anywhere within the kernel
+ * image and other modules. References using PREL32 relocations have a +/-2G
+ * range, and so we need to ensure that the entire kernel image and all modules
+ * fall within a 2G window such that these are always within range.
+ *
+ * Modules may directly branch to functions and code within the kernel text,
+ * and to functions and code within other modules. These branches will use
+ * CALL26/JUMP26 relocations with a +/-128M range. Without PLTs, we must ensure
+ * that the entire kernel text and all module text falls within a 128M window
+ * such that these are always within range. With PLTs, we can expand this to a
+ * 2G window.
+ *
+ * We chose the 128M region to surround the entire kernel image (rather than
+ * just the text) as using the same bounds for the 128M and 2G regions ensures
+ * by construction that we never select a 128M region that is not a subset of
+ * the 2G region. For very large and unusual kernel configurations this means
+ * we may fall back to PLTs where they could have been avoided, but this keeps
+ * the logic significantly simpler.
+ */
+static int __init module_init_limits(void)
+{
+	u64 kernel_end = (u64)_end;
+	u64 kernel_start = (u64)_text;
+	u64 kernel_size = kernel_end - kernel_start;
+
+	/*
+	 * The default modules region is placed immediately below the kernel
+	 * image, and is large enough to use the full 2G relocation range.
+	 */
+	BUILD_BUG_ON(KIMAGE_VADDR != MODULES_END);
+	BUILD_BUG_ON(MODULES_VSIZE < SZ_2G);
+
+	if (!kaslr_enabled()) {
+		if (kernel_size < SZ_128M)
+			module_direct_base = kernel_end - SZ_128M;
+		if (kernel_size < SZ_2G)
+			module_plt_base = kernel_end - SZ_2G;
+	} else {
+		u64 min = kernel_start;
+		u64 max = kernel_end;
+
+		if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
+			pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
+		} else {
+			module_direct_base = random_bounding_box(SZ_128M, min, max);
+			if (module_direct_base) {
+				min = module_direct_base;
+				max = module_direct_base + SZ_128M;
+			}
+		}
+
+		module_plt_base = random_bounding_box(SZ_2G, min, max);
+	}
+
+	pr_info("%llu pages in range for non-PLT usage",
+		module_direct_base ? (SZ_128M - kernel_size) / PAGE_SIZE : 0);
+	pr_info("%llu pages in range for PLT usage",
+		module_plt_base ? (SZ_2G - kernel_size) / PAGE_SIZE : 0);
+
+	return 0;
+}
+
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+		},
+		[EXECMEM_KPROBES] = {
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+			.alignment = 1,
+		},
+		[EXECMEM_BPF] = {
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	module_init_limits();
+
+	r->pgprot = PAGE_KERNEL;
+
+	if (module_direct_base) {
+		r->start = module_direct_base;
+		r->end = module_direct_base + SZ_128M;
+
+		if (module_plt_base) {
+			r->fallback_start = module_plt_base;
+			r->fallback_end = module_plt_base + SZ_2G;
+		}
+	} else if (module_plt_base) {
+		r->start = module_plt_base;
+		r->end = module_plt_base + SZ_2G;
+	}
+
+	execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
+	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index a1d8fe9796fa..181b5f8b09f1 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -18,7 +18,6 @@
 #include <linux/ftrace.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
-#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/inst.h>
 
@@ -470,23 +469,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.pgprot = PAGE_KERNEL,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-
-	return &execmem_params;
-}
-
 static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
 				   const Elf_Shdr *sechdrs, struct module *mod)
 {
diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
index f3fe8c06ba4d..26b10a51309c 100644
--- a/arch/loongarch/mm/init.c
+++ b/arch/loongarch/mm/init.c
@@ -24,6 +24,7 @@
 #include <linux/gfp.h>
 #include <linux/hugetlb.h>
 #include <linux/mmzone.h>
+#include <linux/execmem.h>
 
 #include <asm/asm-offsets.h>
 #include <asm/bootinfo.h>
@@ -247,3 +248,22 @@ EXPORT_SYMBOL(invalid_pmd_table);
 #endif
 pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss;
 EXPORT_SYMBOL(invalid_pte_table);
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
index 1c959074b35f..ebf9496f5db0 100644
--- a/arch/mips/kernel/module.c
+++ b/arch/mips/kernel/module.c
@@ -33,25 +33,6 @@ struct mips_hi16 {
 static LIST_HEAD(dbe_list);
 static DEFINE_SPINLOCK(dbe_lock);
 
-#ifdef MODULE_START
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.start = MODULE_START,
-			.end = MODULE_END,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-#endif
-
 static void apply_r_mips_32(u32 *location, u32 base, Elf_Addr v)
 {
 	*location = base + v;
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 5dcb525a8995..55e7869d03f2 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -31,6 +31,7 @@
 #include <linux/gfp.h>
 #include <linux/kcore.h>
 #include <linux/initrd.h>
+#include <linux/execmem.h>
 
 #include <asm/bootinfo.h>
 #include <asm/cachectl.h>
@@ -573,3 +574,22 @@ EXPORT_SYMBOL_GPL(invalid_pmd_table);
 #endif
 pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss;
 EXPORT_SYMBOL(invalid_pte_table);
+
+#if defined(CONFIG_EXECMEM) && defined(MODULE_START)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.start = MODULE_START,
+			.end = MODULE_END,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index 0c6dfd1daef3..fecd2760b7a6 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -174,23 +174,6 @@ static inline int reassemble_22(int as22)
 		((as22 & 0x0003ff) << 3));
 }
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.pgprot = PAGE_KERNEL_RWX,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
-
-	return &execmem_params;
-}
-
 #ifndef CONFIG_64BIT
 static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
 {
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index a088c243edea..c87fed38e38e 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -24,6 +24,7 @@
 #include <linux/nodemask.h>	/* for node_online_map */
 #include <linux/pagemap.h>	/* for release_pages */
 #include <linux/compat.h>
+#include <linux/execmem.h>
 
 #include <asm/pgalloc.h>
 #include <asm/tlb.h>
@@ -479,7 +480,7 @@ void free_initmem(void)
 	/* finally dump all the instructions which were cached, since the
 	 * pages are no-longer executable */
 	flush_icache_range(init_begin, init_end);
-	
+
 	free_initmem_default(POISON_FREE_INITMEM);
 
 	/* set up a new led state on systems shipped LED State panel */
@@ -919,3 +920,22 @@ static const pgprot_t protection_map[16] = {
 	[VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]	= PAGE_RWX
 };
 DECLARE_VM_GET_PAGE_PROT
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL_RWX,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index e4ecee1c87ef..b30e00964a60 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -10,7 +10,6 @@
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
 #include <linux/bug.h>
-#include <linux/execmem.h>
 #include <asm/module.h>
 #include <linux/uaccess.h>
 #include <asm/firmware.h>
@@ -89,62 +88,3 @@ int module_finalize(const Elf_Ehdr *hdr,
 
 	return 0;
 }
-
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.alignment = 1,
-		},
-		[EXECMEM_KPROBES] = {
-			.alignment = 1,
-		},
-		[EXECMEM_MODULE_DATA] = {
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
-	struct execmem_range *range = &execmem_params.ranges[EXECMEM_DEFAULT];
-
-	/*
-	 * BOOK3S_32 and 8xx define MODULES_VADDR for text allocations and
-	 * allow allocating data in the entire vmalloc space
-	 */
-#ifdef MODULES_VADDR
-	struct execmem_range *data = &execmem_params.ranges[EXECMEM_MODULE_DATA];
-	unsigned long limit = (unsigned long)_etext - SZ_32M;
-
-	/* First try within 32M limit from _etext to avoid branch trampolines */
-	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit) {
-		range->start = limit;
-		range->end = MODULES_END;
-		range->fallback_start = MODULES_VADDR;
-		range->fallback_end = MODULES_END;
-	} else {
-		range->start = MODULES_VADDR;
-		range->end = MODULES_END;
-	}
-	data->start = VMALLOC_START;
-	data->end = VMALLOC_END;
-	data->pgprot = PAGE_KERNEL;
-	data->alignment = 1;
-#else
-	range->start = VMALLOC_START;
-	range->end = VMALLOC_END;
-#endif
-
-	range->pgprot = prot;
-
-	execmem_params.ranges[EXECMEM_KPROBES].start = range->start;
-	execmem_params.ranges[EXECMEM_KPROBES].end = range->end;
-
-	if (strict_module_rwx_enabled())
-		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
-	else
-		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
-
-	return &execmem_params;
-}
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8b121df7b08f..466e912181af 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -16,6 +16,7 @@
 #include <linux/highmem.h>
 #include <linux/suspend.h>
 #include <linux/dma-direct.h>
+#include <linux/execmem.h>
 
 #include <asm/swiotlb.h>
 #include <asm/machdep.h>
@@ -406,3 +407,64 @@ int devmem_is_allowed(unsigned long pfn)
  * the EHEA driver. Drop this when drivers/net/ethernet/ibm/ehea is removed.
  */
 EXPORT_SYMBOL_GPL(walk_system_ram_range);
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.alignment = 1,
+		},
+		[EXECMEM_KPROBES] = {
+			.alignment = 1,
+		},
+		[EXECMEM_MODULE_DATA] = {
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
+	struct execmem_range *range = &execmem_params.ranges[EXECMEM_DEFAULT];
+
+	/*
+	 * BOOK3S_32 and 8xx define MODULES_VADDR for text allocations and
+	 * allow allocating data in the entire vmalloc space
+	 */
+#ifdef MODULES_VADDR
+	struct execmem_range *data = &execmem_params.ranges[EXECMEM_MODULE_DATA];
+	unsigned long limit = (unsigned long)_etext - SZ_32M;
+
+	/* First try within 32M limit from _etext to avoid branch trampolines */
+	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit) {
+		range->start = limit;
+		range->end = MODULES_END;
+		range->fallback_start = MODULES_VADDR;
+		range->fallback_end = MODULES_END;
+	} else {
+		range->start = MODULES_VADDR;
+		range->end = MODULES_END;
+	}
+	data->start = VMALLOC_START;
+	data->end = VMALLOC_END;
+	data->pgprot = PAGE_KERNEL;
+	data->alignment = 1;
+#else
+	range->start = VMALLOC_START;
+	range->end = VMALLOC_END;
+#endif
+
+	range->pgprot = prot;
+
+	execmem_params.ranges[EXECMEM_KPROBES].start = range->start;
+	execmem_params.ranges[EXECMEM_KPROBES].end = range->end;
+
+	if (strict_module_rwx_enabled())
+		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
+	else
+		execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_EXEC;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 31505ecb5c72..8af08d5449bf 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -11,7 +11,6 @@
 #include <linux/vmalloc.h>
 #include <linux/sizes.h>
 #include <linux/pgtable.h>
-#include <linux/execmem.h>
 #include <asm/alternative.h>
 #include <asm/sections.h>
 
@@ -436,44 +435,6 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	return 0;
 }
 
-#ifdef CONFIG_MMU
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.pgprot = PAGE_KERNEL,
-			.alignment = 1,
-		},
-		[EXECMEM_KPROBES] = {
-			.pgprot = PAGE_KERNEL_READ_EXEC,
-			.alignment = 1,
-		},
-		[EXECMEM_BPF] = {
-			.pgprot = PAGE_KERNEL,
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-#ifdef CONFIG_64BIT
-	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-#else
-	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
-#endif
-
-	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
-	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
-
-	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
-	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
-
-	return &execmem_params;
-}
-#endif
-
 int module_finalize(const Elf_Ehdr *hdr,
 		    const Elf_Shdr *sechdrs,
 		    struct module *me)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 0798bd861dcb..b0f7848f39e3 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -24,6 +24,7 @@
 #include <linux/elf.h>
 #endif
 #include <linux/kfence.h>
+#include <linux/execmem.h>
 
 #include <asm/fixmap.h>
 #include <asm/io.h>
@@ -1564,3 +1565,41 @@ void __init pgtable_cache_init(void)
 		preallocate_pgd_pages_range(MODULES_VADDR, MODULES_END, "bpf/modules");
 }
 #endif
+
+#if defined(CONFIG_MMU) && defined(CONFIG_EXECMEM)
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+		[EXECMEM_KPROBES] = {
+			.pgprot = PAGE_KERNEL_READ_EXEC,
+			.alignment = 1,
+		},
+		[EXECMEM_BPF] = {
+			.pgprot = PAGE_KERNEL,
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+#ifdef CONFIG_64BIT
+	execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+#else
+	execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+#endif
+
+	execmem_params.ranges[EXECMEM_KPROBES].start = VMALLOC_START;
+	execmem_params.ranges[EXECMEM_KPROBES].end = VMALLOC_END;
+
+	execmem_params.ranges[EXECMEM_BPF].start = BPF_JIT_REGION_START;
+	execmem_params.ranges[EXECMEM_BPF].end = BPF_JIT_REGION_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 538d5f24af66..81a8d92ca092 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -37,31 +37,6 @@
 
 #define PLT_ENTRY_SIZE 22
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-			.pgprot = PAGE_KERNEL,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	unsigned long module_load_offset = 0;
-	unsigned long start;
-
-	if (kaslr_enabled())
-		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-
-	start = MODULES_VADDR + module_load_offset;
-	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-
-	return &execmem_params;
-}
-
 #ifdef CONFIG_FUNCTION_TRACER
 void module_arch_cleanup(struct module *mod)
 {
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 8b94d2212d33..2e6d6512fc5f 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -34,6 +34,7 @@
 #include <linux/percpu.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
+#include <linux/execmem.h>
 #include <asm/pgalloc.h>
 #include <asm/kfence.h>
 #include <asm/ptdump.h>
@@ -311,3 +312,30 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
 	vmem_remove_mapping(start, size);
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+			.pgprot = PAGE_KERNEL,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	unsigned long module_load_offset = 0;
+	unsigned long start;
+
+	if (kaslr_enabled())
+		module_load_offset = get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
+
+	start = MODULES_VADDR + module_load_offset;
+	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+
+	return &execmem_params;
+}
+#endif
diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
index 1d8d1fba95b9..dff1d85ba202 100644
--- a/arch/sparc/kernel/module.c
+++ b/arch/sparc/kernel/module.c
@@ -14,7 +14,6 @@
 #include <linux/string.h>
 #include <linux/ctype.h>
 #include <linux/mm.h>
-#include <linux/execmem.h>
 #ifdef CONFIG_SPARC64
 #include <linux/jump_label.h>
 #endif
@@ -25,28 +24,6 @@
 
 #include "entry.h"
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-#ifdef CONFIG_SPARC64
-			.start = MODULES_VADDR,
-			.end = MODULES_END,
-#else
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-#endif
-			.alignment = 1,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-
 /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
 int module_frob_arch_sections(Elf_Ehdr *hdr,
 			      Elf_Shdr *sechdrs,
diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile
index 871354aa3c00..87e2cf7efb5b 100644
--- a/arch/sparc/mm/Makefile
+++ b/arch/sparc/mm/Makefile
@@ -15,3 +15,5 @@ obj-$(CONFIG_SPARC32)   += leon_mm.o
 
 # Only used by sparc64
 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
+
+obj-$(CONFIG_EXECMEM) += execmem.o
diff --git a/arch/sparc/mm/execmem.c b/arch/sparc/mm/execmem.c
new file mode 100644
index 000000000000..fb53a859869a
--- /dev/null
+++ b/arch/sparc/mm/execmem.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/mm.h>
+#include <linux/execmem.h>
+
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+#ifdef CONFIG_SPARC64
+			.start = MODULES_VADDR,
+			.end = MODULES_END,
+#else
+			.start = VMALLOC_START,
+			.end = VMALLOC_END,
+#endif
+			.alignment = 1,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 9d37375e2f05..c52d591c0f3f 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -19,7 +19,6 @@
 #include <linux/jump_label.h>
 #include <linux/random.h>
 #include <linux/memory.h>
-#include <linux/execmem.h>
 
 #include <asm/text-patching.h>
 #include <asm/page.h>
@@ -37,32 +36,6 @@ do {							\
 } while (0)
 #endif
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-		},
-	},
-};
-
-struct execmem_params __init *execmem_arch_params(void)
-{
-	unsigned long module_load_offset = 0;
-	unsigned long start;
-
-	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
-		module_load_offset =
-			get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
-
-	start = MODULES_VADDR + module_load_offset;
-	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
-
-	return &execmem_params;
-}
-
 #ifdef CONFIG_X86_32
 int apply_relocate(Elf32_Shdr *sechdrs,
 		   const char *strtab,
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 679893ea5e68..022af7ab50f9 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -7,6 +7,7 @@
 #include <linux/swapops.h>
 #include <linux/kmemleak.h>
 #include <linux/sched/task.h>
+#include <linux/execmem.h>
 
 #include <asm/set_memory.h>
 #include <asm/cpu_device_id.h>
@@ -1099,3 +1100,31 @@ unsigned long arch_max_swapfile_size(void)
 	return pages;
 }
 #endif
+
+#ifdef CONFIG_EXECMEM
+static struct execmem_params execmem_params __ro_after_init = {
+	.ranges = {
+		[EXECMEM_DEFAULT] = {
+			.flags = EXECMEM_KASAN_SHADOW,
+			.alignment = MODULE_ALIGN,
+		},
+	},
+};
+
+struct execmem_params __init *execmem_arch_params(void)
+{
+	unsigned long module_load_offset = 0;
+	unsigned long start;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
+		module_load_offset =
+			get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
+
+	start = MODULES_VADDR + module_load_offset;
+	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
+	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
+	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+
+	return &execmem_params;
+}
+#endif /* CONFIG_EXECMEM */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-09-18  7:29 ` [PATCH v3 04/13] mm/execmem, arch: convert remaining " Mike Rapoport
@ 2023-10-04  0:29   ` Edgecombe, Rick P
  2023-10-05  5:28     ` Mike Rapoport
  2023-10-23 17:14   ` Will Deacon
  1 sibling, 1 reply; 49+ messages in thread
From: Edgecombe, Rick P @ 2023-10-04  0:29 UTC (permalink / raw)
  To: linux-kernel, rppt
  Cc: mark.rutland, x86, catalin.marinas, song, sparclinux,
	linux-riscv, nadav.amit, linux-s390, deller, chenhuacai, linux,
	naveen.n.rao, linux-trace-kernel, will, hca, rostedt, loongarch,
	bjorn, tglx, akpm, lin ux-arm-kernel@lists.infradead.org,
	tsbogend, puranjay12, linux-parisc, linux-mm, netdev,
	kent.overstreet, linux-mips, dinguyen, mcgrof, palmer, bpf,
	linuxppc-dev, davem, linux-modules

On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote:
> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
> index 5f71a0cf4399..9d37375e2f05 100644
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -19,6 +19,7 @@
>  #include <linux/jump_label.h>
>  #include <linux/random.h>
>  #include <linux/memory.h>
> +#include <linux/execmem.h>
>  
>  #include <asm/text-patching.h>
>  #include <asm/page.h>
> @@ -36,55 +37,30 @@ do
> {                                                        \
>  } while (0)
>  #endif
>  
> -#ifdef CONFIG_RANDOMIZE_BASE
> -static unsigned long module_load_offset;
> +static struct execmem_params execmem_params __ro_after_init = {
> +       .ranges = {
> +               [EXECMEM_DEFAULT] = {
> +                       .flags = EXECMEM_KASAN_SHADOW,
> +                       .alignment = MODULE_ALIGN,
> +               },
> +       },
> +};
>  
> -/* Mutex protects the module_load_offset. */
> -static DEFINE_MUTEX(module_kaslr_mutex);
> -
> -static unsigned long int get_module_load_offset(void)
> -{
> -       if (kaslr_enabled()) {
> -               mutex_lock(&module_kaslr_mutex);
> -               /*
> -                * Calculate the module_load_offset the first time
> this
> -                * code is called. Once calculated it stays the same
> until
> -                * reboot.
> -                */
> -               if (module_load_offset == 0)
> -                       module_load_offset =
> -                               get_random_u32_inclusive(1, 1024) *
> PAGE_SIZE;
> -               mutex_unlock(&module_kaslr_mutex);
> -       }
> -       return module_load_offset;
> -}
> -#else
> -static unsigned long int get_module_load_offset(void)
> -{
> -       return 0;
> -}
> -#endif
> -
> -void *module_alloc(unsigned long size)
> +struct execmem_params __init *execmem_arch_params(void)
>  {
> -       gfp_t gfp_mask = GFP_KERNEL;
> -       void *p;
> -
> -       if (PAGE_ALIGN(size) > MODULES_LEN)
> -               return NULL;
> +       unsigned long module_load_offset = 0;
> +       unsigned long start;
>  
> -       p = __vmalloc_node_range(size, MODULE_ALIGN,
> -                                MODULES_VADDR +
> get_module_load_offset(),
> -                                MODULES_END, gfp_mask, PAGE_KERNEL,
> -                                VM_FLUSH_RESET_PERMS |
> VM_DEFER_KMEMLEAK,
> -                                NUMA_NO_NODE,
> __builtin_return_address(0));
> +       if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
> +               module_load_offset =
> +                       get_random_u32_inclusive(1, 1024) *
> PAGE_SIZE;

Minor:
I think you can skip the IS_ENABLED(CONFIG_RANDOMIZE_BASE) part because
CONFIG_RANDOMIZE_MEMORY depends on CONFIG_RANDOMIZE_BASE (which is
checked in kaslr_enabled()).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-09-18  7:29 ` [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem Mike Rapoport
@ 2023-10-04  0:29   ` Edgecombe, Rick P
  2023-10-04 15:39     ` Edgecombe, Rick P
  2023-10-05 18:11   ` Edgecombe, Rick P
  1 sibling, 1 reply; 49+ messages in thread
From: Edgecombe, Rick P @ 2023-10-04  0:29 UTC (permalink / raw)
  To: linux-kernel, rppt
  Cc: mark.rutland, x86, catalin.marinas, song, sparclinux,
	linux-riscv, nadav.amit, linux-s390, deller, chenhuacai, linux,
	naveen.n.rao, linux-trace-kernel, will, hca, rostedt, loongarch,
	bjorn, tglx, akpm, lin ux-arm-kernel@lists.infradead.org,
	tsbogend, puranjay12, linux-parisc, linux-mm, netdev,
	kent.overstreet, linux-mips, dinguyen, mcgrof, palmer, bpf,
	linuxppc-dev, davem, linux-modules

On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote:
> +
> +static void execmem_init_missing(struct execmem_params *p)
> +{
> +       struct execmem_range *default_range = &p-
> >ranges[EXECMEM_DEFAULT];
> +
> +       for (int i = EXECMEM_DEFAULT + 1; i < EXECMEM_TYPE_MAX; i++)
> {
> +               struct execmem_range *r = &p->ranges[i];
> +
> +               if (!r->start) {
> +                       r->pgprot = default_range->pgprot;
> +                       r->alignment = default_range->alignment;
> +                       r->start = default_range->start;
> +                       r->end = default_range->end;
> +               }
> +       }
> +}
> +

It seems a bit weird to copy all of this. Is it trying to be faster or
something?

Couldn't it just check r->start in execmem_text/data_alloc() path and
switch to EXECMEM_DEFAULT if needed then? The execmem_range_is_data()
part that comes later could be added to the logic there too. So this
seems like unnecessary complexity to me or I don't see the reason.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-10-04  0:29   ` Edgecombe, Rick P
@ 2023-10-04 15:39     ` Edgecombe, Rick P
  2023-10-05  5:26       ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: Edgecombe, Rick P @ 2023-10-04 15:39 UTC (permalink / raw)
  To: linux-kernel, rppt
  Cc: mark.rutland, x86, catalin.marinas, song, sparclinux,
	linux-riscv, nadav.amit, linux-s390, deller, chenhuacai, linux,
	naveen.n.rao, linux-trace-kernel, will, hca, rostedt, loongarch,
	bjorn, tglx, akpm, lin ux-arm-kernel@lists.infradead.org,
	tsbogend, puranjay12, linux-parisc, linux-mm, netdev,
	kent.overstreet, linux-mips, dinguyen, mcgrof, palmer, bpf,
	linuxppc-dev, davem, linux-modules

On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote:
> It seems a bit weird to copy all of this. Is it trying to be faster
> or
> something?
> 
> Couldn't it just check r->start in execmem_text/data_alloc() path and
> switch to EXECMEM_DEFAULT if needed then? The execmem_range_is_data()
> part that comes later could be added to the logic there too. So this
> seems like unnecessary complexity to me or I don't see the reason.

I guess this is a bad idea because if you have the full size array
sitting around anyway you might as well use it and reduce the
exec_mem_alloc() logic. Just looking at it from the x86 side (and
similar) though, where there is actually only one execmem_range and it
building this whole array with identical data and it seems weird.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-10-04 15:39     ` Edgecombe, Rick P
@ 2023-10-05  5:26       ` Mike Rapoport
  2023-10-05 18:09         ` Edgecombe, Rick P
  0 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-10-05  5:26 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: mark.rutland, x86, catalin.marinas, linux, song, sparclinux,
	linux-riscv, nadav.amit, linux-s390, deller, chenhuacai, mcgrof,
	naveen.n.rao, linux-mips, linux-trace-kernel, will, hca, rostedt,
	loongarch, tglx, akpm, linux-arm-kernel, tsbogend, puranjay12,
	linux-parisc, linux-mm, netdev, kent.overstreet, linux-kernel,
	dinguyen, bjorn, palmer, bpf, linuxppc-dev, davem, linux-modules

On Wed, Oct 04, 2023 at 03:39:26PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote:
> > It seems a bit weird to copy all of this. Is it trying to be faster
> > or
> > something?
> > 
> > Couldn't it just check r->start in execmem_text/data_alloc() path and
> > switch to EXECMEM_DEFAULT if needed then? The execmem_range_is_data()
> > part that comes later could be added to the logic there too. So this
> > seems like unnecessary complexity to me or I don't see the reason.
> 
> I guess this is a bad idea because if you have the full size array
> sitting around anyway you might as well use it and reduce the
> exec_mem_alloc() logic.

That's was the idea, indeed. :)

> Just looking at it from the x86 side (and
> similar) though, where there is actually only one execmem_range and it
> building this whole array with identical data and it seems weird.

Right, most architectures have only one range, but to support all variants
that we have, execmem has to maintain the whole array.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-10-04  0:29   ` Edgecombe, Rick P
@ 2023-10-05  5:28     ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-10-05  5:28 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: mark.rutland, x86, catalin.marinas, linux, song, sparclinux,
	linux-riscv, nadav.amit, linux-s390, deller, chenhuacai, mcgrof,
	naveen.n.rao, linux-mips, linux-trace-kernel, will, hca, rostedt,
	loongarch, tglx, akpm, linux-arm-kernel, tsbogend, puranjay12,
	linux-parisc, linux-mm, netdev, kent.overstreet, linux-kernel,
	dinguyen, bjorn, palmer, bpf, linuxppc-dev, davem, linux-modules

On Wed, Oct 04, 2023 at 12:29:36AM +0000, Edgecombe, Rick P wrote:
> On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote:
> > diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
> > index 5f71a0cf4399..9d37375e2f05 100644
> > --- a/arch/x86/kernel/module.c
> > +++ b/arch/x86/kernel/module.c
> >
> > -void *module_alloc(unsigned long size)
> > +struct execmem_params __init *execmem_arch_params(void)
> >  {
> > -       gfp_t gfp_mask = GFP_KERNEL;
> > -       void *p;
> > -
> > -       if (PAGE_ALIGN(size) > MODULES_LEN)
> > -               return NULL;
> > +       unsigned long module_load_offset = 0;
> > +       unsigned long start;
> >  
> > -       p = __vmalloc_node_range(size, MODULE_ALIGN,
> > -                                MODULES_VADDR +
> > get_module_load_offset(),
> > -                                MODULES_END, gfp_mask, PAGE_KERNEL,
> > -                                VM_FLUSH_RESET_PERMS |
> > VM_DEFER_KMEMLEAK,
> > -                                NUMA_NO_NODE,
> > __builtin_return_address(0));
> > +       if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled())
> > +               module_load_offset =
> > +                       get_random_u32_inclusive(1, 1024) *
> > PAGE_SIZE;
> 
> Minor:
> I think you can skip the IS_ENABLED(CONFIG_RANDOMIZE_BASE) part because
> CONFIG_RANDOMIZE_MEMORY depends on CONFIG_RANDOMIZE_BASE (which is
> checked in kaslr_enabled()).

Thanks, I'll look into it.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-10-05  5:26       ` Mike Rapoport
@ 2023-10-05 18:09         ` Edgecombe, Rick P
  2023-10-26  8:40           ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: Edgecombe, Rick P @ 2023-10-05 18:09 UTC (permalink / raw)
  To: rppt
  Cc: mark.rutland, chenhuacai, catalin.marinas, linux-kernel, song,
	sparclinux, linux-riscv, nadav.amit, linux-s390, deller, x86,
	linux, naveen.n.rao, linux-trace-kernel, will, hca, rostedt,
	loongarch, bjorn, tglx, akpm, linux-arm-kernel, tsbogend,
	puranjay12, linux-parisc, linux-mm, netdev, kent.overstreet,
	linux-mips, dinguyen, mcgrof, palmer, bpf, linuxppc-dev, davem,
	linux-modules

On Thu, 2023-10-05 at 08:26 +0300, Mike Rapoport wrote:
> On Wed, Oct 04, 2023 at 03:39:26PM +0000, Edgecombe, Rick P wrote:
> > On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote:
> > > It seems a bit weird to copy all of this. Is it trying to be
> > > faster
> > > or
> > > something?
> > > 
> > > Couldn't it just check r->start in execmem_text/data_alloc() path
> > > and
> > > switch to EXECMEM_DEFAULT if needed then? The
> > > execmem_range_is_data()
> > > part that comes later could be added to the logic there too. So
> > > this
> > > seems like unnecessary complexity to me or I don't see the
> > > reason.
> > 
> > I guess this is a bad idea because if you have the full size array
> > sitting around anyway you might as well use it and reduce the
> > exec_mem_alloc() logic.
> 
> That's was the idea, indeed. :)
> 
> > Just looking at it from the x86 side (and
> > similar) though, where there is actually only one execmem_range and
> > it
> > building this whole array with identical data and it seems weird.
> 
> Right, most architectures have only one range, but to support all
> variants
> that we have, execmem has to maintain the whole array.

What about just having an index into a smaller set of ranges. The
module area and the extra JIT area. So ->ranges can be size 3
(statically allocated in the arch code) for three areas and then the
index array can be size EXECMEM_TYPE_MAX. The default 0 value of the
indexing array will point to the default area and any special areas can
be set in the index point to the desired range.

Looking at how it would do for x86 and arm64, it looks maybe a bit
better to me. A little bit less code and memory usage, and a bit easier
to trace the configuration through to the final state (IMO). What do
you think? Very rough, on top of this series, below.

As I was playing around with this, I was also wondering why it needs
two copies of struct execmem_params: one returned from the arch code
and one in exec mem. And why the temporary arch copy is ro_after_init,
but the final execmem.c copy is not ro_after_init?

 arch/arm64/mm/init.c    | 67 ++++++++++++++++++++++++++++++++++++++---
--------------------------
 arch/x86/mm/init.c      | 24 +++++++++++++-----------
 include/linux/execmem.h |  5 +++--
 mm/execmem.c            | 61 ++++++++++++++++-------------------------
--------------------
 4 files changed, 70 insertions(+), 87 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 9b7716b4d84c..7df119101f20 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -633,49 +633,58 @@ static int __init module_init_limits(void)
 	return 0;
 }
 
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-		},
-		[EXECMEM_KPROBES] = {
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-			.alignment = 1,
-		},
-		[EXECMEM_BPF] = {
-			.start = VMALLOC_START,
-			.end = VMALLOC_END,
-			.alignment = 1,
-		},
+static struct execmem_range[2] ranges __ro_after_init = {
+	/* Module area */
+	[0] = {
+		.flags = EXECMEM_KASAN_SHADOW,
+		.alignment = MODULE_ALIGN,
+	},
+	/* Kprobes area */
+	[1] = {
+		.start = VMALLOC_START,
+		.end = VMALLOC_END,
+		.alignment = 1,
+	},
+	/* BPF area */
+	[2] = {
+		.start = VMALLOC_START,
+		.end = VMALLOC_END,
+		.alignment = 1,
 	},
 };
 
-struct execmem_params __init *execmem_arch_params(void)
+void __init execmem_arch_params(struct execmem_params *p)
 {
-	struct execmem_range *r =
&execmem_params.ranges[EXECMEM_DEFAULT];
+	struct execmem_range *default;
+	struct execmem_range *jit;
+
+	p->ranges = &ranges;
 
 	module_init_limits();
 
-	r->pgprot = PAGE_KERNEL;
-
+	/* Default area */
+	default = &ranges[0];
+	default->pgprot = PAGE_KERNEL;
 	if (module_direct_base) {
-		r->start = module_direct_base;
-		r->end = module_direct_base + SZ_128M;
+		default->start = module_direct_base;
+		default->end = module_direct_base + SZ_128M;
 
 		if (module_plt_base) {
-			r->fallback_start = module_plt_base;
-			r->fallback_end = module_plt_base + SZ_2G;
+			default->fallback_start = module_plt_base;
+			default->fallback_end = module_plt_base +
SZ_2G;
 		}
 	} else if (module_plt_base) {
-		r->start = module_plt_base;
-		r->end = module_plt_base + SZ_2G;
+		default->start = module_plt_base;
+		default->end = module_plt_base + SZ_2G;
 	}
 
-	execmem_params.ranges[EXECMEM_KPROBES].pgprot =
PAGE_KERNEL_ROX;
-	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
+	/* Jit area */
+	ranges[1].pgprot = PAGE_KERNEL_ROX;
+	p->defaults[EXECMEM_KPROBES] = 1;
+	
 
-	return &execmem_params;
+	/* BPF Area */
+	ranges[2].pgprot = PAGE_KERNEL;
+	p->defaults[EXECMEM_BPF] = 2;
 }
 #endif
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 022af7ab50f9..7397472ffc39 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -1102,16 +1102,15 @@ unsigned long arch_max_swapfile_size(void)
 #endif
 
 #ifdef CONFIG_EXECMEM
-static struct execmem_params execmem_params __ro_after_init = {
-	.ranges = {
-		[EXECMEM_DEFAULT] = {
-			.flags = EXECMEM_KASAN_SHADOW,
-			.alignment = MODULE_ALIGN,
-		},
+static struct execmem_range ranges[1] __ro_after_init = {
+	/* Module area */
+	[0] = {
+		.flags = EXECMEM_KASAN_SHADOW,
+		.alignment = MODULE_ALIGN,
 	},
 };
 
-struct execmem_params __init *execmem_arch_params(void)
+void __init execmem_arch_params(struct execmem_params *p)
 {
 	unsigned long module_load_offset = 0;
 	unsigned long start;
@@ -1121,10 +1120,13 @@ struct execmem_params __init
*execmem_arch_params(void)
 			get_random_u32_inclusive(1, 1024) * PAGE_SIZE;
 
 	start = MODULES_VADDR + module_load_offset;
-	execmem_params.ranges[EXECMEM_DEFAULT].start = start;
-	execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END;
-	execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
+	p->ranges = ranges;
 
-	return &execmem_params;
+	/* Module area */
+	p->ranges[0].start = start;
+	p->ranges[0].end = MODULES_END;
+	p->ranges[0].pgprot = PAGE_KERNEL;
+	p->ranges[0].flags = EXECMEM_KASAN_SHADOW;
+	p->ranges[0].alignment = MODULE_ALIGN;
 }
 #endif /* CONFIG_EXECMEM */
diff --git a/include/linux/execmem.h b/include/linux/execmem.h
index 09d45ac786e9..702435443d87 100644
--- a/include/linux/execmem.h
+++ b/include/linux/execmem.h
@@ -77,7 +77,8 @@ struct execmem_range {
  * each type of executable memory allocations
  */
 struct execmem_params {
-	struct execmem_range	ranges[EXECMEM_TYPE_MAX];
+	int areas[EXECMEM_TYPE_MAX];
+	struct execmem_range	*ranges;
 };
 
 /**
@@ -92,7 +93,7 @@ struct execmem_params {
  * Return: a structure defining architecture parameters and
restrictions
  * for allocations of executable memory
  */
-struct execmem_params *execmem_arch_params(void);
+void execmem_arch_params(struct execmem_params *p);
 
 /**
  * execmem_text_alloc - allocate executable memory
diff --git a/mm/execmem.c b/mm/execmem.c
index aeff85261360..dfdec8c2b074 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -6,15 +6,15 @@
 #include <linux/moduleloader.h>
 
 static struct execmem_params execmem_params;
+static struct execmem_range default_range;
 
-static void *execmem_alloc(size_t size, struct execmem_range *range)
+static void *execmem_alloc(size_t size, struct execmem_range *range,
pgprot_t pgprot)
 {
 	unsigned long start = range->start;
 	unsigned long end = range->end;
 	unsigned long fallback_start = range->fallback_start;
 	unsigned long fallback_end = range->fallback_end;
 	unsigned int align = range->alignment;
-	pgprot_t pgprot = range->pgprot;
 	bool kasan = range->flags & EXECMEM_KASAN_SHADOW;
 	unsigned long vm_flags  = VM_FLUSH_RESET_PERMS;
 	bool fallback  = !!fallback_start;
@@ -60,14 +60,18 @@ static inline bool execmem_range_is_data(enum
execmem_type type)
 
 void *execmem_text_alloc(enum execmem_type type, size_t size)
 {
-	return execmem_alloc(size, &execmem_params.ranges[type]);
+	struct execmem_range *range =
&execmem_params.ranges[execmem_params.areas[type]];
+
+	return execmem_alloc(size, range, range->pgprot);
 }
 
 void *execmem_data_alloc(enum execmem_type type, size_t size)
 {
+	struct execmem_range *range =
&execmem_params.ranges[execmem_params.areas[type]];
+
 	WARN_ON_ONCE(!execmem_range_is_data(type));
 
-	return execmem_alloc(size, &execmem_params.ranges[type]);
+	return execmem_alloc(size, range, PAGE_KERNEL);
 }
 
 void execmem_free(void *ptr)
@@ -80,9 +84,13 @@ void execmem_free(void *ptr)
 	vfree(ptr);
 }
 
-struct execmem_params * __weak execmem_arch_params(void)
+void __weak execmem_arch_params(struct execmem_params *p)
 {
-	return NULL;
+	p->ranges = default_range;
+	p->ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
+	p->ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
+	p->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC;
+	p->ranges[EXECMEM_DEFAULT].alignment = 1;
 }
 
 static bool execmem_validate_params(struct execmem_params *p)
@@ -97,46 +105,9 @@ static bool execmem_validate_params(struct
execmem_params *p)
 	return true;
 }
 
-static void execmem_init_missing(struct execmem_params *p)
-{
-	struct execmem_range *default_range = &p-
>ranges[EXECMEM_DEFAULT];
-
-	for (int i = EXECMEM_DEFAULT + 1; i < EXECMEM_TYPE_MAX; i++) {
-		struct execmem_range *r = &p->ranges[i];
-
-		if (!r->start) {
-			if (execmem_range_is_data(i))
-				r->pgprot = PAGE_KERNEL;
-			else
-				r->pgprot = default_range->pgprot;
-			r->alignment = default_range->alignment;
-			r->start = default_range->start;
-			r->end = default_range->end;
-			r->flags = default_range->flags;
-			r->fallback_start = default_range-
>fallback_start;
-			r->fallback_end = default_range->fallback_end;
-		}
-	}
-}
-
 void __init execmem_init(void)
 {
-	struct execmem_params *p = execmem_arch_params();
+	execmem_arch_params(&execmem_params);
 
-	if (!p) {
-		p = &execmem_params;
-		p->ranges[EXECMEM_DEFAULT].start = VMALLOC_START;
-		p->ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
-		p->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC;
-		p->ranges[EXECMEM_DEFAULT].alignment = 1;
-
-		return;
-	}
-
-	if (!execmem_validate_params(p))
-		return;
-
-	execmem_init_missing(p);
-
-	execmem_params = *p;
+	execmem_validate_params(&execmem_params);
 }


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-09-18  7:29 ` [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem Mike Rapoport
  2023-10-04  0:29   ` Edgecombe, Rick P
@ 2023-10-05 18:11   ` Edgecombe, Rick P
  1 sibling, 0 replies; 49+ messages in thread
From: Edgecombe, Rick P @ 2023-10-05 18:11 UTC (permalink / raw)
  To: linux-kernel, rppt
  Cc: mark.rutland, x86, catalin.marinas, song, sparclinux,
	linux-riscv, nadav.amit, linux-s390, deller, chenhuacai, linux,
	naveen.n.rao, linux-trace-kernel, will, hca, rostedt, loongarch,
	bjorn, tglx, akpm, lin ux-arm-kernel@lists.infradead.org,
	tsbogend, puranjay12, linux-parisc, linux-mm, netdev,
	kent.overstreet, linux-mips, dinguyen, mcgrof, palmer, bpf,
	linuxppc-dev, davem, linux-modules

On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote:
> +/**
> + * struct execmem_range - definition of a memory range suitable for
> code and
> + *                       related data allocations
> + * @start:     address space start
> + * @end:       address space end (inclusive)
> + * @pgprot:    permissions for memory in this address space
> + * @alignment: alignment required for text allocations
> + */
> +struct execmem_range {
> +       unsigned long   start;
> +       unsigned long   end;
> +       pgprot_t        pgprot;
> +       unsigned int    alignment;
> +};

Not a strong opinion, but range doesn't seem an appropriate name. It
*has* a range, but also other allocation configuration. It gets
especially confusing when multiple "ranges" have the same range. Maybe
execmem_alloc_params?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-09-18  7:29 ` [PATCH v3 04/13] mm/execmem, arch: convert remaining " Mike Rapoport
  2023-10-04  0:29   ` Edgecombe, Rick P
@ 2023-10-23 17:14   ` Will Deacon
  2023-10-26  8:58     ` Mike Rapoport
  1 sibling, 1 reply; 49+ messages in thread
From: Will Deacon @ 2023-10-23 17:14 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn  Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

Hi Mike,

On Mon, Sep 18, 2023 at 10:29:46AM +0300, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> 
> Extend execmem parameters to accommodate more complex overrides of
> module_alloc() by architectures.
> 
> This includes specification of a fallback range required by arm, arm64
> and powerpc and support for allocation of KASAN shadow required by
> arm64, s390 and x86.
> 
> The core implementation of execmem_alloc() takes care of suppressing
> warnings when the initial allocation fails but there is a fallback range
> defined.
> 
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> ---
>  arch/arm/kernel/module.c     | 38 ++++++++++++---------
>  arch/arm64/kernel/module.c   | 57 ++++++++++++++------------------
>  arch/powerpc/kernel/module.c | 52 ++++++++++++++---------------
>  arch/s390/kernel/module.c    | 52 +++++++++++------------------
>  arch/x86/kernel/module.c     | 64 +++++++++++-------------------------
>  include/linux/execmem.h      | 14 ++++++++
>  mm/execmem.c                 | 43 ++++++++++++++++++++++--
>  7 files changed, 167 insertions(+), 153 deletions(-)

[...]

> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index dd851297596e..cd6320de1c54 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -20,6 +20,7 @@
>  #include <linux/random.h>
>  #include <linux/scs.h>
>  #include <linux/vmalloc.h>
> +#include <linux/execmem.h>
>  
>  #include <asm/alternative.h>
>  #include <asm/insn.h>
> @@ -108,46 +109,38 @@ static int __init module_init_limits(void)
>  
>  	return 0;
>  }
> -subsys_initcall(module_init_limits);
>  
> -void *module_alloc(unsigned long size)
> +static struct execmem_params execmem_params __ro_after_init = {
> +	.ranges = {
> +		[EXECMEM_DEFAULT] = {
> +			.flags = EXECMEM_KASAN_SHADOW,
> +			.alignment = MODULE_ALIGN,
> +		},
> +	},
> +};
> +
> +struct execmem_params __init *execmem_arch_params(void)
>  {
> -	void *p = NULL;
> +	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
>  
> -	/*
> -	 * Where possible, prefer to allocate within direct branch range of the
> -	 * kernel such that no PLTs are necessary.
> -	 */

Why are you removing this comment? I think you could just move it next
to the part where we set a 128MiB range.

> -	if (module_direct_base) {
> -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> -					 module_direct_base,
> -					 module_direct_base + SZ_128M,
> -					 GFP_KERNEL | __GFP_NOWARN,
> -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> -					 __builtin_return_address(0));
> -	}
> +	module_init_limits();

Hmm, this used to be run from subsys_initcall(), but now you're running
it _really_ early, before random_init(), so randomization of the module
space is no longer going to be very random if we don't have early entropy
from the firmware or the CPU, which is likely to be the case on most SoCs.

>  
> -	if (!p && module_plt_base) {
> -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> -					 module_plt_base,
> -					 module_plt_base + SZ_2G,
> -					 GFP_KERNEL | __GFP_NOWARN,
> -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> -					 __builtin_return_address(0));
> -	}
> +	r->pgprot = PAGE_KERNEL;
>  
> -	if (!p) {
> -		pr_warn_ratelimited("%s: unable to allocate memory\n",
> -				    __func__);
> -	}
> +	if (module_direct_base) {
> +		r->start = module_direct_base;
> +		r->end = module_direct_base + SZ_128M;
>  
> -	if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) {
> -		vfree(p);
> -		return NULL;
> +		if (module_plt_base) {
> +			r->fallback_start = module_plt_base;
> +			r->fallback_end = module_plt_base + SZ_2G;
> +		}
> +	} else if (module_plt_base) {
> +		r->start = module_plt_base;
> +		r->end = module_plt_base + SZ_2G;
>  	}
>  
> -	/* Memory is intended to be executable, reset the pointer tag. */
> -	return kasan_reset_tag(p);
> +	return &execmem_params;
>  }
>  
>  enum aarch64_reloc_op {

[...]

> diff --git a/include/linux/execmem.h b/include/linux/execmem.h
> index 44e213625053..806ad1a0088d 100644
> --- a/include/linux/execmem.h
> +++ b/include/linux/execmem.h
> @@ -32,19 +32,33 @@ enum execmem_type {
>  	EXECMEM_TYPE_MAX,
>  };
>  
> +/**
> + * enum execmem_module_flags - options for executable memory allocations
> + * @EXECMEM_KASAN_SHADOW:	allocate kasan shadow
> + */
> +enum execmem_range_flags {
> +	EXECMEM_KASAN_SHADOW	= (1 << 0),
> +};
> +
>  /**
>   * struct execmem_range - definition of a memory range suitable for code and
>   *			  related data allocations
>   * @start:	address space start
>   * @end:	address space end (inclusive)
> + * @fallback_start:	start of the range for fallback allocations
> + * @fallback_end:	end of the range for fallback allocations (inclusive)
>   * @pgprot:	permissions for memory in this address space
>   * @alignment:	alignment required for text allocations
> + * @flags:	options for memory allocations for this range
>   */
>  struct execmem_range {
>  	unsigned long   start;
>  	unsigned long   end;
> +	unsigned long   fallback_start;
> +	unsigned long   fallback_end;
>  	pgprot_t        pgprot;
>  	unsigned int	alignment;
> +	enum execmem_range_flags flags;
>  };
>  
>  /**
> diff --git a/mm/execmem.c b/mm/execmem.c
> index f25a5e064886..a8c2f44d0133 100644
> --- a/mm/execmem.c
> +++ b/mm/execmem.c
> @@ -11,12 +11,46 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
>  {
>  	unsigned long start = range->start;
>  	unsigned long end = range->end;
> +	unsigned long fallback_start = range->fallback_start;
> +	unsigned long fallback_end = range->fallback_end;
>  	unsigned int align = range->alignment;
>  	pgprot_t pgprot = range->pgprot;
> +	bool kasan = range->flags & EXECMEM_KASAN_SHADOW;
> +	unsigned long vm_flags  = VM_FLUSH_RESET_PERMS;
> +	bool fallback  = !!fallback_start;
> +	gfp_t gfp_flags = GFP_KERNEL;
> +	void *p;
>  
> -	return __vmalloc_node_range(size, align, start, end,
> -				   GFP_KERNEL, pgprot, VM_FLUSH_RESET_PERMS,
> -				   NUMA_NO_NODE, __builtin_return_address(0));
> +	if (PAGE_ALIGN(size) > (end - start))
> +		return NULL;
> +
> +	if (kasan)
> +		vm_flags |= VM_DEFER_KMEMLEAK;

Hmm, I don't think we passed this before on arm64, should we have done?

Will

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 07/13] arm64, execmem: extend execmem_params for generated code allocations
  2023-09-18  7:29 ` [PATCH v3 07/13] arm64, execmem: extend execmem_params for generated code allocations Mike Rapoport
@ 2023-10-23 17:21   ` Will Deacon
  0 siblings, 0 replies; 49+ messages in thread
From: Will Deacon @ 2023-10-23 17:21 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn  Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Sep 18, 2023 at 10:29:49AM +0300, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> 
> The memory allocations for kprobes and BPF on arm64 can be placed
> anywhere in vmalloc address space and currently this is implemented with
> overrides of alloc_insn_page() and bpf_jit_alloc_exec() in arm64.
> 
> Define EXECMEM_KPROBES and EXECMEM_BPF ranges in arm64::execmem_params and
> drop overrides of alloc_insn_page() and bpf_jit_alloc_exec().
> 
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> ---
>  arch/arm64/kernel/module.c         | 13 +++++++++++++
>  arch/arm64/kernel/probes/kprobes.c |  7 -------
>  arch/arm64/net/bpf_jit_comp.c      | 11 -----------
>  3 files changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index cd6320de1c54..d27db168d2a2 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -116,6 +116,16 @@ static struct execmem_params execmem_params __ro_after_init = {
>  			.flags = EXECMEM_KASAN_SHADOW,
>  			.alignment = MODULE_ALIGN,
>  		},
> +		[EXECMEM_KPROBES] = {
> +			.start = VMALLOC_START,
> +			.end = VMALLOC_END,
> +			.alignment = 1,
> +		},
> +		[EXECMEM_BPF] = {
> +			.start = VMALLOC_START,
> +			.end = VMALLOC_END,
> +			.alignment = 1,
> +		},
>  	},
>  };
>  
> @@ -140,6 +150,9 @@ struct execmem_params __init *execmem_arch_params(void)
>  		r->end = module_plt_base + SZ_2G;
>  	}
>  
> +	execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX;
> +	execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL;
> +
>  	return &execmem_params;
>  }
>  
> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index 70b91a8c6bb3..6fccedd02b2a 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -129,13 +129,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
>  	return 0;
>  }
>  
> -void *alloc_insn_page(void)
> -{
> -	return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
> -			GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS,
> -			NUMA_NO_NODE, __builtin_return_address(0));
> -}

It's slightly curious that we didn't clear the tag here, so it's nice that
it all happens magically with your series:

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
  2023-10-05 18:09         ` Edgecombe, Rick P
@ 2023-10-26  8:40           ` Mike Rapoport
  0 siblings, 0 replies; 49+ messages in thread
From: Mike Rapoport @ 2023-10-26  8:40 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: mark.rutland, chenhuacai, catalin.marinas, linux-kernel, song,
	sparclinux, linux-riscv, nadav.amit, linux-s390, deller, x86,
	linux, naveen.n.rao, linux-trace-kernel, will, hca, rostedt,
	loongarch, bjorn, tglx, akpm, linux-arm-kernel, tsbogend,
	puranjay12, linux-parisc, linux-mm, netdev, kent.overstreet,
	linux-mips, dinguyen, mcgrof, palmer, bpf, linuxppc-dev, davem,
	linux-modules

Hi Rick,

Sorry for the delay, I was a bit preoccupied with $stuff.

On Thu, Oct 05, 2023 at 06:09:07PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2023-10-05 at 08:26 +0300, Mike Rapoport wrote:
> > On Wed, Oct 04, 2023 at 03:39:26PM +0000, Edgecombe, Rick P wrote:
> > > On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote:
> > > > It seems a bit weird to copy all of this. Is it trying to be
> > > > faster
> > > > or
> > > > something?
> > > > 
> > > > Couldn't it just check r->start in execmem_text/data_alloc() path
> > > > and
> > > > switch to EXECMEM_DEFAULT if needed then? The
> > > > execmem_range_is_data()
> > > > part that comes later could be added to the logic there too. So
> > > > this
> > > > seems like unnecessary complexity to me or I don't see the
> > > > reason.
> > > 
> > > I guess this is a bad idea because if you have the full size array
> > > sitting around anyway you might as well use it and reduce the
> > > exec_mem_alloc() logic.
> > 
> > That's was the idea, indeed. :)
> > 
> > > Just looking at it from the x86 side (and
> > > similar) though, where there is actually only one execmem_range and
> > > it
> > > building this whole array with identical data and it seems weird.
> > 
> > Right, most architectures have only one range, but to support all
> > variants
> > that we have, execmem has to maintain the whole array.
> 
> What about just having an index into a smaller set of ranges. The
> module area and the extra JIT area. So ->ranges can be size 3
> (statically allocated in the arch code) for three areas and then the
> index array can be size EXECMEM_TYPE_MAX. The default 0 value of the
> indexing array will point to the default area and any special areas can
> be set in the index point to the desired range.
> 
> Looking at how it would do for x86 and arm64, it looks maybe a bit
> better to me. A little bit less code and memory usage, and a bit easier
> to trace the configuration through to the final state (IMO). What do
> you think? Very rough, on top of this series, below.

I like your suggestion to only have definitions of actual ranges in arch
code and index array to redirect allocation requests to the right range.
I'll make the next version along the lines of your patch.

> As I was playing around with this, I was also wondering why it needs
> two copies of struct execmem_params: one returned from the arch code
> and one in exec mem. 

No actual reason, one copy is enough, thanks for catching this.

> And why the temporary arch copy is ro_after_init,
> but the final execmem.c copy is not ro_after_init?

I just missed it, thanks for pointing out.
 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-10-23 17:14   ` Will Deacon
@ 2023-10-26  8:58     ` Mike Rapoport
  2023-10-26 10:24       ` Will Deacon
  0 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-10-26  8:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn  Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

Hi Will,

On Mon, Oct 23, 2023 at 06:14:20PM +0100, Will Deacon wrote:
> Hi Mike,
> 
> On Mon, Sep 18, 2023 at 10:29:46AM +0300, Mike Rapoport wrote:
> > From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> > 
> > Extend execmem parameters to accommodate more complex overrides of
> > module_alloc() by architectures.
> > 
> > This includes specification of a fallback range required by arm, arm64
> > and powerpc and support for allocation of KASAN shadow required by
> > arm64, s390 and x86.
> > 
> > The core implementation of execmem_alloc() takes care of suppressing
> > warnings when the initial allocation fails but there is a fallback range
> > defined.
> > 
> > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> > ---
> >  arch/arm/kernel/module.c     | 38 ++++++++++++---------
> >  arch/arm64/kernel/module.c   | 57 ++++++++++++++------------------
> >  arch/powerpc/kernel/module.c | 52 ++++++++++++++---------------
> >  arch/s390/kernel/module.c    | 52 +++++++++++------------------
> >  arch/x86/kernel/module.c     | 64 +++++++++++-------------------------
> >  include/linux/execmem.h      | 14 ++++++++
> >  mm/execmem.c                 | 43 ++++++++++++++++++++++--
> >  7 files changed, 167 insertions(+), 153 deletions(-)
> 
> [...]
> 
> > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > index dd851297596e..cd6320de1c54 100644
> > --- a/arch/arm64/kernel/module.c
> > +++ b/arch/arm64/kernel/module.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/random.h>
> >  #include <linux/scs.h>
> >  #include <linux/vmalloc.h>
> > +#include <linux/execmem.h>
> >  
> >  #include <asm/alternative.h>
> >  #include <asm/insn.h>
> > @@ -108,46 +109,38 @@ static int __init module_init_limits(void)
> >  
> >  	return 0;
> >  }
> > -subsys_initcall(module_init_limits);
> >  
> > -void *module_alloc(unsigned long size)
> > +static struct execmem_params execmem_params __ro_after_init = {
> > +	.ranges = {
> > +		[EXECMEM_DEFAULT] = {
> > +			.flags = EXECMEM_KASAN_SHADOW,
> > +			.alignment = MODULE_ALIGN,
> > +		},
> > +	},
> > +};
> > +
> > +struct execmem_params __init *execmem_arch_params(void)
> >  {
> > -	void *p = NULL;
> > +	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
> >  
> > -	/*
> > -	 * Where possible, prefer to allocate within direct branch range of the
> > -	 * kernel such that no PLTs are necessary.
> > -	 */
> 
> Why are you removing this comment? I think you could just move it next
> to the part where we set a 128MiB range.
 
Oops, my bad. Will add it back.

> > -	if (module_direct_base) {
> > -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > -					 module_direct_base,
> > -					 module_direct_base + SZ_128M,
> > -					 GFP_KERNEL | __GFP_NOWARN,
> > -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -					 __builtin_return_address(0));
> > -	}
> > +	module_init_limits();
> 
> Hmm, this used to be run from subsys_initcall(), but now you're running
> it _really_ early, before random_init(), so randomization of the module
> space is no longer going to be very random if we don't have early entropy
> from the firmware or the CPU, which is likely to be the case on most SoCs.

Well, it will be as random as KASLR. Won't that be enough?
 
> > diff --git a/mm/execmem.c b/mm/execmem.c
> > index f25a5e064886..a8c2f44d0133 100644
> > --- a/mm/execmem.c
> > +++ b/mm/execmem.c
> > @@ -11,12 +11,46 @@ static void *execmem_alloc(size_t size, struct execmem_range *range)
> >  {
> >  	unsigned long start = range->start;
> >  	unsigned long end = range->end;
> > +	unsigned long fallback_start = range->fallback_start;
> > +	unsigned long fallback_end = range->fallback_end;
> >  	unsigned int align = range->alignment;
> >  	pgprot_t pgprot = range->pgprot;
> > +	bool kasan = range->flags & EXECMEM_KASAN_SHADOW;
> > +	unsigned long vm_flags  = VM_FLUSH_RESET_PERMS;
> > +	bool fallback  = !!fallback_start;
> > +	gfp_t gfp_flags = GFP_KERNEL;
> > +	void *p;
> >  
> > -	return __vmalloc_node_range(size, align, start, end,
> > -				   GFP_KERNEL, pgprot, VM_FLUSH_RESET_PERMS,
> > -				   NUMA_NO_NODE, __builtin_return_address(0));
> > +	if (PAGE_ALIGN(size) > (end - start))
> > +		return NULL;
> > +
> > +	if (kasan)
> > +		vm_flags |= VM_DEFER_KMEMLEAK;
> 
> Hmm, I don't think we passed this before on arm64, should we have done?

It was there on arm64 before commit 8339f7d8e178 ("arm64: module: remove
old !KASAN_VMALLOC logic").
There's no need to pass VM_DEFER_KMEMLEAK when KASAN_VMALLOC is enabled and
arm64 always selects KASAN_VMALLOC with KASAN.

And for the generic case, I should have made the condition to check for
KASAN_VMALLOC as well.
 
> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-10-26  8:58     ` Mike Rapoport
@ 2023-10-26 10:24       ` Will Deacon
  2023-10-30  7:00         ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: Will Deacon @ 2023-10-26 10:24 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn  Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Oct 26, 2023 at 11:58:00AM +0300, Mike Rapoport wrote:
> On Mon, Oct 23, 2023 at 06:14:20PM +0100, Will Deacon wrote:
> > On Mon, Sep 18, 2023 at 10:29:46AM +0300, Mike Rapoport wrote:
> > > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > > index dd851297596e..cd6320de1c54 100644
> > > --- a/arch/arm64/kernel/module.c
> > > +++ b/arch/arm64/kernel/module.c
> > > @@ -20,6 +20,7 @@
> > >  #include <linux/random.h>
> > >  #include <linux/scs.h>
> > >  #include <linux/vmalloc.h>
> > > +#include <linux/execmem.h>
> > >  
> > >  #include <asm/alternative.h>
> > >  #include <asm/insn.h>
> > > @@ -108,46 +109,38 @@ static int __init module_init_limits(void)
> > >  
> > >  	return 0;
> > >  }
> > > -subsys_initcall(module_init_limits);
> > >  
> > > -void *module_alloc(unsigned long size)
> > > +static struct execmem_params execmem_params __ro_after_init = {
> > > +	.ranges = {
> > > +		[EXECMEM_DEFAULT] = {
> > > +			.flags = EXECMEM_KASAN_SHADOW,
> > > +			.alignment = MODULE_ALIGN,
> > > +		},
> > > +	},
> > > +};
> > > +
> > > +struct execmem_params __init *execmem_arch_params(void)
> > >  {
> > > -	void *p = NULL;
> > > +	struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT];
> > >  
> > > -	/*
> > > -	 * Where possible, prefer to allocate within direct branch range of the
> > > -	 * kernel such that no PLTs are necessary.
> > > -	 */
> > 
> > Why are you removing this comment? I think you could just move it next
> > to the part where we set a 128MiB range.
>  
> Oops, my bad. Will add it back.

Thanks.

> > > -	if (module_direct_base) {
> > > -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > > -					 module_direct_base,
> > > -					 module_direct_base + SZ_128M,
> > > -					 GFP_KERNEL | __GFP_NOWARN,
> > > -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > > -					 __builtin_return_address(0));
> > > -	}
> > > +	module_init_limits();
> > 
> > Hmm, this used to be run from subsys_initcall(), but now you're running
> > it _really_ early, before random_init(), so randomization of the module
> > space is no longer going to be very random if we don't have early entropy
> > from the firmware or the CPU, which is likely to be the case on most SoCs.
> 
> Well, it will be as random as KASLR. Won't that be enough?

I don't think that's true -- we have the 'kaslr-seed' property for KASLR,
but I'm not seeing anything like that for the module randomisation and I
also don't see why we need to set these limits so early.

Will

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-10-26 10:24       ` Will Deacon
@ 2023-10-30  7:00         ` Mike Rapoport
  2023-11-07 10:44           ` Will Deacon
  0 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2023-10-30  7:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn  Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Thu, Oct 26, 2023 at 11:24:39AM +0100, Will Deacon wrote:
> On Thu, Oct 26, 2023 at 11:58:00AM +0300, Mike Rapoport wrote:
> > On Mon, Oct 23, 2023 at 06:14:20PM +0100, Will Deacon wrote:
> > > On Mon, Sep 18, 2023 at 10:29:46AM +0300, Mike Rapoport wrote:
> > > > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > > > index dd851297596e..cd6320de1c54 100644
> > > > --- a/arch/arm64/kernel/module.c
> > > > +++ b/arch/arm64/kernel/module.c

...

> > > > -	if (module_direct_base) {
> > > > -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > > > -					 module_direct_base,
> > > > -					 module_direct_base + SZ_128M,
> > > > -					 GFP_KERNEL | __GFP_NOWARN,
> > > > -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > > > -					 __builtin_return_address(0));
> > > > -	}
> > > > +	module_init_limits();
> > > 
> > > Hmm, this used to be run from subsys_initcall(), but now you're running
> > > it _really_ early, before random_init(), so randomization of the module
> > > space is no longer going to be very random if we don't have early entropy
> > > from the firmware or the CPU, which is likely to be the case on most SoCs.
> > 
> > Well, it will be as random as KASLR. Won't that be enough?
> 
> I don't think that's true -- we have the 'kaslr-seed' property for KASLR,
> but I'm not seeing anything like that for the module randomisation and I
> also don't see why we need to set these limits so early.

x86 needs execmem initialized before ftrace_init() so I thought it would be
best to setup execmem along with most of MM in mm_core_init().

I'll move execmem initialization for !x86 to a later point, say
core_initcall.
 
> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  2023-10-30  7:00         ` Mike Rapoport
@ 2023-11-07 10:44           ` Will Deacon
  0 siblings, 0 replies; 49+ messages in thread
From: Will Deacon @ 2023-11-07 10:44 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, x86, Catalin Marinas, linux-mips, Song Liu,
	Luis Chamberlain, sparclinux, linux-riscv, Nadav Amit,
	linux-s390, Helge Deller, Huacai Chen, Russell King,
	Naveen N. Rao, linux-trace-kernel, Heiko Carstens,
	Steven Rostedt, loongarch, Thomas Gleixner, bpf,
	linux-arm-kernel, Thomas Bogendoerfer, linux-parisc,
	Puranjay Mohan, linux-mm, netdev, Kent Overstreet, linux-kernel,
	Dinh Nguyen, Björn  Töpel, Palmer Dabbelt,
	Andrew Morton, Rick Edgecombe, linuxppc-dev, David S. Miller,
	linux-modules

On Mon, Oct 30, 2023 at 09:00:53AM +0200, Mike Rapoport wrote:
> On Thu, Oct 26, 2023 at 11:24:39AM +0100, Will Deacon wrote:
> > On Thu, Oct 26, 2023 at 11:58:00AM +0300, Mike Rapoport wrote:
> > > On Mon, Oct 23, 2023 at 06:14:20PM +0100, Will Deacon wrote:
> > > > On Mon, Sep 18, 2023 at 10:29:46AM +0300, Mike Rapoport wrote:
> > > > > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > > > > index dd851297596e..cd6320de1c54 100644
> > > > > --- a/arch/arm64/kernel/module.c
> > > > > +++ b/arch/arm64/kernel/module.c
> 
> ...
> 
> > > > > -	if (module_direct_base) {
> > > > > -		p = __vmalloc_node_range(size, MODULE_ALIGN,
> > > > > -					 module_direct_base,
> > > > > -					 module_direct_base + SZ_128M,
> > > > > -					 GFP_KERNEL | __GFP_NOWARN,
> > > > > -					 PAGE_KERNEL, 0, NUMA_NO_NODE,
> > > > > -					 __builtin_return_address(0));
> > > > > -	}
> > > > > +	module_init_limits();
> > > > 
> > > > Hmm, this used to be run from subsys_initcall(), but now you're running
> > > > it _really_ early, before random_init(), so randomization of the module
> > > > space is no longer going to be very random if we don't have early entropy
> > > > from the firmware or the CPU, which is likely to be the case on most SoCs.
> > > 
> > > Well, it will be as random as KASLR. Won't that be enough?
> > 
> > I don't think that's true -- we have the 'kaslr-seed' property for KASLR,
> > but I'm not seeing anything like that for the module randomisation and I
> > also don't see why we need to set these limits so early.
> 
> x86 needs execmem initialized before ftrace_init() so I thought it would be
> best to setup execmem along with most of MM in mm_core_init().
> 
> I'll move execmem initialization for !x86 to a later point, say
> core_initcall.

Thanks, Mike.

Will

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2023-11-07 10:45 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-18  7:29 [PATCH v3 00/13] mm: jit/text allocator Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 01/13] nios2: define virtual address space for modules Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 02/13] mm: introduce execmem_text_alloc() and execmem_free() Mike Rapoport
2023-09-21 22:10   ` Song Liu
2023-09-23 15:42     ` Mike Rapoport
2023-09-21 22:14   ` Song Liu
2023-09-23 15:40     ` Mike Rapoport
2023-09-21 22:34   ` Song Liu
2023-09-23 15:38     ` Mike Rapoport
2023-09-23 22:36       ` Song Liu
2023-09-26  8:04         ` Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem Mike Rapoport
2023-10-04  0:29   ` Edgecombe, Rick P
2023-10-04 15:39     ` Edgecombe, Rick P
2023-10-05  5:26       ` Mike Rapoport
2023-10-05 18:09         ` Edgecombe, Rick P
2023-10-26  8:40           ` Mike Rapoport
2023-10-05 18:11   ` Edgecombe, Rick P
2023-09-18  7:29 ` [PATCH v3 04/13] mm/execmem, arch: convert remaining " Mike Rapoport
2023-10-04  0:29   ` Edgecombe, Rick P
2023-10-05  5:28     ` Mike Rapoport
2023-10-23 17:14   ` Will Deacon
2023-10-26  8:58     ` Mike Rapoport
2023-10-26 10:24       ` Will Deacon
2023-10-30  7:00         ` Mike Rapoport
2023-11-07 10:44           ` Will Deacon
2023-09-18  7:29 ` [PATCH v3 05/13] modules, execmem: drop module_alloc Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 06/13] mm/execmem: introduce execmem_data_alloc() Mike Rapoport
2023-09-21 22:52   ` Song Liu
2023-09-22  7:16     ` Christophe Leroy
2023-09-22  8:55       ` Song Liu
2023-09-22 10:13         ` Christophe Leroy
2023-09-23 16:20     ` Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 07/13] arm64, execmem: extend execmem_params for generated code allocations Mike Rapoport
2023-10-23 17:21   ` Will Deacon
2023-09-18  7:29 ` [PATCH v3 08/13] riscv: " Mike Rapoport
2023-09-22 10:37   ` Alexandre Ghiti
2023-09-23 16:23     ` Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 09/13] powerpc: extend execmem_params for kprobes allocations Mike Rapoport
2023-09-21 22:30   ` Song Liu
2023-09-23 16:25     ` Mike Rapoport
2023-09-22 10:32   ` Christophe Leroy
2023-09-23 16:27     ` Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 10/13] arch: make execmem setup available regardless of CONFIG_MODULES Mike Rapoport
2023-09-26  7:33   ` Arnd Bergmann
2023-09-26  8:32     ` Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 11/13] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 12/13] kprobes: remove dependency on CONFIG_MODULES Mike Rapoport
2023-09-18  7:29 ` [PATCH v3 13/13] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).