All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/9] powerpc: Further Strict RWX support
@ 2021-04-29  3:15 Jordan Niethe
  2021-04-29  3:15 ` [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Adding more Strict RWX support on powerpc, in particular Strict Module RWX.
Thanks for all of the feedback everyone.
It is now rebased on linux-next.

For reference the previous revision is available here: 
https://lore.kernel.org/linuxppc-dev/20210330045132.722243-1-jniethe5@gmail.com/

The changes in v11 for each patch:

Christophe Leroy (2):
  powerpc/mm: implement set_memory_attr()
  powerpc/32: use set_memory_attr()

Jordan Niethe (4):
  powerpc/lib/code-patching: Set up Strict RWX patching earlier
  powerpc: Always define MODULES_{VADDR,END}
    v11: - Consider more places MODULES_VADDR was being used
  powerpc/bpf: Remove bpf_jit_free()
    v11: - New to series
  powerpc/bpf: Write protect JIT code
    v11: - Remove CONFIG_STRICT_MODULE_RWX conditional

Russell Currey (3):
  powerpc/mm: Implement set_memory() routines
    v11: - Update copywrite dates
         - Allow set memory functions to be used without Strict RW
         - Hash: Disallow certain regions and add comment explaining why
         - Have change_page_attr() take function pointers to manipulate ptes
         - Clarify change_page_attr()'s comment
         - Radix: Add ptesync after set_pte_at()
  powerpc/kprobes: Mark newly allocated probes as ROX
    v11: Neaten up
  powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
    v11: Neaten up

Some patches were dropped from this revision:
  powerpc/mm/ptdump: debugfs handler for W+X checks at runtime
    - Will use Christophe's generic ptdump series
  powerpc/configs: Enable STRICT_MODULE_RWX in skiroot_defconfig
    - Will enable STRICT_MODULE_RWX by default later


 arch/powerpc/Kconfig                  |   2 +
 arch/powerpc/include/asm/pgtable.h    |  11 ++
 arch/powerpc/include/asm/set_memory.h |  12 +++
 arch/powerpc/kernel/kprobes.c         |  11 ++
 arch/powerpc/kernel/module.c          |  14 +--
 arch/powerpc/lib/code-patching.c      |  12 +--
 arch/powerpc/mm/Makefile              |   2 +-
 arch/powerpc/mm/kasan/kasan_init_32.c |  10 +-
 arch/powerpc/mm/pageattr.c            | 138 ++++++++++++++++++++++++++
 arch/powerpc/mm/pgtable_32.c          |  60 ++---------
 arch/powerpc/mm/ptdump/ptdump.c       |   4 +-
 arch/powerpc/net/bpf_jit_comp.c       |  13 +--
 12 files changed, 204 insertions(+), 85 deletions(-)
 create mode 100644 arch/powerpc/include/asm/set_memory.h
 create mode 100644 arch/powerpc/mm/pageattr.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
@ 2021-04-29  3:15 ` Jordan Niethe
  2021-04-29  7:32   ` Christophe Leroy
  2021-04-29  3:15 ` [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Russell Currey <ruscur@russell.cc>

The set_memory_{ro/rw/nx/x}() functions are required for
STRICT_MODULE_RWX, and are generally useful primitives to have.  This
implementation is designed to be generic across powerpc's many MMUs.
It's possible that this could be optimised to be faster for specific
MMUs.

This implementation does not handle cases where the caller is attempting
to change the mapping of the page it is executing from, or if another
CPU is concurrently using the page being altered.  These cases likely
shouldn't happen, but a more complex implementation with MMU-specific code
could safely handle them.

On hash, the linear mapping is not kept in the linux pagetable, so this
will not change the protection if used on that range. Currently these
functions are not used on the linear map so just WARN for now.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[jpn: - Allow set memory functions to be used without Strict RWX
      - Hash: Disallow certain regions
      - Have change_page_attr() take function pointers to manipulate ptes
      - Radix: Add ptesync after set_pte_at()]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: WARN if trying to change the hash linear map
v11: - Update copywrite dates
     - Allow set memory functions to be used without Strict RWX
     - Hash: Disallow certain regions and add comment explaining why
     - Have change_page_attr() take function pointers to manipulate ptes
     - Clarify change_page_attr()'s comment
     - Radix: Add ptesync after set_pte_at()
---
 arch/powerpc/Kconfig                  |   1 +
 arch/powerpc/include/asm/set_memory.h |  10 +++
 arch/powerpc/mm/Makefile              |   2 +-
 arch/powerpc/mm/pageattr.c            | 105 ++++++++++++++++++++++++++
 4 files changed, 117 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/include/asm/set_memory.h
 create mode 100644 arch/powerpc/mm/pageattr.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index cb2d44ee4e38..94c34932a74b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -138,6 +138,7 @@ config PPC
 	select ARCH_HAS_MEMBARRIER_CALLBACKS
 	select ARCH_HAS_MEMBARRIER_SYNC_CORE
 	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
+	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_STRICT_KERNEL_RWX	if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UACCESS_FLUSHCACHE
diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h
new file mode 100644
index 000000000000..d1cd69b1a43a
--- /dev/null
+++ b/arch/powerpc/include/asm/set_memory.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_SET_MEMORY_H
+#define _ASM_POWERPC_SET_MEMORY_H
+
+int set_memory_ro(unsigned long addr, int numpages);
+int set_memory_rw(unsigned long addr, int numpages);
+int set_memory_nx(unsigned long addr, int numpages);
+int set_memory_x(unsigned long addr, int numpages);
+
+#endif
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index c3df3a8501d4..9142cf1fb0d5 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -5,7 +5,7 @@
 
 ccflags-$(CONFIG_PPC64)	:= $(NO_MINIMAL_TOC)
 
-obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o \
+obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o pageattr.o \
 				   init_$(BITS).o pgtable_$(BITS).o \
 				   pgtable-frag.o ioremap.o ioremap_$(BITS).o \
 				   init-common.o mmu_context.o drmem.o \
diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
new file mode 100644
index 000000000000..3b4aa72e555e
--- /dev/null
+++ b/arch/powerpc/mm/pageattr.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * MMU-generic set_memory implementation for powerpc
+ *
+ * Copyright 2019-2021, IBM Corporation.
+ */
+
+#include <linux/mm.h>
+#include <linux/set_memory.h>
+
+#include <asm/mmu.h>
+#include <asm/page.h>
+#include <asm/pgtable.h>
+
+
+/*
+ * Updates the attributes of a page in three steps:
+ *
+ * 1. invalidate the page table entry
+ * 2. flush the TLB
+ * 3. install the new entry with the updated attributes
+ *
+ * Invalidating the pte means there are situations where this will not work
+ * when in theory it should.
+ * For example:
+ * - removing write from page whilst it is being executed
+ * - setting a page read-only whilst it is being read by another CPU
+ *
+ */
+static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
+{
+	pte_t (*fn)(pte_t) = data;
+	pte_t pte;
+
+	spin_lock(&init_mm.page_table_lock);
+
+	/* invalidate the PTE so it's safe to modify */
+	pte = ptep_get_and_clear(&init_mm, addr, ptep);
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
+	/* modify the PTE bits as desired, then apply */
+	pte = fn(pte);
+
+	set_pte_at(&init_mm, addr, ptep, pte);
+
+	/* See ptesync comment in radix__set_pte_at() */
+	if (radix_enabled())
+		asm volatile("ptesync": : :"memory");
+	spin_unlock(&init_mm.page_table_lock);
+
+	return 0;
+}
+
+static int change_memory_attr(unsigned long addr, int numpages, pte_t (*fn)(pte_t))
+{
+	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
+	unsigned long size = numpages * PAGE_SIZE;
+
+	if (!numpages)
+		return 0;
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	/*
+	 * On hash, the linear mapping is not in the Linux page table so
+	 * apply_to_existing_page_range() will have no effect. If in the future
+	 * the set_memory_* functions are used on the linear map this will need
+	 * to be updated.
+	 */
+	if (!radix_enabled()) {
+		int region = get_region_id(addr);
+
+		if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
+			return -EINVAL;
+	}
+#endif
+
+	return apply_to_existing_page_range(&init_mm, start, size,
+					    change_page_attr, fn);
+}
+
+int set_memory_ro(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, pte_wrprotect);
+}
+
+static pte_t pte_mkdirtywrite(pte_t pte)
+{
+	return pte_mkwrite(pte_mkdirty(pte));
+}
+
+int set_memory_rw(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, pte_mkdirtywrite);
+}
+
+int set_memory_nx(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, pte_exprotect);
+}
+
+int set_memory_x(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, pte_mkexec);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
  2021-04-29  3:15 ` [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
@ 2021-04-29  3:15 ` Jordan Niethe
  2021-04-29  4:53   ` Christophe Leroy
  2021-04-29  3:15 ` [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END} Jordan Niethe
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

setup_text_poke_area() is a late init call so it runs before
mark_rodata_ro() and after the init calls. This lets all the init code
patching simply write to their locations. In the future, kprobes is
going to allocate its instruction pages RO which means they will need
setup_text__poke_area() to have been already called for their code
patching. However, init_kprobes() (which allocates and patches some
instruction pages) is an early init call so it happens before
setup_text__poke_area().

start_kernel() calls poking_init() before any of the init calls. On
powerpc, poking_init() is currently a nop. setup_text_poke_area() relies
on kernel virtual memory, cpu hotplug and per_cpu_areas being setup.
setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called
before poking_init().

Turn setup_text_poke_area() into poking_init().

Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v9: New to series
---
 arch/powerpc/lib/code-patching.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 870b30d9be2f..15296207e1ba 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -70,14 +70,11 @@ static int text_area_cpu_down(unsigned int cpu)
 }
 
 /*
- * Run as a late init call. This allows all the boot time patching to be done
- * simply by patching the code, and then we're called here prior to
- * mark_rodata_ro(), which happens after all init calls are run. Although
- * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
- * it as being preferable to a kernel that will crash later when someone tries
- * to use patch_instruction().
+ * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
+ * we judge it as being preferable to a kernel that will crash later when
+ * someone tries to use patch_instruction().
  */
-static int __init setup_text_poke_area(void)
+int __init poking_init(void)
 {
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
@@ -85,7 +82,6 @@ static int __init setup_text_poke_area(void)
 
 	return 0;
 }
-late_initcall(setup_text_poke_area);
 
 /*
  * This can be called for kernel text or a module.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
  2021-04-29  3:15 ` [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
  2021-04-29  3:15 ` [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
@ 2021-04-29  3:15 ` Jordan Niethe
  2021-04-29  5:04   ` Christophe Leroy
  2021-04-29  3:15 ` [PATCH v11 4/9] powerpc/kprobes: Mark newly allocated probes as ROX Jordan Niethe
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
VMALLOC_END respectively. This reduces the need for special cases. For
example, powerpc's module_alloc() was previously predicated on
MODULES_VADDR being defined but now is unconditionally defined.

This will be useful reducing conditional code in other places that need
to allocate from the module region (i.e., kprobes).

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: New to series
v11: - Consider more places MODULES_VADDR was being used
---
 arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
 arch/powerpc/kernel/module.c          |  5 +----
 arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
 arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index c6a676714f04..882fda779648 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -39,6 +39,17 @@ struct mm_struct;
 #define __S110	PAGE_SHARED_X
 #define __S111	PAGE_SHARED_X
 
+#ifndef MODULES_VADDR
+#define MODULES_VADDR VMALLOC_START
+#define MODULES_END VMALLOC_END
+#endif
+
+#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
+#if TASK_SIZE > MODULES_VADDR
+#error TASK_SIZE > MODULES_VADDR
+#endif
+#endif
+
 #ifndef __ASSEMBLY__
 
 /* Keep these as a macros to avoid include dependency mess */
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index fab84024650c..c60c7457ff47 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -15,6 +15,7 @@
 #include <linux/sort.h>
 #include <asm/setup.h>
 #include <asm/sections.h>
+#include <linux/mm.h>
 
 static LIST_HEAD(module_bug_list);
 
@@ -88,7 +89,6 @@ int module_finalize(const Elf_Ehdr *hdr,
 	return 0;
 }
 
-#ifdef MODULES_VADDR
 static __always_inline void *
 __module_alloc(unsigned long size, unsigned long start, unsigned long end)
 {
@@ -102,8 +102,6 @@ void *module_alloc(unsigned long size)
 	unsigned long limit = (unsigned long)_etext - SZ_32M;
 	void *ptr = NULL;
 
-	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
-
 	/* First try within 32M limit from _etext to avoid branch trampolines */
 	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
 		ptr = __module_alloc(size, limit, MODULES_END);
@@ -113,4 +111,3 @@ void *module_alloc(unsigned long size)
 
 	return ptr;
 }
-#endif
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index cf8770b1a692..42c057366ac7 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -116,11 +116,11 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
 
 	kasan_update_early_region(k_start, k_end, __pte(0));
 
-#ifdef MODULES_VADDR
-	k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
-	k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
-	kasan_update_early_region(k_start, k_end, __pte(0));
-#endif
+	if (MODULES_VADDR != VMALLOC_START && MODULES_END != VMALLOC_END) {
+		k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
+		k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
+		kasan_update_early_region(k_start, k_end, __pte(0));
+	}
 }
 
 void __init kasan_mmu_init(void)
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index aca354fb670b..0431457f668f 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -73,7 +73,7 @@ struct addr_marker {
 
 static struct addr_marker address_markers[] = {
 	{ 0,	"Start of kernel VM" },
-#ifdef MODULES_VADDR
+#if defined(CONFIG_BOOK32_32) && defined(CONFIG_STRICT_KERNEL_RWX)
 	{ 0,	"modules start" },
 	{ 0,	"modules end" },
 #endif
@@ -359,7 +359,7 @@ static void populate_markers(void)
 #else
 	address_markers[i++].start_address = TASK_SIZE;
 #endif
-#ifdef MODULES_VADDR
+#if defined(CONFIG_BOOK32_32) && defined(CONFIG_STRICT_KERNEL_RWX)
 	address_markers[i++].start_address = MODULES_VADDR;
 	address_markers[i++].start_address = MODULES_END;
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 4/9] powerpc/kprobes: Mark newly allocated probes as ROX
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (2 preceding siblings ...)
  2021-04-29  3:15 ` [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END} Jordan Niethe
@ 2021-04-29  3:15 ` Jordan Niethe
  2021-04-29  3:15 ` [PATCH v11 5/9] powerpc/bpf: Remove bpf_jit_free() Jordan Niethe
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Russell Currey <ruscur@russell.cc>

Add the arch specific insn page allocator for powerpc. This allocates
ROX pages if STRICT_KERNEL_RWX is enabled. These pages are only written
to with patch_instruction() which is able to write RO pages.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[jpn: Reword commit message, switch to __vmalloc_node_range()]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v9: - vmalloc_exec() no longer exists
    - Set the page to RW before freeing it
v10: - use __vmalloc_node_range()
v11: - Neaten up
---
 arch/powerpc/kernel/kprobes.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 01ab2163659e..5670b43e254f 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -25,6 +25,7 @@
 #include <asm/sections.h>
 #include <asm/inst.h>
 #include <linux/uaccess.h>
+#include <linux/vmalloc.h>
 
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
 DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
@@ -103,6 +104,16 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
 	return addr;
 }
 
+void *alloc_insn_page(void)
+{
+	pgprot_t prot = IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) ? PAGE_KERNEL_ROX :
+							       PAGE_KERNEL_EXEC;
+
+	return __vmalloc_node_range(PAGE_SIZE, 1, MODULES_VADDR, MODULES_END,
+			GFP_KERNEL, prot, VM_FLUSH_RESET_PERMS,
+			NUMA_NO_NODE, __builtin_return_address(0));
+}
+
 int arch_prepare_kprobe(struct kprobe *p)
 {
 	int ret = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 5/9] powerpc/bpf: Remove bpf_jit_free()
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (3 preceding siblings ...)
  2021-04-29  3:15 ` [PATCH v11 4/9] powerpc/kprobes: Mark newly allocated probes as ROX Jordan Niethe
@ 2021-04-29  3:15 ` Jordan Niethe
  2021-04-29  3:15 ` [PATCH v11 6/9] powerpc/bpf: Write protect JIT code Jordan Niethe
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Commit 74451e66d516 ("bpf: make jited programs visible in traces") added
a default bpf_jit_free() implementation. Powerpc did not use the default
bpf_jit_free() as powerpc did not set the images read-only. The default
bpf_jit_free() called bpf_jit_binary_unlock_ro() is why it could not be
used for powerpc.

Commit d53d2f78cead ("bpf: Use vmalloc special flag") moved keeping
track of read-only memory to vmalloc. This included removing
bpf_jit_binary_unlock_ro(). Therefore there is no reason powerpc needs
its own bpf_jit_free(). Remove it.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v11: New to series
---
 arch/powerpc/net/bpf_jit_comp.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 798ac4350a82..6c8c268e4fe8 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -257,15 +257,3 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 	return fp;
 }
-
-/* Overriding bpf_jit_free() as we don't set images read-only. */
-void bpf_jit_free(struct bpf_prog *fp)
-{
-	unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
-	struct bpf_binary_header *bpf_hdr = (void *)addr;
-
-	if (fp->jited)
-		bpf_jit_binary_free(bpf_hdr);
-
-	bpf_prog_unlock_free(fp);
-}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 6/9] powerpc/bpf: Write protect JIT code
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (4 preceding siblings ...)
  2021-04-29  3:15 ` [PATCH v11 5/9] powerpc/bpf: Remove bpf_jit_free() Jordan Niethe
@ 2021-04-29  3:15 ` Jordan Niethe
  2021-04-29  3:16 ` [PATCH v11 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Add the necessary call to bpf_jit_binary_lock_ro() to remove write and
add exec permissions to the JIT image after it has finished being
written.

Without CONFIG_STRICT_MODULE_RWX the image will be writable and
executable until the call to bpf_jit_binary_lock_ro().

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: New to series
v11: Remove CONFIG_STRICT_MODULE_RWX conditional
---
 arch/powerpc/net/bpf_jit_comp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 6c8c268e4fe8..53aefee3fe70 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -237,6 +237,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	fp->jited_len = alloclen;
 
 	bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + (bpf_hdr->pages * PAGE_SIZE));
+	bpf_jit_binary_lock_ro(bpf_hdr);
 	if (!fp->is_func || extra_pass) {
 		bpf_prog_fill_jited_linfo(fp, addrs);
 out_addrs:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (5 preceding siblings ...)
  2021-04-29  3:15 ` [PATCH v11 6/9] powerpc/bpf: Write protect JIT code Jordan Niethe
@ 2021-04-29  3:16 ` Jordan Niethe
  2021-04-29  3:16 ` [PATCH v11 8/9] powerpc/mm: implement set_memory_attr() Jordan Niethe
  2021-04-29  3:16 ` [PATCH v11 9/9] powerpc/32: use set_memory_attr() Jordan Niethe
  8 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:16 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Russell Currey <ruscur@russell.cc>

To enable strict module RWX on powerpc, set:

    CONFIG_STRICT_MODULE_RWX=y

You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real
security benefit.

ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX.
This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that
makes STRICT_MODULE_RWX *on by default* in configurations where
STRICT_KERNEL_RWX is *unavailable*.

Since this doesn't make much sense, and module RWX without kernel RWX
doesn't make much sense, having the same dependencies as kernel RWX
works around this problem.

With STRICT_MODULE_RWX, now make module_alloc() allocate pages with
KERNEL_PAGE protection rather than KERNEL_PAGE_EXEC.

Book32s/32 processors with a hash mmu (i.e. 604 core) can not set memory
protection on a page by page basis so do not enable.

Signed-off-by: Russell Currey <ruscur@russell.cc>
[jpn: - predicate on !PPC_BOOK3S_604
      - make module_alloc() use PAGE_KERNEL protection]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: - Predicate on !PPC_BOOK3S_604
     - Make module_alloc() use PAGE_KERNEL protection
v11: - Neaten up
---
 arch/powerpc/Kconfig         | 1 +
 arch/powerpc/kernel/module.c | 9 ++++++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 94c34932a74b..9e264309fd99 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -140,6 +140,7 @@ config PPC
 	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_STRICT_KERNEL_RWX	if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
+	select ARCH_HAS_STRICT_MODULE_RWX	if ARCH_HAS_STRICT_KERNEL_RWX && !PPC_BOOK3S_604
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UACCESS_FLUSHCACHE
 	select ARCH_HAS_COPY_MC			if PPC64
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index c60c7457ff47..9eabaf1fadbe 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -92,9 +92,12 @@ int module_finalize(const Elf_Ehdr *hdr,
 static __always_inline void *
 __module_alloc(unsigned long size, unsigned long start, unsigned long end)
 {
-	return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL,
-				    PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
-				    __builtin_return_address(0));
+	pgprot_t prot = IS_ENABLED(CONFIG_STRICT_MODULE_RWX) ? PAGE_KERNEL :
+							       PAGE_KERNEL_EXEC;
+
+	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+			GFP_KERNEL, prot, VM_FLUSH_RESET_PERMS,
+			NUMA_NO_NODE, __builtin_return_address(0));
 }
 
 void *module_alloc(unsigned long size)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 8/9] powerpc/mm: implement set_memory_attr()
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (6 preceding siblings ...)
  2021-04-29  3:16 ` [PATCH v11 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
@ 2021-04-29  3:16 ` Jordan Niethe
  2021-04-29  3:16 ` [PATCH v11 9/9] powerpc/32: use set_memory_attr() Jordan Niethe
  8 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:16 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, kbuild test robot, npiggin, aneesh.kumar, naveen.n.rao,
	Jordan Niethe, dja

From: Christophe Leroy <christophe.leroy@csgroup.eu>

In addition to the set_memory_xx() functions which allows to change
the memory attributes of not (yet) used memory regions, implement a
set_memory_attr() function to:
- set the final memory protection after init on currently used
kernel regions.
- enable/disable kernel memory regions in the scope of DEBUG_PAGEALLOC.

Unlike the set_memory_xx() which can act in three step as the regions
are unused, this function must modify 'on the fly' as the kernel is
executing from them. At the moment only PPC32 will use it and changing
page attributes on the fly is not an issue.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: kbuild test robot <lkp@intel.com>
[ruscur: cast "data" to unsigned long instead of int]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/set_memory.h |  2 ++
 arch/powerpc/mm/pageattr.c            | 33 +++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h
index d1cd69b1a43a..a738e86f34bd 100644
--- a/arch/powerpc/include/asm/set_memory.h
+++ b/arch/powerpc/include/asm/set_memory.h
@@ -7,4 +7,6 @@ int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 
+int set_memory_attr(unsigned long addr, int numpages, pgprot_t prot);
+
 #endif
diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 3b4aa72e555e..9098066bc113 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -103,3 +103,36 @@ int set_memory_x(unsigned long addr, int numpages)
 {
 	return change_memory_attr(addr, numpages, pte_mkexec);
 }
+
+/*
+ * Set the attributes of a page:
+ *
+ * This function is used by PPC32 at the end of init to set final kernel memory
+ * protection. It includes changing the maping of the page it is executing from
+ * and data pages it is using.
+ */
+static int set_page_attr(pte_t *ptep, unsigned long addr, void *data)
+{
+	pgprot_t prot = __pgprot((unsigned long)data);
+
+	spin_lock(&init_mm.page_table_lock);
+
+	set_pte_at(&init_mm, addr, ptep, pte_modify(*ptep, prot));
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
+	spin_unlock(&init_mm.page_table_lock);
+
+	return 0;
+}
+
+int set_memory_attr(unsigned long addr, int numpages, pgprot_t prot)
+{
+	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
+	unsigned long sz = numpages * PAGE_SIZE;
+
+	if (numpages <= 0)
+		return 0;
+
+	return apply_to_existing_page_range(&init_mm, start, sz, set_page_attr,
+					    (void *)pgprot_val(prot));
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v11 9/9] powerpc/32: use set_memory_attr()
  2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (7 preceding siblings ...)
  2021-04-29  3:16 ` [PATCH v11 8/9] powerpc/mm: implement set_memory_attr() Jordan Niethe
@ 2021-04-29  3:16 ` Jordan Niethe
  8 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-04-29  3:16 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Christophe Leroy <christophe.leroy@csgroup.eu>

Use set_memory_attr() instead of the PPC32 specific change_page_attr()

change_page_attr() was checking that the address was not mapped by
blocks and was handling highmem, but that's unneeded because the
affected pages can't be in highmem and block mapping verification
is already done by the callers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[ruscur: rebase on powerpc/merge with Christophe's new patches]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/mm/pgtable_32.c | 60 ++++++------------------------------
 1 file changed, 10 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index e0ec67a16887..dcf5ecca19d9 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -23,6 +23,7 @@
 #include <linux/highmem.h>
 #include <linux/memblock.h>
 #include <linux/slab.h>
+#include <linux/set_memory.h>
 
 #include <asm/pgalloc.h>
 #include <asm/fixmap.h>
@@ -132,64 +133,20 @@ void __init mapin_ram(void)
 	}
 }
 
-static int __change_page_attr_noflush(struct page *page, pgprot_t prot)
-{
-	pte_t *kpte;
-	unsigned long address;
-
-	BUG_ON(PageHighMem(page));
-	address = (unsigned long)page_address(page);
-
-	if (v_block_mapped(address))
-		return 0;
-	kpte = virt_to_kpte(address);
-	if (!kpte)
-		return -EINVAL;
-	__set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0);
-
-	return 0;
-}
-
-/*
- * Change the page attributes of an page in the linear mapping.
- *
- * THIS DOES NOTHING WITH BAT MAPPINGS, DEBUG USE ONLY
- */
-static int change_page_attr(struct page *page, int numpages, pgprot_t prot)
-{
-	int i, err = 0;
-	unsigned long flags;
-	struct page *start = page;
-
-	local_irq_save(flags);
-	for (i = 0; i < numpages; i++, page++) {
-		err = __change_page_attr_noflush(page, prot);
-		if (err)
-			break;
-	}
-	wmb();
-	local_irq_restore(flags);
-	flush_tlb_kernel_range((unsigned long)page_address(start),
-			       (unsigned long)page_address(page));
-	return err;
-}
-
 void mark_initmem_nx(void)
 {
-	struct page *page = virt_to_page(_sinittext);
 	unsigned long numpages = PFN_UP((unsigned long)_einittext) -
 				 PFN_DOWN((unsigned long)_sinittext);
 
 	if (v_block_mapped((unsigned long)_sinittext))
 		mmu_mark_initmem_nx();
 	else
-		change_page_attr(page, numpages, PAGE_KERNEL);
+		set_memory_attr((unsigned long)_sinittext, numpages, PAGE_KERNEL);
 }
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void mark_rodata_ro(void)
 {
-	struct page *page;
 	unsigned long numpages;
 
 	if (v_block_mapped((unsigned long)_stext + 1)) {
@@ -198,20 +155,18 @@ void mark_rodata_ro(void)
 		return;
 	}
 
-	page = virt_to_page(_stext);
 	numpages = PFN_UP((unsigned long)_etext) -
 		   PFN_DOWN((unsigned long)_stext);
 
-	change_page_attr(page, numpages, PAGE_KERNEL_ROX);
+	set_memory_attr((unsigned long)_stext, numpages, PAGE_KERNEL_ROX);
 	/*
 	 * mark .rodata as read only. Use __init_begin rather than __end_rodata
 	 * to cover NOTES and EXCEPTION_TABLE.
 	 */
-	page = virt_to_page(__start_rodata);
 	numpages = PFN_UP((unsigned long)__init_begin) -
 		   PFN_DOWN((unsigned long)__start_rodata);
 
-	change_page_attr(page, numpages, PAGE_KERNEL_RO);
+	set_memory_attr((unsigned long)__start_rodata, numpages, PAGE_KERNEL_RO);
 
 	// mark_initmem_nx() should have already run by now
 	ptdump_check_wx();
@@ -221,9 +176,14 @@ void mark_rodata_ro(void)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
+	unsigned long addr = (unsigned long)page_address(page);
+
 	if (PageHighMem(page))
 		return;
 
-	change_page_attr(page, numpages, enable ? PAGE_KERNEL : __pgprot(0));
+	if (enable)
+		set_memory_attr(addr, numpages, PAGE_KERNEL);
+	else
+		set_memory_attr(addr, numpages, __pgprot(0));
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier
  2021-04-29  3:15 ` [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
@ 2021-04-29  4:53   ` Christophe Leroy
  2021-05-05  5:22     ` Jordan Niethe
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2021-04-29  4:53 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev
  Cc: ajd, npiggin, cmr, aneesh.kumar, naveen.n.rao, dja



Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> setup_text_poke_area() is a late init call so it runs before
> mark_rodata_ro() and after the init calls. This lets all the init code
> patching simply write to their locations. In the future, kprobes is
> going to allocate its instruction pages RO which means they will need
> setup_text__poke_area() to have been already called for their code
> patching. However, init_kprobes() (which allocates and patches some
> instruction pages) is an early init call so it happens before
> setup_text__poke_area().
> 
> start_kernel() calls poking_init() before any of the init calls. On
> powerpc, poking_init() is currently a nop. setup_text_poke_area() relies
> on kernel virtual memory, cpu hotplug and per_cpu_areas being setup.
> setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called
> before poking_init().
> 
> Turn setup_text_poke_area() into poking_init().

I can't remember, maybe I already asked the question:
Have you done some performance measurement or at least some performance analysis ?

Christophe

> 
> Reviewed-by: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v9: New to series
> ---
>   arch/powerpc/lib/code-patching.c | 12 ++++--------
>   1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 870b30d9be2f..15296207e1ba 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -70,14 +70,11 @@ static int text_area_cpu_down(unsigned int cpu)
>   }
>   
>   /*
> - * Run as a late init call. This allows all the boot time patching to be done
> - * simply by patching the code, and then we're called here prior to
> - * mark_rodata_ro(), which happens after all init calls are run. Although
> - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
> - * it as being preferable to a kernel that will crash later when someone tries
> - * to use patch_instruction().
> + * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
> + * we judge it as being preferable to a kernel that will crash later when
> + * someone tries to use patch_instruction().
>    */
> -static int __init setup_text_poke_area(void)
> +int __init poking_init(void)
>   {
>   	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>   		"powerpc/text_poke:online", text_area_cpu_up,
> @@ -85,7 +82,6 @@ static int __init setup_text_poke_area(void)
>   
>   	return 0;
>   }
> -late_initcall(setup_text_poke_area);
>   
>   /*
>    * This can be called for kernel text or a module.
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-04-29  3:15 ` [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END} Jordan Niethe
@ 2021-04-29  5:04   ` Christophe Leroy
  2021-05-03  5:39     ` Jordan Niethe
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2021-04-29  5:04 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev
  Cc: ajd, npiggin, cmr, aneesh.kumar, naveen.n.rao, dja



Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
> VMALLOC_END respectively. This reduces the need for special cases. For
> example, powerpc's module_alloc() was previously predicated on
> MODULES_VADDR being defined but now is unconditionally defined.
> 
> This will be useful reducing conditional code in other places that need
> to allocate from the module region (i.e., kprobes).
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v10: New to series
> v11: - Consider more places MODULES_VADDR was being used
> ---
>   arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
>   arch/powerpc/kernel/module.c          |  5 +----
>   arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
>   arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
>   4 files changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index c6a676714f04..882fda779648 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -39,6 +39,17 @@ struct mm_struct;
>   #define __S110	PAGE_SHARED_X
>   #define __S111	PAGE_SHARED_X
>   
> +#ifndef MODULES_VADDR
> +#define MODULES_VADDR VMALLOC_START
> +#define MODULES_END VMALLOC_END
> +#endif
> +
> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)

No no.

TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.

Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?

> +#if TASK_SIZE > MODULES_VADDR
> +#error TASK_SIZE > MODULES_VADDR
> +#endif
> +#endif
> +
>   #ifndef __ASSEMBLY__
>   
>   /* Keep these as a macros to avoid include dependency mess */
> diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
> index fab84024650c..c60c7457ff47 100644
> --- a/arch/powerpc/kernel/module.c
> +++ b/arch/powerpc/kernel/module.c
> @@ -15,6 +15,7 @@
>   #include <linux/sort.h>
>   #include <asm/setup.h>
>   #include <asm/sections.h>
> +#include <linux/mm.h>
>   
>   static LIST_HEAD(module_bug_list);
>   
> @@ -88,7 +89,6 @@ int module_finalize(const Elf_Ehdr *hdr,
>   	return 0;
>   }
>   
> -#ifdef MODULES_VADDR
>   static __always_inline void *
>   __module_alloc(unsigned long size, unsigned long start, unsigned long end)
>   {
> @@ -102,8 +102,6 @@ void *module_alloc(unsigned long size)
>   	unsigned long limit = (unsigned long)_etext - SZ_32M;
>   	void *ptr = NULL;
>   
> -	BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
> -
>   	/* First try within 32M limit from _etext to avoid branch trampolines */
>   	if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
>   		ptr = __module_alloc(size, limit, MODULES_END);
> @@ -113,4 +111,3 @@ void *module_alloc(unsigned long size)
>   
>   	return ptr;
>   }
> -#endif
> diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
> index cf8770b1a692..42c057366ac7 100644
> --- a/arch/powerpc/mm/kasan/kasan_init_32.c
> +++ b/arch/powerpc/mm/kasan/kasan_init_32.c
> @@ -116,11 +116,11 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
>   
>   	kasan_update_early_region(k_start, k_end, __pte(0));
>   
> -#ifdef MODULES_VADDR
> -	k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
> -	k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
> -	kasan_update_early_region(k_start, k_end, __pte(0));
> -#endif
> +	if (MODULES_VADDR != VMALLOC_START && MODULES_END != VMALLOC_END) {

Shouldn't it be an || ?

As soon as either MODULES_VADDR or MODULES_END differs from the vmalloc boundaries, it needs to be 
done I think.

> +		k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
> +		k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
> +		kasan_update_early_region(k_start, k_end, __pte(0));
> +	}
>   }
>   
>   void __init kasan_mmu_init(void)
> diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
> index aca354fb670b..0431457f668f 100644
> --- a/arch/powerpc/mm/ptdump/ptdump.c
> +++ b/arch/powerpc/mm/ptdump/ptdump.c
> @@ -73,7 +73,7 @@ struct addr_marker {
>   
>   static struct addr_marker address_markers[] = {
>   	{ 0,	"Start of kernel VM" },
> -#ifdef MODULES_VADDR
> +#if defined(CONFIG_BOOK32_32) && defined(CONFIG_STRICT_KERNEL_RWX)

Not valid anymore, see https://github.com/linuxppc/linux/commit/80edc68e0479 and 
https://github.com/linuxppc/linux/commit/9132a2e82adc

The best would be to be able to do something like:

#if MODULES_VADDR != VMALLOC_START

If it doesn't work, then it has to be

#if defined(CONFIG_BOOK32_32) || defined(CONFIG_PPC_8xx)

>   	{ 0,	"modules start" },
>   	{ 0,	"modules end" },
>   #endif
> @@ -359,7 +359,7 @@ static void populate_markers(void)
>   #else
>   	address_markers[i++].start_address = TASK_SIZE;
>   #endif
> -#ifdef MODULES_VADDR
> +#if defined(CONFIG_BOOK32_32) && defined(CONFIG_STRICT_KERNEL_RWX)

Same.

>   	address_markers[i++].start_address = MODULES_VADDR;
>   	address_markers[i++].start_address = MODULES_END;
>   #endif
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines
  2021-04-29  3:15 ` [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
@ 2021-04-29  7:32   ` Christophe Leroy
  2021-05-03  5:02     ` Jordan Niethe
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2021-04-29  7:32 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev
  Cc: ajd, npiggin, cmr, aneesh.kumar, naveen.n.rao, dja



Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> From: Russell Currey <ruscur@russell.cc>
> 
> The set_memory_{ro/rw/nx/x}() functions are required for
> STRICT_MODULE_RWX, and are generally useful primitives to have.  This
> implementation is designed to be generic across powerpc's many MMUs.
> It's possible that this could be optimised to be faster for specific
> MMUs.
> 
> This implementation does not handle cases where the caller is attempting
> to change the mapping of the page it is executing from, or if another
> CPU is concurrently using the page being altered.  These cases likely
> shouldn't happen, but a more complex implementation with MMU-specific code
> could safely handle them.
> 
> On hash, the linear mapping is not kept in the linux pagetable, so this
> will not change the protection if used on that range. Currently these
> functions are not used on the linear map so just WARN for now.
> 
> Reviewed-by: Daniel Axtens <dja@axtens.net>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> [jpn: - Allow set memory functions to be used without Strict RWX
>        - Hash: Disallow certain regions
>        - Have change_page_attr() take function pointers to manipulate ptes

Did you look at the resulting generated code ? I find it awful.

pte manipulation helpers are meant to be inlined. Here you force the compiler to outline them. This 
also means that the input and output goes through memory.

And now set_memory_xx are not tiny inlined functions anymore.

What is the reason you abandonned the way it was done up to now, through the use of an 'action' 
value ? With the previous approach the generated code was a lot lighter.

>        - Radix: Add ptesync after set_pte_at()]
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v10: WARN if trying to change the hash linear map
> v11: - Update copywrite dates
>       - Allow set memory functions to be used without Strict RWX
>       - Hash: Disallow certain regions and add comment explaining why
>       - Have change_page_attr() take function pointers to manipulate ptes
>       - Clarify change_page_attr()'s comment
>       - Radix: Add ptesync after set_pte_at()
> ---
>   arch/powerpc/Kconfig                  |   1 +
>   arch/powerpc/include/asm/set_memory.h |  10 +++
>   arch/powerpc/mm/Makefile              |   2 +-
>   arch/powerpc/mm/pageattr.c            | 105 ++++++++++++++++++++++++++
>   4 files changed, 117 insertions(+), 1 deletion(-)
>   create mode 100644 arch/powerpc/include/asm/set_memory.h
>   create mode 100644 arch/powerpc/mm/pageattr.c
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index cb2d44ee4e38..94c34932a74b 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -138,6 +138,7 @@ config PPC
>   	select ARCH_HAS_MEMBARRIER_CALLBACKS
>   	select ARCH_HAS_MEMBARRIER_SYNC_CORE
>   	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
> +	select ARCH_HAS_SET_MEMORY
>   	select ARCH_HAS_STRICT_KERNEL_RWX	if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
>   	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
>   	select ARCH_HAS_UACCESS_FLUSHCACHE
> diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h
> new file mode 100644
> index 000000000000..d1cd69b1a43a
> --- /dev/null
> +++ b/arch/powerpc/include/asm/set_memory.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_POWERPC_SET_MEMORY_H
> +#define _ASM_POWERPC_SET_MEMORY_H
> +
> +int set_memory_ro(unsigned long addr, int numpages);
> +int set_memory_rw(unsigned long addr, int numpages);
> +int set_memory_nx(unsigned long addr, int numpages);
> +int set_memory_x(unsigned long addr, int numpages);
> +
> +#endif
> diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
> index c3df3a8501d4..9142cf1fb0d5 100644
> --- a/arch/powerpc/mm/Makefile
> +++ b/arch/powerpc/mm/Makefile
> @@ -5,7 +5,7 @@
>   
>   ccflags-$(CONFIG_PPC64)	:= $(NO_MINIMAL_TOC)
>   
> -obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o \
> +obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o pageattr.o \
>   				   init_$(BITS).o pgtable_$(BITS).o \
>   				   pgtable-frag.o ioremap.o ioremap_$(BITS).o \
>   				   init-common.o mmu_context.o drmem.o \
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> new file mode 100644
> index 000000000000..3b4aa72e555e
> --- /dev/null
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -0,0 +1,105 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * MMU-generic set_memory implementation for powerpc
> + *
> + * Copyright 2019-2021, IBM Corporation.
> + */
> +
> +#include <linux/mm.h>
> +#include <linux/set_memory.h>
> +
> +#include <asm/mmu.h>
> +#include <asm/page.h>
> +#include <asm/pgtable.h>
> +
> +
> +/*
> + * Updates the attributes of a page in three steps:
> + *
> + * 1. invalidate the page table entry
> + * 2. flush the TLB
> + * 3. install the new entry with the updated attributes
> + *
> + * Invalidating the pte means there are situations where this will not work
> + * when in theory it should.
> + * For example:
> + * - removing write from page whilst it is being executed
> + * - setting a page read-only whilst it is being read by another CPU
> + *
> + */
> +static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
> +{
> +	pte_t (*fn)(pte_t) = data;
> +	pte_t pte;
> +
> +	spin_lock(&init_mm.page_table_lock);
> +
> +	/* invalidate the PTE so it's safe to modify */
> +	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +
> +	/* modify the PTE bits as desired, then apply */
> +	pte = fn(pte);
> +
> +	set_pte_at(&init_mm, addr, ptep, pte);
> +
> +	/* See ptesync comment in radix__set_pte_at() */
> +	if (radix_enabled())
> +		asm volatile("ptesync": : :"memory");
> +	spin_unlock(&init_mm.page_table_lock);
> +
> +	return 0;
> +}
> +
> +static int change_memory_attr(unsigned long addr, int numpages, pte_t (*fn)(pte_t))
> +{
> +	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
> +	unsigned long size = numpages * PAGE_SIZE;
> +
> +	if (!numpages)
> +		return 0;
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	/*
> +	 * On hash, the linear mapping is not in the Linux page table so
> +	 * apply_to_existing_page_range() will have no effect. If in the future
> +	 * the set_memory_* functions are used on the linear map this will need
> +	 * to be updated.
> +	 */
> +	if (!radix_enabled()) {
> +		int region = get_region_id(addr);
> +
> +		if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> +			return -EINVAL;
> +	}
> +#endif
> +
> +	return apply_to_existing_page_range(&init_mm, start, size,
> +					    change_page_attr, fn);
> +}
> +
> +int set_memory_ro(unsigned long addr, int numpages)
> +{
> +	return change_memory_attr(addr, numpages, pte_wrprotect);
> +}
> +
> +static pte_t pte_mkdirtywrite(pte_t pte)
> +{
> +	return pte_mkwrite(pte_mkdirty(pte));
> +}
> +
> +int set_memory_rw(unsigned long addr, int numpages)
> +{
> +	return change_memory_attr(addr, numpages, pte_mkdirtywrite);
> +}
> +
> +int set_memory_nx(unsigned long addr, int numpages)
> +{
> +	return change_memory_attr(addr, numpages, pte_exprotect);
> +}
> +
> +int set_memory_x(unsigned long addr, int numpages)
> +{
> +	return change_memory_attr(addr, numpages, pte_mkexec);
> +}
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines
  2021-04-29  7:32   ` Christophe Leroy
@ 2021-05-03  5:02     ` Jordan Niethe
  0 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-05-03  5:02 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens

On Thu, Apr 29, 2021 at 5:32 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> > From: Russell Currey <ruscur@russell.cc>
> >
> > The set_memory_{ro/rw/nx/x}() functions are required for
> > STRICT_MODULE_RWX, and are generally useful primitives to have.  This
> > implementation is designed to be generic across powerpc's many MMUs.
> > It's possible that this could be optimised to be faster for specific
> > MMUs.
> >
> > This implementation does not handle cases where the caller is attempting
> > to change the mapping of the page it is executing from, or if another
> > CPU is concurrently using the page being altered.  These cases likely
> > shouldn't happen, but a more complex implementation with MMU-specific code
> > could safely handle them.
> >
> > On hash, the linear mapping is not kept in the linux pagetable, so this
> > will not change the protection if used on that range. Currently these
> > functions are not used on the linear map so just WARN for now.
> >
> > Reviewed-by: Daniel Axtens <dja@axtens.net>
> > Signed-off-by: Russell Currey <ruscur@russell.cc>
> > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > [jpn: - Allow set memory functions to be used without Strict RWX
> >        - Hash: Disallow certain regions
> >        - Have change_page_attr() take function pointers to manipulate ptes
>
> Did you look at the resulting generated code ? I find it awful.
>
> pte manipulation helpers are meant to be inlined. Here you force the compiler to outline them. This
> also means that the input and output goes through memory.
>
> And now set_memory_xx are not tiny inlined functions anymore.
>
> What is the reason you abandonned the way it was done up to now, through the use of an 'action'
> value ? With the previous approach the generated code was a lot lighter.
When I was looking at the patch again, it started to look to me like
the action values were an unneeded abstraction. But yeah doing it like
this makes the generated code much worse. I'll change back in the next
version.
>
> >        - Radix: Add ptesync after set_pte_at()]
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> > v10: WARN if trying to change the hash linear map
> > v11: - Update copywrite dates
> >       - Allow set memory functions to be used without Strict RWX
> >       - Hash: Disallow certain regions and add comment explaining why
> >       - Have change_page_attr() take function pointers to manipulate ptes
> >       - Clarify change_page_attr()'s comment
> >       - Radix: Add ptesync after set_pte_at()
> > ---
> >   arch/powerpc/Kconfig                  |   1 +
> >   arch/powerpc/include/asm/set_memory.h |  10 +++
> >   arch/powerpc/mm/Makefile              |   2 +-
> >   arch/powerpc/mm/pageattr.c            | 105 ++++++++++++++++++++++++++
> >   4 files changed, 117 insertions(+), 1 deletion(-)
> >   create mode 100644 arch/powerpc/include/asm/set_memory.h
> >   create mode 100644 arch/powerpc/mm/pageattr.c
> >
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index cb2d44ee4e38..94c34932a74b 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -138,6 +138,7 @@ config PPC
> >       select ARCH_HAS_MEMBARRIER_CALLBACKS
> >       select ARCH_HAS_MEMBARRIER_SYNC_CORE
> >       select ARCH_HAS_SCALED_CPUTIME          if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
> > +     select ARCH_HAS_SET_MEMORY
> >       select ARCH_HAS_STRICT_KERNEL_RWX       if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
> >       select ARCH_HAS_TICK_BROADCAST          if GENERIC_CLOCKEVENTS_BROADCAST
> >       select ARCH_HAS_UACCESS_FLUSHCACHE
> > diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h
> > new file mode 100644
> > index 000000000000..d1cd69b1a43a
> > --- /dev/null
> > +++ b/arch/powerpc/include/asm/set_memory.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_POWERPC_SET_MEMORY_H
> > +#define _ASM_POWERPC_SET_MEMORY_H
> > +
> > +int set_memory_ro(unsigned long addr, int numpages);
> > +int set_memory_rw(unsigned long addr, int numpages);
> > +int set_memory_nx(unsigned long addr, int numpages);
> > +int set_memory_x(unsigned long addr, int numpages);
> > +
> > +#endif
> > diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
> > index c3df3a8501d4..9142cf1fb0d5 100644
> > --- a/arch/powerpc/mm/Makefile
> > +++ b/arch/powerpc/mm/Makefile
> > @@ -5,7 +5,7 @@
> >
> >   ccflags-$(CONFIG_PPC64)     := $(NO_MINIMAL_TOC)
> >
> > -obj-y                                := fault.o mem.o pgtable.o mmap.o maccess.o \
> > +obj-y                                := fault.o mem.o pgtable.o mmap.o maccess.o pageattr.o \
> >                                  init_$(BITS).o pgtable_$(BITS).o \
> >                                  pgtable-frag.o ioremap.o ioremap_$(BITS).o \
> >                                  init-common.o mmu_context.o drmem.o \
> > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> > new file mode 100644
> > index 000000000000..3b4aa72e555e
> > --- /dev/null
> > +++ b/arch/powerpc/mm/pageattr.c
> > @@ -0,0 +1,105 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +/*
> > + * MMU-generic set_memory implementation for powerpc
> > + *
> > + * Copyright 2019-2021, IBM Corporation.
> > + */
> > +
> > +#include <linux/mm.h>
> > +#include <linux/set_memory.h>
> > +
> > +#include <asm/mmu.h>
> > +#include <asm/page.h>
> > +#include <asm/pgtable.h>
> > +
> > +
> > +/*
> > + * Updates the attributes of a page in three steps:
> > + *
> > + * 1. invalidate the page table entry
> > + * 2. flush the TLB
> > + * 3. install the new entry with the updated attributes
> > + *
> > + * Invalidating the pte means there are situations where this will not work
> > + * when in theory it should.
> > + * For example:
> > + * - removing write from page whilst it is being executed
> > + * - setting a page read-only whilst it is being read by another CPU
> > + *
> > + */
> > +static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
> > +{
> > +     pte_t (*fn)(pte_t) = data;
> > +     pte_t pte;
> > +
> > +     spin_lock(&init_mm.page_table_lock);
> > +
> > +     /* invalidate the PTE so it's safe to modify */
> > +     pte = ptep_get_and_clear(&init_mm, addr, ptep);
> > +     flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +
> > +     /* modify the PTE bits as desired, then apply */
> > +     pte = fn(pte);
> > +
> > +     set_pte_at(&init_mm, addr, ptep, pte);
> > +
> > +     /* See ptesync comment in radix__set_pte_at() */
> > +     if (radix_enabled())
> > +             asm volatile("ptesync": : :"memory");
> > +     spin_unlock(&init_mm.page_table_lock);
> > +
> > +     return 0;
> > +}
> > +
> > +static int change_memory_attr(unsigned long addr, int numpages, pte_t (*fn)(pte_t))
> > +{
> > +     unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
> > +     unsigned long size = numpages * PAGE_SIZE;
> > +
> > +     if (!numpages)
> > +             return 0;
> > +
> > +#ifdef CONFIG_PPC_BOOK3S_64
> > +     /*
> > +      * On hash, the linear mapping is not in the Linux page table so
> > +      * apply_to_existing_page_range() will have no effect. If in the future
> > +      * the set_memory_* functions are used on the linear map this will need
> > +      * to be updated.
> > +      */
> > +     if (!radix_enabled()) {
> > +             int region = get_region_id(addr);
> > +
> > +             if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > +                     return -EINVAL;
> > +     }
> > +#endif
> > +
> > +     return apply_to_existing_page_range(&init_mm, start, size,
> > +                                         change_page_attr, fn);
> > +}
> > +
> > +int set_memory_ro(unsigned long addr, int numpages)
> > +{
> > +     return change_memory_attr(addr, numpages, pte_wrprotect);
> > +}
> > +
> > +static pte_t pte_mkdirtywrite(pte_t pte)
> > +{
> > +     return pte_mkwrite(pte_mkdirty(pte));
> > +}
> > +
> > +int set_memory_rw(unsigned long addr, int numpages)
> > +{
> > +     return change_memory_attr(addr, numpages, pte_mkdirtywrite);
> > +}
> > +
> > +int set_memory_nx(unsigned long addr, int numpages)
> > +{
> > +     return change_memory_attr(addr, numpages, pte_exprotect);
> > +}
> > +
> > +int set_memory_x(unsigned long addr, int numpages)
> > +{
> > +     return change_memory_attr(addr, numpages, pte_mkexec);
> > +}
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-04-29  5:04   ` Christophe Leroy
@ 2021-05-03  5:39     ` Jordan Niethe
  2021-05-03  5:57       ` Christophe Leroy
  0 siblings, 1 reply; 21+ messages in thread
From: Jordan Niethe @ 2021-05-03  5:39 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens

On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> > If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
> > VMALLOC_END respectively. This reduces the need for special cases. For
> > example, powerpc's module_alloc() was previously predicated on
> > MODULES_VADDR being defined but now is unconditionally defined.
> >
> > This will be useful reducing conditional code in other places that need
> > to allocate from the module region (i.e., kprobes).
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> > v10: New to series
> > v11: - Consider more places MODULES_VADDR was being used
> > ---
> >   arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
> >   arch/powerpc/kernel/module.c          |  5 +----
> >   arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
> >   arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
> >   4 files changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> > index c6a676714f04..882fda779648 100644
> > --- a/arch/powerpc/include/asm/pgtable.h
> > +++ b/arch/powerpc/include/asm/pgtable.h
> > @@ -39,6 +39,17 @@ struct mm_struct;
> >   #define __S110      PAGE_SHARED_X
> >   #define __S111      PAGE_SHARED_X
> >
> > +#ifndef MODULES_VADDR
> > +#define MODULES_VADDR VMALLOC_START
> > +#define MODULES_END VMALLOC_END
> > +#endif
> > +
> > +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
>
> No no.
>
> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.
>
> Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?
On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
TASK_SIZE depends on current.
Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
If I put it back in module_alloc() and wrap it with #ifdef
CONFIG_PPC_BOOK3S_32 will that be fine?

>
> > +#if TASK_SIZE > MODULES_VADDR
> > +#error TASK_SIZE > MODULES_VADDR
> > +#endif
> > +#endif
> > +
> >   #ifndef __ASSEMBLY__
> >
> >   /* Keep these as a macros to avoid include dependency mess */
> > diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
> > index fab84024650c..c60c7457ff47 100644
> > --- a/arch/powerpc/kernel/module.c
> > +++ b/arch/powerpc/kernel/module.c
> > @@ -15,6 +15,7 @@
> >   #include <linux/sort.h>
> >   #include <asm/setup.h>
> >   #include <asm/sections.h>
> > +#include <linux/mm.h>
> >
> >   static LIST_HEAD(module_bug_list);
> >
> > @@ -88,7 +89,6 @@ int module_finalize(const Elf_Ehdr *hdr,
> >       return 0;
> >   }
> >
> > -#ifdef MODULES_VADDR
> >   static __always_inline void *
> >   __module_alloc(unsigned long size, unsigned long start, unsigned long end)
> >   {
> > @@ -102,8 +102,6 @@ void *module_alloc(unsigned long size)
> >       unsigned long limit = (unsigned long)_etext - SZ_32M;
> >       void *ptr = NULL;
> >
> > -     BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
> > -
> >       /* First try within 32M limit from _etext to avoid branch trampolines */
> >       if (MODULES_VADDR < PAGE_OFFSET && MODULES_END > limit)
> >               ptr = __module_alloc(size, limit, MODULES_END);
> > @@ -113,4 +111,3 @@ void *module_alloc(unsigned long size)
> >
> >       return ptr;
> >   }
> > -#endif
> > diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
> > index cf8770b1a692..42c057366ac7 100644
> > --- a/arch/powerpc/mm/kasan/kasan_init_32.c
> > +++ b/arch/powerpc/mm/kasan/kasan_init_32.c
> > @@ -116,11 +116,11 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
> >
> >       kasan_update_early_region(k_start, k_end, __pte(0));
> >
> > -#ifdef MODULES_VADDR
> > -     k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
> > -     k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
> > -     kasan_update_early_region(k_start, k_end, __pte(0));
> > -#endif
> > +     if (MODULES_VADDR != VMALLOC_START && MODULES_END != VMALLOC_END) {
>
> Shouldn't it be an || ?
Yeah
>
> As soon as either MODULES_VADDR or MODULES_END differs from the vmalloc boundaries, it needs to be
> done I think.
>
> > +             k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
> > +             k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
> > +             kasan_update_early_region(k_start, k_end, __pte(0));
> > +     }
> >   }
> >
> >   void __init kasan_mmu_init(void)
> > diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
> > index aca354fb670b..0431457f668f 100644
> > --- a/arch/powerpc/mm/ptdump/ptdump.c
> > +++ b/arch/powerpc/mm/ptdump/ptdump.c
> > @@ -73,7 +73,7 @@ struct addr_marker {
> >
> >   static struct addr_marker address_markers[] = {
> >       { 0,    "Start of kernel VM" },
> > -#ifdef MODULES_VADDR
> > +#if defined(CONFIG_BOOK32_32) && defined(CONFIG_STRICT_KERNEL_RWX)
>
> Not valid anymore, see https://github.com/linuxppc/linux/commit/80edc68e0479 and
> https://github.com/linuxppc/linux/commit/9132a2e82adc
>
> The best would be to be able to do something like:
>
> #if MODULES_VADDR != VMALLOC_START
I tried to do it like that originally but with stuff like
#define VMALLOC_START ((((long)high_memory + VMALLOC_OFFSET) &
~(VMALLOC_OFFSET-1)))
it doesn't work.
>
> If it doesn't work, then it has to be
>
> #if defined(CONFIG_BOOK32_32) || defined(CONFIG_PPC_8xx)
Ok, thanks.
>
> >       { 0,    "modules start" },
> >       { 0,    "modules end" },
> >   #endif
> > @@ -359,7 +359,7 @@ static void populate_markers(void)
> >   #else
> >       address_markers[i++].start_address = TASK_SIZE;
> >   #endif
> > -#ifdef MODULES_VADDR
> > +#if defined(CONFIG_BOOK32_32) && defined(CONFIG_STRICT_KERNEL_RWX)
>
> Same.
>
> >       address_markers[i++].start_address = MODULES_VADDR;
> >       address_markers[i++].start_address = MODULES_END;
> >   #endif
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-05-03  5:39     ` Jordan Niethe
@ 2021-05-03  5:57       ` Christophe Leroy
  2021-05-03  6:16         ` Jordan Niethe
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2021-05-03  5:57 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens



Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
> On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>>
>>
>> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
>>> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
>>> VMALLOC_END respectively. This reduces the need for special cases. For
>>> example, powerpc's module_alloc() was previously predicated on
>>> MODULES_VADDR being defined but now is unconditionally defined.
>>>
>>> This will be useful reducing conditional code in other places that need
>>> to allocate from the module region (i.e., kprobes).
>>>
>>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
>>> ---
>>> v10: New to series
>>> v11: - Consider more places MODULES_VADDR was being used
>>> ---
>>>    arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
>>>    arch/powerpc/kernel/module.c          |  5 +----
>>>    arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
>>>    arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
>>>    4 files changed, 19 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
>>> index c6a676714f04..882fda779648 100644
>>> --- a/arch/powerpc/include/asm/pgtable.h
>>> +++ b/arch/powerpc/include/asm/pgtable.h
>>> @@ -39,6 +39,17 @@ struct mm_struct;
>>>    #define __S110      PAGE_SHARED_X
>>>    #define __S111      PAGE_SHARED_X
>>>
>>> +#ifndef MODULES_VADDR
>>> +#define MODULES_VADDR VMALLOC_START
>>> +#define MODULES_END VMALLOC_END
>>> +#endif
>>> +
>>> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
>>
>> No no.
>>
>> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.
>>
>> Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?
> On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
> TASK_SIZE depends on current.
> Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
> If I put it back in module_alloc() and wrap it with #ifdef
> CONFIG_PPC_BOOK3S_32 will that be fine?

Thinking about it once more, I think the best approach is the one taken by Nick in 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npiggin@gmail.com/

Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END otherwise.

I know I suggested to always define MODULES_VADDR, but maybe that's not the best solution at the end.

For kprobes, is there a way to re-use functions from modules.c in alloc_insn_page() ?

Christophe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-05-03  5:57       ` Christophe Leroy
@ 2021-05-03  6:16         ` Jordan Niethe
  2021-05-03  6:22           ` Christophe Leroy
  0 siblings, 1 reply; 21+ messages in thread
From: Jordan Niethe @ 2021-05-03  6:16 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens

On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
> > On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
> > <christophe.leroy@csgroup.eu> wrote:
> >>
> >>
> >>
> >> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> >>> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
> >>> VMALLOC_END respectively. This reduces the need for special cases. For
> >>> example, powerpc's module_alloc() was previously predicated on
> >>> MODULES_VADDR being defined but now is unconditionally defined.
> >>>
> >>> This will be useful reducing conditional code in other places that need
> >>> to allocate from the module region (i.e., kprobes).
> >>>
> >>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> >>> ---
> >>> v10: New to series
> >>> v11: - Consider more places MODULES_VADDR was being used
> >>> ---
> >>>    arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
> >>>    arch/powerpc/kernel/module.c          |  5 +----
> >>>    arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
> >>>    arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
> >>>    4 files changed, 19 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> >>> index c6a676714f04..882fda779648 100644
> >>> --- a/arch/powerpc/include/asm/pgtable.h
> >>> +++ b/arch/powerpc/include/asm/pgtable.h
> >>> @@ -39,6 +39,17 @@ struct mm_struct;
> >>>    #define __S110      PAGE_SHARED_X
> >>>    #define __S111      PAGE_SHARED_X
> >>>
> >>> +#ifndef MODULES_VADDR
> >>> +#define MODULES_VADDR VMALLOC_START
> >>> +#define MODULES_END VMALLOC_END
> >>> +#endif
> >>> +
> >>> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
> >>
> >> No no.
> >>
> >> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.
> >>
> >> Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?
> > On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
> > TASK_SIZE depends on current.
> > Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
> > If I put it back in module_alloc() and wrap it with #ifdef
> > CONFIG_PPC_BOOK3S_32 will that be fine?
>
> Thinking about it once more, I think the best approach is the one taken by Nick in
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npiggin@gmail.com/
>
> Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END otherwise.
>
> I know I suggested to always define MODULES_VADDR, but maybe that's not the best solution at the end.
Sure, let's do it like that.
>
> For kprobes, is there a way to re-use functions from modules.c in alloc_insn_page() ?
Probably we can use module_alloc() then the set_memory_ functions to
get the permissions right.
Something like we had in v9:
https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniethe5@gmail.com/
>
> Christophe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-05-03  6:16         ` Jordan Niethe
@ 2021-05-03  6:22           ` Christophe Leroy
  2021-05-03  6:26             ` Jordan Niethe
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2021-05-03  6:22 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens



Le 03/05/2021 à 08:16, Jordan Niethe a écrit :
> On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>>
>>
>> Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
>>> On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
>>> <christophe.leroy@csgroup.eu> wrote:
>>>>
>>>>
>>>>
>>>> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
>>>>> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
>>>>> VMALLOC_END respectively. This reduces the need for special cases. For
>>>>> example, powerpc's module_alloc() was previously predicated on
>>>>> MODULES_VADDR being defined but now is unconditionally defined.
>>>>>
>>>>> This will be useful reducing conditional code in other places that need
>>>>> to allocate from the module region (i.e., kprobes).
>>>>>
>>>>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
>>>>> ---
>>>>> v10: New to series
>>>>> v11: - Consider more places MODULES_VADDR was being used
>>>>> ---
>>>>>     arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
>>>>>     arch/powerpc/kernel/module.c          |  5 +----
>>>>>     arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
>>>>>     arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
>>>>>     4 files changed, 19 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
>>>>> index c6a676714f04..882fda779648 100644
>>>>> --- a/arch/powerpc/include/asm/pgtable.h
>>>>> +++ b/arch/powerpc/include/asm/pgtable.h
>>>>> @@ -39,6 +39,17 @@ struct mm_struct;
>>>>>     #define __S110      PAGE_SHARED_X
>>>>>     #define __S111      PAGE_SHARED_X
>>>>>
>>>>> +#ifndef MODULES_VADDR
>>>>> +#define MODULES_VADDR VMALLOC_START
>>>>> +#define MODULES_END VMALLOC_END
>>>>> +#endif
>>>>> +
>>>>> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
>>>>
>>>> No no.
>>>>
>>>> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.
>>>>
>>>> Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?
>>> On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
>>> TASK_SIZE depends on current.
>>> Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
>>> If I put it back in module_alloc() and wrap it with #ifdef
>>> CONFIG_PPC_BOOK3S_32 will that be fine?
>>
>> Thinking about it once more, I think the best approach is the one taken by Nick in
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npiggin@gmail.com/
>>
>> Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END otherwise.
>>
>> I know I suggested to always define MODULES_VADDR, but maybe that's not the best solution at the end.
> Sure, let's do it like that.
>>
>> For kprobes, is there a way to re-use functions from modules.c in alloc_insn_page() ?
> Probably we can use module_alloc() then the set_memory_ functions to
> get the permissions right.
> Something like we had in v9:
> https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniethe5@gmail.com/

Yes, more or less, but using module_alloc() instead of vmalloc().
And module_alloc() implies EXEC, so only the set_memory_ro() will be required.

I see no point in doing any set_memory_xxx() in free_insn_page(), because as soon as you do a 
vfree() the page is not mapped anymore so any access will lead to a fault.

Christophe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-05-03  6:22           ` Christophe Leroy
@ 2021-05-03  6:26             ` Jordan Niethe
  2021-05-03  6:32               ` Christophe Leroy
  0 siblings, 1 reply; 21+ messages in thread
From: Jordan Niethe @ 2021-05-03  6:26 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens

On Mon, May 3, 2021 at 4:22 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 03/05/2021 à 08:16, Jordan Niethe a écrit :
> > On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
> > <christophe.leroy@csgroup.eu> wrote:
> >>
> >>
> >>
> >> Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
> >>> On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
> >>> <christophe.leroy@csgroup.eu> wrote:
> >>>>
> >>>>
> >>>>
> >>>> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> >>>>> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
> >>>>> VMALLOC_END respectively. This reduces the need for special cases. For
> >>>>> example, powerpc's module_alloc() was previously predicated on
> >>>>> MODULES_VADDR being defined but now is unconditionally defined.
> >>>>>
> >>>>> This will be useful reducing conditional code in other places that need
> >>>>> to allocate from the module region (i.e., kprobes).
> >>>>>
> >>>>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> >>>>> ---
> >>>>> v10: New to series
> >>>>> v11: - Consider more places MODULES_VADDR was being used
> >>>>> ---
> >>>>>     arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
> >>>>>     arch/powerpc/kernel/module.c          |  5 +----
> >>>>>     arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
> >>>>>     arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
> >>>>>     4 files changed, 19 insertions(+), 11 deletions(-)
> >>>>>
> >>>>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> >>>>> index c6a676714f04..882fda779648 100644
> >>>>> --- a/arch/powerpc/include/asm/pgtable.h
> >>>>> +++ b/arch/powerpc/include/asm/pgtable.h
> >>>>> @@ -39,6 +39,17 @@ struct mm_struct;
> >>>>>     #define __S110      PAGE_SHARED_X
> >>>>>     #define __S111      PAGE_SHARED_X
> >>>>>
> >>>>> +#ifndef MODULES_VADDR
> >>>>> +#define MODULES_VADDR VMALLOC_START
> >>>>> +#define MODULES_END VMALLOC_END
> >>>>> +#endif
> >>>>> +
> >>>>> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
> >>>>
> >>>> No no.
> >>>>
> >>>> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.
> >>>>
> >>>> Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?
> >>> On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
> >>> TASK_SIZE depends on current.
> >>> Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
> >>> If I put it back in module_alloc() and wrap it with #ifdef
> >>> CONFIG_PPC_BOOK3S_32 will that be fine?
> >>
> >> Thinking about it once more, I think the best approach is the one taken by Nick in
> >> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npiggin@gmail.com/
> >>
> >> Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END otherwise.
> >>
> >> I know I suggested to always define MODULES_VADDR, but maybe that's not the best solution at the end.
> > Sure, let's do it like that.
> >>
> >> For kprobes, is there a way to re-use functions from modules.c in alloc_insn_page() ?
> > Probably we can use module_alloc() then the set_memory_ functions to
> > get the permissions right.
> > Something like we had in v9:
> > https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniethe5@gmail.com/
>
> Yes, more or less, but using module_alloc() instead of vmalloc().
> And module_alloc() implies EXEC, so only the set_memory_ro() will be required.
Yep.
>
> I see no point in doing any set_memory_xxx() in free_insn_page(), because as soon as you do a
> vfree() the page is not mapped anymore so any access will lead to a fault.
Yeah, I'd not realised we had VM_FLUSH_RESET_PERMS when I added that.
I agree it's pointless.
>
> Christophe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}
  2021-05-03  6:26             ` Jordan Niethe
@ 2021-05-03  6:32               ` Christophe Leroy
  0 siblings, 0 replies; 21+ messages in thread
From: Christophe Leroy @ 2021-05-03  6:32 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens



Le 03/05/2021 à 08:26, Jordan Niethe a écrit :
> On Mon, May 3, 2021 at 4:22 PM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>>
>>
>> Le 03/05/2021 à 08:16, Jordan Niethe a écrit :
>>> On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
>>> <christophe.leroy@csgroup.eu> wrote:
>>>>
>>>>
>>>>
>>>> Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
>>>>> On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
>>>>> <christophe.leroy@csgroup.eu> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
>>>>>>> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
>>>>>>> VMALLOC_END respectively. This reduces the need for special cases. For
>>>>>>> example, powerpc's module_alloc() was previously predicated on
>>>>>>> MODULES_VADDR being defined but now is unconditionally defined.
>>>>>>>
>>>>>>> This will be useful reducing conditional code in other places that need
>>>>>>> to allocate from the module region (i.e., kprobes).
>>>>>>>
>>>>>>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
>>>>>>> ---
>>>>>>> v10: New to series
>>>>>>> v11: - Consider more places MODULES_VADDR was being used
>>>>>>> ---
>>>>>>>      arch/powerpc/include/asm/pgtable.h    | 11 +++++++++++
>>>>>>>      arch/powerpc/kernel/module.c          |  5 +----
>>>>>>>      arch/powerpc/mm/kasan/kasan_init_32.c | 10 +++++-----
>>>>>>>      arch/powerpc/mm/ptdump/ptdump.c       |  4 ++--
>>>>>>>      4 files changed, 19 insertions(+), 11 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
>>>>>>> index c6a676714f04..882fda779648 100644
>>>>>>> --- a/arch/powerpc/include/asm/pgtable.h
>>>>>>> +++ b/arch/powerpc/include/asm/pgtable.h
>>>>>>> @@ -39,6 +39,17 @@ struct mm_struct;
>>>>>>>      #define __S110      PAGE_SHARED_X
>>>>>>>      #define __S111      PAGE_SHARED_X
>>>>>>>
>>>>>>> +#ifndef MODULES_VADDR
>>>>>>> +#define MODULES_VADDR VMALLOC_START
>>>>>>> +#define MODULES_END VMALLOC_END
>>>>>>> +#endif
>>>>>>> +
>>>>>>> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
>>>>>>
>>>>>> No no.
>>>>>>
>>>>>> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.
>>>>>>
>>>>>> Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?
>>>>> On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
>>>>> TASK_SIZE depends on current.
>>>>> Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
>>>>> If I put it back in module_alloc() and wrap it with #ifdef
>>>>> CONFIG_PPC_BOOK3S_32 will that be fine?
>>>>
>>>> Thinking about it once more, I think the best approach is the one taken by Nick in
>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npiggin@gmail.com/
>>>>
>>>> Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END otherwise.
>>>>
>>>> I know I suggested to always define MODULES_VADDR, but maybe that's not the best solution at the end.
>>> Sure, let's do it like that.
>>>>
>>>> For kprobes, is there a way to re-use functions from modules.c in alloc_insn_page() ?
>>> Probably we can use module_alloc() then the set_memory_ functions to
>>> get the permissions right.
>>> Something like we had in v9:
>>> https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniethe5@gmail.com/
>>
>> Yes, more or less, but using module_alloc() instead of vmalloc().
>> And module_alloc() implies EXEC, so only the set_memory_ro() will be required.
> Yep.
>>
>> I see no point in doing any set_memory_xxx() in free_insn_page(), because as soon as you do a
>> vfree() the page is not mapped anymore so any access will lead to a fault.
> Yeah, I'd not realised we had VM_FLUSH_RESET_PERMS when I added that.
> I agree it's pointless.

At the end if should be quite similar to what S390 architecture does.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier
  2021-04-29  4:53   ` Christophe Leroy
@ 2021-05-05  5:22     ` Jordan Niethe
  0 siblings, 0 replies; 21+ messages in thread
From: Jordan Niethe @ 2021-05-05  5:22 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ajd, Nicholas Piggin, cmr, Aneesh Kumar K.V, naveen.n.rao,
	linuxppc-dev, Daniel Axtens

On Thu, Apr 29, 2021 at 2:53 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> > setup_text_poke_area() is a late init call so it runs before
> > mark_rodata_ro() and after the init calls. This lets all the init code
> > patching simply write to their locations. In the future, kprobes is
> > going to allocate its instruction pages RO which means they will need
> > setup_text__poke_area() to have been already called for their code
> > patching. However, init_kprobes() (which allocates and patches some
> > instruction pages) is an early init call so it happens before
> > setup_text__poke_area().
> >
> > start_kernel() calls poking_init() before any of the init calls. On
> > powerpc, poking_init() is currently a nop. setup_text_poke_area() relies
> > on kernel virtual memory, cpu hotplug and per_cpu_areas being setup.
> > setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called
> > before poking_init().
> >
> > Turn setup_text_poke_area() into poking_init().
>
> I can't remember, maybe I already asked the question:
> Have you done some performance measurement or at least some performance analysis ?
No I don't think you have asked and it is a good question.

Here are some results on a Power9 (T2P9D01 REV 1.01) running powernv_defconfig
Timestamp at "Run /init as init process"
Before: ~1.059326
After: ~1.273105

Turning on more testing the difference is greater:
For example, turning on CONFIG_FTRACE_STARTUP_TEST
Timestamp at "Run /init as init process"
Before: ~7.176759
After: ~15.967576

Running with initcall_debug:
Before: initcall init_trace_selftests+0x0/0x1b4 returned 0 after 2880859 usecs
After: initcall init_trace_selftests+0x0/0x1b4 returned 0 after 10048828 usecs

So it does slow it down.
But it also might be a good thing for testing that these tests using
code patching now will use the same code path for patching that would
be used on a fully booted system.

>
> Christophe
>
> >
> > Reviewed-by: Russell Currey <ruscur@russell.cc>
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> > v9: New to series
> > ---
> >   arch/powerpc/lib/code-patching.c | 12 ++++--------
> >   1 file changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 870b30d9be2f..15296207e1ba 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -70,14 +70,11 @@ static int text_area_cpu_down(unsigned int cpu)
> >   }
> >
> >   /*
> > - * Run as a late init call. This allows all the boot time patching to be done
> > - * simply by patching the code, and then we're called here prior to
> > - * mark_rodata_ro(), which happens after all init calls are run. Although
> > - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
> > - * it as being preferable to a kernel that will crash later when someone tries
> > - * to use patch_instruction().
> > + * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
> > + * we judge it as being preferable to a kernel that will crash later when
> > + * someone tries to use patch_instruction().
> >    */
> > -static int __init setup_text_poke_area(void)
> > +int __init poking_init(void)
> >   {
> >       BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> >               "powerpc/text_poke:online", text_area_cpu_up,
> > @@ -85,7 +82,6 @@ static int __init setup_text_poke_area(void)
> >
> >       return 0;
> >   }
> > -late_initcall(setup_text_poke_area);
> >
> >   /*
> >    * This can be called for kernel text or a module.
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-05-05  5:23 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29  3:15 [PATCH v11 0/9] powerpc: Further Strict RWX support Jordan Niethe
2021-04-29  3:15 ` [PATCH v11 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
2021-04-29  7:32   ` Christophe Leroy
2021-05-03  5:02     ` Jordan Niethe
2021-04-29  3:15 ` [PATCH v11 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
2021-04-29  4:53   ` Christophe Leroy
2021-05-05  5:22     ` Jordan Niethe
2021-04-29  3:15 ` [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END} Jordan Niethe
2021-04-29  5:04   ` Christophe Leroy
2021-05-03  5:39     ` Jordan Niethe
2021-05-03  5:57       ` Christophe Leroy
2021-05-03  6:16         ` Jordan Niethe
2021-05-03  6:22           ` Christophe Leroy
2021-05-03  6:26             ` Jordan Niethe
2021-05-03  6:32               ` Christophe Leroy
2021-04-29  3:15 ` [PATCH v11 4/9] powerpc/kprobes: Mark newly allocated probes as ROX Jordan Niethe
2021-04-29  3:15 ` [PATCH v11 5/9] powerpc/bpf: Remove bpf_jit_free() Jordan Niethe
2021-04-29  3:15 ` [PATCH v11 6/9] powerpc/bpf: Write protect JIT code Jordan Niethe
2021-04-29  3:16 ` [PATCH v11 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
2021-04-29  3:16 ` [PATCH v11 8/9] powerpc/mm: implement set_memory_attr() Jordan Niethe
2021-04-29  3:16 ` [PATCH v11 9/9] powerpc/32: use set_memory_attr() Jordan Niethe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.