linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v15 0/9] powerpc: Further Strict RWX support
@ 2021-06-09  1:34 Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Adding more Strict RWX support on powerpc, in particular Strict Module RWX.
It is now rebased on ppc next.

For reference the previous revision is available here: 
https://lore.kernel.org/linuxppc-dev/20210517032810.129949-1-jniethe5@gmail.com/

Changes for v15:

Christophe Leroy (2):
  powerpc/mm: implement set_memory_attr()
  powerpc/32: use set_memory_attr()

Jordan Niethe (4):
  powerpc/lib/code-patching: Set up Strict RWX patching earlier
  powerpc/modules: Make module_alloc() Strict Module RWX aware
  powerpc/bpf: Remove bpf_jit_free()
  powerpc/bpf: Write protect JIT code

Russell Currey (3):
  powerpc/mm: Implement set_memory() routines
  powerpc/kprobes: Mark newly allocated probes as ROX
  powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
    - Force STRICT_KERNEL_RWX if STRICT_MODULE_RWX is selected
    - Predicate on !PPC_BOOK3S_32 instead


 arch/powerpc/Kconfig                  |   3 +
 arch/powerpc/include/asm/mmu.h        |   5 +
 arch/powerpc/include/asm/set_memory.h |  34 +++++++
 arch/powerpc/kernel/kprobes.c         |  17 ++++
 arch/powerpc/kernel/module.c          |   4 +-
 arch/powerpc/lib/code-patching.c      |  12 +--
 arch/powerpc/mm/Makefile              |   2 +-
 arch/powerpc/mm/pageattr.c            | 134 ++++++++++++++++++++++++++
 arch/powerpc/mm/pgtable_32.c          |  60 ++----------
 arch/powerpc/net/bpf_jit_comp.c       |  13 +--
 10 files changed, 212 insertions(+), 72 deletions(-)
 create mode 100644 arch/powerpc/include/asm/set_memory.h
 create mode 100644 arch/powerpc/mm/pageattr.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v15 1/9] powerpc/mm: Implement set_memory() routines
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Russell Currey <ruscur@russell.cc>

The set_memory_{ro/rw/nx/x}() functions are required for
STRICT_MODULE_RWX, and are generally useful primitives to have.  This
implementation is designed to be generic across powerpc's many MMUs.
It's possible that this could be optimised to be faster for specific
MMUs.

This implementation does not handle cases where the caller is attempting
to change the mapping of the page it is executing from, or if another
CPU is concurrently using the page being altered.  These cases likely
shouldn't happen, but a more complex implementation with MMU-specific code
could safely handle them.

On hash, the linear mapping is not kept in the linux pagetable, so this
will not change the protection if used on that range. Currently these
functions are not used on the linear map so just WARN for now.

apply_to_existing_page_range() does not work on huge pages so for now
disallow changing the protection of huge pages.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[jpn: - Allow set memory functions to be used without Strict RWX
      - Hash: Disallow certain regions
      - Have change_page_attr() take function pointers to manipulate ptes
      - Radix: Add ptesync after set_pte_at()]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: WARN if trying to change the hash linear map
v11: - Update copywrite dates
     - Allow set memory functions to be used without Strict RWX
     - Hash: Disallow certain regions and add comment explaining why
     - Have change_page_attr() take function pointers to manipulate ptes
     - Clarify change_page_attr()'s comment
     - Radix: Add ptesync after set_pte_at()
v12: - change_page_attr() back to taking an action value
     - disallow operating on huge pages
v14: - only check is_vm_area_hugepages() for virtual memory
---
 arch/powerpc/Kconfig                  |   1 +
 arch/powerpc/include/asm/set_memory.h |  32 ++++++++
 arch/powerpc/mm/Makefile              |   2 +-
 arch/powerpc/mm/pageattr.c            | 101 ++++++++++++++++++++++++++
 4 files changed, 135 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/include/asm/set_memory.h
 create mode 100644 arch/powerpc/mm/pageattr.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 36ba413f49d3..abfe2e9225fa 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -140,6 +140,7 @@ config PPC
 	select ARCH_HAS_PTE_DEVMAP		if PPC_BOOK3S_64
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
+	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_STRICT_KERNEL_RWX	if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UACCESS_FLUSHCACHE
diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h
new file mode 100644
index 000000000000..64011ea444b4
--- /dev/null
+++ b/arch/powerpc/include/asm/set_memory.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_SET_MEMORY_H
+#define _ASM_POWERPC_SET_MEMORY_H
+
+#define SET_MEMORY_RO	0
+#define SET_MEMORY_RW	1
+#define SET_MEMORY_NX	2
+#define SET_MEMORY_X	3
+
+int change_memory_attr(unsigned long addr, int numpages, long action);
+
+static inline int set_memory_ro(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, SET_MEMORY_RO);
+}
+
+static inline int set_memory_rw(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, SET_MEMORY_RW);
+}
+
+static inline int set_memory_nx(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, SET_MEMORY_NX);
+}
+
+static inline int set_memory_x(unsigned long addr, int numpages)
+{
+	return change_memory_attr(addr, numpages, SET_MEMORY_X);
+}
+
+#endif
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index c3df3a8501d4..9142cf1fb0d5 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -5,7 +5,7 @@
 
 ccflags-$(CONFIG_PPC64)	:= $(NO_MINIMAL_TOC)
 
-obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o \
+obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o pageattr.o \
 				   init_$(BITS).o pgtable_$(BITS).o \
 				   pgtable-frag.o ioremap.o ioremap_$(BITS).o \
 				   init-common.o mmu_context.o drmem.o \
diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
new file mode 100644
index 000000000000..5e5ae50a7f23
--- /dev/null
+++ b/arch/powerpc/mm/pageattr.c
@@ -0,0 +1,101 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * MMU-generic set_memory implementation for powerpc
+ *
+ * Copyright 2019-2021, IBM Corporation.
+ */
+
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+#include <linux/set_memory.h>
+
+#include <asm/mmu.h>
+#include <asm/page.h>
+#include <asm/pgtable.h>
+
+
+/*
+ * Updates the attributes of a page in three steps:
+ *
+ * 1. invalidate the page table entry
+ * 2. flush the TLB
+ * 3. install the new entry with the updated attributes
+ *
+ * Invalidating the pte means there are situations where this will not work
+ * when in theory it should.
+ * For example:
+ * - removing write from page whilst it is being executed
+ * - setting a page read-only whilst it is being read by another CPU
+ *
+ */
+static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
+{
+	long action = (long)data;
+	pte_t pte;
+
+	spin_lock(&init_mm.page_table_lock);
+
+	/* invalidate the PTE so it's safe to modify */
+	pte = ptep_get_and_clear(&init_mm, addr, ptep);
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
+	/* modify the PTE bits as desired, then apply */
+	switch (action) {
+	case SET_MEMORY_RO:
+		pte = pte_wrprotect(pte);
+		break;
+	case SET_MEMORY_RW:
+		pte = pte_mkwrite(pte_mkdirty(pte));
+		break;
+	case SET_MEMORY_NX:
+		pte = pte_exprotect(pte);
+		break;
+	case SET_MEMORY_X:
+		pte = pte_mkexec(pte);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		break;
+	}
+
+	set_pte_at(&init_mm, addr, ptep, pte);
+
+	/* See ptesync comment in radix__set_pte_at() */
+	if (radix_enabled())
+		asm volatile("ptesync": : :"memory");
+	spin_unlock(&init_mm.page_table_lock);
+
+	return 0;
+}
+
+int change_memory_attr(unsigned long addr, int numpages, long action)
+{
+	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
+	unsigned long size = numpages * PAGE_SIZE;
+
+	if (!numpages)
+		return 0;
+
+	if (WARN_ON_ONCE(is_vmalloc_or_module_addr((void *)addr) &&
+			 is_vm_area_hugepages((void *)addr)))
+		return -EINVAL;
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	/*
+	 * On hash, the linear mapping is not in the Linux page table so
+	 * apply_to_existing_page_range() will have no effect. If in the future
+	 * the set_memory_* functions are used on the linear map this will need
+	 * to be updated.
+	 */
+	if (!radix_enabled()) {
+		int region = get_region_id(addr);
+
+		if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
+			return -EINVAL;
+	}
+#endif
+
+	return apply_to_existing_page_range(&init_mm, start, size,
+					    change_page_attr, (void *)action);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware Jordan Niethe
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

setup_text_poke_area() is a late init call so it runs before
mark_rodata_ro() and after the init calls. This lets all the init code
patching simply write to their locations. In the future, kprobes is
going to allocate its instruction pages RO which means they will need
setup_text__poke_area() to have been already called for their code
patching. However, init_kprobes() (which allocates and patches some
instruction pages) is an early init call so it happens before
setup_text__poke_area().

start_kernel() calls poking_init() before any of the init calls. On
powerpc, poking_init() is currently a nop. setup_text_poke_area() relies
on kernel virtual memory, cpu hotplug and per_cpu_areas being setup.
setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called
before poking_init().

Turn setup_text_poke_area() into poking_init().

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v9: New to series
---
 arch/powerpc/lib/code-patching.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 870b30d9be2f..15296207e1ba 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -70,14 +70,11 @@ static int text_area_cpu_down(unsigned int cpu)
 }
 
 /*
- * Run as a late init call. This allows all the boot time patching to be done
- * simply by patching the code, and then we're called here prior to
- * mark_rodata_ro(), which happens after all init calls are run. Although
- * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
- * it as being preferable to a kernel that will crash later when someone tries
- * to use patch_instruction().
+ * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
+ * we judge it as being preferable to a kernel that will crash later when
+ * someone tries to use patch_instruction().
  */
-static int __init setup_text_poke_area(void)
+int __init poking_init(void)
 {
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
@@ -85,7 +82,6 @@ static int __init setup_text_poke_area(void)
 
 	return 0;
 }
-late_initcall(setup_text_poke_area);
 
 /*
  * This can be called for kernel text or a module.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 4/9] powerpc/kprobes: Mark newly allocated probes as ROX Jordan Niethe
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Make module_alloc() use PAGE_KERNEL protections instead of
PAGE_KERNEL_EXEX if Strict Module RWX is enabled.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v14: - Split out from powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
     - Add and use strict_module_rwx_enabled() helper
---
 arch/powerpc/include/asm/mmu.h | 5 +++++
 arch/powerpc/kernel/module.c   | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 998fe01dd1a8..27016b98ecb2 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -345,6 +345,11 @@ static inline bool strict_kernel_rwx_enabled(void)
 	return false;
 }
 #endif
+
+static inline bool strict_module_rwx_enabled(void)
+{
+	return IS_ENABLED(CONFIG_STRICT_MODULE_RWX) && strict_kernel_rwx_enabled();
+}
 #endif /* !__ASSEMBLY__ */
 
 /* The kernel use the constants below to index in the page sizes array.
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 3f35c8d20be7..ed04a3ba66fe 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -92,12 +92,14 @@ int module_finalize(const Elf_Ehdr *hdr,
 static __always_inline void *
 __module_alloc(unsigned long size, unsigned long start, unsigned long end)
 {
+	pgprot_t prot = strict_module_rwx_enabled() ? PAGE_KERNEL : PAGE_KERNEL_EXEC;
+
 	/*
 	 * Don't do huge page allocations for modules yet until more testing
 	 * is done. STRICT_MODULE_RWX may require extra work to support this
 	 * too.
 	 */
-	return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, PAGE_KERNEL_EXEC,
+	return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, prot,
 				    VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP,
 				    NUMA_NO_NODE, __builtin_return_address(0));
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 4/9] powerpc/kprobes: Mark newly allocated probes as ROX
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (2 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 5/9] powerpc/bpf: Remove bpf_jit_free() Jordan Niethe
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Russell Currey <ruscur@russell.cc>

Add the arch specific insn page allocator for powerpc. This allocates
ROX pages if STRICT_KERNEL_RWX is enabled. These pages are only written
to with patch_instruction() which is able to write RO pages.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[jpn: Reword commit message, switch to __vmalloc_node_range()]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v9: - vmalloc_exec() no longer exists
    - Set the page to RW before freeing it
v10: - use __vmalloc_node_range()
v11: - Neaten up
v12: - Switch from __vmalloc_node_range() to module_alloc()
v13: Use strict_kernel_rwx_enabled()
v14: Use strict_module_rwx_enabled()
---
 arch/powerpc/kernel/kprobes.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 01ab2163659e..937e338053ff 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -19,11 +19,13 @@
 #include <linux/extable.h>
 #include <linux/kdebug.h>
 #include <linux/slab.h>
+#include <linux/moduleloader.h>
 #include <asm/code-patching.h>
 #include <asm/cacheflush.h>
 #include <asm/sstep.h>
 #include <asm/sections.h>
 #include <asm/inst.h>
+#include <asm/set_memory.h>
 #include <linux/uaccess.h>
 
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
@@ -103,6 +105,21 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
 	return addr;
 }
 
+void *alloc_insn_page(void)
+{
+	void *page;
+
+	page = module_alloc(PAGE_SIZE);
+	if (!page)
+		return NULL;
+
+	if (strict_module_rwx_enabled()) {
+		set_memory_ro((unsigned long)page, 1);
+		set_memory_x((unsigned long)page, 1);
+	}
+	return page;
+}
+
 int arch_prepare_kprobe(struct kprobe *p)
 {
 	int ret = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 5/9] powerpc/bpf: Remove bpf_jit_free()
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (3 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 4/9] powerpc/kprobes: Mark newly allocated probes as ROX Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 6/9] powerpc/bpf: Write protect JIT code Jordan Niethe
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Commit 74451e66d516 ("bpf: make jited programs visible in traces") added
a default bpf_jit_free() implementation. Powerpc did not use the default
bpf_jit_free() as powerpc did not set the images read-only. The default
bpf_jit_free() called bpf_jit_binary_unlock_ro() is why it could not be
used for powerpc.

Commit d53d2f78cead ("bpf: Use vmalloc special flag") moved keeping
track of read-only memory to vmalloc. This included removing
bpf_jit_binary_unlock_ro(). Therefore there is no reason powerpc needs
its own bpf_jit_free(). Remove it.

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v11: New to series
---
 arch/powerpc/net/bpf_jit_comp.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 798ac4350a82..6c8c268e4fe8 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -257,15 +257,3 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 	return fp;
 }
-
-/* Overriding bpf_jit_free() as we don't set images read-only. */
-void bpf_jit_free(struct bpf_prog *fp)
-{
-	unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
-	struct bpf_binary_header *bpf_hdr = (void *)addr;
-
-	if (fp->jited)
-		bpf_jit_binary_free(bpf_hdr);
-
-	bpf_prog_unlock_free(fp);
-}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 6/9] powerpc/bpf: Write protect JIT code
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (4 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 5/9] powerpc/bpf: Remove bpf_jit_free() Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

Add the necessary call to bpf_jit_binary_lock_ro() to remove write and
add exec permissions to the JIT image after it has finished being
written.

Without CONFIG_STRICT_MODULE_RWX the image will be writable and
executable until the call to bpf_jit_binary_lock_ro().

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: New to series
v11: Remove CONFIG_STRICT_MODULE_RWX conditional
---
 arch/powerpc/net/bpf_jit_comp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 6c8c268e4fe8..53aefee3fe70 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -237,6 +237,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	fp->jited_len = alloclen;
 
 	bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + (bpf_hdr->pages * PAGE_SIZE));
+	bpf_jit_binary_lock_ro(bpf_hdr);
 	if (!fp->is_func || extra_pass) {
 		bpf_prog_fill_jited_linfo(fp, addrs);
 out_addrs:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (5 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 6/9] powerpc/bpf: Write protect JIT code Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-08-05  9:34   ` Laurent Vivier
  2021-06-09  1:34 ` [PATCH v15 8/9] powerpc/mm: implement set_memory_attr() Jordan Niethe
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Russell Currey <ruscur@russell.cc>

To enable strict module RWX on powerpc, set:

    CONFIG_STRICT_MODULE_RWX=y

You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real
security benefit.

ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX.
This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that
makes STRICT_MODULE_RWX *on by default* in configurations where
STRICT_KERNEL_RWX is *unavailable*.

Since this doesn't make much sense, and module RWX without kernel RWX
doesn't make much sense, having the same dependencies as kernel RWX
works around this problem.

Book3s/32 603 and 604 core processors are not able to write protect
kernel pages so do not set ARCH_HAS_STRICT_MODULE_RWX for Book3s/32.

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Russell Currey <ruscur@russell.cc>
[jpn: - predicate on !PPC_BOOK3S_604
      - make module_alloc() use PAGE_KERNEL protection]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v10: - Predicate on !PPC_BOOK3S_604
     - Make module_alloc() use PAGE_KERNEL protection
v11: - Neaten up
v13: Use strict_kernel_rwx_enabled()
v14: Make changes to module_alloc() its own commit
v15: - Force STRICT_KERNEL_RWX if STRICT_MODULE_RWX is selected
     - Predicate on !PPC_BOOK3S_32 instead
---
 arch/powerpc/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index abfe2e9225fa..72f307f1796b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -142,6 +142,7 @@ config PPC
 	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_STRICT_KERNEL_RWX	if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
+	select ARCH_HAS_STRICT_MODULE_RWX	if ARCH_HAS_STRICT_KERNEL_RWX && !PPC_BOOK3S_32
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UACCESS_FLUSHCACHE
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
@@ -267,6 +268,7 @@ config PPC
 	select PPC_DAWR				if PPC64
 	select RTC_LIB
 	select SPARSE_IRQ
+	select STRICT_KERNEL_RWX if STRICT_MODULE_RWX
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
 	select VIRT_TO_BUS			if !PPC64
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 8/9] powerpc/mm: implement set_memory_attr()
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (6 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-09  1:34 ` [PATCH v15 9/9] powerpc/32: use set_memory_attr() Jordan Niethe
  2021-06-24 14:03 ` [PATCH v15 0/9] powerpc: Further Strict RWX support Michael Ellerman
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, kbuild test robot, npiggin, aneesh.kumar, naveen.n.rao,
	Jordan Niethe, dja

From: Christophe Leroy <christophe.leroy@csgroup.eu>

In addition to the set_memory_xx() functions which allows to change
the memory attributes of not (yet) used memory regions, implement a
set_memory_attr() function to:
- set the final memory protection after init on currently used
kernel regions.
- enable/disable kernel memory regions in the scope of DEBUG_PAGEALLOC.

Unlike the set_memory_xx() which can act in three step as the regions
are unused, this function must modify 'on the fly' as the kernel is
executing from them. At the moment only PPC32 will use it and changing
page attributes on the fly is not an issue.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: kbuild test robot <lkp@intel.com>
[ruscur: cast "data" to unsigned long instead of int]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/set_memory.h |  2 ++
 arch/powerpc/mm/pageattr.c            | 33 +++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h
index 64011ea444b4..b040094f7920 100644
--- a/arch/powerpc/include/asm/set_memory.h
+++ b/arch/powerpc/include/asm/set_memory.h
@@ -29,4 +29,6 @@ static inline int set_memory_x(unsigned long addr, int numpages)
 	return change_memory_attr(addr, numpages, SET_MEMORY_X);
 }
 
+int set_memory_attr(unsigned long addr, int numpages, pgprot_t prot);
+
 #endif
diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 5e5ae50a7f23..0876216ceee6 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -99,3 +99,36 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
 	return apply_to_existing_page_range(&init_mm, start, size,
 					    change_page_attr, (void *)action);
 }
+
+/*
+ * Set the attributes of a page:
+ *
+ * This function is used by PPC32 at the end of init to set final kernel memory
+ * protection. It includes changing the maping of the page it is executing from
+ * and data pages it is using.
+ */
+static int set_page_attr(pte_t *ptep, unsigned long addr, void *data)
+{
+	pgprot_t prot = __pgprot((unsigned long)data);
+
+	spin_lock(&init_mm.page_table_lock);
+
+	set_pte_at(&init_mm, addr, ptep, pte_modify(*ptep, prot));
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
+	spin_unlock(&init_mm.page_table_lock);
+
+	return 0;
+}
+
+int set_memory_attr(unsigned long addr, int numpages, pgprot_t prot)
+{
+	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
+	unsigned long sz = numpages * PAGE_SIZE;
+
+	if (numpages <= 0)
+		return 0;
+
+	return apply_to_existing_page_range(&init_mm, start, sz, set_page_attr,
+					    (void *)pgprot_val(prot));
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v15 9/9] powerpc/32: use set_memory_attr()
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (7 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 8/9] powerpc/mm: implement set_memory_attr() Jordan Niethe
@ 2021-06-09  1:34 ` Jordan Niethe
  2021-06-24 14:03 ` [PATCH v15 0/9] powerpc: Further Strict RWX support Michael Ellerman
  9 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-06-09  1:34 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ajd, cmr, npiggin, aneesh.kumar, naveen.n.rao, Jordan Niethe, dja

From: Christophe Leroy <christophe.leroy@csgroup.eu>

Use set_memory_attr() instead of the PPC32 specific change_page_attr()

change_page_attr() was checking that the address was not mapped by
blocks and was handling highmem, but that's unneeded because the
affected pages can't be in highmem and block mapping verification
is already done by the callers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[ruscur: rebase on powerpc/merge with Christophe's new patches]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/mm/pgtable_32.c | 60 ++++++------------------------------
 1 file changed, 10 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index e0ec67a16887..dcf5ecca19d9 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -23,6 +23,7 @@
 #include <linux/highmem.h>
 #include <linux/memblock.h>
 #include <linux/slab.h>
+#include <linux/set_memory.h>
 
 #include <asm/pgalloc.h>
 #include <asm/fixmap.h>
@@ -132,64 +133,20 @@ void __init mapin_ram(void)
 	}
 }
 
-static int __change_page_attr_noflush(struct page *page, pgprot_t prot)
-{
-	pte_t *kpte;
-	unsigned long address;
-
-	BUG_ON(PageHighMem(page));
-	address = (unsigned long)page_address(page);
-
-	if (v_block_mapped(address))
-		return 0;
-	kpte = virt_to_kpte(address);
-	if (!kpte)
-		return -EINVAL;
-	__set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0);
-
-	return 0;
-}
-
-/*
- * Change the page attributes of an page in the linear mapping.
- *
- * THIS DOES NOTHING WITH BAT MAPPINGS, DEBUG USE ONLY
- */
-static int change_page_attr(struct page *page, int numpages, pgprot_t prot)
-{
-	int i, err = 0;
-	unsigned long flags;
-	struct page *start = page;
-
-	local_irq_save(flags);
-	for (i = 0; i < numpages; i++, page++) {
-		err = __change_page_attr_noflush(page, prot);
-		if (err)
-			break;
-	}
-	wmb();
-	local_irq_restore(flags);
-	flush_tlb_kernel_range((unsigned long)page_address(start),
-			       (unsigned long)page_address(page));
-	return err;
-}
-
 void mark_initmem_nx(void)
 {
-	struct page *page = virt_to_page(_sinittext);
 	unsigned long numpages = PFN_UP((unsigned long)_einittext) -
 				 PFN_DOWN((unsigned long)_sinittext);
 
 	if (v_block_mapped((unsigned long)_sinittext))
 		mmu_mark_initmem_nx();
 	else
-		change_page_attr(page, numpages, PAGE_KERNEL);
+		set_memory_attr((unsigned long)_sinittext, numpages, PAGE_KERNEL);
 }
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void mark_rodata_ro(void)
 {
-	struct page *page;
 	unsigned long numpages;
 
 	if (v_block_mapped((unsigned long)_stext + 1)) {
@@ -198,20 +155,18 @@ void mark_rodata_ro(void)
 		return;
 	}
 
-	page = virt_to_page(_stext);
 	numpages = PFN_UP((unsigned long)_etext) -
 		   PFN_DOWN((unsigned long)_stext);
 
-	change_page_attr(page, numpages, PAGE_KERNEL_ROX);
+	set_memory_attr((unsigned long)_stext, numpages, PAGE_KERNEL_ROX);
 	/*
 	 * mark .rodata as read only. Use __init_begin rather than __end_rodata
 	 * to cover NOTES and EXCEPTION_TABLE.
 	 */
-	page = virt_to_page(__start_rodata);
 	numpages = PFN_UP((unsigned long)__init_begin) -
 		   PFN_DOWN((unsigned long)__start_rodata);
 
-	change_page_attr(page, numpages, PAGE_KERNEL_RO);
+	set_memory_attr((unsigned long)__start_rodata, numpages, PAGE_KERNEL_RO);
 
 	// mark_initmem_nx() should have already run by now
 	ptdump_check_wx();
@@ -221,9 +176,14 @@ void mark_rodata_ro(void)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
+	unsigned long addr = (unsigned long)page_address(page);
+
 	if (PageHighMem(page))
 		return;
 
-	change_page_attr(page, numpages, enable ? PAGE_KERNEL : __pgprot(0));
+	if (enable)
+		set_memory_attr(addr, numpages, PAGE_KERNEL);
+	else
+		set_memory_attr(addr, numpages, __pgprot(0));
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v15 0/9] powerpc: Further Strict RWX support
  2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
                   ` (8 preceding siblings ...)
  2021-06-09  1:34 ` [PATCH v15 9/9] powerpc/32: use set_memory_attr() Jordan Niethe
@ 2021-06-24 14:03 ` Michael Ellerman
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2021-06-24 14:03 UTC (permalink / raw)
  To: linuxppc-dev, Jordan Niethe
  Cc: ajd, aneesh.kumar, npiggin, cmr, naveen.n.rao, dja

On Wed, 9 Jun 2021 11:34:22 +1000, Jordan Niethe wrote:
> Adding more Strict RWX support on powerpc, in particular Strict Module RWX.
> It is now rebased on ppc next.
> 
> For reference the previous revision is available here:
> https://lore.kernel.org/linuxppc-dev/20210517032810.129949-1-jniethe5@gmail.com/
> 
> Changes for v15:
> 
> [...]

Applied to powerpc/next.

[1/9] powerpc/mm: Implement set_memory() routines
      https://git.kernel.org/powerpc/c/1f9ad21c3b384a8f16d8c46845a48a01d281a603
[2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier
      https://git.kernel.org/powerpc/c/71a5b3db9f209ea5d1e07371017e65398d3c6fbc
[3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware
      https://git.kernel.org/powerpc/c/4fcc636615b1a309b39cab101a2b433cbf1f63f1
[4/9] powerpc/kprobes: Mark newly allocated probes as ROX
      https://git.kernel.org/powerpc/c/6a3a58e6230dc5b646ce3511436d7e74fc7f764b
[5/9] powerpc/bpf: Remove bpf_jit_free()
      https://git.kernel.org/powerpc/c/bc33cfdb0bb84d9e4b125a617a437c29ddcac4d9
[6/9] powerpc/bpf: Write protect JIT code
      https://git.kernel.org/powerpc/c/62e3d4210ac9c35783d0e8fc306df4239c540a79
[7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
      https://git.kernel.org/powerpc/c/c35717c71e983ed55d61e523cbd11a798429bc82
[8/9] powerpc/mm: implement set_memory_attr()
      https://git.kernel.org/powerpc/c/4d1755b6a762149ae022a32fb2bbeefb6680baa6
[9/9] powerpc/32: use set_memory_attr()
      https://git.kernel.org/powerpc/c/c988cfd38e489d9390d253d4392590daf451d87a

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
  2021-06-09  1:34 ` [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
@ 2021-08-05  9:34   ` Laurent Vivier
  2021-08-13 22:58     ` Fabiano Rosas
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Vivier @ 2021-08-05  9:34 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev
  Cc: ajd, aneesh.kumar, Greg Kurz, npiggin, cmr, kvm-ppc,
	naveen.n.rao, David Gibson, dja

Hi,

On 09/06/2021 03:34, Jordan Niethe wrote:
> From: Russell Currey <ruscur@russell.cc>
> 
> To enable strict module RWX on powerpc, set:
> 
>     CONFIG_STRICT_MODULE_RWX=y
> 
> You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real
> security benefit.
> 
> ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX.
> This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that
> makes STRICT_MODULE_RWX *on by default* in configurations where
> STRICT_KERNEL_RWX is *unavailable*.
> 
> Since this doesn't make much sense, and module RWX without kernel RWX
> doesn't make much sense, having the same dependencies as kernel RWX
> works around this problem.
> 
> Book3s/32 603 and 604 core processors are not able to write protect
> kernel pages so do not set ARCH_HAS_STRICT_MODULE_RWX for Book3s/32.
> 
> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> [jpn: - predicate on !PPC_BOOK3S_604
>       - make module_alloc() use PAGE_KERNEL protection]
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v10: - Predicate on !PPC_BOOK3S_604
>      - Make module_alloc() use PAGE_KERNEL protection
> v11: - Neaten up
> v13: Use strict_kernel_rwx_enabled()
> v14: Make changes to module_alloc() its own commit
> v15: - Force STRICT_KERNEL_RWX if STRICT_MODULE_RWX is selected
>      - Predicate on !PPC_BOOK3S_32 instead
> ---
>  arch/powerpc/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index abfe2e9225fa..72f307f1796b 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -142,6 +142,7 @@ config PPC
>  	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
>  	select ARCH_HAS_SET_MEMORY
>  	select ARCH_HAS_STRICT_KERNEL_RWX	if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
> +	select ARCH_HAS_STRICT_MODULE_RWX	if ARCH_HAS_STRICT_KERNEL_RWX && !PPC_BOOK3S_32
>  	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
>  	select ARCH_HAS_UACCESS_FLUSHCACHE
>  	select ARCH_HAS_UBSAN_SANITIZE_ALL
> @@ -267,6 +268,7 @@ config PPC
>  	select PPC_DAWR				if PPC64
>  	select RTC_LIB
>  	select SPARSE_IRQ
> +	select STRICT_KERNEL_RWX if STRICT_MODULE_RWX
>  	select SYSCTL_EXCEPTION_TRACE
>  	select THREAD_INFO_IN_TASK
>  	select VIRT_TO_BUS			if !PPC64
> 

since this patch is merged my VM is experiencing a crash at boot (20% of the time):

[    8.496850] kernel tried to execute exec-protected page (c008000004073278) - exploit
attempt? (uid: 0)
[    8.496921] BUG: Unable to handle kernel instruction fetch
[    8.496954] Faulting instruction address: 0xc008000004073278
[    8.496994] Oops: Kernel access of bad area, sig: 11 [#1]
[    8.497028] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[    8.497071] Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks xfs
libcrc32c virtio_net net_failover virtio_blk vmx_crypto failover dm_mirror dm_region_hash
dm_log dm_mod
[    8.497186] CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
[    8.497228] Workqueue: events control_work_handler [virtio_console]
[    8.497272] NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
[    8.497320] REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
[    8.497361] MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822
XER: 200400cf
[    8.497426] CFAR: c0000000001e9e44 IRQMASK: 1
[    8.497426] GPR00: c008000004073278 c00000002e4efa80 c000000002a26b00 c000000042c39520
[    8.497426] GPR04: 0000000000000001 0000000000000000 0000000000000000 00000000000000ff
[    8.497426] GPR08: 0000000000000001 c000000042c39520 0000000000000001 c008000004076008
[    8.497426] GPR12: c0000000001e9de0 c0000001fffccb00 c00000000018ba88 c00000002c91d400
[    8.497426] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    8.497426] GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000004080340
[    8.497426] GPR24: c0080000040a01e8 0000000000000000 0000000000000000 c00000002e0975c0
[    8.497426] GPR28: c00000002ce72940 c000000042c39520 0000000000000048 0000000000000038
[    8.497891] NIP [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
[    8.497934] LR [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
[    8.497976] Call Trace:
[    8.497993] [c00000002e4efa80] [c00800000407323c] fill_queue+0xb4/0x210
[virtio_console] (unreliable)
[    8.498052] [c00000002e4efae0] [c008000004073a90] add_port+0x1a8/0x470 [virtio_console]
[    8.498102] [c00000002e4efbb0] [c0080000040750f4] control_work_handler+0xbc/0x1e8
[virtio_console]
[    8.498160] [c00000002e4efc60] [c00000000017f4f0] process_one_work+0x290/0x590
[    8.498212] [c00000002e4efd00] [c00000000017f878] worker_thread+0x88/0x620
[    8.498256] [c00000002e4efda0] [c00000000018bc14] kthread+0x194/0x1a0
[    8.498299] [c00000002e4efe10] [c00000000000cf54] ret_from_kernel_thread+0x5c/0x64
[    8.498349] Instruction dump:
[    8.498374] 7da96b78 a14d0c8a 419c00b0 2f8a0000 419eff88 b32d0c8a 7c0004ac 4bffff7c
[    8.498430] 60000000 60000000 7fa3eb78 48002d95 <e8410018> 38600000 480025e1 e8410018
[    8.498485] ---[ end trace 16ee10903290b647 ]---
[    8.501433]
[    9.502601] Kernel panic - not syncing: Fatal exception

add_port+0x1a8/0x470 :

  1420	
  1421		/* We can safely ignore ENOSPC because it means
  1422		 * the queue already has buffers. Buffers are removed
  1423		 * only by virtcons_remove(), not by unplug_port()
  1424		 */
->1425		err = fill_queue(port->in_vq, &port->inbuf_lock);
  1426		if (err < 0 && err != -ENOSPC) {
  1427			dev_err(port->dev, "Error allocating inbufs\n");
  1428			goto free_device;
  1429		}

fill_queue+0x90/0x210 :

  1326	static int fill_queue(struct virtqueue *vq, spinlock_t *lock)
  1327	{
  1328		struct port_buffer *buf;
  1329		int nr_added_bufs;
  1330		int ret;
  1331	
  1332		nr_added_bufs = 0;
  1333		do {
  1334			buf = alloc_buf(vq->vdev, PAGE_SIZE, 0);
  1335			if (!buf)
  1336				return -ENOMEM;
  1337	
->1338			spin_lock_irq(lock);

I'm using an upstream kernel (5.14-rc4, 251a1524293d) in the VM.

My host is a RHEL 8.5/POWER9: qemu-kvm-6.0.0-21 and kernel-4.18.0-325

My qemu command line is:

/usr/libexec/qemu-kvm \
-M pseries,accel=kvm \
-nographic -nodefaults \
-device virtio-serial-pci \
-device virtconsole \
-device virtio-net-pci,mac=9a:2b:2c:2d:2e:2f,netdev=hostnet0  \
-blockdev
node-name=disk1,file.driver=file,driver=qcow2,file.driver=file,file.filename=disk.qcow2 \
-netdev bridge,id=hostnet0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
-device virtio-blk-pci,id=image1,drive=disk1 \
-m 8192  \
-smp 4 \
-serial mon:stdio


Do we need something in qemu/kvm to support STRICT_MODULE_RWX ?

Thanks,
Laurent


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
  2021-08-05  9:34   ` Laurent Vivier
@ 2021-08-13 22:58     ` Fabiano Rosas
  2021-08-14  1:23       ` Jordan Niethe
  0 siblings, 1 reply; 14+ messages in thread
From: Fabiano Rosas @ 2021-08-13 22:58 UTC (permalink / raw)
  To: Laurent Vivier, Jordan Niethe, linuxppc-dev
  Cc: ajd, aneesh.kumar, muriloo, Greg Kurz, npiggin, cmr, kvm-ppc,
	naveen.n.rao, David Gibson, dja

Laurent Vivier <lvivier@redhat.com> writes:

>
> since this patch is merged my VM is experiencing a crash at boot (20% of the time):
>
> [    8.496850] kernel tried to execute exec-protected page (c008000004073278) - exploit
> attempt? (uid: 0)
> [    8.496921] BUG: Unable to handle kernel instruction fetch
> [    8.496954] Faulting instruction address: 0xc008000004073278
> [    8.496994] Oops: Kernel access of bad area, sig: 11 [#1]
> [    8.497028] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [    8.497071] Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks xfs
> libcrc32c virtio_net net_failover virtio_blk vmx_crypto failover dm_mirror dm_region_hash
> dm_log dm_mod
> [    8.497186] CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
> [    8.497228] Workqueue: events control_work_handler [virtio_console]
> [    8.497272] NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
> [    8.497320] REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
> [    8.497361] MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822
> XER: 200400cf
> [    8.497426] CFAR: c0000000001e9e44 IRQMASK: 1
> [    8.497426] GPR00: c008000004073278 c00000002e4efa80 c000000002a26b00 c000000042c39520
> [    8.497426] GPR04: 0000000000000001 0000000000000000 0000000000000000 00000000000000ff
> [    8.497426] GPR08: 0000000000000001 c000000042c39520 0000000000000001 c008000004076008
> [    8.497426] GPR12: c0000000001e9de0 c0000001fffccb00 c00000000018ba88 c00000002c91d400
> [    8.497426] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [    8.497426] GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000004080340
> [    8.497426] GPR24: c0080000040a01e8 0000000000000000 0000000000000000 c00000002e0975c0
> [    8.497426] GPR28: c00000002ce72940 c000000042c39520 0000000000000048 0000000000000038
> [    8.497891] NIP [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
> [    8.497934] LR [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
> [    8.497976] Call Trace:
> [    8.497993] [c00000002e4efa80] [c00800000407323c] fill_queue+0xb4/0x210
> [virtio_console] (unreliable)
> [    8.498052] [c00000002e4efae0] [c008000004073a90] add_port+0x1a8/0x470 [virtio_console]
> [    8.498102] [c00000002e4efbb0] [c0080000040750f4] control_work_handler+0xbc/0x1e8
> [virtio_console]
> [    8.498160] [c00000002e4efc60] [c00000000017f4f0] process_one_work+0x290/0x590
> [    8.498212] [c00000002e4efd00] [c00000000017f878] worker_thread+0x88/0x620
> [    8.498256] [c00000002e4efda0] [c00000000018bc14] kthread+0x194/0x1a0
> [    8.498299] [c00000002e4efe10] [c00000000000cf54] ret_from_kernel_thread+0x5c/0x64
> [    8.498349] Instruction dump:
> [    8.498374] 7da96b78 a14d0c8a 419c00b0 2f8a0000 419eff88 b32d0c8a 7c0004ac 4bffff7c
> [    8.498430] 60000000 60000000 7fa3eb78 48002d95 <e8410018> 38600000 480025e1 e8410018
> [    8.498485] ---[ end trace 16ee10903290b647 ]---
> [    8.501433]
> [    9.502601] Kernel panic - not syncing: Fatal exception
>
> add_port+0x1a8/0x470 :
>
>   1420	
>   1421		/* We can safely ignore ENOSPC because it means
>   1422		 * the queue already has buffers. Buffers are removed
>   1423		 * only by virtcons_remove(), not by unplug_port()
>   1424		 */
> ->1425		err = fill_queue(port->in_vq, &port->inbuf_lock);
>   1426		if (err < 0 && err != -ENOSPC) {
>   1427			dev_err(port->dev, "Error allocating inbufs\n");
>   1428			goto free_device;
>   1429		}
>
> fill_queue+0x90/0x210 :
>
>   1326	static int fill_queue(struct virtqueue *vq, spinlock_t *lock)
>   1327	{
>   1328		struct port_buffer *buf;
>   1329		int nr_added_bufs;
>   1330		int ret;
>   1331	
>   1332		nr_added_bufs = 0;
>   1333		do {
>   1334			buf = alloc_buf(vq->vdev, PAGE_SIZE, 0);
>   1335			if (!buf)
>   1336				return -ENOMEM;
>   1337	
> ->1338			spin_lock_irq(lock);
>
> I'm using an upstream kernel (5.14-rc4, 251a1524293d) in the VM.
>
> My host is a RHEL 8.5/POWER9: qemu-kvm-6.0.0-21 and kernel-4.18.0-325
>
> My qemu command line is:
>
> /usr/libexec/qemu-kvm \
> -M pseries,accel=kvm \
> -nographic -nodefaults \
> -device virtio-serial-pci \
> -device virtconsole \
> -device virtio-net-pci,mac=9a:2b:2c:2d:2e:2f,netdev=hostnet0  \
> -blockdev
> node-name=disk1,file.driver=file,driver=qcow2,file.driver=file,file.filename=disk.qcow2 \
> -netdev bridge,id=hostnet0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
> -device virtio-blk-pci,id=image1,drive=disk1 \
> -m 8192  \
> -smp 4 \
> -serial mon:stdio
>
>
> Do we need something in qemu/kvm to support STRICT_MODULE_RWX ?
>
> Thanks,
> Laurent

This patch survived 300 consecutive runs without the issue:

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 0876216ceee6..7aeb2b62e5dc 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -35,10 +35,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
        pte_t pte;
 
        spin_lock(&init_mm.page_table_lock);
-
-       /* invalidate the PTE so it's safe to modify */
-       pte = ptep_get_and_clear(&init_mm, addr, ptep);
-       flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+       pte = *ptep;
 
        /* modify the PTE bits as desired, then apply */
        switch (action) {
@@ -60,10 +57,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
        }
 
        set_pte_at(&init_mm, addr, ptep, pte);
-
-       /* See ptesync comment in radix__set_pte_at() */
-       if (radix_enabled())
-               asm volatile("ptesync": : :"memory");
+       flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
        spin_unlock(&init_mm.page_table_lock);
 
        return 0;
---

What I think is happening is that the virtio_console code is running
at the same time we are doing the `module_enable_ro` at the end of
`do_init_module` due to the async nature of the work handler. There is
a window after the TLB flush when the PTE has its permission bits
cleared, so any translations of the module code page attempted during
that window will fault.

I'm ignorant of strict rwx in general so I don't see why we need to
clear the bits before setting them to their final value, but as I
understand it, the set_pte_at + flush_tlb_kernel_range satisfy the ISA
requirement of [ptesync; tlbie; eieio; tlbsync; ptesync;] so it seems
like the patch should work.

Now, I cannot explain why the crash always happens around the code
that does the module's symbols relocation (the NIP in Laurent's trace
is the TOC reload from module_64.c:restore_r2). Maybe because
instructions are already in icache until the first branch into the
stub?

Anyway, this is what Murilo and I found out over our debugging session
in the past couple of days. I hope it helps. =)


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
  2021-08-13 22:58     ` Fabiano Rosas
@ 2021-08-14  1:23       ` Jordan Niethe
  0 siblings, 0 replies; 14+ messages in thread
From: Jordan Niethe @ 2021-08-14  1:23 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Laurent Vivier, ajd, Aneesh Kumar K.V, muriloo, Greg Kurz,
	Nicholas Piggin, cmr, kvm-ppc, naveen.n.rao, David Gibson,
	linuxppc-dev, Daniel Axtens

On Sat, Aug 14, 2021 at 8:59 AM Fabiano Rosas <farosas@linux.ibm.com> wrote:
>
> Laurent Vivier <lvivier@redhat.com> writes:
>
> >
> > since this patch is merged my VM is experiencing a crash at boot (20% of the time):
> >
> > [    8.496850] kernel tried to execute exec-protected page (c008000004073278) - exploit
> > attempt? (uid: 0)
> > [    8.496921] BUG: Unable to handle kernel instruction fetch
> > [    8.496954] Faulting instruction address: 0xc008000004073278
> > [    8.496994] Oops: Kernel access of bad area, sig: 11 [#1]
> > [    8.497028] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> > [    8.497071] Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks xfs
> > libcrc32c virtio_net net_failover virtio_blk vmx_crypto failover dm_mirror dm_region_hash
> > dm_log dm_mod
> > [    8.497186] CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
> > [    8.497228] Workqueue: events control_work_handler [virtio_console]
> > [    8.497272] NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
> > [    8.497320] REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
> > [    8.497361] MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822
> > XER: 200400cf
> > [    8.497426] CFAR: c0000000001e9e44 IRQMASK: 1
> > [    8.497426] GPR00: c008000004073278 c00000002e4efa80 c000000002a26b00 c000000042c39520
> > [    8.497426] GPR04: 0000000000000001 0000000000000000 0000000000000000 00000000000000ff
> > [    8.497426] GPR08: 0000000000000001 c000000042c39520 0000000000000001 c008000004076008
> > [    8.497426] GPR12: c0000000001e9de0 c0000001fffccb00 c00000000018ba88 c00000002c91d400
> > [    8.497426] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [    8.497426] GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000004080340
> > [    8.497426] GPR24: c0080000040a01e8 0000000000000000 0000000000000000 c00000002e0975c0
> > [    8.497426] GPR28: c00000002ce72940 c000000042c39520 0000000000000048 0000000000000038
> > [    8.497891] NIP [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
> > [    8.497934] LR [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
> > [    8.497976] Call Trace:
> > [    8.497993] [c00000002e4efa80] [c00800000407323c] fill_queue+0xb4/0x210
> > [virtio_console] (unreliable)
> > [    8.498052] [c00000002e4efae0] [c008000004073a90] add_port+0x1a8/0x470 [virtio_console]
> > [    8.498102] [c00000002e4efbb0] [c0080000040750f4] control_work_handler+0xbc/0x1e8
> > [virtio_console]
> > [    8.498160] [c00000002e4efc60] [c00000000017f4f0] process_one_work+0x290/0x590
> > [    8.498212] [c00000002e4efd00] [c00000000017f878] worker_thread+0x88/0x620
> > [    8.498256] [c00000002e4efda0] [c00000000018bc14] kthread+0x194/0x1a0
> > [    8.498299] [c00000002e4efe10] [c00000000000cf54] ret_from_kernel_thread+0x5c/0x64
> > [    8.498349] Instruction dump:
> > [    8.498374] 7da96b78 a14d0c8a 419c00b0 2f8a0000 419eff88 b32d0c8a 7c0004ac 4bffff7c
> > [    8.498430] 60000000 60000000 7fa3eb78 48002d95 <e8410018> 38600000 480025e1 e8410018
> > [    8.498485] ---[ end trace 16ee10903290b647 ]---
> > [    8.501433]
> > [    9.502601] Kernel panic - not syncing: Fatal exception
> >
> > add_port+0x1a8/0x470 :
> >
> >   1420
> >   1421                /* We can safely ignore ENOSPC because it means
> >   1422                 * the queue already has buffers. Buffers are removed
> >   1423                 * only by virtcons_remove(), not by unplug_port()
> >   1424                 */
> > ->1425                err = fill_queue(port->in_vq, &port->inbuf_lock);
> >   1426                if (err < 0 && err != -ENOSPC) {
> >   1427                        dev_err(port->dev, "Error allocating inbufs\n");
> >   1428                        goto free_device;
> >   1429                }
> >
> > fill_queue+0x90/0x210 :
> >
> >   1326        static int fill_queue(struct virtqueue *vq, spinlock_t *lock)
> >   1327        {
> >   1328                struct port_buffer *buf;
> >   1329                int nr_added_bufs;
> >   1330                int ret;
> >   1331
> >   1332                nr_added_bufs = 0;
> >   1333                do {
> >   1334                        buf = alloc_buf(vq->vdev, PAGE_SIZE, 0);
> >   1335                        if (!buf)
> >   1336                                return -ENOMEM;
> >   1337
> > ->1338                        spin_lock_irq(lock);
> >
> > I'm using an upstream kernel (5.14-rc4, 251a1524293d) in the VM.
> >
> > My host is a RHEL 8.5/POWER9: qemu-kvm-6.0.0-21 and kernel-4.18.0-325
> >
> > My qemu command line is:
> >
> > /usr/libexec/qemu-kvm \
> > -M pseries,accel=kvm \
> > -nographic -nodefaults \
> > -device virtio-serial-pci \
> > -device virtconsole \
> > -device virtio-net-pci,mac=9a:2b:2c:2d:2e:2f,netdev=hostnet0  \
> > -blockdev
> > node-name=disk1,file.driver=file,driver=qcow2,file.driver=file,file.filename=disk.qcow2 \
> > -netdev bridge,id=hostnet0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
> > -device virtio-blk-pci,id=image1,drive=disk1 \
> > -m 8192  \
> > -smp 4 \
> > -serial mon:stdio
> >
> >
> > Do we need something in qemu/kvm to support STRICT_MODULE_RWX ?
> >
> > Thanks,
> > Laurent
>
> This patch survived 300 consecutive runs without the issue:
>
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> index 0876216ceee6..7aeb2b62e5dc 100644
> --- a/arch/powerpc/mm/pageattr.c
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -35,10 +35,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>         pte_t pte;
>
>         spin_lock(&init_mm.page_table_lock);
> -
> -       /* invalidate the PTE so it's safe to modify */
> -       pte = ptep_get_and_clear(&init_mm, addr, ptep);
> -       flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +       pte = *ptep;
>
>         /* modify the PTE bits as desired, then apply */
>         switch (action) {
> @@ -60,10 +57,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>         }
>
>         set_pte_at(&init_mm, addr, ptep, pte);
> -
> -       /* See ptesync comment in radix__set_pte_at() */
> -       if (radix_enabled())
> -               asm volatile("ptesync": : :"memory");
> +       flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>         spin_unlock(&init_mm.page_table_lock);
>
>         return 0;
> ---
>
> What I think is happening is that the virtio_console code is running
> at the same time we are doing the `module_enable_ro` at the end of
> `do_init_module` due to the async nature of the work handler. There is
> a window after the TLB flush when the PTE has its permission bits
> cleared, so any translations of the module code page attempted during
> that window will fault.

Thanks for looking at this. I agree that this is the problem.
This avoids the crash (not a proper solution):

--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2011,13 +2011,15 @@ static void module_enable_ro(const struct
module *mod, bool after_init)
        if (!rodata_enabled)
                return;

-       set_vm_flush_reset_perms(mod->core_layout.base);
-       set_vm_flush_reset_perms(mod->init_layout.base);
-       frob_text(&mod->core_layout, set_memory_ro);
-
-       frob_rodata(&mod->core_layout, set_memory_ro);
-       frob_text(&mod->init_layout, set_memory_ro);
-       frob_rodata(&mod->init_layout, set_memory_ro);
+       if (!after_init) {
+               set_vm_flush_reset_perms(mod->core_layout.base);
+               set_vm_flush_reset_perms(mod->init_layout.base);
+               frob_text(&mod->core_layout, set_memory_ro);
+
+               frob_rodata(&mod->core_layout, set_memory_ro);
+               frob_text(&mod->init_layout, set_memory_ro);
+               frob_rodata(&mod->init_layout, set_memory_ro);
+       }

>
> I'm ignorant of strict rwx in general so I don't see why we need to
> clear the bits before setting them to their final value, but as I
> understand it, the set_pte_at + flush_tlb_kernel_range satisfy the ISA
> requirement of [ptesync; tlbie; eieio; tlbsync; ptesync;] so it seems
> like the patch should work.
I believe it's done like that because ISA 6.10.1.2 Modifying a
Translation Table Entry -
"The sequence is equivalent to deleting the PTE and then adding a new one".
In that sequence, I think it is [set pte to 0; ptesync; tlbie; eieio;
tlbsync; ptesync;]

But it does seem like we need to do something like your patch for
change_page_attr() to work as expected.
>
> Now, I cannot explain why the crash always happens around the code
> that does the module's symbols relocation (the NIP in Laurent's trace
> is the TOC reload from module_64.c:restore_r2). Maybe because
> instructions are already in icache until the first branch into the
> stub?
Yeah, mpe thought it might be some cache effect of a branch instruction.
From time to time I did see Oopses like this (not "kernel tried to
execute exec-protected page")

[  124.986964][   T52] Oops: Kernel access of bad area, sig: 11 [#1]
[  124.987043][   T52] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[  124.987095][   T52] Modules linked in: virtio_console binfmt_misc
virtiofs fuse virtio_net virtio_blk virtio_pci virtio_pci_modern_dev
virtio_ring virtio crc32c_vpmsum [last unloaded: virtio_console]
[  124.987209][   T52] CPU: 2 PID: 52 Comm: kworker/2:1 Not tainted
5.14.0-rc4 #4
[  124.987259][   T52] Workqueue: events control_work_handler [virtio_console]
[  124.987307][   T52] NIP:  c008000001a86044 LR: c008000001a82cf8
CTR: c00000000042a6c0
[  124.987358][   T52] REGS: c00000000b273770 TRAP: 0300   Not tainted
 (5.14.0-rc4)
[  124.987406][   T52] MSR:  800000000280b033
<SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 82002882  XER: 00000004
[  124.987475][   T52] CFAR: c008000001a82cf4 DAR: c008000001a86058
DSISR: 40000000 IRQMASK: 0
[  124.987475][   T52] GPR00: c008000001a82ce8 c00000000b273a10
c008000001ab8000 c000000010731320
[  124.987475][   T52] GPR04: 0000000000000001 0000000000000000
0000000000000000 0000000000000000
[  124.987475][   T52] GPR08: 0000000000000000 0000000000000000
0000000000000000 c008000001a86038
[  124.987475][   T52] GPR12: 0000000022002884 c0000001ffffdf00
c000000000169ae8 c0000000037700c0
[  124.987475][   T52] GPR16: 0000000000000000 0000000000000000
c000000021af0000 c000000003313900
[  124.987475][   T52] GPR20: 0000000000000001 0000000000000000
0000000000000000 c000000003316780
[  124.987475][   T52] GPR24: c008000001a90288 0000000000000000
0000000000000000 c000000010731320
[  124.987475][   T52] GPR28: c000000003316900 0000000000000018
0000000000000068 c00000000379c5a0
[  124.988028][   T52] NIP [c008000001a86044] fini+0x824/0xa7e0 [virtio_console]
[  124.988085][   T52] LR [c008000001a82cf8] fill_queue+0xb0/0x230
[virtio_console]
[  124.988141][   T52] Call Trace:
[  124.988171][   T52] [c00000000b273a10] [c008000001a82ce8]
fill_queue+0xa0/0x230 [virtio_console] (unreliable)
[  124.988246][   T52] [c00000000b273aa0] [c008000001a83208]
add_port.isra.0+0x1a0/0x4b0 [virtio_console]
[  124.988312][   T52] [c00000000b273b70] [c008000001a85474]
control_work_handler+0x46c/0x654 [virtio_console]
[  124.988403][   T52] [c00000000b273c70] [c00000000015ce60]
process_one_work+0x2a0/0x570
[  124.988586][   T52] [c00000000b273d10] [c00000000015d1d8]
worker_thread+0xa8/0x660
[  124.988684][   T52] [c00000000b273da0] [c000000000169c5c] kthread+0x17c/0x190
[  124.988762][   T52] [c00000000b273e10] [c00000000000cf54]
ret_from_kernel_thread+0x5c/0x64
[  124.988916][   T52] Instruction dump:
[  124.988965][   T52] 396be010 f8410018 e98b0020 7d8903a6 4e800420
00000000 73747562 003a0300
[  124.989152][   T52] c0000000 3d62fffd 396be038 f8410018 <e98b0020>
7d8903a6 4e800420 00000000
[  124.989298][   T52] ---[ end trace ab9046c024eb3154 ]---

I guess it depends on the timing of which pte is invalidated.
>
> Anyway, this is what Murilo and I found out over our debugging session
> in the past couple of days. I hope it helps. =)
Yes it does, thanks heaps.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-08-14  1:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09  1:34 [PATCH v15 0/9] powerpc: Further Strict RWX support Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 1/9] powerpc/mm: Implement set_memory() routines Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 2/9] powerpc/lib/code-patching: Set up Strict RWX patching earlier Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 3/9] powerpc/modules: Make module_alloc() Strict Module RWX aware Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 4/9] powerpc/kprobes: Mark newly allocated probes as ROX Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 5/9] powerpc/bpf: Remove bpf_jit_free() Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 6/9] powerpc/bpf: Write protect JIT code Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX Jordan Niethe
2021-08-05  9:34   ` Laurent Vivier
2021-08-13 22:58     ` Fabiano Rosas
2021-08-14  1:23       ` Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 8/9] powerpc/mm: implement set_memory_attr() Jordan Niethe
2021-06-09  1:34 ` [PATCH v15 9/9] powerpc/32: use set_memory_attr() Jordan Niethe
2021-06-24 14:03 ` [PATCH v15 0/9] powerpc: Further Strict RWX support Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).