linux-hardening.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU
@ 2021-07-13  5:31 Christopher M. Riedl
  2021-07-13  5:31 ` [PATCH v5 1/8] powerpc: Add LKDTM accessor for patching addr Christopher M. Riedl
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

When compiled with CONFIG_STRICT_KERNEL_RWX, the kernel must create
temporary mappings when patching itself. These mappings temporarily
override the strict RWX text protections to permit a write. Currently,
powerpc allocates a per-CPU VM area for patching. Patching occurs as
follows:

	1. Map page in per-CPU VM area w/ PAGE_KERNEL protection
	2. Patch text
	3. Remove the temporary mapping

While the VM area is per-CPU, the mapping is actually inserted into the
kernel page tables. Presumably, this could allow another CPU to access
the normally write-protected text - either malicously or accidentally -
via this same mapping if the address of the VM area is known. Ideally,
the mapping should be kept local to the CPU doing the patching [0].

x86 introduced "temporary mm" structs which allow the creation of mappings
local to a particular CPU [1]. This series intends to bring the notion of a
temporary mm to powerpc's Book3s64 Radix MMU and harden it by using such a
mapping for patching a kernel with strict RWX permissions.

The first four patches implement an LKDTM test "proof-of-concept" which
exploits the potential vulnerability (ie. the temporary mapping during patching
is exposed in the kernel page tables and accessible by other CPUs) using a
simple brute-force approach. This test is implemented for both powerpc and
x86_64. The test passes on powerpc Radix with this new series, fails on
upstream powerpc, passes on upstream x86_64, and fails on an older (ancient)
x86_64 tree without the x86_64 temporary mm patches. The remaining patches add
support for and use a temporary mm for code patching on powerpc with the Radix
MMU.

Tested boot, ftrace, and repeated LKDTM "hijack":
	- QEMU+KVM (host: POWER9 Blackbird): Radix MMU w/ KUAP
	- QEMU+KVM (host: POWER9 Blackbird): Hash MMU

Tested repeated LKDTM "hijack":
	- QEMU+KVM (host: AMD desktop): x86_64 upstream
	- QEMU+KVM (host: AMD desktop): x86_64 w/o percpu temp mm to
	  verify the LKDTM "hijack" test fails

Tested boot and ftrace:
	- QEMU+TCG: ppc44x (bamboo)
	- QEMU+TCG: g5 (mac99)

I also tested with various extra config options enabled as suggested in
section 12) in Documentation/process/submit-checklist.rst.

v5:	* Only support Book3s64 Radix MMU for now. There are some issues with
	  the previous implementation on the Hash MMU as pointed out by Nick
	  Piggin. Fixing these is not trivial so we only support the Radix MMU
	  for now. I tried using realmode (no data translation) to patch with
	  Hash to at least avoid exposing the patch mapping to other CPUs but
	  this doesn't reliably work either since we cannot access vmalloc'd
	  space in realmode.
	* Use percpu variables for the patching_mm and patching_addr. This
	  avoids the need for synchronization mechanisms entirely. Thanks to
	  Peter Zijlstra for pointing out text_mutex which unfortunately didn't
	  work out without larger code changes in powerpc. Also thanks to Daniel
	  Axtens for comments about using percpu variables for the *percpu* temp
	  mm things off list.

v4:	* It's time to revisit this series again since @jpn and @mpe fixed
	  our known STRICT_*_RWX bugs on powerpc/64s.
	* Rebase on linuxppc/next:
          commit ee1bc694fbaec ("powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n")
	* Completely rework how map_patch() works on book3s64 Hash MMU
	* Split the LKDTM x86_64 and powerpc bits into separate patches
	* Annotate commit messages with changes from v3 instead of
	  listing them here completely out-of context...

v3:	* Rebase on linuxppc/next: commit 9123e3a74ec7 ("Linux 5.9-rc1")
	* Move temporary mm implementation into code-patching.c where it
	  belongs
	* Implement LKDTM hijacker test on x86_64 (on IBM time oof) Do
	* not use address zero for the patching address in the
	  temporary mm (thanks @dja for pointing this out!)
	* Wrap the LKDTM test w/ CONFIG_SMP as suggested by Christophe
	  Leroy
	* Comments to clarify PTE pre-allocation and patching addr
	  selection

v2:	* Rebase on linuxppc/next:
	  commit 105fb38124a4 ("powerpc/8xx: Modify ptep_get()")
	* Always dirty pte when mapping patch
	* Use `ppc_inst_len` instead of `sizeof` on instructions
	* Declare LKDTM patching addr accessor in header where it belongs	

v1:	* Rebase on linuxppc/next (4336b9337824)
	* Save and restore second hw watchpoint
	* Use new ppc_inst_* functions for patching check and in LKDTM test

rfc-v2:	* Many fixes and improvements mostly based on extensive feedback
          and testing by Christophe Leroy (thanks!).
	* Make patching_mm and patching_addr static and move
	  '__ro_after_init' to after the variable name (more common in
	  other parts of the kernel)
	* Use 'asm/debug.h' header instead of 'asm/hw_breakpoint.h' to
	  fix PPC64e compile
	* Add comment explaining why we use BUG_ON() during the init
	  call to setup for patching later
	* Move ptep into patch_mapping to avoid walking page tables a
	  second time when unmapping the temporary mapping
	* Use KUAP under non-radix, also manually dirty the PTE for patch
	  mapping on non-BOOK3S_64 platforms
	* Properly return any error from __patch_instruction
        * Do not use 'memcmp' where a simple comparison is appropriate
	* Simplify expression for patch address by removing pointer maths
	* Add LKDTM test

[0]: https://github.com/linuxppc/issues/issues/224
[1]: https://lore.kernel.org/kernel-hardening/20190426232303.28381-1-nadav.amit@gmail.com/

Christopher M. Riedl (8):
  powerpc: Add LKDTM accessor for patching addr
  lkdtm/powerpc: Add test to hijack a patch mapping
  x86_64: Add LKDTM accessor for patching addr
  lkdtm/x86_64: Add test to hijack a patch mapping
  powerpc/64s: Introduce temporary mm for Radix MMU
  powerpc: Rework and improve STRICT_KERNEL_RWX patching
  powerpc/64s: Initialize and use a temporary mm for patching on Radix
  lkdtm/powerpc: Fix code patching hijack test

 arch/powerpc/include/asm/code-patching.h |   4 +
 arch/powerpc/include/asm/debug.h         |   1 +
 arch/powerpc/kernel/process.c            |   5 +
 arch/powerpc/lib/code-patching.c         | 240 ++++++++++++++++++++---
 arch/x86/include/asm/text-patching.h     |   4 +
 arch/x86/kernel/alternative.c            |   7 +
 drivers/misc/lkdtm/core.c                |   1 +
 drivers/misc/lkdtm/lkdtm.h               |   1 +
 drivers/misc/lkdtm/perms.c               | 143 ++++++++++++++
 9 files changed, 378 insertions(+), 28 deletions(-)

-- 
2.26.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5 1/8] powerpc: Add LKDTM accessor for patching addr
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-07-13  5:31 ` [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping Christopher M. Riedl
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

When live patching with STRICT_KERNEL_RWX a mapping is installed at a
"patching address" with temporary write permissions. Provide a
LKDTM-only accessor function for this address in preparation for a LKDTM
test which attempts to "hijack" this mapping by writing to it from
another CPU.

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
---
 arch/powerpc/include/asm/code-patching.h | 4 ++++
 arch/powerpc/lib/code-patching.c         | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index a95f63788c6b1..16fbc58a4932f 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -184,4 +184,8 @@ static inline unsigned long ppc_kallsyms_lookup_name(const char *name)
 #define PPC_INST_STD_LR		PPC_RAW_STD(_R0, _R1, PPC_LR_STKOFF)
 #endif /* CONFIG_PPC64 */
 
+#if IS_BUILTIN(CONFIG_LKDTM) && IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)
+unsigned long read_cpu_patching_addr(unsigned int cpu);
+#endif
+
 #endif /* _ASM_POWERPC_CODE_PATCHING_H */
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f9a3019e37b43..54b6157d44e95 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -47,6 +47,13 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr)
 #ifdef CONFIG_STRICT_KERNEL_RWX
 static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
 
+#if IS_BUILTIN(CONFIG_LKDTM)
+unsigned long read_cpu_patching_addr(unsigned int cpu)
+{
+	return (unsigned long)(per_cpu(text_poke_area, cpu))->addr;
+}
+#endif
+
 static int text_area_cpu_up(unsigned int cpu)
 {
 	struct vm_struct *area;
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
  2021-07-13  5:31 ` [PATCH v5 1/8] powerpc: Add LKDTM accessor for patching addr Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-08-05  9:13   ` Christophe Leroy
  2021-07-13  5:31 ` [PATCH v5 3/8] x86_64: Add LKDTM accessor for patching addr Christopher M. Riedl
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

When live patching with STRICT_KERNEL_RWX the CPU doing the patching
must temporarily remap the page(s) containing the patch site with +W
permissions. While this temporary mapping is in use, another CPU could
write to the same mapping and maliciously alter kernel text. Implement a
LKDTM test to attempt to exploit such an opening during code patching.
The test is implemented on powerpc and requires LKDTM built into the
kernel (building LKDTM as a module is insufficient).

The LKDTM "hijack" test works as follows:

  1. A CPU executes an infinite loop to patch an instruction. This is
     the "patching" CPU.
  2. Another CPU attempts to write to the address of the temporary
     mapping used by the "patching" CPU. This other CPU is the
     "hijacker" CPU. The hijack either fails with a fault/error or
     succeeds, in which case some kernel text is now overwritten.

The virtual address of the temporary patch mapping is provided via an
LKDTM-specific accessor to the hijacker CPU. This test assumes a
hypothetical situation where this address was leaked previously.

How to run the test:

	mount -t debugfs none /sys/kernel/debug
	(echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT)

A passing test indicates that it is not possible to overwrite kernel
text from another CPU by using the temporary mapping established by
a CPU for patching.

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>

---

v5:  * Use `u32*` instead of `struct ppc_inst*` based on new series in
       upstream.

v4:  * Separate the powerpc and x86_64 bits into individual patches.
     * Use __put_kernel_nofault() when attempting to hijack the mapping
     * Use raw_smp_processor_id() to avoid triggering the BUG() when
       calling smp_processor_id() in preemptible code - the only thing
       that matters is that one of the threads is bound to a different
       CPU - we are not using smp_processor_id() to access any per-cpu
       data or similar where preemption should be disabled.
     * Rework the patching_cpu() kthread stop condition to avoid:
       https://lwn.net/Articles/628628/
---
 drivers/misc/lkdtm/core.c  |   1 +
 drivers/misc/lkdtm/lkdtm.h |   1 +
 drivers/misc/lkdtm/perms.c | 134 +++++++++++++++++++++++++++++++++++++
 3 files changed, 136 insertions(+)

diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index 8024b6a5cc7fc..fbcb95eda337b 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -147,6 +147,7 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(WRITE_RO),
 	CRASHTYPE(WRITE_RO_AFTER_INIT),
 	CRASHTYPE(WRITE_KERN),
+	CRASHTYPE(HIJACK_PATCH),
 	CRASHTYPE(REFCOUNT_INC_OVERFLOW),
 	CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
 	CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 99f90d3e5e9cb..87e7e6136d962 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void);
 void lkdtm_EXEC_NULL(void);
 void lkdtm_ACCESS_USERSPACE(void);
 void lkdtm_ACCESS_NULL(void);
+void lkdtm_HIJACK_PATCH(void);
 
 /* refcount.c */
 void lkdtm_REFCOUNT_INC_OVERFLOW(void);
diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
index 2dede2ef658f3..39e7456852229 100644
--- a/drivers/misc/lkdtm/perms.c
+++ b/drivers/misc/lkdtm/perms.c
@@ -9,6 +9,7 @@
 #include <linux/vmalloc.h>
 #include <linux/mman.h>
 #include <linux/uaccess.h>
+#include <linux/kthread.h>
 #include <asm/cacheflush.h>
 
 /* Whether or not to fill the target memory area with do_nothing(). */
@@ -222,6 +223,139 @@ void lkdtm_ACCESS_NULL(void)
 	pr_err("FAIL: survived bad write\n");
 }
 
+#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
+	defined(CONFIG_PPC))
+/*
+ * This is just a dummy location to patch-over.
+ */
+static void patching_target(void)
+{
+	return;
+}
+
+#include <asm/code-patching.h>
+const u32 *patch_site = (const u32 *)&patching_target;
+
+static inline int lkdtm_do_patch(u32 data)
+{
+	return patch_instruction((u32 *)patch_site, ppc_inst(data));
+}
+
+static inline u32 lkdtm_read_patch_site(void)
+{
+	return READ_ONCE(*patch_site);
+}
+
+/* Returns True if the write succeeds */
+static inline bool lkdtm_try_write(u32 data, u32 *addr)
+{
+	__put_kernel_nofault(addr, &data, u32, err);
+	return true;
+
+err:
+	return false;
+}
+
+static int lkdtm_patching_cpu(void *data)
+{
+	int err = 0;
+	u32 val = 0xdeadbeef;
+
+	pr_info("starting patching_cpu=%d\n", raw_smp_processor_id());
+
+	do {
+		err = lkdtm_do_patch(val);
+	} while (lkdtm_read_patch_site() == val && !err && !kthread_should_stop());
+
+	if (err)
+		pr_warn("XFAIL: patch_instruction returned error: %d\n", err);
+
+	while (!kthread_should_stop()) {
+		set_current_state(TASK_INTERRUPTIBLE);
+		schedule();
+	}
+
+	return err;
+}
+
+void lkdtm_HIJACK_PATCH(void)
+{
+	struct task_struct *patching_kthrd;
+	int patching_cpu, hijacker_cpu, attempts;
+	unsigned long addr;
+	bool hijacked;
+	const u32 bad_data = 0xbad00bad;
+	const u32 original_insn = lkdtm_read_patch_site();
+
+	if (!IS_ENABLED(CONFIG_SMP)) {
+		pr_err("XFAIL: this test requires CONFIG_SMP\n");
+		return;
+	}
+
+	if (num_online_cpus() < 2) {
+		pr_warn("XFAIL: this test requires at least two cpus\n");
+		return;
+	}
+
+	hijacker_cpu = raw_smp_processor_id();
+	patching_cpu = cpumask_any_but(cpu_online_mask, hijacker_cpu);
+
+	patching_kthrd = kthread_create_on_node(&lkdtm_patching_cpu, NULL,
+						cpu_to_node(patching_cpu),
+						"lkdtm_patching_cpu");
+	kthread_bind(patching_kthrd, patching_cpu);
+	wake_up_process(patching_kthrd);
+
+	addr = offset_in_page(patch_site) | read_cpu_patching_addr(patching_cpu);
+
+	pr_info("starting hijacker_cpu=%d\n", hijacker_cpu);
+	for (attempts = 0; attempts < 100000; ++attempts) {
+		/* Try to write to the other CPU's temp patch mapping */
+		hijacked = lkdtm_try_write(bad_data, (u32 *)addr);
+
+		if (hijacked) {
+			if (kthread_stop(patching_kthrd)) {
+				pr_info("hijack attempts: %d\n", attempts);
+				pr_err("XFAIL: error stopping patching cpu\n");
+				return;
+			}
+			break;
+		}
+	}
+	pr_info("hijack attempts: %d\n", attempts);
+
+	if (hijacked) {
+		if (lkdtm_read_patch_site() == bad_data)
+			pr_err("overwrote kernel text\n");
+		/*
+		 * There are window conditions where the hijacker cpu manages to
+		 * write to the patch site but the site gets overwritten again by
+		 * the patching cpu. We still consider that a "successful" hijack
+		 * since the hijacker cpu did not fault on the write.
+		 */
+		pr_err("FAIL: wrote to another cpu's patching area\n");
+	} else {
+		kthread_stop(patching_kthrd);
+	}
+
+	/* Restore the original data to be able to run the test again */
+	lkdtm_do_patch(original_insn);
+}
+
+#else
+
+void lkdtm_HIJACK_PATCH(void)
+{
+	if (!IS_ENABLED(CONFIG_PPC))
+		pr_err("XFAIL: this test only runs on powerpc\n");
+	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
+		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
+	if (!IS_BUILTIN(CONFIG_LKDTM))
+		pr_err("XFAIL: this test requires CONFIG_LKDTM=y (not =m!)\n");
+}
+
+#endif
+
 void __init lkdtm_perms_init(void)
 {
 	/* Make sure we can write to __ro_after_init values during __init */
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 3/8] x86_64: Add LKDTM accessor for patching addr
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
  2021-07-13  5:31 ` [PATCH v5 1/8] powerpc: Add LKDTM accessor for patching addr Christopher M. Riedl
  2021-07-13  5:31 ` [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-07-13  5:31 ` [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping Christopher M. Riedl
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

When live patching with STRICT_KERNEL_RWX a mapping is installed at a
"patching address" with temporary write permissions. Provide a
LKDTM-only accessor function for this address in preparation for a LKDTM
test which attempts to "hijack" this mapping by writing to it from
another CPU.

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
---
 arch/x86/include/asm/text-patching.h | 4 ++++
 arch/x86/kernel/alternative.c        | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index b7421780e4e92..f0caf9ee13bd8 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -167,4 +167,8 @@ void int3_emulate_ret(struct pt_regs *regs)
 }
 #endif /* !CONFIG_UML_X86 */
 
+#if IS_BUILTIN(CONFIG_LKDTM)
+unsigned long read_cpu_patching_addr(unsigned int cpu);
+#endif
+
 #endif /* _ASM_X86_TEXT_PATCHING_H */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index e9da3dc712541..28bb92b695639 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -773,6 +773,13 @@ static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
 __ro_after_init struct mm_struct *poking_mm;
 __ro_after_init unsigned long poking_addr;
 
+#if IS_BUILTIN(CONFIG_LKDTM)
+unsigned long read_cpu_patching_addr(unsigned int cpu)
+{
+	return poking_addr;
+}
+#endif
+
 static void *__text_poke(void *addr, const void *opcode, size_t len)
 {
 	bool cross_page_boundary = offset_in_page(addr) + len > PAGE_SIZE;
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
                   ` (2 preceding siblings ...)
  2021-07-13  5:31 ` [PATCH v5 3/8] x86_64: Add LKDTM accessor for patching addr Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-08-05  9:09   ` Christophe Leroy
  2021-07-13  5:31 ` [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU Christopher M. Riedl
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

A previous commit implemented an LKDTM test on powerpc to exploit the
temporary mapping established when patching code with STRICT_KERNEL_RWX
enabled. Extend the test to work on x86_64 as well.

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
---
 drivers/misc/lkdtm/perms.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
index 39e7456852229..41e87e5f9cc86 100644
--- a/drivers/misc/lkdtm/perms.c
+++ b/drivers/misc/lkdtm/perms.c
@@ -224,7 +224,7 @@ void lkdtm_ACCESS_NULL(void)
 }
 
 #if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
-	defined(CONFIG_PPC))
+	(defined(CONFIG_PPC) || defined(CONFIG_X86_64)))
 /*
  * This is just a dummy location to patch-over.
  */
@@ -233,12 +233,25 @@ static void patching_target(void)
 	return;
 }
 
-#include <asm/code-patching.h>
 const u32 *patch_site = (const u32 *)&patching_target;
 
+#ifdef CONFIG_PPC
+#include <asm/code-patching.h>
+#endif
+
+#ifdef CONFIG_X86_64
+#include <asm/text-patching.h>
+#endif
+
 static inline int lkdtm_do_patch(u32 data)
 {
+#ifdef CONFIG_PPC
 	return patch_instruction((u32 *)patch_site, ppc_inst(data));
+#endif
+#ifdef CONFIG_X86_64
+	text_poke((void *)patch_site, &data, sizeof(u32));
+	return 0;
+#endif
 }
 
 static inline u32 lkdtm_read_patch_site(void)
@@ -249,11 +262,16 @@ static inline u32 lkdtm_read_patch_site(void)
 /* Returns True if the write succeeds */
 static inline bool lkdtm_try_write(u32 data, u32 *addr)
 {
+#ifdef CONFIG_PPC
 	__put_kernel_nofault(addr, &data, u32, err);
 	return true;
 
 err:
 	return false;
+#endif
+#ifdef CONFIG_X86_64
+	return !__put_user(data, addr);
+#endif
 }
 
 static int lkdtm_patching_cpu(void *data)
@@ -346,8 +364,8 @@ void lkdtm_HIJACK_PATCH(void)
 
 void lkdtm_HIJACK_PATCH(void)
 {
-	if (!IS_ENABLED(CONFIG_PPC))
-		pr_err("XFAIL: this test only runs on powerpc\n");
+	if (!IS_ENABLED(CONFIG_PPC) && !IS_ENABLED(CONFIG_X86_64))
+		pr_err("XFAIL: this test only runs on powerpc and x86_64\n");
 	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
 		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
 	if (!IS_BUILTIN(CONFIG_LKDTM))
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
                   ` (3 preceding siblings ...)
  2021-07-13  5:31 ` [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-08-05  9:27   ` Christophe Leroy
  2021-07-13  5:31 ` [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching Christopher M. Riedl
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

x86 supports the notion of a temporary mm which restricts access to
temporary PTEs to a single CPU. A temporary mm is useful for situations
where a CPU needs to perform sensitive operations (such as patching a
STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
said mappings to other CPUs. Another benefit is that other CPU TLBs do
not need to be flushed when the temporary mm is torn down.

Mappings in the temporary mm can be set in the userspace portion of the
address-space.

Interrupts must be disabled while the temporary mm is in use. HW
breakpoints, which may have been set by userspace as watchpoints on
addresses now within the temporary mm, are saved and disabled when
loading the temporary mm. The HW breakpoints are restored when unloading
the temporary mm. All HW breakpoints are indiscriminately disabled while
the temporary mm is in use.

Based on x86 implementation:

commit cefa929c034e
("x86/mm: Introduce temporary mm structs")

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>

---

v5:  * Drop support for using a temporary mm on Book3s64 Hash MMU.

v4:  * Pass the prev mm instead of NULL to switch_mm_irqs_off() when
       using/unusing the temp mm as suggested by Jann Horn to keep
       the context.active counter in-sync on mm/nohash.
     * Disable SLB preload in the temporary mm when initializing the
       temp_mm struct.
     * Include asm/debug.h header to fix build issue with
       ppc44x_defconfig.
---
 arch/powerpc/include/asm/debug.h |  1 +
 arch/powerpc/kernel/process.c    |  5 +++
 arch/powerpc/lib/code-patching.c | 56 ++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+)

diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
index 86a14736c76c3..dfd82635ea8b3 100644
--- a/arch/powerpc/include/asm/debug.h
+++ b/arch/powerpc/include/asm/debug.h
@@ -46,6 +46,7 @@ static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
 #endif
 
 void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk);
+void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk);
 bool ppc_breakpoint_available(void);
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
 extern void do_send_trap(struct pt_regs *regs, unsigned long address,
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 185beb2905801..a0776200772e8 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -865,6 +865,11 @@ static inline int set_breakpoint_8xx(struct arch_hw_breakpoint *brk)
 	return 0;
 }
 
+void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk)
+{
+	memcpy(brk, this_cpu_ptr(&current_brk[nr]), sizeof(*brk));
+}
+
 void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
 {
 	memcpy(this_cpu_ptr(&current_brk[nr]), brk, sizeof(*brk));
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 54b6157d44e95..3122d8e4cc013 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -17,6 +17,9 @@
 #include <asm/code-patching.h>
 #include <asm/setup.h>
 #include <asm/inst.h>
+#include <asm/mmu_context.h>
+#include <asm/debug.h>
+#include <asm/tlb.h>
 
 static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 *patch_addr)
 {
@@ -45,6 +48,59 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr)
 }
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
+
+struct temp_mm {
+	struct mm_struct *temp;
+	struct mm_struct *prev;
+	struct arch_hw_breakpoint brk[HBP_NUM_MAX];
+};
+
+static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct *mm)
+{
+	/* We currently only support temporary mm on the Book3s64 Radix MMU */
+	WARN_ON(!radix_enabled());
+
+	temp_mm->temp = mm;
+	temp_mm->prev = NULL;
+	memset(&temp_mm->brk, 0, sizeof(temp_mm->brk));
+}
+
+static inline void use_temporary_mm(struct temp_mm *temp_mm)
+{
+	lockdep_assert_irqs_disabled();
+
+	temp_mm->prev = current->active_mm;
+	switch_mm_irqs_off(temp_mm->prev, temp_mm->temp, current);
+
+	WARN_ON(!mm_is_thread_local(temp_mm->temp));
+
+	if (ppc_breakpoint_available()) {
+		struct arch_hw_breakpoint null_brk = {0};
+		int i = 0;
+
+		for (; i < nr_wp_slots(); ++i) {
+			__get_breakpoint(i, &temp_mm->brk[i]);
+			if (temp_mm->brk[i].type != 0)
+				__set_breakpoint(i, &null_brk);
+		}
+	}
+}
+
+static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
+{
+	lockdep_assert_irqs_disabled();
+
+	switch_mm_irqs_off(temp_mm->temp, temp_mm->prev, current);
+
+	if (ppc_breakpoint_available()) {
+		int i = 0;
+
+		for (; i < nr_wp_slots(); ++i)
+			if (temp_mm->brk[i].type != 0)
+				__set_breakpoint(i, &temp_mm->brk[i]);
+	}
+}
+
 static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
 
 #if IS_BUILTIN(CONFIG_LKDTM)
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
                   ` (4 preceding siblings ...)
  2021-07-13  5:31 ` [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-08-05  9:34   ` Christophe Leroy
  2021-07-13  5:31 ` [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix Christopher M. Riedl
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

Rework code-patching with STRICT_KERNEL_RWX to prepare for the next
patch which uses a temporary mm for patching under the Book3s64 Radix
MMU. Make improvements by adding a WARN_ON when the patchsite doesn't
match after patching and return the error from __patch_instruction()
properly.

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>

---

v5:  * New to series.
---
 arch/powerpc/lib/code-patching.c | 51 +++++++++++++++++---------------
 1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 3122d8e4cc013..9f2eba9b70ee4 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -102,11 +102,12 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
 }
 
 static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
+static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
 
 #if IS_BUILTIN(CONFIG_LKDTM)
 unsigned long read_cpu_patching_addr(unsigned int cpu)
 {
-	return (unsigned long)(per_cpu(text_poke_area, cpu))->addr;
+	return per_cpu(cpu_patching_addr, cpu);
 }
 #endif
 
@@ -121,6 +122,7 @@ static int text_area_cpu_up(unsigned int cpu)
 		return -1;
 	}
 	this_cpu_write(text_poke_area, area);
+	this_cpu_write(cpu_patching_addr, (unsigned long)area->addr);
 
 	return 0;
 }
@@ -146,7 +148,7 @@ void __init poking_init(void)
 /*
  * This can be called for kernel text or a module.
  */
-static int map_patch_area(void *addr, unsigned long text_poke_addr)
+static int map_patch_area(void *addr)
 {
 	unsigned long pfn;
 	int err;
@@ -156,17 +158,20 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
 	else
 		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
 
-	err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
+	err = map_kernel_page(__this_cpu_read(cpu_patching_addr),
+			      (pfn << PAGE_SHIFT), PAGE_KERNEL);
 
-	pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err);
+	pr_devel("Mapped addr %lx with pfn %lx:%d\n",
+		 __this_cpu_read(cpu_patching_addr), pfn, err);
 	if (err)
 		return -1;
 
 	return 0;
 }
 
-static inline int unmap_patch_area(unsigned long addr)
+static inline int unmap_patch_area(void)
 {
+	unsigned long addr = __this_cpu_read(cpu_patching_addr);
 	pte_t *ptep;
 	pmd_t *pmdp;
 	pud_t *pudp;
@@ -175,23 +180,23 @@ static inline int unmap_patch_area(unsigned long addr)
 
 	pgdp = pgd_offset_k(addr);
 	if (unlikely(!pgdp))
-		return -EINVAL;
+		goto out_err;
 
 	p4dp = p4d_offset(pgdp, addr);
 	if (unlikely(!p4dp))
-		return -EINVAL;
+		goto out_err;
 
 	pudp = pud_offset(p4dp, addr);
 	if (unlikely(!pudp))
-		return -EINVAL;
+		goto out_err;
 
 	pmdp = pmd_offset(pudp, addr);
 	if (unlikely(!pmdp))
-		return -EINVAL;
+		goto out_err;
 
 	ptep = pte_offset_kernel(pmdp, addr);
 	if (unlikely(!ptep))
-		return -EINVAL;
+		goto out_err;
 
 	pr_devel("clearing mm %p, pte %p, addr %lx\n", &init_mm, ptep, addr);
 
@@ -202,15 +207,17 @@ static inline int unmap_patch_area(unsigned long addr)
 	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 
 	return 0;
+
+out_err:
+	pr_warn("failed to unmap %lx\n", addr);
+	return -EINVAL;
 }
 
 static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 {
-	int err;
+	int err, rc = 0;
 	u32 *patch_addr = NULL;
 	unsigned long flags;
-	unsigned long text_poke_addr;
-	unsigned long kaddr = (unsigned long)addr;
 
 	/*
 	 * During early early boot patch_instruction is called
@@ -222,24 +229,20 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 
 	local_irq_save(flags);
 
-	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
-	if (map_patch_area(addr, text_poke_addr)) {
-		err = -1;
+	err = map_patch_area(addr);
+	if (err)
 		goto out;
-	}
-
-	patch_addr = (u32 *)(text_poke_addr + (kaddr & ~PAGE_MASK));
 
-	__patch_instruction(addr, instr, patch_addr);
+	patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
+	rc = __patch_instruction(addr, instr, patch_addr);
 
-	err = unmap_patch_area(text_poke_addr);
-	if (err)
-		pr_warn("failed to unmap %lx\n", text_poke_addr);
+	err = unmap_patch_area();
 
 out:
 	local_irq_restore(flags);
+	WARN_ON(!ppc_inst_equal(ppc_inst_read(addr), instr));
 
-	return err;
+	return rc ? rc : err;
 }
 #else /* !CONFIG_STRICT_KERNEL_RWX */
 
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
                   ` (5 preceding siblings ...)
  2021-07-13  5:31 ` [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-08-05  9:48   ` Christophe Leroy
  2021-07-13  5:31 ` [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test Christopher M. Riedl
  2021-08-05  9:03 ` [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christophe Leroy
  8 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

When code patching a STRICT_KERNEL_RWX kernel the page containing the
address to be patched is temporarily mapped as writeable. Currently, a
per-cpu vmalloc patch area is used for this purpose. While the patch
area is per-cpu, the temporary page mapping is inserted into the kernel
page tables for the duration of patching. The mapping is exposed to CPUs
other than the patching CPU - this is undesirable from a hardening
perspective. Use a temporary mm instead which keeps the mapping local to
the CPU doing the patching.

Use the `poking_init` init hook to prepare a temporary mm and patching
address. Initialize the temporary mm by copying the init mm. Choose a
randomized patching address inside the temporary mm userspace address
space. The patching address is randomized between PAGE_SIZE and
DEFAULT_MAP_WINDOW-PAGE_SIZE.

Bits of entropy with 64K page size on BOOK3S_64:

        bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)

        PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
        bits of entropy = log2(128TB / 64K)
	bits of entropy = 31

The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
operates - by default the space above DEFAULT_MAP_WINDOW is not
available. Currently the Hash MMU does not use a temporary mm so
technically this upper limit isn't necessary; however, a larger
randomization range does not further "harden" this overall approach and
future work may introduce patching with a temporary mm on Hash as well.

Randomization occurs only once during initialization at boot for each
possible CPU in the system.

Introduce two new functions, map_patch() and unmap_patch(), to
respectively create and remove the temporary mapping with write
permissions at patching_addr. Map the page with PAGE_KERNEL to set
EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
KUAP) according to PowerISA v3.0b Figure 35 on Radix.

Based on x86 implementation:

commit 4fc19708b165
("x86/alternatives: Initialize temporary mm for patching")

and:

commit b3fd8e83ada0
("x86/alternatives: Use temporary mm for text poking")

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>

---

v5:  * Only support Book3s64 Radix MMU for now.
     * Use a per-cpu datastructure to hold the patching_addr and
       patching_mm to avoid the need for a synchronization lock/mutex.

v4:  * In the previous series this was two separate patches: one to init
       the temporary mm in poking_init() (unused in powerpc at the time)
       and the other to use it for patching (which removed all the
       per-cpu vmalloc code). Now that we use poking_init() in the
       existing per-cpu vmalloc approach, that separation doesn't work
       as nicely anymore so I just merged the two patches into one.
     * Preload the SLB entry and hash the page for the patching_addr
       when using Hash on book3s64 to avoid taking an SLB and Hash fault
       during patching. The previous implementation was a hack which
       changed current->mm to allow the SLB and Hash fault handlers to
       work with the temporary mm since both of those code-paths always
       assume mm == current->mm.
     * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
       have to manage the mm->context.active_cpus counter and mm cpumask
       since they determine (via mm_is_thread_local()) if the TLB flush
       in pte_clear() is local or not - it should always be local when
       we're using the temporary mm. On book3s64's Radix MMU we can
       just call local_flush_tlb_mm().
     * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
       KUAP.
---
 arch/powerpc/lib/code-patching.c | 132 +++++++++++++++++++++++++++++--
 1 file changed, 125 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 9f2eba9b70ee4..027dabd42b8dd 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -11,6 +11,7 @@
 #include <linux/cpuhotplug.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
+#include <linux/random.h>
 
 #include <asm/tlbflush.h>
 #include <asm/page.h>
@@ -103,6 +104,7 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
 
 static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
 static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
+static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
 
 #if IS_BUILTIN(CONFIG_LKDTM)
 unsigned long read_cpu_patching_addr(unsigned int cpu)
@@ -133,6 +135,51 @@ static int text_area_cpu_down(unsigned int cpu)
 	return 0;
 }
 
+static __always_inline void __poking_init_temp_mm(void)
+{
+	int cpu;
+	spinlock_t *ptl; /* for protecting pte table */
+	pte_t *ptep;
+	struct mm_struct *patching_mm;
+	unsigned long patching_addr;
+
+	for_each_possible_cpu(cpu) {
+		/*
+		 * Some parts of the kernel (static keys for example) depend on
+		 * successful code patching. Code patching under
+		 * STRICT_KERNEL_RWX requires this setup - otherwise we cannot
+		 * patch at all. We use BUG_ON() here and later since an early
+		 * failure is preferred to buggy behavior and/or strange
+		 * crashes later.
+		 */
+		patching_mm = copy_init_mm();
+		BUG_ON(!patching_mm);
+
+		per_cpu(cpu_patching_mm, cpu) = patching_mm;
+
+		/*
+		 * Choose a randomized, page-aligned address from the range:
+		 * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower
+		 * address bound is PAGE_SIZE to avoid the zero-page.  The
+		 * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to
+		 * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
+		 */
+		patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
+				% (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));
+
+		per_cpu(cpu_patching_addr, cpu) = patching_addr;
+
+		/*
+		 * PTE allocation uses GFP_KERNEL which means we need to
+		 * pre-allocate the PTE here because we cannot do the
+		 * allocation during patching when IRQs are disabled.
+		 */
+		ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
+		BUG_ON(!ptep);
+		pte_unmap_unlock(ptep, ptl);
+	}
+}
+
 /*
  * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
  * we judge it as being preferable to a kernel that will crash later when
@@ -140,6 +187,11 @@ static int text_area_cpu_down(unsigned int cpu)
  */
 void __init poking_init(void)
 {
+	if (radix_enabled()) {
+		__poking_init_temp_mm();
+		return;
+	}
+
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
@@ -213,30 +265,96 @@ static inline int unmap_patch_area(void)
 	return -EINVAL;
 }
 
+struct patch_mapping {
+	spinlock_t *ptl; /* for protecting pte table */
+	pte_t *ptep;
+	struct temp_mm temp_mm;
+};
+
+/*
+ * This can be called for kernel text or a module.
+ */
+static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
+{
+	struct page *page;
+	pte_t pte;
+	pgprot_t pgprot;
+	struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
+	unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
+
+	if (is_vmalloc_or_module_addr(addr))
+		page = vmalloc_to_page(addr);
+	else
+		page = virt_to_page(addr);
+
+	patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
+					     &patch_mapping->ptl);
+	if (unlikely(!patch_mapping->ptep)) {
+		pr_warn("map patch: failed to allocate pte for patching\n");
+		return -1;
+	}
+
+	pgprot = PAGE_KERNEL;
+	pte = mk_pte(page, pgprot);
+	pte = pte_mkdirty(pte);
+	set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, pte);
+
+	init_temp_mm(&patch_mapping->temp_mm, patching_mm);
+	use_temporary_mm(&patch_mapping->temp_mm);
+
+	return 0;
+}
+
+static void unmap_patch(struct patch_mapping *patch_mapping)
+{
+	struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
+	unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
+
+	pte_clear(patching_mm, patching_addr, patch_mapping->ptep);
+
+	local_flush_tlb_mm(patching_mm);
+
+	pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl);
+
+	unuse_temporary_mm(&patch_mapping->temp_mm);
+}
+
 static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 {
 	int err, rc = 0;
 	u32 *patch_addr = NULL;
 	unsigned long flags;
+	struct patch_mapping patch_mapping;
 
 	/*
-	 * During early early boot patch_instruction is called
-	 * when text_poke_area is not ready, but we still need
-	 * to allow patching. We just do the plain old patching
+	 * During early early boot patch_instruction is called when the
+	 * patching_mm/text_poke_area is not ready, but we still need to allow
+	 * patching. We just do the plain old patching.
 	 */
-	if (!this_cpu_read(text_poke_area))
-		return raw_patch_instruction(addr, instr);
+	if (radix_enabled()) {
+		if (!this_cpu_read(cpu_patching_mm))
+			return raw_patch_instruction(addr, instr);
+	} else {
+		if (!this_cpu_read(text_poke_area))
+			return raw_patch_instruction(addr, instr);
+	}
 
 	local_irq_save(flags);
 
-	err = map_patch_area(addr);
+	if (radix_enabled())
+		err = map_patch(addr, &patch_mapping);
+	else
+		err = map_patch_area(addr);
 	if (err)
 		goto out;
 
 	patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
 	rc = __patch_instruction(addr, instr, patch_addr);
 
-	err = unmap_patch_area();
+	if (radix_enabled())
+		unmap_patch(&patch_mapping);
+	else
+		err = unmap_patch_area();
 
 out:
 	local_irq_restore(flags);
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
                   ` (6 preceding siblings ...)
  2021-07-13  5:31 ` [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix Christopher M. Riedl
@ 2021-07-13  5:31 ` Christopher M. Riedl
  2021-08-05  9:18   ` Christophe Leroy
  2021-08-05  9:03 ` [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christophe Leroy
  8 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-07-13  5:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook, npiggin, dja, peterz

Code patching on powerpc with a STRICT_KERNEL_RWX uses a userspace
address in a temporary mm on Radix now. Use __put_user() to avoid write
failures due to KUAP when attempting a "hijack" on the patching address.
__put_user() also works with the non-userspace, vmalloc-based patching
address on non-Radix MMUs.

Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
---
 drivers/misc/lkdtm/perms.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
index 41e87e5f9cc86..da6a34a0a49fb 100644
--- a/drivers/misc/lkdtm/perms.c
+++ b/drivers/misc/lkdtm/perms.c
@@ -262,16 +262,7 @@ static inline u32 lkdtm_read_patch_site(void)
 /* Returns True if the write succeeds */
 static inline bool lkdtm_try_write(u32 data, u32 *addr)
 {
-#ifdef CONFIG_PPC
-	__put_kernel_nofault(addr, &data, u32, err);
-	return true;
-
-err:
-	return false;
-#endif
-#ifdef CONFIG_X86_64
 	return !__put_user(data, addr);
-#endif
 }
 
 static int lkdtm_patching_cpu(void *data)
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU
  2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
                   ` (7 preceding siblings ...)
  2021-07-13  5:31 ` [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test Christopher M. Riedl
@ 2021-08-05  9:03 ` Christophe Leroy
  2021-08-11 17:49   ` Christopher M. Riedl
  8 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:03 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> When compiled with CONFIG_STRICT_KERNEL_RWX, the kernel must create
> temporary mappings when patching itself. These mappings temporarily
> override the strict RWX text protections to permit a write. Currently,
> powerpc allocates a per-CPU VM area for patching. Patching occurs as
> follows:
> 
> 	1. Map page in per-CPU VM area w/ PAGE_KERNEL protection
> 	2. Patch text
> 	3. Remove the temporary mapping
> 
> While the VM area is per-CPU, the mapping is actually inserted into the
> kernel page tables. Presumably, this could allow another CPU to access
> the normally write-protected text - either malicously or accidentally -
> via this same mapping if the address of the VM area is known. Ideally,
> the mapping should be kept local to the CPU doing the patching [0].
> 
> x86 introduced "temporary mm" structs which allow the creation of mappings
> local to a particular CPU [1]. This series intends to bring the notion of a
> temporary mm to powerpc's Book3s64 Radix MMU and harden it by using such a
> mapping for patching a kernel with strict RWX permissions.
> 
> The first four patches implement an LKDTM test "proof-of-concept" which
> exploits the potential vulnerability (ie. the temporary mapping during patching
> is exposed in the kernel page tables and accessible by other CPUs) using a
> simple brute-force approach. This test is implemented for both powerpc and
> x86_64. The test passes on powerpc Radix with this new series, fails on
> upstream powerpc, passes on upstream x86_64, and fails on an older (ancient)
> x86_64 tree without the x86_64 temporary mm patches. The remaining patches add
> support for and use a temporary mm for code patching on powerpc with the Radix
> MMU.

I think four first patches (together with last one) are quite independent from the heart of the 
series itself which is patches 5, 6, 7. Maybe you should split that series it two series ? After all 
those selftests are nice to have but are not absolutely necessary, that would help getting forward I 
think.

> 
> Tested boot, ftrace, and repeated LKDTM "hijack":
> 	- QEMU+KVM (host: POWER9 Blackbird): Radix MMU w/ KUAP
> 	- QEMU+KVM (host: POWER9 Blackbird): Hash MMU
> 
> Tested repeated LKDTM "hijack":
> 	- QEMU+KVM (host: AMD desktop): x86_64 upstream
> 	- QEMU+KVM (host: AMD desktop): x86_64 w/o percpu temp mm to
> 	  verify the LKDTM "hijack" test fails
> 
> Tested boot and ftrace:
> 	- QEMU+TCG: ppc44x (bamboo)
> 	- QEMU+TCG: g5 (mac99)
> 
> I also tested with various extra config options enabled as suggested in
> section 12) in Documentation/process/submit-checklist.rst.
> 
> v5:	* Only support Book3s64 Radix MMU for now. There are some issues with
> 	  the previous implementation on the Hash MMU as pointed out by Nick
> 	  Piggin. Fixing these is not trivial so we only support the Radix MMU
> 	  for now. I tried using realmode (no data translation) to patch with
> 	  Hash to at least avoid exposing the patch mapping to other CPUs but
> 	  this doesn't reliably work either since we cannot access vmalloc'd
> 	  space in realmode.

So you now accept to have two different mode depending on the platform ?
As far as I remember I commented some time ago that non SMP didn't need that feature and you were 
reluctant to have two different implementations. What made you change your mind ? (just curious).


> 	* Use percpu variables for the patching_mm and patching_addr. This
> 	  avoids the need for synchronization mechanisms entirely. Thanks to
> 	  Peter Zijlstra for pointing out text_mutex which unfortunately didn't
> 	  work out without larger code changes in powerpc. Also thanks to Daniel
> 	  Axtens for comments about using percpu variables for the *percpu* temp
> 	  mm things off list.
> 
> v4:	* It's time to revisit this series again since @jpn and @mpe fixed
> 	  our known STRICT_*_RWX bugs on powerpc/64s.
> 	* Rebase on linuxppc/next:
>            commit ee1bc694fbaec ("powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n")
> 	* Completely rework how map_patch() works on book3s64 Hash MMU
> 	* Split the LKDTM x86_64 and powerpc bits into separate patches
> 	* Annotate commit messages with changes from v3 instead of
> 	  listing them here completely out-of context...
> 
> v3:	* Rebase on linuxppc/next: commit 9123e3a74ec7 ("Linux 5.9-rc1")
> 	* Move temporary mm implementation into code-patching.c where it
> 	  belongs
> 	* Implement LKDTM hijacker test on x86_64 (on IBM time oof) Do
> 	* not use address zero for the patching address in the
> 	  temporary mm (thanks @dja for pointing this out!)
> 	* Wrap the LKDTM test w/ CONFIG_SMP as suggested by Christophe
> 	  Leroy
> 	* Comments to clarify PTE pre-allocation and patching addr
> 	  selection
> 
> v2:	* Rebase on linuxppc/next:
> 	  commit 105fb38124a4 ("powerpc/8xx: Modify ptep_get()")
> 	* Always dirty pte when mapping patch
> 	* Use `ppc_inst_len` instead of `sizeof` on instructions
> 	* Declare LKDTM patching addr accessor in header where it belongs	
> 
> v1:	* Rebase on linuxppc/next (4336b9337824)
> 	* Save and restore second hw watchpoint
> 	* Use new ppc_inst_* functions for patching check and in LKDTM test
> 
> rfc-v2:	* Many fixes and improvements mostly based on extensive feedback
>            and testing by Christophe Leroy (thanks!).
> 	* Make patching_mm and patching_addr static and move
> 	  '__ro_after_init' to after the variable name (more common in
> 	  other parts of the kernel)
> 	* Use 'asm/debug.h' header instead of 'asm/hw_breakpoint.h' to
> 	  fix PPC64e compile
> 	* Add comment explaining why we use BUG_ON() during the init
> 	  call to setup for patching later
> 	* Move ptep into patch_mapping to avoid walking page tables a
> 	  second time when unmapping the temporary mapping
> 	* Use KUAP under non-radix, also manually dirty the PTE for patch
> 	  mapping on non-BOOK3S_64 platforms
> 	* Properly return any error from __patch_instruction
>          * Do not use 'memcmp' where a simple comparison is appropriate
> 	* Simplify expression for patch address by removing pointer maths
> 	* Add LKDTM test
> 
> [0]: https://github.com/linuxppc/issues/issues/224
> [1]: https://lore.kernel.org/kernel-hardening/20190426232303.28381-1-nadav.amit@gmail.com/
> 
> Christopher M. Riedl (8):
>    powerpc: Add LKDTM accessor for patching addr
>    lkdtm/powerpc: Add test to hijack a patch mapping
>    x86_64: Add LKDTM accessor for patching addr
>    lkdtm/x86_64: Add test to hijack a patch mapping
>    powerpc/64s: Introduce temporary mm for Radix MMU
>    powerpc: Rework and improve STRICT_KERNEL_RWX patching
>    powerpc/64s: Initialize and use a temporary mm for patching on Radix
>    lkdtm/powerpc: Fix code patching hijack test
> 
>   arch/powerpc/include/asm/code-patching.h |   4 +
>   arch/powerpc/include/asm/debug.h         |   1 +
>   arch/powerpc/kernel/process.c            |   5 +
>   arch/powerpc/lib/code-patching.c         | 240 ++++++++++++++++++++---
>   arch/x86/include/asm/text-patching.h     |   4 +
>   arch/x86/kernel/alternative.c            |   7 +
>   drivers/misc/lkdtm/core.c                |   1 +
>   drivers/misc/lkdtm/lkdtm.h               |   1 +
>   drivers/misc/lkdtm/perms.c               | 143 ++++++++++++++
>   9 files changed, 378 insertions(+), 28 deletions(-)
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping
  2021-07-13  5:31 ` [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping Christopher M. Riedl
@ 2021-08-05  9:09   ` Christophe Leroy
  2021-08-11 17:53     ` Christopher M. Riedl
  0 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:09 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> A previous commit implemented an LKDTM test on powerpc to exploit the
> temporary mapping established when patching code with STRICT_KERNEL_RWX
> enabled. Extend the test to work on x86_64 as well.
> 
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> ---
>   drivers/misc/lkdtm/perms.c | 26 ++++++++++++++++++++++----
>   1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 39e7456852229..41e87e5f9cc86 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -224,7 +224,7 @@ void lkdtm_ACCESS_NULL(void)
>   }
>   
>   #if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> -	defined(CONFIG_PPC))
> +	(defined(CONFIG_PPC) || defined(CONFIG_X86_64)))
>   /*
>    * This is just a dummy location to patch-over.
>    */
> @@ -233,12 +233,25 @@ static void patching_target(void)
>   	return;
>   }
>   
> -#include <asm/code-patching.h>
>   const u32 *patch_site = (const u32 *)&patching_target;
>   
> +#ifdef CONFIG_PPC
> +#include <asm/code-patching.h>
> +#endif
> +
> +#ifdef CONFIG_X86_64
> +#include <asm/text-patching.h>
> +#endif
> +
>   static inline int lkdtm_do_patch(u32 data)
>   {
> +#ifdef CONFIG_PPC
>   	return patch_instruction((u32 *)patch_site, ppc_inst(data));
> +#endif
> +#ifdef CONFIG_X86_64
> +	text_poke((void *)patch_site, &data, sizeof(u32));
> +	return 0;
> +#endif
>   }
>   
>   static inline u32 lkdtm_read_patch_site(void)
> @@ -249,11 +262,16 @@ static inline u32 lkdtm_read_patch_site(void)
>   /* Returns True if the write succeeds */
>   static inline bool lkdtm_try_write(u32 data, u32 *addr)
>   {
> +#ifdef CONFIG_PPC
>   	__put_kernel_nofault(addr, &data, u32, err);
>   	return true;
>   
>   err:
>   	return false;
> +#endif
> +#ifdef CONFIG_X86_64
> +	return !__put_user(data, addr);
> +#endif
>   }
>   
>   static int lkdtm_patching_cpu(void *data)
> @@ -346,8 +364,8 @@ void lkdtm_HIJACK_PATCH(void)
>   
>   void lkdtm_HIJACK_PATCH(void)
>   {
> -	if (!IS_ENABLED(CONFIG_PPC))
> -		pr_err("XFAIL: this test only runs on powerpc\n");
> +	if (!IS_ENABLED(CONFIG_PPC) && !IS_ENABLED(CONFIG_X86_64))
> +		pr_err("XFAIL: this test only runs on powerpc and x86_64\n");
>   	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
>   		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
>   	if (!IS_BUILTIN(CONFIG_LKDTM))
> 

Instead of spreading arch specific stuff into LKDTM, wouldn't it make sence to define common a 
common API ? Because the day another arch like arm64 implements it own approach, do we add specific 
functions again and again into LKDTM ?

Also, I find it odd to define tests only when they can succeed. For other tests like 
ACCESS_USERSPACE, they are there all the time, regardless of whether we have selected 
CONFIG_PPC_KUAP or not. I think it should be the same here, have it all there time, if 
CONFIG_STRICT_KERNEL_RWX is selected the test succeeds otherwise it fails, but it is always there.

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping
  2021-07-13  5:31 ` [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping Christopher M. Riedl
@ 2021-08-05  9:13   ` Christophe Leroy
  2021-08-11 17:57     ` Christopher M. Riedl
  0 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:13 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> When live patching with STRICT_KERNEL_RWX the CPU doing the patching
> must temporarily remap the page(s) containing the patch site with +W
> permissions. While this temporary mapping is in use, another CPU could
> write to the same mapping and maliciously alter kernel text. Implement a
> LKDTM test to attempt to exploit such an opening during code patching.
> The test is implemented on powerpc and requires LKDTM built into the
> kernel (building LKDTM as a module is insufficient).
> 
> The LKDTM "hijack" test works as follows:
> 
>    1. A CPU executes an infinite loop to patch an instruction. This is
>       the "patching" CPU.
>    2. Another CPU attempts to write to the address of the temporary
>       mapping used by the "patching" CPU. This other CPU is the
>       "hijacker" CPU. The hijack either fails with a fault/error or
>       succeeds, in which case some kernel text is now overwritten.
> 
> The virtual address of the temporary patch mapping is provided via an
> LKDTM-specific accessor to the hijacker CPU. This test assumes a
> hypothetical situation where this address was leaked previously.
> 
> How to run the test:
> 
> 	mount -t debugfs none /sys/kernel/debug
> 	(echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT)
> 
> A passing test indicates that it is not possible to overwrite kernel
> text from another CPU by using the temporary mapping established by
> a CPU for patching.
> 
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> 
> ---
> 
> v5:  * Use `u32*` instead of `struct ppc_inst*` based on new series in
>         upstream.
> 
> v4:  * Separate the powerpc and x86_64 bits into individual patches.
>       * Use __put_kernel_nofault() when attempting to hijack the mapping
>       * Use raw_smp_processor_id() to avoid triggering the BUG() when
>         calling smp_processor_id() in preemptible code - the only thing
>         that matters is that one of the threads is bound to a different
>         CPU - we are not using smp_processor_id() to access any per-cpu
>         data or similar where preemption should be disabled.
>       * Rework the patching_cpu() kthread stop condition to avoid:
>         https://lwn.net/Articles/628628/
> ---
>   drivers/misc/lkdtm/core.c  |   1 +
>   drivers/misc/lkdtm/lkdtm.h |   1 +
>   drivers/misc/lkdtm/perms.c | 134 +++++++++++++++++++++++++++++++++++++
>   3 files changed, 136 insertions(+)
> 
> diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> index 8024b6a5cc7fc..fbcb95eda337b 100644
> --- a/drivers/misc/lkdtm/core.c
> +++ b/drivers/misc/lkdtm/core.c
> @@ -147,6 +147,7 @@ static const struct crashtype crashtypes[] = {
>   	CRASHTYPE(WRITE_RO),
>   	CRASHTYPE(WRITE_RO_AFTER_INIT),
>   	CRASHTYPE(WRITE_KERN),
> +	CRASHTYPE(HIJACK_PATCH),
>   	CRASHTYPE(REFCOUNT_INC_OVERFLOW),
>   	CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
>   	CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
> diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
> index 99f90d3e5e9cb..87e7e6136d962 100644
> --- a/drivers/misc/lkdtm/lkdtm.h
> +++ b/drivers/misc/lkdtm/lkdtm.h
> @@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void);
>   void lkdtm_EXEC_NULL(void);
>   void lkdtm_ACCESS_USERSPACE(void);
>   void lkdtm_ACCESS_NULL(void);
> +void lkdtm_HIJACK_PATCH(void);
>   
>   /* refcount.c */
>   void lkdtm_REFCOUNT_INC_OVERFLOW(void);
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 2dede2ef658f3..39e7456852229 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -9,6 +9,7 @@
>   #include <linux/vmalloc.h>
>   #include <linux/mman.h>
>   #include <linux/uaccess.h>
> +#include <linux/kthread.h>
>   #include <asm/cacheflush.h>
>   
>   /* Whether or not to fill the target memory area with do_nothing(). */
> @@ -222,6 +223,139 @@ void lkdtm_ACCESS_NULL(void)
>   	pr_err("FAIL: survived bad write\n");
>   }
>   
> +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> +	defined(CONFIG_PPC))


I think this test shouldn't be limited to CONFIG_PPC and shouldn't be limited to 
CONFIG_STRICT_KERNEL_RWX. It should be there all the time.

Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ?

> +/*
> + * This is just a dummy location to patch-over.
> + */
> +static void patching_target(void)
> +{
> +	return;
> +}
> +
> +#include <asm/code-patching.h>
> +const u32 *patch_site = (const u32 *)&patching_target;
> +
> +static inline int lkdtm_do_patch(u32 data)
> +{
> +	return patch_instruction((u32 *)patch_site, ppc_inst(data));
> +}
> +
> +static inline u32 lkdtm_read_patch_site(void)
> +{
> +	return READ_ONCE(*patch_site);
> +}
> +
> +/* Returns True if the write succeeds */
> +static inline bool lkdtm_try_write(u32 data, u32 *addr)
> +{
> +	__put_kernel_nofault(addr, &data, u32, err);
> +	return true;
> +
> +err:
> +	return false;
> +}
> +
> +static int lkdtm_patching_cpu(void *data)
> +{
> +	int err = 0;
> +	u32 val = 0xdeadbeef;
> +
> +	pr_info("starting patching_cpu=%d\n", raw_smp_processor_id());
> +
> +	do {
> +		err = lkdtm_do_patch(val);
> +	} while (lkdtm_read_patch_site() == val && !err && !kthread_should_stop());
> +
> +	if (err)
> +		pr_warn("XFAIL: patch_instruction returned error: %d\n", err);
> +
> +	while (!kthread_should_stop()) {
> +		set_current_state(TASK_INTERRUPTIBLE);
> +		schedule();
> +	}
> +
> +	return err;
> +}
> +
> +void lkdtm_HIJACK_PATCH(void)
> +{
> +	struct task_struct *patching_kthrd;
> +	int patching_cpu, hijacker_cpu, attempts;
> +	unsigned long addr;
> +	bool hijacked;
> +	const u32 bad_data = 0xbad00bad;
> +	const u32 original_insn = lkdtm_read_patch_site();
> +
> +	if (!IS_ENABLED(CONFIG_SMP)) {
> +		pr_err("XFAIL: this test requires CONFIG_SMP\n");
> +		return;
> +	}
> +
> +	if (num_online_cpus() < 2) {
> +		pr_warn("XFAIL: this test requires at least two cpus\n");
> +		return;
> +	}
> +
> +	hijacker_cpu = raw_smp_processor_id();
> +	patching_cpu = cpumask_any_but(cpu_online_mask, hijacker_cpu);
> +
> +	patching_kthrd = kthread_create_on_node(&lkdtm_patching_cpu, NULL,
> +						cpu_to_node(patching_cpu),
> +						"lkdtm_patching_cpu");
> +	kthread_bind(patching_kthrd, patching_cpu);
> +	wake_up_process(patching_kthrd);
> +
> +	addr = offset_in_page(patch_site) | read_cpu_patching_addr(patching_cpu);
> +
> +	pr_info("starting hijacker_cpu=%d\n", hijacker_cpu);
> +	for (attempts = 0; attempts < 100000; ++attempts) {
> +		/* Try to write to the other CPU's temp patch mapping */
> +		hijacked = lkdtm_try_write(bad_data, (u32 *)addr);
> +
> +		if (hijacked) {
> +			if (kthread_stop(patching_kthrd)) {
> +				pr_info("hijack attempts: %d\n", attempts);
> +				pr_err("XFAIL: error stopping patching cpu\n");
> +				return;
> +			}
> +			break;
> +		}
> +	}
> +	pr_info("hijack attempts: %d\n", attempts);
> +
> +	if (hijacked) {
> +		if (lkdtm_read_patch_site() == bad_data)
> +			pr_err("overwrote kernel text\n");
> +		/*
> +		 * There are window conditions where the hijacker cpu manages to
> +		 * write to the patch site but the site gets overwritten again by
> +		 * the patching cpu. We still consider that a "successful" hijack
> +		 * since the hijacker cpu did not fault on the write.
> +		 */
> +		pr_err("FAIL: wrote to another cpu's patching area\n");
> +	} else {
> +		kthread_stop(patching_kthrd);
> +	}
> +
> +	/* Restore the original data to be able to run the test again */
> +	lkdtm_do_patch(original_insn);
> +}
> +
> +#else
> +
> +void lkdtm_HIJACK_PATCH(void)
> +{
> +	if (!IS_ENABLED(CONFIG_PPC))
> +		pr_err("XFAIL: this test only runs on powerpc\n");
> +	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> +		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> +	if (!IS_BUILTIN(CONFIG_LKDTM))
> +		pr_err("XFAIL: this test requires CONFIG_LKDTM=y (not =m!)\n");
> +}
> +
> +#endif
> +
>   void __init lkdtm_perms_init(void)
>   {
>   	/* Make sure we can write to __ro_after_init values during __init */
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test
  2021-07-13  5:31 ` [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test Christopher M. Riedl
@ 2021-08-05  9:18   ` Christophe Leroy
  2021-08-11 17:57     ` Christopher M. Riedl
  0 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:18 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> Code patching on powerpc with a STRICT_KERNEL_RWX uses a userspace
> address in a temporary mm on Radix now. Use __put_user() to avoid write
> failures due to KUAP when attempting a "hijack" on the patching address.
> __put_user() also works with the non-userspace, vmalloc-based patching
> address on non-Radix MMUs.

It is not really clean to use __put_user() on non user address, allthought it works by change.

I think it would be better to do something like

	if (is_kernel_addr(addr))
		copy_to_kernel_nofault(...);
	else
		copy_to_user_nofault(...);



> 
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> ---
>   drivers/misc/lkdtm/perms.c | 9 ---------
>   1 file changed, 9 deletions(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 41e87e5f9cc86..da6a34a0a49fb 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -262,16 +262,7 @@ static inline u32 lkdtm_read_patch_site(void)
>   /* Returns True if the write succeeds */
>   static inline bool lkdtm_try_write(u32 data, u32 *addr)
>   {
> -#ifdef CONFIG_PPC
> -	__put_kernel_nofault(addr, &data, u32, err);
> -	return true;
> -
> -err:
> -	return false;
> -#endif
> -#ifdef CONFIG_X86_64
>   	return !__put_user(data, addr);
> -#endif
>   }
>   
>   static int lkdtm_patching_cpu(void *data)
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU
  2021-07-13  5:31 ` [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU Christopher M. Riedl
@ 2021-08-05  9:27   ` Christophe Leroy
  2021-08-11 18:02     ` Christopher M. Riedl
  0 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:27 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> x86 supports the notion of a temporary mm which restricts access to
> temporary PTEs to a single CPU. A temporary mm is useful for situations
> where a CPU needs to perform sensitive operations (such as patching a
> STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
> said mappings to other CPUs. Another benefit is that other CPU TLBs do
> not need to be flushed when the temporary mm is torn down.
> 
> Mappings in the temporary mm can be set in the userspace portion of the
> address-space.
> 
> Interrupts must be disabled while the temporary mm is in use. HW
> breakpoints, which may have been set by userspace as watchpoints on
> addresses now within the temporary mm, are saved and disabled when
> loading the temporary mm. The HW breakpoints are restored when unloading
> the temporary mm. All HW breakpoints are indiscriminately disabled while
> the temporary mm is in use.

Can you explain more about that breakpoint stuff ? Why is it a special case here at all ? Isn't it 
the same when you switch from one user task to another one ? x86 commit doesn't say anythink about 
breakpoints.

> 
> Based on x86 implementation:
> 
> commit cefa929c034e
> ("x86/mm: Introduce temporary mm structs")
> 
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> 
> ---
> 
> v5:  * Drop support for using a temporary mm on Book3s64 Hash MMU.
> 
> v4:  * Pass the prev mm instead of NULL to switch_mm_irqs_off() when
>         using/unusing the temp mm as suggested by Jann Horn to keep
>         the context.active counter in-sync on mm/nohash.
>       * Disable SLB preload in the temporary mm when initializing the
>         temp_mm struct.
>       * Include asm/debug.h header to fix build issue with
>         ppc44x_defconfig.
> ---
>   arch/powerpc/include/asm/debug.h |  1 +
>   arch/powerpc/kernel/process.c    |  5 +++
>   arch/powerpc/lib/code-patching.c | 56 ++++++++++++++++++++++++++++++++
>   3 files changed, 62 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> index 86a14736c76c3..dfd82635ea8b3 100644
> --- a/arch/powerpc/include/asm/debug.h
> +++ b/arch/powerpc/include/asm/debug.h
> @@ -46,6 +46,7 @@ static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
>   #endif
>   
>   void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk);
>   bool ppc_breakpoint_available(void);
>   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
>   extern void do_send_trap(struct pt_regs *regs, unsigned long address,
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 185beb2905801..a0776200772e8 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -865,6 +865,11 @@ static inline int set_breakpoint_8xx(struct arch_hw_breakpoint *brk)
>   	return 0;
>   }
>   
> +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> +{
> +	memcpy(brk, this_cpu_ptr(&current_brk[nr]), sizeof(*brk));
> +}
> +
>   void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
>   {
>   	memcpy(this_cpu_ptr(&current_brk[nr]), brk, sizeof(*brk));
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 54b6157d44e95..3122d8e4cc013 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -17,6 +17,9 @@
>   #include <asm/code-patching.h>
>   #include <asm/setup.h>
>   #include <asm/inst.h>
> +#include <asm/mmu_context.h>
> +#include <asm/debug.h>
> +#include <asm/tlb.h>
>   
>   static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 *patch_addr)
>   {
> @@ -45,6 +48,59 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr)
>   }
>   
>   #ifdef CONFIG_STRICT_KERNEL_RWX
> +
> +struct temp_mm {
> +	struct mm_struct *temp;
> +	struct mm_struct *prev;
> +	struct arch_hw_breakpoint brk[HBP_NUM_MAX];
> +};
> +
> +static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct *mm)
> +{
> +	/* We currently only support temporary mm on the Book3s64 Radix MMU */
> +	WARN_ON(!radix_enabled());
> +
> +	temp_mm->temp = mm;
> +	temp_mm->prev = NULL;
> +	memset(&temp_mm->brk, 0, sizeof(temp_mm->brk));
> +}
> +
> +static inline void use_temporary_mm(struct temp_mm *temp_mm)
> +{
> +	lockdep_assert_irqs_disabled();
> +
> +	temp_mm->prev = current->active_mm;
> +	switch_mm_irqs_off(temp_mm->prev, temp_mm->temp, current);
> +
> +	WARN_ON(!mm_is_thread_local(temp_mm->temp));
> +
> +	if (ppc_breakpoint_available()) {
> +		struct arch_hw_breakpoint null_brk = {0};
> +		int i = 0;
> +
> +		for (; i < nr_wp_slots(); ++i) {
> +			__get_breakpoint(i, &temp_mm->brk[i]);
> +			if (temp_mm->brk[i].type != 0)
> +				__set_breakpoint(i, &null_brk);
> +		}
> +	}
> +}
> +
> +static inline void unuse_temporary_mm(struct temp_mm *temp_mm)

not sure about the naming.

Maybe start_using_temp_mm() and stop_using_temp_mm() would be more explicit.


> +{
> +	lockdep_assert_irqs_disabled();
> +
> +	switch_mm_irqs_off(temp_mm->temp, temp_mm->prev, current);
> +
> +	if (ppc_breakpoint_available()) {
> +		int i = 0;
> +
> +		for (; i < nr_wp_slots(); ++i)
> +			if (temp_mm->brk[i].type != 0)
> +				__set_breakpoint(i, &temp_mm->brk[i]);
> +	}
> +}
> +
>   static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>   
>   #if IS_BUILTIN(CONFIG_LKDTM)
> 

You'll probably get a bisecting hasard with those unused 'static inline' functions in a .c file 
because that patch alone probably fails build.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching
  2021-07-13  5:31 ` [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching Christopher M. Riedl
@ 2021-08-05  9:34   ` Christophe Leroy
  2021-08-11 18:10     ` Christopher M. Riedl
  0 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:34 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> Rework code-patching with STRICT_KERNEL_RWX to prepare for the next
> patch which uses a temporary mm for patching under the Book3s64 Radix
> MMU. Make improvements by adding a WARN_ON when the patchsite doesn't
> match after patching and return the error from __patch_instruction()
> properly.
> 
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> 
> ---
> 
> v5:  * New to series.
> ---
>   arch/powerpc/lib/code-patching.c | 51 +++++++++++++++++---------------
>   1 file changed, 27 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 3122d8e4cc013..9f2eba9b70ee4 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -102,11 +102,12 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
>   }
>   
>   static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> +static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
>   
>   #if IS_BUILTIN(CONFIG_LKDTM)
>   unsigned long read_cpu_patching_addr(unsigned int cpu)
>   {
> -	return (unsigned long)(per_cpu(text_poke_area, cpu))->addr;
> +	return per_cpu(cpu_patching_addr, cpu);
>   }
>   #endif
>   
> @@ -121,6 +122,7 @@ static int text_area_cpu_up(unsigned int cpu)
>   		return -1;
>   	}
>   	this_cpu_write(text_poke_area, area);
> +	this_cpu_write(cpu_patching_addr, (unsigned long)area->addr);
>   
>   	return 0;
>   }
> @@ -146,7 +148,7 @@ void __init poking_init(void)
>   /*
>    * This can be called for kernel text or a module.
>    */
> -static int map_patch_area(void *addr, unsigned long text_poke_addr)
> +static int map_patch_area(void *addr)
>   {
>   	unsigned long pfn;
>   	int err;
> @@ -156,17 +158,20 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
>   	else
>   		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
>   
> -	err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
> +	err = map_kernel_page(__this_cpu_read(cpu_patching_addr),
> +			      (pfn << PAGE_SHIFT), PAGE_KERNEL);
>   
> -	pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err);
> +	pr_devel("Mapped addr %lx with pfn %lx:%d\n",
> +		 __this_cpu_read(cpu_patching_addr), pfn, err);
>   	if (err)
>   		return -1;
>   
>   	return 0;
>   }
>   
> -static inline int unmap_patch_area(unsigned long addr)
> +static inline int unmap_patch_area(void)
>   {
> +	unsigned long addr = __this_cpu_read(cpu_patching_addr);
>   	pte_t *ptep;
>   	pmd_t *pmdp;
>   	pud_t *pudp;
> @@ -175,23 +180,23 @@ static inline int unmap_patch_area(unsigned long addr)
>   
>   	pgdp = pgd_offset_k(addr);
>   	if (unlikely(!pgdp))
> -		return -EINVAL;
> +		goto out_err;
>   
>   	p4dp = p4d_offset(pgdp, addr);
>   	if (unlikely(!p4dp))
> -		return -EINVAL;
> +		goto out_err;
>   
>   	pudp = pud_offset(p4dp, addr);
>   	if (unlikely(!pudp))
> -		return -EINVAL;
> +		goto out_err;
>   
>   	pmdp = pmd_offset(pudp, addr);
>   	if (unlikely(!pmdp))
> -		return -EINVAL;
> +		goto out_err;
>   
>   	ptep = pte_offset_kernel(pmdp, addr);
>   	if (unlikely(!ptep))
> -		return -EINVAL;
> +		goto out_err;
>   
>   	pr_devel("clearing mm %p, pte %p, addr %lx\n", &init_mm, ptep, addr);
>   
> @@ -202,15 +207,17 @@ static inline int unmap_patch_area(unsigned long addr)
>   	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>   
>   	return 0;
> +
> +out_err:
> +	pr_warn("failed to unmap %lx\n", addr);
> +	return -EINVAL;

Can you keep that in the caller of unmap_patch_area() instead of all those goto stuff ?

>   }
>   
>   static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
>   {
> -	int err;
> +	int err, rc = 0;
>   	u32 *patch_addr = NULL;
>   	unsigned long flags;
> -	unsigned long text_poke_addr;
> -	unsigned long kaddr = (unsigned long)addr;
>   
>   	/*
>   	 * During early early boot patch_instruction is called
> @@ -222,24 +229,20 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
>   
>   	local_irq_save(flags);
>   
> -	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
> -	if (map_patch_area(addr, text_poke_addr)) {
> -		err = -1;
> +	err = map_patch_area(addr);
> +	if (err)
>   		goto out;
> -	}
> -
> -	patch_addr = (u32 *)(text_poke_addr + (kaddr & ~PAGE_MASK));
>   
> -	__patch_instruction(addr, instr, patch_addr);
> +	patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
> +	rc = __patch_instruction(addr, instr, patch_addr);
>   
> -	err = unmap_patch_area(text_poke_addr);
> -	if (err)
> -		pr_warn("failed to unmap %lx\n", text_poke_addr);
> +	err = unmap_patch_area();
>   
>   out:
>   	local_irq_restore(flags);
> +	WARN_ON(!ppc_inst_equal(ppc_inst_read(addr), instr));

Why adding that WARN_ON(), what could make that happen that is worth a WARN_ON() ?

Patching is quite a critical fast path, I'm not sure we want to afford too many checks during 
patching, we want it quick at first.

>   
> -	return err;
> +	return rc ? rc : err;
>   }
>   #else /* !CONFIG_STRICT_KERNEL_RWX */
>   
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix
  2021-07-13  5:31 ` [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix Christopher M. Riedl
@ 2021-08-05  9:48   ` Christophe Leroy
  2021-08-11 18:28     ` Christopher M. Riedl
  0 siblings, 1 reply; 24+ messages in thread
From: Christophe Leroy @ 2021-08-05  9:48 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja



Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> When code patching a STRICT_KERNEL_RWX kernel the page containing the
> address to be patched is temporarily mapped as writeable. Currently, a
> per-cpu vmalloc patch area is used for this purpose. While the patch
> area is per-cpu, the temporary page mapping is inserted into the kernel
> page tables for the duration of patching. The mapping is exposed to CPUs
> other than the patching CPU - this is undesirable from a hardening
> perspective. Use a temporary mm instead which keeps the mapping local to
> the CPU doing the patching.
> 
> Use the `poking_init` init hook to prepare a temporary mm and patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> space. The patching address is randomized between PAGE_SIZE and
> DEFAULT_MAP_WINDOW-PAGE_SIZE.
> 
> Bits of entropy with 64K page size on BOOK3S_64:
> 
>          bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
> 
>          PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
>          bits of entropy = log2(128TB / 64K)
> 	bits of entropy = 31
> 
> The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
> operates - by default the space above DEFAULT_MAP_WINDOW is not
> available. Currently the Hash MMU does not use a temporary mm so
> technically this upper limit isn't necessary; however, a larger
> randomization range does not further "harden" this overall approach and
> future work may introduce patching with a temporary mm on Hash as well.
> 
> Randomization occurs only once during initialization at boot for each
> possible CPU in the system.
> 
> Introduce two new functions, map_patch() and unmap_patch(), to
> respectively create and remove the temporary mapping with write
> permissions at patching_addr. Map the page with PAGE_KERNEL to set
> EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
> KUAP) according to PowerISA v3.0b Figure 35 on Radix.
> 
> Based on x86 implementation:
> 
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
> 
> and:
> 
> commit b3fd8e83ada0
> ("x86/alternatives: Use temporary mm for text poking")
> 
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> 
> ---
> 
> v5:  * Only support Book3s64 Radix MMU for now.
>       * Use a per-cpu datastructure to hold the patching_addr and
>         patching_mm to avoid the need for a synchronization lock/mutex.
> 
> v4:  * In the previous series this was two separate patches: one to init
>         the temporary mm in poking_init() (unused in powerpc at the time)
>         and the other to use it for patching (which removed all the
>         per-cpu vmalloc code). Now that we use poking_init() in the
>         existing per-cpu vmalloc approach, that separation doesn't work
>         as nicely anymore so I just merged the two patches into one.
>       * Preload the SLB entry and hash the page for the patching_addr
>         when using Hash on book3s64 to avoid taking an SLB and Hash fault
>         during patching. The previous implementation was a hack which
>         changed current->mm to allow the SLB and Hash fault handlers to
>         work with the temporary mm since both of those code-paths always
>         assume mm == current->mm.
>       * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
>         have to manage the mm->context.active_cpus counter and mm cpumask
>         since they determine (via mm_is_thread_local()) if the TLB flush
>         in pte_clear() is local or not - it should always be local when
>         we're using the temporary mm. On book3s64's Radix MMU we can
>         just call local_flush_tlb_mm().
>       * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
>         KUAP.
> ---
>   arch/powerpc/lib/code-patching.c | 132 +++++++++++++++++++++++++++++--
>   1 file changed, 125 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 9f2eba9b70ee4..027dabd42b8dd 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -11,6 +11,7 @@
>   #include <linux/cpuhotplug.h>
>   #include <linux/slab.h>
>   #include <linux/uaccess.h>
> +#include <linux/random.h>
>   
>   #include <asm/tlbflush.h>
>   #include <asm/page.h>
> @@ -103,6 +104,7 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
>   
>   static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>   static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
>   
>   #if IS_BUILTIN(CONFIG_LKDTM)
>   unsigned long read_cpu_patching_addr(unsigned int cpu)
> @@ -133,6 +135,51 @@ static int text_area_cpu_down(unsigned int cpu)
>   	return 0;
>   }
>   
> +static __always_inline void __poking_init_temp_mm(void)
> +{
> +	int cpu;
> +	spinlock_t *ptl; /* for protecting pte table */
> +	pte_t *ptep;
> +	struct mm_struct *patching_mm;
> +	unsigned long patching_addr;
> +
> +	for_each_possible_cpu(cpu) {
> +		/*
> +		 * Some parts of the kernel (static keys for example) depend on
> +		 * successful code patching. Code patching under
> +		 * STRICT_KERNEL_RWX requires this setup - otherwise we cannot
> +		 * patch at all. We use BUG_ON() here and later since an early
> +		 * failure is preferred to buggy behavior and/or strange
> +		 * crashes later.
> +		 */
> +		patching_mm = copy_init_mm();
> +		BUG_ON(!patching_mm);

Read https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on and 
https://github.com/linuxppc/issues/issues/88

Avoid BUG_ON()s thanks.

> +
> +		per_cpu(cpu_patching_mm, cpu) = patching_mm;
> +
> +		/*
> +		 * Choose a randomized, page-aligned address from the range:
> +		 * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower
> +		 * address bound is PAGE_SIZE to avoid the zero-page.  The
> +		 * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to
> +		 * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
> +		 */
> +		patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
> +				% (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));

% should be at the end of first line and the second line alignment should match open parenthesis in 
first line.

> +
> +		per_cpu(cpu_patching_addr, cpu) = patching_addr;
> +
> +		/*
> +		 * PTE allocation uses GFP_KERNEL which means we need to
> +		 * pre-allocate the PTE here because we cannot do the
> +		 * allocation during patching when IRQs are disabled.
> +		 */
> +		ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> +		BUG_ON(!ptep);

Avoid BUG_ON() please


> +		pte_unmap_unlock(ptep, ptl);
> +	}
> +}
> +
>   /*
>    * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
>    * we judge it as being preferable to a kernel that will crash later when
> @@ -140,6 +187,11 @@ static int text_area_cpu_down(unsigned int cpu)
>    */
>   void __init poking_init(void)
>   {
> +	if (radix_enabled()) {
> +		__poking_init_temp_mm();
> +		return;
> +	}
> +
>   	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>   		"powerpc/text_poke:online", text_area_cpu_up,
>   		text_area_cpu_down));
> @@ -213,30 +265,96 @@ static inline int unmap_patch_area(void)
>   	return -EINVAL;
>   }
>   
> +struct patch_mapping {
> +	spinlock_t *ptl; /* for protecting pte table */
> +	pte_t *ptep;
> +	struct temp_mm temp_mm;
> +};
> +
> +/*
> + * This can be called for kernel text or a module.
> + */
> +static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
> +{
> +	struct page *page;
> +	pte_t pte;
> +	pgprot_t pgprot;
> +	struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> +	unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> +	if (is_vmalloc_or_module_addr(addr))
> +		page = vmalloc_to_page(addr);
> +	else
> +		page = virt_to_page(addr);
> +
> +	patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
> +					     &patch_mapping->ptl);

Not sure you need to split this line, checkpatch now allows 100 chars per line.


> +	if (unlikely(!patch_mapping->ptep)) {
> +		pr_warn("map patch: failed to allocate pte for patching\n");

That's a lot better than all above BUG_ONs


> +		return -1;
> +	}
> +
> +	pgprot = PAGE_KERNEL;
> +	pte = mk_pte(page, pgprot);
> +	pte = pte_mkdirty(pte);

I'm sure you can do

	pte = pte_mkdirty(mk_pte(page, PAGE_KERNEL));

And indeed PAGE_KERNEL already includes _PAGE_DIRTY, so all you should need is

	pte = mk_pte(page, PAGE_KERNEL);

Or even just

	set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, mk_pte(page, PAGE_KERNEL));


> +	set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, pte);
> +
> +	init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> +	use_temporary_mm(&patch_mapping->temp_mm);
> +
> +	return 0;
> +}
> +
> +static void unmap_patch(struct patch_mapping *patch_mapping)
> +{
> +	struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> +	unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> +	pte_clear(patching_mm, patching_addr, patch_mapping->ptep);
> +
> +	local_flush_tlb_mm(patching_mm);
> +
> +	pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl);
> +
> +	unuse_temporary_mm(&patch_mapping->temp_mm);

Shouldn't you stop using it before unmapping/unlocking it ?


> +}
> +
>   static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
>   {
>   	int err, rc = 0;
>   	u32 *patch_addr = NULL;
>   	unsigned long flags;
> +	struct patch_mapping patch_mapping;
>   
>   	/*
> -	 * During early early boot patch_instruction is called
> -	 * when text_poke_area is not ready, but we still need
> -	 * to allow patching. We just do the plain old patching
> +	 * During early early boot patch_instruction is called when the
> +	 * patching_mm/text_poke_area is not ready, but we still need to allow
> +	 * patching. We just do the plain old patching.
>   	 */
> -	if (!this_cpu_read(text_poke_area))
> -		return raw_patch_instruction(addr, instr);
> +	if (radix_enabled()) {
> +		if (!this_cpu_read(cpu_patching_mm))
> +			return raw_patch_instruction(addr, instr);
> +	} else {
> +		if (!this_cpu_read(text_poke_area))
> +			return raw_patch_instruction(addr, instr);
> +	}
>   
>   	local_irq_save(flags);
>   
> -	err = map_patch_area(addr);
> +	if (radix_enabled())
> +		err = map_patch(addr, &patch_mapping);

Maybe call it map_patch_mm() or map_patch_mapping() ?

> +	else
> +		err = map_patch_area(addr);
>   	if (err)
>   		goto out;
>   
>   	patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
>   	rc = __patch_instruction(addr, instr, patch_addr);
>   
> -	err = unmap_patch_area();
> +	if (radix_enabled())
> +		unmap_patch(&patch_mapping);

No err ? Would be better if it could return something, allthough always 0.

And same comment about naming.

> +	else
> +		err = unmap_patch_area();
>   
>   out:
>   	local_irq_restore(flags);
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU
  2021-08-05  9:03 ` [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christophe Leroy
@ 2021-08-11 17:49   ` Christopher M. Riedl
  0 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 17:49 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:03 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > When compiled with CONFIG_STRICT_KERNEL_RWX, the kernel must create
> > temporary mappings when patching itself. These mappings temporarily
> > override the strict RWX text protections to permit a write. Currently,
> > powerpc allocates a per-CPU VM area for patching. Patching occurs as
> > follows:
> > 
> > 	1. Map page in per-CPU VM area w/ PAGE_KERNEL protection
> > 	2. Patch text
> > 	3. Remove the temporary mapping
> > 
> > While the VM area is per-CPU, the mapping is actually inserted into the
> > kernel page tables. Presumably, this could allow another CPU to access
> > the normally write-protected text - either malicously or accidentally -
> > via this same mapping if the address of the VM area is known. Ideally,
> > the mapping should be kept local to the CPU doing the patching [0].
> > 
> > x86 introduced "temporary mm" structs which allow the creation of mappings
> > local to a particular CPU [1]. This series intends to bring the notion of a
> > temporary mm to powerpc's Book3s64 Radix MMU and harden it by using such a
> > mapping for patching a kernel with strict RWX permissions.
> > 
> > The first four patches implement an LKDTM test "proof-of-concept" which
> > exploits the potential vulnerability (ie. the temporary mapping during patching
> > is exposed in the kernel page tables and accessible by other CPUs) using a
> > simple brute-force approach. This test is implemented for both powerpc and
> > x86_64. The test passes on powerpc Radix with this new series, fails on
> > upstream powerpc, passes on upstream x86_64, and fails on an older (ancient)
> > x86_64 tree without the x86_64 temporary mm patches. The remaining patches add
> > support for and use a temporary mm for code patching on powerpc with the Radix
> > MMU.
>
> I think four first patches (together with last one) are quite
> independent from the heart of the
> series itself which is patches 5, 6, 7. Maybe you should split that
> series it two series ? After all
> those selftests are nice to have but are not absolutely necessary, that
> would help getting forward I
> think.
>

Hmm you're probably right. The selftest at least proves there is a
potential attack which I think is necessary for any hardening related
series/patch. I'll split the series into separate powerpc temp mm and
LKDTM series for the next spin.

> > 
> > Tested boot, ftrace, and repeated LKDTM "hijack":
> > 	- QEMU+KVM (host: POWER9 Blackbird): Radix MMU w/ KUAP
> > 	- QEMU+KVM (host: POWER9 Blackbird): Hash MMU
> > 
> > Tested repeated LKDTM "hijack":
> > 	- QEMU+KVM (host: AMD desktop): x86_64 upstream
> > 	- QEMU+KVM (host: AMD desktop): x86_64 w/o percpu temp mm to
> > 	  verify the LKDTM "hijack" test fails
> > 
> > Tested boot and ftrace:
> > 	- QEMU+TCG: ppc44x (bamboo)
> > 	- QEMU+TCG: g5 (mac99)
> > 
> > I also tested with various extra config options enabled as suggested in
> > section 12) in Documentation/process/submit-checklist.rst.
> > 
> > v5:	* Only support Book3s64 Radix MMU for now. There are some issues with
> > 	  the previous implementation on the Hash MMU as pointed out by Nick
> > 	  Piggin. Fixing these is not trivial so we only support the Radix MMU
> > 	  for now. I tried using realmode (no data translation) to patch with
> > 	  Hash to at least avoid exposing the patch mapping to other CPUs but
> > 	  this doesn't reliably work either since we cannot access vmalloc'd
> > 	  space in realmode.
>
> So you now accept to have two different mode depending on the platform ?

By necessity yes.

> As far as I remember I commented some time ago that non SMP didn't need
> that feature and you were
> reluctant to have two different implementations. What made you change
> your mind ? (just curious).
>

The book3s64 hash mmu support is a pain ;) Supporting both the temp-mm
and vmalloc implementations turned out to be relatively simple - I
initially thought this would be messier. For now we will support both;
however, in the future I'd still like to implement the percpu temp-mm
support for the Hash MMU as well. I suppose we could re-evaluate then if
we want/need both implementations (I know you're in favor of keeping the
vmalloc-based approach for performance reasons on non-SMP).

>
> > 	* Use percpu variables for the patching_mm and patching_addr. This
> > 	  avoids the need for synchronization mechanisms entirely. Thanks to
> > 	  Peter Zijlstra for pointing out text_mutex which unfortunately didn't
> > 	  work out without larger code changes in powerpc. Also thanks to Daniel
> > 	  Axtens for comments about using percpu variables for the *percpu* temp
> > 	  mm things off list.
> > 
> > v4:	* It's time to revisit this series again since @jpn and @mpe fixed
> > 	  our known STRICT_*_RWX bugs on powerpc/64s.
> > 	* Rebase on linuxppc/next:
> >            commit ee1bc694fbaec ("powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n")
> > 	* Completely rework how map_patch() works on book3s64 Hash MMU
> > 	* Split the LKDTM x86_64 and powerpc bits into separate patches
> > 	* Annotate commit messages with changes from v3 instead of
> > 	  listing them here completely out-of context...
> > 
> > v3:	* Rebase on linuxppc/next: commit 9123e3a74ec7 ("Linux 5.9-rc1")
> > 	* Move temporary mm implementation into code-patching.c where it
> > 	  belongs
> > 	* Implement LKDTM hijacker test on x86_64 (on IBM time oof) Do
> > 	* not use address zero for the patching address in the
> > 	  temporary mm (thanks @dja for pointing this out!)
> > 	* Wrap the LKDTM test w/ CONFIG_SMP as suggested by Christophe
> > 	  Leroy
> > 	* Comments to clarify PTE pre-allocation and patching addr
> > 	  selection
> > 
> > v2:	* Rebase on linuxppc/next:
> > 	  commit 105fb38124a4 ("powerpc/8xx: Modify ptep_get()")
> > 	* Always dirty pte when mapping patch
> > 	* Use `ppc_inst_len` instead of `sizeof` on instructions
> > 	* Declare LKDTM patching addr accessor in header where it belongs	
> > 
> > v1:	* Rebase on linuxppc/next (4336b9337824)
> > 	* Save and restore second hw watchpoint
> > 	* Use new ppc_inst_* functions for patching check and in LKDTM test
> > 
> > rfc-v2:	* Many fixes and improvements mostly based on extensive feedback
> >            and testing by Christophe Leroy (thanks!).
> > 	* Make patching_mm and patching_addr static and move
> > 	  '__ro_after_init' to after the variable name (more common in
> > 	  other parts of the kernel)
> > 	* Use 'asm/debug.h' header instead of 'asm/hw_breakpoint.h' to
> > 	  fix PPC64e compile
> > 	* Add comment explaining why we use BUG_ON() during the init
> > 	  call to setup for patching later
> > 	* Move ptep into patch_mapping to avoid walking page tables a
> > 	  second time when unmapping the temporary mapping
> > 	* Use KUAP under non-radix, also manually dirty the PTE for patch
> > 	  mapping on non-BOOK3S_64 platforms
> > 	* Properly return any error from __patch_instruction
> >          * Do not use 'memcmp' where a simple comparison is appropriate
> > 	* Simplify expression for patch address by removing pointer maths
> > 	* Add LKDTM test
> > 
> > [0]: https://github.com/linuxppc/issues/issues/224
> > [1]: https://lore.kernel.org/kernel-hardening/20190426232303.28381-1-nadav.amit@gmail.com/
> > 
> > Christopher M. Riedl (8):
> >    powerpc: Add LKDTM accessor for patching addr
> >    lkdtm/powerpc: Add test to hijack a patch mapping
> >    x86_64: Add LKDTM accessor for patching addr
> >    lkdtm/x86_64: Add test to hijack a patch mapping
> >    powerpc/64s: Introduce temporary mm for Radix MMU
> >    powerpc: Rework and improve STRICT_KERNEL_RWX patching
> >    powerpc/64s: Initialize and use a temporary mm for patching on Radix
> >    lkdtm/powerpc: Fix code patching hijack test
> > 
> >   arch/powerpc/include/asm/code-patching.h |   4 +
> >   arch/powerpc/include/asm/debug.h         |   1 +
> >   arch/powerpc/kernel/process.c            |   5 +
> >   arch/powerpc/lib/code-patching.c         | 240 ++++++++++++++++++++---
> >   arch/x86/include/asm/text-patching.h     |   4 +
> >   arch/x86/kernel/alternative.c            |   7 +
> >   drivers/misc/lkdtm/core.c                |   1 +
> >   drivers/misc/lkdtm/lkdtm.h               |   1 +
> >   drivers/misc/lkdtm/perms.c               | 143 ++++++++++++++
> >   9 files changed, 378 insertions(+), 28 deletions(-)
> > 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping
  2021-08-05  9:09   ` Christophe Leroy
@ 2021-08-11 17:53     ` Christopher M. Riedl
  0 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 17:53 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:09 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > A previous commit implemented an LKDTM test on powerpc to exploit the
> > temporary mapping established when patching code with STRICT_KERNEL_RWX
> > enabled. Extend the test to work on x86_64 as well.
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> > ---
> >   drivers/misc/lkdtm/perms.c | 26 ++++++++++++++++++++++----
> >   1 file changed, 22 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> > index 39e7456852229..41e87e5f9cc86 100644
> > --- a/drivers/misc/lkdtm/perms.c
> > +++ b/drivers/misc/lkdtm/perms.c
> > @@ -224,7 +224,7 @@ void lkdtm_ACCESS_NULL(void)
> >   }
> >   
> >   #if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> > -	defined(CONFIG_PPC))
> > +	(defined(CONFIG_PPC) || defined(CONFIG_X86_64)))
> >   /*
> >    * This is just a dummy location to patch-over.
> >    */
> > @@ -233,12 +233,25 @@ static void patching_target(void)
> >   	return;
> >   }
> >   
> > -#include <asm/code-patching.h>
> >   const u32 *patch_site = (const u32 *)&patching_target;
> >   
> > +#ifdef CONFIG_PPC
> > +#include <asm/code-patching.h>
> > +#endif
> > +
> > +#ifdef CONFIG_X86_64
> > +#include <asm/text-patching.h>
> > +#endif
> > +
> >   static inline int lkdtm_do_patch(u32 data)
> >   {
> > +#ifdef CONFIG_PPC
> >   	return patch_instruction((u32 *)patch_site, ppc_inst(data));
> > +#endif
> > +#ifdef CONFIG_X86_64
> > +	text_poke((void *)patch_site, &data, sizeof(u32));
> > +	return 0;
> > +#endif
> >   }
> >   
> >   static inline u32 lkdtm_read_patch_site(void)
> > @@ -249,11 +262,16 @@ static inline u32 lkdtm_read_patch_site(void)
> >   /* Returns True if the write succeeds */
> >   static inline bool lkdtm_try_write(u32 data, u32 *addr)
> >   {
> > +#ifdef CONFIG_PPC
> >   	__put_kernel_nofault(addr, &data, u32, err);
> >   	return true;
> >   
> >   err:
> >   	return false;
> > +#endif
> > +#ifdef CONFIG_X86_64
> > +	return !__put_user(data, addr);
> > +#endif
> >   }
> >   
> >   static int lkdtm_patching_cpu(void *data)
> > @@ -346,8 +364,8 @@ void lkdtm_HIJACK_PATCH(void)
> >   
> >   void lkdtm_HIJACK_PATCH(void)
> >   {
> > -	if (!IS_ENABLED(CONFIG_PPC))
> > -		pr_err("XFAIL: this test only runs on powerpc\n");
> > +	if (!IS_ENABLED(CONFIG_PPC) && !IS_ENABLED(CONFIG_X86_64))
> > +		pr_err("XFAIL: this test only runs on powerpc and x86_64\n");
> >   	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> >   		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> >   	if (!IS_BUILTIN(CONFIG_LKDTM))
> > 
>
> Instead of spreading arch specific stuff into LKDTM, wouldn't it make
> sence to define common a
> common API ? Because the day another arch like arm64 implements it own
> approach, do we add specific
> functions again and again into LKDTM ?

Hmm a common patch/poke kernel API is probably out of scope for this
series? I do agree though - since you suggested splitting the series
maybe that's something I can add along with the LKDTM patches.

>
> Also, I find it odd to define tests only when they can succeed. For
> other tests like
> ACCESS_USERSPACE, they are there all the time, regardless of whether we
> have selected
> CONFIG_PPC_KUAP or not. I think it should be the same here, have it all
> there time, if
> CONFIG_STRICT_KERNEL_RWX is selected the test succeeds otherwise it
> fails, but it is always there.

I followed the approach in lkdtm_DOUBLE_FAULT and others in
drivers/misc/lkdtm/bugs.c. I suppose it doesn't hurt to always build the
test irrespective of CONFIG_STRICT_KERNEL_RWX.

>
> Christophe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping
  2021-08-05  9:13   ` Christophe Leroy
@ 2021-08-11 17:57     ` Christopher M. Riedl
  2021-08-11 18:07       ` Kees Cook
  0 siblings, 1 reply; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 17:57 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:13 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > When live patching with STRICT_KERNEL_RWX the CPU doing the patching
> > must temporarily remap the page(s) containing the patch site with +W
> > permissions. While this temporary mapping is in use, another CPU could
> > write to the same mapping and maliciously alter kernel text. Implement a
> > LKDTM test to attempt to exploit such an opening during code patching.
> > The test is implemented on powerpc and requires LKDTM built into the
> > kernel (building LKDTM as a module is insufficient).
> > 
> > The LKDTM "hijack" test works as follows:
> > 
> >    1. A CPU executes an infinite loop to patch an instruction. This is
> >       the "patching" CPU.
> >    2. Another CPU attempts to write to the address of the temporary
> >       mapping used by the "patching" CPU. This other CPU is the
> >       "hijacker" CPU. The hijack either fails with a fault/error or
> >       succeeds, in which case some kernel text is now overwritten.
> > 
> > The virtual address of the temporary patch mapping is provided via an
> > LKDTM-specific accessor to the hijacker CPU. This test assumes a
> > hypothetical situation where this address was leaked previously.
> > 
> > How to run the test:
> > 
> > 	mount -t debugfs none /sys/kernel/debug
> > 	(echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT)
> > 
> > A passing test indicates that it is not possible to overwrite kernel
> > text from another CPU by using the temporary mapping established by
> > a CPU for patching.
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> > 
> > ---
> > 
> > v5:  * Use `u32*` instead of `struct ppc_inst*` based on new series in
> >         upstream.
> > 
> > v4:  * Separate the powerpc and x86_64 bits into individual patches.
> >       * Use __put_kernel_nofault() when attempting to hijack the mapping
> >       * Use raw_smp_processor_id() to avoid triggering the BUG() when
> >         calling smp_processor_id() in preemptible code - the only thing
> >         that matters is that one of the threads is bound to a different
> >         CPU - we are not using smp_processor_id() to access any per-cpu
> >         data or similar where preemption should be disabled.
> >       * Rework the patching_cpu() kthread stop condition to avoid:
> >         https://lwn.net/Articles/628628/
> > ---
> >   drivers/misc/lkdtm/core.c  |   1 +
> >   drivers/misc/lkdtm/lkdtm.h |   1 +
> >   drivers/misc/lkdtm/perms.c | 134 +++++++++++++++++++++++++++++++++++++
> >   3 files changed, 136 insertions(+)
> > 
> > diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> > index 8024b6a5cc7fc..fbcb95eda337b 100644
> > --- a/drivers/misc/lkdtm/core.c
> > +++ b/drivers/misc/lkdtm/core.c
> > @@ -147,6 +147,7 @@ static const struct crashtype crashtypes[] = {
> >   	CRASHTYPE(WRITE_RO),
> >   	CRASHTYPE(WRITE_RO_AFTER_INIT),
> >   	CRASHTYPE(WRITE_KERN),
> > +	CRASHTYPE(HIJACK_PATCH),
> >   	CRASHTYPE(REFCOUNT_INC_OVERFLOW),
> >   	CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
> >   	CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
> > diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
> > index 99f90d3e5e9cb..87e7e6136d962 100644
> > --- a/drivers/misc/lkdtm/lkdtm.h
> > +++ b/drivers/misc/lkdtm/lkdtm.h
> > @@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void);
> >   void lkdtm_EXEC_NULL(void);
> >   void lkdtm_ACCESS_USERSPACE(void);
> >   void lkdtm_ACCESS_NULL(void);
> > +void lkdtm_HIJACK_PATCH(void);
> >   
> >   /* refcount.c */
> >   void lkdtm_REFCOUNT_INC_OVERFLOW(void);
> > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> > index 2dede2ef658f3..39e7456852229 100644
> > --- a/drivers/misc/lkdtm/perms.c
> > +++ b/drivers/misc/lkdtm/perms.c
> > @@ -9,6 +9,7 @@
> >   #include <linux/vmalloc.h>
> >   #include <linux/mman.h>
> >   #include <linux/uaccess.h>
> > +#include <linux/kthread.h>
> >   #include <asm/cacheflush.h>
> >   
> >   /* Whether or not to fill the target memory area with do_nothing(). */
> > @@ -222,6 +223,139 @@ void lkdtm_ACCESS_NULL(void)
> >   	pr_err("FAIL: survived bad write\n");
> >   }
> >   
> > +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> > +	defined(CONFIG_PPC))
>
>
> I think this test shouldn't be limited to CONFIG_PPC and shouldn't be
> limited to
> CONFIG_STRICT_KERNEL_RWX. It should be there all the time.
>
> Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ?
>

The test needs read_cpu_patching_addr() which definitely cannot be
exposed outside of the kernel (ie. builtin).

> > +/*
> > + * This is just a dummy location to patch-over.
> > + */
> > +static void patching_target(void)
> > +{
> > +	return;
> > +}
> > +
> > +#include <asm/code-patching.h>
> > +const u32 *patch_site = (const u32 *)&patching_target;
> > +
> > +static inline int lkdtm_do_patch(u32 data)
> > +{
> > +	return patch_instruction((u32 *)patch_site, ppc_inst(data));
> > +}
> > +
> > +static inline u32 lkdtm_read_patch_site(void)
> > +{
> > +	return READ_ONCE(*patch_site);
> > +}
> > +
> > +/* Returns True if the write succeeds */
> > +static inline bool lkdtm_try_write(u32 data, u32 *addr)
> > +{
> > +	__put_kernel_nofault(addr, &data, u32, err);
> > +	return true;
> > +
> > +err:
> > +	return false;
> > +}
> > +
> > +static int lkdtm_patching_cpu(void *data)
> > +{
> > +	int err = 0;
> > +	u32 val = 0xdeadbeef;
> > +
> > +	pr_info("starting patching_cpu=%d\n", raw_smp_processor_id());
> > +
> > +	do {
> > +		err = lkdtm_do_patch(val);
> > +	} while (lkdtm_read_patch_site() == val && !err && !kthread_should_stop());
> > +
> > +	if (err)
> > +		pr_warn("XFAIL: patch_instruction returned error: %d\n", err);
> > +
> > +	while (!kthread_should_stop()) {
> > +		set_current_state(TASK_INTERRUPTIBLE);
> > +		schedule();
> > +	}
> > +
> > +	return err;
> > +}
> > +
> > +void lkdtm_HIJACK_PATCH(void)
> > +{
> > +	struct task_struct *patching_kthrd;
> > +	int patching_cpu, hijacker_cpu, attempts;
> > +	unsigned long addr;
> > +	bool hijacked;
> > +	const u32 bad_data = 0xbad00bad;
> > +	const u32 original_insn = lkdtm_read_patch_site();
> > +
> > +	if (!IS_ENABLED(CONFIG_SMP)) {
> > +		pr_err("XFAIL: this test requires CONFIG_SMP\n");
> > +		return;
> > +	}
> > +
> > +	if (num_online_cpus() < 2) {
> > +		pr_warn("XFAIL: this test requires at least two cpus\n");
> > +		return;
> > +	}
> > +
> > +	hijacker_cpu = raw_smp_processor_id();
> > +	patching_cpu = cpumask_any_but(cpu_online_mask, hijacker_cpu);
> > +
> > +	patching_kthrd = kthread_create_on_node(&lkdtm_patching_cpu, NULL,
> > +						cpu_to_node(patching_cpu),
> > +						"lkdtm_patching_cpu");
> > +	kthread_bind(patching_kthrd, patching_cpu);
> > +	wake_up_process(patching_kthrd);
> > +
> > +	addr = offset_in_page(patch_site) | read_cpu_patching_addr(patching_cpu);
> > +
> > +	pr_info("starting hijacker_cpu=%d\n", hijacker_cpu);
> > +	for (attempts = 0; attempts < 100000; ++attempts) {
> > +		/* Try to write to the other CPU's temp patch mapping */
> > +		hijacked = lkdtm_try_write(bad_data, (u32 *)addr);
> > +
> > +		if (hijacked) {
> > +			if (kthread_stop(patching_kthrd)) {
> > +				pr_info("hijack attempts: %d\n", attempts);
> > +				pr_err("XFAIL: error stopping patching cpu\n");
> > +				return;
> > +			}
> > +			break;
> > +		}
> > +	}
> > +	pr_info("hijack attempts: %d\n", attempts);
> > +
> > +	if (hijacked) {
> > +		if (lkdtm_read_patch_site() == bad_data)
> > +			pr_err("overwrote kernel text\n");
> > +		/*
> > +		 * There are window conditions where the hijacker cpu manages to
> > +		 * write to the patch site but the site gets overwritten again by
> > +		 * the patching cpu. We still consider that a "successful" hijack
> > +		 * since the hijacker cpu did not fault on the write.
> > +		 */
> > +		pr_err("FAIL: wrote to another cpu's patching area\n");
> > +	} else {
> > +		kthread_stop(patching_kthrd);
> > +	}
> > +
> > +	/* Restore the original data to be able to run the test again */
> > +	lkdtm_do_patch(original_insn);
> > +}
> > +
> > +#else
> > +
> > +void lkdtm_HIJACK_PATCH(void)
> > +{
> > +	if (!IS_ENABLED(CONFIG_PPC))
> > +		pr_err("XFAIL: this test only runs on powerpc\n");
> > +	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> > +		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> > +	if (!IS_BUILTIN(CONFIG_LKDTM))
> > +		pr_err("XFAIL: this test requires CONFIG_LKDTM=y (not =m!)\n");
> > +}
> > +
> > +#endif
> > +
> >   void __init lkdtm_perms_init(void)
> >   {
> >   	/* Make sure we can write to __ro_after_init values during __init */
> > 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test
  2021-08-05  9:18   ` Christophe Leroy
@ 2021-08-11 17:57     ` Christopher M. Riedl
  0 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 17:57 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:18 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > Code patching on powerpc with a STRICT_KERNEL_RWX uses a userspace
> > address in a temporary mm on Radix now. Use __put_user() to avoid write
> > failures due to KUAP when attempting a "hijack" on the patching address.
> > __put_user() also works with the non-userspace, vmalloc-based patching
> > address on non-Radix MMUs.
>
> It is not really clean to use __put_user() on non user address,
> allthought it works by change.
>
> I think it would be better to do something like
>
> if (is_kernel_addr(addr))
> copy_to_kernel_nofault(...);
> else
> copy_to_user_nofault(...);
>

Yes that looks much better. I'll pick this up and try it for the next
spin. Thanks!

>
>
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> > ---
> >   drivers/misc/lkdtm/perms.c | 9 ---------
> >   1 file changed, 9 deletions(-)
> > 
> > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> > index 41e87e5f9cc86..da6a34a0a49fb 100644
> > --- a/drivers/misc/lkdtm/perms.c
> > +++ b/drivers/misc/lkdtm/perms.c
> > @@ -262,16 +262,7 @@ static inline u32 lkdtm_read_patch_site(void)
> >   /* Returns True if the write succeeds */
> >   static inline bool lkdtm_try_write(u32 data, u32 *addr)
> >   {
> > -#ifdef CONFIG_PPC
> > -	__put_kernel_nofault(addr, &data, u32, err);
> > -	return true;
> > -
> > -err:
> > -	return false;
> > -#endif
> > -#ifdef CONFIG_X86_64
> >   	return !__put_user(data, addr);
> > -#endif
> >   }
> >   
> >   static int lkdtm_patching_cpu(void *data)
> > 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU
  2021-08-05  9:27   ` Christophe Leroy
@ 2021-08-11 18:02     ` Christopher M. Riedl
  0 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 18:02 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:27 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > x86 supports the notion of a temporary mm which restricts access to
> > temporary PTEs to a single CPU. A temporary mm is useful for situations
> > where a CPU needs to perform sensitive operations (such as patching a
> > STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
> > said mappings to other CPUs. Another benefit is that other CPU TLBs do
> > not need to be flushed when the temporary mm is torn down.
> > 
> > Mappings in the temporary mm can be set in the userspace portion of the
> > address-space.
> > 
> > Interrupts must be disabled while the temporary mm is in use. HW
> > breakpoints, which may have been set by userspace as watchpoints on
> > addresses now within the temporary mm, are saved and disabled when
> > loading the temporary mm. The HW breakpoints are restored when unloading
> > the temporary mm. All HW breakpoints are indiscriminately disabled while
> > the temporary mm is in use.
>
> Can you explain more about that breakpoint stuff ? Why is it a special
> case here at all ? Isn't it
> the same when you switch from one user task to another one ? x86 commit
> doesn't say anythink about
> breakpoints.
>

We do not check if the breakpoint is on a kernel address (perf can do
this IIUC) and just disable all of them. I had to dig, but x86 has a
comment with their implementation at arch/x86/kernel/alternative.c:743.

I can reword that part of the commit message if it's unclear.

> > 
> > Based on x86 implementation:
> > 
> > commit cefa929c034e
> > ("x86/mm: Introduce temporary mm structs")
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> > 
> > ---
> > 
> > v5:  * Drop support for using a temporary mm on Book3s64 Hash MMU.
> > 
> > v4:  * Pass the prev mm instead of NULL to switch_mm_irqs_off() when
> >         using/unusing the temp mm as suggested by Jann Horn to keep
> >         the context.active counter in-sync on mm/nohash.
> >       * Disable SLB preload in the temporary mm when initializing the
> >         temp_mm struct.
> >       * Include asm/debug.h header to fix build issue with
> >         ppc44x_defconfig.
> > ---
> >   arch/powerpc/include/asm/debug.h |  1 +
> >   arch/powerpc/kernel/process.c    |  5 +++
> >   arch/powerpc/lib/code-patching.c | 56 ++++++++++++++++++++++++++++++++
> >   3 files changed, 62 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> > index 86a14736c76c3..dfd82635ea8b3 100644
> > --- a/arch/powerpc/include/asm/debug.h
> > +++ b/arch/powerpc/include/asm/debug.h
> > @@ -46,6 +46,7 @@ static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
> >   #endif
> >   
> >   void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> > +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> >   bool ppc_breakpoint_available(void);
> >   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> >   extern void do_send_trap(struct pt_regs *regs, unsigned long address,
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 185beb2905801..a0776200772e8 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -865,6 +865,11 @@ static inline int set_breakpoint_8xx(struct arch_hw_breakpoint *brk)
> >   	return 0;
> >   }
> >   
> > +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> > +{
> > +	memcpy(brk, this_cpu_ptr(&current_brk[nr]), sizeof(*brk));
> > +}
> > +
> >   void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> >   {
> >   	memcpy(this_cpu_ptr(&current_brk[nr]), brk, sizeof(*brk));
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 54b6157d44e95..3122d8e4cc013 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -17,6 +17,9 @@
> >   #include <asm/code-patching.h>
> >   #include <asm/setup.h>
> >   #include <asm/inst.h>
> > +#include <asm/mmu_context.h>
> > +#include <asm/debug.h>
> > +#include <asm/tlb.h>
> >   
> >   static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 *patch_addr)
> >   {
> > @@ -45,6 +48,59 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr)
> >   }
> >   
> >   #ifdef CONFIG_STRICT_KERNEL_RWX
> > +
> > +struct temp_mm {
> > +	struct mm_struct *temp;
> > +	struct mm_struct *prev;
> > +	struct arch_hw_breakpoint brk[HBP_NUM_MAX];
> > +};
> > +
> > +static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct *mm)
> > +{
> > +	/* We currently only support temporary mm on the Book3s64 Radix MMU */
> > +	WARN_ON(!radix_enabled());
> > +
> > +	temp_mm->temp = mm;
> > +	temp_mm->prev = NULL;
> > +	memset(&temp_mm->brk, 0, sizeof(temp_mm->brk));
> > +}
> > +
> > +static inline void use_temporary_mm(struct temp_mm *temp_mm)
> > +{
> > +	lockdep_assert_irqs_disabled();
> > +
> > +	temp_mm->prev = current->active_mm;
> > +	switch_mm_irqs_off(temp_mm->prev, temp_mm->temp, current);
> > +
> > +	WARN_ON(!mm_is_thread_local(temp_mm->temp));
> > +
> > +	if (ppc_breakpoint_available()) {
> > +		struct arch_hw_breakpoint null_brk = {0};
> > +		int i = 0;
> > +
> > +		for (; i < nr_wp_slots(); ++i) {
> > +			__get_breakpoint(i, &temp_mm->brk[i]);
> > +			if (temp_mm->brk[i].type != 0)
> > +				__set_breakpoint(i, &null_brk);
> > +		}
> > +	}
> > +}
> > +
> > +static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
>
> not sure about the naming.
>
> Maybe start_using_temp_mm() and stop_using_temp_mm() would be more
> explicit.
>

Hehe I think we've discussed this before - naming things is hard :) I'll
take your suggestions for the next spin.

>
> > +{
> > +	lockdep_assert_irqs_disabled();
> > +
> > +	switch_mm_irqs_off(temp_mm->temp, temp_mm->prev, current);
> > +
> > +	if (ppc_breakpoint_available()) {
> > +		int i = 0;
> > +
> > +		for (; i < nr_wp_slots(); ++i)
> > +			if (temp_mm->brk[i].type != 0)
> > +				__set_breakpoint(i, &temp_mm->brk[i]);
> > +	}
> > +}
> > +
> >   static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> >   
> >   #if IS_BUILTIN(CONFIG_LKDTM)
> > 
>
> You'll probably get a bisecting hasard with those unused 'static inline'
> functions in a .c file
> because that patch alone probably fails build.

I just built the patch without any issue. The compiler only complains
for unused 'static' (non-inline) functions right?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping
  2021-08-11 17:57     ` Christopher M. Riedl
@ 2021-08-11 18:07       ` Kees Cook
  0 siblings, 0 replies; 24+ messages in thread
From: Kees Cook @ 2021-08-11 18:07 UTC (permalink / raw)
  To: Christopher M. Riedl
  Cc: Christophe Leroy, linuxppc-dev, peterz, x86, npiggin,
	linux-hardening, tglx, dja

On Wed, Aug 11, 2021 at 12:57:00PM -0500, Christopher M. Riedl wrote:
> On Thu Aug 5, 2021 at 4:13 AM CDT, Christophe Leroy wrote:
> >
> >
> > Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > > When live patching with STRICT_KERNEL_RWX the CPU doing the patching
> > > must temporarily remap the page(s) containing the patch site with +W
> > > permissions. While this temporary mapping is in use, another CPU could
> > > write to the same mapping and maliciously alter kernel text. Implement a
> > > LKDTM test to attempt to exploit such an opening during code patching.
> > > The test is implemented on powerpc and requires LKDTM built into the
> > > kernel (building LKDTM as a module is insufficient).
> > > 
> > > The LKDTM "hijack" test works as follows:
> > > 
> > >    1. A CPU executes an infinite loop to patch an instruction. This is
> > >       the "patching" CPU.
> > >    2. Another CPU attempts to write to the address of the temporary
> > >       mapping used by the "patching" CPU. This other CPU is the
> > >       "hijacker" CPU. The hijack either fails with a fault/error or
> > >       succeeds, in which case some kernel text is now overwritten.
> > > [...]
> > > +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> > > +	defined(CONFIG_PPC))
> >
> > I think this test shouldn't be limited to CONFIG_PPC and shouldn't be
> > limited to CONFIG_STRICT_KERNEL_RWX. It should be there all the time.

Agreed: if the machinery exists to provide this defense on even one
arch/config/whatever combo, I'd like LKDTM to test for it. This lets use
compare defenses across different combinations more easily, and means
folks must answer questions like "why doesn't $combination provide
$defense?"

> > Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ?
> 
> The test needs read_cpu_patching_addr() which definitely cannot be
> exposed outside of the kernel (ie. builtin).

FWIW, I'm okay with this. There isn't a solution that feels entirely
"right", so either a build-time requirement like this, or using an
exception for modules like this:

arch/x86/kernel/cpu/common.c:#if IS_MODULE(CONFIG_LKDTM)
arch/x86/kernel/cpu/common.c-EXPORT_SYMBOL_GPL(native_write_cr4);
arch/x86/kernel/cpu/common.c-#endif

I think neither is great. Another idea is maybe using a name-spaced
export, like:

EXPORT_SYMBOL_NS_GPL(native_write_cr4, LKDTM);

But that still means it gets exposed to malicious discovery, so probably
not.

I suspect the best is to just do the BUILTIN check, since building LKDTM
as a module on a _production_ kernel is rare if it exists at all. The
only downside is needing to completely reboot to perform updated tests,
but then, I frequently find myself breaking the kernel badly on bad
tests, so I have to reboot anyway. ;)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching
  2021-08-05  9:34   ` Christophe Leroy
@ 2021-08-11 18:10     ` Christopher M. Riedl
  0 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 18:10 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:34 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > Rework code-patching with STRICT_KERNEL_RWX to prepare for the next
> > patch which uses a temporary mm for patching under the Book3s64 Radix
> > MMU. Make improvements by adding a WARN_ON when the patchsite doesn't
> > match after patching and return the error from __patch_instruction()
> > properly.
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> > 
> > ---
> > 
> > v5:  * New to series.
> > ---
> >   arch/powerpc/lib/code-patching.c | 51 +++++++++++++++++---------------
> >   1 file changed, 27 insertions(+), 24 deletions(-)
> > 
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 3122d8e4cc013..9f2eba9b70ee4 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -102,11 +102,12 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
> >   }
> >   
> >   static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> > +static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> >   
> >   #if IS_BUILTIN(CONFIG_LKDTM)
> >   unsigned long read_cpu_patching_addr(unsigned int cpu)
> >   {
> > -	return (unsigned long)(per_cpu(text_poke_area, cpu))->addr;
> > +	return per_cpu(cpu_patching_addr, cpu);
> >   }
> >   #endif
> >   
> > @@ -121,6 +122,7 @@ static int text_area_cpu_up(unsigned int cpu)
> >   		return -1;
> >   	}
> >   	this_cpu_write(text_poke_area, area);
> > +	this_cpu_write(cpu_patching_addr, (unsigned long)area->addr);
> >   
> >   	return 0;
> >   }
> > @@ -146,7 +148,7 @@ void __init poking_init(void)
> >   /*
> >    * This can be called for kernel text or a module.
> >    */
> > -static int map_patch_area(void *addr, unsigned long text_poke_addr)
> > +static int map_patch_area(void *addr)
> >   {
> >   	unsigned long pfn;
> >   	int err;
> > @@ -156,17 +158,20 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
> >   	else
> >   		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
> >   
> > -	err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
> > +	err = map_kernel_page(__this_cpu_read(cpu_patching_addr),
> > +			      (pfn << PAGE_SHIFT), PAGE_KERNEL);
> >   
> > -	pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err);
> > +	pr_devel("Mapped addr %lx with pfn %lx:%d\n",
> > +		 __this_cpu_read(cpu_patching_addr), pfn, err);
> >   	if (err)
> >   		return -1;
> >   
> >   	return 0;
> >   }
> >   
> > -static inline int unmap_patch_area(unsigned long addr)
> > +static inline int unmap_patch_area(void)
> >   {
> > +	unsigned long addr = __this_cpu_read(cpu_patching_addr);
> >   	pte_t *ptep;
> >   	pmd_t *pmdp;
> >   	pud_t *pudp;
> > @@ -175,23 +180,23 @@ static inline int unmap_patch_area(unsigned long addr)
> >   
> >   	pgdp = pgd_offset_k(addr);
> >   	if (unlikely(!pgdp))
> > -		return -EINVAL;
> > +		goto out_err;
> >   
> >   	p4dp = p4d_offset(pgdp, addr);
> >   	if (unlikely(!p4dp))
> > -		return -EINVAL;
> > +		goto out_err;
> >   
> >   	pudp = pud_offset(p4dp, addr);
> >   	if (unlikely(!pudp))
> > -		return -EINVAL;
> > +		goto out_err;
> >   
> >   	pmdp = pmd_offset(pudp, addr);
> >   	if (unlikely(!pmdp))
> > -		return -EINVAL;
> > +		goto out_err;
> >   
> >   	ptep = pte_offset_kernel(pmdp, addr);
> >   	if (unlikely(!ptep))
> > -		return -EINVAL;
> > +		goto out_err;
> >   
> >   	pr_devel("clearing mm %p, pte %p, addr %lx\n", &init_mm, ptep, addr);
> >   
> > @@ -202,15 +207,17 @@ static inline int unmap_patch_area(unsigned long addr)
> >   	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> >   
> >   	return 0;
> > +
> > +out_err:
> > +	pr_warn("failed to unmap %lx\n", addr);
> > +	return -EINVAL;
>
> Can you keep that in the caller of unmap_patch_area() instead of all
> those goto stuff ?
>

Yeah I think that's fair. I'll do this in the next spin.

> >   }
> >   
> >   static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
> >   {
> > -	int err;
> > +	int err, rc = 0;
> >   	u32 *patch_addr = NULL;
> >   	unsigned long flags;
> > -	unsigned long text_poke_addr;
> > -	unsigned long kaddr = (unsigned long)addr;
> >   
> >   	/*
> >   	 * During early early boot patch_instruction is called
> > @@ -222,24 +229,20 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
> >   
> >   	local_irq_save(flags);
> >   
> > -	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
> > -	if (map_patch_area(addr, text_poke_addr)) {
> > -		err = -1;
> > +	err = map_patch_area(addr);
> > +	if (err)
> >   		goto out;
> > -	}
> > -
> > -	patch_addr = (u32 *)(text_poke_addr + (kaddr & ~PAGE_MASK));
> >   
> > -	__patch_instruction(addr, instr, patch_addr);
> > +	patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
> > +	rc = __patch_instruction(addr, instr, patch_addr);
> >   
> > -	err = unmap_patch_area(text_poke_addr);
> > -	if (err)
> > -		pr_warn("failed to unmap %lx\n", text_poke_addr);
> > +	err = unmap_patch_area();
> >   
> >   out:
> >   	local_irq_restore(flags);
> > +	WARN_ON(!ppc_inst_equal(ppc_inst_read(addr), instr));
>
> Why adding that WARN_ON(), what could make that happen that is worth a
> WARN_ON() ?

Failing to patch something could cause very strange issues later, so
explicitly calling out a failure when it happens is warranted IMO.

>
> Patching is quite a critical fast path, I'm not sure we want to afford
> too many checks during
> patching, we want it quick at first.

Hmm, I'd prefer to measure the impact first - if it's a huge degradation
then sure we can drop the WARN_ON()... I'll add some data with the next
spin.

>
> >   
> > -	return err;
> > +	return rc ? rc : err;
> >   }
> >   #else /* !CONFIG_STRICT_KERNEL_RWX */
> >   
> > 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix
  2021-08-05  9:48   ` Christophe Leroy
@ 2021-08-11 18:28     ` Christopher M. Riedl
  0 siblings, 0 replies; 24+ messages in thread
From: Christopher M. Riedl @ 2021-08-11 18:28 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
  Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja

On Thu Aug 5, 2021 at 4:48 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > When code patching a STRICT_KERNEL_RWX kernel the page containing the
> > address to be patched is temporarily mapped as writeable. Currently, a
> > per-cpu vmalloc patch area is used for this purpose. While the patch
> > area is per-cpu, the temporary page mapping is inserted into the kernel
> > page tables for the duration of patching. The mapping is exposed to CPUs
> > other than the patching CPU - this is undesirable from a hardening
> > perspective. Use a temporary mm instead which keeps the mapping local to
> > the CPU doing the patching.
> > 
> > Use the `poking_init` init hook to prepare a temporary mm and patching
> > address. Initialize the temporary mm by copying the init mm. Choose a
> > randomized patching address inside the temporary mm userspace address
> > space. The patching address is randomized between PAGE_SIZE and
> > DEFAULT_MAP_WINDOW-PAGE_SIZE.
> > 
> > Bits of entropy with 64K page size on BOOK3S_64:
> > 
> >          bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
> > 
> >          PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
> >          bits of entropy = log2(128TB / 64K)
> > 	bits of entropy = 31
> > 
> > The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
> > operates - by default the space above DEFAULT_MAP_WINDOW is not
> > available. Currently the Hash MMU does not use a temporary mm so
> > technically this upper limit isn't necessary; however, a larger
> > randomization range does not further "harden" this overall approach and
> > future work may introduce patching with a temporary mm on Hash as well.
> > 
> > Randomization occurs only once during initialization at boot for each
> > possible CPU in the system.
> > 
> > Introduce two new functions, map_patch() and unmap_patch(), to
> > respectively create and remove the temporary mapping with write
> > permissions at patching_addr. Map the page with PAGE_KERNEL to set
> > EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
> > KUAP) according to PowerISA v3.0b Figure 35 on Radix.
> > 
> > Based on x86 implementation:
> > 
> > commit 4fc19708b165
> > ("x86/alternatives: Initialize temporary mm for patching")
> > 
> > and:
> > 
> > commit b3fd8e83ada0
> > ("x86/alternatives: Use temporary mm for text poking")
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> > 
> > ---
> > 
> > v5:  * Only support Book3s64 Radix MMU for now.
> >       * Use a per-cpu datastructure to hold the patching_addr and
> >         patching_mm to avoid the need for a synchronization lock/mutex.
> > 
> > v4:  * In the previous series this was two separate patches: one to init
> >         the temporary mm in poking_init() (unused in powerpc at the time)
> >         and the other to use it for patching (which removed all the
> >         per-cpu vmalloc code). Now that we use poking_init() in the
> >         existing per-cpu vmalloc approach, that separation doesn't work
> >         as nicely anymore so I just merged the two patches into one.
> >       * Preload the SLB entry and hash the page for the patching_addr
> >         when using Hash on book3s64 to avoid taking an SLB and Hash fault
> >         during patching. The previous implementation was a hack which
> >         changed current->mm to allow the SLB and Hash fault handlers to
> >         work with the temporary mm since both of those code-paths always
> >         assume mm == current->mm.
> >       * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
> >         have to manage the mm->context.active_cpus counter and mm cpumask
> >         since they determine (via mm_is_thread_local()) if the TLB flush
> >         in pte_clear() is local or not - it should always be local when
> >         we're using the temporary mm. On book3s64's Radix MMU we can
> >         just call local_flush_tlb_mm().
> >       * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
> >         KUAP.
> > ---
> >   arch/powerpc/lib/code-patching.c | 132 +++++++++++++++++++++++++++++--
> >   1 file changed, 125 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 9f2eba9b70ee4..027dabd42b8dd 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -11,6 +11,7 @@
> >   #include <linux/cpuhotplug.h>
> >   #include <linux/slab.h>
> >   #include <linux/uaccess.h>
> > +#include <linux/random.h>
> >   
> >   #include <asm/tlbflush.h>
> >   #include <asm/page.h>
> > @@ -103,6 +104,7 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
> >   
> >   static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> >   static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> > +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
> >   
> >   #if IS_BUILTIN(CONFIG_LKDTM)
> >   unsigned long read_cpu_patching_addr(unsigned int cpu)
> > @@ -133,6 +135,51 @@ static int text_area_cpu_down(unsigned int cpu)
> >   	return 0;
> >   }
> >   
> > +static __always_inline void __poking_init_temp_mm(void)
> > +{
> > +	int cpu;
> > +	spinlock_t *ptl; /* for protecting pte table */
> > +	pte_t *ptep;
> > +	struct mm_struct *patching_mm;
> > +	unsigned long patching_addr;
> > +
> > +	for_each_possible_cpu(cpu) {
> > +		/*
> > +		 * Some parts of the kernel (static keys for example) depend on
> > +		 * successful code patching. Code patching under
> > +		 * STRICT_KERNEL_RWX requires this setup - otherwise we cannot
> > +		 * patch at all. We use BUG_ON() here and later since an early
> > +		 * failure is preferred to buggy behavior and/or strange
> > +		 * crashes later.
> > +		 */
> > +		patching_mm = copy_init_mm();
> > +		BUG_ON(!patching_mm);
>
> Read
> https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> and
> https://github.com/linuxppc/issues/issues/88
>
> Avoid BUG_ON()s thanks.
>

Fine, @mpe's reply on the GH issue says the check is probably redundant:

"In general we don't need to BUG_ON(!ptr), the MMU will catch NULL
pointer dereferences for us."

> > +
> > +		per_cpu(cpu_patching_mm, cpu) = patching_mm;
> > +
> > +		/*
> > +		 * Choose a randomized, page-aligned address from the range:
> > +		 * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower
> > +		 * address bound is PAGE_SIZE to avoid the zero-page.  The
> > +		 * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to
> > +		 * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
> > +		 */
> > +		patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
> > +				% (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));
>
> % should be at the end of first line and the second line alignment
> should match open parenthesis in
> first line.

Ok - thanks!

>
> > +
> > +		per_cpu(cpu_patching_addr, cpu) = patching_addr;
> > +
> > +		/*
> > +		 * PTE allocation uses GFP_KERNEL which means we need to
> > +		 * pre-allocate the PTE here because we cannot do the
> > +		 * allocation during patching when IRQs are disabled.
> > +		 */
> > +		ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> > +		BUG_ON(!ptep);
>
> Avoid BUG_ON() please
>

Yup, I'll remove these in the next spin.

>
> > +		pte_unmap_unlock(ptep, ptl);
> > +	}
> > +}
> > +
> >   /*
> >    * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
> >    * we judge it as being preferable to a kernel that will crash later when
> > @@ -140,6 +187,11 @@ static int text_area_cpu_down(unsigned int cpu)
> >    */
> >   void __init poking_init(void)
> >   {
> > +	if (radix_enabled()) {
> > +		__poking_init_temp_mm();
> > +		return;
> > +	}
> > +
> >   	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> >   		"powerpc/text_poke:online", text_area_cpu_up,
> >   		text_area_cpu_down));
> > @@ -213,30 +265,96 @@ static inline int unmap_patch_area(void)
> >   	return -EINVAL;
> >   }
> >   
> > +struct patch_mapping {
> > +	spinlock_t *ptl; /* for protecting pte table */
> > +	pte_t *ptep;
> > +	struct temp_mm temp_mm;
> > +};
> > +
> > +/*
> > + * This can be called for kernel text or a module.
> > + */
> > +static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
> > +{
> > +	struct page *page;
> > +	pte_t pte;
> > +	pgprot_t pgprot;
> > +	struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> > +	unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> > +
> > +	if (is_vmalloc_or_module_addr(addr))
> > +		page = vmalloc_to_page(addr);
> > +	else
> > +		page = virt_to_page(addr);
> > +
> > +	patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
> > +					     &patch_mapping->ptl);
>
> Not sure you need to split this line, checkpatch now allows 100 chars
> per line.
>

I prefer sticking to 80 columns unless readability *really* improves by
going over that limit.

>
> > +	if (unlikely(!patch_mapping->ptep)) {
> > +		pr_warn("map patch: failed to allocate pte for patching\n");
>
> That's a lot better than all above BUG_ONs
>
>
> > +		return -1;
> > +	}
> > +
> > +	pgprot = PAGE_KERNEL;
> > +	pte = mk_pte(page, pgprot);
> > +	pte = pte_mkdirty(pte);
>
> I'm sure you can do
>
> pte = pte_mkdirty(mk_pte(page, PAGE_KERNEL));
>
> And indeed PAGE_KERNEL already includes _PAGE_DIRTY, so all you should
> need is
>
> pte = mk_pte(page, PAGE_KERNEL);
>
> Or even just
>
> set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, mk_pte(page,
> PAGE_KERNEL));
>

Ok, I'll consolidate this in the next spin. Thanks!

>
> > +	set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, pte);
> > +
> > +	init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> > +	use_temporary_mm(&patch_mapping->temp_mm);
> > +
> > +	return 0;
> > +}
> > +
> > +static void unmap_patch(struct patch_mapping *patch_mapping)
> > +{
> > +	struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> > +	unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> > +
> > +	pte_clear(patching_mm, patching_addr, patch_mapping->ptep);
> > +
> > +	local_flush_tlb_mm(patching_mm);
> > +
> > +	pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl);
> > +
> > +	unuse_temporary_mm(&patch_mapping->temp_mm);
>
> Shouldn't you stop using it before unmapping/unlocking it ?
>

Yes I think you're right - IIRC I had to do this for the Hash MMU (which
we don't support w/ this verion of the series anymore anyways). I'll
revisit this for the next spin.

>
> > +}
> > +
> >   static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
> >   {
> >   	int err, rc = 0;
> >   	u32 *patch_addr = NULL;
> >   	unsigned long flags;
> > +	struct patch_mapping patch_mapping;
> >   
> >   	/*
> > -	 * During early early boot patch_instruction is called
> > -	 * when text_poke_area is not ready, but we still need
> > -	 * to allow patching. We just do the plain old patching
> > +	 * During early early boot patch_instruction is called when the
> > +	 * patching_mm/text_poke_area is not ready, but we still need to allow
> > +	 * patching. We just do the plain old patching.
> >   	 */
> > -	if (!this_cpu_read(text_poke_area))
> > -		return raw_patch_instruction(addr, instr);
> > +	if (radix_enabled()) {
> > +		if (!this_cpu_read(cpu_patching_mm))
> > +			return raw_patch_instruction(addr, instr);
> > +	} else {
> > +		if (!this_cpu_read(text_poke_area))
> > +			return raw_patch_instruction(addr, instr);
> > +	}
> >   
> >   	local_irq_save(flags);
> >   
> > -	err = map_patch_area(addr);
> > +	if (radix_enabled())
> > +		err = map_patch(addr, &patch_mapping);
>
> Maybe call it map_patch_mm() or map_patch_mapping() ?

Yes that does sound better.

>
> > +	else
> > +		err = map_patch_area(addr);
> >   	if (err)
> >   		goto out;
> >   
> >   	patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
> >   	rc = __patch_instruction(addr, instr, patch_addr);
> >   
> > -	err = unmap_patch_area();
> > +	if (radix_enabled())
> > +		unmap_patch(&patch_mapping);
>
> No err ? Would be better if it could return something, allthough always
> 0.

Ok I'll do that.

>
> And same comment about naming.
>

Yes I'll use your suggested names.

> > +	else
> > +		err = unmap_patch_area();
> >   
> >   out:
> >   	local_irq_restore(flags);
> > 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-08-11 18:28 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-13  5:31 [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 1/8] powerpc: Add LKDTM accessor for patching addr Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping Christopher M. Riedl
2021-08-05  9:13   ` Christophe Leroy
2021-08-11 17:57     ` Christopher M. Riedl
2021-08-11 18:07       ` Kees Cook
2021-07-13  5:31 ` [PATCH v5 3/8] x86_64: Add LKDTM accessor for patching addr Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping Christopher M. Riedl
2021-08-05  9:09   ` Christophe Leroy
2021-08-11 17:53     ` Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU Christopher M. Riedl
2021-08-05  9:27   ` Christophe Leroy
2021-08-11 18:02     ` Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching Christopher M. Riedl
2021-08-05  9:34   ` Christophe Leroy
2021-08-11 18:10     ` Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix Christopher M. Riedl
2021-08-05  9:48   ` Christophe Leroy
2021-08-11 18:28     ` Christopher M. Riedl
2021-07-13  5:31 ` [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test Christopher M. Riedl
2021-08-05  9:18   ` Christophe Leroy
2021-08-11 17:57     ` Christopher M. Riedl
2021-08-05  9:03 ` [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU Christophe Leroy
2021-08-11 17:49   ` Christopher M. Riedl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).