linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp()
@ 2017-07-03 13:01 Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 02/10] powerpc/mm/radix: Fix execute permissions for interrupt_vectors Michael Ellerman
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

Once upon a time there were only two PP (page protection) bits. In ISA
2.03 an additional PP bit was added, but because of the layout of the
HPTE it could not be made contiguous with the existing PP bits.

The result is that we now have three PP bits, named pp0, pp1, pp2,
where pp0 occupies bit 63 of dword 1 of the HPTE and pp1 and pp2
occupy bits 1 and 0 respectively. Until recently Linux hasn't used
pp0, however with the addition of _PAGE_KERNEL_RO we started using it.

The problem arises in the LPAR code, where we need to translate the PP
bits into the argument for the H_PROTECT hypercall. Currently the code
only passes bits 0-2 of newpp, which covers pp1, pp2 and N (no
execute), meaning pp0 is not passed to the hypervisor at all.

We can't simply pass it through in bit 63, as that would collide with a
different field in the flags argument, as defined in PAPR. Instead we
have to shift it down to bit 8 (IBM bit 55).

Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Simplify the test, rework change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/platforms/pseries/lpar.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 6541d0b03e4c..495ba4e7336d 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -301,7 +301,7 @@ static long pSeries_lpar_hpte_updatepp(unsigned long slot,
 				       int ssize, unsigned long inv_flags)
 {
 	unsigned long lpar_rc;
-	unsigned long flags = (newpp & 7) | H_AVPN;
+	unsigned long flags;
 	unsigned long want_v;
 
 	want_v = hpte_encode_avpn(vpn, psize, ssize);
@@ -309,6 +309,11 @@ static long pSeries_lpar_hpte_updatepp(unsigned long slot,
 	pr_devel("    update: avpnv=%016lx, hash=%016lx, f=%lx, psize: %d ...",
 		 want_v, slot, flags, psize);
 
+	flags = (newpp & 7) | H_AVPN;
+	if (mmu_has_feature(MMU_FTR_KERNEL_RO))
+		/* Move pp0 into bit 8 (IBM 55) */
+		flags |= (newpp & HPTE_R_PP0) >> 55;
+
 	lpar_rc = plpar_pte_protect(flags, slot, want_v);
 
 	if (lpar_rc == H_NOT_FOUND) {
@@ -380,6 +385,10 @@ static void pSeries_lpar_hpte_updateboltedpp(unsigned long newpp,
 	BUG_ON(slot == -1);
 
 	flags = newpp & 7;
+	if (mmu_has_feature(MMU_FTR_KERNEL_RO))
+		/* Move pp0 into bit 8 (IBM 55) */
+		flags |= (newpp & HPTE_R_PP0) >> 55;
+
 	lpar_rc = plpar_pte_protect(flags, slot, 0);
 
 	BUG_ON(lpar_rc != H_SUCCESS);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 02/10] powerpc/mm/radix: Fix execute permissions for interrupt_vectors
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 03/10] powerpc/kprobes: Move kprobes over to patch_instruction() Michael Ellerman
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

Commit 9abcc981de97 ("powerpc/mm/radix: Only add X for pages
overlapping kernel text") changed the linear mapping on Radix to only
mark the kernel text executable.

However if the kernel is run relocated, for example as a kdump kernel,
then the exception vectors are split from the kernel text, ie. they
remain at real address 0.

We tend to get away with it, because the kernel itself will usually be
below 1G, which means the 1G page at 0-1G is marked executable and
everything works OK. However if the kernel is loaded above 1G, or the
system has less than 1G in total (meaning we can't use a 1G page),
then the exception vectors will not be marked executable and the
kernel will fail to boot.

Fix it by also checking if the address range overlaps the exception
vectors when deciding if we should add PAGE_KERNEL_X.

Fixes: 9abcc981de97 ("powerpc/mm/radix: Only add X for pages overlapping kernel text")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Combine with the existing check, rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/pgtable-radix.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index f6af90371b1e..1342859552b1 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -150,7 +150,8 @@ static int __meminit create_physical_mapping(unsigned long start,
 
 		vaddr = (unsigned long)__va(addr);
 
-		if (overlaps_kernel_text(vaddr, vaddr + mapping_size))
+		if (overlaps_kernel_text(vaddr, vaddr + mapping_size) ||
+		    overlaps_interrupt_vector_text(vaddr, vaddr + mapping_size))
 			prot = PAGE_KERNEL_X;
 		else
 			prot = PAGE_KERNEL;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 03/10] powerpc/kprobes: Move kprobes over to patch_instruction()
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 02/10] powerpc/mm/radix: Fix execute permissions for interrupt_vectors Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 04/10] powerpc/kprobes/optprobes: Use patch_instruction() Michael Ellerman
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

arch_arm/disarm_probe() use direct assignment for copying
instructions, replace them with patch_instruction(). We don't need to
call flush_icache_range() because patch_instruction() does it for us.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/kprobes.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index fc4343514bed..697d1fbb0f77 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -158,17 +158,13 @@ NOKPROBE_SYMBOL(arch_prepare_kprobe);
 
 void arch_arm_kprobe(struct kprobe *p)
 {
-	*p->addr = BREAKPOINT_INSTRUCTION;
-	flush_icache_range((unsigned long) p->addr,
-			   (unsigned long) p->addr + sizeof(kprobe_opcode_t));
+	patch_instruction(p->addr, BREAKPOINT_INSTRUCTION);
 }
 NOKPROBE_SYMBOL(arch_arm_kprobe);
 
 void arch_disarm_kprobe(struct kprobe *p)
 {
-	*p->addr = p->opcode;
-	flush_icache_range((unsigned long) p->addr,
-			   (unsigned long) p->addr + sizeof(kprobe_opcode_t));
+	patch_instruction(p->addr, p->opcode);
 }
 NOKPROBE_SYMBOL(arch_disarm_kprobe);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 04/10] powerpc/kprobes/optprobes: Use patch_instruction()
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 02/10] powerpc/mm/radix: Fix execute permissions for interrupt_vectors Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 03/10] powerpc/kprobes: Move kprobes over to patch_instruction() Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 05/10] powerpc/xmon: Add patch_instruction() support for xmon Michael Ellerman
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

So that we can implement STRICT_RWX, use patch_instruction() in
optprobes.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/optprobes.c | 53 +++++++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index ec60ed0d4aad..6f8273f5e988 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -158,12 +158,13 @@ void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
 void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
 {
 	/* addis r4,0,(insn)@h */
-	*addr++ = PPC_INST_ADDIS | ___PPC_RT(4) |
-		  ((val >> 16) & 0xffff);
+	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(4) |
+			  ((val >> 16) & 0xffff));
+	addr++;
 
 	/* ori r4,r4,(insn)@l */
-	*addr = PPC_INST_ORI | ___PPC_RA(4) | ___PPC_RS(4) |
-		(val & 0xffff);
+	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(4) |
+			  ___PPC_RS(4) | (val & 0xffff));
 }
 
 /*
@@ -173,24 +174,28 @@ void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
 void patch_imm64_load_insns(unsigned long val, kprobe_opcode_t *addr)
 {
 	/* lis r3,(op)@highest */
-	*addr++ = PPC_INST_ADDIS | ___PPC_RT(3) |
-		  ((val >> 48) & 0xffff);
+	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(3) |
+			  ((val >> 48) & 0xffff));
+	addr++;
 
 	/* ori r3,r3,(op)@higher */
-	*addr++ = PPC_INST_ORI | ___PPC_RA(3) | ___PPC_RS(3) |
-		  ((val >> 32) & 0xffff);
+	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(3) |
+			  ___PPC_RS(3) | ((val >> 32) & 0xffff));
+	addr++;
 
 	/* rldicr r3,r3,32,31 */
-	*addr++ = PPC_INST_RLDICR | ___PPC_RA(3) | ___PPC_RS(3) |
-		  __PPC_SH64(32) | __PPC_ME64(31);
+	patch_instruction(addr, PPC_INST_RLDICR | ___PPC_RA(3) |
+			  ___PPC_RS(3) | __PPC_SH64(32) | __PPC_ME64(31));
+	addr++;
 
 	/* oris r3,r3,(op)@h */
-	*addr++ = PPC_INST_ORIS | ___PPC_RA(3) | ___PPC_RS(3) |
-		  ((val >> 16) & 0xffff);
+	patch_instruction(addr, PPC_INST_ORIS | ___PPC_RA(3) |
+			  ___PPC_RS(3) | ((val >> 16) & 0xffff));
+	addr++;
 
 	/* ori r3,r3,(op)@l */
-	*addr = PPC_INST_ORI | ___PPC_RA(3) | ___PPC_RS(3) |
-		(val & 0xffff);
+	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(3) |
+			  ___PPC_RS(3) | (val & 0xffff));
 }
 
 int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
@@ -198,7 +203,8 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
 	kprobe_opcode_t *buff, branch_op_callback, branch_emulate_step;
 	kprobe_opcode_t *op_callback_addr, *emulate_step_addr;
 	long b_offset;
-	unsigned long nip;
+	unsigned long nip, size;
+	int rc, i;
 
 	kprobe_ppc_optinsn_slots.insn_size = MAX_OPTINSN_SIZE;
 
@@ -231,8 +237,14 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
 		goto error;
 
 	/* Setup template */
-	memcpy(buff, optprobe_template_entry,
-			TMPL_END_IDX * sizeof(kprobe_opcode_t));
+	/* We can optimize this via patch_instruction_window later */
+	size = (TMPL_END_IDX * sizeof(kprobe_opcode_t)) / sizeof(int);
+	pr_devel("Copying template to %p, size %lu\n", buff, size);
+	for (i = 0; i < size; i++) {
+		rc = patch_instruction(buff + i, *(optprobe_template_entry + i));
+		if (rc < 0)
+			goto error;
+	}
 
 	/*
 	 * Fixup the template with instructions to:
@@ -261,8 +273,8 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
 	if (!branch_op_callback || !branch_emulate_step)
 		goto error;
 
-	buff[TMPL_CALL_HDLR_IDX] = branch_op_callback;
-	buff[TMPL_EMULATE_IDX] = branch_emulate_step;
+	patch_instruction(buff + TMPL_CALL_HDLR_IDX, branch_op_callback);
+	patch_instruction(buff + TMPL_EMULATE_IDX, branch_emulate_step);
 
 	/*
 	 * 3. load instruction to be emulated into relevant register, and
@@ -272,8 +284,7 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
 	/*
 	 * 4. branch back from trampoline
 	 */
-	buff[TMPL_RET_IDX] = create_branch((unsigned int *)buff + TMPL_RET_IDX,
-				(unsigned long)nip, 0);
+	patch_branch(buff + TMPL_RET_IDX, (unsigned long)nip, 0);
 
 	flush_icache_range((unsigned long)buff,
 			   (unsigned long)(&buff[TMPL_END_IDX]));
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 05/10] powerpc/xmon: Add patch_instruction() support for xmon
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (2 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 04/10] powerpc/kprobes/optprobes: Use patch_instruction() Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction() Michael Ellerman
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

Move from mwrite() to patch_instruction() for xmon for
breakpoint addition and removal.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/xmon/xmon.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index a728e1919613..08e367e3e8c3 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -53,6 +53,7 @@
 #include <asm/xive.h>
 #include <asm/opal.h>
 #include <asm/firmware.h>
+#include <asm/code-patching.h>
 
 #ifdef CONFIG_PPC64
 #include <asm/hvcall.h>
@@ -837,7 +838,8 @@ static void insert_bpts(void)
 		store_inst(&bp->instr[0]);
 		if (bp->enabled & BP_CIABR)
 			continue;
-		if (mwrite(bp->address, &bpinstr, 4) != 4) {
+		if (patch_instruction((unsigned int *)bp->address,
+							bpinstr) != 0) {
 			printf("Couldn't write instruction at %lx, "
 			       "disabling breakpoint there\n", bp->address);
 			bp->enabled &= ~BP_TRAP;
@@ -874,7 +876,8 @@ static void remove_bpts(void)
 			continue;
 		if (mread(bp->address, &instr, 4) == 4
 		    && instr == bpinstr
-		    && mwrite(bp->address, &bp->instr, 4) != 4)
+		    && patch_instruction(
+			(unsigned int *)bp->address, bp->instr[0]) != 0)
 			printf("Couldn't remove breakpoint at %lx\n",
 			       bp->address);
 		else
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction()
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (3 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 05/10] powerpc/xmon: Add patch_instruction() support for xmon Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-11-23  7:12   ` Christophe LEROY
  2017-07-03 13:01 ` [PATCH v6 07/10] powerpc/vmlinux.lds: Align __init_begin to 16M Michael Ellerman
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

This patch creates the window using text_poke_area, allocated via
get_vm_area(). text_poke_area is per CPU to avoid locking.
text_poke_area for each cpu is setup using late_initcall, prior to
setup of these alternate mapping areas, we continue to use direct
write to change/modify kernel text. With the ability to use alternate
mappings to write to kernel text, it provides us the freedom to then
turn text read-only and implement CONFIG_STRICT_KERNEL_RWX.

This code is CPU hotplug aware to ensure that the we have mappings for
any new cpus as they come online and tear down mappings for any CPUs
that go offline.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/lib/code-patching.c | 171 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 167 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 500b0f6a0b64..c9de03e0c1f1 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -12,23 +12,186 @@
 #include <linux/vmalloc.h>
 #include <linux/init.h>
 #include <linux/mm.h>
-#include <asm/page.h>
-#include <asm/code-patching.h>
+#include <linux/cpuhotplug.h>
+#include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/kprobes.h>
 
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
+#include <asm/page.h>
+#include <asm/code-patching.h>
 
-int patch_instruction(unsigned int *addr, unsigned int instr)
+static int __patch_instruction(unsigned int *addr, unsigned int instr)
 {
 	int err;
 
 	__put_user_size(instr, addr, 4, err);
 	if (err)
 		return err;
-	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr));
+
+	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (addr));
+
+	return 0;
+}
+
+#ifdef CONFIG_STRICT_KERNEL_RWX
+static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
+
+static int text_area_cpu_up(unsigned int cpu)
+{
+	struct vm_struct *area;
+
+	area = get_vm_area(PAGE_SIZE, VM_ALLOC);
+	if (!area) {
+		WARN_ONCE(1, "Failed to create text area for cpu %d\n",
+			cpu);
+		return -1;
+	}
+	this_cpu_write(text_poke_area, area);
+
+	return 0;
+}
+
+static int text_area_cpu_down(unsigned int cpu)
+{
+	free_vm_area(this_cpu_read(text_poke_area));
+	return 0;
+}
+
+/*
+ * Run as a late init call. This allows all the boot time patching to be done
+ * simply by patching the code, and then we're called here prior to
+ * mark_rodata_ro(), which happens after all init calls are run. Although
+ * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
+ * it as being preferable to a kernel that will crash later when someone tries
+ * to use patch_instruction().
+ */
+static int __init setup_text_poke_area(void)
+{
+	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+		"powerpc/text_poke:online", text_area_cpu_up,
+		text_area_cpu_down));
+
+	return 0;
+}
+late_initcall(setup_text_poke_area);
+
+/*
+ * This can be called for kernel text or a module.
+ */
+static int map_patch_area(void *addr, unsigned long text_poke_addr)
+{
+	unsigned long pfn;
+	int err;
+
+	if (is_vmalloc_addr(addr))
+		pfn = vmalloc_to_pfn(addr);
+	else
+		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
+
+	err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT),
+				pgprot_val(PAGE_KERNEL));
+
+	pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err);
+	if (err)
+		return -1;
+
 	return 0;
 }
 
+static inline int unmap_patch_area(unsigned long addr)
+{
+	pte_t *ptep;
+	pmd_t *pmdp;
+	pud_t *pudp;
+	pgd_t *pgdp;
+
+	pgdp = pgd_offset_k(addr);
+	if (unlikely(!pgdp))
+		return -EINVAL;
+
+	pudp = pud_offset(pgdp, addr);
+	if (unlikely(!pudp))
+		return -EINVAL;
+
+	pmdp = pmd_offset(pudp, addr);
+	if (unlikely(!pmdp))
+		return -EINVAL;
+
+	ptep = pte_offset_kernel(pmdp, addr);
+	if (unlikely(!ptep))
+		return -EINVAL;
+
+	pr_devel("clearing mm %p, pte %p, addr %lx\n", &init_mm, ptep, addr);
+
+	/*
+	 * In hash, pte_clear flushes the tlb, in radix, we have to
+	 */
+	pte_clear(&init_mm, addr, ptep);
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
+	return 0;
+}
+
+int patch_instruction(unsigned int *addr, unsigned int instr)
+{
+	int err;
+	unsigned int *dest = NULL;
+	unsigned long flags;
+	unsigned long text_poke_addr;
+	unsigned long kaddr = (unsigned long)addr;
+
+	/*
+	 * During early early boot patch_instruction is called
+	 * when text_poke_area is not ready, but we still need
+	 * to allow patching. We just do the plain old patching
+	 * We use slab_is_available and per cpu read * via this_cpu_read
+	 * of text_poke_area. Per-CPU areas might not be up early
+	 * this can create problems with just using this_cpu_read()
+	 */
+	if (!slab_is_available() || !this_cpu_read(text_poke_area))
+		return __patch_instruction(addr, instr);
+
+	local_irq_save(flags);
+
+	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
+	if (map_patch_area(addr, text_poke_addr)) {
+		err = -1;
+		goto out;
+	}
+
+	dest = (unsigned int *)(text_poke_addr) +
+			((kaddr & ~PAGE_MASK) / sizeof(unsigned int));
+
+	/*
+	 * We use __put_user_size so that we can handle faults while
+	 * writing to dest and return err to handle faults gracefully
+	 */
+	__put_user_size(instr, dest, 4, err);
+	if (!err)
+		asm ("dcbst 0, %0; sync; icbi 0,%0; icbi 0,%1; sync; isync"
+			::"r" (dest), "r"(addr));
+
+	err = unmap_patch_area(text_poke_addr);
+	if (err)
+		pr_warn("failed to unmap %lx\n", text_poke_addr);
+
+out:
+	local_irq_restore(flags);
+
+	return err;
+}
+#else /* !CONFIG_STRICT_KERNEL_RWX */
+
+int patch_instruction(unsigned int *addr, unsigned int instr)
+{
+	return __patch_instruction(addr, instr);
+}
+
+#endif /* CONFIG_STRICT_KERNEL_RWX */
+NOKPROBE_SYMBOL(patch_instruction);
+
 int patch_branch(unsigned int *addr, unsigned long target, int flags)
 {
 	return patch_instruction(addr, create_branch(addr, target, flags));
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 07/10] powerpc/vmlinux.lds: Align __init_begin to 16M
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (4 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction() Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 08/10] powerpc/mm/hash: Implement mark_rodata_ro() for hash Michael Ellerman
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

For CONFIG_STRICT_KERNEL_RWX align __init_begin to 16M. We use 16M
since its the larger of 2M on radix and 16M on hash for our linear
mapping. The plan is to have .text, .rodata and everything upto
__init_begin marked as RX. Note we still have executable read only
data. We could further align rodata to another 16M boundary. I've used
keeping text plus rodata as read-only-executable as a trade-off to
doing read-only-executable for text and read-only for rodata.

We don't use multi PT_LOAD in PHDRS because we are not sure if all
bootloaders support them. This patch keeps PHDRS in vmlinux.lds.S as
the same they are with just one PT_LOAD for all of the kernel marked
as RWX (7).

mpe: What this means is the added alignment bloats the resulting
binary on disk, a powernv kernel goes from 17M to 22M.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/vmlinux.lds.S | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index ace6b6579961..b1a250560198 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -8,6 +8,12 @@
 #include <asm/cache.h>
 #include <asm/thread_info.h>
 
+#ifdef CONFIG_STRICT_KERNEL_RWX
+#define STRICT_ALIGN_SIZE	(1 << 24)
+#else
+#define STRICT_ALIGN_SIZE	PAGE_SIZE
+#endif
+
 ENTRY(_stext)
 
 PHDRS {
@@ -123,7 +129,7 @@ SECTIONS
 	PROVIDE32 (etext = .);
 
 	/* Read-only data */
-	RODATA
+	RO_DATA(PAGE_SIZE)
 
 	EXCEPTION_TABLE(0)
 
@@ -140,7 +146,7 @@ SECTIONS
 /*
  * Init sections discarded at runtime
  */
-	. = ALIGN(PAGE_SIZE);
+	. = ALIGN(STRICT_ALIGN_SIZE);
 	__init_begin = .;
 	INIT_TEXT_SECTION(PAGE_SIZE) :kernel
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 08/10] powerpc/mm/hash: Implement mark_rodata_ro() for hash
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (5 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 07/10] powerpc/vmlinux.lds: Align __init_begin to 16M Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 09/10] powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix Michael Ellerman
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

With hash we update the bolted pte to mark it read-only. We rely
on the MMU_FTR_KERNEL_RO to generate the correct permissions
for read-only text. The radix implementation just prints a warning
in this implementation

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/book3s/64/hash.h  |  3 +++
 arch/powerpc/include/asm/book3s/64/radix.h |  4 +++
 arch/powerpc/mm/pgtable-hash64.c           | 41 ++++++++++++++++++++++++++++++
 arch/powerpc/mm/pgtable-radix.c            |  7 +++++
 arch/powerpc/mm/pgtable_64.c               |  9 +++++++
 5 files changed, 64 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 4e957b027fe0..0ce513f2926f 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -89,6 +89,9 @@ static inline int hash__pgd_bad(pgd_t pgd)
 {
 	return (pgd_val(pgd) == 0);
 }
+#ifdef CONFIG_STRICT_KERNEL_RWX
+extern void hash__mark_rodata_ro(void);
+#endif
 
 extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
 			    pte_t *ptep, unsigned long pte, int huge);
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index ba43754e96d2..487709ff6875 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -116,6 +116,10 @@
 #define RADIX_PUD_TABLE_SIZE	(sizeof(pud_t) << RADIX_PUD_INDEX_SIZE)
 #define RADIX_PGD_TABLE_SIZE	(sizeof(pgd_t) << RADIX_PGD_INDEX_SIZE)
 
+#ifdef CONFIG_STRICT_KERNEL_RWX
+extern void radix__mark_rodata_ro(void);
+#endif
+
 static inline unsigned long __radix_pte_update(pte_t *ptep, unsigned long clr,
 					       unsigned long set)
 {
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index a0facee58811..0809102b66bd 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -11,8 +11,12 @@
 
 #include <linux/sched.h>
 #include <linux/mm_types.h>
+#include <linux/mm.h>
 
 #include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+#include <asm/mmu.h>
 #include <asm/tlb.h>
 
 #include "mmu_decl.h"
@@ -419,3 +423,40 @@ int hash__has_transparent_hugepage(void)
 	return 1;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+#ifdef CONFIG_STRICT_KERNEL_RWX
+void hash__mark_rodata_ro(void)
+{
+	unsigned long start = (unsigned long)_stext;
+	unsigned long end = (unsigned long)__init_begin;
+	unsigned long idx;
+	unsigned int step, shift;
+	unsigned long newpp = PP_RXXX;
+
+	if (!mmu_has_feature(MMU_FTR_KERNEL_RO)) {
+		pr_info("R/O rodata not supported\n");
+		return;
+	}
+
+	shift = mmu_psize_defs[mmu_linear_psize].shift;
+	step = 1 << shift;
+
+	start = ((start + step - 1) >> shift) << shift;
+	end = (end >> shift) << shift;
+
+	pr_devel("marking ro start %lx, end %lx, step %x\n",
+			start, end, step);
+
+	if (start == end) {
+		pr_warn("could not set rodata ro, relocate the start"
+			" of the kernel to a 0x%x boundary\n", step);
+		return;
+	}
+
+	for (idx = start; idx < end; idx += step)
+		/* Not sure if we can do much with the return value */
+		mmu_hash_ops.hpte_updateboltedpp(newpp, idx, mmu_linear_psize,
+							mmu_kernel_ssize);
+
+}
+#endif
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 1342859552b1..b07e0008d02a 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -110,6 +110,13 @@ int radix__map_kernel_page(unsigned long ea, unsigned long pa,
 	return 0;
 }
 
+#ifdef CONFIG_STRICT_KERNEL_RWX
+void radix__mark_rodata_ro(void)
+{
+	pr_warn("Not yet implemented for radix\n");
+}
+#endif
+
 static inline void __meminit print_mapping(unsigned long start,
 					   unsigned long end,
 					   unsigned long size)
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index bce0ed50789c..5b0c2d63d645 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -491,3 +491,12 @@ void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
 }
 EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
 #endif /* CONFIG_PPC_BOOK3S_64 */
+
+#ifdef CONFIG_STRICT_KERNEL_RWX
+void mark_rodata_ro(void)
+{
+	if (radix_enabled())
+		return radix__mark_rodata_ro();
+	return hash__mark_rodata_ro();
+}
+#endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 09/10] powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (6 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 08/10] powerpc/mm/hash: Implement mark_rodata_ro() for hash Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-03 13:01 ` [PATCH v6 10/10] powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs Michael Ellerman
  2017-07-04 10:48 ` [v6, 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

The Radix linear mapping code (create_physical_mapping()) tries to use
the largest page size it can at each step. Currently the only reason
it steps down to a smaller page size is if the start addr is
unaligned (never happens in practice), or the end of memory is not
aligned to a huge page boundary.

To support STRICT_RWX we need to break the mapping at __init_begin,
so that the text and rodata prior to that can be marked R_X and the
regular pages after can be marked RW.

Having done that we can now implement mark_rodata_ro() for Radix,
knowing that we won't need to split any mappings.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Split down to PAGE_SIZE, not 2MB, rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/pgtable-radix.c | 67 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 65 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index b07e0008d02a..d2fd34a6f542 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -11,6 +11,7 @@
 #include <linux/sched/mm.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
+#include <linux/mm.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -113,7 +114,48 @@ int radix__map_kernel_page(unsigned long ea, unsigned long pa,
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void radix__mark_rodata_ro(void)
 {
-	pr_warn("Not yet implemented for radix\n");
+	unsigned long start = (unsigned long)_stext;
+	unsigned long end = (unsigned long)__init_begin;
+	unsigned long idx;
+	pgd_t *pgdp;
+	pud_t *pudp;
+	pmd_t *pmdp;
+	pte_t *ptep;
+
+	if (!mmu_has_feature(MMU_FTR_KERNEL_RO)) {
+		pr_info("R/O rodata not supported\n");
+		return;
+	}
+
+	start = ALIGN_DOWN(start, PAGE_SIZE);
+	end = PAGE_ALIGN(end); // aligns up
+
+	pr_devel("marking ro start %lx, end %lx\n", start, end);
+
+	for (idx = start; idx < end; idx += PAGE_SIZE) {
+		pgdp = pgd_offset_k(idx);
+		pudp = pud_alloc(&init_mm, pgdp, idx);
+		if (!pudp)
+			continue;
+		if (pud_huge(*pudp)) {
+			ptep = (pte_t *)pudp;
+			goto update_the_pte;
+		}
+		pmdp = pmd_alloc(&init_mm, pudp, idx);
+		if (!pmdp)
+			continue;
+		if (pmd_huge(*pmdp)) {
+			ptep = pmdp_ptep(pmdp);
+			goto update_the_pte;
+		}
+		ptep = pte_alloc_kernel(pmdp, idx);
+		if (!ptep)
+			continue;
+update_the_pte:
+		radix__pte_update(&init_mm, idx, ptep, _PAGE_WRITE, 0, 0);
+	}
+	radix__flush_tlb_kernel_range(start, end);
+
 }
 #endif
 
@@ -132,6 +174,12 @@ static int __meminit create_physical_mapping(unsigned long start,
 {
 	unsigned long vaddr, addr, mapping_size = 0;
 	pgprot_t prot;
+	unsigned long max_mapping_size;
+#ifdef CONFIG_STRICT_KERNEL_RWX
+	int split_text_mapping = 1;
+#else
+	int split_text_mapping = 0;
+#endif
 
 	start = _ALIGN_UP(start, PAGE_SIZE);
 	for (addr = start; addr < end; addr += mapping_size) {
@@ -140,9 +188,12 @@ static int __meminit create_physical_mapping(unsigned long start,
 
 		gap = end - addr;
 		previous_size = mapping_size;
+		max_mapping_size = PUD_SIZE;
 
+retry:
 		if (IS_ALIGNED(addr, PUD_SIZE) && gap >= PUD_SIZE &&
-		    mmu_psize_defs[MMU_PAGE_1G].shift)
+		    mmu_psize_defs[MMU_PAGE_1G].shift &&
+		    PUD_SIZE <= max_mapping_size)
 			mapping_size = PUD_SIZE;
 		else if (IS_ALIGNED(addr, PMD_SIZE) && gap >= PMD_SIZE &&
 			 mmu_psize_defs[MMU_PAGE_2M].shift)
@@ -150,6 +201,18 @@ static int __meminit create_physical_mapping(unsigned long start,
 		else
 			mapping_size = PAGE_SIZE;
 
+		if (split_text_mapping && (mapping_size == PUD_SIZE) &&
+			(addr <= __pa_symbol(__init_begin)) &&
+			(addr + mapping_size) >= __pa_symbol(_stext)) {
+			max_mapping_size = PMD_SIZE;
+			goto retry;
+		}
+
+		if (split_text_mapping && (mapping_size == PMD_SIZE) &&
+		    (addr <= __pa_symbol(__init_begin)) &&
+		    (addr + mapping_size) >= __pa_symbol(_stext))
+			mapping_size = PAGE_SIZE;
+
 		if (mapping_size != previous_size) {
 			print_mapping(start, addr, previous_size);
 			start = addr;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 10/10] powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (7 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 09/10] powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix Michael Ellerman
@ 2017-07-03 13:01 ` Michael Ellerman
  2017-07-04 10:48 ` [v6, 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-03 13:01 UTC (permalink / raw)
  To: linuxppc-dev

From: Balbir Singh <bsingharora@gmail.com>

All code that patches kernel text has been moved over to using
patch_instruction() and patch_instruction() is able to cope with the
kernel text being read only.

The linker script has been updated to ensure the read only data ends
on a large page boundary, so it and the preceding kernel text can be
marked R_X. We also have implementations of mark_rodata_ro() for Hash
and Radix MMU modes.

There are some corner-cases missing when the kernel is built
relocatable, so for now make it depend on !RELOCATABLE.

There's also a temporary workaround to depend on !HIBERNATION to avoid
a build failure, that will be removed once we've merged with the PM
tree.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Make it depend on !RELOCATABLE, munge change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index eb0e591c867d..0190340f8ca9 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -164,6 +164,8 @@ config PPC
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
+	select ARCH_HAS_STRICT_KERNEL_RWX	if (PPC_BOOK3S_64 && !RELOCATABLE && !HIBERNATION)
+	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select HAVE_CBPF_JIT			if !PPC64
 	select HAVE_CONTEXT_TRACKING		if PPC64
 	select HAVE_DEBUG_KMEMLEAK
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [v6, 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp()
  2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
                   ` (8 preceding siblings ...)
  2017-07-03 13:01 ` [PATCH v6 10/10] powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs Michael Ellerman
@ 2017-07-04 10:48 ` Michael Ellerman
  9 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-07-04 10:48 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev

On Mon, 2017-07-03 at 13:01:45 UTC, Michael Ellerman wrote:
> From: Balbir Singh <bsingharora@gmail.com>
> 
> Once upon a time there were only two PP (page protection) bits. In ISA
> 2.03 an additional PP bit was added, but because of the layout of the
> HPTE it could not be made contiguous with the existing PP bits.
> 
> The result is that we now have three PP bits, named pp0, pp1, pp2,
> where pp0 occupies bit 63 of dword 1 of the HPTE and pp1 and pp2
> occupy bits 1 and 0 respectively. Until recently Linux hasn't used
> pp0, however with the addition of _PAGE_KERNEL_RO we started using it.
> 
> The problem arises in the LPAR code, where we need to translate the PP
> bits into the argument for the H_PROTECT hypercall. Currently the code
> only passes bits 0-2 of newpp, which covers pp1, pp2 and N (no
> execute), meaning pp0 is not passed to the hypervisor at all.
> 
> We can't simply pass it through in bit 63, as that would collide with a
> different field in the flags argument, as defined in PAPR. Instead we
> have to shift it down to bit 8 (IBM bit 55).
> 
> Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO")
> Cc: stable@vger.kernel.org # v4.7+
> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> [mpe: Simplify the test, rework change log]
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Series applied to powerpc next.

https://git.kernel.org/powerpc/c/e71ff982ae4c17d176e9f0132157d5

cheers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction()
  2017-07-03 13:01 ` [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction() Michael Ellerman
@ 2017-11-23  7:12   ` Christophe LEROY
  2017-11-23 11:04     ` Michael Ellerman
  0 siblings, 1 reply; 13+ messages in thread
From: Christophe LEROY @ 2017-11-23  7:12 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev, Balbir Singh



Le 03/07/2017 à 15:01, Michael Ellerman a écrit :
> From: Balbir Singh <bsingharora@gmail.com>
> 
> This patch creates the window using text_poke_area, allocated via
> get_vm_area(). text_poke_area is per CPU to avoid locking.
> text_poke_area for each cpu is setup using late_initcall, prior to
> setup of these alternate mapping areas, we continue to use direct
> write to change/modify kernel text. With the ability to use alternate
> mappings to write to kernel text, it provides us the freedom to then
> turn text read-only and implement CONFIG_STRICT_KERNEL_RWX.
> 
> This code is CPU hotplug aware to ensure that the we have mappings for
> any new cpus as they come online and tear down mappings for any CPUs
> that go offline.
> 
> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>   arch/powerpc/lib/code-patching.c | 171 ++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 167 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 500b0f6a0b64..c9de03e0c1f1 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -12,23 +12,186 @@
>   #include <linux/vmalloc.h>
>   #include <linux/init.h>
>   #include <linux/mm.h>
> -#include <asm/page.h>
> -#include <asm/code-patching.h>
> +#include <linux/cpuhotplug.h>
> +#include <linux/slab.h>
>   #include <linux/uaccess.h>
>   #include <linux/kprobes.h>
>   
> +#include <asm/pgtable.h>
> +#include <asm/tlbflush.h>
> +#include <asm/page.h>
> +#include <asm/code-patching.h>
>   
> -int patch_instruction(unsigned int *addr, unsigned int instr)
> +static int __patch_instruction(unsigned int *addr, unsigned int instr)
>   {
>   	int err;
>   
>   	__put_user_size(instr, addr, 4, err);
>   	if (err)
>   		return err;
> -	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr));
> +
> +	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (addr));
> +
> +	return 0;
> +}
> +

[...]

> +int patch_instruction(unsigned int *addr, unsigned int instr)
> +{
> +	int err;
> +	unsigned int *dest = NULL;
> +	unsigned long flags;
> +	unsigned long text_poke_addr;
> +	unsigned long kaddr = (unsigned long)addr;
> +
> +	/*
> +	 * During early early boot patch_instruction is called
> +	 * when text_poke_area is not ready, but we still need
> +	 * to allow patching. We just do the plain old patching
> +	 * We use slab_is_available and per cpu read * via this_cpu_read
> +	 * of text_poke_area. Per-CPU areas might not be up early
> +	 * this can create problems with just using this_cpu_read()
> +	 */
> +	if (!slab_is_available() || !this_cpu_read(text_poke_area))
> +		return __patch_instruction(addr, instr);
> +
> +	local_irq_save(flags);
> +
> +	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
> +	if (map_patch_area(addr, text_poke_addr)) {
> +		err = -1;
> +		goto out;
> +	}
> +
> +	dest = (unsigned int *)(text_poke_addr) +
> +			((kaddr & ~PAGE_MASK) / sizeof(unsigned int));
> +
> +	/*
> +	 * We use __put_user_size so that we can handle faults while
> +	 * writing to dest and return err to handle faults gracefully
> +	 */
> +	__put_user_size(instr, dest, 4, err);
> +	if (!err)
> +		asm ("dcbst 0, %0; sync; icbi 0,%0; icbi 0,%1; sync; isync"
> +			::"r" (dest), "r"(addr));

Is the second icbi really needed since the alternative area is mapped 
with PAGE_KERNEL which is not executable ?
If we could avoid that, we could refactor this part as follows:

diff --git a/arch/powerpc/lib/code-patching.c 
b/arch/powerpc/lib/code-patching.c
index d469224c4ada..85031de43bb9 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -23,15 +23,17 @@
  #include <asm/code-patching.h>
  #include <asm/setup.h>

-static int __patch_instruction(unsigned int *addr, unsigned int instr)
+static int __patch_instruction(unsigned int *addr, unsigned int instr,
+			       unsigned int *dest)
  {
  	int err;

-	__put_user_size(instr, addr, 4, err);
+	__put_user_size(instr, dest, 4, err);
  	if (err)
  		return err;

-	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (addr));
+	asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (dest),
+							    "r" (addr));

  	return 0;
  }
@@ -149,7 +151,7 @@ int patch_instruction(unsigned int *addr, unsigned 
int instr)
  	 * to allow patching. We just do the plain old patching
  	 */
  	if (!this_cpu_read(*PTRRELOC(&text_poke_area)))
-		return __patch_instruction(addr, instr);
+		return __patch_instruction(addr, instr, addr);

  	local_irq_save(flags);

@@ -162,14 +164,7 @@ int patch_instruction(unsigned int *addr, unsigned 
int instr)
  	dest = (unsigned int *)(text_poke_addr) +
  			((kaddr & ~PAGE_MASK) / sizeof(unsigned int));

-	/*
-	 * We use __put_user_size so that we can handle faults while
-	 * writing to dest and return err to handle faults gracefully
-	 */
-	__put_user_size(instr, dest, 4, err);
-	if (!err)
-		asm ("dcbst 0, %0; sync; icbi 0,%0; icbi 0,%1; sync; isync"
-			::"r" (dest), "r"(addr));
+	__patch_instruction(addr, instr, dest);

  	err = unmap_patch_area(text_poke_addr);
  	if (err)
@@ -184,7 +179,7 @@ int patch_instruction(unsigned int *addr, unsigned 
int instr)

  int patch_instruction(unsigned int *addr, unsigned int instr)
  {
-	return __patch_instruction(addr, instr);
+	return __patch_instruction(addr, instr, addr);
  }

  #endif /* CONFIG_STRICT_KERNEL_RWX */


> +
> +	err = unmap_patch_area(text_poke_addr);
> +	if (err)
> +		pr_warn("failed to unmap %lx\n", text_poke_addr);
> +
> +out:
> +	local_irq_restore(flags);
> +
> +	return err;
> +}
> +#else /* !CONFIG_STRICT_KERNEL_RWX */
> +
> +int patch_instruction(unsigned int *addr, unsigned int instr)
> +{
> +	return __patch_instruction(addr, instr);
> +}
> +
> +#endif /* CONFIG_STRICT_KERNEL_RWX */
> +NOKPROBE_SYMBOL(patch_instruction);
> +
>   int patch_branch(unsigned int *addr, unsigned long target, int flags)
>   {
>   	return patch_instruction(addr, create_branch(addr, target, flags));
> 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction()
  2017-11-23  7:12   ` Christophe LEROY
@ 2017-11-23 11:04     ` Michael Ellerman
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-11-23 11:04 UTC (permalink / raw)
  To: Christophe LEROY, linuxppc-dev, Balbir Singh

Christophe LEROY <christophe.leroy@c-s.fr> writes:

> Le 03/07/2017 =C3=A0 15:01, Michael Ellerman a =C3=A9crit=C2=A0:
>> From: Balbir Singh <bsingharora@gmail.com>
>>=20
>> This patch creates the window using text_poke_area, allocated via
>> get_vm_area(). text_poke_area is per CPU to avoid locking.
>> text_poke_area for each cpu is setup using late_initcall, prior to
>> setup of these alternate mapping areas, we continue to use direct
>> write to change/modify kernel text. With the ability to use alternate
>> mappings to write to kernel text, it provides us the freedom to then
>> turn text read-only and implement CONFIG_STRICT_KERNEL_RWX.
>>=20
>> This code is CPU hotplug aware to ensure that the we have mappings for
>> any new cpus as they come online and tear down mappings for any CPUs
>> that go offline.
>>=20
>> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>>   arch/powerpc/lib/code-patching.c | 171 +++++++++++++++++++++++++++++++=
+++++++-
>>   1 file changed, 167 insertions(+), 4 deletions(-)
>>=20
>> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-pa=
tching.c
>> index 500b0f6a0b64..c9de03e0c1f1 100644
>> --- a/arch/powerpc/lib/code-patching.c
>> +++ b/arch/powerpc/lib/code-patching.c
>> @@ -12,23 +12,186 @@
>>   #include <linux/vmalloc.h>
>>   #include <linux/init.h>
>>   #include <linux/mm.h>
>> -#include <asm/page.h>
>> -#include <asm/code-patching.h>
>> +#include <linux/cpuhotplug.h>
>> +#include <linux/slab.h>
>>   #include <linux/uaccess.h>
>>   #include <linux/kprobes.h>
>>=20=20=20
>> +#include <asm/pgtable.h>
>> +#include <asm/tlbflush.h>
>> +#include <asm/page.h>
>> +#include <asm/code-patching.h>
>>=20=20=20
>> -int patch_instruction(unsigned int *addr, unsigned int instr)
>> +static int __patch_instruction(unsigned int *addr, unsigned int instr)
>>   {
>>   	int err;
>>=20=20=20
>>   	__put_user_size(instr, addr, 4, err);
>>   	if (err)
>>   		return err;
>> -	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr));
>> +
>> +	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (addr));
>> +
>> +	return 0;
>> +}
>> +
>
> [...]
>
>> +int patch_instruction(unsigned int *addr, unsigned int instr)
>> +{
>> +	int err;
>> +	unsigned int *dest =3D NULL;
>> +	unsigned long flags;
>> +	unsigned long text_poke_addr;
>> +	unsigned long kaddr =3D (unsigned long)addr;
>> +
>> +	/*
>> +	 * During early early boot patch_instruction is called
>> +	 * when text_poke_area is not ready, but we still need
>> +	 * to allow patching. We just do the plain old patching
>> +	 * We use slab_is_available and per cpu read * via this_cpu_read
>> +	 * of text_poke_area. Per-CPU areas might not be up early
>> +	 * this can create problems with just using this_cpu_read()
>> +	 */
>> +	if (!slab_is_available() || !this_cpu_read(text_poke_area))
>> +		return __patch_instruction(addr, instr);
>> +
>> +	local_irq_save(flags);
>> +
>> +	text_poke_addr =3D (unsigned long)__this_cpu_read(text_poke_area)->add=
r;
>> +	if (map_patch_area(addr, text_poke_addr)) {
>> +		err =3D -1;
>> +		goto out;
>> +	}
>> +
>> +	dest =3D (unsigned int *)(text_poke_addr) +
>> +			((kaddr & ~PAGE_MASK) / sizeof(unsigned int));
>> +
>> +	/*
>> +	 * We use __put_user_size so that we can handle faults while
>> +	 * writing to dest and return err to handle faults gracefully
>> +	 */
>> +	__put_user_size(instr, dest, 4, err);
>> +	if (!err)
>> +		asm ("dcbst 0, %0; sync; icbi 0,%0; icbi 0,%1; sync; isync"
>> +			::"r" (dest), "r"(addr));
>
> Is the second icbi really needed since the alternative area is mapped=20
> with PAGE_KERNEL which is not executable ?

The use of dest vs addr is hard to follow. The second icbi is of addr,
which is the executable mapping, so yes we need that icbi.

But the first icbi could be skipped, it's to dest which is not
executable. Is that what you meant? Or am I missing the point entirely.

> If we could avoid that, we could refactor this part as follows:

Yeah that would be a good clean up I think.

> diff --git a/arch/powerpc/lib/code-patching.c=20
> b/arch/powerpc/lib/code-patching.c
> index d469224c4ada..85031de43bb9 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -23,15 +23,17 @@
>   #include <asm/code-patching.h>
>   #include <asm/setup.h>
>
> -static int __patch_instruction(unsigned int *addr, unsigned int instr)
> +static int __patch_instruction(unsigned int *addr, unsigned int instr,
> +			       unsigned int *dest)

Renaming addr vs dest would help I think.

ie. maybe:

static int __patch_instruction(unsigned int *exec_addr, unsigned int instr,
			       unsigned int *patch_addr)

Or something.

cheers

>   {
>   	int err;
>
> -	__put_user_size(instr, addr, 4, err);
> +	__put_user_size(instr, dest, 4, err);
>   	if (err)
>   		return err;
>
> -	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (addr));
> +	asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (dest),
> +							    "r" (addr));
>
>   	return 0;
>   }
> @@ -149,7 +151,7 @@ int patch_instruction(unsigned int *addr, unsigned=20
> int instr)
>   	 * to allow patching. We just do the plain old patching
>   	 */
>   	if (!this_cpu_read(*PTRRELOC(&text_poke_area)))
> -		return __patch_instruction(addr, instr);
> +		return __patch_instruction(addr, instr, addr);
>
>   	local_irq_save(flags);
>
> @@ -162,14 +164,7 @@ int patch_instruction(unsigned int *addr, unsigned=20
> int instr)
>   	dest =3D (unsigned int *)(text_poke_addr) +
>   			((kaddr & ~PAGE_MASK) / sizeof(unsigned int));
>
> -	/*
> -	 * We use __put_user_size so that we can handle faults while
> -	 * writing to dest and return err to handle faults gracefully
> -	 */
> -	__put_user_size(instr, dest, 4, err);
> -	if (!err)
> -		asm ("dcbst 0, %0; sync; icbi 0,%0; icbi 0,%1; sync; isync"
> -			::"r" (dest), "r"(addr));
> +	__patch_instruction(addr, instr, dest);
>
>   	err =3D unmap_patch_area(text_poke_addr);
>   	if (err)
> @@ -184,7 +179,7 @@ int patch_instruction(unsigned int *addr, unsigned=20
> int instr)
>
>   int patch_instruction(unsigned int *addr, unsigned int instr)
>   {
> -	return __patch_instruction(addr, instr);
> +	return __patch_instruction(addr, instr, addr);
>   }
>
>   #endif /* CONFIG_STRICT_KERNEL_RWX */
>
>
>> +
>> +	err =3D unmap_patch_area(text_poke_addr);
>> +	if (err)
>> +		pr_warn("failed to unmap %lx\n", text_poke_addr);
>> +
>> +out:
>> +	local_irq_restore(flags);
>> +
>> +	return err;
>> +}
>> +#else /* !CONFIG_STRICT_KERNEL_RWX */
>> +
>> +int patch_instruction(unsigned int *addr, unsigned int instr)
>> +{
>> +	return __patch_instruction(addr, instr);
>> +}
>> +
>> +#endif /* CONFIG_STRICT_KERNEL_RWX */
>> +NOKPROBE_SYMBOL(patch_instruction);
>> +
>>   int patch_branch(unsigned int *addr, unsigned long target, int flags)
>>   {
>>   	return patch_instruction(addr, create_branch(addr, target, flags));
>>=20

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-11-23 11:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-03 13:01 [PATCH v6 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 02/10] powerpc/mm/radix: Fix execute permissions for interrupt_vectors Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 03/10] powerpc/kprobes: Move kprobes over to patch_instruction() Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 04/10] powerpc/kprobes/optprobes: Use patch_instruction() Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 05/10] powerpc/xmon: Add patch_instruction() support for xmon Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 06/10] powerpc/lib/code-patching: Use alternate map for patch_instruction() Michael Ellerman
2017-11-23  7:12   ` Christophe LEROY
2017-11-23 11:04     ` Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 07/10] powerpc/vmlinux.lds: Align __init_begin to 16M Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 08/10] powerpc/mm/hash: Implement mark_rodata_ro() for hash Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 09/10] powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix Michael Ellerman
2017-07-03 13:01 ` [PATCH v6 10/10] powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs Michael Ellerman
2017-07-04 10:48 ` [v6, 01/10] powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).