All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv3 0/7] Better page protections for arm64
@ 2014-08-21  1:20 Laura Abbott
  2014-08-21  1:20 ` [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer Laura Abbott
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This is v3 of the series to add stricter page protections for arm64
(Text should be read only, data/heap should be no execute etc.). There
are quite a few more patches in this series

Patch 1 - 4 address issues where text was ending up up in areas that
needed to be read only. Two of these patches were sent out before[1]
and two are new issues found with more testing (hotplug/suspend test cases)

Patch 5 - 6 add support for patching kernel text via fixmap. This matches
the same approach taken for arm.

Patch 7 is the patch to map memory with appropriate permissions.

This has been tested with 3 level page tables, 2 and 4 could use testing.

Thanks,
Laura


[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/261458.html

Laura Abbott (7):
  arm64: Treat handle_arch_irq as a function pointer
  arm64: Switch to ldr for loading the stub vectors
  arm64: Move cpu_resume into the text section
  arm64: Move some head.text functions to executable section
  arm64: Factor out fixmap initialiation from ioremap
  arm64: use fixmap for text patching when text is RO
  arm64: add better page protections to arm64

 arch/arm64/Kconfig                  |   8 +
 arch/arm64/Kconfig.debug            |  23 ++
 arch/arm64/include/asm/cacheflush.h |   3 +
 arch/arm64/include/asm/fixmap.h     |   8 +-
 arch/arm64/include/asm/insn.h       |   2 +
 arch/arm64/include/asm/irq.h        |   1 -
 arch/arm64/kernel/entry.S           |   6 +-
 arch/arm64/kernel/head.S            | 427 +++++++++++++++++++-----------------
 arch/arm64/kernel/insn.c            |  74 ++++++-
 arch/arm64/kernel/irq.c             |   2 +
 arch/arm64/kernel/jump_label.c      |   2 +-
 arch/arm64/kernel/setup.c           |   3 +-
 arch/arm64/kernel/sleep.S           |  12 +-
 arch/arm64/kernel/vmlinux.lds.S     |  18 ++
 arch/arm64/mm/init.c                |   1 +
 arch/arm64/mm/ioremap.c             |  93 +-------
 arch/arm64/mm/mm.h                  |   2 +
 arch/arm64/mm/mmu.c                 | 397 ++++++++++++++++++++++++++++++---
 18 files changed, 740 insertions(+), 342 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
       [not found]   ` <CAGXu5jLur_gdXs2X5BCmxB6L5HwgyP12jkrufK7bpS0Cxhp_+Q@mail.gmail.com>
  2014-08-28 17:02   ` Catalin Marinas
  2014-08-21  1:20 ` [PATCHv3 2/7] arm64: Switch to ldr for loading the stub vectors Laura Abbott
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

handle_arch_irq isn't actually text, it's just a function pointer.
It doesn't need to be stored in the text section and doing so
causes problesm if we ever want to make the kernel text read only.
Declare handle_arch_irq as a proper function pointer stored in
the data section.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/include/asm/irq.h | 1 -
 arch/arm64/kernel/entry.S    | 6 ++----
 arch/arm64/kernel/irq.c      | 2 ++
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index e1f7ecd..1eebf5b 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -3,7 +3,6 @@
 
 #include <asm-generic/irq.h>
 
-extern void (*handle_arch_irq)(struct pt_regs *);
 extern void migrate_irqs(void);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f0b5e51..854379a 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -168,7 +168,8 @@ tsk	.req	x28		// current thread_info
  * Interrupt handling.
  */
 	.macro	irq_handler
-	ldr	x1, handle_arch_irq
+	adrp x1, handle_arch_irq
+	ldr x1, [x1, #:lo12:handle_arch_irq]
 	mov	x0, sp
 	blr	x1
 	.endm
@@ -696,6 +697,3 @@ ENTRY(sys_rt_sigreturn_wrapper)
 	mov	x0, sp
 	b	sys_rt_sigreturn
 ENDPROC(sys_rt_sigreturn_wrapper)
-
-ENTRY(handle_arch_irq)
-	.quad	0
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 0f08dfd..e4fedbc 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -67,6 +67,8 @@ void handle_IRQ(unsigned int irq, struct pt_regs *regs)
 	set_irq_regs(old_regs);
 }
 
+void (*handle_arch_irq)(struct pt_regs *) = NULL;
+
 void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))
 {
 	if (handle_arch_irq)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 2/7] arm64: Switch to ldr for loading the stub vectors
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
  2014-08-21  1:20 ` [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
  2014-08-21  9:30   ` Mark Rutland
  2014-08-21  1:20 ` [PATCHv3 3/7] arm64: Move cpu_resume into the text section Laura Abbott
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

The hyp stub vectors are currently loaded using adr. This
instruction has a +/- 1MB range for the loading address. If
the alignment for sections is changed. the address may be more
than 1MB away, resulting in reclocation errors. Switch to using
ldr for getting the address to ensure we aren't affected by the
location of the __hyp_stub_vectors.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/kernel/head.S | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 144f105..61bc210 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -331,7 +331,8 @@ CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
 	msr	vttbr_el2, xzr
 
 	/* Hypervisor stub */
-	adr	x0, __hyp_stub_vectors
+	adrp	x0, __hyp_stub_vectors
+	add	x0, x0, #:lo12:__hyp_stub_vectors
 	msr	vbar_el2, x0
 
 	/* spsr */
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 3/7] arm64: Move cpu_resume into the text section
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
  2014-08-21  1:20 ` [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer Laura Abbott
  2014-08-21  1:20 ` [PATCHv3 2/7] arm64: Switch to ldr for loading the stub vectors Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
  2014-08-25 20:34   ` Stephen Boyd
  2014-08-21  1:20 ` [PATCHv3 4/7] arm64: Move some head.text functions to executable section Laura Abbott
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

The function cpu_resume currently lives in the .data
section. This breaks functionality to make .data no
execute. Move cpu_resume to .text which is where it
actually belongs.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/kernel/sleep.S | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index b192572..6ee13db 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -126,12 +126,12 @@ cpu_resume_after_mmu:
 	ret
 ENDPROC(cpu_resume_after_mmu)
 
-	.data
 ENTRY(cpu_resume)
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
 #ifdef CONFIG_SMP
 	mrs	x1, mpidr_el1
-	adr	x4, mpidr_hash_ptr
+	adrp	x4, mpidr_hash_ptr
+	add	x4, x4, #:lo12:mpidr_hash_ptr
 	ldr	x5, [x4]
 	add	x8, x4, x5		// x8 = struct mpidr_hash phys address
         /* retrieve mpidr_hash members to compute the hash */
@@ -143,14 +143,15 @@ ENTRY(cpu_resume)
 #else
 	mov	x7, xzr
 #endif
-	adr	x0, sleep_save_sp
+	adrp	x0, sleep_save_sp
+	add	x0, x0, #:lo12:sleep_save_sp
 	ldr	x0, [x0, #SLEEP_SAVE_SP_PHYS]
 	ldr	x0, [x0, x7, lsl #3]
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	adr	x1, sleep_idmap_phys
+	adrp	x1, sleep_idmap_phys
 	/* load physical address of identity map page table in x1 */
-	ldr	x1, [x1]
+	ldr	x1, [x1, #:lo12:sleep_idmap_phys]
 	mov	sp, x2
 	/*
 	 * cpu_do_resume expects x0 to contain context physical address
@@ -160,6 +161,7 @@ ENTRY(cpu_resume)
 	b	cpu_resume_mmu		// Resume MMU, never returns
 ENDPROC(cpu_resume)
 
+	.data
 	.align 3
 mpidr_hash_ptr:
 	/*
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 4/7] arm64: Move some head.text functions to executable section
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
                   ` (2 preceding siblings ...)
  2014-08-21  1:20 ` [PATCHv3 3/7] arm64: Move cpu_resume into the text section Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
  2014-08-21 10:34   ` Mark Rutland
  2014-08-21  1:20 ` [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap Laura Abbott
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

The code in the head.text section of the kernel exists in the
same section as the swapper_pg_dir which means it needs the
same page table permissions. The swapper_pg_dir needs to be
writeable but shouldn't be executable. The head.text section
is intended to be run at early bootup before any of the regular
kernel mappings have been setup so there is no issue at bootup.
The suspend/resume/hotplug code path requires some of these
head.S functions to run however which means they need to be
executable. We can't easily move all of the head.text to
an executable section, so split it into two parts: that which
is used only at early head.S bootup and that which is used
after bootup. There is a small bit of code duplication because
of some relocation issues related to accessing code more than
1MB away.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/kernel/head.S        | 424 +++++++++++++++++++++-------------------
 arch/arm64/kernel/vmlinux.lds.S |   1 +
 2 files changed, 228 insertions(+), 197 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 61bc210..dbdb378 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -238,7 +238,7 @@ ENTRY(stext)
 	mov	x0, x22
 	bl	lookup_processor_type
 	mov	x23, x0				// x23=current cpu_table
-	cbz	x23, __error_p			// invalid processor (x23=0)?
+	cbz	x23, __h_error_p		// invalid processor (x23=0)?
 	bl	__vet_fdt
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
 	/*
@@ -250,12 +250,236 @@ ENTRY(stext)
 	 */
 	ldr	x27, __switch_data		// address to jump to after
 						// MMU has been enabled
-	adr	lr, __enable_mmu		// return (PIC) address
+	adr	lr, __h_enable_mmu		// return (PIC) address
 	ldr	x12, [x23, #CPU_INFO_SETUP]
 	add	x12, x12, x28			// __virt_to_phys
 	br	x12				// initialise processor
 ENDPROC(stext)
 
+__h_error_p:
+ENDPROC(__h_error_p)
+
+__h_error:
+1:
+	nop
+	b	1b
+ENDPROC(__h_error)
+
+__h_enable_mmu:
+	ldr	x5, =vectors
+	msr	vbar_el1, x5
+	msr	ttbr0_el1, x25			// load TTBR0
+	msr	ttbr1_el1, x26			// load TTBR1
+	isb
+	b	__h_turn_mmu_on
+ENDPROC(__h_enable_mmu)
+
+	.align	4
+__h_turn_mmu_on:
+	msr	sctlr_el1, x0
+	isb
+	br	x27
+ENDPROC(__h_turn_mmu_on)
+
+/*
+ * Determine validity of the x21 FDT pointer.
+ * The dtb must be 8-byte aligned and live in the first 512M of memory.
+ */
+__vet_fdt:
+	tst	x21, #0x7
+	b.ne	1f
+	cmp	x21, x24
+	b.lt	1f
+	mov	x0, #(1 << 29)
+	add	x0, x0, x24
+	cmp	x21, x0
+	b.ge	1f
+	ret
+1:
+	mov	x21, #0
+	ret
+ENDPROC(__vet_fdt)
+/*
+ * Macro to create a table entry to the next page.
+ *
+ *	tbl:	page table address
+ *	virt:	virtual address
+ *	shift:	#imm page table shift
+ *	ptrs:	#imm pointers per table page
+ *
+ * Preserves:	virt
+ * Corrupts:	tmp1, tmp2
+ * Returns:	tbl -> next level table page address
+ */
+	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
+	lsr	\tmp1, \virt, #\shift
+	and	\tmp1, \tmp1, #\ptrs - 1	// table index
+	add	\tmp2, \tbl, #PAGE_SIZE
+	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
+	str	\tmp2, [\tbl, \tmp1, lsl #3]
+	add	\tbl, \tbl, #PAGE_SIZE		// next level table page
+	.endm
+
+/*
+ * Macro to populate the PGD (and possibily PUD) for the corresponding
+ * block entry in the next level (tbl) for the given virtual address.
+ *
+ * Preserves:	tbl, next, virt
+ * Corrupts:	tmp1, tmp2
+ */
+	.macro	create_pgd_entry, tbl, virt, tmp1, tmp2
+	create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
+#if SWAPPER_PGTABLE_LEVELS == 3
+	create_table_entry \tbl, \virt, TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2
+#endif
+	.endm
+
+/*
+ * Macro to populate block entries in the page table for the start..end
+ * virtual range (inclusive).
+ *
+ * Preserves:	tbl, flags
+ * Corrupts:	phys, start, end, pstate
+ */
+	.macro	create_block_map, tbl, flags, phys, start, end
+	lsr	\phys, \phys, #BLOCK_SHIFT
+	lsr	\start, \start, #BLOCK_SHIFT
+	and	\start, \start, #PTRS_PER_PTE - 1	// table index
+	orr	\phys, \flags, \phys, lsl #BLOCK_SHIFT	// table entry
+	lsr	\end, \end, #BLOCK_SHIFT
+	and	\end, \end, #PTRS_PER_PTE - 1		// table end index
+9999:	str	\phys, [\tbl, \start, lsl #3]		// store the entry
+	add	\start, \start, #1			// next entry
+	add	\phys, \phys, #BLOCK_SIZE		// next block
+	cmp	\start, \end
+	b.ls	9999b
+	.endm
+
+/*
+ * Setup the initial page tables. We only setup the barest amount which is
+ * required to get the kernel running. The following sections are required:
+ *   - identity mapping to enable the MMU (low address, TTBR0)
+ *   - first few MB of the kernel linear mapping to jump to once the MMU has
+ *     been enabled, including the FDT blob (TTBR1)
+ *   - pgd entry for fixed mappings (TTBR1)
+ */
+__create_page_tables:
+	pgtbl	x25, x26, x28			// idmap_pg_dir and swapper_pg_dir addresses
+	mov	x27, lr
+
+	/*
+	 * Invalidate the idmap and swapper page tables to avoid potential
+	 * dirty cache lines being evicted.
+	 */
+	mov	x0, x25
+	add	x1, x26, #SWAPPER_DIR_SIZE
+	bl	__inval_cache_range
+
+	/*
+	 * Clear the idmap and swapper page tables.
+	 */
+	mov	x0, x25
+	add	x6, x26, #SWAPPER_DIR_SIZE
+1:	stp	xzr, xzr, [x0], #16
+	stp	xzr, xzr, [x0], #16
+	stp	xzr, xzr, [x0], #16
+	stp	xzr, xzr, [x0], #16
+	cmp	x0, x6
+	b.lo	1b
+
+	ldr	x7, =MM_MMUFLAGS
+
+	/*
+	 * Create the identity mapping.
+	 */
+	mov	x0, x25				// idmap_pg_dir
+	ldr	x3, =KERNEL_START
+	add	x3, x3, x28			// __pa(KERNEL_START)
+	create_pgd_entry x0, x3, x5, x6
+	ldr	x6, =KERNEL_END
+	mov	x5, x3				// __pa(KERNEL_START)
+	add	x6, x6, x28			// __pa(KERNEL_END)
+	create_block_map x0, x7, x3, x5, x6
+
+	/*
+	 * Map the kernel image (starting with PHYS_OFFSET).
+	 */
+	mov	x0, x26				// swapper_pg_dir
+	mov	x5, #PAGE_OFFSET
+	create_pgd_entry x0, x5, x3, x6
+	ldr	x6, =KERNEL_END
+	mov	x3, x24				// phys offset
+	create_block_map x0, x7, x3, x5, x6
+
+	/*
+	 * Map the FDT blob (maximum 2MB; must be within 512MB of
+	 * PHYS_OFFSET).
+	 */
+	mov	x3, x21				// FDT phys address
+	and	x3, x3, #~((1 << 21) - 1)	// 2MB aligned
+	mov	x6, #PAGE_OFFSET
+	sub	x5, x3, x24			// subtract PHYS_OFFSET
+	tst	x5, #~((1 << 29) - 1)		// within 512MB?
+	csel	x21, xzr, x21, ne		// zero the FDT pointer
+	b.ne	1f
+	add	x5, x5, x6			// __va(FDT blob)
+	add	x6, x5, #1 << 21		// 2MB for the FDT blob
+	sub	x6, x6, #1			// inclusive range
+	create_block_map x0, x7, x3, x5, x6
+1:
+	/*
+	 * Since the page tables have been populated with non-cacheable
+	 * accesses (MMU disabled), invalidate the idmap and swapper page
+	 * tables again to remove any speculatively loaded cache lines.
+	 */
+	mov	x0, x25
+	add	x1, x26, #SWAPPER_DIR_SIZE
+	bl	__inval_cache_range
+
+	mov	lr, x27
+	ret
+ENDPROC(__create_page_tables)
+	.ltorg
+
+	.align	3
+	.type	__switch_data, %object
+__switch_data:
+	.quad	__mmap_switched
+	.quad	__bss_start			// x6
+	.quad	__bss_stop			// x7
+	.quad	processor_id			// x4
+	.quad	__fdt_pointer			// x5
+	.quad	memstart_addr			// x6
+	.quad	init_thread_union + THREAD_START_SP // sp
+
+/*
+ * The following fragment of code is executed with the MMU on in MMU mode, and
+ * uses absolute addresses; this is not position independent.
+ */
+__mmap_switched:
+	adr	x3, __switch_data + 8
+
+	ldp	x6, x7, [x3], #16
+1:	cmp	x6, x7
+	b.hs	2f
+	str	xzr, [x6], #8			// Clear BSS
+	b	1b
+2:
+	ldp	x4, x5, [x3], #16
+	ldr	x6, [x3], #8
+	ldr	x16, [x3]
+	mov	sp, x16
+	str	x22, [x4]			// Save processor ID
+	str	x21, [x5]			// Save FDT pointer
+	str	x24, [x6]			// Save PHYS_OFFSET
+	mov	x29, #0
+	b	start_kernel
+ENDPROC(__mmap_switched)
+
+/*
+ * end 'true' head section, begin head section that can be read only
+ */
+	.section ".latehead.text","ax"
 /*
  * If we're fortunate enough to boot@EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -497,183 +721,6 @@ ENDPROC(__calc_phys_offset)
 	.quad	PAGE_OFFSET
 
 /*
- * Macro to create a table entry to the next page.
- *
- *	tbl:	page table address
- *	virt:	virtual address
- *	shift:	#imm page table shift
- *	ptrs:	#imm pointers per table page
- *
- * Preserves:	virt
- * Corrupts:	tmp1, tmp2
- * Returns:	tbl -> next level table page address
- */
-	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
-	lsr	\tmp1, \virt, #\shift
-	and	\tmp1, \tmp1, #\ptrs - 1	// table index
-	add	\tmp2, \tbl, #PAGE_SIZE
-	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
-	str	\tmp2, [\tbl, \tmp1, lsl #3]
-	add	\tbl, \tbl, #PAGE_SIZE		// next level table page
-	.endm
-
-/*
- * Macro to populate the PGD (and possibily PUD) for the corresponding
- * block entry in the next level (tbl) for the given virtual address.
- *
- * Preserves:	tbl, next, virt
- * Corrupts:	tmp1, tmp2
- */
-	.macro	create_pgd_entry, tbl, virt, tmp1, tmp2
-	create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
-#if SWAPPER_PGTABLE_LEVELS == 3
-	create_table_entry \tbl, \virt, TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2
-#endif
-	.endm
-
-/*
- * Macro to populate block entries in the page table for the start..end
- * virtual range (inclusive).
- *
- * Preserves:	tbl, flags
- * Corrupts:	phys, start, end, pstate
- */
-	.macro	create_block_map, tbl, flags, phys, start, end
-	lsr	\phys, \phys, #BLOCK_SHIFT
-	lsr	\start, \start, #BLOCK_SHIFT
-	and	\start, \start, #PTRS_PER_PTE - 1	// table index
-	orr	\phys, \flags, \phys, lsl #BLOCK_SHIFT	// table entry
-	lsr	\end, \end, #BLOCK_SHIFT
-	and	\end, \end, #PTRS_PER_PTE - 1		// table end index
-9999:	str	\phys, [\tbl, \start, lsl #3]		// store the entry
-	add	\start, \start, #1			// next entry
-	add	\phys, \phys, #BLOCK_SIZE		// next block
-	cmp	\start, \end
-	b.ls	9999b
-	.endm
-
-/*
- * Setup the initial page tables. We only setup the barest amount which is
- * required to get the kernel running. The following sections are required:
- *   - identity mapping to enable the MMU (low address, TTBR0)
- *   - first few MB of the kernel linear mapping to jump to once the MMU has
- *     been enabled, including the FDT blob (TTBR1)
- *   - pgd entry for fixed mappings (TTBR1)
- */
-__create_page_tables:
-	pgtbl	x25, x26, x28			// idmap_pg_dir and swapper_pg_dir addresses
-	mov	x27, lr
-
-	/*
-	 * Invalidate the idmap and swapper page tables to avoid potential
-	 * dirty cache lines being evicted.
-	 */
-	mov	x0, x25
-	add	x1, x26, #SWAPPER_DIR_SIZE
-	bl	__inval_cache_range
-
-	/*
-	 * Clear the idmap and swapper page tables.
-	 */
-	mov	x0, x25
-	add	x6, x26, #SWAPPER_DIR_SIZE
-1:	stp	xzr, xzr, [x0], #16
-	stp	xzr, xzr, [x0], #16
-	stp	xzr, xzr, [x0], #16
-	stp	xzr, xzr, [x0], #16
-	cmp	x0, x6
-	b.lo	1b
-
-	ldr	x7, =MM_MMUFLAGS
-
-	/*
-	 * Create the identity mapping.
-	 */
-	mov	x0, x25				// idmap_pg_dir
-	ldr	x3, =KERNEL_START
-	add	x3, x3, x28			// __pa(KERNEL_START)
-	create_pgd_entry x0, x3, x5, x6
-	ldr	x6, =KERNEL_END
-	mov	x5, x3				// __pa(KERNEL_START)
-	add	x6, x6, x28			// __pa(KERNEL_END)
-	create_block_map x0, x7, x3, x5, x6
-
-	/*
-	 * Map the kernel image (starting with PHYS_OFFSET).
-	 */
-	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
-	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END
-	mov	x3, x24				// phys offset
-	create_block_map x0, x7, x3, x5, x6
-
-	/*
-	 * Map the FDT blob (maximum 2MB; must be within 512MB of
-	 * PHYS_OFFSET).
-	 */
-	mov	x3, x21				// FDT phys address
-	and	x3, x3, #~((1 << 21) - 1)	// 2MB aligned
-	mov	x6, #PAGE_OFFSET
-	sub	x5, x3, x24			// subtract PHYS_OFFSET
-	tst	x5, #~((1 << 29) - 1)		// within 512MB?
-	csel	x21, xzr, x21, ne		// zero the FDT pointer
-	b.ne	1f
-	add	x5, x5, x6			// __va(FDT blob)
-	add	x6, x5, #1 << 21		// 2MB for the FDT blob
-	sub	x6, x6, #1			// inclusive range
-	create_block_map x0, x7, x3, x5, x6
-1:
-	/*
-	 * Since the page tables have been populated with non-cacheable
-	 * accesses (MMU disabled), invalidate the idmap and swapper page
-	 * tables again to remove any speculatively loaded cache lines.
-	 */
-	mov	x0, x25
-	add	x1, x26, #SWAPPER_DIR_SIZE
-	bl	__inval_cache_range
-
-	mov	lr, x27
-	ret
-ENDPROC(__create_page_tables)
-	.ltorg
-
-	.align	3
-	.type	__switch_data, %object
-__switch_data:
-	.quad	__mmap_switched
-	.quad	__bss_start			// x6
-	.quad	__bss_stop			// x7
-	.quad	processor_id			// x4
-	.quad	__fdt_pointer			// x5
-	.quad	memstart_addr			// x6
-	.quad	init_thread_union + THREAD_START_SP // sp
-
-/*
- * The following fragment of code is executed with the MMU on in MMU mode, and
- * uses absolute addresses; this is not position independent.
- */
-__mmap_switched:
-	adr	x3, __switch_data + 8
-
-	ldp	x6, x7, [x3], #16
-1:	cmp	x6, x7
-	b.hs	2f
-	str	xzr, [x6], #8			// Clear BSS
-	b	1b
-2:
-	ldp	x4, x5, [x3], #16
-	ldr	x6, [x3], #8
-	ldr	x16, [x3]
-	mov	sp, x16
-	str	x22, [x4]			// Save processor ID
-	str	x21, [x5]			// Save FDT pointer
-	str	x24, [x6]			// Save PHYS_OFFSET
-	mov	x29, #0
-	b	start_kernel
-ENDPROC(__mmap_switched)
-
-/*
  * Exception handling. Something went wrong and we can't proceed. We ought to
  * tell the user, but since we don't have any guarantee that we're even
  * running on the right architecture, we do virtually nothing.
@@ -721,21 +768,4 @@ __lookup_processor_type_data:
 	.quad	cpu_table
 	.size	__lookup_processor_type_data, . - __lookup_processor_type_data
 
-/*
- * Determine validity of the x21 FDT pointer.
- * The dtb must be 8-byte aligned and live in the first 512M of memory.
- */
-__vet_fdt:
-	tst	x21, #0x7
-	b.ne	1f
-	cmp	x21, x24
-	b.lt	1f
-	mov	x0, #(1 << 29)
-	add	x0, x0, x24
-	cmp	x21, x0
-	b.ge	1f
-	ret
-1:
-	mov	x21, #0
-	ret
-ENDPROC(__vet_fdt)
+
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 97f0c04..2b674c5 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -56,6 +56,7 @@ SECTIONS
 	}
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
+			*(.latehead.text)
 			__exception_text_start = .;
 			*(.exception.text)
 			__exception_text_end = .;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
                   ` (3 preceding siblings ...)
  2014-08-21  1:20 ` [PATCHv3 4/7] arm64: Move some head.text functions to executable section Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
  2014-08-23  5:45   ` Kees Cook
  2014-08-21  1:20 ` [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO Laura Abbott
  2014-08-21  1:20 ` [PATCHv3 7/7] arm64: add better page protections to arm64 Laura Abbott
  6 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

The fixmap API was originally added for arm64 for
early_ioremap purposes. It can be used for other purposes too
so move the initialization from ioremap to somewhere more
generic. This makes it obvious where the fixmap is being set
up and allows for a cleaner implementation of __set_fixmap.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/include/asm/fixmap.h |  7 +--
 arch/arm64/kernel/setup.c       |  3 +-
 arch/arm64/mm/ioremap.c         | 93 ++--------------------------------------
 arch/arm64/mm/mmu.c             | 94 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 103 insertions(+), 94 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 5f7bfe6..db26a2f2 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -56,10 +56,11 @@ enum fixed_addresses {
 
 #define FIXMAP_PAGE_IO     __pgprot(PROT_DEVICE_nGnRE)
 
-extern void __early_set_fixmap(enum fixed_addresses idx,
-			       phys_addr_t phys, pgprot_t flags);
+void __init early_fixmap_init(void);
 
-#define __set_fixmap __early_set_fixmap
+#define __early_set_fixmap __set_fixmap
+
+extern void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
 
 #include <asm-generic/fixmap.h>
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index f6f0ccf..9244e20 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -378,7 +378,8 @@ void __init setup_arch(char **cmdline_p)
 
 	*cmdline_p = boot_command_line;
 
-	early_ioremap_init();
+	early_fixmap_init();
+	early_ioremap_setup();
 
 	parse_early_param();
 
diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
index fa324bd..cbb99c8 100644
--- a/arch/arm64/mm/ioremap.c
+++ b/arch/arm64/mm/ioremap.c
@@ -103,97 +103,10 @@ void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
 }
 EXPORT_SYMBOL(ioremap_cache);
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_ARM64_PGTABLE_LEVELS > 2
-static pte_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_ARM64_PGTABLE_LEVELS > 3
-static pte_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
-static inline pud_t * __init early_ioremap_pud(unsigned long addr)
-{
-	pgd_t *pgd;
-
-	pgd = pgd_offset_k(addr);
-	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
-
-	return pud_offset(pgd, addr);
-}
-
-static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
-{
-	pud_t *pud = early_ioremap_pud(addr);
-
-	BUG_ON(pud_none(*pud) || pud_bad(*pud));
-
-	return pmd_offset(pud, addr);
-}
-
-static inline pte_t * __init early_ioremap_pte(unsigned long addr)
-{
-	pmd_t *pmd = early_ioremap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
-}
-
+/*
+ * Must be called after early_fixmap_init
+ */
 void __init early_ioremap_init(void)
 {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-	unsigned long addr = fix_to_virt(FIX_BTMAP_BEGIN);
-
-	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
-	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
-	pmd_populate_kernel(&init_mm, pmd, bm_pte);
-
-	/*
-	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not prepared:
-	 */
-	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
-		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-
-	if (pmd != early_ioremap_pmd(fix_to_virt(FIX_BTMAP_END))) {
-		WARN_ON(1);
-		pr_warn("pmd %p != %p\n",
-			pmd, early_ioremap_pmd(fix_to_virt(FIX_BTMAP_END)));
-		pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
-			fix_to_virt(FIX_BTMAP_BEGIN));
-		pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
-			fix_to_virt(FIX_BTMAP_END));
-
-		pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
-		pr_warn("FIX_BTMAP_BEGIN:     %d\n",
-			FIX_BTMAP_BEGIN);
-	}
-
 	early_ioremap_setup();
 }
-
-void __init __early_set_fixmap(enum fixed_addresses idx,
-			       phys_addr_t phys, pgprot_t flags)
-{
-	unsigned long addr = __fix_to_virt(idx);
-	pte_t *pte;
-
-	if (idx >= __end_of_fixed_addresses) {
-		BUG();
-		return;
-	}
-
-	pte = early_ioremap_pte(addr);
-
-	if (pgprot_val(flags))
-		set_pte(pte, pfn_pte(phys >> PAGE_SHIFT, flags));
-	else {
-		pte_clear(&init_mm, addr, pte);
-		flush_tlb_kernel_range(addr, addr+PAGE_SIZE);
-	}
-}
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c555672..bd549a3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -28,6 +28,7 @@
 #include <linux/io.h>
 
 #include <asm/cputype.h>
+#include <asm/fixmap.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/sizes.h>
@@ -459,3 +460,96 @@ void vmemmap_free(unsigned long start, unsigned long end)
 {
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
+
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+#if CONFIG_ARM64_PGTABLE_LEVELS > 2
+static pte_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+#endif
+#if CONFIG_ARM64_PGTABLE_LEVELS > 3
+static pte_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+#endif
+
+static inline pud_t * fixmap_pud(unsigned long addr)
+{
+	pgd_t *pgd = pgd_offset_k(addr);
+
+	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
+
+	return pud_offset(pgd, addr);
+}
+
+static inline pmd_t * fixmap_pmd(unsigned long addr)
+{
+	pud_t *pud = fixmap_pud(addr);
+
+	BUG_ON(pud_none(*pud) || pud_bad(*pud));
+
+	return pmd_offset(pud, addr);
+}
+
+static inline pte_t * fixmap_pte(unsigned long addr)
+{
+	pmd_t *pmd = fixmap_pmd(addr);
+
+	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
+
+	return pte_offset_kernel(pmd, addr);
+}
+
+void __init early_fixmap_init(void)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	unsigned long addr = FIXADDR_START;
+
+	pgd = pgd_offset_k(addr);
+	pgd_populate(&init_mm, pgd, bm_pud);
+	pud = pud_offset(pgd, addr);
+	pud_populate(&init_mm, pud, bm_pmd);
+	pmd = pmd_offset(pud, addr);
+	pmd_populate_kernel(&init_mm, pmd, bm_pte);
+
+	/*
+	 * The boot-ioremap range spans multiple pmds, for which
+	 * we are not preparted:
+	 */
+	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
+		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
+
+	if ((pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)))
+	     || pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_END))) {
+		WARN_ON(1);
+		pr_warn("pmd %p != %p, %p\n",
+			pmd, fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)),
+			fixmap_pmd(fix_to_virt(FIX_BTMAP_END)));
+		pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
+			fix_to_virt(FIX_BTMAP_BEGIN));
+		pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
+			fix_to_virt(FIX_BTMAP_END));
+
+		pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
+		pr_warn("FIX_BTMAP_BEGIN:     %d\n", FIX_BTMAP_BEGIN);
+	}
+}
+
+void __set_fixmap(enum fixed_addresses idx,
+			       phys_addr_t phys, pgprot_t flags)
+{
+	unsigned long addr = __fix_to_virt(idx);
+	pte_t *pte;
+
+	if (idx >= __end_of_fixed_addresses) {
+		BUG();
+		return;
+	}
+
+	pte = fixmap_pte(addr);
+
+	if (pgprot_val(flags)) {
+		set_pte(pte, pfn_pte(phys >> PAGE_SHIFT, flags));
+	} else {
+		pte_clear(&init_mm, addr, pte);
+		flush_tlb_kernel_range(addr, addr+PAGE_SIZE);
+	}
+}
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
                   ` (4 preceding siblings ...)
  2014-08-21  1:20 ` [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
  2014-08-23  5:51   ` Kees Cook
  2014-08-21  1:20 ` [PATCHv3 7/7] arm64: add better page protections to arm64 Laura Abbott
  6 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

When kernel text is marked as read only, it cannot be modified directly.
Use a fixmap to modify the text instead in a similar manner to
x86 and arm.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
I think there were some questions on spinlocks for the arm version, not
sure if similar concerns apply here.
---
 arch/arm64/include/asm/fixmap.h |  1 +
 arch/arm64/include/asm/insn.h   |  2 ++
 arch/arm64/kernel/insn.c        | 74 +++++++++++++++++++++++++++++++++++++++--
 arch/arm64/kernel/jump_label.c  |  2 +-
 4 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index db26a2f2..2cd4b0d 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -48,6 +48,7 @@ enum fixed_addresses {
 
 	FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
 	FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
+	FIX_TEXT_POKE0,
 	__end_of_fixed_addresses
 };
 
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index dc1f73b..07ac29b 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -92,6 +92,7 @@ bool aarch64_insn_is_nop(u32 insn);
 
 int aarch64_insn_read(void *addr, u32 *insnp);
 int aarch64_insn_write(void *addr, u32 insn);
+int aarch64_insn_write_early(void *addr, u32 insn);
 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
 u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 				  u32 insn, u64 imm);
@@ -103,6 +104,7 @@ u32 aarch64_insn_gen_nop(void);
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
+int __aarch64_insn_patch_text_nosync(void *addr, u32 insn, bool early);
 int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
 int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 92f3683..b25f8db 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -17,10 +17,13 @@
 #include <linux/bitops.h>
 #include <linux/compiler.h>
 #include <linux/kernel.h>
+#include <linux/mm.h>
 #include <linux/smp.h>
+#include <linux/spinlock.h>
 #include <linux/stop_machine.h>
 #include <linux/uaccess.h>
 #include <asm/cacheflush.h>
+#include <asm/fixmap.h>
 #include <asm/insn.h>
 
 static int aarch64_insn_encoding_class[] = {
@@ -65,6 +68,36 @@ bool __kprobes aarch64_insn_is_nop(u32 insn)
 	}
 }
 
+static DEFINE_SPINLOCK(patch_lock);
+
+static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
+{
+	unsigned long uintaddr = (uintptr_t) addr;
+	bool module = !core_kernel_text(uintaddr);
+	struct page *page;
+
+	if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
+		page = vmalloc_to_page(addr);
+	else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
+		page = virt_to_page(addr);
+	else
+		return addr;
+
+	if (flags)
+		spin_lock_irqsave(&patch_lock, *flags);
+
+	set_fixmap(fixmap, page_to_phys(page));
+
+	return (void *) (__fix_to_virt(fixmap) + (uintaddr & ~PAGE_MASK));
+}
+
+static void __kprobes patch_unmap(int fixmap, unsigned long *flags)
+{
+	clear_fixmap(fixmap);
+
+	if (flags)
+		spin_unlock_irqrestore(&patch_lock, *flags);
+}
 /*
  * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
  * little-endian.
@@ -81,10 +114,36 @@ int __kprobes aarch64_insn_read(void *addr, u32 *insnp)
 	return ret;
 }
 
+static int __kprobes __aarch64_insn_write(void *addr, u32 insn, bool patch)
+{
+	void *waddr = addr;
+	unsigned long flags;
+	int ret;
+
+	if (patch)
+		waddr = patch_map(addr, FIX_TEXT_POKE0, &flags);
+
+	ret = probe_kernel_write(waddr, &insn, AARCH64_INSN_SIZE);
+
+	if (waddr != addr) {
+		__flush_dcache_area(waddr, AARCH64_INSN_SIZE);
+		patch_unmap(FIX_TEXT_POKE0, &flags);
+	}
+
+	return ret;
+}
+
 int __kprobes aarch64_insn_write(void *addr, u32 insn)
 {
 	insn = cpu_to_le32(insn);
-	return probe_kernel_write(addr, &insn, AARCH64_INSN_SIZE);
+	return __aarch64_insn_write(addr, insn, true);
+}
+
+int __kprobes aarch64_insn_write_early(void *addr, u32 insn)
+{
+	insn = cpu_to_le32(insn);
+	return __aarch64_insn_write(addr, insn, false);
+
 }
 
 static bool __kprobes __aarch64_insn_hotpatch_safe(u32 insn)
@@ -117,7 +176,7 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
 	       __aarch64_insn_hotpatch_safe(new_insn);
 }
 
-int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
+int __kprobes __aarch64_insn_patch_text_nosync(void *addr, u32 insn, bool early)
 {
 	u32 *tp = addr;
 	int ret;
@@ -126,7 +185,11 @@ int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
 	if ((uintptr_t)tp & 0x3)
 		return -EINVAL;
 
-	ret = aarch64_insn_write(tp, insn);
+	if (early)
+		ret = aarch64_insn_write_early(tp, insn);
+	else
+		ret = aarch64_insn_write(tp, insn);
+
 	if (ret == 0)
 		flush_icache_range((uintptr_t)tp,
 				   (uintptr_t)tp + AARCH64_INSN_SIZE);
@@ -134,6 +197,11 @@ int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
 	return ret;
 }
 
+int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
+{
+	return __aarch64_insn_patch_text_nosync(addr, insn, false);
+}
+
 struct aarch64_insn_patch {
 	void		**text_addrs;
 	u32		*new_insns;
diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
index 263a166..9ac30bb 100644
--- a/arch/arm64/kernel/jump_label.c
+++ b/arch/arm64/kernel/jump_label.c
@@ -38,7 +38,7 @@ static void __arch_jump_label_transform(struct jump_entry *entry,
 	}
 
 	if (is_static)
-		aarch64_insn_patch_text_nosync(addr, insn);
+		__aarch64_insn_patch_text_nosync(addr, insn, true);
 	else
 		aarch64_insn_patch_text(&addr, &insn, 1);
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 7/7] arm64: add better page protections to arm64
  2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
                   ` (5 preceding siblings ...)
  2014-08-21  1:20 ` [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO Laura Abbott
@ 2014-08-21  1:20 ` Laura Abbott
  2014-08-23  5:59   ` Kees Cook
  6 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

Add page protections for arm64 similar to those in arm.
This is for security reasons to prevent certain classes
of exploits. The current method:

- Map all memory as either RWX or RW. We round to the nearest
  section to avoid creating page tables before everything is mapped
- Once everything is mapped, if either end of the RWX section should
  not be X, we split the PMD and remap as necessary
- When initmem is to be freed, we change the permissions back to
  RW (using stop machine if necessary to flush the TLB)
- If CONFIG_DEBUG_RODATA is set, the read only sections are set
  read only.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/Kconfig                  |   8 +
 arch/arm64/Kconfig.debug            |  23 +++
 arch/arm64/include/asm/cacheflush.h |   3 +
 arch/arm64/kernel/vmlinux.lds.S     |  17 ++
 arch/arm64/mm/init.c                |   1 +
 arch/arm64/mm/mm.h                  |   2 +
 arch/arm64/mm/mmu.c                 | 303 +++++++++++++++++++++++++++++++-----
 7 files changed, 321 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fd4e81a..b718974 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -328,6 +328,14 @@ config FORCE_MAX_ZONEORDER
 	default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
 	default "11"
 
+config NX_KERNEL_AREAS
+	bool "Allow no execute kernel areas"
+	help
+	  Mark areas of the kernel that should never be executed (data, heap
+	  etc.) as no execute. This helps with a certain set of security issues.
+	  You should say Y here unless you like malicious code doing unexpected
+	  things on your hardware.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 4ee8e90..032ccc1 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -43,4 +43,27 @@ config ARM64_RANDOMIZE_TEXT_OFFSET
 	  of TEXT_OFFSET and platforms must not require a specific
 	  value.
 
+config DEBUG_RODATA
+	bool "Make kernel text and rodata read-only"
+	help
+	  If this is set, kernel text and rodata will be made read-only. This
+	  is to help catch accidental or malicious attempts to change the
+	  kernel's executable code. Additionally splits rodata from kernel
+	  text so it can be made explicitly non-executable.
+
+          If in doubt, say Y
+
+config DEBUG_ALIGN_RODATA
+	depends on DEBUG_RODATA
+	bool "Align linker sections up to SECTION_SIZE"
+	help
+	  If this option is enabled, sections that may potentially be marked as
+	  read only or non-executable will be aligned up to the section size of
+	  the kernel. This prevents sections from being split into pages and
+	  avoids a potential TLB penalty. The downside is an increase in
+	  alignment and potentially wasted space. Turn on this option if
+	  performance is more important than memory pressure.
+
+	  If in doubt, say N
+
 endmenu
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index f2defe1..6e2f910 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -148,4 +148,7 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
 {
 }
 
+#ifdef CONFIG_DEBUG_RODATA
+void mark_rodata_ro(void);
+#endif
 #endif
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 2b674c5..6dea60d 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -8,6 +8,7 @@
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 
 #include "image.h"
 
@@ -54,6 +55,9 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
+#ifdef DEBUG_ALIGN_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			*(.latehead.text)
@@ -71,19 +75,32 @@ SECTIONS
 		*(.got)			/* Global offset table		*/
 	}
 
+#ifdef DEBUG_ALIGN_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
 	RO_DATA(PAGE_SIZE)
 	EXCEPTION_TABLE(8)
 	NOTES
 	_etext = .;			/* End of text and rodata section */
 
+#ifdef DEBUG_ALIGN_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#else
 	. = ALIGN(PAGE_SIZE);
+#endif
 	__init_begin = .;
 
 	INIT_TEXT_SECTION(8)
 	.exit.text : {
 		ARM_EXIT_KEEP(EXIT_TEXT)
 	}
+
+#ifdef DEBUG_ALIGN_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+	__init_data_begin = .;
+#else
 	. = ALIGN(16);
+#endif
 	.init.data : {
 		INIT_DATA
 		INIT_SETUP(16)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5b4526e..d796b76 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -323,6 +323,7 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
+	fixup_init();
 	free_initmem_default(0);
 }
 
diff --git a/arch/arm64/mm/mm.h b/arch/arm64/mm/mm.h
index d519f4f..82347d8 100644
--- a/arch/arm64/mm/mm.h
+++ b/arch/arm64/mm/mm.h
@@ -1,2 +1,4 @@
 extern void __init bootmem_init(void);
 extern void __init arm64_swiotlb_init(void);
+
+void fixup_init(void);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index bd549a3..7118152 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -26,6 +26,7 @@
 #include <linux/memblock.h>
 #include <linux/fs.h>
 #include <linux/io.h>
+#include <linux/stop_machine.h>
 
 #include <asm/cputype.h>
 #include <asm/fixmap.h>
@@ -137,17 +138,55 @@ static void __init *early_alloc(unsigned long sz)
 	return ptr;
 }
 
-static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
+/*
+ * remap a PMD into pages
+ */
+static noinline void __ref split_pmd(pmd_t *pmd, bool early)
+{
+	pte_t *pte, *start_pte;
+	unsigned long pfn;
+	int i = 0;
+
+	if (early)
+		start_pte = pte = early_alloc(PTRS_PER_PTE*sizeof(pte_t));
+	else
+		start_pte = pte = (pte_t *)__get_free_page(PGALLOC_GFP);
+
+	BUG_ON(!pte);
+
+	pfn = pmd_pfn(*pmd);
+
+	do {
+		/*
+		 * Need to have the least restrictive permissions available
+		 * permissions will be fixed up later
+		 */
+		set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
+		pfn++;
+	} while (pte++, i++, i < PTRS_PER_PTE);
+
+
+	__pmd_populate(pmd, __pa(start_pte), PMD_TYPE_TABLE);
+	flush_tlb_all();
+}
+
+static void __ref alloc_init_pte(pmd_t *pmd, unsigned long addr,
 				  unsigned long end, unsigned long pfn,
-				  pgprot_t prot)
+				  pgprot_t prot, bool early)
 {
 	pte_t *pte;
 
 	if (pmd_none(*pmd)) {
-		pte = early_alloc(PTRS_PER_PTE * sizeof(pte_t));
+		if (early)
+			pte = early_alloc(PTRS_PER_PTE * sizeof(pte_t));
+		else
+			pte = (pte_t *)__get_free_page(PGALLOC_GFP);
+		BUG_ON(!pte);
 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
 	}
-	BUG_ON(pmd_bad(*pmd));
+
+	if (pmd_bad(*pmd))
+		split_pmd(pmd, early);
 
 	pte = pte_offset_kernel(pmd, addr);
 	do {
@@ -156,29 +195,52 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 }
 
-static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
+void __ref split_pud(pud_t *old_pud, pmd_t *pmd)
+{
+	unsigned long pfn = pud_pfn(*old_pud);
+        const pmdval_t mask = PMD_SECT_USER | PMD_SECT_PXN | PMD_SECT_UXN |
+                              PMD_SECT_RDONLY | PMD_SECT_PROT_NONE |
+                              PMD_SECT_VALID;
+	pgprot_t prot = __pgprot(pud_val(*old_pud) & mask);
+	int i = 0;
+
+	do {
+		set_pmd(pmd, pfn_pmd(pfn, prot));
+		pfn++;
+	} while (pmd++, i++, i < PTRS_PER_PMD);
+}
+
+static void __ref alloc_init_pmd(pud_t *pud, unsigned long addr,
 				  unsigned long end, phys_addr_t phys,
-				  int map_io)
+				  pgprot_t sect_prot, pgprot_t pte_prot,
+				  bool early, int map_io)
 {
 	pmd_t *pmd;
 	unsigned long next;
-	pmdval_t prot_sect;
-	pgprot_t prot_pte;
 
 	if (map_io) {
-		prot_sect = PROT_SECT_DEVICE_nGnRE;
-		prot_pte = __pgprot(PROT_DEVICE_nGnRE);
-	} else {
-		prot_sect = PROT_SECT_NORMAL_EXEC;
-		prot_pte = PAGE_KERNEL_EXEC;
+		sect_prot = PROT_SECT_DEVICE_nGnRE;
+		pte_prot = __pgprot(PROT_DEVICE_nGnRE);
 	}
 
 	/*
 	 * Check for initial section mappings in the pgd/pud and remove them.
 	 */
 	if (pud_none(*pud) || pud_bad(*pud)) {
-		pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
+		if (early)
+			pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
+		else
+			pmd = pmd_alloc_one(&init_mm, addr);
+		BUG_ON(!pmd);
+		if (pud_sect(*pud)) {
+			/*
+			 * need to have the 1G of mappings continue to be
+			 * present
+			 */
+			split_pud(pud, pmd);
+		}
 		pud_populate(&init_mm, pud, pmd);
+		flush_tlb_all();
 	}
 
 	pmd = pmd_offset(pud, addr);
@@ -186,8 +248,8 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 		next = pmd_addr_end(addr, end);
 		/* try section mapping first */
 		if (((addr | next | phys) & ~SECTION_MASK) == 0) {
-			pmd_t old_pmd =*pmd;
-			set_pmd(pmd, __pmd(phys | prot_sect));
+			pmd_t old_pmd = *pmd;
+			set_pmd(pmd, __pmd(phys | sect_prot));
 			/*
 			 * Check for previous table entries created during
 			 * boot (__create_page_tables) and flush them.
@@ -196,15 +258,16 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 				flush_tlb_all();
 		} else {
 			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
-				       prot_pte);
+					pte_prot, early);
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
+static void __ref alloc_init_pud(pgd_t *pgd, unsigned long addr,
 				  unsigned long end, unsigned long phys,
-				  int map_io)
+				  pgprot_t sect_prot, pgprot_t pte_prot,
+				  bool early, int map_io)
 {
 	pud_t *pud;
 	unsigned long next;
@@ -222,10 +285,10 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
 		/*
 		 * For 4K granule only, attempt to put down a 1GB block
 		 */
-		if (!map_io && (PAGE_SHIFT == 12) &&
+		if (!map_io && early && (PAGE_SHIFT == 12) &&
 		    ((addr | next | phys) & ~PUD_MASK) == 0) {
 			pud_t old_pud = *pud;
-			set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
+			set_pud(pud, __pud(phys | sect_prot));
 
 			/*
 			 * If we have an old value for a pud, it will
@@ -240,7 +303,8 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(pud, addr, next, phys, map_io);
+			alloc_init_pmd(pud, addr, next, phys, sect_prot,
+					pte_prot, early, map_io);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -250,9 +314,11 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
  * Create the page directory entries and any necessary page tables for the
  * mapping specified by 'md'.
  */
-static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
-				    unsigned long virt, phys_addr_t size,
-				    int map_io)
+static void __ref __create_mapping(pgd_t *pgd, phys_addr_t phys,
+				  unsigned long virt,
+				  phys_addr_t size,
+				  pgprot_t sect_prot, pgprot_t pte_prot,
+				  int map_io, bool early)
 {
 	unsigned long addr, length, end, next;
 
@@ -262,31 +328,140 @@ static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(pgd, addr, next, phys, map_io);
+		alloc_init_pud(pgd, addr, next, phys, sect_prot, pte_prot,
+				early, map_io);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
 
-static void __init create_mapping(phys_addr_t phys, unsigned long virt,
-				  phys_addr_t size)
+void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
+{
+	if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
+		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
+		return;
+	}
+	__create_mapping(&idmap_pg_dir[pgd_index(addr)],
+			 addr, addr, size, PROT_SECT_NORMAL_EXEC,
+			 PAGE_KERNEL_EXEC, map_io, false);
+}
+
+static inline pmd_t *pmd_off_k(unsigned long virt)
+{
+	return pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
+}
+
+#ifdef CONFIG_EARLY_PRINTK
+/*
+ * Create an early I/O mapping using the pgd/pmd entries already populated
+ * in head.S as this function is called too early to allocated any memory. The
+ * mapping size is 2MB with 4KB pages or 64KB or 64KB pages.
+ */
+void __iomem * __init early_io_map(phys_addr_t phys, unsigned long virt)
+{
+	unsigned long size, mask;
+	bool page64k = IS_ENABLED(CONFIG_ARM64_64K_PAGES);
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	/*
+	 * No early pte entries with !ARM64_64K_PAGES configuration, so using
+	 * sections (pmd).
+	 */
+	size = page64k ? PAGE_SIZE : SECTION_SIZE;
+	mask = ~(size - 1);
+
+	pgd = pgd_offset_k(virt);
+	pud = pud_offset(pgd, virt);
+	if (pud_none(*pud))
+		return NULL;
+	pmd = pmd_offset(pud, virt);
+
+	if (page64k) {
+		if (pmd_none(*pmd))
+			return NULL;
+		pte = pte_offset_kernel(pmd, virt);
+		set_pte(pte, __pte((phys & mask) | PROT_DEVICE_nGnRE));
+	} else {
+		set_pmd(pmd, __pmd((phys & mask) | PROT_SECT_DEVICE_nGnRE));
+	}
+
+	return (void __iomem *)((virt & mask) + (phys & ~mask));
+}
+#endif
+
+static void __ref create_mapping(phys_addr_t phys, unsigned long virt,
+				  phys_addr_t size,
+				  pgprot_t sect_prot, pgprot_t pte_prot)
 {
 	if (virt < VMALLOC_START) {
 		pr_warn("BUG: not creating mapping for %pa@0x%016lx - outside kernel range\n",
 			&phys, virt);
 		return;
 	}
-	__create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt, size, 0);
+
+	return __create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt,
+				size, sect_prot, pte_prot, 0, true);
 }
 
-void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
+static void __ref create_mapping_late(phys_addr_t phys, unsigned long virt,
+				  phys_addr_t size,
+				  pgprot_t sect_prot, pgprot_t pte_prot)
 {
-	if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
-		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
+	if (virt < VMALLOC_START) {
+		pr_warn("BUG: not creating mapping for %pa@0x%016lx - outside kernel range\n",
+			&phys, virt);
 		return;
 	}
-	__create_mapping(&idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size, map_io);
+
+	return __create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt,
+				size, sect_prot, pte_prot, 0, false);
+}
+
+#ifdef CONFIG_NX_KERNEL_AREAS
+static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
+{
+	/*
+	 * Set up the executable regions using the existing section mappings
+	 * for now. This will get more fine grained later once all memory
+	 * is mapped
+	 */
+	unsigned long kernel_x_start = round_down(__pa(_stext), SECTION_SIZE);
+	unsigned long kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
+
+	if (end < kernel_x_start) {
+		create_mapping(start, __phys_to_virt(start),
+			end - start, PROT_SECT_NORMAL, PAGE_KERNEL);
+	} else if (start >= kernel_x_end) {
+		create_mapping(start, __phys_to_virt(start),
+			end - start, PROT_SECT_NORMAL, PAGE_KERNEL);
+	} else {
+		if (start < kernel_x_start)
+			create_mapping(start, __phys_to_virt(start),
+				kernel_x_start - start,
+				PROT_SECT_NORMAL,
+				PAGE_KERNEL);
+		create_mapping(kernel_x_start,
+				__phys_to_virt(kernel_x_start),
+				kernel_x_end - kernel_x_start,
+				PROT_SECT_NORMAL_EXEC, PAGE_KERNEL_EXEC);
+		if (kernel_x_end < end)
+			create_mapping(kernel_x_end,
+				__phys_to_virt(kernel_x_end),
+				end - kernel_x_end,
+				PROT_SECT_NORMAL,
+				PAGE_KERNEL);
+	}
+
+}
+#else
+static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
+{
+	create_mapping(start, __phys_to_virt(start), end - start,
+			PROT_SECT_NORMAL_EXEC, PAGE_KERNEL_EXEC);
 }
+#endif
 
 static void __init map_mem(void)
 {
@@ -328,14 +503,69 @@ static void __init map_mem(void)
 			memblock_set_current_limit(limit);
 		}
 #endif
-
-		create_mapping(start, __phys_to_virt(start), end - start);
+		__map_memblock(start, end);
 	}
 
 	/* Limit no longer required. */
 	memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE);
 }
 
+void __init fixup_executable(void)
+{
+#ifdef CONFIG_NX_KERNEL_AREAS
+	/* now that we are actually fully mapped, make the start/end more fine grained */
+	if (!IS_ALIGNED((unsigned long)_stext, SECTION_SIZE)) {
+		unsigned long aligned_start = round_down(__pa(_stext),
+							SECTION_SIZE);
+
+		create_mapping(aligned_start, __phys_to_virt(aligned_start),
+				__pa(_stext) - aligned_start,
+				PROT_SECT_NORMAL,
+				PAGE_KERNEL);
+	}
+
+	if (!IS_ALIGNED((unsigned long)__init_end, SECTION_SIZE)) {
+		unsigned long aligned_end = round_up(__pa(__init_end),
+							SECTION_SIZE);
+		create_mapping(__pa(__init_end), (unsigned long)__init_end,
+				aligned_end - __pa(__init_end),
+				PROT_SECT_NORMAL,
+				PAGE_KERNEL);
+	}
+#endif
+}
+
+#ifdef CONFIG_DEBUG_RODATA
+void mark_rodata_ro(void)
+{
+	create_mapping_late(__pa(_stext), (unsigned long)_stext,
+				(unsigned long)_etext - (unsigned long)_stext,
+				PROT_SECT_NORMAL_EXEC | PMD_SECT_RDONLY,
+				PAGE_KERNEL_EXEC | PTE_RDONLY);
+
+}
+#endif
+
+static int __flush_mappings(void *unused)
+{
+	flush_tlb_kernel_range((unsigned long)__init_begin,
+				(unsigned long)__init_end);
+	return 0;
+}
+
+void __ref fixup_init(void)
+{
+	phys_addr_t start = __pa(__init_begin);
+	phys_addr_t end = __pa(__init_end);
+
+	create_mapping_late(start, (unsigned long)__init_begin,
+			end - start,
+			PROT_SECT_NORMAL,
+			PAGE_KERNEL);
+	if (!IS_ALIGNED(start, SECTION_SIZE) || !IS_ALIGNED(end, SECTION_SIZE))
+		stop_machine(__flush_mappings, NULL, NULL);
+}
+
 /*
  * paging_init() sets up the page tables, initialises the zone memory
  * maps and sets up the zero page.
@@ -345,6 +575,7 @@ void __init paging_init(void)
 	void *zero_page;
 
 	map_mem();
+	fixup_executable();
 
 	/*
 	 * Finally flush the caches and tlb to ensure that we're in a
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 2/7] arm64: Switch to ldr for loading the stub vectors
  2014-08-21  1:20 ` [PATCHv3 2/7] arm64: Switch to ldr for loading the stub vectors Laura Abbott
@ 2014-08-21  9:30   ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2014-08-21  9:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Laura,

On Thu, Aug 21, 2014 at 02:20:34AM +0100, Laura Abbott wrote:
> The hyp stub vectors are currently loaded using adr. This
> instruction has a +/- 1MB range for the loading address. If
> the alignment for sections is changed. the address may be more

Nit: that '.' looks out of place.

> than 1MB away, resulting in reclocation errors. Switch to using
> ldr for getting the address to ensure we aren't affected by the
> location of the __hyp_stub_vectors.

The subject and commit message don't seem to have been updated to
reflect that the patch uses adrp + add (with a range of ~4GB) rather
than ldr.

Otherwise, the patch itself seems fine to me.

Mark.

> 
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm64/kernel/head.S | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 144f105..61bc210 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -331,7 +331,8 @@ CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
>  	msr	vttbr_el2, xzr
>  
>  	/* Hypervisor stub */
> -	adr	x0, __hyp_stub_vectors
> +	adrp	x0, __hyp_stub_vectors
> +	add	x0, x0, #:lo12:__hyp_stub_vectors
>  	msr	vbar_el2, x0
>  
>  	/* spsr */
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 4/7] arm64: Move some head.text functions to executable section
  2014-08-21  1:20 ` [PATCHv3 4/7] arm64: Move some head.text functions to executable section Laura Abbott
@ 2014-08-21 10:34   ` Mark Rutland
  2014-08-21 21:42     ` Laura Abbott
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Rutland @ 2014-08-21 10:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Laura,

On Thu, Aug 21, 2014 at 02:20:36AM +0100, Laura Abbott wrote:
> The code in the head.text section of the kernel exists in the
> same section as the swapper_pg_dir which means it needs the
> same page table permissions. The swapper_pg_dir needs to be
> writeable but shouldn't be executable.

I think we can drop the above. As far as I can tell as of commit
bd00cd5f8c8c (arm64: place initial page tables above the kernel) it's no
longer relevant.

> The head.text section
> is intended to be run at early bootup before any of the regular
> kernel mappings have been setup so there is no issue at bootup.
> The suspend/resume/hotplug code path requires some of these
> head.S functions to run however which means they need to be
> executable. We can't easily move all of the head.text to
> an executable section, so split it into two parts: that which
> is used only at early head.S bootup and that which is used
> after bootup. There is a small bit of code duplication because
> of some relocation issues related to accessing code more than
> 1MB away.

>From a cursory glance it looks like the only things we need write access
to in .head.text are __cpu_boot_mode and __switch_data. Can't we instead
place those in .data and make .head.text executable?

We currently find them with adr, which should be easy to replace with
adrp + add to get around relocation issues.

Thanks,
Mark.

> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm64/kernel/head.S        | 424 +++++++++++++++++++++-------------------
>  arch/arm64/kernel/vmlinux.lds.S |   1 +
>  2 files changed, 228 insertions(+), 197 deletions(-)
> 
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 61bc210..dbdb378 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -238,7 +238,7 @@ ENTRY(stext)
>         mov     x0, x22
>         bl      lookup_processor_type
>         mov     x23, x0                         // x23=current cpu_table
> -       cbz     x23, __error_p                  // invalid processor (x23=0)?
> +       cbz     x23, __h_error_p                // invalid processor (x23=0)?
>         bl      __vet_fdt
>         bl      __create_page_tables            // x25=TTBR0, x26=TTBR1
>         /*
> @@ -250,12 +250,236 @@ ENTRY(stext)
>          */
>         ldr     x27, __switch_data              // address to jump to after
>                                                 // MMU has been enabled
> -       adr     lr, __enable_mmu                // return (PIC) address
> +       adr     lr, __h_enable_mmu              // return (PIC) address
>         ldr     x12, [x23, #CPU_INFO_SETUP]
>         add     x12, x12, x28                   // __virt_to_phys
>         br      x12                             // initialise processor
>  ENDPROC(stext)
> 
> +__h_error_p:
> +ENDPROC(__h_error_p)
> +
> +__h_error:
> +1:
> +       nop
> +       b       1b
> +ENDPROC(__h_error)
> +
> +__h_enable_mmu:
> +       ldr     x5, =vectors
> +       msr     vbar_el1, x5
> +       msr     ttbr0_el1, x25                  // load TTBR0
> +       msr     ttbr1_el1, x26                  // load TTBR1
> +       isb
> +       b       __h_turn_mmu_on
> +ENDPROC(__h_enable_mmu)
> +
> +       .align  4
> +__h_turn_mmu_on:
> +       msr     sctlr_el1, x0
> +       isb
> +       br      x27
> +ENDPROC(__h_turn_mmu_on)
> +
> +/*
> + * Determine validity of the x21 FDT pointer.
> + * The dtb must be 8-byte aligned and live in the first 512M of memory.
> + */
> +__vet_fdt:
> +       tst     x21, #0x7
> +       b.ne    1f
> +       cmp     x21, x24
> +       b.lt    1f
> +       mov     x0, #(1 << 29)
> +       add     x0, x0, x24
> +       cmp     x21, x0
> +       b.ge    1f
> +       ret
> +1:
> +       mov     x21, #0
> +       ret
> +ENDPROC(__vet_fdt)
> +/*
> + * Macro to create a table entry to the next page.
> + *
> + *     tbl:    page table address
> + *     virt:   virtual address
> + *     shift:  #imm page table shift
> + *     ptrs:   #imm pointers per table page
> + *
> + * Preserves:  virt
> + * Corrupts:   tmp1, tmp2
> + * Returns:    tbl -> next level table page address
> + */
> +       .macro  create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
> +       lsr     \tmp1, \virt, #\shift
> +       and     \tmp1, \tmp1, #\ptrs - 1        // table index
> +       add     \tmp2, \tbl, #PAGE_SIZE
> +       orr     \tmp2, \tmp2, #PMD_TYPE_TABLE   // address of next table and entry type
> +       str     \tmp2, [\tbl, \tmp1, lsl #3]
> +       add     \tbl, \tbl, #PAGE_SIZE          // next level table page
> +       .endm
> +
> +/*
> + * Macro to populate the PGD (and possibily PUD) for the corresponding
> + * block entry in the next level (tbl) for the given virtual address.
> + *
> + * Preserves:  tbl, next, virt
> + * Corrupts:   tmp1, tmp2
> + */
> +       .macro  create_pgd_entry, tbl, virt, tmp1, tmp2
> +       create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
> +#if SWAPPER_PGTABLE_LEVELS == 3
> +       create_table_entry \tbl, \virt, TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2
> +#endif
> +       .endm
> +
> +/*
> + * Macro to populate block entries in the page table for the start..end
> + * virtual range (inclusive).
> + *
> + * Preserves:  tbl, flags
> + * Corrupts:   phys, start, end, pstate
> + */
> +       .macro  create_block_map, tbl, flags, phys, start, end
> +       lsr     \phys, \phys, #BLOCK_SHIFT
> +       lsr     \start, \start, #BLOCK_SHIFT
> +       and     \start, \start, #PTRS_PER_PTE - 1       // table index
> +       orr     \phys, \flags, \phys, lsl #BLOCK_SHIFT  // table entry
> +       lsr     \end, \end, #BLOCK_SHIFT
> +       and     \end, \end, #PTRS_PER_PTE - 1           // table end index
> +9999:  str     \phys, [\tbl, \start, lsl #3]           // store the entry
> +       add     \start, \start, #1                      // next entry
> +       add     \phys, \phys, #BLOCK_SIZE               // next block
> +       cmp     \start, \end
> +       b.ls    9999b
> +       .endm
> +
> +/*
> + * Setup the initial page tables. We only setup the barest amount which is
> + * required to get the kernel running. The following sections are required:
> + *   - identity mapping to enable the MMU (low address, TTBR0)
> + *   - first few MB of the kernel linear mapping to jump to once the MMU has
> + *     been enabled, including the FDT blob (TTBR1)
> + *   - pgd entry for fixed mappings (TTBR1)
> + */
> +__create_page_tables:
> +       pgtbl   x25, x26, x28                   // idmap_pg_dir and swapper_pg_dir addresses
> +       mov     x27, lr
> +
> +       /*
> +        * Invalidate the idmap and swapper page tables to avoid potential
> +        * dirty cache lines being evicted.
> +        */
> +       mov     x0, x25
> +       add     x1, x26, #SWAPPER_DIR_SIZE
> +       bl      __inval_cache_range
> +
> +       /*
> +        * Clear the idmap and swapper page tables.
> +        */
> +       mov     x0, x25
> +       add     x6, x26, #SWAPPER_DIR_SIZE
> +1:     stp     xzr, xzr, [x0], #16
> +       stp     xzr, xzr, [x0], #16
> +       stp     xzr, xzr, [x0], #16
> +       stp     xzr, xzr, [x0], #16
> +       cmp     x0, x6
> +       b.lo    1b
> +
> +       ldr     x7, =MM_MMUFLAGS
> +
> +       /*
> +        * Create the identity mapping.
> +        */
> +       mov     x0, x25                         // idmap_pg_dir
> +       ldr     x3, =KERNEL_START
> +       add     x3, x3, x28                     // __pa(KERNEL_START)
> +       create_pgd_entry x0, x3, x5, x6
> +       ldr     x6, =KERNEL_END
> +       mov     x5, x3                          // __pa(KERNEL_START)
> +       add     x6, x6, x28                     // __pa(KERNEL_END)
> +       create_block_map x0, x7, x3, x5, x6
> +
> +       /*
> +        * Map the kernel image (starting with PHYS_OFFSET).
> +        */
> +       mov     x0, x26                         // swapper_pg_dir
> +       mov     x5, #PAGE_OFFSET
> +       create_pgd_entry x0, x5, x3, x6
> +       ldr     x6, =KERNEL_END
> +       mov     x3, x24                         // phys offset
> +       create_block_map x0, x7, x3, x5, x6
> +
> +       /*
> +        * Map the FDT blob (maximum 2MB; must be within 512MB of
> +        * PHYS_OFFSET).
> +        */
> +       mov     x3, x21                         // FDT phys address
> +       and     x3, x3, #~((1 << 21) - 1)       // 2MB aligned
> +       mov     x6, #PAGE_OFFSET
> +       sub     x5, x3, x24                     // subtract PHYS_OFFSET
> +       tst     x5, #~((1 << 29) - 1)           // within 512MB?
> +       csel    x21, xzr, x21, ne               // zero the FDT pointer
> +       b.ne    1f
> +       add     x5, x5, x6                      // __va(FDT blob)
> +       add     x6, x5, #1 << 21                // 2MB for the FDT blob
> +       sub     x6, x6, #1                      // inclusive range
> +       create_block_map x0, x7, x3, x5, x6
> +1:
> +       /*
> +        * Since the page tables have been populated with non-cacheable
> +        * accesses (MMU disabled), invalidate the idmap and swapper page
> +        * tables again to remove any speculatively loaded cache lines.
> +        */
> +       mov     x0, x25
> +       add     x1, x26, #SWAPPER_DIR_SIZE
> +       bl      __inval_cache_range
> +
> +       mov     lr, x27
> +       ret
> +ENDPROC(__create_page_tables)
> +       .ltorg
> +
> +       .align  3
> +       .type   __switch_data, %object
> +__switch_data:
> +       .quad   __mmap_switched
> +       .quad   __bss_start                     // x6
> +       .quad   __bss_stop                      // x7
> +       .quad   processor_id                    // x4
> +       .quad   __fdt_pointer                   // x5
> +       .quad   memstart_addr                   // x6
> +       .quad   init_thread_union + THREAD_START_SP // sp
> +
> +/*
> + * The following fragment of code is executed with the MMU on in MMU mode, and
> + * uses absolute addresses; this is not position independent.
> + */
> +__mmap_switched:
> +       adr     x3, __switch_data + 8
> +
> +       ldp     x6, x7, [x3], #16
> +1:     cmp     x6, x7
> +       b.hs    2f
> +       str     xzr, [x6], #8                   // Clear BSS
> +       b       1b
> +2:
> +       ldp     x4, x5, [x3], #16
> +       ldr     x6, [x3], #8
> +       ldr     x16, [x3]
> +       mov     sp, x16
> +       str     x22, [x4]                       // Save processor ID
> +       str     x21, [x5]                       // Save FDT pointer
> +       str     x24, [x6]                       // Save PHYS_OFFSET
> +       mov     x29, #0
> +       b       start_kernel
> +ENDPROC(__mmap_switched)
> +
> +/*
> + * end 'true' head section, begin head section that can be read only
> + */
> +       .section ".latehead.text","ax"
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -497,183 +721,6 @@ ENDPROC(__calc_phys_offset)
>         .quad   PAGE_OFFSET
> 
>  /*
> - * Macro to create a table entry to the next page.
> - *
> - *     tbl:    page table address
> - *     virt:   virtual address
> - *     shift:  #imm page table shift
> - *     ptrs:   #imm pointers per table page
> - *
> - * Preserves:  virt
> - * Corrupts:   tmp1, tmp2
> - * Returns:    tbl -> next level table page address
> - */
> -       .macro  create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
> -       lsr     \tmp1, \virt, #\shift
> -       and     \tmp1, \tmp1, #\ptrs - 1        // table index
> -       add     \tmp2, \tbl, #PAGE_SIZE
> -       orr     \tmp2, \tmp2, #PMD_TYPE_TABLE   // address of next table and entry type
> -       str     \tmp2, [\tbl, \tmp1, lsl #3]
> -       add     \tbl, \tbl, #PAGE_SIZE          // next level table page
> -       .endm
> -
> -/*
> - * Macro to populate the PGD (and possibily PUD) for the corresponding
> - * block entry in the next level (tbl) for the given virtual address.
> - *
> - * Preserves:  tbl, next, virt
> - * Corrupts:   tmp1, tmp2
> - */
> -       .macro  create_pgd_entry, tbl, virt, tmp1, tmp2
> -       create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
> -#if SWAPPER_PGTABLE_LEVELS == 3
> -       create_table_entry \tbl, \virt, TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2
> -#endif
> -       .endm
> -
> -/*
> - * Macro to populate block entries in the page table for the start..end
> - * virtual range (inclusive).
> - *
> - * Preserves:  tbl, flags
> - * Corrupts:   phys, start, end, pstate
> - */
> -       .macro  create_block_map, tbl, flags, phys, start, end
> -       lsr     \phys, \phys, #BLOCK_SHIFT
> -       lsr     \start, \start, #BLOCK_SHIFT
> -       and     \start, \start, #PTRS_PER_PTE - 1       // table index
> -       orr     \phys, \flags, \phys, lsl #BLOCK_SHIFT  // table entry
> -       lsr     \end, \end, #BLOCK_SHIFT
> -       and     \end, \end, #PTRS_PER_PTE - 1           // table end index
> -9999:  str     \phys, [\tbl, \start, lsl #3]           // store the entry
> -       add     \start, \start, #1                      // next entry
> -       add     \phys, \phys, #BLOCK_SIZE               // next block
> -       cmp     \start, \end
> -       b.ls    9999b
> -       .endm
> -
> -/*
> - * Setup the initial page tables. We only setup the barest amount which is
> - * required to get the kernel running. The following sections are required:
> - *   - identity mapping to enable the MMU (low address, TTBR0)
> - *   - first few MB of the kernel linear mapping to jump to once the MMU has
> - *     been enabled, including the FDT blob (TTBR1)
> - *   - pgd entry for fixed mappings (TTBR1)
> - */
> -__create_page_tables:
> -       pgtbl   x25, x26, x28                   // idmap_pg_dir and swapper_pg_dir addresses
> -       mov     x27, lr
> -
> -       /*
> -        * Invalidate the idmap and swapper page tables to avoid potential
> -        * dirty cache lines being evicted.
> -        */
> -       mov     x0, x25
> -       add     x1, x26, #SWAPPER_DIR_SIZE
> -       bl      __inval_cache_range
> -
> -       /*
> -        * Clear the idmap and swapper page tables.
> -        */
> -       mov     x0, x25
> -       add     x6, x26, #SWAPPER_DIR_SIZE
> -1:     stp     xzr, xzr, [x0], #16
> -       stp     xzr, xzr, [x0], #16
> -       stp     xzr, xzr, [x0], #16
> -       stp     xzr, xzr, [x0], #16
> -       cmp     x0, x6
> -       b.lo    1b
> -
> -       ldr     x7, =MM_MMUFLAGS
> -
> -       /*
> -        * Create the identity mapping.
> -        */
> -       mov     x0, x25                         // idmap_pg_dir
> -       ldr     x3, =KERNEL_START
> -       add     x3, x3, x28                     // __pa(KERNEL_START)
> -       create_pgd_entry x0, x3, x5, x6
> -       ldr     x6, =KERNEL_END
> -       mov     x5, x3                          // __pa(KERNEL_START)
> -       add     x6, x6, x28                     // __pa(KERNEL_END)
> -       create_block_map x0, x7, x3, x5, x6
> -
> -       /*
> -        * Map the kernel image (starting with PHYS_OFFSET).
> -        */
> -       mov     x0, x26                         // swapper_pg_dir
> -       mov     x5, #PAGE_OFFSET
> -       create_pgd_entry x0, x5, x3, x6
> -       ldr     x6, =KERNEL_END
> -       mov     x3, x24                         // phys offset
> -       create_block_map x0, x7, x3, x5, x6
> -
> -       /*
> -        * Map the FDT blob (maximum 2MB; must be within 512MB of
> -        * PHYS_OFFSET).
> -        */
> -       mov     x3, x21                         // FDT phys address
> -       and     x3, x3, #~((1 << 21) - 1)       // 2MB aligned
> -       mov     x6, #PAGE_OFFSET
> -       sub     x5, x3, x24                     // subtract PHYS_OFFSET
> -       tst     x5, #~((1 << 29) - 1)           // within 512MB?
> -       csel    x21, xzr, x21, ne               // zero the FDT pointer
> -       b.ne    1f
> -       add     x5, x5, x6                      // __va(FDT blob)
> -       add     x6, x5, #1 << 21                // 2MB for the FDT blob
> -       sub     x6, x6, #1                      // inclusive range
> -       create_block_map x0, x7, x3, x5, x6
> -1:
> -       /*
> -        * Since the page tables have been populated with non-cacheable
> -        * accesses (MMU disabled), invalidate the idmap and swapper page
> -        * tables again to remove any speculatively loaded cache lines.
> -        */
> -       mov     x0, x25
> -       add     x1, x26, #SWAPPER_DIR_SIZE
> -       bl      __inval_cache_range
> -
> -       mov     lr, x27
> -       ret
> -ENDPROC(__create_page_tables)
> -       .ltorg
> -
> -       .align  3
> -       .type   __switch_data, %object
> -__switch_data:
> -       .quad   __mmap_switched
> -       .quad   __bss_start                     // x6
> -       .quad   __bss_stop                      // x7
> -       .quad   processor_id                    // x4
> -       .quad   __fdt_pointer                   // x5
> -       .quad   memstart_addr                   // x6
> -       .quad   init_thread_union + THREAD_START_SP // sp
> -
> -/*
> - * The following fragment of code is executed with the MMU on in MMU mode, and
> - * uses absolute addresses; this is not position independent.
> - */
> -__mmap_switched:
> -       adr     x3, __switch_data + 8
> -
> -       ldp     x6, x7, [x3], #16
> -1:     cmp     x6, x7
> -       b.hs    2f
> -       str     xzr, [x6], #8                   // Clear BSS
> -       b       1b
> -2:
> -       ldp     x4, x5, [x3], #16
> -       ldr     x6, [x3], #8
> -       ldr     x16, [x3]
> -       mov     sp, x16
> -       str     x22, [x4]                       // Save processor ID
> -       str     x21, [x5]                       // Save FDT pointer
> -       str     x24, [x6]                       // Save PHYS_OFFSET
> -       mov     x29, #0
> -       b       start_kernel
> -ENDPROC(__mmap_switched)
> -
> -/*
>   * Exception handling. Something went wrong and we can't proceed. We ought to
>   * tell the user, but since we don't have any guarantee that we're even
>   * running on the right architecture, we do virtually nothing.
> @@ -721,21 +768,4 @@ __lookup_processor_type_data:
>         .quad   cpu_table
>         .size   __lookup_processor_type_data, . - __lookup_processor_type_data
> 
> -/*
> - * Determine validity of the x21 FDT pointer.
> - * The dtb must be 8-byte aligned and live in the first 512M of memory.
> - */
> -__vet_fdt:
> -       tst     x21, #0x7
> -       b.ne    1f
> -       cmp     x21, x24
> -       b.lt    1f
> -       mov     x0, #(1 << 29)
> -       add     x0, x0, x24
> -       cmp     x21, x0
> -       b.ge    1f
> -       ret
> -1:
> -       mov     x21, #0
> -       ret
> -ENDPROC(__vet_fdt)
> +
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 97f0c04..2b674c5 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -56,6 +56,7 @@ SECTIONS
>         }
>         .text : {                       /* Real text segment            */
>                 _stext = .;             /* Text and read-only data      */
> +                       *(.latehead.text)
>                         __exception_text_start = .;
>                         *(.exception.text)
>                         __exception_text_end = .;
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 4/7] arm64: Move some head.text functions to executable section
  2014-08-21 10:34   ` Mark Rutland
@ 2014-08-21 21:42     ` Laura Abbott
  2014-08-22  9:48       ` Mark Rutland
  0 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-21 21:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/21/2014 3:34 AM, Mark Rutland wrote:
> Hi Laura,
> 
> On Thu, Aug 21, 2014 at 02:20:36AM +0100, Laura Abbott wrote:
>> The code in the head.text section of the kernel exists in the
>> same section as the swapper_pg_dir which means it needs the
>> same page table permissions. The swapper_pg_dir needs to be
>> writeable but shouldn't be executable.
> 
> I think we can drop the above. As far as I can tell as of commit
> bd00cd5f8c8c (arm64: place initial page tables above the kernel) it's no
> longer relevant.
>

Yes, this should be changed. Instead of citing swapper_pg_dir, I need
to cite the fact that there may still be memory outside of stext which
will get freed to the buddy allocator and therefore should be RW/NX.
 
>> The head.text section
>> is intended to be run at early bootup before any of the regular
>> kernel mappings have been setup so there is no issue at bootup.
>> The suspend/resume/hotplug code path requires some of these
>> head.S functions to run however which means they need to be
>> executable. We can't easily move all of the head.text to
>> an executable section, so split it into two parts: that which
>> is used only at early head.S bootup and that which is used
>> after bootup. There is a small bit of code duplication because
>> of some relocation issues related to accessing code more than
>> 1MB away.
> 
> From a cursory glance it looks like the only things we need write access
> to in .head.text are __cpu_boot_mode and __switch_data. Can't we instead
> place those in .data and make .head.text executable?
> 
> We currently find them with adr, which should be easy to replace with
> adrp + add to get around relocation issues.
> 

__boot_cpu_mode should be placed in data with a push section and
__switch_data is only modified before the permissions are set up.
I took a closer look at the code and the only thing that actually
needs to be executable from head.S is __secondary_switched so
the following patch should be sufficient to cover it:

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index caa9557..5c17599 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -414,12 +414,14 @@ ENTRY(secondary_startup)
        b       __enable_mmu
 ENDPROC(secondary_startup)
 
+       .pushsection    .text, "ax"
 ENTRY(__secondary_switched)
        ldr     x0, [x21]                       // get secondary_data.stack
        mov     sp, x0
        mov     x29, #0
        b       secondary_start_kernel
 ENDPROC(__secondary_switched)
+       .popsection
 #endif /* CONFIG_SMP */
 
 /*

I think I was a bit over zealous in determining that everything needed
to be placed in .text vs. not.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCHv3 4/7] arm64: Move some head.text functions to executable section
  2014-08-21 21:42     ` Laura Abbott
@ 2014-08-22  9:48       ` Mark Rutland
  2014-08-26  0:32         ` Laura Abbott
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Rutland @ 2014-08-22  9:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 21, 2014 at 10:42:13PM +0100, Laura Abbott wrote:
> On 8/21/2014 3:34 AM, Mark Rutland wrote:
> > Hi Laura,
> > 
> > On Thu, Aug 21, 2014 at 02:20:36AM +0100, Laura Abbott wrote:
> >> The code in the head.text section of the kernel exists in the
> >> same section as the swapper_pg_dir which means it needs the
> >> same page table permissions. The swapper_pg_dir needs to be
> >> writeable but shouldn't be executable.
> > 
> > I think we can drop the above. As far as I can tell as of commit
> > bd00cd5f8c8c (arm64: place initial page tables above the kernel) it's no
> > longer relevant.
> >
> 
> Yes, this should be changed. Instead of citing swapper_pg_dir, I need
> to cite the fact that there may still be memory outside of stext which
> will get freed to the buddy allocator and therefore should be RW/NX.

Ok. I must be missing something here; we don't seem to free any of
.head.text to the buddy allocator at the moment. Are we just trying to
have the bare minimum remain executable regardless?

> >> The head.text section
> >> is intended to be run at early bootup before any of the regular
> >> kernel mappings have been setup so there is no issue at bootup.
> >> The suspend/resume/hotplug code path requires some of these
> >> head.S functions to run however which means they need to be
> >> executable. We can't easily move all of the head.text to
> >> an executable section, so split it into two parts: that which
> >> is used only at early head.S bootup and that which is used
> >> after bootup. There is a small bit of code duplication because
> >> of some relocation issues related to accessing code more than
> >> 1MB away.
> > 
> > From a cursory glance it looks like the only things we need write access
> > to in .head.text are __cpu_boot_mode and __switch_data. Can't we instead
> > place those in .data and make .head.text executable?
> > 
> > We currently find them with adr, which should be easy to replace with
> > adrp + add to get around relocation issues.
> > 
> 
> __boot_cpu_mode should be placed in data with a push section and
> __switch_data is only modified before the permissions are set up.

Ok, so those are fine then.

> I took a closer look at the code and the only thing that actually
> needs to be executable from head.S is __secondary_switched so
> the following patch should be sufficient to cover it:
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index caa9557..5c17599 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -414,12 +414,14 @@ ENTRY(secondary_startup)
>         b       __enable_mmu
>  ENDPROC(secondary_startup)
>  
> +       .pushsection    .text, "ax"
>  ENTRY(__secondary_switched)
>         ldr     x0, [x21]                       // get secondary_data.stack
>         mov     sp, x0
>         mov     x29, #0
>         b       secondary_start_kernel
>  ENDPROC(__secondary_switched)
> +       .popsection
>  #endif /* CONFIG_SMP */

If we need to cover __secondary_switched, don't we need to cover
__turn_mmu_on as well, given that will turn the MMU on and branch to
__secondary_switched?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap
  2014-08-21  1:20 ` [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap Laura Abbott
@ 2014-08-23  5:45   ` Kees Cook
  2014-08-25 18:34     ` Laura Abbott
  0 siblings, 1 reply; 26+ messages in thread
From: Kees Cook @ 2014-08-23  5:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>
> The fixmap API was originally added for arm64 for
> early_ioremap purposes. It can be used for other purposes too
> so move the initialization from ioremap to somewhere more
> generic. This makes it obvious where the fixmap is being set
> up and allows for a cleaner implementation of __set_fixmap.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm64/include/asm/fixmap.h |  7 +--
>  arch/arm64/kernel/setup.c       |  3 +-
>  arch/arm64/mm/ioremap.c         | 93 ++--------------------------------------
>  arch/arm64/mm/mmu.c             | 94 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 103 insertions(+), 94 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
> index 5f7bfe6..db26a2f2 100644
> --- a/arch/arm64/include/asm/fixmap.h
> +++ b/arch/arm64/include/asm/fixmap.h
> @@ -56,10 +56,11 @@ enum fixed_addresses {
>
>  #define FIXMAP_PAGE_IO     __pgprot(PROT_DEVICE_nGnRE)
>
> -extern void __early_set_fixmap(enum fixed_addresses idx,
> -                              phys_addr_t phys, pgprot_t flags);
> +void __init early_fixmap_init(void);
>
> -#define __set_fixmap __early_set_fixmap
> +#define __early_set_fixmap __set_fixmap
> +
> +extern void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
>
>  #include <asm-generic/fixmap.h>
>
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index f6f0ccf..9244e20 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -378,7 +378,8 @@ void __init setup_arch(char **cmdline_p)
>
>         *cmdline_p = boot_command_line;
>
> -       early_ioremap_init();
> +       early_fixmap_init();
> +       early_ioremap_setup();
>
>         parse_early_param();
>
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index fa324bd..cbb99c8 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -103,97 +103,10 @@ void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
>  }
>  EXPORT_SYMBOL(ioremap_cache);
>
> -static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
> -#if CONFIG_ARM64_PGTABLE_LEVELS > 2
> -static pte_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
> -#endif
> -#if CONFIG_ARM64_PGTABLE_LEVELS > 3
> -static pte_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
> -#endif
> -
> -static inline pud_t * __init early_ioremap_pud(unsigned long addr)
> -{
> -       pgd_t *pgd;
> -
> -       pgd = pgd_offset_k(addr);
> -       BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
> -
> -       return pud_offset(pgd, addr);
> -}
> -
> -static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
> -{
> -       pud_t *pud = early_ioremap_pud(addr);
> -
> -       BUG_ON(pud_none(*pud) || pud_bad(*pud));
> -
> -       return pmd_offset(pud, addr);
> -}
> -
> -static inline pte_t * __init early_ioremap_pte(unsigned long addr)
> -{
> -       pmd_t *pmd = early_ioremap_pmd(addr);
> -
> -       BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
> -
> -       return pte_offset_kernel(pmd, addr);
> -}
> -
> +/*
> + * Must be called after early_fixmap_init
> + */
>  void __init early_ioremap_init(void)
>  {
> -       pgd_t *pgd;
> -       pud_t *pud;
> -       pmd_t *pmd;
> -       unsigned long addr = fix_to_virt(FIX_BTMAP_BEGIN);
> -
> -       pgd = pgd_offset_k(addr);
> -       pgd_populate(&init_mm, pgd, bm_pud);
> -       pud = pud_offset(pgd, addr);
> -       pud_populate(&init_mm, pud, bm_pmd);
> -       pmd = pmd_offset(pud, addr);
> -       pmd_populate_kernel(&init_mm, pmd, bm_pte);
> -
> -       /*
> -        * The boot-ioremap range spans multiple pmds, for which
> -        * we are not prepared:
> -        */
> -       BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
> -                    != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
> -
> -       if (pmd != early_ioremap_pmd(fix_to_virt(FIX_BTMAP_END))) {
> -               WARN_ON(1);
> -               pr_warn("pmd %p != %p\n",
> -                       pmd, early_ioremap_pmd(fix_to_virt(FIX_BTMAP_END)));
> -               pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
> -                       fix_to_virt(FIX_BTMAP_BEGIN));
> -               pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
> -                       fix_to_virt(FIX_BTMAP_END));
> -
> -               pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
> -               pr_warn("FIX_BTMAP_BEGIN:     %d\n",
> -                       FIX_BTMAP_BEGIN);
> -       }
> -
>         early_ioremap_setup();
>  }
> -
> -void __init __early_set_fixmap(enum fixed_addresses idx,
> -                              phys_addr_t phys, pgprot_t flags)
> -{
> -       unsigned long addr = __fix_to_virt(idx);
> -       pte_t *pte;
> -
> -       if (idx >= __end_of_fixed_addresses) {
> -               BUG();
> -               return;
> -       }
> -
> -       pte = early_ioremap_pte(addr);
> -
> -       if (pgprot_val(flags))
> -               set_pte(pte, pfn_pte(phys >> PAGE_SHIFT, flags));
> -       else {
> -               pte_clear(&init_mm, addr, pte);
> -               flush_tlb_kernel_range(addr, addr+PAGE_SIZE);
> -       }
> -}
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index c555672..bd549a3 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -28,6 +28,7 @@
>  #include <linux/io.h>
>
>  #include <asm/cputype.h>
> +#include <asm/fixmap.h>
>  #include <asm/sections.h>
>  #include <asm/setup.h>
>  #include <asm/sizes.h>
> @@ -459,3 +460,96 @@ void vmemmap_free(unsigned long start, unsigned long end)
>  {
>  }
>  #endif /* CONFIG_SPARSEMEM_VMEMMAP */
> +
> +static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
> +#if CONFIG_ARM64_PGTABLE_LEVELS > 2
> +static pte_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
> +#endif
> +#if CONFIG_ARM64_PGTABLE_LEVELS > 3
> +static pte_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
> +#endif
> +
> +static inline pud_t * fixmap_pud(unsigned long addr)
> +{
> +       pgd_t *pgd = pgd_offset_k(addr);
> +
> +       BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
> +
> +       return pud_offset(pgd, addr);
> +}
> +
> +static inline pmd_t * fixmap_pmd(unsigned long addr)
> +{
> +       pud_t *pud = fixmap_pud(addr);
> +
> +       BUG_ON(pud_none(*pud) || pud_bad(*pud));
> +
> +       return pmd_offset(pud, addr);
> +}
> +
> +static inline pte_t * fixmap_pte(unsigned long addr)
> +{
> +       pmd_t *pmd = fixmap_pmd(addr);
> +
> +       BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
> +
> +       return pte_offset_kernel(pmd, addr);
> +}
> +
> +void __init early_fixmap_init(void)
> +{
> +       pgd_t *pgd;
> +       pud_t *pud;
> +       pmd_t *pmd;
> +       unsigned long addr = FIXADDR_START;
> +
> +       pgd = pgd_offset_k(addr);
> +       pgd_populate(&init_mm, pgd, bm_pud);
> +       pud = pud_offset(pgd, addr);
> +       pud_populate(&init_mm, pud, bm_pmd);
> +       pmd = pmd_offset(pud, addr);
> +       pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +
> +       /*
> +        * The boot-ioremap range spans multiple pmds, for which
> +        * we are not preparted:
> +        */
> +       BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
> +                    != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
> +
> +       if ((pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)))
> +            || pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_END))) {
> +               WARN_ON(1);
> +               pr_warn("pmd %p != %p, %p\n",
> +                       pmd, fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)),
> +                       fixmap_pmd(fix_to_virt(FIX_BTMAP_END)));
> +               pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
> +                       fix_to_virt(FIX_BTMAP_BEGIN));
> +               pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
> +                       fix_to_virt(FIX_BTMAP_END));
> +
> +               pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
> +               pr_warn("FIX_BTMAP_BEGIN:     %d\n", FIX_BTMAP_BEGIN);
> +       }
> +}
> +
> +void __set_fixmap(enum fixed_addresses idx,
> +                              phys_addr_t phys, pgprot_t flags)
> +{
> +       unsigned long addr = __fix_to_virt(idx);
> +       pte_t *pte;
> +
> +       if (idx >= __end_of_fixed_addresses) {
> +               BUG();
> +               return;
> +       }

Is it worth cleaning this up into BUG_ON instead of BUG; return; ?

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> +
> +       pte = fixmap_pte(addr);
> +
> +       if (pgprot_val(flags)) {
> +               set_pte(pte, pfn_pte(phys >> PAGE_SHIFT, flags));
> +       } else {
> +               pte_clear(&init_mm, addr, pte);
> +               flush_tlb_kernel_range(addr, addr+PAGE_SIZE);
> +       }
> +}
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO
  2014-08-21  1:20 ` [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO Laura Abbott
@ 2014-08-23  5:51   ` Kees Cook
  2014-08-25 18:38     ` Laura Abbott
  0 siblings, 1 reply; 26+ messages in thread
From: Kees Cook @ 2014-08-23  5:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> When kernel text is marked as read only, it cannot be modified directly.
> Use a fixmap to modify the text instead in a similar manner to
> x86 and arm.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
> I think there were some questions on spinlocks for the arm version, not
> sure if similar concerns apply here.
> ---
>  arch/arm64/include/asm/fixmap.h |  1 +
>  arch/arm64/include/asm/insn.h   |  2 ++
>  arch/arm64/kernel/insn.c        | 74 +++++++++++++++++++++++++++++++++++++++--
>  arch/arm64/kernel/jump_label.c  |  2 +-
>  4 files changed, 75 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
> index db26a2f2..2cd4b0d 100644
> --- a/arch/arm64/include/asm/fixmap.h
> +++ b/arch/arm64/include/asm/fixmap.h
> @@ -48,6 +48,7 @@ enum fixed_addresses {
>
>         FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
>         FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
> +       FIX_TEXT_POKE0,
>         __end_of_fixed_addresses
>  };
>
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index dc1f73b..07ac29b 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -92,6 +92,7 @@ bool aarch64_insn_is_nop(u32 insn);
>
>  int aarch64_insn_read(void *addr, u32 *insnp);
>  int aarch64_insn_write(void *addr, u32 insn);
> +int aarch64_insn_write_early(void *addr, u32 insn);
>  enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>  u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>                                   u32 insn, u64 imm);
> @@ -103,6 +104,7 @@ u32 aarch64_insn_gen_nop(void);
>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>
>  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> +int __aarch64_insn_patch_text_nosync(void *addr, u32 insn, bool early);
>  int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>  int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>  #endif /* __ASSEMBLY__ */
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 92f3683..b25f8db 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -17,10 +17,13 @@
>  #include <linux/bitops.h>
>  #include <linux/compiler.h>
>  #include <linux/kernel.h>
> +#include <linux/mm.h>
>  #include <linux/smp.h>
> +#include <linux/spinlock.h>
>  #include <linux/stop_machine.h>
>  #include <linux/uaccess.h>
>  #include <asm/cacheflush.h>
> +#include <asm/fixmap.h>
>  #include <asm/insn.h>
>
>  static int aarch64_insn_encoding_class[] = {
> @@ -65,6 +68,36 @@ bool __kprobes aarch64_insn_is_nop(u32 insn)
>         }
>  }
>
> +static DEFINE_SPINLOCK(patch_lock);
> +
> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
> +{
> +       unsigned long uintaddr = (uintptr_t) addr;
> +       bool module = !core_kernel_text(uintaddr);
> +       struct page *page;
> +
> +       if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
> +               page = vmalloc_to_page(addr);
> +       else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
> +               page = virt_to_page(addr);
> +       else
> +               return addr;
> +
> +       if (flags)
> +               spin_lock_irqsave(&patch_lock, *flags);
> +
> +       set_fixmap(fixmap, page_to_phys(page));
> +
> +       return (void *) (__fix_to_virt(fixmap) + (uintaddr & ~PAGE_MASK));
> +}
> +
> +static void __kprobes patch_unmap(int fixmap, unsigned long *flags)
> +{
> +       clear_fixmap(fixmap);
> +
> +       if (flags)
> +               spin_unlock_irqrestore(&patch_lock, *flags);
> +}
>  /*
>   * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>   * little-endian.
> @@ -81,10 +114,36 @@ int __kprobes aarch64_insn_read(void *addr, u32 *insnp)
>         return ret;
>  }
>
> +static int __kprobes __aarch64_insn_write(void *addr, u32 insn, bool patch)
> +{
> +       void *waddr = addr;
> +       unsigned long flags;
> +       int ret;
> +
> +       if (patch)
> +               waddr = patch_map(addr, FIX_TEXT_POKE0, &flags);
> +
> +       ret = probe_kernel_write(waddr, &insn, AARCH64_INSN_SIZE);
> +
> +       if (waddr != addr) {
> +               __flush_dcache_area(waddr, AARCH64_INSN_SIZE);

Is this flush to make sure the waddr change has actually made it to
physical memory?

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> +               patch_unmap(FIX_TEXT_POKE0, &flags);
> +       }
> +
> +       return ret;
> +}
> +
>  int __kprobes aarch64_insn_write(void *addr, u32 insn)
>  {
>         insn = cpu_to_le32(insn);
> -       return probe_kernel_write(addr, &insn, AARCH64_INSN_SIZE);
> +       return __aarch64_insn_write(addr, insn, true);
> +}
> +
> +int __kprobes aarch64_insn_write_early(void *addr, u32 insn)
> +{
> +       insn = cpu_to_le32(insn);
> +       return __aarch64_insn_write(addr, insn, false);
> +
>  }
>
>  static bool __kprobes __aarch64_insn_hotpatch_safe(u32 insn)
> @@ -117,7 +176,7 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>                __aarch64_insn_hotpatch_safe(new_insn);
>  }
>
> -int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
> +int __kprobes __aarch64_insn_patch_text_nosync(void *addr, u32 insn, bool early)
>  {
>         u32 *tp = addr;
>         int ret;
> @@ -126,7 +185,11 @@ int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
>         if ((uintptr_t)tp & 0x3)
>                 return -EINVAL;
>
> -       ret = aarch64_insn_write(tp, insn);
> +       if (early)
> +               ret = aarch64_insn_write_early(tp, insn);
> +       else
> +               ret = aarch64_insn_write(tp, insn);
> +
>         if (ret == 0)
>                 flush_icache_range((uintptr_t)tp,
>                                    (uintptr_t)tp + AARCH64_INSN_SIZE);
> @@ -134,6 +197,11 @@ int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
>         return ret;
>  }
>
> +int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
> +{
> +       return __aarch64_insn_patch_text_nosync(addr, insn, false);
> +}
> +
>  struct aarch64_insn_patch {
>         void            **text_addrs;
>         u32             *new_insns;
> diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
> index 263a166..9ac30bb 100644
> --- a/arch/arm64/kernel/jump_label.c
> +++ b/arch/arm64/kernel/jump_label.c
> @@ -38,7 +38,7 @@ static void __arch_jump_label_transform(struct jump_entry *entry,
>         }
>
>         if (is_static)
> -               aarch64_insn_patch_text_nosync(addr, insn);
> +               __aarch64_insn_patch_text_nosync(addr, insn, true);
>         else
>                 aarch64_insn_patch_text(&addr, &insn, 1);
>  }
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 7/7] arm64: add better page protections to arm64
  2014-08-21  1:20 ` [PATCHv3 7/7] arm64: add better page protections to arm64 Laura Abbott
@ 2014-08-23  5:59   ` Kees Cook
  2014-08-25 19:04     ` Laura Abbott
  0 siblings, 1 reply; 26+ messages in thread
From: Kees Cook @ 2014-08-23  5:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> Add page protections for arm64 similar to those in arm.
> This is for security reasons to prevent certain classes
> of exploits. The current method:
>
> - Map all memory as either RWX or RW. We round to the nearest
>   section to avoid creating page tables before everything is mapped
> - Once everything is mapped, if either end of the RWX section should
>   not be X, we split the PMD and remap as necessary
> - When initmem is to be freed, we change the permissions back to
>   RW (using stop machine if necessary to flush the TLB)
> - If CONFIG_DEBUG_RODATA is set, the read only sections are set
>   read only.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm64/Kconfig                  |   8 +
>  arch/arm64/Kconfig.debug            |  23 +++
>  arch/arm64/include/asm/cacheflush.h |   3 +
>  arch/arm64/kernel/vmlinux.lds.S     |  17 ++
>  arch/arm64/mm/init.c                |   1 +
>  arch/arm64/mm/mm.h                  |   2 +
>  arch/arm64/mm/mmu.c                 | 303 +++++++++++++++++++++++++++++++-----
>  7 files changed, 321 insertions(+), 36 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index fd4e81a..b718974 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -328,6 +328,14 @@ config FORCE_MAX_ZONEORDER
>         default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
>         default "11"
>
> +config NX_KERNEL_AREAS
> +       bool "Allow no execute kernel areas"
> +       help
> +         Mark areas of the kernel that should never be executed (data, heap
> +         etc.) as no execute. This helps with a certain set of security issues.
> +         You should say Y here unless you like malicious code doing unexpected
> +         things on your hardware.
> +
>  endmenu
>
>  menu "Boot options"
> diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
> index 4ee8e90..032ccc1 100644
> --- a/arch/arm64/Kconfig.debug
> +++ b/arch/arm64/Kconfig.debug
> @@ -43,4 +43,27 @@ config ARM64_RANDOMIZE_TEXT_OFFSET
>           of TEXT_OFFSET and platforms must not require a specific
>           value.
>
> +config DEBUG_RODATA
> +       bool "Make kernel text and rodata read-only"
> +       help
> +         If this is set, kernel text and rodata will be made read-only. This
> +         is to help catch accidental or malicious attempts to change the
> +         kernel's executable code. Additionally splits rodata from kernel
> +         text so it can be made explicitly non-executable.
> +
> +          If in doubt, say Y

I wonder if this option should explicitly select NX_KERNEL_AREAS? It
seems crazy to me that someone would only want one of these. And this
config is the cross-arch name for "make kernel page tables protected",
so it might be nice to turn everything on via this one switch?

> +
> +config DEBUG_ALIGN_RODATA
> +       depends on DEBUG_RODATA
> +       bool "Align linker sections up to SECTION_SIZE"
> +       help
> +         If this option is enabled, sections that may potentially be marked as
> +         read only or non-executable will be aligned up to the section size of
> +         the kernel. This prevents sections from being split into pages and
> +         avoids a potential TLB penalty. The downside is an increase in
> +         alignment and potentially wasted space. Turn on this option if
> +         performance is more important than memory pressure.
> +
> +         If in doubt, say N

Making this configurable is pretty neat. What's the overhead on memory
pressure for arm64? Is the correct default "N"?

-Kees

> +
>  endmenu
> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> index f2defe1..6e2f910 100644
> --- a/arch/arm64/include/asm/cacheflush.h
> +++ b/arch/arm64/include/asm/cacheflush.h
> @@ -148,4 +148,7 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
>  {
>  }
>
> +#ifdef CONFIG_DEBUG_RODATA
> +void mark_rodata_ro(void);
> +#endif
>  #endif
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 2b674c5..6dea60d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -8,6 +8,7 @@
>  #include <asm/thread_info.h>
>  #include <asm/memory.h>
>  #include <asm/page.h>
> +#include <asm/pgtable.h>
>
>  #include "image.h"
>
> @@ -54,6 +55,9 @@ SECTIONS
>                 _text = .;
>                 HEAD_TEXT
>         }
> +#ifdef DEBUG_ALIGN_RODATA
> +       . = ALIGN(1<<SECTION_SHIFT);
> +#endif
>         .text : {                       /* Real text segment            */
>                 _stext = .;             /* Text and read-only data      */
>                         *(.latehead.text)
> @@ -71,19 +75,32 @@ SECTIONS
>                 *(.got)                 /* Global offset table          */
>         }
>
> +#ifdef DEBUG_ALIGN_RODATA
> +       . = ALIGN(1<<SECTION_SHIFT);
> +#endif
>         RO_DATA(PAGE_SIZE)
>         EXCEPTION_TABLE(8)
>         NOTES
>         _etext = .;                     /* End of text and rodata section */
>
> +#ifdef DEBUG_ALIGN_RODATA
> +       . = ALIGN(1<<SECTION_SHIFT);
> +#else
>         . = ALIGN(PAGE_SIZE);
> +#endif
>         __init_begin = .;
>
>         INIT_TEXT_SECTION(8)
>         .exit.text : {
>                 ARM_EXIT_KEEP(EXIT_TEXT)
>         }
> +
> +#ifdef DEBUG_ALIGN_RODATA
> +       . = ALIGN(1<<SECTION_SHIFT);
> +       __init_data_begin = .;
> +#else
>         . = ALIGN(16);
> +#endif
>         .init.data : {
>                 INIT_DATA
>                 INIT_SETUP(16)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 5b4526e..d796b76 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -323,6 +323,7 @@ void __init mem_init(void)
>
>  void free_initmem(void)
>  {
> +       fixup_init();
>         free_initmem_default(0);
>  }
>
> diff --git a/arch/arm64/mm/mm.h b/arch/arm64/mm/mm.h
> index d519f4f..82347d8 100644
> --- a/arch/arm64/mm/mm.h
> +++ b/arch/arm64/mm/mm.h
> @@ -1,2 +1,4 @@
>  extern void __init bootmem_init(void);
>  extern void __init arm64_swiotlb_init(void);
> +
> +void fixup_init(void);
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index bd549a3..7118152 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -26,6 +26,7 @@
>  #include <linux/memblock.h>
>  #include <linux/fs.h>
>  #include <linux/io.h>
> +#include <linux/stop_machine.h>
>
>  #include <asm/cputype.h>
>  #include <asm/fixmap.h>
> @@ -137,17 +138,55 @@ static void __init *early_alloc(unsigned long sz)
>         return ptr;
>  }
>
> -static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
> +/*
> + * remap a PMD into pages
> + */
> +static noinline void __ref split_pmd(pmd_t *pmd, bool early)
> +{
> +       pte_t *pte, *start_pte;
> +       unsigned long pfn;
> +       int i = 0;
> +
> +       if (early)
> +               start_pte = pte = early_alloc(PTRS_PER_PTE*sizeof(pte_t));
> +       else
> +               start_pte = pte = (pte_t *)__get_free_page(PGALLOC_GFP);
> +
> +       BUG_ON(!pte);
> +
> +       pfn = pmd_pfn(*pmd);
> +
> +       do {
> +               /*
> +                * Need to have the least restrictive permissions available
> +                * permissions will be fixed up later
> +                */
> +               set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
> +               pfn++;
> +       } while (pte++, i++, i < PTRS_PER_PTE);
> +
> +
> +       __pmd_populate(pmd, __pa(start_pte), PMD_TYPE_TABLE);
> +       flush_tlb_all();
> +}
> +
> +static void __ref alloc_init_pte(pmd_t *pmd, unsigned long addr,
>                                   unsigned long end, unsigned long pfn,
> -                                 pgprot_t prot)
> +                                 pgprot_t prot, bool early)
>  {
>         pte_t *pte;
>
>         if (pmd_none(*pmd)) {
> -               pte = early_alloc(PTRS_PER_PTE * sizeof(pte_t));
> +               if (early)
> +                       pte = early_alloc(PTRS_PER_PTE * sizeof(pte_t));
> +               else
> +                       pte = (pte_t *)__get_free_page(PGALLOC_GFP);
> +               BUG_ON(!pte);
>                 __pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
>         }
> -       BUG_ON(pmd_bad(*pmd));
> +
> +       if (pmd_bad(*pmd))
> +               split_pmd(pmd, early);
>
>         pte = pte_offset_kernel(pmd, addr);
>         do {
> @@ -156,29 +195,52 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
>         } while (pte++, addr += PAGE_SIZE, addr != end);
>  }
>
> -static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
> +void __ref split_pud(pud_t *old_pud, pmd_t *pmd)
> +{
> +       unsigned long pfn = pud_pfn(*old_pud);
> +        const pmdval_t mask = PMD_SECT_USER | PMD_SECT_PXN | PMD_SECT_UXN |
> +                              PMD_SECT_RDONLY | PMD_SECT_PROT_NONE |
> +                              PMD_SECT_VALID;
> +       pgprot_t prot = __pgprot(pud_val(*old_pud) & mask);
> +       int i = 0;
> +
> +       do {
> +               set_pmd(pmd, pfn_pmd(pfn, prot));
> +               pfn++;
> +       } while (pmd++, i++, i < PTRS_PER_PMD);
> +}
> +
> +static void __ref alloc_init_pmd(pud_t *pud, unsigned long addr,
>                                   unsigned long end, phys_addr_t phys,
> -                                 int map_io)
> +                                 pgprot_t sect_prot, pgprot_t pte_prot,
> +                                 bool early, int map_io)
>  {
>         pmd_t *pmd;
>         unsigned long next;
> -       pmdval_t prot_sect;
> -       pgprot_t prot_pte;
>
>         if (map_io) {
> -               prot_sect = PROT_SECT_DEVICE_nGnRE;
> -               prot_pte = __pgprot(PROT_DEVICE_nGnRE);
> -       } else {
> -               prot_sect = PROT_SECT_NORMAL_EXEC;
> -               prot_pte = PAGE_KERNEL_EXEC;
> +               sect_prot = PROT_SECT_DEVICE_nGnRE;
> +               pte_prot = __pgprot(PROT_DEVICE_nGnRE);
>         }
>
>         /*
>          * Check for initial section mappings in the pgd/pud and remove them.
>          */
>         if (pud_none(*pud) || pud_bad(*pud)) {
> -               pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
> +               if (early)
> +                       pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
> +               else
> +                       pmd = pmd_alloc_one(&init_mm, addr);
> +               BUG_ON(!pmd);
> +               if (pud_sect(*pud)) {
> +                       /*
> +                        * need to have the 1G of mappings continue to be
> +                        * present
> +                        */
> +                       split_pud(pud, pmd);
> +               }
>                 pud_populate(&init_mm, pud, pmd);
> +               flush_tlb_all();
>         }
>
>         pmd = pmd_offset(pud, addr);
> @@ -186,8 +248,8 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
>                 next = pmd_addr_end(addr, end);
>                 /* try section mapping first */
>                 if (((addr | next | phys) & ~SECTION_MASK) == 0) {
> -                       pmd_t old_pmd =*pmd;
> -                       set_pmd(pmd, __pmd(phys | prot_sect));
> +                       pmd_t old_pmd = *pmd;
> +                       set_pmd(pmd, __pmd(phys | sect_prot));
>                         /*
>                          * Check for previous table entries created during
>                          * boot (__create_page_tables) and flush them.
> @@ -196,15 +258,16 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
>                                 flush_tlb_all();
>                 } else {
>                         alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
> -                                      prot_pte);
> +                                       pte_prot, early);
>                 }
>                 phys += next - addr;
>         } while (pmd++, addr = next, addr != end);
>  }
>
> -static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
> +static void __ref alloc_init_pud(pgd_t *pgd, unsigned long addr,
>                                   unsigned long end, unsigned long phys,
> -                                 int map_io)
> +                                 pgprot_t sect_prot, pgprot_t pte_prot,
> +                                 bool early, int map_io)
>  {
>         pud_t *pud;
>         unsigned long next;
> @@ -222,10 +285,10 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
>                 /*
>                  * For 4K granule only, attempt to put down a 1GB block
>                  */
> -               if (!map_io && (PAGE_SHIFT == 12) &&
> +               if (!map_io && early && (PAGE_SHIFT == 12) &&
>                     ((addr | next | phys) & ~PUD_MASK) == 0) {
>                         pud_t old_pud = *pud;
> -                       set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
> +                       set_pud(pud, __pud(phys | sect_prot));
>
>                         /*
>                          * If we have an old value for a pud, it will
> @@ -240,7 +303,8 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
>                                 flush_tlb_all();
>                         }
>                 } else {
> -                       alloc_init_pmd(pud, addr, next, phys, map_io);
> +                       alloc_init_pmd(pud, addr, next, phys, sect_prot,
> +                                       pte_prot, early, map_io);
>                 }
>                 phys += next - addr;
>         } while (pud++, addr = next, addr != end);
> @@ -250,9 +314,11 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
>   * Create the page directory entries and any necessary page tables for the
>   * mapping specified by 'md'.
>   */
> -static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
> -                                   unsigned long virt, phys_addr_t size,
> -                                   int map_io)
> +static void __ref __create_mapping(pgd_t *pgd, phys_addr_t phys,
> +                                 unsigned long virt,
> +                                 phys_addr_t size,
> +                                 pgprot_t sect_prot, pgprot_t pte_prot,
> +                                 int map_io, bool early)
>  {
>         unsigned long addr, length, end, next;
>
> @@ -262,31 +328,140 @@ static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
>         end = addr + length;
>         do {
>                 next = pgd_addr_end(addr, end);
> -               alloc_init_pud(pgd, addr, next, phys, map_io);
> +               alloc_init_pud(pgd, addr, next, phys, sect_prot, pte_prot,
> +                               early, map_io);
>                 phys += next - addr;
>         } while (pgd++, addr = next, addr != end);
>  }
>
> -static void __init create_mapping(phys_addr_t phys, unsigned long virt,
> -                                 phys_addr_t size)
> +void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> +{
> +       if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
> +               pr_warn("BUG: not creating id mapping for %pa\n", &addr);
> +               return;
> +       }
> +       __create_mapping(&idmap_pg_dir[pgd_index(addr)],
> +                        addr, addr, size, PROT_SECT_NORMAL_EXEC,
> +                        PAGE_KERNEL_EXEC, map_io, false);
> +}
> +
> +static inline pmd_t *pmd_off_k(unsigned long virt)
> +{
> +       return pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
> +}
> +
> +#ifdef CONFIG_EARLY_PRINTK
> +/*
> + * Create an early I/O mapping using the pgd/pmd entries already populated
> + * in head.S as this function is called too early to allocated any memory. The
> + * mapping size is 2MB with 4KB pages or 64KB or 64KB pages.
> + */
> +void __iomem * __init early_io_map(phys_addr_t phys, unsigned long virt)
> +{
> +       unsigned long size, mask;
> +       bool page64k = IS_ENABLED(CONFIG_ARM64_64K_PAGES);
> +       pgd_t *pgd;
> +       pud_t *pud;
> +       pmd_t *pmd;
> +       pte_t *pte;
> +
> +       /*
> +        * No early pte entries with !ARM64_64K_PAGES configuration, so using
> +        * sections (pmd).
> +        */
> +       size = page64k ? PAGE_SIZE : SECTION_SIZE;
> +       mask = ~(size - 1);
> +
> +       pgd = pgd_offset_k(virt);
> +       pud = pud_offset(pgd, virt);
> +       if (pud_none(*pud))
> +               return NULL;
> +       pmd = pmd_offset(pud, virt);
> +
> +       if (page64k) {
> +               if (pmd_none(*pmd))
> +                       return NULL;
> +               pte = pte_offset_kernel(pmd, virt);
> +               set_pte(pte, __pte((phys & mask) | PROT_DEVICE_nGnRE));
> +       } else {
> +               set_pmd(pmd, __pmd((phys & mask) | PROT_SECT_DEVICE_nGnRE));
> +       }
> +
> +       return (void __iomem *)((virt & mask) + (phys & ~mask));
> +}
> +#endif
> +
> +static void __ref create_mapping(phys_addr_t phys, unsigned long virt,
> +                                 phys_addr_t size,
> +                                 pgprot_t sect_prot, pgprot_t pte_prot)
>  {
>         if (virt < VMALLOC_START) {
>                 pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
>                         &phys, virt);
>                 return;
>         }
> -       __create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt, size, 0);
> +
> +       return __create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt,
> +                               size, sect_prot, pte_prot, 0, true);
>  }
>
> -void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> +static void __ref create_mapping_late(phys_addr_t phys, unsigned long virt,
> +                                 phys_addr_t size,
> +                                 pgprot_t sect_prot, pgprot_t pte_prot)
>  {
> -       if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
> -               pr_warn("BUG: not creating id mapping for %pa\n", &addr);
> +       if (virt < VMALLOC_START) {
> +               pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
> +                       &phys, virt);
>                 return;
>         }
> -       __create_mapping(&idmap_pg_dir[pgd_index(addr)],
> -                        addr, addr, size, map_io);
> +
> +       return __create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt,
> +                               size, sect_prot, pte_prot, 0, false);
> +}
> +
> +#ifdef CONFIG_NX_KERNEL_AREAS
> +static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
> +{
> +       /*
> +        * Set up the executable regions using the existing section mappings
> +        * for now. This will get more fine grained later once all memory
> +        * is mapped
> +        */
> +       unsigned long kernel_x_start = round_down(__pa(_stext), SECTION_SIZE);
> +       unsigned long kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
> +
> +       if (end < kernel_x_start) {
> +               create_mapping(start, __phys_to_virt(start),
> +                       end - start, PROT_SECT_NORMAL, PAGE_KERNEL);
> +       } else if (start >= kernel_x_end) {
> +               create_mapping(start, __phys_to_virt(start),
> +                       end - start, PROT_SECT_NORMAL, PAGE_KERNEL);
> +       } else {
> +               if (start < kernel_x_start)
> +                       create_mapping(start, __phys_to_virt(start),
> +                               kernel_x_start - start,
> +                               PROT_SECT_NORMAL,
> +                               PAGE_KERNEL);
> +               create_mapping(kernel_x_start,
> +                               __phys_to_virt(kernel_x_start),
> +                               kernel_x_end - kernel_x_start,
> +                               PROT_SECT_NORMAL_EXEC, PAGE_KERNEL_EXEC);
> +               if (kernel_x_end < end)
> +                       create_mapping(kernel_x_end,
> +                               __phys_to_virt(kernel_x_end),
> +                               end - kernel_x_end,
> +                               PROT_SECT_NORMAL,
> +                               PAGE_KERNEL);
> +       }
> +
> +}
> +#else
> +static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
> +{
> +       create_mapping(start, __phys_to_virt(start), end - start,
> +                       PROT_SECT_NORMAL_EXEC, PAGE_KERNEL_EXEC);
>  }
> +#endif
>
>  static void __init map_mem(void)
>  {
> @@ -328,14 +503,69 @@ static void __init map_mem(void)
>                         memblock_set_current_limit(limit);
>                 }
>  #endif
> -
> -               create_mapping(start, __phys_to_virt(start), end - start);
> +               __map_memblock(start, end);
>         }
>
>         /* Limit no longer required. */
>         memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE);
>  }
>
> +void __init fixup_executable(void)
> +{
> +#ifdef CONFIG_NX_KERNEL_AREAS
> +       /* now that we are actually fully mapped, make the start/end more fine grained */
> +       if (!IS_ALIGNED((unsigned long)_stext, SECTION_SIZE)) {
> +               unsigned long aligned_start = round_down(__pa(_stext),
> +                                                       SECTION_SIZE);
> +
> +               create_mapping(aligned_start, __phys_to_virt(aligned_start),
> +                               __pa(_stext) - aligned_start,
> +                               PROT_SECT_NORMAL,
> +                               PAGE_KERNEL);
> +       }
> +
> +       if (!IS_ALIGNED((unsigned long)__init_end, SECTION_SIZE)) {
> +               unsigned long aligned_end = round_up(__pa(__init_end),
> +                                                       SECTION_SIZE);
> +               create_mapping(__pa(__init_end), (unsigned long)__init_end,
> +                               aligned_end - __pa(__init_end),
> +                               PROT_SECT_NORMAL,
> +                               PAGE_KERNEL);
> +       }
> +#endif
> +}
> +
> +#ifdef CONFIG_DEBUG_RODATA
> +void mark_rodata_ro(void)
> +{
> +       create_mapping_late(__pa(_stext), (unsigned long)_stext,
> +                               (unsigned long)_etext - (unsigned long)_stext,
> +                               PROT_SECT_NORMAL_EXEC | PMD_SECT_RDONLY,
> +                               PAGE_KERNEL_EXEC | PTE_RDONLY);
> +
> +}
> +#endif
> +
> +static int __flush_mappings(void *unused)
> +{
> +       flush_tlb_kernel_range((unsigned long)__init_begin,
> +                               (unsigned long)__init_end);
> +       return 0;
> +}
> +
> +void __ref fixup_init(void)
> +{
> +       phys_addr_t start = __pa(__init_begin);
> +       phys_addr_t end = __pa(__init_end);
> +
> +       create_mapping_late(start, (unsigned long)__init_begin,
> +                       end - start,
> +                       PROT_SECT_NORMAL,
> +                       PAGE_KERNEL);
> +       if (!IS_ALIGNED(start, SECTION_SIZE) || !IS_ALIGNED(end, SECTION_SIZE))
> +               stop_machine(__flush_mappings, NULL, NULL);
> +}
> +
>  /*
>   * paging_init() sets up the page tables, initialises the zone memory
>   * maps and sets up the zero page.
> @@ -345,6 +575,7 @@ void __init paging_init(void)
>         void *zero_page;
>
>         map_mem();
> +       fixup_executable();
>
>         /*
>          * Finally flush the caches and tlb to ensure that we're in a
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer
       [not found]   ` <CAGXu5jLur_gdXs2X5BCmxB6L5HwgyP12jkrufK7bpS0Cxhp_+Q@mail.gmail.com>
@ 2014-08-25 18:23     ` Laura Abbott
  0 siblings, 0 replies; 26+ messages in thread
From: Laura Abbott @ 2014-08-25 18:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/22/2014 10:41 PM, Kees Cook wrote:
>
>
>
> On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org
> <mailto:lauraa@codeaurora.org>> wrote:
>
>     handle_arch_irq isn't actually text, it's just a function pointer.
>     It doesn't need to be stored in the text section and doing so
>     causes problesm if we ever want to make the kernel text read only.
>     Declare handle_arch_irq as a proper function pointer stored in
>     the data section.
>
>     Signed-off-by: Laura Abbott <lauraa@codeaurora.org
>     <mailto:lauraa@codeaurora.org>>
>
>
> Out of curiosity, did you find these cases via inspection, or were you
> debugging faults after making stuff RO/NX?
>


These were all faults after making things RO/NX. It was a good way to know
that the code was working as expected.

Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap
  2014-08-23  5:45   ` Kees Cook
@ 2014-08-25 18:34     ` Laura Abbott
  0 siblings, 0 replies; 26+ messages in thread
From: Laura Abbott @ 2014-08-25 18:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/22/2014 10:45 PM, Kees Cook wrote:
> On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>>
>> The fixmap API was originally added for arm64 for
>> early_ioremap purposes. It can be used for other purposes too
>> so move the initialization from ioremap to somewhere more
>> generic. This makes it obvious where the fixmap is being set
>> up and allows for a cleaner implementation of __set_fixmap.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
...
>> +void __init early_fixmap_init(void)
>> +{
>> +       pgd_t *pgd;
>> +       pud_t *pud;
>> +       pmd_t *pmd;
>> +       unsigned long addr = FIXADDR_START;
>> +
>> +       pgd = pgd_offset_k(addr);
>> +       pgd_populate(&init_mm, pgd, bm_pud);
>> +       pud = pud_offset(pgd, addr);
>> +       pud_populate(&init_mm, pud, bm_pmd);
>> +       pmd = pmd_offset(pud, addr);
>> +       pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +
>> +       /*
>> +        * The boot-ioremap range spans multiple pmds, for which
>> +        * we are not preparted:
>> +        */
>> +       BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
>> +                    != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
>> +
>> +       if ((pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)))
>> +            || pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_END))) {
>> +               WARN_ON(1);
>> +               pr_warn("pmd %p != %p, %p\n",
>> +                       pmd, fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)),
>> +                       fixmap_pmd(fix_to_virt(FIX_BTMAP_END)));
>> +               pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
>> +                       fix_to_virt(FIX_BTMAP_BEGIN));
>> +               pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
>> +                       fix_to_virt(FIX_BTMAP_END));
>> +
>> +               pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
>> +               pr_warn("FIX_BTMAP_BEGIN:     %d\n", FIX_BTMAP_BEGIN);
>> +       }
>> +}
>> +
>> +void __set_fixmap(enum fixed_addresses idx,
>> +                              phys_addr_t phys, pgprot_t flags)
>> +{
>> +       unsigned long addr = __fix_to_virt(idx);
>> +       pte_t *pte;
>> +
>> +       if (idx >= __end_of_fixed_addresses) {
>> +               BUG();
>> +               return;
>> +       }
> 
> Is it worth cleaning this up into BUG_ON instead of BUG; return; ?
> 
> Reviewed-by: Kees Cook <keescook@chromium.org>
> 
> -Kees

I'm guessing this was set up for error handling even if CONFIG_BUG
is turned off.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO
  2014-08-23  5:51   ` Kees Cook
@ 2014-08-25 18:38     ` Laura Abbott
  2014-08-26 18:36       ` Mark Rutland
  0 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-25 18:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/22/2014 10:51 PM, Kees Cook wrote:
> On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> When kernel text is marked as read only, it cannot be modified directly.
>> Use a fixmap to modify the text instead in a similar manner to
>> x86 and arm.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
>> I think there were some questions on spinlocks for the arm version, not
>> sure if similar concerns apply here.
>> ---
>>  arch/arm64/include/asm/fixmap.h |  1 +
>>  arch/arm64/include/asm/insn.h   |  2 ++
>>  arch/arm64/kernel/insn.c        | 74 +++++++++++++++++++++++++++++++++++++++--
>>  arch/arm64/kernel/jump_label.c  |  2 +-
>>  4 files changed, 75 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
>> index db26a2f2..2cd4b0d 100644
>> --- a/arch/arm64/include/asm/fixmap.h
>> +++ b/arch/arm64/include/asm/fixmap.h
>> @@ -48,6 +48,7 @@ enum fixed_addresses {
>>
>>         FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
>>         FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
>> +       FIX_TEXT_POKE0,
>>         __end_of_fixed_addresses
>>  };
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index dc1f73b..07ac29b 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -92,6 +92,7 @@ bool aarch64_insn_is_nop(u32 insn);
>>
>>  int aarch64_insn_read(void *addr, u32 *insnp);
>>  int aarch64_insn_write(void *addr, u32 insn);
>> +int aarch64_insn_write_early(void *addr, u32 insn);
>>  enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
>>  u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>>                                   u32 insn, u64 imm);
>> @@ -103,6 +104,7 @@ u32 aarch64_insn_gen_nop(void);
>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>
>>  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
>> +int __aarch64_insn_patch_text_nosync(void *addr, u32 insn, bool early);
>>  int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>>  int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>>  #endif /* __ASSEMBLY__ */
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index 92f3683..b25f8db 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -17,10 +17,13 @@
>>  #include <linux/bitops.h>
>>  #include <linux/compiler.h>
>>  #include <linux/kernel.h>
>> +#include <linux/mm.h>
>>  #include <linux/smp.h>
>> +#include <linux/spinlock.h>
>>  #include <linux/stop_machine.h>
>>  #include <linux/uaccess.h>
>>  #include <asm/cacheflush.h>
>> +#include <asm/fixmap.h>
>>  #include <asm/insn.h>
>>
>>  static int aarch64_insn_encoding_class[] = {
>> @@ -65,6 +68,36 @@ bool __kprobes aarch64_insn_is_nop(u32 insn)
>>         }
>>  }
>>
>> +static DEFINE_SPINLOCK(patch_lock);
>> +
>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>> +{
>> +       unsigned long uintaddr = (uintptr_t) addr;
>> +       bool module = !core_kernel_text(uintaddr);
>> +       struct page *page;
>> +
>> +       if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>> +               page = vmalloc_to_page(addr);
>> +       else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>> +               page = virt_to_page(addr);
>> +       else
>> +               return addr;
>> +
>> +       if (flags)
>> +               spin_lock_irqsave(&patch_lock, *flags);
>> +
>> +       set_fixmap(fixmap, page_to_phys(page));
>> +
>> +       return (void *) (__fix_to_virt(fixmap) + (uintaddr & ~PAGE_MASK));
>> +}
>> +
>> +static void __kprobes patch_unmap(int fixmap, unsigned long *flags)
>> +{
>> +       clear_fixmap(fixmap);
>> +
>> +       if (flags)
>> +               spin_unlock_irqrestore(&patch_lock, *flags);
>> +}
>>  /*
>>   * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>>   * little-endian.
>> @@ -81,10 +114,36 @@ int __kprobes aarch64_insn_read(void *addr, u32 *insnp)
>>         return ret;
>>  }
>>
>> +static int __kprobes __aarch64_insn_write(void *addr, u32 insn, bool patch)
>> +{
>> +       void *waddr = addr;
>> +       unsigned long flags;
>> +       int ret;
>> +
>> +       if (patch)
>> +               waddr = patch_map(addr, FIX_TEXT_POKE0, &flags);
>> +
>> +       ret = probe_kernel_write(waddr, &insn, AARCH64_INSN_SIZE);
>> +
>> +       if (waddr != addr) {
>> +               __flush_dcache_area(waddr, AARCH64_INSN_SIZE);
> 
> Is this flush to make sure the waddr change has actually made it to
> physical memory?
> 
> Reviewed-by: Kees Cook <keescook@chromium.org>
> 
> -Kees
> 

It's more for the alias flushing to match what arm was doing. This was
one of the parts that I wasn't sure if it was necessary or not.

Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 7/7] arm64: add better page protections to arm64
  2014-08-23  5:59   ` Kees Cook
@ 2014-08-25 19:04     ` Laura Abbott
  0 siblings, 0 replies; 26+ messages in thread
From: Laura Abbott @ 2014-08-25 19:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/22/2014 10:59 PM, Kees Cook wrote:
> On Wed, Aug 20, 2014 at 6:20 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> Add page protections for arm64 similar to those in arm.
>> This is for security reasons to prevent certain classes
>> of exploits. The current method:
>>
>> - Map all memory as either RWX or RW. We round to the nearest
>>   section to avoid creating page tables before everything is mapped
>> - Once everything is mapped, if either end of the RWX section should
>>   not be X, we split the PMD and remap as necessary
>> - When initmem is to be freed, we change the permissions back to
>>   RW (using stop machine if necessary to flush the TLB)
>> - If CONFIG_DEBUG_RODATA is set, the read only sections are set
>>   read only.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
>>  arch/arm64/Kconfig                  |   8 +
>>  arch/arm64/Kconfig.debug            |  23 +++
>>  arch/arm64/include/asm/cacheflush.h |   3 +
>>  arch/arm64/kernel/vmlinux.lds.S     |  17 ++
>>  arch/arm64/mm/init.c                |   1 +
>>  arch/arm64/mm/mm.h                  |   2 +
>>  arch/arm64/mm/mmu.c                 | 303 +++++++++++++++++++++++++++++++-----
>>  7 files changed, 321 insertions(+), 36 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index fd4e81a..b718974 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -328,6 +328,14 @@ config FORCE_MAX_ZONEORDER
>>         default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
>>         default "11"
>>
>> +config NX_KERNEL_AREAS
>> +       bool "Allow no execute kernel areas"
>> +       help
>> +         Mark areas of the kernel that should never be executed (data, heap
>> +         etc.) as no execute. This helps with a certain set of security issues.
>> +         You should say Y here unless you like malicious code doing unexpected
>> +         things on your hardware.
>> +
>>  endmenu
>>
>>  menu "Boot options"
>> diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
>> index 4ee8e90..032ccc1 100644
>> --- a/arch/arm64/Kconfig.debug
>> +++ b/arch/arm64/Kconfig.debug
>> @@ -43,4 +43,27 @@ config ARM64_RANDOMIZE_TEXT_OFFSET
>>           of TEXT_OFFSET and platforms must not require a specific
>>           value.
>>
>> +config DEBUG_RODATA
>> +       bool "Make kernel text and rodata read-only"
>> +       help
>> +         If this is set, kernel text and rodata will be made read-only. This
>> +         is to help catch accidental or malicious attempts to change the
>> +         kernel's executable code. Additionally splits rodata from kernel
>> +         text so it can be made explicitly non-executable.
>> +
>> +          If in doubt, say Y
> 
> I wonder if this option should explicitly select NX_KERNEL_AREAS? It
> seems crazy to me that someone would only want one of these. And this
> config is the cross-arch name for "make kernel page tables protected",
> so it might be nice to turn everything on via this one switch?
> 

I think I will just merge the two Kconfigs. It just seems a little odd to have
CONFIG_DEBUG_RODATA also setting nx permissions. Perhaps this needs
some clarification elsewhere. (CONFIG_DEBUG_RODATA ->
CONFIG_STRICT_KERNEL_PERMS ?)

>> +
>> +config DEBUG_ALIGN_RODATA
>> +       depends on DEBUG_RODATA
>> +       bool "Align linker sections up to SECTION_SIZE"
>> +       help
>> +         If this option is enabled, sections that may potentially be marked as
>> +         read only or non-executable will be aligned up to the section size of
>> +         the kernel. This prevents sections from being split into pages and
>> +         avoids a potential TLB penalty. The downside is an increase in
>> +         alignment and potentially wasted space. Turn on this option if
>> +         performance is more important than memory pressure.
>> +
>> +         If in doubt, say N
> 
> Making this configurable is pretty neat. What's the overhead on memory
> pressure for arm64? Is the correct default "N"?
> 

Last time I measured this, it was a loss of about 5.5MB when we align up
to section size. Depending on the memory footprint, this can either be
negligible or pretty drastic. From other experimental patches, breaking
down sections can result in ~10% performance drop on some benchmarks.

Given what are the expected use case for arm64, I suspect this should actually
be Y. My plan was to get the infrastructure in and then let everyone conclude
what would be a reasonable default.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 3/7] arm64: Move cpu_resume into the text section
  2014-08-21  1:20 ` [PATCHv3 3/7] arm64: Move cpu_resume into the text section Laura Abbott
@ 2014-08-25 20:34   ` Stephen Boyd
  2014-08-26  0:43     ` Laura Abbott
  0 siblings, 1 reply; 26+ messages in thread
From: Stephen Boyd @ 2014-08-25 20:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/20/14 18:20, Laura Abbott wrote:
> The function cpu_resume currently lives in the .data
> section. This breaks functionality to make .data no
> execute. Move cpu_resume to .text which is where it
> actually belongs.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---

This is similar to arm (where I don't see a similar patch to this in
Kees' patchset for RO). In arm we have this comment:

/*
 * Note: Yes, part of the following code is located into the .data section.
 *       This is to allow sleep_save_sp to be accessed with a relative load
 *       while we can't rely on any MMU translation.  We could have put
 *       sleep_save_sp in the .text section as well, but some setups might
 *       insist on it to be truly read-only.
 */

Doesn't the same situation apply here? How is this problem overcome?

But I'm a little confused, shouldn't this code run with the MMU disabled?

>  ENDPROC(cpu_resume_after_mmu)
>  
> -	.data
>  ENTRY(cpu_resume)
>  	bl	el2_setup		// if in EL2 drop to EL1 cleanly
>  #ifdef CONFIG_SMP
>  	mrs	x1, mpidr_el1
> -	adr	x4, mpidr_hash_ptr
> +	adrp	x4, mpidr_hash_ptr
> +	add	x4, x4, #:lo12:mpidr_hash_ptr
>  	ldr	x5, [x4]
>  	add	x8, x4, x5		// x8 = struct mpidr_hash phys address
>          /* retrieve mpidr_hash members to compute the hash */
> @@ -143,14 +143,15 @@ ENTRY(cpu_resume)
>  #else
>  	mov	x7, xzr
>  #endif
> -	adr	x0, sleep_save_sp
> +	adrp	x0, sleep_save_sp
> +	add	x0, x0, #:lo12:sleep_save_sp
>  	ldr	x0, [x0, #SLEEP_SAVE_SP_PHYS]
>  	ldr	x0, [x0, x7, lsl #3]
>  	/* load sp from context */
>  	ldr	x2, [x0, #CPU_CTX_SP]
> -	adr	x1, sleep_idmap_phys
> +	adrp	x1, sleep_idmap_phys
>  	/* load physical address of identity map page table in x1 */
> -	ldr	x1, [x1]
> +	ldr	x1, [x1, #:lo12:sleep_idmap_phys]
>  	mov	sp, x2
>  	/*
>  	 * cpu_do_resume expects x0 to contain context physical address
> @@ -160,6 +161,7 @@ ENTRY(cpu_resume)
>  	b	cpu_resume_mmu		// Resume MMU, never returns
>  ENDPROC(cpu_resume)
>  
> +	.data
>  	.align 3
>  mpidr_hash_ptr:
>  	/*


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 4/7] arm64: Move some head.text functions to executable section
  2014-08-22  9:48       ` Mark Rutland
@ 2014-08-26  0:32         ` Laura Abbott
  2014-08-26 17:45           ` Mark Rutland
  0 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-26  0:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/22/2014 2:48 AM, Mark Rutland wrote:
> On Thu, Aug 21, 2014 at 10:42:13PM +0100, Laura Abbott wrote:
>> On 8/21/2014 3:34 AM, Mark Rutland wrote:
>>> Hi Laura,
>>>
>>> On Thu, Aug 21, 2014 at 02:20:36AM +0100, Laura Abbott wrote:
>>>> The code in the head.text section of the kernel exists in the
>>>> same section as the swapper_pg_dir which means it needs the
>>>> same page table permissions. The swapper_pg_dir needs to be
>>>> writeable but shouldn't be executable.
>>>
>>> I think we can drop the above. As far as I can tell as of commit
>>> bd00cd5f8c8c (arm64: place initial page tables above the kernel) it's no
>>> longer relevant.
>>>
>>
>> Yes, this should be changed. Instead of citing swapper_pg_dir, I need
>> to cite the fact that there may still be memory outside of stext which
>> will get freed to the buddy allocator and therefore should be RW/NX.
> 
> Ok. I must be missing something here; we don't seem to free any of
> .head.text to the buddy allocator at the moment. Are we just trying to
> have the bare minimum remain executable regardless?
> 

It's not head.text per se, but we do 

memblock_reserve(__pa(_text), _end - _text);

to reserve the text area. On the system I was testing on, _text did not
start at PAGE_OFFSET so there was still some not reserved memory
between PAGE_OFFSET and _text which was freed to the buddy allocator.

>>>> The head.text section
>>>> is intended to be run at early bootup before any of the regular
>>>> kernel mappings have been setup so there is no issue at bootup.
>>>> The suspend/resume/hotplug code path requires some of these
>>>> head.S functions to run however which means they need to be
>>>> executable. We can't easily move all of the head.text to
>>>> an executable section, so split it into two parts: that which
>>>> is used only at early head.S bootup and that which is used
>>>> after bootup. There is a small bit of code duplication because
>>>> of some relocation issues related to accessing code more than
>>>> 1MB away.
>>>
>>> From a cursory glance it looks like the only things we need write access
>>> to in .head.text are __cpu_boot_mode and __switch_data. Can't we instead
>>> place those in .data and make .head.text executable?
>>>
>>> We currently find them with adr, which should be easy to replace with
>>> adrp + add to get around relocation issues.
>>>
>>
>> __boot_cpu_mode should be placed in data with a push section and
>> __switch_data is only modified before the permissions are set up.
> 
> Ok, so those are fine then.
> 
>> I took a closer look at the code and the only thing that actually
>> needs to be executable from head.S is __secondary_switched so
>> the following patch should be sufficient to cover it:
>>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index caa9557..5c17599 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -414,12 +414,14 @@ ENTRY(secondary_startup)
>>         b       __enable_mmu
>>  ENDPROC(secondary_startup)
>>  
>> +       .pushsection    .text, "ax"
>>  ENTRY(__secondary_switched)
>>         ldr     x0, [x21]                       // get secondary_data.stack
>>         mov     sp, x0
>>         mov     x29, #0
>>         b       secondary_start_kernel
>>  ENDPROC(__secondary_switched)
>> +       .popsection
>>  #endif /* CONFIG_SMP */
> 
> If we need to cover __secondary_switched, don't we need to cover
> __turn_mmu_on as well, given that will turn the MMU on and branch to
> __secondary_switched?
> 

You should be right but when I was testing this I seemed to get away
without needing it. It's probably just luck and I should move
__turn_mmu_on to text just to be on the safe side as well.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 3/7] arm64: Move cpu_resume into the text section
  2014-08-25 20:34   ` Stephen Boyd
@ 2014-08-26  0:43     ` Laura Abbott
  2014-08-26  1:08       ` Stephen Boyd
  0 siblings, 1 reply; 26+ messages in thread
From: Laura Abbott @ 2014-08-26  0:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/25/2014 1:34 PM, Stephen Boyd wrote:
> On 08/20/14 18:20, Laura Abbott wrote:
>> The function cpu_resume currently lives in the .data
>> section. This breaks functionality to make .data no
>> execute. Move cpu_resume to .text which is where it
>> actually belongs.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
> 
> This is similar to arm (where I don't see a similar patch to this in
> Kees' patchset for RO). In arm we have this comment:
> 
> /*
>  * Note: Yes, part of the following code is located into the .data section.
>  *       This is to allow sleep_save_sp to be accessed with a relative load
>  *       while we can't rely on any MMU translation.  We could have put
>  *       sleep_save_sp in the .text section as well, but some setups might
>  *       insist on it to be truly read-only.
>  */
> 
> Doesn't the same situation apply here? How is this problem overcome?
> 
> But I'm a little confused, shouldn't this code run with the MMU disabled?
> 
>>  ENDPROC(cpu_resume_after_mmu)
>>  
>> -	.data
>>  ENTRY(cpu_resume)
>>  	bl	el2_setup		// if in EL2 drop to EL1 cleanly
>>  #ifdef CONFIG_SMP
>>  	mrs	x1, mpidr_el1
>> -	adr	x4, mpidr_hash_ptr
>> +	adrp	x4, mpidr_hash_ptr
>> +	add	x4, x4, #:lo12:mpidr_hash_ptr
>>  	ldr	x5, [x4]
>>  	add	x8, x4, x5		// x8 = struct mpidr_hash phys address
>>          /* retrieve mpidr_hash members to compute the hash */
>> @@ -143,14 +143,15 @@ ENTRY(cpu_resume)
>>  #else
>>  	mov	x7, xzr
>>  #endif
>> -	adr	x0, sleep_save_sp
>> +	adrp	x0, sleep_save_sp
>> +	add	x0, x0, #:lo12:sleep_save_sp
>>  	ldr	x0, [x0, #SLEEP_SAVE_SP_PHYS]
>>  	ldr	x0, [x0, x7, lsl #3]
>>  	/* load sp from context */
>>  	ldr	x2, [x0, #CPU_CTX_SP]
>> -	adr	x1, sleep_idmap_phys
>> +	adrp	x1, sleep_idmap_phys
>>  	/* load physical address of identity map page table in x1 */
>> -	ldr	x1, [x1]
>> +	ldr	x1, [x1, #:lo12:sleep_idmap_phys]
>>  	mov	sp, x2
>>  	/*
>>  	 * cpu_do_resume expects x0 to contain context physical address
>> @@ -160,6 +161,7 @@ ENTRY(cpu_resume)
>>  	b	cpu_resume_mmu		// Resume MMU, never returns
>>  ENDPROC(cpu_resume)
>>  
>> +	.data
>>  	.align 3
>>  mpidr_hash_ptr:
>>  	/*
> 
> 

Good point. I think this was a patch I added when I was debugging other
issues and assumed it would be needed (code in .data segment, seems
naturally a problem, right?) . When I revert the patch though it seems
to work just fine. I suspect the comment about pc relative load is no
longer relevant since I use the relocation trick to properly access
sleep_save_sp in the data section.

Since it's not technically needed, I could drop the patch and add one
adding the comment back saying this was done on purpose. On the other
hand, I wonder if I could do something 'interesting' by modifying
the cpu_resume code since it's writable if I was a malicious
program.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 3/7] arm64: Move cpu_resume into the text section
  2014-08-26  0:43     ` Laura Abbott
@ 2014-08-26  1:08       ` Stephen Boyd
  0 siblings, 0 replies; 26+ messages in thread
From: Stephen Boyd @ 2014-08-26  1:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/25/14 17:43, Laura Abbott wrote:

> Good point. I think this was a patch I added when I was debugging other
> issues and assumed it would be needed (code in .data segment, seems
> naturally a problem, right?) . When I revert the patch though it seems
> to work just fine. I suspect the comment about pc relative load is no
> longer relevant since I use the relocation trick to properly access
> sleep_save_sp in the data section.

Ah good. Can we move this code to the text section on arm32 as well
please? Probably update the commit text too.

>
> Since it's not technically needed, I could drop the patch and add one
> adding the comment back saying this was done on purpose. On the other
> hand, I wonder if I could do something 'interesting' by modifying
> the cpu_resume code since it's writable if I was a malicious
> program.

Even moving the cpu_resume function into the text section doesn't
prevent a malicious program which can write to the sleep_save_sp area to
use a different resume function. I suppose that is a bit harder to do
though.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 4/7] arm64: Move some head.text functions to executable section
  2014-08-26  0:32         ` Laura Abbott
@ 2014-08-26 17:45           ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2014-08-26 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 26, 2014 at 01:32:27AM +0100, Laura Abbott wrote:
> On 8/22/2014 2:48 AM, Mark Rutland wrote:
> > On Thu, Aug 21, 2014 at 10:42:13PM +0100, Laura Abbott wrote:
> >> On 8/21/2014 3:34 AM, Mark Rutland wrote:
> >>> Hi Laura,
> >>>
> >>> On Thu, Aug 21, 2014 at 02:20:36AM +0100, Laura Abbott wrote:
> >>>> The code in the head.text section of the kernel exists in the
> >>>> same section as the swapper_pg_dir which means it needs the
> >>>> same page table permissions. The swapper_pg_dir needs to be
> >>>> writeable but shouldn't be executable.
> >>>
> >>> I think we can drop the above. As far as I can tell as of commit
> >>> bd00cd5f8c8c (arm64: place initial page tables above the kernel) it's no
> >>> longer relevant.
> >>>
> >>
> >> Yes, this should be changed. Instead of citing swapper_pg_dir, I need
> >> to cite the fact that there may still be memory outside of stext which
> >> will get freed to the buddy allocator and therefore should be RW/NX.
> > 
> > Ok. I must be missing something here; we don't seem to free any of
> > .head.text to the buddy allocator at the moment. Are we just trying to
> > have the bare minimum remain executable regardless?
> > 
> 
> It's not head.text per se, but we do 
> 
> memblock_reserve(__pa(_text), _end - _text);
> 
> to reserve the text area. On the system I was testing on, _text did not
> start at PAGE_OFFSET so there was still some not reserved memory
> between PAGE_OFFSET and _text which was freed to the buddy allocator.

Ah, I see. I'd forgotten about the TEXT_OFFSET area.

Thanks for the clarification.

> >>>> The head.text section
> >>>> is intended to be run at early bootup before any of the regular
> >>>> kernel mappings have been setup so there is no issue at bootup.
> >>>> The suspend/resume/hotplug code path requires some of these
> >>>> head.S functions to run however which means they need to be
> >>>> executable. We can't easily move all of the head.text to
> >>>> an executable section, so split it into two parts: that which
> >>>> is used only at early head.S bootup and that which is used
> >>>> after bootup. There is a small bit of code duplication because
> >>>> of some relocation issues related to accessing code more than
> >>>> 1MB away.
> >>>
> >>> From a cursory glance it looks like the only things we need write access
> >>> to in .head.text are __cpu_boot_mode and __switch_data. Can't we instead
> >>> place those in .data and make .head.text executable?
> >>>
> >>> We currently find them with adr, which should be easy to replace with
> >>> adrp + add to get around relocation issues.
> >>>
> >>
> >> __boot_cpu_mode should be placed in data with a push section and
> >> __switch_data is only modified before the permissions are set up.
> > 
> > Ok, so those are fine then.
> > 
> >> I took a closer look at the code and the only thing that actually
> >> needs to be executable from head.S is __secondary_switched so
> >> the following patch should be sufficient to cover it:
> >>
> >> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> >> index caa9557..5c17599 100644
> >> --- a/arch/arm64/kernel/head.S
> >> +++ b/arch/arm64/kernel/head.S
> >> @@ -414,12 +414,14 @@ ENTRY(secondary_startup)
> >>         b       __enable_mmu
> >>  ENDPROC(secondary_startup)
> >>  
> >> +       .pushsection    .text, "ax"
> >>  ENTRY(__secondary_switched)
> >>         ldr     x0, [x21]                       // get secondary_data.stack
> >>         mov     sp, x0
> >>         mov     x29, #0
> >>         b       secondary_start_kernel
> >>  ENDPROC(__secondary_switched)
> >> +       .popsection
> >>  #endif /* CONFIG_SMP */
> > 
> > If we need to cover __secondary_switched, don't we need to cover
> > __turn_mmu_on as well, given that will turn the MMU on and branch to
> > __secondary_switched?
> > 
> 
> You should be right but when I was testing this I seemed to get away
> without needing it. It's probably just luck and I should move
> __turn_mmu_on to text just to be on the safe side as well.

Interesting. I guess the I-cache was on already, then?

Moving __turn_mmu_on to the text section sounds like the right thing to
do.

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO
  2014-08-25 18:38     ` Laura Abbott
@ 2014-08-26 18:36       ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2014-08-26 18:36 UTC (permalink / raw)
  To: linux-arm-kernel

[...]

> >> +static int __kprobes __aarch64_insn_write(void *addr, u32 insn, bool patch)
> >> +{
> >> +       void *waddr = addr;
> >> +       unsigned long flags;
> >> +       int ret;
> >> +
> >> +       if (patch)
> >> +               waddr = patch_map(addr, FIX_TEXT_POKE0, &flags);
> >> +
> >> +       ret = probe_kernel_write(waddr, &insn, AARCH64_INSN_SIZE);
> >> +
> >> +       if (waddr != addr) {
> >> +               __flush_dcache_area(waddr, AARCH64_INSN_SIZE);
> > 
> > Is this flush to make sure the waddr change has actually made it to
> > physical memory?
> > 
> > Reviewed-by: Kees Cook <keescook@chromium.org>
> > 
> > -Kees
> > 
> 
> It's more for the alias flushing to match what arm was doing. This was
> one of the parts that I wasn't sure if it was necessary or not.

ARMv8 doesn't allow for aliases in the D-cache, so I think we can drop
the __flush_dcache_area call:

  - D-cache maintenance instructions execute in program-order relative
    to loads & stores that access an address in Normal memory with Inner
    Write {Through,Back} attributes within the same cache line. (per
    ARMv8 ARM, D3-1615).

  - D-cache maintenance for an address is visible at all aliases. (per
    ARMv8 ARM, D4-1750)

So we shouldn't need a barrier between the write and the D-cache
maintenance, and we don't care which virtual alias we perform the
maintenance on. As flush_icache_range flushes the VA matching the
I-cache, that should be sufficient.

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer
  2014-08-21  1:20 ` [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer Laura Abbott
       [not found]   ` <CAGXu5jLur_gdXs2X5BCmxB6L5HwgyP12jkrufK7bpS0Cxhp_+Q@mail.gmail.com>
@ 2014-08-28 17:02   ` Catalin Marinas
  1 sibling, 0 replies; 26+ messages in thread
From: Catalin Marinas @ 2014-08-28 17:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 21, 2014 at 02:20:33AM +0100, Laura Abbott wrote:
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -168,7 +168,8 @@ tsk	.req	x28		// current thread_info
>   * Interrupt handling.
>   */
>  	.macro	irq_handler
> -	ldr	x1, handle_arch_irq
> +	adrp x1, handle_arch_irq
> +	ldr x1, [x1, #:lo12:handle_arch_irq]

Nitpick: please keep the tabs between instructions and registers, it
looks nicer with the rest of the code.

-- 
Catalin

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-08-28 17:02 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-21  1:20 [PATCHv3 0/7] Better page protections for arm64 Laura Abbott
2014-08-21  1:20 ` [PATCHv3 1/7] arm64: Treat handle_arch_irq as a function pointer Laura Abbott
     [not found]   ` <CAGXu5jLur_gdXs2X5BCmxB6L5HwgyP12jkrufK7bpS0Cxhp_+Q@mail.gmail.com>
2014-08-25 18:23     ` Laura Abbott
2014-08-28 17:02   ` Catalin Marinas
2014-08-21  1:20 ` [PATCHv3 2/7] arm64: Switch to ldr for loading the stub vectors Laura Abbott
2014-08-21  9:30   ` Mark Rutland
2014-08-21  1:20 ` [PATCHv3 3/7] arm64: Move cpu_resume into the text section Laura Abbott
2014-08-25 20:34   ` Stephen Boyd
2014-08-26  0:43     ` Laura Abbott
2014-08-26  1:08       ` Stephen Boyd
2014-08-21  1:20 ` [PATCHv3 4/7] arm64: Move some head.text functions to executable section Laura Abbott
2014-08-21 10:34   ` Mark Rutland
2014-08-21 21:42     ` Laura Abbott
2014-08-22  9:48       ` Mark Rutland
2014-08-26  0:32         ` Laura Abbott
2014-08-26 17:45           ` Mark Rutland
2014-08-21  1:20 ` [PATCHv3 5/7] arm64: Factor out fixmap initialiation from ioremap Laura Abbott
2014-08-23  5:45   ` Kees Cook
2014-08-25 18:34     ` Laura Abbott
2014-08-21  1:20 ` [PATCHv3 6/7] arm64: use fixmap for text patching when text is RO Laura Abbott
2014-08-23  5:51   ` Kees Cook
2014-08-25 18:38     ` Laura Abbott
2014-08-26 18:36       ` Mark Rutland
2014-08-21  1:20 ` [PATCHv3 7/7] arm64: add better page protections to arm64 Laura Abbott
2014-08-23  5:59   ` Kees Cook
2014-08-25 19:04     ` Laura Abbott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.