All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/8] arm: support CONFIG_RODATA
@ 2014-08-13 17:06 ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

This is a series of patches to support CONFIG_RODATA on ARM, so that
the kernel text is RO, and non-text sections default to NX. To support
on-the-fly kernel text patching (via ftrace, kprobes, etc), fixmap
support has been finalized based on several versions of various patches
that are floating around on the mailing list. This series attempts to
include the least intrusive version, so that others can build on it for
future fixmap work.

The series has been heavily tested, and appears to be working correctly:

With CONFIG_ARM_PTDUMP, expected page table permissions are seen in
/sys/kernel/debug/kernel_page_tables.

Using CONFIG_LKDTM, the kernel now correctly detects bad accesses for
for the following lkdtm tests via /sys/kernel/debug/provoke-crash/DIRECT:
        EXEC_DATA
        WRITE_RO
        WRITE_KERN

ftrace works:
        CONFIG_FTRACE_STARTUP_TEST passes
        Enabling tracing works:
                echo function > /sys/kernel/debug/tracing/current_tracer

kprobes works:
        CONFIG_ARM_KPROBES_TEST passes

kexec works:
        kexec will load and start a new kernel

Built with and without CONFIG_HIGHMEM, CONFIG_HIGHMEM_DEBUG, and
CONFIG_NR_CPUS=32.

Thanks to everyone who has been testing this series and working on its
various pieces!

-Kees

v4:
- expanded fixmap to 3MB to support 32 CPUs (robh)
- corrected pmd-finding via vaddr instead of FIXMAP_START (robh)
- switched structure size test to BUILD_BUG_ON (sboyd)
- added locking annotations to keep sparse happy (sboyd)
- adding missing "static" declarations noticed by sparse
- reorganized fixmap portion of patches

v3:
- more cleanups in switch to generic fixmap (lauraa, robh)
- fixed kexec merge hunk glitch (will.deacon)
- added tested-by tags where appropriate from v2 testing

v2:
- fix typo in kexec merge (buildbot)
- flip index order for highmem pte access (lauraa)
- added kgdb updates (dianders)


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 0/8] arm: support CONFIG_RODATA
@ 2014-08-13 17:06 ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

This is a series of patches to support CONFIG_RODATA on ARM, so that
the kernel text is RO, and non-text sections default to NX. To support
on-the-fly kernel text patching (via ftrace, kprobes, etc), fixmap
support has been finalized based on several versions of various patches
that are floating around on the mailing list. This series attempts to
include the least intrusive version, so that others can build on it for
future fixmap work.

The series has been heavily tested, and appears to be working correctly:

With CONFIG_ARM_PTDUMP, expected page table permissions are seen in
/sys/kernel/debug/kernel_page_tables.

Using CONFIG_LKDTM, the kernel now correctly detects bad accesses for
for the following lkdtm tests via /sys/kernel/debug/provoke-crash/DIRECT:
        EXEC_DATA
        WRITE_RO
        WRITE_KERN

ftrace works:
        CONFIG_FTRACE_STARTUP_TEST passes
        Enabling tracing works:
                echo function > /sys/kernel/debug/tracing/current_tracer

kprobes works:
        CONFIG_ARM_KPROBES_TEST passes

kexec works:
        kexec will load and start a new kernel

Built with and without CONFIG_HIGHMEM, CONFIG_HIGHMEM_DEBUG, and
CONFIG_NR_CPUS=32.

Thanks to everyone who has been testing this series and working on its
various pieces!

-Kees

v4:
- expanded fixmap to 3MB to support 32 CPUs (robh)
- corrected pmd-finding via vaddr instead of FIXMAP_START (robh)
- switched structure size test to BUILD_BUG_ON (sboyd)
- added locking annotations to keep sparse happy (sboyd)
- adding missing "static" declarations noticed by sparse
- reorganized fixmap portion of patches

v3:
- more cleanups in switch to generic fixmap (lauraa, robh)
- fixed kexec merge hunk glitch (will.deacon)
- added tested-by tags where appropriate from v2 testing

v2:
- fix typo in kexec merge (buildbot)
- flip index order for highmem pte access (lauraa)
- added kgdb updates (dianders)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 1/8] arm: use generic fixmap.h
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Mark Salter, Laura Abbott, Rob Herring, Leif Lindholm,
	Stephen Boyd, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Will Deacon, Catalin Marinas, Russell King - ARM Linux,
	linux-arm-kernel, linux-doc

From: Mark Salter <msalter@redhat.com>

ARM is different from other architectures in that fixmap pages are indexed
with a positive offset from FIXADDR_START.  Other architectures index with
a negative offset from FIXADDR_TOP.  In order to use the generic fixmap.h
definitions, this patch redefines FIXADDR_TOP to be inclusive of the
useable range.  That is, FIXADDR_TOP is the virtual address of the topmost
fixed page.  The newly defined FIXADDR_END is the first virtual address
past the fixed mappings.

Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Doug Anderson <dianders@chromium.org>
[kees: update for a05e54c103b0 "ARM: 8031/2: change fixmap ..."]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Rob Herring <robh@kernel.org>
---
 arch/arm/include/asm/fixmap.h | 27 +++++++++------------------
 arch/arm/mm/init.c            |  2 +-
 2 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index 74124b0d0d79..a7add6f9315d 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -2,27 +2,18 @@
 #define _ASM_FIXMAP_H
 
 #define FIXADDR_START		0xffc00000UL
-#define FIXADDR_TOP		0xffe00000UL
-#define FIXADDR_SIZE		(FIXADDR_TOP - FIXADDR_START)
+#define FIXADDR_END		0xffe00000UL
+#define FIXADDR_TOP		(FIXADDR_END - PAGE_SIZE)
 
-#define FIX_KMAP_NR_PTES	(FIXADDR_SIZE >> PAGE_SHIFT)
+#include <asm/kmap_types.h>
 
-#define __fix_to_virt(x)	(FIXADDR_START + ((x) << PAGE_SHIFT))
-#define __virt_to_fix(x)	(((x) - FIXADDR_START) >> PAGE_SHIFT)
+enum fixed_addresses {
+	FIX_KMAP_BEGIN,
+	FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1,
 
-extern void __this_fixmap_does_not_exist(void);
+	__end_of_fixed_addresses
+};
 
-static inline unsigned long fix_to_virt(const unsigned int idx)
-{
-	if (idx >= FIX_KMAP_NR_PTES)
-		__this_fixmap_does_not_exist();
-	return __fix_to_virt(idx);
-}
-
-static inline unsigned int virt_to_fix(const unsigned long vaddr)
-{
-	BUG_ON(vaddr >= FIXADDR_TOP || vaddr < FIXADDR_START);
-	return __virt_to_fix(vaddr);
-}
+#include <asm-generic/fixmap.h>
 
 #endif
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 659c75d808dc..ad82c05bfc3a 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -570,7 +570,7 @@ void __init mem_init(void)
 			MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
 			MLK(ITCM_OFFSET, (unsigned long) itcm_end),
 #endif
-			MLK(FIXADDR_START, FIXADDR_TOP),
+			MLK(FIXADDR_START, FIXADDR_END),
 			MLM(VMALLOC_START, VMALLOC_END),
 			MLM(PAGE_OFFSET, (unsigned long)high_memory),
 #ifdef CONFIG_HIGHMEM
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 1/8] arm: use generic fixmap.h
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

From: Mark Salter <msalter@redhat.com>

ARM is different from other architectures in that fixmap pages are indexed
with a positive offset from FIXADDR_START.  Other architectures index with
a negative offset from FIXADDR_TOP.  In order to use the generic fixmap.h
definitions, this patch redefines FIXADDR_TOP to be inclusive of the
useable range.  That is, FIXADDR_TOP is the virtual address of the topmost
fixed page.  The newly defined FIXADDR_END is the first virtual address
past the fixed mappings.

Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Doug Anderson <dianders@chromium.org>
[kees: update for a05e54c103b0 "ARM: 8031/2: change fixmap ..."]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Rob Herring <robh@kernel.org>
---
 arch/arm/include/asm/fixmap.h | 27 +++++++++------------------
 arch/arm/mm/init.c            |  2 +-
 2 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index 74124b0d0d79..a7add6f9315d 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -2,27 +2,18 @@
 #define _ASM_FIXMAP_H
 
 #define FIXADDR_START		0xffc00000UL
-#define FIXADDR_TOP		0xffe00000UL
-#define FIXADDR_SIZE		(FIXADDR_TOP - FIXADDR_START)
+#define FIXADDR_END		0xffe00000UL
+#define FIXADDR_TOP		(FIXADDR_END - PAGE_SIZE)
 
-#define FIX_KMAP_NR_PTES	(FIXADDR_SIZE >> PAGE_SHIFT)
+#include <asm/kmap_types.h>
 
-#define __fix_to_virt(x)	(FIXADDR_START + ((x) << PAGE_SHIFT))
-#define __virt_to_fix(x)	(((x) - FIXADDR_START) >> PAGE_SHIFT)
+enum fixed_addresses {
+	FIX_KMAP_BEGIN,
+	FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1,
 
-extern void __this_fixmap_does_not_exist(void);
+	__end_of_fixed_addresses
+};
 
-static inline unsigned long fix_to_virt(const unsigned int idx)
-{
-	if (idx >= FIX_KMAP_NR_PTES)
-		__this_fixmap_does_not_exist();
-	return __fix_to_virt(idx);
-}
-
-static inline unsigned int virt_to_fix(const unsigned long vaddr)
-{
-	BUG_ON(vaddr >= FIXADDR_TOP || vaddr < FIXADDR_START);
-	return __virt_to_fix(vaddr);
-}
+#include <asm-generic/fixmap.h>
 
 #endif
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 659c75d808dc..ad82c05bfc3a 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -570,7 +570,7 @@ void __init mem_init(void)
 			MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
 			MLK(ITCM_OFFSET, (unsigned long) itcm_end),
 #endif
-			MLK(FIXADDR_START, FIXADDR_TOP),
+			MLK(FIXADDR_START, FIXADDR_END),
 			MLM(VMALLOC_START, VMALLOC_END),
 			MLM(PAGE_OFFSET, (unsigned long)high_memory),
 #ifdef CONFIG_HIGHMEM
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 2/8] ARM: expand fixmap region to 3MB
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

From: Rob Herring <robh@kernel.org>

With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
support 32 CPUs"), the fixmap region was expanded to 2MB, but it
precluded any other uses of the fixmap region. In order to support other
uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
adjacent 1MB range 0xffe00000-0xfff00000 is availabe.

Remove fixmap_page_table ptr and lookup the page table via the virtual
address so that the fixmap region can span more that one pmd. The 2nd
pmd is already created since it is shared with the vector page.

Signed-off-by: Rob Herring <robh@kernel.org>
[kees: fixed CONFIG_DEBUG_HIGHMEM get_fixmap() calls]
[kees: moved pte allocation outside of CONFIG_HIGHMEM]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 Documentation/arm/memory.txt  |  2 +-
 arch/arm/include/asm/fixmap.h |  2 +-
 arch/arm/mm/highmem.c         | 15 ++++++++-------
 arch/arm/mm/mmu.c             |  6 +++---
 4 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 38dc06d0a791..4178ebda6e66 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -41,7 +41,7 @@ fffe8000	fffeffff	DTCM mapping area for platforms with
 fffe0000	fffe7fff	ITCM mapping area for platforms with
 				ITCM mounted inside the CPU.
 
-ffc00000	ffdfffff	Fixmap mapping region.  Addresses provided
+ffc00000	ffefffff	Fixmap mapping region.  Addresses provided
 				by fix_to_virt() will be located here.
 
 fee00000	feffffff	Mapping of PCI I/O space. This is a static
diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index a7add6f9315d..d984ca69ce19 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -2,7 +2,7 @@
 #define _ASM_FIXMAP_H
 
 #define FIXADDR_START		0xffc00000UL
-#define FIXADDR_END		0xffe00000UL
+#define FIXADDR_END		0xfff00000UL
 #define FIXADDR_TOP		(FIXADDR_END - PAGE_SIZE)
 
 #include <asm/kmap_types.h>
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 45aeaaca9052..81061987ac45 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -18,19 +18,20 @@
 #include <asm/tlbflush.h>
 #include "mm.h"
 
-pte_t *fixmap_page_table;
-
 static inline void set_fixmap_pte(int idx, pte_t pte)
 {
 	unsigned long vaddr = __fix_to_virt(idx);
-	set_pte_ext(fixmap_page_table + idx, pte, 0);
+	pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+
+	set_pte_ext(ptep, pte, 0);
 	local_flush_tlb_kernel_page(vaddr);
 }
 
 static inline pte_t get_fixmap_pte(unsigned long vaddr)
 {
-	unsigned long idx = __virt_to_fix(vaddr);
-	return *(fixmap_page_table + idx);
+	pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+
+	return *ptep;
 }
 
 void *kmap(struct page *page)
@@ -84,7 +85,7 @@ void *kmap_atomic(struct page *page)
 	 * With debugging enabled, kunmap_atomic forces that entry to 0.
 	 * Make sure it was indeed properly unmapped.
 	 */
-	BUG_ON(!pte_none(*(fixmap_page_table + idx)));
+	BUG_ON(!pte_none(get_fixmap_pte(vaddr)));
 #endif
 	/*
 	 * When debugging is off, kunmap_atomic leaves the previous mapping
@@ -134,7 +135,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	idx = type + KM_TYPE_NR * smp_processor_id();
 	vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
-	BUG_ON(!pte_none(*(fixmap_page_table + idx)));
+	BUG_ON(!pte_none(get_fixmap_pte(vaddr)));
 #endif
 	set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot));
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 8348ed6b2efe..7fa0966cd15f 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1326,10 +1326,10 @@ static void __init kmap_init(void)
 #ifdef CONFIG_HIGHMEM
 	pkmap_page_table = early_pte_alloc(pmd_off_k(PKMAP_BASE),
 		PKMAP_BASE, _PAGE_KERNEL_TABLE);
-
-	fixmap_page_table = early_pte_alloc(pmd_off_k(FIXADDR_START),
-		FIXADDR_START, _PAGE_KERNEL_TABLE);
 #endif
+
+	early_pte_alloc(pmd_off_k(FIXADDR_START), FIXADDR_START,
+			_PAGE_KERNEL_TABLE);
 }
 
 static void __init map_lowmem(void)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 2/8] ARM: expand fixmap region to 3MB
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

From: Rob Herring <robh@kernel.org>

With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
support 32 CPUs"), the fixmap region was expanded to 2MB, but it
precluded any other uses of the fixmap region. In order to support other
uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
adjacent 1MB range 0xffe00000-0xfff00000 is availabe.

Remove fixmap_page_table ptr and lookup the page table via the virtual
address so that the fixmap region can span more that one pmd. The 2nd
pmd is already created since it is shared with the vector page.

Signed-off-by: Rob Herring <robh@kernel.org>
[kees: fixed CONFIG_DEBUG_HIGHMEM get_fixmap() calls]
[kees: moved pte allocation outside of CONFIG_HIGHMEM]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 Documentation/arm/memory.txt  |  2 +-
 arch/arm/include/asm/fixmap.h |  2 +-
 arch/arm/mm/highmem.c         | 15 ++++++++-------
 arch/arm/mm/mmu.c             |  6 +++---
 4 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 38dc06d0a791..4178ebda6e66 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -41,7 +41,7 @@ fffe8000	fffeffff	DTCM mapping area for platforms with
 fffe0000	fffe7fff	ITCM mapping area for platforms with
 				ITCM mounted inside the CPU.
 
-ffc00000	ffdfffff	Fixmap mapping region.  Addresses provided
+ffc00000	ffefffff	Fixmap mapping region.  Addresses provided
 				by fix_to_virt() will be located here.
 
 fee00000	feffffff	Mapping of PCI I/O space. This is a static
diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index a7add6f9315d..d984ca69ce19 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -2,7 +2,7 @@
 #define _ASM_FIXMAP_H
 
 #define FIXADDR_START		0xffc00000UL
-#define FIXADDR_END		0xffe00000UL
+#define FIXADDR_END		0xfff00000UL
 #define FIXADDR_TOP		(FIXADDR_END - PAGE_SIZE)
 
 #include <asm/kmap_types.h>
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 45aeaaca9052..81061987ac45 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -18,19 +18,20 @@
 #include <asm/tlbflush.h>
 #include "mm.h"
 
-pte_t *fixmap_page_table;
-
 static inline void set_fixmap_pte(int idx, pte_t pte)
 {
 	unsigned long vaddr = __fix_to_virt(idx);
-	set_pte_ext(fixmap_page_table + idx, pte, 0);
+	pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+
+	set_pte_ext(ptep, pte, 0);
 	local_flush_tlb_kernel_page(vaddr);
 }
 
 static inline pte_t get_fixmap_pte(unsigned long vaddr)
 {
-	unsigned long idx = __virt_to_fix(vaddr);
-	return *(fixmap_page_table + idx);
+	pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+
+	return *ptep;
 }
 
 void *kmap(struct page *page)
@@ -84,7 +85,7 @@ void *kmap_atomic(struct page *page)
 	 * With debugging enabled, kunmap_atomic forces that entry to 0.
 	 * Make sure it was indeed properly unmapped.
 	 */
-	BUG_ON(!pte_none(*(fixmap_page_table + idx)));
+	BUG_ON(!pte_none(get_fixmap_pte(vaddr)));
 #endif
 	/*
 	 * When debugging is off, kunmap_atomic leaves the previous mapping
@@ -134,7 +135,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	idx = type + KM_TYPE_NR * smp_processor_id();
 	vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
-	BUG_ON(!pte_none(*(fixmap_page_table + idx)));
+	BUG_ON(!pte_none(get_fixmap_pte(vaddr)));
 #endif
 	set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot));
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 8348ed6b2efe..7fa0966cd15f 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1326,10 +1326,10 @@ static void __init kmap_init(void)
 #ifdef CONFIG_HIGHMEM
 	pkmap_page_table = early_pte_alloc(pmd_off_k(PKMAP_BASE),
 		PKMAP_BASE, _PAGE_KERNEL_TABLE);
-
-	fixmap_page_table = early_pte_alloc(pmd_off_k(FIXADDR_START),
-		FIXADDR_START, _PAGE_KERNEL_TABLE);
 #endif
+
+	early_pte_alloc(pmd_off_k(FIXADDR_START), FIXADDR_START,
+			_PAGE_KERNEL_TABLE);
 }
 
 static void __init map_lowmem(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 3/8] arm: fixmap: implement __set_fixmap()
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, Mark Salter, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

This is used from set_fixmap() and clear_fixmap() via asm-generic/fixmap.h.
Also makes sure that the fixmap allocation fits into the expected range.

Based on patch by Rabin Vincent.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rabin Vincent <rabin@rab.in>
---
 arch/arm/include/asm/fixmap.h |  2 ++
 arch/arm/mm/mmu.c             | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index d984ca69ce19..714606f70425 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -14,6 +14,8 @@ enum fixed_addresses {
 	__end_of_fixed_addresses
 };
 
+void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
+
 #include <asm-generic/fixmap.h>
 
 #endif
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7fa0966cd15f..94eecb913f15 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -22,6 +22,7 @@
 #include <asm/cputype.h>
 #include <asm/sections.h>
 #include <asm/cachetype.h>
+#include <asm/fixmap.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
@@ -392,6 +393,24 @@ SET_MEMORY_FN(rw, pte_set_rw)
 SET_MEMORY_FN(x, pte_set_x)
 SET_MEMORY_FN(nx, pte_set_nx)
 
+void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
+{
+	unsigned long vaddr = __fix_to_virt(idx);
+	pte_t *pte = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+
+	/* Make sure fixmap region does not exceed available allocation. */
+	BUILD_BUG_ON(FIXADDR_START + (__end_of_fixed_addresses * PAGE_SIZE) >
+		     FIXADDR_END);
+	BUG_ON(idx >= __end_of_fixed_addresses);
+
+	if (pgprot_val(prot))
+		set_pte_at(NULL, vaddr, pte,
+			pfn_pte(phys >> PAGE_SHIFT, prot));
+	else
+		pte_clear(NULL, vaddr, pte);
+	flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
+}
+
 /*
  * Adjust the PMD section entries according to the CPU in use.
  */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 3/8] arm: fixmap: implement __set_fixmap()
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

This is used from set_fixmap() and clear_fixmap() via asm-generic/fixmap.h.
Also makes sure that the fixmap allocation fits into the expected range.

Based on patch by Rabin Vincent.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rabin Vincent <rabin@rab.in>
---
 arch/arm/include/asm/fixmap.h |  2 ++
 arch/arm/mm/mmu.c             | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index d984ca69ce19..714606f70425 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -14,6 +14,8 @@ enum fixed_addresses {
 	__end_of_fixed_addresses
 };
 
+void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
+
 #include <asm-generic/fixmap.h>
 
 #endif
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7fa0966cd15f..94eecb913f15 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -22,6 +22,7 @@
 #include <asm/cputype.h>
 #include <asm/sections.h>
 #include <asm/cachetype.h>
+#include <asm/fixmap.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
@@ -392,6 +393,24 @@ SET_MEMORY_FN(rw, pte_set_rw)
 SET_MEMORY_FN(x, pte_set_x)
 SET_MEMORY_FN(nx, pte_set_nx)
 
+void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
+{
+	unsigned long vaddr = __fix_to_virt(idx);
+	pte_t *pte = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+
+	/* Make sure fixmap region does not exceed available allocation. */
+	BUILD_BUG_ON(FIXADDR_START + (__end_of_fixed_addresses * PAGE_SIZE) >
+		     FIXADDR_END);
+	BUG_ON(idx >= __end_of_fixed_addresses);
+
+	if (pgprot_val(prot))
+		set_pte_at(NULL, vaddr, pte,
+			pfn_pte(phys >> PAGE_SHIFT, prot));
+	else
+		pte_clear(NULL, vaddr, pte);
+	flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
+}
+
 /*
  * Adjust the PMD section entries according to the CPU in use.
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, Mark Salter, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

From: Rabin Vincent <rabin@rab.in>

Use fixmaps for text patching when the kernel text is read-only,
inspired by x86.  This makes jump labels and kprobes work with the
currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
CONFIG_DEBUG_RODATA options.

Signed-off-by: Rabin Vincent <rabin@rab.in>
[kees: fixed up for merge with "arm: use generic fixmap.h"]
[kees: added parse acquire/release annotations to pass C=1 builds]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/include/asm/fixmap.h |  4 +++
 arch/arm/kernel/jump_label.c  |  2 +-
 arch/arm/kernel/patch.c       | 79 +++++++++++++++++++++++++++++++++++++++----
 arch/arm/kernel/patch.h       | 12 ++++++-
 4 files changed, 88 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index 714606f70425..0415eae1df27 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -11,6 +11,10 @@ enum fixed_addresses {
 	FIX_KMAP_BEGIN,
 	FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1,
 
+	/* Support writing RO kernel text via kprobes, jump labels, etc. */
+	FIX_TEXT_POKE0,
+	FIX_TEXT_POKE1,
+
 	__end_of_fixed_addresses
 };
 
diff --git a/arch/arm/kernel/jump_label.c b/arch/arm/kernel/jump_label.c
index 4ce4f789446d..afeeb9ea6f43 100644
--- a/arch/arm/kernel/jump_label.c
+++ b/arch/arm/kernel/jump_label.c
@@ -19,7 +19,7 @@ static void __arch_jump_label_transform(struct jump_entry *entry,
 		insn = arm_gen_nop();
 
 	if (is_static)
-		__patch_text(addr, insn);
+		__patch_text_early(addr, insn);
 	else
 		patch_text(addr, insn);
 }
diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
index 07314af47733..a1dce690446a 100644
--- a/arch/arm/kernel/patch.c
+++ b/arch/arm/kernel/patch.c
@@ -1,8 +1,11 @@
 #include <linux/kernel.h>
+#include <linux/spinlock.h>
 #include <linux/kprobes.h>
+#include <linux/mm.h>
 #include <linux/stop_machine.h>
 
 #include <asm/cacheflush.h>
+#include <asm/fixmap.h>
 #include <asm/smp_plat.h>
 #include <asm/opcodes.h>
 
@@ -13,21 +16,77 @@ struct patch {
 	unsigned int insn;
 };
 
-void __kprobes __patch_text(void *addr, unsigned int insn)
+static DEFINE_SPINLOCK(patch_lock);
+
+static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
+	__acquires(&patch_lock)
+{
+	unsigned int uintaddr = (uintptr_t) addr;
+	bool module = !core_kernel_text(uintaddr);
+	struct page *page;
+
+	if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
+		page = vmalloc_to_page(addr);
+	else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
+		page = virt_to_page(addr);
+	else
+		return addr;
+
+	if (flags)
+		spin_lock_irqsave(&patch_lock, *flags);
+	else
+		__acquire(&patch_lock);
+
+	set_fixmap(fixmap, page_to_phys(page));
+
+	return (void *) (__fix_to_virt(fixmap) + (uintaddr & ~PAGE_MASK));
+}
+
+static void __kprobes patch_unmap(int fixmap, unsigned long *flags)
+	__releases(&patch_lock)
+{
+	clear_fixmap(fixmap);
+
+	if (flags)
+		spin_unlock_irqrestore(&patch_lock, *flags);
+	else
+		__release(&patch_lock);
+}
+
+void __kprobes __patch_text_real(void *addr, unsigned int insn, bool remap)
 {
 	bool thumb2 = IS_ENABLED(CONFIG_THUMB2_KERNEL);
+	unsigned int uintaddr = (uintptr_t) addr;
+	bool twopage = false;
+	unsigned long flags;
+	void *waddr = addr;
 	int size;
 
+	if (remap)
+		waddr = patch_map(addr, FIX_TEXT_POKE0, &flags);
+	else
+		__acquire(&patch_lock);
+
 	if (thumb2 && __opcode_is_thumb16(insn)) {
-		*(u16 *)addr = __opcode_to_mem_thumb16(insn);
+		*(u16 *)waddr = __opcode_to_mem_thumb16(insn);
 		size = sizeof(u16);
-	} else if (thumb2 && ((uintptr_t)addr & 2)) {
+	} else if (thumb2 && (uintaddr & 2)) {
 		u16 first = __opcode_thumb32_first(insn);
 		u16 second = __opcode_thumb32_second(insn);
-		u16 *addrh = addr;
+		u16 *addrh0 = waddr;
+		u16 *addrh1 = waddr + 2;
+
+		twopage = (uintaddr & ~PAGE_MASK) == PAGE_SIZE - 2;
+		if (twopage && remap)
+			addrh1 = patch_map(addr + 2, FIX_TEXT_POKE1, NULL);
 
-		addrh[0] = __opcode_to_mem_thumb16(first);
-		addrh[1] = __opcode_to_mem_thumb16(second);
+		*addrh0 = __opcode_to_mem_thumb16(first);
+		*addrh1 = __opcode_to_mem_thumb16(second);
+
+		if (twopage && addrh1 != addr + 2) {
+			flush_kernel_vmap_range(addrh1, 2);
+			patch_unmap(FIX_TEXT_POKE1, NULL);
+		}
 
 		size = sizeof(u32);
 	} else {
@@ -36,10 +95,16 @@ void __kprobes __patch_text(void *addr, unsigned int insn)
 		else
 			insn = __opcode_to_mem_arm(insn);
 
-		*(u32 *)addr = insn;
+		*(u32 *)waddr = insn;
 		size = sizeof(u32);
 	}
 
+	if (waddr != addr) {
+		flush_kernel_vmap_range(waddr, twopage ? size / 2 : size);
+		patch_unmap(FIX_TEXT_POKE0, &flags);
+	} else
+		__release(&patch_lock);
+
 	flush_icache_range((uintptr_t)(addr),
 			   (uintptr_t)(addr) + size);
 }
diff --git a/arch/arm/kernel/patch.h b/arch/arm/kernel/patch.h
index b4731f2dac38..77e054c2f6cd 100644
--- a/arch/arm/kernel/patch.h
+++ b/arch/arm/kernel/patch.h
@@ -2,6 +2,16 @@
 #define _ARM_KERNEL_PATCH_H
 
 void patch_text(void *addr, unsigned int insn);
-void __patch_text(void *addr, unsigned int insn);
+void __patch_text_real(void *addr, unsigned int insn, bool remap);
+
+static inline void __patch_text(void *addr, unsigned int insn)
+{
+	__patch_text_real(addr, insn, true);
+}
+
+static inline void __patch_text_early(void *addr, unsigned int insn)
+{
+	__patch_text_real(addr, insn, false);
+}
 
 #endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

From: Rabin Vincent <rabin@rab.in>

Use fixmaps for text patching when the kernel text is read-only,
inspired by x86.  This makes jump labels and kprobes work with the
currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
CONFIG_DEBUG_RODATA options.

Signed-off-by: Rabin Vincent <rabin@rab.in>
[kees: fixed up for merge with "arm: use generic fixmap.h"]
[kees: added parse acquire/release annotations to pass C=1 builds]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/include/asm/fixmap.h |  4 +++
 arch/arm/kernel/jump_label.c  |  2 +-
 arch/arm/kernel/patch.c       | 79 +++++++++++++++++++++++++++++++++++++++----
 arch/arm/kernel/patch.h       | 12 ++++++-
 4 files changed, 88 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index 714606f70425..0415eae1df27 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -11,6 +11,10 @@ enum fixed_addresses {
 	FIX_KMAP_BEGIN,
 	FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1,
 
+	/* Support writing RO kernel text via kprobes, jump labels, etc. */
+	FIX_TEXT_POKE0,
+	FIX_TEXT_POKE1,
+
 	__end_of_fixed_addresses
 };
 
diff --git a/arch/arm/kernel/jump_label.c b/arch/arm/kernel/jump_label.c
index 4ce4f789446d..afeeb9ea6f43 100644
--- a/arch/arm/kernel/jump_label.c
+++ b/arch/arm/kernel/jump_label.c
@@ -19,7 +19,7 @@ static void __arch_jump_label_transform(struct jump_entry *entry,
 		insn = arm_gen_nop();
 
 	if (is_static)
-		__patch_text(addr, insn);
+		__patch_text_early(addr, insn);
 	else
 		patch_text(addr, insn);
 }
diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
index 07314af47733..a1dce690446a 100644
--- a/arch/arm/kernel/patch.c
+++ b/arch/arm/kernel/patch.c
@@ -1,8 +1,11 @@
 #include <linux/kernel.h>
+#include <linux/spinlock.h>
 #include <linux/kprobes.h>
+#include <linux/mm.h>
 #include <linux/stop_machine.h>
 
 #include <asm/cacheflush.h>
+#include <asm/fixmap.h>
 #include <asm/smp_plat.h>
 #include <asm/opcodes.h>
 
@@ -13,21 +16,77 @@ struct patch {
 	unsigned int insn;
 };
 
-void __kprobes __patch_text(void *addr, unsigned int insn)
+static DEFINE_SPINLOCK(patch_lock);
+
+static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
+	__acquires(&patch_lock)
+{
+	unsigned int uintaddr = (uintptr_t) addr;
+	bool module = !core_kernel_text(uintaddr);
+	struct page *page;
+
+	if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
+		page = vmalloc_to_page(addr);
+	else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
+		page = virt_to_page(addr);
+	else
+		return addr;
+
+	if (flags)
+		spin_lock_irqsave(&patch_lock, *flags);
+	else
+		__acquire(&patch_lock);
+
+	set_fixmap(fixmap, page_to_phys(page));
+
+	return (void *) (__fix_to_virt(fixmap) + (uintaddr & ~PAGE_MASK));
+}
+
+static void __kprobes patch_unmap(int fixmap, unsigned long *flags)
+	__releases(&patch_lock)
+{
+	clear_fixmap(fixmap);
+
+	if (flags)
+		spin_unlock_irqrestore(&patch_lock, *flags);
+	else
+		__release(&patch_lock);
+}
+
+void __kprobes __patch_text_real(void *addr, unsigned int insn, bool remap)
 {
 	bool thumb2 = IS_ENABLED(CONFIG_THUMB2_KERNEL);
+	unsigned int uintaddr = (uintptr_t) addr;
+	bool twopage = false;
+	unsigned long flags;
+	void *waddr = addr;
 	int size;
 
+	if (remap)
+		waddr = patch_map(addr, FIX_TEXT_POKE0, &flags);
+	else
+		__acquire(&patch_lock);
+
 	if (thumb2 && __opcode_is_thumb16(insn)) {
-		*(u16 *)addr = __opcode_to_mem_thumb16(insn);
+		*(u16 *)waddr = __opcode_to_mem_thumb16(insn);
 		size = sizeof(u16);
-	} else if (thumb2 && ((uintptr_t)addr & 2)) {
+	} else if (thumb2 && (uintaddr & 2)) {
 		u16 first = __opcode_thumb32_first(insn);
 		u16 second = __opcode_thumb32_second(insn);
-		u16 *addrh = addr;
+		u16 *addrh0 = waddr;
+		u16 *addrh1 = waddr + 2;
+
+		twopage = (uintaddr & ~PAGE_MASK) == PAGE_SIZE - 2;
+		if (twopage && remap)
+			addrh1 = patch_map(addr + 2, FIX_TEXT_POKE1, NULL);
 
-		addrh[0] = __opcode_to_mem_thumb16(first);
-		addrh[1] = __opcode_to_mem_thumb16(second);
+		*addrh0 = __opcode_to_mem_thumb16(first);
+		*addrh1 = __opcode_to_mem_thumb16(second);
+
+		if (twopage && addrh1 != addr + 2) {
+			flush_kernel_vmap_range(addrh1, 2);
+			patch_unmap(FIX_TEXT_POKE1, NULL);
+		}
 
 		size = sizeof(u32);
 	} else {
@@ -36,10 +95,16 @@ void __kprobes __patch_text(void *addr, unsigned int insn)
 		else
 			insn = __opcode_to_mem_arm(insn);
 
-		*(u32 *)addr = insn;
+		*(u32 *)waddr = insn;
 		size = sizeof(u32);
 	}
 
+	if (waddr != addr) {
+		flush_kernel_vmap_range(waddr, twopage ? size / 2 : size);
+		patch_unmap(FIX_TEXT_POKE0, &flags);
+	} else
+		__release(&patch_lock);
+
 	flush_icache_range((uintptr_t)(addr),
 			   (uintptr_t)(addr) + size);
 }
diff --git a/arch/arm/kernel/patch.h b/arch/arm/kernel/patch.h
index b4731f2dac38..77e054c2f6cd 100644
--- a/arch/arm/kernel/patch.h
+++ b/arch/arm/kernel/patch.h
@@ -2,6 +2,16 @@
 #define _ARM_KERNEL_PATCH_H
 
 void patch_text(void *addr, unsigned int insn);
-void __patch_text(void *addr, unsigned int insn);
+void __patch_text_real(void *addr, unsigned int insn, bool remap);
+
+static inline void __patch_text(void *addr, unsigned int insn)
+{
+	__patch_text_real(addr, insn, true);
+}
+
+static inline void __patch_text_early(void *addr, unsigned int insn)
+{
+	__patch_text_real(addr, insn, false);
+}
 
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 5/8] ARM: kexec: Make .text R/W in machine_kexec
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Nikolay Borisov, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Will Deacon, Catalin Marinas, Russell King - ARM Linux,
	linux-arm-kernel, linux-doc

From: Nikolay Borisov <Nikolay.Borisov@arm.com>

With the introduction of Kees Cook's patch to make the kernel .text
read-only the existing method by which kexec works got broken since it
directly pokes some values in the template code, which resides in the
.text section.

The current patch changes the way those values are inserted so that poking
.text section occurs only in machine_kexec (e.g when we are about to nuke
the old kernel and are beyond the point of return). This allows to use
set_kernel_text_rw() to directly patch the values in the .text section.

I had already sent a patch which achieved this but it was significantly
more complicated, so this is a cleaner/straight-forward approach.

Signed-off-by: Nikolay Borisov <Nikolay.Borisov@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
[kees: collapsed kexec_boot_atags (will.daecon)]
[kees: for bisectability, moved set_kernel_text_rw() to RODATA patch]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/machine_kexec.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 8cf0996aa1a8..8f75250cbe30 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -29,6 +29,7 @@ extern unsigned long kexec_boot_atags;
 
 static atomic_t waiting_for_crash_ipi;
 
+static unsigned long dt_mem;
 /*
  * Provide a dummy crash_notes definition while crash dump arrives to arm.
  * This prevents breakage of crash_notes attribute in kernel/ksysfs.c.
@@ -64,7 +65,7 @@ int machine_kexec_prepare(struct kimage *image)
 			return err;
 
 		if (be32_to_cpu(header) == OF_DT_HEADER)
-			kexec_boot_atags = current_segment->mem;
+			dt_mem = current_segment->mem;
 	}
 	return 0;
 }
@@ -166,9 +167,8 @@ void machine_kexec(struct kimage *image)
 	kexec_start_address = image->start;
 	kexec_indirection_page = page_list;
 	kexec_mach_type = machine_arch_type;
-	if (!kexec_boot_atags)
-		kexec_boot_atags = image->start - KEXEC_ARM_ZIMAGE_OFFSET + KEXEC_ARM_ATAGS_OFFSET;
-
+	kexec_boot_atags = dt_mem ?: image->start - KEXEC_ARM_ZIMAGE_OFFSET
+				     + KEXEC_ARM_ATAGS_OFFSET;
 
 	/* copy our kernel relocation code to the control code page */
 	reboot_entry = fncpy(reboot_code_buffer,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 5/8] ARM: kexec: Make .text R/W in machine_kexec
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

From: Nikolay Borisov <Nikolay.Borisov@arm.com>

With the introduction of Kees Cook's patch to make the kernel .text
read-only the existing method by which kexec works got broken since it
directly pokes some values in the template code, which resides in the
.text section.

The current patch changes the way those values are inserted so that poking
.text section occurs only in machine_kexec (e.g when we are about to nuke
the old kernel and are beyond the point of return). This allows to use
set_kernel_text_rw() to directly patch the values in the .text section.

I had already sent a patch which achieved this but it was significantly
more complicated, so this is a cleaner/straight-forward approach.

Signed-off-by: Nikolay Borisov <Nikolay.Borisov@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
[kees: collapsed kexec_boot_atags (will.daecon)]
[kees: for bisectability, moved set_kernel_text_rw() to RODATA patch]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/machine_kexec.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 8cf0996aa1a8..8f75250cbe30 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -29,6 +29,7 @@ extern unsigned long kexec_boot_atags;
 
 static atomic_t waiting_for_crash_ipi;
 
+static unsigned long dt_mem;
 /*
  * Provide a dummy crash_notes definition while crash dump arrives to arm.
  * This prevents breakage of crash_notes attribute in kernel/ksysfs.c.
@@ -64,7 +65,7 @@ int machine_kexec_prepare(struct kimage *image)
 			return err;
 
 		if (be32_to_cpu(header) == OF_DT_HEADER)
-			kexec_boot_atags = current_segment->mem;
+			dt_mem = current_segment->mem;
 	}
 	return 0;
 }
@@ -166,9 +167,8 @@ void machine_kexec(struct kimage *image)
 	kexec_start_address = image->start;
 	kexec_indirection_page = page_list;
 	kexec_mach_type = machine_arch_type;
-	if (!kexec_boot_atags)
-		kexec_boot_atags = image->start - KEXEC_ARM_ZIMAGE_OFFSET + KEXEC_ARM_ATAGS_OFFSET;
-
+	kexec_boot_atags = dt_mem ?: image->start - KEXEC_ARM_ZIMAGE_OFFSET
+				     + KEXEC_ARM_ATAGS_OFFSET;
 
 	/* copy our kernel relocation code to the control code page */
 	reboot_entry = fncpy(reboot_code_buffer,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 6/8] arm: kgdb: Handle read-only text / modules
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Doug Anderson, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

From: Doug Anderson <dianders@chromium.org>

Handle the case where someone has set the text segment of the kernel
as read-only by using the newly introduced "patch" mechanism.

Signed-off-by: Doug Anderson <dianders@chromium.org>
[kees: switched structure size check to BUILD_BUG_ON (sboyd)]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/Makefile |  2 +-
 arch/arm/kernel/kgdb.c   | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f83d0e..70b730766330 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -67,7 +67,7 @@ test-kprobes-objs		+= kprobes-test-arm.o
 endif
 obj-$(CONFIG_OABI_COMPAT)	+= sys_oabi-compat.o
 obj-$(CONFIG_ARM_THUMBEE)	+= thumbee.o
-obj-$(CONFIG_KGDB)		+= kgdb.o
+obj-$(CONFIG_KGDB)		+= kgdb.o patch.o
 obj-$(CONFIG_ARM_UNWIND)	+= unwind.o
 obj-$(CONFIG_HAVE_TCM)		+= tcm.o
 obj-$(CONFIG_OF)		+= devtree.o
diff --git a/arch/arm/kernel/kgdb.c b/arch/arm/kernel/kgdb.c
index a74b53c1b7df..07db2f8a1b45 100644
--- a/arch/arm/kernel/kgdb.c
+++ b/arch/arm/kernel/kgdb.c
@@ -12,8 +12,12 @@
 #include <linux/irq.h>
 #include <linux/kdebug.h>
 #include <linux/kgdb.h>
+#include <linux/uaccess.h>
+
 #include <asm/traps.h>
 
+#include "patch.h"
+
 struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] =
 {
 	{ "r0", 4, offsetof(struct pt_regs, ARM_r0)},
@@ -244,6 +248,31 @@ void kgdb_arch_exit(void)
 	unregister_die_notifier(&kgdb_notifier);
 }
 
+int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt)
+{
+	int err;
+
+	/* patch_text() only supports int-sized breakpoints */
+	BUILD_BUG_ON(sizeof(int) != BREAK_INSTR_SIZE);
+
+	err = probe_kernel_read(bpt->saved_instr, (char *)bpt->bpt_addr,
+				BREAK_INSTR_SIZE);
+	if (err)
+		return err;
+
+	patch_text((void *)bpt->bpt_addr,
+		   *(unsigned int *)arch_kgdb_ops.gdb_bpt_instr);
+
+	return err;
+}
+
+int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
+{
+	patch_text((void *)bpt->bpt_addr, *(unsigned int *)bpt->saved_instr);
+
+	return 0;
+}
+
 /*
  * Register our undef instruction hooks with ARM undef core.
  * We regsiter a hook specifically looking for the KGB break inst
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 6/8] arm: kgdb: Handle read-only text / modules
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

From: Doug Anderson <dianders@chromium.org>

Handle the case where someone has set the text segment of the kernel
as read-only by using the newly introduced "patch" mechanism.

Signed-off-by: Doug Anderson <dianders@chromium.org>
[kees: switched structure size check to BUILD_BUG_ON (sboyd)]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/Makefile |  2 +-
 arch/arm/kernel/kgdb.c   | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f83d0e..70b730766330 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -67,7 +67,7 @@ test-kprobes-objs		+= kprobes-test-arm.o
 endif
 obj-$(CONFIG_OABI_COMPAT)	+= sys_oabi-compat.o
 obj-$(CONFIG_ARM_THUMBEE)	+= thumbee.o
-obj-$(CONFIG_KGDB)		+= kgdb.o
+obj-$(CONFIG_KGDB)		+= kgdb.o patch.o
 obj-$(CONFIG_ARM_UNWIND)	+= unwind.o
 obj-$(CONFIG_HAVE_TCM)		+= tcm.o
 obj-$(CONFIG_OF)		+= devtree.o
diff --git a/arch/arm/kernel/kgdb.c b/arch/arm/kernel/kgdb.c
index a74b53c1b7df..07db2f8a1b45 100644
--- a/arch/arm/kernel/kgdb.c
+++ b/arch/arm/kernel/kgdb.c
@@ -12,8 +12,12 @@
 #include <linux/irq.h>
 #include <linux/kdebug.h>
 #include <linux/kgdb.h>
+#include <linux/uaccess.h>
+
 #include <asm/traps.h>
 
+#include "patch.h"
+
 struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] =
 {
 	{ "r0", 4, offsetof(struct pt_regs, ARM_r0)},
@@ -244,6 +248,31 @@ void kgdb_arch_exit(void)
 	unregister_die_notifier(&kgdb_notifier);
 }
 
+int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt)
+{
+	int err;
+
+	/* patch_text() only supports int-sized breakpoints */
+	BUILD_BUG_ON(sizeof(int) != BREAK_INSTR_SIZE);
+
+	err = probe_kernel_read(bpt->saved_instr, (char *)bpt->bpt_addr,
+				BREAK_INSTR_SIZE);
+	if (err)
+		return err;
+
+	patch_text((void *)bpt->bpt_addr,
+		   *(unsigned int *)arch_kgdb_ops.gdb_bpt_instr);
+
+	return err;
+}
+
+int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
+{
+	patch_text((void *)bpt->bpt_addr, *(unsigned int *)bpt->saved_instr);
+
+	return 0;
+}
+
 /*
  * Register our undef instruction hooks with ARM undef core.
  * We regsiter a hook specifically looking for the KGB break inst
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
into section-sized areas that can have different permisions. Performs
the NX permission changes during free_initmem, so that init memory can be
reclaimed.

This uses section size instead of PMD size to reduce memory lost to
padding on non-LPAE systems.

Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm/kernel/vmlinux.lds.S |  17 +++++++
 arch/arm/mm/Kconfig           |   9 ++++
 arch/arm/mm/init.c            | 106 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mm/mmu.c             |  13 +++++-
 4 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 6f57cb94367f..a3d07ca2bbb4 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -8,6 +8,9 @@
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/page.h>
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/pgtable.h>
+#endif
 	
 #define PROC_INFO							\
 	. = ALIGN(4);							\
@@ -90,6 +93,11 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
+
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			__exception_text_start = .;
@@ -145,7 +153,11 @@ SECTIONS
 	_etext = .;			/* End of text and rodata section */
 
 #ifndef CONFIG_XIP_KERNEL
+# ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+# else
 	. = ALIGN(PAGE_SIZE);
+# endif
 	__init_begin = .;
 #endif
 	/*
@@ -220,7 +232,12 @@ SECTIONS
 	. = PAGE_OFFSET + TEXT_OFFSET;
 #else
 	__init_end = .;
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#else
 	. = ALIGN(THREAD_SIZE);
+#endif
 	__data_loc = .;
 #endif
 
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index ae69809a9e47..7a0756df91a2 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -1008,3 +1008,12 @@ config ARCH_SUPPORTS_BIG_ENDIAN
 	help
 	  This option specifies the architecture can support big endian
 	  operation.
+
+config ARM_KERNMEM_PERMS
+	bool "Restrict kernel memory permissions"
+	help
+	  If this is set, kernel memory other than kernel text (and rodata)
+	  will be made non-executable. The tradeoff is that each region is
+	  padded to section-size (1MiB) boundaries (because their permissions
+	  are different and splitting the 1M pages into 4K ones causes TLB
+	  performance problems), wasting memory.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index ad82c05bfc3a..ccf392ef40d4 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -32,6 +32,11 @@
 #include <asm/tlb.h>
 #include <asm/fixmap.h>
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/system_info.h>
+#include <asm/cp15.h>
+#endif
+
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
 
@@ -615,11 +620,112 @@ void __init mem_init(void)
 	}
 }
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+struct section_perm {
+	unsigned long start;
+	unsigned long end;
+	pmdval_t mask;
+	pmdval_t prot;
+};
+
+struct section_perm nx_perms[] = {
+	/* Make pages tables, etc before _stext RW (set NX). */
+	{
+		.start	= PAGE_OFFSET,
+		.end	= (unsigned long)_stext,
+		.mask	= ~PMD_SECT_XN,
+		.prot	= PMD_SECT_XN,
+	},
+	/* Make init RW (set NX). */
+	{
+		.start	= (unsigned long)__init_begin,
+		.end	= (unsigned long)_sdata,
+		.mask	= ~PMD_SECT_XN,
+		.prot	= PMD_SECT_XN,
+	},
+};
+
+/*
+ * Updates section permissions only for the current mm (sections are
+ * copied into each mm). During startup, this is the init_mm.
+ */
+static inline void section_update(unsigned long addr, pmdval_t mask,
+				  pmdval_t prot)
+{
+	struct mm_struct *mm;
+	pmd_t *pmd;
+
+	mm = current->active_mm;
+	pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
+
+#ifdef CONFIG_ARM_LPAE
+	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
+#else
+	if (addr & SECTION_SIZE)
+		pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
+	else
+		pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
+#endif
+	flush_pmd_entry(pmd);
+	local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
+}
+
+/* Make sure extended page tables are in use. */
+static inline bool arch_has_strict_perms(void)
+{
+	unsigned int cr;
+
+	if (cpu_architecture() < CPU_ARCH_ARMv6)
+		return false;
+
+	cr = get_cr();
+	if (!(cr & CR_XP))
+		return false;
+
+	return true;
+}
+
+#define set_section_perms(perms, field)	{				\
+	size_t i;							\
+	unsigned long addr;						\
+									\
+	if (!arch_has_strict_perms())					\
+		return;							\
+									\
+	for (i = 0; i < ARRAY_SIZE(perms); i++) {			\
+		if (!IS_ALIGNED(perms[i].start, SECTION_SIZE) ||	\
+		    !IS_ALIGNED(perms[i].end, SECTION_SIZE)) {		\
+			pr_err("BUG: section %lx-%lx not aligned to %lx\n", \
+				perms[i].start, perms[i].end,		\
+				SECTION_SIZE);				\
+			continue;					\
+		}							\
+									\
+		for (addr = perms[i].start;				\
+		     addr < perms[i].end;				\
+		     addr += SECTION_SIZE)				\
+			section_update(addr, perms[i].mask,		\
+				       perms[i].field);			\
+	}								\
+}
+
+static inline void fix_kernmem_perms(void)
+{
+	set_section_perms(nx_perms, prot);
+}
+#else
+static inline void fix_kernmem_perms(void) { }
+#endif /* CONFIG_ARM_KERNMEM_PERMS */
+
 void free_initmem(void)
 {
 #ifdef CONFIG_HAVE_TCM
 	extern char __tcm_start, __tcm_end;
+#endif
+
+	fix_kernmem_perms();
 
+#ifdef CONFIG_HAVE_TCM
 	poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
 	free_reserved_area(&__tcm_start, &__tcm_end, -1, "TCM link");
 #endif
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 94eecb913f15..c9227bf59b45 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1368,13 +1368,24 @@ static void __init map_lowmem(void)
 		if (start >= end)
 			break;
 
-		if (end < kernel_x_start || start >= kernel_x_end) {
+		if (end < kernel_x_start) {
 			map.pfn = __phys_to_pfn(start);
 			map.virtual = __phys_to_virt(start);
 			map.length = end - start;
 			map.type = MT_MEMORY_RWX;
 
 			create_mapping(&map);
+		} else if (start >= kernel_x_end) {
+			map.pfn = __phys_to_pfn(start);
+			map.virtual = __phys_to_virt(start);
+			map.length = end - start;
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+			map.type = MT_MEMORY_RW;
+#else
+			map.type = MT_MEMORY_RWX;
+#endif
+
+			create_mapping(&map);
 		} else {
 			/* This better cover the entire kernel */
 			if (start < kernel_x_start) {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
into section-sized areas that can have different permisions. Performs
the NX permission changes during free_initmem, so that init memory can be
reclaimed.

This uses section size instead of PMD size to reduce memory lost to
padding on non-LPAE systems.

Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm/kernel/vmlinux.lds.S |  17 +++++++
 arch/arm/mm/Kconfig           |   9 ++++
 arch/arm/mm/init.c            | 106 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mm/mmu.c             |  13 +++++-
 4 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 6f57cb94367f..a3d07ca2bbb4 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -8,6 +8,9 @@
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/page.h>
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/pgtable.h>
+#endif
 	
 #define PROC_INFO							\
 	. = ALIGN(4);							\
@@ -90,6 +93,11 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
+
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			__exception_text_start = .;
@@ -145,7 +153,11 @@ SECTIONS
 	_etext = .;			/* End of text and rodata section */
 
 #ifndef CONFIG_XIP_KERNEL
+# ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+# else
 	. = ALIGN(PAGE_SIZE);
+# endif
 	__init_begin = .;
 #endif
 	/*
@@ -220,7 +232,12 @@ SECTIONS
 	. = PAGE_OFFSET + TEXT_OFFSET;
 #else
 	__init_end = .;
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#else
 	. = ALIGN(THREAD_SIZE);
+#endif
 	__data_loc = .;
 #endif
 
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index ae69809a9e47..7a0756df91a2 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -1008,3 +1008,12 @@ config ARCH_SUPPORTS_BIG_ENDIAN
 	help
 	  This option specifies the architecture can support big endian
 	  operation.
+
+config ARM_KERNMEM_PERMS
+	bool "Restrict kernel memory permissions"
+	help
+	  If this is set, kernel memory other than kernel text (and rodata)
+	  will be made non-executable. The tradeoff is that each region is
+	  padded to section-size (1MiB) boundaries (because their permissions
+	  are different and splitting the 1M pages into 4K ones causes TLB
+	  performance problems), wasting memory.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index ad82c05bfc3a..ccf392ef40d4 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -32,6 +32,11 @@
 #include <asm/tlb.h>
 #include <asm/fixmap.h>
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/system_info.h>
+#include <asm/cp15.h>
+#endif
+
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
 
@@ -615,11 +620,112 @@ void __init mem_init(void)
 	}
 }
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+struct section_perm {
+	unsigned long start;
+	unsigned long end;
+	pmdval_t mask;
+	pmdval_t prot;
+};
+
+struct section_perm nx_perms[] = {
+	/* Make pages tables, etc before _stext RW (set NX). */
+	{
+		.start	= PAGE_OFFSET,
+		.end	= (unsigned long)_stext,
+		.mask	= ~PMD_SECT_XN,
+		.prot	= PMD_SECT_XN,
+	},
+	/* Make init RW (set NX). */
+	{
+		.start	= (unsigned long)__init_begin,
+		.end	= (unsigned long)_sdata,
+		.mask	= ~PMD_SECT_XN,
+		.prot	= PMD_SECT_XN,
+	},
+};
+
+/*
+ * Updates section permissions only for the current mm (sections are
+ * copied into each mm). During startup, this is the init_mm.
+ */
+static inline void section_update(unsigned long addr, pmdval_t mask,
+				  pmdval_t prot)
+{
+	struct mm_struct *mm;
+	pmd_t *pmd;
+
+	mm = current->active_mm;
+	pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
+
+#ifdef CONFIG_ARM_LPAE
+	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
+#else
+	if (addr & SECTION_SIZE)
+		pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
+	else
+		pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
+#endif
+	flush_pmd_entry(pmd);
+	local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
+}
+
+/* Make sure extended page tables are in use. */
+static inline bool arch_has_strict_perms(void)
+{
+	unsigned int cr;
+
+	if (cpu_architecture() < CPU_ARCH_ARMv6)
+		return false;
+
+	cr = get_cr();
+	if (!(cr & CR_XP))
+		return false;
+
+	return true;
+}
+
+#define set_section_perms(perms, field)	{				\
+	size_t i;							\
+	unsigned long addr;						\
+									\
+	if (!arch_has_strict_perms())					\
+		return;							\
+									\
+	for (i = 0; i < ARRAY_SIZE(perms); i++) {			\
+		if (!IS_ALIGNED(perms[i].start, SECTION_SIZE) ||	\
+		    !IS_ALIGNED(perms[i].end, SECTION_SIZE)) {		\
+			pr_err("BUG: section %lx-%lx not aligned to %lx\n", \
+				perms[i].start, perms[i].end,		\
+				SECTION_SIZE);				\
+			continue;					\
+		}							\
+									\
+		for (addr = perms[i].start;				\
+		     addr < perms[i].end;				\
+		     addr += SECTION_SIZE)				\
+			section_update(addr, perms[i].mask,		\
+				       perms[i].field);			\
+	}								\
+}
+
+static inline void fix_kernmem_perms(void)
+{
+	set_section_perms(nx_perms, prot);
+}
+#else
+static inline void fix_kernmem_perms(void) { }
+#endif /* CONFIG_ARM_KERNMEM_PERMS */
+
 void free_initmem(void)
 {
 #ifdef CONFIG_HAVE_TCM
 	extern char __tcm_start, __tcm_end;
+#endif
+
+	fix_kernmem_perms();
 
+#ifdef CONFIG_HAVE_TCM
 	poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
 	free_reserved_area(&__tcm_start, &__tcm_end, -1, "TCM link");
 #endif
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 94eecb913f15..c9227bf59b45 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1368,13 +1368,24 @@ static void __init map_lowmem(void)
 		if (start >= end)
 			break;
 
-		if (end < kernel_x_start || start >= kernel_x_end) {
+		if (end < kernel_x_start) {
 			map.pfn = __phys_to_pfn(start);
 			map.virtual = __phys_to_virt(start);
 			map.length = end - start;
 			map.type = MT_MEMORY_RWX;
 
 			create_mapping(&map);
+		} else if (start >= kernel_x_end) {
+			map.pfn = __phys_to_pfn(start);
+			map.virtual = __phys_to_virt(start);
+			map.length = end - start;
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+			map.type = MT_MEMORY_RW;
+#else
+			map.type = MT_MEMORY_RWX;
+#endif
+
+			create_mapping(&map);
 		} else {
 			/* This better cover the entire kernel */
 			if (start < kernel_x_start) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:06   ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nikolay Borisov, Nicolas Pitre, Tomasz Figa, Doug Anderson,
	Jason Wessel, Will Deacon, Catalin Marinas,
	Russell King - ARM Linux, linux-arm-kernel, linux-doc

This introduces CONFIG_DEBUG_RODATA, making kernel text and rodata
read-only. Additionally, this splits rodata from text so that rodata can
also be NX, which may lead to wasted memory when aligning to SECTION_SIZE.
The read-only areas are made writable during ftrace updates and kexec.

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm/include/asm/cacheflush.h | 10 ++++++++
 arch/arm/kernel/ftrace.c          | 19 ++++++++++++++++
 arch/arm/kernel/machine_kexec.c   |  1 +
 arch/arm/kernel/vmlinux.lds.S     |  3 +++
 arch/arm/mm/Kconfig               | 12 ++++++++++
 arch/arm/mm/init.c                | 48 ++++++++++++++++++++++++++++++++++++++-
 6 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index fd43f7f55b70..0cdf1e31df86 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -487,6 +487,16 @@ int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
 
+#ifdef CONFIG_DEBUG_RODATA
+void mark_rodata_ro(void);
+void set_kernel_text_rw(void);
+void set_kernel_text_ro(void);
+#else
+static inline void set_kernel_text_rw(void) { }
+static inline void set_kernel_text_ro(void) { }
+#endif
+
 void flush_uprobe_xol_access(struct page *page, unsigned long uaddr,
 			     void *kaddr, unsigned long len);
+
 #endif
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index af9a8a927a4e..b8c75e45a950 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -15,6 +15,7 @@
 #include <linux/ftrace.h>
 #include <linux/uaccess.h>
 #include <linux/module.h>
+#include <linux/stop_machine.h>
 
 #include <asm/cacheflush.h>
 #include <asm/opcodes.h>
@@ -35,6 +36,22 @@
 
 #define	OLD_NOP		0xe1a00000	/* mov r0, r0 */
 
+static int __ftrace_modify_code(void *data)
+{
+	int *command = data;
+
+	set_kernel_text_rw();
+	ftrace_modify_all_code(*command);
+	set_kernel_text_ro();
+
+	return 0;
+}
+
+void arch_ftrace_update_code(int command)
+{
+	stop_machine(__ftrace_modify_code, &command, NULL);
+}
+
 static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
 {
 	return rec->arch.old_mcount ? OLD_NOP : NOP;
@@ -73,6 +90,8 @@ int ftrace_arch_code_modify_prepare(void)
 int ftrace_arch_code_modify_post_process(void)
 {
 	set_all_modules_text_ro();
+	/* Make sure any TLB misses during machine stop are cleared. */
+	flush_tlb_all();
 	return 0;
 }
 
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 8f75250cbe30..4423a565ef6f 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -164,6 +164,7 @@ void machine_kexec(struct kimage *image)
 	reboot_code_buffer = page_address(image->control_code_page);
 
 	/* Prepare parameters for reboot_code_buffer*/
+	set_kernel_text_rw();
 	kexec_start_address = image->start;
 	kexec_indirection_page = page_list;
 	kexec_mach_type = machine_arch_type;
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index a3d07ca2bbb4..542e58919bd9 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -120,6 +120,9 @@ SECTIONS
 			ARM_CPU_KEEP(PROC_INFO)
 	}
 
+#ifdef CONFIG_DEBUG_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
 	RO_DATA(PAGE_SIZE)
 
 	. = ALIGN(4);
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 7a0756df91a2..c9cd9c5bf1e1 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -1017,3 +1017,15 @@ config ARM_KERNMEM_PERMS
 	  padded to section-size (1MiB) boundaries (because their permissions
 	  are different and splitting the 1M pages into 4K ones causes TLB
 	  performance problems), wasting memory.
+
+config DEBUG_RODATA
+	bool "Make kernel text and rodata read-only"
+	depends on ARM_KERNMEM_PERMS
+	default y
+	help
+	  If this is set, kernel text and rodata will be made read-only. This
+	  is to help catch accidental or malicious attempts to change the
+	  kernel's executable code. Additionally splits rodata from kernel
+	  text so it can be made explicitly non-executable. This creates
+	  another section-size padded region, so it can waste more memory
+	  space while gaining the read-only protections.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index ccf392ef40d4..35c838da90d5 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -626,9 +626,10 @@ struct section_perm {
 	unsigned long end;
 	pmdval_t mask;
 	pmdval_t prot;
+	pmdval_t clear;
 };
 
-struct section_perm nx_perms[] = {
+static struct section_perm nx_perms[] = {
 	/* Make pages tables, etc before _stext RW (set NX). */
 	{
 		.start	= PAGE_OFFSET,
@@ -643,8 +644,35 @@ struct section_perm nx_perms[] = {
 		.mask	= ~PMD_SECT_XN,
 		.prot	= PMD_SECT_XN,
 	},
+#ifdef CONFIG_DEBUG_RODATA
+	/* Make rodata NX (set RO in ro_perms below). */
+	{
+		.start  = (unsigned long)__start_rodata,
+		.end    = (unsigned long)__init_begin,
+		.mask   = ~PMD_SECT_XN,
+		.prot   = PMD_SECT_XN,
+	},
+#endif
 };
 
+#ifdef CONFIG_DEBUG_RODATA
+static struct section_perm ro_perms[] = {
+	/* Make kernel code and rodata RX (set RO). */
+	{
+		.start  = (unsigned long)_stext,
+		.end    = (unsigned long)__init_begin,
+#ifdef CONFIG_ARM_LPAE
+		.mask   = ~PMD_SECT_RDONLY,
+		.prot   = PMD_SECT_RDONLY,
+#else
+		.mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
+		.prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
+		.clear  = PMD_SECT_AP_WRITE,
+#endif
+	},
+};
+#endif
+
 /*
  * Updates section permissions only for the current mm (sections are
  * copied into each mm). During startup, this is the init_mm.
@@ -713,6 +741,24 @@ static inline void fix_kernmem_perms(void)
 {
 	set_section_perms(nx_perms, prot);
 }
+
+#ifdef CONFIG_DEBUG_RODATA
+void mark_rodata_ro(void)
+{
+	set_section_perms(ro_perms, prot);
+}
+
+void set_kernel_text_rw(void)
+{
+	set_section_perms(ro_perms, clear);
+}
+
+void set_kernel_text_ro(void)
+{
+	set_section_perms(ro_perms, prot);
+}
+#endif /* CONFIG_DEBUG_RODATA */
+
 #else
 static inline void fix_kernmem_perms(void) { }
 #endif /* CONFIG_ARM_KERNMEM_PERMS */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only
@ 2014-08-13 17:06   ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-13 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

This introduces CONFIG_DEBUG_RODATA, making kernel text and rodata
read-only. Additionally, this splits rodata from text so that rodata can
also be NX, which may lead to wasted memory when aligning to SECTION_SIZE.
The read-only areas are made writable during ftrace updates and kexec.

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm/include/asm/cacheflush.h | 10 ++++++++
 arch/arm/kernel/ftrace.c          | 19 ++++++++++++++++
 arch/arm/kernel/machine_kexec.c   |  1 +
 arch/arm/kernel/vmlinux.lds.S     |  3 +++
 arch/arm/mm/Kconfig               | 12 ++++++++++
 arch/arm/mm/init.c                | 48 ++++++++++++++++++++++++++++++++++++++-
 6 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index fd43f7f55b70..0cdf1e31df86 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -487,6 +487,16 @@ int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
 
+#ifdef CONFIG_DEBUG_RODATA
+void mark_rodata_ro(void);
+void set_kernel_text_rw(void);
+void set_kernel_text_ro(void);
+#else
+static inline void set_kernel_text_rw(void) { }
+static inline void set_kernel_text_ro(void) { }
+#endif
+
 void flush_uprobe_xol_access(struct page *page, unsigned long uaddr,
 			     void *kaddr, unsigned long len);
+
 #endif
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index af9a8a927a4e..b8c75e45a950 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -15,6 +15,7 @@
 #include <linux/ftrace.h>
 #include <linux/uaccess.h>
 #include <linux/module.h>
+#include <linux/stop_machine.h>
 
 #include <asm/cacheflush.h>
 #include <asm/opcodes.h>
@@ -35,6 +36,22 @@
 
 #define	OLD_NOP		0xe1a00000	/* mov r0, r0 */
 
+static int __ftrace_modify_code(void *data)
+{
+	int *command = data;
+
+	set_kernel_text_rw();
+	ftrace_modify_all_code(*command);
+	set_kernel_text_ro();
+
+	return 0;
+}
+
+void arch_ftrace_update_code(int command)
+{
+	stop_machine(__ftrace_modify_code, &command, NULL);
+}
+
 static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
 {
 	return rec->arch.old_mcount ? OLD_NOP : NOP;
@@ -73,6 +90,8 @@ int ftrace_arch_code_modify_prepare(void)
 int ftrace_arch_code_modify_post_process(void)
 {
 	set_all_modules_text_ro();
+	/* Make sure any TLB misses during machine stop are cleared. */
+	flush_tlb_all();
 	return 0;
 }
 
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 8f75250cbe30..4423a565ef6f 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -164,6 +164,7 @@ void machine_kexec(struct kimage *image)
 	reboot_code_buffer = page_address(image->control_code_page);
 
 	/* Prepare parameters for reboot_code_buffer*/
+	set_kernel_text_rw();
 	kexec_start_address = image->start;
 	kexec_indirection_page = page_list;
 	kexec_mach_type = machine_arch_type;
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index a3d07ca2bbb4..542e58919bd9 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -120,6 +120,9 @@ SECTIONS
 			ARM_CPU_KEEP(PROC_INFO)
 	}
 
+#ifdef CONFIG_DEBUG_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
 	RO_DATA(PAGE_SIZE)
 
 	. = ALIGN(4);
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 7a0756df91a2..c9cd9c5bf1e1 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -1017,3 +1017,15 @@ config ARM_KERNMEM_PERMS
 	  padded to section-size (1MiB) boundaries (because their permissions
 	  are different and splitting the 1M pages into 4K ones causes TLB
 	  performance problems), wasting memory.
+
+config DEBUG_RODATA
+	bool "Make kernel text and rodata read-only"
+	depends on ARM_KERNMEM_PERMS
+	default y
+	help
+	  If this is set, kernel text and rodata will be made read-only. This
+	  is to help catch accidental or malicious attempts to change the
+	  kernel's executable code. Additionally splits rodata from kernel
+	  text so it can be made explicitly non-executable. This creates
+	  another section-size padded region, so it can waste more memory
+	  space while gaining the read-only protections.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index ccf392ef40d4..35c838da90d5 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -626,9 +626,10 @@ struct section_perm {
 	unsigned long end;
 	pmdval_t mask;
 	pmdval_t prot;
+	pmdval_t clear;
 };
 
-struct section_perm nx_perms[] = {
+static struct section_perm nx_perms[] = {
 	/* Make pages tables, etc before _stext RW (set NX). */
 	{
 		.start	= PAGE_OFFSET,
@@ -643,8 +644,35 @@ struct section_perm nx_perms[] = {
 		.mask	= ~PMD_SECT_XN,
 		.prot	= PMD_SECT_XN,
 	},
+#ifdef CONFIG_DEBUG_RODATA
+	/* Make rodata NX (set RO in ro_perms below). */
+	{
+		.start  = (unsigned long)__start_rodata,
+		.end    = (unsigned long)__init_begin,
+		.mask   = ~PMD_SECT_XN,
+		.prot   = PMD_SECT_XN,
+	},
+#endif
 };
 
+#ifdef CONFIG_DEBUG_RODATA
+static struct section_perm ro_perms[] = {
+	/* Make kernel code and rodata RX (set RO). */
+	{
+		.start  = (unsigned long)_stext,
+		.end    = (unsigned long)__init_begin,
+#ifdef CONFIG_ARM_LPAE
+		.mask   = ~PMD_SECT_RDONLY,
+		.prot   = PMD_SECT_RDONLY,
+#else
+		.mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
+		.prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
+		.clear  = PMD_SECT_AP_WRITE,
+#endif
+	},
+};
+#endif
+
 /*
  * Updates section permissions only for the current mm (sections are
  * copied into each mm). During startup, this is the init_mm.
@@ -713,6 +741,24 @@ static inline void fix_kernmem_perms(void)
 {
 	set_section_perms(nx_perms, prot);
 }
+
+#ifdef CONFIG_DEBUG_RODATA
+void mark_rodata_ro(void)
+{
+	set_section_perms(ro_perms, prot);
+}
+
+void set_kernel_text_rw(void)
+{
+	set_section_perms(ro_perms, clear);
+}
+
+void set_kernel_text_ro(void)
+{
+	set_section_perms(ro_perms, prot);
+}
+#endif /* CONFIG_DEBUG_RODATA */
+
 #else
 static inline void fix_kernmem_perms(void) { }
 #endif /* CONFIG_ARM_KERNMEM_PERMS */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 0/8] arm: support CONFIG_RODATA
  2014-08-13 17:06 ` Kees Cook
@ 2014-08-13 17:38   ` Nicolas Pitre
  -1 siblings, 0 replies; 52+ messages in thread
From: Nicolas Pitre @ 2014-08-13 17:38 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, Mark Salter, Rabin Vincent, Liu hua,
	Nikolay Borisov, Tomasz Figa, Doug Anderson, Jason Wessel,
	Will Deacon, Catalin Marinas, Russell King - ARM Linux,
	linux-arm-kernel, linux-doc

On Wed, 13 Aug 2014, Kees Cook wrote:

> This is a series of patches to support CONFIG_RODATA on ARM, so that
> the kernel text is RO, and non-text sections default to NX. To support
> on-the-fly kernel text patching (via ftrace, kprobes, etc), fixmap
> support has been finalized based on several versions of various patches
> that are floating around on the mailing list. This series attempts to
> include the least intrusive version, so that others can build on it for
> future fixmap work.
> 
> The series has been heavily tested, and appears to be working correctly:
> 
> With CONFIG_ARM_PTDUMP, expected page table permissions are seen in
> /sys/kernel/debug/kernel_page_tables.
> 
> Using CONFIG_LKDTM, the kernel now correctly detects bad accesses for
> for the following lkdtm tests via /sys/kernel/debug/provoke-crash/DIRECT:
>         EXEC_DATA
>         WRITE_RO
>         WRITE_KERN
> 
> ftrace works:
>         CONFIG_FTRACE_STARTUP_TEST passes
>         Enabling tracing works:
>                 echo function > /sys/kernel/debug/tracing/current_tracer
> 
> kprobes works:
>         CONFIG_ARM_KPROBES_TEST passes
> 
> kexec works:
>         kexec will load and start a new kernel
> 
> Built with and without CONFIG_HIGHMEM, CONFIG_HIGHMEM_DEBUG, and
> CONFIG_NR_CPUS=32.
> 
> Thanks to everyone who has been testing this series and working on its
> various pieces!

For the whole series:

Acked-by: Nicolas Pitre <nico@linaro.org>



> 
> -Kees
> 
> v4:
> - expanded fixmap to 3MB to support 32 CPUs (robh)
> - corrected pmd-finding via vaddr instead of FIXMAP_START (robh)
> - switched structure size test to BUILD_BUG_ON (sboyd)
> - added locking annotations to keep sparse happy (sboyd)
> - adding missing "static" declarations noticed by sparse
> - reorganized fixmap portion of patches
> 
> v3:
> - more cleanups in switch to generic fixmap (lauraa, robh)
> - fixed kexec merge hunk glitch (will.deacon)
> - added tested-by tags where appropriate from v2 testing
> 
> v2:
> - fix typo in kexec merge (buildbot)
> - flip index order for highmem pte access (lauraa)
> - added kgdb updates (dianders)
> 
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 0/8] arm: support CONFIG_RODATA
@ 2014-08-13 17:38   ` Nicolas Pitre
  0 siblings, 0 replies; 52+ messages in thread
From: Nicolas Pitre @ 2014-08-13 17:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Aug 2014, Kees Cook wrote:

> This is a series of patches to support CONFIG_RODATA on ARM, so that
> the kernel text is RO, and non-text sections default to NX. To support
> on-the-fly kernel text patching (via ftrace, kprobes, etc), fixmap
> support has been finalized based on several versions of various patches
> that are floating around on the mailing list. This series attempts to
> include the least intrusive version, so that others can build on it for
> future fixmap work.
> 
> The series has been heavily tested, and appears to be working correctly:
> 
> With CONFIG_ARM_PTDUMP, expected page table permissions are seen in
> /sys/kernel/debug/kernel_page_tables.
> 
> Using CONFIG_LKDTM, the kernel now correctly detects bad accesses for
> for the following lkdtm tests via /sys/kernel/debug/provoke-crash/DIRECT:
>         EXEC_DATA
>         WRITE_RO
>         WRITE_KERN
> 
> ftrace works:
>         CONFIG_FTRACE_STARTUP_TEST passes
>         Enabling tracing works:
>                 echo function > /sys/kernel/debug/tracing/current_tracer
> 
> kprobes works:
>         CONFIG_ARM_KPROBES_TEST passes
> 
> kexec works:
>         kexec will load and start a new kernel
> 
> Built with and without CONFIG_HIGHMEM, CONFIG_HIGHMEM_DEBUG, and
> CONFIG_NR_CPUS=32.
> 
> Thanks to everyone who has been testing this series and working on its
> various pieces!

For the whole series:

Acked-by: Nicolas Pitre <nico@linaro.org>



> 
> -Kees
> 
> v4:
> - expanded fixmap to 3MB to support 32 CPUs (robh)
> - corrected pmd-finding via vaddr instead of FIXMAP_START (robh)
> - switched structure size test to BUILD_BUG_ON (sboyd)
> - added locking annotations to keep sparse happy (sboyd)
> - adding missing "static" declarations noticed by sparse
> - reorganized fixmap portion of patches
> 
> v3:
> - more cleanups in switch to generic fixmap (lauraa, robh)
> - fixed kexec merge hunk glitch (will.deacon)
> - added tested-by tags where appropriate from v2 testing
> 
> v2:
> - fix typo in kexec merge (buildbot)
> - flip index order for highmem pte access (lauraa)
> - added kgdb updates (dianders)
> 
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 2/8] ARM: expand fixmap region to 3MB
  2014-08-13 17:06   ` Kees Cook
@ 2014-08-19 12:26     ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

Hi Kees, Rob,

On Wed, Aug 13, 2014 at 06:06:27PM +0100, Kees Cook wrote:
> From: Rob Herring <robh@kernel.org>
> 
> With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
> support 32 CPUs"), the fixmap region was expanded to 2MB, but it
> precluded any other uses of the fixmap region. In order to support other
> uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
> adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
> 
> Remove fixmap_page_table ptr and lookup the page table via the virtual
> address so that the fixmap region can span more that one pmd. The 2nd
> pmd is already created since it is shared with the vector page.

[...]

> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
> index 45aeaaca9052..81061987ac45 100644
> --- a/arch/arm/mm/highmem.c
> +++ b/arch/arm/mm/highmem.c
> @@ -18,19 +18,20 @@
>  #include <asm/tlbflush.h>
>  #include "mm.h"
>  
> -pte_t *fixmap_page_table;
> -
>  static inline void set_fixmap_pte(int idx, pte_t pte)
>  {
>  	unsigned long vaddr = __fix_to_virt(idx);
> -	set_pte_ext(fixmap_page_table + idx, pte, 0);
> +	pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
> +
> +	set_pte_ext(ptep, pte, 0);
>  	local_flush_tlb_kernel_page(vaddr);
>  }

Minor comment, but we can drop the ifx parameter here now, right? It looks a
bit weird having a setter/getter pair of functions where the type signatures
aren't the inverse.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 2/8] ARM: expand fixmap region to 3MB
@ 2014-08-19 12:26     ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Kees, Rob,

On Wed, Aug 13, 2014 at 06:06:27PM +0100, Kees Cook wrote:
> From: Rob Herring <robh@kernel.org>
> 
> With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
> support 32 CPUs"), the fixmap region was expanded to 2MB, but it
> precluded any other uses of the fixmap region. In order to support other
> uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
> adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
> 
> Remove fixmap_page_table ptr and lookup the page table via the virtual
> address so that the fixmap region can span more that one pmd. The 2nd
> pmd is already created since it is shared with the vector page.

[...]

> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
> index 45aeaaca9052..81061987ac45 100644
> --- a/arch/arm/mm/highmem.c
> +++ b/arch/arm/mm/highmem.c
> @@ -18,19 +18,20 @@
>  #include <asm/tlbflush.h>
>  #include "mm.h"
>  
> -pte_t *fixmap_page_table;
> -
>  static inline void set_fixmap_pte(int idx, pte_t pte)
>  {
>  	unsigned long vaddr = __fix_to_virt(idx);
> -	set_pte_ext(fixmap_page_table + idx, pte, 0);
> +	pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
> +
> +	set_pte_ext(ptep, pte, 0);
>  	local_flush_tlb_kernel_page(vaddr);
>  }

Minor comment, but we can drop the ifx parameter here now, right? It looks a
bit weird having a setter/getter pair of functions where the type signatures
aren't the inverse.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-08-13 17:06   ` Kees Cook
@ 2014-08-19 12:29     ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:29 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

Hello,

On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
> From: Rabin Vincent <rabin@rab.in>
> 
> Use fixmaps for text patching when the kernel text is read-only,
> inspired by x86.  This makes jump labels and kprobes work with the
> currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
> CONFIG_DEBUG_RODATA options.
> 
> Signed-off-by: Rabin Vincent <rabin@rab.in>
> [kees: fixed up for merge with "arm: use generic fixmap.h"]
> [kees: added parse acquire/release annotations to pass C=1 builds]
> Signed-off-by: Kees Cook <keescook@chromium.org>

[...]

> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
> index 07314af47733..a1dce690446a 100644
> --- a/arch/arm/kernel/patch.c
> +++ b/arch/arm/kernel/patch.c
> @@ -1,8 +1,11 @@
>  #include <linux/kernel.h>
> +#include <linux/spinlock.h>
>  #include <linux/kprobes.h>
> +#include <linux/mm.h>
>  #include <linux/stop_machine.h>
>  
>  #include <asm/cacheflush.h>
> +#include <asm/fixmap.h>
>  #include <asm/smp_plat.h>
>  #include <asm/opcodes.h>
>  
> @@ -13,21 +16,77 @@ struct patch {
>  	unsigned int insn;
>  };
>  
> -void __kprobes __patch_text(void *addr, unsigned int insn)
> +static DEFINE_SPINLOCK(patch_lock);
> +
> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
> +	__acquires(&patch_lock)
> +{
> +	unsigned int uintaddr = (uintptr_t) addr;
> +	bool module = !core_kernel_text(uintaddr);
> +	struct page *page;
> +
> +	if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
> +		page = vmalloc_to_page(addr);
> +	else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
> +		page = virt_to_page(addr);
> +	else
> +		return addr;
> +
> +	if (flags)
> +		spin_lock_irqsave(&patch_lock, *flags);
> +	else
> +		__acquire(&patch_lock);

I don't understand the locking here. Why is it conditional, why do we need
to disable interrupts, and are you just racing against yourself?

> +	set_fixmap(fixmap, page_to_phys(page));

set_fixmap does TLB invalidation, right? I think that means it can block on
11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
interrupts disabled anyway.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-08-19 12:29     ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
> From: Rabin Vincent <rabin@rab.in>
> 
> Use fixmaps for text patching when the kernel text is read-only,
> inspired by x86.  This makes jump labels and kprobes work with the
> currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
> CONFIG_DEBUG_RODATA options.
> 
> Signed-off-by: Rabin Vincent <rabin@rab.in>
> [kees: fixed up for merge with "arm: use generic fixmap.h"]
> [kees: added parse acquire/release annotations to pass C=1 builds]
> Signed-off-by: Kees Cook <keescook@chromium.org>

[...]

> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
> index 07314af47733..a1dce690446a 100644
> --- a/arch/arm/kernel/patch.c
> +++ b/arch/arm/kernel/patch.c
> @@ -1,8 +1,11 @@
>  #include <linux/kernel.h>
> +#include <linux/spinlock.h>
>  #include <linux/kprobes.h>
> +#include <linux/mm.h>
>  #include <linux/stop_machine.h>
>  
>  #include <asm/cacheflush.h>
> +#include <asm/fixmap.h>
>  #include <asm/smp_plat.h>
>  #include <asm/opcodes.h>
>  
> @@ -13,21 +16,77 @@ struct patch {
>  	unsigned int insn;
>  };
>  
> -void __kprobes __patch_text(void *addr, unsigned int insn)
> +static DEFINE_SPINLOCK(patch_lock);
> +
> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
> +	__acquires(&patch_lock)
> +{
> +	unsigned int uintaddr = (uintptr_t) addr;
> +	bool module = !core_kernel_text(uintaddr);
> +	struct page *page;
> +
> +	if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
> +		page = vmalloc_to_page(addr);
> +	else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
> +		page = virt_to_page(addr);
> +	else
> +		return addr;
> +
> +	if (flags)
> +		spin_lock_irqsave(&patch_lock, *flags);
> +	else
> +		__acquire(&patch_lock);

I don't understand the locking here. Why is it conditional, why do we need
to disable interrupts, and are you just racing against yourself?

> +	set_fixmap(fixmap, page_to_phys(page));

set_fixmap does TLB invalidation, right? I think that means it can block on
11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
interrupts disabled anyway.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
  2014-08-13 17:06   ` Kees Cook
@ 2014-08-19 12:33     ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
> Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
> into section-sized areas that can have different permisions. Performs
> the NX permission changes during free_initmem, so that init memory can be
> reclaimed.
> 
> This uses section size instead of PMD size to reduce memory lost to
> padding on non-LPAE systems.
> 
> Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.

[...]

> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index 6f57cb94367f..a3d07ca2bbb4 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -8,6 +8,9 @@
>  #include <asm/thread_info.h>
>  #include <asm/memory.h>
>  #include <asm/page.h>
> +#ifdef CONFIG_ARM_KERNMEM_PERMS
> +#include <asm/pgtable.h>
> +#endif
>  	
>  #define PROC_INFO							\
>  	. = ALIGN(4);							\
> @@ -90,6 +93,11 @@ SECTIONS
>  		_text = .;
>  		HEAD_TEXT
>  	}
> +
> +#ifdef CONFIG_ARM_KERNMEM_PERMS
> +	. = ALIGN(1<<SECTION_SHIFT);
> +#endif
> +
>  	.text : {			/* Real text segment		*/
>  		_stext = .;		/* Text and read-only data	*/
>  			__exception_text_start = .;
> @@ -145,7 +153,11 @@ SECTIONS
>  	_etext = .;			/* End of text and rodata section */
>  
>  #ifndef CONFIG_XIP_KERNEL
> +# ifdef CONFIG_ARM_KERNMEM_PERMS
> +	. = ALIGN(1<<SECTION_SHIFT);
> +# else
>  	. = ALIGN(PAGE_SIZE);
> +# endif

This might be cleaner if we had a single macro (ALIGN_MIN?) that expanded to
ALIGN(1 << SECTION_SHIFT) #ifdef  CONFIG_ARM_KERNMEM_PERMS.

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index ad82c05bfc3a..ccf392ef40d4 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -32,6 +32,11 @@
>  #include <asm/tlb.h>
>  #include <asm/fixmap.h>
>  
> +#ifdef CONFIG_ARM_KERNMEM_PERMS
> +#include <asm/system_info.h>
> +#include <asm/cp15.h>
> +#endif

We already #include cp15.h in this file. If you need system_info.h, I don't
think you should bother making the include conditional.

> +/*
> + * Updates section permissions only for the current mm (sections are
> + * copied into each mm). During startup, this is the init_mm.
> + */
> +static inline void section_update(unsigned long addr, pmdval_t mask,
> +				  pmdval_t prot)
> +{
> +	struct mm_struct *mm;
> +	pmd_t *pmd;
> +
> +	mm = current->active_mm;
> +	pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
> +
> +#ifdef CONFIG_ARM_LPAE
> +	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> +#else
> +	if (addr & SECTION_SIZE)
> +		pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
> +	else
> +		pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> +#endif
> +	flush_pmd_entry(pmd);
> +	local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);

Why only a local flush? You're changing global mappings here, right?

> +}
> +
> +/* Make sure extended page tables are in use. */
> +static inline bool arch_has_strict_perms(void)
> +{
> +	unsigned int cr;
> +
> +	if (cpu_architecture() < CPU_ARCH_ARMv6)
> +		return false;
> +
> +	cr = get_cr();
> +	if (!(cr & CR_XP))
> +		return false;
> +
> +	return true;

return !!(get_cr() & CR_XP)?

> +}
> +
> +#define set_section_perms(perms, field)	{				\
> +	size_t i;							\
> +	unsigned long addr;						\
> +									\
> +	if (!arch_has_strict_perms())					\
> +		return;							\
> +									\
> +	for (i = 0; i < ARRAY_SIZE(perms); i++) {			\
> +		if (!IS_ALIGNED(perms[i].start, SECTION_SIZE) ||	\
> +		    !IS_ALIGNED(perms[i].end, SECTION_SIZE)) {		\
> +			pr_err("BUG: section %lx-%lx not aligned to %lx\n", \
> +				perms[i].start, perms[i].end,		\
> +				SECTION_SIZE);				\
> +			continue;					\
> +		}							\
> +									\
> +		for (addr = perms[i].start;				\
> +		     addr < perms[i].end;				\
> +		     addr += SECTION_SIZE)				\
> +			section_update(addr, perms[i].mask,		\
> +				       perms[i].field);			\
> +	}								\
> +}
> +
> +static inline void fix_kernmem_perms(void)
> +{
> +	set_section_perms(nx_perms, prot);
> +}
> +#else
> +static inline void fix_kernmem_perms(void) { }
> +#endif /* CONFIG_ARM_KERNMEM_PERMS */
> +
>  void free_initmem(void)
>  {
>  #ifdef CONFIG_HAVE_TCM
>  	extern char __tcm_start, __tcm_end;
> +#endif
> +
> +	fix_kernmem_perms();
>  
> +#ifdef CONFIG_HAVE_TCM

You could avoid the double #ifdef by moving the tcm stuff into another
function (free_tcmmem?)

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
@ 2014-08-19 12:33     ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
> Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
> into section-sized areas that can have different permisions. Performs
> the NX permission changes during free_initmem, so that init memory can be
> reclaimed.
> 
> This uses section size instead of PMD size to reduce memory lost to
> padding on non-LPAE systems.
> 
> Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.

[...]

> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index 6f57cb94367f..a3d07ca2bbb4 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -8,6 +8,9 @@
>  #include <asm/thread_info.h>
>  #include <asm/memory.h>
>  #include <asm/page.h>
> +#ifdef CONFIG_ARM_KERNMEM_PERMS
> +#include <asm/pgtable.h>
> +#endif
>  	
>  #define PROC_INFO							\
>  	. = ALIGN(4);							\
> @@ -90,6 +93,11 @@ SECTIONS
>  		_text = .;
>  		HEAD_TEXT
>  	}
> +
> +#ifdef CONFIG_ARM_KERNMEM_PERMS
> +	. = ALIGN(1<<SECTION_SHIFT);
> +#endif
> +
>  	.text : {			/* Real text segment		*/
>  		_stext = .;		/* Text and read-only data	*/
>  			__exception_text_start = .;
> @@ -145,7 +153,11 @@ SECTIONS
>  	_etext = .;			/* End of text and rodata section */
>  
>  #ifndef CONFIG_XIP_KERNEL
> +# ifdef CONFIG_ARM_KERNMEM_PERMS
> +	. = ALIGN(1<<SECTION_SHIFT);
> +# else
>  	. = ALIGN(PAGE_SIZE);
> +# endif

This might be cleaner if we had a single macro (ALIGN_MIN?) that expanded to
ALIGN(1 << SECTION_SHIFT) #ifdef  CONFIG_ARM_KERNMEM_PERMS.

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index ad82c05bfc3a..ccf392ef40d4 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -32,6 +32,11 @@
>  #include <asm/tlb.h>
>  #include <asm/fixmap.h>
>  
> +#ifdef CONFIG_ARM_KERNMEM_PERMS
> +#include <asm/system_info.h>
> +#include <asm/cp15.h>
> +#endif

We already #include cp15.h in this file. If you need system_info.h, I don't
think you should bother making the include conditional.

> +/*
> + * Updates section permissions only for the current mm (sections are
> + * copied into each mm). During startup, this is the init_mm.
> + */
> +static inline void section_update(unsigned long addr, pmdval_t mask,
> +				  pmdval_t prot)
> +{
> +	struct mm_struct *mm;
> +	pmd_t *pmd;
> +
> +	mm = current->active_mm;
> +	pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
> +
> +#ifdef CONFIG_ARM_LPAE
> +	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> +#else
> +	if (addr & SECTION_SIZE)
> +		pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
> +	else
> +		pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> +#endif
> +	flush_pmd_entry(pmd);
> +	local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);

Why only a local flush? You're changing global mappings here, right?

> +}
> +
> +/* Make sure extended page tables are in use. */
> +static inline bool arch_has_strict_perms(void)
> +{
> +	unsigned int cr;
> +
> +	if (cpu_architecture() < CPU_ARCH_ARMv6)
> +		return false;
> +
> +	cr = get_cr();
> +	if (!(cr & CR_XP))
> +		return false;
> +
> +	return true;

return !!(get_cr() & CR_XP)?

> +}
> +
> +#define set_section_perms(perms, field)	{				\
> +	size_t i;							\
> +	unsigned long addr;						\
> +									\
> +	if (!arch_has_strict_perms())					\
> +		return;							\
> +									\
> +	for (i = 0; i < ARRAY_SIZE(perms); i++) {			\
> +		if (!IS_ALIGNED(perms[i].start, SECTION_SIZE) ||	\
> +		    !IS_ALIGNED(perms[i].end, SECTION_SIZE)) {		\
> +			pr_err("BUG: section %lx-%lx not aligned to %lx\n", \
> +				perms[i].start, perms[i].end,		\
> +				SECTION_SIZE);				\
> +			continue;					\
> +		}							\
> +									\
> +		for (addr = perms[i].start;				\
> +		     addr < perms[i].end;				\
> +		     addr += SECTION_SIZE)				\
> +			section_update(addr, perms[i].mask,		\
> +				       perms[i].field);			\
> +	}								\
> +}
> +
> +static inline void fix_kernmem_perms(void)
> +{
> +	set_section_perms(nx_perms, prot);
> +}
> +#else
> +static inline void fix_kernmem_perms(void) { }
> +#endif /* CONFIG_ARM_KERNMEM_PERMS */
> +
>  void free_initmem(void)
>  {
>  #ifdef CONFIG_HAVE_TCM
>  	extern char __tcm_start, __tcm_end;
> +#endif
> +
> +	fix_kernmem_perms();
>  
> +#ifdef CONFIG_HAVE_TCM

You could avoid the double #ifdef by moving the tcm stuff into another
function (free_tcmmem?)

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only
  2014-08-13 17:06   ` Kees Cook
@ 2014-08-19 12:36     ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:36 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Wed, Aug 13, 2014 at 06:06:33PM +0100, Kees Cook wrote:
> This introduces CONFIG_DEBUG_RODATA, making kernel text and rodata
> read-only. Additionally, this splits rodata from text so that rodata can
> also be NX, which may lead to wasted memory when aligning to SECTION_SIZE.
> The read-only areas are made writable during ftrace updates and kexec.

[...]

> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index af9a8a927a4e..b8c75e45a950 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -15,6 +15,7 @@
>  #include <linux/ftrace.h>
>  #include <linux/uaccess.h>
>  #include <linux/module.h>
> +#include <linux/stop_machine.h>
>  
>  #include <asm/cacheflush.h>
>  #include <asm/opcodes.h>
> @@ -35,6 +36,22 @@
>  
>  #define	OLD_NOP		0xe1a00000	/* mov r0, r0 */
>  
> +static int __ftrace_modify_code(void *data)
> +{
> +	int *command = data;
> +
> +	set_kernel_text_rw();
> +	ftrace_modify_all_code(*command);
> +	set_kernel_text_ro();
> +
> +	return 0;
> +}
> +
> +void arch_ftrace_update_code(int command)
> +{
> +	stop_machine(__ftrace_modify_code, &command, NULL);
> +}
> +
>  static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
>  {
>  	return rec->arch.old_mcount ? OLD_NOP : NOP;
> @@ -73,6 +90,8 @@ int ftrace_arch_code_modify_prepare(void)
>  int ftrace_arch_code_modify_post_process(void)
>  {
>  	set_all_modules_text_ro();
> +	/* Make sure any TLB misses during machine stop are cleared. */
> +	flush_tlb_all();

I'm afraid I don't understand what you're trying to achieve here. What do
you mean by `clearing a TLB miss'?

[...]

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index ccf392ef40d4..35c838da90d5 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -626,9 +626,10 @@ struct section_perm {
>  	unsigned long end;
>  	pmdval_t mask;
>  	pmdval_t prot;
> +	pmdval_t clear;
>  };
>  
> -struct section_perm nx_perms[] = {
> +static struct section_perm nx_perms[] = {
>  	/* Make pages tables, etc before _stext RW (set NX). */
>  	{
>  		.start	= PAGE_OFFSET,
> @@ -643,8 +644,35 @@ struct section_perm nx_perms[] = {
>  		.mask	= ~PMD_SECT_XN,
>  		.prot	= PMD_SECT_XN,
>  	},
> +#ifdef CONFIG_DEBUG_RODATA
> +	/* Make rodata NX (set RO in ro_perms below). */
> +	{
> +		.start  = (unsigned long)__start_rodata,
> +		.end    = (unsigned long)__init_begin,
> +		.mask   = ~PMD_SECT_XN,
> +		.prot   = PMD_SECT_XN,
> +	},
> +#endif
>  };
>  
> +#ifdef CONFIG_DEBUG_RODATA
> +static struct section_perm ro_perms[] = {
> +	/* Make kernel code and rodata RX (set RO). */
> +	{
> +		.start  = (unsigned long)_stext,
> +		.end    = (unsigned long)__init_begin,
> +#ifdef CONFIG_ARM_LPAE
> +		.mask   = ~PMD_SECT_RDONLY,
> +		.prot   = PMD_SECT_RDONLY,
> +#else
> +		.mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
> +		.prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
> +		.clear  = PMD_SECT_AP_WRITE,
> +#endif
> +	},
> +};
> +#endif
> +
>  /*
>   * Updates section permissions only for the current mm (sections are
>   * copied into each mm). During startup, this is the init_mm.
> @@ -713,6 +741,24 @@ static inline void fix_kernmem_perms(void)
>  {
>  	set_section_perms(nx_perms, prot);
>  }
> +
> +#ifdef CONFIG_DEBUG_RODATA
> +void mark_rodata_ro(void)
> +{
> +	set_section_perms(ro_perms, prot);
> +}
> +
> +void set_kernel_text_rw(void)
> +{
> +	set_section_perms(ro_perms, clear);
> +}

How does this work with LPAE? I don't see a populated clear field there.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only
@ 2014-08-19 12:36     ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-19 12:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 06:06:33PM +0100, Kees Cook wrote:
> This introduces CONFIG_DEBUG_RODATA, making kernel text and rodata
> read-only. Additionally, this splits rodata from text so that rodata can
> also be NX, which may lead to wasted memory when aligning to SECTION_SIZE.
> The read-only areas are made writable during ftrace updates and kexec.

[...]

> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index af9a8a927a4e..b8c75e45a950 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -15,6 +15,7 @@
>  #include <linux/ftrace.h>
>  #include <linux/uaccess.h>
>  #include <linux/module.h>
> +#include <linux/stop_machine.h>
>  
>  #include <asm/cacheflush.h>
>  #include <asm/opcodes.h>
> @@ -35,6 +36,22 @@
>  
>  #define	OLD_NOP		0xe1a00000	/* mov r0, r0 */
>  
> +static int __ftrace_modify_code(void *data)
> +{
> +	int *command = data;
> +
> +	set_kernel_text_rw();
> +	ftrace_modify_all_code(*command);
> +	set_kernel_text_ro();
> +
> +	return 0;
> +}
> +
> +void arch_ftrace_update_code(int command)
> +{
> +	stop_machine(__ftrace_modify_code, &command, NULL);
> +}
> +
>  static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
>  {
>  	return rec->arch.old_mcount ? OLD_NOP : NOP;
> @@ -73,6 +90,8 @@ int ftrace_arch_code_modify_prepare(void)
>  int ftrace_arch_code_modify_post_process(void)
>  {
>  	set_all_modules_text_ro();
> +	/* Make sure any TLB misses during machine stop are cleared. */
> +	flush_tlb_all();

I'm afraid I don't understand what you're trying to achieve here. What do
you mean by `clearing a TLB miss'?

[...]

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index ccf392ef40d4..35c838da90d5 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -626,9 +626,10 @@ struct section_perm {
>  	unsigned long end;
>  	pmdval_t mask;
>  	pmdval_t prot;
> +	pmdval_t clear;
>  };
>  
> -struct section_perm nx_perms[] = {
> +static struct section_perm nx_perms[] = {
>  	/* Make pages tables, etc before _stext RW (set NX). */
>  	{
>  		.start	= PAGE_OFFSET,
> @@ -643,8 +644,35 @@ struct section_perm nx_perms[] = {
>  		.mask	= ~PMD_SECT_XN,
>  		.prot	= PMD_SECT_XN,
>  	},
> +#ifdef CONFIG_DEBUG_RODATA
> +	/* Make rodata NX (set RO in ro_perms below). */
> +	{
> +		.start  = (unsigned long)__start_rodata,
> +		.end    = (unsigned long)__init_begin,
> +		.mask   = ~PMD_SECT_XN,
> +		.prot   = PMD_SECT_XN,
> +	},
> +#endif
>  };
>  
> +#ifdef CONFIG_DEBUG_RODATA
> +static struct section_perm ro_perms[] = {
> +	/* Make kernel code and rodata RX (set RO). */
> +	{
> +		.start  = (unsigned long)_stext,
> +		.end    = (unsigned long)__init_begin,
> +#ifdef CONFIG_ARM_LPAE
> +		.mask   = ~PMD_SECT_RDONLY,
> +		.prot   = PMD_SECT_RDONLY,
> +#else
> +		.mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
> +		.prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
> +		.clear  = PMD_SECT_AP_WRITE,
> +#endif
> +	},
> +};
> +#endif
> +
>  /*
>   * Updates section permissions only for the current mm (sections are
>   * copied into each mm). During startup, this is the init_mm.
> @@ -713,6 +741,24 @@ static inline void fix_kernmem_perms(void)
>  {
>  	set_section_perms(nx_perms, prot);
>  }
> +
> +#ifdef CONFIG_DEBUG_RODATA
> +void mark_rodata_ro(void)
> +{
> +	set_section_perms(ro_perms, prot);
> +}
> +
> +void set_kernel_text_rw(void)
> +{
> +	set_section_perms(ro_perms, clear);
> +}

How does this work with LPAE? I don't see a populated clear field there.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 2/8] ARM: expand fixmap region to 3MB
  2014-08-19 12:26     ` Will Deacon
@ 2014-08-20 12:16       ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Tue, Aug 19, 2014 at 7:26 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hi Kees, Rob,
>
> On Wed, Aug 13, 2014 at 06:06:27PM +0100, Kees Cook wrote:
>> From: Rob Herring <robh@kernel.org>
>>
>> With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
>> support 32 CPUs"), the fixmap region was expanded to 2MB, but it
>> precluded any other uses of the fixmap region. In order to support other
>> uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
>> adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
>>
>> Remove fixmap_page_table ptr and lookup the page table via the virtual
>> address so that the fixmap region can span more that one pmd. The 2nd
>> pmd is already created since it is shared with the vector page.
>
> [...]
>
>> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
>> index 45aeaaca9052..81061987ac45 100644
>> --- a/arch/arm/mm/highmem.c
>> +++ b/arch/arm/mm/highmem.c
>> @@ -18,19 +18,20 @@
>>  #include <asm/tlbflush.h>
>>  #include "mm.h"
>>
>> -pte_t *fixmap_page_table;
>> -
>>  static inline void set_fixmap_pte(int idx, pte_t pte)
>>  {
>>       unsigned long vaddr = __fix_to_virt(idx);
>> -     set_pte_ext(fixmap_page_table + idx, pte, 0);
>> +     pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
>> +
>> +     set_pte_ext(ptep, pte, 0);
>>       local_flush_tlb_kernel_page(vaddr);
>>  }
>
> Minor comment, but we can drop the ifx parameter here now, right? It looks a
> bit weird having a setter/getter pair of functions where the type signatures
> aren't the inverse.

idx is used to get the vaddr here. I suppose the various users of
these functions in highmem could be reworked, but that doesn't look
sensible either. Maybe it's just the names of the functions that are
confusing?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 2/8] ARM: expand fixmap region to 3MB
@ 2014-08-20 12:16       ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19, 2014 at 7:26 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hi Kees, Rob,
>
> On Wed, Aug 13, 2014 at 06:06:27PM +0100, Kees Cook wrote:
>> From: Rob Herring <robh@kernel.org>
>>
>> With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
>> support 32 CPUs"), the fixmap region was expanded to 2MB, but it
>> precluded any other uses of the fixmap region. In order to support other
>> uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
>> adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
>>
>> Remove fixmap_page_table ptr and lookup the page table via the virtual
>> address so that the fixmap region can span more that one pmd. The 2nd
>> pmd is already created since it is shared with the vector page.
>
> [...]
>
>> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
>> index 45aeaaca9052..81061987ac45 100644
>> --- a/arch/arm/mm/highmem.c
>> +++ b/arch/arm/mm/highmem.c
>> @@ -18,19 +18,20 @@
>>  #include <asm/tlbflush.h>
>>  #include "mm.h"
>>
>> -pte_t *fixmap_page_table;
>> -
>>  static inline void set_fixmap_pte(int idx, pte_t pte)
>>  {
>>       unsigned long vaddr = __fix_to_virt(idx);
>> -     set_pte_ext(fixmap_page_table + idx, pte, 0);
>> +     pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
>> +
>> +     set_pte_ext(ptep, pte, 0);
>>       local_flush_tlb_kernel_page(vaddr);
>>  }
>
> Minor comment, but we can drop the ifx parameter here now, right? It looks a
> bit weird having a setter/getter pair of functions where the type signatures
> aren't the inverse.

idx is used to get the vaddr here. I suppose the various users of
these functions in highmem could be reworked, but that doesn't look
sensible either. Maybe it's just the names of the functions that are
confusing?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-08-19 12:29     ` Will Deacon
@ 2014-08-20 12:28       ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:28 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hello,
>
> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
>> From: Rabin Vincent <rabin@rab.in>
>>
>> Use fixmaps for text patching when the kernel text is read-only,
>> inspired by x86.  This makes jump labels and kprobes work with the
>> currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
>> CONFIG_DEBUG_RODATA options.
>>
>> Signed-off-by: Rabin Vincent <rabin@rab.in>
>> [kees: fixed up for merge with "arm: use generic fixmap.h"]
>> [kees: added parse acquire/release annotations to pass C=1 builds]
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> [...]
>
>> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
>> index 07314af47733..a1dce690446a 100644
>> --- a/arch/arm/kernel/patch.c
>> +++ b/arch/arm/kernel/patch.c
>> @@ -1,8 +1,11 @@
>>  #include <linux/kernel.h>
>> +#include <linux/spinlock.h>
>>  #include <linux/kprobes.h>
>> +#include <linux/mm.h>
>>  #include <linux/stop_machine.h>
>>
>>  #include <asm/cacheflush.h>
>> +#include <asm/fixmap.h>
>>  #include <asm/smp_plat.h>
>>  #include <asm/opcodes.h>
>>
>> @@ -13,21 +16,77 @@ struct patch {
>>       unsigned int insn;
>>  };
>>
>> -void __kprobes __patch_text(void *addr, unsigned int insn)
>> +static DEFINE_SPINLOCK(patch_lock);
>> +
>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>> +     __acquires(&patch_lock)
>> +{
>> +     unsigned int uintaddr = (uintptr_t) addr;
>> +     bool module = !core_kernel_text(uintaddr);
>> +     struct page *page;
>> +
>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>> +             page = vmalloc_to_page(addr);
>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>> +             page = virt_to_page(addr);
>> +     else
>> +             return addr;
>> +
>> +     if (flags)
>> +             spin_lock_irqsave(&patch_lock, *flags);
>> +     else
>> +             __acquire(&patch_lock);
>
> I don't understand the locking here. Why is it conditional, why do we need
> to disable interrupts, and are you just racing against yourself?

AIUI, the locking is here to avoid multiple users of the text poking
fixmaps. It's conditional because there are two fixmaps
(FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
locking around 1 is not needed since it is only ever used when 0 is in
use. (__patch_text_real locks patch_lock before setting 0 when it uses
remapping, and if it also needs 1, it doesn't have to lock since the
lock is already held.)

>> +     set_fixmap(fixmap, page_to_phys(page));
>
> set_fixmap does TLB invalidation, right? I think that means it can block on
> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
> interrupts disabled anyway.

Oh right. Hrm.

In an earlier version of this series set_fixmap did not perform TLB
invalidation. I wonder if this is not needed at all? (Wouldn't that be
nice...)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-08-20 12:28       ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hello,
>
> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
>> From: Rabin Vincent <rabin@rab.in>
>>
>> Use fixmaps for text patching when the kernel text is read-only,
>> inspired by x86.  This makes jump labels and kprobes work with the
>> currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
>> CONFIG_DEBUG_RODATA options.
>>
>> Signed-off-by: Rabin Vincent <rabin@rab.in>
>> [kees: fixed up for merge with "arm: use generic fixmap.h"]
>> [kees: added parse acquire/release annotations to pass C=1 builds]
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> [...]
>
>> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
>> index 07314af47733..a1dce690446a 100644
>> --- a/arch/arm/kernel/patch.c
>> +++ b/arch/arm/kernel/patch.c
>> @@ -1,8 +1,11 @@
>>  #include <linux/kernel.h>
>> +#include <linux/spinlock.h>
>>  #include <linux/kprobes.h>
>> +#include <linux/mm.h>
>>  #include <linux/stop_machine.h>
>>
>>  #include <asm/cacheflush.h>
>> +#include <asm/fixmap.h>
>>  #include <asm/smp_plat.h>
>>  #include <asm/opcodes.h>
>>
>> @@ -13,21 +16,77 @@ struct patch {
>>       unsigned int insn;
>>  };
>>
>> -void __kprobes __patch_text(void *addr, unsigned int insn)
>> +static DEFINE_SPINLOCK(patch_lock);
>> +
>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>> +     __acquires(&patch_lock)
>> +{
>> +     unsigned int uintaddr = (uintptr_t) addr;
>> +     bool module = !core_kernel_text(uintaddr);
>> +     struct page *page;
>> +
>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>> +             page = vmalloc_to_page(addr);
>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>> +             page = virt_to_page(addr);
>> +     else
>> +             return addr;
>> +
>> +     if (flags)
>> +             spin_lock_irqsave(&patch_lock, *flags);
>> +     else
>> +             __acquire(&patch_lock);
>
> I don't understand the locking here. Why is it conditional, why do we need
> to disable interrupts, and are you just racing against yourself?

AIUI, the locking is here to avoid multiple users of the text poking
fixmaps. It's conditional because there are two fixmaps
(FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
locking around 1 is not needed since it is only ever used when 0 is in
use. (__patch_text_real locks patch_lock before setting 0 when it uses
remapping, and if it also needs 1, it doesn't have to lock since the
lock is already held.)

>> +     set_fixmap(fixmap, page_to_phys(page));
>
> set_fixmap does TLB invalidation, right? I think that means it can block on
> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
> interrupts disabled anyway.

Oh right. Hrm.

In an earlier version of this series set_fixmap did not perform TLB
invalidation. I wonder if this is not needed at all? (Wouldn't that be
nice...)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
  2014-08-19 12:33     ` Will Deacon
@ 2014-08-20 12:37       ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Tue, Aug 19, 2014 at 7:33 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
>> Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
>> into section-sized areas that can have different permisions. Performs
>> the NX permission changes during free_initmem, so that init memory can be
>> reclaimed.
>>
>> This uses section size instead of PMD size to reduce memory lost to
>> padding on non-LPAE systems.
>>
>> Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.
>
> [...]
>
>> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
>> index 6f57cb94367f..a3d07ca2bbb4 100644
>> --- a/arch/arm/kernel/vmlinux.lds.S
>> +++ b/arch/arm/kernel/vmlinux.lds.S
>> @@ -8,6 +8,9 @@
>>  #include <asm/thread_info.h>
>>  #include <asm/memory.h>
>>  #include <asm/page.h>
>> +#ifdef CONFIG_ARM_KERNMEM_PERMS
>> +#include <asm/pgtable.h>
>> +#endif
>>
>>  #define PROC_INFO                                                    \
>>       . = ALIGN(4);                                                   \
>> @@ -90,6 +93,11 @@ SECTIONS
>>               _text = .;
>>               HEAD_TEXT
>>       }
>> +
>> +#ifdef CONFIG_ARM_KERNMEM_PERMS
>> +     . = ALIGN(1<<SECTION_SHIFT);
>> +#endif
>> +
>>       .text : {                       /* Real text segment            */
>>               _stext = .;             /* Text and read-only data      */
>>                       __exception_text_start = .;
>> @@ -145,7 +153,11 @@ SECTIONS
>>       _etext = .;                     /* End of text and rodata section */
>>
>>  #ifndef CONFIG_XIP_KERNEL
>> +# ifdef CONFIG_ARM_KERNMEM_PERMS
>> +     . = ALIGN(1<<SECTION_SHIFT);
>> +# else
>>       . = ALIGN(PAGE_SIZE);
>> +# endif
>
> This might be cleaner if we had a single macro (ALIGN_MIN?) that expanded to
> ALIGN(1 << SECTION_SHIFT) #ifdef  CONFIG_ARM_KERNMEM_PERMS.

I didn't see a cleaner way to do this, since some times it's optional
(no alignment instead of different alignment), and sometimes it's not
controlled by _KERNMEM_PERMS (i.e. sometimes it's _DEBUG_RODATA).

>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index ad82c05bfc3a..ccf392ef40d4 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -32,6 +32,11 @@
>>  #include <asm/tlb.h>
>>  #include <asm/fixmap.h>
>>
>> +#ifdef CONFIG_ARM_KERNMEM_PERMS
>> +#include <asm/system_info.h>
>> +#include <asm/cp15.h>
>> +#endif
>
> We already #include cp15.h in this file. If you need system_info.h, I don't
> think you should bother making the include conditional.

Ah, yes. I will fix this.

>
>> +/*
>> + * Updates section permissions only for the current mm (sections are
>> + * copied into each mm). During startup, this is the init_mm.
>> + */
>> +static inline void section_update(unsigned long addr, pmdval_t mask,
>> +                               pmdval_t prot)
>> +{
>> +     struct mm_struct *mm;
>> +     pmd_t *pmd;
>> +
>> +     mm = current->active_mm;
>> +     pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
>> +
>> +#ifdef CONFIG_ARM_LPAE
>> +     pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> +#else
>> +     if (addr & SECTION_SIZE)
>> +             pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
>> +     else
>> +             pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> +#endif
>> +     flush_pmd_entry(pmd);
>> +     local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
>
> Why only a local flush? You're changing global mappings here, right?

Yes, but with the a15 errata, it cannot use a global flush. As a
result, section_update can only be used by a single CPU which is how
the usage is managed. Perhaps I should add some comments to that
effect? (There was a thread a few months ago on this problem and this
shook out as a solution.)

>
>> +}
>> +
>> +/* Make sure extended page tables are in use. */
>> +static inline bool arch_has_strict_perms(void)
>> +{
>> +     unsigned int cr;
>> +
>> +     if (cpu_architecture() < CPU_ARCH_ARMv6)
>> +             return false;
>> +
>> +     cr = get_cr();
>> +     if (!(cr & CR_XP))
>> +             return false;
>> +
>> +     return true;
>
> return !!(get_cr() & CR_XP)?

Sure, I can reduce this.

>
>> +}
>> +
>> +#define set_section_perms(perms, field)      {                               \
>> +     size_t i;                                                       \
>> +     unsigned long addr;                                             \
>> +                                                                     \
>> +     if (!arch_has_strict_perms())                                   \
>> +             return;                                                 \
>> +                                                                     \
>> +     for (i = 0; i < ARRAY_SIZE(perms); i++) {                       \
>> +             if (!IS_ALIGNED(perms[i].start, SECTION_SIZE) ||        \
>> +                 !IS_ALIGNED(perms[i].end, SECTION_SIZE)) {          \
>> +                     pr_err("BUG: section %lx-%lx not aligned to %lx\n", \
>> +                             perms[i].start, perms[i].end,           \
>> +                             SECTION_SIZE);                          \
>> +                     continue;                                       \
>> +             }                                                       \
>> +                                                                     \
>> +             for (addr = perms[i].start;                             \
>> +                  addr < perms[i].end;                               \
>> +                  addr += SECTION_SIZE)                              \
>> +                     section_update(addr, perms[i].mask,             \
>> +                                    perms[i].field);                 \
>> +     }                                                               \
>> +}
>> +
>> +static inline void fix_kernmem_perms(void)
>> +{
>> +     set_section_perms(nx_perms, prot);
>> +}
>> +#else
>> +static inline void fix_kernmem_perms(void) { }
>> +#endif /* CONFIG_ARM_KERNMEM_PERMS */
>> +
>>  void free_initmem(void)
>>  {
>>  #ifdef CONFIG_HAVE_TCM
>>       extern char __tcm_start, __tcm_end;
>> +#endif
>> +
>> +     fix_kernmem_perms();
>>
>> +#ifdef CONFIG_HAVE_TCM
>
> You could avoid the double #ifdef by moving the tcm stuff into another
> function (free_tcmmem?)

Sure, I can do that.

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
@ 2014-08-20 12:37       ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19, 2014 at 7:33 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
>> Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
>> into section-sized areas that can have different permisions. Performs
>> the NX permission changes during free_initmem, so that init memory can be
>> reclaimed.
>>
>> This uses section size instead of PMD size to reduce memory lost to
>> padding on non-LPAE systems.
>>
>> Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.
>
> [...]
>
>> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
>> index 6f57cb94367f..a3d07ca2bbb4 100644
>> --- a/arch/arm/kernel/vmlinux.lds.S
>> +++ b/arch/arm/kernel/vmlinux.lds.S
>> @@ -8,6 +8,9 @@
>>  #include <asm/thread_info.h>
>>  #include <asm/memory.h>
>>  #include <asm/page.h>
>> +#ifdef CONFIG_ARM_KERNMEM_PERMS
>> +#include <asm/pgtable.h>
>> +#endif
>>
>>  #define PROC_INFO                                                    \
>>       . = ALIGN(4);                                                   \
>> @@ -90,6 +93,11 @@ SECTIONS
>>               _text = .;
>>               HEAD_TEXT
>>       }
>> +
>> +#ifdef CONFIG_ARM_KERNMEM_PERMS
>> +     . = ALIGN(1<<SECTION_SHIFT);
>> +#endif
>> +
>>       .text : {                       /* Real text segment            */
>>               _stext = .;             /* Text and read-only data      */
>>                       __exception_text_start = .;
>> @@ -145,7 +153,11 @@ SECTIONS
>>       _etext = .;                     /* End of text and rodata section */
>>
>>  #ifndef CONFIG_XIP_KERNEL
>> +# ifdef CONFIG_ARM_KERNMEM_PERMS
>> +     . = ALIGN(1<<SECTION_SHIFT);
>> +# else
>>       . = ALIGN(PAGE_SIZE);
>> +# endif
>
> This might be cleaner if we had a single macro (ALIGN_MIN?) that expanded to
> ALIGN(1 << SECTION_SHIFT) #ifdef  CONFIG_ARM_KERNMEM_PERMS.

I didn't see a cleaner way to do this, since some times it's optional
(no alignment instead of different alignment), and sometimes it's not
controlled by _KERNMEM_PERMS (i.e. sometimes it's _DEBUG_RODATA).

>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index ad82c05bfc3a..ccf392ef40d4 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -32,6 +32,11 @@
>>  #include <asm/tlb.h>
>>  #include <asm/fixmap.h>
>>
>> +#ifdef CONFIG_ARM_KERNMEM_PERMS
>> +#include <asm/system_info.h>
>> +#include <asm/cp15.h>
>> +#endif
>
> We already #include cp15.h in this file. If you need system_info.h, I don't
> think you should bother making the include conditional.

Ah, yes. I will fix this.

>
>> +/*
>> + * Updates section permissions only for the current mm (sections are
>> + * copied into each mm). During startup, this is the init_mm.
>> + */
>> +static inline void section_update(unsigned long addr, pmdval_t mask,
>> +                               pmdval_t prot)
>> +{
>> +     struct mm_struct *mm;
>> +     pmd_t *pmd;
>> +
>> +     mm = current->active_mm;
>> +     pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
>> +
>> +#ifdef CONFIG_ARM_LPAE
>> +     pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> +#else
>> +     if (addr & SECTION_SIZE)
>> +             pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
>> +     else
>> +             pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> +#endif
>> +     flush_pmd_entry(pmd);
>> +     local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
>
> Why only a local flush? You're changing global mappings here, right?

Yes, but with the a15 errata, it cannot use a global flush. As a
result, section_update can only be used by a single CPU which is how
the usage is managed. Perhaps I should add some comments to that
effect? (There was a thread a few months ago on this problem and this
shook out as a solution.)

>
>> +}
>> +
>> +/* Make sure extended page tables are in use. */
>> +static inline bool arch_has_strict_perms(void)
>> +{
>> +     unsigned int cr;
>> +
>> +     if (cpu_architecture() < CPU_ARCH_ARMv6)
>> +             return false;
>> +
>> +     cr = get_cr();
>> +     if (!(cr & CR_XP))
>> +             return false;
>> +
>> +     return true;
>
> return !!(get_cr() & CR_XP)?

Sure, I can reduce this.

>
>> +}
>> +
>> +#define set_section_perms(perms, field)      {                               \
>> +     size_t i;                                                       \
>> +     unsigned long addr;                                             \
>> +                                                                     \
>> +     if (!arch_has_strict_perms())                                   \
>> +             return;                                                 \
>> +                                                                     \
>> +     for (i = 0; i < ARRAY_SIZE(perms); i++) {                       \
>> +             if (!IS_ALIGNED(perms[i].start, SECTION_SIZE) ||        \
>> +                 !IS_ALIGNED(perms[i].end, SECTION_SIZE)) {          \
>> +                     pr_err("BUG: section %lx-%lx not aligned to %lx\n", \
>> +                             perms[i].start, perms[i].end,           \
>> +                             SECTION_SIZE);                          \
>> +                     continue;                                       \
>> +             }                                                       \
>> +                                                                     \
>> +             for (addr = perms[i].start;                             \
>> +                  addr < perms[i].end;                               \
>> +                  addr += SECTION_SIZE)                              \
>> +                     section_update(addr, perms[i].mask,             \
>> +                                    perms[i].field);                 \
>> +     }                                                               \
>> +}
>> +
>> +static inline void fix_kernmem_perms(void)
>> +{
>> +     set_section_perms(nx_perms, prot);
>> +}
>> +#else
>> +static inline void fix_kernmem_perms(void) { }
>> +#endif /* CONFIG_ARM_KERNMEM_PERMS */
>> +
>>  void free_initmem(void)
>>  {
>>  #ifdef CONFIG_HAVE_TCM
>>       extern char __tcm_start, __tcm_end;
>> +#endif
>> +
>> +     fix_kernmem_perms();
>>
>> +#ifdef CONFIG_HAVE_TCM
>
> You could avoid the double #ifdef by moving the tcm stuff into another
> function (free_tcmmem?)

Sure, I can do that.

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only
  2014-08-19 12:36     ` Will Deacon
@ 2014-08-20 12:52       ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:52 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Tue, Aug 19, 2014 at 7:36 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 13, 2014 at 06:06:33PM +0100, Kees Cook wrote:
>> This introduces CONFIG_DEBUG_RODATA, making kernel text and rodata
>> read-only. Additionally, this splits rodata from text so that rodata can
>> also be NX, which may lead to wasted memory when aligning to SECTION_SIZE.
>> The read-only areas are made writable during ftrace updates and kexec.
>
> [...]
>
>> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
>> index af9a8a927a4e..b8c75e45a950 100644
>> --- a/arch/arm/kernel/ftrace.c
>> +++ b/arch/arm/kernel/ftrace.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/ftrace.h>
>>  #include <linux/uaccess.h>
>>  #include <linux/module.h>
>> +#include <linux/stop_machine.h>
>>
>>  #include <asm/cacheflush.h>
>>  #include <asm/opcodes.h>
>> @@ -35,6 +36,22 @@
>>
>>  #define      OLD_NOP         0xe1a00000      /* mov r0, r0 */
>>
>> +static int __ftrace_modify_code(void *data)
>> +{
>> +     int *command = data;
>> +
>> +     set_kernel_text_rw();
>> +     ftrace_modify_all_code(*command);
>> +     set_kernel_text_ro();
>> +
>> +     return 0;
>> +}
>> +
>> +void arch_ftrace_update_code(int command)
>> +{
>> +     stop_machine(__ftrace_modify_code, &command, NULL);
>> +}
>> +
>>  static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
>>  {
>>       return rec->arch.old_mcount ? OLD_NOP : NOP;
>> @@ -73,6 +90,8 @@ int ftrace_arch_code_modify_prepare(void)
>>  int ftrace_arch_code_modify_post_process(void)
>>  {
>>       set_all_modules_text_ro();
>> +     /* Make sure any TLB misses during machine stop are cleared. */
>> +     flush_tlb_all();
>
> I'm afraid I don't understand what you're trying to achieve here. What do
> you mean by `clearing a TLB miss'?

The concern with the local TLB flush when using section_update is that
another CPU might come along and load the temporarily-writable page
permissions during the time the first CPU has called
set_kernel_text_rw() and set_kernel_text_ro(). The call here to
flush_tlb_all() is to make sure all CPUs have the correct page
permissions visible again.

(This is all to work around the a15 errata, and also part of the
output from the thread I mentioned in my 7/8 comment reply.)

>
> [...]
>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index ccf392ef40d4..35c838da90d5 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -626,9 +626,10 @@ struct section_perm {
>>       unsigned long end;
>>       pmdval_t mask;
>>       pmdval_t prot;
>> +     pmdval_t clear;
>>  };
>>
>> -struct section_perm nx_perms[] = {
>> +static struct section_perm nx_perms[] = {
>>       /* Make pages tables, etc before _stext RW (set NX). */
>>       {
>>               .start  = PAGE_OFFSET,
>> @@ -643,8 +644,35 @@ struct section_perm nx_perms[] = {
>>               .mask   = ~PMD_SECT_XN,
>>               .prot   = PMD_SECT_XN,
>>       },
>> +#ifdef CONFIG_DEBUG_RODATA
>> +     /* Make rodata NX (set RO in ro_perms below). */
>> +     {
>> +             .start  = (unsigned long)__start_rodata,
>> +             .end    = (unsigned long)__init_begin,
>> +             .mask   = ~PMD_SECT_XN,
>> +             .prot   = PMD_SECT_XN,
>> +     },
>> +#endif
>>  };
>>
>> +#ifdef CONFIG_DEBUG_RODATA
>> +static struct section_perm ro_perms[] = {
>> +     /* Make kernel code and rodata RX (set RO). */
>> +     {
>> +             .start  = (unsigned long)_stext,
>> +             .end    = (unsigned long)__init_begin,
>> +#ifdef CONFIG_ARM_LPAE
>> +             .mask   = ~PMD_SECT_RDONLY,
>> +             .prot   = PMD_SECT_RDONLY,
>> +#else
>> +             .mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
>> +             .prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
>> +             .clear  = PMD_SECT_AP_WRITE,
>> +#endif
>> +     },
>> +};
>> +#endif
>> +
>>  /*
>>   * Updates section permissions only for the current mm (sections are
>>   * copied into each mm). During startup, this is the init_mm.
>> @@ -713,6 +741,24 @@ static inline void fix_kernmem_perms(void)
>>  {
>>       set_section_perms(nx_perms, prot);
>>  }
>> +
>> +#ifdef CONFIG_DEBUG_RODATA
>> +void mark_rodata_ro(void)
>> +{
>> +     set_section_perms(ro_perms, prot);
>> +}
>> +
>> +void set_kernel_text_rw(void)
>> +{
>> +     set_section_perms(ro_perms, clear);
>> +}
>
> How does this work with LPAE? I don't see a populated clear field there.

LPAE's case has .clear=0 since it only needs the mask -- it has no
bits from the mask to set when clearing. Maybe I need better field
names. It was "'mask' used to unset bits" with "bits to set when
'prot'ecting" and "bits to set when 'clear'ing".

The non-LPAE case masks out "~(PMD_SECT_APX | PMD_SECT_AP_WRITE)" and
then sets either "PMD_SECT_APX | PMD_SECT_AP_WRITE" to set the ro
state, or sets "PMD_SECT_AP_WRITE" to clear the ro state.

The LPAE case masks out "~PMD_SECT_RDONLY" and then sets either
"PMD_SECT_RDONLY" to set the ro state, or sets nothing to clear the ro
state (the mask did everything needed to clear the ro state).

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only
@ 2014-08-20 12:52       ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-20 12:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19, 2014 at 7:36 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 13, 2014 at 06:06:33PM +0100, Kees Cook wrote:
>> This introduces CONFIG_DEBUG_RODATA, making kernel text and rodata
>> read-only. Additionally, this splits rodata from text so that rodata can
>> also be NX, which may lead to wasted memory when aligning to SECTION_SIZE.
>> The read-only areas are made writable during ftrace updates and kexec.
>
> [...]
>
>> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
>> index af9a8a927a4e..b8c75e45a950 100644
>> --- a/arch/arm/kernel/ftrace.c
>> +++ b/arch/arm/kernel/ftrace.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/ftrace.h>
>>  #include <linux/uaccess.h>
>>  #include <linux/module.h>
>> +#include <linux/stop_machine.h>
>>
>>  #include <asm/cacheflush.h>
>>  #include <asm/opcodes.h>
>> @@ -35,6 +36,22 @@
>>
>>  #define      OLD_NOP         0xe1a00000      /* mov r0, r0 */
>>
>> +static int __ftrace_modify_code(void *data)
>> +{
>> +     int *command = data;
>> +
>> +     set_kernel_text_rw();
>> +     ftrace_modify_all_code(*command);
>> +     set_kernel_text_ro();
>> +
>> +     return 0;
>> +}
>> +
>> +void arch_ftrace_update_code(int command)
>> +{
>> +     stop_machine(__ftrace_modify_code, &command, NULL);
>> +}
>> +
>>  static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
>>  {
>>       return rec->arch.old_mcount ? OLD_NOP : NOP;
>> @@ -73,6 +90,8 @@ int ftrace_arch_code_modify_prepare(void)
>>  int ftrace_arch_code_modify_post_process(void)
>>  {
>>       set_all_modules_text_ro();
>> +     /* Make sure any TLB misses during machine stop are cleared. */
>> +     flush_tlb_all();
>
> I'm afraid I don't understand what you're trying to achieve here. What do
> you mean by `clearing a TLB miss'?

The concern with the local TLB flush when using section_update is that
another CPU might come along and load the temporarily-writable page
permissions during the time the first CPU has called
set_kernel_text_rw() and set_kernel_text_ro(). The call here to
flush_tlb_all() is to make sure all CPUs have the correct page
permissions visible again.

(This is all to work around the a15 errata, and also part of the
output from the thread I mentioned in my 7/8 comment reply.)

>
> [...]
>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index ccf392ef40d4..35c838da90d5 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -626,9 +626,10 @@ struct section_perm {
>>       unsigned long end;
>>       pmdval_t mask;
>>       pmdval_t prot;
>> +     pmdval_t clear;
>>  };
>>
>> -struct section_perm nx_perms[] = {
>> +static struct section_perm nx_perms[] = {
>>       /* Make pages tables, etc before _stext RW (set NX). */
>>       {
>>               .start  = PAGE_OFFSET,
>> @@ -643,8 +644,35 @@ struct section_perm nx_perms[] = {
>>               .mask   = ~PMD_SECT_XN,
>>               .prot   = PMD_SECT_XN,
>>       },
>> +#ifdef CONFIG_DEBUG_RODATA
>> +     /* Make rodata NX (set RO in ro_perms below). */
>> +     {
>> +             .start  = (unsigned long)__start_rodata,
>> +             .end    = (unsigned long)__init_begin,
>> +             .mask   = ~PMD_SECT_XN,
>> +             .prot   = PMD_SECT_XN,
>> +     },
>> +#endif
>>  };
>>
>> +#ifdef CONFIG_DEBUG_RODATA
>> +static struct section_perm ro_perms[] = {
>> +     /* Make kernel code and rodata RX (set RO). */
>> +     {
>> +             .start  = (unsigned long)_stext,
>> +             .end    = (unsigned long)__init_begin,
>> +#ifdef CONFIG_ARM_LPAE
>> +             .mask   = ~PMD_SECT_RDONLY,
>> +             .prot   = PMD_SECT_RDONLY,
>> +#else
>> +             .mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
>> +             .prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
>> +             .clear  = PMD_SECT_AP_WRITE,
>> +#endif
>> +     },
>> +};
>> +#endif
>> +
>>  /*
>>   * Updates section permissions only for the current mm (sections are
>>   * copied into each mm). During startup, this is the init_mm.
>> @@ -713,6 +741,24 @@ static inline void fix_kernmem_perms(void)
>>  {
>>       set_section_perms(nx_perms, prot);
>>  }
>> +
>> +#ifdef CONFIG_DEBUG_RODATA
>> +void mark_rodata_ro(void)
>> +{
>> +     set_section_perms(ro_perms, prot);
>> +}
>> +
>> +void set_kernel_text_rw(void)
>> +{
>> +     set_section_perms(ro_perms, clear);
>> +}
>
> How does this work with LPAE? I don't see a populated clear field there.

LPAE's case has .clear=0 since it only needs the mask -- it has no
bits from the mask to set when clearing. Maybe I need better field
names. It was "'mask' used to unset bits" with "bits to set when
'prot'ecting" and "bits to set when 'clear'ing".

The non-LPAE case masks out "~(PMD_SECT_APX | PMD_SECT_AP_WRITE)" and
then sets either "PMD_SECT_APX | PMD_SECT_AP_WRITE" to set the ro
state, or sets "PMD_SECT_AP_WRITE" to clear the ro state.

The LPAE case masks out "~PMD_SECT_RDONLY" and then sets either
"PMD_SECT_RDONLY" to set the ro state, or sets nothing to clear the ro
state (the mask did everything needed to clear the ro state).

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 2/8] ARM: expand fixmap region to 3MB
  2014-08-20 12:16       ` Kees Cook
@ 2014-08-26 14:37         ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-26 14:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Wed, Aug 20, 2014 at 01:16:34PM +0100, Kees Cook wrote:
> On Tue, Aug 19, 2014 at 7:26 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Aug 13, 2014 at 06:06:27PM +0100, Kees Cook wrote:
> >> From: Rob Herring <robh@kernel.org>
> >>
> >> With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
> >> support 32 CPUs"), the fixmap region was expanded to 2MB, but it
> >> precluded any other uses of the fixmap region. In order to support other
> >> uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
> >> adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
> >>
> >> Remove fixmap_page_table ptr and lookup the page table via the virtual
> >> address so that the fixmap region can span more that one pmd. The 2nd
> >> pmd is already created since it is shared with the vector page.
> >
> > [...]
> >
> >> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
> >> index 45aeaaca9052..81061987ac45 100644
> >> --- a/arch/arm/mm/highmem.c
> >> +++ b/arch/arm/mm/highmem.c
> >> @@ -18,19 +18,20 @@
> >>  #include <asm/tlbflush.h>
> >>  #include "mm.h"
> >>
> >> -pte_t *fixmap_page_table;
> >> -
> >>  static inline void set_fixmap_pte(int idx, pte_t pte)
> >>  {
> >>       unsigned long vaddr = __fix_to_virt(idx);
> >> -     set_pte_ext(fixmap_page_table + idx, pte, 0);
> >> +     pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
> >> +
> >> +     set_pte_ext(ptep, pte, 0);
> >>       local_flush_tlb_kernel_page(vaddr);
> >>  }
> >
> > Minor comment, but we can drop the ifx parameter here now, right? It looks a
> > bit weird having a setter/getter pair of functions where the type signatures
> > aren't the inverse.
> 
> idx is used to get the vaddr here. I suppose the various users of
> these functions in highmem could be reworked, but that doesn't look
> sensible either. Maybe it's just the names of the functions that are
> confusing?

Sorry, I misread the context line as a deletion. Leave it like it is.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 2/8] ARM: expand fixmap region to 3MB
@ 2014-08-26 14:37         ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-26 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 20, 2014 at 01:16:34PM +0100, Kees Cook wrote:
> On Tue, Aug 19, 2014 at 7:26 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Aug 13, 2014 at 06:06:27PM +0100, Kees Cook wrote:
> >> From: Rob Herring <robh@kernel.org>
> >>
> >> With commit a05e54c103b0 ("ARM: 8031/2: change fixmap mapping region to
> >> support 32 CPUs"), the fixmap region was expanded to 2MB, but it
> >> precluded any other uses of the fixmap region. In order to support other
> >> uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
> >> adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
> >>
> >> Remove fixmap_page_table ptr and lookup the page table via the virtual
> >> address so that the fixmap region can span more that one pmd. The 2nd
> >> pmd is already created since it is shared with the vector page.
> >
> > [...]
> >
> >> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
> >> index 45aeaaca9052..81061987ac45 100644
> >> --- a/arch/arm/mm/highmem.c
> >> +++ b/arch/arm/mm/highmem.c
> >> @@ -18,19 +18,20 @@
> >>  #include <asm/tlbflush.h>
> >>  #include "mm.h"
> >>
> >> -pte_t *fixmap_page_table;
> >> -
> >>  static inline void set_fixmap_pte(int idx, pte_t pte)
> >>  {
> >>       unsigned long vaddr = __fix_to_virt(idx);
> >> -     set_pte_ext(fixmap_page_table + idx, pte, 0);
> >> +     pte_t *ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
> >> +
> >> +     set_pte_ext(ptep, pte, 0);
> >>       local_flush_tlb_kernel_page(vaddr);
> >>  }
> >
> > Minor comment, but we can drop the ifx parameter here now, right? It looks a
> > bit weird having a setter/getter pair of functions where the type signatures
> > aren't the inverse.
> 
> idx is used to get the vaddr here. I suppose the various users of
> these functions in highmem could be reworked, but that doesn't look
> sensible either. Maybe it's just the names of the functions that are
> confusing?

Sorry, I misread the context line as a deletion. Leave it like it is.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
  2014-08-20 12:37       ` Kees Cook
@ 2014-08-26 14:43         ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-26 14:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Wed, Aug 20, 2014 at 01:37:14PM +0100, Kees Cook wrote:
> On Tue, Aug 19, 2014 at 7:33 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
> >> +/*
> >> + * Updates section permissions only for the current mm (sections are
> >> + * copied into each mm). During startup, this is the init_mm.
> >> + */
> >> +static inline void section_update(unsigned long addr, pmdval_t mask,
> >> +                               pmdval_t prot)
> >> +{
> >> +     struct mm_struct *mm;
> >> +     pmd_t *pmd;
> >> +
> >> +     mm = current->active_mm;
> >> +     pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
> >> +
> >> +#ifdef CONFIG_ARM_LPAE
> >> +     pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> >> +#else
> >> +     if (addr & SECTION_SIZE)
> >> +             pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
> >> +     else
> >> +             pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> >> +#endif
> >> +     flush_pmd_entry(pmd);
> >> +     local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
> >
> > Why only a local flush? You're changing global mappings here, right?
> 
> Yes, but with the a15 errata, it cannot use a global flush. As a
> result, section_update can only be used by a single CPU which is how
> the usage is managed. Perhaps I should add some comments to that
> effect? (There was a thread a few months ago on this problem and this
> shook out as a solution.)

Hmm, so do you mandate that preemption is disabled during sections of code
where the permissions must be changed after boot? (e.g. ftrace patching)

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
@ 2014-08-26 14:43         ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-08-26 14:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 20, 2014 at 01:37:14PM +0100, Kees Cook wrote:
> On Tue, Aug 19, 2014 at 7:33 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
> >> +/*
> >> + * Updates section permissions only for the current mm (sections are
> >> + * copied into each mm). During startup, this is the init_mm.
> >> + */
> >> +static inline void section_update(unsigned long addr, pmdval_t mask,
> >> +                               pmdval_t prot)
> >> +{
> >> +     struct mm_struct *mm;
> >> +     pmd_t *pmd;
> >> +
> >> +     mm = current->active_mm;
> >> +     pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
> >> +
> >> +#ifdef CONFIG_ARM_LPAE
> >> +     pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> >> +#else
> >> +     if (addr & SECTION_SIZE)
> >> +             pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
> >> +     else
> >> +             pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> >> +#endif
> >> +     flush_pmd_entry(pmd);
> >> +     local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
> >
> > Why only a local flush? You're changing global mappings here, right?
> 
> Yes, but with the a15 errata, it cannot use a global flush. As a
> result, section_update can only be used by a single CPU which is how
> the usage is managed. Perhaps I should add some comments to that
> effect? (There was a thread a few months ago on this problem and this
> shook out as a solution.)

Hmm, so do you mandate that preemption is disabled during sections of code
where the permissions must be changed after boot? (e.g. ftrace patching)

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
  2014-08-26 14:43         ` Will Deacon
@ 2014-08-29 16:04           ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-29 16:04 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rob Herring, Laura Abbott, Leif Lindholm,
	Stephen Boyd, msalter, Rabin Vincent, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Tue, Aug 26, 2014 at 7:43 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 20, 2014 at 01:37:14PM +0100, Kees Cook wrote:
>> On Tue, Aug 19, 2014 at 7:33 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
>> >> +/*
>> >> + * Updates section permissions only for the current mm (sections are
>> >> + * copied into each mm). During startup, this is the init_mm.
>> >> + */
>> >> +static inline void section_update(unsigned long addr, pmdval_t mask,
>> >> +                               pmdval_t prot)
>> >> +{
>> >> +     struct mm_struct *mm;
>> >> +     pmd_t *pmd;
>> >> +
>> >> +     mm = current->active_mm;
>> >> +     pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
>> >> +
>> >> +#ifdef CONFIG_ARM_LPAE
>> >> +     pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> >> +#else
>> >> +     if (addr & SECTION_SIZE)
>> >> +             pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
>> >> +     else
>> >> +             pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> >> +#endif
>> >> +     flush_pmd_entry(pmd);
>> >> +     local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
>> >
>> > Why only a local flush? You're changing global mappings here, right?
>>
>> Yes, but with the a15 errata, it cannot use a global flush. As a
>> result, section_update can only be used by a single CPU which is how
>> the usage is managed. Perhaps I should add some comments to that
>> effect? (There was a thread a few months ago on this problem and this
>> shook out as a solution.)
>
> Hmm, so do you mandate that preemption is disabled during sections of code
> where the permissions must be changed after boot? (e.g. ftrace patching)

I guess it's not enforced (how do I make sure preempt is off?), but
it's only used during the ftrace patching. I'm open to suggestions! (I
had avoided going the fixmap route since I would have expected that to
be much slower.)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
@ 2014-08-29 16:04           ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-08-29 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 26, 2014 at 7:43 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 20, 2014 at 01:37:14PM +0100, Kees Cook wrote:
>> On Tue, Aug 19, 2014 at 7:33 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Wed, Aug 13, 2014 at 06:06:32PM +0100, Kees Cook wrote:
>> >> +/*
>> >> + * Updates section permissions only for the current mm (sections are
>> >> + * copied into each mm). During startup, this is the init_mm.
>> >> + */
>> >> +static inline void section_update(unsigned long addr, pmdval_t mask,
>> >> +                               pmdval_t prot)
>> >> +{
>> >> +     struct mm_struct *mm;
>> >> +     pmd_t *pmd;
>> >> +
>> >> +     mm = current->active_mm;
>> >> +     pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
>> >> +
>> >> +#ifdef CONFIG_ARM_LPAE
>> >> +     pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> >> +#else
>> >> +     if (addr & SECTION_SIZE)
>> >> +             pmd[1] = __pmd((pmd_val(pmd[1]) & mask) | prot);
>> >> +     else
>> >> +             pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
>> >> +#endif
>> >> +     flush_pmd_entry(pmd);
>> >> +     local_flush_tlb_kernel_range(addr, addr + SECTION_SIZE);
>> >
>> > Why only a local flush? You're changing global mappings here, right?
>>
>> Yes, but with the a15 errata, it cannot use a global flush. As a
>> result, section_update can only be used by a single CPU which is how
>> the usage is managed. Perhaps I should add some comments to that
>> effect? (There was a thread a few months ago on this problem and this
>> shook out as a solution.)
>
> Hmm, so do you mandate that preemption is disabled during sections of code
> where the permissions must be changed after boot? (e.g. ftrace patching)

I guess it's not enforced (how do I make sure preempt is off?), but
it's only used during the ftrace patching. I'm open to suggestions! (I
had avoided going the fixmap route since I would have expected that to
be much slower.)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
  2014-08-29 16:04           ` Kees Cook
@ 2014-08-31 14:59             ` Rabin Vincent
  -1 siblings, 0 replies; 52+ messages in thread
From: Rabin Vincent @ 2014-08-31 14:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Deacon, linux-kernel, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Fri, Aug 29, 2014 at 09:04:41AM -0700, Kees Cook wrote:
> On Tue, Aug 26, 2014 at 7:43 AM, Will Deacon <will.deacon@arm.com> wrote:
> > Hmm, so do you mandate that preemption is disabled during sections of code
> > where the permissions must be changed after boot? (e.g. ftrace patching)
> 
> I guess it's not enforced (how do I make sure preempt is off?), but
> it's only used during the ftrace patching. I'm open to suggestions! (I
> had avoided going the fixmap route since I would have expected that to
> be much slower.)

For ftrace patching this is called under stop_machine() so preemption is
already off there.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable
@ 2014-08-31 14:59             ` Rabin Vincent
  0 siblings, 0 replies; 52+ messages in thread
From: Rabin Vincent @ 2014-08-31 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 29, 2014 at 09:04:41AM -0700, Kees Cook wrote:
> On Tue, Aug 26, 2014 at 7:43 AM, Will Deacon <will.deacon@arm.com> wrote:
> > Hmm, so do you mandate that preemption is disabled during sections of code
> > where the permissions must be changed after boot? (e.g. ftrace patching)
> 
> I guess it's not enforced (how do I make sure preempt is off?), but
> it's only used during the ftrace patching. I'm open to suggestions! (I
> had avoided going the fixmap route since I would have expected that to
> be much slower.)

For ftrace patching this is called under stop_machine() so preemption is
already off there.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-08-20 12:28       ` Kees Cook
@ 2014-09-03 21:43         ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-09-03 21:43 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
>> Hello,
>>
>> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
>>> From: Rabin Vincent <rabin@rab.in>
>>>
>>> Use fixmaps for text patching when the kernel text is read-only,
>>> inspired by x86.  This makes jump labels and kprobes work with the
>>> currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
>>> CONFIG_DEBUG_RODATA options.
>>>
>>> Signed-off-by: Rabin Vincent <rabin@rab.in>
>>> [kees: fixed up for merge with "arm: use generic fixmap.h"]
>>> [kees: added parse acquire/release annotations to pass C=1 builds]
>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>
>> [...]
>>
>>> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
>>> index 07314af47733..a1dce690446a 100644
>>> --- a/arch/arm/kernel/patch.c
>>> +++ b/arch/arm/kernel/patch.c
>>> @@ -1,8 +1,11 @@
>>>  #include <linux/kernel.h>
>>> +#include <linux/spinlock.h>
>>>  #include <linux/kprobes.h>
>>> +#include <linux/mm.h>
>>>  #include <linux/stop_machine.h>
>>>
>>>  #include <asm/cacheflush.h>
>>> +#include <asm/fixmap.h>
>>>  #include <asm/smp_plat.h>
>>>  #include <asm/opcodes.h>
>>>
>>> @@ -13,21 +16,77 @@ struct patch {
>>>       unsigned int insn;
>>>  };
>>>
>>> -void __kprobes __patch_text(void *addr, unsigned int insn)
>>> +static DEFINE_SPINLOCK(patch_lock);
>>> +
>>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>>> +     __acquires(&patch_lock)
>>> +{
>>> +     unsigned int uintaddr = (uintptr_t) addr;
>>> +     bool module = !core_kernel_text(uintaddr);
>>> +     struct page *page;
>>> +
>>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>>> +             page = vmalloc_to_page(addr);
>>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>>> +             page = virt_to_page(addr);
>>> +     else
>>> +             return addr;
>>> +
>>> +     if (flags)
>>> +             spin_lock_irqsave(&patch_lock, *flags);
>>> +     else
>>> +             __acquire(&patch_lock);
>>
>> I don't understand the locking here. Why is it conditional, why do we need
>> to disable interrupts, and are you just racing against yourself?
>
> AIUI, the locking is here to avoid multiple users of the text poking
> fixmaps. It's conditional because there are two fixmaps
> (FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
> locking around 1 is not needed since it is only ever used when 0 is in
> use. (__patch_text_real locks patch_lock before setting 0 when it uses
> remapping, and if it also needs 1, it doesn't have to lock since the
> lock is already held.)
>
>>> +     set_fixmap(fixmap, page_to_phys(page));
>>
>> set_fixmap does TLB invalidation, right? I think that means it can block on
>> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
>> interrupts disabled anyway.
>
> Oh right. Hrm.
>
> In an earlier version of this series set_fixmap did not perform TLB
> invalidation. I wonder if this is not needed at all? (Wouldn't that be
> nice...)

As suspected, my tests fail spectacularly without the TLB flush.
Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
here. Should I leave the WARN_ON in place for clarity, or some other
comments?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-09-03 21:43         ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-09-03 21:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
>> Hello,
>>
>> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
>>> From: Rabin Vincent <rabin@rab.in>
>>>
>>> Use fixmaps for text patching when the kernel text is read-only,
>>> inspired by x86.  This makes jump labels and kprobes work with the
>>> currently available CONFIG_DEBUG_SET_MODULE_RONX and the upcoming
>>> CONFIG_DEBUG_RODATA options.
>>>
>>> Signed-off-by: Rabin Vincent <rabin@rab.in>
>>> [kees: fixed up for merge with "arm: use generic fixmap.h"]
>>> [kees: added parse acquire/release annotations to pass C=1 builds]
>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>
>> [...]
>>
>>> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
>>> index 07314af47733..a1dce690446a 100644
>>> --- a/arch/arm/kernel/patch.c
>>> +++ b/arch/arm/kernel/patch.c
>>> @@ -1,8 +1,11 @@
>>>  #include <linux/kernel.h>
>>> +#include <linux/spinlock.h>
>>>  #include <linux/kprobes.h>
>>> +#include <linux/mm.h>
>>>  #include <linux/stop_machine.h>
>>>
>>>  #include <asm/cacheflush.h>
>>> +#include <asm/fixmap.h>
>>>  #include <asm/smp_plat.h>
>>>  #include <asm/opcodes.h>
>>>
>>> @@ -13,21 +16,77 @@ struct patch {
>>>       unsigned int insn;
>>>  };
>>>
>>> -void __kprobes __patch_text(void *addr, unsigned int insn)
>>> +static DEFINE_SPINLOCK(patch_lock);
>>> +
>>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>>> +     __acquires(&patch_lock)
>>> +{
>>> +     unsigned int uintaddr = (uintptr_t) addr;
>>> +     bool module = !core_kernel_text(uintaddr);
>>> +     struct page *page;
>>> +
>>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>>> +             page = vmalloc_to_page(addr);
>>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>>> +             page = virt_to_page(addr);
>>> +     else
>>> +             return addr;
>>> +
>>> +     if (flags)
>>> +             spin_lock_irqsave(&patch_lock, *flags);
>>> +     else
>>> +             __acquire(&patch_lock);
>>
>> I don't understand the locking here. Why is it conditional, why do we need
>> to disable interrupts, and are you just racing against yourself?
>
> AIUI, the locking is here to avoid multiple users of the text poking
> fixmaps. It's conditional because there are two fixmaps
> (FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
> locking around 1 is not needed since it is only ever used when 0 is in
> use. (__patch_text_real locks patch_lock before setting 0 when it uses
> remapping, and if it also needs 1, it doesn't have to lock since the
> lock is already held.)
>
>>> +     set_fixmap(fixmap, page_to_phys(page));
>>
>> set_fixmap does TLB invalidation, right? I think that means it can block on
>> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
>> interrupts disabled anyway.
>
> Oh right. Hrm.
>
> In an earlier version of this series set_fixmap did not perform TLB
> invalidation. I wonder if this is not needed at all? (Wouldn't that be
> nice...)

As suspected, my tests fail spectacularly without the TLB flush.
Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
here. Should I leave the WARN_ON in place for clarity, or some other
comments?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-09-03 21:43         ` Kees Cook
@ 2014-09-04  9:27           ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-09-04  9:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Wed, Sep 03, 2014 at 10:43:58PM +0100, Kees Cook wrote:
> On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
> > On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
> >>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
> >>> +     __acquires(&patch_lock)
> >>> +{
> >>> +     unsigned int uintaddr = (uintptr_t) addr;
> >>> +     bool module = !core_kernel_text(uintaddr);
> >>> +     struct page *page;
> >>> +
> >>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
> >>> +             page = vmalloc_to_page(addr);
> >>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
> >>> +             page = virt_to_page(addr);
> >>> +     else
> >>> +             return addr;
> >>> +
> >>> +     if (flags)
> >>> +             spin_lock_irqsave(&patch_lock, *flags);
> >>> +     else
> >>> +             __acquire(&patch_lock);
> >>
> >> I don't understand the locking here. Why is it conditional, why do we need
> >> to disable interrupts, and are you just racing against yourself?
> >
> > AIUI, the locking is here to avoid multiple users of the text poking
> > fixmaps. It's conditional because there are two fixmaps
> > (FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
> > locking around 1 is not needed since it is only ever used when 0 is in
> > use. (__patch_text_real locks patch_lock before setting 0 when it uses
> > remapping, and if it also needs 1, it doesn't have to lock since the
> > lock is already held.)
> >
> >>> +     set_fixmap(fixmap, page_to_phys(page));
> >>
> >> set_fixmap does TLB invalidation, right? I think that means it can block on
> >> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
> >> interrupts disabled anyway.
> >
> > Oh right. Hrm.
> >
> > In an earlier version of this series set_fixmap did not perform TLB
> > invalidation. I wonder if this is not needed at all? (Wouldn't that be
> > nice...)
> 
> As suspected, my tests fail spectacularly without the TLB flush.
> Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
> here. Should I leave the WARN_ON in place for clarity, or some other
> comments?

I thought there was a potential call to spin_lock_irqsave right before
this TLB flush?

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-09-04  9:27           ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-09-04  9:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 03, 2014 at 10:43:58PM +0100, Kees Cook wrote:
> On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
> > On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
> >>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
> >>> +     __acquires(&patch_lock)
> >>> +{
> >>> +     unsigned int uintaddr = (uintptr_t) addr;
> >>> +     bool module = !core_kernel_text(uintaddr);
> >>> +     struct page *page;
> >>> +
> >>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
> >>> +             page = vmalloc_to_page(addr);
> >>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
> >>> +             page = virt_to_page(addr);
> >>> +     else
> >>> +             return addr;
> >>> +
> >>> +     if (flags)
> >>> +             spin_lock_irqsave(&patch_lock, *flags);
> >>> +     else
> >>> +             __acquire(&patch_lock);
> >>
> >> I don't understand the locking here. Why is it conditional, why do we need
> >> to disable interrupts, and are you just racing against yourself?
> >
> > AIUI, the locking is here to avoid multiple users of the text poking
> > fixmaps. It's conditional because there are two fixmaps
> > (FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
> > locking around 1 is not needed since it is only ever used when 0 is in
> > use. (__patch_text_real locks patch_lock before setting 0 when it uses
> > remapping, and if it also needs 1, it doesn't have to lock since the
> > lock is already held.)
> >
> >>> +     set_fixmap(fixmap, page_to_phys(page));
> >>
> >> set_fixmap does TLB invalidation, right? I think that means it can block on
> >> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
> >> interrupts disabled anyway.
> >
> > Oh right. Hrm.
> >
> > In an earlier version of this series set_fixmap did not perform TLB
> > invalidation. I wonder if this is not needed at all? (Wouldn't that be
> > nice...)
> 
> As suspected, my tests fail spectacularly without the TLB flush.
> Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
> here. Should I leave the WARN_ON in place for clarity, or some other
> comments?

I thought there was a potential call to spin_lock_irqsave right before
this TLB flush?

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-09-04  9:27           ` Will Deacon
@ 2014-09-04 14:00             ` Kees Cook
  -1 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-09-04 14:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Thu, Sep 4, 2014 at 2:27 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Sep 03, 2014 at 10:43:58PM +0100, Kees Cook wrote:
>> On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
>> > On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
>> >> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
>> >>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>> >>> +     __acquires(&patch_lock)
>> >>> +{
>> >>> +     unsigned int uintaddr = (uintptr_t) addr;
>> >>> +     bool module = !core_kernel_text(uintaddr);
>> >>> +     struct page *page;
>> >>> +
>> >>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>> >>> +             page = vmalloc_to_page(addr);
>> >>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>> >>> +             page = virt_to_page(addr);
>> >>> +     else
>> >>> +             return addr;
>> >>> +
>> >>> +     if (flags)
>> >>> +             spin_lock_irqsave(&patch_lock, *flags);
>> >>> +     else
>> >>> +             __acquire(&patch_lock);
>> >>
>> >> I don't understand the locking here. Why is it conditional, why do we need
>> >> to disable interrupts, and are you just racing against yourself?
>> >
>> > AIUI, the locking is here to avoid multiple users of the text poking
>> > fixmaps. It's conditional because there are two fixmaps
>> > (FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
>> > locking around 1 is not needed since it is only ever used when 0 is in
>> > use. (__patch_text_real locks patch_lock before setting 0 when it uses
>> > remapping, and if it also needs 1, it doesn't have to lock since the
>> > lock is already held.)
>> >
>> >>> +     set_fixmap(fixmap, page_to_phys(page));
>> >>
>> >> set_fixmap does TLB invalidation, right? I think that means it can block on
>> >> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
>> >> interrupts disabled anyway.
>> >
>> > Oh right. Hrm.
>> >
>> > In an earlier version of this series set_fixmap did not perform TLB
>> > invalidation. I wonder if this is not needed at all? (Wouldn't that be
>> > nice...)
>>
>> As suspected, my tests fail spectacularly without the TLB flush.
>> Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
>> here. Should I leave the WARN_ON in place for clarity, or some other
>> comments?
>
> I thought there was a potential call to spin_lock_irqsave right before
> this TLB flush?

I'm not sure I understand what you mean. Should I change something
here? It looks like irqs are disabled, so isn't this a safe code path?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-09-04 14:00             ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2014-09-04 14:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 4, 2014 at 2:27 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Sep 03, 2014 at 10:43:58PM +0100, Kees Cook wrote:
>> On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
>> > On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
>> >> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
>> >>> +static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
>> >>> +     __acquires(&patch_lock)
>> >>> +{
>> >>> +     unsigned int uintaddr = (uintptr_t) addr;
>> >>> +     bool module = !core_kernel_text(uintaddr);
>> >>> +     struct page *page;
>> >>> +
>> >>> +     if (module && IS_ENABLED(CONFIG_DEBUG_SET_MODULE_RONX))
>> >>> +             page = vmalloc_to_page(addr);
>> >>> +     else if (!module && IS_ENABLED(CONFIG_DEBUG_RODATA))
>> >>> +             page = virt_to_page(addr);
>> >>> +     else
>> >>> +             return addr;
>> >>> +
>> >>> +     if (flags)
>> >>> +             spin_lock_irqsave(&patch_lock, *flags);
>> >>> +     else
>> >>> +             __acquire(&patch_lock);
>> >>
>> >> I don't understand the locking here. Why is it conditional, why do we need
>> >> to disable interrupts, and are you just racing against yourself?
>> >
>> > AIUI, the locking is here to avoid multiple users of the text poking
>> > fixmaps. It's conditional because there are two fixmaps
>> > (FIX_TEXT_POKE0 and FIX_TEXT_POKE1). Locking happens around 0 so
>> > locking around 1 is not needed since it is only ever used when 0 is in
>> > use. (__patch_text_real locks patch_lock before setting 0 when it uses
>> > remapping, and if it also needs 1, it doesn't have to lock since the
>> > lock is already held.)
>> >
>> >>> +     set_fixmap(fixmap, page_to_phys(page));
>> >>
>> >> set_fixmap does TLB invalidation, right? I think that means it can block on
>> >> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
>> >> interrupts disabled anyway.
>> >
>> > Oh right. Hrm.
>> >
>> > In an earlier version of this series set_fixmap did not perform TLB
>> > invalidation. I wonder if this is not needed at all? (Wouldn't that be
>> > nice...)
>>
>> As suspected, my tests fail spectacularly without the TLB flush.
>> Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
>> here. Should I leave the WARN_ON in place for clarity, or some other
>> comments?
>
> I thought there was a potential call to spin_lock_irqsave right before
> this TLB flush?

I'm not sure I understand what you mean. Should I change something
here? It looks like irqs are disabled, so isn't this a safe code path?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
  2014-09-04 14:00             ` Kees Cook
@ 2014-09-04 14:06               ` Will Deacon
  -1 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-09-04 14:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Rabin Vincent, Rob Herring, Laura Abbott,
	Leif Lindholm, Stephen Boyd, msalter, Liu hua, Nikolay Borisov,
	Nicolas Pitre, Tomasz Figa, Doug Anderson, Jason Wessel,
	Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-doc

On Thu, Sep 04, 2014 at 03:00:31PM +0100, Kees Cook wrote:
> On Thu, Sep 4, 2014 at 2:27 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Sep 03, 2014 at 10:43:58PM +0100, Kees Cook wrote:
> >> On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
> >> > On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> >> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
> >> >>> +     set_fixmap(fixmap, page_to_phys(page));
> >> >>
> >> >> set_fixmap does TLB invalidation, right? I think that means it can block on
> >> >> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
> >> >> interrupts disabled anyway.
> >> >
> >> > Oh right. Hrm.
> >> >
> >> > In an earlier version of this series set_fixmap did not perform TLB
> >> > invalidation. I wonder if this is not needed at all? (Wouldn't that be
> >> > nice...)
> >>
> >> As suspected, my tests fail spectacularly without the TLB flush.
> >> Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
> >> here. Should I leave the WARN_ON in place for clarity, or some other
> >> comments?
> >
> > I thought there was a potential call to spin_lock_irqsave right before
> > this TLB flush?
> 
> I'm not sure I understand what you mean. Should I change something
> here? It looks like irqs are disabled, so isn't this a safe code path?

My point was that it's not safe to call set_fixmap when interrupts are
disabled, because it will try to flush the TLB, and this can lock-up on
some CPUs where it needs to do something like smp_call_function.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/8] arm: use fixmap for text patching when text is RO
@ 2014-09-04 14:06               ` Will Deacon
  0 siblings, 0 replies; 52+ messages in thread
From: Will Deacon @ 2014-09-04 14:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 04, 2014 at 03:00:31PM +0100, Kees Cook wrote:
> On Thu, Sep 4, 2014 at 2:27 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Sep 03, 2014 at 10:43:58PM +0100, Kees Cook wrote:
> >> On Wed, Aug 20, 2014 at 5:28 AM, Kees Cook <keescook@chromium.org> wrote:
> >> > On Tue, Aug 19, 2014 at 7:29 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> >> On Wed, Aug 13, 2014 at 06:06:29PM +0100, Kees Cook wrote:
> >> >>> +     set_fixmap(fixmap, page_to_phys(page));
> >> >>
> >> >> set_fixmap does TLB invalidation, right? I think that means it can block on
> >> >> 11MPCore and A15 w/ the TLBI erratum, so it's not safe to call this with
> >> >> interrupts disabled anyway.
> >> >
> >> > Oh right. Hrm.
> >> >
> >> > In an earlier version of this series set_fixmap did not perform TLB
> >> > invalidation. I wonder if this is not needed at all? (Wouldn't that be
> >> > nice...)
> >>
> >> As suspected, my tests fail spectacularly without the TLB flush.
> >> Adding WARN_ON(!irqs_disabled()) doesn't warn, so I think we're safe
> >> here. Should I leave the WARN_ON in place for clarity, or some other
> >> comments?
> >
> > I thought there was a potential call to spin_lock_irqsave right before
> > this TLB flush?
> 
> I'm not sure I understand what you mean. Should I change something
> here? It looks like irqs are disabled, so isn't this a safe code path?

My point was that it's not safe to call set_fixmap when interrupts are
disabled, because it will try to flush the TLB, and this can lock-up on
some CPUs where it needs to do something like smp_call_function.

Will

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2014-09-04 14:06 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-13 17:06 [PATCH v4 0/8] arm: support CONFIG_RODATA Kees Cook
2014-08-13 17:06 ` Kees Cook
2014-08-13 17:06 ` [PATCH v4 1/8] arm: use generic fixmap.h Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-13 17:06 ` [PATCH v4 2/8] ARM: expand fixmap region to 3MB Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-19 12:26   ` Will Deacon
2014-08-19 12:26     ` Will Deacon
2014-08-20 12:16     ` Kees Cook
2014-08-20 12:16       ` Kees Cook
2014-08-26 14:37       ` Will Deacon
2014-08-26 14:37         ` Will Deacon
2014-08-13 17:06 ` [PATCH v4 3/8] arm: fixmap: implement __set_fixmap() Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-13 17:06 ` [PATCH v4 4/8] arm: use fixmap for text patching when text is RO Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-19 12:29   ` Will Deacon
2014-08-19 12:29     ` Will Deacon
2014-08-20 12:28     ` Kees Cook
2014-08-20 12:28       ` Kees Cook
2014-09-03 21:43       ` Kees Cook
2014-09-03 21:43         ` Kees Cook
2014-09-04  9:27         ` Will Deacon
2014-09-04  9:27           ` Will Deacon
2014-09-04 14:00           ` Kees Cook
2014-09-04 14:00             ` Kees Cook
2014-09-04 14:06             ` Will Deacon
2014-09-04 14:06               ` Will Deacon
2014-08-13 17:06 ` [PATCH v4 5/8] ARM: kexec: Make .text R/W in machine_kexec Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-13 17:06 ` [PATCH v4 6/8] arm: kgdb: Handle read-only text / modules Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-13 17:06 ` [PATCH v4 7/8] ARM: mm: allow non-text sections to be non-executable Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-19 12:33   ` Will Deacon
2014-08-19 12:33     ` Will Deacon
2014-08-20 12:37     ` Kees Cook
2014-08-20 12:37       ` Kees Cook
2014-08-26 14:43       ` Will Deacon
2014-08-26 14:43         ` Will Deacon
2014-08-29 16:04         ` Kees Cook
2014-08-29 16:04           ` Kees Cook
2014-08-31 14:59           ` Rabin Vincent
2014-08-31 14:59             ` Rabin Vincent
2014-08-13 17:06 ` [PATCH v4 8/8] ARM: mm: allow text and rodata sections to be read-only Kees Cook
2014-08-13 17:06   ` Kees Cook
2014-08-19 12:36   ` Will Deacon
2014-08-19 12:36     ` Will Deacon
2014-08-20 12:52     ` Kees Cook
2014-08-20 12:52       ` Kees Cook
2014-08-13 17:38 ` [PATCH v4 0/8] arm: support CONFIG_RODATA Nicolas Pitre
2014-08-13 17:38   ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.