All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ARM: mm: allow for stricter kernel memory perms
@ 2014-02-14  1:04 ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14  1:04 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: keescook, Russell King, Laura Abbott, Larry Bassel,
	Catalin Marinas, Stephen Rothwell, Greg Kroah-Hartman,
	Christoffer Dall, Marc Zyngier, Jonathan Austin, Simon Baatz,
	Nicolas Pitre, Dave Martin, Will Deacon, Uwe Kleine-König,
	Ben Dooks, Andrew Morton, Santosh Shilimkar, Jiang Liu,
	Grant Likely, Rob Herring, Vitaly Andrianov, linux-kernel

This series of patches allows the ARM kernel page tables to gain better
permission separation. With a fixed[1] CONFIG_ARM_PTDUMP enabled, you
can see the before and after in /sys/kernel/debug/kernel_page_tables.

Before:
---[ Kernel Mapping ]---
0xc0000000-0xc0800000           8M     RW x  SHD
0xc0800000-0xc1e00000          22M     RW NX SHD
0xc2000000-0xc3000000          16M     RW x  SHD
0xc3800000-0xd1000000         216M     RW x  SHD
0xd1800000-0xef800000         480M     RW x  SHD

After:
---[ Kernel Mapping ]---
0xc0000000-0xc0100000           1M     RW NX SHD
0xc0100000-0xc0700000           6M     ro x  SHD
0xc0700000-0xc0a00000           3M     ro NX SHD
0xc0a00000-0xc1e00000          20M     RW NX SHD
0xc2000000-0xc3000000          16M     RW NX SHD
0xc3800000-0xd1000000         216M     RW NX SHD
0xd1800000-0xef800000         480M     RW NX SHD

This is available via CONFIG_ARM_KERNMEM_PERMS and CONFIG_DEBUG_RODATA.
The latter exists to match the x86 option of the same name, and is
left as a configurable since each additional region adds more potential
memory padding.

The series is based on earlier work from Brad Spengler, Larry Bassel,
and Laura Abbott.

Thanks,

-Kees

[1] these patches are needed to get the correct output:
    https://lkml.org/lkml/2014/2/12/662
    https://lkml.org/lkml/2014/2/12/663


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 0/2] ARM: mm: allow for stricter kernel memory perms
@ 2014-02-14  1:04 ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14  1:04 UTC (permalink / raw)
  To: linux-arm-kernel

This series of patches allows the ARM kernel page tables to gain better
permission separation. With a fixed[1] CONFIG_ARM_PTDUMP enabled, you
can see the before and after in /sys/kernel/debug/kernel_page_tables.

Before:
---[ Kernel Mapping ]---
0xc0000000-0xc0800000           8M     RW x  SHD
0xc0800000-0xc1e00000          22M     RW NX SHD
0xc2000000-0xc3000000          16M     RW x  SHD
0xc3800000-0xd1000000         216M     RW x  SHD
0xd1800000-0xef800000         480M     RW x  SHD

After:
---[ Kernel Mapping ]---
0xc0000000-0xc0100000           1M     RW NX SHD
0xc0100000-0xc0700000           6M     ro x  SHD
0xc0700000-0xc0a00000           3M     ro NX SHD
0xc0a00000-0xc1e00000          20M     RW NX SHD
0xc2000000-0xc3000000          16M     RW NX SHD
0xc3800000-0xd1000000         216M     RW NX SHD
0xd1800000-0xef800000         480M     RW NX SHD

This is available via CONFIG_ARM_KERNMEM_PERMS and CONFIG_DEBUG_RODATA.
The latter exists to match the x86 option of the same name, and is
left as a configurable since each additional region adds more potential
memory padding.

The series is based on earlier work from Brad Spengler, Larry Bassel,
and Laura Abbott.

Thanks,

-Kees

[1] these patches are needed to get the correct output:
    https://lkml.org/lkml/2014/2/12/662
    https://lkml.org/lkml/2014/2/12/663

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/2] ARM: mm: allow for stricter kernel memory perms
  2014-02-14  1:04 ` Kees Cook
@ 2014-02-14  1:04   ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14  1:04 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: keescook, Russell King, Laura Abbott, Larry Bassel,
	Catalin Marinas, Stephen Rothwell, Greg Kroah-Hartman,
	Christoffer Dall, Marc Zyngier, Jonathan Austin, Simon Baatz,
	Nicolas Pitre, Dave Martin, Will Deacon, Uwe Kleine-König,
	Ben Dooks, Andrew Morton, Santosh Shilimkar, Jiang Liu,
	Grant Likely, Rob Herring, Vitaly Andrianov, linux-kernel

Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
into section-sized areas that can have different permisions. Performs
the permission changes during free_initmem.

This uses section size instead of PMD size to reduce memory caps on
non-LPAE systems.

Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/vmlinux.lds.S |   17 +++++++++
 arch/arm/mm/Kconfig           |   10 +++++
 arch/arm/mm/init.c            |   84 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 111 insertions(+)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 7bcee5c9b604..08fa667ef2f1 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -8,6 +8,9 @@
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/page.h>
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/pgtable.h>
+#endif
 	
 #define PROC_INFO							\
 	. = ALIGN(4);							\
@@ -90,6 +93,11 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
+
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			__exception_text_start = .;
@@ -145,7 +153,11 @@ SECTIONS
 	_etext = .;			/* End of text and rodata section */
 
 #ifndef CONFIG_XIP_KERNEL
+# ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+# else
 	. = ALIGN(PAGE_SIZE);
+# endif
 	__init_begin = .;
 #endif
 	/*
@@ -220,7 +232,12 @@ SECTIONS
 	. = PAGE_OFFSET + TEXT_OFFSET;
 #else
 	__init_end = .;
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#else
 	. = ALIGN(THREAD_SIZE);
+#endif
 	__data_loc = .;
 #endif
 
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 1f8fed94c2a4..999eb505faee 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -958,3 +958,13 @@ config ARCH_SUPPORTS_BIG_ENDIAN
 	help
 	  This option specifies the architecture can support big endian
 	  operation.
+
+config ARM_KERNMEM_PERMS
+	bool "Restrict kernel memory permissions"
+	help
+	  If this is set, kernel text will be made RX, kernel data and stack
+	  RW (otherwise all of the regions of the kernel 1-to-1 mapping
+	  outside section boundaries remains RWX). The tradeoff is that each
+	  region is padded to section-size (1MiB) boundaries (because their
+	  permissions are different and splitting the 1M pages into 4K ones
+	  causes TLB performance problems), wasting memory.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 804d61566a53..f0b1df53f436 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -31,6 +31,11 @@
 #include <asm/tlb.h>
 #include <asm/fixmap.h>
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/system_info.h>
+#include <asm/cp15.h>
+#endif
+
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
 
@@ -621,11 +626,90 @@ void __init mem_init(void)
 	}
 }
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+struct section_perm {
+	unsigned long start;
+	unsigned long end;
+	pmdval_t prot;
+};
+
+struct section_perm __initdata section_perms[] = {
+	/* Make pages tables, etc before _stext RW (set NX). */
+	{
+		.start	= PAGE_OFFSET,
+		.end	= (unsigned long)_stext,
+		.prot	= PMD_SECT_XN,
+	},
+	/* Make init RW (set NX). */
+	{
+		.start	= (unsigned long)__init_begin,
+		.end	= (unsigned long)_sdata,
+		.prot	= PMD_SECT_XN,
+	},
+	/* Make kernel code and rodata RX (set RO). */
+	{
+		.start	= (unsigned long)_stext,
+		.end	= (unsigned long)__init_begin,
+#ifdef CONFIG_ARM_LPAE
+		.prot	= PMD_SECT_RDONLY,
+#else
+		.prot	= PMD_SECT_APX | PMD_SECT_AP_WRITE,
+#endif
+	},
+};
+
+static inline void section_update(unsigned long addr, pmdval_t prot)
+{
+	pmd_t *pmd = pmd_off_k(addr);
+
+#ifdef CONFIG_ARM_LPAE
+	pmd[0] = __pmd(pmd_val(pmd[0]) | prot);
+#else
+	if (addr & SECTION_SIZE)
+		pmd[1] = __pmd(pmd_val(pmd[1]) | prot);
+	else
+		pmd[0] = __pmd(pmd_val(pmd[0]) | prot);
+#endif
+	flush_pmd_entry(pmd);
+}
+
+static inline void fix_kernmem_perms(void)
+{
+	unsigned long addr;
+	int cpu_arch = cpu_architecture();
+	unsigned int i, cr = get_cr();
+
+	if (cpu_arch < CPU_ARCH_ARMv6 || !(cr & CR_XP))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(section_perms); i++) {
+		if (!IS_ALIGNED(section_perms[i].start, SECTION_SIZE) ||
+		    !IS_ALIGNED(section_perms[i].end, SECTION_SIZE)) {
+			pr_err("BUG: section %lx-%lx not aligned to %lx\n",
+				section_perms[i].start, section_perms[i].end,
+				SECTION_SIZE);
+			continue;
+		}
+
+		for (addr = section_perms[i].start;
+		     addr < section_perms[i].end;
+		     addr += SECTION_SIZE)
+			section_update(addr, section_perms[i].prot);
+	}
+}
+#else
+static inline void fix_kernmem_perms(void) { }
+#endif /* CONFIG_ARM_KERNMEM_PERMS */
+
 void free_initmem(void)
 {
 #ifdef CONFIG_HAVE_TCM
 	extern char __tcm_start, __tcm_end;
+#endif
+
+	fix_kernmem_perms();
 
+#ifdef CONFIG_HAVE_TCM
 	poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
 	free_reserved_area(&__tcm_start, &__tcm_end, -1, "TCM link");
 #endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 1/2] ARM: mm: allow for stricter kernel memory perms
@ 2014-02-14  1:04   ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14  1:04 UTC (permalink / raw)
  To: linux-arm-kernel

Adds CONFIG_ARM_KERNMEM_PERMS to separate the kernel memory regions
into section-sized areas that can have different permisions. Performs
the permission changes during free_initmem.

This uses section size instead of PMD size to reduce memory caps on
non-LPAE systems.

Based on work by Brad Spengler, Larry Bassel, and Laura Abbott.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/vmlinux.lds.S |   17 +++++++++
 arch/arm/mm/Kconfig           |   10 +++++
 arch/arm/mm/init.c            |   84 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 111 insertions(+)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 7bcee5c9b604..08fa667ef2f1 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -8,6 +8,9 @@
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/page.h>
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/pgtable.h>
+#endif
 	
 #define PROC_INFO							\
 	. = ALIGN(4);							\
@@ -90,6 +93,11 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
+
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			__exception_text_start = .;
@@ -145,7 +153,11 @@ SECTIONS
 	_etext = .;			/* End of text and rodata section */
 
 #ifndef CONFIG_XIP_KERNEL
+# ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+# else
 	. = ALIGN(PAGE_SIZE);
+# endif
 	__init_begin = .;
 #endif
 	/*
@@ -220,7 +232,12 @@ SECTIONS
 	. = PAGE_OFFSET + TEXT_OFFSET;
 #else
 	__init_end = .;
+
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+	. = ALIGN(1<<SECTION_SHIFT);
+#else
 	. = ALIGN(THREAD_SIZE);
+#endif
 	__data_loc = .;
 #endif
 
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 1f8fed94c2a4..999eb505faee 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -958,3 +958,13 @@ config ARCH_SUPPORTS_BIG_ENDIAN
 	help
 	  This option specifies the architecture can support big endian
 	  operation.
+
+config ARM_KERNMEM_PERMS
+	bool "Restrict kernel memory permissions"
+	help
+	  If this is set, kernel text will be made RX, kernel data and stack
+	  RW (otherwise all of the regions of the kernel 1-to-1 mapping
+	  outside section boundaries remains RWX). The tradeoff is that each
+	  region is padded to section-size (1MiB) boundaries (because their
+	  permissions are different and splitting the 1M pages into 4K ones
+	  causes TLB performance problems), wasting memory.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 804d61566a53..f0b1df53f436 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -31,6 +31,11 @@
 #include <asm/tlb.h>
 #include <asm/fixmap.h>
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+#include <asm/system_info.h>
+#include <asm/cp15.h>
+#endif
+
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
 
@@ -621,11 +626,90 @@ void __init mem_init(void)
 	}
 }
 
+#ifdef CONFIG_ARM_KERNMEM_PERMS
+struct section_perm {
+	unsigned long start;
+	unsigned long end;
+	pmdval_t prot;
+};
+
+struct section_perm __initdata section_perms[] = {
+	/* Make pages tables, etc before _stext RW (set NX). */
+	{
+		.start	= PAGE_OFFSET,
+		.end	= (unsigned long)_stext,
+		.prot	= PMD_SECT_XN,
+	},
+	/* Make init RW (set NX). */
+	{
+		.start	= (unsigned long)__init_begin,
+		.end	= (unsigned long)_sdata,
+		.prot	= PMD_SECT_XN,
+	},
+	/* Make kernel code and rodata RX (set RO). */
+	{
+		.start	= (unsigned long)_stext,
+		.end	= (unsigned long)__init_begin,
+#ifdef CONFIG_ARM_LPAE
+		.prot	= PMD_SECT_RDONLY,
+#else
+		.prot	= PMD_SECT_APX | PMD_SECT_AP_WRITE,
+#endif
+	},
+};
+
+static inline void section_update(unsigned long addr, pmdval_t prot)
+{
+	pmd_t *pmd = pmd_off_k(addr);
+
+#ifdef CONFIG_ARM_LPAE
+	pmd[0] = __pmd(pmd_val(pmd[0]) | prot);
+#else
+	if (addr & SECTION_SIZE)
+		pmd[1] = __pmd(pmd_val(pmd[1]) | prot);
+	else
+		pmd[0] = __pmd(pmd_val(pmd[0]) | prot);
+#endif
+	flush_pmd_entry(pmd);
+}
+
+static inline void fix_kernmem_perms(void)
+{
+	unsigned long addr;
+	int cpu_arch = cpu_architecture();
+	unsigned int i, cr = get_cr();
+
+	if (cpu_arch < CPU_ARCH_ARMv6 || !(cr & CR_XP))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(section_perms); i++) {
+		if (!IS_ALIGNED(section_perms[i].start, SECTION_SIZE) ||
+		    !IS_ALIGNED(section_perms[i].end, SECTION_SIZE)) {
+			pr_err("BUG: section %lx-%lx not aligned to %lx\n",
+				section_perms[i].start, section_perms[i].end,
+				SECTION_SIZE);
+			continue;
+		}
+
+		for (addr = section_perms[i].start;
+		     addr < section_perms[i].end;
+		     addr += SECTION_SIZE)
+			section_update(addr, section_perms[i].prot);
+	}
+}
+#else
+static inline void fix_kernmem_perms(void) { }
+#endif /* CONFIG_ARM_KERNMEM_PERMS */
+
 void free_initmem(void)
 {
 #ifdef CONFIG_HAVE_TCM
 	extern char __tcm_start, __tcm_end;
+#endif
+
+	fix_kernmem_perms();
 
+#ifdef CONFIG_HAVE_TCM
 	poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
 	free_reserved_area(&__tcm_start, &__tcm_end, -1, "TCM link");
 #endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-14  1:04 ` Kees Cook
@ 2014-02-14  1:04   ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14  1:04 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: keescook, Russell King, Laura Abbott, Larry Bassel,
	Catalin Marinas, Stephen Rothwell, Greg Kroah-Hartman,
	Christoffer Dall, Marc Zyngier, Jonathan Austin, Simon Baatz,
	Nicolas Pitre, Dave Martin, Will Deacon, Uwe Kleine-König,
	Ben Dooks, Andrew Morton, Santosh Shilimkar, Jiang Liu,
	Grant Likely, Rob Herring, Vitaly Andrianov, linux-kernel

Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
sets rodata read-only (but executable), where as this option additionally
splits rodata from the kernel text (resulting in potentially more memory
lost to padding) and sets it non-executable as well. The end result is
that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
marked purely read-only.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/include/asm/cacheflush.h |    5 +++++
 arch/arm/kernel/vmlinux.lds.S     |    3 +++
 arch/arm/mm/Kconfig               |   12 ++++++++++++
 arch/arm/mm/init.c                |    8 ++++++++
 4 files changed, 28 insertions(+)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e9a49fe0284e..2b058fc7a188 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -486,4 +486,9 @@ int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
 
+#ifdef CONFIG_DEBUG_RODATA
+/* This has already happened during free_initmem. */
+static inline void mark_rodata_ro(void) { }
+#endif
+
 #endif
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 08fa667ef2f1..ec79e7268e09 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -120,6 +120,9 @@ SECTIONS
 			ARM_CPU_KEEP(PROC_INFO)
 	}
 
+#ifdef CONFIG_DEBUG_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
 	RO_DATA(PAGE_SIZE)
 
 	. = ALIGN(4);
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 999eb505faee..7c8bbe7e2769 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -968,3 +968,15 @@ config ARM_KERNMEM_PERMS
 	  region is padded to section-size (1MiB) boundaries (because their
 	  permissions are different and splitting the 1M pages into 4K ones
 	  causes TLB performance problems), wasting memory.
+
+config DEBUG_RODATA
+	bool "Split rodata from text and set it read-only/non-executable"
+	depends on ARM_KERNMEM_PERMS
+	default y
+	help
+	  If this is set, rodata will be split from kernel text and made
+	  non-executable. (This option already depends on the option
+	  CONFIG_STRICT_KERNMEM_PERMS which makes rodata read-only, though
+	  still executable.) This creates another section-size padded
+	  region, so it can waste more memory space while gaining a pure
+	  read-only rodata region.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f0b1df53f436..5b1b049501b9 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -656,6 +656,14 @@ struct section_perm __initdata section_perms[] = {
 		.prot	= PMD_SECT_APX | PMD_SECT_AP_WRITE,
 #endif
 	},
+#ifdef CONFIG_DEBUG_RODATA
+	/* Make rodata RO (set NX). */
+	{
+		.start	= (unsigned long)__start_rodata,
+		.end	= (unsigned long)__init_begin,
+		.prot	= PMD_SECT_XN,
+	}
+#endif
 };
 
 static inline void section_update(unsigned long addr, pmdval_t prot)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-14  1:04   ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14  1:04 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
sets rodata read-only (but executable), where as this option additionally
splits rodata from the kernel text (resulting in potentially more memory
lost to padding) and sets it non-executable as well. The end result is
that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
marked purely read-only.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/include/asm/cacheflush.h |    5 +++++
 arch/arm/kernel/vmlinux.lds.S     |    3 +++
 arch/arm/mm/Kconfig               |   12 ++++++++++++
 arch/arm/mm/init.c                |    8 ++++++++
 4 files changed, 28 insertions(+)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e9a49fe0284e..2b058fc7a188 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -486,4 +486,9 @@ int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
 
+#ifdef CONFIG_DEBUG_RODATA
+/* This has already happened during free_initmem. */
+static inline void mark_rodata_ro(void) { }
+#endif
+
 #endif
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 08fa667ef2f1..ec79e7268e09 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -120,6 +120,9 @@ SECTIONS
 			ARM_CPU_KEEP(PROC_INFO)
 	}
 
+#ifdef CONFIG_DEBUG_RODATA
+	. = ALIGN(1<<SECTION_SHIFT);
+#endif
 	RO_DATA(PAGE_SIZE)
 
 	. = ALIGN(4);
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 999eb505faee..7c8bbe7e2769 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -968,3 +968,15 @@ config ARM_KERNMEM_PERMS
 	  region is padded to section-size (1MiB) boundaries (because their
 	  permissions are different and splitting the 1M pages into 4K ones
 	  causes TLB performance problems), wasting memory.
+
+config DEBUG_RODATA
+	bool "Split rodata from text and set it read-only/non-executable"
+	depends on ARM_KERNMEM_PERMS
+	default y
+	help
+	  If this is set, rodata will be split from kernel text and made
+	  non-executable. (This option already depends on the option
+	  CONFIG_STRICT_KERNMEM_PERMS which makes rodata read-only, though
+	  still executable.) This creates another section-size padded
+	  region, so it can waste more memory space while gaining a pure
+	  read-only rodata region.
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f0b1df53f436..5b1b049501b9 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -656,6 +656,14 @@ struct section_perm __initdata section_perms[] = {
 		.prot	= PMD_SECT_APX | PMD_SECT_AP_WRITE,
 #endif
 	},
+#ifdef CONFIG_DEBUG_RODATA
+	/* Make rodata RO (set NX). */
+	{
+		.start	= (unsigned long)__start_rodata,
+		.end	= (unsigned long)__init_begin,
+		.prot	= PMD_SECT_XN,
+	}
+#endif
 };
 
 static inline void section_update(unsigned long addr, pmdval_t prot)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-14  1:04   ` Kees Cook
@ 2014-02-14 16:22     ` Dave Martin
  -1 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-02-14 16:22 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Russell King, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Laura Abbott, Marc Zyngier, Rob Herring, Vitaly Andrianov,
	Jonathan Austin, Simon Baatz, Greg Kroah-Hartman, linux-kernel,
	Santosh Shilimkar, Andrew Morton

On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> sets rodata read-only (but executable), where as this option additionally
> splits rodata from the kernel text (resulting in potentially more memory
> lost to padding) and sets it non-executable as well. The end result is
> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> marked purely read-only.

This triggers an Oops in kexec, because we have a block of code in .text
which is a template for generating baremetal code to relocate the new
kernel, and some literal words are written into it before copying.

Possibly this should be in .rodata, not .text.

There may be a few other instances of this kind of thing.

Are you aware of similar situations on other arches?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-14 16:22     ` Dave Martin
  0 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-02-14 16:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> sets rodata read-only (but executable), where as this option additionally
> splits rodata from the kernel text (resulting in potentially more memory
> lost to padding) and sets it non-executable as well. The end result is
> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> marked purely read-only.

This triggers an Oops in kexec, because we have a block of code in .text
which is a template for generating baremetal code to relocate the new
kernel, and some literal words are written into it before copying.

Possibly this should be in .rodata, not .text.

There may be a few other instances of this kind of thing.

Are you aware of similar situations on other arches?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-14 16:22     ` Dave Martin
@ 2014-02-14 19:11       ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14 19:11 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Russell King, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Laura Abbott, Marc Zyngier, Rob Herring, Vitaly Andrianov,
	Jonathan Austin, Simon Baatz, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>> sets rodata read-only (but executable), where as this option additionally
>> splits rodata from the kernel text (resulting in potentially more memory
>> lost to padding) and sets it non-executable as well. The end result is
>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>> marked purely read-only.
>
> This triggers an Oops in kexec, because we have a block of code in .text
> which is a template for generating baremetal code to relocate the new
> kernel, and some literal words are written into it before copying.

You're writing into the text area? I would imagine that
CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
right place to be building code -- shouldn't the module area be used
for that?

> Possibly this should be in .rodata, not .text.

Well, rodata should be neither writable nor executable.

> There may be a few other instances of this kind of thing.

This config will certainly find them! :) But, that's why it's behind a config.

> Are you aware of similar situations on other arches?

I think there were some problems a long time ago on x86 for rodata too.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-14 19:11       ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-14 19:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>> sets rodata read-only (but executable), where as this option additionally
>> splits rodata from the kernel text (resulting in potentially more memory
>> lost to padding) and sets it non-executable as well. The end result is
>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>> marked purely read-only.
>
> This triggers an Oops in kexec, because we have a block of code in .text
> which is a template for generating baremetal code to relocate the new
> kernel, and some literal words are written into it before copying.

You're writing into the text area? I would imagine that
CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
right place to be building code -- shouldn't the module area be used
for that?

> Possibly this should be in .rodata, not .text.

Well, rodata should be neither writable nor executable.

> There may be a few other instances of this kind of thing.

This config will certainly find them! :) But, that's why it's behind a config.

> Are you aware of similar situations on other arches?

I think there were some problems a long time ago on x86 for rodata too.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-14 19:11       ` Kees Cook
@ 2014-02-17 12:34         ` Dave Martin
  -1 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-02-17 12:34 UTC (permalink / raw)
  To: Kees Cook
  Cc: Catalin Marinas, Will Deacon, Larry Bassel, Stephen Rothwell,
	Russell King, Nicolas Pitre, Ben Dooks, Uwe Kleine-König,
	Grant Likely, Jiang Liu, Christoffer Dall, Laura Abbott,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> > On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >> sets rodata read-only (but executable), where as this option additionally
> >> splits rodata from the kernel text (resulting in potentially more memory
> >> lost to padding) and sets it non-executable as well. The end result is
> >> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >> marked purely read-only.
> >
> > This triggers an Oops in kexec, because we have a block of code in .text
> > which is a template for generating baremetal code to relocate the new
> > kernel, and some literal words are written into it before copying.
> 
> You're writing into the text area? I would imagine that
> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> right place to be building code -- shouldn't the module area be used
> for that?
> 
> > Possibly this should be in .rodata, not .text.
> 
> Well, rodata should be neither writable nor executable.

We're not writing into code exactly.

This code is never executed in-place in vmlinux.  It gets copied, and
only copies are ever executed.

Some pointers and offsets get poked into the code to configure it.

I think it would be better simply to put the code in .rodata, and
poke paramaters into the copy, not the original -- but that's a bit
more awkward to code up, since the values can't be poked simply by
writing global variables.

> 
> > There may be a few other instances of this kind of thing.
> 
> This config will certainly find them! :) But, that's why it's behind a config.

I haven't tested exhaustively, but it think this is sufficient for a
Tested-by.  The patch does seem to be doing what it is intended to
do, and doesn't seem to be triggering false positives all over the
place.

> 
> > Are you aware of similar situations on other arches?
> 
> I think there were some problems a long time ago on x86 for rodata too.

It would be good to get this kexec case fixed -- I'll try to hack up
a separate patch.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-17 12:34         ` Dave Martin
  0 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-02-17 12:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> > On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >> sets rodata read-only (but executable), where as this option additionally
> >> splits rodata from the kernel text (resulting in potentially more memory
> >> lost to padding) and sets it non-executable as well. The end result is
> >> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >> marked purely read-only.
> >
> > This triggers an Oops in kexec, because we have a block of code in .text
> > which is a template for generating baremetal code to relocate the new
> > kernel, and some literal words are written into it before copying.
> 
> You're writing into the text area? I would imagine that
> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> right place to be building code -- shouldn't the module area be used
> for that?
> 
> > Possibly this should be in .rodata, not .text.
> 
> Well, rodata should be neither writable nor executable.

We're not writing into code exactly.

This code is never executed in-place in vmlinux.  It gets copied, and
only copies are ever executed.

Some pointers and offsets get poked into the code to configure it.

I think it would be better simply to put the code in .rodata, and
poke paramaters into the copy, not the original -- but that's a bit
more awkward to code up, since the values can't be poked simply by
writing global variables.

> 
> > There may be a few other instances of this kind of thing.
> 
> This config will certainly find them! :) But, that's why it's behind a config.

I haven't tested exhaustively, but it think this is sufficient for a
Tested-by.  The patch does seem to be doing what it is intended to
do, and doesn't seem to be triggering false positives all over the
place.

> 
> > Are you aware of similar situations on other arches?
> 
> I think there were some problems a long time ago on x86 for rodata too.

It would be good to get this kexec case fixed -- I'll try to hack up
a separate patch.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-17 12:34         ` Dave Martin
@ 2014-02-18 18:10           ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-18 18:10 UTC (permalink / raw)
  To: Dave Martin
  Cc: Catalin Marinas, Will Deacon, Larry Bassel, Stephen Rothwell,
	Russell King, Nicolas Pitre, Ben Dooks, Uwe Kleine-König,
	Grant Likely, Jiang Liu, Christoffer Dall, Laura Abbott,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Mon, Feb 17, 2014 at 4:34 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>> > On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>> >> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>> >> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>> >> sets rodata read-only (but executable), where as this option additionally
>> >> splits rodata from the kernel text (resulting in potentially more memory
>> >> lost to padding) and sets it non-executable as well. The end result is
>> >> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>> >> marked purely read-only.
>> >
>> > This triggers an Oops in kexec, because we have a block of code in .text
>> > which is a template for generating baremetal code to relocate the new
>> > kernel, and some literal words are written into it before copying.
>>
>> You're writing into the text area? I would imagine that
>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>> right place to be building code -- shouldn't the module area be used
>> for that?
>>
>> > Possibly this should be in .rodata, not .text.
>>
>> Well, rodata should be neither writable nor executable.
>
> We're not writing into code exactly.
>
> This code is never executed in-place in vmlinux.  It gets copied, and
> only copies are ever executed.
>
> Some pointers and offsets get poked into the code to configure it.
>
> I think it would be better simply to put the code in .rodata, and
> poke paramaters into the copy, not the original -- but that's a bit
> more awkward to code up, since the values can't be poked simply by
> writing global variables.

Okay, interesting. I'll be curious to see what the patch for this looks like.

>> > There may be a few other instances of this kind of thing.
>>
>> This config will certainly find them! :) But, that's why it's behind a config.
>
> I haven't tested exhaustively, but it think this is sufficient for a
> Tested-by.  The patch does seem to be doing what it is intended to
> do, and doesn't seem to be triggering false positives all over the
> place.

Great, thanks for taking the time to check on it!

Should I send this to the patch tracker, or wait for more feedback?

>> > Are you aware of similar situations on other arches?
>>
>> I think there were some problems a long time ago on x86 for rodata too.
>
> It would be good to get this kexec case fixed -- I'll try to hack up
> a separate patch.

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-18 18:10           ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-18 18:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 17, 2014 at 4:34 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>> > On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>> >> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>> >> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>> >> sets rodata read-only (but executable), where as this option additionally
>> >> splits rodata from the kernel text (resulting in potentially more memory
>> >> lost to padding) and sets it non-executable as well. The end result is
>> >> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>> >> marked purely read-only.
>> >
>> > This triggers an Oops in kexec, because we have a block of code in .text
>> > which is a template for generating baremetal code to relocate the new
>> > kernel, and some literal words are written into it before copying.
>>
>> You're writing into the text area? I would imagine that
>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>> right place to be building code -- shouldn't the module area be used
>> for that?
>>
>> > Possibly this should be in .rodata, not .text.
>>
>> Well, rodata should be neither writable nor executable.
>
> We're not writing into code exactly.
>
> This code is never executed in-place in vmlinux.  It gets copied, and
> only copies are ever executed.
>
> Some pointers and offsets get poked into the code to configure it.
>
> I think it would be better simply to put the code in .rodata, and
> poke paramaters into the copy, not the original -- but that's a bit
> more awkward to code up, since the values can't be poked simply by
> writing global variables.

Okay, interesting. I'll be curious to see what the patch for this looks like.

>> > There may be a few other instances of this kind of thing.
>>
>> This config will certainly find them! :) But, that's why it's behind a config.
>
> I haven't tested exhaustively, but it think this is sufficient for a
> Tested-by.  The patch does seem to be doing what it is intended to
> do, and doesn't seem to be triggering false positives all over the
> place.

Great, thanks for taking the time to check on it!

Should I send this to the patch tracker, or wait for more feedback?

>> > Are you aware of similar situations on other arches?
>>
>> I think there were some problems a long time ago on x86 for rodata too.
>
> It would be good to get this kexec case fixed -- I'll try to hack up
> a separate patch.

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-18 18:10           ` Kees Cook
@ 2014-02-21 12:37             ` Dave Martin
  -1 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-02-21 12:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Catalin Marinas, Will Deacon, Larry Bassel, Stephen Rothwell,
	Russell King, Nicolas Pitre, Ben Dooks, Uwe Kleine-König,
	Grant Likely, Jiang Liu, Christoffer Dall, Laura Abbott,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Jonathan Austin, Simon Baatz, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Tue, Feb 18, 2014 at 10:10:03AM -0800, Kees Cook wrote:
> On Mon, Feb 17, 2014 at 4:34 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> > On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >> > On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >> >> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >> >> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >> >> sets rodata read-only (but executable), where as this option additionally
> >> >> splits rodata from the kernel text (resulting in potentially more memory
> >> >> lost to padding) and sets it non-executable as well. The end result is
> >> >> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >> >> marked purely read-only.
> >> >
> >> > This triggers an Oops in kexec, because we have a block of code in .text
> >> > which is a template for generating baremetal code to relocate the new
> >> > kernel, and some literal words are written into it before copying.
> >>
> >> You're writing into the text area? I would imagine that
> >> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >> right place to be building code -- shouldn't the module area be used
> >> for that?
> >>
> >> > Possibly this should be in .rodata, not .text.
> >>
> >> Well, rodata should be neither writable nor executable.
> >
> > We're not writing into code exactly.
> >
> > This code is never executed in-place in vmlinux.  It gets copied, and
> > only copies are ever executed.
> >
> > Some pointers and offsets get poked into the code to configure it.
> >
> > I think it would be better simply to put the code in .rodata, and
> > poke paramaters into the copy, not the original -- but that's a bit
> > more awkward to code up, since the values can't be poked simply by
> > writing global variables.
> 
> Okay, interesting. I'll be curious to see what the patch for this looks like.
> 
> >> > There may be a few other instances of this kind of thing.
> >>
> >> This config will certainly find them! :) But, that's why it's behind a config.
> >
> > I haven't tested exhaustively, but it think this is sufficient for a
> > Tested-by.  The patch does seem to be doing what it is intended to
> > do, and doesn't seem to be triggering false positives all over the
> > place.
> 
> Great, thanks for taking the time to check on it!
> 
> Should I send this to the patch tracker, or wait for more feedback?

It would be good if someone who's more familiar with the parms and
vmlinux.lds stuff could take a look at it, though I don't see any
obvious problem yet.

If you don't receive further comments, you could try reposting once
to alert people to the fact that you're still waiting.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-21 12:37             ` Dave Martin
  0 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-02-21 12:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 18, 2014 at 10:10:03AM -0800, Kees Cook wrote:
> On Mon, Feb 17, 2014 at 4:34 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> > On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >> > On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >> >> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >> >> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >> >> sets rodata read-only (but executable), where as this option additionally
> >> >> splits rodata from the kernel text (resulting in potentially more memory
> >> >> lost to padding) and sets it non-executable as well. The end result is
> >> >> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >> >> marked purely read-only.
> >> >
> >> > This triggers an Oops in kexec, because we have a block of code in .text
> >> > which is a template for generating baremetal code to relocate the new
> >> > kernel, and some literal words are written into it before copying.
> >>
> >> You're writing into the text area? I would imagine that
> >> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >> right place to be building code -- shouldn't the module area be used
> >> for that?
> >>
> >> > Possibly this should be in .rodata, not .text.
> >>
> >> Well, rodata should be neither writable nor executable.
> >
> > We're not writing into code exactly.
> >
> > This code is never executed in-place in vmlinux.  It gets copied, and
> > only copies are ever executed.
> >
> > Some pointers and offsets get poked into the code to configure it.
> >
> > I think it would be better simply to put the code in .rodata, and
> > poke paramaters into the copy, not the original -- but that's a bit
> > more awkward to code up, since the values can't be poked simply by
> > writing global variables.
> 
> Okay, interesting. I'll be curious to see what the patch for this looks like.
> 
> >> > There may be a few other instances of this kind of thing.
> >>
> >> This config will certainly find them! :) But, that's why it's behind a config.
> >
> > I haven't tested exhaustively, but it think this is sufficient for a
> > Tested-by.  The patch does seem to be doing what it is intended to
> > do, and doesn't seem to be triggering false positives all over the
> > place.
> 
> Great, thanks for taking the time to check on it!
> 
> Should I send this to the patch tracker, or wait for more feedback?

It would be good if someone who's more familiar with the parms and
vmlinux.lds stuff could take a look at it, though I don't see any
obvious problem yet.

If you don't receive further comments, you could try reposting once
to alert people to the fact that you're still waiting.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-21 12:37             ` Dave Martin
@ 2014-02-21 13:20               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2014-02-21 13:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Kees Cook, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Laura Abbott, Marc Zyngier, Rob Herring, Vitaly Andrianov,
	linux-arm-kernel, Jonathan Austin, Simon Baatz,
	Greg Kroah-Hartman, LKML, Santosh Shilimkar, Andrew Morton

On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
> It would be good if someone who's more familiar with the parms and
> vmlinux.lds stuff could take a look at it, though I don't see any
> obvious problem yet.

The biggest issue with it is that we end up with:

- the .text section rounded up to 1MB
- the .rodata section rounded up to 1MB

That means we can end up wasting up to 1MB of memory for each (in the
worst case where we encroach into the next 1MB aligned region by a few
bytes) and this memory can't be re-used.

The alternative is to adjust the maps such that we end up mapping the
.text / .rodata overlap 1MB using 4K pages, taking the additional TLB
hit by doing so.

The .text is aligned to 1MB, so the majority of the first 0x8000 to
0x100000 is unused.  The end of the .text section is aligned to 1MB,
and the start of the .data section is also aligned to 1MB.

So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
- 0x8000

So, looking at this kernel I've recently built:

Idx Name          Size      VMA       LMA       File off  Algn
  0 .head.text    00000204  c0008000  c0008000  00008000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

--- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000

  1 .text         006c4530  c0008240  c0008240  00008240  2**6
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

--- sizeof(.text) + sizeof(.text.head) becomes 0x700000
--- .rodata starts at 0xc0800000 instead of 0xc06cd000

  3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 11 __param       00000c70  c093d564  c093d564  0093d564  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 15 .vectors      00000020  00000000  c0940000  00940000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 16 .stubs        00000240  00001000  c0940020  00941000  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 17 .init.text    00051760  c0940260  c0940260  00948260  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
                  CONTENTS, ALLOC, LOAD, DATA

--- sizeof previous sections is 0x2db000, which becomes 0x300000
--- start of .data becomes 0xc0b00000 instead of 0xc09a8000

 25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
                  CONTENTS, ALLOC, LOAD, DATA
 26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
                  ALLOC

--- which means the kernel image finishes at 0xC12B6FB0 whereas it used
    to finish at 0xC115EFB0.

 27 .comment      00000011  00000000  00000000  00a12728  2**0
                  CONTENTS, READONLY
 28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
                  CONTENTS, READONLY

That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
that sounds small, but the thing to realise is that growth is independent
of the image size, so a smaller image sees a larger %age wise growth in
its size.

People have already complained bitterly when I've said that stealing
memory and taking out out of memblock should always be 1MB aligned, so
/no one/ has the right to say "it's only 1.5MB, it doesn't matter"
because quite frankly they should've been saying that and supporting me
with the memblock issue.  So, I really don't want to hear that argument!

However, if you look at where these boundaries are placed, they're not
quite in the right place.  For example, the .init.data section is writable,
and so should be grouped with the .data section.  So should .data..percpu.

Now, a few other things stand out from the above:

(a) .text.head - imx, sunxi and tegra need to fix that.  There is no
    specific meaning to it.

(b) .init.text is executable, and can't be in a NX region when it's
    set as non-executable.

(c) we can't free the .init sections (sections 15 through up to and
    including 23) anymore with this feature enabled because it's setup
    as read-only memory.

-- 
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up.  Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-21 13:20               ` Russell King - ARM Linux
  0 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2014-02-21 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
> It would be good if someone who's more familiar with the parms and
> vmlinux.lds stuff could take a look at it, though I don't see any
> obvious problem yet.

The biggest issue with it is that we end up with:

- the .text section rounded up to 1MB
- the .rodata section rounded up to 1MB

That means we can end up wasting up to 1MB of memory for each (in the
worst case where we encroach into the next 1MB aligned region by a few
bytes) and this memory can't be re-used.

The alternative is to adjust the maps such that we end up mapping the
.text / .rodata overlap 1MB using 4K pages, taking the additional TLB
hit by doing so.

The .text is aligned to 1MB, so the majority of the first 0x8000 to
0x100000 is unused.  The end of the .text section is aligned to 1MB,
and the start of the .data section is also aligned to 1MB.

So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
- 0x8000

So, looking at this kernel I've recently built:

Idx Name          Size      VMA       LMA       File off  Algn
  0 .head.text    00000204  c0008000  c0008000  00008000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

--- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000

  1 .text         006c4530  c0008240  c0008240  00008240  2**6
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

--- sizeof(.text) + sizeof(.text.head) becomes 0x700000
--- .rodata starts at 0xc0800000 instead of 0xc06cd000

  3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 11 __param       00000c70  c093d564  c093d564  0093d564  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 15 .vectors      00000020  00000000  c0940000  00940000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 16 .stubs        00000240  00001000  c0940020  00941000  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 17 .init.text    00051760  c0940260  c0940260  00948260  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
                  CONTENTS, ALLOC, LOAD, DATA

--- sizeof previous sections is 0x2db000, which becomes 0x300000
--- start of .data becomes 0xc0b00000 instead of 0xc09a8000

 25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
                  CONTENTS, ALLOC, LOAD, DATA
 26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
                  ALLOC

--- which means the kernel image finishes at 0xC12B6FB0 whereas it used
    to finish at 0xC115EFB0.

 27 .comment      00000011  00000000  00000000  00a12728  2**0
                  CONTENTS, READONLY
 28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
                  CONTENTS, READONLY

That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
that sounds small, but the thing to realise is that growth is independent
of the image size, so a smaller image sees a larger %age wise growth in
its size.

People have already complained bitterly when I've said that stealing
memory and taking out out of memblock should always be 1MB aligned, so
/no one/ has the right to say "it's only 1.5MB, it doesn't matter"
because quite frankly they should've been saying that and supporting me
with the memblock issue.  So, I really don't want to hear that argument!

However, if you look at where these boundaries are placed, they're not
quite in the right place.  For example, the .init.data section is writable,
and so should be grouped with the .data section.  So should .data..percpu.

Now, a few other things stand out from the above:

(a) .text.head - imx, sunxi and tegra need to fix that.  There is no
    specific meaning to it.

(b) .init.text is executable, and can't be in a NX region when it's
    set as non-executable.

(c) we can't free the .init sections (sections 15 through up to and
    including 23) anymore with this feature enabled because it's setup
    as read-only memory.

-- 
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up.  Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-21 13:20               ` Russell King - ARM Linux
@ 2014-02-21 22:09                 ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-21 22:09 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Dave Martin, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Laura Abbott, Marc Zyngier, Rob Herring, Vitaly Andrianov,
	linux-arm-kernel, Jonathan Austin, Simon Baatz,
	Greg Kroah-Hartman, LKML, Santosh Shilimkar, Andrew Morton

On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>> It would be good if someone who's more familiar with the parms and
>> vmlinux.lds stuff could take a look at it, though I don't see any
>> obvious problem yet.
>
> The biggest issue with it is that we end up with:
>
> - the .text section rounded up to 1MB
> - the .rodata section rounded up to 1MB
>
> That means we can end up wasting up to 1MB of memory for each (in the
> worst case where we encroach into the next 1MB aligned region by a few
> bytes) and this memory can't be re-used.
>
> The alternative is to adjust the maps such that we end up mapping the
> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
> hit by doing so.
>
> The .text is aligned to 1MB, so the majority of the first 0x8000 to
> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
> and the start of the .data section is also aligned to 1MB.
>
> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
> - 0x8000
>
> So, looking at this kernel I've recently built:
>
> Idx Name          Size      VMA       LMA       File off  Algn
>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>
> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>
>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>
> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>
>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>                   CONTENTS, ALLOC, LOAD, DATA
>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>                   CONTENTS, ALLOC, LOAD, DATA
>
> --- sizeof previous sections is 0x2db000, which becomes 0x300000
> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>
>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>                   CONTENTS, ALLOC, LOAD, DATA
>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>                   ALLOC
>
> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>     to finish at 0xC115EFB0.
>
>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>                   CONTENTS, READONLY
>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>                   CONTENTS, READONLY
>
> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
> that sounds small, but the thing to realise is that growth is independent
> of the image size, so a smaller image sees a larger %age wise growth in
> its size.
>
> People have already complained bitterly when I've said that stealing
> memory and taking out out of memblock should always be 1MB aligned, so
> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
> because quite frankly they should've been saying that and supporting me
> with the memblock issue.  So, I really don't want to hear that argument!

Absolutely, the memory waste is a problem. This is why I made sure to
leave the rodata as a separate config item. It seems like if people
want this feature, the cost is either going to be this kind of
alignment loss, or TLB hit switching to 4K pages. It seems like the
latter is significantly more "costly". Regardless, that's why it's all
behind config options.

> However, if you look at where these boundaries are placed, they're not
> quite in the right place.  For example, the .init.data section is writable,
> and so should be grouped with the .data section.  So should .data..percpu.
>
> Now, a few other things stand out from the above:
>
> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>     specific meaning to it.
>
> (b) .init.text is executable, and can't be in a NX region when it's
>     set as non-executable.
>
> (c) we can't free the .init sections (sections 15 through up to and
>     including 23) anymore with this feature enabled because it's setup
>     as read-only memory.

Perhaps I've misunderstood something, but I don't think b) and c) or
the comments about .init.data and .data..percpu are problems for this
series because of when the mapping happens. Until init finishes, these
sections are fully RXW. Only after init is done is the init section
made NX.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-02-21 22:09                 ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-02-21 22:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>> It would be good if someone who's more familiar with the parms and
>> vmlinux.lds stuff could take a look at it, though I don't see any
>> obvious problem yet.
>
> The biggest issue with it is that we end up with:
>
> - the .text section rounded up to 1MB
> - the .rodata section rounded up to 1MB
>
> That means we can end up wasting up to 1MB of memory for each (in the
> worst case where we encroach into the next 1MB aligned region by a few
> bytes) and this memory can't be re-used.
>
> The alternative is to adjust the maps such that we end up mapping the
> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
> hit by doing so.
>
> The .text is aligned to 1MB, so the majority of the first 0x8000 to
> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
> and the start of the .data section is also aligned to 1MB.
>
> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
> - 0x8000
>
> So, looking at this kernel I've recently built:
>
> Idx Name          Size      VMA       LMA       File off  Algn
>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>
> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>
>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>
> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>
>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>                   CONTENTS, ALLOC, LOAD, DATA
>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>                   CONTENTS, ALLOC, LOAD, DATA
>
> --- sizeof previous sections is 0x2db000, which becomes 0x300000
> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>
>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>                   CONTENTS, ALLOC, LOAD, DATA
>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>                   ALLOC
>
> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>     to finish at 0xC115EFB0.
>
>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>                   CONTENTS, READONLY
>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>                   CONTENTS, READONLY
>
> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
> that sounds small, but the thing to realise is that growth is independent
> of the image size, so a smaller image sees a larger %age wise growth in
> its size.
>
> People have already complained bitterly when I've said that stealing
> memory and taking out out of memblock should always be 1MB aligned, so
> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
> because quite frankly they should've been saying that and supporting me
> with the memblock issue.  So, I really don't want to hear that argument!

Absolutely, the memory waste is a problem. This is why I made sure to
leave the rodata as a separate config item. It seems like if people
want this feature, the cost is either going to be this kind of
alignment loss, or TLB hit switching to 4K pages. It seems like the
latter is significantly more "costly". Regardless, that's why it's all
behind config options.

> However, if you look at where these boundaries are placed, they're not
> quite in the right place.  For example, the .init.data section is writable,
> and so should be grouped with the .data section.  So should .data..percpu.
>
> Now, a few other things stand out from the above:
>
> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>     specific meaning to it.
>
> (b) .init.text is executable, and can't be in a NX region when it's
>     set as non-executable.
>
> (c) we can't free the .init sections (sections 15 through up to and
>     including 23) anymore with this feature enabled because it's setup
>     as read-only memory.

Perhaps I've misunderstood something, but I don't think b) and c) or
the comments about .init.data and .data..percpu are problems for this
series because of when the mapping happens. Until init finishes, these
sections are fully RXW. Only after init is done is the init section
made NX.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-21 22:09                 ` Kees Cook
@ 2014-03-13 19:07                   ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-13 19:07 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Dave Martin, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Laura Abbott, Marc Zyngier, Vitaly Andrianov, linux-arm-kernel,
	Jonathan Austin, Simon Baatz, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Fri, Feb 21, 2014 at 2:09 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>>> It would be good if someone who's more familiar with the parms and
>>> vmlinux.lds stuff could take a look at it, though I don't see any
>>> obvious problem yet.
>>
>> The biggest issue with it is that we end up with:
>>
>> - the .text section rounded up to 1MB
>> - the .rodata section rounded up to 1MB
>>
>> That means we can end up wasting up to 1MB of memory for each (in the
>> worst case where we encroach into the next 1MB aligned region by a few
>> bytes) and this memory can't be re-used.
>>
>> The alternative is to adjust the maps such that we end up mapping the
>> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
>> hit by doing so.
>>
>> The .text is aligned to 1MB, so the majority of the first 0x8000 to
>> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
>> and the start of the .data section is also aligned to 1MB.
>>
>> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
>> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
>> - 0x8000
>>
>> So, looking at this kernel I've recently built:
>>
>> Idx Name          Size      VMA       LMA       File off  Algn
>>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>
>> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>>
>>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>
>> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
>> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>>
>>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>>                   CONTENTS, ALLOC, LOAD, DATA
>>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>>                   CONTENTS, ALLOC, LOAD, DATA
>>
>> --- sizeof previous sections is 0x2db000, which becomes 0x300000
>> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>>
>>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>>                   CONTENTS, ALLOC, LOAD, DATA
>>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>>                   ALLOC
>>
>> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>>     to finish at 0xC115EFB0.
>>
>>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>>                   CONTENTS, READONLY
>>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>>                   CONTENTS, READONLY
>>
>> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
>> that sounds small, but the thing to realise is that growth is independent
>> of the image size, so a smaller image sees a larger %age wise growth in
>> its size.
>>
>> People have already complained bitterly when I've said that stealing
>> memory and taking out out of memblock should always be 1MB aligned, so
>> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
>> because quite frankly they should've been saying that and supporting me
>> with the memblock issue.  So, I really don't want to hear that argument!
>
> Absolutely, the memory waste is a problem. This is why I made sure to
> leave the rodata as a separate config item. It seems like if people
> want this feature, the cost is either going to be this kind of
> alignment loss, or TLB hit switching to 4K pages. It seems like the
> latter is significantly more "costly". Regardless, that's why it's all
> behind config options.
>
>> However, if you look at where these boundaries are placed, they're not
>> quite in the right place.  For example, the .init.data section is writable,
>> and so should be grouped with the .data section.  So should .data..percpu.
>>
>> Now, a few other things stand out from the above:
>>
>> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>>     specific meaning to it.
>>
>> (b) .init.text is executable, and can't be in a NX region when it's
>>     set as non-executable.
>>
>> (c) we can't free the .init sections (sections 15 through up to and
>>     including 23) anymore with this feature enabled because it's setup
>>     as read-only memory.
>
> Perhaps I've misunderstood something, but I don't think b) and c) or
> the comments about .init.data and .data..percpu are problems for this
> series because of when the mapping happens. Until init finishes, these
> sections are fully RXW. Only after init is done is the init section
> made NX.

I haven't heard anything back, and I think I've answered the questions raised.

Since these changes are behind CONFIG items, other folks have
successfully tested the series, and at least Google, Linaro, and the
codeaurora folks are interested in these features, can I send these to
the patch tracker?

Thanks,

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-13 19:07                   ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-13 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 21, 2014 at 2:09 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>>> It would be good if someone who's more familiar with the parms and
>>> vmlinux.lds stuff could take a look at it, though I don't see any
>>> obvious problem yet.
>>
>> The biggest issue with it is that we end up with:
>>
>> - the .text section rounded up to 1MB
>> - the .rodata section rounded up to 1MB
>>
>> That means we can end up wasting up to 1MB of memory for each (in the
>> worst case where we encroach into the next 1MB aligned region by a few
>> bytes) and this memory can't be re-used.
>>
>> The alternative is to adjust the maps such that we end up mapping the
>> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
>> hit by doing so.
>>
>> The .text is aligned to 1MB, so the majority of the first 0x8000 to
>> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
>> and the start of the .data section is also aligned to 1MB.
>>
>> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
>> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
>> - 0x8000
>>
>> So, looking at this kernel I've recently built:
>>
>> Idx Name          Size      VMA       LMA       File off  Algn
>>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>
>> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>>
>>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>
>> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
>> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>>
>>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>>                   CONTENTS, ALLOC, LOAD, DATA
>>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>>                   CONTENTS, ALLOC, LOAD, DATA
>>
>> --- sizeof previous sections is 0x2db000, which becomes 0x300000
>> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>>
>>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>>                   CONTENTS, ALLOC, LOAD, DATA
>>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>>                   ALLOC
>>
>> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>>     to finish at 0xC115EFB0.
>>
>>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>>                   CONTENTS, READONLY
>>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>>                   CONTENTS, READONLY
>>
>> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
>> that sounds small, but the thing to realise is that growth is independent
>> of the image size, so a smaller image sees a larger %age wise growth in
>> its size.
>>
>> People have already complained bitterly when I've said that stealing
>> memory and taking out out of memblock should always be 1MB aligned, so
>> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
>> because quite frankly they should've been saying that and supporting me
>> with the memblock issue.  So, I really don't want to hear that argument!
>
> Absolutely, the memory waste is a problem. This is why I made sure to
> leave the rodata as a separate config item. It seems like if people
> want this feature, the cost is either going to be this kind of
> alignment loss, or TLB hit switching to 4K pages. It seems like the
> latter is significantly more "costly". Regardless, that's why it's all
> behind config options.
>
>> However, if you look at where these boundaries are placed, they're not
>> quite in the right place.  For example, the .init.data section is writable,
>> and so should be grouped with the .data section.  So should .data..percpu.
>>
>> Now, a few other things stand out from the above:
>>
>> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>>     specific meaning to it.
>>
>> (b) .init.text is executable, and can't be in a NX region when it's
>>     set as non-executable.
>>
>> (c) we can't free the .init sections (sections 15 through up to and
>>     including 23) anymore with this feature enabled because it's setup
>>     as read-only memory.
>
> Perhaps I've misunderstood something, but I don't think b) and c) or
> the comments about .init.data and .data..percpu are problems for this
> series because of when the mapping happens. Until init finishes, these
> sections are fully RXW. Only after init is done is the init section
> made NX.

I haven't heard anything back, and I think I've answered the questions raised.

Since these changes are behind CONFIG items, other folks have
successfully tested the series, and at least Google, Linaro, and the
codeaurora folks are interested in these features, can I send these to
the patch tracker?

Thanks,

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-13 19:07                   ` Kees Cook
@ 2014-03-23 18:32                     ` Laura Abbott
  -1 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2014-03-23 18:32 UTC (permalink / raw)
  To: Kees Cook, Russell King - ARM Linux
  Cc: Dave Martin, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Marc Zyngier, Vitaly Andrianov, linux-arm-kernel,
	Jonathan Austin, Simon Baatz, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On 3/13/2014 12:07 PM, Kees Cook wrote:
> On Fri, Feb 21, 2014 at 2:09 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>>>> It would be good if someone who's more familiar with the parms and
>>>> vmlinux.lds stuff could take a look at it, though I don't see any
>>>> obvious problem yet.
>>>
>>> The biggest issue with it is that we end up with:
>>>
>>> - the .text section rounded up to 1MB
>>> - the .rodata section rounded up to 1MB
>>>
>>> That means we can end up wasting up to 1MB of memory for each (in the
>>> worst case where we encroach into the next 1MB aligned region by a few
>>> bytes) and this memory can't be re-used.
>>>
>>> The alternative is to adjust the maps such that we end up mapping the
>>> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
>>> hit by doing so.
>>>
>>> The .text is aligned to 1MB, so the majority of the first 0x8000 to
>>> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
>>> and the start of the .data section is also aligned to 1MB.
>>>
>>> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
>>> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
>>> - 0x8000
>>>
>>> So, looking at this kernel I've recently built:
>>>
>>> Idx Name          Size      VMA       LMA       File off  Algn
>>>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>
>>> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>>>
>>>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>
>>> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
>>> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>>>
>>>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>
>>> --- sizeof previous sections is 0x2db000, which becomes 0x300000
>>> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>>>
>>>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>>>                   ALLOC
>>>
>>> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>>>     to finish at 0xC115EFB0.
>>>
>>>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>>>                   CONTENTS, READONLY
>>>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>>>                   CONTENTS, READONLY
>>>
>>> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
>>> that sounds small, but the thing to realise is that growth is independent
>>> of the image size, so a smaller image sees a larger %age wise growth in
>>> its size.
>>>
>>> People have already complained bitterly when I've said that stealing
>>> memory and taking out out of memblock should always be 1MB aligned, so
>>> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
>>> because quite frankly they should've been saying that and supporting me
>>> with the memblock issue.  So, I really don't want to hear that argument!
>>
>> Absolutely, the memory waste is a problem. This is why I made sure to
>> leave the rodata as a separate config item. It seems like if people
>> want this feature, the cost is either going to be this kind of
>> alignment loss, or TLB hit switching to 4K pages. It seems like the
>> latter is significantly more "costly". Regardless, that's why it's all
>> behind config options.
>>
>>> However, if you look at where these boundaries are placed, they're not
>>> quite in the right place.  For example, the .init.data section is writable,
>>> and so should be grouped with the .data section.  So should .data..percpu.
>>>
>>> Now, a few other things stand out from the above:
>>>
>>> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>>>     specific meaning to it.
>>>
>>> (b) .init.text is executable, and can't be in a NX region when it's
>>>     set as non-executable.
>>>
>>> (c) we can't free the .init sections (sections 15 through up to and
>>>     including 23) anymore with this feature enabled because it's setup
>>>     as read-only memory.
>>
>> Perhaps I've misunderstood something, but I don't think b) and c) or
>> the comments about .init.data and .data..percpu are problems for this
>> series because of when the mapping happens. Until init finishes, these
>> sections are fully RXW. Only after init is done is the init section
>> made NX.
> 
> I haven't heard anything back, and I think I've answered the questions raised.
> 
> Since these changes are behind CONFIG items, other folks have
> successfully tested the series, and at least Google, Linaro, and the
> codeaurora folks are interested in these features, can I send these to
> the patch tracker?

I'm getting a section mismatch warning when compiling

WARNING: vmlinux.o(.text+0x1257c): Section mismatch in reference from the
function free_initmem() to the (unknown reference) .init.data:(unknown)
The function free_initmem() references
the (unknown reference) __initdata (unknown).
This is often because free_initmem lacks a __initdata 
annotation or the annotation of (unknown) is wrong.

free_initmem -> fix_kernmem_perms aren't marked as __init but there is a
reference to section_perms which is marked as __initdata.

Kernel output looks as expected on a v7 non LPAE system

/ # cat /sys/kernel/debug/kernel_page_tables
---[ Modules ]---
0xbfe01000-0xbfe0d000          48K     RW NX SHD MEM/CACHED/WBWA
---[ Kernel Mapping ]---
0xc0000000-0xc0100000           1M     RW NX SHD
0xc0100000-0xc0600000           5M     ro x  SHD
0xc0600000-0xc0900000           3M     ro NX SHD
0xc0900000-0xef800000         751M     RW NX SHD
---[ vmalloc() Area ]---
0xf0000000-0xf0001000           4K     RW NX SHD DEV/SHARED
0xf0002000-0xf0003000           4K     RW NX SHD DEV/SHARED
0xf0006000-0xf0007000           4K     RW NX SHD DEV/SHARED
0xf0018000-0xf0058000         256K     RW NX SHD MEM/BUFFERABLE/WC
0xf005a000-0xf005b000           4K     RW NX SHD DEV/SHARED
0xf005c000-0xf005f000          12K     RW NX SHD MEM/CACHED/WBWA
0xf0060000-0xf0064000          16K     RW NX SHD DEV/SHARED
0xf0068000-0xf006e000          24K     RW NX SHD DEV/SHARED
---[ vmalloc() End ]---
---[ Fixmap Area ]---
0xfffae000-0xfffb0000           8K     RW NX SHD MEM/CACHED/WBWA
0xfffbe000-0xfffc0000           8K     RW NX SHD MEM/CACHED/WBWA
0xfffce000-0xfffd0000           8K     RW NX SHD MEM/CACHED/WBWA
0xfffde000-0xfffe0000           8K     RW NX SHD MEM/CACHED/WBWA
---[ Vectors ]---
0xffff0000-0xffff1000           4K USR ro x  SHD MEM/CACHED/WBWA
0xffff1000-0xffff2000           4K     ro x  SHD MEM/CACHED/WBWA
---[ Vectors End ]---

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-23 18:32                     ` Laura Abbott
  0 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2014-03-23 18:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 3/13/2014 12:07 PM, Kees Cook wrote:
> On Fri, Feb 21, 2014 at 2:09 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>>>> It would be good if someone who's more familiar with the parms and
>>>> vmlinux.lds stuff could take a look at it, though I don't see any
>>>> obvious problem yet.
>>>
>>> The biggest issue with it is that we end up with:
>>>
>>> - the .text section rounded up to 1MB
>>> - the .rodata section rounded up to 1MB
>>>
>>> That means we can end up wasting up to 1MB of memory for each (in the
>>> worst case where we encroach into the next 1MB aligned region by a few
>>> bytes) and this memory can't be re-used.
>>>
>>> The alternative is to adjust the maps such that we end up mapping the
>>> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
>>> hit by doing so.
>>>
>>> The .text is aligned to 1MB, so the majority of the first 0x8000 to
>>> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
>>> and the start of the .data section is also aligned to 1MB.
>>>
>>> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
>>> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
>>> - 0x8000
>>>
>>> So, looking at this kernel I've recently built:
>>>
>>> Idx Name          Size      VMA       LMA       File off  Algn
>>>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>
>>> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>>>
>>>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>
>>> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
>>> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>>>
>>>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>
>>> --- sizeof previous sections is 0x2db000, which becomes 0x300000
>>> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>>>
>>>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>>>                   ALLOC
>>>
>>> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>>>     to finish at 0xC115EFB0.
>>>
>>>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>>>                   CONTENTS, READONLY
>>>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>>>                   CONTENTS, READONLY
>>>
>>> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
>>> that sounds small, but the thing to realise is that growth is independent
>>> of the image size, so a smaller image sees a larger %age wise growth in
>>> its size.
>>>
>>> People have already complained bitterly when I've said that stealing
>>> memory and taking out out of memblock should always be 1MB aligned, so
>>> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
>>> because quite frankly they should've been saying that and supporting me
>>> with the memblock issue.  So, I really don't want to hear that argument!
>>
>> Absolutely, the memory waste is a problem. This is why I made sure to
>> leave the rodata as a separate config item. It seems like if people
>> want this feature, the cost is either going to be this kind of
>> alignment loss, or TLB hit switching to 4K pages. It seems like the
>> latter is significantly more "costly". Regardless, that's why it's all
>> behind config options.
>>
>>> However, if you look at where these boundaries are placed, they're not
>>> quite in the right place.  For example, the .init.data section is writable,
>>> and so should be grouped with the .data section.  So should .data..percpu.
>>>
>>> Now, a few other things stand out from the above:
>>>
>>> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>>>     specific meaning to it.
>>>
>>> (b) .init.text is executable, and can't be in a NX region when it's
>>>     set as non-executable.
>>>
>>> (c) we can't free the .init sections (sections 15 through up to and
>>>     including 23) anymore with this feature enabled because it's setup
>>>     as read-only memory.
>>
>> Perhaps I've misunderstood something, but I don't think b) and c) or
>> the comments about .init.data and .data..percpu are problems for this
>> series because of when the mapping happens. Until init finishes, these
>> sections are fully RXW. Only after init is done is the init section
>> made NX.
> 
> I haven't heard anything back, and I think I've answered the questions raised.
> 
> Since these changes are behind CONFIG items, other folks have
> successfully tested the series, and at least Google, Linaro, and the
> codeaurora folks are interested in these features, can I send these to
> the patch tracker?

I'm getting a section mismatch warning when compiling

WARNING: vmlinux.o(.text+0x1257c): Section mismatch in reference from the
function free_initmem() to the (unknown reference) .init.data:(unknown)
The function free_initmem() references
the (unknown reference) __initdata (unknown).
This is often because free_initmem lacks a __initdata 
annotation or the annotation of (unknown) is wrong.

free_initmem -> fix_kernmem_perms aren't marked as __init but there is a
reference to section_perms which is marked as __initdata.

Kernel output looks as expected on a v7 non LPAE system

/ # cat /sys/kernel/debug/kernel_page_tables
---[ Modules ]---
0xbfe01000-0xbfe0d000          48K     RW NX SHD MEM/CACHED/WBWA
---[ Kernel Mapping ]---
0xc0000000-0xc0100000           1M     RW NX SHD
0xc0100000-0xc0600000           5M     ro x  SHD
0xc0600000-0xc0900000           3M     ro NX SHD
0xc0900000-0xef800000         751M     RW NX SHD
---[ vmalloc() Area ]---
0xf0000000-0xf0001000           4K     RW NX SHD DEV/SHARED
0xf0002000-0xf0003000           4K     RW NX SHD DEV/SHARED
0xf0006000-0xf0007000           4K     RW NX SHD DEV/SHARED
0xf0018000-0xf0058000         256K     RW NX SHD MEM/BUFFERABLE/WC
0xf005a000-0xf005b000           4K     RW NX SHD DEV/SHARED
0xf005c000-0xf005f000          12K     RW NX SHD MEM/CACHED/WBWA
0xf0060000-0xf0064000          16K     RW NX SHD DEV/SHARED
0xf0068000-0xf006e000          24K     RW NX SHD DEV/SHARED
---[ vmalloc() End ]---
---[ Fixmap Area ]---
0xfffae000-0xfffb0000           8K     RW NX SHD MEM/CACHED/WBWA
0xfffbe000-0xfffc0000           8K     RW NX SHD MEM/CACHED/WBWA
0xfffce000-0xfffd0000           8K     RW NX SHD MEM/CACHED/WBWA
0xfffde000-0xfffe0000           8K     RW NX SHD MEM/CACHED/WBWA
---[ Vectors ]---
0xffff0000-0xffff1000           4K USR ro x  SHD MEM/CACHED/WBWA
0xffff1000-0xffff2000           4K     ro x  SHD MEM/CACHED/WBWA
---[ Vectors End ]---

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-02-17 12:34         ` Dave Martin
@ 2014-03-23 18:47           ` Laura Abbott
  -1 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2014-03-23 18:47 UTC (permalink / raw)
  To: Dave Martin, Kees Cook
  Cc: Catalin Marinas, Will Deacon, Larry Bassel, Stephen Rothwell,
	Russell King, Nicolas Pitre, Ben Dooks, Uwe Kleine-König,
	Grant Likely, Jiang Liu, Christoffer Dall, Marc Zyngier,
	Rob Herring, Vitaly Andrianov, linux-arm-kernel, Simon Baatz,
	Jonathan Austin, Greg Kroah-Hartman, LKML, Santosh Shilimkar,
	Andrew Morton

On 2/17/2014 4:34 AM, Dave Martin wrote:
> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>>>> sets rodata read-only (but executable), where as this option additionally
>>>> splits rodata from the kernel text (resulting in potentially more memory
>>>> lost to padding) and sets it non-executable as well. The end result is
>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>>>> marked purely read-only.
>>>
>>> This triggers an Oops in kexec, because we have a block of code in .text
>>> which is a template for generating baremetal code to relocate the new
>>> kernel, and some literal words are written into it before copying.
>>
>> You're writing into the text area? I would imagine that
>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>> right place to be building code -- shouldn't the module area be used
>> for that?
>>
>>> Possibly this should be in .rodata, not .text.
>>
>> Well, rodata should be neither writable nor executable.
> 
> We're not writing into code exactly.
> 
> This code is never executed in-place in vmlinux.  It gets copied, and
> only copies are ever executed.
> 
> Some pointers and offsets get poked into the code to configure it.
> 
> I think it would be better simply to put the code in .rodata, and
> poke paramaters into the copy, not the original -- but that's a bit
> more awkward to code up, since the values can't be poked simply by
> writing global variables.
> 
>>
>>> There may be a few other instances of this kind of thing.
>>
>> This config will certainly find them! :) But, that's why it's behind a config.
> 
> I haven't tested exhaustively, but it think this is sufficient for a
> Tested-by.  The patch does seem to be doing what it is intended to
> do, and doesn't seem to be triggering false positives all over the
> place.
> 
>>
>>> Are you aware of similar situations on other arches?
>>
>> I think there were some problems a long time ago on x86 for rodata too.
> 
> It would be good to get this kexec case fixed -- I'll try to hack up
> a separate patch.
>

FWIW, we've hit issues not just with kexec but kprobes as well. The same
problems exist with this series:

/ # echo p:nl 0xc01d5c00 >> /sys/kernel/debug/tracing/kprobe_events
/ # echo 1 > /sys/kernel/debug/tracing/events/kprobes/nl/enable
[ 1639.739629] Unable to handle kernel paging request at virtual address c01d5c00
[ 1639.739655] pgd = edbc4000
[ 1639.745730] [c01d5c00] *pgd=0001141e(bad)
[ 1639.752413] Internal error: Oops: 80d [#1] PREEMPT SMP ARM
[ 1639.752503] Modules linked in:
[ 1639.760920] CPU: 0 PID: 58 Comm: sh Not tainted 3.14.0-rc7-next-20140318-00004-ga0191b7-dirty #170
[ 1639.761015] task: edb90d80 ti: ed018000 task.ti: ed018000
[ 1639.769870] PC is at patch_text+0x4/0x10
[ 1639.775333] LR is at arm_kprobe+0x28/0x38
[ 1639.779327] pc : [<c058acec>]    lr : [<c058bcc4>]    psr: 20000013
[ 1639.779327] sp : ed019f10  ip : a0000013  fp : 01e7fb34
[ 1639.783241] r10: 00000000  r9 : 01e80ab8  r8 : 00000002
[ 1639.794517] r7 : ed019f80  r6 : ed900bc4  r5 : edb19fa0  r4 : edb19f08
[ 1639.799727] r3 : c01d5c00  r2 : ed019f08  r1 : e7f001f8  r0 : c01d5c00
[ 1639.806326] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 1639.812837] Control: 10c5787d  Table: 2dbc406a  DAC: 00000015
[ 1639.820040] Process sh (pid: 58, stack limit = 0xed018240)
[ 1639.825768] Stack: (0xed019f10 to 0xed01a000)
[ 1639.831150] 9f00:                                     00000000 c058bd5c ed900ba0 c018b724
[ 1639.835584] 9f20: ed6fab00 00000000 edb2d240 00000002 ed019f80 c018bdc4 ffffffff 00000001
[ 1639.843743] 9f40: edb2d240 01e80ab8 ed019f80 00000002 00000002 c01e1b1c 01e7fb34 c011c3c0
[ 1639.851902] 9f60: 00000003 00000000 00000000 edb2d240 edb2d240 00000002 01e80ab8 c01e2108
[ 1639.860063] 9f80: 00000000 00000000 00200200 00157ecc 00000001 01e80ab8 00000004 c0106e64
[ 1639.868222] 9fa0: ed018000 c0106ce0 00157ecc 00000001 00000001 01e80ab8 00000002 00000000
[ 1639.876381] 9fc0: 00157ecc 00000001 01e80ab8 00000004 00000020 01e7fb48 01e7fb14 01e7fb34
[ 1639.884542] 9fe0: 00000000 bef4562c 0001ee5d 0000a8cc 60000010 00000001 00000000 00000000
[ 1639.892709] [<c058acec>] (patch_text) from [<c058bcc4>] (arm_kprobe+0x28/0x38)
[ 1639.900862] [<c058bcc4>] (arm_kprobe) from [<c058bd5c>] (enable_kprobe+0x88/0x94)
[ 1639.907983] [<c058bd5c>] (enable_kprobe) from [<c018b724>] (__ftrace_event_enable_disable+0x13c/0x200)
[ 1639.915537] [<c018b724>] (__ftrace_event_enable_disable) from [<c018bdc4>] (event_enable_write+0x78/0xd4)
[ 1639.924741] [<c018bdc4>] (event_enable_write) from [<c01e1b1c>] (vfs_write+0xac/0x188)
[ 1639.934372] [<c01e1b1c>] (vfs_write) from [<c01e2108>] (SyS_write+0x40/0x94)
[ 1639.942187] [<c01e2108>] (SyS_write) from [<c0106ce0>] (ret_fast_syscall+0x0/0x30)
[ 1639.949386] Code: e4831004 e1a01003 eaee298d e1a03000 (e4831004)
[ 1639.956766] ---[ end trace b548269e2c7a3190 ]---


We had some functions that allowed the text to be temporarily made writable but something
uniform for kexec would be useful as well (our kexec solution has been 'turn it off')


> Cheers
> ---Dave
> 

Thanks,
Laura
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-23 18:47           ` Laura Abbott
  0 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2014-03-23 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 2/17/2014 4:34 AM, Dave Martin wrote:
> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>>>> sets rodata read-only (but executable), where as this option additionally
>>>> splits rodata from the kernel text (resulting in potentially more memory
>>>> lost to padding) and sets it non-executable as well. The end result is
>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>>>> marked purely read-only.
>>>
>>> This triggers an Oops in kexec, because we have a block of code in .text
>>> which is a template for generating baremetal code to relocate the new
>>> kernel, and some literal words are written into it before copying.
>>
>> You're writing into the text area? I would imagine that
>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>> right place to be building code -- shouldn't the module area be used
>> for that?
>>
>>> Possibly this should be in .rodata, not .text.
>>
>> Well, rodata should be neither writable nor executable.
> 
> We're not writing into code exactly.
> 
> This code is never executed in-place in vmlinux.  It gets copied, and
> only copies are ever executed.
> 
> Some pointers and offsets get poked into the code to configure it.
> 
> I think it would be better simply to put the code in .rodata, and
> poke paramaters into the copy, not the original -- but that's a bit
> more awkward to code up, since the values can't be poked simply by
> writing global variables.
> 
>>
>>> There may be a few other instances of this kind of thing.
>>
>> This config will certainly find them! :) But, that's why it's behind a config.
> 
> I haven't tested exhaustively, but it think this is sufficient for a
> Tested-by.  The patch does seem to be doing what it is intended to
> do, and doesn't seem to be triggering false positives all over the
> place.
> 
>>
>>> Are you aware of similar situations on other arches?
>>
>> I think there were some problems a long time ago on x86 for rodata too.
> 
> It would be good to get this kexec case fixed -- I'll try to hack up
> a separate patch.
>

FWIW, we've hit issues not just with kexec but kprobes as well. The same
problems exist with this series:

/ # echo p:nl 0xc01d5c00 >> /sys/kernel/debug/tracing/kprobe_events
/ # echo 1 > /sys/kernel/debug/tracing/events/kprobes/nl/enable
[ 1639.739629] Unable to handle kernel paging request at virtual address c01d5c00
[ 1639.739655] pgd = edbc4000
[ 1639.745730] [c01d5c00] *pgd=0001141e(bad)
[ 1639.752413] Internal error: Oops: 80d [#1] PREEMPT SMP ARM
[ 1639.752503] Modules linked in:
[ 1639.760920] CPU: 0 PID: 58 Comm: sh Not tainted 3.14.0-rc7-next-20140318-00004-ga0191b7-dirty #170
[ 1639.761015] task: edb90d80 ti: ed018000 task.ti: ed018000
[ 1639.769870] PC is at patch_text+0x4/0x10
[ 1639.775333] LR is at arm_kprobe+0x28/0x38
[ 1639.779327] pc : [<c058acec>]    lr : [<c058bcc4>]    psr: 20000013
[ 1639.779327] sp : ed019f10  ip : a0000013  fp : 01e7fb34
[ 1639.783241] r10: 00000000  r9 : 01e80ab8  r8 : 00000002
[ 1639.794517] r7 : ed019f80  r6 : ed900bc4  r5 : edb19fa0  r4 : edb19f08
[ 1639.799727] r3 : c01d5c00  r2 : ed019f08  r1 : e7f001f8  r0 : c01d5c00
[ 1639.806326] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 1639.812837] Control: 10c5787d  Table: 2dbc406a  DAC: 00000015
[ 1639.820040] Process sh (pid: 58, stack limit = 0xed018240)
[ 1639.825768] Stack: (0xed019f10 to 0xed01a000)
[ 1639.831150] 9f00:                                     00000000 c058bd5c ed900ba0 c018b724
[ 1639.835584] 9f20: ed6fab00 00000000 edb2d240 00000002 ed019f80 c018bdc4 ffffffff 00000001
[ 1639.843743] 9f40: edb2d240 01e80ab8 ed019f80 00000002 00000002 c01e1b1c 01e7fb34 c011c3c0
[ 1639.851902] 9f60: 00000003 00000000 00000000 edb2d240 edb2d240 00000002 01e80ab8 c01e2108
[ 1639.860063] 9f80: 00000000 00000000 00200200 00157ecc 00000001 01e80ab8 00000004 c0106e64
[ 1639.868222] 9fa0: ed018000 c0106ce0 00157ecc 00000001 00000001 01e80ab8 00000002 00000000
[ 1639.876381] 9fc0: 00157ecc 00000001 01e80ab8 00000004 00000020 01e7fb48 01e7fb14 01e7fb34
[ 1639.884542] 9fe0: 00000000 bef4562c 0001ee5d 0000a8cc 60000010 00000001 00000000 00000000
[ 1639.892709] [<c058acec>] (patch_text) from [<c058bcc4>] (arm_kprobe+0x28/0x38)
[ 1639.900862] [<c058bcc4>] (arm_kprobe) from [<c058bd5c>] (enable_kprobe+0x88/0x94)
[ 1639.907983] [<c058bd5c>] (enable_kprobe) from [<c018b724>] (__ftrace_event_enable_disable+0x13c/0x200)
[ 1639.915537] [<c018b724>] (__ftrace_event_enable_disable) from [<c018bdc4>] (event_enable_write+0x78/0xd4)
[ 1639.924741] [<c018bdc4>] (event_enable_write) from [<c01e1b1c>] (vfs_write+0xac/0x188)
[ 1639.934372] [<c01e1b1c>] (vfs_write) from [<c01e2108>] (SyS_write+0x40/0x94)
[ 1639.942187] [<c01e2108>] (SyS_write) from [<c0106ce0>] (ret_fast_syscall+0x0/0x30)
[ 1639.949386] Code: e4831004 e1a01003 eaee298d e1a03000 (e4831004)
[ 1639.956766] ---[ end trace b548269e2c7a3190 ]---


We had some functions that allowed the text to be temporarily made writable but something
uniform for kexec would be useful as well (our kexec solution has been 'turn it off')


> Cheers
> ---Dave
> 

Thanks,
Laura
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-23 18:32                     ` Laura Abbott
@ 2014-03-23 22:20                       ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-23 22:20 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Russell King - ARM Linux, Dave Martin, Catalin Marinas,
	Will Deacon, Larry Bassel, Stephen Rothwell, Nicolas Pitre,
	Ben Dooks, Uwe Kleine-König, Grant Likely, Jiang Liu,
	Christoffer Dall, Marc Zyngier, Vitaly Andrianov,
	linux-arm-kernel, Jonathan Austin, Simon Baatz,
	Greg Kroah-Hartman, LKML, Santosh Shilimkar, Andrew Morton

On Sun, Mar 23, 2014 at 12:32 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 3/13/2014 12:07 PM, Kees Cook wrote:
>> On Fri, Feb 21, 2014 at 2:09 PM, Kees Cook <keescook@chromium.org> wrote:
>>> On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
>>> <linux@arm.linux.org.uk> wrote:
>>>> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>>>>> It would be good if someone who's more familiar with the parms and
>>>>> vmlinux.lds stuff could take a look at it, though I don't see any
>>>>> obvious problem yet.
>>>>
>>>> The biggest issue with it is that we end up with:
>>>>
>>>> - the .text section rounded up to 1MB
>>>> - the .rodata section rounded up to 1MB
>>>>
>>>> That means we can end up wasting up to 1MB of memory for each (in the
>>>> worst case where we encroach into the next 1MB aligned region by a few
>>>> bytes) and this memory can't be re-used.
>>>>
>>>> The alternative is to adjust the maps such that we end up mapping the
>>>> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
>>>> hit by doing so.
>>>>
>>>> The .text is aligned to 1MB, so the majority of the first 0x8000 to
>>>> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
>>>> and the start of the .data section is also aligned to 1MB.
>>>>
>>>> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
>>>> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
>>>> - 0x8000
>>>>
>>>> So, looking at this kernel I've recently built:
>>>>
>>>> Idx Name          Size      VMA       LMA       File off  Algn
>>>>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>
>>>> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>>>>
>>>>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>
>>>> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
>>>> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>>>>
>>>>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>>
>>>> --- sizeof previous sections is 0x2db000, which becomes 0x300000
>>>> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>>>>
>>>>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>>>>                   ALLOC
>>>>
>>>> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>>>>     to finish at 0xC115EFB0.
>>>>
>>>>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>>>>                   CONTENTS, READONLY
>>>>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>>>>                   CONTENTS, READONLY
>>>>
>>>> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
>>>> that sounds small, but the thing to realise is that growth is independent
>>>> of the image size, so a smaller image sees a larger %age wise growth in
>>>> its size.
>>>>
>>>> People have already complained bitterly when I've said that stealing
>>>> memory and taking out out of memblock should always be 1MB aligned, so
>>>> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
>>>> because quite frankly they should've been saying that and supporting me
>>>> with the memblock issue.  So, I really don't want to hear that argument!
>>>
>>> Absolutely, the memory waste is a problem. This is why I made sure to
>>> leave the rodata as a separate config item. It seems like if people
>>> want this feature, the cost is either going to be this kind of
>>> alignment loss, or TLB hit switching to 4K pages. It seems like the
>>> latter is significantly more "costly". Regardless, that's why it's all
>>> behind config options.
>>>
>>>> However, if you look at where these boundaries are placed, they're not
>>>> quite in the right place.  For example, the .init.data section is writable,
>>>> and so should be grouped with the .data section.  So should .data..percpu.
>>>>
>>>> Now, a few other things stand out from the above:
>>>>
>>>> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>>>>     specific meaning to it.
>>>>
>>>> (b) .init.text is executable, and can't be in a NX region when it's
>>>>     set as non-executable.
>>>>
>>>> (c) we can't free the .init sections (sections 15 through up to and
>>>>     including 23) anymore with this feature enabled because it's setup
>>>>     as read-only memory.
>>>
>>> Perhaps I've misunderstood something, but I don't think b) and c) or
>>> the comments about .init.data and .data..percpu are problems for this
>>> series because of when the mapping happens. Until init finishes, these
>>> sections are fully RXW. Only after init is done is the init section
>>> made NX.
>>
>> I haven't heard anything back, and I think I've answered the questions raised.
>>
>> Since these changes are behind CONFIG items, other folks have
>> successfully tested the series, and at least Google, Linaro, and the
>> codeaurora folks are interested in these features, can I send these to
>> the patch tracker?
>
> I'm getting a section mismatch warning when compiling

Ah! Good catch, thanks. That config got turned off for me somewhere
along the lines. I've turned it back on and I see the same thing.

> WARNING: vmlinux.o(.text+0x1257c): Section mismatch in reference from the
> function free_initmem() to the (unknown reference) .init.data:(unknown)
> The function free_initmem() references
> the (unknown reference) __initdata (unknown).
> This is often because free_initmem lacks a __initdata
> annotation or the annotation of (unknown) is wrong.
>
> free_initmem -> fix_kernmem_perms aren't marked as __init but there is a
> reference to section_perms which is marked as __initdata.

Yeah, I don't see a way to have fix_kernmem_perms marked as __init
without continuing to trigger the warnings. Instead, I've now dropped
__initdata from section_perms instead. That cleaned up the warning for
me; I will send an updated version.

> Kernel output looks as expected on a v7 non LPAE system

Great, thanks for the confirmation! :)

-Kees

>
> / # cat /sys/kernel/debug/kernel_page_tables
> ---[ Modules ]---
> 0xbfe01000-0xbfe0d000          48K     RW NX SHD MEM/CACHED/WBWA
> ---[ Kernel Mapping ]---
> 0xc0000000-0xc0100000           1M     RW NX SHD
> 0xc0100000-0xc0600000           5M     ro x  SHD
> 0xc0600000-0xc0900000           3M     ro NX SHD
> 0xc0900000-0xef800000         751M     RW NX SHD
> ---[ vmalloc() Area ]---
> 0xf0000000-0xf0001000           4K     RW NX SHD DEV/SHARED
> 0xf0002000-0xf0003000           4K     RW NX SHD DEV/SHARED
> 0xf0006000-0xf0007000           4K     RW NX SHD DEV/SHARED
> 0xf0018000-0xf0058000         256K     RW NX SHD MEM/BUFFERABLE/WC
> 0xf005a000-0xf005b000           4K     RW NX SHD DEV/SHARED
> 0xf005c000-0xf005f000          12K     RW NX SHD MEM/CACHED/WBWA
> 0xf0060000-0xf0064000          16K     RW NX SHD DEV/SHARED
> 0xf0068000-0xf006e000          24K     RW NX SHD DEV/SHARED
> ---[ vmalloc() End ]---
> ---[ Fixmap Area ]---
> 0xfffae000-0xfffb0000           8K     RW NX SHD MEM/CACHED/WBWA
> 0xfffbe000-0xfffc0000           8K     RW NX SHD MEM/CACHED/WBWA
> 0xfffce000-0xfffd0000           8K     RW NX SHD MEM/CACHED/WBWA
> 0xfffde000-0xfffe0000           8K     RW NX SHD MEM/CACHED/WBWA
> ---[ Vectors ]---
> 0xffff0000-0xffff1000           4K USR ro x  SHD MEM/CACHED/WBWA
> 0xffff1000-0xffff2000           4K     ro x  SHD MEM/CACHED/WBWA
> ---[ Vectors End ]---
>
> Thanks,
> Laura
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-23 22:20                       ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-23 22:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 23, 2014 at 12:32 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 3/13/2014 12:07 PM, Kees Cook wrote:
>> On Fri, Feb 21, 2014 at 2:09 PM, Kees Cook <keescook@chromium.org> wrote:
>>> On Fri, Feb 21, 2014 at 5:20 AM, Russell King - ARM Linux
>>> <linux@arm.linux.org.uk> wrote:
>>>> On Fri, Feb 21, 2014 at 12:37:04PM +0000, Dave Martin wrote:
>>>>> It would be good if someone who's more familiar with the parms and
>>>>> vmlinux.lds stuff could take a look at it, though I don't see any
>>>>> obvious problem yet.
>>>>
>>>> The biggest issue with it is that we end up with:
>>>>
>>>> - the .text section rounded up to 1MB
>>>> - the .rodata section rounded up to 1MB
>>>>
>>>> That means we can end up wasting up to 1MB of memory for each (in the
>>>> worst case where we encroach into the next 1MB aligned region by a few
>>>> bytes) and this memory can't be re-used.
>>>>
>>>> The alternative is to adjust the maps such that we end up mapping the
>>>> .text / .rodata overlap 1MB using 4K pages, taking the additional TLB
>>>> hit by doing so.
>>>>
>>>> The .text is aligned to 1MB, so the majority of the first 0x8000 to
>>>> 0x100000 is unused.  The end of the .text section is aligned to 1MB,
>>>> and the start of the .data section is also aligned to 1MB.
>>>>
>>>> So, the minimum kernel size is: 0x100000 + MB_ALIGN(sizeof(.text)) +
>>>> MB_ALIGN(sizeof(.rodata)) + MB_ALIGN(sizeof(init sections)) + sizeof(.data)
>>>> - 0x8000
>>>>
>>>> So, looking at this kernel I've recently built:
>>>>
>>>> Idx Name          Size      VMA       LMA       File off  Algn
>>>>   0 .head.text    00000204  c0008000  c0008000  00008000  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>
>>>> --- .text this gets set to 0xc0100000, we lose 0xc0008240 to 0xc0100000
>>>>
>>>>   1 .text         006c4530  c0008240  c0008240  00008240  2**6
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>   2 .text.head    0000004c  c06cc770  c06cc770  006cc770  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>
>>>> --- sizeof(.text) + sizeof(.text.head) becomes 0x700000
>>>> --- .rodata starts at 0xc0800000 instead of 0xc06cd000
>>>>
>>>>   3 .rodata       0022f568  c06cd000  c06cd000  006cd000  2**6
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   4 __bug_table   0000873c  c08fc568  c08fc568  008fc568  2**0
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   5 .pci_fixup    00000030  c0904ca4  c0904ca4  00904ca4  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   6 __ksymtab     00008158  c0904cd4  c0904cd4  00904cd4  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   7 __ksymtab_gpl 00006858  c090ce2c  c090ce2c  0090ce2c  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   8 __kcrctab     000040ac  c0913684  c0913684  00913684  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>   9 __kcrctab_gpl 0000342c  c0917730  c0917730  00917730  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  10 __ksymtab_strings 00022a08  c091ab5c  c091ab5c  0091ab5c  2**0
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  11 __param       00000c70  c093d564  c093d564  0093d564  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  12 __modver      00000e2c  c093e1d4  c093e1d4  0093e1d4  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  13 __ex_table    00000f18  c093f000  c093f000  0093f000  2**3
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  14 .notes        00000024  c093ff18  c093ff18  0093ff18  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  15 .vectors      00000020  00000000  c0940000  00940000  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  16 .stubs        00000240  00001000  c0940020  00941000  2**5
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  17 .init.text    00051760  c0940260  c0940260  00948260  2**5
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  18 .exit.text    00002130  c09919c0  c09919c0  009999c0  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, CODE
>>>>  19 .init.arch.info 00000108  c0993af0  c0993af0  0099baf0  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  20 .init.tagtable 00000048  c0993bf8  c0993bf8  0099bbf8  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  21 .init.smpalt  000032f8  c0993c40  c0993c40  0099bc40  2**2
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  22 .init.pv_table 00000314  c0996f38  c0996f38  0099ef38  2**0
>>>>                   CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>>  23 .init.data    0000c19c  c0997250  c0997250  0099f250  2**3
>>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>>  24 .data..percpu 000035c0  c09a4000  c09a4000  009ac000  2**6
>>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>>
>>>> --- sizeof previous sections is 0x2db000, which becomes 0x300000
>>>> --- start of .data becomes 0xc0b00000 instead of 0xc09a8000
>>>>
>>>>  25 .data         00062728  c09a8000  c09a8000  009b0000  2**6
>>>>                   CONTENTS, ALLOC, LOAD, DATA
>>>>  26 .bss          00754870  c0a0a740  c0a0a740  00a12728  2**6
>>>>                   ALLOC
>>>>
>>>> --- which means the kernel image finishes at 0xC12B6FB0 whereas it used
>>>>     to finish at 0xC115EFB0.
>>>>
>>>>  27 .comment      00000011  00000000  00000000  00a12728  2**0
>>>>                   CONTENTS, READONLY
>>>>  28 .ARM.attributes 00000010  00000000  00000000  00a12739  2**0
>>>>                   CONTENTS, READONLY
>>>>
>>>> That's almost 1.5MB larger on an image size of 18MB.  Percentage wise,
>>>> that sounds small, but the thing to realise is that growth is independent
>>>> of the image size, so a smaller image sees a larger %age wise growth in
>>>> its size.
>>>>
>>>> People have already complained bitterly when I've said that stealing
>>>> memory and taking out out of memblock should always be 1MB aligned, so
>>>> /no one/ has the right to say "it's only 1.5MB, it doesn't matter"
>>>> because quite frankly they should've been saying that and supporting me
>>>> with the memblock issue.  So, I really don't want to hear that argument!
>>>
>>> Absolutely, the memory waste is a problem. This is why I made sure to
>>> leave the rodata as a separate config item. It seems like if people
>>> want this feature, the cost is either going to be this kind of
>>> alignment loss, or TLB hit switching to 4K pages. It seems like the
>>> latter is significantly more "costly". Regardless, that's why it's all
>>> behind config options.
>>>
>>>> However, if you look at where these boundaries are placed, they're not
>>>> quite in the right place.  For example, the .init.data section is writable,
>>>> and so should be grouped with the .data section.  So should .data..percpu.
>>>>
>>>> Now, a few other things stand out from the above:
>>>>
>>>> (a) .text.head - imx, sunxi and tegra need to fix that.  There is no
>>>>     specific meaning to it.
>>>>
>>>> (b) .init.text is executable, and can't be in a NX region when it's
>>>>     set as non-executable.
>>>>
>>>> (c) we can't free the .init sections (sections 15 through up to and
>>>>     including 23) anymore with this feature enabled because it's setup
>>>>     as read-only memory.
>>>
>>> Perhaps I've misunderstood something, but I don't think b) and c) or
>>> the comments about .init.data and .data..percpu are problems for this
>>> series because of when the mapping happens. Until init finishes, these
>>> sections are fully RXW. Only after init is done is the init section
>>> made NX.
>>
>> I haven't heard anything back, and I think I've answered the questions raised.
>>
>> Since these changes are behind CONFIG items, other folks have
>> successfully tested the series, and at least Google, Linaro, and the
>> codeaurora folks are interested in these features, can I send these to
>> the patch tracker?
>
> I'm getting a section mismatch warning when compiling

Ah! Good catch, thanks. That config got turned off for me somewhere
along the lines. I've turned it back on and I see the same thing.

> WARNING: vmlinux.o(.text+0x1257c): Section mismatch in reference from the
> function free_initmem() to the (unknown reference) .init.data:(unknown)
> The function free_initmem() references
> the (unknown reference) __initdata (unknown).
> This is often because free_initmem lacks a __initdata
> annotation or the annotation of (unknown) is wrong.
>
> free_initmem -> fix_kernmem_perms aren't marked as __init but there is a
> reference to section_perms which is marked as __initdata.

Yeah, I don't see a way to have fix_kernmem_perms marked as __init
without continuing to trigger the warnings. Instead, I've now dropped
__initdata from section_perms instead. That cleaned up the warning for
me; I will send an updated version.

> Kernel output looks as expected on a v7 non LPAE system

Great, thanks for the confirmation! :)

-Kees

>
> / # cat /sys/kernel/debug/kernel_page_tables
> ---[ Modules ]---
> 0xbfe01000-0xbfe0d000          48K     RW NX SHD MEM/CACHED/WBWA
> ---[ Kernel Mapping ]---
> 0xc0000000-0xc0100000           1M     RW NX SHD
> 0xc0100000-0xc0600000           5M     ro x  SHD
> 0xc0600000-0xc0900000           3M     ro NX SHD
> 0xc0900000-0xef800000         751M     RW NX SHD
> ---[ vmalloc() Area ]---
> 0xf0000000-0xf0001000           4K     RW NX SHD DEV/SHARED
> 0xf0002000-0xf0003000           4K     RW NX SHD DEV/SHARED
> 0xf0006000-0xf0007000           4K     RW NX SHD DEV/SHARED
> 0xf0018000-0xf0058000         256K     RW NX SHD MEM/BUFFERABLE/WC
> 0xf005a000-0xf005b000           4K     RW NX SHD DEV/SHARED
> 0xf005c000-0xf005f000          12K     RW NX SHD MEM/CACHED/WBWA
> 0xf0060000-0xf0064000          16K     RW NX SHD DEV/SHARED
> 0xf0068000-0xf006e000          24K     RW NX SHD DEV/SHARED
> ---[ vmalloc() End ]---
> ---[ Fixmap Area ]---
> 0xfffae000-0xfffb0000           8K     RW NX SHD MEM/CACHED/WBWA
> 0xfffbe000-0xfffc0000           8K     RW NX SHD MEM/CACHED/WBWA
> 0xfffce000-0xfffd0000           8K     RW NX SHD MEM/CACHED/WBWA
> 0xfffde000-0xfffe0000           8K     RW NX SHD MEM/CACHED/WBWA
> ---[ Vectors ]---
> 0xffff0000-0xffff1000           4K USR ro x  SHD MEM/CACHED/WBWA
> 0xffff1000-0xffff2000           4K     ro x  SHD MEM/CACHED/WBWA
> ---[ Vectors End ]---
>
> Thanks,
> Laura
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-23 18:47           ` Laura Abbott
@ 2014-03-23 22:21             ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-23 22:21 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Dave Martin, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Russell King, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 2/17/2014 4:34 AM, Dave Martin wrote:
>> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>>>>> sets rodata read-only (but executable), where as this option additionally
>>>>> splits rodata from the kernel text (resulting in potentially more memory
>>>>> lost to padding) and sets it non-executable as well. The end result is
>>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>>>>> marked purely read-only.
>>>>
>>>> This triggers an Oops in kexec, because we have a block of code in .text
>>>> which is a template for generating baremetal code to relocate the new
>>>> kernel, and some literal words are written into it before copying.
>>>
>>> You're writing into the text area? I would imagine that
>>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>>> right place to be building code -- shouldn't the module area be used
>>> for that?
>>>
>>>> Possibly this should be in .rodata, not .text.
>>>
>>> Well, rodata should be neither writable nor executable.
>>
>> We're not writing into code exactly.
>>
>> This code is never executed in-place in vmlinux.  It gets copied, and
>> only copies are ever executed.
>>
>> Some pointers and offsets get poked into the code to configure it.
>>
>> I think it would be better simply to put the code in .rodata, and
>> poke paramaters into the copy, not the original -- but that's a bit
>> more awkward to code up, since the values can't be poked simply by
>> writing global variables.
>>
>>>
>>>> There may be a few other instances of this kind of thing.
>>>
>>> This config will certainly find them! :) But, that's why it's behind a config.
>>
>> I haven't tested exhaustively, but it think this is sufficient for a
>> Tested-by.  The patch does seem to be doing what it is intended to
>> do, and doesn't seem to be triggering false positives all over the
>> place.
>>
>>>
>>>> Are you aware of similar situations on other arches?
>>>
>>> I think there were some problems a long time ago on x86 for rodata too.
>>
>> It would be good to get this kexec case fixed -- I'll try to hack up
>> a separate patch.
>>
>
> FWIW, we've hit issues not just with kexec but kprobes as well. The same
> problems exist with this series:

For this stage, how about I make this "depends on KEXEC=n &&
KPROBES=n"? Then as these areas get fixed, we can drop those
requirements.

-Kees

>
> / # echo p:nl 0xc01d5c00 >> /sys/kernel/debug/tracing/kprobe_events
> / # echo 1 > /sys/kernel/debug/tracing/events/kprobes/nl/enable
> [ 1639.739629] Unable to handle kernel paging request at virtual address c01d5c00
> [ 1639.739655] pgd = edbc4000
> [ 1639.745730] [c01d5c00] *pgd=0001141e(bad)
> [ 1639.752413] Internal error: Oops: 80d [#1] PREEMPT SMP ARM
> [ 1639.752503] Modules linked in:
> [ 1639.760920] CPU: 0 PID: 58 Comm: sh Not tainted 3.14.0-rc7-next-20140318-00004-ga0191b7-dirty #170
> [ 1639.761015] task: edb90d80 ti: ed018000 task.ti: ed018000
> [ 1639.769870] PC is at patch_text+0x4/0x10
> [ 1639.775333] LR is at arm_kprobe+0x28/0x38
> [ 1639.779327] pc : [<c058acec>]    lr : [<c058bcc4>]    psr: 20000013
> [ 1639.779327] sp : ed019f10  ip : a0000013  fp : 01e7fb34
> [ 1639.783241] r10: 00000000  r9 : 01e80ab8  r8 : 00000002
> [ 1639.794517] r7 : ed019f80  r6 : ed900bc4  r5 : edb19fa0  r4 : edb19f08
> [ 1639.799727] r3 : c01d5c00  r2 : ed019f08  r1 : e7f001f8  r0 : c01d5c00
> [ 1639.806326] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [ 1639.812837] Control: 10c5787d  Table: 2dbc406a  DAC: 00000015
> [ 1639.820040] Process sh (pid: 58, stack limit = 0xed018240)
> [ 1639.825768] Stack: (0xed019f10 to 0xed01a000)
> [ 1639.831150] 9f00:                                     00000000 c058bd5c ed900ba0 c018b724
> [ 1639.835584] 9f20: ed6fab00 00000000 edb2d240 00000002 ed019f80 c018bdc4 ffffffff 00000001
> [ 1639.843743] 9f40: edb2d240 01e80ab8 ed019f80 00000002 00000002 c01e1b1c 01e7fb34 c011c3c0
> [ 1639.851902] 9f60: 00000003 00000000 00000000 edb2d240 edb2d240 00000002 01e80ab8 c01e2108
> [ 1639.860063] 9f80: 00000000 00000000 00200200 00157ecc 00000001 01e80ab8 00000004 c0106e64
> [ 1639.868222] 9fa0: ed018000 c0106ce0 00157ecc 00000001 00000001 01e80ab8 00000002 00000000
> [ 1639.876381] 9fc0: 00157ecc 00000001 01e80ab8 00000004 00000020 01e7fb48 01e7fb14 01e7fb34
> [ 1639.884542] 9fe0: 00000000 bef4562c 0001ee5d 0000a8cc 60000010 00000001 00000000 00000000
> [ 1639.892709] [<c058acec>] (patch_text) from [<c058bcc4>] (arm_kprobe+0x28/0x38)
> [ 1639.900862] [<c058bcc4>] (arm_kprobe) from [<c058bd5c>] (enable_kprobe+0x88/0x94)
> [ 1639.907983] [<c058bd5c>] (enable_kprobe) from [<c018b724>] (__ftrace_event_enable_disable+0x13c/0x200)
> [ 1639.915537] [<c018b724>] (__ftrace_event_enable_disable) from [<c018bdc4>] (event_enable_write+0x78/0xd4)
> [ 1639.924741] [<c018bdc4>] (event_enable_write) from [<c01e1b1c>] (vfs_write+0xac/0x188)
> [ 1639.934372] [<c01e1b1c>] (vfs_write) from [<c01e2108>] (SyS_write+0x40/0x94)
> [ 1639.942187] [<c01e2108>] (SyS_write) from [<c0106ce0>] (ret_fast_syscall+0x0/0x30)
> [ 1639.949386] Code: e4831004 e1a01003 eaee298d e1a03000 (e4831004)
> [ 1639.956766] ---[ end trace b548269e2c7a3190 ]---
>
>
> We had some functions that allowed the text to be temporarily made writable but something
> uniform for kexec would be useful as well (our kexec solution has been 'turn it off')
>
>
>> Cheers
>> ---Dave
>>
>
> Thanks,
> Laura
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-23 22:21             ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-23 22:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 2/17/2014 4:34 AM, Dave Martin wrote:
>> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>>>>> sets rodata read-only (but executable), where as this option additionally
>>>>> splits rodata from the kernel text (resulting in potentially more memory
>>>>> lost to padding) and sets it non-executable as well. The end result is
>>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>>>>> marked purely read-only.
>>>>
>>>> This triggers an Oops in kexec, because we have a block of code in .text
>>>> which is a template for generating baremetal code to relocate the new
>>>> kernel, and some literal words are written into it before copying.
>>>
>>> You're writing into the text area? I would imagine that
>>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>>> right place to be building code -- shouldn't the module area be used
>>> for that?
>>>
>>>> Possibly this should be in .rodata, not .text.
>>>
>>> Well, rodata should be neither writable nor executable.
>>
>> We're not writing into code exactly.
>>
>> This code is never executed in-place in vmlinux.  It gets copied, and
>> only copies are ever executed.
>>
>> Some pointers and offsets get poked into the code to configure it.
>>
>> I think it would be better simply to put the code in .rodata, and
>> poke paramaters into the copy, not the original -- but that's a bit
>> more awkward to code up, since the values can't be poked simply by
>> writing global variables.
>>
>>>
>>>> There may be a few other instances of this kind of thing.
>>>
>>> This config will certainly find them! :) But, that's why it's behind a config.
>>
>> I haven't tested exhaustively, but it think this is sufficient for a
>> Tested-by.  The patch does seem to be doing what it is intended to
>> do, and doesn't seem to be triggering false positives all over the
>> place.
>>
>>>
>>>> Are you aware of similar situations on other arches?
>>>
>>> I think there were some problems a long time ago on x86 for rodata too.
>>
>> It would be good to get this kexec case fixed -- I'll try to hack up
>> a separate patch.
>>
>
> FWIW, we've hit issues not just with kexec but kprobes as well. The same
> problems exist with this series:

For this stage, how about I make this "depends on KEXEC=n &&
KPROBES=n"? Then as these areas get fixed, we can drop those
requirements.

-Kees

>
> / # echo p:nl 0xc01d5c00 >> /sys/kernel/debug/tracing/kprobe_events
> / # echo 1 > /sys/kernel/debug/tracing/events/kprobes/nl/enable
> [ 1639.739629] Unable to handle kernel paging request at virtual address c01d5c00
> [ 1639.739655] pgd = edbc4000
> [ 1639.745730] [c01d5c00] *pgd=0001141e(bad)
> [ 1639.752413] Internal error: Oops: 80d [#1] PREEMPT SMP ARM
> [ 1639.752503] Modules linked in:
> [ 1639.760920] CPU: 0 PID: 58 Comm: sh Not tainted 3.14.0-rc7-next-20140318-00004-ga0191b7-dirty #170
> [ 1639.761015] task: edb90d80 ti: ed018000 task.ti: ed018000
> [ 1639.769870] PC is at patch_text+0x4/0x10
> [ 1639.775333] LR is at arm_kprobe+0x28/0x38
> [ 1639.779327] pc : [<c058acec>]    lr : [<c058bcc4>]    psr: 20000013
> [ 1639.779327] sp : ed019f10  ip : a0000013  fp : 01e7fb34
> [ 1639.783241] r10: 00000000  r9 : 01e80ab8  r8 : 00000002
> [ 1639.794517] r7 : ed019f80  r6 : ed900bc4  r5 : edb19fa0  r4 : edb19f08
> [ 1639.799727] r3 : c01d5c00  r2 : ed019f08  r1 : e7f001f8  r0 : c01d5c00
> [ 1639.806326] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [ 1639.812837] Control: 10c5787d  Table: 2dbc406a  DAC: 00000015
> [ 1639.820040] Process sh (pid: 58, stack limit = 0xed018240)
> [ 1639.825768] Stack: (0xed019f10 to 0xed01a000)
> [ 1639.831150] 9f00:                                     00000000 c058bd5c ed900ba0 c018b724
> [ 1639.835584] 9f20: ed6fab00 00000000 edb2d240 00000002 ed019f80 c018bdc4 ffffffff 00000001
> [ 1639.843743] 9f40: edb2d240 01e80ab8 ed019f80 00000002 00000002 c01e1b1c 01e7fb34 c011c3c0
> [ 1639.851902] 9f60: 00000003 00000000 00000000 edb2d240 edb2d240 00000002 01e80ab8 c01e2108
> [ 1639.860063] 9f80: 00000000 00000000 00200200 00157ecc 00000001 01e80ab8 00000004 c0106e64
> [ 1639.868222] 9fa0: ed018000 c0106ce0 00157ecc 00000001 00000001 01e80ab8 00000002 00000000
> [ 1639.876381] 9fc0: 00157ecc 00000001 01e80ab8 00000004 00000020 01e7fb48 01e7fb14 01e7fb34
> [ 1639.884542] 9fe0: 00000000 bef4562c 0001ee5d 0000a8cc 60000010 00000001 00000000 00000000
> [ 1639.892709] [<c058acec>] (patch_text) from [<c058bcc4>] (arm_kprobe+0x28/0x38)
> [ 1639.900862] [<c058bcc4>] (arm_kprobe) from [<c058bd5c>] (enable_kprobe+0x88/0x94)
> [ 1639.907983] [<c058bd5c>] (enable_kprobe) from [<c018b724>] (__ftrace_event_enable_disable+0x13c/0x200)
> [ 1639.915537] [<c018b724>] (__ftrace_event_enable_disable) from [<c018bdc4>] (event_enable_write+0x78/0xd4)
> [ 1639.924741] [<c018bdc4>] (event_enable_write) from [<c01e1b1c>] (vfs_write+0xac/0x188)
> [ 1639.934372] [<c01e1b1c>] (vfs_write) from [<c01e2108>] (SyS_write+0x40/0x94)
> [ 1639.942187] [<c01e2108>] (SyS_write) from [<c0106ce0>] (ret_fast_syscall+0x0/0x30)
> [ 1639.949386] Code: e4831004 e1a01003 eaee298d e1a03000 (e4831004)
> [ 1639.956766] ---[ end trace b548269e2c7a3190 ]---
>
>
> We had some functions that allowed the text to be temporarily made writable but something
> uniform for kexec would be useful as well (our kexec solution has been 'turn it off')
>
>
>> Cheers
>> ---Dave
>>
>
> Thanks,
> Laura
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-23 22:21             ` Kees Cook
@ 2014-03-23 22:37               ` Nicolas Pitre
  -1 siblings, 0 replies; 45+ messages in thread
From: Nicolas Pitre @ 2014-03-23 22:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Laura Abbott, Dave Martin, Catalin Marinas, Will Deacon,
	Larry Bassel, Stephen Rothwell, Russell King, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Sun, 23 Mar 2014, Kees Cook wrote:

> On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> > On 2/17/2014 4:34 AM, Dave Martin wrote:
> >> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >>>>> sets rodata read-only (but executable), where as this option additionally
> >>>>> splits rodata from the kernel text (resulting in potentially more memory
> >>>>> lost to padding) and sets it non-executable as well. The end result is
> >>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >>>>> marked purely read-only.
> >>>>
> >>>> This triggers an Oops in kexec, because we have a block of code in .text
> >>>> which is a template for generating baremetal code to relocate the new
> >>>> kernel, and some literal words are written into it before copying.
> >>>
> >>> You're writing into the text area? I would imagine that
> >>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >>> right place to be building code -- shouldn't the module area be used
> >>> for that?
> >>>
> >>>> Possibly this should be in .rodata, not .text.
> >>>
> >>> Well, rodata should be neither writable nor executable.
> >>
> >> We're not writing into code exactly.
> >>
> >> This code is never executed in-place in vmlinux.  It gets copied, and
> >> only copies are ever executed.
> >>
> >> Some pointers and offsets get poked into the code to configure it.
> >>
> >> I think it would be better simply to put the code in .rodata, and
> >> poke paramaters into the copy, not the original -- but that's a bit
> >> more awkward to code up, since the values can't be poked simply by
> >> writing global variables.
> >>
> >>>
> >>>> There may be a few other instances of this kind of thing.
> >>>
> >>> This config will certainly find them! :) But, that's why it's behind a config.
> >>
> >> I haven't tested exhaustively, but it think this is sufficient for a
> >> Tested-by.  The patch does seem to be doing what it is intended to
> >> do, and doesn't seem to be triggering false positives all over the
> >> place.
> >>
> >>>
> >>>> Are you aware of similar situations on other arches?
> >>>
> >>> I think there were some problems a long time ago on x86 for rodata too.
> >>
> >> It would be good to get this kexec case fixed -- I'll try to hack up
> >> a separate patch.
> >>
> >
> > FWIW, we've hit issues not just with kexec but kprobes as well. The same
> > problems exist with this series:
> 
> For this stage, how about I make this "depends on KEXEC=n &&
> KPROBES=n"? Then as these areas get fixed, we can drop those
> requirements.

Do they really need "fixing"?

The goal here is to increase security by preventing kernel code to be 
modified.  And now it would require hole punching in order to support 
kprobes.

If security is important enough for this option to be attractive to you, 
then wouldn't you want to keep kprobes firmly turned off as well?


Nicolas

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-23 22:37               ` Nicolas Pitre
  0 siblings, 0 replies; 45+ messages in thread
From: Nicolas Pitre @ 2014-03-23 22:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 23 Mar 2014, Kees Cook wrote:

> On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> > On 2/17/2014 4:34 AM, Dave Martin wrote:
> >> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >>>>> sets rodata read-only (but executable), where as this option additionally
> >>>>> splits rodata from the kernel text (resulting in potentially more memory
> >>>>> lost to padding) and sets it non-executable as well. The end result is
> >>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >>>>> marked purely read-only.
> >>>>
> >>>> This triggers an Oops in kexec, because we have a block of code in .text
> >>>> which is a template for generating baremetal code to relocate the new
> >>>> kernel, and some literal words are written into it before copying.
> >>>
> >>> You're writing into the text area? I would imagine that
> >>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >>> right place to be building code -- shouldn't the module area be used
> >>> for that?
> >>>
> >>>> Possibly this should be in .rodata, not .text.
> >>>
> >>> Well, rodata should be neither writable nor executable.
> >>
> >> We're not writing into code exactly.
> >>
> >> This code is never executed in-place in vmlinux.  It gets copied, and
> >> only copies are ever executed.
> >>
> >> Some pointers and offsets get poked into the code to configure it.
> >>
> >> I think it would be better simply to put the code in .rodata, and
> >> poke paramaters into the copy, not the original -- but that's a bit
> >> more awkward to code up, since the values can't be poked simply by
> >> writing global variables.
> >>
> >>>
> >>>> There may be a few other instances of this kind of thing.
> >>>
> >>> This config will certainly find them! :) But, that's why it's behind a config.
> >>
> >> I haven't tested exhaustively, but it think this is sufficient for a
> >> Tested-by.  The patch does seem to be doing what it is intended to
> >> do, and doesn't seem to be triggering false positives all over the
> >> place.
> >>
> >>>
> >>>> Are you aware of similar situations on other arches?
> >>>
> >>> I think there were some problems a long time ago on x86 for rodata too.
> >>
> >> It would be good to get this kexec case fixed -- I'll try to hack up
> >> a separate patch.
> >>
> >
> > FWIW, we've hit issues not just with kexec but kprobes as well. The same
> > problems exist with this series:
> 
> For this stage, how about I make this "depends on KEXEC=n &&
> KPROBES=n"? Then as these areas get fixed, we can drop those
> requirements.

Do they really need "fixing"?

The goal here is to increase security by preventing kernel code to be 
modified.  And now it would require hole punching in order to support 
kprobes.

If security is important enough for this option to be attractive to you, 
then wouldn't you want to keep kprobes firmly turned off as well?


Nicolas

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-23 22:37               ` Nicolas Pitre
@ 2014-03-23 22:56                 ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-23 22:56 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Laura Abbott, Dave Martin, Catalin Marinas, Will Deacon,
	Larry Bassel, Stephen Rothwell, Russell King, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Jiang Liu, Christoffer Dall,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Sun, Mar 23, 2014 at 4:37 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Sun, 23 Mar 2014, Kees Cook wrote:
>
>> On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> > On 2/17/2014 4:34 AM, Dave Martin wrote:
>> >> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>> >>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>> >>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>> >>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>> >>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>> >>>>> sets rodata read-only (but executable), where as this option additionally
>> >>>>> splits rodata from the kernel text (resulting in potentially more memory
>> >>>>> lost to padding) and sets it non-executable as well. The end result is
>> >>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>> >>>>> marked purely read-only.
>> >>>>
>> >>>> This triggers an Oops in kexec, because we have a block of code in .text
>> >>>> which is a template for generating baremetal code to relocate the new
>> >>>> kernel, and some literal words are written into it before copying.
>> >>>
>> >>> You're writing into the text area? I would imagine that
>> >>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>> >>> right place to be building code -- shouldn't the module area be used
>> >>> for that?
>> >>>
>> >>>> Possibly this should be in .rodata, not .text.
>> >>>
>> >>> Well, rodata should be neither writable nor executable.
>> >>
>> >> We're not writing into code exactly.
>> >>
>> >> This code is never executed in-place in vmlinux.  It gets copied, and
>> >> only copies are ever executed.
>> >>
>> >> Some pointers and offsets get poked into the code to configure it.
>> >>
>> >> I think it would be better simply to put the code in .rodata, and
>> >> poke paramaters into the copy, not the original -- but that's a bit
>> >> more awkward to code up, since the values can't be poked simply by
>> >> writing global variables.
>> >>
>> >>>
>> >>>> There may be a few other instances of this kind of thing.
>> >>>
>> >>> This config will certainly find them! :) But, that's why it's behind a config.
>> >>
>> >> I haven't tested exhaustively, but it think this is sufficient for a
>> >> Tested-by.  The patch does seem to be doing what it is intended to
>> >> do, and doesn't seem to be triggering false positives all over the
>> >> place.
>> >>
>> >>>
>> >>>> Are you aware of similar situations on other arches?
>> >>>
>> >>> I think there were some problems a long time ago on x86 for rodata too.
>> >>
>> >> It would be good to get this kexec case fixed -- I'll try to hack up
>> >> a separate patch.
>> >>
>> >
>> > FWIW, we've hit issues not just with kexec but kprobes as well. The same
>> > problems exist with this series:
>>
>> For this stage, how about I make this "depends on KEXEC=n &&
>> KPROBES=n"? Then as these areas get fixed, we can drop those
>> requirements.
>
> Do they really need "fixing"?
>
> The goal here is to increase security by preventing kernel code to be
> modified.  And now it would require hole punching in order to support
> kprobes.
>
> If security is important enough for this option to be attractive to you,
> then wouldn't you want to keep kprobes firmly turned off as well?

I couldn't agree more. :) I don't build with these options, but if
someone wants both of these, we'll have to deal with that. Until then,
we should keep it disabled with the negative "depends on".

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-23 22:56                 ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-03-23 22:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 23, 2014 at 4:37 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Sun, 23 Mar 2014, Kees Cook wrote:
>
>> On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> > On 2/17/2014 4:34 AM, Dave Martin wrote:
>> >> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
>> >>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
>> >>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
>> >>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
>> >>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
>> >>>>> sets rodata read-only (but executable), where as this option additionally
>> >>>>> splits rodata from the kernel text (resulting in potentially more memory
>> >>>>> lost to padding) and sets it non-executable as well. The end result is
>> >>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
>> >>>>> marked purely read-only.
>> >>>>
>> >>>> This triggers an Oops in kexec, because we have a block of code in .text
>> >>>> which is a template for generating baremetal code to relocate the new
>> >>>> kernel, and some literal words are written into it before copying.
>> >>>
>> >>> You're writing into the text area? I would imagine that
>> >>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
>> >>> right place to be building code -- shouldn't the module area be used
>> >>> for that?
>> >>>
>> >>>> Possibly this should be in .rodata, not .text.
>> >>>
>> >>> Well, rodata should be neither writable nor executable.
>> >>
>> >> We're not writing into code exactly.
>> >>
>> >> This code is never executed in-place in vmlinux.  It gets copied, and
>> >> only copies are ever executed.
>> >>
>> >> Some pointers and offsets get poked into the code to configure it.
>> >>
>> >> I think it would be better simply to put the code in .rodata, and
>> >> poke paramaters into the copy, not the original -- but that's a bit
>> >> more awkward to code up, since the values can't be poked simply by
>> >> writing global variables.
>> >>
>> >>>
>> >>>> There may be a few other instances of this kind of thing.
>> >>>
>> >>> This config will certainly find them! :) But, that's why it's behind a config.
>> >>
>> >> I haven't tested exhaustively, but it think this is sufficient for a
>> >> Tested-by.  The patch does seem to be doing what it is intended to
>> >> do, and doesn't seem to be triggering false positives all over the
>> >> place.
>> >>
>> >>>
>> >>>> Are you aware of similar situations on other arches?
>> >>>
>> >>> I think there were some problems a long time ago on x86 for rodata too.
>> >>
>> >> It would be good to get this kexec case fixed -- I'll try to hack up
>> >> a separate patch.
>> >>
>> >
>> > FWIW, we've hit issues not just with kexec but kprobes as well. The same
>> > problems exist with this series:
>>
>> For this stage, how about I make this "depends on KEXEC=n &&
>> KPROBES=n"? Then as these areas get fixed, we can drop those
>> requirements.
>
> Do they really need "fixing"?
>
> The goal here is to increase security by preventing kernel code to be
> modified.  And now it would require hole punching in order to support
> kprobes.
>
> If security is important enough for this option to be attractive to you,
> then wouldn't you want to keep kprobes firmly turned off as well?

I couldn't agree more. :) I don't build with these options, but if
someone wants both of these, we'll have to deal with that. Until then,
we should keep it disabled with the negative "depends on".

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-23 22:21             ` Kees Cook
@ 2014-03-24 10:47               ` Jon Medhurst (Tixy)
  -1 siblings, 0 replies; 45+ messages in thread
From: Jon Medhurst (Tixy) @ 2014-03-24 10:47 UTC (permalink / raw)
  To: Kees Cook
  Cc: Laura Abbott, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Russell King, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Dave Martin, Jiang Liu,
	Christoffer Dall, Marc Zyngier, Rob Herring, Vitaly Andrianov,
	linux-arm-kernel, Jonathan Austin, Simon Baatz,
	Greg Kroah-Hartman, LKML, Santosh Shilimkar, Andrew Morton

On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
> On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> > On 2/17/2014 4:34 AM, Dave Martin wrote:
> >> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >>>>> sets rodata read-only (but executable), where as this option additionally
> >>>>> splits rodata from the kernel text (resulting in potentially more memory
> >>>>> lost to padding) and sets it non-executable as well. The end result is
> >>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >>>>> marked purely read-only.
> >>>>
> >>>> This triggers an Oops in kexec, because we have a block of code in .text
> >>>> which is a template for generating baremetal code to relocate the new
> >>>> kernel, and some literal words are written into it before copying.
> >>>
> >>> You're writing into the text area? I would imagine that
> >>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >>> right place to be building code -- shouldn't the module area be used
> >>> for that?
> >>>
> >>>> Possibly this should be in .rodata, not .text.
> >>>
> >>> Well, rodata should be neither writable nor executable.
> >>
> >> We're not writing into code exactly.
> >>
> >> This code is never executed in-place in vmlinux.  It gets copied, and
> >> only copies are ever executed.
> >>
> >> Some pointers and offsets get poked into the code to configure it.
> >>
> >> I think it would be better simply to put the code in .rodata, and
> >> poke paramaters into the copy, not the original -- but that's a bit
> >> more awkward to code up, since the values can't be poked simply by
> >> writing global variables.
> >>
> >>>
> >>>> There may be a few other instances of this kind of thing.
> >>>
> >>> This config will certainly find them! :) But, that's why it's behind a config.
> >>
> >> I haven't tested exhaustively, but it think this is sufficient for a
> >> Tested-by.  The patch does seem to be doing what it is intended to
> >> do, and doesn't seem to be triggering false positives all over the
> >> place.
> >>
> >>>
> >>>> Are you aware of similar situations on other arches?
> >>>
> >>> I think there were some problems a long time ago on x86 for rodata too.
> >>
> >> It would be good to get this kexec case fixed -- I'll try to hack up
> >> a separate patch.
> >>
> >
> > FWIW, we've hit issues not just with kexec but kprobes as well. The same
> > problems exist with this series:
> 
> For this stage, how about I make this "depends on KEXEC=n &&
> KPROBES=n"? 

There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
kernel code with a call to probe_kernel_write(), which GDB uses as well.
 
And grepping for the patch_text() function also shows
__arch_jump_label_transform() modifies kernel code. Not sure how and
when that gets used.

-- 
Tixy


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-24 10:47               ` Jon Medhurst (Tixy)
  0 siblings, 0 replies; 45+ messages in thread
From: Jon Medhurst (Tixy) @ 2014-03-24 10:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
> On Sun, Mar 23, 2014 at 12:47 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> > On 2/17/2014 4:34 AM, Dave Martin wrote:
> >> On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >>> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >>>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >>>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >>>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >>>>> sets rodata read-only (but executable), where as this option additionally
> >>>>> splits rodata from the kernel text (resulting in potentially more memory
> >>>>> lost to padding) and sets it non-executable as well. The end result is
> >>>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >>>>> marked purely read-only.
> >>>>
> >>>> This triggers an Oops in kexec, because we have a block of code in .text
> >>>> which is a template for generating baremetal code to relocate the new
> >>>> kernel, and some literal words are written into it before copying.
> >>>
> >>> You're writing into the text area? I would imagine that
> >>> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >>> right place to be building code -- shouldn't the module area be used
> >>> for that?
> >>>
> >>>> Possibly this should be in .rodata, not .text.
> >>>
> >>> Well, rodata should be neither writable nor executable.
> >>
> >> We're not writing into code exactly.
> >>
> >> This code is never executed in-place in vmlinux.  It gets copied, and
> >> only copies are ever executed.
> >>
> >> Some pointers and offsets get poked into the code to configure it.
> >>
> >> I think it would be better simply to put the code in .rodata, and
> >> poke paramaters into the copy, not the original -- but that's a bit
> >> more awkward to code up, since the values can't be poked simply by
> >> writing global variables.
> >>
> >>>
> >>>> There may be a few other instances of this kind of thing.
> >>>
> >>> This config will certainly find them! :) But, that's why it's behind a config.
> >>
> >> I haven't tested exhaustively, but it think this is sufficient for a
> >> Tested-by.  The patch does seem to be doing what it is intended to
> >> do, and doesn't seem to be triggering false positives all over the
> >> place.
> >>
> >>>
> >>>> Are you aware of similar situations on other arches?
> >>>
> >>> I think there were some problems a long time ago on x86 for rodata too.
> >>
> >> It would be good to get this kexec case fixed -- I'll try to hack up
> >> a separate patch.
> >>
> >
> > FWIW, we've hit issues not just with kexec but kprobes as well. The same
> > problems exist with this series:
> 
> For this stage, how about I make this "depends on KEXEC=n &&
> KPROBES=n"? 

There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
kernel code with a call to probe_kernel_write(), which GDB uses as well.
 
And grepping for the patch_text() function also shows
__arch_jump_label_transform() modifies kernel code. Not sure how and
when that gets used.

-- 
Tixy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-23 18:47           ` Laura Abbott
@ 2014-03-24 12:30             ` Dave Martin
  -1 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-03-24 12:30 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Kees Cook, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Russell King, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, grant.likely, Jiang Liu, Christoffer Dall,
	Marc Zyngier, rob.herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Sun, Mar 23, 2014 at 06:47:36PM +0000, Laura Abbott wrote:
> On 2/17/2014 4:34 AM, Dave Martin wrote:
> > On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >>>> sets rodata read-only (but executable), where as this option additionally
> >>>> splits rodata from the kernel text (resulting in potentially more memory
> >>>> lost to padding) and sets it non-executable as well. The end result is
> >>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >>>> marked purely read-only.
> >>>
> >>> This triggers an Oops in kexec, because we have a block of code in .text
> >>> which is a template for generating baremetal code to relocate the new
> >>> kernel, and some literal words are written into it before copying.
> >>
> >> You're writing into the text area? I would imagine that
> >> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >> right place to be building code -- shouldn't the module area be used
> >> for that?
> >>

[...]
	
> FWIW, we've hit issues not just with kexec but kprobes as well. The same
> problems exist with this series:

[...]

> We had some functions that allowed the text to be temporarily made writable but something
> uniform for kexec would be useful as well (our kexec solution has been 'turn it off')

kexec doesn't rely on poking the kernel text: the fact that it does this
is just a side-effect of the way it is currently implemented.

I would like to fix it -- it's currently on my todo list, but I
consider it non-urgent.  depending on !KEXEC seems reasonable for
now.

People building a hardened system may choose do disable kexec for other
reasons, but that's a separate issue entirely.


kprobes is a different matter: getting it to work with strict permissions
is likely to be complex and costly.  kprobes is already unavoidably
complex and costly, so that is probably better viewed as a kprobes
problem to be solved later, rather than a strict permissions problem.


Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-03-24 12:30             ` Dave Martin
  0 siblings, 0 replies; 45+ messages in thread
From: Dave Martin @ 2014-03-24 12:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 23, 2014 at 06:47:36PM +0000, Laura Abbott wrote:
> On 2/17/2014 4:34 AM, Dave Martin wrote:
> > On Fri, Feb 14, 2014 at 11:11:07AM -0800, Kees Cook wrote:
> >> On Fri, Feb 14, 2014 at 8:22 AM, Dave Martin <Dave.Martin@arm.com> wrote:
> >>> On Thu, Feb 13, 2014 at 05:04:10PM -0800, Kees Cook wrote:
> >>>> Introduce "CONFIG_DEBUG_RODATA" to mostly match the x86 config, though
> >>>> the behavior is different: it depends on STRICT_KERNMEM_PERMS, which
> >>>> sets rodata read-only (but executable), where as this option additionally
> >>>> splits rodata from the kernel text (resulting in potentially more memory
> >>>> lost to padding) and sets it non-executable as well. The end result is
> >>>> that on builds with CONFIG_DEBUG_RODATA=y (like x86) the rodata with be
> >>>> marked purely read-only.
> >>>
> >>> This triggers an Oops in kexec, because we have a block of code in .text
> >>> which is a template for generating baremetal code to relocate the new
> >>> kernel, and some literal words are written into it before copying.
> >>
> >> You're writing into the text area? I would imagine that
> >> CONFIG_ARM_KERNMEM_PERMS would break that. However, that's not the
> >> right place to be building code -- shouldn't the module area be used
> >> for that?
> >>

[...]
	
> FWIW, we've hit issues not just with kexec but kprobes as well. The same
> problems exist with this series:

[...]

> We had some functions that allowed the text to be temporarily made writable but something
> uniform for kexec would be useful as well (our kexec solution has been 'turn it off')

kexec doesn't rely on poking the kernel text: the fact that it does this
is just a side-effect of the way it is currently implemented.

I would like to fix it -- it's currently on my todo list, but I
consider it non-urgent.  depending on !KEXEC seems reasonable for
now.

People building a hardened system may choose do disable kexec for other
reasons, but that's a separate issue entirely.


kprobes is a different matter: getting it to work with strict permissions
is likely to be complex and costly.  kprobes is already unavoidably
complex and costly, so that is probably better viewed as a kprobes
problem to be solved later, rather than a strict permissions problem.


Cheers
---Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-24 10:47               ` Jon Medhurst (Tixy)
  (?)
@ 2014-03-25 22:11               ` Rabin Vincent
  -1 siblings, 0 replies; 45+ messages in thread
From: Rabin Vincent @ 2014-03-25 22:11 UTC (permalink / raw)
  To: linux-arm-kernel

2014-03-24 11:47 GMT+01:00 Jon Medhurst (Tixy) <tixy@linaro.org>:
> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>> For this stage, how about I make this "depends on KEXEC=n &&
>> KPROBES=n"?
>
> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
> kernel code with a call to probe_kernel_write(), which GDB uses as well.

x86 handles this by making all kernel text R/W around the place where
ftrace does the modifications.  This is called under stop_machine().
See 16239630974516a887 ("ftrace, x86: make kernel text writable only
for conversions").

> And grepping for the patch_text() function also shows
> __arch_jump_label_transform() modifies kernel code. Not sure how and
> when that gets used.

It gets used when "Optimize very unlikely/likely branches"
(CONFIG_JUMP_LABEL) is enabled.  These "very unlikely/likely branches"
are used, among other things, for controlling tracepoints and some
scheduler/networking features.

x86 handles jump labels (and kprobes) by mapping the page being
modified read-write around the modification.  See text_poke().

See also this ARM patch from Kyle Martin which I don't think has been merged:
http://marc.info/?l=linux-arm-kernel&m=139291862727863&w=2

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-03-24 10:47               ` Jon Medhurst (Tixy)
@ 2014-04-01 22:34                 ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-04-01 22:34 UTC (permalink / raw)
  To: Jon Medhurst (Tixy)
  Cc: Laura Abbott, Catalin Marinas, Will Deacon, Larry Bassel,
	Stephen Rothwell, Russell King, Nicolas Pitre, Ben Dooks,
	Uwe Kleine-König, Grant Likely, Dave Martin, Jiang Liu,
	Christoffer Dall, Marc Zyngier, Rob Herring, Vitaly Andrianov,
	linux-arm-kernel, Jonathan Austin, Simon Baatz,
	Greg Kroah-Hartman, LKML, Santosh Shilimkar, Andrew Morton

On Mon, Mar 24, 2014 at 3:47 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>> For this stage, how about I make this "depends on KEXEC=n &&
>> KPROBES=n"?
>
> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
> kernel code with a call to probe_kernel_write(), which GDB uses as well.
>
> And grepping for the patch_text() function also shows
> __arch_jump_label_transform() modifies kernel code. Not sure how and
> when that gets used.

Right, so, I'm trying to fix ftrace now, and I've hit a wall. It is as
if changes to the kernel text PMD aren't being noticed after the
kernel is running. Does anyone know why this might be happening?

Code and details here:
https://lkml.org/lkml/2014/4/1/674

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-04-01 22:34                 ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-04-01 22:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 24, 2014 at 3:47 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>> For this stage, how about I make this "depends on KEXEC=n &&
>> KPROBES=n"?
>
> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
> kernel code with a call to probe_kernel_write(), which GDB uses as well.
>
> And grepping for the patch_text() function also shows
> __arch_jump_label_transform() modifies kernel code. Not sure how and
> when that gets used.

Right, so, I'm trying to fix ftrace now, and I've hit a wall. It is as
if changes to the kernel text PMD aren't being noticed after the
kernel is running. Does anyone know why this might be happening?

Code and details here:
https://lkml.org/lkml/2014/4/1/674

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-04-01 22:34                 ` Kees Cook
@ 2014-04-01 22:54                   ` Laura Abbott
  -1 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2014-04-01 22:54 UTC (permalink / raw)
  To: Kees Cook, Jon Medhurst (Tixy)
  Cc: Catalin Marinas, Will Deacon, Larry Bassel, Stephen Rothwell,
	Russell King, Nicolas Pitre, Ben Dooks, Uwe Kleine-König,
	Grant Likely, Dave Martin, Jiang Liu, Christoffer Dall,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On 4/1/2014 3:34 PM, Kees Cook wrote:
> On Mon, Mar 24, 2014 at 3:47 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
>> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>>> For this stage, how about I make this "depends on KEXEC=n &&
>>> KPROBES=n"?
>>
>> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
>> kernel code with a call to probe_kernel_write(), which GDB uses as well.
>>
>> And grepping for the patch_text() function also shows
>> __arch_jump_label_transform() modifies kernel code. Not sure how and
>> when that gets used.
> 
> Right, so, I'm trying to fix ftrace now, and I've hit a wall. It is as
> if changes to the kernel text PMD aren't being noticed after the
> kernel is running. Does anyone know why this might be happening?
> 
> Code and details here:
> https://lkml.org/lkml/2014/4/1/674
> 
> -Kees
> 

We had a flush_tlb_kernel_page after the pmd_flush in our out of tree code
which makes the text writeable in __patch_text.

Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-04-01 22:54                   ` Laura Abbott
  0 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2014-04-01 22:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 4/1/2014 3:34 PM, Kees Cook wrote:
> On Mon, Mar 24, 2014 at 3:47 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
>> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>>> For this stage, how about I make this "depends on KEXEC=n &&
>>> KPROBES=n"?
>>
>> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
>> kernel code with a call to probe_kernel_write(), which GDB uses as well.
>>
>> And grepping for the patch_text() function also shows
>> __arch_jump_label_transform() modifies kernel code. Not sure how and
>> when that gets used.
> 
> Right, so, I'm trying to fix ftrace now, and I've hit a wall. It is as
> if changes to the kernel text PMD aren't being noticed after the
> kernel is running. Does anyone know why this might be happening?
> 
> Code and details here:
> https://lkml.org/lkml/2014/4/1/674
> 
> -Kees
> 

We had a flush_tlb_kernel_page after the pmd_flush in our out of tree code
which makes the text writeable in __patch_text.

Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] ARM: mm: keep rodata non-executable
  2014-04-01 22:54                   ` Laura Abbott
@ 2014-04-01 22:59                     ` Kees Cook
  -1 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-04-01 22:59 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Jon Medhurst (Tixy),
	Catalin Marinas, Will Deacon, Larry Bassel, Stephen Rothwell,
	Russell King, Nicolas Pitre, Ben Dooks, Uwe Kleine-König,
	Grant Likely, Dave Martin, Jiang Liu, Christoffer Dall,
	Marc Zyngier, Rob Herring, Vitaly Andrianov, linux-arm-kernel,
	Simon Baatz, Jonathan Austin, Greg Kroah-Hartman, LKML,
	Santosh Shilimkar, Andrew Morton

On Tue, Apr 1, 2014 at 3:54 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 4/1/2014 3:34 PM, Kees Cook wrote:
>> On Mon, Mar 24, 2014 at 3:47 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
>>> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>>>> For this stage, how about I make this "depends on KEXEC=n &&
>>>> KPROBES=n"?
>>>
>>> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
>>> kernel code with a call to probe_kernel_write(), which GDB uses as well.
>>>
>>> And grepping for the patch_text() function also shows
>>> __arch_jump_label_transform() modifies kernel code. Not sure how and
>>> when that gets used.
>>
>> Right, so, I'm trying to fix ftrace now, and I've hit a wall. It is as
>> if changes to the kernel text PMD aren't being noticed after the
>> kernel is running. Does anyone know why this might be happening?
>>
>> Code and details here:
>> https://lkml.org/lkml/2014/4/1/674
>>
>> -Kees
>>
>
> We had a flush_tlb_kernel_page after the pmd_flush in our out of tree code
> which makes the text writeable in __patch_text.

I tried flush_tlb_kernel_range(), which I'd expect to do the same
thing. I can try with _page() too.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 2/2] ARM: mm: keep rodata non-executable
@ 2014-04-01 22:59                     ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2014-04-01 22:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 1, 2014 at 3:54 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 4/1/2014 3:34 PM, Kees Cook wrote:
>> On Mon, Mar 24, 2014 at 3:47 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
>>> On Sun, 2014-03-23 at 16:21 -0600, Kees Cook wrote:
>>>> For this stage, how about I make this "depends on KEXEC=n &&
>>>> KPROBES=n"?
>>>
>>> There's also ftrace (CONFIG_DYNAMIC_FTRACE I believe) which modifies
>>> kernel code with a call to probe_kernel_write(), which GDB uses as well.
>>>
>>> And grepping for the patch_text() function also shows
>>> __arch_jump_label_transform() modifies kernel code. Not sure how and
>>> when that gets used.
>>
>> Right, so, I'm trying to fix ftrace now, and I've hit a wall. It is as
>> if changes to the kernel text PMD aren't being noticed after the
>> kernel is running. Does anyone know why this might be happening?
>>
>> Code and details here:
>> https://lkml.org/lkml/2014/4/1/674
>>
>> -Kees
>>
>
> We had a flush_tlb_kernel_page after the pmd_flush in our out of tree code
> which makes the text writeable in __patch_text.

I tried flush_tlb_kernel_range(), which I'd expect to do the same
thing. I can try with _page() too.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2014-04-01 23:00 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-14  1:04 [PATCH 0/2] ARM: mm: allow for stricter kernel memory perms Kees Cook
2014-02-14  1:04 ` Kees Cook
2014-02-14  1:04 ` [PATCH 1/2] " Kees Cook
2014-02-14  1:04   ` Kees Cook
2014-02-14  1:04 ` [PATCH 2/2] ARM: mm: keep rodata non-executable Kees Cook
2014-02-14  1:04   ` Kees Cook
2014-02-14 16:22   ` Dave Martin
2014-02-14 16:22     ` Dave Martin
2014-02-14 19:11     ` Kees Cook
2014-02-14 19:11       ` Kees Cook
2014-02-17 12:34       ` Dave Martin
2014-02-17 12:34         ` Dave Martin
2014-02-18 18:10         ` Kees Cook
2014-02-18 18:10           ` Kees Cook
2014-02-21 12:37           ` Dave Martin
2014-02-21 12:37             ` Dave Martin
2014-02-21 13:20             ` Russell King - ARM Linux
2014-02-21 13:20               ` Russell King - ARM Linux
2014-02-21 22:09               ` Kees Cook
2014-02-21 22:09                 ` Kees Cook
2014-03-13 19:07                 ` Kees Cook
2014-03-13 19:07                   ` Kees Cook
2014-03-23 18:32                   ` Laura Abbott
2014-03-23 18:32                     ` Laura Abbott
2014-03-23 22:20                     ` Kees Cook
2014-03-23 22:20                       ` Kees Cook
2014-03-23 18:47         ` Laura Abbott
2014-03-23 18:47           ` Laura Abbott
2014-03-23 22:21           ` Kees Cook
2014-03-23 22:21             ` Kees Cook
2014-03-23 22:37             ` Nicolas Pitre
2014-03-23 22:37               ` Nicolas Pitre
2014-03-23 22:56               ` Kees Cook
2014-03-23 22:56                 ` Kees Cook
2014-03-24 10:47             ` Jon Medhurst (Tixy)
2014-03-24 10:47               ` Jon Medhurst (Tixy)
2014-03-25 22:11               ` Rabin Vincent
2014-04-01 22:34               ` Kees Cook
2014-04-01 22:34                 ` Kees Cook
2014-04-01 22:54                 ` Laura Abbott
2014-04-01 22:54                   ` Laura Abbott
2014-04-01 22:59                   ` Kees Cook
2014-04-01 22:59                     ` Kees Cook
2014-03-24 12:30           ` Dave Martin
2014-03-24 12:30             ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.