All of lore.kernel.org
 help / color / mirror / Atom feed
* Prevent list poison values from being mapped by userspace processes
@ 2015-08-18 21:42 Jeffrey Vander Stoep
  2015-08-21 13:30 ` Russell King - ARM Linux
  0 siblings, 1 reply; 45+ messages in thread
From: Jeffrey Vander Stoep @ 2015-08-18 21:42 UTC (permalink / raw)
  To: linux-arm-kernel

List poison pointer values point to memory that is mappable by
userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
0x00200200. This means poison values can be valid pointers controlled
by userspace and can be used to exploit the kernel as demonstrated in
a recent blackhat talk:
https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf

Can these poison values be moved to an area not mappable by userspace
on 32 bit ARM?

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-18 21:42 Prevent list poison values from being mapped by userspace processes Jeffrey Vander Stoep
@ 2015-08-21 13:30 ` Russell King - ARM Linux
  2015-08-21 13:31   ` [PATCH 1/9] ARM: domains: switch to keeping domain value in register Russell King
                     ` (12 more replies)
  0 siblings, 13 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-21 13:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
> List poison pointer values point to memory that is mappable by
> userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
> 0x00200200. This means poison values can be valid pointers controlled
> by userspace and can be used to exploit the kernel as demonstrated in
> a recent blackhat talk:
> https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
> 
> Can these poison values be moved to an area not mappable by userspace
> on 32 bit ARM?

As was discussed privately before your message, both Catalin and myself
agreed that this is not possible, and I proposed alternatives which were
feasible.

I have now implemented the domain access alternative which I mentioned
during that discussion, which is suitable for all non-LPAE setups, which
has the effect of blocking almost all implicit kernel accesses to
userspace, thereby substantially reducing the possibility for an attack
similar to that given in the above paper.

It should be said that with the following patches applied, it won't stop
the original bug being used to crash the system (that's already been
fixed) but it will prevent userspace being able to mask the crash, and
therefore prevent such use-after-free bugs being used to gain privileges.

This approach also covers low-vector CPUs as well, with one caveat: the
lower 1MB of userspace will remain accessible to the kernel due to the
need for the vectors to remain visible.  Doing otherwise crashes the
machine on the first exception event.  So here, we offer a "best efforts"
implementation rather than something which completely blocks userspace
access from kernel space.

This is not a simple fix - it's quite involved, and it changes a fair
number of places in the kernel.  It needs time to be proven before any
thought can be given to backporting these changes to stable kernels.
It would be good to get some testing of these changes.

 arch/arm/Kconfig                            | 15 +++++
 arch/arm/include/asm/domain.h               | 45 +++++++++++----
 arch/arm/include/asm/futex.h                | 19 ++++++-
 arch/arm/include/asm/pgtable-2level-hwdef.h |  1 +
 arch/arm/include/asm/thread_info.h          |  3 -
 arch/arm/include/asm/uaccess.h              | 85 +++++++++++++++++++++++++++--
 arch/arm/kernel/armksyms.c                  |  6 +-
 arch/arm/kernel/entry-armv.S                | 27 ++++++---
 arch/arm/kernel/entry-common.S              |  2 +
 arch/arm/kernel/entry-header.S              | 42 ++++++++++++++
 arch/arm/kernel/head.S                      |  5 +-
 arch/arm/kernel/process.c                   | 37 ++++++++++---
 arch/arm/kernel/traps.c                     |  1 -
 arch/arm/lib/clear_user.S                   |  6 +-
 arch/arm/lib/copy_from_user.S               |  6 +-
 arch/arm/lib/copy_to_user.S                 |  6 +-
 arch/arm/lib/uaccess_with_memcpy.c          |  4 +-
 arch/arm/mm/mmu.c                           |  4 +-
 arch/arm/mm/pgd.c                           | 10 ++++
 19 files changed, 267 insertions(+), 57 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/9] ARM: domains: switch to keeping domain value in register
  2015-08-21 13:30 ` Russell King - ARM Linux
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 2/9] ARM: domains: provide domain_mask() Russell King
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Rather than modifying both the domain access control register and our
per-thread copy, modify only the domain access control register, and
use the per-thread copy to save and restore the register over context
switches.  We can also avoid the explicit initialisation of the
init thread_info structure.

This allows us to avoid needing to gain access to the thread information
at the uaccess control sites.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h      | 20 +++++++++++++++-----
 arch/arm/include/asm/thread_info.h |  3 ---
 arch/arm/kernel/entry-armv.S       |  2 ++
 arch/arm/kernel/process.c          | 13 ++++++++++---
 4 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 6ddbe446425e..7f2941905714 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -59,6 +59,17 @@
 
 #ifndef __ASSEMBLY__
 
+static inline unsigned int get_domain(void)
+{
+	unsigned int domain;
+
+	asm(
+	"mrc	p15, 0, %0, c3, c0	@ get domain"
+	 : "=r" (domain));
+
+	return domain;
+}
+
 #ifdef CONFIG_CPU_USE_DOMAINS
 static inline void set_domain(unsigned val)
 {
@@ -70,11 +81,10 @@ static inline void set_domain(unsigned val)
 
 #define modify_domain(dom,type)					\
 	do {							\
-	struct thread_info *thread = current_thread_info();	\
-	unsigned int domain = thread->cpu_domain;		\
-	domain &= ~domain_val(dom, DOMAIN_MANAGER);		\
-	thread->cpu_domain = domain | domain_val(dom, type);	\
-	set_domain(thread->cpu_domain);				\
+		unsigned int domain = get_domain();		\
+		domain &= ~domain_val(dom, DOMAIN_MANAGER);	\
+		domain = domain | domain_val(dom, type);	\
+		set_domain(domain);				\
 	} while (0)
 
 #else
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index bd32eded3e50..0a0aec410d8c 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -74,9 +74,6 @@ struct thread_info {
 	.flags		= 0,						\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
 	.addr_limit	= KERNEL_DS,					\
-	.cpu_domain	= domain_val(DOMAIN_USER, DOMAIN_MANAGER) |	\
-			  domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) |	\
-			  domain_val(DOMAIN_IO, DOMAIN_CLIENT),		\
 }
 
 #define init_thread_info	(init_thread_union.thread_info)
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 7dac3086e361..d19adcf6c580 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -770,6 +770,8 @@ ENTRY(__switch_to)
 	ldr	r4, [r2, #TI_TP_VALUE]
 	ldr	r5, [r2, #TI_TP_VALUE + 4]
 #ifdef CONFIG_CPU_USE_DOMAINS
+	mrc	p15, 0, r6, c3, c0, 0		@ Get domain register
+	str	r6, [r1, #TI_CPU_DOMAIN]	@ Save old domain register
 	ldr	r6, [r2, #TI_CPU_DOMAIN]
 #endif
 	switch_tls r1, r4, r5, r3, r7
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index f192a2a41719..e722f9b3c9b1 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -146,10 +146,9 @@ void __show_regs(struct pt_regs *regs)
 		buf[0] = '\0';
 #ifdef CONFIG_CPU_CP15_MMU
 		{
-			unsigned int transbase, dac;
+			unsigned int transbase, dac = get_domain();
 			asm("mrc p15, 0, %0, c2, c0\n\t"
-			    "mrc p15, 0, %1, c3, c0\n"
-			    : "=r" (transbase), "=r" (dac));
+			    : "=r" (transbase));
 			snprintf(buf, sizeof(buf), "  Table: %08x  DAC: %08x",
 			  	transbase, dac);
 		}
@@ -210,6 +209,14 @@ copy_thread(unsigned long clone_flags, unsigned long stack_start,
 
 	memset(&thread->cpu_context, 0, sizeof(struct cpu_context_save));
 
+	/*
+	 * Copy the initial value of the domain access control register
+	 * from the current thread: thread->addr_limit will have been
+	 * copied from the current thread via setup_thread_stack() in
+	 * kernel/fork.c
+	 */
+	thread->cpu_domain = get_domain();
+
 	if (likely(!(p->flags & PF_KTHREAD))) {
 		*childregs = *current_pt_regs();
 		childregs->ARM_r0 = 0;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/9] ARM: domains: provide domain_mask()
  2015-08-21 13:30 ` Russell King - ARM Linux
  2015-08-21 13:31   ` [PATCH 1/9] ARM: domains: switch to keeping domain value in register Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 3/9] ARM: domains: move initial domain setting value to asm/domains.h Russell King
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Provide a macro to generate the mask for a domain, rather than using
domain_val(, DOMAIN_MANAGER) which won't work when CPU_USE_DOMAINS
is turned off.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 7f2941905714..045b9b453bcd 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -55,7 +55,8 @@
 #define DOMAIN_MANAGER	1
 #endif
 
-#define domain_val(dom,type)	((type) << (2*(dom)))
+#define domain_mask(dom)	((3) << (2 * (dom)))
+#define domain_val(dom,type)	((type) << (2 * (dom)))
 
 #ifndef __ASSEMBLY__
 
@@ -82,7 +83,7 @@ static inline void set_domain(unsigned val)
 #define modify_domain(dom,type)					\
 	do {							\
 		unsigned int domain = get_domain();		\
-		domain &= ~domain_val(dom, DOMAIN_MANAGER);	\
+		domain &= ~domain_mask(dom);			\
 		domain = domain | domain_val(dom, type);	\
 		set_domain(domain);				\
 	} while (0)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/9] ARM: domains: move initial domain setting value to asm/domains.h
  2015-08-21 13:30 ` Russell King - ARM Linux
  2015-08-21 13:31   ` [PATCH 1/9] ARM: domains: switch to keeping domain value in register Russell King
  2015-08-21 13:31   ` [PATCH 2/9] ARM: domains: provide domain_mask() Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 4/9] ARM: domains: get rid of manager mode for user domain Russell King
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 6 ++++++
 arch/arm/kernel/head.S        | 5 +----
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 045b9b453bcd..4218f88e8f7e 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -58,6 +58,12 @@
 #define domain_mask(dom)	((3) << (2 * (dom)))
 #define domain_val(dom,type)	((type) << (2 * (dom)))
 
+#define DACR_INIT \
+	(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
+	 domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
+	 domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
+	 domain_val(DOMAIN_IO, DOMAIN_CLIENT))
+
 #ifndef __ASSEMBLY__
 
 static inline unsigned int get_domain(void)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index bd755d97e459..d56e5e9a9e1e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -461,10 +461,7 @@ __enable_mmu:
 #ifdef CONFIG_ARM_LPAE
 	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
 #else
-	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
-		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
-		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
-		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
+	mov	r5, #DACR_INIT
 	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/9] ARM: domains: get rid of manager mode for user domain
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (2 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 3/9] ARM: domains: move initial domain setting value to asm/domains.h Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 5/9] ARM: domains: keep vectors in separate domain Russell King
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Since we switched to early trap initialisation in 94e5a85b3be0
("ARM: earlier initialization of vectors page") we haven't been writing
directly to the vectors page, and so there's no need for this domain
to be in manager mode.  Switch it to client mode.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 2 +-
 arch/arm/kernel/traps.c       | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 4218f88e8f7e..08b601e69ddc 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -59,7 +59,7 @@
 #define domain_val(dom,type)	((type) << (2 * (dom)))
 
 #define DACR_INIT \
-	(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
+	(domain_val(DOMAIN_USER, DOMAIN_CLIENT) | \
 	 domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 	 domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 	 domain_val(DOMAIN_IO, DOMAIN_CLIENT))
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index d358226236f2..969f9d9e665f 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -870,7 +870,6 @@ void __init early_trap_init(void *vectors_base)
 	kuser_init(vectors_base);
 
 	flush_icache_range(vectors, vectors + PAGE_SIZE * 2);
-	modify_domain(DOMAIN_USER, DOMAIN_CLIENT);
 #else /* ifndef CONFIG_CPU_V7M */
 	/*
 	 * on V7-M there is no need to copy the vector table to a dedicated
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 5/9] ARM: domains: keep vectors in separate domain
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (3 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 4/9] ARM: domains: get rid of manager mode for user domain Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 6/9] ARM: domains: remove DOMAIN_TABLE Russell King
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Keep the machine vectors in its own domain to avoid software based
user access control from making the vector code inaccessible, and
thereby deadlocking the machine.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h               |  4 +++-
 arch/arm/include/asm/pgtable-2level-hwdef.h |  1 +
 arch/arm/mm/mmu.c                           |  4 ++--
 arch/arm/mm/pgd.c                           | 10 ++++++++++
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 08b601e69ddc..396a12e486fe 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -43,6 +43,7 @@
 #define DOMAIN_USER	1
 #define DOMAIN_IO	0
 #endif
+#define DOMAIN_VECTORS	3
 
 /*
  * Domain types
@@ -62,7 +63,8 @@
 	(domain_val(DOMAIN_USER, DOMAIN_CLIENT) | \
 	 domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 	 domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
-	 domain_val(DOMAIN_IO, DOMAIN_CLIENT))
+	 domain_val(DOMAIN_IO, DOMAIN_CLIENT) | \
+	 domain_val(DOMAIN_VECTORS, DOMAIN_CLIENT))
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
index 5e68278e953e..d0131ee6f6af 100644
--- a/arch/arm/include/asm/pgtable-2level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -23,6 +23,7 @@
 #define PMD_PXNTABLE		(_AT(pmdval_t, 1) << 2)     /* v7 */
 #define PMD_BIT4		(_AT(pmdval_t, 1) << 4)
 #define PMD_DOMAIN(x)		(_AT(pmdval_t, (x)) << 5)
+#define PMD_DOMAIN_MASK		PMD_DOMAIN(0x0f)
 #define PMD_PROTECTION		(_AT(pmdval_t, 1) << 9)		/* v5 */
 /*
  *   - section
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 6ca7d9aa896f..a016de248034 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -291,13 +291,13 @@ static struct mem_type mem_types[] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
 				L_PTE_RDONLY,
 		.prot_l1   = PMD_TYPE_TABLE,
-		.domain    = DOMAIN_USER,
+		.domain    = DOMAIN_VECTORS,
 	},
 	[MT_HIGH_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
 				L_PTE_USER | L_PTE_RDONLY,
 		.prot_l1   = PMD_TYPE_TABLE,
-		.domain    = DOMAIN_USER,
+		.domain    = DOMAIN_VECTORS,
 	},
 	[MT_MEMORY_RWX] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY,
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index a3681f11dd9f..e683db1b90a3 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -84,6 +84,16 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 		if (!new_pte)
 			goto no_pte;
 
+#ifndef CONFIG_ARM_LPAE
+		/*
+		 * Modify the PTE pointer to have the correct domain.  This
+		 * needs to be the vectors domain to avoid the low vectors
+		 * being unmapped.
+		 */
+		pmd_val(*new_pmd) &= ~PMD_DOMAIN_MASK;
+		pmd_val(*new_pmd) |= PMD_DOMAIN(DOMAIN_VECTORS);
+#endif
+
 		init_pud = pud_offset(init_pgd, 0);
 		init_pmd = pmd_offset(init_pud, 0);
 		init_pte = pte_offset_map(init_pmd, 0);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 6/9] ARM: domains: remove DOMAIN_TABLE
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (4 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 5/9] ARM: domains: keep vectors in separate domain Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 7/9] ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() Russell King
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

DOMAIN_TABLE is not used; in any case, it aliases to the kernel domain.
Remove this definition.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 396a12e486fe..2be929549938 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -34,12 +34,10 @@
  */
 #ifndef CONFIG_IO_36
 #define DOMAIN_KERNEL	0
-#define DOMAIN_TABLE	0
 #define DOMAIN_USER	1
 #define DOMAIN_IO	2
 #else
 #define DOMAIN_KERNEL	2
-#define DOMAIN_TABLE	2
 #define DOMAIN_USER	1
 #define DOMAIN_IO	0
 #endif
@@ -62,7 +60,6 @@
 #define DACR_INIT \
 	(domain_val(DOMAIN_USER, DOMAIN_CLIENT) | \
 	 domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
-	 domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 	 domain_val(DOMAIN_IO, DOMAIN_CLIENT) | \
 	 domain_val(DOMAIN_VECTORS, DOMAIN_CLIENT))
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 7/9] ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore()
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (5 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 6/9] ARM: domains: remove DOMAIN_TABLE Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-21 13:31   ` [PATCH 8/9] ARM: entry: provide uaccess assembly macro hooks Russell King
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Provide uaccess_save_and_enable() and uaccess_restore() to permit
control of userspace visibility to the kernel, and hook these into
the appropriate places in the kernel where we need to access
userspace.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/futex.h       | 19 ++++++++--
 arch/arm/include/asm/uaccess.h     | 71 +++++++++++++++++++++++++++++++++++---
 arch/arm/kernel/armksyms.c         |  6 ++--
 arch/arm/lib/clear_user.S          |  6 ++--
 arch/arm/lib/copy_from_user.S      |  6 ++--
 arch/arm/lib/copy_to_user.S        |  6 ++--
 arch/arm/lib/uaccess_with_memcpy.c |  4 +--
 7 files changed, 97 insertions(+), 21 deletions(-)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 5eed82809d82..6795368ad023 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -22,8 +22,11 @@
 #ifdef CONFIG_SMP
 
 #define __futex_atomic_op(insn, ret, oldval, tmp, uaddr, oparg)	\
+({								\
+	unsigned int __ua_flags;				\
 	smp_mb();						\
 	prefetchw(uaddr);					\
+	__ua_flags = uaccess_save_and_enable();			\
 	__asm__ __volatile__(					\
 	"1:	ldrex	%1, [%3]\n"				\
 	"	" insn "\n"					\
@@ -34,12 +37,15 @@
 	__futex_atomic_ex_table("%5")				\
 	: "=&r" (ret), "=&r" (oldval), "=&r" (tmp)		\
 	: "r" (uaddr), "r" (oparg), "Ir" (-EFAULT)		\
-	: "cc", "memory")
+	: "cc", "memory");					\
+	uaccess_restore(__ua_flags);				\
+})
 
 static inline int
 futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 			      u32 oldval, u32 newval)
 {
+	unsigned int __ua_flags;
 	int ret;
 	u32 val;
 
@@ -49,6 +55,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	smp_mb();
 	/* Prefetching cannot fault */
 	prefetchw(uaddr);
+	__ua_flags = uaccess_save_and_enable();
 	__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
 	"1:	ldrex	%1, [%4]\n"
 	"	teq	%1, %2\n"
@@ -61,6 +68,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	: "=&r" (ret), "=&r" (val)
 	: "r" (oldval), "r" (newval), "r" (uaddr), "Ir" (-EFAULT)
 	: "cc", "memory");
+	uaccess_restore(__ua_flags);
 	smp_mb();
 
 	*uval = val;
@@ -73,6 +81,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 #include <asm/domain.h>
 
 #define __futex_atomic_op(insn, ret, oldval, tmp, uaddr, oparg)	\
+({								\
+	unsigned int __ua_flags = uaccess_save_and_enable();	\
 	__asm__ __volatile__(					\
 	"1:	" TUSER(ldr) "	%1, [%3]\n"			\
 	"	" insn "\n"					\
@@ -81,12 +91,15 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	__futex_atomic_ex_table("%5")				\
 	: "=&r" (ret), "=&r" (oldval), "=&r" (tmp)		\
 	: "r" (uaddr), "r" (oparg), "Ir" (-EFAULT)		\
-	: "cc", "memory")
+	: "cc", "memory");					\
+	uaccess_restore(__ua_flags);				\
+})
 
 static inline int
 futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 			      u32 oldval, u32 newval)
 {
+	unsigned int __ua_flags;
 	int ret = 0;
 	u32 val;
 
@@ -94,6 +107,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 		return -EFAULT;
 
 	preempt_disable();
+	__ua_flags = uaccess_save_and_enable();
 	__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
 	"1:	" TUSER(ldr) "	%1, [%4]\n"
 	"	teq	%1, %2\n"
@@ -103,6 +117,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	: "+r" (ret), "=&r" (val)
 	: "r" (oldval), "r" (newval), "r" (uaddr), "Ir" (-EFAULT)
 	: "cc", "memory");
+	uaccess_restore(__ua_flags);
 
 	*uval = val;
 	preempt_enable();
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 74b17d09ef7a..4ae10967a8ba 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -94,6 +94,21 @@ static inline void set_fs(mm_segment_t fs)
 	flag; })
 
 /*
+ * These two functions allow hooking accesses to userspace to increase
+ * system integrity by ensuring that the kernel can not inadvertantly
+ * perform such accesses (eg, via list poison values) which could then
+ * be exploited for priviledge escalation.
+ */
+static inline unsigned int uaccess_save_and_enable(void)
+{
+	return 0;
+}
+
+static inline void uaccess_restore(unsigned int flags)
+{
+}
+
+/*
  * Single-value transfer routines.  They automatically use the right
  * size if we just have the right pointer type.  Note that the functions
  * which read from user space (*get_*) need to take care not to leak
@@ -165,6 +180,7 @@ extern int __get_user_64t_4(void *);
 		register typeof(x) __r2 asm("r2");			\
 		register unsigned long __l asm("r1") = __limit;		\
 		register int __e asm("r0");				\
+		unsigned int __ua_flags = uaccess_save_and_enable();	\
 		switch (sizeof(*(__p))) {				\
 		case 1:							\
 			if (sizeof((x)) >= 8)				\
@@ -192,6 +208,7 @@ extern int __get_user_64t_4(void *);
 			break;						\
 		default: __e = __get_user_bad(); break;			\
 		}							\
+		uaccess_restore(__ua_flags);				\
 		x = (typeof(*(p))) __r2;				\
 		__e;							\
 	})
@@ -224,6 +241,7 @@ extern int __put_user_8(void *, unsigned long long);
 		register const typeof(*(p)) __user *__p asm("r0") = __tmp_p; \
 		register unsigned long __l asm("r1") = __limit;		\
 		register int __e asm("r0");				\
+		unsigned int __ua_flags = uaccess_save_and_enable();	\
 		switch (sizeof(*(__p))) {				\
 		case 1:							\
 			__put_user_x(__r2, __p, __e, __l, 1);		\
@@ -239,6 +257,7 @@ extern int __put_user_8(void *, unsigned long long);
 			break;						\
 		default: __e = __put_user_bad(); break;			\
 		}							\
+		uaccess_restore(__ua_flags);				\
 		__e;							\
 	})
 
@@ -300,14 +319,17 @@ static inline void set_fs(mm_segment_t fs)
 do {									\
 	unsigned long __gu_addr = (unsigned long)(ptr);			\
 	unsigned long __gu_val;						\
+	unsigned int __ua_flags;					\
 	__chk_user_ptr(ptr);						\
 	might_fault();							\
+	__ua_flags = uaccess_save_and_enable();				\
 	switch (sizeof(*(ptr))) {					\
 	case 1:	__get_user_asm_byte(__gu_val, __gu_addr, err);	break;	\
 	case 2:	__get_user_asm_half(__gu_val, __gu_addr, err);	break;	\
 	case 4:	__get_user_asm_word(__gu_val, __gu_addr, err);	break;	\
 	default: (__gu_val) = __get_user_bad();				\
 	}								\
+	uaccess_restore(__ua_flags);					\
 	(x) = (__typeof__(*(ptr)))__gu_val;				\
 } while (0)
 
@@ -381,9 +403,11 @@ do {									\
 #define __put_user_err(x, ptr, err)					\
 do {									\
 	unsigned long __pu_addr = (unsigned long)(ptr);			\
+	unsigned int __ua_flags;					\
 	__typeof__(*(ptr)) __pu_val = (x);				\
 	__chk_user_ptr(ptr);						\
 	might_fault();							\
+	__ua_flags = uaccess_save_and_enable();				\
 	switch (sizeof(*(ptr))) {					\
 	case 1: __put_user_asm_byte(__pu_val, __pu_addr, err);	break;	\
 	case 2: __put_user_asm_half(__pu_val, __pu_addr, err);	break;	\
@@ -391,6 +415,7 @@ do {									\
 	case 8:	__put_user_asm_dword(__pu_val, __pu_addr, err);	break;	\
 	default: __put_user_bad();					\
 	}								\
+	uaccess_restore(__ua_flags);					\
 } while (0)
 
 #define __put_user_asm_byte(x, __pu_addr, err)			\
@@ -474,11 +499,46 @@ do {									\
 
 
 #ifdef CONFIG_MMU
-extern unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n);
-extern unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n);
-extern unsigned long __must_check __copy_to_user_std(void __user *to, const void *from, unsigned long n);
-extern unsigned long __must_check __clear_user(void __user *addr, unsigned long n);
-extern unsigned long __must_check __clear_user_std(void __user *addr, unsigned long n);
+extern unsigned long __must_check
+arm_copy_from_user(void *to, const void __user *from, unsigned long n);
+
+static inline unsigned long __must_check
+__copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	unsigned int __ua_flags = uaccess_save_and_enable();
+	n = arm_copy_from_user(to, from, n);
+	uaccess_restore(__ua_flags);
+	return n;
+}
+
+extern unsigned long __must_check
+arm_copy_to_user(void __user *to, const void *from, unsigned long n);
+extern unsigned long __must_check
+__copy_to_user_std(void __user *to, const void *from, unsigned long n);
+
+static inline unsigned long __must_check
+__copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	unsigned int __ua_flags = uaccess_save_and_enable();
+	n = arm_copy_to_user(to, from, n);
+	uaccess_restore(__ua_flags);
+	return n;
+}
+
+extern unsigned long __must_check
+arm_clear_user(void __user *addr, unsigned long n);
+extern unsigned long __must_check
+__clear_user_std(void __user *addr, unsigned long n);
+
+static inline unsigned long __must_check
+__clear_user(void __user *addr, unsigned long n)
+{
+	unsigned int __ua_flags = uaccess_save_and_enable();
+	n = arm_clear_user(addr, n);
+	uaccess_restore(__ua_flags);
+	return n;
+}
+
 #else
 #define __copy_from_user(to, from, n)	(memcpy(to, (void __force *)from, n), 0)
 #define __copy_to_user(to, from, n)	(memcpy((void __force *)to, from, n), 0)
@@ -511,6 +571,7 @@ static inline unsigned long __must_check clear_user(void __user *to, unsigned lo
 	return n;
 }
 
+/* These are from lib/ code, and use __get_user() and friends */
 extern long strncpy_from_user(char *dest, const char __user *src, long count);
 
 extern __must_check long strlen_user(const char __user *str);
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index a88671cfe1ff..a35d72d30b56 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -91,9 +91,9 @@ EXPORT_SYMBOL(__memzero);
 #ifdef CONFIG_MMU
 EXPORT_SYMBOL(copy_page);
 
-EXPORT_SYMBOL(__copy_from_user);
-EXPORT_SYMBOL(__copy_to_user);
-EXPORT_SYMBOL(__clear_user);
+EXPORT_SYMBOL(arm_copy_from_user);
+EXPORT_SYMBOL(arm_copy_to_user);
+EXPORT_SYMBOL(arm_clear_user);
 
 EXPORT_SYMBOL(__get_user_1);
 EXPORT_SYMBOL(__get_user_2);
diff --git a/arch/arm/lib/clear_user.S b/arch/arm/lib/clear_user.S
index 1710fd7db2d5..970d6c043774 100644
--- a/arch/arm/lib/clear_user.S
+++ b/arch/arm/lib/clear_user.S
@@ -12,14 +12,14 @@
 
 		.text
 
-/* Prototype: int __clear_user(void *addr, size_t sz)
+/* Prototype: unsigned long arm_clear_user(void *addr, size_t sz)
  * Purpose  : clear some user memory
  * Params   : addr - user memory address to clear
  *          : sz   - number of bytes to clear
  * Returns  : number of bytes NOT cleared
  */
 ENTRY(__clear_user_std)
-WEAK(__clear_user)
+WEAK(arm_clear_user)
 		stmfd	sp!, {r1, lr}
 		mov	r2, #0
 		cmp	r1, #4
@@ -44,7 +44,7 @@ WEAK(__clear_user)
 USER(		strnebt	r2, [r0])
 		mov	r0, #0
 		ldmfd	sp!, {r1, pc}
-ENDPROC(__clear_user)
+ENDPROC(arm_clear_user)
 ENDPROC(__clear_user_std)
 
 		.pushsection .text.fixup,"ax"
diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S
index 7a235b9952be..1512bebfbf1b 100644
--- a/arch/arm/lib/copy_from_user.S
+++ b/arch/arm/lib/copy_from_user.S
@@ -17,7 +17,7 @@
 /*
  * Prototype:
  *
- *	size_t __copy_from_user(void *to, const void *from, size_t n)
+ *	size_t arm_copy_from_user(void *to, const void *from, size_t n)
  *
  * Purpose:
  *
@@ -89,11 +89,11 @@
 
 	.text
 
-ENTRY(__copy_from_user)
+ENTRY(arm_copy_from_user)
 
 #include "copy_template.S"
 
-ENDPROC(__copy_from_user)
+ENDPROC(arm_copy_from_user)
 
 	.pushsection .fixup,"ax"
 	.align 0
diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S
index 9648b0675a3e..caf5019d8161 100644
--- a/arch/arm/lib/copy_to_user.S
+++ b/arch/arm/lib/copy_to_user.S
@@ -17,7 +17,7 @@
 /*
  * Prototype:
  *
- *	size_t __copy_to_user(void *to, const void *from, size_t n)
+ *	size_t arm_copy_to_user(void *to, const void *from, size_t n)
  *
  * Purpose:
  *
@@ -93,11 +93,11 @@
 	.text
 
 ENTRY(__copy_to_user_std)
-WEAK(__copy_to_user)
+WEAK(arm_copy_to_user)
 
 #include "copy_template.S"
 
-ENDPROC(__copy_to_user)
+ENDPROC(arm_copy_to_user)
 ENDPROC(__copy_to_user_std)
 
 	.pushsection .text.fixup,"ax"
diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index 3e58d710013c..77f020e75ccd 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -136,7 +136,7 @@ __copy_to_user_memcpy(void __user *to, const void *from, unsigned long n)
 }
 
 unsigned long
-__copy_to_user(void __user *to, const void *from, unsigned long n)
+arm_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
 	/*
 	 * This test is stubbed out of the main function above to keep
@@ -190,7 +190,7 @@ __clear_user_memset(void __user *addr, unsigned long n)
 	return n;
 }
 
-unsigned long __clear_user(void __user *addr, unsigned long n)
+unsigned long arm_clear_user(void __user *addr, unsigned long n)
 {
 	/* See rational for this in __copy_to_user() above. */
 	if (n < 64)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 8/9] ARM: entry: provide uaccess assembly macro hooks
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (6 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 7/9] ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-27 21:40     ` Stephen Boyd
  2015-08-21 13:31   ` [PATCH 9/9] ARM: software-based priviledged-no-access support Russell King
                     ` (4 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Provide hooks into the kernel entry and exit paths to permit control
of userspace visibility to the kernel.  The intended use is:

- on entry to kernel from user, uaccess_disable will be called to
  disable userspace visibility
- on exit from kernel to user, uaccess_enable will be called to
  enable userspace visibility
- on entry from a kernel exception, uaccess_save_and_disable will be
  called to save the current userspace visibility setting, and disable
  access
- on exit from a kernel exception, uaccess_restore will be called to
  restore the userspace visibility as it was before the exception
  occurred.

These hooks allows us to keep userspace visibility disabled for the
vast majority of the kernel, except for localised regions where we
want to explicitly access userspace.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/kernel/entry-armv.S   | 25 ++++++++++++++++++-------
 arch/arm/kernel/entry-common.S |  2 ++
 arch/arm/kernel/entry-header.S | 17 +++++++++++++++++
 3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index d19adcf6c580..f6102b39517c 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -152,7 +152,7 @@ ENDPROC(__und_invalid)
 	.macro	svc_entry, stack_hole=0, trace=1
  UNWIND(.fnstart		)
  UNWIND(.save {r0 - pc}		)
-	sub	sp, sp, #(S_FRAME_SIZE + \stack_hole - 4)
+	sub	sp, sp, #(S_FRAME_SIZE + 8 + \stack_hole - 4)
 #ifdef CONFIG_THUMB2_KERNEL
  SPFIX(	str	r0, [sp]	)	@ temporarily saved
  SPFIX(	mov	r0, sp		)
@@ -167,7 +167,7 @@ ENDPROC(__und_invalid)
 	ldmia	r0, {r3 - r5}
 	add	r7, sp, #S_SP - 4	@ here for interlock avoidance
 	mov	r6, #-1			@  ""  ""      ""       ""
-	add	r2, sp, #(S_FRAME_SIZE + \stack_hole - 4)
+	add	r2, sp, #(S_FRAME_SIZE + 8 + \stack_hole - 4)
  SPFIX(	addeq	r2, r2, #4	)
 	str	r3, [sp, #-4]!		@ save the "real" r0 copied
 					@ from the exception stack
@@ -185,6 +185,8 @@ ENDPROC(__und_invalid)
 	@
 	stmia	r7, {r2 - r6}
 
+	uaccess_save_and_disable r0
+
 	.if \trace
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
@@ -368,7 +370,7 @@ ENDPROC(__fiq_abt)
 #error "sizeof(struct pt_regs) must be a multiple of 8"
 #endif
 
-	.macro	usr_entry, trace=1
+	.macro	usr_entry, trace=1, uaccess=1
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)	@ don't unwind the user space
 	sub	sp, sp, #S_FRAME_SIZE
@@ -400,6 +402,10 @@ ENDPROC(__fiq_abt)
  ARM(	stmdb	r0, {sp, lr}^			)
  THUMB(	store_user_sp_lr r0, r1, S_SP - S_PC	)
 
+	.if \uaccess
+	uaccess_disable ip
+	.endif
+
 	@ Enable the alignment trap while in kernel mode
  ATRAP(	teq	r8, r7)
  ATRAP( mcrne	p15, 0, r8, c1, c0, 0)
@@ -458,7 +464,7 @@ ENDPROC(__irq_usr)
 
 	.align	5
 __und_usr:
-	usr_entry
+	usr_entry uaccess=0
 
 	mov	r2, r4
 	mov	r3, r5
@@ -484,6 +490,8 @@ __und_usr:
 1:	ldrt	r0, [r4]
  ARM_BE8(rev	r0, r0)				@ little endian instruction
 
+	uaccess_disable ip
+
 	@ r0 = 32-bit ARM instruction which caused the exception
 	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
 	@ r4 = PC value for the faulting instruction
@@ -518,9 +526,10 @@ __und_usr_thumb:
 2:	ldrht	r5, [r4]
 ARM_BE8(rev16	r5, r5)				@ little endian instruction
 	cmp	r5, #0xe800			@ 32bit instruction if xx != 0
-	blo	__und_usr_fault_16		@ 16bit undefined instruction
+	blo	__und_usr_fault_16_pan		@ 16bit undefined instruction
 3:	ldrht	r0, [r2]
 ARM_BE8(rev16	r0, r0)				@ little endian instruction
+	uaccess_disable ip
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	str	r2, [sp, #S_PC]			@ it's a 2x16bit instr, update
 	orr	r0, r0, r5, lsl #16
@@ -715,6 +724,8 @@ ENDPROC(no_fp)
 __und_usr_fault_32:
 	mov	r1, #4
 	b	1f
+__und_usr_fault_16_pan:
+	uaccess_disable ip
 __und_usr_fault_16:
 	mov	r1, #2
 1:	mov	r0, sp
@@ -769,7 +780,7 @@ ENTRY(__switch_to)
  THUMB(	str	lr, [ip], #4		   )
 	ldr	r4, [r2, #TI_TP_VALUE]
 	ldr	r5, [r2, #TI_TP_VALUE + 4]
-#ifdef CONFIG_CPU_USE_DOMAINS
+#if defined(CONFIG_CPU_USE_DOMAINS) || defined(CONFIG_CPU_SW_DOMAIN_PAN)
 	mrc	p15, 0, r6, c3, c0, 0		@ Get domain register
 	str	r6, [r1, #TI_CPU_DOMAIN]	@ Save old domain register
 	ldr	r6, [r2, #TI_CPU_DOMAIN]
@@ -780,7 +791,7 @@ ENTRY(__switch_to)
 	ldr	r8, =__stack_chk_guard
 	ldr	r7, [r7, #TSK_STACK_CANARY]
 #endif
-#ifdef CONFIG_CPU_USE_DOMAINS
+#if defined(CONFIG_CPU_USE_DOMAINS) || defined(CONFIG_CPU_SW_DOMAIN_PAN)
 	mcr	p15, 0, r6, c3, c0, 0		@ Set domain register
 #endif
 	mov	r5, r0
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 92828a1dec80..189154980703 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -173,6 +173,8 @@ ENTRY(vector_swi)
  USER(	ldr	scno, [lr, #-4]		)	@ get SWI instruction
 #endif
 
+	uaccess_disable tbl
+
 	adr	tbl, sys_call_table		@ load syscall table pointer
 
 #if defined(CONFIG_OABI_COMPAT)
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 1a0045abead7..3aa6c3742182 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -53,6 +53,18 @@
 #endif
 	.endm
 
+	.macro	uaccess_disable, tmp
+	.endm
+
+	.macro	uaccess_enable, tmp
+	.endm
+
+	.macro	uaccess_save_and_disable, tmp
+	.endm
+
+	.macro	uaccess_restore
+	.endm
+
 #ifdef CONFIG_CPU_V7M
 /*
  * ARMv7-M exception entry/exit macros.
@@ -215,6 +227,7 @@
 	blne	trace_hardirqs_off
 #endif
 	.endif
+	uaccess_restore
 	msr	spsr_cxsf, \rpsr
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
 	@ We must avoid clrex due to Cortex-A15 erratum #830321
@@ -241,6 +254,7 @@
 	@ on the stack remains correct).
 	@
 	.macro  svc_exit_via_fiq
+	uaccess_restore
 	mov	r0, sp
 	ldmib	r0, {r1 - r14}	@ abort is deadly from here onward (it will
 				@ clobber state restored below)
@@ -253,6 +267,7 @@
 	.endm
 
 	.macro	restore_user_regs, fast = 0, offset = 0
+	uaccess_enable r1
 	mov	r2, sp
 	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
 	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
@@ -329,6 +344,7 @@
 	 * part of each exception entry and exit sequence.
 	 */
 	.macro	restore_user_regs, fast = 0, offset = 0
+	uaccess_enable r1
 	.if	\offset
 	add	sp, #\offset
 	.endif
@@ -336,6 +352,7 @@
 	.endm
 #else	/* ifdef CONFIG_CPU_V7M */
 	.macro	restore_user_regs, fast = 0, offset = 0
+	uaccess_enable r1
 	mov	r2, sp
 	load_user_sp_lr r2, r3, \offset + S_SP	@ calling sp, lr
 	ldr	r1, [sp, #\offset + S_PSR]	@ get calling cpsr
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (7 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 8/9] ARM: entry: provide uaccess assembly macro hooks Russell King
@ 2015-08-21 13:31   ` Russell King
  2015-08-25 10:32       ` Geert Uytterhoeven
  2015-08-25 14:05     ` Will Deacon
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
                     ` (3 subsequent siblings)
  12 siblings, 2 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Provide a software-based implementation of the priviledged no access
support found in ARMv8.1.

Userspace pages are mapped using a different domain number from the
kernel and IO mappings.  If we switch the user domain to "no access"
when we enter the kernel, we can prevent the kernel from touching
userspace.

However, the kernel needs to be able to access userspace via the
various user accessor functions.  With the wrapping in the previous
patch, we can temporarily enable access when the kernel needs user
access, and re-disable it afterwards.

This allows us to trap non-intended accesses to userspace, eg, caused
by an inadvertent dereference of the LIST_POISON* values, which, with
appropriate user mappings setup, can be made to succeed.  This in turn
can allow use-after-free bugs to be further exploited than would
otherwise be possible.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/Kconfig               | 15 +++++++++++++++
 arch/arm/include/asm/domain.h  | 15 ++++++++++++---
 arch/arm/include/asm/uaccess.h | 14 ++++++++++++++
 arch/arm/kernel/entry-header.S | 25 +++++++++++++++++++++++++
 arch/arm/kernel/process.c      | 24 ++++++++++++++++++------
 5 files changed, 84 insertions(+), 9 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a750c1425c3a..a898eb72da51 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1694,6 +1694,21 @@ config HIGHPTE
 	bool "Allocate 2nd-level pagetables from highmem"
 	depends on HIGHMEM
 
+config CPU_SW_DOMAIN_PAN
+	bool "Enable use of CPU domains to implement priviledged no-access"
+	depends on MMU && !ARM_LPAE
+	default y
+	help
+	  Increase kernel security by ensuring that normal kernel accesses
+	  are unable to access userspace addresses.  This can help prevent
+	  use-after-free bugs becoming an exploitable privilege escalation
+	  by ensuring that magic values (such as LIST_POISON) will always
+	  fault when dereferenced.
+
+	  CPUs with low-vector mappings use a best-efforts implementation.
+	  Their lower 1MB needs to remain accessible for the vectors, but
+	  the remainder of userspace will become appropriately inaccessible.
+
 config HW_PERF_EVENTS
 	bool "Enable hardware performance counter support for perf events"
 	depends on PERF_EVENTS
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 2be929549938..0c373979af00 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -58,11 +58,21 @@
 #define domain_val(dom,type)	((type) << (2 * (dom)))
 
 #define DACR_INIT \
-	(domain_val(DOMAIN_USER, DOMAIN_CLIENT) | \
+	(domain_val(DOMAIN_USER, DOMAIN_NOACCESS) | \
 	 domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 	 domain_val(DOMAIN_IO, DOMAIN_CLIENT) | \
 	 domain_val(DOMAIN_VECTORS, DOMAIN_CLIENT))
 
+#define __DACR_DEFAULT \
+	domain_val(DOMAIN_KERNEL, DOMAIN_CLIENT) | \
+	domain_val(DOMAIN_IO, DOMAIN_CLIENT) | \
+	domain_val(DOMAIN_VECTORS, DOMAIN_CLIENT)
+
+#define DACR_UACCESS_DISABLE	\
+	(__DACR_DEFAULT | domain_val(DOMAIN_USER, DOMAIN_NOACCESS))
+#define DACR_UACCESS_ENABLE	\
+	(__DACR_DEFAULT | domain_val(DOMAIN_USER, DOMAIN_CLIENT))
+
 #ifndef __ASSEMBLY__
 
 static inline unsigned int get_domain(void)
@@ -76,7 +86,6 @@ static inline unsigned int get_domain(void)
 	return domain;
 }
 
-#ifdef CONFIG_CPU_USE_DOMAINS
 static inline void set_domain(unsigned val)
 {
 	asm volatile(
@@ -85,6 +94,7 @@ static inline void set_domain(unsigned val)
 	isb();
 }
 
+#ifdef CONFIG_CPU_USE_DOMAINS
 #define modify_domain(dom,type)					\
 	do {							\
 		unsigned int domain = get_domain();		\
@@ -94,7 +104,6 @@ static inline void set_domain(unsigned val)
 	} while (0)
 
 #else
-static inline void set_domain(unsigned val) { }
 static inline void modify_domain(unsigned dom, unsigned type)	{ }
 #endif
 
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 4ae10967a8ba..cb802870ffb9 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -101,11 +101,25 @@ static inline void set_fs(mm_segment_t fs)
  */
 static inline unsigned int uaccess_save_and_enable(void)
 {
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+	unsigned int old_domain = get_domain();
+
+	/* Set the current domain access to permit user accesses */
+	set_domain((old_domain & ~domain_mask(DOMAIN_USER)) |
+		   domain_val(DOMAIN_USER, DOMAIN_CLIENT));
+
+	return old_domain;
+#else
 	return 0;
+#endif
 }
 
 static inline void uaccess_restore(unsigned int flags)
 {
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+	/* Restore the user access mask */
+	set_domain(flags);
+#endif
 }
 
 /*
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 3aa6c3742182..bec7ee0764e1 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -54,15 +54,40 @@
 	.endm
 
 	.macro	uaccess_disable, tmp
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+	/*
+	 * Whenever we re-enter userspace, the domains should always be
+	 * set appropriately.
+	 */
+	mov	\tmp, #DACR_UACCESS_DISABLE
+	mcr	p15, 0, \tmp, c3, c0, 0		@ Set domain register
+#endif
 	.endm
 
 	.macro	uaccess_enable, tmp
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+	/*
+	 * Whenever we re-enter userspace, the domains should always be
+	 * set appropriately.
+	 */
+	mov	\tmp, #DACR_UACCESS_ENABLE
+	mcr	p15, 0, \tmp, c3, c0, 0
+#endif
 	.endm
 
 	.macro	uaccess_save_and_disable, tmp
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+	mrc	p15, 0, \tmp, c3, c0, 0
+	str	\tmp, [sp, #S_FRAME_SIZE]
+#endif
+	uaccess_disable \tmp
 	.endm
 
 	.macro	uaccess_restore
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+	ldr	r0, [sp, #S_FRAME_SIZE]
+	mcr	p15, 0, r0, c3, c0, 0
+#endif
 	.endm
 
 #ifdef CONFIG_CPU_V7M
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index e722f9b3c9b1..b407cc7a7b55 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -129,12 +129,24 @@ void __show_regs(struct pt_regs *regs)
 	buf[4] = '\0';
 
 #ifndef CONFIG_CPU_V7M
-	printk("Flags: %s  IRQs o%s  FIQs o%s  Mode %s  ISA %s  Segment %s\n",
-		buf, interrupts_enabled(regs) ? "n" : "ff",
-		fast_interrupts_enabled(regs) ? "n" : "ff",
-		processor_modes[processor_mode(regs)],
-		isa_modes[isa_mode(regs)],
-		get_fs() == get_ds() ? "kernel" : "user");
+	{
+		unsigned int domain = get_domain();
+		const char *segment;
+
+		if ((domain & domain_mask(DOMAIN_USER)) ==
+		    domain_val(DOMAIN_USER, DOMAIN_NOACCESS))
+			segment = "none";
+		else if (get_fs() == get_ds())
+			segment = "kernel";
+		else
+			segment = "user";
+
+		printk("Flags: %s  IRQs o%s  FIQs o%s  Mode %s  ISA %s  Segment %s\n",
+			buf, interrupts_enabled(regs) ? "n" : "ff",
+			fast_interrupts_enabled(regs) ? "n" : "ff",
+			processor_modes[processor_mode(regs)],
+			isa_modes[isa_mode(regs)], segment);
+	}
 #else
 	printk("xPSR: %08lx\n", regs->ARM_cpsr);
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 0/4] Efficiency cleanups
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (8 preceding siblings ...)
  2015-08-21 13:31   ` [PATCH 9/9] ARM: software-based priviledged-no-access support Russell King
@ 2015-08-21 13:46   ` Russell King - ARM Linux
  2015-08-21 13:48     ` [PATCH 1/4] ARM: uaccess: simplify user access assembly Russell King
                       ` (4 more replies)
  2015-08-21 17:32   ` Prevent list poison values from being mapped by userspace processes Catalin Marinas
                     ` (2 subsequent siblings)
  12 siblings, 5 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-21 13:46 UTC (permalink / raw)
  To: linux-arm-kernel

While developing the previous patch set, I noticed that the kernel's
"fast" exit path efficiency was not what it's supposed to be due to the
addition of trace and context tracking.

These add several instances of register stacking and unstacking around
various function calls, several of which we can avoid.  Many of these
instances don't need to stack any register other than r0.

Moreover, we can avoid stacking and unstacking r0 if these features are
enabled by storing r0 in the pt_regs as we would do in our slow return
path.

Various other cleanups are included in this set as well.  Acks welcome.

 arch/arm/include/asm/assembler.h   | 22 ++++++++------
 arch/arm/include/asm/thread_info.h | 20 +++++--------
 arch/arm/include/asm/uaccess.h     | 47 ++++++++---------------------
 arch/arm/kernel/entry-common.S     | 61 ++++++++++++++++++++++++++++----------
 arch/arm/kernel/signal.c           |  6 ++++
 5 files changed, 84 insertions(+), 72 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/4] ARM: uaccess: simplify user access assembly
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
@ 2015-08-21 13:48     ` Russell King
  2015-08-21 13:48     ` [PATCH 2/4] ARM: entry: get rid of asm_trace_hardirqs_on_cond Russell King
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

The user assembly for byte and word accesses was virtually identical.
Rather than duplicating this, use a macro instead.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/uaccess.h | 47 +++++++++++-------------------------------
 1 file changed, 12 insertions(+), 35 deletions(-)

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 74b17d09ef7a..4cf54ebe408a 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -311,9 +311,9 @@ do {									\
 	(x) = (__typeof__(*(ptr)))__gu_val;				\
 } while (0)
 
-#define __get_user_asm_byte(x, addr, err)			\
+#define __get_user_asm(x, addr, err, instr)			\
 	__asm__ __volatile__(					\
-	"1:	" TUSER(ldrb) "	%1,[%2],#0\n"			\
+	"1:	" TUSER(instr) " %1, [%2], #0\n"		\
 	"2:\n"							\
 	"	.pushsection .text.fixup,\"ax\"\n"		\
 	"	.align	2\n"					\
@@ -329,6 +329,9 @@ do {									\
 	: "r" (addr), "i" (-EFAULT)				\
 	: "cc")
 
+#define __get_user_asm_byte(x, addr, err)			\
+	__get_user_asm(x, addr, err, ldrb)
+
 #ifndef __ARMEB__
 #define __get_user_asm_half(x, __gu_addr, err)			\
 ({								\
@@ -348,22 +351,7 @@ do {									\
 #endif
 
 #define __get_user_asm_word(x, addr, err)			\
-	__asm__ __volatile__(					\
-	"1:	" TUSER(ldr) "	%1,[%2],#0\n"			\
-	"2:\n"							\
-	"	.pushsection .text.fixup,\"ax\"\n"		\
-	"	.align	2\n"					\
-	"3:	mov	%0, %3\n"				\
-	"	mov	%1, #0\n"				\
-	"	b	2b\n"					\
-	"	.popsection\n"					\
-	"	.pushsection __ex_table,\"a\"\n"		\
-	"	.align	3\n"					\
-	"	.long	1b, 3b\n"				\
-	"	.popsection"					\
-	: "+r" (err), "=&r" (x)					\
-	: "r" (addr), "i" (-EFAULT)				\
-	: "cc")
+	__get_user_asm(x, addr, err, ldr)
 
 #define __put_user(x, ptr)						\
 ({									\
@@ -393,9 +381,9 @@ do {									\
 	}								\
 } while (0)
 
-#define __put_user_asm_byte(x, __pu_addr, err)			\
+#define __put_user_asm(x, __pu_addr, err, instr)		\
 	__asm__ __volatile__(					\
-	"1:	" TUSER(strb) "	%1,[%2],#0\n"			\
+	"1:	" TUSER(instr) " %1, [%2], #0\n"		\
 	"2:\n"							\
 	"	.pushsection .text.fixup,\"ax\"\n"		\
 	"	.align	2\n"					\
@@ -410,6 +398,9 @@ do {									\
 	: "r" (x), "r" (__pu_addr), "i" (-EFAULT)		\
 	: "cc")
 
+#define __put_user_asm_byte(x, __pu_addr, err)			\
+	__put_user_asm(x, __pu_addr, err, strb)
+
 #ifndef __ARMEB__
 #define __put_user_asm_half(x, __pu_addr, err)			\
 ({								\
@@ -427,21 +418,7 @@ do {									\
 #endif
 
 #define __put_user_asm_word(x, __pu_addr, err)			\
-	__asm__ __volatile__(					\
-	"1:	" TUSER(str) "	%1,[%2],#0\n"			\
-	"2:\n"							\
-	"	.pushsection .text.fixup,\"ax\"\n"		\
-	"	.align	2\n"					\
-	"3:	mov	%0, %3\n"				\
-	"	b	2b\n"					\
-	"	.popsection\n"					\
-	"	.pushsection __ex_table,\"a\"\n"		\
-	"	.align	3\n"					\
-	"	.long	1b, 3b\n"				\
-	"	.popsection"					\
-	: "+r" (err)						\
-	: "r" (x), "r" (__pu_addr), "i" (-EFAULT)		\
-	: "cc")
+	__put_user_asm(x, __pu_addr, err, str)
 
 #ifndef __ARMEB__
 #define	__reg_oper0	"%R2"
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/4] ARM: entry: get rid of asm_trace_hardirqs_on_cond
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
  2015-08-21 13:48     ` [PATCH 1/4] ARM: uaccess: simplify user access assembly Russell King
@ 2015-08-21 13:48     ` Russell King
  2015-08-21 13:48     ` [PATCH 3/4] ARM: entry: efficiency cleanups Russell King
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

There's no need for this macro, it can use a default for the
condition argument.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/assembler.h | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 4abe57279c66..742495eb5526 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -116,7 +116,7 @@
 #endif
 	.endm
 
-	.macro asm_trace_hardirqs_on_cond, cond
+	.macro asm_trace_hardirqs_on, cond=al
 #if defined(CONFIG_TRACE_IRQFLAGS)
 	/*
 	 * actually the registers should be pushed and pop'd conditionally, but
@@ -128,10 +128,6 @@
 #endif
 	.endm
 
-	.macro asm_trace_hardirqs_on
-	asm_trace_hardirqs_on_cond al
-	.endm
-
 	.macro disable_irq
 	disable_irq_notrace
 	asm_trace_hardirqs_off
@@ -173,7 +169,7 @@
 
 	.macro restore_irqs, oldcpsr
 	tst	\oldcpsr, #PSR_I_BIT
-	asm_trace_hardirqs_on_cond eq
+	asm_trace_hardirqs_on cond=eq
 	restore_irqs_notrace \oldcpsr
 	.endm
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/4] ARM: entry: efficiency cleanups
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
  2015-08-21 13:48     ` [PATCH 1/4] ARM: uaccess: simplify user access assembly Russell King
  2015-08-21 13:48     ` [PATCH 2/4] ARM: entry: get rid of asm_trace_hardirqs_on_cond Russell King
@ 2015-08-21 13:48     ` Russell King
  2015-08-21 13:48     ` [PATCH 4/4] ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit() Russell King
  2015-08-24 14:36     ` [PATCH 0/4] Efficiency cleanups Will Deacon
  4 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

Make the "fast" syscall return path fast again.  The addition of IRQ
tracing and context tracking has made this path grossly inefficient.
We can do much better if these options are enabled if we save the
syscall return code on the stack - we then don't need to save a bunch
of registers around every single callout to C code.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/assembler.h   | 16 +++++++---
 arch/arm/include/asm/thread_info.h | 20 +++++--------
 arch/arm/kernel/entry-common.S     | 61 ++++++++++++++++++++++++++++----------
 arch/arm/kernel/signal.c           |  6 ++++
 4 files changed, 71 insertions(+), 32 deletions(-)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 742495eb5526..5a5504f90d5f 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -108,29 +108,37 @@
 	.endm
 #endif
 
-	.macro asm_trace_hardirqs_off
+	.macro asm_trace_hardirqs_off, save=1
 #if defined(CONFIG_TRACE_IRQFLAGS)
+	.if \save
 	stmdb   sp!, {r0-r3, ip, lr}
+	.endif
 	bl	trace_hardirqs_off
+	.if \save
 	ldmia	sp!, {r0-r3, ip, lr}
+	.endif
 #endif
 	.endm
 
-	.macro asm_trace_hardirqs_on, cond=al
+	.macro asm_trace_hardirqs_on, cond=al, save=1
 #if defined(CONFIG_TRACE_IRQFLAGS)
 	/*
 	 * actually the registers should be pushed and pop'd conditionally, but
 	 * after bl the flags are certainly clobbered
 	 */
+	.if \save
 	stmdb   sp!, {r0-r3, ip, lr}
+	.endif
 	bl\cond	trace_hardirqs_on
+	.if \save
 	ldmia	sp!, {r0-r3, ip, lr}
+	.endif
 #endif
 	.endm
 
-	.macro disable_irq
+	.macro disable_irq, save=1
 	disable_irq_notrace
-	asm_trace_hardirqs_off
+	asm_trace_hardirqs_off \save
 	.endm
 
 	.macro enable_irq
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index bd32eded3e50..71e0ffcedf8e 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -136,22 +136,18 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *,
 
 /*
  * thread information flags:
- *  TIF_SYSCALL_TRACE	- syscall trace active
- *  TIF_SYSCAL_AUDIT	- syscall auditing active
- *  TIF_SIGPENDING	- signal pending
- *  TIF_NEED_RESCHED	- rescheduling necessary
- *  TIF_NOTIFY_RESUME	- callback before returning to user
  *  TIF_USEDFPU		- FPU was used by this task this quantum (SMP)
  *  TIF_POLLING_NRFLAG	- true if poll_idle() is polling TIF_NEED_RESCHED
  */
-#define TIF_SIGPENDING		0
-#define TIF_NEED_RESCHED	1
+#define TIF_SIGPENDING		0	/* signal pending */
+#define TIF_NEED_RESCHED	1	/* rescheduling necessary */
 #define TIF_NOTIFY_RESUME	2	/* callback before returning to user */
-#define TIF_UPROBE		7
-#define TIF_SYSCALL_TRACE	8
-#define TIF_SYSCALL_AUDIT	9
-#define TIF_SYSCALL_TRACEPOINT	10
-#define TIF_SECCOMP		11	/* seccomp syscall filtering active */
+#define TIF_UPROBE		3	/* breakpointed or singlestepping */
+#define TIF_SYSCALL_TRACE	4	/* syscall trace active */
+#define TIF_SYSCALL_AUDIT	5	/* syscall auditing active */
+#define TIF_SYSCALL_TRACEPOINT	6	/* syscall tracepoint instrumentation */
+#define TIF_SECCOMP		7	/* seccomp syscall filtering active */
+
 #define TIF_NOHZ		12	/* in adaptive nohz mode */
 #define TIF_USING_IWMMXT	17
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 92828a1dec80..dd3721d1185e 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -24,35 +24,55 @@
 
 
 	.align	5
+#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING))
 /*
- * This is the fast syscall return path.  We do as little as
- * possible here, and this includes saving r0 back into the SVC
- * stack.
+ * This is the fast syscall return path.  We do as little as possible here,
+ * such as avoiding writing r0 to the stack.  We only use this path if we
+ * have tracing and context tracking disabled - the overheads from those
+ * features make this path too inefficient.
  */
 ret_fast_syscall:
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)
-	disable_irq				@ disable interrupts
+	disable_irq_notrace			@ disable interrupts
 	ldr	r1, [tsk, #TI_FLAGS]		@ re-check for syscall tracing
-	tst	r1, #_TIF_SYSCALL_WORK
-	bne	__sys_trace_return
-	tst	r1, #_TIF_WORK_MASK
+	tst	r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
 	bne	fast_work_pending
-	asm_trace_hardirqs_on
 
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
-	ct_user_enter
 
 	restore_user_regs fast = 1, offset = S_OFF
  UNWIND(.fnend		)
+ENDPROC(ret_fast_syscall)
 
-/*
- * Ok, we need to do extra processing, enter the slow path.
- */
+	/* Ok, we need to do extra processing, enter the slow path. */
 fast_work_pending:
 	str	r0, [sp, #S_R0+S_OFF]!		@ returned r0
-work_pending:
+	/* fall through to work_pending */
+#else
+/*
+ * The "replacement" ret_fast_syscall for when tracing or context tracking
+ * is enabled.  As we will need to call out to some C functions, we save
+ * r0 first to avoid needing to save registers around each C function call.
+ */
+ret_fast_syscall:
+ UNWIND(.fnstart	)
+ UNWIND(.cantunwind	)
+	str	r0, [sp, #S_R0 + S_OFF]!	@ save returned r0
+	disable_irq_notrace			@ disable interrupts
+	ldr	r1, [tsk, #TI_FLAGS]		@ re-check for syscall tracing
+	tst	r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
+	beq	no_work_pending
+ UNWIND(.fnend		)
+ENDPROC(ret_fast_syscall)
+
+	/* Slower path - fall through to work_pending */
+#endif
+
+	tst	r1, #_TIF_SYSCALL_WORK
+	bne	__sys_trace_return_nosave
+slow_work_pending:
 	mov	r0, sp				@ 'regs'
 	mov	r2, why				@ 'syscall'
 	bl	do_work_pending
@@ -64,16 +84,19 @@ work_pending:
 
 /*
  * "slow" syscall return path.  "why" tells us if this was a real syscall.
+ * IRQs may be enabled here, so always disable them.  Note that we use the
+ * "notrace" version to avoid calling into the tracing code unnecessarily.
+ * do_work_pending() will update this state if necessary.
  */
 ENTRY(ret_to_user)
 ret_slow_syscall:
-	disable_irq				@ disable interrupts
+	disable_irq_notrace			@ disable interrupts
 ENTRY(ret_to_user_from_irq)
 	ldr	r1, [tsk, #TI_FLAGS]
 	tst	r1, #_TIF_WORK_MASK
-	bne	work_pending
+	bne	slow_work_pending
 no_work_pending:
-	asm_trace_hardirqs_on
+	asm_trace_hardirqs_on save = 0
 
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
@@ -251,6 +274,12 @@ __sys_trace_return:
 	bl	syscall_trace_exit
 	b	ret_slow_syscall
 
+__sys_trace_return_nosave:
+	asm_trace_hardirqs_off save=0
+	mov	r0, sp
+	bl	syscall_trace_exit
+	b	ret_slow_syscall
+
 	.align	5
 #ifdef CONFIG_ALIGNMENT_TRAP
 	.type	__cr_alignment, #object
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 423663e23791..b6cda06b455f 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -562,6 +562,12 @@ static int do_signal(struct pt_regs *regs, int syscall)
 asmlinkage int
 do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
 {
+	/*
+	 * The assembly code enters us with IRQs off, but it hasn't
+	 * informed the tracing code of that for efficiency reasons.
+	 * Update the trace code with the current status.
+	 */
+	trace_hardirqs_off();
 	do {
 		if (likely(thread_flags & _TIF_NEED_RESCHED)) {
 			schedule();
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/4] ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
                       ` (2 preceding siblings ...)
  2015-08-21 13:48     ` [PATCH 3/4] ARM: entry: efficiency cleanups Russell King
@ 2015-08-21 13:48     ` Russell King
  2015-08-24 14:36     ` [PATCH 0/4] Efficiency cleanups Will Deacon
  4 siblings, 0 replies; 45+ messages in thread
From: Russell King @ 2015-08-21 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

The audit code looks like it's been written to cope with being called
with IRQs enabled.  However, it's unclear whether IRQs should be
enabled or disabled when calling the syscall tracing infrastructure.

Right now, sometimes we call this with IRQs enabled, and other times
with IRQs disabled.  Opt for IRQs being enabled for consistency.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/kernel/entry-common.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index dd3721d1185e..d83a40d8e055 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -275,7 +275,7 @@ __sys_trace_return:
 	b	ret_slow_syscall
 
 __sys_trace_return_nosave:
-	asm_trace_hardirqs_off save=0
+	enable_irq_notrace
 	mov	r0, sp
 	bl	syscall_trace_exit
 	b	ret_slow_syscall
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (9 preceding siblings ...)
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
@ 2015-08-21 17:32   ` Catalin Marinas
  2015-08-24 12:06     ` Russell King - ARM Linux
  2015-08-24 13:05   ` Nicolas Schichan
  2015-08-24 18:06   ` Kees Cook
  12 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2015-08-21 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 02:30:43PM +0100, Russell King - ARM Linux wrote:
> On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
> > List poison pointer values point to memory that is mappable by
> > userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
> > 0x00200200. This means poison values can be valid pointers controlled
> > by userspace and can be used to exploit the kernel as demonstrated in
> > a recent blackhat talk:
> > https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
> > 
> > Can these poison values be moved to an area not mappable by userspace
> > on 32 bit ARM?
> 
> As was discussed privately before your message, both Catalin and myself
> agreed that this is not possible, and I proposed alternatives which were
> feasible.

Nice to see these patches so quickly. However, I'll be away for the next
~12 days, so I won't be able to review them.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-21 17:32   ` Prevent list poison values from being mapped by userspace processes Catalin Marinas
@ 2015-08-24 12:06     ` Russell King - ARM Linux
  0 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-24 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 06:32:42PM +0100, Catalin Marinas wrote:
> On Fri, Aug 21, 2015 at 02:30:43PM +0100, Russell King - ARM Linux wrote:
> > On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
> > > List poison pointer values point to memory that is mappable by
> > > userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
> > > 0x00200200. This means poison values can be valid pointers controlled
> > > by userspace and can be used to exploit the kernel as demonstrated in
> > > a recent blackhat talk:
> > > https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
> > > 
> > > Can these poison values be moved to an area not mappable by userspace
> > > on 32 bit ARM?
> > 
> > As was discussed privately before your message, both Catalin and myself
> > agreed that this is not possible, and I proposed alternatives which were
> > feasible.
> 
> Nice to see these patches so quickly. However, I'll be away for the next
> ~12 days, so I won't be able to review them.

That'll be after the merge window opens, so I'll drop them into
linux-next now - they've already been through Olof's builder.  I've
already tweaked one patch to ensure that it doesn't cause build errors
with !MMU platforms.

I see that there's some boot failures reported via the mail interface
to kernelci.org, but I've no idea what they are as kernelci.org is not
elinks-compatible (kernelci.org makes use of javascript.)  So I'm not
going to care about them until someone reports the failures in a form
I can read.

In any case, kernelci.org seems to be random in terms of which platforms
get a boot test - two builds one after each other show initally 373
boots tested, and then only 195 despite the second build not failing
any ARM targets.

kernelci.org is not useful to me in this state, so I shall go purely
on Olof's build system which gave me a successful build and mostly
successful boot.  (Even with Olof's boot system, it's difficult to
tell what is a real failure and what is a transient failure - provoking
the build system to run the same build twice often results in different
boot results...)

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (10 preceding siblings ...)
  2015-08-21 17:32   ` Prevent list poison values from being mapped by userspace processes Catalin Marinas
@ 2015-08-24 13:05   ` Nicolas Schichan
  2015-08-25  8:15     ` Russell King - ARM Linux
  2015-08-24 18:06   ` Kees Cook
  12 siblings, 1 reply; 45+ messages in thread
From: Nicolas Schichan @ 2015-08-24 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/21/2015 03:30 PM, Russell King - ARM Linux wrote:
> On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
>> List poison pointer values point to memory that is mappable by
>> userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
>> 0x00200200. This means poison values can be valid pointers controlled
>> by userspace and can be used to exploit the kernel as demonstrated in
>> a recent blackhat talk:
>> https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
>>
>> Can these poison values be moved to an area not mappable by userspace
>> on 32 bit ARM?
> 
> As was discussed privately before your message, both Catalin and myself
> agreed that this is not possible, and I proposed alternatives which were
> feasible.
> 
> I have now implemented the domain access alternative which I mentioned
> during that discussion, which is suitable for all non-LPAE setups, which
> has the effect of blocking almost all implicit kernel accesses to
> userspace, thereby substantially reducing the possibility for an attack
> similar to that given in the above paper.
> 
> It should be said that with the following patches applied, it won't stop
> the original bug being used to crash the system (that's already been
> fixed) but it will prevent userspace being able to mask the crash, and
> therefore prevent such use-after-free bugs being used to gain privileges.
> 
> This approach also covers low-vector CPUs as well, with one caveat: the
> lower 1MB of userspace will remain accessible to the kernel due to the
> need for the vectors to remain visible.  Doing otherwise crashes the
> machine on the first exception event.  So here, we offer a "best efforts"
> implementation rather than something which completely blocks userspace
> access from kernel space.
> 
> This is not a simple fix - it's quite involved, and it changes a fair
> number of places in the kernel.  It needs time to be proven before any
> thought can be given to backporting these changes to stable kernels.
> It would be good to get some testing of these changes.

Hello Russell,

I gave your patch serie a try on ARMv5/kirkwood (backported on a v4.1 kernel)
and at first I got the following panic just after the kernel transitioned
to userland (with CONFIG_CPU_SW_DOMAIN_PAN=y):

[   13.505419] Freeing unused kernel memory: 180K (c0813000 - c0840000)
[   13.535671] Unhandled fault: page domain fault (0x01b) at 0xb6ec605c
[   13.542056] pgd = df7c0000
[   13.544768] [b6ec605c] *pgd=1ec1f831, *pte=008b618f, *ppte=008b6aae
[   13.551085] Internal error: : 1b [#1] ARM
[   13.555109] Modules linked in:
[   13.558184] CPU: 0 PID: 1 Comm: init Not tainted 4.1.0-00341-gde5a42f-dirty #19
[   13.565522] Hardware name: FBXGW1R
[   13.572596] task: df442000 ti: df450000 task.ti: df450000
[   13.578026] PC is at not_thumb+0x0/0x1c
[   13.581879] LR is at __dabt_usr+0x4c/0x60
[   13.585904] pc : [<c0013f4c>]    lr : [<c000dbcc>]    psr: 40000093
[   13.585904] sp : df451fb0  ip : 00000051  fp : be860f18
[   13.597429] r10: b6f00920  r9 : b6ed83b3  r8 : 00053977
[   13.602671] r7 : 0005397f  r6 : ffffffff  r5 : 20000010  r4 : b6ec605c
[   13.609228] r3 : be860fdf  r2 : df451fb0  r1 : 00000017  r0 : b6ed83a2
[   13.615786] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   13.623040] Control: 0005397f  Table: 1f7c0000  DAC: 00000051
[   13.628811] Process init (pid: 1, stack limit = 0xdf450190)
[   13.634402] Stack: (0xdf451fb0 to 0xdf452000)
[   13.638773] 1fa0:                                     be860fdf b6ed83a2
00000010 00000048
[   13.646991] 1fc0: be860f14 be860e50 00000000 b6ed83a2 00000003 b6ed83b3
b6f00920 be860f18
[   13.655205] 1fe0: be860ee8 be860e38 b6ea1ee4 b6ec605c 20000010 ffffffff
29211822 602a0511
[   13.663425] [<c0013f4c>] (not_thumb) from [<00000048>] (0x48)
[   13.669200] Code: 03833b02 e3130b02 03811b02 eaffd4a2 (05943000)
[   13.675317] ---[ end trace f412d29360772faf ]---
[   13.684219] Kernel panic - not syncing: Fatal exception
[   13.696487] Rebooting in 10 seconds..

I have tracked this to the attempt made by the code in
arch/arm/mm/abort-ev5t.S to read the fault instruction which in this
case is in unserspace:

	ldreq	r3, [r4]			@ read aborted ARM instruction

The following (crude) changes seemed to fix it:

diff --git a/arch/arm/mm/abort-ev5t.S b/arch/arm/mm/abort-ev5t.S
index a0908d4..fc3a219 100644
--- a/arch/arm/mm/abort-ev5t.S
+++ b/arch/arm/mm/abort-ev5t.S
@@ -15,12 +15,33 @@
  * abort here if the I-TLB and D-TLB aren't seeing the same
  * picture.  Unfortunately, this does happen.  We live with it.
  */
+
+        .macro  uaccess_disable, tmp
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+       /*
+        * Whenever we re-enter userspace, the domains should always be
+        * set appropriately.
+        */
+       mov     \tmp, #DACR_UACCESS_DISABLE
+       mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
+#endif
+        .endm
+
+       .macro uaccess_enable, tmp
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+       mov     \tmp, #DACR_UACCESS_ENABLE
+       mcr     p15, 0, \tmp, c3, c0, 0         @ set domain register
+#endif
+       .endm
+
        .align  5
 ENTRY(v5t_early_abort)
        mrc     p15, 0, r1, c5, c0, 0           @ get FSR
        mrc     p15, 0, r0, c6, c0, 0           @ get FAR
        do_thumb_abort fsr=r1, pc=r4, psr=r5, tmp=r3
+       uaccess_enable ip
        ldreq   r3, [r4]                        @ read aborted ARM instruction
+       uaccess_disable ip
        bic     r1, r1, #1 << 11                @ clear bits 11 of FSR
        do_ldrd_abort tmp=ip, insn=r3
        tst     r3, #1 << 20                    @ check write

It looks like ARMv6 with CONFIG_ARM_ERRATA_326103 enabled will suffer
from the same issue as it reads the faulty ARM instruction, possibly from
userland (see arch/arm/mm/abort-ev6.S).

With the changes above, userland boots fine and attempts to
dereference LIST_POISON1 from the kernel results the expected "page
domain fault".

To test that I mapped LIST_POISON1 from user space via mmap() and
triggered the fault by reading from /proc/cpu/alignment. I modified the
code showing /proc/cpu/alignment to access LIST_POISON1. Without your
patch serie the access to LIST_POISON1 goes through without a hitch.

Also, when CONFIG_CPU_SW_DOMAIN_PAN is not set, the DACR_INIT constant is
setup with (domain_val(DOMAIN_USER, DOMAIN_NOACCESS) which will cause the
kernel to die with a "page domain fault" when running init.

The following (crude patch) works around that:

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 0c37397..e878129 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -57,11 +57,19 @@
 #define domain_mask(dom)       ((3) << (2 * (dom)))
 #define domain_val(dom,type)   ((type) << (2 * (dom)))

+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
 #define DACR_INIT \
        (domain_val(DOMAIN_USER, DOMAIN_NOACCESS) | \
         domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
         domain_val(DOMAIN_IO, DOMAIN_CLIENT) | \
         domain_val(DOMAIN_VECTORS, DOMAIN_CLIENT))
+#else
+#define DACR_INIT \
+       (domain_val(DOMAIN_USER, DOMAIN_CLIENT) | \
+        domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
+        domain_val(DOMAIN_IO, DOMAIN_CLIENT) | \
+        domain_val(DOMAIN_VECTORS, DOMAIN_CLIENT))
+#endif

 #define __DACR_DEFAULT \
        domain_val(DOMAIN_KERNEL, DOMAIN_CLIENT) | \


Thanks,

-- 
Nicolas Schichan

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 0/4] Efficiency cleanups
  2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
                       ` (3 preceding siblings ...)
  2015-08-21 13:48     ` [PATCH 4/4] ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit() Russell King
@ 2015-08-24 14:36     ` Will Deacon
  2015-08-24 15:00       ` Russell King - ARM Linux
  4 siblings, 1 reply; 45+ messages in thread
From: Will Deacon @ 2015-08-24 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 02:46:30PM +0100, Russell King - ARM Linux wrote:
> While developing the previous patch set, I noticed that the kernel's
> "fast" exit path efficiency was not what it's supposed to be due to the
> addition of trace and context tracking.
> 
> These add several instances of register stacking and unstacking around
> various function calls, several of which we can avoid.  Many of these
> instances don't need to stack any register other than r0.
> 
> Moreover, we can avoid stacking and unstacking r0 if these features are
> enabled by storing r0 in the pt_regs as we would do in our slow return
> path.
> 
> Various other cleanups are included in this set as well.  Acks welcome.

All four patches look fine to me:

  Acked-by: Will Deacon <will.deacon@arm.com>

For some reason, I thought the numbering of the TIF_ flags was important
for some immediate construction in asm code, but either I'm mistaken or
it's no longer the case.

Will

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 0/4] Efficiency cleanups
  2015-08-24 14:36     ` [PATCH 0/4] Efficiency cleanups Will Deacon
@ 2015-08-24 15:00       ` Russell King - ARM Linux
  0 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-24 15:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 03:36:20PM +0100, Will Deacon wrote:
> On Fri, Aug 21, 2015 at 02:46:30PM +0100, Russell King - ARM Linux wrote:
> > While developing the previous patch set, I noticed that the kernel's
> > "fast" exit path efficiency was not what it's supposed to be due to the
> > addition of trace and context tracking.
> > 
> > These add several instances of register stacking and unstacking around
> > various function calls, several of which we can avoid.  Many of these
> > instances don't need to stack any register other than r0.
> > 
> > Moreover, we can avoid stacking and unstacking r0 if these features are
> > enabled by storing r0 in the pt_regs as we would do in our slow return
> > path.
> > 
> > Various other cleanups are included in this set as well.  Acks welcome.
> 
> All four patches look fine to me:
> 
>   Acked-by: Will Deacon <will.deacon@arm.com>
> 
> For some reason, I thought the numbering of the TIF_ flags was important
> for some immediate construction in asm code, but either I'm mistaken or
> it's no longer the case.

The only reason they're important is to ensure that the bits are close
enough together - we use the definitions directly in the asm code.

Over the years, I think we've deleted a number of flags, and grown some
others, and we've left holes in the bit allocation.  By compressing them,
we can change the assembly code to check for "is there _any_ work at all
that needs to be done" as opposed to the two-step approach we've been
doing up to now.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-21 13:30 ` Russell King - ARM Linux
                     ` (11 preceding siblings ...)
  2015-08-24 13:05   ` Nicolas Schichan
@ 2015-08-24 18:06   ` Kees Cook
  2015-08-24 18:47     ` Russell King - ARM Linux
  12 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2015-08-24 18:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 6:30 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
>> List poison pointer values point to memory that is mappable by
>> userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
>> 0x00200200. This means poison values can be valid pointers controlled
>> by userspace and can be used to exploit the kernel as demonstrated in
>> a recent blackhat talk:
>> https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
>>
>> Can these poison values be moved to an area not mappable by userspace
>> on 32 bit ARM?
>
> As was discussed privately before your message, both Catalin and myself
> agreed that this is not possible, and I proposed alternatives which were
> feasible.
>
> I have now implemented the domain access alternative which I mentioned
> during that discussion, which is suitable for all non-LPAE setups, which
> has the effect of blocking almost all implicit kernel accesses to
> userspace, thereby substantially reducing the possibility for an attack
> similar to that given in the above paper.

What's the right solution for LPAE setups?

-Kees

>
> It should be said that with the following patches applied, it won't stop
> the original bug being used to crash the system (that's already been
> fixed) but it will prevent userspace being able to mask the crash, and
> therefore prevent such use-after-free bugs being used to gain privileges.
>
> This approach also covers low-vector CPUs as well, with one caveat: the
> lower 1MB of userspace will remain accessible to the kernel due to the
> need for the vectors to remain visible.  Doing otherwise crashes the
> machine on the first exception event.  So here, we offer a "best efforts"
> implementation rather than something which completely blocks userspace
> access from kernel space.
>
> This is not a simple fix - it's quite involved, and it changes a fair
> number of places in the kernel.  It needs time to be proven before any
> thought can be given to backporting these changes to stable kernels.
> It would be good to get some testing of these changes.
>
>  arch/arm/Kconfig                            | 15 +++++
>  arch/arm/include/asm/domain.h               | 45 +++++++++++----
>  arch/arm/include/asm/futex.h                | 19 ++++++-
>  arch/arm/include/asm/pgtable-2level-hwdef.h |  1 +
>  arch/arm/include/asm/thread_info.h          |  3 -
>  arch/arm/include/asm/uaccess.h              | 85 +++++++++++++++++++++++++++--
>  arch/arm/kernel/armksyms.c                  |  6 +-
>  arch/arm/kernel/entry-armv.S                | 27 ++++++---
>  arch/arm/kernel/entry-common.S              |  2 +
>  arch/arm/kernel/entry-header.S              | 42 ++++++++++++++
>  arch/arm/kernel/head.S                      |  5 +-
>  arch/arm/kernel/process.c                   | 37 ++++++++++---
>  arch/arm/kernel/traps.c                     |  1 -
>  arch/arm/lib/clear_user.S                   |  6 +-
>  arch/arm/lib/copy_from_user.S               |  6 +-
>  arch/arm/lib/copy_to_user.S                 |  6 +-
>  arch/arm/lib/uaccess_with_memcpy.c          |  4 +-
>  arch/arm/mm/mmu.c                           |  4 +-
>  arch/arm/mm/pgd.c                           | 10 ++++
>  19 files changed, 267 insertions(+), 57 deletions(-)
>
> --
> FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
> according to speedtest.net.
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 18:06   ` Kees Cook
@ 2015-08-24 18:47     ` Russell King - ARM Linux
  2015-08-24 18:51       ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-24 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 11:06:52AM -0700, Kees Cook wrote:
> On Fri, Aug 21, 2015 at 6:30 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
> >> List poison pointer values point to memory that is mappable by
> >> userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
> >> 0x00200200. This means poison values can be valid pointers controlled
> >> by userspace and can be used to exploit the kernel as demonstrated in
> >> a recent blackhat talk:
> >> https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
> >>
> >> Can these poison values be moved to an area not mappable by userspace
> >> on 32 bit ARM?
> >
> > As was discussed privately before your message, both Catalin and myself
> > agreed that this is not possible, and I proposed alternatives which were
> > feasible.
> >
> > I have now implemented the domain access alternative which I mentioned
> > during that discussion, which is suitable for all non-LPAE setups, which
> > has the effect of blocking almost all implicit kernel accesses to
> > userspace, thereby substantially reducing the possibility for an attack
> > similar to that given in the above paper.
> 
> What's the right solution for LPAE setups?

That's something which Catalin indicated that he'll work on.  However,
he said in a public email last week that he won't be around for a while.

So, I have no immediate solution for LPAE - it looks like LPAE will
require switching the page tables on kernel entry or exit, and again
each and every time we want to perform a userspace access.  How this
is done is not something that has been discussed, and neither do we
yet know how expensive this will be.  There are a number of places in
the kernel where a large number of get_user()s or put_user()s follow
one after each other, which necessitates switching back and forth
multiple times.  We may need to address some of those areas by
converting them to copy_(to|from)_user().

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 18:47     ` Russell King - ARM Linux
@ 2015-08-24 18:51       ` Kees Cook
  2015-08-24 19:14         ` Russell King - ARM Linux
  0 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2015-08-24 18:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 11:47 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Aug 24, 2015 at 11:06:52AM -0700, Kees Cook wrote:
>> On Fri, Aug 21, 2015 at 6:30 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Tue, Aug 18, 2015 at 02:42:44PM -0700, Jeffrey Vander Stoep wrote:
>> >> List poison pointer values point to memory that is mappable by
>> >> userspace. i.e. LIST_POISON1 = 0x00100100 and LIST_POISON2 =
>> >> 0x00200200. This means poison values can be valid pointers controlled
>> >> by userspace and can be used to exploit the kernel as demonstrated in
>> >> a recent blackhat talk:
>> >> https://www.blackhat.com/docs/us-15/materials/us-15-Xu-Ah-Universal-Android-Rooting-Is-Back-wp.pdf
>> >>
>> >> Can these poison values be moved to an area not mappable by userspace
>> >> on 32 bit ARM?
>> >
>> > As was discussed privately before your message, both Catalin and myself
>> > agreed that this is not possible, and I proposed alternatives which were
>> > feasible.
>> >
>> > I have now implemented the domain access alternative which I mentioned
>> > during that discussion, which is suitable for all non-LPAE setups, which
>> > has the effect of blocking almost all implicit kernel accesses to
>> > userspace, thereby substantially reducing the possibility for an attack
>> > similar to that given in the above paper.
>>
>> What's the right solution for LPAE setups?
>
> That's something which Catalin indicated that he'll work on.  However,
> he said in a public email last week that he won't be around for a while.
>
> So, I have no immediate solution for LPAE - it looks like LPAE will
> require switching the page tables on kernel entry or exit, and again
> each and every time we want to perform a userspace access.  How this
> is done is not something that has been discussed, and neither do we
> yet know how expensive this will be.  There are a number of places in
> the kernel where a large number of get_user()s or put_user()s follow
> one after each other, which necessitates switching back and forth
> multiple times.  We may need to address some of those areas by
> converting them to copy_(to|from)_user().

By the way, have you looked at grsecurity's implementation of these
protections? They've been using domains for a while now, and I think
have an LPAE solution as well.

The original description of the work was here:
https://forums.grsecurity.net/viewtopic.php?f=7&t=3292

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 18:51       ` Kees Cook
@ 2015-08-24 19:14         ` Russell King - ARM Linux
  2015-08-24 19:22           ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-24 19:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 11:51:04AM -0700, Kees Cook wrote:
> On Mon, Aug 24, 2015 at 11:47 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > That's something which Catalin indicated that he'll work on.  However,
> > he said in a public email last week that he won't be around for a while.
> >
> > So, I have no immediate solution for LPAE - it looks like LPAE will
> > require switching the page tables on kernel entry or exit, and again
> > each and every time we want to perform a userspace access.  How this
> > is done is not something that has been discussed, and neither do we
> > yet know how expensive this will be.  There are a number of places in
> > the kernel where a large number of get_user()s or put_user()s follow
> > one after each other, which necessitates switching back and forth
> > multiple times.  We may need to address some of those areas by
> > converting them to copy_(to|from)_user().
> 
> By the way, have you looked at grsecurity's implementation of these
> protections? They've been using domains for a while now, and I think
> have an LPAE solution as well.

*Sigh*.

No, and I really don't care - if people want to do development work out
of the mainline kernel and not bother to talk about getting it upstream,
it's their loss.  As far as I'm concerned, such external work doesn't
exist.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 19:14         ` Russell King - ARM Linux
@ 2015-08-24 19:22           ` Kees Cook
  2015-08-24 19:32             ` Russell King - ARM Linux
  0 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2015-08-24 19:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 12:14 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Aug 24, 2015 at 11:51:04AM -0700, Kees Cook wrote:
>> On Mon, Aug 24, 2015 at 11:47 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > That's something which Catalin indicated that he'll work on.  However,
>> > he said in a public email last week that he won't be around for a while.
>> >
>> > So, I have no immediate solution for LPAE - it looks like LPAE will
>> > require switching the page tables on kernel entry or exit, and again
>> > each and every time we want to perform a userspace access.  How this
>> > is done is not something that has been discussed, and neither do we
>> > yet know how expensive this will be.  There are a number of places in
>> > the kernel where a large number of get_user()s or put_user()s follow
>> > one after each other, which necessitates switching back and forth
>> > multiple times.  We may need to address some of those areas by
>> > converting them to copy_(to|from)_user().
>>
>> By the way, have you looked at grsecurity's implementation of these
>> protections? They've been using domains for a while now, and I think
>> have an LPAE solution as well.
>
> *Sigh*.
>
> No, and I really don't care - if people want to do development work out
> of the mainline kernel and not bother to talk about getting it upstream,
> it's their loss.  As far as I'm concerned, such external work doesn't
> exist.

Sure, I understand, but it's worth at least looking at to compare
feature sets. For example, when doing the W^X kernel memory work, I
looked at both qcom and spender's work, trying to get the best of both
into upstreamable shape.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 19:22           ` Kees Cook
@ 2015-08-24 19:32             ` Russell King - ARM Linux
  2015-08-24 22:01               ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-24 19:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 12:22:33PM -0700, Kees Cook wrote:
> On Mon, Aug 24, 2015 at 12:14 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Mon, Aug 24, 2015 at 11:51:04AM -0700, Kees Cook wrote:
> >> On Mon, Aug 24, 2015 at 11:47 AM, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > That's something which Catalin indicated that he'll work on.  However,
> >> > he said in a public email last week that he won't be around for a while.
> >> >
> >> > So, I have no immediate solution for LPAE - it looks like LPAE will
> >> > require switching the page tables on kernel entry or exit, and again
> >> > each and every time we want to perform a userspace access.  How this
> >> > is done is not something that has been discussed, and neither do we
> >> > yet know how expensive this will be.  There are a number of places in
> >> > the kernel where a large number of get_user()s or put_user()s follow
> >> > one after each other, which necessitates switching back and forth
> >> > multiple times.  We may need to address some of those areas by
> >> > converting them to copy_(to|from)_user().
> >>
> >> By the way, have you looked at grsecurity's implementation of these
> >> protections? They've been using domains for a while now, and I think
> >> have an LPAE solution as well.
> >
> > *Sigh*.
> >
> > No, and I really don't care - if people want to do development work out
> > of the mainline kernel and not bother to talk about getting it upstream,
> > it's their loss.  As far as I'm concerned, such external work doesn't
> > exist.
> 
> Sure, I understand, but it's worth at least looking at to compare
> feature sets. For example, when doing the W^X kernel memory work, I
> looked at both qcom and spender's work, trying to get the best of both
> into upstreamable shape.

That's one way of looking at it.

Another way of looking at it is that by looking at their work, and
merging their ideas into your own, it becomes an encouragement for
working outside of mainline - not only do they get the kernel itself
free, but they get their feature merged without themselves doing any
work - while some other bugger has to sort out making their code
mergable.

Therefore, my standard point of view is that if people can't be
bothered to talk about their ARM specific kernel features here with
a view to having them merged, they are leeching off the efforts of
the upstream kernel community, and their code just isn't worth
looking at.

I hold the same view on "community" kernel trees which don't bother
pushing their code upstream as well.

Sorry, I'm *not* supporting leeches.

I've already been accused this year by one very mistaken individual
for not pushing _my_ iMX6 work into community kernel trees - when
the work that I do is solely targetted at mainline kernels.  The
leeches are going mad, and I'm saying no more to this crap.  If it's
not talked about on a recognised mainline kernel mailing list, it
doesn't exist, and deserves to be rewritten.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 19:32             ` Russell King - ARM Linux
@ 2015-08-24 22:01               ` Kees Cook
  2015-08-26 20:34                 ` Russell King - ARM Linux
  0 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2015-08-24 22:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 12:32 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Aug 24, 2015 at 12:22:33PM -0700, Kees Cook wrote:
>> On Mon, Aug 24, 2015 at 12:14 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Mon, Aug 24, 2015 at 11:51:04AM -0700, Kees Cook wrote:
>> >> On Mon, Aug 24, 2015 at 11:47 AM, Russell King - ARM Linux
>> >> <linux@arm.linux.org.uk> wrote:
>> >> > That's something which Catalin indicated that he'll work on.  However,
>> >> > he said in a public email last week that he won't be around for a while.
>> >> >
>> >> > So, I have no immediate solution for LPAE - it looks like LPAE will
>> >> > require switching the page tables on kernel entry or exit, and again
>> >> > each and every time we want to perform a userspace access.  How this
>> >> > is done is not something that has been discussed, and neither do we
>> >> > yet know how expensive this will be.  There are a number of places in
>> >> > the kernel where a large number of get_user()s or put_user()s follow
>> >> > one after each other, which necessitates switching back and forth
>> >> > multiple times.  We may need to address some of those areas by
>> >> > converting them to copy_(to|from)_user().
>> >>
>> >> By the way, have you looked at grsecurity's implementation of these
>> >> protections? They've been using domains for a while now, and I think
>> >> have an LPAE solution as well.
>> >
>> > *Sigh*.
>> >
>> > No, and I really don't care - if people want to do development work out
>> > of the mainline kernel and not bother to talk about getting it upstream,
>> > it's their loss.  As far as I'm concerned, such external work doesn't
>> > exist.
>>
>> Sure, I understand, but it's worth at least looking at to compare
>> feature sets. For example, when doing the W^X kernel memory work, I
>> looked at both qcom and spender's work, trying to get the best of both
>> into upstreamable shape.
>
> That's one way of looking at it.
>
> Another way of looking at it is that by looking at their work, and
> merging their ideas into your own, it becomes an encouragement for
> working outside of mainline - not only do they get the kernel itself
> free, but they get their feature merged without themselves doing any
> work - while some other bugger has to sort out making their code
> mergable.
>
> Therefore, my standard point of view is that if people can't be
> bothered to talk about their ARM specific kernel features here with
> a view to having them merged, they are leeching off the efforts of
> the upstream kernel community, and their code just isn't worth
> looking at.
>
> I hold the same view on "community" kernel trees which don't bother
> pushing their code upstream as well.
>
> Sorry, I'm *not* supporting leeches.
>
> I've already been accused this year by one very mistaken individual
> for not pushing _my_ iMX6 work into community kernel trees - when
> the work that I do is solely targetted at mainline kernels.  The
> leeches are going mad, and I'm saying no more to this crap.  If it's
> not talked about on a recognised mainline kernel mailing list, it
> doesn't exist, and deserves to be rewritten.

I certainly see your point, but I'm not sure it serves end users best
to ignore proven technologies. I am trying to bring up the discussion
on a mainline list, and it seemed redundant to paste the entire grsec
forum post here as a starting point. :)

Anyway, I just want to see stuff as secure as possible. We're all
working toward the same goal.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 13:05   ` Nicolas Schichan
@ 2015-08-25  8:15     ` Russell King - ARM Linux
  2015-08-25 13:17       ` Nicolas Schichan
  0 siblings, 1 reply; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-25  8:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 03:05:51PM +0200, Nicolas Schichan wrote:
> I gave your patch serie a try on ARMv5/kirkwood (backported on a v4.1 kernel)
> and at first I got the following panic just after the kernel transitioned
> to userland (with CONFIG_CPU_SW_DOMAIN_PAN=y):

Ah, damn.  Thanks for testing.  I really need to add some non-ARMv7
platforms to my nightly test rig, but I'm out of physical space to
do that. :p

> I have tracked this to the attempt made by the code in
> arch/arm/mm/abort-ev5t.S to read the fault instruction which in this
> case is in unserspace:
> 
> 	ldreq	r3, [r4]			@ read aborted ARM instruction

There's going to be many more of these... it may be better if I left
the domain enabled when calling into these handlers, and had every
handler do the turn-off itself when it's ready to do so - there's
no point turning off userspace access only to then immediately
re-enable it.

> With the changes above, userland boots fine and attempts to
> dereference LIST_POISON1 from the kernel results the expected "page
> domain fault".
> 
> To test that I mapped LIST_POISON1 from user space via mmap() and
> triggered the fault by reading from /proc/cpu/alignment. I modified the
> code showing /proc/cpu/alignment to access LIST_POISON1. Without your
> patch serie the access to LIST_POISON1 goes through without a hitch.

Great, thanks for the independent testing of its effectiveness.

> Also, when CONFIG_CPU_SW_DOMAIN_PAN is not set, the DACR_INIT constant is
> setup with (domain_val(DOMAIN_USER, DOMAIN_NOACCESS) which will cause the
> kernel to die with a "page domain fault" when running init.

If you don't mind, I'll merge that into the patch adding this so it
doesn't introduce a regression there.

Once I've fixed the abort handler issue, would you mind re-testing
and giving a tested-by attributation please?

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-21 13:31   ` [PATCH 9/9] ARM: software-based priviledged-no-access support Russell King
@ 2015-08-25 10:32       ` Geert Uytterhoeven
  2015-08-25 14:05     ` Will Deacon
  1 sibling, 0 replies; 45+ messages in thread
From: Geert Uytterhoeven @ 2015-08-25 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Fri, Aug 21, 2015 at 3:31 PM, Russell King
<rmk+kernel@arm.linux.org.uk> wrote:
> Provide a software-based implementation of the priviledged no access
> support found in ARMv8.1.
>
> Userspace pages are mapped using a different domain number from the
> kernel and IO mappings.  If we switch the user domain to "no access"
> when we enter the kernel, we can prevent the kernel from touching
> userspace.
>
> However, the kernel needs to be able to access userspace via the
> various user accessor functions.  With the wrapping in the previous
> patch, we can temporarily enable access when the kernel needs user
> access, and re-disable it afterwards.
>
> This allows us to trap non-intended accesses to userspace, eg, caused
> by an inadvertent dereference of the LIST_POISON* values, which, with
> appropriate user mappings setup, can be made to succeed.  This in turn
> can allow use-after-free bugs to be further exploited than would
> otherwise be possible.
>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
on r8a7791/koelsch, which has a dual core CA15:

    [ ok ] Configuring network interfaces...done.
    Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
    pgd = edbb0000
    [be8e6120] *pgdma77831, *pte¿4d075f, *ppte¿4d0c7f
    Internal error: : 1b [#1] SMP ARM
    CPU: 1 PID: 1629 Comm: ntpdate Not tainted
4.2.0-rc8-06444-g3c24fd89c9421db1 #31
    9
    Hardware name: Generic R8A7791 (Flattened Device Tree)
    task: ed883a80 ti: ed41c000 task.ti: ed41c000
    PC is at csum_partial_copy_from_user+0x28/0x3d8
    LR is at csum_and_copy_from_iter+0x334/0x4c0
    pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
    sp : ed41db00  ip : 00000020  fp : ed41db6c
    r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
    r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
    r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
    Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
    Process ntpdate (pid: 1629, stack limit = 0xed41c210)
    Stack: (0xed41db00 to 0xed41e000)
    db00: eda5262c 00000027 00000000 ed41dec8 eda52653 00000027
ed41dc20 c01c82e8
    db20: ed41db3c c03d7d44 000000d0 c00a85a0 ed41db74 00000000
ed41dba4 00000000
    db40: 00000000 00000027 edb36940 ed9b9380 00000000 ed41dc20
0000002f ed41dc30
    db60: ed41db8c ed41db70 c040dd5c c01c7fc0 00000000 00000000
00000027 edb36940
    db80: ed41dc04 ed41db90 c040c454 c040dd04 00000000 edb36940
ed41dbc4 00000043
    dba0: 000005c8 000005c8 0000002f 00000000 00000000 00000010
000005dc ee3c7280
    dba0: 000005c8 000005c8 0000002f 00000000 00000000 00000010
000005dc ee3c7280
    dbc0: 00000000 000005dc 00000000 00000014 ed41dc04 ffffff97
c040bde4 00004040
    dbe0: ed41dc20 ed9b95a8 ed9b9380 ed41dec0 c040dcf8 00003500
ed41dc74 ed41dc08
    dc00: c040e7f4 c040be8c ed883e5c c040dcf8 ed41dec0 0000002f
00000008 00004040
    dc20: ed41dc20 ed41dc20 00000000 c067bc40 00000000 00000000
00000000 000005dc
    dc40: 0000002f ee3c7280 ffff0000 ed41dc00 ed9b9380 ed9b95a8
ed41dec0 fe61a8c0
    dc60: 00000000 fe61a8c0 ed41dd64 ed41dc78 c0432118 c040e758
0000002f 00000008
    dc80: ed41dcb4 ed41dcb0 00004040 ffffffff 00000000 00000000
ed9b95a8 00000000
    dca0: c040dcf8 1c61a8c0 00000000 00000027 00000000 fe61a8c0
00000000 00000000
    dcc0: ffff0000 00000000 01ffffff b6d21000 edbb2db0 edb81580
ed41dd74 ed41dce8
    dce0: c0098d60 c00985d0 c04c27f8 ed41ddc0 00000001 be8e6068
00000051 ed41ddc0
    dd00: 00000008 00000000 00000008 c00cc668 00000008 ed41dec8
ed41dd9c 00000001
    dd20: 00000001 00000001 ed41dd64 ed41dd38 c01c8c7c c01c62f0
00000027 ed9b9380
    dd40: ed41dec0 00000027 ed41dda0 edc78c80 ed41deec 00004040
ed41dd84 ed41dd68
    dd60: c043b224 c0431c30 c043b198 ed41dec0 be8e6078 00000000
ed41dd94 ed41dd88
    dd80: c03cbaf0 c043b1a4 ed41deac ed41dd98 c03cbd3c c03cbae0
6f7f979f 00000000
    dda0: eedaf25c b6d21000 edb12484 edbb2db0 ed41de24 ed41ddc0
c00b1898 c00b02d8
    ddc0: be8e6120 00000027 00000001 000000fe 00000001 ee36d740
ed41ddf4 ed9b95a8
    dde0: c06a5b80 00000000 00000000 ed9b95a8 ed9b95a8 ee25f580
ed41de64 ed41de08
    de00: c0407274 00000000 c06a5b80 00000000 ee3c7280 00000006
c06a5b80 ee3c7280
    de20: c06a5b80 c06a5b80 ed9b9380 ed8736f0 ed41de4c ed41de40
ed41de94 ed41de48
    de40: c042e1c8 c04049b8 c0432688 c04c5a44 ed9b9380 ed9b944c
ee3c7280 ed41df08
    de60: ed9b95a8 00000000 ed41de8c ed41de78 ed9b9380 00000000
ed41de94 ed41de88
    de80: c00e5c08 00000000 be8e6078 edc78c80 00000002 00004000
ed41c030 00000000
    dea0: ed41df94 ed41deb0 c03ccfe8 c03cbbc0 ed41deec ed41dec0
00000000 00000000
    dec0: 00000000 00000000 00000001 00000000 00000027 ed41ddc0
00000001 00000000
    dee0: 00000000 00004040 00000000 c037ff04 ed41df44 ed41df00
c007181c c03801b0
    df00: 08cc6da6 00000000 00000000 002aea54 ffffffff 00ffffff
ed41df44 ed41df80
    df20: be8e5f88 00000005 0000004e c000fea4 ed41c000 00000000
ed41df54 ed41df48
    df40: c0071918 c00717dc ed41df7c ed41df58 c0071f04 00000000
00000001 be8e6060
    df60: 00000000 c000fea4 ed41c000 ffffffff 00000000 00004000
00000002 00000176
    df80: c000fea4 ed41c000 ed41dfa4 ed41df98 c03cd080 c03ccf80
00000000 ed41dfa8
    dfa0: c000fce0 c03cd07c 00000000 00004000 00000003 be8e6078
00000002 00004000
    dfc0: 00000000 00004000 00000002 00000176 00000003 00000005
b6e4ec14 2af73cb0
    dfe0: 00000176 be8e5f70 b6df6191 b6d798e6 800f0030 00000003
00000000 00000000
    Backtrace:
    [<c01c7fb4>] (csum_and_copy_from_iter) from [<c040dd5c>]
(ip_generic_getfrag+0x64/0xb4)
     r10:ed41dc30 r9:0000002f r8:ed41dc20 r7:00000000 r6:ed9b9380 r5:edb36940
     r4:00000027
    [<c040dcf8>] (ip_generic_getfrag) from [<c040c454>]
(__ip_append_data.isra.37+0x5d4/0x9b0)
     r5:edb36940 r4:00000027
    [<c040be80>] (__ip_append_data.isra.37) from [<c040e7f4>]
(ip_make_skb+0xa8/0xe0)
     r10:00003500 r9:c040dcf8 r8:ed41dec0 r7:ed9b9380 r6:ed9b95a8 r5:ed41dc20
     r4:00004040
    [<c040e74c>] (ip_make_skb) from [<c0432118>] (udp_sendmsg+0x4f4/0x6d8)
     r9:fe61a8c0 r8:00000000 r7:fe61a8c0 r6:ed41dec0 r5:ed9b95a8 r4:ed9b9380
    [<c0431c24>] (udp_sendmsg) from [<c043b224>] (inet_sendmsg+0x8c/0xc0)
     r10:00004040 r9:ed41deec r8:edc78c80 r7:ed41dda0 r6:00000027 r5:ed41dec0
     r4:ed9b9380
    [<c043b198>] (inet_sendmsg) from [<c03cbaf0>] (sock_sendmsg+0x1c/0x2c)
     r6:00000000 r5:be8e6078 r4:ed41dec0 r3:c043b198
    [<c03cbad4>] (sock_sendmsg) from [<c03cbd3c>] (___sys_sendmsg+0x188/0x1f8)
    [<c03cbbb4>] (___sys_sendmsg) from [<c03ccfe8>] (__sys_sendmmsg+0x74/0xfc)
     r10:00000000 r9:ed41c030 r8:00004000 r7:00000002 r6:edc78c80 r5:be8e6078
     r4:00000000
    [<c03ccf74>] (__sys_sendmmsg) from [<c03cd080>] (SyS_sendmmsg+0x10/0x14)
     r9:ed41c000 r8:c000fea4 r7:00000176 r6:00000002 r5:00004000 r4:00000000
    [<c03cd070>] (SyS_sendmmsg) from [<c000fce0>] (ret_fast_syscall+0x0/0x3c)
    Code: e3100003 1a00002f e3d2c00f 0a00000b (e4904004)
    ---[ end trace 21df281cc5d080da ]---

There are a few more networking-related backtraces during further booting
of userspace.

After disabling CONFIG_CPU_SW_DOMAIN_PAN it  fails differently:

    VFS: Mounted root (nfs filesystem) readonly on device 0:13.
    devtmpfs: mounted
    Freeing unused kernel memory: 300K (c0629000 - c0674000)
    Unhandled fault: page domain fault (0x81b) at 0x000263e0
    pgd = ed908000
    [000263e0] *pgdn299831, *pte¿81d75f, *ppte¿81dc7f
    Internal error: : 81b [#1] SMP ARM
    CPU: 1 PID: 1 Comm: init Not tainted 4.2.0-rc8-06444-g3c24fd89c9421db1 #332
    Hardware name: Generic R8A7791 (Flattened Device Tree)
    task: ee0319c0 ti: ee04e000 task.ti: ee04e000
    PC is at __clear_user_std+0x34/0x68
    LR is at padzero+0x4c/0x60
    pc : [<c01b2bd8>]    lr : [<c010a470>]    psr: 20000113
    sp : ee04fe40  ip : 00000000  fp : ee04fe54
    r10: ee0f5300  r9 : ee316120  r8 : 00000000
    r7 : 000265fc  r6 : 000263e0  r5 : ee314400  r4 : ee290e00
    r3 : 00000000  r2 : 00000000  r1 : 00000c18  r0 : 000263e0
    Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 10c5307d  Table: 6d90806a  DAC: 00000051
    Process init (pid: 1, stack limit = 0xee04e210)
    Stack: (0xee04fe40 to 0xee050000)
    fe40: 00000c20 c010a470 ee04fed4 ee04fe58 c010ae78 c010a430
00001812 00000000
    fe60: ee04fe94 ee04fe58 ee04e018 00025ef4 00015ad8 00010000
00000009 00010000
    fe80: 00000001 ee316000 ee31b300 000263e0 ee3d3600 00000000
ef7e93c0 00000000
    fea0: ee04febc ee04feb0 c001dde4 fffffff8 ee0f5300 c06c3ccc
c06c3ccc c067ff0c
    fec0: c0680374 c06c3ccc ee04ff04 ee04fed8 c00cf0b8 c010a7cc
c067c8b8 ee0f5300
    fee0: 00000000 ee13a000 00000001 00000000 ed9d5040 c0679318
ee04ff4c ee04ff08
    ff00: c00cf5a4 c00cf038 c05d6ab9 ed9d5078 c0679290 00000000
00000000 ee031c18
    ff20: ee04ff44 c0679318 c0679290 00000000 00000000 00000000
00000000 00000000
    ff40: ee04ff64 ee04ff50 c00cf784 c00cf198 00000000 00000000
ee04ff7c ee04ff68
    ff60: c000a5c8 c00cf75c c06a6000 c05ca7cd ee04ff94 ee04ff80
c000a5e4 c000a5ac
    ff80: c06a6000 c04b54c4 ee04ffac ee04ff98 c04b5544 c000a5dc
ee04e000 00000000
    ffa0: 00000000 ee04ffb0 c000fc88 c04b54d0 00000000 00000000
00000000 00000000
    ffc0: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
    ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
00000000 00000000
    Backtrace:
    [<c010a424>] (padzero) from [<c010ae78>] (load_elf_binary+0x6b8/0xfbc)
    [<c010a7c0>] (load_elf_binary) from [<c00cf0b8>]
(search_binary_handler+0x8c/0x160)
     r10:c06c3ccc r9:c0680374 r8:c067ff0c r7:c06c3ccc r6:c06c3ccc r5:ee0f5300
     r4:fffffff8
    [<c00cf02c>] (search_binary_handler) from [<c00cf5a4>]
(do_execveat_common+0x418/0x5c4)
     r10:c0679318 r9:ed9d5040 r8:00000000 r7:00000001 r6:ee13a000 r5:00000000
     r4:ee0f5300 r3:c067c8b8
    [<c00cf18c>] (do_execveat_common) from [<c00cf784>] (do_execve+0x34/0x3c)
     r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0679290
     r4:c0679318
    [<c00cf750>] (do_execve) from [<c000a5c8>] (run_init_process+0x28/0x30)
    [<c000a5a0>] (run_init_process) from [<c000a5e4>]
(try_to_run_init_process+0x14/0x40)
     r5:c05ca7cd r4:c06a6000
    [<c000a5d0>] (try_to_run_init_process) from [<c04b5544>]
(kernel_init+0x80/0xec)
     r5:c04b54c4 r4:c06a6000
    [<c04b54c4>] (kernel_init) from [<c000fc88>] (ret_from_fork+0x14/0x2c)
     r4:00000000 r3:ee04e000
    Code: b4c02001 e26cc004 e041100c e2511008 (54802004)
    ---[ end trace 807fed3702987ba4 ]---
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Reverting commit 0db805aa8c96f0ea ("ARM: software-based priviledged-no-access
support") fixes it.

Another board-specific config that has CONFIG_ARM_LPAE=y runs fine on the
same hardware. Disabling CONFIG_ARM_LPAE breaks it.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
@ 2015-08-25 10:32       ` Geert Uytterhoeven
  0 siblings, 0 replies; 45+ messages in thread
From: Geert Uytterhoeven @ 2015-08-25 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Fri, Aug 21, 2015 at 3:31 PM, Russell King
<rmk+kernel@arm.linux.org.uk> wrote:
> Provide a software-based implementation of the priviledged no access
> support found in ARMv8.1.
>
> Userspace pages are mapped using a different domain number from the
> kernel and IO mappings.  If we switch the user domain to "no access"
> when we enter the kernel, we can prevent the kernel from touching
> userspace.
>
> However, the kernel needs to be able to access userspace via the
> various user accessor functions.  With the wrapping in the previous
> patch, we can temporarily enable access when the kernel needs user
> access, and re-disable it afterwards.
>
> This allows us to trap non-intended accesses to userspace, eg, caused
> by an inadvertent dereference of the LIST_POISON* values, which, with
> appropriate user mappings setup, can be made to succeed.  This in turn
> can allow use-after-free bugs to be further exploited than would
> otherwise be possible.
>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
on r8a7791/koelsch, which has a dual core CA15:

    [ ok ] Configuring network interfaces...done.
    Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
    pgd = edbb0000
    [be8e6120] *pgd=6da77831, *pte=bf4d075f, *ppte=bf4d0c7f
    Internal error: : 1b [#1] SMP ARM
    CPU: 1 PID: 1629 Comm: ntpdate Not tainted
4.2.0-rc8-06444-g3c24fd89c9421db1 #31
    9
    Hardware name: Generic R8A7791 (Flattened Device Tree)
    task: ed883a80 ti: ed41c000 task.ti: ed41c000
    PC is at csum_partial_copy_from_user+0x28/0x3d8
    LR is at csum_and_copy_from_iter+0x334/0x4c0
    pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
    sp : ed41db00  ip : 00000020  fp : ed41db6c
    r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
    r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
    r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
    Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
    Process ntpdate (pid: 1629, stack limit = 0xed41c210)
    Stack: (0xed41db00 to 0xed41e000)
    db00: eda5262c 00000027 00000000 ed41dec8 eda52653 00000027
ed41dc20 c01c82e8
    db20: ed41db3c c03d7d44 000000d0 c00a85a0 ed41db74 00000000
ed41dba4 00000000
    db40: 00000000 00000027 edb36940 ed9b9380 00000000 ed41dc20
0000002f ed41dc30
    db60: ed41db8c ed41db70 c040dd5c c01c7fc0 00000000 00000000
00000027 edb36940
    db80: ed41dc04 ed41db90 c040c454 c040dd04 00000000 edb36940
ed41dbc4 00000043
    dba0: 000005c8 000005c8 0000002f 00000000 00000000 00000010
000005dc ee3c7280
    dba0: 000005c8 000005c8 0000002f 00000000 00000000 00000010
000005dc ee3c7280
    dbc0: 00000000 000005dc 00000000 00000014 ed41dc04 ffffff97
c040bde4 00004040
    dbe0: ed41dc20 ed9b95a8 ed9b9380 ed41dec0 c040dcf8 00003500
ed41dc74 ed41dc08
    dc00: c040e7f4 c040be8c ed883e5c c040dcf8 ed41dec0 0000002f
00000008 00004040
    dc20: ed41dc20 ed41dc20 00000000 c067bc40 00000000 00000000
00000000 000005dc
    dc40: 0000002f ee3c7280 ffff0000 ed41dc00 ed9b9380 ed9b95a8
ed41dec0 fe61a8c0
    dc60: 00000000 fe61a8c0 ed41dd64 ed41dc78 c0432118 c040e758
0000002f 00000008
    dc80: ed41dcb4 ed41dcb0 00004040 ffffffff 00000000 00000000
ed9b95a8 00000000
    dca0: c040dcf8 1c61a8c0 00000000 00000027 00000000 fe61a8c0
00000000 00000000
    dcc0: ffff0000 00000000 01ffffff b6d21000 edbb2db0 edb81580
ed41dd74 ed41dce8
    dce0: c0098d60 c00985d0 c04c27f8 ed41ddc0 00000001 be8e6068
00000051 ed41ddc0
    dd00: 00000008 00000000 00000008 c00cc668 00000008 ed41dec8
ed41dd9c 00000001
    dd20: 00000001 00000001 ed41dd64 ed41dd38 c01c8c7c c01c62f0
00000027 ed9b9380
    dd40: ed41dec0 00000027 ed41dda0 edc78c80 ed41deec 00004040
ed41dd84 ed41dd68
    dd60: c043b224 c0431c30 c043b198 ed41dec0 be8e6078 00000000
ed41dd94 ed41dd88
    dd80: c03cbaf0 c043b1a4 ed41deac ed41dd98 c03cbd3c c03cbae0
6f7f979f 00000000
    dda0: eedaf25c b6d21000 edb12484 edbb2db0 ed41de24 ed41ddc0
c00b1898 c00b02d8
    ddc0: be8e6120 00000027 00000001 000000fe 00000001 ee36d740
ed41ddf4 ed9b95a8
    dde0: c06a5b80 00000000 00000000 ed9b95a8 ed9b95a8 ee25f580
ed41de64 ed41de08
    de00: c0407274 00000000 c06a5b80 00000000 ee3c7280 00000006
c06a5b80 ee3c7280
    de20: c06a5b80 c06a5b80 ed9b9380 ed8736f0 ed41de4c ed41de40
ed41de94 ed41de48
    de40: c042e1c8 c04049b8 c0432688 c04c5a44 ed9b9380 ed9b944c
ee3c7280 ed41df08
    de60: ed9b95a8 00000000 ed41de8c ed41de78 ed9b9380 00000000
ed41de94 ed41de88
    de80: c00e5c08 00000000 be8e6078 edc78c80 00000002 00004000
ed41c030 00000000
    dea0: ed41df94 ed41deb0 c03ccfe8 c03cbbc0 ed41deec ed41dec0
00000000 00000000
    dec0: 00000000 00000000 00000001 00000000 00000027 ed41ddc0
00000001 00000000
    dee0: 00000000 00004040 00000000 c037ff04 ed41df44 ed41df00
c007181c c03801b0
    df00: 08cc6da6 00000000 00000000 002aea54 ffffffff 00ffffff
ed41df44 ed41df80
    df20: be8e5f88 00000005 0000004e c000fea4 ed41c000 00000000
ed41df54 ed41df48
    df40: c0071918 c00717dc ed41df7c ed41df58 c0071f04 00000000
00000001 be8e6060
    df60: 00000000 c000fea4 ed41c000 ffffffff 00000000 00004000
00000002 00000176
    df80: c000fea4 ed41c000 ed41dfa4 ed41df98 c03cd080 c03ccf80
00000000 ed41dfa8
    dfa0: c000fce0 c03cd07c 00000000 00004000 00000003 be8e6078
00000002 00004000
    dfc0: 00000000 00004000 00000002 00000176 00000003 00000005
b6e4ec14 2af73cb0
    dfe0: 00000176 be8e5f70 b6df6191 b6d798e6 800f0030 00000003
00000000 00000000
    Backtrace:
    [<c01c7fb4>] (csum_and_copy_from_iter) from [<c040dd5c>]
(ip_generic_getfrag+0x64/0xb4)
     r10:ed41dc30 r9:0000002f r8:ed41dc20 r7:00000000 r6:ed9b9380 r5:edb36940
     r4:00000027
    [<c040dcf8>] (ip_generic_getfrag) from [<c040c454>]
(__ip_append_data.isra.37+0x5d4/0x9b0)
     r5:edb36940 r4:00000027
    [<c040be80>] (__ip_append_data.isra.37) from [<c040e7f4>]
(ip_make_skb+0xa8/0xe0)
     r10:00003500 r9:c040dcf8 r8:ed41dec0 r7:ed9b9380 r6:ed9b95a8 r5:ed41dc20
     r4:00004040
    [<c040e74c>] (ip_make_skb) from [<c0432118>] (udp_sendmsg+0x4f4/0x6d8)
     r9:fe61a8c0 r8:00000000 r7:fe61a8c0 r6:ed41dec0 r5:ed9b95a8 r4:ed9b9380
    [<c0431c24>] (udp_sendmsg) from [<c043b224>] (inet_sendmsg+0x8c/0xc0)
     r10:00004040 r9:ed41deec r8:edc78c80 r7:ed41dda0 r6:00000027 r5:ed41dec0
     r4:ed9b9380
    [<c043b198>] (inet_sendmsg) from [<c03cbaf0>] (sock_sendmsg+0x1c/0x2c)
     r6:00000000 r5:be8e6078 r4:ed41dec0 r3:c043b198
    [<c03cbad4>] (sock_sendmsg) from [<c03cbd3c>] (___sys_sendmsg+0x188/0x1f8)
    [<c03cbbb4>] (___sys_sendmsg) from [<c03ccfe8>] (__sys_sendmmsg+0x74/0xfc)
     r10:00000000 r9:ed41c030 r8:00004000 r7:00000002 r6:edc78c80 r5:be8e6078
     r4:00000000
    [<c03ccf74>] (__sys_sendmmsg) from [<c03cd080>] (SyS_sendmmsg+0x10/0x14)
     r9:ed41c000 r8:c000fea4 r7:00000176 r6:00000002 r5:00004000 r4:00000000
    [<c03cd070>] (SyS_sendmmsg) from [<c000fce0>] (ret_fast_syscall+0x0/0x3c)
    Code: e3100003 1a00002f e3d2c00f 0a00000b (e4904004)
    ---[ end trace 21df281cc5d080da ]---

There are a few more networking-related backtraces during further booting
of userspace.

After disabling CONFIG_CPU_SW_DOMAIN_PAN it  fails differently:

    VFS: Mounted root (nfs filesystem) readonly on device 0:13.
    devtmpfs: mounted
    Freeing unused kernel memory: 300K (c0629000 - c0674000)
    Unhandled fault: page domain fault (0x81b) at 0x000263e0
    pgd = ed908000
    [000263e0] *pgd=6e299831, *pte=bf81d75f, *ppte=bf81dc7f
    Internal error: : 81b [#1] SMP ARM
    CPU: 1 PID: 1 Comm: init Not tainted 4.2.0-rc8-06444-g3c24fd89c9421db1 #332
    Hardware name: Generic R8A7791 (Flattened Device Tree)
    task: ee0319c0 ti: ee04e000 task.ti: ee04e000
    PC is at __clear_user_std+0x34/0x68
    LR is at padzero+0x4c/0x60
    pc : [<c01b2bd8>]    lr : [<c010a470>]    psr: 20000113
    sp : ee04fe40  ip : 00000000  fp : ee04fe54
    r10: ee0f5300  r9 : ee316120  r8 : 00000000
    r7 : 000265fc  r6 : 000263e0  r5 : ee314400  r4 : ee290e00
    r3 : 00000000  r2 : 00000000  r1 : 00000c18  r0 : 000263e0
    Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 10c5307d  Table: 6d90806a  DAC: 00000051
    Process init (pid: 1, stack limit = 0xee04e210)
    Stack: (0xee04fe40 to 0xee050000)
    fe40: 00000c20 c010a470 ee04fed4 ee04fe58 c010ae78 c010a430
00001812 00000000
    fe60: ee04fe94 ee04fe58 ee04e018 00025ef4 00015ad8 00010000
00000009 00010000
    fe80: 00000001 ee316000 ee31b300 000263e0 ee3d3600 00000000
ef7e93c0 00000000
    fea0: ee04febc ee04feb0 c001dde4 fffffff8 ee0f5300 c06c3ccc
c06c3ccc c067ff0c
    fec0: c0680374 c06c3ccc ee04ff04 ee04fed8 c00cf0b8 c010a7cc
c067c8b8 ee0f5300
    fee0: 00000000 ee13a000 00000001 00000000 ed9d5040 c0679318
ee04ff4c ee04ff08
    ff00: c00cf5a4 c00cf038 c05d6ab9 ed9d5078 c0679290 00000000
00000000 ee031c18
    ff20: ee04ff44 c0679318 c0679290 00000000 00000000 00000000
00000000 00000000
    ff40: ee04ff64 ee04ff50 c00cf784 c00cf198 00000000 00000000
ee04ff7c ee04ff68
    ff60: c000a5c8 c00cf75c c06a6000 c05ca7cd ee04ff94 ee04ff80
c000a5e4 c000a5ac
    ff80: c06a6000 c04b54c4 ee04ffac ee04ff98 c04b5544 c000a5dc
ee04e000 00000000
    ffa0: 00000000 ee04ffb0 c000fc88 c04b54d0 00000000 00000000
00000000 00000000
    ffc0: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
    ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
00000000 00000000
    Backtrace:
    [<c010a424>] (padzero) from [<c010ae78>] (load_elf_binary+0x6b8/0xfbc)
    [<c010a7c0>] (load_elf_binary) from [<c00cf0b8>]
(search_binary_handler+0x8c/0x160)
     r10:c06c3ccc r9:c0680374 r8:c067ff0c r7:c06c3ccc r6:c06c3ccc r5:ee0f5300
     r4:fffffff8
    [<c00cf02c>] (search_binary_handler) from [<c00cf5a4>]
(do_execveat_common+0x418/0x5c4)
     r10:c0679318 r9:ed9d5040 r8:00000000 r7:00000001 r6:ee13a000 r5:00000000
     r4:ee0f5300 r3:c067c8b8
    [<c00cf18c>] (do_execveat_common) from [<c00cf784>] (do_execve+0x34/0x3c)
     r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0679290
     r4:c0679318
    [<c00cf750>] (do_execve) from [<c000a5c8>] (run_init_process+0x28/0x30)
    [<c000a5a0>] (run_init_process) from [<c000a5e4>]
(try_to_run_init_process+0x14/0x40)
     r5:c05ca7cd r4:c06a6000
    [<c000a5d0>] (try_to_run_init_process) from [<c04b5544>]
(kernel_init+0x80/0xec)
     r5:c04b54c4 r4:c06a6000
    [<c04b54c4>] (kernel_init) from [<c000fc88>] (ret_from_fork+0x14/0x2c)
     r4:00000000 r3:ee04e000
    Code: b4c02001 e26cc004 e041100c e2511008 (54802004)
    ---[ end trace 807fed3702987ba4 ]---
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Reverting commit 0db805aa8c96f0ea ("ARM: software-based priviledged-no-access
support") fixes it.

Another board-specific config that has CONFIG_ARM_LPAE=y runs fine on the
same hardware. Disabling CONFIG_ARM_LPAE breaks it.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-25 10:32       ` Geert Uytterhoeven
@ 2015-08-25 10:44         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-25 10:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
> on r8a7791/koelsch, which has a dual core CA15:
> 
>     [ ok ] Configuring network interfaces...done.
>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
>     pgd = edbb0000
>     [be8e6120] *pgdma77831, *pte¿4d075f, *ppte¿4d0c7f
>     Internal error: : 1b [#1] SMP ARM
>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
>     9
>     Hardware name: Generic R8A7791 (Flattened Device Tree)
>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
>     PC is at csum_partial_copy_from_user+0x28/0x3d8
>     LR is at csum_and_copy_from_iter+0x334/0x4c0
>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
>     sp : ed41db00  ip : 00000020  fp : ed41db6c
>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)

Thanks.  I wonder what's different about your ntpdate that triggers
this, and why all my iMX6 behave fine, which have desktop-like ubuntu
installs on (of two different versions.)

What it's basically showing is that (unsurprisingly)
csum_partial_copy_from_user is trying to access userspace.  I'll see
about fixing that today, or pulling the patch from -next if I can't.

I've also noticed that on rpc_defconfig, the 0-day builder shows that
this triggers an ICE as the compiler appears to think it's run out of
registers.

> After disabling CONFIG_CPU_SW_DOMAIN_PAN it  fails differently:
> 
>     VFS: Mounted root (nfs filesystem) readonly on device 0:13.
>     devtmpfs: mounted
>     Freeing unused kernel memory: 300K (c0629000 - c0674000)
>     Unhandled fault: page domain fault (0x81b) at 0x000263e0
>     pgd = ed908000
>     [000263e0] *pgdn299831, *pte¿81d75f, *ppte¿81dc7f

Yes, this one is because I forgot to provide the non-protected default
for bootup, which I've already merged a fix for.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
@ 2015-08-25 10:44         ` Russell King - ARM Linux
  0 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-25 10:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
> on r8a7791/koelsch, which has a dual core CA15:
> 
>     [ ok ] Configuring network interfaces...done.
>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
>     pgd = edbb0000
>     [be8e6120] *pgd=6da77831, *pte=bf4d075f, *ppte=bf4d0c7f
>     Internal error: : 1b [#1] SMP ARM
>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
>     9
>     Hardware name: Generic R8A7791 (Flattened Device Tree)
>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
>     PC is at csum_partial_copy_from_user+0x28/0x3d8
>     LR is at csum_and_copy_from_iter+0x334/0x4c0
>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
>     sp : ed41db00  ip : 00000020  fp : ed41db6c
>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)

Thanks.  I wonder what's different about your ntpdate that triggers
this, and why all my iMX6 behave fine, which have desktop-like ubuntu
installs on (of two different versions.)

What it's basically showing is that (unsurprisingly)
csum_partial_copy_from_user is trying to access userspace.  I'll see
about fixing that today, or pulling the patch from -next if I can't.

I've also noticed that on rpc_defconfig, the 0-day builder shows that
this triggers an ICE as the compiler appears to think it's run out of
registers.

> After disabling CONFIG_CPU_SW_DOMAIN_PAN it  fails differently:
> 
>     VFS: Mounted root (nfs filesystem) readonly on device 0:13.
>     devtmpfs: mounted
>     Freeing unused kernel memory: 300K (c0629000 - c0674000)
>     Unhandled fault: page domain fault (0x81b) at 0x000263e0
>     pgd = ed908000
>     [000263e0] *pgd=6e299831, *pte=bf81d75f, *ppte=bf81dc7f

Yes, this one is because I forgot to provide the non-protected default
for bootup, which I've already merged a fix for.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-25 10:44         ` Russell King - ARM Linux
@ 2015-08-25 11:21           ` Geert Uytterhoeven
  -1 siblings, 0 replies; 45+ messages in thread
From: Geert Uytterhoeven @ 2015-08-25 11:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Tue, Aug 25, 2015 at 12:44 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
>> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
>> on r8a7791/koelsch, which has a dual core CA15:
>>
>>     [ ok ] Configuring network interfaces...done.
>>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
>>     pgd = edbb0000
>>     [be8e6120] *pgdma77831, *pte¿4d075f, *ppte¿4d0c7f
>>     Internal error: : 1b [#1] SMP ARM
>>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
>> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
>>     9
>>     Hardware name: Generic R8A7791 (Flattened Device Tree)
>>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
>>     PC is at csum_partial_copy_from_user+0x28/0x3d8
>>     LR is at csum_and_copy_from_iter+0x334/0x4c0
>>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
>>     sp : ed41db00  ip : 00000020  fp : ed41db6c
>>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
>>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
>>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
>>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
>>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)
>
> Thanks.  I wonder what's different about your ntpdate that triggers
> this, and why all my iMX6 behave fine, which have desktop-like ubuntu
> installs on (of two different versions.)

It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.

But I get similar dumps during boot up from rpc.idmapd (SyS_send),
rsyslogd (SyS_send), and from sshd (SyS_write) when trying to log in.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
@ 2015-08-25 11:21           ` Geert Uytterhoeven
  0 siblings, 0 replies; 45+ messages in thread
From: Geert Uytterhoeven @ 2015-08-25 11:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Tue, Aug 25, 2015 at 12:44 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
>> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
>> on r8a7791/koelsch, which has a dual core CA15:
>>
>>     [ ok ] Configuring network interfaces...done.
>>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
>>     pgd = edbb0000
>>     [be8e6120] *pgd=6da77831, *pte=bf4d075f, *ppte=bf4d0c7f
>>     Internal error: : 1b [#1] SMP ARM
>>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
>> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
>>     9
>>     Hardware name: Generic R8A7791 (Flattened Device Tree)
>>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
>>     PC is at csum_partial_copy_from_user+0x28/0x3d8
>>     LR is at csum_and_copy_from_iter+0x334/0x4c0
>>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
>>     sp : ed41db00  ip : 00000020  fp : ed41db6c
>>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
>>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
>>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
>>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
>>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)
>
> Thanks.  I wonder what's different about your ntpdate that triggers
> this, and why all my iMX6 behave fine, which have desktop-like ubuntu
> installs on (of two different versions.)

It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.

But I get similar dumps during boot up from rpc.idmapd (SyS_send),
rsyslogd (SyS_send), and from sshd (SyS_write) when trying to log in.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-25 11:21           ` Geert Uytterhoeven
@ 2015-08-25 12:38             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-25 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 25, 2015 at 01:21:04PM +0200, Geert Uytterhoeven wrote:
> Hi Russell,
> 
> On Tue, Aug 25, 2015 at 12:44 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
> >> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
> >> on r8a7791/koelsch, which has a dual core CA15:
> >>
> >>     [ ok ] Configuring network interfaces...done.
> >>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
> >>     pgd = edbb0000
> >>     [be8e6120] *pgdma77831, *pte¿4d075f, *ppte¿4d0c7f
> >>     Internal error: : 1b [#1] SMP ARM
> >>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
> >> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
> >>     9
> >>     Hardware name: Generic R8A7791 (Flattened Device Tree)
> >>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
> >>     PC is at csum_partial_copy_from_user+0x28/0x3d8
> >>     LR is at csum_and_copy_from_iter+0x334/0x4c0
> >>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
> >>     sp : ed41db00  ip : 00000020  fp : ed41db6c
> >>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
> >>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
> >>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
> >>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> >>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
> >>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)
> >
> > Thanks.  I wonder what's different about your ntpdate that triggers
> > this, and why all my iMX6 behave fine, which have desktop-like ubuntu
> > installs on (of two different versions.)
> 
> It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.

Hmm, I think I tried at one time to install Debian on an iMX6 platform
and gave up with it after spending 50 minutes with the installer getting
so far, and then killing the network - it was very repeatable, and always
happened at the same point in the installation.  I gave up with Debian
at that point, as I didn't have lots of 50 minutes to babysit the silly
installer (which can't ask the questions up-front) nor did I want to
waste my monthly internet allowance on multiple failed install attempts.

The reports I was getting from other iMX6 users was that Debian Jessie
had lots of problems at that time.

> But I get similar dumps during boot up from rpc.idmapd (SyS_send),
> rsyslogd (SyS_send), and from sshd (SyS_write) when trying to log in.

Hmm.

root       693  0.0  0.1   4944  3196 ?        Ss   01:22   0:00 /usr/sbin/sshd -D
syslog     720  0.2  0.0  30404  2032 ?        Sl   01:23   1:19 rsyslogd -c5
root       722  0.0  0.0   2392  1340 ?        Ss   01:23   0:00 rpc.idmapd

So, the question I need to find an answer to is... why hasn't this path
been exercised on my platforms during my testing.  It's certainly
compiled into the kernel...

Anyway, I've now (hopefully) fixed the bug, but I've nobbled
csum_partial_copy_from_user to ensure that it will always oops the kernel
if called:

000000b4 <csum_partial_copy_from_user>:
  b4:   ee133f10        mrc     15, 0, r3, cr3, cr0, {0}
  b8:   e92d41fe        push    {r1, r2, r3, r4, r5, r6, r7, r8, lr}
  bc:   e3a03055        mov     r3, #85 ; 0x55
  c0:   ee033f10        mcr     15, 0, r3, cr3, cr0, {0}
  c4:   e7033003        str     r3, [r3, -r3]

and... it doesn't trigger.  I can only assume that this is because the
iMX6 ethernet interface uses TSO (which implies checksum offload), there's
no need to use these csum functions - and that would explain why it never
came up in my local testing.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
@ 2015-08-25 12:38             ` Russell King - ARM Linux
  0 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-25 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 25, 2015 at 01:21:04PM +0200, Geert Uytterhoeven wrote:
> Hi Russell,
> 
> On Tue, Aug 25, 2015 at 12:44 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
> >> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
> >> on r8a7791/koelsch, which has a dual core CA15:
> >>
> >>     [ ok ] Configuring network interfaces...done.
> >>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
> >>     pgd = edbb0000
> >>     [be8e6120] *pgd=6da77831, *pte=bf4d075f, *ppte=bf4d0c7f
> >>     Internal error: : 1b [#1] SMP ARM
> >>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
> >> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
> >>     9
> >>     Hardware name: Generic R8A7791 (Flattened Device Tree)
> >>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
> >>     PC is at csum_partial_copy_from_user+0x28/0x3d8
> >>     LR is at csum_and_copy_from_iter+0x334/0x4c0
> >>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
> >>     sp : ed41db00  ip : 00000020  fp : ed41db6c
> >>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
> >>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
> >>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
> >>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> >>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
> >>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)
> >
> > Thanks.  I wonder what's different about your ntpdate that triggers
> > this, and why all my iMX6 behave fine, which have desktop-like ubuntu
> > installs on (of two different versions.)
> 
> It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.

Hmm, I think I tried at one time to install Debian on an iMX6 platform
and gave up with it after spending 50 minutes with the installer getting
so far, and then killing the network - it was very repeatable, and always
happened at the same point in the installation.  I gave up with Debian
at that point, as I didn't have lots of 50 minutes to babysit the silly
installer (which can't ask the questions up-front) nor did I want to
waste my monthly internet allowance on multiple failed install attempts.

The reports I was getting from other iMX6 users was that Debian Jessie
had lots of problems at that time.

> But I get similar dumps during boot up from rpc.idmapd (SyS_send),
> rsyslogd (SyS_send), and from sshd (SyS_write) when trying to log in.

Hmm.

root       693  0.0  0.1   4944  3196 ?        Ss   01:22   0:00 /usr/sbin/sshd -D
syslog     720  0.2  0.0  30404  2032 ?        Sl   01:23   1:19 rsyslogd -c5
root       722  0.0  0.0   2392  1340 ?        Ss   01:23   0:00 rpc.idmapd

So, the question I need to find an answer to is... why hasn't this path
been exercised on my platforms during my testing.  It's certainly
compiled into the kernel...

Anyway, I've now (hopefully) fixed the bug, but I've nobbled
csum_partial_copy_from_user to ensure that it will always oops the kernel
if called:

000000b4 <csum_partial_copy_from_user>:
  b4:   ee133f10        mrc     15, 0, r3, cr3, cr0, {0}
  b8:   e92d41fe        push    {r1, r2, r3, r4, r5, r6, r7, r8, lr}
  bc:   e3a03055        mov     r3, #85 ; 0x55
  c0:   ee033f10        mcr     15, 0, r3, cr3, cr0, {0}
  c4:   e7033003        str     r3, [r3, -r3]

and... it doesn't trigger.  I can only assume that this is because the
iMX6 ethernet interface uses TSO (which implies checksum offload), there's
no need to use these csum functions - and that would explain why it never
came up in my local testing.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-25 12:38             ` Russell King - ARM Linux
@ 2015-08-25 12:47               ` Geert Uytterhoeven
  -1 siblings, 0 replies; 45+ messages in thread
From: Geert Uytterhoeven @ 2015-08-25 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 25, 2015 at 2:38 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
>> It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.
>
> Hmm, I think I tried at one time to install Debian on an iMX6 platform
> and gave up with it after spending 50 minutes with the installer getting
> so far, and then killing the network - it was very repeatable, and always
> happened at the same point in the installation.  I gave up with Debian
> at that point, as I didn't have lots of 50 minutes to babysit the silly
> installer (which can't ask the questions up-front) nor did I want to
> waste my monthly internet allowance on multiple failed install attempts.

debootstrap is your friend.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
@ 2015-08-25 12:47               ` Geert Uytterhoeven
  0 siblings, 0 replies; 45+ messages in thread
From: Geert Uytterhoeven @ 2015-08-25 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 25, 2015 at 2:38 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
>> It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.
>
> Hmm, I think I tried at one time to install Debian on an iMX6 platform
> and gave up with it after spending 50 minutes with the installer getting
> so far, and then killing the network - it was very repeatable, and always
> happened at the same point in the installation.  I gave up with Debian
> at that point, as I didn't have lots of 50 minutes to babysit the silly
> installer (which can't ask the questions up-front) nor did I want to
> waste my monthly internet allowance on multiple failed install attempts.

debootstrap is your friend.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-25  8:15     ` Russell King - ARM Linux
@ 2015-08-25 13:17       ` Nicolas Schichan
  0 siblings, 0 replies; 45+ messages in thread
From: Nicolas Schichan @ 2015-08-25 13:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/25/2015 10:15 AM, Russell King - ARM Linux wrote:
>> Also, when CONFIG_CPU_SW_DOMAIN_PAN is not set, the DACR_INIT constant is
>> setup with (domain_val(DOMAIN_USER, DOMAIN_NOACCESS) which will cause the
>> kernel to die with a "page domain fault" when running init.
> 
> If you don't mind, I'll merge that into the patch adding this so it
> doesn't introduce a regression there.
> 
> Once I've fixed the abort handler issue, would you mind re-testing
> and giving a tested-by attributation please?

Sure, I've got no problem with that and I'll be happy to re-test.

-- 
Nicolas Schichan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-25 12:38             ` Russell King - ARM Linux
@ 2015-08-25 13:55               ` Nicolas Schichan
  -1 siblings, 0 replies; 45+ messages in thread
From: Nicolas Schichan @ 2015-08-25 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/25/2015 02:38 PM, Russell King - ARM Linux wrote:
> On Tue, Aug 25, 2015 at 01:21:04PM +0200, Geert Uytterhoeven wrote:
>> Hi Russell,
>>
>> On Tue, Aug 25, 2015 at 12:44 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
>>>> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
>>>> on r8a7791/koelsch, which has a dual core CA15:
>>>>
>>>>     [ ok ] Configuring network interfaces...done.
>>>>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
>>>>     pgd = edbb0000
>>>>     [be8e6120] *pgdma77831, *pte¿4d075f, *ppte¿4d0c7f
>>>>     Internal error: : 1b [#1] SMP ARM
>>>>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
>>>> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
>>>>     9
>>>>     Hardware name: Generic R8A7791 (Flattened Device Tree)
>>>>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
>>>>     PC is at csum_partial_copy_from_user+0x28/0x3d8
>>>>     LR is at csum_and_copy_from_iter+0x334/0x4c0
>>>>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
>>>>     sp : ed41db00  ip : 00000020  fp : ed41db6c
>>>>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
>>>>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
>>>>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
>>>>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>>>>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
>>>>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)
>>>
>>> Thanks.  I wonder what's different about your ntpdate that triggers
>>> this, and why all my iMX6 behave fine, which have desktop-like ubuntu
>>> installs on (of two different versions.)
>>
>> It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.
> 
> Hmm, I think I tried at one time to install Debian on an iMX6 platform
> and gave up with it after spending 50 minutes with the installer getting
> so far, and then killing the network - it was very repeatable, and always
> happened at the same point in the installation.  I gave up with Debian
> at that point, as I didn't have lots of 50 minutes to babysit the silly
> installer (which can't ask the questions up-front) nor did I want to
> waste my monthly internet allowance on multiple failed install attempts.
> 
> The reports I was getting from other iMX6 users was that Debian Jessie
> had lots of problems at that time.
> 
>> But I get similar dumps during boot up from rpc.idmapd (SyS_send),
>> rsyslogd (SyS_send), and from sshd (SyS_write) when trying to log in.
> 
> Hmm.
> 
> root       693  0.0  0.1   4944  3196 ?        Ss   01:22   0:00 /usr/sbin/sshd -D
> syslog     720  0.2  0.0  30404  2032 ?        Sl   01:23   1:19 rsyslogd -c5
> root       722  0.0  0.0   2392  1340 ?        Ss   01:23   0:00 rpc.idmapd
> 
> So, the question I need to find an answer to is... why hasn't this path
> been exercised on my platforms during my testing.  It's certainly
> compiled into the kernel...
> 
> Anyway, I've now (hopefully) fixed the bug, but I've nobbled
> csum_partial_copy_from_user to ensure that it will always oops the kernel
> if called:
> 
> 000000b4 <csum_partial_copy_from_user>:
>   b4:   ee133f10        mrc     15, 0, r3, cr3, cr0, {0}
>   b8:   e92d41fe        push    {r1, r2, r3, r4, r5, r6, r7, r8, lr}
>   bc:   e3a03055        mov     r3, #85 ; 0x55
>   c0:   ee033f10        mcr     15, 0, r3, cr3, cr0, {0}
>   c4:   e7033003        str     r3, [r3, -r3]
> 
> and... it doesn't trigger.  I can only assume that this is because the
> iMX6 ethernet interface uses TSO (which implies checksum offload), there's
> no need to use these csum functions - and that would explain why it never
> came up in my local testing.

[resent with the list and other original recipients this time]

I have the csum_partial_copy_from_user issue too, but with radvd (which sends
ipv6 packets). ipv4 networking is fine on the other hand. The kirkwood
platform I use does have checksum offload for ipv4 only and not ipv6 so the
csum functions will get called in the ipv6 case.


-- 
Nicolas Schichan
Freebox SAS

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
@ 2015-08-25 13:55               ` Nicolas Schichan
  0 siblings, 0 replies; 45+ messages in thread
From: Nicolas Schichan @ 2015-08-25 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/25/2015 02:38 PM, Russell King - ARM Linux wrote:
> On Tue, Aug 25, 2015 at 01:21:04PM +0200, Geert Uytterhoeven wrote:
>> Hi Russell,
>>
>> On Tue, Aug 25, 2015 at 12:44 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Tue, Aug 25, 2015 at 12:32:51PM +0200, Geert Uytterhoeven wrote:
>>>> This patch, which is now in arm-soc/for-next, breaks shmobile_defconfig
>>>> on r8a7791/koelsch, which has a dual core CA15:
>>>>
>>>>     [ ok ] Configuring network interfaces...done.
>>>>     Unhandled fault: page domain fault (0x01b) at 0xbe8e6120
>>>>     pgd = edbb0000
>>>>     [be8e6120] *pgd=6da77831, *pte=bf4d075f, *ppte=bf4d0c7f
>>>>     Internal error: : 1b [#1] SMP ARM
>>>>     CPU: 1 PID: 1629 Comm: ntpdate Not tainted
>>>> 4.2.0-rc8-06444-g3c24fd89c9421db1 #31
>>>>     9
>>>>     Hardware name: Generic R8A7791 (Flattened Device Tree)
>>>>     task: ed883a80 ti: ed41c000 task.ti: ed41c000
>>>>     PC is at csum_partial_copy_from_user+0x28/0x3d8
>>>>     LR is at csum_and_copy_from_iter+0x334/0x4c0
>>>>     pc : [<c04ba510>]    lr : [<c01c82e8>]    psr: 000f0013
>>>>     sp : ed41db00  ip : 00000020  fp : ed41db6c
>>>>     r10: ed41ddc0  r9 : 00000027  r8 : ed41dc20
>>>>     r7 : 00000027  r6 : eda52653  r5 : ed41dec8  r4 : 00000000
>>>>     r3 : 00000000  r2 : 00000027  r1 : eda5262c  r0 : be8e6120
>>>>     Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>>>>     Control: 10c5307d  Table: 6dbb006a  DAC: 00000051
>>>>     Process ntpdate (pid: 1629, stack limit = 0xed41c210)
>>>
>>> Thanks.  I wonder what's different about your ntpdate that triggers
>>> this, and why all my iMX6 behave fine, which have desktop-like ubuntu
>>> installs on (of two different versions.)
>>
>> It's ntpdate 1:4.2.6.p5+dfsg-7 from desktop-like Debian jessie.
> 
> Hmm, I think I tried at one time to install Debian on an iMX6 platform
> and gave up with it after spending 50 minutes with the installer getting
> so far, and then killing the network - it was very repeatable, and always
> happened at the same point in the installation.  I gave up with Debian
> at that point, as I didn't have lots of 50 minutes to babysit the silly
> installer (which can't ask the questions up-front) nor did I want to
> waste my monthly internet allowance on multiple failed install attempts.
> 
> The reports I was getting from other iMX6 users was that Debian Jessie
> had lots of problems at that time.
> 
>> But I get similar dumps during boot up from rpc.idmapd (SyS_send),
>> rsyslogd (SyS_send), and from sshd (SyS_write) when trying to log in.
> 
> Hmm.
> 
> root       693  0.0  0.1   4944  3196 ?        Ss   01:22   0:00 /usr/sbin/sshd -D
> syslog     720  0.2  0.0  30404  2032 ?        Sl   01:23   1:19 rsyslogd -c5
> root       722  0.0  0.0   2392  1340 ?        Ss   01:23   0:00 rpc.idmapd
> 
> So, the question I need to find an answer to is... why hasn't this path
> been exercised on my platforms during my testing.  It's certainly
> compiled into the kernel...
> 
> Anyway, I've now (hopefully) fixed the bug, but I've nobbled
> csum_partial_copy_from_user to ensure that it will always oops the kernel
> if called:
> 
> 000000b4 <csum_partial_copy_from_user>:
>   b4:   ee133f10        mrc     15, 0, r3, cr3, cr0, {0}
>   b8:   e92d41fe        push    {r1, r2, r3, r4, r5, r6, r7, r8, lr}
>   bc:   e3a03055        mov     r3, #85 ; 0x55
>   c0:   ee033f10        mcr     15, 0, r3, cr3, cr0, {0}
>   c4:   e7033003        str     r3, [r3, -r3]
> 
> and... it doesn't trigger.  I can only assume that this is because the
> iMX6 ethernet interface uses TSO (which implies checksum offload), there's
> no need to use these csum functions - and that would explain why it never
> came up in my local testing.

[resent with the list and other original recipients this time]

I have the csum_partial_copy_from_user issue too, but with radvd (which sends
ipv6 packets). ipv4 networking is fine on the other hand. The kirkwood
platform I use does have checksum offload for ipv4 only and not ipv6 so the
csum functions will get called in the ipv6 case.


-- 
Nicolas Schichan
Freebox SAS

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 9/9] ARM: software-based priviledged-no-access support
  2015-08-21 13:31   ` [PATCH 9/9] ARM: software-based priviledged-no-access support Russell King
  2015-08-25 10:32       ` Geert Uytterhoeven
@ 2015-08-25 14:05     ` Will Deacon
  1 sibling, 0 replies; 45+ messages in thread
From: Will Deacon @ 2015-08-25 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 02:31:56PM +0100, Russell King wrote:
> Provide a software-based implementation of the priviledged no access
> support found in ARMv8.1.
> 
> Userspace pages are mapped using a different domain number from the
> kernel and IO mappings.  If we switch the user domain to "no access"
> when we enter the kernel, we can prevent the kernel from touching
> userspace.
> 
> However, the kernel needs to be able to access userspace via the
> various user accessor functions.  With the wrapping in the previous
> patch, we can temporarily enable access when the kernel needs user
> access, and re-disable it afterwards.
> 
> This allows us to trap non-intended accesses to userspace, eg, caused
> by an inadvertent dereference of the LIST_POISON* values, which, with
> appropriate user mappings setup, can be made to succeed.  This in turn
> can allow use-after-free bugs to be further exploited than would
> otherwise be possible.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  arch/arm/Kconfig               | 15 +++++++++++++++
>  arch/arm/include/asm/domain.h  | 15 ++++++++++++---
>  arch/arm/include/asm/uaccess.h | 14 ++++++++++++++
>  arch/arm/kernel/entry-header.S | 25 +++++++++++++++++++++++++
>  arch/arm/kernel/process.c      | 24 ++++++++++++++++++------
>  5 files changed, 84 insertions(+), 9 deletions(-)

[...]

> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> index 3aa6c3742182..bec7ee0764e1 100644
> --- a/arch/arm/kernel/entry-header.S
> +++ b/arch/arm/kernel/entry-header.S
> @@ -54,15 +54,40 @@
>  	.endm
>  
>  	.macro	uaccess_disable, tmp
> +#ifdef CONFIG_CPU_SW_DOMAIN_PAN
> +	/*
> +	 * Whenever we re-enter userspace, the domains should always be
> +	 * set appropriately.
> +	 */
> +	mov	\tmp, #DACR_UACCESS_DISABLE
> +	mcr	p15, 0, \tmp, c3, c0, 0		@ Set domain register
> +#endif

Missing ISB?

>  	.endm
>  
>  	.macro	uaccess_enable, tmp
> +#ifdef CONFIG_CPU_SW_DOMAIN_PAN
> +	/*
> +	 * Whenever we re-enter userspace, the domains should always be
> +	 * set appropriately.
> +	 */
> +	mov	\tmp, #DACR_UACCESS_ENABLE
> +	mcr	p15, 0, \tmp, c3, c0, 0
> +#endif
>  	.endm
>  
>  	.macro	uaccess_save_and_disable, tmp
> +#ifdef CONFIG_CPU_SW_DOMAIN_PAN
> +	mrc	p15, 0, \tmp, c3, c0, 0
> +	str	\tmp, [sp, #S_FRAME_SIZE]
> +#endif
> +	uaccess_disable \tmp
>  	.endm

Same here. For the enable/restore cases, the exception return will
synchronise the DACR for us, but I think we need the ISB to be sure that
the change has taken effect on the exception entry paths.

Will

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Prevent list poison values from being mapped by userspace processes
  2015-08-24 22:01               ` Kees Cook
@ 2015-08-26 20:34                 ` Russell King - ARM Linux
  0 siblings, 0 replies; 45+ messages in thread
From: Russell King - ARM Linux @ 2015-08-26 20:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 24, 2015 at 03:01:06PM -0700, Kees Cook wrote:
> On Mon, Aug 24, 2015 at 12:32 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > That's one way of looking at it.
> >
> > Another way of looking at it is that by looking at their work, and
> > merging their ideas into your own, it becomes an encouragement for
> > working outside of mainline - not only do they get the kernel itself
> > free, but they get their feature merged without themselves doing any
> > work - while some other bugger has to sort out making their code
> > mergable.
> >
> > Therefore, my standard point of view is that if people can't be
> > bothered to talk about their ARM specific kernel features here with
> > a view to having them merged, they are leeching off the efforts of
> > the upstream kernel community, and their code just isn't worth
> > looking at.
> >
> > I hold the same view on "community" kernel trees which don't bother
> > pushing their code upstream as well.
> >
> > Sorry, I'm *not* supporting leeches.
> >
> > I've already been accused this year by one very mistaken individual
> > for not pushing _my_ iMX6 work into community kernel trees - when
> > the work that I do is solely targetted at mainline kernels.  The
> > leeches are going mad, and I'm saying no more to this crap.  If it's
> > not talked about on a recognised mainline kernel mailing list, it
> > doesn't exist, and deserves to be rewritten.
> 
> I certainly see your point, but I'm not sure it serves end users best
> to ignore proven technologies. I am trying to bring up the discussion
> on a mainline list, and it seemed redundant to paste the entire grsec
> forum post here as a starting point. :)

As kernel developers feeding code into mainline, we can't go around
picking "proven technologies" from other people's trees.  Doing that
carries with it risk.

Consider the difference between:

1) Individual A sends their code onto mailing lists for review, asking
   it to be merged.  It gets reviewed by multiple people, gains acks
   and tested-bys, and the proper well established and proven kernel
   submission process is followed.  Individual B merges the code and
   sends it upstream.

2) Individual A merges code into their own tree and publishes it on
   the Internet.  Individual B searches the Internet, finds this code,
   downloads, modifies it a bit to get it to apply, and sends it out
   for review.  It gains acks and tested-bys and eventually B merges
   the code and sends it upstream.

Sometime later, patent and/or copyright lawyers start creating a ruckus
over the code which has been merged.  In each case, where does the bulk
of the blame lie - would it be with individual A or individual B?

My _personal_ view is that in (2), much more of the blame lies on
individual B who _took_ code from individual A, potentially without
their knowledge and merged it into another "project".  Individual A
may share some of the blame too, but they have the ability to say
"I never submitted it to that project."  If lawyers target a project,
then the way that would work is they'd target the project, and leave
the project to fight it out with the next stage along.  (You see this
happening between companies: Company A (re)publishes work which
Company B did.  Company A gets sued.  Company A then has to sue
company B to recover costs and damages.)

As part of the process in (1) above, we require that the _sender_ of
the code signs-off that _submission_, and by doing so, they're basically
asserting that the code conforms with the statements in the DCO and
therefore they are taking the bulk of the responsibility.

To re-iterate the point, if I were to take someone elses code, then
_I_ would have to assert DCO (b) - I'm making that assertion, therefore
I'm the one at the end of the submission path for the project.

So, my view is that DCO (b) carries with it more risk than the other
sub-clauses, irrespective of what the GPL says.


As for my comments about leeching, those are a view that I've held
for well over a decade.  Trees which take from mainline but never
contribute back are trees which leech off the efforts of the many
who do contribute to mainline.  Such trees benefit from mainline, but
do not supply any benefit back to mainline.  I find such trees totally
abhorrent and disgusting, but I _can_ understand that it's human
nature.  Sure, the GPL does not require that code gets contributed
back, but there is a clear moral, ethical and fairness issue here.

In this case, it's more than that though.  Kees, in your comment
above, you talk about "proven technologies" and serving the best
interests of our users.  I think you're totally missing the point
here.  Are users best interests served by having these trees keep
their "proven technologies" to themselves, or are they best served
by having these proven technologies submitted into mainline in a
timely manner?

Consider the recent networking use-after-free bug which is the
subject of these domain patches, and think about this.  If this
"proven technology" was already in mainline and enabled, then this
particular use-after-free would not be a privilege escalation because
with this additional protection against dereference of LIST_POISON*,
there is no way userspace could exploit it.  Sure, they can oops the
kernel, but that's _far_ better than a privilege escalation and
potential undetected information leak.

So, in my mind, it's the leeches which are doing a dis-service to
users, not those of us who spend their time trying to fix the
problems which the leeches have already fixed but failed contribute
back.

It intensely annoys me when I _do_ find out that some feature
(especially bug fixes) has already been created but not fed back,
and I've spent time re-creating it.  It's really a waste of my
time.  That's what leads me to decide that no, I'm not going to
waste even more of my time trying to rip a feature out of a
leeching tree and allow them to have some of the credit for it.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 8/9] ARM: entry: provide uaccess assembly macro hooks
  2015-08-21 13:31   ` [PATCH 8/9] ARM: entry: provide uaccess assembly macro hooks Russell King
@ 2015-08-27 21:40     ` Stephen Boyd
  0 siblings, 0 replies; 45+ messages in thread
From: Stephen Boyd @ 2015-08-27 21:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/21, Russell King wrote:
> @@ -400,6 +402,10 @@ ENDPROC(__fiq_abt)
>   ARM(	stmdb	r0, {sp, lr}^			)
>   THUMB(	store_user_sp_lr r0, r1, S_SP - S_PC	)
>  
> +	.if \uaccess

This \u seems to trip up my clang build. It seems that the
assembler thinks \u is escaping for unicode or something?

 arch/arm/kernel/entry-armv.S:202:
 Error: non-constant expression in ".if" statement

Looking at the intermediate assembly file I see:

 .if ??ss
 uaccess_disable ip
 .endif

Changing 'uaccess' to 'access' seems to make the problem go away.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2015-08-27 21:40 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-18 21:42 Prevent list poison values from being mapped by userspace processes Jeffrey Vander Stoep
2015-08-21 13:30 ` Russell King - ARM Linux
2015-08-21 13:31   ` [PATCH 1/9] ARM: domains: switch to keeping domain value in register Russell King
2015-08-21 13:31   ` [PATCH 2/9] ARM: domains: provide domain_mask() Russell King
2015-08-21 13:31   ` [PATCH 3/9] ARM: domains: move initial domain setting value to asm/domains.h Russell King
2015-08-21 13:31   ` [PATCH 4/9] ARM: domains: get rid of manager mode for user domain Russell King
2015-08-21 13:31   ` [PATCH 5/9] ARM: domains: keep vectors in separate domain Russell King
2015-08-21 13:31   ` [PATCH 6/9] ARM: domains: remove DOMAIN_TABLE Russell King
2015-08-21 13:31   ` [PATCH 7/9] ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() Russell King
2015-08-21 13:31   ` [PATCH 8/9] ARM: entry: provide uaccess assembly macro hooks Russell King
2015-08-27 21:40     ` Stephen Boyd
2015-08-21 13:31   ` [PATCH 9/9] ARM: software-based priviledged-no-access support Russell King
2015-08-25 10:32     ` Geert Uytterhoeven
2015-08-25 10:32       ` Geert Uytterhoeven
2015-08-25 10:44       ` Russell King - ARM Linux
2015-08-25 10:44         ` Russell King - ARM Linux
2015-08-25 11:21         ` Geert Uytterhoeven
2015-08-25 11:21           ` Geert Uytterhoeven
2015-08-25 12:38           ` Russell King - ARM Linux
2015-08-25 12:38             ` Russell King - ARM Linux
2015-08-25 12:47             ` Geert Uytterhoeven
2015-08-25 12:47               ` Geert Uytterhoeven
2015-08-25 13:55             ` Nicolas Schichan
2015-08-25 13:55               ` Nicolas Schichan
2015-08-25 14:05     ` Will Deacon
2015-08-21 13:46   ` [PATCH 0/4] Efficiency cleanups Russell King - ARM Linux
2015-08-21 13:48     ` [PATCH 1/4] ARM: uaccess: simplify user access assembly Russell King
2015-08-21 13:48     ` [PATCH 2/4] ARM: entry: get rid of asm_trace_hardirqs_on_cond Russell King
2015-08-21 13:48     ` [PATCH 3/4] ARM: entry: efficiency cleanups Russell King
2015-08-21 13:48     ` [PATCH 4/4] ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit() Russell King
2015-08-24 14:36     ` [PATCH 0/4] Efficiency cleanups Will Deacon
2015-08-24 15:00       ` Russell King - ARM Linux
2015-08-21 17:32   ` Prevent list poison values from being mapped by userspace processes Catalin Marinas
2015-08-24 12:06     ` Russell King - ARM Linux
2015-08-24 13:05   ` Nicolas Schichan
2015-08-25  8:15     ` Russell King - ARM Linux
2015-08-25 13:17       ` Nicolas Schichan
2015-08-24 18:06   ` Kees Cook
2015-08-24 18:47     ` Russell King - ARM Linux
2015-08-24 18:51       ` Kees Cook
2015-08-24 19:14         ` Russell King - ARM Linux
2015-08-24 19:22           ` Kees Cook
2015-08-24 19:32             ` Russell King - ARM Linux
2015-08-24 22:01               ` Kees Cook
2015-08-26 20:34                 ` Russell King - ARM Linux

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.