linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] arm64: VMAP_STACK support
@ 2017-08-07 18:35 Mark Rutland
  2017-08-07 18:35 ` [PATCH 01/14] arm64: remove __die()'s stack dump Mark Rutland
                   ` (13 more replies)
  0 siblings, 14 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Hi,

Ard and I have worked together to implement vmap stack support for
arm64. This supersedes our earlier vmap stack RFCs [0,1]. The git author
stats are a little misleading, as I've teased parts out into smaller
patches for review.

The series is based on our stack dump rework [2,3], which can be found
in the arm64/exception-stack branch [4] of my kernel.org repo. This
series can be found in the arm64/vmap-stack branch [5] of the same repo.

On arm64, there is no double-fault exception, as software saves
exception context to the stack. An erroneous memory access taken during
exception handling results in a data abort, as with any other erroneous
memory access. To avoid taking these recursively, we must detect
overflow by checking the SP before we attempt to store any context to
the stack. Doing this efficiently requires a couple of tricks.

For a naturally aligned stack, bits THREAD_SHIFT-1:0 of a valid SP may
contain any arbitrary value:

	0bXX .. 11111111111111
	0bXX .. 11011001011100
	0bXX .. 00000000000000

By aligning stacks to double their natural alignment, we know that the
THREAD_SHIFT bit of any valid SP must be zero:

	0bXX .. 0 11111111111111
	0bXX .. 0 11011001011100
	0bXX .. 0 00000000000000

... while an overflow will result in this bit flipping, along with
(some) other high-order bits:

	0bXX .. 0 00000000000000
	< SP -= 1 >
	0bXX .. 1 11111111111111

... and thus, we can detect overflows of up to THREAD_SIZE by testing
the THREAD_SHIFT bit of the SP value.

Provided we can get the SP into a general purpose register, we can
perform this test with a single TBNZ instruction. We don't have scratch
space to store a GPR, but we can (partially) swap the SP with a GPR
using arithmetic to perform the test:

	add	sp, sp, x0		// sp' = sp + x0
	sub	x0, sp, x0		// x0' = sp' - x0 = (sp + x0) - x0 = sp
	tbnz	x0, #THREAD_SHIFT, overflow_handler
	sub	x0, sp, x0		// sp' - x0' = (sp + x0) - sp = x0
	sub	sp, sp, x0		// sp' - x0 = (sp + x0) - x0 = sp

This series implements this approach, along with the other requisite
changes required to make this work.

The SP test is performed for all exceptions, after compensating for the
size of the exception registers, allowing the original exception context
to be preserved in entirety. The tests themselves are folded into the
exception vectors, minimizing their impact.

To ensure that IRQ stack overflows are detected and handled, IRQ stacks
are now dynamically allocated, with guard pages.

I've given the series some light testing with LKDTM, Syzkaller, Vince
Weaver's perf fuzzer, and a few combinations of debug options. I haven't
compared performance of the entire series to a baseline kernel, but from
testing so far the cost of the SP test falls in the noise for a kernel
build workload on Cortex-A57.

Many thanks to Ard for putting up with my meddling, and also to Laura
and James for their testing and comments on prior patches.

Thanks,
Mark.

[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/518368.html
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/518434.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520705.html
[3] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/521435.html
[4] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/exception-stack
[5] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/vmap-stack

Ard Biesheuvel (2):
  arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
  arm64: assembler: allow adr_this_cpu to use the stack pointer

Mark Rutland (12):
  arm64: remove __die()'s stack dump
  fork: allow arch-override of VMAP stack alignment
  arm64: factor out PAGE_* and CONT_* definitions
  arm64: clean up THREAD_* definitions
  arm64: clean up irq stack definitions
  arm64: move SEGMENT_ALIGN to <asm/memory.h>
  efi/arm64: add EFI_KIMG_ALIGN
  arm64: factor out entry stack manipulation
  arm64: use an irq stack pointer
  arm64: add basic VMAP_STACK support
  arm64: add on_accessible_stack()
  arm64: add VMAP_STACK overflow detection

 arch/arm64/Kconfig                        |   1 +
 arch/arm64/include/asm/assembler.h        |   3 +-
 arch/arm64/include/asm/efi.h              |   8 +++
 arch/arm64/include/asm/irq.h              |  25 -------
 arch/arm64/include/asm/memory.h           |  53 ++++++++++++++
 arch/arm64/include/asm/page-def.h         |  34 +++++++++
 arch/arm64/include/asm/page.h             |  12 +---
 arch/arm64/include/asm/processor.h        |   2 +-
 arch/arm64/include/asm/stacktrace.h       |  62 ++++++++++++++++-
 arch/arm64/include/asm/thread_info.h      |  10 +--
 arch/arm64/kernel/entry.S                 | 110 +++++++++++++++++++++++-------
 arch/arm64/kernel/irq.c                   |  40 ++++++++++-
 arch/arm64/kernel/ptrace.c                |   1 +
 arch/arm64/kernel/smp.c                   |   2 +-
 arch/arm64/kernel/stacktrace.c            |   7 +-
 arch/arm64/kernel/traps.c                 |  44 ++++++++++--
 arch/arm64/kernel/vmlinux.lds.S           |  18 +----
 drivers/firmware/efi/libstub/arm64-stub.c |   6 +-
 kernel/fork.c                             |   5 +-
 19 files changed, 339 insertions(+), 104 deletions(-)
 create mode 100644 arch/arm64/include/asm/page-def.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/14] arm64: remove __die()'s stack dump
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:35 ` [PATCH 02/14] fork: allow arch-override of VMAP stack alignment Mark Rutland
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Our __die() implementation tries to dump the stack memory, in addition
to a backtrace, which is problematic.

For contemporary 16K stacks, this can be a lot of data, which can take a
long time to dump, and can push other useful context out of the kernel's
printk ringbuffer (and/or a user's scrollback buffer on an attached
console).

Additionally, the code implicitly assumes that the SP is on the task's
stack, and tries to dump everything between the SP and the highest task
stack address. When the SP points at an IRQ stack (or is corrupted),
this makes the kernel attempt to dump vast amounts of VA space. With
vmap'd stacks, this may result in erroneous accesses to peripherals.

This patch removes the memory dump, leaving us to rely on the backtrace,
and other means of dumping stack memory such as kdump.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/traps.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index c2a81bf..9633773 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -237,8 +237,6 @@ static int __die(const char *str, int err, struct pt_regs *regs)
 		 end_of_stack(tsk));
 
 	if (!user_mode(regs)) {
-		dump_mem(KERN_EMERG, "Stack: ", regs->sp,
-			 THREAD_SIZE + (unsigned long)task_stack_page(tsk));
 		dump_backtrace(regs, tsk);
 		dump_instr(KERN_EMERG, regs);
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/14] fork: allow arch-override of VMAP stack alignment
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
  2017-08-07 18:35 ` [PATCH 01/14] arm64: remove __die()'s stack dump Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:35 ` [PATCH 03/14] arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP Mark Rutland
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

In some cases, an architecture might wish its stacks to be aligned to a
boundary larger than THREAD_SIZE. For example, using an alignment of
double THREAD_SIZE can allow for stack overflows smaller than
THREAD_SIZE to be detected by checking a single bit of the stack
pointer.

This patch allows architectures to override the alignment of VMAP'd
stacks, by defining THREAD_ALIGN. Where not defined, this defaults to
THREAD_SIZE, as is the case today.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
---
 kernel/fork.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 17921b0..696d692 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -217,7 +217,10 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
 		return s->addr;
 	}
 
-	stack = __vmalloc_node_range(THREAD_SIZE, THREAD_SIZE,
+#ifndef THREAD_ALIGN
+#define THREAD_ALIGN	THREAD_SIZE
+#endif
+	stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
 				     VMALLOC_START, VMALLOC_END,
 				     THREADINFO_GFP,
 				     PAGE_KERNEL,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/14] arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
  2017-08-07 18:35 ` [PATCH 01/14] arm64: remove __die()'s stack dump Mark Rutland
  2017-08-07 18:35 ` [PATCH 02/14] fork: allow arch-override of VMAP stack alignment Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:35 ` [PATCH 04/14] arm64: factor out PAGE_* and CONT_* definitions Mark Rutland
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

For historical reasons, we leave the top 16 bytes of our task and IRQ
stacks unused, a practice used to ensure that the SP can always be
masked to find the base of the current stack (historically, where
thread_info could be found).

However, this is not necessary, as:

* When an exception is taken from a task stack, we decrement the SP by
  S_FRAME_SIZE and stash the exception registers before we compare the
  SP against the task stack. In such cases, the SP must be at least
  S_FRAME_SIZE below the limit, and can be safely masked to determine
  whether the task stack is in use.

* When transitioning to an IRQ stack, we'll place a dummy frame onto the
  IRQ stack before enabling asynchronous exceptions, or executing code
  we expect to trigger faults. Thus, if an exception is taken from the
  IRQ stack, the SP must be at least 16 bytes below the limit.

* We no longer mask the SP to find the thread_info, which is now found
  via sp_el0. Note that historically, the offset was critical to ensure
  that cpu_switch_to() found the correct stack for new threads that
  hadn't yet executed ret_from_fork().

Given that, this initial offset serves no purpose, and can be removed.
This brings us in-line with other architectures (e.g. x86) which do not
rely on this masking.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: rebase, kill THREAD_START_SP, commit msg additions]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/irq.h         | 5 ++---
 arch/arm64/include/asm/processor.h   | 2 +-
 arch/arm64/include/asm/thread_info.h | 1 -
 arch/arm64/kernel/entry.S            | 2 +-
 arch/arm64/kernel/smp.c              | 2 +-
 5 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 8ba89c4..1ebe202 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -2,7 +2,6 @@
 #define __ASM_IRQ_H
 
 #define IRQ_STACK_SIZE			THREAD_SIZE
-#define IRQ_STACK_START_SP		THREAD_START_SP
 
 #ifndef __ASSEMBLER__
 
@@ -26,9 +25,9 @@ static inline int nr_legacy_irqs(void)
 static inline bool on_irq_stack(unsigned long sp)
 {
 	unsigned long low = (unsigned long)raw_cpu_ptr(irq_stack);
-	unsigned long high = low + IRQ_STACK_START_SP;
+	unsigned long high = low + IRQ_STACK_SIZE;
 
-	return (low <= sp && sp <= high);
+	return (low <= sp && sp < high);
 }
 
 static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 64c9e78..6687dd2 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -159,7 +159,7 @@ extern struct task_struct *cpu_switch_to(struct task_struct *prev,
 					 struct task_struct *next);
 
 #define task_pt_regs(p) \
-	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
+	((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1)
 
 #define KSTK_EIP(tsk)	((unsigned long)task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)	user_stack_pointer(task_pt_regs(tsk))
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 46c3b93..b29ab0e 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -30,7 +30,6 @@
 #endif
 
 #define THREAD_SIZE		16384
-#define THREAD_START_SP		(THREAD_SIZE - 16)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 4ddb8d7..1c0f787 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -268,7 +268,7 @@ alternative_else_nop_endif
 	cbnz	x25, 9998f
 
 	adr_this_cpu x25, irq_stack, x26
-	mov	x26, #IRQ_STACK_START_SP
+	mov	x26, #IRQ_STACK_SIZE
 	add	x26, x25, x26
 
 	/* switch to the irq stack */
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index dc66e6e..f13ddb2 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -154,7 +154,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 * page tables.
 	 */
 	secondary_data.task = idle;
-	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
+	secondary_data.stack = task_stack_page(idle) + THREAD_SIZE;
 	update_cpu_boot_status(CPU_MMU_OFF);
 	__flush_dcache_area(&secondary_data, sizeof(secondary_data));
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/14] arm64: factor out PAGE_* and CONT_* definitions
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (2 preceding siblings ...)
  2017-08-07 18:35 ` [PATCH 03/14] arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:35 ` [PATCH 05/14] arm64: clean up THREAD_* definitions Mark Rutland
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Some headers rely on PAGE_* definitions from <asm/page.h>, but cannot
include this due to potential circular includes. For example, a number
of definitions in <asm/memory.h> rely on PAGE_SHIFT, and <asm/page.h>
includes <asm/memory.h>.

This requires users of these definitions to include both headers, which
is fragile and error-prone.

This patch ameliorates matters by moving the basic definitions out to a
new header, <asm/page-def.h>. Both <asm/page.h> and <asm/memory.h> are
updated to include this, avoiding this fragility, and avoiding the
possibility of circular include dependencies.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/memory.h   |  1 +
 arch/arm64/include/asm/page-def.h | 34 ++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/page.h     | 12 +-----------
 3 files changed, 36 insertions(+), 11 deletions(-)
 create mode 100644 arch/arm64/include/asm/page-def.h

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 32f827233..77d55dc 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -25,6 +25,7 @@
 #include <linux/const.h>
 #include <linux/types.h>
 #include <asm/bug.h>
+#include <asm/page-def.h>
 #include <asm/sizes.h>
 
 /*
diff --git a/arch/arm64/include/asm/page-def.h b/arch/arm64/include/asm/page-def.h
new file mode 100644
index 0000000..01591a2
--- /dev/null
+++ b/arch/arm64/include/asm/page-def.h
@@ -0,0 +1,34 @@
+/*
+ * Based on arch/arm/include/asm/page.h
+ *
+ * Copyright (C) 1995-2003 Russell King
+ * Copyright (C) 2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PAGE_DEF_H
+#define __ASM_PAGE_DEF_H
+
+#include <linux/const.h>
+
+/* PAGE_SHIFT determines the page size */
+/* CONT_SHIFT determines the number of pages which can be tracked together  */
+#define PAGE_SHIFT		CONFIG_ARM64_PAGE_SHIFT
+#define CONT_SHIFT		CONFIG_ARM64_CONT_SHIFT
+#define PAGE_SIZE		(_AC(1, UL) << PAGE_SHIFT)
+#define PAGE_MASK		(~(PAGE_SIZE-1))
+
+#define CONT_SIZE		(_AC(1, UL) << (CONT_SHIFT + PAGE_SHIFT))
+#define CONT_MASK		(~(CONT_SIZE-1))
+
+#endif /* __ASM_PAGE_DEF_H */
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 8472c6d..60d02c8 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -19,17 +19,7 @@
 #ifndef __ASM_PAGE_H
 #define __ASM_PAGE_H
 
-#include <linux/const.h>
-
-/* PAGE_SHIFT determines the page size */
-/* CONT_SHIFT determines the number of pages which can be tracked together  */
-#define PAGE_SHIFT		CONFIG_ARM64_PAGE_SHIFT
-#define CONT_SHIFT		CONFIG_ARM64_CONT_SHIFT
-#define PAGE_SIZE		(_AC(1, UL) << PAGE_SHIFT)
-#define PAGE_MASK		(~(PAGE_SIZE-1))
-
-#define CONT_SIZE		(_AC(1, UL) << (CONT_SHIFT + PAGE_SHIFT))
-#define CONT_MASK		(~(CONT_SIZE-1))
+#include <asm/page-def.h>
 
 #ifndef __ASSEMBLY__
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/14] arm64: clean up THREAD_* definitions
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (3 preceding siblings ...)
  2017-08-07 18:35 ` [PATCH 04/14] arm64: factor out PAGE_* and CONT_* definitions Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-14 11:59   ` Catalin Marinas
  2017-08-07 18:35 ` [PATCH 06/14] arm64: clean up irq stack definitions Mark Rutland
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Currently we define THREAD_SIZE and THREAD_SIZE order separately, with
the latter dependent on particular CONFIG_ARM64_*K_PAGES definitions.
This is somewhat opaque, and will get in the way of future modifications
to THREAD_SIZE.

This patch cleans this up, defining both in terms of a common
THREAD_SHIFT, and using PAGE_SHIFT to calculate THREAD_SIZE_ORDER,
rather than using a number of definitions dependent on config symbols.
Subsequent patches will make use of this to alter the stack size used in
some configurations.

At the same time, these are moved into <asm/memory.h>, which will avoid
circular include issues in subsequent patches. To ensure that existing
code isn't adversely affected, <asm/thread_info.h> is updated to
transitively include these definitions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/memory.h      | 8 ++++++++
 arch/arm64/include/asm/thread_info.h | 9 +--------
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 77d55dc..8ab4774 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -102,6 +102,14 @@
 #define KASAN_SHADOW_SIZE	(0)
 #endif
 
+#define THREAD_SHIFT		14
+
+#if THREAD_SHIFT >= PAGE_SHIFT
+#define THREAD_SIZE_ORDER	(THREAD_SHIFT - PAGE_SHIFT)
+#endif
+
+#define THREAD_SIZE		(UL(1) << THREAD_SHIFT)
+
 /*
  * Memory types available.
  */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index b29ab0e..aa04b73 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -23,18 +23,11 @@
 
 #include <linux/compiler.h>
 
-#ifdef CONFIG_ARM64_4K_PAGES
-#define THREAD_SIZE_ORDER	2
-#elif defined(CONFIG_ARM64_16K_PAGES)
-#define THREAD_SIZE_ORDER	0
-#endif
-
-#define THREAD_SIZE		16384
-
 #ifndef __ASSEMBLY__
 
 struct task_struct;
 
+#include <asm/memory.h>
 #include <asm/stack_pointer.h>
 #include <asm/types.h>
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/14] arm64: clean up irq stack definitions
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (4 preceding siblings ...)
  2017-08-07 18:35 ` [PATCH 05/14] arm64: clean up THREAD_* definitions Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:35 ` [PATCH 07/14] arm64: move SEGMENT_ALIGN to <asm/memory.h> Mark Rutland
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Before we add yet another stack to the kernel, it would be nice to
ensure that we consistently organise stack definitions and related
helper functions.

This patch moves the basic IRQ stack defintions to <asm/memory.h> to
live with their task stack counterparts. Helpers used for unwinding are
moved into <asm/stacktrace.h>, where subsequent patches will add helpers
for other stacks. Includes are fixed up accordingly.

This patch is a pure refactoring -- there should be no functional
changes as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/irq.h        | 24 ------------------------
 arch/arm64/include/asm/memory.h     |  2 ++
 arch/arm64/include/asm/stacktrace.h | 25 ++++++++++++++++++++++++-
 arch/arm64/kernel/ptrace.c          |  1 +
 4 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 1ebe202..5e6f772 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -1,20 +1,12 @@
 #ifndef __ASM_IRQ_H
 #define __ASM_IRQ_H
 
-#define IRQ_STACK_SIZE			THREAD_SIZE
-
 #ifndef __ASSEMBLER__
 
-#include <linux/percpu.h>
-#include <linux/sched/task_stack.h>
-
 #include <asm-generic/irq.h>
-#include <asm/thread_info.h>
 
 struct pt_regs;
 
-DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
-
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
 static inline int nr_legacy_irqs(void)
@@ -22,21 +14,5 @@ static inline int nr_legacy_irqs(void)
 	return 0;
 }
 
-static inline bool on_irq_stack(unsigned long sp)
-{
-	unsigned long low = (unsigned long)raw_cpu_ptr(irq_stack);
-	unsigned long high = low + IRQ_STACK_SIZE;
-
-	return (low <= sp && sp < high);
-}
-
-static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp)
-{
-	unsigned long low = (unsigned long)task_stack_page(tsk);
-	unsigned long high = low + THREAD_SIZE;
-
-	return (low <= sp && sp < high);
-}
-
 #endif /* !__ASSEMBLER__ */
 #endif
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 8ab4774..1fc2453 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -110,6 +110,8 @@
 
 #define THREAD_SIZE		(UL(1) << THREAD_SHIFT)
 
+#define IRQ_STACK_SIZE		THREAD_SIZE
+
 /*
  * Memory types available.
  */
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 3bebab3..000e2418 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -16,7 +16,12 @@
 #ifndef __ASM_STACKTRACE_H
 #define __ASM_STACKTRACE_H
 
-struct task_struct;
+#include <linux/percpu.h>
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+
+#include <asm/memory.h>
+#include <asm/ptrace.h>
 
 struct stackframe {
 	unsigned long fp;
@@ -31,4 +36,22 @@ extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 			    int (*fn)(struct stackframe *, void *), void *data);
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk);
 
+DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
+
+static inline bool on_irq_stack(unsigned long sp)
+{
+	unsigned long low = (unsigned long)raw_cpu_ptr(irq_stack);
+	unsigned long high = low + IRQ_STACK_SIZE;
+
+	return (low <= sp && sp < high);
+}
+
+static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp)
+{
+	unsigned long low = (unsigned long)task_stack_page(tsk);
+	unsigned long high = low + THREAD_SIZE;
+
+	return (low <= sp && sp < high);
+}
+
 #endif	/* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index baf0838..a9f8715 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -42,6 +42,7 @@
 #include <asm/compat.h>
 #include <asm/debug-monitors.h>
 #include <asm/pgtable.h>
+#include <asm/stacktrace.h>
 #include <asm/syscall.h>
 #include <asm/traps.h>
 #include <asm/system_misc.h>
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/14] arm64: move SEGMENT_ALIGN to <asm/memory.h>
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (5 preceding siblings ...)
  2017-08-07 18:35 ` [PATCH 06/14] arm64: clean up irq stack definitions Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:35 ` [PATCH 08/14] efi/arm64: add EFI_KIMG_ALIGN Mark Rutland
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Currently we define SEGMENT_ALIGN directly in our vmlinux.lds.S.

This is unfortunate, as the EFI stub currently open-codes the same
number, and in future we'll want to fiddle with this.

This patch moves the definition to our <asm/memory.h>, where it can be
used by both vmlinux.lds.S and the EFI stub code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/memory.h | 19 +++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S | 16 ----------------
 2 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 1fc2453..7fa6ad4 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -113,6 +113,25 @@
 #define IRQ_STACK_SIZE		THREAD_SIZE
 
 /*
+ * Alignment of kernel segments (e.g. .text, .data).
+ */
+#if defined(CONFIG_DEBUG_ALIGN_RODATA)
+/*
+ *  4 KB granule:   1 level 2 entry
+ * 16 KB granule: 128 level 3 entries, with contiguous bit
+ * 64 KB granule:  32 level 3 entries, with contiguous bit
+ */
+#define SEGMENT_ALIGN			SZ_2M
+#else
+/*
+ *  4 KB granule:  16 level 3 entries, with contiguous bit
+ * 16 KB granule:   4 level 3 entries, without contiguous bit
+ * 64 KB granule:   1 level 3 entry
+ */
+#define SEGMENT_ALIGN			SZ_64K
+#endif
+
+/*
  * Memory types available.
  */
 #define MT_DEVICE_nGnRnE	0
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 987a00e..7156538 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -72,22 +72,6 @@ PECOFF_FILE_ALIGNMENT = 0x200;
 #define PECOFF_EDATA_PADDING
 #endif
 
-#if defined(CONFIG_DEBUG_ALIGN_RODATA)
-/*
- *  4 KB granule:   1 level 2 entry
- * 16 KB granule: 128 level 3 entries, with contiguous bit
- * 64 KB granule:  32 level 3 entries, with contiguous bit
- */
-#define SEGMENT_ALIGN			SZ_2M
-#else
-/*
- *  4 KB granule:  16 level 3 entries, with contiguous bit
- * 16 KB granule:   4 level 3 entries, without contiguous bit
- * 64 KB granule:   1 level 3 entry
- */
-#define SEGMENT_ALIGN			SZ_64K
-#endif
-
 SECTIONS
 {
 	/*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/14] efi/arm64: add EFI_KIMG_ALIGN
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (6 preceding siblings ...)
  2017-08-07 18:35 ` [PATCH 07/14] arm64: move SEGMENT_ALIGN to <asm/memory.h> Mark Rutland
@ 2017-08-07 18:35 ` Mark Rutland
  2017-08-07 18:36 ` [PATCH 09/14] arm64: factor out entry stack manipulation Mark Rutland
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:35 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

The EFI stub is intimately coupled with the kernel, and takes advantage
of this by relocating the kernel at a weaker alignment than the
documented boot protocol mandates.

However, it does so by assuming it can align the kernel to the segment
alignment, and assumes that this is 64K. In subsequent patches, we'll
have to consider other details to determine this de-facto alignment
constraint.

This patch adds a new EFI_KIMG_ALIGN definition that will track the
kernel's de-facto alignment requirements. Subsequent patches will modify
this as required.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/efi.h              | 3 +++
 drivers/firmware/efi/libstub/arm64-stub.c | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 8f3043a..0e8cc3b 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -4,6 +4,7 @@
 #include <asm/boot.h>
 #include <asm/cpufeature.h>
 #include <asm/io.h>
+#include <asm/memory.h>
 #include <asm/mmu_context.h>
 #include <asm/neon.h>
 #include <asm/ptrace.h>
@@ -48,6 +49,8 @@
  */
 #define EFI_FDT_ALIGN	SZ_2M   /* used by allocate_new_fdt_and_exit_boot() */
 
+#define EFI_KIMG_ALIGN	SEGMENT_ALIGN
+
 /* on arm64, the FDT may be located anywhere in system RAM */
 static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
 {
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index b4c2589..af6ae95 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -11,6 +11,7 @@
  */
 #include <linux/efi.h>
 #include <asm/efi.h>
+#include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/sysreg.h>
 
@@ -81,9 +82,10 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table_arg,
 		/*
 		 * If CONFIG_DEBUG_ALIGN_RODATA is not set, produce a
 		 * displacement in the interval [0, MIN_KIMG_ALIGN) that
-		 * is a multiple of the minimal segment alignment (SZ_64K)
+		 * doesn't violate this kernel's de-facto alignment
+		 * constraints.
 		 */
-		u32 mask = (MIN_KIMG_ALIGN - 1) & ~(SZ_64K - 1);
+		u32 mask = (MIN_KIMG_ALIGN - 1) & ~(EFI_KIMG_ALIGN - 1);
 		u32 offset = !IS_ENABLED(CONFIG_DEBUG_ALIGN_RODATA) ?
 			     (phys_seed >> 32) & mask : TEXT_OFFSET;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/14] arm64: factor out entry stack manipulation
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (7 preceding siblings ...)
  2017-08-07 18:35 ` [PATCH 08/14] efi/arm64: add EFI_KIMG_ALIGN Mark Rutland
@ 2017-08-07 18:36 ` Mark Rutland
  2017-08-07 18:36 ` [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer Mark Rutland
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:36 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

In subsequent patches, we will detect stack overflow in our exception
entry code, by verifying the SP after it has been decremented to make
space for the exception regs.

This verification code is small, and we can minimize its impact by
placing it directly in the vectors. To avoid redundant modification of
the SP, we also need to move the initial decrement of the SP into the
vectors.

As a preparatory step, this patch introduces kernel_ventry, which
performs this decrement, and updates the entry code accordingly.
Subsequent patches will fold SP verification into kernel_ventry.

There should be no functional change as a result of this patch.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: turn into prep patch, expand commit msg]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/entry.S | 47 ++++++++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 1c0f787..bd3b6de 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -69,8 +69,13 @@
 #define BAD_FIQ		2
 #define BAD_ERROR	3
 
-	.macro	kernel_entry, el, regsize = 64
+	.macro kernel_ventry	label
+	.align 7
 	sub	sp, sp, #S_FRAME_SIZE
+	b	\label
+	.endm
+
+	.macro	kernel_entry, el, regsize = 64
 	.if	\regsize == 32
 	mov	w0, w0				// zero upper 32 bits of x0
 	.endif
@@ -315,31 +320,31 @@ tsk	.req	x28		// current thread_info
 
 	.align	11
 ENTRY(vectors)
-	ventry	el1_sync_invalid		// Synchronous EL1t
-	ventry	el1_irq_invalid			// IRQ EL1t
-	ventry	el1_fiq_invalid			// FIQ EL1t
-	ventry	el1_error_invalid		// Error EL1t
+	kernel_ventry	el1_sync_invalid		// Synchronous EL1t
+	kernel_ventry	el1_irq_invalid			// IRQ EL1t
+	kernel_ventry	el1_fiq_invalid			// FIQ EL1t
+	kernel_ventry	el1_error_invalid		// Error EL1t
 
-	ventry	el1_sync			// Synchronous EL1h
-	ventry	el1_irq				// IRQ EL1h
-	ventry	el1_fiq_invalid			// FIQ EL1h
-	ventry	el1_error_invalid		// Error EL1h
+	kernel_ventry	el1_sync			// Synchronous EL1h
+	kernel_ventry	el1_irq				// IRQ EL1h
+	kernel_ventry	el1_fiq_invalid			// FIQ EL1h
+	kernel_ventry	el1_error_invalid		// Error EL1h
 
-	ventry	el0_sync			// Synchronous 64-bit EL0
-	ventry	el0_irq				// IRQ 64-bit EL0
-	ventry	el0_fiq_invalid			// FIQ 64-bit EL0
-	ventry	el0_error_invalid		// Error 64-bit EL0
+	kernel_ventry	el0_sync			// Synchronous 64-bit EL0
+	kernel_ventry	el0_irq				// IRQ 64-bit EL0
+	kernel_ventry	el0_fiq_invalid			// FIQ 64-bit EL0
+	kernel_ventry	el0_error_invalid		// Error 64-bit EL0
 
 #ifdef CONFIG_COMPAT
-	ventry	el0_sync_compat			// Synchronous 32-bit EL0
-	ventry	el0_irq_compat			// IRQ 32-bit EL0
-	ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
-	ventry	el0_error_invalid_compat	// Error 32-bit EL0
+	kernel_ventry	el0_sync_compat			// Synchronous 32-bit EL0
+	kernel_ventry	el0_irq_compat			// IRQ 32-bit EL0
+	kernel_ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
+	kernel_ventry	el0_error_invalid_compat	// Error 32-bit EL0
 #else
-	ventry	el0_sync_invalid		// Synchronous 32-bit EL0
-	ventry	el0_irq_invalid			// IRQ 32-bit EL0
-	ventry	el0_fiq_invalid			// FIQ 32-bit EL0
-	ventry	el0_error_invalid		// Error 32-bit EL0
+	kernel_ventry	el0_sync_invalid		// Synchronous 32-bit EL0
+	kernel_ventry	el0_irq_invalid			// IRQ 32-bit EL0
+	kernel_ventry	el0_fiq_invalid			// FIQ 32-bit EL0
+	kernel_ventry	el0_error_invalid		// Error 32-bit EL0
 #endif
 END(vectors)
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (8 preceding siblings ...)
  2017-08-07 18:36 ` [PATCH 09/14] arm64: factor out entry stack manipulation Mark Rutland
@ 2017-08-07 18:36 ` Mark Rutland
  2017-08-14 17:13   ` Catalin Marinas
  2017-08-07 18:36 ` [PATCH 11/14] arm64: use an irq " Mark Rutland
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:36 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Given that adr_this_cpu already requires a temp register in addition
to the destination register, tweak the instruction sequence so that sp
may be used as well.

This will simplify switching to per-cpu stacks in subsequent patches. While
this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use
adr_this_cpu in modules, and this is not problematic for the main kernel image.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: add more commit text]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/assembler.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 610a420..4775af5 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -235,7 +235,8 @@
 	 * @tmp: scratch register
 	 */
 	.macro adr_this_cpu, dst, sym, tmp
-	adr_l	\dst, \sym
+	adrp	\tmp, \sym
+	add	\dst, \tmp, #:lo12:\sym
 	mrs	\tmp, tpidr_el1
 	add	\dst, \dst, \tmp
 	.endm
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/14] arm64: use an irq stack pointer
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (9 preceding siblings ...)
  2017-08-07 18:36 ` [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer Mark Rutland
@ 2017-08-07 18:36 ` Mark Rutland
  2017-08-07 18:36 ` [PATCH 12/14] arm64: add basic VMAP_STACK support Mark Rutland
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:36 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

We allocate our IRQ stacks using a percpu array. This allows us to generate our
IRQ stack pointers with adr_this_cpu, but bloats the kernel Image with the boot
CPU's IRQ stack. Additionally, these are packed with other percpu variables,
and aren't guaranteed to have guard pages.

When we enable VMAP_STACK we'll want to vmap our IRQ stacks also, in order to
provide guard pages and to permit more stringent alignment requirements. Doing
so will require that we use a percpu pointer to each IRQ stack, rather than
allocating a percpu IRQ stack in the kernel image.

This patch updates our IRQ stack code to use a percpu pointer to the base of
each IRQ stack. This will allow us to change the way the stack is allocated
with minimal changes elsewhere. In some cases we may try to backtrace before
the IRQ stack pointers are initialised, so on_irq_stack() is updated to account
for this.

In testing with cyclictest, there was no measureable difference between using
adr_this_cpu (for irq_stack) and ldr_this_cpu (for irq_stack_ptr) in the IRQ
entry path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/stacktrace.h |  7 +++++--
 arch/arm64/kernel/entry.S           |  2 +-
 arch/arm64/kernel/irq.c             | 10 ++++++++++
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 000e2418..4c68d8a 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -36,13 +36,16 @@ extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 			    int (*fn)(struct stackframe *, void *), void *data);
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk);
 
-DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
+DECLARE_PER_CPU(unsigned long *, irq_stack_ptr);
 
 static inline bool on_irq_stack(unsigned long sp)
 {
-	unsigned long low = (unsigned long)raw_cpu_ptr(irq_stack);
+	unsigned long low = (unsigned long)raw_cpu_read(irq_stack_ptr);
 	unsigned long high = low + IRQ_STACK_SIZE;
 
+	if (!low)
+		return false;
+
 	return (low <= sp && sp < high);
 }
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index bd3b6de..e5aa866 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -272,7 +272,7 @@ alternative_else_nop_endif
 	and	x25, x25, #~(THREAD_SIZE - 1)
 	cbnz	x25, 9998f
 
-	adr_this_cpu x25, irq_stack, x26
+	ldr_this_cpu x25, irq_stack_ptr, x26
 	mov	x26, #IRQ_STACK_SIZE
 	add	x26, x25, x26
 
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 2386b26..5141282 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -32,6 +32,7 @@
 
 /* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
 DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
+DEFINE_PER_CPU(unsigned long *, irq_stack_ptr);
 
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
@@ -50,8 +51,17 @@ void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))
 	handle_arch_irq = handle_irq;
 }
 
+static void init_irq_stacks(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		per_cpu(irq_stack_ptr, cpu) = per_cpu(irq_stack, cpu);
+}
+
 void __init init_IRQ(void)
 {
+	init_irq_stacks();
 	irqchip_init();
 	if (!handle_arch_irq)
 		panic("No interrupt controller found.");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/14] arm64: add basic VMAP_STACK support
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (10 preceding siblings ...)
  2017-08-07 18:36 ` [PATCH 11/14] arm64: use an irq " Mark Rutland
@ 2017-08-07 18:36 ` Mark Rutland
  2017-08-07 18:36 ` [PATCH 13/14] arm64: add on_accessible_stack() Mark Rutland
  2017-08-07 18:36 ` [PATCH 14/14] arm64: add VMAP_STACK overflow detection Mark Rutland
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:36 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

This path enables arm64 to be built with vmap'd task and IRQ stacks.

As vmap'd stacks are mapped at page granularity, stacks must be a multiple of
PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in
size.

To minimize the increase in Image size, IRQ stacks are dynamically allocated at
boot time, rather than embedding the boot CPU's IRQ stack in the kernel image.

This patch was co-authored by Ard Biesheuvel and Mark Rutland.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig              |  1 +
 arch/arm64/include/asm/efi.h    |  7 ++++++-
 arch/arm64/include/asm/memory.h | 23 ++++++++++++++++++++++-
 arch/arm64/kernel/irq.c         | 30 ++++++++++++++++++++++++++++--
 arch/arm64/kernel/vmlinux.lds.S |  2 +-
 5 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dfd9086..d66f9db 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -75,6 +75,7 @@ config ARM64
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
+	select HAVE_ARCH_VMAP_STACK
 	select HAVE_ARM_SMCCC
 	select HAVE_EBPF_JIT
 	select HAVE_C_RECORDMCOUNT
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 0e8cc3b..2b1e5de 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -49,7 +49,12 @@
  */
 #define EFI_FDT_ALIGN	SZ_2M   /* used by allocate_new_fdt_and_exit_boot() */
 
-#define EFI_KIMG_ALIGN	SEGMENT_ALIGN
+/*
+ * In some configurations (e.g. VMAP_STACK && 64K pages), stacks built into the
+ * kernel need greater alignment than we require the segments to be padded to.
+ */
+#define EFI_KIMG_ALIGN	\
+	(SEGMENT_ALIGN > THREAD_ALIGN ? SEGMENT_ALIGN : THREAD_ALIGN)
 
 /* on arm64, the FDT may be located anywhere in system RAM */
 static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 7fa6ad4..c5cd2c5 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -102,7 +102,17 @@
 #define KASAN_SHADOW_SIZE	(0)
 #endif
 
-#define THREAD_SHIFT		14
+#define MIN_THREAD_SHIFT	14
+
+/*
+ * VMAP'd stacks are allocated at page granularity, so we must ensure that such
+ * stacks are a multiple of page size.
+ */
+#if defined(CONFIG_VMAP_STACK) && (MIN_THREAD_SHIFT < PAGE_SHIFT)
+#define THREAD_SHIFT		PAGE_SHIFT
+#else
+#define THREAD_SHIFT		MIN_THREAD_SHIFT
+#endif
 
 #if THREAD_SHIFT >= PAGE_SHIFT
 #define THREAD_SIZE_ORDER	(THREAD_SHIFT - PAGE_SHIFT)
@@ -110,6 +120,17 @@
 
 #define THREAD_SIZE		(UL(1) << THREAD_SHIFT)
 
+/*
+ * By aligning VMAP'd stacks to 2 * THREAD_SIZE, we can detect overflow by
+ * checking sp & (1 << THREAD_SHIFT), which we can do cheaply in the entry
+ * assembly.
+ */
+#ifdef CONFIG_VMAP_STACK
+#define THREAD_ALIGN		(2 * THREAD_SIZE)
+#else
+#define THREAD_ALIGN		THREAD_SIZE
+#endif
+
 #define IRQ_STACK_SIZE		THREAD_SIZE
 
 /*
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 5141282..713561e 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -23,15 +23,15 @@
 
 #include <linux/kernel_stat.h>
 #include <linux/irq.h>
+#include <linux/memory.h>
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
+#include <linux/vmalloc.h>
 
 unsigned long irq_err_count;
 
-/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
-DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
 DEFINE_PER_CPU(unsigned long *, irq_stack_ptr);
 
 int arch_show_interrupts(struct seq_file *p, int prec)
@@ -51,6 +51,31 @@ void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))
 	handle_arch_irq = handle_irq;
 }
 
+#ifdef CONFIG_VMAP_STACK
+static void init_irq_stacks(void)
+{
+	int cpu;
+	unsigned long *p;
+
+	for_each_possible_cpu(cpu) {
+		/*
+		* To ensure that VMAP'd stack overflow detection works
+		* correctly, the IRQ stacks need to have the same
+		* alignment as other stacks.
+		*/
+		p = __vmalloc_node_range(IRQ_STACK_SIZE, THREAD_ALIGN,
+					 VMALLOC_START, VMALLOC_END,
+					 THREADINFO_GFP, PAGE_KERNEL,
+					 0, cpu_to_node(cpu),
+					 __builtin_return_address(0));
+
+		per_cpu(irq_stack_ptr, cpu) = p;
+	}
+}
+#else
+/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
+DEFINE_PER_CPU_ALIGNED(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
+
 static void init_irq_stacks(void)
 {
 	int cpu;
@@ -58,6 +83,7 @@ static void init_irq_stacks(void)
 	for_each_possible_cpu(cpu)
 		per_cpu(irq_stack_ptr, cpu) = per_cpu(irq_stack, cpu);
 }
+#endif
 
 void __init init_IRQ(void)
 {
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7156538..fe56c26 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -176,7 +176,7 @@ SECTIONS
 
 	_data = .;
 	_sdata = .;
-	RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
+	RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
 
 	/*
 	 * Data written with the MMU off but read with the MMU on requires
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 13/14] arm64: add on_accessible_stack()
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (11 preceding siblings ...)
  2017-08-07 18:36 ` [PATCH 12/14] arm64: add basic VMAP_STACK support Mark Rutland
@ 2017-08-07 18:36 ` Mark Rutland
  2017-08-07 18:36 ` [PATCH 14/14] arm64: add VMAP_STACK overflow detection Mark Rutland
  13 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:36 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

Both unwind_frame() and dump_backtrace() try to check whether a stack
address is sane to access, with very similar logic. Both will need
updating in order to handle overflow stacks.

Factor out this logic into a helper, so that we can avoid further
duplication when we add overflow stacks.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/stacktrace.h | 16 ++++++++++++++++
 arch/arm64/kernel/stacktrace.c      |  7 +------
 arch/arm64/kernel/traps.c           |  3 +--
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 4c68d8a..92ddb6d 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -57,4 +57,20 @@ static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp)
 	return (low <= sp && sp < high);
 }
 
+/*
+ * We can only safely access per-cpu stacks from current in a non-preemptible
+ * context.
+ */
+static inline bool on_accessible_stack(struct task_struct *tsk, unsigned long sp)
+{
+	if (on_task_stack(tsk, sp))
+		return true;
+	if (tsk != current || preemptible())
+		return false;
+	if (on_irq_stack(sp))
+		return true;
+
+	return false;
+}
+
 #endif	/* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 54f3463..d9b80eb 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -50,12 +50,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 	if (!tsk)
 		tsk = current;
 
-	/*
-	 * Switching between stacks is valid when tracing current and in
-	 * non-preemptible context.
-	 */
-	if (!(tsk == current && !preemptible() && on_irq_stack(fp)) &&
-	    !on_task_stack(tsk, fp))
+	if (!on_accessible_stack(tsk, fp))
 		return -EINVAL;
 
 	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 9633773..d01c598 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -193,8 +193,7 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
 		if (in_entry_text(frame.pc)) {
 			stack = frame.fp - offsetof(struct pt_regs, stackframe);
 
-			if (on_task_stack(tsk, stack) ||
-			    (tsk == current && !preemptible() && on_irq_stack(stack)))
+			if (on_accessible_stack(tsk, stack))
 				dump_mem("", "Exception stack", stack,
 					 stack + sizeof(struct pt_regs));
 		}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 14/14] arm64: add VMAP_STACK overflow detection
  2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
                   ` (12 preceding siblings ...)
  2017-08-07 18:36 ` [PATCH 13/14] arm64: add on_accessible_stack() Mark Rutland
@ 2017-08-07 18:36 ` Mark Rutland
  2017-08-14 15:32   ` Will Deacon
  2017-08-15 11:10   ` Catalin Marinas
  13 siblings, 2 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-07 18:36 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: ard.biesheuvel, catalin.marinas, james.morse, labbott,
	linux-kernel, luto, mark.rutland, matt, will.deacon,
	kernel-hardening, keescook

This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.

Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).

Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.

The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.

This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:

root@ribbensteg:/sys/kernel/debug/provoke-crash# echo OVERFLOW > DIRECT
[  116.249161] lkdtm: Performing direct entry OVERFLOW
[  116.254048] Insufficient stack space to handle exception!
[  116.254059] CPU: 4 PID: 2269 Comm: bash Not tainted 4.13.0-rc3-00020-g307fec7 #197
[  116.266913] Hardware name: ARM Juno development board (r1) (DT)
[  116.272783] task: ffff800976bf0e00 task.stack: ffff00000d540000
[  116.278660] PC is at recursive_loop+0x10/0x50
[  116.282981] LR is at recursive_loop+0x34/0x50
[  116.287300] pc : [<ffff000008597778>] lr : [<ffff00000859779c>] pstate: 40000145
[  116.294633] sp : ffff00000d53ff30
[  116.297916] x29: ffff00000d540350 x28: ffff800976bf0e00
[  116.303188] x27: ffff000008981000 x26: ffff000008f701f8
[  116.308458] x25: ffff00000d543eb8 x24: ffff00000d543eb8
[  116.313729] x23: ffff000008f6ff30 x22: 0000000000000009
[  116.318999] x21: ffff800975c43000 x20: ffff000008f6ff80
[  116.324269] x19: 0000000000000013 x18: 0000000000000010
[  116.329539] x17: 0000ffffb24cf6a4 x16: ffff0000081fbc40
[  116.334820] x15: 0000000000000006 x14: ffff000088fc637f
[  116.340099] x13: ffff000008fc638d x12: ffff000008ec2460
[  116.345379] x11: ffff00000d543a30 x10: 0000000005f5e0ff
[  116.350659] x9 : 00000000ffffffd0 x8 : ffff00000d540770
[  116.355939] x7 : 1313131313131313 x6 : 000000000000019c
[  116.361218] x5 : 0000000000000000 x4 : 0000000000000000
[  116.366497] x3 : 0000000000000000 x2 : 0000000000000400
[  116.371777] x1 : 0000000000000013 x0 : 0000000000000012
[  116.377058] Task stack:     [0xffff00000d540000..0xffff00000d544000]
[  116.383366] IRQ stack:      [0xffff000008020000..0xffff000008024000]
[  116.389675] Overflow stack: [0xffff80097ffa54e0..0xffff80097ffa64e0]
[  116.395984] ESR: 0x96000047 -- DABT (current EL)
[  116.400569] FAR: 0xffff00000d53ff30
[  116.404036] Kernel panic - not syncing: kernel stack overflow
[  116.409744] CPU: 4 PID: 2269 Comm: bash Not tainted 4.13.0-rc3-00020-g307fec7 #197
[  116.417268] Hardware name: ARM Juno development board (r1) (DT)
[  116.423146] Call trace:
[  116.425587] [<ffff0000080883a0>] dump_backtrace+0x0/0x268
[  116.430955] [<ffff0000080886cc>] show_stack+0x14/0x20
[  116.435976] [<ffff00000894e138>] dump_stack+0x98/0xb8
[  116.440997] [<ffff0000080c1e44>] panic+0x118/0x28c
[  116.445758] [<ffff0000080c1a84>] nmi_panic+0x6c/0x70
[  116.450693] [<ffff000008088f88>] handle_bad_stack+0x118/0x128
[  116.456401] Exception stack(0xffff80097ffa63a0 to 0xffff80097ffa64e0)
[  116.462799] 63a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[  116.470585] 63c0: 0000000000000000 0000000000000000 000000000000019c 1313131313131313
[  116.478372] 63e0: ffff00000d540770 00000000ffffffd0 0000000005f5e0ff ffff00000d543a30
[  116.486157] 6400: ffff000008ec2460 ffff000008fc638d ffff000088fc637f 0000000000000006
[  116.493943] 6420: ffff0000081fbc40 0000ffffb24cf6a4 0000000000000010 0000000000000013
[  116.501730] 6440: ffff000008f6ff80 ffff800975c43000 0000000000000009 ffff000008f6ff30
[  116.509516] 6460: ffff00000d543eb8 ffff00000d543eb8 ffff000008f701f8 ffff000008981000
[  116.517302] 6480: ffff800976bf0e00 ffff00000d540350 ffff00000859779c ffff00000d53ff30
[  116.525087] 64a0: ffff000008597778 0000000040000145 0000000000000000 0000000000000000
[  116.532874] 64c0: 0001000000000000 0000000000000000 ffff00000d540350 ffff000008597778
[  116.540660] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[  116.545767] [<ffff000008597778>] recursive_loop+0x10/0x50
[  116.551132] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.556497] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.561862] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.567228] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.572592] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.577957] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.583322] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.588687] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.594051] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.599416] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.604781] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.610146] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.615511] [<ffff00000859779c>] recursive_loop+0x34/0x50
[  116.620876] [<ffff00000859782c>] lkdtm_OVERFLOW+0x14/0x20
[  116.626241] [<ffff000008597760>] lkdtm_do_action+0x1c/0x24
[  116.631693] [<ffff0000085975d0>] direct_entry+0xe0/0x168
[  116.636974] [<ffff000008340f98>] full_proxy_write+0x60/0xa8
[  116.642511] [<ffff0000081f93dc>] __vfs_write+0x1c/0x118
[  116.647704] [<ffff0000081fa824>] vfs_write+0x9c/0x1a8
[  116.652723] [<ffff0000081fbc84>] SyS_write+0x44/0xa0
[  116.657655] Exception stack(0xffff00000d543ec0 to 0xffff00000d544000)
[  116.664053] 3ec0: 0000000000000001 000000001952d808 0000000000000009 0000000000000000
[  116.671838] 3ee0: 0000000000000000 0000000000000000 0000ffffb24d6c6c 0dfefefefeff07ff
[  116.679624] 3f00: 0000000000000040 fefefefefefefeff 0000000019555b28 0000000000000008
[  116.687411] 3f20: 0000000000000000 0000000000000018 ffffffffffffffff 00000ca9b8000000
[  116.695196] 3f40: 0000000000000000 0000ffffb24cf6a4 0000ffffd8d00e40 0000000000000009
[  116.702983] 3f60: 000000001952d808 0000ffffb25ad178 0000000000000009 0000000000000000
[  116.710768] 3f80: 0000000000000001 00000000004c9c98 00000000004ca628 00000000004ed000
[  116.718554] 3fa0: 00000000004ea8e0 0000ffffd8d00fe0 0000ffffb24d674c 0000ffffd8d00fe0
[  116.726340] 3fc0: 0000ffffb2524fec 0000000060000000 0000000000000001 0000000000000040
[  116.734125] 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000ffffb2524fec
[  116.741912] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[  116.747189] [<0000ffffb2524fec>] 0xffffb2524fec
[  116.751695] SMP: stopping secondary CPUs
[  116.755909] Kernel Offset: disabled
[  116.759375] CPU features: 0x002086
[  116.762753] Memory Limit: none
[  116.765795] ---[ end Kernel panic - not syncing: kernel stack overflow

This patch was co-authored by Ard Biesheuvel and Mark Rutland.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/memory.h     |  2 ++
 arch/arm64/include/asm/stacktrace.h | 18 +++++++++++
 arch/arm64/kernel/entry.S           | 59 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/traps.c           | 39 ++++++++++++++++++++++++
 4 files changed, 118 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index c5cd2c5..1a025b7 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -133,6 +133,8 @@
 
 #define IRQ_STACK_SIZE		THREAD_SIZE
 
+#define OVERFLOW_STACK_SIZE	SZ_4K
+
 /*
  * Alignment of kernel segments (e.g. .text, .data).
  */
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 92ddb6d..ee19563 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -57,6 +57,22 @@ static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp)
 	return (low <= sp && sp < high);
 }
 
+#ifdef CONFIG_VMAP_STACK
+DECLARE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack);
+
+#define OVERFLOW_STACK_PTR() ((unsigned long)this_cpu_ptr(overflow_stack) + OVERFLOW_STACK_SIZE)
+
+static inline bool on_overflow_stack(unsigned long sp)
+{
+	unsigned long low = (unsigned long)this_cpu_ptr(overflow_stack);
+	unsigned long high = low + OVERFLOW_STACK_SIZE;
+
+	return (low <= sp && sp < high);
+}
+#else
+static inline bool on_overflow_stack(unsigned long sp) { return false; }
+#endif
+
 /*
  * We can only safely access per-cpu stacks from current in a non-preemptible
  * context.
@@ -69,6 +85,8 @@ static inline bool on_accessible_stack(struct task_struct *tsk, unsigned long sp
 		return false;
 	if (on_irq_stack(sp))
 		return true;
+	if (on_overflow_stack(sp))
+		return true;
 
 	return false;
 }
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e5aa866..44a27c3 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -72,6 +72,37 @@
 	.macro kernel_ventry	label
 	.align 7
 	sub	sp, sp, #S_FRAME_SIZE
+#ifdef CONFIG_VMAP_STACK
+	add	sp, sp, x0			// sp' = sp + x0
+	sub	x0, sp, x0			// x0' = sp' - x0 = (sp + x0) - x0 = sp
+	tbnz	x0, #THREAD_SHIFT, 0f
+	sub	x0, sp, x0			// sp' - x0' = (sp + x0) - sp = x0
+	sub	sp, sp, x0			// sp' - x0 = (sp + x0) - x0 = sp
+	b	\label
+
+	/* Stash the original SP value in tpidr_el0 */
+0:	msr	tpidr_el0, x0
+
+	/* Recover the original x0 value and stash it in tpidrro_el0 */
+	sub	x0, sp, x0
+	msr	tpidrro_el0, x0
+
+	/* Switch to the overflow stack */
+	adr_this_cpu sp, overflow_stack + OVERFLOW_STACK_SIZE, x0
+
+	/*
+	 * Check whether we were already on the overflow stack. This may happen
+	 * after panic() re-enables interrupts.
+	 */
+	mrs	x0, tpidr_el0			// sp of interrupted context
+	sub	x0, sp, x0			// delta with top of overflow stack
+	tst	x0, #~(OVERFLOW_STACK_SIZE - 1)	// within range?
+	b.ne	__bad_stack			// no? -> bad stack pointer
+
+	/* We were already on the overflow stack. Restore sp/x0 and carry on. */
+	sub	sp, sp, x0
+	mrs	x0, tpidrro_el0
+#endif
 	b	\label
 	.endm
 
@@ -348,6 +379,34 @@ ENTRY(vectors)
 #endif
 END(vectors)
 
+#ifdef CONFIG_VMAP_STACK
+	/*
+	 * We detected an overflow in kernel_ventry, which switched to the
+	 * overflow stack. Stash the exception regs, and head to our overflow
+	 * handler.
+	 */
+__bad_stack:
+	/* Restore the original x0 value */
+	mrs	x0, tpidrro_el0
+
+	/*
+	 * Store the original GPRs to the new stack. The orginial SP (minus
+	 * S_FRAME_SIZE) was stashed in tpidr_el0 by kernel_ventry.
+	 */
+	sub	sp, sp, #S_FRAME_SIZE
+	kernel_entry 1
+	mrs	x0, tpidr_el0
+	add	x0, x0, #S_FRAME_SIZE
+	str	x0, [sp, #S_SP]
+
+	/* Stash the regs for handle_bad_stack */
+	mov	x0, sp
+
+	/* Time to die */
+	bl	handle_bad_stack
+	ASM_BUG()
+#endif /* CONFIG_VMAP_STACK */
+
 /*
  * Invalid mode handlers
  */
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index d01c598..2c80a11 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -32,6 +32,7 @@
 #include <linux/sched/signal.h>
 #include <linux/sched/debug.h>
 #include <linux/sched/task_stack.h>
+#include <linux/sizes.h>
 #include <linux/syscalls.h>
 #include <linux/mm_types.h>
 
@@ -41,6 +42,7 @@
 #include <asm/esr.h>
 #include <asm/insn.h>
 #include <asm/traps.h>
+#include <asm/smp.h>
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 #include <asm/exception.h>
@@ -666,6 +668,43 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
 	force_sig_info(info.si_signo, &info, current);
 }
 
+#ifdef CONFIG_VMAP_STACK
+
+DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
+	__aligned(16);
+
+asmlinkage void handle_bad_stack(struct pt_regs *regs)
+{
+	unsigned long tsk_stk = (unsigned long)current->stack;
+	unsigned long irq_stk = (unsigned long)this_cpu_read(irq_stack_ptr);
+	unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
+	unsigned int esr = read_sysreg(esr_el1);
+	unsigned long far = read_sysreg(far_el1);
+
+	console_verbose();
+	pr_emerg("Insufficient stack space to handle exception!");
+
+	__show_regs(regs);
+
+	pr_emerg("Task stack:     [0x%016lx..0x%016lx]\n",
+		 tsk_stk, tsk_stk + THREAD_SIZE);
+	pr_emerg("IRQ stack:      [0x%016lx..0x%016lx]\n",
+		 irq_stk, irq_stk + THREAD_SIZE);
+	pr_emerg("Overflow stack: [0x%016lx..0x%016lx]\n",
+		 ovf_stk, ovf_stk + OVERFLOW_STACK_SIZE);
+
+	pr_emerg("ESR: 0x%08x -- %s\n", esr, esr_get_class_string(esr));
+	pr_emerg("FAR: 0x%016lx\n", far);
+
+	/*
+	 * We use nmi_panic to limit the potential for recusive overflows, and
+	 * to get a better stack trace.
+	 */
+	nmi_panic(NULL, "kernel stack overflow");
+	cpu_park_loop();
+}
+#endif
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 05/14] arm64: clean up THREAD_* definitions
  2017-08-07 18:35 ` [PATCH 05/14] arm64: clean up THREAD_* definitions Mark Rutland
@ 2017-08-14 11:59   ` Catalin Marinas
  2017-08-14 13:10     ` Mark Rutland
  0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2017-08-14 11:59 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, keescook, ard.biesheuvel, matt,
	kernel-hardening, will.deacon, linux-kernel, luto, james.morse,
	labbott

On Mon, Aug 07, 2017 at 07:35:56PM +0100, Mark Rutland wrote:
> Currently we define THREAD_SIZE and THREAD_SIZE order separately, with

s/THREAD_SIZE order/THREAD_SIZE_ORDER/

> the latter dependent on particular CONFIG_ARM64_*K_PAGES definitions.
> This is somewhat opaque, and will get in the way of future modifications
> to THREAD_SIZE.
> 
> This patch cleans this up, defining both in terms of a common
> THREAD_SHIFT, and using PAGE_SHIFT to calculate THREAD_SIZE_ORDER,
> rather than using a number of definitions dependent on config symbols.
> Subsequent patches will make use of this to alter the stack size used in
> some configurations.
> 
> At the same time, these are moved into <asm/memory.h>, which will avoid
> circular include issues in subsequent patches. To ensure that existing
> code isn't adversely affected, <asm/thread_info.h> is updated to
> transitively include these definitions.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Laura Abbott <labbott@redhat.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/include/asm/memory.h      | 8 ++++++++
>  arch/arm64/include/asm/thread_info.h | 9 +--------
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 77d55dc..8ab4774 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -102,6 +102,14 @@
>  #define KASAN_SHADOW_SIZE	(0)
>  #endif
>  
> +#define THREAD_SHIFT		14
> +
> +#if THREAD_SHIFT >= PAGE_SHIFT
> +#define THREAD_SIZE_ORDER	(THREAD_SHIFT - PAGE_SHIFT)
> +#endif
> +
> +#define THREAD_SIZE		(UL(1) << THREAD_SHIFT)

I haven't tried the series to this patch but it seems to me that
THREAD_SIZE_ORDER is undefined for a PAGE_SHIFT of 16. Without
VMAP_STACKS, it may fail to build.

-- 
Catalin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 05/14] arm64: clean up THREAD_* definitions
  2017-08-14 11:59   ` Catalin Marinas
@ 2017-08-14 13:10     ` Mark Rutland
  0 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-14 13:10 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, keescook, ard.biesheuvel, matt,
	kernel-hardening, will.deacon, linux-kernel, luto, james.morse,
	labbott

On Mon, Aug 14, 2017 at 12:59:59PM +0100, Catalin Marinas wrote:
> On Mon, Aug 07, 2017 at 07:35:56PM +0100, Mark Rutland wrote:
> > Currently we define THREAD_SIZE and THREAD_SIZE order separately, with
> 
> s/THREAD_SIZE order/THREAD_SIZE_ORDER/

Whoops; I'll fix that up now.

> > +#define THREAD_SHIFT		14
> > +
> > +#if THREAD_SHIFT >= PAGE_SHIFT
> > +#define THREAD_SIZE_ORDER	(THREAD_SHIFT - PAGE_SHIFT)
> > +#endif
> > +
> > +#define THREAD_SIZE		(UL(1) << THREAD_SHIFT)
> 
> I haven't tried the series to this patch but it seems to me that
> THREAD_SIZE_ORDER is undefined for a PAGE_SHIFT of 16. 

That is already the case without these patches, as we have:

#ifdef CONFIG_ARM64_4K_PAGES
#define THREAD_SIZE_ORDER       2
#elif defined(CONFIG_ARM64_16K_PAGES)
#define THREAD_SIZE_ORDER       0
#endif

... this is also deliberate, as we'd need an order of -2 with 16K stacks and
64K pages, and this doesn't make sense.

THREAD_SIZE_ORDER only matters if THREAD_SIZE >= PAGE_SIZE (or if we're using a
VMAP'd stack). For 64K pages && VMAP_STACK, we'll use a 64K stack, avoiding the
problem.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 14/14] arm64: add VMAP_STACK overflow detection
  2017-08-07 18:36 ` [PATCH 14/14] arm64: add VMAP_STACK overflow detection Mark Rutland
@ 2017-08-14 15:32   ` Will Deacon
  2017-08-14 17:25     ` Mark Rutland
  2017-08-15 11:10   ` Catalin Marinas
  1 sibling, 1 reply; 23+ messages in thread
From: Will Deacon @ 2017-08-14 15:32 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, ard.biesheuvel, catalin.marinas, james.morse,
	labbott, linux-kernel, luto, matt, kernel-hardening, keescook

Just some minor comments on this (after taking ages to realise you were
using tpidr_el0 as a temporary rather than tpidr_el1 and getting totally
confused!).

On Mon, Aug 07, 2017 at 07:36:05PM +0100, Mark Rutland wrote:
> This patch adds stack overflow detection to arm64, usable when vmap'd stacks
> are in use.
> 
> Overflow is detected in a small preamble executed for each exception entry,
> which checks whether there is enough space on the current stack for the general
> purpose registers to be saved. If there is not enough space, the overflow
> handler is invoked on a per-cpu overflow stack. This approach preserves the
> original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
> 
> Task and IRQ stacks are aligned to double their size, enabling overflow to be
> detected with a single bit test. For example, a 16K stack is aligned to 32K,
> ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
> this bit is flipped. Thus, overflow (of less than the size of the stack) can be
> detected by testing whether this bit is set.
> 
> The overflow check is performed before any attempt is made to access the
> stack, avoiding recursive faults (and the loss of exception information
> these would entail). As logical operations cannot be performed on the SP
> directly, the SP is temporarily swapped with a general purpose register
> using arithmetic operations to enable the test to be performed.

[...]

> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index c5cd2c5..1a025b7 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -133,6 +133,8 @@
>  
>  #define IRQ_STACK_SIZE		THREAD_SIZE
>  
> +#define OVERFLOW_STACK_SIZE	SZ_4K
> +
>  /*
>   * Alignment of kernel segments (e.g. .text, .data).
>   */
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index 92ddb6d..ee19563 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -57,6 +57,22 @@ static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp)
>  	return (low <= sp && sp < high);
>  }
>  
> +#ifdef CONFIG_VMAP_STACK
> +DECLARE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack);
> +
> +#define OVERFLOW_STACK_PTR() ((unsigned long)this_cpu_ptr(overflow_stack) + OVERFLOW_STACK_SIZE)
> +
> +static inline bool on_overflow_stack(unsigned long sp)
> +{
> +	unsigned long low = (unsigned long)this_cpu_ptr(overflow_stack);

Can you use raw_cpu_ptr here, like you do for the irq stack?

> +	unsigned long high = low + OVERFLOW_STACK_SIZE;
> +
> +	return (low <= sp && sp < high);
> +}
> +#else
> +static inline bool on_overflow_stack(unsigned long sp) { return false; }
> +#endif
> +
>  /*
>   * We can only safely access per-cpu stacks from current in a non-preemptible
>   * context.
> @@ -69,6 +85,8 @@ static inline bool on_accessible_stack(struct task_struct *tsk, unsigned long sp
>  		return false;
>  	if (on_irq_stack(sp))
>  		return true;
> +	if (on_overflow_stack(sp))
> +		return true;

I find the "return false" clause in this function makes it fiddly to
read because it's really predicating all following conditionals on current
&& !preemptible, but I haven't got any better ideas :(

>  	return false;
>  }
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index e5aa866..44a27c3 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -72,6 +72,37 @@
>  	.macro kernel_ventry	label
>  	.align 7
>  	sub	sp, sp, #S_FRAME_SIZE
> +#ifdef CONFIG_VMAP_STACK
> +	add	sp, sp, x0			// sp' = sp + x0
> +	sub	x0, sp, x0			// x0' = sp' - x0 = (sp + x0) - x0 = sp
> +	tbnz	x0, #THREAD_SHIFT, 0f
> +	sub	x0, sp, x0			// sp' - x0' = (sp + x0) - sp = x0
> +	sub	sp, sp, x0			// sp' - x0 = (sp + x0) - x0 = sp
> +	b	\label
> +
> +	/* Stash the original SP value in tpidr_el0 */
> +0:	msr	tpidr_el0, x0

The comment here is a bit confusing, since the sp has already been
decremented for the frame, as mention in a later comment.

> +
> +	/* Recover the original x0 value and stash it in tpidrro_el0 */
> +	sub	x0, sp, x0
> +	msr	tpidrro_el0, x0
> +
> +	/* Switch to the overflow stack */
> +	adr_this_cpu sp, overflow_stack + OVERFLOW_STACK_SIZE, x0
> +
> +	/*
> +	 * Check whether we were already on the overflow stack. This may happen
> +	 * after panic() re-enables interrupts.
> +	 */
> +	mrs	x0, tpidr_el0			// sp of interrupted context
> +	sub	x0, sp, x0			// delta with top of overflow stack
> +	tst	x0, #~(OVERFLOW_STACK_SIZE - 1)	// within range?
> +	b.ne	__bad_stack			// no? -> bad stack pointer
> +
> +	/* We were already on the overflow stack. Restore sp/x0 and carry on. */
> +	sub	sp, sp, x0
> +	mrs	x0, tpidrro_el0
> +#endif
>  	b	\label
>  	.endm
>  
> @@ -348,6 +379,34 @@ ENTRY(vectors)
>  #endif
>  END(vectors)
>  
> +#ifdef CONFIG_VMAP_STACK
> +	/*
> +	 * We detected an overflow in kernel_ventry, which switched to the
> +	 * overflow stack. Stash the exception regs, and head to our overflow
> +	 * handler.
> +	 */
> +__bad_stack:
> +	/* Restore the original x0 value */
> +	mrs	x0, tpidrro_el0
> +
> +	/*
> +	 * Store the original GPRs to the new stack. The orginial SP (minus

original

> +	 * S_FRAME_SIZE) was stashed in tpidr_el0 by kernel_ventry.
> +	 */
> +	sub	sp, sp, #S_FRAME_SIZE
> +	kernel_entry 1
> +	mrs	x0, tpidr_el0
> +	add	x0, x0, #S_FRAME_SIZE
> +	str	x0, [sp, #S_SP]
> +
> +	/* Stash the regs for handle_bad_stack */
> +	mov	x0, sp
> +
> +	/* Time to die */
> +	bl	handle_bad_stack
> +	ASM_BUG()

Why not just a b without the ASM_BUG?

Will

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer
  2017-08-07 18:36 ` [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer Mark Rutland
@ 2017-08-14 17:13   ` Catalin Marinas
  2017-08-14 17:42     ` Mark Rutland
  0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2017-08-14 17:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, keescook, ard.biesheuvel, matt,
	kernel-hardening, will.deacon, linux-kernel, luto, james.morse,
	labbott

On Mon, Aug 07, 2017 at 07:36:01PM +0100, Mark Rutland wrote:
> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> Given that adr_this_cpu already requires a temp register in addition
> to the destination register, tweak the instruction sequence so that sp
> may be used as well.
> 
> This will simplify switching to per-cpu stacks in subsequent patches. While
> this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use
> adr_this_cpu in modules, and this is not problematic for the main kernel image.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> [Mark: add more commit text]
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Laura Abbott <labbott@redhat.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index 610a420..4775af5 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -235,7 +235,8 @@
>  	 * @tmp: scratch register
>  	 */
>  	.macro adr_this_cpu, dst, sym, tmp
> -	adr_l	\dst, \sym
> +	adrp	\tmp, \sym
> +	add	\dst, \tmp, #:lo12:\sym
>  	mrs	\tmp, tpidr_el1
>  	add	\dst, \dst, \tmp
>  	.endm

Nitpick: it may be worth adding an #ifndef MODULE around these macros,
together with a comment, just in case. There are other macros in this
file like adr_l which are used in modules (crypto).

-- 
Catalin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 14/14] arm64: add VMAP_STACK overflow detection
  2017-08-14 15:32   ` Will Deacon
@ 2017-08-14 17:25     ` Mark Rutland
  0 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-14 17:25 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, ard.biesheuvel, catalin.marinas, james.morse,
	labbott, linux-kernel, luto, matt, kernel-hardening, keescook

On Mon, Aug 14, 2017 at 04:32:53PM +0100, Will Deacon wrote:
> Just some minor comments on this (after taking ages to realise you were
> using tpidr_el0 as a temporary rather than tpidr_el1 and getting totally
> confused!).
> 
> On Mon, Aug 07, 2017 at 07:36:05PM +0100, Mark Rutland wrote:

> > +static inline bool on_overflow_stack(unsigned long sp)
> > +{
> > +	unsigned long low = (unsigned long)this_cpu_ptr(overflow_stack);
> 
> Can you use raw_cpu_ptr here, like you do for the irq stack?

Sure; done.

> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index e5aa866..44a27c3 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -72,6 +72,37 @@
> >  	.macro kernel_ventry	label
> >  	.align 7
> >  	sub	sp, sp, #S_FRAME_SIZE
> > +#ifdef CONFIG_VMAP_STACK
> > +	add	sp, sp, x0			// sp' = sp + x0
> > +	sub	x0, sp, x0			// x0' = sp' - x0 = (sp + x0) - x0 = sp
> > +	tbnz	x0, #THREAD_SHIFT, 0f
> > +	sub	x0, sp, x0			// sp' - x0' = (sp + x0) - sp = x0
> > +	sub	sp, sp, x0			// sp' - x0 = (sp + x0) - x0 = sp
> > +	b	\label
> > +
> > +	/* Stash the original SP value in tpidr_el0 */
> > +0:	msr	tpidr_el0, x0
> 
> The comment here is a bit confusing, since the sp has already been
> decremented for the frame, as mention in a later comment.

True. I've updated the comment to say:

	/*
	 * Stash the SP (minus S_FRAME_SIZE) in tpidr_el0. We can recover the
	 * original SP value later if we need it.
	 */  

[...]

> > +	 * Store the original GPRs to the new stack. The orginial SP (minus
> 
> original

Took me a moment to spot the second instance. Fixed now.

[...]

> > +	/* Time to die */
> > +	bl	handle_bad_stack
> > +	ASM_BUG()
> 
> Why not just a b without the ASM_BUG?

We need the BL to ensure that the LR is valid for unwinding. That's
necessary for the backtrace to identify the exception regs based on the
LR falling into .entry.text.

The ASM_BUG() ensures that the LR value definitely falls in .entry.text,
and makes the backtrace resolve the symbol correctly regardless of
what's next.

I didn't add a comment for the other cases, so I hadn't bothered here.
I'm happy to add those, so long as we're consistent.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer
  2017-08-14 17:13   ` Catalin Marinas
@ 2017-08-14 17:42     ` Mark Rutland
  0 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-14 17:42 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, keescook, ard.biesheuvel, matt,
	kernel-hardening, will.deacon, linux-kernel, luto, james.morse,
	labbott

On Mon, Aug 14, 2017 at 06:13:39PM +0100, Catalin Marinas wrote:
> On Mon, Aug 07, 2017 at 07:36:01PM +0100, Mark Rutland wrote:

> >  	.macro adr_this_cpu, dst, sym, tmp
> > -	adr_l	\dst, \sym
> > +	adrp	\tmp, \sym
> > +	add	\dst, \tmp, #:lo12:\sym
> >  	mrs	\tmp, tpidr_el1
> >  	add	\dst, \dst, \tmp
> >  	.endm
> 
> Nitpick: it may be worth adding an #ifndef MODULE around these macros,
> together with a comment, just in case. There are other macros in this
> file like adr_l which are used in modules (crypto).

I've folded in the below, in keeping with the other MODULE fallbacks.

Thanks,
Mark.

---->8----
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 4775af5..50c5592 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -230,13 +230,18 @@
        .endm
 
        /*
-        * @dst: Result of per_cpu(sym, smp_processor_id())
+        * @dst: Result of per_cpu(sym, smp_processor_id()), can be SP for
+        *       non-module code
         * @sym: The name of the per-cpu variable
         * @tmp: scratch register
         */
        .macro adr_this_cpu, dst, sym, tmp
+#ifndef MODULE
        adrp    \tmp, \sym
        add     \dst, \tmp, #:lo12:\sym
+#else
+       adr_l   \tmp, \sym
+#endif
        mrs     \tmp, tpidr_el1
        add     \dst, \dst, \tmp
        .endm

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 14/14] arm64: add VMAP_STACK overflow detection
  2017-08-07 18:36 ` [PATCH 14/14] arm64: add VMAP_STACK overflow detection Mark Rutland
  2017-08-14 15:32   ` Will Deacon
@ 2017-08-15 11:10   ` Catalin Marinas
  2017-08-15 11:19     ` Mark Rutland
  1 sibling, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2017-08-15 11:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, keescook, ard.biesheuvel, matt,
	kernel-hardening, will.deacon, linux-kernel, luto, james.morse,
	labbott

On Mon, Aug 07, 2017 at 07:36:05PM +0100, Mark Rutland wrote:
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index e5aa866..44a27c3 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -72,6 +72,37 @@
>  	.macro kernel_ventry	label
>  	.align 7
>  	sub	sp, sp, #S_FRAME_SIZE
> +#ifdef CONFIG_VMAP_STACK
> +	add	sp, sp, x0			// sp' = sp + x0
> +	sub	x0, sp, x0			// x0' = sp' - x0 = (sp + x0) - x0 = sp
> +	tbnz	x0, #THREAD_SHIFT, 0f
> +	sub	x0, sp, x0			// sp' - x0' = (sp + x0) - sp = x0
> +	sub	sp, sp, x0			// sp' - x0 = (sp + x0) - x0 = sp
> +	b	\label

Maybe a small comment before this hunk just to tell the user that it's
trying to test a bit in SP without corrupting a gpr. It's obvious once
you read it but not you see it for the first time ;).

> +
> +	/* Stash the original SP value in tpidr_el0 */
> +0:	msr	tpidr_el0, x0

And a comment here that on this path we no longer care about the user
tpidr_el0 as we are never returning there.

Otherwise I'm fine with the series (I'm not a fan of the complexity it
adds but I don't have a better suggestion).

-- 
Catalin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 14/14] arm64: add VMAP_STACK overflow detection
  2017-08-15 11:10   ` Catalin Marinas
@ 2017-08-15 11:19     ` Mark Rutland
  0 siblings, 0 replies; 23+ messages in thread
From: Mark Rutland @ 2017-08-15 11:19 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, keescook, ard.biesheuvel, matt,
	kernel-hardening, will.deacon, linux-kernel, luto, james.morse,
	labbott

On Tue, Aug 15, 2017 at 12:10:32PM +0100, Catalin Marinas wrote:
> On Mon, Aug 07, 2017 at 07:36:05PM +0100, Mark Rutland wrote:
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index e5aa866..44a27c3 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -72,6 +72,37 @@
> >  	.macro kernel_ventry	label
> >  	.align 7
> >  	sub	sp, sp, #S_FRAME_SIZE
> > +#ifdef CONFIG_VMAP_STACK
> > +	add	sp, sp, x0			// sp' = sp + x0
> > +	sub	x0, sp, x0			// x0' = sp' - x0 = (sp + x0) - x0 = sp
> > +	tbnz	x0, #THREAD_SHIFT, 0f
> > +	sub	x0, sp, x0			// sp' - x0' = (sp + x0) - sp = x0
> > +	sub	sp, sp, x0			// sp' - x0 = (sp + x0) - x0 = sp
> > +	b	\label
> 
> Maybe a small comment before this hunk just to tell the user that it's
> trying to test a bit in SP without corrupting a gpr. It's obvious once
> you read it but not you see it for the first time ;).
> 
> > +
> > +	/* Stash the original SP value in tpidr_el0 */
> > +0:	msr	tpidr_el0, x0
> 
> And a comment here that on this path we no longer care about the user
> tpidr_el0 as we are never returning there.

Ok.

I've updated comments in both cases.

> Otherwise I'm fine with the series (I'm not a fan of the complexity it
> adds but I don't have a better suggestion).

Thanks!

I'll send out a v2 shortly with the changes you requested.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-08-15 11:20 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-07 18:35 [PATCH 00/14] arm64: VMAP_STACK support Mark Rutland
2017-08-07 18:35 ` [PATCH 01/14] arm64: remove __die()'s stack dump Mark Rutland
2017-08-07 18:35 ` [PATCH 02/14] fork: allow arch-override of VMAP stack alignment Mark Rutland
2017-08-07 18:35 ` [PATCH 03/14] arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP Mark Rutland
2017-08-07 18:35 ` [PATCH 04/14] arm64: factor out PAGE_* and CONT_* definitions Mark Rutland
2017-08-07 18:35 ` [PATCH 05/14] arm64: clean up THREAD_* definitions Mark Rutland
2017-08-14 11:59   ` Catalin Marinas
2017-08-14 13:10     ` Mark Rutland
2017-08-07 18:35 ` [PATCH 06/14] arm64: clean up irq stack definitions Mark Rutland
2017-08-07 18:35 ` [PATCH 07/14] arm64: move SEGMENT_ALIGN to <asm/memory.h> Mark Rutland
2017-08-07 18:35 ` [PATCH 08/14] efi/arm64: add EFI_KIMG_ALIGN Mark Rutland
2017-08-07 18:36 ` [PATCH 09/14] arm64: factor out entry stack manipulation Mark Rutland
2017-08-07 18:36 ` [PATCH 10/14] arm64: assembler: allow adr_this_cpu to use the stack pointer Mark Rutland
2017-08-14 17:13   ` Catalin Marinas
2017-08-14 17:42     ` Mark Rutland
2017-08-07 18:36 ` [PATCH 11/14] arm64: use an irq " Mark Rutland
2017-08-07 18:36 ` [PATCH 12/14] arm64: add basic VMAP_STACK support Mark Rutland
2017-08-07 18:36 ` [PATCH 13/14] arm64: add on_accessible_stack() Mark Rutland
2017-08-07 18:36 ` [PATCH 14/14] arm64: add VMAP_STACK overflow detection Mark Rutland
2017-08-14 15:32   ` Will Deacon
2017-08-14 17:25     ` Mark Rutland
2017-08-15 11:10   ` Catalin Marinas
2017-08-15 11:19     ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).