All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] arm64: move thread_info off of the task stack
@ 2016-10-19 19:10 Mark Rutland
  2016-10-19 19:10 ` [PATCH 01/10] arm64: thread_info remove stale items Mark Rutland
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

Building atop of Andy's work on x86 and generic code, these patches move
arm64's thread_info off of the stack and into task_struct. This protects
thread_info from corruption in the face of stack overflow, and serves as
a step towards fully robust stack overflow handling, which will be
addressed by subsequent patches.

These patches are based atop of a preparatory series [1] (itself based
on v4.9-rc1) that's also necessary for s390. I've placed those patches
in a branch [2] on my kernel.org repo, along with this series [3]. I'm
hoping that the prep work will be able to become a stable branch/tag
soon.

I've given the series some light testing on a couple of SMP arm64
platforms, but this has yet to see a thorough beating; please do try to
make this fall over!

Since RFC [4]:
* Rely on prior patches to make thread_info arch-specific
* Make smp_processor_id() use a per-cpu variable
* Split out current_stack_pointer
* Make SMP actually work

[1] http://lkml.kernel.org/r/1476901693-8492-1-git-send-email-mark.rutland at arm.com
[2] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=core/ti-stack-split
[3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
[4] http://lkml.kernel.org/r/1473947349-14521-1-git-send-email-mark.rutland at arm.com

Thanks,
Mark.

Mark Rutland (10):
  arm64: thread_info remove stale items
  arm64: asm-offsets: remove unused definitions
  arm64: factor out current_stack_pointer
  arm64: traps: simplify die() and __die()
  arm64: prep stack walkers for THREAD_INFO_IN_TASK
  arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx
  arm64: smp: prepare for smp_processor_id() rework
  arm64: make cpu number a percpu variable
  arm64: assembler: introduce ldr_this_cpu
  arm64: split thread_info from task stack

 arch/arm64/Kconfig                     |  1 +
 arch/arm64/include/asm/Kbuild          |  1 -
 arch/arm64/include/asm/assembler.h     | 19 +++++++++++++++----
 arch/arm64/include/asm/current.h       | 22 ++++++++++++++++++++++
 arch/arm64/include/asm/percpu.h        |  2 ++
 arch/arm64/include/asm/perf_event.h    |  2 ++
 arch/arm64/include/asm/smp.h           |  7 ++++++-
 arch/arm64/include/asm/stack_pointer.h |  9 +++++++++
 arch/arm64/include/asm/suspend.h       |  2 +-
 arch/arm64/include/asm/thread_info.h   | 32 +-------------------------------
 arch/arm64/kernel/asm-offsets.c        |  3 +--
 arch/arm64/kernel/entry.S              |  6 +++---
 arch/arm64/kernel/head.S               | 11 ++++++-----
 arch/arm64/kernel/process.c            | 33 +++++++++++++++++++++++++++------
 arch/arm64/kernel/return_address.c     |  1 +
 arch/arm64/kernel/sleep.S              |  3 ---
 arch/arm64/kernel/smp.c                | 14 +++++++++++---
 arch/arm64/kernel/stacktrace.c         |  6 ++++++
 arch/arm64/kernel/suspend.c            |  6 ------
 arch/arm64/kernel/traps.c              | 14 +++++++-------
 arch/arm64/mm/proc.S                   |  6 ++++++
 21 files changed, 127 insertions(+), 73 deletions(-)
 create mode 100644 arch/arm64/include/asm/current.h
 create mode 100644 arch/arm64/include/asm/stack_pointer.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/10] arm64: thread_info remove stale items
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 02/10] arm64: asm-offsets: remove unused definitions Mark Rutland
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

We have a comment claiming __switch_to() cares about where cpu_context
is located relative to cpu_domain in thread_info. However arm64 has
never had a thread_info::cpu_domain field, and neither __switch_to nor
cpu_switch_to care where the cpu_context field is relative to others.

Additionally, the init_thread_info alias is never used anywhere in the
kernel, and will shortly become problematic when thread_info is moved
into task_struct.

This patch removes both.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/thread_info.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e9ea5a6..76a9559 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -42,7 +42,6 @@
 
 /*
  * low level task data that entry.S needs immediate access to.
- * __switch_to() assumes cpu_context follows immediately after cpu_domain.
  */
 struct thread_info {
 	unsigned long		flags;		/* low level flags */
@@ -60,7 +59,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,					\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
 #define init_stack		(init_thread_union.stack)
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/10] arm64: asm-offsets: remove unused definitions
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
  2016-10-19 19:10 ` [PATCH 01/10] arm64: thread_info remove stale items Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 03/10] arm64: factor out current_stack_pointer Mark Rutland
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

Subsequent patches will move the thread_info::{task,cpu} fields, and the
current TI_{TASK,CPU} offset definitions are not used anywhere.

This patch removes the redundant definitions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/asm-offsets.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 4a2f0f0..d30b232 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -39,8 +39,6 @@ int main(void)
   DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
   DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
   DEFINE(TI_ADDR_LIMIT,		offsetof(struct thread_info, addr_limit));
-  DEFINE(TI_TASK,		offsetof(struct thread_info, task));
-  DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   BLANK();
   DEFINE(THREAD_CPU_CONTEXT,	offsetof(struct task_struct, thread.cpu_context));
   BLANK();
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/10] arm64: factor out current_stack_pointer
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
  2016-10-19 19:10 ` [PATCH 01/10] arm64: thread_info remove stale items Mark Rutland
  2016-10-19 19:10 ` [PATCH 02/10] arm64: asm-offsets: remove unused definitions Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 04/10] arm64: traps: simplify die() and __die() Mark Rutland
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

We define current_stack_pointer in <asm/thread_info.h>, though other
files and header relying upon it do not have this necessary include, and
are thus fragile to changes in the header soup.

Subsequent patches will affect the header soup such that directly
including <asm/thread_info.h> may result in a circular header include in
some of these cases, so we can't simply include <asm/thread_info.h>.

Instead, factor current_thread_info into its own header, and have all
existing users include this explicitly.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/percpu.h        | 2 ++
 arch/arm64/include/asm/perf_event.h    | 2 ++
 arch/arm64/include/asm/stack_pointer.h | 9 +++++++++
 arch/arm64/include/asm/thread_info.h   | 6 +-----
 arch/arm64/kernel/return_address.c     | 1 +
 arch/arm64/kernel/stacktrace.c         | 1 +
 arch/arm64/kernel/traps.c              | 1 +
 7 files changed, 17 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/include/asm/stack_pointer.h

diff --git a/arch/arm64/include/asm/percpu.h b/arch/arm64/include/asm/percpu.h
index 2fee2f5..2127a41 100644
--- a/arch/arm64/include/asm/percpu.h
+++ b/arch/arm64/include/asm/percpu.h
@@ -16,6 +16,8 @@
 #ifndef __ASM_PERCPU_H
 #define __ASM_PERCPU_H
 
+#include <asm/stack_pointer.h>
+
 static inline void set_my_cpu_offset(unsigned long off)
 {
 	asm volatile("msr tpidr_el1, %0" :: "r" (off) : "memory");
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2065f46..9eee2be 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -17,6 +17,8 @@
 #ifndef __ASM_PERF_EVENT_H
 #define __ASM_PERF_EVENT_H
 
+#include <asm/stack_pointer.h>
+
 #define	ARMV8_PMU_MAX_COUNTERS	32
 #define	ARMV8_PMU_COUNTER_MASK	(ARMV8_PMU_MAX_COUNTERS - 1)
 
diff --git a/arch/arm64/include/asm/stack_pointer.h b/arch/arm64/include/asm/stack_pointer.h
new file mode 100644
index 0000000..ffcdf74
--- /dev/null
+++ b/arch/arm64/include/asm/stack_pointer.h
@@ -0,0 +1,9 @@
+#ifndef __ASM_STACK_POINTER_H
+#define __ASM_STACK_POINTER_H
+
+/*
+ * how to get the current stack pointer from C
+ */
+register unsigned long current_stack_pointer asm ("sp");
+
+#endif /* __ASM_STACK_POINTER_H */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 76a9559..3a4f85d 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -36,6 +36,7 @@
 
 struct task_struct;
 
+#include <asm/stack_pointer.h>
 #include <asm/types.h>
 
 typedef unsigned long mm_segment_t;
@@ -62,11 +63,6 @@ struct thread_info {
 #define init_stack		(init_thread_union.stack)
 
 /*
- * how to get the current stack pointer from C
- */
-register unsigned long current_stack_pointer asm ("sp");
-
-/*
  * how to get the thread information struct from C
  */
 static inline struct thread_info *current_thread_info(void) __attribute_const__;
diff --git a/arch/arm64/kernel/return_address.c b/arch/arm64/kernel/return_address.c
index 1718706..12a87f2 100644
--- a/arch/arm64/kernel/return_address.c
+++ b/arch/arm64/kernel/return_address.c
@@ -12,6 +12,7 @@
 #include <linux/export.h>
 #include <linux/ftrace.h>
 
+#include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
 struct return_address_data {
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index c2efddf..5b80068 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -22,6 +22,7 @@
 #include <linux/stacktrace.h>
 
 #include <asm/irq.h>
+#include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
 /*
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 5ff020f..32d4bdb 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -38,6 +38,7 @@
 #include <asm/esr.h>
 #include <asm/insn.h>
 #include <asm/traps.h>
+#include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 #include <asm/exception.h>
 #include <asm/system_misc.h>
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/10] arm64: traps: simplify die() and __die()
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (2 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 03/10] arm64: factor out current_stack_pointer Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 05/10] arm64: prep stack walkers for THREAD_INFO_IN_TASK Mark Rutland
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

In arm64's die and __die routines we pass around a thread_info, and
subsequently use this to determine the relevant task_struct, and the end
of the thread's stack. Subsequent patches will decouple thread_info from
the stack, and this approach will no longer work.

To figure out the end of the stack, we can use the new generic
end_of_stack() helper. As we only call __die() from die(), and die()
always deals with the current task, we can remove the parameter and have
both acquire current directly, which also makes it clear that __die
can't be called for arbitrary tasks.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/traps.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 32d4bdb..1cfbe10 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -228,10 +228,9 @@ void show_stack(struct task_struct *tsk, unsigned long *sp)
 #endif
 #define S_SMP " SMP"
 
-static int __die(const char *str, int err, struct thread_info *thread,
-		 struct pt_regs *regs)
+static int __die(const char *str, int err, struct pt_regs *regs)
 {
-	struct task_struct *tsk = thread->task;
+	struct task_struct *tsk = current;
 	static int die_counter;
 	int ret;
 
@@ -246,7 +245,8 @@ static int __die(const char *str, int err, struct thread_info *thread,
 	print_modules();
 	__show_regs(regs);
 	pr_emerg("Process %.*s (pid: %d, stack limit = 0x%p)\n",
-		 TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk), thread + 1);
+		 TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk),
+		 end_of_stack(tsk));
 
 	if (!user_mode(regs)) {
 		dump_mem(KERN_EMERG, "Stack: ", regs->sp,
@@ -265,7 +265,6 @@ static int __die(const char *str, int err, struct thread_info *thread,
  */
 void die(const char *str, struct pt_regs *regs, int err)
 {
-	struct thread_info *thread = current_thread_info();
 	int ret;
 
 	oops_enter();
@@ -273,9 +272,9 @@ void die(const char *str, struct pt_regs *regs, int err)
 	raw_spin_lock_irq(&die_lock);
 	console_verbose();
 	bust_spinlocks(1);
-	ret = __die(str, err, thread, regs);
+	ret = __die(str, err, regs);
 
-	if (regs && kexec_should_crash(thread->task))
+	if (regs && kexec_should_crash(current))
 		crash_kexec(regs);
 
 	bust_spinlocks(0);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/10] arm64: prep stack walkers for THREAD_INFO_IN_TASK
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (3 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 04/10] arm64: traps: simplify die() and __die() Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 06/10] arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx Mark Rutland
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
before a task is destroyed. To account for this, the stacks are
refcounted, and when manipulating the stack of another task, it is
necessary to get/put the stack to ensure it isn't freed and/or re-used
while we do so.

This patch reworks the arm64 stack walking code to account for this.
When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
refcounting, and this should only be a structural change that does not
affect behaviour.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/process.c    | 20 ++++++++++++++------
 arch/arm64/kernel/stacktrace.c |  5 +++++
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 27b2f13..2f39036 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -340,27 +340,35 @@ struct task_struct *__switch_to(struct task_struct *prev,
 unsigned long get_wchan(struct task_struct *p)
 {
 	struct stackframe frame;
-	unsigned long stack_page;
+	unsigned long stack_page, ret = 0;
 	int count = 0;
 	if (!p || p == current || p->state == TASK_RUNNING)
 		return 0;
 
+	stack_page = (unsigned long)try_get_task_stack(p);
+	if (!stack_page)
+		return 0;
+
 	frame.fp = thread_saved_fp(p);
 	frame.sp = thread_saved_sp(p);
 	frame.pc = thread_saved_pc(p);
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	frame.graph = p->curr_ret_stack;
 #endif
-	stack_page = (unsigned long)task_stack_page(p);
 	do {
 		if (frame.sp < stack_page ||
 		    frame.sp >= stack_page + THREAD_SIZE ||
 		    unwind_frame(p, &frame))
-			return 0;
-		if (!in_sched_functions(frame.pc))
-			return frame.pc;
+			goto out;
+		if (!in_sched_functions(frame.pc)) {
+			ret = frame.pc;
+			goto out;
+		}
 	} while (count ++ < 16);
-	return 0;
+
+out:
+	put_task_stack(p);
+	return ret;
 }
 
 unsigned long arch_align_stack(unsigned long sp)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 5b80068..c77e5b6 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -182,6 +182,9 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
 	struct stack_trace_data data;
 	struct stackframe frame;
 
+	if (!try_get_task_stack(tsk))
+		return;
+
 	data.trace = trace;
 	data.skip = trace->skip;
 
@@ -203,6 +206,8 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
 	walk_stackframe(tsk, &frame, save_trace, &data);
 	if (trace->nr_entries < trace->max_entries)
 		trace->entries[trace->nr_entries++] = ULONG_MAX;
+
+	put_task_stack(tsk);
 }
 
 void save_stack_trace(struct stack_trace *trace)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/10] arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (4 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 05/10] arm64: prep stack walkers for THREAD_INFO_IN_TASK Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 07/10] arm64: smp: prepare for smp_processor_id() rework Mark Rutland
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

When returning from idle, we rely on the fact that thread_info lives at
the end of the kernel stack, and restore this by masking the saved stack
pointer. Subsequent patches will sever the relationship between the
stack and thread_info, and to cater for this we must save/restore sp_el0
explicitly, storing it in cpu_suspend_ctx.

As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
in the same way, which simplifies the code, avoiding pointer chasing on
the restore path (as we no longer need to load thread_info::cpu followed
by the relevant slot in __per_cpu_offset based on this).

This patch stashes both registers in cpu_suspend_ctx.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/suspend.h | 2 +-
 arch/arm64/kernel/sleep.S        | 3 ---
 arch/arm64/kernel/suspend.c      | 6 ------
 arch/arm64/mm/proc.S             | 6 ++++++
 4 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index b8a313f..de5600f 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
-#define NR_CTX_REGS 10
+#define NR_CTX_REGS 12
 #define NR_CALLEE_SAVED_REGS 12
 
 /*
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index b8799e7..5062fd9 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -125,9 +125,6 @@ ENTRY(_cpu_resume)
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
 	mov	sp, x2
-	/* save thread_info */
-	and	x2, x2, #~(THREAD_SIZE - 1)
-	msr	sp_el0, x2
 	/*
 	 * cpu_do_resume expects x0 to contain context address pointer
 	 */
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index ad73414..7db5620 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -44,12 +44,6 @@ void notrace __cpu_suspend_exit(void)
 	cpu_uninstall_idmap();
 
 	/*
-	 * Restore per-cpu offset before any kernel
-	 * subsystem relying on it has a chance to run.
-	 */
-	set_my_cpu_offset(per_cpu_offset(cpu));
-
-	/*
 	 * Restore HW breakpoint registers to sane values
 	 * before debug exceptions are possibly reenabled
 	 * through local_dbg_restore.
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 352c73b..6a853a8 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -70,11 +70,14 @@ ENTRY(cpu_do_suspend)
 	mrs	x8, mdscr_el1
 	mrs	x9, oslsr_el1
 	mrs	x10, sctlr_el1
+	mrs	x11, tpidr_el1
+	mrs	x12, sp_el0
 	stp	x2, x3, [x0]
 	stp	x4, xzr, [x0, #16]
 	stp	x5, x6, [x0, #32]
 	stp	x7, x8, [x0, #48]
 	stp	x9, x10, [x0, #64]
+	stp	x11, x12, [x0, #80]
 	ret
 ENDPROC(cpu_do_suspend)
 
@@ -90,6 +93,7 @@ ENTRY(cpu_do_resume)
 	ldp	x6, x8, [x0, #32]
 	ldp	x9, x10, [x0, #48]
 	ldp	x11, x12, [x0, #64]
+	ldp	x13, x14, [x0, #80]
 	msr	tpidr_el0, x2
 	msr	tpidrro_el0, x3
 	msr	contextidr_el1, x4
@@ -112,6 +116,8 @@ ENTRY(cpu_do_resume)
 	msr	mdscr_el1, x10
 
 	msr	sctlr_el1, x12
+	msr	tpidr_el1, x13
+	msr	sp_el0, x14
 	/*
 	 * Restore oslsr_el1 by writing oslar_el1
 	 */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/10] arm64: smp: prepare for smp_processor_id() rework
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (5 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 06/10] arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 08/10] arm64: make cpu number a percpu variable Mark Rutland
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

Subsequent patches will make smp_processor_id() use a percpu variable.
This will make smp_processor_id() dependent on the percpu offset, and
thus we cannot use smp_processor_id() to figure out what to initialise
the offset to.

Prepare for this by initialising the percpu offset based on
current::cpu, which will work regardless of how smp_processor_id() is
implemented. Also, make this relationship obvious by placing this code
together at the start of secondary_start_kernel().

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/smp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index d3f151c..7af46bf 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -208,7 +208,10 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 asmlinkage void secondary_start_kernel(void)
 {
 	struct mm_struct *mm = &init_mm;
-	unsigned int cpu = smp_processor_id();
+	unsigned int cpu;
+
+	cpu = task_cpu(current);
+	set_my_cpu_offset(per_cpu_offset(cpu));
 
 	/*
 	 * All kernel threads share the same mm context; grab a
@@ -217,8 +220,6 @@ asmlinkage void secondary_start_kernel(void)
 	atomic_inc(&mm->mm_count);
 	current->active_mm = mm;
 
-	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
-
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/10] arm64: make cpu number a percpu variable
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (6 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 07/10] arm64: smp: prepare for smp_processor_id() rework Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 09/10] arm64: assembler: introduce ldr_this_cpu Mark Rutland
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

In the absence of CONFIG_THREAD_INFO_IN_TASK, core code maintains
thread_info::cpu, and low-level architecture code can access this to
build raw_smp_processor_id(). With CONFIG_THREAD_INFO_IN_TASK, core code
maintains task_struct::cpu, which for reasons of hte header soup is not
accessible to low-level arch code.

Instead, we can maintain a percpu variable containing the cpu number.
Non-preemptible code will receive the current CPU's number, while
preemptible code may race (as was previosuly the case), and see a stale
value. Thus, there shouldn't be a change in observable behaviour.

For both the old and new implementation of raw_smp_processor_id(), we
read a syreg into a GPR, add an offset, and load the result. As the
offset is now large, it probably won't be folded into the load, but
otherwise the assembly shouldn't change much.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/smp.h | 6 +++++-
 arch/arm64/kernel/smp.c      | 5 +++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 0226447..f77ac0d 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -29,11 +29,15 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/percpu.h>
+
 #include <linux/threads.h>
 #include <linux/cpumask.h>
 #include <linux/thread_info.h>
 
-#define raw_smp_processor_id() (current_thread_info()->cpu)
+DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
+
+#define raw_smp_processor_id() (*this_cpu_ptr(&cpu_number))
 
 struct seq_file;
 
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 7af46bf..2679722 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -58,6 +58,9 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
+DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number);
+EXPORT_PER_CPU_SYMBOL(cpu_number);
+
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
  * so we need some other way of telling a new secondary core
@@ -718,6 +721,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	 */
 	for_each_possible_cpu(cpu) {
 
+		per_cpu(cpu_number, cpu) = cpu;
+
 		if (cpu == smp_processor_id())
 			continue;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/10] arm64: assembler: introduce ldr_this_cpu
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (7 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 08/10] arm64: make cpu number a percpu variable Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-19 19:10 ` [PATCH 10/10] arm64: split thread_info from task stack Mark Rutland
  2016-10-24 17:38 ` [PATCH 00/10] arm64: move thread_info off of the " Laura Abbott
  10 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

Shortly we will want to load a percpu variable in the return from
userspace path. We can save an instruction by folding the addition of
the percpu offset into the load instruction, and this patch adds a new
helper to do so.

At the same time, we clean up this_cpu_ptr for consistency. As with
{adr,ldr,str}_l, we change the template to take the destination register
first, and name this dst. Secondly, we rename the macro to adr_this_cpu,
following the scheme of adr_l, and matching the newly added
ldr_this_cpu.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/assembler.h | 19 +++++++++++++++----
 arch/arm64/kernel/entry.S          |  2 +-
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 28bfe61..128a9ca 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -202,14 +202,25 @@
 	.endm
 
 	/*
+	 * @dst: Result of per_cpu(sym, smp_processor_id())
 	 * @sym: The name of the per-cpu variable
-	 * @reg: Result of per_cpu(sym, smp_processor_id())
 	 * @tmp: scratch register
 	 */
-	.macro this_cpu_ptr, sym, reg, tmp
-	adr_l	\reg, \sym
+	.macro adr_this_cpu, dst, sym, tmp
+	adr_l	\dst, \sym
 	mrs	\tmp, tpidr_el1
-	add	\reg, \reg, \tmp
+	add	\dst, \dst, \tmp
+	.endm
+
+	/*
+	 * @dst: Result of READ_ONCE(per_cpu(sym, smp_processor_id()))
+	 * @sym: The name of the per-cpu variable
+	 * @tmp: scratch register
+	 */
+	.macro ldr_this_cpu dst, sym, tmp
+	adr_l	\dst, \sym
+	mrs	\tmp, tpidr_el1
+	ldr	\dst, [\dst, \tmp]
 	.endm
 
 /*
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 223d54a..2d4c83b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -200,7 +200,7 @@ alternative_else_nop_endif
 	cmp	x25, tsk
 	b.ne	9998f
 
-	this_cpu_ptr irq_stack, x25, x26
+	adr_this_cpu x25, irq_stack, x26
 	mov	x26, #IRQ_STACK_START_SP
 	add	x26, x25, x26
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/10] arm64: split thread_info from task stack
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (8 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 09/10] arm64: assembler: introduce ldr_this_cpu Mark Rutland
@ 2016-10-19 19:10 ` Mark Rutland
  2016-10-21 14:50   ` James Morse
  2016-10-24 17:38 ` [PATCH 00/10] arm64: move thread_info off of the " Laura Abbott
  10 siblings, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2016-10-19 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

This patch moves arm64's struct thread_info from the task stack into
task_struct. This protects thread_info from corruption in the case of
stack overflows, and makes its address harder to determine if stack
addresses are leaked, making a number of attacks more difficult. Precise
detection and handling of overflow is left for subsequent patches.

Largely, this involves changing code to store the task_struct in sp_el0,
and acquire the thread_info from the task struct (which is the opposite
way around to the current code). Both secondary entry and idle are
updated to stash the sp and task pointer separately.

Userspace clobbers sp_el0, and we can no longer restore this from the
stack. Instead, the current task is cached in a per-cpu variable that we
can safely access from early assembly as interrupts are disabled (and we
are thus not preemptible).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/Kbuild        |  1 -
 arch/arm64/include/asm/current.h     | 22 ++++++++++++++++++++++
 arch/arm64/include/asm/smp.h         |  1 +
 arch/arm64/include/asm/thread_info.h | 24 ------------------------
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kernel/entry.S            |  4 ++--
 arch/arm64/kernel/head.S             | 11 ++++++-----
 arch/arm64/kernel/process.c          | 13 +++++++++++++
 arch/arm64/kernel/smp.c              |  2 ++
 10 files changed, 48 insertions(+), 32 deletions(-)
 create mode 100644 arch/arm64/include/asm/current.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 30398db..1874b61 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -109,6 +109,7 @@ config ARM64
 	select POWER_SUPPLY
 	select SPARSE_IRQ
 	select SYSCTL_EXCEPTION_TRACE
+	select THREAD_INFO_IN_TASK
 	help
 	  ARM 64-bit (AArch64) Linux support.
 
diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
index 44e1d7f1..28196b1 100644
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@ -1,7 +1,6 @@
 generic-y += bugs.h
 generic-y += clkdev.h
 generic-y += cputime.h
-generic-y += current.h
 generic-y += delay.h
 generic-y += div64.h
 generic-y += dma.h
diff --git a/arch/arm64/include/asm/current.h b/arch/arm64/include/asm/current.h
new file mode 100644
index 0000000..f2bcbe2
--- /dev/null
+++ b/arch/arm64/include/asm/current.h
@@ -0,0 +1,22 @@
+#ifndef __ASM_CURRENT_H
+#define __ASM_CURRENT_H
+
+#include <linux/compiler.h>
+
+#include <asm/sysreg.h>
+
+#ifndef __ASSEMBLY__
+
+struct task_struct;
+
+static __always_inline struct task_struct *get_current(void)
+{
+	return (struct task_struct *)read_sysreg(sp_el0);
+}
+
+#define current get_current()
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_CURRENT_H */
+
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index f77ac0d..4400719 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -77,6 +77,7 @@
  */
 struct secondary_data {
 	void *stack;
+	struct task_struct *task;
 	long status;
 };
 
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 3a4f85d..7a4e0e4 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -47,41 +47,17 @@
 struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	mm_segment_t		addr_limit;	/* address limit */
-	struct task_struct	*task;		/* main task structure */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
-	int			cpu;		/* cpu */
 };
 
 #define INIT_THREAD_INFO(tsk)						\
 {									\
-	.task		= &tsk,						\
-	.flags		= 0,						\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
 	.addr_limit	= KERNEL_DS,					\
 }
 
 #define init_stack		(init_thread_union.stack)
 
-/*
- * how to get the thread information struct from C
- */
-static inline struct thread_info *current_thread_info(void) __attribute_const__;
-
-/*
- * struct thread_info can be accessed directly via sp_el0.
- *
- * We don't use read_sysreg() as we want the compiler to cache the value where
- * possible.
- */
-static inline struct thread_info *current_thread_info(void)
-{
-	unsigned long sp_el0;
-
-	asm ("mrs %0, sp_el0" : "=r" (sp_el0));
-
-	return (struct thread_info *)sp_el0;
-}
-
 #define thread_saved_pc(tsk)	\
 	((unsigned long)(tsk->thread.cpu_context.pc))
 #define thread_saved_sp(tsk)	\
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index d30b232..9c2c770 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -121,6 +121,7 @@ int main(void)
   DEFINE(TZ_DSTTIME,		offsetof(struct timezone, tz_dsttime));
   BLANK();
   DEFINE(CPU_BOOT_STACK,	offsetof(struct secondary_data, stack));
+  DEFINE(CPU_BOOT_TASK,		offsetof(struct secondary_data, task));
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 2d4c83b..e781391 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -123,6 +123,7 @@
 	 * Set sp_el0 to current thread_info.
 	 */
 	.if	\el == 0
+	ldr_this_cpu	tsk, __entry_task, x21
 	msr	sp_el0, tsk
 	.endif
 
@@ -674,8 +675,7 @@ ENTRY(cpu_switch_to)
 	ldp	x29, x9, [x8], #16
 	ldr	lr, [x8]
 	mov	sp, x9
-	and	x9, x9, #~(THREAD_SIZE - 1)
-	msr	sp_el0, x9
+	msr	sp_el0, x1
 	ret
 ENDPROC(cpu_switch_to)
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 427f6d3..12cf383 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -428,7 +428,8 @@ ENDPROC(__create_page_tables)
 __primary_switched:
 	adrp	x4, init_thread_union
 	add	sp, x4, #THREAD_SIZE
-	msr	sp_el0, x4			// Save thread_info
+	adr_l	x5, init_task
+	msr	sp_el0, x5			// Save thread_info
 
 	adr_l	x8, vectors			// load VBAR_EL1 with virtual
 	msr	vbar_el1, x8			// vector table address
@@ -698,10 +699,10 @@ __secondary_switched:
 	isb
 
 	adr_l	x0, secondary_data
-	ldr	x0, [x0, #CPU_BOOT_STACK]	// get secondary_data.stack
-	mov	sp, x0
-	and	x0, x0, #~(THREAD_SIZE - 1)
-	msr	sp_el0, x0			// save thread_info
+	ldr	x1, [x0, #CPU_BOOT_STACK]	// get secondary_data.stack
+	mov	sp, x1
+	ldr	x2, [x0, #CPU_BOOT_TASK]
+	msr	sp_el0, x2
 	mov	x29, #0
 	b	secondary_start_kernel
 ENDPROC(__secondary_switched)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 2f39036..ddce61b 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -45,6 +45,7 @@
 #include <linux/personality.h>
 #include <linux/notifier.h>
 #include <trace/events/power.h>
+#include <linux/percpu.h>
 
 #include <asm/alternative.h>
 #include <asm/compat.h>
@@ -312,6 +313,17 @@ static void uao_thread_switch(struct task_struct *next)
 }
 
 /*
+ * We store our current task in sp_el0, which is clobbered by userspace. Keep a
+ * shadow copy so that we can restore this upon entry from userspace.
+ */
+DEFINE_PER_CPU(struct task_struct *, __entry_task) = &init_task;
+
+static void entry_task_switch(struct task_struct *next)
+{
+	__this_cpu_write(__entry_task, next);
+}
+
+/*
  * Thread switching.
  */
 struct task_struct *__switch_to(struct task_struct *prev,
@@ -323,6 +335,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
 	tls_thread_switch(next);
 	hw_breakpoint_thread_switch(next);
 	contextidr_thread_switch(next);
+	entry_task_switch(next);
 	uao_thread_switch(next);
 
 	/*
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 2679722..cde25f4 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -149,6 +149,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 * We need to tell the secondary core where to find its stack and the
 	 * page tables.
 	 */
+	secondary_data.task = idle;
 	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
 	update_cpu_boot_status(CPU_MMU_OFF);
 	__flush_dcache_area(&secondary_data, sizeof(secondary_data));
@@ -173,6 +174,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
 	}
 
+	secondary_data.task = NULL;
 	secondary_data.stack = NULL;
 	status = READ_ONCE(secondary_data.status);
 	if (ret && status) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/10] arm64: split thread_info from task stack
  2016-10-19 19:10 ` [PATCH 10/10] arm64: split thread_info from task stack Mark Rutland
@ 2016-10-21 14:50   ` James Morse
  2016-10-21 15:59     ` Mark Rutland
  2016-10-21 16:20     ` Mark Rutland
  0 siblings, 2 replies; 24+ messages in thread
From: James Morse @ 2016-10-21 14:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

This looks great, we should definitely do this.
There are a few things missing from entry.S below:

On 19/10/16 20:10, Mark Rutland wrote:
> This patch moves arm64's struct thread_info from the task stack into
> task_struct. This protects thread_info from corruption in the case of
> stack overflows, and makes its address harder to determine if stack
> addresses are leaked, making a number of attacks more difficult. Precise
> detection and handling of overflow is left for subsequent patches.
> 
> Largely, this involves changing code to store the task_struct in sp_el0,
> and acquire the thread_info from the task struct (which is the opposite
> way around to the current code). Both secondary entry and idle are
> updated to stash the sp and task pointer separately.
> 
> Userspace clobbers sp_el0, and we can no longer restore this from the
> stack. Instead, the current task is cached in a per-cpu variable that we
> can safely access from early assembly as interrupts are disabled (and we

>  arch/arm64/Kconfig                   |  1 +
>  arch/arm64/include/asm/Kbuild        |  1 -
>  arch/arm64/include/asm/current.h     | 22 ++++++++++++++++++++++
>  arch/arm64/include/asm/smp.h         |  1 +
>  arch/arm64/include/asm/thread_info.h | 24 ------------------------
>  arch/arm64/kernel/asm-offsets.c      |  1 +

>  arch/arm64/kernel/entry.S            |  4 ++--

4? That was too easy...


>  arch/arm64/kernel/head.S             | 11 ++++++-----
>  arch/arm64/kernel/process.c          | 13 +++++++++++++
>  arch/arm64/kernel/smp.c              |  2 ++
>  10 files changed, 48 insertions(+), 32 deletions(-)


> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 2d4c83b..e781391 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -123,6 +123,7 @@
>  	 * Set sp_el0 to current thread_info.
>  	 */
>  	.if	\el == 0
> +	ldr_this_cpu	tsk, __entry_task, x21
>  	msr	sp_el0, tsk
>  	.endif
>  
> @@ -674,8 +675,7 @@ ENTRY(cpu_switch_to)
>  	ldp	x29, x9, [x8], #16
>  	ldr	lr, [x8]
>  	mov	sp, x9
> -	and	x9, x9, #~(THREAD_SIZE - 1)
> -	msr	sp_el0, x9
> +	msr	sp_el0, x1
>  	ret
>  ENDPROC(cpu_switch_to)
>  

So now tsk is current instead of current_thread_info(), but we still access it
with TI_* offsets:
entry.S:102
> 	/* Save the task's original addr_limit and set USER_DS (TASK_SIZE_64) */
> 	ldr	x20, [tsk, #TI_ADDR_LIMIT]
> 	str	x20, [sp, #S_ORIG_ADDR_LIMIT]
> 	mov	x20, #TASK_SIZE_64
> 	str	x20, [tsk, #TI_ADDR_LIMIT]

entry.S:143
> 	/* Restore the task's original addr_limit. */
> 	ldr	x20, [sp, #S_ORIG_ADDR_LIMIT]
> 	str	x20, [tsk, #TI_ADDR_LIMIT]


The 're-entered irq stack' check is going to need re-thinking:
entry.S:195
> 	/*
> 	 * Compare sp with the current thread_info, if the top
> 	 * ~(THREAD_SIZE - 1) bits match, we are on a task stack, and
> 	 * should switch to the irq stack.
> 	 */
> 	and	x25, x19, #~(THREAD_SIZE - 1)
> 	cmp	x25, tsk
> 	b.ne	9998f

It was done like this as the irq stack isn't naturally aligned, so this check
implicitly assumes tsk is on the stack. I will try and come up with an alternative.


And there are a few other things like this:
entry.S:431
> 	ldr	w24, [tsk, #TI_PREEMPT]		// get preempt count
> 	cbnz	w24, 1f				// preempt count != 0
> 	ldr	x0, [tsk, #TI_FLAGS]		// get flags
> 	tbz	x0, #TIF_NEED_RESCHED, 1f	// needs rescheduling?
> 	bl	el1_preempt


(It may be worth renaming the register alias 'tsk' as it isn't really a
 struct_task. This would catch any missed users at build time, including
 any patches in flight...)


> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 2f39036..ddce61b 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -45,6 +45,7 @@
>  #include <linux/personality.h>
>  #include <linux/notifier.h>
>  #include <trace/events/power.h>
> +#include <linux/percpu.h>
>  
>  #include <asm/alternative.h>
>  #include <asm/compat.h>
> @@ -312,6 +313,17 @@ static void uao_thread_switch(struct task_struct *next)
>  }
>  
>  /*
> + * We store our current task in sp_el0, which is clobbered by userspace. Keep a
> + * shadow copy so that we can restore this upon entry from userspace.
> + */
> +DEFINE_PER_CPU(struct task_struct *, __entry_task) = &init_task;
> +
> +static void entry_task_switch(struct task_struct *next)
> +{
> +	__this_cpu_write(__entry_task, next);
> +}
> +
> +/*
>   * Thread switching.
>   */
>  struct task_struct *__switch_to(struct task_struct *prev,
> @@ -323,6 +335,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
>  	tls_thread_switch(next);
>  	hw_breakpoint_thread_switch(next);
>  	contextidr_thread_switch(next);
> +	entry_task_switch(next);
>  	uao_thread_switch(next);
>  
>  	/*
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 2679722..cde25f4 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -149,6 +149,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  	 * We need to tell the secondary core where to find its stack and the
>  	 * page tables.
>  	 */
> +	secondary_data.task = idle;
>  	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
>  	update_cpu_boot_status(CPU_MMU_OFF);
>  	__flush_dcache_area(&secondary_data, sizeof(secondary_data));
> @@ -173,6 +174,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
>  	}
>  
> +	secondary_data.task = NULL;
>  	secondary_data.stack = NULL;
>  	status = READ_ONCE(secondary_data.status);
>  	if (ret && status) {
> 

Nit-territory: Something we should remember is that __entry_task isn't written
on secondary startup, so its stale (CPU0s idle task) until the first
__switch_to(). This isn't a problem as its only read on entry from EL0.


Thanks,

James

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 10/10] arm64: split thread_info from task stack
  2016-10-21 14:50   ` James Morse
@ 2016-10-21 15:59     ` Mark Rutland
  2016-10-21 17:27       ` Mark Rutland
  2016-10-21 16:20     ` Mark Rutland
  1 sibling, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2016-10-21 15:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Fri, Oct 21, 2016 at 03:50:34PM +0100, James Morse wrote:
> >  arch/arm64/kernel/entry.S            |  4 ++--
> 
> 4? That was too easy...

Indeed... ;)

> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index 2d4c83b..e781391 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -123,6 +123,7 @@
> >  	 * Set sp_el0 to current thread_info.
> >  	 */
> >  	.if	\el == 0
> > +	ldr_this_cpu	tsk, __entry_task, x21
> >  	msr	sp_el0, tsk
> >  	.endif
> >  
> > @@ -674,8 +675,7 @@ ENTRY(cpu_switch_to)
> >  	ldp	x29, x9, [x8], #16
> >  	ldr	lr, [x8]
> >  	mov	sp, x9
> > -	and	x9, x9, #~(THREAD_SIZE - 1)
> > -	msr	sp_el0, x9
> > +	msr	sp_el0, x1
> >  	ret
> >  ENDPROC(cpu_switch_to)
> >  
> 
> So now tsk is current instead of current_thread_info(), but we still access it
> with TI_* offsets:

Yes; luckily thread_info is the first member of task_struct, so this
works (as offsetof(struct task_struct, thread_info) == 0). The core code
also relies on this, e.g. in <linux/thread_info.h>:

	#ifdef CONFIG_THREAD_INFO_IN_TASK
	#define current_thread_info() ((struct thread_info *)current)
	#endif

... regardless, I should have commented that, mentioned it in the commit
message, and perhaps put a BUILD_BUG_ON()-style assert somewhere. I'll
need to double-check, but IIRC I can't update asm-offsets to base the
TI_* offsets on task_struct without introducing a potential circular
include dependency.

I could at least s/TI_/TSK_/, with a comment.

> The 're-entered irq stack' check is going to need re-thinking:
> entry.S:195
> > 	/*
> > 	 * Compare sp with the current thread_info, if the top
> > 	 * ~(THREAD_SIZE - 1) bits match, we are on a task stack, and
> > 	 * should switch to the irq stack.
> > 	 */
> > 	and	x25, x19, #~(THREAD_SIZE - 1)
> > 	cmp	x25, tsk
> > 	b.ne	9998f
> 
> It was done like this as the irq stack isn't naturally aligned, so this check
> implicitly assumes tsk is on the stack. I will try and come up with an alternative.

Ouch; I clearly didn't vet this thoroughly enough.

If we can corrupt another register here, we can load task_struct::stack
to compare against instead.

> And there are a few other things like this:
> entry.S:431
> > 	ldr	w24, [tsk, #TI_PREEMPT]		// get preempt count
> > 	cbnz	w24, 1f				// preempt count != 0
> > 	ldr	x0, [tsk, #TI_FLAGS]		// get flags
> > 	tbz	x0, #TIF_NEED_RESCHED, 1f	// needs rescheduling?
> > 	bl	el1_preempt
> 
> (It may be worth renaming the register alias 'tsk' as it isn't really a
>  struct_task. This would catch any missed users at build time, including
>  any patches in flight...)

Entertainingly, with these patches 'tsk' *is* task_struct, whereas
before it wasn't. 

> > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > index 2679722..cde25f4 100644
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -149,6 +149,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> >  	 * We need to tell the secondary core where to find its stack and the
> >  	 * page tables.
> >  	 */
> > +	secondary_data.task = idle;
> >  	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
> >  	update_cpu_boot_status(CPU_MMU_OFF);
> >  	__flush_dcache_area(&secondary_data, sizeof(secondary_data));
> > @@ -173,6 +174,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> >  		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
> >  	}
> >  
> > +	secondary_data.task = NULL;
> >  	secondary_data.stack = NULL;
> >  	status = READ_ONCE(secondary_data.status);
> >  	if (ret && status) {
> > 
> 
> Nit-territory: Something we should remember is that __entry_task isn't written
> on secondary startup, so its stale (CPU0s idle task) until the first
> __switch_to(). This isn't a problem as its only read on entry from EL0.

Good point. I think I can initialise this in the hotplug path, if
nothing else but to save us any surprises when debugging.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 10/10] arm64: split thread_info from task stack
  2016-10-21 14:50   ` James Morse
  2016-10-21 15:59     ` Mark Rutland
@ 2016-10-21 16:20     ` Mark Rutland
  1 sibling, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-21 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 21, 2016 at 03:50:34PM +0100, James Morse wrote:
> Hi Mark,
> 
> This looks great, we should definitely do this.
> There are a few things missing from entry.S below:
> 
> On 19/10/16 20:10, Mark Rutland wrote:
> > This patch moves arm64's struct thread_info from the task stack into
> > task_struct. This protects thread_info from corruption in the case of
> > stack overflows, and makes its address harder to determine if stack
> > addresses are leaked, making a number of attacks more difficult. Precise
> > detection and handling of overflow is left for subsequent patches.
> > 
> > Largely, this involves changing code to store the task_struct in sp_el0,
> > and acquire the thread_info from the task struct (which is the opposite
> > way around to the current code). Both secondary entry and idle are
> > updated to stash the sp and task pointer separately.
> > 
> > Userspace clobbers sp_el0, and we can no longer restore this from the
> > stack. Instead, the current task is cached in a per-cpu variable that we
> > can safely access from early assembly as interrupts are disabled (and we
> 
> >  arch/arm64/Kconfig                   |  1 +
> >  arch/arm64/include/asm/Kbuild        |  1 -
> >  arch/arm64/include/asm/current.h     | 22 ++++++++++++++++++++++
> >  arch/arm64/include/asm/smp.h         |  1 +
> >  arch/arm64/include/asm/thread_info.h | 24 ------------------------
> >  arch/arm64/kernel/asm-offsets.c      |  1 +
> 
> >  arch/arm64/kernel/entry.S            |  4 ++--
> 
> 4? That was too easy...

Far to easy; just looking at kernel_entry there' a glaring error:

	.if     \el == 0
	mrs     x21, sp_el0
	mov     tsk, sp
	and     tsk, tsk, #~(THREAD_SIZE - 1)   // Ensure MDSCR_EL1.SS is clear,
	ldr     x19, [tsk, #TI_FLAGS]           // since we can unmask debug
	disable_step_tsk x19, x20               // exceptions when scheduling.

...it's amazing how broken a kernel will boot quite happily.

I've fixed that up locally.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 10/10] arm64: split thread_info from task stack
  2016-10-21 15:59     ` Mark Rutland
@ 2016-10-21 17:27       ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-21 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 21, 2016 at 04:59:02PM +0100, Mark Rutland wrote:
> On Fri, Oct 21, 2016 at 03:50:34PM +0100, James Morse wrote:
 
> > So now tsk is current instead of current_thread_info(), but we still access it
> > with TI_* offsets:

> I'll need to double-check, but IIRC I can't update asm-offsets to base
> the TI_* offsets on task_struct without introducing a potential
> circular include dependency.

I was mistaken; asm-offsets.c already includes <linux/sched.h>. I've
given TSK_ prefixes to everything using tsk as a base.

> > The 're-entered irq stack' check is going to need re-thinking:
> > entry.S:195
> > > 	/*
> > > 	 * Compare sp with the current thread_info, if the top
> > > 	 * ~(THREAD_SIZE - 1) bits match, we are on a task stack, and
> > > 	 * should switch to the irq stack.
> > > 	 */
> > > 	and	x25, x19, #~(THREAD_SIZE - 1)
> > > 	cmp	x25, tsk
> > > 	b.ne	9998f
> > 
> > It was done like this as the irq stack isn't naturally aligned, so this check
> > implicitly assumes tsk is on the stack. I will try and come up with an alternative.

I've changed this to:

	/*
	 * Compare sp with the task stack base. If the top ~(THREAD_SIZE - 1)
	 * bits match, we are on a task stack, and should switch to the irq
	 * stack.
	 */
	ldr	x25, [tsk, TSK_STACK]
	eor	x25, x19
	and	x25, x25, #~(THREAD_SIZE - 1)
	cbnz	x25, 9998f

Where TSK_STACK is generated with asm-offsets.c:

	DEFINE(TSK_STACK,	offsetof(struct task_struct, stack));

[...]

> > Nit-territory: Something we should remember is that __entry_task isn't written
> > on secondary startup, so its stale (CPU0s idle task) until the first
> > __switch_to(). This isn't a problem as its only read on entry from EL0.
> 
> Good point. I think I can initialise this in the hotplug path, if
> nothing else but to save us any surprises when debugging.

... thinking about it some more, that requires defining __entry_task in
a header, and spreading it around. Instead, I've made the comment more
explicit regarding the __switch_to() requirement.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
                   ` (9 preceding siblings ...)
  2016-10-19 19:10 ` [PATCH 10/10] arm64: split thread_info from task stack Mark Rutland
@ 2016-10-24 17:38 ` Laura Abbott
  2016-10-24 17:48   ` Mark Rutland
  10 siblings, 1 reply; 24+ messages in thread
From: Laura Abbott @ 2016-10-24 17:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/19/2016 12:10 PM, Mark Rutland wrote:
> Hi all,
>
> Building atop of Andy's work on x86 and generic code, these patches move
> arm64's thread_info off of the stack and into task_struct. This protects
> thread_info from corruption in the face of stack overflow, and serves as
> a step towards fully robust stack overflow handling, which will be
> addressed by subsequent patches.
>
> These patches are based atop of a preparatory series [1] (itself based
> on v4.9-rc1) that's also necessary for s390. I've placed those patches
> in a branch [2] on my kernel.org repo, along with this series [3]. I'm
> hoping that the prep work will be able to become a stable branch/tag
> soon.
>
> I've given the series some light testing on a couple of SMP arm64
> platforms, but this has yet to see a thorough beating; please do try to
> make this fall over!
>
> Since RFC [4]:
> * Rely on prior patches to make thread_info arch-specific
> * Make smp_processor_id() use a per-cpu variable
> * Split out current_stack_pointer
> * Make SMP actually work
>
> [1] http://lkml.kernel.org/r/1476901693-8492-1-git-send-email-mark.rutland at arm.com
> [2] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=core/ti-stack-split
> [3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
> [4] http://lkml.kernel.org/r/1473947349-14521-1-git-send-email-mark.rutland at arm.com
>
> Thanks,
> Mark.
>
> Mark Rutland (10):
>   arm64: thread_info remove stale items
>   arm64: asm-offsets: remove unused definitions
>   arm64: factor out current_stack_pointer
>   arm64: traps: simplify die() and __die()
>   arm64: prep stack walkers for THREAD_INFO_IN_TASK
>   arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx
>   arm64: smp: prepare for smp_processor_id() rework
>   arm64: make cpu number a percpu variable
>   arm64: assembler: introduce ldr_this_cpu
>   arm64: split thread_info from task stack
>
>  arch/arm64/Kconfig                     |  1 +
>  arch/arm64/include/asm/Kbuild          |  1 -
>  arch/arm64/include/asm/assembler.h     | 19 +++++++++++++++----
>  arch/arm64/include/asm/current.h       | 22 ++++++++++++++++++++++
>  arch/arm64/include/asm/percpu.h        |  2 ++
>  arch/arm64/include/asm/perf_event.h    |  2 ++
>  arch/arm64/include/asm/smp.h           |  7 ++++++-
>  arch/arm64/include/asm/stack_pointer.h |  9 +++++++++
>  arch/arm64/include/asm/suspend.h       |  2 +-
>  arch/arm64/include/asm/thread_info.h   | 32 +-------------------------------
>  arch/arm64/kernel/asm-offsets.c        |  3 +--
>  arch/arm64/kernel/entry.S              |  6 +++---
>  arch/arm64/kernel/head.S               | 11 ++++++-----
>  arch/arm64/kernel/process.c            | 33 +++++++++++++++++++++++++++------
>  arch/arm64/kernel/return_address.c     |  1 +
>  arch/arm64/kernel/sleep.S              |  3 ---
>  arch/arm64/kernel/smp.c                | 14 +++++++++++---
>  arch/arm64/kernel/stacktrace.c         |  6 ++++++
>  arch/arm64/kernel/suspend.c            |  6 ------
>  arch/arm64/kernel/traps.c              | 14 +++++++-------
>  arch/arm64/mm/proc.S                   |  6 ++++++
>  21 files changed, 127 insertions(+), 73 deletions(-)
>  create mode 100644 arch/arm64/include/asm/current.h
>  create mode 100644 arch/arm64/include/asm/stack_pointer.h
>

I pulled the arm64/ti-stack-split branch on top of a Fedora
tree and ran back-to-back kernel RPM builds for a long weekend.
It's still going as of this morning so you can take that as a

Tested-by: Laura Abbott <labbott@redhat.com>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 17:38 ` [PATCH 00/10] arm64: move thread_info off of the " Laura Abbott
@ 2016-10-24 17:48   ` Mark Rutland
  2016-10-24 17:58     ` Laura Abbott
  0 siblings, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2016-10-24 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 24, 2016 at 10:38:59AM -0700, Laura Abbott wrote:
> On 10/19/2016 12:10 PM, Mark Rutland wrote:
> >Hi all,
> >
> >Building atop of Andy's work on x86 and generic code, these patches move
> >arm64's thread_info off of the stack and into task_struct. This protects
> >thread_info from corruption in the face of stack overflow, and serves as
> >a step towards fully robust stack overflow handling, which will be
> >addressed by subsequent patches.
> >
> >These patches are based atop of a preparatory series [1] (itself based
> >on v4.9-rc1) that's also necessary for s390. I've placed those patches
> >in a branch [2] on my kernel.org repo, along with this series [3]. I'm
> >hoping that the prep work will be able to become a stable branch/tag
> >soon.
> >
> >I've given the series some light testing on a couple of SMP arm64
> >platforms, but this has yet to see a thorough beating; please do try to
> >make this fall over!
> >
> >Since RFC [4]:
> >* Rely on prior patches to make thread_info arch-specific
> >* Make smp_processor_id() use a per-cpu variable
> >* Split out current_stack_pointer
> >* Make SMP actually work
> >
> >[1] http://lkml.kernel.org/r/1476901693-8492-1-git-send-email-mark.rutland at arm.com
> >[2] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=core/ti-stack-split
> >[3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
> >[4] http://lkml.kernel.org/r/1473947349-14521-1-git-send-email-mark.rutland at arm.com

> I pulled the arm64/ti-stack-split branch on top of a Fedora
> tree and ran back-to-back kernel RPM builds for a long weekend.
> It's still going as of this morning so you can take that as a
> 
> Tested-by: Laura Abbott <labbott@redhat.com>

Thanks! That's much appreciated!

Just to check, did you grab the version with entry.S fixes rolled in
(where the head is 657f54256c427fec)?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 17:48   ` Mark Rutland
@ 2016-10-24 17:58     ` Laura Abbott
  2016-10-24 18:09       ` Mark Rutland
  0 siblings, 1 reply; 24+ messages in thread
From: Laura Abbott @ 2016-10-24 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/24/2016 10:48 AM, Mark Rutland wrote:
> On Mon, Oct 24, 2016 at 10:38:59AM -0700, Laura Abbott wrote:
>> On 10/19/2016 12:10 PM, Mark Rutland wrote:
>>> Hi all,
>>>
>>> Building atop of Andy's work on x86 and generic code, these patches move
>>> arm64's thread_info off of the stack and into task_struct. This protects
>>> thread_info from corruption in the face of stack overflow, and serves as
>>> a step towards fully robust stack overflow handling, which will be
>>> addressed by subsequent patches.
>>>
>>> These patches are based atop of a preparatory series [1] (itself based
>>> on v4.9-rc1) that's also necessary for s390. I've placed those patches
>>> in a branch [2] on my kernel.org repo, along with this series [3]. I'm
>>> hoping that the prep work will be able to become a stable branch/tag
>>> soon.
>>>
>>> I've given the series some light testing on a couple of SMP arm64
>>> platforms, but this has yet to see a thorough beating; please do try to
>>> make this fall over!
>>>
>>> Since RFC [4]:
>>> * Rely on prior patches to make thread_info arch-specific
>>> * Make smp_processor_id() use a per-cpu variable
>>> * Split out current_stack_pointer
>>> * Make SMP actually work
>>>
>>> [1] http://lkml.kernel.org/r/1476901693-8492-1-git-send-email-mark.rutland at arm.com
>>> [2] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=core/ti-stack-split
>>> [3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
>>> [4] http://lkml.kernel.org/r/1473947349-14521-1-git-send-email-mark.rutland at arm.com
>
>> I pulled the arm64/ti-stack-split branch on top of a Fedora
>> tree and ran back-to-back kernel RPM builds for a long weekend.
>> It's still going as of this morning so you can take that as a
>>
>> Tested-by: Laura Abbott <labbott@redhat.com>
>
> Thanks! That's much appreciated!
>
> Just to check, did you grab the version with entry.S fixes rolled in
> (where the head is 657f54256c427fec)?

Ah I did not. That came in after I started the test. I'll start
another run with the new version.

>
> Thanks,
> Mark.
>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 17:58     ` Laura Abbott
@ 2016-10-24 18:09       ` Mark Rutland
  2016-10-24 18:15         ` Mark Rutland
  2016-10-26  0:46         ` Laura Abbott
  0 siblings, 2 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-24 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 24, 2016 at 10:58:10AM -0700, Laura Abbott wrote:
> On 10/24/2016 10:48 AM, Mark Rutland wrote:
> >On Mon, Oct 24, 2016 at 10:38:59AM -0700, Laura Abbott wrote:
> >>On 10/19/2016 12:10 PM, Mark Rutland wrote:
> >>>[3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
> >
> >>I pulled the arm64/ti-stack-split branch on top of a Fedora
> >>tree and ran back-to-back kernel RPM builds for a long weekend.
> >>It's still going as of this morning so you can take that as a
> >>
> >>Tested-by: Laura Abbott <labbott@redhat.com>
> >
> >Thanks! That's much appreciated!
> >
> >Just to check, did you grab the version with entry.S fixes rolled in
> >(where the head is 657f54256c427fec)?
> 
> Ah I did not. That came in after I started the test. I'll start
> another run with the new version.

Sorry about that; thanks for respinning!

It's really crazy how broken a kernel can be yet still "work"; clearly
we better tests are needed. :/

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 18:09       ` Mark Rutland
@ 2016-10-24 18:15         ` Mark Rutland
  2016-10-24 18:18           ` Kees Cook
  2016-10-26  0:46         ` Laura Abbott
  1 sibling, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2016-10-24 18:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 24, 2016 at 07:09:42PM +0100, Mark Rutland wrote:
> It's really crazy how broken a kernel can be yet still "work"; clearly
> we better tests are needed. :/

Clearly we better grammar need too. :(

Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 18:15         ` Mark Rutland
@ 2016-10-24 18:18           ` Kees Cook
  2016-10-25 10:05             ` Mark Rutland
  0 siblings, 1 reply; 24+ messages in thread
From: Kees Cook @ 2016-10-24 18:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 24, 2016 at 11:15 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Mon, Oct 24, 2016 at 07:09:42PM +0100, Mark Rutland wrote:
>> It's really crazy how broken a kernel can be yet still "work"; clearly
>> we better tests are needed. :/
>
> Clearly we better grammar need too. :(

Out of curiosity, what workflow would have tripped over the entry.S bug?

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 18:18           ` Kees Cook
@ 2016-10-25 10:05             ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-25 10:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 24, 2016 at 11:18:35AM -0700, Kees Cook wrote:
> On Mon, Oct 24, 2016 at 11:15 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Mon, Oct 24, 2016 at 07:09:42PM +0100, Mark Rutland wrote:
> >> It's really crazy how broken a kernel can be yet still "work"; clearly
> >> we better tests are needed. :/
> >
> > Clearly we better grammar need too. :(
> 
> Out of curiosity, what workflow would have tripped over the entry.S bug?

There are two bugs:

The issues in [1] would show up if you were attempting to use
breakpoints or watchpoints -- we'd never disable the single step.

The broken 're-entered irq stack' check [2] would be an issue if we were
close to exhausting the stack -- we'd never switch to the IRQ stack when
we take an IRQ in a kernel context. I'm not sure of a particular
workload.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-October/462932.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-October/462891.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-24 18:09       ` Mark Rutland
  2016-10-24 18:15         ` Mark Rutland
@ 2016-10-26  0:46         ` Laura Abbott
  2016-10-26  9:55           ` Mark Rutland
  1 sibling, 1 reply; 24+ messages in thread
From: Laura Abbott @ 2016-10-26  0:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/24/2016 11:09 AM, Mark Rutland wrote:
> On Mon, Oct 24, 2016 at 10:58:10AM -0700, Laura Abbott wrote:
>> On 10/24/2016 10:48 AM, Mark Rutland wrote:
>>> On Mon, Oct 24, 2016 at 10:38:59AM -0700, Laura Abbott wrote:
>>>> On 10/19/2016 12:10 PM, Mark Rutland wrote:
>>>>> [3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
>>>
>>>> I pulled the arm64/ti-stack-split branch on top of a Fedora
>>>> tree and ran back-to-back kernel RPM builds for a long weekend.
>>>> It's still going as of this morning so you can take that as a
>>>>
>>>> Tested-by: Laura Abbott <labbott@redhat.com>
>>>
>>> Thanks! That's much appreciated!
>>>
>>> Just to check, did you grab the version with entry.S fixes rolled in
>>> (where the head is 657f54256c427fec)?
>>
>> Ah I did not. That came in after I started the test. I'll start
>> another run with the new version.
>
> Sorry about that; thanks for respinning!
>
> It's really crazy how broken a kernel can be yet still "work"; clearly
> we better tests are needed. :/
>
> Thanks,
> Mark.
>

While not as long running, I gave it another spin and it seemed to
work fine as well.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 00/10] arm64: move thread_info off of the task stack
  2016-10-26  0:46         ` Laura Abbott
@ 2016-10-26  9:55           ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2016-10-26  9:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 25, 2016 at 05:46:39PM -0700, Laura Abbott wrote:
> On 10/24/2016 11:09 AM, Mark Rutland wrote:
> >On Mon, Oct 24, 2016 at 10:58:10AM -0700, Laura Abbott wrote:
> >>On 10/24/2016 10:48 AM, Mark Rutland wrote:
> >>>On Mon, Oct 24, 2016 at 10:38:59AM -0700, Laura Abbott wrote:
> >>>>On 10/19/2016 12:10 PM, Mark Rutland wrote:
> >>>>>[3] https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/log/?h=arm64/ti-stack-split
> >>>
> >>>>I pulled the arm64/ti-stack-split branch on top of a Fedora
> >>>>tree and ran back-to-back kernel RPM builds for a long weekend.
> >>>>It's still going as of this morning so you can take that as a
> >>>>
> >>>>Tested-by: Laura Abbott <labbott@redhat.com>
> >>>
> >>>Thanks! That's much appreciated!
> >>>
> >>>Just to check, did you grab the version with entry.S fixes rolled in
> >>>(where the head is 657f54256c427fec)?
> >>
> >>Ah I did not. That came in after I started the test. I'll start
> >>another run with the new version.
> >
> >Sorry about that; thanks for respinning!
 
> While not as long running, I gave it another spin and it seemed to
> work fine as well.

Thanks, that's much appreciated!

I've applied your Tested-by to the series.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-10-26  9:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-19 19:10 [PATCH 00/10] arm64: move thread_info off of the task stack Mark Rutland
2016-10-19 19:10 ` [PATCH 01/10] arm64: thread_info remove stale items Mark Rutland
2016-10-19 19:10 ` [PATCH 02/10] arm64: asm-offsets: remove unused definitions Mark Rutland
2016-10-19 19:10 ` [PATCH 03/10] arm64: factor out current_stack_pointer Mark Rutland
2016-10-19 19:10 ` [PATCH 04/10] arm64: traps: simplify die() and __die() Mark Rutland
2016-10-19 19:10 ` [PATCH 05/10] arm64: prep stack walkers for THREAD_INFO_IN_TASK Mark Rutland
2016-10-19 19:10 ` [PATCH 06/10] arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx Mark Rutland
2016-10-19 19:10 ` [PATCH 07/10] arm64: smp: prepare for smp_processor_id() rework Mark Rutland
2016-10-19 19:10 ` [PATCH 08/10] arm64: make cpu number a percpu variable Mark Rutland
2016-10-19 19:10 ` [PATCH 09/10] arm64: assembler: introduce ldr_this_cpu Mark Rutland
2016-10-19 19:10 ` [PATCH 10/10] arm64: split thread_info from task stack Mark Rutland
2016-10-21 14:50   ` James Morse
2016-10-21 15:59     ` Mark Rutland
2016-10-21 17:27       ` Mark Rutland
2016-10-21 16:20     ` Mark Rutland
2016-10-24 17:38 ` [PATCH 00/10] arm64: move thread_info off of the " Laura Abbott
2016-10-24 17:48   ` Mark Rutland
2016-10-24 17:58     ` Laura Abbott
2016-10-24 18:09       ` Mark Rutland
2016-10-24 18:15         ` Mark Rutland
2016-10-24 18:18           ` Kees Cook
2016-10-25 10:05             ` Mark Rutland
2016-10-26  0:46         ` Laura Abbott
2016-10-26  9:55           ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.