linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] arm64: stacktrace: improve robustness
@ 2019-06-06 12:53 Mark Rutland
  2019-06-06 12:54 ` [PATCH 1/3] arm64: stacktrace: Constify stacktrace.h functions Mark Rutland
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Mark Rutland @ 2019-06-06 12:53 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, tengfeif, catalin.marinas, will.deacon,
	james.morse, dave.martin

The arm64 stacktrace code is careful to only access valid stack
locations, but in the presence of a corrupted stack where frame records
form a loop, it will never terminate.

This series updates the stacktrace code to terminate in finite time even
when a stack is corrupted. A stacktrace will be terminated if the next
record is at a lower (or equal) address on the current stack, or when
the next record is on a stack we've already completed unwinding.

The first couple of patches come from Dave's prior attempt to fix this
[1], with the final patch relying on infrastructure which has been
introduced in the mean time.

I've given this a quick spin with magic-sysrq L in a KVM guest, and
things look fine, but further testing would be appreciated.

This series (based on v5.2-rc1) can also be found in my
arm64/robust-stracktrace branch on kernel.org [2].

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-April/572685.html
[2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/robust-stacktrace

Dave Martin (2):
  arm64: stacktrace: Constify stacktrace.h functions
  arm64: stacktrace: Factor out backtrace initialisation

Mark Rutland (1):
  arm64: stacktrace: better handle corrupted stacks

 arch/arm64/include/asm/stacktrace.h | 55 ++++++++++++++++++++++++++++---------
 arch/arm64/kernel/process.c         |  6 +---
 arch/arm64/kernel/stacktrace.c      | 16 ++++++++++-
 arch/arm64/kernel/time.c            |  6 +---
 arch/arm64/kernel/traps.c           | 13 ++++-----
 5 files changed, 65 insertions(+), 31 deletions(-)

-- 
2.11.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] arm64: stacktrace: Constify stacktrace.h functions
  2019-06-06 12:53 [PATCH 0/3] arm64: stacktrace: improve robustness Mark Rutland
@ 2019-06-06 12:54 ` Mark Rutland
  2019-06-06 12:54 ` [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation Mark Rutland
  2019-06-06 12:54 ` [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks Mark Rutland
  2 siblings, 0 replies; 16+ messages in thread
From: Mark Rutland @ 2019-06-06 12:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, tengfeif, catalin.marinas, will.deacon,
	james.morse, Dave Martin

From: Dave Martin <Dave.Martin@arm.com>

on_accessible_stack() and on_task_stack() shouldn't (and don't)
modify their task argument, so it can be const.

This patch adds the appropriate modifiers. Whitespace violations in the
parameter lists are fixed at the same time.

No functional change.

Signed-off-by: Dave Martin <dave.martin@arm.com>
[Mark: fixup const location, whitespace]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/include/asm/stacktrace.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index e86737b7c924..4dd569592e65 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -75,8 +75,9 @@ static inline bool on_irq_stack(unsigned long sp,
 	return true;
 }
 
-static inline bool on_task_stack(struct task_struct *tsk, unsigned long sp,
-				struct stack_info *info)
+static inline bool on_task_stack(const struct task_struct *tsk,
+				 unsigned long sp,
+				 struct stack_info *info)
 {
 	unsigned long low = (unsigned long)task_stack_page(tsk);
 	unsigned long high = low + THREAD_SIZE;
@@ -123,9 +124,9 @@ static inline bool on_overflow_stack(unsigned long sp,
  * We can only safely access per-cpu stacks from current in a non-preemptible
  * context.
  */
-static inline bool on_accessible_stack(struct task_struct *tsk,
-					unsigned long sp,
-					struct stack_info *info)
+static inline bool on_accessible_stack(const struct task_struct *tsk,
+				       unsigned long sp,
+				       struct stack_info *info)
 {
 	if (on_task_stack(tsk, sp, info))
 		return true;
-- 
2.11.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation
  2019-06-06 12:53 [PATCH 0/3] arm64: stacktrace: improve robustness Mark Rutland
  2019-06-06 12:54 ` [PATCH 1/3] arm64: stacktrace: Constify stacktrace.h functions Mark Rutland
@ 2019-06-06 12:54 ` Mark Rutland
  2019-06-21 15:50   ` Dave Martin
  2019-06-06 12:54 ` [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks Mark Rutland
  2 siblings, 1 reply; 16+ messages in thread
From: Mark Rutland @ 2019-06-06 12:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, tengfeif, catalin.marinas, will.deacon,
	james.morse, Dave Martin

From: Dave Martin <Dave.Martin@arm.com>

Some common code is required by each stacktrace user to initialise
struct stackframe before the first call to unwind_frame().

In preparation for adding to the common code, this patch factors it
out into a separate function start_backtrace(), and modifies the
stacktrace callers appropriately.

No functional change.

Signed-off-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/include/asm/stacktrace.h | 10 ++++++++++
 arch/arm64/kernel/process.c         |  6 +-----
 arch/arm64/kernel/time.c            |  6 +-----
 arch/arm64/kernel/traps.c           | 13 ++++++-------
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 4dd569592e65..18f90bf1385c 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -142,4 +142,14 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
 	return false;
 }
 
+static inline void start_backtrace(struct stackframe *frame,
+				   unsigned long fp, unsigned long pc)
+{
+	frame->fp = fp;
+	frame->pc = pc;
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	frame->graph = 0;
+#endif
+}
+
 #endif	/* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 3767fb21a5b8..122d88fccd13 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -509,11 +509,7 @@ unsigned long get_wchan(struct task_struct *p)
 	if (!stack_page)
 		return 0;
 
-	frame.fp = thread_saved_fp(p);
-	frame.pc = thread_saved_pc(p);
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	frame.graph = 0;
-#endif
+	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
 	do {
 		if (unwind_frame(p, &frame))
 			goto out;
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
index a777ae90044d..aa3489f3a452 100644
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -49,11 +49,7 @@ unsigned long profile_pc(struct pt_regs *regs)
 	if (!in_lock_functions(regs->pc))
 		return regs->pc;
 
-	frame.fp = regs->regs[29];
-	frame.pc = regs->pc;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	frame.graph = 0;
-#endif
+	start_backtrace(&frame, regs->regs[29], regs->pc);
 	do {
 		int ret = unwind_frame(NULL, &frame);
 		if (ret < 0)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index ade32046f3fe..8053bbed8776 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -119,18 +119,17 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
 		return;
 
 	if (tsk == current) {
-		frame.fp = (unsigned long)__builtin_frame_address(0);
-		frame.pc = (unsigned long)dump_backtrace;
+		start_backtrace(&frame,
+				(unsigned long)__builtin_frame_address(0),
+				(unsigned long)dump_backtrace);
 	} else {
 		/*
 		 * task blocked in __switch_to
 		 */
-		frame.fp = thread_saved_fp(tsk);
-		frame.pc = thread_saved_pc(tsk);
+		start_backtrace(&frame,
+				thread_saved_fp(tsk),
+				thread_saved_pc(tsk));
 	}
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	frame.graph = 0;
-#endif
 
 	printk("Call trace:\n");
 	do {
-- 
2.11.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-06 12:53 [PATCH 0/3] arm64: stacktrace: improve robustness Mark Rutland
  2019-06-06 12:54 ` [PATCH 1/3] arm64: stacktrace: Constify stacktrace.h functions Mark Rutland
  2019-06-06 12:54 ` [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation Mark Rutland
@ 2019-06-06 12:54 ` Mark Rutland
  2019-06-21 16:37   ` Dave Martin
                     ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Mark Rutland @ 2019-06-06 12:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, tengfeif, catalin.marinas, will.deacon,
	james.morse, dave.martin

The arm64 stacktrace code is careful to only dereference frame records
in valid stack ranges, ensuring that a corrupted frame record won't
result in a faulting access.

However, it's still possible for corrupt frame records to result in
infinite loops in the stacktrace code, which is also undesirable.

This patch ensures that we complete a stacktrace in finite time, by
keeping track of which stacks we have already completed unwinding, and
verifying that if the next frame record is on the same stack, it is at a
higher address.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Tengfei Fan <tengfeif@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/stacktrace.h | 34 ++++++++++++++++++++++++++--------
 arch/arm64/kernel/process.c         |  2 +-
 arch/arm64/kernel/stacktrace.c      | 16 +++++++++++++++-
 arch/arm64/kernel/time.c            |  2 +-
 arch/arm64/kernel/traps.c           |  4 ++--
 5 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 18f90bf1385c..4ebf8a8997b0 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -19,19 +19,12 @@
 #include <linux/percpu.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
+#include <linux/types.h>
 
 #include <asm/memory.h>
 #include <asm/ptrace.h>
 #include <asm/sdei.h>
 
-struct stackframe {
-	unsigned long fp;
-	unsigned long pc;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	int graph;
-#endif
-};
-
 enum stack_type {
 	STACK_TYPE_UNKNOWN,
 	STACK_TYPE_TASK,
@@ -39,6 +32,7 @@ enum stack_type {
 	STACK_TYPE_OVERFLOW,
 	STACK_TYPE_SDEI_NORMAL,
 	STACK_TYPE_SDEI_CRITICAL,
+	__NR_STACK_TYPES
 };
 
 struct stack_info {
@@ -47,6 +41,16 @@ struct stack_info {
 	enum stack_type type;
 };
 
+struct stackframe {
+	unsigned long fp;
+	unsigned long pc;
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	int graph;
+#endif
+	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
+	enum stack_type stack_current;
+};
+
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
 extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 			    int (*fn)(struct stackframe *, void *), void *data);
@@ -128,6 +132,9 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
 				       unsigned long sp,
 				       struct stack_info *info)
 {
+	if (info)
+		info->type = STACK_TYPE_UNKNOWN;
+
 	if (on_task_stack(tsk, sp, info))
 		return true;
 	if (tsk != current || preemptible())
@@ -143,13 +150,24 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
 }
 
 static inline void start_backtrace(struct stackframe *frame,
+				   struct task_struct *tsk,
 				   unsigned long fp, unsigned long pc)
 {
+	struct stack_info info;
+
 	frame->fp = fp;
 	frame->pc = pc;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	frame->graph = 0;
 #endif
+	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
+
+	/*
+	 * We need to prime stack_current for the first unwind, but we can
+	 * ignore the accessibility until the unwind occurs.
+	 */
+	on_accessible_stack(tsk, fp, &info);
+	frame->stack_current = info.type;
 }
 
 #endif	/* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 122d88fccd13..ba9441982573 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -509,7 +509,7 @@ unsigned long get_wchan(struct task_struct *p)
 	if (!stack_page)
 		return 0;
 
-	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
+	start_backtrace(&frame, p, thread_saved_fp(p), thread_saved_pc(p));
 	do {
 		if (unwind_frame(p, &frame))
 			goto out;
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index b00ec7d483d1..1c45b33c7474 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -43,6 +43,8 @@
 int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
 	unsigned long fp = frame->fp;
+	bool changed_stack = false;
+	struct stack_info info;
 
 	if (fp & 0xf)
 		return -EINVAL;
@@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 	if (!tsk)
 		tsk = current;
 
-	if (!on_accessible_stack(tsk, fp, NULL))
+	if (!on_accessible_stack(tsk, fp, &info))
 		return -EINVAL;
 
+	if (test_bit(info.type, frame->stacks_done))
+		return -EINVAL;
+
+	if (frame->stack_current != info.type) {
+		set_bit(frame->stack_current, frame->stacks_done);
+		frame->stack_current = info.type;
+		changed_stack = true;
+	}
+
 	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
 	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
 
+	if (!changed_stack && frame->fp <= fp)
+		return -EINVAL;
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	if (tsk->ret_stack &&
 			(frame->pc == (unsigned long)return_to_handler)) {
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
index aa3489f3a452..83f08c7e9464 100644
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -49,7 +49,7 @@ unsigned long profile_pc(struct pt_regs *regs)
 	if (!in_lock_functions(regs->pc))
 		return regs->pc;
 
-	start_backtrace(&frame, regs->regs[29], regs->pc);
+	start_backtrace(&frame, current, regs->regs[29], regs->pc);
 	do {
 		int ret = unwind_frame(NULL, &frame);
 		if (ret < 0)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 8053bbed8776..3bbe1992259e 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -119,14 +119,14 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
 		return;
 
 	if (tsk == current) {
-		start_backtrace(&frame,
+		start_backtrace(&frame, tsk,
 				(unsigned long)__builtin_frame_address(0),
 				(unsigned long)dump_backtrace);
 	} else {
 		/*
 		 * task blocked in __switch_to
 		 */
-		start_backtrace(&frame,
+		start_backtrace(&frame, tsk,
 				thread_saved_fp(tsk),
 				thread_saved_pc(tsk));
 	}
-- 
2.11.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation
  2019-06-06 12:54 ` [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation Mark Rutland
@ 2019-06-21 15:50   ` Dave Martin
  2019-06-28 11:27     ` Mark Rutland
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Martin @ 2019-06-21 15:50 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, tengfeif, will.deacon, james.morse, linux-arm-kernel

On Thu, Jun 06, 2019 at 01:54:01PM +0100, Mark Rutland wrote:
> From: Dave Martin <Dave.Martin@arm.com>
> 
> Some common code is required by each stacktrace user to initialise
> struct stackframe before the first call to unwind_frame().
> 
> In preparation for adding to the common code, this patch factors it
> out into a separate function start_backtrace(), and modifies the
> stacktrace callers appropriately.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <dave.martin@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>

Ack (from memory you just added the CONFIG_FUNCTION_GRAPH_TRACER stuff,
the lack of which was causing build failures in my original version).

Have you added any new calls to start_backtrace() here?

Cheers
---Dave

> ---
>  arch/arm64/include/asm/stacktrace.h | 10 ++++++++++
>  arch/arm64/kernel/process.c         |  6 +-----
>  arch/arm64/kernel/time.c            |  6 +-----
>  arch/arm64/kernel/traps.c           | 13 ++++++-------
>  4 files changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index 4dd569592e65..18f90bf1385c 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -142,4 +142,14 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
>  	return false;
>  }
>  
> +static inline void start_backtrace(struct stackframe *frame,
> +				   unsigned long fp, unsigned long pc)
> +{
> +	frame->fp = fp;
> +	frame->pc = pc;
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +	frame->graph = 0;
> +#endif
> +}
> +
>  #endif	/* __ASM_STACKTRACE_H */
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 3767fb21a5b8..122d88fccd13 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -509,11 +509,7 @@ unsigned long get_wchan(struct task_struct *p)
>  	if (!stack_page)
>  		return 0;
>  
> -	frame.fp = thread_saved_fp(p);
> -	frame.pc = thread_saved_pc(p);
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -	frame.graph = 0;
> -#endif
> +	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
>  	do {
>  		if (unwind_frame(p, &frame))
>  			goto out;
> diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
> index a777ae90044d..aa3489f3a452 100644
> --- a/arch/arm64/kernel/time.c
> +++ b/arch/arm64/kernel/time.c
> @@ -49,11 +49,7 @@ unsigned long profile_pc(struct pt_regs *regs)
>  	if (!in_lock_functions(regs->pc))
>  		return regs->pc;
>  
> -	frame.fp = regs->regs[29];
> -	frame.pc = regs->pc;
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -	frame.graph = 0;
> -#endif
> +	start_backtrace(&frame, regs->regs[29], regs->pc);
>  	do {
>  		int ret = unwind_frame(NULL, &frame);
>  		if (ret < 0)
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index ade32046f3fe..8053bbed8776 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -119,18 +119,17 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
>  		return;
>  
>  	if (tsk == current) {
> -		frame.fp = (unsigned long)__builtin_frame_address(0);
> -		frame.pc = (unsigned long)dump_backtrace;
> +		start_backtrace(&frame,
> +				(unsigned long)__builtin_frame_address(0),
> +				(unsigned long)dump_backtrace);
>  	} else {
>  		/*
>  		 * task blocked in __switch_to
>  		 */
> -		frame.fp = thread_saved_fp(tsk);
> -		frame.pc = thread_saved_pc(tsk);
> +		start_backtrace(&frame,
> +				thread_saved_fp(tsk),
> +				thread_saved_pc(tsk));
>  	}
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -	frame.graph = 0;
> -#endif
>  
>  	printk("Call trace:\n");
>  	do {
> -- 
> 2.11.0
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-06 12:54 ` [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks Mark Rutland
@ 2019-06-21 16:37   ` Dave Martin
  2019-06-28 11:32     ` Mark Rutland
  2019-06-24 11:34   ` James Morse
  2019-06-27 16:24   ` James Morse
  2 siblings, 1 reply; 16+ messages in thread
From: Dave Martin @ 2019-06-21 16:37 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, tengfeif, will.deacon, james.morse, linux-arm-kernel

On Thu, Jun 06, 2019 at 01:54:02PM +0100, Mark Rutland wrote:
> The arm64 stacktrace code is careful to only dereference frame records
> in valid stack ranges, ensuring that a corrupted frame record won't
> result in a faulting access.
> 
> However, it's still possible for corrupt frame records to result in
> infinite loops in the stacktrace code, which is also undesirable.
> 
> This patch ensures that we complete a stacktrace in finite time, by
> keeping track of which stacks we have already completed unwinding, and
> verifying that if the next frame record is on the same stack, it is at a
> higher address.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Dave Martin <dave.martin@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Tengfei Fan <tengfeif@codeaurora.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/include/asm/stacktrace.h | 34 ++++++++++++++++++++++++++--------
>  arch/arm64/kernel/process.c         |  2 +-
>  arch/arm64/kernel/stacktrace.c      | 16 +++++++++++++++-
>  arch/arm64/kernel/time.c            |  2 +-
>  arch/arm64/kernel/traps.c           |  4 ++--
>  5 files changed, 45 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index 18f90bf1385c..4ebf8a8997b0 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -19,19 +19,12 @@
>  #include <linux/percpu.h>
>  #include <linux/sched.h>
>  #include <linux/sched/task_stack.h>
> +#include <linux/types.h>
>  
>  #include <asm/memory.h>
>  #include <asm/ptrace.h>
>  #include <asm/sdei.h>
>  
> -struct stackframe {
> -	unsigned long fp;
> -	unsigned long pc;
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -	int graph;
> -#endif
> -};
> -
>  enum stack_type {
>  	STACK_TYPE_UNKNOWN,
>  	STACK_TYPE_TASK,
> @@ -39,6 +32,7 @@ enum stack_type {
>  	STACK_TYPE_OVERFLOW,
>  	STACK_TYPE_SDEI_NORMAL,
>  	STACK_TYPE_SDEI_CRITICAL,
> +	__NR_STACK_TYPES

The number of stack types is actually 1 less than this, and the zeroth
bit in stacks_done doesn't get used if we use this enum as an index.

Would STACK_TYPE_UNKNOWN = 0 fix this, or would that break something
elsewhere?

>  };
>  
>  struct stack_info {
> @@ -47,6 +41,16 @@ struct stack_info {
>  	enum stack_type type;
>  };
>  
> +struct stackframe {
> +	unsigned long fp;
> +	unsigned long pc;
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +	int graph;
> +#endif
> +	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
> +	enum stack_type stack_current;
> +};
> +
>  extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
>  extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
>  			    int (*fn)(struct stackframe *, void *), void *data);
> @@ -128,6 +132,9 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
>  				       unsigned long sp,
>  				       struct stack_info *info)
>  {
> +	if (info)
> +		info->type = STACK_TYPE_UNKNOWN;
> +
>  	if (on_task_stack(tsk, sp, info))
>  		return true;
>  	if (tsk != current || preemptible())
> @@ -143,13 +150,24 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
>  }
>  
>  static inline void start_backtrace(struct stackframe *frame,
> +				   struct task_struct *tsk,
>  				   unsigned long fp, unsigned long pc)
>  {
> +	struct stack_info info;
> +
>  	frame->fp = fp;
>  	frame->pc = pc;
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	frame->graph = 0;
>  #endif
> +	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
> +
> +	/*
> +	 * We need to prime stack_current for the first unwind, but we can
> +	 * ignore the accessibility until the unwind occurs.
> +	 */
> +	on_accessible_stack(tsk, fp, &info);
> +	frame->stack_current = info.type;
>  }
>  
>  #endif	/* __ASM_STACKTRACE_H */
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 122d88fccd13..ba9441982573 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -509,7 +509,7 @@ unsigned long get_wchan(struct task_struct *p)
>  	if (!stack_page)
>  		return 0;
>  
> -	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
> +	start_backtrace(&frame, p, thread_saved_fp(p), thread_saved_pc(p));
>  	do {
>  		if (unwind_frame(p, &frame))
>  			goto out;
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index b00ec7d483d1..1c45b33c7474 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -43,6 +43,8 @@
>  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  {
>  	unsigned long fp = frame->fp;
> +	bool changed_stack = false;
> +	struct stack_info info;
>  
>  	if (fp & 0xf)
>  		return -EINVAL;
> @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	if (!tsk)
>  		tsk = current;
>  
> -	if (!on_accessible_stack(tsk, fp, NULL))
> +	if (!on_accessible_stack(tsk, fp, &info))
>  		return -EINVAL;
>  
> +	if (test_bit(info.type, frame->stacks_done))
> +		return -EINVAL;

Doesn't this fire when we unwind a sequence of frames on the same stack
(i.e., the common case)?

I may be missing something obvious here.

> +
> +	if (frame->stack_current != info.type) {
> +		set_bit(frame->stack_current, frame->stacks_done);

Oh, right, stacks_done is the set of stacks we have been on, excluding
the current one?  If so, a comment somewhere explaining that, or some
more explicit name, like "past_stacks" might make sense.

> +		frame->stack_current = info.type;
> +		changed_stack = true;
> +	}
> +
>  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
>  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
>  
> +	if (!changed_stack && frame->fp <= fp)
> +		return -EINVAL;
> +

[...]

Otherwise, seems to make sense.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-06 12:54 ` [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks Mark Rutland
  2019-06-21 16:37   ` Dave Martin
@ 2019-06-24 11:34   ` James Morse
  2019-06-25 10:28     ` James Morse
  2019-06-27 16:24   ` James Morse
  2 siblings, 1 reply; 16+ messages in thread
From: James Morse @ 2019-06-24 11:34 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, tengfeif, will.deacon, dave.martin, linux-arm-kernel

Hi Mark,

On 06/06/2019 13:54, Mark Rutland wrote:
> The arm64 stacktrace code is careful to only dereference frame records
> in valid stack ranges, ensuring that a corrupted frame record won't
> result in a faulting access.
> 
> However, it's still possible for corrupt frame records to result in
> infinite loops in the stacktrace code, which is also undesirable.
> 
> This patch ensures that we complete a stacktrace in finite time, by
> keeping track of which stacks we have already completed unwinding, and
> verifying that if the next frame record is on the same stack, it is at a
> higher address.

This looks good, I tried to take it for a spin to test SDEI stack tracing ... but it
wouldn't boot, it panic()s before earlycon.

defconfig doesn't do this, defconfig+CONFIG_PROVE_LOCKING does.
Toggling CONFIG_DEBUG_LOCK_ALLOC is the smallest config change to make this show up.

Its taking a translation fault:
| <__ll_sc_arch_atomic64_or>:
|        f9800031        prfm    pstl1strm, [x1]
|        c85f7c31        ldxr    x17, [x1]		(faulting instruction)
|        aa000231        orr     x17, x17, x0
|        c8107c31        stxr    w16, x17, [x1]
|        35ffffb0        cbnz    w16, ffff000010c7d19c <__ll_sc_a
|        d65f03c0        ret

x0: 0x0000000000000100
x1: 0xffff0000137399e8			(far_el2)

If x1 were part of 'frame' in __save_stack_trace it should be on the stack, but at
fault-time sp is 0xffff0000114a3a50. This happens before the linear map has been set up....

The lr points just after the set_bit() call in unwind_frame().


> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index b00ec7d483d1..1c45b33c7474 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -43,6 +43,8 @@
>  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  {
>  	unsigned long fp = frame->fp;
> +	bool changed_stack = false;
> +	struct stack_info info;
>  
>  	if (fp & 0xf)
>  		return -EINVAL;
> @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	if (!tsk)
>  		tsk = current;
>  
> -	if (!on_accessible_stack(tsk, fp, NULL))
> +	if (!on_accessible_stack(tsk, fp, &info))
>  		return -EINVAL;
>  
> +	if (test_bit(info.type, frame->stacks_done))
> +		return -EINVAL;
> +
> +	if (frame->stack_current != info.type) {

> +		set_bit(frame->stack_current, frame->stacks_done);


Changing this line:
| -               set_bit(frame->stack_current, frame->stacks_done);
| +               *frame->stacks_done |= (1 << frame->stack_current);
works fine.

But it doesn't cause a stacktrace to be printed, so I can't work out how
CONFIG_DEBUG_LOCK_ALLOC is relevant.


... this makes no sense, can anyone else reproduce it?


Thanks,

James


> +		frame->stack_current = info.type;
> +		changed_stack = true;
> +	}
> +
>  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
>  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
>  
> +	if (!changed_stack && frame->fp <= fp)
> +		return -EINVAL;
> +
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	if (tsk->ret_stack &&
>  			(frame->pc == (unsigned long)return_to_handler)) {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-24 11:34   ` James Morse
@ 2019-06-25 10:28     ` James Morse
  0 siblings, 0 replies; 16+ messages in thread
From: James Morse @ 2019-06-25 10:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, tengfeif, will.deacon, dave.martin, linux-arm-kernel

Hi Mark,

On 24/06/2019 12:34, James Morse wrote:
> On 06/06/2019 13:54, Mark Rutland wrote:
>> The arm64 stacktrace code is careful to only dereference frame records
>> in valid stack ranges, ensuring that a corrupted frame record won't
>> result in a faulting access.
>>
>> However, it's still possible for corrupt frame records to result in
>> infinite loops in the stacktrace code, which is also undesirable.
>>
>> This patch ensures that we complete a stacktrace in finite time, by
>> keeping track of which stacks we have already completed unwinding, and
>> verifying that if the next frame record is on the same stack, it is at a
>> higher address.
> 
> This looks good, I tried to take it for a spin to test SDEI stack tracing ... but it
> wouldn't boot, it panic()s before earlycon.
> 
> defconfig doesn't do this, defconfig+CONFIG_PROVE_LOCKING does.
> Toggling CONFIG_DEBUG_LOCK_ALLOC is the smallest config change to make this show up.
> 
> Its taking a translation fault:
> | <__ll_sc_arch_atomic64_or>:
> |        f9800031        prfm    pstl1strm, [x1]
> |        c85f7c31        ldxr    x17, [x1]		(faulting instruction)
> |        aa000231        orr     x17, x17, x0
> |        c8107c31        stxr    w16, x17, [x1]
> |        35ffffb0        cbnz    w16, ffff000010c7d19c <__ll_sc_a
> |        d65f03c0        ret
> 
> x0: 0x0000000000000100
> x1: 0xffff0000137399e8			(far_el2)
> 
> If x1 were part of 'frame' in __save_stack_trace it should be on the stack, but at
> fault-time sp is 0xffff0000114a3a50. This happens before the linear map has been set up....
> 
> The lr points just after the set_bit() call in unwind_frame().

frame->stack_current is uninitialized/junk, so the calculated bit to set is outside of
mapped memory.

Lockdep is relevant because it uses save_stack_trace() which doesn't use the
start_backtrace() helper that initialises the new metadata.
DEBUG_LOCK_ALLOC was a red-herring, it was perturbing the stack layout so this code ate a
pointer instead of a much more believable 0.

Patch below, this needs to come before patch 3.
I'll give this a spin with the SDEI firmware.


Thanks,

James

----------------------------------%<----------------------------------
From: James Morse <james.morse@arm.com>
Date:   Tue Jun 25 11:05:33 2019 +0100

arm64: stacktrace: use start_backtrace() consistently

unwind_frame() is about to get smart when it comes to validating
stack frames and stepping between stacks without going round in
circles.

All this relies on new parameters in struct stackframe being
initialised. Before we can do this, we need all users of
struct stackframe to use start_backtrace(), instead of packing
the values manually.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/perf_callchain.c |  7 +------
 arch/arm64/kernel/return_address.c |  8 +++-----
 arch/arm64/kernel/stacktrace.c     | 20 ++++++--------------
 3 files changed, 10 insertions(+), 25 deletions(-)

---
diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callcha
in.c
index 61d983f5756f..1510ccbd7cb2 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -165,12 +165,7 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx
*entry,
                return;
        }

-       frame.fp = regs->regs[29];
-       frame.pc = regs->pc;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-       frame.graph = 0;
-#endif
-
+       start_backtrace(&frame, current, regs->regs[29], regs->pc);
        walk_stackframe(current, &frame, callchain_trace, entry);
 }

diff --git a/arch/arm64/kernel/return_address.c b/arch/arm64/kernel/return_address.c
index 53c40196b607..36f8888a5e76 100644
--- a/arch/arm64/kernel/return_address.c
+++ b/arch/arm64/kernel/return_address.c
@@ -41,11 +41,9 @@ void *return_address(unsigned int level)
        data.level = level + 2;
        data.addr = NULL;

-       frame.fp = (unsigned long)__builtin_frame_address(0);
-       frame.pc = (unsigned long)return_address; /* dummy */
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-       frame.graph = 0;
-#endif
+       start_backtrace(&frame, current,
+                       (unsigned long)__builtin_frame_address(0),
+                       (unsigned long)return_address);

        walk_stackframe(current, &frame, save_return_addr, &data);

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 1c45b33c7474..8a1fa90c784d 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -147,12 +147,7 @@ void save_stack_trace_regs(struct pt_regs *regs, struct stack_trace
*trace)
        data.skip = trace->skip;
        data.no_sched_functions = 0;

-       frame.fp = regs->regs[29];
-       frame.pc = regs->pc;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-       frame.graph = 0;
-#endif
-
+       start_backtrace(&frame, current, regs->regs[29], regs->pc);
        walk_stackframe(current, &frame, save_trace, &data);
 }
 EXPORT_SYMBOL_GPL(save_stack_trace_regs);
@@ -171,18 +166,15 @@ static noinline void __save_stack_trace(struct task_struct *tsk,
        data.no_sched_functions = nosched;

        if (tsk != current) {
-               frame.fp = thread_saved_fp(tsk);
-               frame.pc = thread_saved_pc(tsk);
+               start_backtrace(&frame, tsk, thread_saved_fp(tsk),
+                               thread_saved_pc(tsk));
        } else {
                /* We don't want this function nor the caller */
                data.skip += 2;
-               frame.fp = (unsigned long)__builtin_frame_address(0);
-               frame.pc = (unsigned long)__save_stack_trace;
+               start_backtrace(&frame, tsk,
+                               (unsigned long)__builtin_frame_address(0),
+                               (unsigned long)__save_stack_trace);
        }
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-       frame.graph = 0;
-#endif
-
        walk_stackframe(tsk, &frame, save_trace, &data);

        put_task_stack(tsk);

----------------------------------%<----------------------------------

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-06 12:54 ` [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks Mark Rutland
  2019-06-21 16:37   ` Dave Martin
  2019-06-24 11:34   ` James Morse
@ 2019-06-27 16:24   ` James Morse
  2019-06-28 11:15     ` Dave Martin
  2019-06-28 15:35     ` Mark Rutland
  2 siblings, 2 replies; 16+ messages in thread
From: James Morse @ 2019-06-27 16:24 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, tengfeif, will.deacon, dave.martin, linux-arm-kernel

Hi Mark,

On 06/06/2019 13:54, Mark Rutland wrote:
> The arm64 stacktrace code is careful to only dereference frame records
> in valid stack ranges, ensuring that a corrupted frame record won't
> result in a faulting access.
> 
> However, it's still possible for corrupt frame records to result in
> infinite loops in the stacktrace code, which is also undesirable.
> 
> This patch ensures that we complete a stacktrace in finite time, by
> keeping track of which stacks we have already completed unwinding, and
> verifying that if the next frame record is on the same stack, it is at a
> higher address.

I see this truncate stacks when walking from the SDEI handler...


> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index b00ec7d483d1..1c45b33c7474 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -43,6 +43,8 @@
>  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  {
>  	unsigned long fp = frame->fp;
> +	bool changed_stack = false;
> +	struct stack_info info;
>  
>  	if (fp & 0xf)
>  		return -EINVAL;
> @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	if (!tsk)
>  		tsk = current;
>  
> -	if (!on_accessible_stack(tsk, fp, NULL))
> +	if (!on_accessible_stack(tsk, fp, &info))
>  		return -EINVAL;
>  
> +	if (test_bit(info.type, frame->stacks_done))
> +		return -EINVAL;
> +
> +	if (frame->stack_current != info.type) {
> +		set_bit(frame->stack_current, frame->stacks_done);
> +		frame->stack_current = info.type;
> +		changed_stack = true;
> +	}
> +
>  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
>  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));


> +	if (!changed_stack && frame->fp <= fp)
> +		return -EINVAL;

This is where it goes wrong. changed_stack is only true for the first frame on a newly
visited stack. This means for the last frame of the previous stack, (which must point to
the next stack), we still require 'frame->fp <= fp'.

It think like this just happens to be true for the irq stacks as they are allocated early.
(whereas the SDEI stacks are allocated late).


This hunk fixes it for me:
------------------------------------%<------------------------------------
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 8a1fa90c784d..cb5dee233ede 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -43,7 +43,6 @@
 int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
        unsigned long fp = frame->fp;
-       bool changed_stack = false;
        struct stack_info info;

        if (fp & 0xf)
@@ -61,14 +60,16 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe
*frame)
        if (frame->stack_current != info.type) {
                set_bit(frame->stack_current, frame->stacks_done);
                frame->stack_current = info.type;
-               changed_stack = true;
        }

        frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
        frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));

-       if (!changed_stack && frame->fp <= fp)
-               return -EINVAL;
+       if (info.low <= frame->fp && frame->fp <= info.high) {
+               /* within stack bounds: the next frame must be older */
+               if (frame->fp <= fp)
+                       return -EINVAL;
+       }

 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
        if (tsk->ret_stack &&
------------------------------------%<------------------------------------

I've given this kick with SDEI, perf and ftrace.
Let me know if its easier for me to post a v2 with all these bits assembled.

(If this is is picked onto arm64's for-next/core: The function_graph tracer works with
this series on v5.2-rc6, but it locks up on v5.2-rc4 even without this series.)


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-27 16:24   ` James Morse
@ 2019-06-28 11:15     ` Dave Martin
  2019-06-28 13:02       ` Mark Rutland
  2019-06-28 15:35     ` Mark Rutland
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Martin @ 2019-06-28 11:15 UTC (permalink / raw)
  To: James Morse
  Cc: Mark Rutland, catalin.marinas, tengfeif, will.deacon, linux-arm-kernel

On Thu, Jun 27, 2019 at 05:24:06PM +0100, James Morse wrote:
> Hi Mark,
> 
> On 06/06/2019 13:54, Mark Rutland wrote:
> > The arm64 stacktrace code is careful to only dereference frame records
> > in valid stack ranges, ensuring that a corrupted frame record won't
> > result in a faulting access.
> > 
> > However, it's still possible for corrupt frame records to result in
> > infinite loops in the stacktrace code, which is also undesirable.
> > 
> > This patch ensures that we complete a stacktrace in finite time, by
> > keeping track of which stacks we have already completed unwinding, and
> > verifying that if the next frame record is on the same stack, it is at a
> > higher address.
> 
> I see this truncate stacks when walking from the SDEI handler...
> 
> 
> > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > index b00ec7d483d1..1c45b33c7474 100644
> > --- a/arch/arm64/kernel/stacktrace.c
> > +++ b/arch/arm64/kernel/stacktrace.c
> > @@ -43,6 +43,8 @@
> >  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >  {
> >  	unsigned long fp = frame->fp;
> > +	bool changed_stack = false;
> > +	struct stack_info info;
> >  
> >  	if (fp & 0xf)
> >  		return -EINVAL;
> > @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >  	if (!tsk)
> >  		tsk = current;
> >  
> > -	if (!on_accessible_stack(tsk, fp, NULL))
> > +	if (!on_accessible_stack(tsk, fp, &info))
> >  		return -EINVAL;
> >  
> > +	if (test_bit(info.type, frame->stacks_done))
> > +		return -EINVAL;
> > +
> > +	if (frame->stack_current != info.type) {
> > +		set_bit(frame->stack_current, frame->stacks_done);
> > +		frame->stack_current = info.type;
> > +		changed_stack = true;
> > +	}
> > +
> >  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
> >  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
> 
> 
> > +	if (!changed_stack && frame->fp <= fp)
> > +		return -EINVAL;
> 
> This is where it goes wrong. changed_stack is only true for the first
> frame on a newly visited stack. This means for the last frame of the
> previous stack, (which must point to the next stack), we still
> require 'frame->fp <= fp'.
> 
> It think like this just happens to be true for the irq stacks as they
> are allocated early.  (whereas the SDEI stacks are allocated late).

I don't understand this.

Either we are on an edge frame (changed_stack is true) or not (false).

If not, the two FPs are on the same stack and we should compare them.
Otherwise they're on different stacks and such a comparison is nonsense.

I don't see any third situation.

So, what's wrong here?

> This hunk fixes it for me:
> ------------------------------------%<------------------------------------
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c

[...]

> @@ -61,14 +60,16 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe
> *frame)
>         if (frame->stack_current != info.type) {
>                 set_bit(frame->stack_current, frame->stacks_done);
>                 frame->stack_current = info.type;
> -               changed_stack = true;
>         }
> 
>         frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
>         frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
> 
> -       if (!changed_stack && frame->fp <= fp)
> -               return -EINVAL;
> +       if (info.low <= frame->fp && frame->fp <= info.high) {
> +               /* within stack bounds: the next frame must be older */

Doesn't on_accessible_stack() already do this check?  This is how we
determined what stack we tranitioned to in the first place.

Not saying there isn't a problem here, but I don't yet understand what
goes wrong...

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation
  2019-06-21 15:50   ` Dave Martin
@ 2019-06-28 11:27     ` Mark Rutland
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Rutland @ 2019-06-28 11:27 UTC (permalink / raw)
  To: Dave Martin
  Cc: catalin.marinas, tengfeif, will.deacon, james.morse, linux-arm-kernel

On Fri, Jun 21, 2019 at 04:50:48PM +0100, Dave Martin wrote:
> On Thu, Jun 06, 2019 at 01:54:01PM +0100, Mark Rutland wrote:
> > From: Dave Martin <Dave.Martin@arm.com>
> > 
> > Some common code is required by each stacktrace user to initialise
> > struct stackframe before the first call to unwind_frame().
> > 
> > In preparation for adding to the common code, this patch factors it
> > out into a separate function start_backtrace(), and modifies the
> > stacktrace callers appropriately.
> > 
> > No functional change.
> > 
> > Signed-off-by: Dave Martin <dave.martin@arm.com>
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> 
> Ack (from memory you just added the CONFIG_FUNCTION_GRAPH_TRACER stuff,
> the lack of which was causing build failures in my original version).

You handled that in the version of the seires this was picked from [1].

I removed the tsk parameter, which now gets added in patch 3. I can
re-add that to this patch, or leave this as-is -- I have no strong
feelings either way.

> Have you added any new calls to start_backtrace() here?

No; I have not added any new calls.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-April/572687.html

> 
> Cheers
> ---Dave
> 
> > ---
> >  arch/arm64/include/asm/stacktrace.h | 10 ++++++++++
> >  arch/arm64/kernel/process.c         |  6 +-----
> >  arch/arm64/kernel/time.c            |  6 +-----
> >  arch/arm64/kernel/traps.c           | 13 ++++++-------
> >  4 files changed, 18 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> > index 4dd569592e65..18f90bf1385c 100644
> > --- a/arch/arm64/include/asm/stacktrace.h
> > +++ b/arch/arm64/include/asm/stacktrace.h
> > @@ -142,4 +142,14 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
> >  	return false;
> >  }
> >  
> > +static inline void start_backtrace(struct stackframe *frame,
> > +				   unsigned long fp, unsigned long pc)
> > +{
> > +	frame->fp = fp;
> > +	frame->pc = pc;
> > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > +	frame->graph = 0;
> > +#endif
> > +}
> > +
> >  #endif	/* __ASM_STACKTRACE_H */
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 3767fb21a5b8..122d88fccd13 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -509,11 +509,7 @@ unsigned long get_wchan(struct task_struct *p)
> >  	if (!stack_page)
> >  		return 0;
> >  
> > -	frame.fp = thread_saved_fp(p);
> > -	frame.pc = thread_saved_pc(p);
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -	frame.graph = 0;
> > -#endif
> > +	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
> >  	do {
> >  		if (unwind_frame(p, &frame))
> >  			goto out;
> > diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
> > index a777ae90044d..aa3489f3a452 100644
> > --- a/arch/arm64/kernel/time.c
> > +++ b/arch/arm64/kernel/time.c
> > @@ -49,11 +49,7 @@ unsigned long profile_pc(struct pt_regs *regs)
> >  	if (!in_lock_functions(regs->pc))
> >  		return regs->pc;
> >  
> > -	frame.fp = regs->regs[29];
> > -	frame.pc = regs->pc;
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -	frame.graph = 0;
> > -#endif
> > +	start_backtrace(&frame, regs->regs[29], regs->pc);
> >  	do {
> >  		int ret = unwind_frame(NULL, &frame);
> >  		if (ret < 0)
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index ade32046f3fe..8053bbed8776 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -119,18 +119,17 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
> >  		return;
> >  
> >  	if (tsk == current) {
> > -		frame.fp = (unsigned long)__builtin_frame_address(0);
> > -		frame.pc = (unsigned long)dump_backtrace;
> > +		start_backtrace(&frame,
> > +				(unsigned long)__builtin_frame_address(0),
> > +				(unsigned long)dump_backtrace);
> >  	} else {
> >  		/*
> >  		 * task blocked in __switch_to
> >  		 */
> > -		frame.fp = thread_saved_fp(tsk);
> > -		frame.pc = thread_saved_pc(tsk);
> > +		start_backtrace(&frame,
> > +				thread_saved_fp(tsk),
> > +				thread_saved_pc(tsk));
> >  	}
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -	frame.graph = 0;
> > -#endif
> >  
> >  	printk("Call trace:\n");
> >  	do {
> > -- 
> > 2.11.0
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-21 16:37   ` Dave Martin
@ 2019-06-28 11:32     ` Mark Rutland
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Rutland @ 2019-06-28 11:32 UTC (permalink / raw)
  To: Dave Martin
  Cc: catalin.marinas, tengfeif, will.deacon, james.morse, linux-arm-kernel

On Fri, Jun 21, 2019 at 05:37:21PM +0100, Dave Martin wrote:
> On Thu, Jun 06, 2019 at 01:54:02PM +0100, Mark Rutland wrote:
> > The arm64 stacktrace code is careful to only dereference frame records
> > in valid stack ranges, ensuring that a corrupted frame record won't
> > result in a faulting access.
> > 
> > However, it's still possible for corrupt frame records to result in
> > infinite loops in the stacktrace code, which is also undesirable.
> > 
> > This patch ensures that we complete a stacktrace in finite time, by
> > keeping track of which stacks we have already completed unwinding, and
> > verifying that if the next frame record is on the same stack, it is at a
> > higher address.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Dave Martin <dave.martin@arm.com>
> > Cc: James Morse <james.morse@arm.com>
> > Cc: Tengfei Fan <tengfeif@codeaurora.org>
> > Cc: Will Deacon <will.deacon@arm.com>
> > ---
> >  arch/arm64/include/asm/stacktrace.h | 34 ++++++++++++++++++++++++++--------
> >  arch/arm64/kernel/process.c         |  2 +-
> >  arch/arm64/kernel/stacktrace.c      | 16 +++++++++++++++-
> >  arch/arm64/kernel/time.c            |  2 +-
> >  arch/arm64/kernel/traps.c           |  4 ++--
> >  5 files changed, 45 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> > index 18f90bf1385c..4ebf8a8997b0 100644
> > --- a/arch/arm64/include/asm/stacktrace.h
> > +++ b/arch/arm64/include/asm/stacktrace.h
> > @@ -19,19 +19,12 @@
> >  #include <linux/percpu.h>
> >  #include <linux/sched.h>
> >  #include <linux/sched/task_stack.h>
> > +#include <linux/types.h>
> >  
> >  #include <asm/memory.h>
> >  #include <asm/ptrace.h>
> >  #include <asm/sdei.h>
> >  
> > -struct stackframe {
> > -	unsigned long fp;
> > -	unsigned long pc;
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -	int graph;
> > -#endif
> > -};
> > -
> >  enum stack_type {
> >  	STACK_TYPE_UNKNOWN,
> >  	STACK_TYPE_TASK,
> > @@ -39,6 +32,7 @@ enum stack_type {
> >  	STACK_TYPE_OVERFLOW,
> >  	STACK_TYPE_SDEI_NORMAL,
> >  	STACK_TYPE_SDEI_CRITICAL,
> > +	__NR_STACK_TYPES
> 
> The number of stack types is actually 1 less than this, and the zeroth
> bit in stacks_done doesn't get used if we use this enum as an index.
> 
> Would STACK_TYPE_UNKNOWN = 0 fix this, or would that break something
> elsewhere?

Huh? STACK_TYPE_UNKNOWN is 0, as it's the first entry in the enum.

__NR_STACK types is used for the bitmap, where I rely on being able to
set the STACK_TYPE_UNKNOWN bit. I apprecaite it's one more than the
number of real stack types, but I wasn't able to come up with an
obviously better name.

> > @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >  	if (!tsk)
> >  		tsk = current;
> >  
> > -	if (!on_accessible_stack(tsk, fp, NULL))
> > +	if (!on_accessible_stack(tsk, fp, &info))
> >  		return -EINVAL;
> >  
> > +	if (test_bit(info.type, frame->stacks_done))
> > +		return -EINVAL;
> 
> Doesn't this fire when we unwind a sequence of frames on the same stack
> (i.e., the common case)?
> 
> I may be missing something obvious here.
> 
> > +
> > +	if (frame->stack_current != info.type) {
> > +		set_bit(frame->stack_current, frame->stacks_done);
> 
> Oh, right, stacks_done is the set of stacks we have been on, excluding
> the current one?  If so, a comment somewhere explaining that, or some
> more explicit name, like "past_stacks" might make sense.

I thought that stacks_done was sufficient, but I guess I could
rename that to stacks_prev, to match the stack_current naming.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-28 11:15     ` Dave Martin
@ 2019-06-28 13:02       ` Mark Rutland
  2019-07-01 10:48         ` Dave Martin
  0 siblings, 1 reply; 16+ messages in thread
From: Mark Rutland @ 2019-06-28 13:02 UTC (permalink / raw)
  To: Dave Martin
  Cc: catalin.marinas, tengfeif, James Morse, linux-arm-kernel, will.deacon

On Fri, Jun 28, 2019 at 12:15:03PM +0100, Dave Martin wrote:
> On Thu, Jun 27, 2019 at 05:24:06PM +0100, James Morse wrote:
> > Hi Mark,
> > 
> > On 06/06/2019 13:54, Mark Rutland wrote:
> > > The arm64 stacktrace code is careful to only dereference frame records
> > > in valid stack ranges, ensuring that a corrupted frame record won't
> > > result in a faulting access.
> > > 
> > > However, it's still possible for corrupt frame records to result in
> > > infinite loops in the stacktrace code, which is also undesirable.
> > > 
> > > This patch ensures that we complete a stacktrace in finite time, by
> > > keeping track of which stacks we have already completed unwinding, and
> > > verifying that if the next frame record is on the same stack, it is at a
> > > higher address.
> > 
> > I see this truncate stacks when walking from the SDEI handler...
> > 
> > 
> > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > > index b00ec7d483d1..1c45b33c7474 100644
> > > --- a/arch/arm64/kernel/stacktrace.c
> > > +++ b/arch/arm64/kernel/stacktrace.c
> > > @@ -43,6 +43,8 @@
> > >  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> > >  {
> > >  	unsigned long fp = frame->fp;
> > > +	bool changed_stack = false;
> > > +	struct stack_info info;
> > >  
> > >  	if (fp & 0xf)
> > >  		return -EINVAL;
> > > @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> > >  	if (!tsk)
> > >  		tsk = current;
> > >  
> > > -	if (!on_accessible_stack(tsk, fp, NULL))
> > > +	if (!on_accessible_stack(tsk, fp, &info))
> > >  		return -EINVAL;
> > >  
> > > +	if (test_bit(info.type, frame->stacks_done))
> > > +		return -EINVAL;
> > > +
> > > +	if (frame->stack_current != info.type) {
> > > +		set_bit(frame->stack_current, frame->stacks_done);
> > > +		frame->stack_current = info.type;
> > > +		changed_stack = true;
> > > +	}
> > > +
> > >  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
> > >  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
> > 
> > 
> > > +	if (!changed_stack && frame->fp <= fp)
> > > +		return -EINVAL;
> > 
> > This is where it goes wrong. changed_stack is only true for the first
> > frame on a newly visited stack. This means for the last frame of the
> > previous stack, (which must point to the next stack), we still
> > require 'frame->fp <= fp'.
> > 
> > It think like this just happens to be true for the irq stacks as they
> > are allocated early.  (whereas the SDEI stacks are allocated late).
> 
> I don't understand this.

I ended up drawing a diagram to figure this out, and realised James is
right.

> Either we are on an edge frame (changed_stack is true) or not (false).
> 
> If not, the two FPs are on the same stack and we should compare them.
> Otherwise they're on different stacks and such a comparison is nonsense.
> 
> I don't see any third situation.
> 
> So, what's wrong here?

The problem is that we unwind one frame, then check the fp of that
frame.

Say we have three stack frames, A->B->C, where A and B are on the IRQ
stack, and C is on the task stack.

At entry to unwind_frame(), frame describes A, and frame->fp points at
B. Thus frame->stack_current == info.type, and changed_stack == false.

Then we sample B:

	frame->fp = READ_ONCE(fp); // points at C on the task tasck

Then we do:

	if (!changed_stack && frame->fp <= fp)

... where changed_stack describes the A->B transition (false), but
frame->fp <= fp is the B->C transition, and B and C are on different
stacks!

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-27 16:24   ` James Morse
  2019-06-28 11:15     ` Dave Martin
@ 2019-06-28 15:35     ` Mark Rutland
  1 sibling, 0 replies; 16+ messages in thread
From: Mark Rutland @ 2019-06-28 15:35 UTC (permalink / raw)
  To: James Morse
  Cc: catalin.marinas, tengfeif, will.deacon, dave.martin, linux-arm-kernel

On Thu, Jun 27, 2019 at 05:24:06PM +0100, James Morse wrote:
> Hi Mark,
> 
> On 06/06/2019 13:54, Mark Rutland wrote:
> > The arm64 stacktrace code is careful to only dereference frame records
> > in valid stack ranges, ensuring that a corrupted frame record won't
> > result in a faulting access.
> > 
> > However, it's still possible for corrupt frame records to result in
> > infinite loops in the stacktrace code, which is also undesirable.
> > 
> > This patch ensures that we complete a stacktrace in finite time, by
> > keeping track of which stacks we have already completed unwinding, and
> > verifying that if the next frame record is on the same stack, it is at a
> > higher address.
> 
> I see this truncate stacks when walking from the SDEI handler...
> 
> 
> > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > index b00ec7d483d1..1c45b33c7474 100644
> > --- a/arch/arm64/kernel/stacktrace.c
> > +++ b/arch/arm64/kernel/stacktrace.c
> > @@ -43,6 +43,8 @@
> >  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >  {
> >  	unsigned long fp = frame->fp;
> > +	bool changed_stack = false;
> > +	struct stack_info info;
> >  
> >  	if (fp & 0xf)
> >  		return -EINVAL;
> > @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >  	if (!tsk)
> >  		tsk = current;
> >  
> > -	if (!on_accessible_stack(tsk, fp, NULL))
> > +	if (!on_accessible_stack(tsk, fp, &info))
> >  		return -EINVAL;
> >  
> > +	if (test_bit(info.type, frame->stacks_done))
> > +		return -EINVAL;
> > +
> > +	if (frame->stack_current != info.type) {
> > +		set_bit(frame->stack_current, frame->stacks_done);
> > +		frame->stack_current = info.type;
> > +		changed_stack = true;
> > +	}
> > +
> >  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
> >  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
> 
> 
> > +	if (!changed_stack && frame->fp <= fp)
> > +		return -EINVAL;
> 
> This is where it goes wrong. changed_stack is only true for the first frame on a newly
> visited stack. This means for the last frame of the previous stack, (which must point to
> the next stack), we still require 'frame->fp <= fp'.
> 
> It think like this just happens to be true for the irq stacks as they are allocated early.
> (whereas the SDEI stacks are allocated late).
> 
> 
> This hunk fixes it for me:
> ------------------------------------%<------------------------------------
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 8a1fa90c784d..cb5dee233ede 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -43,7 +43,6 @@
>  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  {
>         unsigned long fp = frame->fp;
> -       bool changed_stack = false;
>         struct stack_info info;
> 
>         if (fp & 0xf)
> @@ -61,14 +60,16 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe
> *frame)
>         if (frame->stack_current != info.type) {
>                 set_bit(frame->stack_current, frame->stacks_done);
>                 frame->stack_current = info.type;
> -               changed_stack = true;
>         }
> 
>         frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
>         frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
> 
> -       if (!changed_stack && frame->fp <= fp)
> -               return -EINVAL;
> +       if (info.low <= frame->fp && frame->fp <= info.high) {
> +               /* within stack bounds: the next frame must be older */
> +               if (frame->fp <= fp)
> +                       return -EINVAL;
> +       }

This fixes the cross-stack case, but it retains the check on the unwound
frame's fp, which may or may not be problematic, but it highlights that
we do something weird.

For frames A->B->C, we unwind A->B, then if C is on the same stack is B
we check whether B->C is sane.

I think that falls apart for cases which are already bad, e.g for:

	+---+
	| B |
	+---+
	| C |
	+---+
	| A |
	+---+

... we'd refuse to unwind A->B, whereas I think we should unwind A->B but
refuse to unwind B->C.

I think we need to keep track of the previous fp, and check that as part
of unwinding A->B. For the first unwind we can prime the frame with
STACK_TYPE_UNKNOWN, since any valid FP will have be treated as a
transition from that stack.

That seems to work in local testing, so I'll have a v2 shortly...

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-06-28 13:02       ` Mark Rutland
@ 2019-07-01 10:48         ` Dave Martin
  2019-07-01 11:22           ` Mark Rutland
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Martin @ 2019-07-01 10:48 UTC (permalink / raw)
  To: Mark Rutland
  Cc: will.deacon, catalin.marinas, tengfeif, James Morse, linux-arm-kernel

On Fri, Jun 28, 2019 at 02:02:55PM +0100, Mark Rutland wrote:
> On Fri, Jun 28, 2019 at 12:15:03PM +0100, Dave Martin wrote:
> > On Thu, Jun 27, 2019 at 05:24:06PM +0100, James Morse wrote:
> > > Hi Mark,
> > > 
> > > On 06/06/2019 13:54, Mark Rutland wrote:
> > > > The arm64 stacktrace code is careful to only dereference frame records
> > > > in valid stack ranges, ensuring that a corrupted frame record won't
> > > > result in a faulting access.
> > > > 
> > > > However, it's still possible for corrupt frame records to result in
> > > > infinite loops in the stacktrace code, which is also undesirable.
> > > > 
> > > > This patch ensures that we complete a stacktrace in finite time, by
> > > > keeping track of which stacks we have already completed unwinding, and
> > > > verifying that if the next frame record is on the same stack, it is at a
> > > > higher address.
> > > 
> > > I see this truncate stacks when walking from the SDEI handler...
> > > 
> > > 
> > > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > > > index b00ec7d483d1..1c45b33c7474 100644
> > > > --- a/arch/arm64/kernel/stacktrace.c
> > > > +++ b/arch/arm64/kernel/stacktrace.c
> > > > @@ -43,6 +43,8 @@
> > > >  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> > > >  {
> > > >  	unsigned long fp = frame->fp;
> > > > +	bool changed_stack = false;
> > > > +	struct stack_info info;
> > > >  
> > > >  	if (fp & 0xf)
> > > >  		return -EINVAL;
> > > > @@ -50,12 +52,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> > > >  	if (!tsk)
> > > >  		tsk = current;
> > > >  
> > > > -	if (!on_accessible_stack(tsk, fp, NULL))
> > > > +	if (!on_accessible_stack(tsk, fp, &info))
> > > >  		return -EINVAL;
> > > >  
> > > > +	if (test_bit(info.type, frame->stacks_done))
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (frame->stack_current != info.type) {
> > > > +		set_bit(frame->stack_current, frame->stacks_done);
> > > > +		frame->stack_current = info.type;
> > > > +		changed_stack = true;
> > > > +	}
> > > > +
> > > >  	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
> > > >  	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
> > > 
> > > 
> > > > +	if (!changed_stack && frame->fp <= fp)
> > > > +		return -EINVAL;
> > > 
> > > This is where it goes wrong. changed_stack is only true for the first
> > > frame on a newly visited stack. This means for the last frame of the
> > > previous stack, (which must point to the next stack), we still
> > > require 'frame->fp <= fp'.
> > > 
> > > It think like this just happens to be true for the irq stacks as they
> > > are allocated early.  (whereas the SDEI stacks are allocated late).
> > 
> > I don't understand this.
> 
> I ended up drawing a diagram to figure this out, and realised James is
> right.
> 
> > Either we are on an edge frame (changed_stack is true) or not (false).
> > 
> > If not, the two FPs are on the same stack and we should compare them.
> > Otherwise they're on different stacks and such a comparison is nonsense.
> > 
> > I don't see any third situation.
> > 
> > So, what's wrong here?
> 
> The problem is that we unwind one frame, then check the fp of that
> frame.
> 
> Say we have three stack frames, A->B->C, where A and B are on the IRQ
> stack, and C is on the task stack.
> 
> At entry to unwind_frame(), frame describes A, and frame->fp points at
> B. Thus frame->stack_current == info.type, and changed_stack == false.
> 
> Then we sample B:
> 
> 	frame->fp = READ_ONCE(fp); // points at C on the task tasck
> 
> Then we do:
> 
> 	if (!changed_stack && frame->fp <= fp)
> 
> ... where changed_stack describes the A->B transition (false), but
> frame->fp <= fp is the B->C transition, and B and C are on different
> stacks!

OK, if I've understood that right, it looks like frame->stack_current
describes where the contents of frame were fetched from, not the frame
at frame->fp.

This is actually pretty confusing: the frame stack_current refers to is
already history: we have no pointer to it any more anyway.

I wonder whether this can be refactored so that stack_current doesn't
lag behind: say, call it fp_stack (the stack frame->fp points at).

That's just one option though.  I'll take a look at the repost.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks
  2019-07-01 10:48         ` Dave Martin
@ 2019-07-01 11:22           ` Mark Rutland
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Rutland @ 2019-07-01 11:22 UTC (permalink / raw)
  To: Dave Martin
  Cc: will.deacon, catalin.marinas, tengfeif, James Morse, linux-arm-kernel

On Mon, Jul 01, 2019 at 11:48:21AM +0100, Dave Martin wrote:
> On Fri, Jun 28, 2019 at 02:02:55PM +0100, Mark Rutland wrote:
> > The problem is that we unwind one frame, then check the fp of that
> > frame.
> > 
> > Say we have three stack frames, A->B->C, where A and B are on the IRQ
> > stack, and C is on the task stack.
> > 
> > At entry to unwind_frame(), frame describes A, and frame->fp points at
> > B. Thus frame->stack_current == info.type, and changed_stack == false.
> > 
> > Then we sample B:
> > 
> > 	frame->fp = READ_ONCE(fp); // points at C on the task tasck
> > 
> > Then we do:
> > 
> > 	if (!changed_stack && frame->fp <= fp)
> > 
> > ... where changed_stack describes the A->B transition (false), but
> > frame->fp <= fp is the B->C transition, and B and C are on different
> > stacks!
> 
> OK, if I've understood that right, it looks like frame->stack_current
> describes where the contents of frame were fetched from, not the frame
> at frame->fp.
> 
> This is actually pretty confusing: the frame stack_current refers to is
> already history: we have no pointer to it any more anyway.
> 
> I wonder whether this can be refactored so that stack_current doesn't
> lag behind: say, call it fp_stack (the stack frame->fp points at).
> 
> That's just one option though.  I'll take a look at the repost.

For v2 I added prev_fp, and renamed stack_current to prev_type.

We need the prev_fp so that we can do the intra-stack monotonicity
check. We can derive prev_type from prev_fp, but it was simpler to store
prev_type than to recalculate it.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-07-01 11:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 12:53 [PATCH 0/3] arm64: stacktrace: improve robustness Mark Rutland
2019-06-06 12:54 ` [PATCH 1/3] arm64: stacktrace: Constify stacktrace.h functions Mark Rutland
2019-06-06 12:54 ` [PATCH 2/3] arm64: stacktrace: Factor out backtrace initialisation Mark Rutland
2019-06-21 15:50   ` Dave Martin
2019-06-28 11:27     ` Mark Rutland
2019-06-06 12:54 ` [PATCH 3/3] arm64: stacktrace: better handle corrupted stacks Mark Rutland
2019-06-21 16:37   ` Dave Martin
2019-06-28 11:32     ` Mark Rutland
2019-06-24 11:34   ` James Morse
2019-06-25 10:28     ` James Morse
2019-06-27 16:24   ` James Morse
2019-06-28 11:15     ` Dave Martin
2019-06-28 13:02       ` Mark Rutland
2019-07-01 10:48         ` Dave Martin
2019-07-01 11:22           ` Mark Rutland
2019-06-28 15:35     ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).