linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next v2 0/4] riscv: Improvments for stacktrace
@ 2022-09-21 12:51 Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 1/4] riscv: stacktrace: Replace walk_stackframe with arch_stack_walk Chen Zhongjin
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Chen Zhongjin @ 2022-09-21 12:51 UTC (permalink / raw)
  To: linux-kernel, linux-riscv, linux-perf-users
  Cc: paul.walmsley, palmer, aou, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung, jolsa, guoren, frederic,
	vincent.chen, ardb, mhiramat, rostedt, keescook, catalin.marinas,
	chenzhongjin

Currently, the stacktrace with FRAME_POINTER on riscv has some problem:

1. stacktrace will stop at irq so it can't get the stack frames before
irq entry.
2. stacktrace can't unwind all the real stack frames when there is
k{ret}probes or ftrace.

These are mainly becase when there is a pt_regs on stack, we can't unwind
the stack frame as normal function.

Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
However this doesn't work for riscv because the ra is not ensured to be
pushed to stack. As explained in:
commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")

So, I choosed the method of x86 that, if there is a pt_regs on stack,
we encoded the frame pointer and save it. When unwinding stack frame,
we can get pt_regs and registers required for unwinding stacks.

In addition, the patch set contains some refactoring of stacktrace.c to
keep the stacktrace code on riscv consistent with other architectures.

Stacktrace before for kretprobes:

  Call Trace:
  ...
  [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
  [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
  [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
  [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
  [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
  ...

Stacktrace after:

  Call Trace:
  ...
  [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
  [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
  [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
+ [<ffffffff01633076>] the_caller+0x2c/0x38 [kprobe_unwind]
  [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
  [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
  ...

Stacktrace before for ftrace:

  Call Trace:
  ...
  [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
  [<ffffffff80008e7e>] ftrace_regs_call+0x8/0x10
  [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
  [<ffffffff8008a4e6>] do_init_module+0x56/0x210
  ...

  Stacktrace after:

  Call Trace:
  ...
  [<ffffffff016150e0>] handler_pre+0x30/0x4a [kprobe_unwind]
  [<ffffffff800bce96>] aggr_pre_handler+0x60/0x94
  [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
  [<ffffffff80008e82>] ftrace_regs_call+0x8/0x10
+ [<ffffffff01615000>] traced_func+0x0/0x1e [kprobe_unwind]
  [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
  [<ffffffff8008a4ea>] do_init_module+0x56/0x210
  ...

Noticed that the caller of ftrace and probed func of kretprobe
cannot be unwind because they are inside function pro/epilogue.

---
v1 -> v2:
- Merge three patches which add ENCODE_FRAME_POINTER together
- Update commit message
- Delete the KRETPORBES stuff added in unwind_state, we don't need them
to recover the kretporbes ret_addr because we can get it in pt_regs
---
Chen Zhongjin (4):
  riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
  riscv: stacktrace: Introduce unwind functions
  riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
  riscv: stacktrace: Implement stacktrace for irq

 arch/riscv/include/asm/frame.h                |  45 ++++++
 arch/riscv/include/asm/stacktrace.h           |   9 +-
 arch/riscv/kernel/entry.S                     |   3 +
 arch/riscv/kernel/mcount-dyn.S                |   7 +
 arch/riscv/kernel/perf_callchain.c            |   2 +-
 arch/riscv/kernel/probes/kprobes_trampoline.S |   7 +
 arch/riscv/kernel/stacktrace.c                | 150 ++++++++++++------
 7 files changed, 174 insertions(+), 49 deletions(-)
 create mode 100644 arch/riscv/include/asm/frame.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next v2 1/4] riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
  2022-09-21 12:51 [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Chen Zhongjin
@ 2022-09-21 12:51 ` Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 2/4] riscv: stacktrace: Introduce unwind functions Chen Zhongjin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Chen Zhongjin @ 2022-09-21 12:51 UTC (permalink / raw)
  To: linux-kernel, linux-riscv, linux-perf-users
  Cc: paul.walmsley, palmer, aou, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung, jolsa, guoren, frederic,
	vincent.chen, ardb, mhiramat, rostedt, keescook, catalin.marinas,
	chenzhongjin

walk_stackframe can be all replaced by arch_stack_walk.

Since walk_stackframe is only called by arch_stack_walk and their only
difference is the argument sequence.

Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
 arch/riscv/include/asm/stacktrace.h |  2 --
 arch/riscv/kernel/perf_callchain.c  |  2 +-
 arch/riscv/kernel/stacktrace.c      | 29 +++++++++++++----------------
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/riscv/include/asm/stacktrace.h b/arch/riscv/include/asm/stacktrace.h
index 3450c1912afd..b6cd3eddfd38 100644
--- a/arch/riscv/include/asm/stacktrace.h
+++ b/arch/riscv/include/asm/stacktrace.h
@@ -11,8 +11,6 @@ struct stackframe {
 	unsigned long ra;
 };
 
-extern void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
-				    bool (*fn)(void *, unsigned long), void *arg);
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
 			   const char *loglvl);
 
diff --git a/arch/riscv/kernel/perf_callchain.c b/arch/riscv/kernel/perf_callchain.c
index 3348a61de7d9..c023e0b1eb81 100644
--- a/arch/riscv/kernel/perf_callchain.c
+++ b/arch/riscv/kernel/perf_callchain.c
@@ -74,5 +74,5 @@ static bool fill_callchain(void *entry, unsigned long pc)
 void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
 			   struct pt_regs *regs)
 {
-	walk_stackframe(NULL, regs, fill_callchain, entry);
+	arch_stack_walk(fill_callchain, entry, NULL, regs);
 }
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 08d11a53f39e..b51e32d50a0e 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -16,8 +16,9 @@
 
 #ifdef CONFIG_FRAME_POINTER
 
-void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
-			     bool (*fn)(void *, unsigned long), void *arg)
+noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
+		  void *cookie, struct task_struct *task,
+		  struct pt_regs *regs)
 {
 	unsigned long fp, sp, pc;
 	int level = 0;
@@ -29,7 +30,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
 	} else if (task == NULL || task == current) {
 		fp = (unsigned long)__builtin_frame_address(0);
 		sp = current_stack_pointer;
-		pc = (unsigned long)walk_stackframe;
+		pc = (unsigned long)arch_stack_walk;
 	} else {
 		/* task blocked in __switch_to */
 		fp = task->thread.s[0];
@@ -41,7 +42,8 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
 		unsigned long low, high;
 		struct stackframe *frame;
 
-		if (unlikely(!__kernel_text_address(pc) || (level++ >= 1 && !fn(arg, pc))))
+		if (unlikely(!__kernel_text_address(pc) ||
+		   (level++ >= 1 && !consume_entry(cookie, pc))))
 			break;
 
 		/* Validate frame pointer */
@@ -66,8 +68,9 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
 
 #else /* !CONFIG_FRAME_POINTER */
 
-void notrace walk_stackframe(struct task_struct *task,
-	struct pt_regs *regs, bool (*fn)(void *, unsigned long), void *arg)
+noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
+		  void *cookie, struct task_struct *task,
+		  struct pt_regs *regs)
 {
 	unsigned long sp, pc;
 	unsigned long *ksp;
@@ -77,7 +80,7 @@ void notrace walk_stackframe(struct task_struct *task,
 		pc = instruction_pointer(regs);
 	} else if (task == NULL || task == current) {
 		sp = current_stack_pointer;
-		pc = (unsigned long)walk_stackframe;
+		pc = (unsigned long)arch_stack_walk;
 	} else {
 		/* task blocked in __switch_to */
 		sp = task->thread.sp;
@@ -89,7 +92,7 @@ void notrace walk_stackframe(struct task_struct *task,
 
 	ksp = (unsigned long *)sp;
 	while (!kstack_end(ksp)) {
-		if (__kernel_text_address(pc) && unlikely(!fn(arg, pc)))
+		if (__kernel_text_address(pc) && unlikely(!consume_entry(cookie, pc)))
 			break;
 		pc = (*ksp++) - 0x4;
 	}
@@ -108,7 +111,7 @@ static bool print_trace_address(void *arg, unsigned long pc)
 noinline void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
 		    const char *loglvl)
 {
-	walk_stackframe(task, regs, print_trace_address, (void *)loglvl);
+	arch_stack_walk(print_trace_address, (void *)loglvl, task, regs);
 }
 
 void show_stack(struct task_struct *task, unsigned long *sp, const char *loglvl)
@@ -133,13 +136,7 @@ unsigned long __get_wchan(struct task_struct *task)
 
 	if (!try_get_task_stack(task))
 		return 0;
-	walk_stackframe(task, NULL, save_wchan, &pc);
+	arch_stack_walk(save_wchan, &pc, task, NULL);
 	put_task_stack(task);
 	return pc;
 }
-
-noinline void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
-		     struct task_struct *task, struct pt_regs *regs)
-{
-	walk_stackframe(task, regs, consume_entry, cookie);
-}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next v2 2/4] riscv: stacktrace: Introduce unwind functions
  2022-09-21 12:51 [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 1/4] riscv: stacktrace: Replace walk_stackframe with arch_stack_walk Chen Zhongjin
@ 2022-09-21 12:51 ` Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 3/4] riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER Chen Zhongjin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Chen Zhongjin @ 2022-09-21 12:51 UTC (permalink / raw)
  To: linux-kernel, linux-riscv, linux-perf-users
  Cc: paul.walmsley, palmer, aou, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung, jolsa, guoren, frederic,
	vincent.chen, ardb, mhiramat, rostedt, keescook, catalin.marinas,
	chenzhongjin

Now all riscv unwinding code is inside arch_stack_walk. It's
not same as other architectures.

Make some refactoring, to move unwinding code into unwind() and
unwind_next() functions, which walks through all stack frames
or single frame.

This patch only moves code but doesn't make any logical change.

Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
 arch/riscv/include/asm/stacktrace.h |   7 ++
 arch/riscv/kernel/stacktrace.c      | 104 ++++++++++++++++++----------
 2 files changed, 74 insertions(+), 37 deletions(-)

diff --git a/arch/riscv/include/asm/stacktrace.h b/arch/riscv/include/asm/stacktrace.h
index b6cd3eddfd38..a39e4ef1dbd5 100644
--- a/arch/riscv/include/asm/stacktrace.h
+++ b/arch/riscv/include/asm/stacktrace.h
@@ -11,6 +11,13 @@ struct stackframe {
 	unsigned long ra;
 };
 
+struct unwind_state {
+	unsigned long fp;
+	unsigned long sp;
+	unsigned long pc;
+	struct pt_regs *regs;
+};
+
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
 			   const char *loglvl);
 
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index b51e32d50a0e..e84e21868a3e 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -16,54 +16,84 @@
 
 #ifdef CONFIG_FRAME_POINTER
 
-noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
-		  void *cookie, struct task_struct *task,
-		  struct pt_regs *regs)
+static int notrace unwind_next(struct unwind_state *state)
 {
-	unsigned long fp, sp, pc;
-	int level = 0;
+	unsigned long low, high, fp;
+	struct stackframe *frame;
 
-	if (regs) {
-		fp = frame_pointer(regs);
-		sp = user_stack_pointer(regs);
-		pc = instruction_pointer(regs);
-	} else if (task == NULL || task == current) {
-		fp = (unsigned long)__builtin_frame_address(0);
-		sp = current_stack_pointer;
-		pc = (unsigned long)arch_stack_walk;
+	fp = state->fp;
+
+	/* Validate frame pointer */
+	low = state->sp + sizeof(struct stackframe);
+	high = ALIGN(low, THREAD_SIZE);
+
+	if (fp < low || fp > high || fp & 0x7)
+		return -EINVAL;
+
+	/* Unwind stack frame */
+	frame = (struct stackframe *)fp - 1;
+	state->sp = fp;
+
+	if (state->regs && state->regs->epc == state->pc &&
+		fp & 0x7) {
+		state->fp = frame->ra;
+		state->pc = state->regs->ra;
 	} else {
-		/* task blocked in __switch_to */
-		fp = task->thread.s[0];
-		sp = task->thread.sp;
-		pc = task->thread.ra;
+		state->fp = frame->fp;
+		state->pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
+							(unsigned long *)fp - 1);
 	}
 
-	for (;;) {
-		unsigned long low, high;
-		struct stackframe *frame;
+	return 0;
+}
 
-		if (unlikely(!__kernel_text_address(pc) ||
-		   (level++ >= 1 && !consume_entry(cookie, pc))))
+static void notrace unwind(struct unwind_state *state,
+				stack_trace_consume_fn consume_entry, void *cookie)
+{
+	while (1) {
+		int ret;
+
+		if (!__kernel_text_address(state->pc))
+			break;
+
+		if (!consume_entry(cookie, state->pc))
 			break;
 
-		/* Validate frame pointer */
-		low = sp + sizeof(struct stackframe);
-		high = ALIGN(sp, THREAD_SIZE);
-		if (unlikely(fp < low || fp > high || fp & 0x7))
+		ret = unwind_next(state);
+		if (ret < 0)
 			break;
-		/* Unwind stack frame */
-		frame = (struct stackframe *)fp - 1;
-		sp = fp;
-		if (regs && (regs->epc == pc) && (frame->fp & 0x7)) {
-			fp = frame->ra;
-			pc = regs->ra;
-		} else {
-			fp = frame->fp;
-			pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
-						   (unsigned long *)(fp - 8));
-		}
+	}
+}
+
+noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
+		  void *cookie, struct task_struct *task,
+		  struct pt_regs *regs)
+{
+	struct unwind_state state;
+
+	if (task == NULL)
+		task = current;
 
+	if (regs) {
+		state.fp = frame_pointer(regs);
+		state.sp = user_stack_pointer(regs);
+		state.pc = instruction_pointer(regs);
+		state.regs = regs;
+	} else if (task == current) {
+		state.fp = (unsigned long)__builtin_frame_address(0);
+		state.sp = current_stack_pointer;
+		state.pc = (unsigned long)arch_stack_walk;
+
+		/* skip frame of arch_stack_walk */
+		unwind_next(&state);
+	} else {
+		/* task blocked in __switch_to */
+		state.fp = task->thread.s[0];
+		state.sp = task->thread.sp;
+		state.pc = task->thread.ra;
 	}
+
+	unwind(&state, consume_entry, cookie);
 }
 
 #else /* !CONFIG_FRAME_POINTER */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next v2 3/4] riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
  2022-09-21 12:51 [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 1/4] riscv: stacktrace: Replace walk_stackframe with arch_stack_walk Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 2/4] riscv: stacktrace: Introduce unwind functions Chen Zhongjin
@ 2022-09-21 12:51 ` Chen Zhongjin
  2022-09-21 12:51 ` [PATCH for-next v2 4/4] riscv: stacktrace: Implement stacktrace for irq Chen Zhongjin
  2022-09-21 14:02 ` [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Mark Rutland
  4 siblings, 0 replies; 8+ messages in thread
From: Chen Zhongjin @ 2022-09-21 12:51 UTC (permalink / raw)
  To: linux-kernel, linux-riscv, linux-perf-users
  Cc: paul.walmsley, palmer, aou, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung, jolsa, guoren, frederic,
	vincent.chen, ardb, mhiramat, rostedt, keescook, catalin.marinas,
	chenzhongjin

To support stack unwinding when there is a pt_regs on stack, the position
of pt_regs is nessesary. Because for some functions, compiler only push
s0/fp on stack without ra.

As the situation described in
commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")

When irq happens there, the function frame looks like:

prev function	|       ...       |
		|                 |
normal function +-----------------+
		| ra to prev      |
		| s0 of prev      |
		|       ...       |<-+
leaf function	+-----------------+  |
		| s0 of normal    |  |
		| empty slot      |  |
irq pt_regs	+-----------------+  |
		| epc (ra to leaf)|  |
		| ra  (ra to norm)|  |
		| ...             |  |
                | s0 of leaf      |--+
		| ...             |
		+-----------------+

When unwinding from unwinding from leaf to normal, beacause (ra to norm)
is saved in pt_regs, but not stackframe of leaf, we have to get pt_regs
for that.

To get pt_regs position on stack, we can save the encoded *pt_regs
in s0, as x86 architecture did. Then we can get s0, epc and ra
easily.

Add ENCODE_FRAME_POINTER in irq, ftrace and kretprobe trampoline entries.
For ftrace and kretprobes we should also set pt_regs as kernel mode
so that they are not mistaken as user_mode pt_regs.

Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/frame.h                | 45 +++++++++++++++++++
 arch/riscv/kernel/entry.S                     |  3 ++
 arch/riscv/kernel/mcount-dyn.S                |  7 +++
 arch/riscv/kernel/probes/kprobes_trampoline.S |  7 +++
 4 files changed, 62 insertions(+)
 create mode 100644 arch/riscv/include/asm/frame.h

diff --git a/arch/riscv/include/asm/frame.h b/arch/riscv/include/asm/frame.h
new file mode 100644
index 000000000000..2a1f45cf3a4e
--- /dev/null
+++ b/arch/riscv/include/asm/frame.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_FRAME_H
+#define _ASM_RISCV_FRAME_H
+
+#include <asm/asm.h>
+
+#ifdef CONFIG_FRAME_POINTER
+
+#ifdef __ASSEMBLY__
+
+/*
+ * This is a sneaky trick to help the unwinder find pt_regs on the stack.  The
+ * frame pointer is replaced with an encoded pointer to pt_regs.  The encoding
+ * is just setting the LSB, which makes it an invalid stack address and is also
+ * a signal to the unwinder that it's a pt_regs pointer in disguise.
+ *
+ * This macro must be used when sp point to pt_regs
+ */
+.macro ENCODE_FRAME_POINTER
+	add s0, sp, 0x1
+.endm
+
+#else /* !__ASSEMBLY__ */
+
+#define ENCODE_FRAME_POINTER			\
+	"add s0, sp, 0x1\n\t"
+
+#endif /* __ASSEMBLY__ */
+
+#else /* !CONFIG_FRAME_POINTER */
+
+#ifdef __ASSEMBLY__
+
+.macro ENCODE_FRAME_POINTER ptregs_offset=0
+.endm
+
+#else /* !__ASSEMBLY */
+
+#define ENCODE_FRAME_POINTER
+
+#endif /* !__ASSEMBLY */
+
+#endif /* CONFIG_FRAME_POINTER */
+
+#endif /* _ASM_RISCV_FRAME_H */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index b9eda3fcbd6d..ecb15c7430b4 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -13,6 +13,7 @@
 #include <asm/thread_info.h>
 #include <asm/asm-offsets.h>
 #include <asm/errata_list.h>
+#include <asm/frame.h>
 
 #if !IS_ENABLED(CONFIG_PREEMPTION)
 .set resume_kernel, restore_all
@@ -95,6 +96,8 @@ _save_context:
 	REG_S s4, PT_CAUSE(sp)
 	REG_S s5, PT_TP(sp)
 
+	ENCODE_FRAME_POINTER
+
 	/*
 	 * Set the scratch register to 0, so that if a recursive exception
 	 * occurs, the exception vector knows it came from the kernel
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index d171eca623b6..a362521e030e 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -10,6 +10,8 @@
 #include <asm/asm-offsets.h>
 #include <asm-generic/export.h>
 #include <asm/ftrace.h>
+#include <asm/frame.h>
+#include <asm/csr.h>
 
 	.text
 
@@ -172,6 +174,11 @@ ENDPROC(ftrace_caller)
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 ENTRY(ftrace_regs_caller)
 	SAVE_ALL
+#ifdef CONFIG_FRAME_POINTER
+	li s0, SR_PP
+	REG_S s0, PT_STATUS(sp)
+	ENCODE_FRAME_POINTER
+#endif
 
 	addi	a0, ra, -FENTRY_RA_OFFSET
 	la	a1, function_trace_op
diff --git a/arch/riscv/kernel/probes/kprobes_trampoline.S b/arch/riscv/kernel/probes/kprobes_trampoline.S
index 7bdb09ded39b..70760a16784f 100644
--- a/arch/riscv/kernel/probes/kprobes_trampoline.S
+++ b/arch/riscv/kernel/probes/kprobes_trampoline.S
@@ -6,6 +6,8 @@
 
 #include <asm/asm.h>
 #include <asm/asm-offsets.h>
+#include <asm/frame.h>
+#include <asm/csr.h>
 
 	.text
 	.altmacro
@@ -78,6 +80,11 @@
 ENTRY(__kretprobe_trampoline)
 	addi sp, sp, -(PT_SIZE_ON_STACK)
 	save_all_base_regs
+#ifdef CONFIG_FRAME_POINTER
+	li s0, SR_PP
+	REG_S s0, PT_STATUS(sp)
+	ENCODE_FRAME_POINTER
+#endif
 
 	move a0, sp /* pt_regs */
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next v2 4/4] riscv: stacktrace: Implement stacktrace for irq
  2022-09-21 12:51 [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Chen Zhongjin
                   ` (2 preceding siblings ...)
  2022-09-21 12:51 ` [PATCH for-next v2 3/4] riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER Chen Zhongjin
@ 2022-09-21 12:51 ` Chen Zhongjin
  2022-09-21 14:02 ` [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Mark Rutland
  4 siblings, 0 replies; 8+ messages in thread
From: Chen Zhongjin @ 2022-09-21 12:51 UTC (permalink / raw)
  To: linux-kernel, linux-riscv, linux-perf-users
  Cc: paul.walmsley, palmer, aou, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung, jolsa, guoren, frederic,
	vincent.chen, ardb, mhiramat, rostedt, keescook, catalin.marinas,
	chenzhongjin

After adding encoded fp onto stack to record pt_regs, now the
unwinder have ability to unwind frame through irq.

There is two steps to unwind irq frame and the interrupted frame:

1. When there is an encoded fp on stack, we can get the pt_regs
and unwind frame by (regs->epc) and (regs->s0).

2. To unwind the interrupted frame, there is two possibilities,
we can determine the situation by checking whether the value in
frame->ra position is a fp value.

If there is a fp in ra position:
We are inside a leaf frame and there is only fp on ra position.
Get fp from ra position and get next pc from pt_regs.
Else:
Just get fp and next pc from stack frame.

Stacktrace before this patch:

 Call Trace:
  ...
  [<ffffffff800aa692>] __flush_smp_call_function_queue+0xde/0x1fa
  [<ffffffff800ab404>] generic_smp_call_function_single_interrupt+0x22/0x2a
  [<ffffffff800077b2>] handle_IPI+0xaa/0x108
  [<ffffffff803f827e>] riscv_intc_irq+0x56/0x6e
  [<ffffffff808d94b6>] generic_handle_arch_irq+0x4c/0x76
  [<ffffffff80003ad0>] ret_from_exception+0x0/0xc

Stacktrace after this patch:

 Call Trace:
  ...
  [<ffffffff800aa6da>] __flush_smp_call_function_queue+0xde/0x1fa
  [<ffffffff800ab44c>] generic_smp_call_function_single_interrupt+0x22/0x2a
  [<ffffffff800077fa>] handle_IPI+0xaa/0x108
  [<ffffffff803f82c6>] riscv_intc_irq+0x56/0x6e
  [<ffffffff808d94fe>] generic_handle_arch_irq+0x4c/0x76
  [<ffffffff80003ad0>] ret_from_exception+0x0/0xc
+ [<ffffffff80003d52>] arch_cpu_idle+0x22/0x28
+ [<ffffffff808e23a8>] default_idle_call+0x44/0xee
+ [<ffffffff80056ece>] do_idle+0x116/0x126
+ [<ffffffff8005706e>] cpu_startup_entry+0x36/0x38
+ [<ffffffff808d99ae>] kernel_init+0x0/0x15a
+ [<ffffffff80a007a0>] arch_post_acpi_subsys_init+0x0/0x38
+ [<ffffffff80a0100c>] start_kernel+0x7c4/0x7f2

Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
 arch/riscv/kernel/stacktrace.c | 45 ++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index e84e21868a3e..976dc298ab3b 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -16,29 +16,60 @@
 
 #ifdef CONFIG_FRAME_POINTER
 
+static struct pt_regs *decode_frame_pointer(unsigned long fp)
+{
+	if (!(fp & 0x1))
+		return NULL;
+
+	return (struct pt_regs *)(fp & ~0x1);
+}
+
 static int notrace unwind_next(struct unwind_state *state)
 {
 	unsigned long low, high, fp;
 	struct stackframe *frame;
+	struct pt_regs *regs;
 
-	fp = state->fp;
+	regs = decode_frame_pointer(state->fp);
 
 	/* Validate frame pointer */
-	low = state->sp + sizeof(struct stackframe);
+	if (regs) {
+		if user_mode(regs)
+			return -1;
+
+		fp = (unsigned long)regs;
+		low = state->sp;
+	} else {
+		fp = state->fp;
+		low = state->sp + sizeof(struct stackframe);
+	}
 	high = ALIGN(low, THREAD_SIZE);
 
 	if (fp < low || fp > high || fp & 0x7)
 		return -EINVAL;
 
-	/* Unwind stack frame */
 	frame = (struct stackframe *)fp - 1;
 	state->sp = fp;
 
-	if (state->regs && state->regs->epc == state->pc &&
-		fp & 0x7) {
-		state->fp = frame->ra;
-		state->pc = state->regs->ra;
+	if (regs) {
+	/* Unwind from irq to interrupted function */
+		state->fp = regs->s0;
+		state->pc = regs->epc;
+		state->regs = regs;
+	} else if (state->regs && state->regs->epc == state->pc) {
+	/* Unwind from interrupted function to caller*/
+		if (frame->ra < low || frame->ra > high) {
+		/* normal function */
+			state->fp = frame->fp;
+			state->pc = frame->ra;
+		} else {
+		/* leaf function */
+			state->fp = frame->ra;
+			state->pc = state->regs->ra;
+		}
+		state->regs = NULL;
 	} else {
+	/* Unwind from normal stack frame */
 		state->fp = frame->fp;
 		state->pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
 							(unsigned long *)fp - 1);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace
  2022-09-21 12:51 [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Chen Zhongjin
                   ` (3 preceding siblings ...)
  2022-09-21 12:51 ` [PATCH for-next v2 4/4] riscv: stacktrace: Implement stacktrace for irq Chen Zhongjin
@ 2022-09-21 14:02 ` Mark Rutland
  2022-09-22  8:22   ` Chen Zhongjin
  4 siblings, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2022-09-21 14:02 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: linux-kernel, linux-riscv, linux-perf-users, paul.walmsley,
	palmer, aou, peterz, mingo, acme, alexander.shishkin, namhyung,
	jolsa, guoren, frederic, vincent.chen, ardb, mhiramat, rostedt,
	keescook, catalin.marinas

On Wed, Sep 21, 2022 at 08:51:23PM +0800, Chen Zhongjin wrote:
> Currently, the stacktrace with FRAME_POINTER on riscv has some problem:
> 
> 1. stacktrace will stop at irq so it can't get the stack frames before
> irq entry.
> 2. stacktrace can't unwind all the real stack frames when there is
> k{ret}probes or ftrace.
> 
> These are mainly becase when there is a pt_regs on stack, we can't unwind
> the stack frame as normal function.
> 
> Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
> However this doesn't work for riscv because the ra is not ensured to be
> pushed to stack. As explained in:
> commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")

FWIW, this is also a latent problem on arm64, since we don't know whether the
LR is live at an exception boundary (and we currently always ignore it).

My plan to fix that on arm64 is to push an empty frame record to the stack upon
an exception, and use that to find the pt_regs, from which we can get the PC
and LR (and then we can report the later as unreliable). That should be roughly
equivalent to what you do in this series (where use use the LSB to identify
that the pointer is actually a pt_regs).

One important thing to note is that when crossing an exception boundary you
won't know whether RA is live, and so you might try to consume an address twice
(if it has also been pushed to the stack). That could be problematic for
unwinding ftrace or kretprobes. On arm64 we had to implement
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR so that we could reliably unwind ftrace. See
commit:

  c6d3cd32fd0064af ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")

... and we haven't yet come up with something similar for kretprobes (though I
suspect we'll need to).

Thanks,
Mark.

> So, I choosed the method of x86 that, if there is a pt_regs on stack,
> we encoded the frame pointer and save it. When unwinding stack frame,
> we can get pt_regs and registers required for unwinding stacks.
> 
> In addition, the patch set contains some refactoring of stacktrace.c to
> keep the stacktrace code on riscv consistent with other architectures.
> 
> Stacktrace before for kretprobes:
> 
>   Call Trace:
>   ...
>   [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
>   [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
>   [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
>   [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
>   [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
>   ...
> 
> Stacktrace after:
> 
>   Call Trace:
>   ...
>   [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
>   [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
>   [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
> + [<ffffffff01633076>] the_caller+0x2c/0x38 [kprobe_unwind]
>   [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
>   [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
>   ...
> 
> Stacktrace before for ftrace:
> 
>   Call Trace:
>   ...
>   [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
>   [<ffffffff80008e7e>] ftrace_regs_call+0x8/0x10
>   [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
>   [<ffffffff8008a4e6>] do_init_module+0x56/0x210
>   ...
> 
>   Stacktrace after:
> 
>   Call Trace:
>   ...
>   [<ffffffff016150e0>] handler_pre+0x30/0x4a [kprobe_unwind]
>   [<ffffffff800bce96>] aggr_pre_handler+0x60/0x94
>   [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
>   [<ffffffff80008e82>] ftrace_regs_call+0x8/0x10
> + [<ffffffff01615000>] traced_func+0x0/0x1e [kprobe_unwind]
>   [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
>   [<ffffffff8008a4ea>] do_init_module+0x56/0x210
>   ...
> 
> Noticed that the caller of ftrace and probed func of kretprobe
> cannot be unwind because they are inside function pro/epilogue.
> 
> ---
> v1 -> v2:
> - Merge three patches which add ENCODE_FRAME_POINTER together
> - Update commit message
> - Delete the KRETPORBES stuff added in unwind_state, we don't need them
> to recover the kretporbes ret_addr because we can get it in pt_regs
> ---
> Chen Zhongjin (4):
>   riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
>   riscv: stacktrace: Introduce unwind functions
>   riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
>   riscv: stacktrace: Implement stacktrace for irq
> 
>  arch/riscv/include/asm/frame.h                |  45 ++++++
>  arch/riscv/include/asm/stacktrace.h           |   9 +-
>  arch/riscv/kernel/entry.S                     |   3 +
>  arch/riscv/kernel/mcount-dyn.S                |   7 +
>  arch/riscv/kernel/perf_callchain.c            |   2 +-
>  arch/riscv/kernel/probes/kprobes_trampoline.S |   7 +
>  arch/riscv/kernel/stacktrace.c                | 150 ++++++++++++------
>  7 files changed, 174 insertions(+), 49 deletions(-)
>  create mode 100644 arch/riscv/include/asm/frame.h
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace
  2022-09-21 14:02 ` [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Mark Rutland
@ 2022-09-22  8:22   ` Chen Zhongjin
  2022-09-23 13:34     ` Mark Rutland
  0 siblings, 1 reply; 8+ messages in thread
From: Chen Zhongjin @ 2022-09-22  8:22 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-riscv, linux-perf-users, paul.walmsley,
	palmer, aou, peterz, mingo, acme, alexander.shishkin, namhyung,
	jolsa, guoren, frederic, vincent.chen, ardb, mhiramat, rostedt,
	keescook, catalin.marinas, Josh Poimboeuf

Hi,

On 2022/9/21 22:02, Mark Rutland wrote:
> On Wed, Sep 21, 2022 at 08:51:23PM +0800, Chen Zhongjin wrote:
>> Currently, the stacktrace with FRAME_POINTER on riscv has some problem:
>>
>> 1. stacktrace will stop at irq so it can't get the stack frames before
>> irq entry.
>> 2. stacktrace can't unwind all the real stack frames when there is
>> k{ret}probes or ftrace.
>>
>> These are mainly becase when there is a pt_regs on stack, we can't unwind
>> the stack frame as normal function.
>>
>> Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
>> However this doesn't work for riscv because the ra is not ensured to be
>> pushed to stack. As explained in:
>> commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")
> FWIW, this is also a latent problem on arm64, since we don't know whether the
> LR is live at an exception boundary (and we currently always ignore it).

In fact this is still a problem for riscv, I didn't unwind the caller of 
ftrace or probed func

of kretprobe because they are inside function pro/epilogue.

The problem on riscv is a little more complex, for leaf function, ra 
will always not be pushed

to stack and fp will be pushed to ra slot. This patch solved this problem.

> My plan to fix that on arm64 is to push an empty frame record to the stack upon
> an exception, and use that to find the pt_regs, from which we can get the PC
> and LR (and then we can report the later as unreliable). That should be roughly
> equivalent to what you do in this series (where use use the LSB to identify
> that the pointer is actually a pt_regs).

IIRC, this solution looks like:

=====

Func1     { lr, fp } or { nothing }

irq entry { pt_regs & empty stackframe[2] }

handler   { lr, fp }

======

handler->fp points to stackframe, and when we find stackframe is empty, 
we know we are inside

an pt_regs and can find registers by offset.

I think it's available, there no other scenario will cause the frame 
list contains zero (unless stack is clobbered).

And it seems only fp slot should be zero, lr will be clobbered after 
function call, we cannot use lr inside pt_regs.

> One important thing to note is that when crossing an exception boundary you
> won't know whether RA is live, and so you might try to consume an address twice
> (if it has also been pushed to the stack). That could be problematic for
> unwinding ftrace or kretprobes. On arm64 we had to implement
> HAVE_FUNCTION_GRAPH_RET_ADDR_PTR so that we could reliably unwind ftrace. See
> commit:
>
>    c6d3cd32fd0064af ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")
>
> ... and we haven't yet come up with something similar for kretprobes (though I
> suspect we'll need to).

Can we test the sp and fp inside pt_regs? Because luckily arm64 saves 
lr, fp and moves sp together.

Before sp is moved we will have fp == sp from last frame 'mov x29, sp'. 
So if they are same, we uses the lr in pt_regs,

conversely we use lr on stack.

That's what I planned to use for orc unwinder (or we need to track lr as 
sp/fp). Seems it also works for this

solution. Haven't thought it in detail through.


Best,

Chen

> Thanks,
> Mark.
>
>> So, I choosed the method of x86 that, if there is a pt_regs on stack,
>> we encoded the frame pointer and save it. When unwinding stack frame,
>> we can get pt_regs and registers required for unwinding stacks.
>>
>> In addition, the patch set contains some refactoring of stacktrace.c to
>> keep the stacktrace code on riscv consistent with other architectures.
>>
>> Stacktrace before for kretprobes:
>>
>>    Call Trace:
>>    ...
>>    [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
>>    [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
>>    [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
>>    [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
>>    [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
>>    ...
>>
>> Stacktrace after:
>>
>>    Call Trace:
>>    ...
>>    [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
>>    [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
>>    [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
>> + [<ffffffff01633076>] the_caller+0x2c/0x38 [kprobe_unwind]
>>    [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
>>    [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
>>    ...
>>
>> Stacktrace before for ftrace:
>>
>>    Call Trace:
>>    ...
>>    [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
>>    [<ffffffff80008e7e>] ftrace_regs_call+0x8/0x10
>>    [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
>>    [<ffffffff8008a4e6>] do_init_module+0x56/0x210
>>    ...
>>
>>    Stacktrace after:
>>
>>    Call Trace:
>>    ...
>>    [<ffffffff016150e0>] handler_pre+0x30/0x4a [kprobe_unwind]
>>    [<ffffffff800bce96>] aggr_pre_handler+0x60/0x94
>>    [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
>>    [<ffffffff80008e82>] ftrace_regs_call+0x8/0x10
>> + [<ffffffff01615000>] traced_func+0x0/0x1e [kprobe_unwind]
>>    [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
>>    [<ffffffff8008a4ea>] do_init_module+0x56/0x210
>>    ...
>>
>> Noticed that the caller of ftrace and probed func of kretprobe
>> cannot be unwind because they are inside function pro/epilogue.
>>
>> ---
>> v1 -> v2:
>> - Merge three patches which add ENCODE_FRAME_POINTER together
>> - Update commit message
>> - Delete the KRETPORBES stuff added in unwind_state, we don't need them
>> to recover the kretporbes ret_addr because we can get it in pt_regs
>> ---
>> Chen Zhongjin (4):
>>    riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
>>    riscv: stacktrace: Introduce unwind functions
>>    riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
>>    riscv: stacktrace: Implement stacktrace for irq
>>
>>   arch/riscv/include/asm/frame.h                |  45 ++++++
>>   arch/riscv/include/asm/stacktrace.h           |   9 +-
>>   arch/riscv/kernel/entry.S                     |   3 +
>>   arch/riscv/kernel/mcount-dyn.S                |   7 +
>>   arch/riscv/kernel/perf_callchain.c            |   2 +-
>>   arch/riscv/kernel/probes/kprobes_trampoline.S |   7 +
>>   arch/riscv/kernel/stacktrace.c                | 150 ++++++++++++------
>>   7 files changed, 174 insertions(+), 49 deletions(-)
>>   create mode 100644 arch/riscv/include/asm/frame.h
>>
>> -- 
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace
  2022-09-22  8:22   ` Chen Zhongjin
@ 2022-09-23 13:34     ` Mark Rutland
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Rutland @ 2022-09-23 13:34 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: linux-kernel, linux-riscv, linux-perf-users, paul.walmsley,
	palmer, aou, peterz, mingo, acme, alexander.shishkin, namhyung,
	jolsa, guoren, frederic, vincent.chen, ardb, mhiramat, rostedt,
	keescook, catalin.marinas, Josh Poimboeuf

On Thu, Sep 22, 2022 at 04:22:23PM +0800, Chen Zhongjin wrote:
> Hi,
> 
> On 2022/9/21 22:02, Mark Rutland wrote:
> > On Wed, Sep 21, 2022 at 08:51:23PM +0800, Chen Zhongjin wrote:
> > > Currently, the stacktrace with FRAME_POINTER on riscv has some problem:
> > > 
> > > 1. stacktrace will stop at irq so it can't get the stack frames before
> > > irq entry.
> > > 2. stacktrace can't unwind all the real stack frames when there is
> > > k{ret}probes or ftrace.
> > > 
> > > These are mainly becase when there is a pt_regs on stack, we can't unwind
> > > the stack frame as normal function.
> > > 
> > > Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
> > > However this doesn't work for riscv because the ra is not ensured to be
> > > pushed to stack. As explained in:
> > > commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")
> > FWIW, this is also a latent problem on arm64, since we don't know whether the
> > LR is live at an exception boundary (and we currently always ignore it).
> 
> In fact this is still a problem for riscv, I didn't unwind the caller of
> ftrace or probed func
> 
> of kretprobe because they are inside function pro/epilogue.
> 
> The problem on riscv is a little more complex, for leaf function, ra will
> always not be pushed to stack and fp will be pushed to ra slot. This patch
> solved this problem.

That's the same on arm64, our `LR` is equivalent to your `RA`.

I have a series at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/metadata

... which explciitly indicates pt_regs::pc and pt_regs::lr in the stack trace
output (and for RELIABLE_STACKTRACE we'd mark the LR entries as unreliable).

> > My plan to fix that on arm64 is to push an empty frame record to the stack upon
> > an exception, and use that to find the pt_regs, from which we can get the PC
> > and LR (and then we can report the later as unreliable). That should be roughly
> > equivalent to what you do in this series (where use use the LSB to identify
> > that the pointer is actually a pt_regs).
> 
> IIRC, this solution looks like:
> 
> =====
> 
> Func1     { lr, fp } or { nothing }
> 
> irq entry { pt_regs & empty stackframe[2] }
> 
> handler   { lr, fp }
> 
> ======
> 
> handler->fp points to stackframe, and when we find stackframe is empty, we
> know we are inside
> 
> an pt_regs and can find registers by offset.
> 
> I think it's available, there no other scenario will cause the frame list
> contains zero (unless stack is clobbered).

That's basically what I do in my series; see:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/stacktrace/metadata&id=7394a825bc4f8243b28261517999793acb6f11cd

... but I haven't posted that to the list yet as I *also* want to add code to
check that we were at an exception boundary when we saw this (based on the
prior PC/LR being in the entry text, using some metadata I have yet to add).

> And it seems only fp slot should be zero, lr will be clobbered after
> function call, we cannot use lr inside pt_regs.

I'm not sure what you mean here.

I agree that the LR in the pt_regs isn't *reliable*, and won't *always* be
valid, but it will sometimes be valid.

> > One important thing to note is that when crossing an exception boundary you
> > won't know whether RA is live, and so you might try to consume an address twice
> > (if it has also been pushed to the stack). That could be problematic for
> > unwinding ftrace or kretprobes. On arm64 we had to implement
> > HAVE_FUNCTION_GRAPH_RET_ADDR_PTR so that we could reliably unwind ftrace. See
> > commit:
> > 
> >    c6d3cd32fd0064af ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")
> > 
> > ... and we haven't yet come up with something similar for kretprobes (though I
> > suspect we'll need to).
> 
> Can we test the sp and fp inside pt_regs? Because luckily arm64 saves lr, fp
> and moves sp together.
> 
> Before sp is moved we will have fp == sp from last frame 'mov x29, sp'. So
> if they are same, we uses the lr in pt_regs, conversely we use lr on stack.

The frame record (which FP points to) can live anywhere in a stack frame, so
comparing aginst SP doesn't tell us anything. Regardless of whether FP == SP,
the LTR might be valid or invalid.

There is no way of knowing if LR is live without information from the compiler
that we don't have today (though from LPC 2022, it seems like CTF *might* be
sufficient -- that's on my list of things to look at).

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-23 13:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21 12:51 [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Chen Zhongjin
2022-09-21 12:51 ` [PATCH for-next v2 1/4] riscv: stacktrace: Replace walk_stackframe with arch_stack_walk Chen Zhongjin
2022-09-21 12:51 ` [PATCH for-next v2 2/4] riscv: stacktrace: Introduce unwind functions Chen Zhongjin
2022-09-21 12:51 ` [PATCH for-next v2 3/4] riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER Chen Zhongjin
2022-09-21 12:51 ` [PATCH for-next v2 4/4] riscv: stacktrace: Implement stacktrace for irq Chen Zhongjin
2022-09-21 14:02 ` [PATCH for-next v2 0/4] riscv: Improvments for stacktrace Mark Rutland
2022-09-22  8:22   ` Chen Zhongjin
2022-09-23 13:34     ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).