live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
       [not found] <705993ccb34a611c75cdae0a8cb1b40f9b218ebd>
@ 2021-04-05 20:43 ` madvenka
  2021-04-05 20:43   ` [RFC PATCH v2 1/4] arm64: Implement infrastructure for " madvenka
                     ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: madvenka @ 2021-04-05 20:43 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

There are a number of places in kernel code where the stack trace is not
reliable. Enhance the unwinder to check for those cases and mark the
stack trace as unreliable. Once all of the checks are in place, the unwinder
can provide a reliable stack trace. But before this can be used for livepatch,
some other entity needs to guarantee that the frame pointers are all set up
correctly in kernel functions. objtool is currently being worked on to
fill that gap.

Except for the return address check, all the other checks involve checking
the return PC of every frame against certain kernel functions. To do this,
implement some infrastructure code:

	- Define a special_functions[] array and populate the array with
	  the special functions

	- Using kallsyms_lookup(), lookup the symbol table entries for the
	  functions and record their address ranges

	- Define an is_reliable_function(pc) to match a return PC against
	  the special functions.

The unwinder calls is_reliable_function(pc) for every return PC and marks
the stack trace as reliable or unreliable accordingly.

Return address check
====================

Check the return PC of every stack frame to make sure that it is a valid
kernel text address (and not some generated code, for example).

Detect EL1 exception frame
==========================

EL1 exceptions can happen on any instruction including instructions in
the frame pointer prolog or epilog. Depending on where exactly they happen,
they could render the stack trace unreliable.

Add all of the EL1 exception handlers to special_functions[].

	- el1_sync()
	- el1_irq()
	- el1_error()
	- el1_sync_invalid()
	- el1_irq_invalid()
	- el1_fiq_invalid()
	- el1_error_invalid()

Detect ftrace frame
===================

When FTRACE executes at the beginning of a traced function, it creates two
frames and calls the tracer function:

	- One frame for the traced function

	- One frame for the caller of the traced function

That gives a sensible stack trace while executing in the tracer function.
When FTRACE returns to the traced function, the frames are popped and
everything is back to normal.

However, in cases like live patch, the tracer function redirects execution
to a different function. When FTRACE returns, control will go to that target
function. A stack trace taken in the tracer function will not show the target
function. The target function is the real function that we want to track.
So, the stack trace is unreliable.

To detect stack traces from a tracer function, add the following to
special_functions[]:

	- ftrace_call + 4

ftrace_call is the label at which the tracer function is patched in. So,
ftrace_call + 4 is its return address. This is what will show up in a
stack trace taken from the tracer function.

When Function Graph Tracing is on, ftrace_graph_caller is patched in
at the label ftrace_graph_call. If a tracer function called before it has
redirected execution as mentioned above, the stack traces taken from within
ftrace_graph_caller will also be unreliable for the same reason as mentioned
above. So, add ftrace_graph_caller to special_functions[] as well.

Also, the Function Graph Tracer modifies the return address of a traced
function to a return trampoline (return_to_handler()) to gather tracing
data on function return. Stack traces taken from the traced function and
functions it calls will not show the original caller of the traced function.
The unwinder handles this case by getting the original caller from FTRACE.

However, stack traces taken from the trampoline itself and functions it calls
are unreliable as the original return address may not be available in
that context. This is because the trampoline calls FTRACE to gather trace
data as well as to obtain the actual return address and FTRACE discards the
record of the original return address along the way.

Add return_to_handler() to special_functions[].

Check for kretprobe
===================

For functions with a kretprobe set up, probe code executes on entry
to the function and replaces the return address in the stack frame with a
kretprobe trampoline. Whenever the function returns, control is
transferred to the trampoline. The trampoline eventually returns to the
original return address.

A stack trace taken while executing in the function (or in functions that
get called from the function) will not show the original return address.
Similarly, a stack trace taken while executing in the trampoline itself
(and functions that get called from the trampoline) will not show the
original return address. This means that the caller of the probed function
will not show. This makes the stack trace unreliable.

Add the kretprobe trampoline to special_functions[].

Optprobes
=========

Optprobes may be implemented in the future for arm64. For optprobes,
the relevant trampoline(s) can be added to special_functions[].
---
Changelog:

v1
	- Define a bool field in struct stackframe. This will indicate if
	  a stack trace is reliable.

	- Implement a special_functions[] array that will be populated
	  with special functions in which the stack trace is considered
	  unreliable.
	
	- Using kallsyms_lookup(), get the address ranges for the special
	  functions and record them.

	- Implement an is_reliable_function(pc). This function will check
	  if a given return PC falls in any of the special functions. If
	  it does, the stack trace is unreliable.

	- Implement check_reliability() function that will check if a
	  stack frame is reliable. Call is_reliable_function() from
	  check_reliability().

	- Before a return PC is checked against special_funtions[], it
	  must be validates as a proper kernel text address. Call
	  __kernel_text_address() from check_reliability().

	- Finally, call check_reliability() from unwind_frame() for
	  each stack frame.

	- Add EL1 exception handlers to special_functions[].

		el1_sync();
		el1_irq();
		el1_error();
		el1_sync_invalid();
		el1_irq_invalid();
		el1_fiq_invalid();
		el1_error_invalid();

	- The above functions are currently defined as LOCAL symbols.
	  Make them global so that they can be referenced from the
	  unwinder code.

	- Add FTRACE trampolines to special_functions[]:

		ftrace_graph_call()
		ftrace_graph_caller()
		return_to_handler()

	- Add the kretprobe trampoline to special functions[]:

		kretprobe_trampoline()

v2
	- Removed the terminating entry { 0, 0 } in special_functions[]
	  and replaced it with the idiom { /* sentinel */ }.

	- Change the ftrace trampoline entry ftrace_graph_call in
	  special_functions[] to ftrace_call + 4 and added explanatory
	  comments.

	- Unnested #ifdefs in special_functions[] for FTRACE.

Madhavan T. Venkataraman (4):
  arm64: Implement infrastructure for stack trace reliability checks
  arm64: Mark a stack trace unreliable if an EL1 exception frame is
    detected
  arm64: Detect FTRACE cases that make the stack trace unreliable
  arm64: Mark stack trace as unreliable if kretprobed functions are
    present

 arch/arm64/include/asm/exception.h  |   8 ++
 arch/arm64/include/asm/stacktrace.h |   2 +
 arch/arm64/kernel/entry-ftrace.S    |  12 ++
 arch/arm64/kernel/entry.S           |  14 +-
 arch/arm64/kernel/stacktrace.c      | 215 ++++++++++++++++++++++++++++
 5 files changed, 244 insertions(+), 7 deletions(-)


base-commit: 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b
-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks
  2021-04-05 20:43 ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks madvenka
@ 2021-04-05 20:43   ` madvenka
  2021-04-08 15:15     ` Mark Brown
  2021-04-08 17:17     ` Mark Brown
  2021-04-05 20:43   ` [RFC PATCH v2 2/4] arm64: Mark a stack trace unreliable if an EL1 exception frame is detected madvenka
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 33+ messages in thread
From: madvenka @ 2021-04-05 20:43 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Implement a check_reliability() function that will contain checks for the
presence of various features and conditions that can render the stack trace
unreliable.

Introduce the first reliability check - If a return PC encountered in a
stack trace is not a valid kernel text address, the stack trace is
considered unreliable. It could be some generated code.

Other reliability checks will be added in the future.

These checks will involve checking the return PC to see if it falls inside
any special functions where the stack trace is considered unreliable.
Implement the infrastructure needed for this.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/include/asm/stacktrace.h |  2 +
 arch/arm64/kernel/stacktrace.c      | 80 +++++++++++++++++++++++++++++
 2 files changed, 82 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index eb29b1fe8255..684f65808394 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -59,6 +59,7 @@ struct stackframe {
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	int graph;
 #endif
+	bool reliable;
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
@@ -169,6 +170,7 @@ static inline void start_backtrace(struct stackframe *frame,
 	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
 	frame->prev_fp = 0;
 	frame->prev_type = STACK_TYPE_UNKNOWN;
+	frame->reliable = true;
 }
 
 #endif	/* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index ad20981dfda4..557657d6e6bd 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -18,6 +18,84 @@
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
+struct function_range {
+	unsigned long	start;
+	unsigned long	end;
+};
+
+/*
+ * Special functions where the stack trace is unreliable.
+ */
+static struct function_range	special_functions[] = {
+	{ /* sentinel */ }
+};
+
+static bool is_reliable_function(unsigned long pc)
+{
+	static bool inited = false;
+	struct function_range *func;
+
+	if (!inited) {
+		static char sym[KSYM_NAME_LEN];
+		unsigned long size, offset;
+
+		for (func = special_functions; func->start; func++) {
+			if (kallsyms_lookup(func->start, &size, &offset,
+					    NULL, sym)) {
+				func->start -= offset;
+				func->end = func->start + size;
+			} else {
+				/*
+				 * This is just a label. So, we only need to
+				 * consider that particular location. So, size
+				 * is the size of one Aarch64 instruction.
+				 */
+				func->end = func->start + 4;
+			}
+		}
+		inited = true;
+	}
+
+	for (func = special_functions; func->start; func++) {
+		if (pc >= func->start && pc < func->end)
+			return false;
+	}
+	return true;
+}
+
+/*
+ * Check for the presence of features and conditions that render the stack
+ * trace unreliable.
+ *
+ * Once all such cases have been addressed, this function can aid live
+ * patching (and this comment can be removed).
+ */
+static void check_reliability(struct stackframe *frame)
+{
+	/*
+	 * If the stack trace has already been marked unreliable, just return.
+	 */
+	if (!frame->reliable)
+		return;
+
+	/*
+	 * First, make sure that the return address is a proper kernel text
+	 * address. A NULL or invalid return address probably means there's
+	 * some generated code which __kernel_text_address() doesn't know
+	 * about. Mark the stack trace as not reliable.
+	 */
+	if (!__kernel_text_address(frame->pc)) {
+		frame->reliable = false;
+		return;
+	}
+
+	/*
+	 * Check the reliability of the return PC's function.
+	 */
+	if (!is_reliable_function(frame->pc))
+		frame->reliable = false;
+}
+
 /*
  * AArch64 PCS assigns the frame pointer to x29.
  *
@@ -108,6 +186,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 
 	frame->pc = ptrauth_strip_insn_pac(frame->pc);
 
+	check_reliability(frame);
+
 	return 0;
 }
 NOKPROBE_SYMBOL(unwind_frame);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH v2 2/4] arm64: Mark a stack trace unreliable if an EL1 exception frame is detected
  2021-04-05 20:43 ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks madvenka
  2021-04-05 20:43   ` [RFC PATCH v2 1/4] arm64: Implement infrastructure for " madvenka
@ 2021-04-05 20:43   ` madvenka
  2021-04-05 20:43   ` [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable madvenka
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: madvenka @ 2021-04-05 20:43 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

EL1 exceptions can happen on any instruction including instructions in
the frame pointer prolog or epilog. Depending on where exactly they happen,
they could render the stack trace unreliable.

If an EL1 exception frame is found on the stack, mark the stack trace as
unreliable. Now, the EL1 exception frame is not at any well-known offset
on the stack. It can be anywhere on the stack. In order to properly detect
an EL1 exception frame, the return address must be checked against all of
the possible EL1 exception handlers.

Preemption
==========

Interrupts encountered in kernel code are also EL1 exceptions. At the end
of an interrupt, the interrupt handler checks if the current task must be
preempted for any reason. If so, it calls the preemption code which takes
the task off the CPU. A stack trace taken on the task after the preemption
will show the EL1 frame and will be considered unreliable. This is correct
behavior as preemption can happen practically at any point in code.

Probing
=======

Breakpoints encountered in kernel code are also EL1 exceptions. The probing
infrastructure uses breakpoints for executing probe code. While in the probe
code, the stack trace will show an EL1 frame and will be considered
unreliable. This is also correct behavior.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/include/asm/exception.h |  8 +++++++
 arch/arm64/kernel/entry.S          | 14 +++++------
 arch/arm64/kernel/stacktrace.c     | 37 ++++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 6546158d2f2d..4ebd2390ef54 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -35,6 +35,14 @@ asmlinkage void el1_sync_handler(struct pt_regs *regs);
 asmlinkage void el0_sync_handler(struct pt_regs *regs);
 asmlinkage void el0_sync_compat_handler(struct pt_regs *regs);
 
+asmlinkage void el1_sync(void);
+asmlinkage void el1_irq(void);
+asmlinkage void el1_error(void);
+asmlinkage void el1_sync_invalid(void);
+asmlinkage void el1_irq_invalid(void);
+asmlinkage void el1_fiq_invalid(void);
+asmlinkage void el1_error_invalid(void);
+
 asmlinkage void noinstr enter_el1_irq_or_nmi(struct pt_regs *regs);
 asmlinkage void noinstr exit_el1_irq_or_nmi(struct pt_regs *regs);
 asmlinkage void enter_from_user_mode(void);
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index a31a0a713c85..9fe3aaeff019 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -630,19 +630,19 @@ SYM_CODE_START_LOCAL(el0_fiq_invalid_compat)
 SYM_CODE_END(el0_fiq_invalid_compat)
 #endif
 
-SYM_CODE_START_LOCAL(el1_sync_invalid)
+SYM_CODE_START(el1_sync_invalid)
 	inv_entry 1, BAD_SYNC
 SYM_CODE_END(el1_sync_invalid)
 
-SYM_CODE_START_LOCAL(el1_irq_invalid)
+SYM_CODE_START(el1_irq_invalid)
 	inv_entry 1, BAD_IRQ
 SYM_CODE_END(el1_irq_invalid)
 
-SYM_CODE_START_LOCAL(el1_fiq_invalid)
+SYM_CODE_START(el1_fiq_invalid)
 	inv_entry 1, BAD_FIQ
 SYM_CODE_END(el1_fiq_invalid)
 
-SYM_CODE_START_LOCAL(el1_error_invalid)
+SYM_CODE_START(el1_error_invalid)
 	inv_entry 1, BAD_ERROR
 SYM_CODE_END(el1_error_invalid)
 
@@ -650,7 +650,7 @@ SYM_CODE_END(el1_error_invalid)
  * EL1 mode handlers.
  */
 	.align	6
-SYM_CODE_START_LOCAL_NOALIGN(el1_sync)
+SYM_CODE_START_NOALIGN(el1_sync)
 	kernel_entry 1
 	mov	x0, sp
 	bl	el1_sync_handler
@@ -658,7 +658,7 @@ SYM_CODE_START_LOCAL_NOALIGN(el1_sync)
 SYM_CODE_END(el1_sync)
 
 	.align	6
-SYM_CODE_START_LOCAL_NOALIGN(el1_irq)
+SYM_CODE_START_NOALIGN(el1_irq)
 	kernel_entry 1
 	gic_prio_irq_setup pmr=x20, tmp=x1
 	enable_da_f
@@ -737,7 +737,7 @@ el0_irq_naked:
 	b	ret_to_user
 SYM_CODE_END(el0_irq)
 
-SYM_CODE_START_LOCAL(el1_error)
+SYM_CODE_START(el1_error)
 	kernel_entry 1
 	mrs	x1, esr_el1
 	gic_prio_kentry_setup tmp=x2
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 557657d6e6bd..fb11e4372891 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -14,6 +14,7 @@
 #include <linux/stacktrace.h>
 
 #include <asm/irq.h>
+#include <asm/exception.h>
 #include <asm/pointer_auth.h>
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
@@ -25,8 +26,44 @@ struct function_range {
 
 /*
  * Special functions where the stack trace is unreliable.
+ *
+ * EL1 exceptions
+ * ==============
+ *
+ * EL1 exceptions can happen on any instruction including instructions in
+ * the frame pointer prolog or epilog. Depending on where exactly they happen,
+ * they could render the stack trace unreliable.
+ *
+ * If an EL1 exception frame is found on the stack, mark the stack trace as
+ * unreliable. Now, the EL1 exception frame is not at any well-known offset
+ * on the stack. It can be anywhere on the stack. In order to properly detect
+ * an EL1 exception frame, the return address must be checked against all of
+ * the possible EL1 exception handlers.
+ *
+ * Interrupts encountered in kernel code are also EL1 exceptions. At the end
+ * of an interrupt, the current task can get preempted. A stack trace taken
+ * on the task after the preemption will show the EL1 frame and will be
+ * considered unreliable. This is correct behavior as preemption can happen
+ * practically at any point in code.
+ *
+ * Breakpoints encountered in kernel code are also EL1 exceptions. Breakpoints
+ * can happen practically on any instruction. Mark the stack trace as
+ * unreliable. Breakpoints are used for executing probe code. Stack traces
+ * taken while in the probe code will show an EL1 frame and will be considered
+ * unreliable. This is correct behavior.
  */
 static struct function_range	special_functions[] = {
+	/*
+	 * EL1 exception handlers.
+	 */
+	{ (unsigned long) el1_sync, 0 },
+	{ (unsigned long) el1_irq, 0 },
+	{ (unsigned long) el1_error, 0 },
+	{ (unsigned long) el1_sync_invalid, 0 },
+	{ (unsigned long) el1_irq_invalid, 0 },
+	{ (unsigned long) el1_fiq_invalid, 0 },
+	{ (unsigned long) el1_error_invalid, 0 },
+
 	{ /* sentinel */ }
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-05 20:43 ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks madvenka
  2021-04-05 20:43   ` [RFC PATCH v2 1/4] arm64: Implement infrastructure for " madvenka
  2021-04-05 20:43   ` [RFC PATCH v2 2/4] arm64: Mark a stack trace unreliable if an EL1 exception frame is detected madvenka
@ 2021-04-05 20:43   ` madvenka
  2021-04-08 16:58     ` Mark Brown
  2021-04-09 12:27     ` Mark Rutland
  2021-04-05 20:43   ` [RFC PATCH v2 4/4] arm64: Mark stack trace as unreliable if kretprobed functions are present madvenka
  2021-04-09 12:09   ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks Mark Rutland
  4 siblings, 2 replies; 33+ messages in thread
From: madvenka @ 2021-04-05 20:43 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled and tracing is activated
for a function, the ftrace infrastructure is called for the function at
the very beginning. Ftrace creates two frames:

	- One for the traced function

	- One for the caller of the traced function

That gives a reliable stack trace while executing in the ftrace code. When
ftrace returns to the traced function, the frames are popped and everything
is back to normal.

However, in cases like live patch, a tracer function may redirect execution
to a different function when it returns. A stack trace taken while still in
the tracer function will not show the target function. The target function
is the real function that we want to track.

So, if an FTRACE frame is detected on the stack, just mark the stack trace
as unreliable. The detection is done by checking the return PC against
FTRACE trampolines.

Also, the Function Graph Tracer modifies the return address of a traced
function to a return trampoline to gather tracing data on function return.
Stack traces taken from that trampoline and functions it calls are
unreliable as the original return address may not be available in
that context. Mark the stack trace unreliable accordingly.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/kernel/entry-ftrace.S | 12 +++++++
 arch/arm64/kernel/stacktrace.c   | 61 ++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+)

diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index b3e4f9a088b1..1f0714a50c71 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -86,6 +86,18 @@ SYM_CODE_START(ftrace_caller)
 	b	ftrace_common
 SYM_CODE_END(ftrace_caller)
 
+/*
+ * A stack trace taken from anywhere in the FTRACE trampoline code should be
+ * considered unreliable as a tracer function (patched at ftrace_call) could
+ * potentially set pt_regs->pc and redirect execution to a function different
+ * than the traced function. E.g., livepatch.
+ *
+ * No stack traces are taken in this FTRACE trampoline assembly code. But
+ * they can be taken from C functions that get called from here. The unwinder
+ * checks if a return address falls in this FTRACE trampoline code. See
+ * stacktrace.c. If function calls in this code are changed, please keep the
+ * special_functions[] array in stacktrace.c in sync.
+ */
 SYM_CODE_START(ftrace_common)
 	sub	x0, x30, #AARCH64_INSN_SIZE	// ip (callsite's BL insn)
 	mov	x1, x9				// parent_ip (callsite's LR)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index fb11e4372891..7a3c638d4aeb 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -51,6 +51,52 @@ struct function_range {
  * unreliable. Breakpoints are used for executing probe code. Stack traces
  * taken while in the probe code will show an EL1 frame and will be considered
  * unreliable. This is correct behavior.
+ *
+ * FTRACE
+ * ======
+ *
+ * When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled, the FTRACE trampoline code
+ * is called from a traced function even before the frame pointer prolog.
+ * FTRACE sets up two stack frames (one for the traced function and one for
+ * its caller) so that the unwinder can provide a sensible stack trace for
+ * any tracer function called from the FTRACE trampoline code.
+ *
+ * There are two cases where the stack trace is not reliable.
+ *
+ * (1) The task gets preempted before the two frames are set up. Preemption
+ *     involves an interrupt which is an EL1 exception. The unwinder already
+ *     handles EL1 exceptions.
+ *
+ * (2) The tracer function that gets called by the FTRACE trampoline code
+ *     changes the return PC (e.g., livepatch).
+ *
+ *     Not all tracer functions do that. But to err on the side of safety,
+ *     consider the stack trace as unreliable in all cases.
+ *
+ * When Function Graph Tracer is used, FTRACE modifies the return address of
+ * the traced function in its stack frame to an FTRACE return trampoline
+ * (return_to_handler). When the traced function returns, control goes to
+ * return_to_handler. return_to_handler calls FTRACE to gather tracing data
+ * and to obtain the original return address. Then, return_to_handler returns
+ * to the original return address.
+ *
+ * There are two cases to consider from a stack trace reliability point of
+ * view:
+ *
+ * (1) Stack traces taken within the traced function (and functions that get
+ *     called from there) will show return_to_handler instead of the original
+ *     return address. The original return address can be obtained from FTRACE.
+ *     The unwinder already obtains it and modifies the return PC in its copy
+ *     of the stack frame to the original return address. So, this is handled.
+ *
+ * (2) return_to_handler calls FTRACE as mentioned before. FTRACE discards
+ *     the record of the original return address along the way as it does not
+ *     need to maintain it anymore. This means that the unwinder cannot get
+ *     the original return address beyond that point while the task is still
+ *     executing in return_to_handler. So, consider the stack trace unreliable
+ *     if return_to_handler is detected on the stack.
+ *
+ * NOTE: The unwinder must do (1) before (2).
  */
 static struct function_range	special_functions[] = {
 	/*
@@ -64,6 +110,21 @@ static struct function_range	special_functions[] = {
 	{ (unsigned long) el1_fiq_invalid, 0 },
 	{ (unsigned long) el1_error_invalid, 0 },
 
+	/*
+	 * FTRACE trampolines.
+	 *
+	 * Tracer function gets patched at the label ftrace_call. Its return
+	 * address is the next instruction address.
+	 */
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+	{ (unsigned long) ftrace_call + 4, 0 },
+#endif
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	{ (unsigned long) ftrace_graph_caller, 0 },
+	{ (unsigned long) return_to_handler, 0 },
+#endif
+
 	{ /* sentinel */ }
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH v2 4/4] arm64: Mark stack trace as unreliable if kretprobed functions are present
  2021-04-05 20:43 ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks madvenka
                     ` (2 preceding siblings ...)
  2021-04-05 20:43   ` [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable madvenka
@ 2021-04-05 20:43   ` madvenka
  2021-04-09 12:09   ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks Mark Rutland
  4 siblings, 0 replies; 33+ messages in thread
From: madvenka @ 2021-04-05 20:43 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

When a kretprobe is active for a function, the function's return address
in its stack frame is modified to point to the kretprobe trampoline. When
the function returns, the frame is popped and control is transferred
to the trampoline. The trampoline eventually returns to the original return
address.

If a stack walk is done within the function (or any functions that get
called from there), the stack trace will only show the trampoline and the
not the original caller.

Also, if the trampoline itself and the functions it calls do a stack trace,
that stack trace will also have the same problem. Detect this as well.

If the trampoline is detected in the stack trace, mark the stack trace
as unreliable.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/kernel/stacktrace.c | 37 ++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 7a3c638d4aeb..926bd91ffb3f 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -97,6 +97,36 @@ struct function_range {
  *     if return_to_handler is detected on the stack.
  *
  * NOTE: The unwinder must do (1) before (2).
+ *
+ * KPROBES
+ * =======
+ *
+ * There are two types of kprobes:
+ *
+ * (1) Regular kprobes that are placed anywhere in a probed function.
+ *     This is implemented by replacing the probed instruction with a
+ *     breakpoint. When the breakpoint is hit, the kprobe code emulates
+ *     the original instruction in-situ and returns to the next
+ *     instruction.
+ *
+ *     Breakpoints are EL1 exceptions. When the unwinder detects them,
+ *     the stack trace is marked as unreliable as it does not know where
+ *     exactly the exception happened. Detection of EL1 exceptions in
+ *     a stack trace will be done separately.
+ *
+ * (2) Return kprobes that are placed on the return of a probed function.
+ *     In this case, Kprobes sets up an initial breakpoint at the
+ *     beginning of the probed function. When the breakpoint is hit,
+ *     Kprobes replaces the return address in the stack frame with the
+ *     kretprobe_trampoline and records the original return address.
+ *     When the probed function returns, control goes to the trampoline
+ *     which eventually returns to the original return address.
+ *
+ *     Stack traces taken while in the probed function or while in the
+ *     trampoline will show kretprobe_trampoline instead of the original
+ *     return address. Detect this and mark the stack trace unreliable.
+ *     The detection is done by checking if the return PC falls anywhere
+ *     in kretprobe_trampoline.
  */
 static struct function_range	special_functions[] = {
 	/*
@@ -125,6 +155,13 @@ static struct function_range	special_functions[] = {
 	{ (unsigned long) return_to_handler, 0 },
 #endif
 
+	/*
+	 * Kprobe trampolines.
+	 */
+#ifdef CONFIG_KRETPROBES
+	{ (unsigned long) kretprobe_trampoline, 0 },
+#endif
+
 	{ /* sentinel */ }
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks
  2021-04-05 20:43   ` [RFC PATCH v2 1/4] arm64: Implement infrastructure for " madvenka
@ 2021-04-08 15:15     ` Mark Brown
  2021-04-08 17:17     ` Mark Brown
  1 sibling, 0 replies; 33+ messages in thread
From: Mark Brown @ 2021-04-08 15:15 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 369 bytes --]

On Mon, Apr 05, 2021 at 03:43:10PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Implement a check_reliability() function that will contain checks for the
> presence of various features and conditions that can render the stack trace
> unreliable.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-05 20:43   ` [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable madvenka
@ 2021-04-08 16:58     ` Mark Brown
  2021-04-08 19:23       ` Madhavan T. Venkataraman
  2021-04-09 12:27     ` Mark Rutland
  1 sibling, 1 reply; 33+ messages in thread
From: Mark Brown @ 2021-04-08 16:58 UTC (permalink / raw)
  To: madvenka, mark.rutland
  Cc: jpoimboe, jthierry, catalin.marinas, will, linux-arm-kernel,
	live-patching, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

On Mon, Apr 05, 2021 at 03:43:12PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled and tracing is activated
> for a function, the ftrace infrastructure is called for the function at
> the very beginning. Ftrace creates two frames:

This looks good to me however I'd really like someone who has a firmer
understanding of what ftrace is doing to double check as it is entirely
likely that I am missing cases here, it seems likely that if I am
missing stuff it's extra stuff that needs to be added and we're not
actually making use of the reliability information yet.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks
  2021-04-05 20:43   ` [RFC PATCH v2 1/4] arm64: Implement infrastructure for " madvenka
  2021-04-08 15:15     ` Mark Brown
@ 2021-04-08 17:17     ` Mark Brown
  2021-04-08 19:30       ` Madhavan T. Venkataraman
  1 sibling, 1 reply; 33+ messages in thread
From: Mark Brown @ 2021-04-08 17:17 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1332 bytes --]

On Mon, Apr 05, 2021 at 03:43:10PM -0500, madvenka@linux.microsoft.com wrote:

> These checks will involve checking the return PC to see if it falls inside
> any special functions where the stack trace is considered unreliable.
> Implement the infrastructure needed for this.

Following up again based on an off-list discussion with Mark Rutland:
while I think this is a reasonable implementation for specifically
listing functions that cause problems we could make life easier for
ourselves by instead using annotations at the call sites to put things
into sections which indicate that they're unsafe for unwinding, we can
then check for any address in one of those sections (or possibly do the
reverse and check for any address in a section we specifically know is
safe) rather than having to enumerate problematic functions in the
unwinder.  This also has the advantage of not having a list that's
separate to the functions themselves so it's less likely that the
unwinder will get out of sync with the rest of the code as things evolve.

We already have SYM_CODE_START() annotations in the code for assembly
functions that aren't using the standard calling convention which should
help a lot here, we could add a variant of that for things that we know
are safe on stacks (like those we expect to find at the bottom of
stacks).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-08 16:58     ` Mark Brown
@ 2021-04-08 19:23       ` Madhavan T. Venkataraman
  2021-04-09 11:31         ` Mark Brown
  0 siblings, 1 reply; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-08 19:23 UTC (permalink / raw)
  To: Mark Brown, mark.rutland
  Cc: jpoimboe, jthierry, catalin.marinas, will, linux-arm-kernel,
	live-patching, linux-kernel



On 4/8/21 11:58 AM, Mark Brown wrote:
> On Mon, Apr 05, 2021 at 03:43:12PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled and tracing is activated
>> for a function, the ftrace infrastructure is called for the function at
>> the very beginning. Ftrace creates two frames:
> 
> This looks good to me however I'd really like someone who has a firmer
> understanding of what ftrace is doing to double check as it is entirely
> likely that I am missing cases here, it seems likely that if I am
> missing stuff it's extra stuff that needs to be added and we're not
> actually making use of the reliability information yet.
> 

OK. So, do you have some specific reviewer(s) in mind? Apart from yourself, Mark Rutland and
Josh Poimboeuf, these are some reviewers I can think of (in alphabetical order):

AKASHI Takahiro <takahiro.akashi@linaro.org>
Ard Biesheuvel <ard.biesheuvel@linaro.org>
Catalin Marinas <catalin.marinas@arm.com>
Josh Poimboeuf <jpoimboe@redhat.com>
Steven Rostedt (VMware) <rostedt@goodmis.org>
Torsten Duwe <duwe@suse.de>
Will Deacon <will@kernel.org>

Sorry if I missed out any of the other experts.

Thanks.

Madhavan                                                 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks
  2021-04-08 17:17     ` Mark Brown
@ 2021-04-08 19:30       ` Madhavan T. Venkataraman
  2021-04-08 23:30         ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-08 19:30 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel



On 4/8/21 12:17 PM, Mark Brown wrote:
> On Mon, Apr 05, 2021 at 03:43:10PM -0500, madvenka@linux.microsoft.com wrote:
> 
>> These checks will involve checking the return PC to see if it falls inside
>> any special functions where the stack trace is considered unreliable.
>> Implement the infrastructure needed for this.
> 
> Following up again based on an off-list discussion with Mark Rutland:
> while I think this is a reasonable implementation for specifically
> listing functions that cause problems we could make life easier for
> ourselves by instead using annotations at the call sites to put things
> into sections which indicate that they're unsafe for unwinding, we can
> then check for any address in one of those sections (or possibly do the
> reverse and check for any address in a section we specifically know is
> safe) rather than having to enumerate problematic functions in the
> unwinder.  This also has the advantage of not having a list that's
> separate to the functions themselves so it's less likely that the
> unwinder will get out of sync with the rest of the code as things evolve.
> 
> We already have SYM_CODE_START() annotations in the code for assembly
> functions that aren't using the standard calling convention which should
> help a lot here, we could add a variant of that for things that we know
> are safe on stacks (like those we expect to find at the bottom of
> stacks).
> 

As I already mentioned before, I like the idea of sections. The only reason that I did
not try it was that I have to address FTRACE trampolines and the kretprobe_trampoline
(and optprobes in the future).

I have the following options:

1. Create a common section (I will have to come up with an appropriate name) and put
   all such functions in that one section.

2. Create one section for each logical type (exception section, ftrace section and
   kprobe section) or some such.

3. Use the section idea only for the el1 exceptions. For the others use the current
   special_functions[] approach.

Which one do you and Mark Rutland prefer? Or, is there another choice?

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks
  2021-04-08 19:30       ` Madhavan T. Venkataraman
@ 2021-04-08 23:30         ` Madhavan T. Venkataraman
  2021-04-09 11:57           ` Mark Brown
  0 siblings, 1 reply; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-08 23:30 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel



On 4/8/21 2:30 PM, Madhavan T. Venkataraman wrote:
> 
> 
> On 4/8/21 12:17 PM, Mark Brown wrote:
>> On Mon, Apr 05, 2021 at 03:43:10PM -0500, madvenka@linux.microsoft.com wrote:
>>
>>> These checks will involve checking the return PC to see if it falls inside
>>> any special functions where the stack trace is considered unreliable.
>>> Implement the infrastructure needed for this.
>>
>> Following up again based on an off-list discussion with Mark Rutland:
>> while I think this is a reasonable implementation for specifically
>> listing functions that cause problems we could make life easier for
>> ourselves by instead using annotations at the call sites to put things
>> into sections which indicate that they're unsafe for unwinding, we can
>> then check for any address in one of those sections (or possibly do the
>> reverse and check for any address in a section we specifically know is
>> safe) rather than having to enumerate problematic functions in the
>> unwinder.  This also has the advantage of not having a list that's
>> separate to the functions themselves so it's less likely that the
>> unwinder will get out of sync with the rest of the code as things evolve.
>>
>> We already have SYM_CODE_START() annotations in the code for assembly
>> functions that aren't using the standard calling convention which should
>> help a lot here, we could add a variant of that for things that we know
>> are safe on stacks (like those we expect to find at the bottom of
>> stacks).
>>
> 
> As I already mentioned before, I like the idea of sections. The only reason that I did
> not try it was that I have to address FTRACE trampolines and the kretprobe_trampoline
> (and optprobes in the future).
> 
> I have the following options:
> 
> 1. Create a common section (I will have to come up with an appropriate name) and put
>    all such functions in that one section.
> 
> 2. Create one section for each logical type (exception section, ftrace section and
>    kprobe section) or some such.
> 

For now, I will start with idea 2. I will create a special section for each class of
functions (EL1 exception handlers, FTRACE trampolines, KPROBE trampolines). Instead of a
special functions array, I will implement a special_sections array. The rest of the code
should just fall into place.

Let me know if you prefer something different.

Thanks.

Madhavan

> 3. Use the section idea only for the el1 exceptions. For the others use the current
>    special_functions[] approach.
> 
> Which one do you and Mark Rutland prefer? Or, is there another choice?
> 
> Madhavan
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-08 19:23       ` Madhavan T. Venkataraman
@ 2021-04-09 11:31         ` Mark Brown
  2021-04-09 14:02           ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Brown @ 2021-04-09 11:31 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 854 bytes --]

On Thu, Apr 08, 2021 at 02:23:39PM -0500, Madhavan T. Venkataraman wrote:
> On 4/8/21 11:58 AM, Mark Brown wrote:

> > This looks good to me however I'd really like someone who has a firmer
> > understanding of what ftrace is doing to double check as it is entirely
> > likely that I am missing cases here, it seems likely that if I am
> > missing stuff it's extra stuff that needs to be added and we're not
> > actually making use of the reliability information yet.

> OK. So, do you have some specific reviewer(s) in mind? Apart from yourself, Mark Rutland and
> Josh Poimboeuf, these are some reviewers I can think of (in alphabetical order):

Mainly Mark Rutland, but generally someone else who has looked at ftrace
on arm64 in detail.  It was mainly a comment to say I wasn't going to do
any kind of Reviewed-by but also hadn't spotted any issues.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 1/4] arm64: Implement infrastructure for stack trace reliability checks
  2021-04-08 23:30         ` Madhavan T. Venkataraman
@ 2021-04-09 11:57           ` Mark Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Mark Brown @ 2021-04-09 11:57 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]

On Thu, Apr 08, 2021 at 06:30:22PM -0500, Madhavan T. Venkataraman wrote:
> On 4/8/21 2:30 PM, Madhavan T. Venkataraman wrote:

> > 1. Create a common section (I will have to come up with an appropriate name) and put
> >    all such functions in that one section.

> > 2. Create one section for each logical type (exception section, ftrace section and
> >    kprobe section) or some such.

> For now, I will start with idea 2. I will create a special section for each class of
> functions (EL1 exception handlers, FTRACE trampolines, KPROBE trampolines). Instead of a
> special functions array, I will implement a special_sections array. The rest of the code
> should just fall into place.

> Let me know if you prefer something different.

It might be safer to start off by just putting all SYM_CODE into a
section then pulling bits we know to be safe out of the section as
needed - we know that anything that's SYM_CODE is doing something
non-standard and needs checking to verify that the unwinder will be
happy with it and I that should cover most if not all of the cases above
as well as anything else we didn't explicitly think of.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-05 20:43 ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks madvenka
                     ` (3 preceding siblings ...)
  2021-04-05 20:43   ` [RFC PATCH v2 4/4] arm64: Mark stack trace as unreliable if kretprobed functions are present madvenka
@ 2021-04-09 12:09   ` Mark Rutland
  2021-04-09 17:16     ` Madhavan T. Venkataraman
  2021-04-09 21:37     ` Josh Poimboeuf
  4 siblings, 2 replies; 33+ messages in thread
From: Mark Rutland @ 2021-04-09 12:09 UTC (permalink / raw)
  To: madvenka
  Cc: broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel

Hi Madhavan,

I've noted some concerns below. At a high-level, I'm not keen on the
blacklisting approach, and I think there's some other preparatory work
that would be more valuable in the short term.

On Mon, Apr 05, 2021 at 03:43:09PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> There are a number of places in kernel code where the stack trace is not
> reliable. Enhance the unwinder to check for those cases and mark the
> stack trace as unreliable. Once all of the checks are in place, the unwinder
> can provide a reliable stack trace. But before this can be used for livepatch,
> some other entity needs to guarantee that the frame pointers are all set up
> correctly in kernel functions. objtool is currently being worked on to
> fill that gap.
> 
> Except for the return address check, all the other checks involve checking
> the return PC of every frame against certain kernel functions. To do this,
> implement some infrastructure code:
> 
> 	- Define a special_functions[] array and populate the array with
> 	  the special functions

I'm not too keen on having to manually collate this within the unwinder,
as it's very painful from a maintenance perspective. I'd much rather we
could associate this information with the implementations of these
functions, so that they're more likely to stay in sync.

Further, I believe all the special cases are assembly functions, and
most of those are already in special sections to begin with. I reckon
it'd be simpler and more robust to reject unwinding based on the
section. If we need to unwind across specific functions in those
sections, we could opt-in with some metadata. So e.g. we could reject
all functions in ".entry.text", special casing the EL0 entry functions
if necessary.

As I mentioned before, I'm currently reworking the entry assembly to
make this simpler to do. I'd prefer to not make invasive changes in that
area until that's sorted.

I think there's a lot more code that we cannot unwind, e.g. KVM
exception code, or almost anything marked with SYM_CODE_END().

> 	- Using kallsyms_lookup(), lookup the symbol table entries for the
> 	  functions and record their address ranges
> 
> 	- Define an is_reliable_function(pc) to match a return PC against
> 	  the special functions.
> 
> The unwinder calls is_reliable_function(pc) for every return PC and marks
> the stack trace as reliable or unreliable accordingly.
> 
> Return address check
> ====================
> 
> Check the return PC of every stack frame to make sure that it is a valid
> kernel text address (and not some generated code, for example).
> 
> Detect EL1 exception frame
> ==========================
> 
> EL1 exceptions can happen on any instruction including instructions in
> the frame pointer prolog or epilog. Depending on where exactly they happen,
> they could render the stack trace unreliable.
> 
> Add all of the EL1 exception handlers to special_functions[].
> 
> 	- el1_sync()
> 	- el1_irq()
> 	- el1_error()
> 	- el1_sync_invalid()
> 	- el1_irq_invalid()
> 	- el1_fiq_invalid()
> 	- el1_error_invalid()
> 
> Detect ftrace frame
> ===================
> 
> When FTRACE executes at the beginning of a traced function, it creates two
> frames and calls the tracer function:
> 
> 	- One frame for the traced function
> 
> 	- One frame for the caller of the traced function
> 
> That gives a sensible stack trace while executing in the tracer function.
> When FTRACE returns to the traced function, the frames are popped and
> everything is back to normal.
> 
> However, in cases like live patch, the tracer function redirects execution
> to a different function. When FTRACE returns, control will go to that target
> function. A stack trace taken in the tracer function will not show the target
> function. The target function is the real function that we want to track.
> So, the stack trace is unreliable.

This doesn't match my understanding of the reliable stacktrace
requirements, but I might have misunderstood what you're saying here.

IIUC what you're describing here is:

1) A calls B
2) B is traced
3) tracer replaces B with TARGET
4) tracer returns to TARGET

... and if a stacktrace is taken at step 3 (before the return address is
patched), the trace will show B rather than TARGET.

My understanding is that this is legitimate behaviour.

> To detect stack traces from a tracer function, add the following to
> special_functions[]:
> 
> 	- ftrace_call + 4
> 
> ftrace_call is the label at which the tracer function is patched in. So,
> ftrace_call + 4 is its return address. This is what will show up in a
> stack trace taken from the tracer function.
> 
> When Function Graph Tracing is on, ftrace_graph_caller is patched in
> at the label ftrace_graph_call. If a tracer function called before it has
> redirected execution as mentioned above, the stack traces taken from within
> ftrace_graph_caller will also be unreliable for the same reason as mentioned
> above. So, add ftrace_graph_caller to special_functions[] as well.
> 
> Also, the Function Graph Tracer modifies the return address of a traced
> function to a return trampoline (return_to_handler()) to gather tracing
> data on function return. Stack traces taken from the traced function and
> functions it calls will not show the original caller of the traced function.
> The unwinder handles this case by getting the original caller from FTRACE.
> 
> However, stack traces taken from the trampoline itself and functions it calls
> are unreliable as the original return address may not be available in
> that context. This is because the trampoline calls FTRACE to gather trace
> data as well as to obtain the actual return address and FTRACE discards the
> record of the original return address along the way.

The reason we cannot unwind the trampolines in the usual way is because
they are not AAPCS compliant functions. We don't discard the original
return address, but it's not in the usual location.  With care, we could
write a special case unwinder for them. Note that we also cannot unwind
from any PLT on the way to the trampolines, so we'd also need to
identify those.  Luckily we're in charge of creating those, and (for
now) we only need to care about the module PLTs.

The bigger problem is return_to_handler, since there's a transient
period when C code removes the return address from the graph return
stack before passing this to assembly in a register, and so we can't
reliably find the correct return address during this period. With care
we could special case unwinding immediately before/after this.

If we could find a way to restructure return_to_handler such that we can
reliably find the correct return address, that would be a useful
improvement today, and would mean that we don't have to blacklist it for
reliable stacktrace.

Thanks,
Mark.

> Add return_to_handler() to special_functions[].
> 
> Check for kretprobe
> ===================
> 
> For functions with a kretprobe set up, probe code executes on entry
> to the function and replaces the return address in the stack frame with a
> kretprobe trampoline. Whenever the function returns, control is
> transferred to the trampoline. The trampoline eventually returns to the
> original return address.
> 
> A stack trace taken while executing in the function (or in functions that
> get called from the function) will not show the original return address.
> Similarly, a stack trace taken while executing in the trampoline itself
> (and functions that get called from the trampoline) will not show the
> original return address. This means that the caller of the probed function
> will not show. This makes the stack trace unreliable.
> 
> Add the kretprobe trampoline to special_functions[].
> 
> Optprobes
> =========
> 
> Optprobes may be implemented in the future for arm64. For optprobes,
> the relevant trampoline(s) can be added to special_functions[].
> ---
> Changelog:
> 
> v1
> 	- Define a bool field in struct stackframe. This will indicate if
> 	  a stack trace is reliable.
> 
> 	- Implement a special_functions[] array that will be populated
> 	  with special functions in which the stack trace is considered
> 	  unreliable.
> 	
> 	- Using kallsyms_lookup(), get the address ranges for the special
> 	  functions and record them.
> 
> 	- Implement an is_reliable_function(pc). This function will check
> 	  if a given return PC falls in any of the special functions. If
> 	  it does, the stack trace is unreliable.
> 
> 	- Implement check_reliability() function that will check if a
> 	  stack frame is reliable. Call is_reliable_function() from
> 	  check_reliability().
> 
> 	- Before a return PC is checked against special_funtions[], it
> 	  must be validates as a proper kernel text address. Call
> 	  __kernel_text_address() from check_reliability().
> 
> 	- Finally, call check_reliability() from unwind_frame() for
> 	  each stack frame.
> 
> 	- Add EL1 exception handlers to special_functions[].
> 
> 		el1_sync();
> 		el1_irq();
> 		el1_error();
> 		el1_sync_invalid();
> 		el1_irq_invalid();
> 		el1_fiq_invalid();
> 		el1_error_invalid();
> 
> 	- The above functions are currently defined as LOCAL symbols.
> 	  Make them global so that they can be referenced from the
> 	  unwinder code.
> 
> 	- Add FTRACE trampolines to special_functions[]:
> 
> 		ftrace_graph_call()
> 		ftrace_graph_caller()
> 		return_to_handler()
> 
> 	- Add the kretprobe trampoline to special functions[]:
> 
> 		kretprobe_trampoline()
> 
> v2
> 	- Removed the terminating entry { 0, 0 } in special_functions[]
> 	  and replaced it with the idiom { /* sentinel */ }.
> 
> 	- Change the ftrace trampoline entry ftrace_graph_call in
> 	  special_functions[] to ftrace_call + 4 and added explanatory
> 	  comments.
> 
> 	- Unnested #ifdefs in special_functions[] for FTRACE.
> 
> Madhavan T. Venkataraman (4):
>   arm64: Implement infrastructure for stack trace reliability checks
>   arm64: Mark a stack trace unreliable if an EL1 exception frame is
>     detected
>   arm64: Detect FTRACE cases that make the stack trace unreliable
>   arm64: Mark stack trace as unreliable if kretprobed functions are
>     present
> 
>  arch/arm64/include/asm/exception.h  |   8 ++
>  arch/arm64/include/asm/stacktrace.h |   2 +
>  arch/arm64/kernel/entry-ftrace.S    |  12 ++
>  arch/arm64/kernel/entry.S           |  14 +-
>  arch/arm64/kernel/stacktrace.c      | 215 ++++++++++++++++++++++++++++
>  5 files changed, 244 insertions(+), 7 deletions(-)
> 
> 
> base-commit: 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-05 20:43   ` [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable madvenka
  2021-04-08 16:58     ` Mark Brown
@ 2021-04-09 12:27     ` Mark Rutland
  2021-04-09 17:23       ` Madhavan T. Venkataraman
  1 sibling, 1 reply; 33+ messages in thread
From: Mark Rutland @ 2021-04-09 12:27 UTC (permalink / raw)
  To: madvenka
  Cc: broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel

On Mon, Apr 05, 2021 at 03:43:12PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled and tracing is activated
> for a function, the ftrace infrastructure is called for the function at
> the very beginning. Ftrace creates two frames:
> 
> 	- One for the traced function
> 
> 	- One for the caller of the traced function
> 
> That gives a reliable stack trace while executing in the ftrace code. When
> ftrace returns to the traced function, the frames are popped and everything
> is back to normal.
> 
> However, in cases like live patch, a tracer function may redirect execution
> to a different function when it returns. A stack trace taken while still in
> the tracer function will not show the target function. The target function
> is the real function that we want to track.
> 
> So, if an FTRACE frame is detected on the stack, just mark the stack trace
> as unreliable. The detection is done by checking the return PC against
> FTRACE trampolines.
> 
> Also, the Function Graph Tracer modifies the return address of a traced
> function to a return trampoline to gather tracing data on function return.
> Stack traces taken from that trampoline and functions it calls are
> unreliable as the original return address may not be available in
> that context. Mark the stack trace unreliable accordingly.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> ---
>  arch/arm64/kernel/entry-ftrace.S | 12 +++++++
>  arch/arm64/kernel/stacktrace.c   | 61 ++++++++++++++++++++++++++++++++
>  2 files changed, 73 insertions(+)
> 
> diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> index b3e4f9a088b1..1f0714a50c71 100644
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -86,6 +86,18 @@ SYM_CODE_START(ftrace_caller)
>  	b	ftrace_common
>  SYM_CODE_END(ftrace_caller)
>  
> +/*
> + * A stack trace taken from anywhere in the FTRACE trampoline code should be
> + * considered unreliable as a tracer function (patched at ftrace_call) could
> + * potentially set pt_regs->pc and redirect execution to a function different
> + * than the traced function. E.g., livepatch.

IIUC the issue here that we have two copies of the pc: one in the regs,
and one in a frame record, and so after the update to the regs, the
frame record is stale.

This is something that we could fix by having
ftrace_instruction_pointer_set() set both.

However, as noted elsewhere there are other issues which mean we'd still
need special unwinding code for this.

Thanks,
Mark.

> + *
> + * No stack traces are taken in this FTRACE trampoline assembly code. But
> + * they can be taken from C functions that get called from here. The unwinder
> + * checks if a return address falls in this FTRACE trampoline code. See
> + * stacktrace.c. If function calls in this code are changed, please keep the
> + * special_functions[] array in stacktrace.c in sync.
> + */
>  SYM_CODE_START(ftrace_common)
>  	sub	x0, x30, #AARCH64_INSN_SIZE	// ip (callsite's BL insn)
>  	mov	x1, x9				// parent_ip (callsite's LR)
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index fb11e4372891..7a3c638d4aeb 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -51,6 +51,52 @@ struct function_range {
>   * unreliable. Breakpoints are used for executing probe code. Stack traces
>   * taken while in the probe code will show an EL1 frame and will be considered
>   * unreliable. This is correct behavior.
> + *
> + * FTRACE
> + * ======
> + *
> + * When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled, the FTRACE trampoline code
> + * is called from a traced function even before the frame pointer prolog.
> + * FTRACE sets up two stack frames (one for the traced function and one for
> + * its caller) so that the unwinder can provide a sensible stack trace for
> + * any tracer function called from the FTRACE trampoline code.
> + *
> + * There are two cases where the stack trace is not reliable.
> + *
> + * (1) The task gets preempted before the two frames are set up. Preemption
> + *     involves an interrupt which is an EL1 exception. The unwinder already
> + *     handles EL1 exceptions.
> + *
> + * (2) The tracer function that gets called by the FTRACE trampoline code
> + *     changes the return PC (e.g., livepatch).
> + *
> + *     Not all tracer functions do that. But to err on the side of safety,
> + *     consider the stack trace as unreliable in all cases.
> + *
> + * When Function Graph Tracer is used, FTRACE modifies the return address of
> + * the traced function in its stack frame to an FTRACE return trampoline
> + * (return_to_handler). When the traced function returns, control goes to
> + * return_to_handler. return_to_handler calls FTRACE to gather tracing data
> + * and to obtain the original return address. Then, return_to_handler returns
> + * to the original return address.
> + *
> + * There are two cases to consider from a stack trace reliability point of
> + * view:
> + *
> + * (1) Stack traces taken within the traced function (and functions that get
> + *     called from there) will show return_to_handler instead of the original
> + *     return address. The original return address can be obtained from FTRACE.
> + *     The unwinder already obtains it and modifies the return PC in its copy
> + *     of the stack frame to the original return address. So, this is handled.
> + *
> + * (2) return_to_handler calls FTRACE as mentioned before. FTRACE discards
> + *     the record of the original return address along the way as it does not
> + *     need to maintain it anymore. This means that the unwinder cannot get
> + *     the original return address beyond that point while the task is still
> + *     executing in return_to_handler. So, consider the stack trace unreliable
> + *     if return_to_handler is detected on the stack.
> + *
> + * NOTE: The unwinder must do (1) before (2).
>   */
>  static struct function_range	special_functions[] = {
>  	/*
> @@ -64,6 +110,21 @@ static struct function_range	special_functions[] = {
>  	{ (unsigned long) el1_fiq_invalid, 0 },
>  	{ (unsigned long) el1_error_invalid, 0 },
>  
> +	/*
> +	 * FTRACE trampolines.
> +	 *
> +	 * Tracer function gets patched at the label ftrace_call. Its return
> +	 * address is the next instruction address.
> +	 */
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +	{ (unsigned long) ftrace_call + 4, 0 },
> +#endif
> +
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +	{ (unsigned long) ftrace_graph_caller, 0 },
> +	{ (unsigned long) return_to_handler, 0 },
> +#endif
> +
>  	{ /* sentinel */ }
>  };
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-09 11:31         ` Mark Brown
@ 2021-04-09 14:02           ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-09 14:02 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel



On 4/9/21 6:31 AM, Mark Brown wrote:
> On Thu, Apr 08, 2021 at 02:23:39PM -0500, Madhavan T. Venkataraman wrote:
>> On 4/8/21 11:58 AM, Mark Brown wrote:
> 
>>> This looks good to me however I'd really like someone who has a firmer
>>> understanding of what ftrace is doing to double check as it is entirely
>>> likely that I am missing cases here, it seems likely that if I am
>>> missing stuff it's extra stuff that needs to be added and we're not
>>> actually making use of the reliability information yet.
> 
>> OK. So, do you have some specific reviewer(s) in mind? Apart from yourself, Mark Rutland and
>> Josh Poimboeuf, these are some reviewers I can think of (in alphabetical order):
> 
> Mainly Mark Rutland, but generally someone else who has looked at ftrace
> on arm64 in detail.  It was mainly a comment to say I wasn't going to do
> any kind of Reviewed-by but also hadn't spotted any issues.
> 

Understood.

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 12:09   ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks Mark Rutland
@ 2021-04-09 17:16     ` Madhavan T. Venkataraman
  2021-04-09 21:37     ` Josh Poimboeuf
  1 sibling, 0 replies; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-09 17:16 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel



On 4/9/21 7:09 AM, Mark Rutland wrote:
> Hi Madhavan,
> 
> I've noted some concerns below. At a high-level, I'm not keen on the
> blacklisting approach, and I think there's some other preparatory work
> that would be more valuable in the short term.
> 

Some kind of blacklisting has to be done whichever way you do it.

> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> There are a number of places in kernel code where the stack trace is not
>> reliable. Enhance the unwinder to check for those cases and mark the
>> stack trace as unreliable. Once all of the checks are in place, the unwinder
>> can provide a reliable stack trace. But before this can be used for livepatch,
>> some other entity needs to guarantee that the frame pointers are all set up
>> correctly in kernel functions. objtool is currently being worked on to
>> fill that gap.
>>
>> Except for the return address check, all the other checks involve checking
>> the return PC of every frame against certain kernel functions. To do this,
>> implement some infrastructure code:
>>
>> 	- Define a special_functions[] array and populate the array with
>> 	  the special functions
> 
> I'm not too keen on having to manually collate this within the unwinder,
> as it's very painful from a maintenance perspective. I'd much rather we
> could associate this information with the implementations of these
> functions, so that they're more likely to stay in sync.
> 
> Further, I believe all the special cases are assembly functions, and
> most of those are already in special sections to begin with. I reckon
> it'd be simpler and more robust to reject unwinding based on the
> section. If we need to unwind across specific functions in those
> sections, we could opt-in with some metadata. So e.g. we could reject
> all functions in ".entry.text", special casing the EL0 entry functions
> if necessary.
> 

Yes. I have already agreed that using sections is the way to go. I am working
on that now.

> As I mentioned before, I'm currently reworking the entry assembly to
> make this simpler to do. I'd prefer to not make invasive changes in that
> area until that's sorted.
> 

I don't plan to make any invasive changes. But a couple of cosmetic changes may be
necessary. I don't know yet. But I will keep in mind that you don't want
any invasive changes there.

> I think there's a lot more code that we cannot unwind, e.g. KVM
> exception code, or almost anything marked with SYM_CODE_END().
> 

As Mark Brown suggested, I will take a look at all code that is marked as
SYM_CODE. His idea of placing all SYM_CODE in a separate section and blacklisting
that to begin with and refining things as we go along appears to me to be
a reasonable approach.

>> 	- Using kallsyms_lookup(), lookup the symbol table entries for the
>> 	  functions and record their address ranges
>>
>> 	- Define an is_reliable_function(pc) to match a return PC against
>> 	  the special functions.
>>
>> The unwinder calls is_reliable_function(pc) for every return PC and marks
>> the stack trace as reliable or unreliable accordingly.
>>
>> Return address check
>> ====================
>>
>> Check the return PC of every stack frame to make sure that it is a valid
>> kernel text address (and not some generated code, for example).
>>
>> Detect EL1 exception frame
>> ==========================
>>
>> EL1 exceptions can happen on any instruction including instructions in
>> the frame pointer prolog or epilog. Depending on where exactly they happen,
>> they could render the stack trace unreliable.
>>
>> Add all of the EL1 exception handlers to special_functions[].
>>
>> 	- el1_sync()
>> 	- el1_irq()
>> 	- el1_error()
>> 	- el1_sync_invalid()
>> 	- el1_irq_invalid()
>> 	- el1_fiq_invalid()
>> 	- el1_error_invalid()
>>
>> Detect ftrace frame
>> ===================
>>
>> When FTRACE executes at the beginning of a traced function, it creates two
>> frames and calls the tracer function:
>>
>> 	- One frame for the traced function
>>
>> 	- One frame for the caller of the traced function
>>
>> That gives a sensible stack trace while executing in the tracer function.
>> When FTRACE returns to the traced function, the frames are popped and
>> everything is back to normal.
>>
>> However, in cases like live patch, the tracer function redirects execution
>> to a different function. When FTRACE returns, control will go to that target
>> function. A stack trace taken in the tracer function will not show the target
>> function. The target function is the real function that we want to track.
>> So, the stack trace is unreliable.
> 
> This doesn't match my understanding of the reliable stacktrace
> requirements, but I might have misunderstood what you're saying here.
> 
> IIUC what you're describing here is:
> 
> 1) A calls B
> 2) B is traced
> 3) tracer replaces B with TARGET
> 4) tracer returns to TARGET
> 
> ... and if a stacktrace is taken at step 3 (before the return address is
> patched), the trace will show B rather than TARGET.
> 
> My understanding is that this is legitimate behaviour.
> 

My understanding is as follows (correct me if I am wrong):

- Before B is traced, the situation is "A calls B".

- A trace is placed on B to redirect execution to TARGET. Semantically, it
  becomes "A calls TARGET" beyond that point and B is irrelevant.

- But temporarily, the stack trace will show A -> B.

>> To detect stack traces from a tracer function, add the following to
>> special_functions[]:
>>
>> 	- ftrace_call + 4
>>
>> ftrace_call is the label at which the tracer function is patched in. So,
>> ftrace_call + 4 is its return address. This is what will show up in a
>> stack trace taken from the tracer function.
>>
>> When Function Graph Tracing is on, ftrace_graph_caller is patched in
>> at the label ftrace_graph_call. If a tracer function called before it has
>> redirected execution as mentioned above, the stack traces taken from within
>> ftrace_graph_caller will also be unreliable for the same reason as mentioned
>> above. So, add ftrace_graph_caller to special_functions[] as well.
>>
>> Also, the Function Graph Tracer modifies the return address of a traced
>> function to a return trampoline (return_to_handler()) to gather tracing
>> data on function return. Stack traces taken from the traced function and
>> functions it calls will not show the original caller of the traced function.
>> The unwinder handles this case by getting the original caller from FTRACE.
>>
>> However, stack traces taken from the trampoline itself and functions it calls
>> are unreliable as the original return address may not be available in
>> that context. This is because the trampoline calls FTRACE to gather trace
>> data as well as to obtain the actual return address and FTRACE discards the
>> record of the original return address along the way.
> 
> The reason we cannot unwind the trampolines in the usual way is because
> they are not AAPCS compliant functions. We don't discard the original
> return address, but it's not in the usual location.  With care, we could
> write a special case unwinder for them. Note that we also cannot unwind
> from any PLT on the way to the trampolines, so we'd also need to
> identify those.  Luckily we're in charge of creating those, and (for
> now) we only need to care about the module PLTs.
> 
> The bigger problem is return_to_handler, since there's a transient
> period when C code removes the return address from the graph return
> stack before passing this to assembly in a register, and so we can't
> reliably find the correct return address during this period. With care
> we could special case unwinding immediately before/after this.
> 

This is what I meant when I said "as the original return address may not be available in
that context" because the original address is popped off the return address stack
by the ftrace code called from the trampoline.

> If we could find a way to restructure return_to_handler such that we can
> reliably find the correct return address, that would be a useful
> improvement today, and would mean that we don't have to blacklist it for
> reliable stacktrace.
> 

Agreed. But until then it needs to be blacklisted. Rather than wait for that restructuring
to be done, we could initially blacklist it and remove the blacklist if and when the
restructuring is done.

Thanks.

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable
  2021-04-09 12:27     ` Mark Rutland
@ 2021-04-09 17:23       ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-09 17:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel


>> Also, the Function Graph Tracer modifies the return address of a traced
>> function to a return trampoline to gather tracing data on function return.
>> Stack traces taken from that trampoline and functions it calls are
>> unreliable as the original return address may not be available in
>> that context. Mark the stack trace unreliable accordingly.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
>> ---
>>  arch/arm64/kernel/entry-ftrace.S | 12 +++++++
>>  arch/arm64/kernel/stacktrace.c   | 61 ++++++++++++++++++++++++++++++++
>>  2 files changed, 73 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
>> index b3e4f9a088b1..1f0714a50c71 100644
>> --- a/arch/arm64/kernel/entry-ftrace.S
>> +++ b/arch/arm64/kernel/entry-ftrace.S
>> @@ -86,6 +86,18 @@ SYM_CODE_START(ftrace_caller)
>>  	b	ftrace_common
>>  SYM_CODE_END(ftrace_caller)
>>  
>> +/*
>> + * A stack trace taken from anywhere in the FTRACE trampoline code should be
>> + * considered unreliable as a tracer function (patched at ftrace_call) could
>> + * potentially set pt_regs->pc and redirect execution to a function different
>> + * than the traced function. E.g., livepatch.
> 
> IIUC the issue here that we have two copies of the pc: one in the regs,
> and one in a frame record, and so after the update to the regs, the
> frame record is stale.
> 
> This is something that we could fix by having
> ftrace_instruction_pointer_set() set both.
> 

Yes. I will look at this.

> However, as noted elsewhere there are other issues which mean we'd still
> need special unwinding code for this.
> 

The only other cases we have discussed are EL1 exceptions in the ftrace code
and the return trampoline for function graph tracing. Is there any other case?

Thanks.

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 12:09   ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks Mark Rutland
  2021-04-09 17:16     ` Madhavan T. Venkataraman
@ 2021-04-09 21:37     ` Josh Poimboeuf
  2021-04-09 22:05       ` Madhavan T. Venkataraman
  2021-04-12 17:36       ` Mark Brown
  1 sibling, 2 replies; 33+ messages in thread
From: Josh Poimboeuf @ 2021-04-09 21:37 UTC (permalink / raw)
  To: Mark Rutland
  Cc: madvenka, broonie, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madvenka@linux.microsoft.com wrote:
> > From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> > 
> > There are a number of places in kernel code where the stack trace is not
> > reliable. Enhance the unwinder to check for those cases and mark the
> > stack trace as unreliable. Once all of the checks are in place, the unwinder
> > can provide a reliable stack trace. But before this can be used for livepatch,
> > some other entity needs to guarantee that the frame pointers are all set up
> > correctly in kernel functions. objtool is currently being worked on to
> > fill that gap.
> > 
> > Except for the return address check, all the other checks involve checking
> > the return PC of every frame against certain kernel functions. To do this,
> > implement some infrastructure code:
> > 
> > 	- Define a special_functions[] array and populate the array with
> > 	  the special functions
> 
> I'm not too keen on having to manually collate this within the unwinder,
> as it's very painful from a maintenance perspective.

Agreed.

> I'd much rather we could associate this information with the
> implementations of these functions, so that they're more likely to
> stay in sync.
> 
> Further, I believe all the special cases are assembly functions, and
> most of those are already in special sections to begin with. I reckon
> it'd be simpler and more robust to reject unwinding based on the
> section. If we need to unwind across specific functions in those
> sections, we could opt-in with some metadata. So e.g. we could reject
> all functions in ".entry.text", special casing the EL0 entry functions
> if necessary.

Couldn't this also end up being somewhat fragile?  Saying "certain
sections are deemed unreliable" isn't necessarily obvious to somebody
who doesn't already know about it, and it could be overlooked or
forgotten over time.  And there's no way to enforce it stays that way.

FWIW, over the years we've had zero issues with encoding the frame
pointer on x86.  After you save pt_regs, you encode the frame pointer to
point to it.  Ideally in the same macro so it's hard to overlook.

If you're concerned about debuggers getting confused by the encoding -
which debuggers specifically?  In my experience, if vmlinux has
debuginfo, gdb and most other debuggers will use DWARF (which is already
broken in asm code) and completely ignore frame pointers.

> I think there's a lot more code that we cannot unwind, e.g. KVM
> exception code, or almost anything marked with SYM_CODE_END().

Just a reminder that livepatch only unwinds blocked tasks (plus the
'current' task which calls into livepatch).  So practically speaking, it
doesn't matter whether the 'unreliable' detection has full coverage.
The only exceptions which really matter are those which end up calling
schedule(), e.g. preemption or page faults.

Being able to consistently detect *all* possible unreliable paths would
be nice in theory, but it's unnecessary and may not be worth the extra
complexity.

-- 
Josh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 21:37     ` Josh Poimboeuf
@ 2021-04-09 22:05       ` Madhavan T. Venkataraman
  2021-04-09 22:32         ` Josh Poimboeuf
  2021-04-12 17:36       ` Mark Brown
  1 sibling, 1 reply; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-09 22:05 UTC (permalink / raw)
  To: Josh Poimboeuf, Mark Rutland
  Cc: broonie, jthierry, catalin.marinas, will, linux-arm-kernel,
	live-patching, linux-kernel, Peter Zijlstra



On 4/9/21 4:37 PM, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
>> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madvenka@linux.microsoft.com wrote:
>>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>>
>>> There are a number of places in kernel code where the stack trace is not
>>> reliable. Enhance the unwinder to check for those cases and mark the
>>> stack trace as unreliable. Once all of the checks are in place, the unwinder
>>> can provide a reliable stack trace. But before this can be used for livepatch,
>>> some other entity needs to guarantee that the frame pointers are all set up
>>> correctly in kernel functions. objtool is currently being worked on to
>>> fill that gap.
>>>
>>> Except for the return address check, all the other checks involve checking
>>> the return PC of every frame against certain kernel functions. To do this,
>>> implement some infrastructure code:
>>>
>>> 	- Define a special_functions[] array and populate the array with
>>> 	  the special functions
>>
>> I'm not too keen on having to manually collate this within the unwinder,
>> as it's very painful from a maintenance perspective.
> 
> Agreed.
> 
>> I'd much rather we could associate this information with the
>> implementations of these functions, so that they're more likely to
>> stay in sync.
>>
>> Further, I believe all the special cases are assembly functions, and
>> most of those are already in special sections to begin with. I reckon
>> it'd be simpler and more robust to reject unwinding based on the
>> section. If we need to unwind across specific functions in those
>> sections, we could opt-in with some metadata. So e.g. we could reject
>> all functions in ".entry.text", special casing the EL0 entry functions
>> if necessary.
> 
> Couldn't this also end up being somewhat fragile?  Saying "certain
> sections are deemed unreliable" isn't necessarily obvious to somebody
> who doesn't already know about it, and it could be overlooked or
> forgotten over time.  And there's no way to enforce it stays that way.
> 

Good point!

> FWIW, over the years we've had zero issues with encoding the frame
> pointer on x86.  After you save pt_regs, you encode the frame pointer to
> point to it.  Ideally in the same macro so it's hard to overlook.
> 

I had the same opinion. In fact, in my encoding scheme, I have additional
checks to make absolutely sure that it is a true encoding and not stack
corruption. The chances of all of those values accidentally matching are,
well, null.

> If you're concerned about debuggers getting confused by the encoding -
> which debuggers specifically?  In my experience, if vmlinux has
> debuginfo, gdb and most other debuggers will use DWARF (which is already
> broken in asm code) and completely ignore frame pointers.
> 

Yes. I checked gdb actually. It did not show a problem.

>> I think there's a lot more code that we cannot unwind, e.g. KVM
>> exception code, or almost anything marked with SYM_CODE_END().
> 
> Just a reminder that livepatch only unwinds blocked tasks (plus the
> 'current' task which calls into livepatch).  So practically speaking, it
> doesn't matter whether the 'unreliable' detection has full coverage.
> The only exceptions which really matter are those which end up calling
> schedule(), e.g. preemption or page faults.
> 
> Being able to consistently detect *all* possible unreliable paths would
> be nice in theory, but it's unnecessary and may not be worth the extra
> complexity.
> 

You do have a point. I tried to think of arch_stack_walk_reliable() as
something that should be implemented independent of livepatching. But
I could not really come up with a single example of where else it would
really be useful.

So, if we assume that the reliable stack trace is solely for the purpose
of livepatching, I agree with your earlier comments as well.

Thanks!

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 22:05       ` Madhavan T. Venkataraman
@ 2021-04-09 22:32         ` Josh Poimboeuf
  2021-04-09 22:53           ` Josh Poimboeuf
  2021-04-12 16:59           ` Mark Brown
  0 siblings, 2 replies; 33+ messages in thread
From: Josh Poimboeuf @ 2021-04-09 22:32 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Mark Rutland, broonie, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
> > FWIW, over the years we've had zero issues with encoding the frame
> > pointer on x86.  After you save pt_regs, you encode the frame pointer to
> > point to it.  Ideally in the same macro so it's hard to overlook.
> > 
> 
> I had the same opinion. In fact, in my encoding scheme, I have additional
> checks to make absolutely sure that it is a true encoding and not stack
> corruption. The chances of all of those values accidentally matching are,
> well, null.

Right, stack corruption -- which is already exceedingly rare -- would
have to be combined with a miracle or two in order to come out of the
whole thing marked as 'reliable' :-)

And really, we already take a similar risk today by "trusting" the frame
pointer value on the stack to a certain extent.

> >> I think there's a lot more code that we cannot unwind, e.g. KVM
> >> exception code, or almost anything marked with SYM_CODE_END().
> > 
> > Just a reminder that livepatch only unwinds blocked tasks (plus the
> > 'current' task which calls into livepatch).  So practically speaking, it
> > doesn't matter whether the 'unreliable' detection has full coverage.
> > The only exceptions which really matter are those which end up calling
> > schedule(), e.g. preemption or page faults.
> > 
> > Being able to consistently detect *all* possible unreliable paths would
> > be nice in theory, but it's unnecessary and may not be worth the extra
> > complexity.
> > 
> 
> You do have a point. I tried to think of arch_stack_walk_reliable() as
> something that should be implemented independent of livepatching. But
> I could not really come up with a single example of where else it would
> really be useful.
> 
> So, if we assume that the reliable stack trace is solely for the purpose
> of livepatching, I agree with your earlier comments as well.

One thought: if folks really view this as a problem, it might help to
just rename things to reduce confusion.

For example, instead of calling it 'reliable', we could call it
something more precise, like 'klp_reliable', to indicate that its
reliable enough for live patching.

Then have a comment above 'klp_reliable' and/or
stack_trace_save_tsk_klp_reliable() which describes what that means.

Hm, for that matter, even without renaming things, a comment above
stack_trace_save_tsk_reliable() describing the meaning of "reliable"
would be a good idea.

-- 
Josh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 22:32         ` Josh Poimboeuf
@ 2021-04-09 22:53           ` Josh Poimboeuf
  2021-04-11 17:54             ` Madhavan T. Venkataraman
  2021-04-12 16:59           ` Mark Brown
  1 sibling, 1 reply; 33+ messages in thread
From: Josh Poimboeuf @ 2021-04-09 22:53 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Mark Rutland, broonie, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
> > > FWIW, over the years we've had zero issues with encoding the frame
> > > pointer on x86.  After you save pt_regs, you encode the frame pointer to
> > > point to it.  Ideally in the same macro so it's hard to overlook.
> > > 
> > 
> > I had the same opinion. In fact, in my encoding scheme, I have additional
> > checks to make absolutely sure that it is a true encoding and not stack
> > corruption. The chances of all of those values accidentally matching are,
> > well, null.
> 
> Right, stack corruption -- which is already exceedingly rare -- would
> have to be combined with a miracle or two in order to come out of the
> whole thing marked as 'reliable' :-)
> 
> And really, we already take a similar risk today by "trusting" the frame
> pointer value on the stack to a certain extent.

Oh yeah, I forgot to mention some more benefits of encoding the frame
pointer (or marking pt_regs in some other way):

a) Stack addresses can be printed properly: '%pS' for printing regs->pc
   and '%pB' for printing call returns.

   Using '%pS' for call returns (as arm64 seems to do today) will result
   in printing the wrong function when you have tail calls to noreturn
   functions on the stack (which is actually quite common for calls to
   panic(), die(), etc).

   More details:

   https://lkml.kernel.org/r/20210403155948.ubbgtwmlsdyar7yp@treble

b) Stack dumps to the console can dump the exception registers they find
   along the way.  This is actually quite nice for debugging.


-- 
Josh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 22:53           ` Josh Poimboeuf
@ 2021-04-11 17:54             ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-11 17:54 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Rutland, broonie, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra



On 4/9/21 5:53 PM, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
>> On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
>>>> FWIW, over the years we've had zero issues with encoding the frame
>>>> pointer on x86.  After you save pt_regs, you encode the frame pointer to
>>>> point to it.  Ideally in the same macro so it's hard to overlook.
>>>>
>>>
>>> I had the same opinion. In fact, in my encoding scheme, I have additional
>>> checks to make absolutely sure that it is a true encoding and not stack
>>> corruption. The chances of all of those values accidentally matching are,
>>> well, null.
>>
>> Right, stack corruption -- which is already exceedingly rare -- would
>> have to be combined with a miracle or two in order to come out of the
>> whole thing marked as 'reliable' :-)
>>
>> And really, we already take a similar risk today by "trusting" the frame
>> pointer value on the stack to a certain extent.
> 
> Oh yeah, I forgot to mention some more benefits of encoding the frame
> pointer (or marking pt_regs in some other way):
> 
> a) Stack addresses can be printed properly: '%pS' for printing regs->pc
>    and '%pB' for printing call returns.
> 
>    Using '%pS' for call returns (as arm64 seems to do today) will result
>    in printing the wrong function when you have tail calls to noreturn
>    functions on the stack (which is actually quite common for calls to
>    panic(), die(), etc).
> 
>    More details:
> 
>    https://lkml.kernel.org/r/20210403155948.ubbgtwmlsdyar7yp@treble
> 
> b) Stack dumps to the console can dump the exception registers they find
>    along the way.  This is actually quite nice for debugging.
> 
> 

Great.

I am preparing version 3 taking into account comments from yourself,
Mark Rutland and Mark Brown.

Stay tuned.

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 22:32         ` Josh Poimboeuf
  2021-04-09 22:53           ` Josh Poimboeuf
@ 2021-04-12 16:59           ` Mark Brown
  2021-04-13 22:53             ` Josh Poimboeuf
  1 sibling, 1 reply; 33+ messages in thread
From: Mark Brown @ 2021-04-12 16:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Madhavan T. Venkataraman, Mark Rutland, jthierry,
	catalin.marinas, will, linux-arm-kernel, live-patching,
	linux-kernel, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]

On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:

> Hm, for that matter, even without renaming things, a comment above
> stack_trace_save_tsk_reliable() describing the meaning of "reliable"
> would be a good idea.

Might be better to place something at the prototype for
arch_stack_walk_reliable() or cross link the two since that's where any
new architectures should be starting, or perhaps even better to extend
the document that Mark wrote further and point to that from both places.  

Some more explict pointer to live patching as the only user would
definitely be good but I think the more important thing would be writing
down any assumptions in the API that aren't already written down and
we're supposed to be relying on.  Mark's document captured a lot of it
but it sounds like there's more here, and even with knowing that this
interface is only used by live patch and digging into what it does it's
not always clear what happens to work with the code right now and what's
something that's suitable to be relied on.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-09 21:37     ` Josh Poimboeuf
  2021-04-09 22:05       ` Madhavan T. Venkataraman
@ 2021-04-12 17:36       ` Mark Brown
  2021-04-12 19:55         ` Madhavan T. Venkataraman
  1 sibling, 1 reply; 33+ messages in thread
From: Mark Brown @ 2021-04-12 17:36 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Rutland, madvenka, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 1596 bytes --]

On Fri, Apr 09, 2021 at 04:37:41PM -0500, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:

> > Further, I believe all the special cases are assembly functions, and
> > most of those are already in special sections to begin with. I reckon
> > it'd be simpler and more robust to reject unwinding based on the
> > section. If we need to unwind across specific functions in those
> > sections, we could opt-in with some metadata. So e.g. we could reject
> > all functions in ".entry.text", special casing the EL0 entry functions
> > if necessary.

> Couldn't this also end up being somewhat fragile?  Saying "certain
> sections are deemed unreliable" isn't necessarily obvious to somebody
> who doesn't already know about it, and it could be overlooked or
> forgotten over time.  And there's no way to enforce it stays that way.

Anything in this area is going to have some opportunity for fragility
and missed assumptions somewhere.  I do find the idea of using the
SYM_CODE annotations that we already have and use for other purposes to
flag code that we don't expect to be suitable for reliable unwinding
appealing from that point of view.  It's pretty clear at the points
where they're used that they're needed, even with a pretty surface level
review, and the bit actually pushing things into a section is going to
be in a single place where the macro is defined.  That seems relatively
robust as these things go, it seems no worse than our reliance on
SYM_FUNC to create BTI annotations.  Missing those causes oopses when we
try to branch to the function.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-12 17:36       ` Mark Brown
@ 2021-04-12 19:55         ` Madhavan T. Venkataraman
  2021-04-13 11:02           ` Mark Brown
  0 siblings, 1 reply; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-12 19:55 UTC (permalink / raw)
  To: Mark Brown, Josh Poimboeuf
  Cc: Mark Rutland, jthierry, catalin.marinas, will, linux-arm-kernel,
	live-patching, linux-kernel, Peter Zijlstra



On 4/12/21 12:36 PM, Mark Brown wrote:
> On Fri, Apr 09, 2021 at 04:37:41PM -0500, Josh Poimboeuf wrote:
>> On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
> 
>>> Further, I believe all the special cases are assembly functions, and
>>> most of those are already in special sections to begin with. I reckon
>>> it'd be simpler and more robust to reject unwinding based on the
>>> section. If we need to unwind across specific functions in those
>>> sections, we could opt-in with some metadata. So e.g. we could reject
>>> all functions in ".entry.text", special casing the EL0 entry functions
>>> if necessary.
> 
>> Couldn't this also end up being somewhat fragile?  Saying "certain
>> sections are deemed unreliable" isn't necessarily obvious to somebody
>> who doesn't already know about it, and it could be overlooked or
>> forgotten over time.  And there's no way to enforce it stays that way.
> 
> Anything in this area is going to have some opportunity for fragility
> and missed assumptions somewhere.  I do find the idea of using the
> SYM_CODE annotations that we already have and use for other purposes to
> flag code that we don't expect to be suitable for reliable unwinding
> appealing from that point of view.  It's pretty clear at the points
> where they're used that they're needed, even with a pretty surface level
> review, and the bit actually pushing things into a section is going to
> be in a single place where the macro is defined.  That seems relatively
> robust as these things go, it seems no worse than our reliance on
> SYM_FUNC to create BTI annotations.  Missing those causes oopses when we
> try to branch to the function.
> 

OK. Just so I am clear on the whole picture, let me state my understanding so far.
Correct me if I am wrong.

1. We are hoping that we can convert a significant number of SYM_CODE functions to
   SYM_FUNC functions by providing them with a proper FP prolog and epilog so that
   we can get objtool coverage for them. These don't need any blacklisting.

2. If we can locate the pt_regs structures created on the stack cleanly for EL1
   exceptions, etc, then we can handle those cases in the unwinder without needing
   any black listing.

   I have a solution for this in version 3 that does it without encoding the FP or
   matching values on the stack. I have addressed all of the objections so far on
   that count. I will send the patch series out soon.

3. We are going to assume that the reliable unwinder is only for livepatch purposes
   and will only be invoked on a task that is not currently running. The task either
   voluntarily gave up the CPU or was pre-empted. We can safely ignore all SYM_CODE
   functions that will never voluntarily give up the CPU. They can only be pre-empted
   and pre-emption is already handled in (2). We don't need to blacklist any of these
   functions.

4. So, the only functions that will need blacklisting are the remaining SYM_CODE functions
   that might give up the CPU voluntarily. At this point, I am not even sure how
   many of these will exist. One hopes that all of these would have ended up as
   SYM_FUNC functions in (1).

So, IMHO, placing code in a black listed section should be the last step and not the first
one. This also satisfies Mark Rutland's requirement that no one muck with the entry text
while he is sorting out that code.

I suggest we do (3) first. Then, review the assembly functions to do (1). Then, review the
remaining ones to see which ones must be blacklisted, if any.

Do you agree?

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-12 19:55         ` Madhavan T. Venkataraman
@ 2021-04-13 11:02           ` Mark Brown
  2021-04-14 10:23             ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Brown @ 2021-04-13 11:02 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Josh Poimboeuf, Mark Rutland, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 1828 bytes --]

On Mon, Apr 12, 2021 at 02:55:35PM -0500, Madhavan T. Venkataraman wrote:

> 
> OK. Just so I am clear on the whole picture, let me state my understanding so far.
> Correct me if I am wrong.

> 1. We are hoping that we can convert a significant number of SYM_CODE functions to
>    SYM_FUNC functions by providing them with a proper FP prolog and epilog so that
>    we can get objtool coverage for them. These don't need any blacklisting.

I wouldn't expect to be converting lots of SYM_CODE to SYM_FUNC.  I'd
expect the overwhelming majority of SYM_CODE to be SYM_CODE because it's
required to be non standard due to some external interface - things like
the exception vectors, ftrace, and stuff around suspend/hibernate.  A
quick grep seems to confirm this.

> 3. We are going to assume that the reliable unwinder is only for livepatch purposes
>    and will only be invoked on a task that is not currently running. The task either

The reliable unwinder can also be invoked on itself.

> 4. So, the only functions that will need blacklisting are the remaining SYM_CODE functions
>    that might give up the CPU voluntarily. At this point, I am not even sure how
>    many of these will exist. One hopes that all of these would have ended up as
>    SYM_FUNC functions in (1).

There's stuff like ret_from_fork there.

> I suggest we do (3) first. Then, review the assembly functions to do (1). Then, review the
> remaining ones to see which ones must be blacklisted, if any.

I'm not clear what the concrete steps you're planning to do first are
there - your 3 seems like a statement of assumptions.  For flagging
functions I do think it'd be safer to default to assuming that all
SYM_CODE functions can't be unwound reliably rather than only explicitly
listing ones that cause problems.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-12 16:59           ` Mark Brown
@ 2021-04-13 22:53             ` Josh Poimboeuf
  2021-04-14 12:24               ` Mark Brown
  0 siblings, 1 reply; 33+ messages in thread
From: Josh Poimboeuf @ 2021-04-13 22:53 UTC (permalink / raw)
  To: Mark Brown
  Cc: Madhavan T. Venkataraman, Mark Rutland, jthierry,
	catalin.marinas, will, linux-arm-kernel, live-patching,
	linux-kernel, Peter Zijlstra

On Mon, Apr 12, 2021 at 05:59:33PM +0100, Mark Brown wrote:
> On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
> 
> > Hm, for that matter, even without renaming things, a comment above
> > stack_trace_save_tsk_reliable() describing the meaning of "reliable"
> > would be a good idea.
> 
> Might be better to place something at the prototype for
> arch_stack_walk_reliable() or cross link the two since that's where any
> new architectures should be starting, or perhaps even better to extend
> the document that Mark wrote further and point to that from both places.  
> 
> Some more explict pointer to live patching as the only user would
> definitely be good but I think the more important thing would be writing
> down any assumptions in the API that aren't already written down and
> we're supposed to be relying on.  Mark's document captured a lot of it
> but it sounds like there's more here, and even with knowing that this
> interface is only used by live patch and digging into what it does it's
> not always clear what happens to work with the code right now and what's
> something that's suitable to be relied on.

Something like so?

From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH] livepatch: Clarify the meaning of 'reliable'

Update the comments and documentation to reflect what 'reliable'
unwinding actually means, in the context of live patching.

Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 .../livepatch/reliable-stacktrace.rst         | 26 +++++++++++++----
 arch/x86/kernel/stacktrace.c                  |  6 ----
 include/linux/stacktrace.h                    | 29 +++++++++++++++++--
 kernel/stacktrace.c                           |  7 ++++-
 4 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/Documentation/livepatch/reliable-stacktrace.rst b/Documentation/livepatch/reliable-stacktrace.rst
index 67459d2ca2af..e325efc7e952 100644
--- a/Documentation/livepatch/reliable-stacktrace.rst
+++ b/Documentation/livepatch/reliable-stacktrace.rst
@@ -72,7 +72,21 @@ The unwinding process varies across architectures, their respective procedure
 call standards, and kernel configurations. This section describes common
 details that architectures should consider.
 
-4.1 Identifying successful termination
+4.1 Only preemptible code needs reliability detection
+-----------------------------------------------------
+
+The only current user of reliable stacktracing is livepatch, which only
+calls it for a) inactive tasks; or b) the current task in task context.
+
+Therefore, the unwinder only needs to detect the reliability of stacks
+involving *preemptible* code.
+
+Practically speaking, reliability of stacks involving *non-preemptible*
+code is a "don't-care".  It may help to return a wrong reliability
+result for such cases, if it results in reduced complexity, since such
+cases will not happen in practice.
+
+4.2 Identifying successful termination
 --------------------------------------
 
 Unwinding may terminate early for a number of reasons, including:
@@ -95,7 +109,7 @@ architectures verify that a stacktrace ends at an expected location, e.g.
 * On a specific stack expected for a kernel entry point (e.g. if the
   architecture has separate task and IRQ stacks).
 
-4.2 Identifying unwindable code
+4.3 Identifying unwindable code
 -------------------------------
 
 Unwinding typically relies on code following specific conventions (e.g.
@@ -129,7 +143,7 @@ unreliable to unwind from, e.g.
 
 * Identifying specific portions of code using bounds information.
 
-4.3 Unwinding across interrupts and exceptions
+4.4 Unwinding across interrupts and exceptions
 ----------------------------------------------
 
 At function call boundaries the stack and other unwind state is expected to be
@@ -156,7 +170,7 @@ have no such cases) should attempt to unwind across exception boundaries, as
 doing so can prevent unnecessarily stalling livepatch consistency checks and
 permits livepatch transitions to complete more quickly.
 
-4.4 Rewriting of return addresses
+4.5 Rewriting of return addresses
 ---------------------------------
 
 Some trampolines temporarily modify the return address of a function in order
@@ -222,7 +236,7 @@ middle of return_to_handler and can report this as unreliable. Architectures
 are not required to unwind from other trampolines which modify the return
 address.
 
-4.5 Obscuring of return addresses
+4.6 Obscuring of return addresses
 ---------------------------------
 
 Some trampolines do not rewrite the return address in order to intercept
@@ -249,7 +263,7 @@ than the link register as would usually be the case.
 Architectures must either ensure that unwinders either reliably unwind
 such cases, or report the unwinding as unreliable.
 
-4.6 Link register unreliability
+4.7 Link register unreliability
 -------------------------------
 
 On some other architectures, 'call' instructions place the return address into a
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 8627fda8d993..15b058eefc4e 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -29,12 +29,6 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
 	}
 }
 
-/*
- * This function returns an error if it detects any unreliable features of the
- * stack.  Otherwise it guarantees that the stack trace is reliable.
- *
- * If the task is not 'current', the caller *must* ensure the task is inactive.
- */
 int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
 			     void *cookie, struct task_struct *task)
 {
diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 50e2df30b0aa..1b6a65a0ad22 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -26,7 +26,7 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size);
 #ifdef CONFIG_ARCH_STACKWALK
 
 /**
- * stack_trace_consume_fn - Callback for arch_stack_walk()
+ * stack_trace_consume_fn() - Callback for arch_stack_walk()
  * @cookie:	Caller supplied pointer handed back by arch_stack_walk()
  * @addr:	The stack entry address to consume
  *
@@ -35,7 +35,7 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size);
  */
 typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr);
 /**
- * arch_stack_walk - Architecture specific function to walk the stack
+ * arch_stack_walk() - Architecture specific function to walk the stack
  * @consume_entry:	Callback which is invoked by the architecture code for
  *			each entry.
  * @cookie:		Caller supplied pointer which is handed back to
@@ -52,8 +52,33 @@ typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr);
  */
 void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
 		     struct task_struct *task, struct pt_regs *regs);
+
+/**
+ * arch_stack_walk_reliable() - Architecture specific function to walk the
+ *				stack, with stack reliability check
+ * @consume_entry:	Callback which is invoked by the architecture code for
+ *			each entry.
+ * @cookie:		Caller supplied pointer which is handed back to
+ *			@consume_entry
+ * @task:		Pointer to a task struct, can be NULL for current
+ *
+ * Return: 0 if the stack trace is considered reliable for livepatch; else < 0.
+ *
+ * NOTE: This interface is only used by livepatch.  The caller must ensure that
+ *	 it's only called in one of the following two scenarios:
+ *
+ *	 a) the task is inactive (and guaranteed to remain so); or
+ *
+ *	 b) the task is 'current', running in task context.
+ *
+ * Effectively, this means the arch unwinder doesn't need to detect the
+ * reliability of stacks involving non-preemptible code.
+ *
+ * For more details, see Documentation/livepatch/reliable-stacktrace.rst.
+ */
 int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, void *cookie,
 			     struct task_struct *task);
+
 void arch_stack_walk_user(stack_trace_consume_fn consume_entry, void *cookie,
 			  const struct pt_regs *regs);
 
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index 9f8117c7cfdd..a198fd194fed 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -185,7 +185,12 @@ unsigned int stack_trace_save_regs(struct pt_regs *regs, unsigned long *store,
  *		stack. Otherwise it guarantees that the stack trace is
  *		reliable and returns the number of entries stored.
  *
- * If the task is not 'current', the caller *must* ensure the task is inactive.
+ * NOTE: This interface is only used by livepatch.  The caller must ensure that
+ *	 it's only called in one of the following two scenarios:
+ *
+ *	 a) the task is inactive (and guaranteed to remain so); or
+ *
+ *	 b) the task is 'current', running in task context.
  */
 int stack_trace_save_tsk_reliable(struct task_struct *tsk, unsigned long *store,
 				  unsigned int size)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-13 11:02           ` Mark Brown
@ 2021-04-14 10:23             ` Madhavan T. Venkataraman
  2021-04-14 12:35               ` Mark Brown
  2021-04-16 14:43               ` Madhavan T. Venkataraman
  0 siblings, 2 replies; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-14 10:23 UTC (permalink / raw)
  To: Mark Brown
  Cc: Josh Poimboeuf, Mark Rutland, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra



On 4/13/21 6:02 AM, Mark Brown wrote:
> On Mon, Apr 12, 2021 at 02:55:35PM -0500, Madhavan T. Venkataraman wrote:
> 
>>
>> OK. Just so I am clear on the whole picture, let me state my understanding so far.
>> Correct me if I am wrong.
> 
>> 1. We are hoping that we can convert a significant number of SYM_CODE functions to
>>    SYM_FUNC functions by providing them with a proper FP prolog and epilog so that
>>    we can get objtool coverage for them. These don't need any blacklisting.
> 
> I wouldn't expect to be converting lots of SYM_CODE to SYM_FUNC.  I'd
> expect the overwhelming majority of SYM_CODE to be SYM_CODE because it's
> required to be non standard due to some external interface - things like
> the exception vectors, ftrace, and stuff around suspend/hibernate.  A
> quick grep seems to confirm this.
> 

OK. Fair enough.

>> 3. We are going to assume that the reliable unwinder is only for livepatch purposes
>>    and will only be invoked on a task that is not currently running. The task either
> 
> The reliable unwinder can also be invoked on itself.
> 

I have not called out the self-directed case because I am assuming that the reliable unwinder
is only used for livepatch. So, AFAICT, this is applicable to the task that performs the
livepatch operation itself. In this case, there should be no unreliable functions on the
self-directed stack trace (otherwise, livepatching would always fail).

>> 4. So, the only functions that will need blacklisting are the remaining SYM_CODE functions
>>    that might give up the CPU voluntarily. At this point, I am not even sure how
>>    many of these will exist. One hopes that all of these would have ended up as
>>    SYM_FUNC functions in (1).
> 
> There's stuff like ret_from_fork there.
> 

OK. There would be a few functions that fit this category. I agree.

>> I suggest we do (3) first. Then, review the assembly functions to do (1). Then, review the
>> remaining ones to see which ones must be blacklisted, if any.
> 
> I'm not clear what the concrete steps you're planning to do first are
> there - your 3 seems like a statement of assumptions.  For flagging
> functions I do think it'd be safer to default to assuming that all
> SYM_CODE functions can't be unwound reliably rather than only explicitly
> listing ones that cause problems.
> 

They are not assumptions. They are true statements. But I probably did not do a good
job of explaining. But Josh sent out a patch that updates the documentation that
explains what I said a lot better.

In any case, I have absolutely no problems in implementing your section idea. I will
make an attempt to do that in version 3 of my patch series.

Stay tuned.

And, thanks for all the input. It is very helpful.

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-13 22:53             ` Josh Poimboeuf
@ 2021-04-14 12:24               ` Mark Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Mark Brown @ 2021-04-14 12:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Madhavan T. Venkataraman, Mark Rutland, jthierry,
	catalin.marinas, will, linux-arm-kernel, live-patching,
	linux-kernel, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 1428 bytes --]

On Tue, Apr 13, 2021 at 05:53:10PM -0500, Josh Poimboeuf wrote:
> On Mon, Apr 12, 2021 at 05:59:33PM +0100, Mark Brown wrote:

> > Some more explict pointer to live patching as the only user would
> > definitely be good but I think the more important thing would be writing
> > down any assumptions in the API that aren't already written down and

> Something like so?

Yeah, looks reasonable - it'll need rebasing against current code as I
moved the docs in the source out of the arch code into the header this
cycle (they were copied verbatim in a couple of places).

>  #ifdef CONFIG_ARCH_STACKWALK
>  
>  /**
> - * stack_trace_consume_fn - Callback for arch_stack_walk()
> + * stack_trace_consume_fn() - Callback for arch_stack_walk()
>   * @cookie:	Caller supplied pointer handed back by arch_stack_walk()
>   * @addr:	The stack entry address to consume
>   *

> @@ -35,7 +35,7 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size);
>   */
>  typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr);
>  /**
> - * arch_stack_walk - Architecture specific function to walk the stack
> + * arch_stack_walk() - Architecture specific function to walk the stack
>   * @consume_entry:	Callback which is invoked by the architecture code for
>   *			each entry.
>   * @cookie:		Caller supplied pointer which is handed back to

These two should be separated.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-14 10:23             ` Madhavan T. Venkataraman
@ 2021-04-14 12:35               ` Mark Brown
  2021-04-16 14:43               ` Madhavan T. Venkataraman
  1 sibling, 0 replies; 33+ messages in thread
From: Mark Brown @ 2021-04-14 12:35 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Josh Poimboeuf, Mark Rutland, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

On Wed, Apr 14, 2021 at 05:23:38AM -0500, Madhavan T. Venkataraman wrote:
> On 4/13/21 6:02 AM, Mark Brown wrote:
> > On Mon, Apr 12, 2021 at 02:55:35PM -0500, Madhavan T. Venkataraman wrote:

> >> 3. We are going to assume that the reliable unwinder is only for livepatch purposes
> >>    and will only be invoked on a task that is not currently running. The task either
> > 
> > The reliable unwinder can also be invoked on itself.

> I have not called out the self-directed case because I am assuming that the reliable unwinder
> is only used for livepatch. So, AFAICT, this is applicable to the task that performs the
> livepatch operation itself. In this case, there should be no unreliable functions on the
> self-directed stack trace (otherwise, livepatching would always fail).

Someone might've added a probe of some kind which upsets things so
there's a possibility things might fail.  Like you say there's no way a
system in such a state can succesfully apply a live patch but we might
still run into that situation.

> >> I suggest we do (3) first. Then, review the assembly functions to do (1). Then, review the
> >> remaining ones to see which ones must be blacklisted, if any.

> > I'm not clear what the concrete steps you're planning to do first are
> > there - your 3 seems like a statement of assumptions.  For flagging
> > functions I do think it'd be safer to default to assuming that all
> > SYM_CODE functions can't be unwound reliably rather than only explicitly
> > listing ones that cause problems.

> They are not assumptions. They are true statements. But I probably did not do a good
> job of explaining. But Josh sent out a patch that updates the documentation that
> explains what I said a lot better.

You say true statements, I say assumptions :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-14 10:23             ` Madhavan T. Venkataraman
  2021-04-14 12:35               ` Mark Brown
@ 2021-04-16 14:43               ` Madhavan T. Venkataraman
  2021-04-16 15:36                 ` Mark Brown
  1 sibling, 1 reply; 33+ messages in thread
From: Madhavan T. Venkataraman @ 2021-04-16 14:43 UTC (permalink / raw)
  To: Mark Brown
  Cc: Josh Poimboeuf, Mark Rutland, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra



On 4/14/21 5:23 AM, Madhavan T. Venkataraman wrote:
> In any case, I have absolutely no problems in implementing your section idea. I will
> make an attempt to do that in version 3 of my patch series.

So, I attempted a patch with just declaring all .entry.text functions as unreliable
by checking just the section bounds. It does work for EL1 exceptions. But there
are other functions that are actually reliable that show up as unreliable.
The example in my test is el0_sync() which is at the base of all system call stacks.

How would you prefer I handle this? Should I place all SYM_CODE functions that
are actually safe for the unwinder in a separate section? I could just take
some approach and solve this. But I would like to get your opinion and Mark
Rutland's opinion so we are all on the same page.

Please let me know.

Madhavan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
  2021-04-16 14:43               ` Madhavan T. Venkataraman
@ 2021-04-16 15:36                 ` Mark Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Mark Brown @ 2021-04-16 15:36 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Josh Poimboeuf, Mark Rutland, jthierry, catalin.marinas, will,
	linux-arm-kernel, live-patching, linux-kernel, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 795 bytes --]

On Fri, Apr 16, 2021 at 09:43:48AM -0500, Madhavan T. Venkataraman wrote:

> How would you prefer I handle this? Should I place all SYM_CODE functions that
> are actually safe for the unwinder in a separate section? I could just take
> some approach and solve this. But I would like to get your opinion and Mark
> Rutland's opinion so we are all on the same page.

That sounds reasonable to me, obviously we'd have to look at how
exactly the annotation ends up getting done and general bikeshed colour
discussions.  I'm not sure if we want a specific "safe for unwinder
section" or to split things up into sections per reason things are safe
for the unwinder (kind of like what you were proposing for flagging
things as a problem), that might end up being useful for other things at
some point.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-04-16 15:36 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <705993ccb34a611c75cdae0a8cb1b40f9b218ebd>
2021-04-05 20:43 ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks madvenka
2021-04-05 20:43   ` [RFC PATCH v2 1/4] arm64: Implement infrastructure for " madvenka
2021-04-08 15:15     ` Mark Brown
2021-04-08 17:17     ` Mark Brown
2021-04-08 19:30       ` Madhavan T. Venkataraman
2021-04-08 23:30         ` Madhavan T. Venkataraman
2021-04-09 11:57           ` Mark Brown
2021-04-05 20:43   ` [RFC PATCH v2 2/4] arm64: Mark a stack trace unreliable if an EL1 exception frame is detected madvenka
2021-04-05 20:43   ` [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable madvenka
2021-04-08 16:58     ` Mark Brown
2021-04-08 19:23       ` Madhavan T. Venkataraman
2021-04-09 11:31         ` Mark Brown
2021-04-09 14:02           ` Madhavan T. Venkataraman
2021-04-09 12:27     ` Mark Rutland
2021-04-09 17:23       ` Madhavan T. Venkataraman
2021-04-05 20:43   ` [RFC PATCH v2 4/4] arm64: Mark stack trace as unreliable if kretprobed functions are present madvenka
2021-04-09 12:09   ` [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks Mark Rutland
2021-04-09 17:16     ` Madhavan T. Venkataraman
2021-04-09 21:37     ` Josh Poimboeuf
2021-04-09 22:05       ` Madhavan T. Venkataraman
2021-04-09 22:32         ` Josh Poimboeuf
2021-04-09 22:53           ` Josh Poimboeuf
2021-04-11 17:54             ` Madhavan T. Venkataraman
2021-04-12 16:59           ` Mark Brown
2021-04-13 22:53             ` Josh Poimboeuf
2021-04-14 12:24               ` Mark Brown
2021-04-12 17:36       ` Mark Brown
2021-04-12 19:55         ` Madhavan T. Venkataraman
2021-04-13 11:02           ` Mark Brown
2021-04-14 10:23             ` Madhavan T. Venkataraman
2021-04-14 12:35               ` Mark Brown
2021-04-16 14:43               ` Madhavan T. Venkataraman
2021-04-16 15:36                 ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).