linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks
       [not found] <95691cae4f4504f33d0fc9075541b1e7deefe96f>
@ 2022-01-17 14:55 ` madvenka
  2022-01-17 14:55   ` [PATCH v13 01/11] arm64: Remove NULL task check from unwind_frame() madvenka
                     ` (10 more replies)
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
  1 sibling, 11 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:55 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

I have rebased this patch series on top of Linus' tree as of Jan 15, 2022.
The base commit is provided at the end of this email.

Remove NULL task check
======================

Currently, there is a check for a NULL task in unwind_frame(). It is not
needed since all current consumers pass a non-NULL task.

Rename unwinder functions
=========================

Rename unwinder functions to unwind*() similar to other architectures
for naming consistency.

	start_backtrace() ==> unwind_init()
	unwind_frame()    ==> unwind_next()
	walk_stackframe() ==> unwind()

Rename struct stackframe
========================

Rename "struct stackframe" to "struct unwind_state" for better naming
and consistency.

Split unwind_init()
===================

Unwind initialization has 3 cases. Accordingly, define 3 separate init
functions as follows:

	- unwind_init_from_regs()
	- unwind_init_from_current()
	- unwind_init_from_task()

This makes it easier to understand and add specialized code to each case
in the future.

Copy task argument
==================

Copy the task argument passed to arch_stack_walk() to unwind_state so that
it can be passed to unwind functions via unwind_state rather than as a
separate argument. The task is a fundamental part of the unwind state.

Use stack_trace_consume_fn
==========================

stack_trace_consume_fn is a typedef already defined. unwind() does not
use this for its consume_entry() argument. Fix this. Also, rename the
arguments to unwind() for better naming consistency.

Redefine the unwinder loop
==========================

Redefine the unwinder loop and make it simple and somewhat similar to other
architectures. Define the following:

	while (unwind_continue(&state, consume_entry, cookie))
		unwind_next(&state);

unwind_continue()
	This new function implements checks to determine whether the
	unwind should continue or terminate.

Reliability checks
==================

There are some kernel features and conditions that make a stack trace
unreliable. Callers may require the unwinder to detect these cases.
E.g., livepatch.

Introduce a new function called unwind_check_reliability() that will detect
these cases and set a boolean "reliable" in the stackframe. Call
unwind_check_reliability() for every frame.

Introduce the first reliability check in unwind_check_reliability() - If
a return PC is not a valid kernel text address, consider the stack
trace unreliable. It could be some generated code.

Other reliability checks will be added in the future.

Make unwind() return a boolean to indicate reliability of the stack trace.

SYM_CODE check
==============

This is the second reliability check implemented.

SYM_CODE functions do not follow normal calling conventions. They cannot
be unwound reliably using the frame pointer. Collect the address ranges
of these functions in a special section called "sym_code_functions".

In unwind_check_reliability(), check the return PC against these ranges. If
a match is found, then mark the stack trace unreliable.

Last stack frame
================

If a SYM_CODE function occurs in the very last frame in the stack trace,
then the stack trace is not considered unreliable. This is because there
is no more unwinding to do. Examples:

	- EL0 exception stack traces end in the top level EL0 exception
	  handlers.

	- All kernel thread stack traces end in ret_from_fork().

arch_stack_walk_reliable()
==========================

Introduce arch_stack_walk_reliable() for ARM64. This works like
arch_stack_walk() except that it returns an error if the stack trace is
found to be unreliable.

Until all of the reliability checks are in place in
unwind_check_reliability(), arch_stack_walk_reliable() may not be used by
livepatch. But it may be used by debug and test code.

HAVE_RELIABLE_STACKTRACE
========================

Select this config for arm64. However, make it conditional on
STACK_VALIDATION. When objtool is enhanced to implement stack
validation for arm64, STACK_VALIDATION will be defined.

---
Changelog:
v13:
	From Mark Brown:

	- Reviewed-by for the following:

	[PATCH v12 03/10] arm64: Rename stackframe to unwind_state
	[PATCH v11 05/10] arm64: Copy unwind arguments to unwind_state
	[PATCH v11 07/10] arm64: Introduce stack trace reliability checks
	                  in the unwinder
	[PATCH v11 5/5] arm64: Create a list of SYM_CODE functions, check
	                return PC against list

	From Mark Rutland:

	- Reviewed-by for the following:

	[PATCH v12 01/10] arm64: Remove NULL task check from unwind_frame()
	[PATCH v12 02/10] arm64: Rename unwinder functions
	[PATCH v12 03/10] arm64: Rename stackframe to unwind_state

	- For each of the 3 cases of unwind initialization, have a separate
	  init function. Call the common init from each of these init
	  functions rather than call it separately.

	- Only copy the task argument to arch_stack_walk() into
	  unwind state. Pass the rest of the arguments as arguments to
	  unwind functions.

v12:
	From Mark Brown:

	- Reviewed-by for the following:

	[PATCH v11 1/5] arm64: Call stack_backtrace() only from within
	                walk_stackframe()
	[PATCH v11 2/5] arm64: Rename unwinder functions
	[PATCH v11 3/5] arm64: Make the unwind loop in unwind() similar to
	                other architectures
	[PATCH v11 5/5] arm64: Create a list of SYM_CODE functions, check
	                return PC against list

	- Add an extra patch at the end to select HAVE_RELIABLE_STACKTRACE
	  just as a place holder for the review. I have added it and made
	  it conditional on STACK_VALIDATION which has not yet been
	  implemented.

	- Mark had a concern about the code for the check for the final
	  frame being repeated in two places. I have now added a new
	  field called "final_fp" in struct stackframe which I compute
	  once in stacktrace initialization. I have added an explicit
	  comment that the stacktrace must terminate at the final_fp.

	- Place the implementation of arch_stack_walk_reliable() in a
	  separate patch after all the reliability checks have been
	  implemented.

	From Mark Rutland:

	- Place the removal of the NULL task check in unwind_frame() in
	  a separate patch.

	- Add a task field to struct stackframe so the task pointer can be
	  passed around via the frame instead of as a separate argument. I have
	  taken this a step further by copying all of the arguments to
	  arch_stack_walk() into struct stackframe so that only that
	  struct needs to be passed to unwind functions.

	- Rename start_backtrace() to unwind_init() instead of unwind_start().

	- Acked-by for the following:

	[PATCH v11 2/5] arm64: Rename unwinder functions

	- Rename "struct stackframe" to "struct unwind_state".

	- Define separate inline functions for initializing the starting
	  FP and PC from regs, or caller, or blocked task. Don't merge
	  unwind_init() into unwind().

v11:
	From Mark Rutland:

	- Peter Zijlstra has submitted patches that make ARCH_STACKWALK
	  independent of STACKTRACE. Mark Rutland extracted some of the
	  patches from my v10 series and added his own patches and comments,
	  rebased it on top of Peter's changes and submitted the series.
	  
	  So, I have rebased the rest of the patches from v10 on top of
	  Mark Rutland's changes.

	- Split the renaming of the unwinder functions and annotating them
	  with notrace and NOKPROBE_SYMBOL(). Also, there is currently no
	  need to annotate unwind_start() as its caller is already annotated
	  properly. So, I am removing the annotation patch from the series.
	  This can be done separately later if deemed necessary. Similarly,
	  I have removed the annotations from unwind_check_reliability() and
	  unwind_continue().

	From Nobuta Keiya:

	- unwind_start() should check for final frame and not mark the
	  final frame unreliable.

v9, v10:
	- v9 had a threading problem. So, I resent it as v10.

	From me:

	- Removed the word "RFC" from the subject line as I believe this
	  is mature enough to be a regular patch.

	From Mark Brown, Mark Rutland:

	- Split the patches into smaller, self-contained ones.

	- Always enable STACKTRACE so that arch_stack_walk() is always
	  defined.

	From Mark Rutland:

	- Update callchain_trace() take the return value of
	  perf_callchain_store() into acount.

	- Restore get_wchan() behavior to the original code.

	- Simplify an if statement in dump_backtrace().

	From Mark Brown:

	- Do not abort the stack trace on the first unreliable frame.

	
v8:
	- Synced to v5.14-rc5.

	From Mark Rutland:

	- Make the unwinder loop similar to other architectures.

	- Keep details to within the unwinder functions and return a simple
	  boolean to the caller.

	- Convert some of the current code that contains unwinder logic to
	  simply use arch_stack_walk(). I have converted all of them.

	- Do not copy sym_code_functions[]. Just place it in rodata for now.

	- Have the main loop check for termination conditions rather than
	  having unwind_frame() check for them. In other words, let
	  unwind_frame() assume that the fp is valid.

	- Replace the big comment for SYM_CODE functions with a shorter
	  comment.

		/*
		 * As SYM_CODE functions don't follow the usual calling
		 * conventions, we assume by default that any SYM_CODE function
		 * cannot be unwound reliably.
		 *
		 * Note that this includes:
		 *
		 * - Exception handlers and entry assembly
		 * - Trampoline assembly (e.g., ftrace, kprobes)
		 * - Hypervisor-related assembly
		 * - Hibernation-related assembly
		 * - CPU start-stop, suspend-resume assembly
		 * - Kernel relocation assembly
		 */

v7:
	The Mailer screwed up the threading on this. So, I have resent this
	same series as version 8 with proper threading to avoid confusion.
v6:
	From Mark Rutland:

	- The per-frame reliability concept and flag are acceptable. But more
	  work is needed to make the per-frame checks more accurate and more
	  complete. E.g., some code reorg is being worked on that will help.

	  I have now removed the frame->reliable flag and deleted the whole
	  concept of per-frame status. This is orthogonal to this patch series.
	  Instead, I have improved the unwinder to return proper return codes
	  so a caller can take appropriate action without needing per-frame
	  status.

	- Remove the mention of PLTs and update the comment.

	  I have replaced the comment above the call to __kernel_text_address()
	  with the comment suggested by Mark Rutland.

	Other comments:

	- Other comments on the per-frame stuff are not relevant because
	  that approach is not there anymore.

v5:
	From Keiya Nobuta:
	
	- The term blacklist(ed) is not to be used anymore. I have changed it
	  to unreliable. So, the function unwinder_blacklisted() has been
	  changed to unwinder_is_unreliable().

	From Mark Brown:

	- Add a comment for the "reliable" flag in struct stackframe. The
	  reliability attribute is not complete until all the checks are
	  in place. Added a comment above struct stackframe.

	- Include some of the comments in the cover letter in the actual
	  code so that we can compare it with the reliable stack trace
	  requirements document for completeness. I have added a comment:

	  	- above unwinder_is_unreliable() that lists the requirements
		  that are addressed by the function.

		- above the __kernel_text_address() call about all the cases
		  the call covers.

v4:
	From Mark Brown:

	- I was checking the return PC with __kernel_text_address() before
	  the Function Graph trace handling. Mark Brown felt that all the
	  reliability checks should be performed on the original return PC
	  once that is obtained. So, I have moved all the reliability checks
	  to after the Function Graph Trace handling code in the unwinder.
	  Basically, the unwinder should perform PC translations first (for
	  rhe return trampoline for Function Graph Tracing, Kretprobes, etc).
	  Then, the reliability checks should be applied to the resulting
	  PC.

	- Mark said to improve the naming of the new functions so they don't
	  collide with existing ones. I have used a prefix "unwinder_" for
	  all the new functions.

	From Josh Poimboeuf:

	- In the error scenarios in the unwinder, the reliable flag in the
	  stack frame should be set. Implemented this.

	- Some of the other comments are not relevant to the new code as
	  I have taken a different approach in the new code. That is why
	  I have not made those changes. E.g., Ard wanted me to add the
	  "const" keyword to the global section array. That array does not
	  exist in v4. Similarly, Mark Brown said to use ARRAY_SIZE() for
	  the same array in a for loop.

	Other changes:

	- Add a new definition for SYM_CODE_END() that adds the address
	  range of the function to a special section called
	  "sym_code_functions".

	- Include the new section under initdata in vmlinux.lds.S.

	- Define an early_initcall() to copy the contents of the
	  "sym_code_functions" section to an array by the same name.

	- Define a function unwinder_blacklisted() that compares a return
	  PC against sym_code_sections[]. If there is a match, mark the
	  stack trace unreliable. Call this from unwind_frame().

v3:
	- Implemented a sym_code_ranges[] array to contains sections bounds
	  for text sections that contain SYM_CODE_*() functions. The unwinder
	  checks each return PC against the sections. If it falls in any of
	  the sections, the stack trace is marked unreliable.

	- Moved SYM_CODE functions from .text and .init.text into a new
	  text section called ".code.text". Added this section to
	  vmlinux.lds.S and sym_code_ranges[].

	- Fixed the logic in the unwinder that handles Function Graph
	  Tracer return trampoline.

	- Removed all the previous code that handles:
		- ftrace entry code for traced function
		- special_functions[] array that lists individual functions
		- kretprobe_trampoline() special case

v2
	- Removed the terminating entry { 0, 0 } in special_functions[]
	  and replaced it with the idiom { /* sentinel */ }.

	- Change the ftrace trampoline entry ftrace_graph_call in
	  special_functions[] to ftrace_call + 4 and added explanatory
	  comments.

	- Unnested #ifdefs in special_functions[] for FTRACE.

v1
	- Define a bool field in struct stackframe. This will indicate if
	  a stack trace is reliable.

	- Implement a special_functions[] array that will be populated
	  with special functions in which the stack trace is considered
	  unreliable.
	
	- Using kallsyms_lookup(), get the address ranges for the special
	  functions and record them.

	- Implement an is_reliable_function(pc). This function will check
	  if a given return PC falls in any of the special functions. If
	  it does, the stack trace is unreliable.

	- Implement check_reliability() function that will check if a
	  stack frame is reliable. Call is_reliable_function() from
	  check_reliability().

	- Before a return PC is checked against special_funtions[], it
	  must be validates as a proper kernel text address. Call
	  __kernel_text_address() from check_reliability().

	- Finally, call check_reliability() from unwind_frame() for
	  each stack frame.

	- Add EL1 exception handlers to special_functions[].

		el1_sync();
		el1_irq();
		el1_error();
		el1_sync_invalid();
		el1_irq_invalid();
		el1_fiq_invalid();
		el1_error_invalid();

	- The above functions are currently defined as LOCAL symbols.
	  Make them global so that they can be referenced from the
	  unwinder code.

	- Add FTRACE trampolines to special_functions[]:

		ftrace_graph_call()
		ftrace_graph_caller()
		return_to_handler()

	- Add the kretprobe trampoline to special functions[]:

		kretprobe_trampoline()

Previous versions and discussion
================================

v12: https://lore.kernel.org/linux-arm-kernel/20220103165212.9303-1-madvenka@linux.microsoft.com/T/#m21e86eecb9b8f0831196568f0bf62c3b56f65bf0
v11: https://lore.kernel.org/linux-arm-kernel/20211123193723.12112-1-madvenka@linux.microsoft.com/T/#t
v10: https://lore.kernel.org/linux-arm-kernel/4b3d5552-590c-e6a0-866b-9bc51da7bebf@linux.microsoft.com/T/#t
v9: Mailer screwed up the threading. Sent the same as v10 with proper threading.
v8: https://lore.kernel.org/linux-arm-kernel/20210812190603.25326-1-madvenka@linux.microsoft.com/
v7: Mailer screwed up the threading. Sent the same as v8 with proper threading.
v6: https://lore.kernel.org/linux-arm-kernel/20210630223356.58714-1-madvenka@linux.microsoft.com/
v5: https://lore.kernel.org/linux-arm-kernel/20210526214917.20099-1-madvenka@linux.microsoft.com/
v4: https://lore.kernel.org/linux-arm-kernel/20210516040018.128105-1-madvenka@linux.microsoft.com/
v3: https://lore.kernel.org/linux-arm-kernel/20210503173615.21576-1-madvenka@linux.microsoft.com/
v2: https://lore.kernel.org/linux-arm-kernel/20210405204313.21346-1-madvenka@linux.microsoft.com/
v1: https://lore.kernel.org/linux-arm-kernel/20210330190955.13707-1-madvenka@linux.microsoft.com/

Madhavan T. Venkataraman (11):
  arm64: Remove NULL task check from unwind_frame()
  arm64: Rename unwinder functions
  arm64: Rename stackframe to unwind_state
  arm64: Split unwind_init()
  arm64: Copy the task argument to unwind_state
  arm64: Use stack_trace_consume_fn and rename args to unwind()
  arm64: Make the unwind loop in unwind() similar to other architectures
  arm64: Introduce stack trace reliability checks in the unwinder
  arm64: Create a list of SYM_CODE functions, check return PC against
    list
  arm64: Introduce arch_stack_walk_reliable()
  arm64: Select HAVE_RELIABLE_STACKTRACE

 arch/arm64/Kconfig                  |   1 +
 arch/arm64/include/asm/linkage.h    |  12 ++
 arch/arm64/include/asm/sections.h   |   1 +
 arch/arm64/include/asm/stacktrace.h |  14 +-
 arch/arm64/kernel/stacktrace.c      | 287 +++++++++++++++++++++-------
 arch/arm64/kernel/vmlinux.lds.S     |  10 +
 6 files changed, 258 insertions(+), 67 deletions(-)


base-commit: a33f5c380c4bd3fa5278d690421b72052456d9fe
-- 
2.25.1


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH v13 01/11] arm64: Remove NULL task check from unwind_frame()
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
@ 2022-01-17 14:55   ` madvenka
  2022-01-17 14:55   ` [PATCH v13 02/11] arm64: Rename unwinder functions madvenka
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:55 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Currently, there is a check for a NULL task in unwind_frame(). It is not
needed since all current consumers pass a non-NULL task.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/kernel/stacktrace.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 0fb58fed54cb..5f5bb35b7b41 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -69,9 +69,6 @@ static int notrace unwind_frame(struct task_struct *tsk,
 	unsigned long fp = frame->fp;
 	struct stack_info info;
 
-	if (!tsk)
-		tsk = current;
-
 	/* Final frame; nothing to unwind */
 	if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
 		return -ENOENT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 02/11] arm64: Rename unwinder functions
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
  2022-01-17 14:55   ` [PATCH v13 01/11] arm64: Remove NULL task check from unwind_frame() madvenka
@ 2022-01-17 14:55   ` madvenka
  2022-01-17 14:56   ` [PATCH v13 03/11] arm64: Rename stackframe to unwind_state madvenka
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:55 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Rename unwinder functions for consistency and better naming.

	- Rename start_backtrace() to unwind_init().
	- Rename unwind_frame() to unwind_next().
	- Rename walk_stackframe() to unwind().

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/kernel/stacktrace.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 5f5bb35b7b41..b980d96dccfc 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -33,8 +33,8 @@
  */
 
 
-static void start_backtrace(struct stackframe *frame, unsigned long fp,
-			    unsigned long pc)
+static void unwind_init(struct stackframe *frame, unsigned long fp,
+			unsigned long pc)
 {
 	frame->fp = fp;
 	frame->pc = pc;
@@ -45,7 +45,7 @@ static void start_backtrace(struct stackframe *frame, unsigned long fp,
 	/*
 	 * Prime the first unwind.
 	 *
-	 * In unwind_frame() we'll check that the FP points to a valid stack,
+	 * In unwind_next() we'll check that the FP points to a valid stack,
 	 * which can't be STACK_TYPE_UNKNOWN, and the first unwind will be
 	 * treated as a transition to whichever stack that happens to be. The
 	 * prev_fp value won't be used, but we set it to 0 such that it is
@@ -63,8 +63,8 @@ static void start_backtrace(struct stackframe *frame, unsigned long fp,
  * records (e.g. a cycle), determined based on the location and fp value of A
  * and the location (but not the fp value) of B.
  */
-static int notrace unwind_frame(struct task_struct *tsk,
-				struct stackframe *frame)
+static int notrace unwind_next(struct task_struct *tsk,
+			       struct stackframe *frame)
 {
 	unsigned long fp = frame->fp;
 	struct stack_info info;
@@ -104,7 +104,7 @@ static int notrace unwind_frame(struct task_struct *tsk,
 
 	/*
 	 * Record this frame record's values and location. The prev_fp and
-	 * prev_type are only meaningful to the next unwind_frame() invocation.
+	 * prev_type are only meaningful to the next unwind_next() invocation.
 	 */
 	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
 	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
@@ -137,23 +137,23 @@ static int notrace unwind_frame(struct task_struct *tsk,
 
 	return 0;
 }
-NOKPROBE_SYMBOL(unwind_frame);
+NOKPROBE_SYMBOL(unwind_next);
 
-static void notrace walk_stackframe(struct task_struct *tsk,
-				    struct stackframe *frame,
-				    bool (*fn)(void *, unsigned long), void *data)
+static void notrace unwind(struct task_struct *tsk,
+			   struct stackframe *frame,
+			   bool (*fn)(void *, unsigned long), void *data)
 {
 	while (1) {
 		int ret;
 
 		if (!fn(data, frame->pc))
 			break;
-		ret = unwind_frame(tsk, frame);
+		ret = unwind_next(tsk, frame);
 		if (ret < 0)
 			break;
 	}
 }
-NOKPROBE_SYMBOL(walk_stackframe);
+NOKPROBE_SYMBOL(unwind);
 
 static bool dump_backtrace_entry(void *arg, unsigned long where)
 {
@@ -195,14 +195,14 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 	struct stackframe frame;
 
 	if (regs)
-		start_backtrace(&frame, regs->regs[29], regs->pc);
+		unwind_init(&frame, regs->regs[29], regs->pc);
 	else if (task == current)
-		start_backtrace(&frame,
+		unwind_init(&frame,
 				(unsigned long)__builtin_frame_address(1),
 				(unsigned long)__builtin_return_address(0));
 	else
-		start_backtrace(&frame, thread_saved_fp(task),
+		unwind_init(&frame, thread_saved_fp(task),
 				thread_saved_pc(task));
 
-	walk_stackframe(task, &frame, consume_entry, cookie);
+	unwind(task, &frame, consume_entry, cookie);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 03/11] arm64: Rename stackframe to unwind_state
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
  2022-01-17 14:55   ` [PATCH v13 01/11] arm64: Remove NULL task check from unwind_frame() madvenka
  2022-01-17 14:55   ` [PATCH v13 02/11] arm64: Rename unwinder functions madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-01-17 14:56   ` [PATCH v13 04/11] arm64: Split unwind_init() madvenka
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Rename "struct stackframe" to "struct unwind_state" for consistency and
better naming. Accordingly, rename variable/argument "frame" to "state".

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/include/asm/stacktrace.h |  2 +-
 arch/arm64/kernel/stacktrace.c      | 66 ++++++++++++++---------------
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index e77cdef9ca29..41ec360515f6 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -52,7 +52,7 @@ struct stack_info {
  *               associated with the most recently encountered replacement lr
  *               value.
  */
-struct stackframe {
+struct unwind_state {
 	unsigned long fp;
 	unsigned long pc;
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index b980d96dccfc..a1a7ff93b84f 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -33,13 +33,13 @@
  */
 
 
-static void unwind_init(struct stackframe *frame, unsigned long fp,
+static void unwind_init(struct unwind_state *state, unsigned long fp,
 			unsigned long pc)
 {
-	frame->fp = fp;
-	frame->pc = pc;
+	state->fp = fp;
+	state->pc = pc;
 #ifdef CONFIG_KRETPROBES
-	frame->kr_cur = NULL;
+	state->kr_cur = NULL;
 #endif
 
 	/*
@@ -51,9 +51,9 @@ static void unwind_init(struct stackframe *frame, unsigned long fp,
 	 * prev_fp value won't be used, but we set it to 0 such that it is
 	 * definitely not an accessible stack address.
 	 */
-	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
-	frame->prev_fp = 0;
-	frame->prev_type = STACK_TYPE_UNKNOWN;
+	bitmap_zero(state->stacks_done, __NR_STACK_TYPES);
+	state->prev_fp = 0;
+	state->prev_type = STACK_TYPE_UNKNOWN;
 }
 
 /*
@@ -64,9 +64,9 @@ static void unwind_init(struct stackframe *frame, unsigned long fp,
  * and the location (but not the fp value) of B.
  */
 static int notrace unwind_next(struct task_struct *tsk,
-			       struct stackframe *frame)
+			       struct unwind_state *state)
 {
-	unsigned long fp = frame->fp;
+	unsigned long fp = state->fp;
 	struct stack_info info;
 
 	/* Final frame; nothing to unwind */
@@ -79,7 +79,7 @@ static int notrace unwind_next(struct task_struct *tsk,
 	if (!on_accessible_stack(tsk, fp, 16, &info))
 		return -EINVAL;
 
-	if (test_bit(info.type, frame->stacks_done))
+	if (test_bit(info.type, state->stacks_done))
 		return -EINVAL;
 
 	/*
@@ -95,27 +95,27 @@ static int notrace unwind_next(struct task_struct *tsk,
 	 * stack to another, it's never valid to unwind back to that first
 	 * stack.
 	 */
-	if (info.type == frame->prev_type) {
-		if (fp <= frame->prev_fp)
+	if (info.type == state->prev_type) {
+		if (fp <= state->prev_fp)
 			return -EINVAL;
 	} else {
-		set_bit(frame->prev_type, frame->stacks_done);
+		set_bit(state->prev_type, state->stacks_done);
 	}
 
 	/*
 	 * Record this frame record's values and location. The prev_fp and
 	 * prev_type are only meaningful to the next unwind_next() invocation.
 	 */
-	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
-	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
-	frame->prev_fp = fp;
-	frame->prev_type = info.type;
+	state->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
+	state->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
+	state->prev_fp = fp;
+	state->prev_type = info.type;
 
-	frame->pc = ptrauth_strip_insn_pac(frame->pc);
+	state->pc = ptrauth_strip_insn_pac(state->pc);
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	if (tsk->ret_stack &&
-		(frame->pc == (unsigned long)return_to_handler)) {
+		(state->pc == (unsigned long)return_to_handler)) {
 		unsigned long orig_pc;
 		/*
 		 * This is a case where function graph tracer has
@@ -123,16 +123,16 @@ static int notrace unwind_next(struct task_struct *tsk,
 		 * to hook a function return.
 		 * So replace it to an original value.
 		 */
-		orig_pc = ftrace_graph_ret_addr(tsk, NULL, frame->pc,
-						(void *)frame->fp);
-		if (WARN_ON_ONCE(frame->pc == orig_pc))
+		orig_pc = ftrace_graph_ret_addr(tsk, NULL, state->pc,
+						(void *)state->fp);
+		if (WARN_ON_ONCE(state->pc == orig_pc))
 			return -EINVAL;
-		frame->pc = orig_pc;
+		state->pc = orig_pc;
 	}
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
 #ifdef CONFIG_KRETPROBES
-	if (is_kretprobe_trampoline(frame->pc))
-		frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
+	if (is_kretprobe_trampoline(state->pc))
+		state->pc = kretprobe_find_ret_addr(tsk, (void *)state->fp, &state->kr_cur);
 #endif
 
 	return 0;
@@ -140,15 +140,15 @@ static int notrace unwind_next(struct task_struct *tsk,
 NOKPROBE_SYMBOL(unwind_next);
 
 static void notrace unwind(struct task_struct *tsk,
-			   struct stackframe *frame,
+			   struct unwind_state *state,
 			   bool (*fn)(void *, unsigned long), void *data)
 {
 	while (1) {
 		int ret;
 
-		if (!fn(data, frame->pc))
+		if (!fn(data, state->pc))
 			break;
-		ret = unwind_next(tsk, frame);
+		ret = unwind_next(tsk, state);
 		if (ret < 0)
 			break;
 	}
@@ -192,17 +192,17 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 			      void *cookie, struct task_struct *task,
 			      struct pt_regs *regs)
 {
-	struct stackframe frame;
+	struct unwind_state state;
 
 	if (regs)
-		unwind_init(&frame, regs->regs[29], regs->pc);
+		unwind_init(&state, regs->regs[29], regs->pc);
 	else if (task == current)
-		unwind_init(&frame,
+		unwind_init(&state,
 				(unsigned long)__builtin_frame_address(1),
 				(unsigned long)__builtin_return_address(0));
 	else
-		unwind_init(&frame, thread_saved_fp(task),
+		unwind_init(&state, thread_saved_fp(task),
 				thread_saved_pc(task));
 
-	unwind(task, &frame, consume_entry, cookie);
+	unwind(task, &state, consume_entry, cookie);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 04/11] arm64: Split unwind_init()
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (2 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 03/11] arm64: Rename stackframe to unwind_state madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-02-02 18:44     ` Mark Brown
  2022-02-15 13:07     ` Mark Rutland
  2022-01-17 14:56   ` [PATCH v13 05/11] arm64: Copy the task argument to unwind_state madvenka
                     ` (6 subsequent siblings)
  10 siblings, 2 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

unwind_init() is currently a single function that initializes all of the
unwind state. Split it into the following functions and call them
appropriately:

	- unwind_init_from_regs() - initialize from regs passed by caller.

	- unwind_init_from_current() - initialize for the current task
	  from the caller of arch_stack_walk().

	- unwind_init_from_task() - initialize from the saved state of a
	  task other than the current task. In this case, the other
	  task must not be running.

This is done for two reasons:

	- the different ways of initializing are clear

	- specialized code can be added to each initializer in the future.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/kernel/stacktrace.c | 54 +++++++++++++++++++++++++++-------
 1 file changed, 44 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index a1a7ff93b84f..b2b568e5deba 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -33,11 +33,8 @@
  */
 
 
-static void unwind_init(struct unwind_state *state, unsigned long fp,
-			unsigned long pc)
+static void unwind_init_common(struct unwind_state *state)
 {
-	state->fp = fp;
-	state->pc = pc;
 #ifdef CONFIG_KRETPROBES
 	state->kr_cur = NULL;
 #endif
@@ -56,6 +53,46 @@ static void unwind_init(struct unwind_state *state, unsigned long fp,
 	state->prev_type = STACK_TYPE_UNKNOWN;
 }
 
+/*
+ * TODO: document requirements here.
+ */
+static inline void unwind_init_from_regs(struct unwind_state *state,
+					 struct pt_regs *regs)
+{
+	unwind_init_common(state);
+
+	state->fp = regs->regs[29];
+	state->pc = regs->pc;
+}
+
+/*
+ * TODO: document requirements here.
+ *
+ * Note: this is always inlined, and we expect our caller to be a noinline
+ * function, such that this starts from our caller's caller.
+ */
+static __always_inline void unwind_init_from_current(struct unwind_state *state)
+{
+	unwind_init_common(state);
+
+	state->fp = (unsigned long)__builtin_frame_address(1);
+	state->pc = (unsigned long)__builtin_return_address(0);
+}
+
+/*
+ * TODO: document requirements here.
+ *
+ * The caller guarantees that the task is not running.
+ */
+static inline void unwind_init_from_task(struct unwind_state *state,
+					 struct task_struct *task)
+{
+	unwind_init_common(state);
+
+	state->fp = thread_saved_fp(task);
+	state->pc = thread_saved_pc(task);
+}
+
 /*
  * Unwind from one frame record (A) to the next frame record (B).
  *
@@ -195,14 +232,11 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 	struct unwind_state state;
 
 	if (regs)
-		unwind_init(&state, regs->regs[29], regs->pc);
+		unwind_init_from_regs(&state, regs);
 	else if (task == current)
-		unwind_init(&state,
-				(unsigned long)__builtin_frame_address(1),
-				(unsigned long)__builtin_return_address(0));
+		unwind_init_from_current(&state);
 	else
-		unwind_init(&state, thread_saved_fp(task),
-				thread_saved_pc(task));
+		unwind_init_from_task(&state, task);
 
 	unwind(task, &state, consume_entry, cookie);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 05/11] arm64: Copy the task argument to unwind_state
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (3 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 04/11] arm64: Split unwind_init() madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-02-02 18:45     ` Mark Brown
  2022-02-15 13:22     ` Mark Rutland
  2022-01-17 14:56   ` [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind() madvenka
                     ` (5 subsequent siblings)
  10 siblings, 2 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Copy the task argument passed to arch_stack_walk() to unwind_state so that
it can be passed to unwind functions via unwind_state rather than as a
separate argument. The task is a fundamental part of the unwind state.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/include/asm/stacktrace.h |  3 +++
 arch/arm64/kernel/stacktrace.c      | 29 ++++++++++++++++-------------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 41ec360515f6..af423f5d7ad8 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -51,6 +51,8 @@ struct stack_info {
  * @kr_cur:      When KRETPROBES is selected, holds the kretprobe instance
  *               associated with the most recently encountered replacement lr
  *               value.
+ *
+ * @task:        Pointer to the task structure.
  */
 struct unwind_state {
 	unsigned long fp;
@@ -61,6 +63,7 @@ struct unwind_state {
 #ifdef CONFIG_KRETPROBES
 	struct llist_node *kr_cur;
 #endif
+	struct task_struct *task;
 };
 
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index b2b568e5deba..1b32e55735aa 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -33,8 +33,10 @@
  */
 
 
-static void unwind_init_common(struct unwind_state *state)
+static void unwind_init_common(struct unwind_state *state,
+			       struct task_struct *task)
 {
+	state->task = task;
 #ifdef CONFIG_KRETPROBES
 	state->kr_cur = NULL;
 #endif
@@ -57,9 +59,10 @@ static void unwind_init_common(struct unwind_state *state)
  * TODO: document requirements here.
  */
 static inline void unwind_init_from_regs(struct unwind_state *state,
+					 struct task_struct *task,
 					 struct pt_regs *regs)
 {
-	unwind_init_common(state);
+	unwind_init_common(state, task);
 
 	state->fp = regs->regs[29];
 	state->pc = regs->pc;
@@ -71,9 +74,10 @@ static inline void unwind_init_from_regs(struct unwind_state *state,
  * Note: this is always inlined, and we expect our caller to be a noinline
  * function, such that this starts from our caller's caller.
  */
-static __always_inline void unwind_init_from_current(struct unwind_state *state)
+static __always_inline void unwind_init_from_current(struct unwind_state *state,
+						     struct task_struct *task)
 {
-	unwind_init_common(state);
+	unwind_init_common(state, task);
 
 	state->fp = (unsigned long)__builtin_frame_address(1);
 	state->pc = (unsigned long)__builtin_return_address(0);
@@ -87,7 +91,7 @@ static __always_inline void unwind_init_from_current(struct unwind_state *state)
 static inline void unwind_init_from_task(struct unwind_state *state,
 					 struct task_struct *task)
 {
-	unwind_init_common(state);
+	unwind_init_common(state, task);
 
 	state->fp = thread_saved_fp(task);
 	state->pc = thread_saved_pc(task);
@@ -100,11 +104,11 @@ static inline void unwind_init_from_task(struct unwind_state *state,
  * records (e.g. a cycle), determined based on the location and fp value of A
  * and the location (but not the fp value) of B.
  */
-static int notrace unwind_next(struct task_struct *tsk,
-			       struct unwind_state *state)
+static int notrace unwind_next(struct unwind_state *state)
 {
 	unsigned long fp = state->fp;
 	struct stack_info info;
+	struct task_struct *tsk = state->task;
 
 	/* Final frame; nothing to unwind */
 	if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
@@ -176,8 +180,7 @@ static int notrace unwind_next(struct task_struct *tsk,
 }
 NOKPROBE_SYMBOL(unwind_next);
 
-static void notrace unwind(struct task_struct *tsk,
-			   struct unwind_state *state,
+static void notrace unwind(struct unwind_state *state,
 			   bool (*fn)(void *, unsigned long), void *data)
 {
 	while (1) {
@@ -185,7 +188,7 @@ static void notrace unwind(struct task_struct *tsk,
 
 		if (!fn(data, state->pc))
 			break;
-		ret = unwind_next(tsk, state);
+		ret = unwind_next(state);
 		if (ret < 0)
 			break;
 	}
@@ -232,11 +235,11 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 	struct unwind_state state;
 
 	if (regs)
-		unwind_init_from_regs(&state, regs);
+		unwind_init_from_regs(&state, task, regs);
 	else if (task == current)
-		unwind_init_from_current(&state);
+		unwind_init_from_current(&state, task);
 	else
 		unwind_init_from_task(&state, task);
 
-	unwind(task, &state, consume_entry, cookie);
+	unwind(&state, consume_entry, cookie);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (4 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 05/11] arm64: Copy the task argument to unwind_state madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-02-02 18:46     ` Mark Brown
  2022-02-15 13:39     ` Mark Rutland
  2022-01-17 14:56   ` [PATCH v13 07/11] arm64: Make the unwind loop in unwind() similar to other architectures madvenka
                     ` (4 subsequent siblings)
  10 siblings, 2 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Rename the arguments to unwind() for better consistency. Also, use the
typedef stack_trace_consume_fn for the consume_entry function as it is
already defined in linux/stacktrace.h.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/kernel/stacktrace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 1b32e55735aa..f772dac78b11 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -181,12 +181,12 @@ static int notrace unwind_next(struct unwind_state *state)
 NOKPROBE_SYMBOL(unwind_next);
 
 static void notrace unwind(struct unwind_state *state,
-			   bool (*fn)(void *, unsigned long), void *data)
+			   stack_trace_consume_fn consume_entry, void *cookie)
 {
 	while (1) {
 		int ret;
 
-		if (!fn(data, state->pc))
+		if (!consume_entry(cookie, state->pc))
 			break;
 		ret = unwind_next(state);
 		if (ret < 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 07/11] arm64: Make the unwind loop in unwind() similar to other architectures
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (5 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind() madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-01-17 14:56   ` [PATCH v13 08/11] arm64: Introduce stack trace reliability checks in the unwinder madvenka
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Change the loop in unwind()
===========================

Change the unwind loop in unwind() to:

	while (unwind_continue(state, consume_entry, cookie))
		unwind_next(state);

This is easy to understand and maintain.

New function unwind_continue()
==============================

Define a new function unwind_continue() that is used in the unwind loop
to check for conditions that terminate a stack trace.

The conditions checked are:

	- If the bottom of the stack (final frame) has been reached,
	  terminate.

	- If the consume_entry() function returns false, the caller of
	  unwind has asked to terminate the stack trace. So, terminate.

	- If unwind_next() failed for some reason (like stack corruption),
	  terminate.

Do not return an error value from unwind_next()
===============================================

We want to check for terminating conditions only in unwind_continue() from
the unwinder loop. So, do not return an error value from unwind_next().
Simply set a flag in unwind_state and check the flag in unwind_continue().

Final FP
========

Introduce a new field "final_fp" in "struct unwind_state". Initialize this
to the final frame of the stack trace:

	task_pt_regs(task)->stackframe

This is where the stacktrace must terminate if it is successful. Add an
explicit comment to that effect.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/stacktrace.h |  6 +++
 arch/arm64/kernel/stacktrace.c      | 72 ++++++++++++++++++-----------
 2 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index af423f5d7ad8..c11b048ffd0e 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -53,6 +53,10 @@ struct stack_info {
  *               value.
  *
  * @task:        Pointer to the task structure.
+ *
+ * @final_fp	 Pointer to the final frame.
+ *
+ * @failed:      Unwind failed.
  */
 struct unwind_state {
 	unsigned long fp;
@@ -64,6 +68,8 @@ struct unwind_state {
 	struct llist_node *kr_cur;
 #endif
 	struct task_struct *task;
+	unsigned long final_fp;
+	bool failed;
 };
 
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index f772dac78b11..73fc6b5ee6fd 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -53,6 +53,10 @@ static void unwind_init_common(struct unwind_state *state,
 	bitmap_zero(state->stacks_done, __NR_STACK_TYPES);
 	state->prev_fp = 0;
 	state->prev_type = STACK_TYPE_UNKNOWN;
+	state->failed = false;
+
+	/* Stack trace terminates here. */
+	state->final_fp = (unsigned long)task_pt_regs(task)->stackframe;
 }
 
 /*
@@ -97,6 +101,25 @@ static inline void unwind_init_from_task(struct unwind_state *state,
 	state->pc = thread_saved_pc(task);
 }
 
+static bool notrace unwind_continue(struct unwind_state *state,
+				    stack_trace_consume_fn consume_entry,
+				    void *cookie)
+{
+	if (state->failed) {
+		/* PC is suspect. Cannot consume it. */
+		return false;
+	}
+
+	if (!consume_entry(cookie, state->pc)) {
+		/* Caller terminated the unwind. */
+		state->failed = true;
+		return false;
+	}
+
+	return state->fp != state->final_fp;
+}
+NOKPROBE_SYMBOL(unwind_continue);
+
 /*
  * Unwind from one frame record (A) to the next frame record (B).
  *
@@ -104,24 +127,26 @@ static inline void unwind_init_from_task(struct unwind_state *state,
  * records (e.g. a cycle), determined based on the location and fp value of A
  * and the location (but not the fp value) of B.
  */
-static int notrace unwind_next(struct unwind_state *state)
+static void notrace unwind_next(struct unwind_state *state)
 {
 	unsigned long fp = state->fp;
 	struct stack_info info;
 	struct task_struct *tsk = state->task;
 
-	/* Final frame; nothing to unwind */
-	if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
-		return -ENOENT;
-
-	if (fp & 0x7)
-		return -EINVAL;
+	if (fp & 0x7) {
+		state->failed = true;
+		return;
+	}
 
-	if (!on_accessible_stack(tsk, fp, 16, &info))
-		return -EINVAL;
+	if (!on_accessible_stack(tsk, fp, 16, &info)) {
+		state->failed = true;
+		return;
+	}
 
-	if (test_bit(info.type, state->stacks_done))
-		return -EINVAL;
+	if (test_bit(info.type, state->stacks_done)) {
+		state->failed = true;
+		return;
+	}
 
 	/*
 	 * As stacks grow downward, any valid record on the same stack must be
@@ -137,8 +162,10 @@ static int notrace unwind_next(struct unwind_state *state)
 	 * stack.
 	 */
 	if (info.type == state->prev_type) {
-		if (fp <= state->prev_fp)
-			return -EINVAL;
+		if (fp <= state->prev_fp) {
+			state->failed = true;
+			return;
+		}
 	} else {
 		set_bit(state->prev_type, state->stacks_done);
 	}
@@ -166,8 +193,10 @@ static int notrace unwind_next(struct unwind_state *state)
 		 */
 		orig_pc = ftrace_graph_ret_addr(tsk, NULL, state->pc,
 						(void *)state->fp);
-		if (WARN_ON_ONCE(state->pc == orig_pc))
-			return -EINVAL;
+		if (WARN_ON_ONCE(state->pc == orig_pc)) {
+			state->failed = true;
+			return;
+		}
 		state->pc = orig_pc;
 	}
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
@@ -175,23 +204,14 @@ static int notrace unwind_next(struct unwind_state *state)
 	if (is_kretprobe_trampoline(state->pc))
 		state->pc = kretprobe_find_ret_addr(tsk, (void *)state->fp, &state->kr_cur);
 #endif
-
-	return 0;
 }
 NOKPROBE_SYMBOL(unwind_next);
 
 static void notrace unwind(struct unwind_state *state,
 			   stack_trace_consume_fn consume_entry, void *cookie)
 {
-	while (1) {
-		int ret;
-
-		if (!consume_entry(cookie, state->pc))
-			break;
-		ret = unwind_next(state);
-		if (ret < 0)
-			break;
-	}
+	while (unwind_continue(state, consume_entry, cookie))
+		unwind_next(state);
 }
 NOKPROBE_SYMBOL(unwind);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 08/11] arm64: Introduce stack trace reliability checks in the unwinder
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (6 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 07/11] arm64: Make the unwind loop in unwind() similar to other architectures madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-01-17 14:56   ` [PATCH v13 09/11] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

There are some kernel features and conditions that make a stack trace
unreliable. Callers may require the unwinder to detect these cases.
E.g., livepatch.

Introduce a new function called unwind_check_reliability() that will
detect these cases and set a flag in the stack frame. Call
unwind_check_reliability() for every frame in unwind().

Introduce the first reliability check in unwind_check_reliability() - If
a return PC is not a valid kernel text address, consider the stack
trace unreliable. It could be some generated code. Other reliability checks
will be added in the future.

Let unwind() return a boolean to indicate if the stack trace is
reliable.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/stacktrace.h |  3 +++
 arch/arm64/kernel/stacktrace.c      | 28 ++++++++++++++++++++++++++--
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index c11b048ffd0e..26eba9d7e5c7 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -57,6 +57,8 @@ struct stack_info {
  * @final_fp	 Pointer to the final frame.
  *
  * @failed:      Unwind failed.
+ *
+ * @reliable:    Stack trace is reliable.
  */
 struct unwind_state {
 	unsigned long fp;
@@ -70,6 +72,7 @@ struct unwind_state {
 	struct task_struct *task;
 	unsigned long final_fp;
 	bool failed;
+	bool reliable;
 };
 
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 73fc6b5ee6fd..3dc0374e83f7 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -18,6 +18,25 @@
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
+/*
+ * Check the stack frame for conditions that make further unwinding unreliable.
+ */
+static void unwind_check_reliability(struct unwind_state *state)
+{
+	if (state->fp == state->final_fp) {
+		/* Final frame; no more unwind, no need to check reliability */
+		return;
+	}
+
+	/*
+	 * If the PC is not a known kernel text address, then we cannot
+	 * be sure that a subsequent unwind will be reliable, as we
+	 * don't know that the code follows our unwind requirements.
+	 */
+	if (!__kernel_text_address(state->pc))
+		state->reliable = false;
+}
+
 /*
  * AArch64 PCS assigns the frame pointer to x29.
  *
@@ -54,6 +73,7 @@ static void unwind_init_common(struct unwind_state *state,
 	state->prev_fp = 0;
 	state->prev_type = STACK_TYPE_UNKNOWN;
 	state->failed = false;
+	state->reliable = true;
 
 	/* Stack trace terminates here. */
 	state->final_fp = (unsigned long)task_pt_regs(task)->stackframe;
@@ -207,11 +227,15 @@ static void notrace unwind_next(struct unwind_state *state)
 }
 NOKPROBE_SYMBOL(unwind_next);
 
-static void notrace unwind(struct unwind_state *state,
+static bool notrace unwind(struct unwind_state *state,
 			   stack_trace_consume_fn consume_entry, void *cookie)
 {
-	while (unwind_continue(state, consume_entry, cookie))
+	unwind_check_reliability(state);
+	while (unwind_continue(state, consume_entry, cookie)) {
 		unwind_next(state);
+		unwind_check_reliability(state);
+	}
+	return !state->failed && state->reliable;
 }
 NOKPROBE_SYMBOL(unwind);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 09/11] arm64: Create a list of SYM_CODE functions, check return PC against list
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (7 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 08/11] arm64: Introduce stack trace reliability checks in the unwinder madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-01-17 14:56   ` [PATCH v13 10/11] arm64: Introduce arch_stack_walk_reliable() madvenka
  2022-01-17 14:56   ` [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE madvenka
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

SYM_CODE functions don't follow the usual calling conventions. Check if the
return PC in a stack frame falls in any of these. If it does, consider the
stack trace unreliable.

Define a special section for unreliable functions
=================================================

Define a SYM_CODE_END() macro for arm64 that adds the function address
range to a new section called "sym_code_functions".

Linker file
===========

Include the "sym_code_functions" section under read-only data in
vmlinux.lds.S.

Initialization
==============

Define an early_initcall() to create a sym_code_functions[] array from
the linker data.

Unwinder check
==============

Add a reliability check in unwind_check_reliability() that compares a
return PC with sym_code_functions[]. If there is a match, then return
failure.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/linkage.h  | 12 +++++++
 arch/arm64/include/asm/sections.h |  1 +
 arch/arm64/kernel/stacktrace.c    | 55 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S   | 10 ++++++
 4 files changed, 78 insertions(+)

diff --git a/arch/arm64/include/asm/linkage.h b/arch/arm64/include/asm/linkage.h
index b77e9b3f5371..a47e7914b289 100644
--- a/arch/arm64/include/asm/linkage.h
+++ b/arch/arm64/include/asm/linkage.h
@@ -63,4 +63,16 @@
 		SYM_FUNC_END_ALIAS(x);		\
 		SYM_FUNC_END_ALIAS(__pi_##x)
 
+/*
+ * Record the address range of each SYM_CODE function in a struct code_range
+ * in a special section.
+ */
+#define SYM_CODE_END(name)				\
+	SYM_END(name, SYM_T_NONE)			;\
+	99:						;\
+	.pushsection "sym_code_functions", "aw"		;\
+	.quad	name					;\
+	.quad	99b					;\
+	.popsection
+
 #endif
diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h
index 152cb35bf9df..ac01189668c5 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -22,5 +22,6 @@ extern char __irqentry_text_start[], __irqentry_text_end[];
 extern char __mmuoff_data_start[], __mmuoff_data_end[];
 extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
 extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[];
+extern char __sym_code_functions_start[], __sym_code_functions_end[];
 
 #endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 3dc0374e83f7..8bfe31cbee46 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -18,11 +18,40 @@
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
+struct code_range {
+	unsigned long	start;
+	unsigned long	end;
+};
+
+static struct code_range	*sym_code_functions;
+static int			num_sym_code_functions;
+
+int __init init_sym_code_functions(void)
+{
+	size_t size = (unsigned long)__sym_code_functions_end -
+		      (unsigned long)__sym_code_functions_start;
+
+	sym_code_functions = (struct code_range *)__sym_code_functions_start;
+	/*
+	 * Order it so that sym_code_functions is not visible before
+	 * num_sym_code_functions.
+	 */
+	smp_mb();
+	num_sym_code_functions = size / sizeof(struct code_range);
+
+	return 0;
+}
+early_initcall(init_sym_code_functions);
+
 /*
  * Check the stack frame for conditions that make further unwinding unreliable.
  */
 static void unwind_check_reliability(struct unwind_state *state)
 {
+	const struct code_range *range;
+	unsigned long pc;
+	int i;
+
 	if (state->fp == state->final_fp) {
 		/* Final frame; no more unwind, no need to check reliability */
 		return;
@@ -35,6 +64,32 @@ static void unwind_check_reliability(struct unwind_state *state)
 	 */
 	if (!__kernel_text_address(state->pc))
 		state->reliable = false;
+
+	/*
+	 * Check the return PC against sym_code_functions[]. If there is a
+	 * match, then the consider the stack frame unreliable.
+	 *
+	 * As SYM_CODE functions don't follow the usual calling conventions,
+	 * we assume by default that any SYM_CODE function cannot be unwound
+	 * reliably.
+	 *
+	 * Note that this includes:
+	 *
+	 * - Exception handlers and entry assembly
+	 * - Trampoline assembly (e.g., ftrace, kprobes)
+	 * - Hypervisor-related assembly
+	 * - Hibernation-related assembly
+	 * - CPU start-stop, suspend-resume assembly
+	 * - Kernel relocation assembly
+	 */
+	pc = state->pc;
+	for (i = 0; i < num_sym_code_functions; i++) {
+		range = &sym_code_functions[i];
+		if (pc >= range->start && pc < range->end) {
+			state->reliable = false;
+			return;
+		}
+	}
 }
 
 /*
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 50bab186c49b..6381e75e566e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -122,6 +122,14 @@ jiffies = jiffies_64;
 #define TRAMP_TEXT
 #endif
 
+#define SYM_CODE_FUNCTIONS				\
+	. = ALIGN(16);					\
+	.symcode : AT(ADDR(.symcode) - LOAD_OFFSET) {	\
+		__sym_code_functions_start = .;		\
+		KEEP(*(sym_code_functions))		\
+		__sym_code_functions_end = .;		\
+	}
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from _stext to _edata, must be a round multiple of the PE/COFF
@@ -209,6 +217,8 @@ SECTIONS
 	swapper_pg_dir = .;
 	. += PAGE_SIZE;
 
+	SYM_CODE_FUNCTIONS
+
 	. = ALIGN(SEGMENT_ALIGN);
 	__init_begin = .;
 	__inittext_begin = .;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 10/11] arm64: Introduce arch_stack_walk_reliable()
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (8 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 09/11] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-01-17 14:56   ` [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE madvenka
  10 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Introduce arch_stack_walk_reliable() for ARM64. This works like
arch_stack_walk() except that it returns -EINVAL if the stack trace is not
reliable.

Until all the reliability checks are in place, arch_stack_walk_reliable()
may not be used by livepatch. But it may be used by debug and test code.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/stacktrace.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 8bfe31cbee46..4902fac5745f 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -342,3 +342,25 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 
 	unwind(&state, consume_entry, cookie);
 }
+
+/*
+ * arch_stack_walk_reliable() may not be used for livepatch until all of
+ * the reliability checks are in place in unwind_consume(). However,
+ * debug and test code can choose to use it even if all the checks are not
+ * in place.
+ */
+noinline int notrace arch_stack_walk_reliable(stack_trace_consume_fn consume_fn,
+					      void *cookie,
+					      struct task_struct *task)
+{
+	struct unwind_state state;
+	bool reliable;
+
+	if (task == current)
+		unwind_init_from_current(&state, task);
+	else
+		unwind_init_from_task(&state, task);
+
+	reliable = unwind(&state, consume_fn, cookie);
+	return reliable ? 0 : -EINVAL;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
                     ` (9 preceding siblings ...)
  2022-01-17 14:56   ` [PATCH v13 10/11] arm64: Introduce arch_stack_walk_reliable() madvenka
@ 2022-01-17 14:56   ` madvenka
  2022-01-25  5:21     ` nobuta.keiya
  10 siblings, 1 reply; 75+ messages in thread
From: madvenka @ 2022-01-17 14:56 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Select HAVE_RELIABLE_STACKTRACE in arm64/Kconfig to allow
arch_stack_walk_reliable() to be used.

Note that this is conditional upon STACK_VALIDATION which will be added
when frame pointer validation is implemented (say via objtool).

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f6e333b59314..bc7b3514b563 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -223,6 +223,7 @@ config ARM64
 	select THREAD_INFO_IN_TASK
 	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
 	select TRACE_IRQFLAGS_SUPPORT
+	select HAVE_RELIABLE_STACKTRACE if FRAME_POINTER && STACK_VALIDATION
 	help
 	  ARM 64-bit (AArch64) Linux support.
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* RE: [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-17 14:56   ` [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE madvenka
@ 2022-01-25  5:21     ` nobuta.keiya
  2022-01-25 13:43       ` Madhavan T. Venkataraman
  2022-01-26 17:16       ` Mark Brown
  0 siblings, 2 replies; 75+ messages in thread
From: nobuta.keiya @ 2022-01-25  5:21 UTC (permalink / raw)
  To: 'madvenka@linux.microsoft.com',
	mark.rutland, broonie, jpoimboe, ardb, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

Hi Madhavan,

> Select HAVE_RELIABLE_STACKTRACE in arm64/Kconfig to allow
> arch_stack_walk_reliable() to be used.
> 
> Note that this is conditional upon STACK_VALIDATION which will be added when frame pointer validation is implemented (say
> via objtool).

I know that Julien Thierry published objtool support for arm64 [1], but I'm not
sure if it has been updated. Could you tell me other threads if you know?

[1] https://lore.kernel.org/linux-arm-kernel/20210303170932.1838634-1-jthierry@redhat.com/


Thanks,
Keiya

> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> ---
>  arch/arm64/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f6e333b59314..bc7b3514b563 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -223,6 +223,7 @@ config ARM64
>  	select THREAD_INFO_IN_TASK
>  	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
>  	select TRACE_IRQFLAGS_SUPPORT
> +	select HAVE_RELIABLE_STACKTRACE if FRAME_POINTER && STACK_VALIDATION
>  	help
>  	  ARM 64-bit (AArch64) Linux support.
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-25  5:21     ` nobuta.keiya
@ 2022-01-25 13:43       ` Madhavan T. Venkataraman
  2022-01-26 10:20         ` nobuta.keiya
  2022-01-26 17:16       ` Mark Brown
  1 sibling, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-01-25 13:43 UTC (permalink / raw)
  To: nobuta.keiya, mark.rutland, broonie, jpoimboe, ardb,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

I have not seen any activity on that in a long time. IIRC, Julien quit RedHat.
I don't know if anyone else has taken over this work in RedHat.

Sorry, I don't have any more information.

Madhavan

On 1/24/22 23:21, nobuta.keiya@fujitsu.com wrote:
> Hi Madhavan,
> 
>> Select HAVE_RELIABLE_STACKTRACE in arm64/Kconfig to allow
>> arch_stack_walk_reliable() to be used.
>>
>> Note that this is conditional upon STACK_VALIDATION which will be added when frame pointer validation is implemented (say
>> via objtool).
> 
> I know that Julien Thierry published objtool support for arm64 [1], but I'm not
> sure if it has been updated. Could you tell me other threads if you know?
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20210303170932.1838634-1-jthierry@redhat.com/
> 
> 
> Thanks,
> Keiya
> 
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
>> ---
>>  arch/arm64/Kconfig | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f6e333b59314..bc7b3514b563 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -223,6 +223,7 @@ config ARM64
>>  	select THREAD_INFO_IN_TASK
>>  	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
>>  	select TRACE_IRQFLAGS_SUPPORT
>> +	select HAVE_RELIABLE_STACKTRACE if FRAME_POINTER && STACK_VALIDATION
>>  	help
>>  	  ARM 64-bit (AArch64) Linux support.
>>
>> --
>> 2.25.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* RE: [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-25 13:43       ` Madhavan T. Venkataraman
@ 2022-01-26 10:20         ` nobuta.keiya
  2022-01-26 17:14           ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 75+ messages in thread
From: nobuta.keiya @ 2022-01-26 10:20 UTC (permalink / raw)
  To: 'Madhavan T. Venkataraman',
	mark.rutland, broonie, jpoimboe, ardb, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

> I have not seen any activity on that in a long time. IIRC, Julien quit RedHat.
> I don't know if anyone else has taken over this work in RedHat.
> 
> Sorry, I don't have any more information.
> 
> Madhavan

Thanks for your information.

By the way, I'm considering test code for arch_stack_walk_reliable().
Specifically, I apply Suraj's patch to enable livepatch, and added a function
that sleeps between SYM_CODE_START and SYM_CODE_END, then livepatch
checks if the task has an unreliable stack.
For now my internal test code working correctly, but my Kconfig excludes
STACK_VALIDATION dependency.

It seems that objtool will not be enabled yet, so I would like to test it easier.
If you are already testing with this patch, could you tell me how to do it?


Thanks,
Keiya

> 
> On 1/24/22 23:21, nobuta.keiya@fujitsu.com wrote:
> > Hi Madhavan,
> >
> >> Select HAVE_RELIABLE_STACKTRACE in arm64/Kconfig to allow
> >> arch_stack_walk_reliable() to be used.
> >>
> >> Note that this is conditional upon STACK_VALIDATION which will be
> >> added when frame pointer validation is implemented (say via objtool).
> >
> > I know that Julien Thierry published objtool support for arm64 [1],
> > but I'm not sure if it has been updated. Could you tell me other threads if you know?
> >
> > [1]
> > https://lore.kernel.org/linux-arm-kernel/20210303170932.1838634-1-jthi
> > erry@redhat.com/
> >
> >
> > Thanks,
> > Keiya
> >
> >>
> >> Signed-off-by: Madhavan T. Venkataraman
> >> <madvenka@linux.microsoft.com>
> >> ---
> >>  arch/arm64/Kconfig | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index
> >> f6e333b59314..bc7b3514b563 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -223,6 +223,7 @@ config ARM64
> >>  	select THREAD_INFO_IN_TASK
> >>  	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
> >>  	select TRACE_IRQFLAGS_SUPPORT
> >> +	select HAVE_RELIABLE_STACKTRACE if FRAME_POINTER &&
> >> +STACK_VALIDATION
> >>  	help
> >>  	  ARM 64-bit (AArch64) Linux support.
> >>
> >> --
> >> 2.25.1
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-26 10:20         ` nobuta.keiya
@ 2022-01-26 17:14           ` Madhavan T. Venkataraman
  2022-01-27  1:13             ` nobuta.keiya
  0 siblings, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-01-26 17:14 UTC (permalink / raw)
  To: nobuta.keiya, mark.rutland, broonie, jpoimboe, ardb,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel



On 1/26/22 04:20, nobuta.keiya@fujitsu.com wrote:
>> I have not seen any activity on that in a long time. IIRC, Julien quit RedHat.
>> I don't know if anyone else has taken over this work in RedHat.
>>
>> Sorry, I don't have any more information.
>>
>> Madhavan
> 
> Thanks for your information.
> 
> By the way, I'm considering test code for arch_stack_walk_reliable().
> Specifically, I apply Suraj's patch to enable livepatch, and added a function
> that sleeps between SYM_CODE_START and SYM_CODE_END, then livepatch
> checks if the task has an unreliable stack.
> For now my internal test code working correctly, but my Kconfig excludes
> STACK_VALIDATION dependency.
> 
> It seems that objtool will not be enabled yet, so I would like to test it easier.
> If you are already testing with this patch, could you tell me how to do it?
> 
> 

For now, I have an instrumented kernel that directly invokes arch_stack_walk_reliable()
from various places in the kernel (interrupt handlers, exception handlers, ftrace entry,
kprobe handler, etc). I also have a test driver to induce conditions like null pointer
dereference. I use this to test different cases where arch_stack_walk_reliable() should
return an error.

As for livepatch testing, I have enhanced objtool and the kernel so the frame pointer can
be validated dynamically rather than statically. I have tested various different livepatch
selftests successfully. I have also written my own livepatch tests to add to the selftests.
I am currently working on preparing an RFC patch series for review. Basically, this series
implements STACK_VALIDATION in a different way.

I plan to publish my work soon (hopefully Feb 2022). I was going to do in December. However,
my workload in Microsoft did not permit me to do that. I am also planning to set up a github
repo so people can try out my changes, if they are interested.

So, stay tuned.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-25  5:21     ` nobuta.keiya
  2022-01-25 13:43       ` Madhavan T. Venkataraman
@ 2022-01-26 17:16       ` Mark Brown
  1 sibling, 0 replies; 75+ messages in thread
From: Mark Brown @ 2022-01-26 17:16 UTC (permalink / raw)
  To: nobuta.keiya
  Cc: 'madvenka@linux.microsoft.com',
	mark.rutland, jpoimboe, ardb, sjitindarsingh, catalin.marinas,
	will, jmorris, linux-arm-kernel, live-patching, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 281 bytes --]

On Tue, Jan 25, 2022 at 05:21:27AM +0000, nobuta.keiya@fujitsu.com wrote:

> I know that Julien Thierry published objtool support for arm64 [1], but I'm not
> sure if it has been updated. Could you tell me other threads if you know?

I've not heard of anyone else picking that up.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* RE: [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE
  2022-01-26 17:14           ` Madhavan T. Venkataraman
@ 2022-01-27  1:13             ` nobuta.keiya
  0 siblings, 0 replies; 75+ messages in thread
From: nobuta.keiya @ 2022-01-27  1:13 UTC (permalink / raw)
  To: 'Madhavan T. Venkataraman',
	mark.rutland, broonie, jpoimboe, ardb, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

> >> I have not seen any activity on that in a long time. IIRC, Julien quit RedHat.
> >> I don't know if anyone else has taken over this work in RedHat.
> >>
> >> Sorry, I don't have any more information.
> >>
> >> Madhavan
> >
> > Thanks for your information.
> >
> > By the way, I'm considering test code for arch_stack_walk_reliable().
> > Specifically, I apply Suraj's patch to enable livepatch, and added a function
> > that sleeps between SYM_CODE_START and SYM_CODE_END, then livepatch
> > checks if the task has an unreliable stack.
> > For now my internal test code working correctly, but my Kconfig excludes
> > STACK_VALIDATION dependency.
> >
> > It seems that objtool will not be enabled yet, so I would like to test it easier.
> > If you are already testing with this patch, could you tell me how to do it?
> >
> >
> 
> For now, I have an instrumented kernel that directly invokes arch_stack_walk_reliable()
> from various places in the kernel (interrupt handlers, exception handlers, ftrace entry,
> kprobe handler, etc). I also have a test driver to induce conditions like null pointer
> dereference. I use this to test different cases where arch_stack_walk_reliable() should
> return an error.

That's good to know, thanks.

> 
> As for livepatch testing, I have enhanced objtool and the kernel so the frame pointer can
> be validated dynamically rather than statically. I have tested various different livepatch
> selftests successfully. I have also written my own livepatch tests to add to the selftests.
> I am currently working on preparing an RFC patch series for review. Basically, this series
> implements STACK_VALIDATION in a different way.
> 
> I plan to publish my work soon (hopefully Feb 2022). I was going to do in December. However,
> my workload in Microsoft did not permit me to do that. I am also planning to set up a github
> repo so people can try out my changes, if they are interested.
> 
> So, stay tuned.
> 
> Madhavan

I'm very interested, so I would to be happy if you could tell me when the github repo is set up.


Thanks again,

Keiya

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 04/11] arm64: Split unwind_init()
  2022-01-17 14:56   ` [PATCH v13 04/11] arm64: Split unwind_init() madvenka
@ 2022-02-02 18:44     ` Mark Brown
  2022-02-03  0:26       ` Madhavan T. Venkataraman
  2022-02-15 13:07     ` Mark Rutland
  1 sibling, 1 reply; 75+ messages in thread
From: Mark Brown @ 2022-02-02 18:44 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1229 bytes --]

On Mon, Jan 17, 2022 at 08:56:01AM -0600, madvenka@linux.microsoft.com wrote:

> +/*
> + * TODO: document requirements here.
> + */
> +static inline void unwind_init_from_regs(struct unwind_state *state,
> +					 struct pt_regs *regs)

> +/*
> + * TODO: document requirements here.
> + *
> + * Note: this is always inlined, and we expect our caller to be a noinline
> + * function, such that this starts from our caller's caller.
> + */
> +static __always_inline void unwind_init_from_current(struct unwind_state *state)

> +/*
> + * TODO: document requirements here.
> + *
> + * The caller guarantees that the task is not running.
> + */
> +static inline void unwind_init_from_task(struct unwind_state *state,
> +					 struct task_struct *task)

Other than the obvious gap this looks good to me.  For _current() I
don't think we've got any particular requirements other than what's
documented.  For the others I think the main thing is that trying to
walk the stack of a task that is actively executing is going to be a bad
idea so we should say that the task shouldn't be running, but in general
given that one of the main use cases is printing diagnostics on error
we shouldn't have too many *requirements* for calling these.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 05/11] arm64: Copy the task argument to unwind_state
  2022-01-17 14:56   ` [PATCH v13 05/11] arm64: Copy the task argument to unwind_state madvenka
@ 2022-02-02 18:45     ` Mark Brown
  2022-02-15 13:22     ` Mark Rutland
  1 sibling, 0 replies; 75+ messages in thread
From: Mark Brown @ 2022-02-02 18:45 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 424 bytes --]

On Mon, Jan 17, 2022 at 08:56:02AM -0600, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Copy the task argument passed to arch_stack_walk() to unwind_state so that
> it can be passed to unwind functions via unwind_state rather than as a
> separate argument. The task is a fundamental part of the unwind state.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-01-17 14:56   ` [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind() madvenka
@ 2022-02-02 18:46     ` Mark Brown
  2022-02-03  0:34       ` Madhavan T. Venkataraman
  2022-02-15 13:39     ` Mark Rutland
  1 sibling, 1 reply; 75+ messages in thread
From: Mark Brown @ 2022-02-02 18:46 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 428 bytes --]

On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Rename the arguments to unwind() for better consistency. Also, use the
> typedef stack_trace_consume_fn for the consume_entry function as it is
> already defined in linux/stacktrace.h.

Consistency with...?  But otherwise:

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 04/11] arm64: Split unwind_init()
  2022-02-02 18:44     ` Mark Brown
@ 2022-02-03  0:26       ` Madhavan T. Venkataraman
  2022-02-03  0:39         ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-03  0:26 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 2/2/22 12:44, Mark Brown wrote:
> On Mon, Jan 17, 2022 at 08:56:01AM -0600, madvenka@linux.microsoft.com wrote:
> 
>> +/*
>> + * TODO: document requirements here.
>> + */
>> +static inline void unwind_init_from_regs(struct unwind_state *state,
>> +					 struct pt_regs *regs)
> 
>> +/*
>> + * TODO: document requirements here.
>> + *
>> + * Note: this is always inlined, and we expect our caller to be a noinline
>> + * function, such that this starts from our caller's caller.
>> + */
>> +static __always_inline void unwind_init_from_current(struct unwind_state *state)
> 
>> +/*
>> + * TODO: document requirements here.
>> + *
>> + * The caller guarantees that the task is not running.
>> + */
>> +static inline void unwind_init_from_task(struct unwind_state *state,
>> +					 struct task_struct *task)
> 
> Other than the obvious gap this looks good to me.  For _current() I
> don't think we've got any particular requirements other than what's
> documented.  For the others I think the main thing is that trying to
> walk the stack of a task that is actively executing is going to be a bad
> idea so we should say that the task shouldn't be running, but in general
> given that one of the main use cases is printing diagnostics on error
> we shouldn't have too many *requirements* for calling these.

OK. For now, I will remove the TODO comment from individual functions.
I will add only a common general comment above all 3 helpers that
additional requirements may be documented as seen fit. And, I will
add that the task must not be running in other-directed cases.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-02-02 18:46     ` Mark Brown
@ 2022-02-03  0:34       ` Madhavan T. Venkataraman
  2022-02-03 11:30         ` Mark Brown
  0 siblings, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-03  0:34 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 2/2/22 12:46, Mark Brown wrote:
> On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Rename the arguments to unwind() for better consistency. Also, use the
>> typedef stack_trace_consume_fn for the consume_entry function as it is
>> already defined in linux/stacktrace.h.
> 
> Consistency with...?  But otherwise:

Naming consistency. E.g., the name consume_entry is used in a lot of places.
This code used to use fn() instead of consume_entry(). arch_stack_walk()
names the argument to consume_entry as cookie. This code calls it data
instead of cookie. That is all. It is minor in nature. But I thought I might
as well make it conform while I am at it.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 04/11] arm64: Split unwind_init()
  2022-02-03  0:26       ` Madhavan T. Venkataraman
@ 2022-02-03  0:39         ` Madhavan T. Venkataraman
  2022-02-03 11:29           ` Mark Brown
  0 siblings, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-03  0:39 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 2/2/22 18:26, Madhavan T. Venkataraman wrote:
> 
> 
> On 2/2/22 12:44, Mark Brown wrote:
>> On Mon, Jan 17, 2022 at 08:56:01AM -0600, madvenka@linux.microsoft.com wrote:
>>
>>> +/*
>>> + * TODO: document requirements here.
>>> + */
>>> +static inline void unwind_init_from_regs(struct unwind_state *state,
>>> +					 struct pt_regs *regs)
>>
>>> +/*
>>> + * TODO: document requirements here.
>>> + *
>>> + * Note: this is always inlined, and we expect our caller to be a noinline
>>> + * function, such that this starts from our caller's caller.
>>> + */
>>> +static __always_inline void unwind_init_from_current(struct unwind_state *state)
>>
>>> +/*
>>> + * TODO: document requirements here.
>>> + *
>>> + * The caller guarantees that the task is not running.
>>> + */
>>> +static inline void unwind_init_from_task(struct unwind_state *state,
>>> +					 struct task_struct *task)
>>
>> Other than the obvious gap this looks good to me.  For _current() I
>> don't think we've got any particular requirements other than what's
>> documented.  For the others I think the main thing is that trying to
>> walk the stack of a task that is actively executing is going to be a bad
>> idea so we should say that the task shouldn't be running, but in general
>> given that one of the main use cases is printing diagnostics on error
>> we shouldn't have too many *requirements* for calling these.
> 
> OK. For now, I will remove the TODO comment from individual functions.
> I will add only a common general comment above all 3 helpers that
> additional requirements may be documented as seen fit. And, I will
> add that the task must not be running in other-directed cases.
> 

If what I have suggested above for comments is good enough, can I get a
Reviewed-by for this? I will fix the comments on the next send.

> Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 04/11] arm64: Split unwind_init()
  2022-02-03  0:39         ` Madhavan T. Venkataraman
@ 2022-02-03 11:29           ` Mark Brown
  0 siblings, 0 replies; 75+ messages in thread
From: Mark Brown @ 2022-02-03 11:29 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 271 bytes --]

On Wed, Feb 02, 2022 at 06:39:29PM -0600, Madhavan T. Venkataraman wrote:

> If what I have suggested above for comments is good enough, can I get a
> Reviewed-by for this? I will fix the comments on the next send.

I would think so, I'll take a look at the new version.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-02-03  0:34       ` Madhavan T. Venkataraman
@ 2022-02-03 11:30         ` Mark Brown
  2022-02-03 14:45           ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 75+ messages in thread
From: Mark Brown @ 2022-02-03 11:30 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 842 bytes --]

On Wed, Feb 02, 2022 at 06:34:43PM -0600, Madhavan T. Venkataraman wrote:
> On 2/2/22 12:46, Mark Brown wrote:
> > On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:

> >> Rename the arguments to unwind() for better consistency. Also, use the
> >> typedef stack_trace_consume_fn for the consume_entry function as it is
> >> already defined in linux/stacktrace.h.

> > Consistency with...?  But otherwise:

> Naming consistency. E.g., the name consume_entry is used in a lot of places.
> This code used to use fn() instead of consume_entry(). arch_stack_walk()
> names the argument to consume_entry as cookie. This code calls it data
> instead of cookie. That is all. It is minor in nature. But I thought I might
> as well make it conform while I am at it.

The commit message should probably say some of that then.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-02-03 11:30         ` Mark Brown
@ 2022-02-03 14:45           ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-03 14:45 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 2/3/22 05:30, Mark Brown wrote:
> On Wed, Feb 02, 2022 at 06:34:43PM -0600, Madhavan T. Venkataraman wrote:
>> On 2/2/22 12:46, Mark Brown wrote:
>>> On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
> 
>>>> Rename the arguments to unwind() for better consistency. Also, use the
>>>> typedef stack_trace_consume_fn for the consume_entry function as it is
>>>> already defined in linux/stacktrace.h.
> 
>>> Consistency with...?  But otherwise:
> 
>> Naming consistency. E.g., the name consume_entry is used in a lot of places.
>> This code used to use fn() instead of consume_entry(). arch_stack_walk()
>> names the argument to consume_entry as cookie. This code calls it data
>> instead of cookie. That is all. It is minor in nature. But I thought I might
>> as well make it conform while I am at it.
> 
> The commit message should probably say some of that then.

OK. Will add that to the commit message in the next version.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 04/11] arm64: Split unwind_init()
  2022-01-17 14:56   ` [PATCH v13 04/11] arm64: Split unwind_init() madvenka
  2022-02-02 18:44     ` Mark Brown
@ 2022-02-15 13:07     ` Mark Rutland
  2022-02-15 18:04       ` Madhavan T. Venkataraman
  1 sibling, 1 reply; 75+ messages in thread
From: Mark Rutland @ 2022-02-15 13:07 UTC (permalink / raw)
  To: madvenka
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

Hi Madhavan,

The diff itself largely looks good, but we need to actually write the comments.
Can you pleaes pick up the wording I've written below for those?

That and renaming `unwind_init_from_current` to `unwind_init_from_caller`.

With those I think this is good, but I'd like to see the updated version before
I provide Acked-by or Reviewed-by tags -- hopefully that's just a formality! :)

On Mon, Jan 17, 2022 at 08:56:01AM -0600, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> unwind_init() is currently a single function that initializes all of the
> unwind state. Split it into the following functions and call them
> appropriately:
> 
> 	- unwind_init_from_regs() - initialize from regs passed by caller.
> 
> 	- unwind_init_from_current() - initialize for the current task
> 	  from the caller of arch_stack_walk().
> 
> 	- unwind_init_from_task() - initialize from the saved state of a
> 	  task other than the current task. In this case, the other
> 	  task must not be running.
> 
> This is done for two reasons:
> 
> 	- the different ways of initializing are clear
> 
> 	- specialized code can be added to each initializer in the future.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> ---
>  arch/arm64/kernel/stacktrace.c | 54 +++++++++++++++++++++++++++-------
>  1 file changed, 44 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index a1a7ff93b84f..b2b568e5deba 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -33,11 +33,8 @@
>   */
>  
>  
> -static void unwind_init(struct unwind_state *state, unsigned long fp,
> -			unsigned long pc)
> +static void unwind_init_common(struct unwind_state *state)
>  {
> -	state->fp = fp;
> -	state->pc = pc;
>  #ifdef CONFIG_KRETPROBES
>  	state->kr_cur = NULL;
>  #endif
> @@ -56,6 +53,46 @@ static void unwind_init(struct unwind_state *state, unsigned long fp,
>  	state->prev_type = STACK_TYPE_UNKNOWN;
>  }
>  
> +/*
> + * TODO: document requirements here.
> + */

Please make this:

/*
 * Start an unwind from a pt_regs.
 *
 * The unwind will begin at the PC within the regs.
 *
 * The regs must be on a stack currently owned by the calling task.
 */

> +static inline void unwind_init_from_regs(struct unwind_state *state,
> +					 struct pt_regs *regs)
> +{

In future we could add:

	WARN_ON_ONCE(!on_accessible_stack(current, regs, sizeof(*regs), NULL));

... to validate the requirements, but I'm happy to lave that for a future patch
so this patch can be a pure refactoring.

> +	unwind_init_common(state);
> +
> +	state->fp = regs->regs[29];
> +	state->pc = regs->pc;
> +}
> +
> +/*
> + * TODO: document requirements here.
> + *
> + * Note: this is always inlined, and we expect our caller to be a noinline
> + * function, such that this starts from our caller's caller.
> + */

Please make this:

/*
 * Start an unwind from a caller.
 *
 * The unwind will begin at the caller of whichever function this is inlined
 * into.
 *
 * The function which invokes this must be noinline.
 */

> +static __always_inline void unwind_init_from_current(struct unwind_state *state)

Can we please rename s/current/caller/ here? That way it's clear *where* in
current we're unwinding from, and the fact that it's current is implicit but
obvious.

> +{

Similarly to unwind_init_from_regs(), in a future patch we could add:

	WARN_ON_ONCE(task == current);

... but for now we can omit that so this patch can be a pure refactoring.

> +	unwind_init_common(state);
> +
> +	state->fp = (unsigned long)__builtin_frame_address(1);
> +	state->pc = (unsigned long)__builtin_return_address(0);
> +}
> +
> +/*
> + * TODO: document requirements here.
> + *
> + * The caller guarantees that the task is not running.
> + */

Please make this:

/*
 * Start an unwind from a blocked task.
 *
 * The unwind will begin at the blocked tasks saved PC (i.e. the caller of
 * cpu_switch_to()).
 *
 * The caller should ensure the task is blocked in cpu_switch_to() for the
 * duration of the unwind, or the unwind will be bogus. It is never valid to
 * call this for the current task.
 */

Thanks,
Mark.

> +static inline void unwind_init_from_task(struct unwind_state *state,
> +					 struct task_struct *task)
> +{
> +	unwind_init_common(state);
> +
> +	state->fp = thread_saved_fp(task);
> +	state->pc = thread_saved_pc(task);
> +}
> +
>  /*
>   * Unwind from one frame record (A) to the next frame record (B).
>   *
> @@ -195,14 +232,11 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
>  	struct unwind_state state;
>  
>  	if (regs)
> -		unwind_init(&state, regs->regs[29], regs->pc);
> +		unwind_init_from_regs(&state, regs);
>  	else if (task == current)
> -		unwind_init(&state,
> -				(unsigned long)__builtin_frame_address(1),
> -				(unsigned long)__builtin_return_address(0));
> +		unwind_init_from_current(&state);
>  	else
> -		unwind_init(&state, thread_saved_fp(task),
> -				thread_saved_pc(task));
> +		unwind_init_from_task(&state, task);
>  
>  	unwind(task, &state, consume_entry, cookie);
>  }
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 05/11] arm64: Copy the task argument to unwind_state
  2022-01-17 14:56   ` [PATCH v13 05/11] arm64: Copy the task argument to unwind_state madvenka
  2022-02-02 18:45     ` Mark Brown
@ 2022-02-15 13:22     ` Mark Rutland
  2022-02-22 16:53       ` Madhavan T. Venkataraman
  1 sibling, 1 reply; 75+ messages in thread
From: Mark Rutland @ 2022-02-15 13:22 UTC (permalink / raw)
  To: madvenka
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

On Mon, Jan 17, 2022 at 08:56:02AM -0600, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Copy the task argument passed to arch_stack_walk() to unwind_state so that
> it can be passed to unwind functions via unwind_state rather than as a
> separate argument. The task is a fundamental part of the unwind state.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> ---
>  arch/arm64/include/asm/stacktrace.h |  3 +++
>  arch/arm64/kernel/stacktrace.c      | 29 ++++++++++++++++-------------
>  2 files changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index 41ec360515f6..af423f5d7ad8 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -51,6 +51,8 @@ struct stack_info {
>   * @kr_cur:      When KRETPROBES is selected, holds the kretprobe instance
>   *               associated with the most recently encountered replacement lr
>   *               value.
> + *
> + * @task:        Pointer to the task structure.

Can we please say:

	@task:	The task being unwound.

>   */
>  struct unwind_state {
>  	unsigned long fp;
> @@ -61,6 +63,7 @@ struct unwind_state {
>  #ifdef CONFIG_KRETPROBES
>  	struct llist_node *kr_cur;
>  #endif
> +	struct task_struct *task;
>  };
>  
>  extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index b2b568e5deba..1b32e55735aa 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -33,8 +33,10 @@
>   */
>  
>  
> -static void unwind_init_common(struct unwind_state *state)
> +static void unwind_init_common(struct unwind_state *state,
> +			       struct task_struct *task)
>  {
> +	state->task = task;
>  #ifdef CONFIG_KRETPROBES
>  	state->kr_cur = NULL;
>  #endif
> @@ -57,9 +59,10 @@ static void unwind_init_common(struct unwind_state *state)
>   * TODO: document requirements here.
>   */
>  static inline void unwind_init_from_regs(struct unwind_state *state,
> +					 struct task_struct *task,

Please drop the `task` parameter here ...

>  					 struct pt_regs *regs)
>  {
> -	unwind_init_common(state);
> +	unwind_init_common(state, task);

... and make this:

	unwind_init_common(state, current);

... since that way it's *impossible* to have ismatched parameters, which is one
of the reasons for having separate functions in the first place.

>  	state->fp = regs->regs[29];
>  	state->pc = regs->pc;
> @@ -71,9 +74,10 @@ static inline void unwind_init_from_regs(struct unwind_state *state,
>   * Note: this is always inlined, and we expect our caller to be a noinline
>   * function, such that this starts from our caller's caller.
>   */
> -static __always_inline void unwind_init_from_current(struct unwind_state *state)
> +static __always_inline void unwind_init_from_current(struct unwind_state *state,
> +						     struct task_struct *task)
>  {
> -	unwind_init_common(state);
> +	unwind_init_common(state, task);

Same comments as for unwind_init_from_regs(): please drop the `task` parameter
and hard-code `current` in the call to unwind_init_common().

>  	state->fp = (unsigned long)__builtin_frame_address(1);
>  	state->pc = (unsigned long)__builtin_return_address(0);
> @@ -87,7 +91,7 @@ static __always_inline void unwind_init_from_current(struct unwind_state *state)
>  static inline void unwind_init_from_task(struct unwind_state *state,
>  					 struct task_struct *task)
>  {
> -	unwind_init_common(state);
> +	unwind_init_common(state, task);
>  
>  	state->fp = thread_saved_fp(task);
>  	state->pc = thread_saved_pc(task);
> @@ -100,11 +104,11 @@ static inline void unwind_init_from_task(struct unwind_state *state,
>   * records (e.g. a cycle), determined based on the location and fp value of A
>   * and the location (but not the fp value) of B.
>   */
> -static int notrace unwind_next(struct task_struct *tsk,
> -			       struct unwind_state *state)
> +static int notrace unwind_next(struct unwind_state *state)
>  {
>  	unsigned long fp = state->fp;
>  	struct stack_info info;
> +	struct task_struct *tsk = state->task;
>  
>  	/* Final frame; nothing to unwind */
>  	if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
> @@ -176,8 +180,7 @@ static int notrace unwind_next(struct task_struct *tsk,
>  }
>  NOKPROBE_SYMBOL(unwind_next);
>  
> -static void notrace unwind(struct task_struct *tsk,
> -			   struct unwind_state *state,
> +static void notrace unwind(struct unwind_state *state,
>  			   bool (*fn)(void *, unsigned long), void *data)
>  {
>  	while (1) {
> @@ -185,7 +188,7 @@ static void notrace unwind(struct task_struct *tsk,
>  
>  		if (!fn(data, state->pc))
>  			break;
> -		ret = unwind_next(tsk, state);
> +		ret = unwind_next(state);
>  		if (ret < 0)
>  			break;
>  	}
> @@ -232,11 +235,11 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
>  	struct unwind_state state;
>  
>  	if (regs)
> -		unwind_init_from_regs(&state, regs);
> +		unwind_init_from_regs(&state, task, regs);
>  	else if (task == current)
> -		unwind_init_from_current(&state);
> +		unwind_init_from_current(&state, task);
>  	else
>  		unwind_init_from_task(&state, task);

As above we shouldn't need these two changes.

For the regs case we might want to sanity-check that task == current.

> -	unwind(task, &state, consume_entry, cookie);
> +	unwind(&state, consume_entry, cookie);

Otherwise, this looks good to me.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-01-17 14:56   ` [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind() madvenka
  2022-02-02 18:46     ` Mark Brown
@ 2022-02-15 13:39     ` Mark Rutland
  2022-02-15 18:12       ` Madhavan T. Venkataraman
  2022-03-07 16:51       ` Madhavan T. Venkataraman
  1 sibling, 2 replies; 75+ messages in thread
From: Mark Rutland @ 2022-02-15 13:39 UTC (permalink / raw)
  To: madvenka
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Rename the arguments to unwind() for better consistency. Also, use the
> typedef stack_trace_consume_fn for the consume_entry function as it is
> already defined in linux/stacktrace.h.
>
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>

How about: 

| arm64: align with common stracktrace naming
|
| For historical reasons, the naming of parameters and their types in the arm64
| stacktrace code differs from that used in generic code and other
| architectures, even though the types are equivalent.
|
| For consistency and clarity, use the generic names.

Either way:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/kernel/stacktrace.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 1b32e55735aa..f772dac78b11 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -181,12 +181,12 @@ static int notrace unwind_next(struct unwind_state *state)
>  NOKPROBE_SYMBOL(unwind_next);
>  
>  static void notrace unwind(struct unwind_state *state,
> -			   bool (*fn)(void *, unsigned long), void *data)
> +			   stack_trace_consume_fn consume_entry, void *cookie)
>  {
>  	while (1) {
>  		int ret;
>  
> -		if (!fn(data, state->pc))
> +		if (!consume_entry(cookie, state->pc))
>  			break;
>  		ret = unwind_next(state);
>  		if (ret < 0)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 04/11] arm64: Split unwind_init()
  2022-02-15 13:07     ` Mark Rutland
@ 2022-02-15 18:04       ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-15 18:04 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 2/15/22 07:07, Mark Rutland wrote:
> Hi Madhavan,
> 
> The diff itself largely looks good, but we need to actually write the comments.
> Can you pleaes pick up the wording I've written below for those?
> 
> That and renaming `unwind_init_from_current` to `unwind_init_from_caller`.
> 
> With those I think this is good, but I'd like to see the updated version before
> I provide Acked-by or Reviewed-by tags -- hopefully that's just a formality! :)
> 

Will do.

Madhavan

> On Mon, Jan 17, 2022 at 08:56:01AM -0600, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> unwind_init() is currently a single function that initializes all of the
>> unwind state. Split it into the following functions and call them
>> appropriately:
>>
>> 	- unwind_init_from_regs() - initialize from regs passed by caller.
>>
>> 	- unwind_init_from_current() - initialize for the current task
>> 	  from the caller of arch_stack_walk().
>>
>> 	- unwind_init_from_task() - initialize from the saved state of a
>> 	  task other than the current task. In this case, the other
>> 	  task must not be running.
>>
>> This is done for two reasons:
>>
>> 	- the different ways of initializing are clear
>>
>> 	- specialized code can be added to each initializer in the future.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
>> ---
>>  arch/arm64/kernel/stacktrace.c | 54 +++++++++++++++++++++++++++-------
>>  1 file changed, 44 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index a1a7ff93b84f..b2b568e5deba 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -33,11 +33,8 @@
>>   */
>>  
>>  
>> -static void unwind_init(struct unwind_state *state, unsigned long fp,
>> -			unsigned long pc)
>> +static void unwind_init_common(struct unwind_state *state)
>>  {
>> -	state->fp = fp;
>> -	state->pc = pc;
>>  #ifdef CONFIG_KRETPROBES
>>  	state->kr_cur = NULL;
>>  #endif
>> @@ -56,6 +53,46 @@ static void unwind_init(struct unwind_state *state, unsigned long fp,
>>  	state->prev_type = STACK_TYPE_UNKNOWN;
>>  }
>>  
>> +/*
>> + * TODO: document requirements here.
>> + */
> 
> Please make this:
> 
> /*
>  * Start an unwind from a pt_regs.
>  *
>  * The unwind will begin at the PC within the regs.
>  *
>  * The regs must be on a stack currently owned by the calling task.
>  */
> 
>> +static inline void unwind_init_from_regs(struct unwind_state *state,
>> +					 struct pt_regs *regs)
>> +{
> 
> In future we could add:
> 
> 	WARN_ON_ONCE(!on_accessible_stack(current, regs, sizeof(*regs), NULL));
> 
> ... to validate the requirements, but I'm happy to lave that for a future patch
> so this patch can be a pure refactoring.
> 
>> +	unwind_init_common(state);
>> +
>> +	state->fp = regs->regs[29];
>> +	state->pc = regs->pc;
>> +}
>> +
>> +/*
>> + * TODO: document requirements here.
>> + *
>> + * Note: this is always inlined, and we expect our caller to be a noinline
>> + * function, such that this starts from our caller's caller.
>> + */
> 
> Please make this:
> 
> /*
>  * Start an unwind from a caller.
>  *
>  * The unwind will begin at the caller of whichever function this is inlined
>  * into.
>  *
>  * The function which invokes this must be noinline.
>  */
> 
>> +static __always_inline void unwind_init_from_current(struct unwind_state *state)
> 
> Can we please rename s/current/caller/ here? That way it's clear *where* in
> current we're unwinding from, and the fact that it's current is implicit but
> obvious.
> 
>> +{
> 
> Similarly to unwind_init_from_regs(), in a future patch we could add:
> 
> 	WARN_ON_ONCE(task == current);
> 
> ... but for now we can omit that so this patch can be a pure refactoring.
> 
>> +	unwind_init_common(state);
>> +
>> +	state->fp = (unsigned long)__builtin_frame_address(1);
>> +	state->pc = (unsigned long)__builtin_return_address(0);
>> +}
>> +
>> +/*
>> + * TODO: document requirements here.
>> + *
>> + * The caller guarantees that the task is not running.
>> + */
> 
> Please make this:
> 
> /*
>  * Start an unwind from a blocked task.
>  *
>  * The unwind will begin at the blocked tasks saved PC (i.e. the caller of
>  * cpu_switch_to()).
>  *
>  * The caller should ensure the task is blocked in cpu_switch_to() for the
>  * duration of the unwind, or the unwind will be bogus. It is never valid to
>  * call this for the current task.
>  */
> 
> Thanks,
> Mark.
> 
>> +static inline void unwind_init_from_task(struct unwind_state *state,
>> +					 struct task_struct *task)
>> +{
>> +	unwind_init_common(state);
>> +
>> +	state->fp = thread_saved_fp(task);
>> +	state->pc = thread_saved_pc(task);
>> +}
>> +
>>  /*
>>   * Unwind from one frame record (A) to the next frame record (B).
>>   *
>> @@ -195,14 +232,11 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
>>  	struct unwind_state state;
>>  
>>  	if (regs)
>> -		unwind_init(&state, regs->regs[29], regs->pc);
>> +		unwind_init_from_regs(&state, regs);
>>  	else if (task == current)
>> -		unwind_init(&state,
>> -				(unsigned long)__builtin_frame_address(1),
>> -				(unsigned long)__builtin_return_address(0));
>> +		unwind_init_from_current(&state);
>>  	else
>> -		unwind_init(&state, thread_saved_fp(task),
>> -				thread_saved_pc(task));
>> +		unwind_init_from_task(&state, task);
>>  
>>  	unwind(task, &state, consume_entry, cookie);
>>  }
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-02-15 13:39     ` Mark Rutland
@ 2022-02-15 18:12       ` Madhavan T. Venkataraman
  2022-03-07 16:51       ` Madhavan T. Venkataraman
  1 sibling, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-15 18:12 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 2/15/22 07:39, Mark Rutland wrote:
> On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Rename the arguments to unwind() for better consistency. Also, use the
>> typedef stack_trace_consume_fn for the consume_entry function as it is
>> already defined in linux/stacktrace.h.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> 
> How about: 
> 
> | arm64: align with common stracktrace naming
> |
> | For historical reasons, the naming of parameters and their types in the arm64
> | stacktrace code differs from that used in generic code and other
> | architectures, even though the types are equivalent.
> |
> | For consistency and clarity, use the generic names.
> 

Will add this.

Madhavan

> Either way:
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> 
> Mark.
> 
>> ---
>>  arch/arm64/kernel/stacktrace.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index 1b32e55735aa..f772dac78b11 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -181,12 +181,12 @@ static int notrace unwind_next(struct unwind_state *state)
>>  NOKPROBE_SYMBOL(unwind_next);
>>  
>>  static void notrace unwind(struct unwind_state *state,
>> -			   bool (*fn)(void *, unsigned long), void *data)
>> +			   stack_trace_consume_fn consume_entry, void *cookie)
>>  {
>>  	while (1) {
>>  		int ret;
>>  
>> -		if (!fn(data, state->pc))
>> +		if (!consume_entry(cookie, state->pc))
>>  			break;
>>  		ret = unwind_next(state);
>>  		if (ret < 0)
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 05/11] arm64: Copy the task argument to unwind_state
  2022-02-15 13:22     ` Mark Rutland
@ 2022-02-22 16:53       ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-02-22 16:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

It looks like I forgot to reply to this. Sorry about that.

On 2/15/22 07:22, Mark Rutland wrote:
> On Mon, Jan 17, 2022 at 08:56:02AM -0600, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Copy the task argument passed to arch_stack_walk() to unwind_state so that
>> it can be passed to unwind functions via unwind_state rather than as a
>> separate argument. The task is a fundamental part of the unwind state.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
>> ---
>>  arch/arm64/include/asm/stacktrace.h |  3 +++
>>  arch/arm64/kernel/stacktrace.c      | 29 ++++++++++++++++-------------
>>  2 files changed, 19 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
>> index 41ec360515f6..af423f5d7ad8 100644
>> --- a/arch/arm64/include/asm/stacktrace.h
>> +++ b/arch/arm64/include/asm/stacktrace.h
>> @@ -51,6 +51,8 @@ struct stack_info {
>>   * @kr_cur:      When KRETPROBES is selected, holds the kretprobe instance
>>   *               associated with the most recently encountered replacement lr
>>   *               value.
>> + *
>> + * @task:        Pointer to the task structure.
> 
> Can we please say:
> 
> 	@task:	The task being unwound.
> 

Will do.

>>   */
>>  struct unwind_state {
>>  	unsigned long fp;
>> @@ -61,6 +63,7 @@ struct unwind_state {
>>  #ifdef CONFIG_KRETPROBES
>>  	struct llist_node *kr_cur;
>>  #endif
>> +	struct task_struct *task;
>>  };
>>  
>>  extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index b2b568e5deba..1b32e55735aa 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -33,8 +33,10 @@
>>   */
>>  
>>  
>> -static void unwind_init_common(struct unwind_state *state)
>> +static void unwind_init_common(struct unwind_state *state,
>> +			       struct task_struct *task)
>>  {
>> +	state->task = task;
>>  #ifdef CONFIG_KRETPROBES
>>  	state->kr_cur = NULL;
>>  #endif
>> @@ -57,9 +59,10 @@ static void unwind_init_common(struct unwind_state *state)
>>   * TODO: document requirements here.
>>   */
>>  static inline void unwind_init_from_regs(struct unwind_state *state,
>> +					 struct task_struct *task,
> 
> Please drop the `task` parameter here ...

OK.

> 
>>  					 struct pt_regs *regs)
>>  {
>> -	unwind_init_common(state);
>> +	unwind_init_common(state, task);
> 
> ... and make this:
> 
> 	unwind_init_common(state, current);

OK.

> 
> ... since that way it's *impossible* to have ismatched parameters, which is one
> of the reasons for having separate functions in the first place.
> 
>>  	state->fp = regs->regs[29];
>>  	state->pc = regs->pc;
>> @@ -71,9 +74,10 @@ static inline void unwind_init_from_regs(struct unwind_state *state,
>>   * Note: this is always inlined, and we expect our caller to be a noinline
>>   * function, such that this starts from our caller's caller.
>>   */
>> -static __always_inline void unwind_init_from_current(struct unwind_state *state)
>> +static __always_inline void unwind_init_from_current(struct unwind_state *state,
>> +						     struct task_struct *task)
>>  {
>> -	unwind_init_common(state);
>> +	unwind_init_common(state, task);
> 
> Same comments as for unwind_init_from_regs(): please drop the `task` parameter
> and hard-code `current` in the call to unwind_init_common().
> 

OK.

>>  	state->fp = (unsigned long)__builtin_frame_address(1);
>>  	state->pc = (unsigned long)__builtin_return_address(0);
>> @@ -87,7 +91,7 @@ static __always_inline void unwind_init_from_current(struct unwind_state *state)
>>  static inline void unwind_init_from_task(struct unwind_state *state,
>>  					 struct task_struct *task)
>>  {
>> -	unwind_init_common(state);
>> +	unwind_init_common(state, task);
>>  
>>  	state->fp = thread_saved_fp(task);
>>  	state->pc = thread_saved_pc(task);
>> @@ -100,11 +104,11 @@ static inline void unwind_init_from_task(struct unwind_state *state,
>>   * records (e.g. a cycle), determined based on the location and fp value of A
>>   * and the location (but not the fp value) of B.
>>   */
>> -static int notrace unwind_next(struct task_struct *tsk,
>> -			       struct unwind_state *state)
>> +static int notrace unwind_next(struct unwind_state *state)
>>  {
>>  	unsigned long fp = state->fp;
>>  	struct stack_info info;
>> +	struct task_struct *tsk = state->task;
>>  
>>  	/* Final frame; nothing to unwind */
>>  	if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
>> @@ -176,8 +180,7 @@ static int notrace unwind_next(struct task_struct *tsk,
>>  }
>>  NOKPROBE_SYMBOL(unwind_next);
>>  
>> -static void notrace unwind(struct task_struct *tsk,
>> -			   struct unwind_state *state,
>> +static void notrace unwind(struct unwind_state *state,
>>  			   bool (*fn)(void *, unsigned long), void *data)
>>  {
>>  	while (1) {
>> @@ -185,7 +188,7 @@ static void notrace unwind(struct task_struct *tsk,
>>  
>>  		if (!fn(data, state->pc))
>>  			break;
>> -		ret = unwind_next(tsk, state);
>> +		ret = unwind_next(state);
>>  		if (ret < 0)
>>  			break;
>>  	}
>> @@ -232,11 +235,11 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
>>  	struct unwind_state state;
>>  
>>  	if (regs)
>> -		unwind_init_from_regs(&state, regs);
>> +		unwind_init_from_regs(&state, task, regs);
>>  	else if (task == current)
>> -		unwind_init_from_current(&state);
>> +		unwind_init_from_current(&state, task);
>>  	else
>>  		unwind_init_from_task(&state, task);
> 
> As above we shouldn't need these two changes.
> 
> For the regs case we might want to sanity-check that task == current.
> 

Will do.

>> -	unwind(task, &state, consume_entry, cookie);
>> +	unwind(&state, consume_entry, cookie);
> 
> Otherwise, this looks good to me.

Thanks.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-02-15 13:39     ` Mark Rutland
  2022-02-15 18:12       ` Madhavan T. Venkataraman
@ 2022-03-07 16:51       ` Madhavan T. Venkataraman
  2022-03-07 17:01         ` Mark Brown
  2022-04-08 14:44         ` Mark Rutland
  1 sibling, 2 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-03-07 16:51 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

Hey Mark Rutland, Mark Brown,

Could you please review the rest of the patches in the series when you can?

Also, many of the patches have received a Reviewed-By from you both. So, after I send the next version out, can we upstream those ones?

Thanks.

Madhavan

On 2/15/22 07:39, Mark Rutland wrote:
> On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Rename the arguments to unwind() for better consistency. Also, use the
>> typedef stack_trace_consume_fn for the consume_entry function as it is
>> already defined in linux/stacktrace.h.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> 
> How about: 
> 
> | arm64: align with common stracktrace naming
> |
> | For historical reasons, the naming of parameters and their types in the arm64
> | stacktrace code differs from that used in generic code and other
> | architectures, even though the types are equivalent.
> |
> | For consistency and clarity, use the generic names.
> 
> Either way:
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> 
> Mark.
> 
>> ---
>>  arch/arm64/kernel/stacktrace.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index 1b32e55735aa..f772dac78b11 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -181,12 +181,12 @@ static int notrace unwind_next(struct unwind_state *state)
>>  NOKPROBE_SYMBOL(unwind_next);
>>  
>>  static void notrace unwind(struct unwind_state *state,
>> -			   bool (*fn)(void *, unsigned long), void *data)
>> +			   stack_trace_consume_fn consume_entry, void *cookie)
>>  {
>>  	while (1) {
>>  		int ret;
>>  
>> -		if (!fn(data, state->pc))
>> +		if (!consume_entry(cookie, state->pc))
>>  			break;
>>  		ret = unwind_next(state);
>>  		if (ret < 0)
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-07 16:51       ` Madhavan T. Venkataraman
@ 2022-03-07 17:01         ` Mark Brown
  2022-03-08 22:00           ` Madhavan T. Venkataraman
  2022-04-08 14:44         ` Mark Rutland
  1 sibling, 1 reply; 75+ messages in thread
From: Mark Brown @ 2022-03-07 17:01 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 657 bytes --]

On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
> Hey Mark Rutland, Mark Brown,
> 
> Could you please review the rest of the patches in the series when you can?
> 

Please don't send content free pings.  As far as I remember I'd reviewed
or was expecting changes based on review or dependent patches for
everything that you'd sent.

> Also, many of the patches have received a Reviewed-By from you both. So, after I send the next version out, can we upstream those ones?

That's more a question for Catalin and Will.  If myself and Mark have
reviewed patches then we're saying we think those patches are good to
go.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-07 17:01         ` Mark Brown
@ 2022-03-08 22:00           ` Madhavan T. Venkataraman
  2022-03-09 11:47             ` Mark Brown
  0 siblings, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-03-08 22:00 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 3/7/22 11:01, Mark Brown wrote:
> On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
>> Hey Mark Rutland, Mark Brown,
>>
>> Could you please review the rest of the patches in the series when you can?
>>
> 
> Please don't send content free pings.  As far as I remember I'd reviewed
> or was expecting changes based on review or dependent patches for
> everything that you'd sent.
> 

Indeed you did! Many thanks!

It is just that patch 11 that defines "select HAVE_RELIABLE_STACKTRACE" did not receive any comments from you (unless I missed a comment that came from you. That is entirely possible. If I missed it, my bad). Since you suggested that change, I just wanted to make sure that that patch looks OK to you.

>> Also, many of the patches have received a Reviewed-By from you both. So, after I send the next version out, can we upstream those ones?
> 
> That's more a question for Catalin and Will.  If myself and Mark have
> reviewed patches then we're saying we think those patches are good to
> go.

Got it!

Thanks!

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-08 22:00           ` Madhavan T. Venkataraman
@ 2022-03-09 11:47             ` Mark Brown
  2022-03-09 15:34               ` Madhavan T. Venkataraman
                                 ` (2 more replies)
  0 siblings, 3 replies; 75+ messages in thread
From: Mark Brown @ 2022-03-09 11:47 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 875 bytes --]

On Tue, Mar 08, 2022 at 04:00:35PM -0600, Madhavan T. Venkataraman wrote:

> It is just that patch 11 that defines "select
> HAVE_RELIABLE_STACKTRACE" did not receive any comments from you
> (unless I missed a comment that came from you. That is entirely
> possible. If I missed it, my bad). Since you suggested that change, I
> just wanted to make sure that that patch looks OK to you.

I think that's more a question for the livepatch people to be honest -
it's not entirely a technical one, there's a bunch of confidence level
stuff going on.  For example there was some suggestion that people might
insist on having objtool support, though there's also substantial
pushback on making objtool a requirement for anything from other
quarters.  I was hoping that posting that patch would provoke some
discussion about what exactly is needed but that's not happened thus
far.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-09 11:47             ` Mark Brown
@ 2022-03-09 15:34               ` Madhavan T. Venkataraman
  2022-03-10  8:33               ` Miroslav Benes
  2022-03-16  3:43               ` Josh Poimboeuf
  2 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-03-09 15:34 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 3/9/22 05:47, Mark Brown wrote:
> On Tue, Mar 08, 2022 at 04:00:35PM -0600, Madhavan T. Venkataraman wrote:
> 
>> It is just that patch 11 that defines "select
>> HAVE_RELIABLE_STACKTRACE" did not receive any comments from you
>> (unless I missed a comment that came from you. That is entirely
>> possible. If I missed it, my bad). Since you suggested that change, I
>> just wanted to make sure that that patch looks OK to you.
> 
> I think that's more a question for the livepatch people to be honest -
> it's not entirely a technical one, there's a bunch of confidence level
> stuff going on.  For example there was some suggestion that people might
> insist on having objtool support, though there's also substantial
> pushback on making objtool a requirement for anything from other
> quarters.  I was hoping that posting that patch would provoke some
> discussion about what exactly is needed but that's not happened thus
> far.

Understood. In that case, I will remove that patch because it is not really required for my current work on the unwinder. I will bring this up later in a different patch series where it will trigger a discussion.

Thanks.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-09 11:47             ` Mark Brown
  2022-03-09 15:34               ` Madhavan T. Venkataraman
@ 2022-03-10  8:33               ` Miroslav Benes
  2022-03-10 12:36                 ` Madhavan T. Venkataraman
  2022-03-16  3:43               ` Josh Poimboeuf
  2 siblings, 1 reply; 75+ messages in thread
From: Miroslav Benes @ 2022-03-10  8:33 UTC (permalink / raw)
  To: Mark Brown
  Cc: Madhavan T. Venkataraman, Mark Rutland, jpoimboe, ardb,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jmorris,
	linux-arm-kernel, live-patching, linux-kernel

On Wed, 9 Mar 2022, Mark Brown wrote:

> On Tue, Mar 08, 2022 at 04:00:35PM -0600, Madhavan T. Venkataraman wrote:
> 
> > It is just that patch 11 that defines "select
> > HAVE_RELIABLE_STACKTRACE" did not receive any comments from you
> > (unless I missed a comment that came from you. That is entirely
> > possible. If I missed it, my bad). Since you suggested that change, I
> > just wanted to make sure that that patch looks OK to you.
> 
> I think that's more a question for the livepatch people to be honest -
> it's not entirely a technical one, there's a bunch of confidence level
> stuff going on.  For example there was some suggestion that people might
> insist on having objtool support, though there's also substantial
> pushback on making objtool a requirement for anything from other
> quarters.  I was hoping that posting that patch would provoke some
> discussion about what exactly is needed but that's not happened thus
> far.

I think everyone will be happy with HAVE_RELIABLE_STACKTRACE on arm64 as 
long as there is a guarantee that stack traces are really reliable. My 
understanding is that there is still some work to be done on arm64 arch 
side (but I may have misunderstood what Mark R. said elsewhere). And yes, 
then there is a question of objtool. It is one option but not the only 
one. There have been proposals of implementing guarantees on a compiler 
side and leaving objtool for x86_64 only (albeit objtool may bring more 
features to the table... ORC, arch features checking).

Madhavan also mentioned that he enhanced objtool and he planned to submit 
it eventually 
(https://lore.kernel.org/all/1a0e19db-a7f8-4c8e-0163-398fcd364d54@linux.microsoft.com/T/#u), 
so maybe arm64 maintainers could decide on a future direction based on 
that?

Regards
Miroslav


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-10  8:33               ` Miroslav Benes
@ 2022-03-10 12:36                 ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-03-10 12:36 UTC (permalink / raw)
  To: Miroslav Benes, Mark Brown
  Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 3/10/22 02:33, Miroslav Benes wrote:
> On Wed, 9 Mar 2022, Mark Brown wrote:
> 
>> On Tue, Mar 08, 2022 at 04:00:35PM -0600, Madhavan T. Venkataraman wrote:
>>
>>> It is just that patch 11 that defines "select
>>> HAVE_RELIABLE_STACKTRACE" did not receive any comments from you
>>> (unless I missed a comment that came from you. That is entirely
>>> possible. If I missed it, my bad). Since you suggested that change, I
>>> just wanted to make sure that that patch looks OK to you.
>>
>> I think that's more a question for the livepatch people to be honest -
>> it's not entirely a technical one, there's a bunch of confidence level
>> stuff going on.  For example there was some suggestion that people might
>> insist on having objtool support, though there's also substantial
>> pushback on making objtool a requirement for anything from other
>> quarters.  I was hoping that posting that patch would provoke some
>> discussion about what exactly is needed but that's not happened thus
>> far.
> 
> I think everyone will be happy with HAVE_RELIABLE_STACKTRACE on arm64 as 
> long as there is a guarantee that stack traces are really reliable. My 
> understanding is that there is still some work to be done on arm64 arch 
> side (but I may have misunderstood what Mark R. said elsewhere). And yes, 
> then there is a question of objtool. It is one option but not the only 
> one. There have been proposals of implementing guarantees on a compiler 
> side and leaving objtool for x86_64 only (albeit objtool may bring more 
> features to the table... ORC, arch features checking).
> 
> Madhavan also mentioned that he enhanced objtool and he planned to submit 
> it eventually 
> (https://lore.kernel.org/all/1a0e19db-a7f8-4c8e-0163-398fcd364d54@linux.microsoft.com/T/#u), 
> so maybe arm64 maintainers could decide on a future direction based on 
> that?
> 

Yes. I am working on that right now. Hope to send it out soon.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-09 11:47             ` Mark Brown
  2022-03-09 15:34               ` Madhavan T. Venkataraman
  2022-03-10  8:33               ` Miroslav Benes
@ 2022-03-16  3:43               ` Josh Poimboeuf
  2 siblings, 0 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2022-03-16  3:43 UTC (permalink / raw)
  To: Mark Brown
  Cc: Madhavan T. Venkataraman, Mark Rutland, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

On Wed, Mar 09, 2022 at 11:47:38AM +0000, Mark Brown wrote:
> On Tue, Mar 08, 2022 at 04:00:35PM -0600, Madhavan T. Venkataraman wrote:
> 
> > It is just that patch 11 that defines "select
> > HAVE_RELIABLE_STACKTRACE" did not receive any comments from you
> > (unless I missed a comment that came from you. That is entirely
> > possible. If I missed it, my bad). Since you suggested that change, I
> > just wanted to make sure that that patch looks OK to you.
> 
> I think that's more a question for the livepatch people to be honest -
> it's not entirely a technical one, there's a bunch of confidence level
> stuff going on.  For example there was some suggestion that people might
> insist on having objtool support, though there's also substantial
> pushback on making objtool a requirement for anything from other
> quarters.  I was hoping that posting that patch would provoke some
> discussion about what exactly is needed but that's not happened thus
> far.

That patch has HAVE_RELIABLE_STACKTRACE depending on STACK_VALIDATION,
which doesn't exist on arm64.

So it didn't seem controversial enough to warrant discussion ;-)

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
       [not found] <95691cae4f4504f33d0fc9075541b1e7deefe96f>
  2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
@ 2022-04-07 20:25 ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 1/9] objtool: Parse DWARF Call Frame Information in object files madvenka
                     ` (11 more replies)
  1 sibling, 12 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Introduction
============

The livepatch feature requires an unwinder that can provide a reliable stack
trace. General requirements for a reliable unwinder are described in this
document from Mark Rutland:

	Documentation/livepatch/reliable-stacktrace.rst

The requirements have two parts:

1. The unwinder must be enhanced with certain features. E.g.,

	- Identifying successful termination of stack trace
	- Identifying unwindable and non-unwindable code
	- Identifying interrupts and exceptions occurring in the frame pointer
	  prolog and epilog
	- Identifying features such as kretprobe and ftrace graph tracing
	  that can modify the return address stored on the stack
	- Identifying corrupted/unreliable stack contents
	- Architecture-specific items that can render a stack trace unreliable
	  at certain points in code

	Some of these features are already in the arm64 unwinder. I am pursuing
	the rest in another patch series. This is work in progress. The latest
	submission as of this writing is here:

https://lore.kernel.org/linux-arm-kernel/20220117145608.6781-1-madvenka@linux.microsoft.com/T/#t

2. Validation of the frame pointer

	This assumes that the unwinder is based on the frame pointer (FP).
	The actual frame pointer that the unwinder uses cannot just be
	assumed to be correct. It needs to be validated somehow.

	This patch series is to address this requirement.

Validation of the FP (aka STACK_VALIDATION)
====================

The current approach in Linux is to use objtool, a build time tool, for this
purpose. When configured, objtool is invoked on every relocatable object file
during kernel build. It performs static analysis of the code in each file. It
walks the instructions in every function and notes the changes to the stack
pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
accordance with the ABI rules. Once objtool completes successfully, the kernel
can then be used for livepatch purposes.

Objtool can have uses other than just FP validation. For instance, it can check
control flow integrity during its analysis.

Problem
=======

Objtool is complex and highly architecture-dependent. It presents a great
challenge to support livepatch on an architecture. We need an alternative
solution for livepatch, preferably one that is largely architecture
independent.

A different approach
====================

I would like to propose a different approach for FP validation - one that is
simpler as well as architecture-independent for the most part. This initial
work is for arm64. But it can easily be extended to other architectures.

In this approach, objtool is used to generate data for the unwinder. The
unwinder uses the data during a stack trace to validate the FP in each
frame. In other words, the FP is validated dynamically and not statically
at build time.

Background for the solution
===========================

DWARF is a debugging file format used by many compilers and debuggers to
support source level debugging. One of the components of DWARF is the
DWARF Call Frame Information (CFI). A special section called .debug_frame
is generated by the compiler to contain CFI. CFI supplies all the rules
required to compute the contents of every register at every instruction
address. A complete unwinder can be built from CFI.

However, DWARF is complex and building a complete unwinder from DWARF CFI
is a ship that has already sailed. That is not the purpose of this patch
series.

The solution
============

The goal here is to use the absolute minimum CFI needed to compute the FP at
every instruction address. The unwinder can compute the FP in each frame,
compare the actual FP with the computed one and validate the actual FP.

Objtool is enhanced to parse the CFI, extract just the rules required,
encode them in compact data structures and create special sections for
the rules. The unwinder uses the special sections to find the rules for
a given instruction address and compute the FP.

Objtool can be invoked as follows:

	objtool dwarf generate <object-file>

The version of the DWARF standard supported in this work is version 4. The
official documentation for this version is here:

	https://dwarfstd.org/doc/DWARF4.pdf

Section 6.4 contains the description of the CFI.

Register rules in CFI
=====================

CFI defines the Canonical Frame Address (CFA) as the value of the stack
pointer (SP) when a call instruction is executed. For the called function,
register values are expressed relative to the CFA.

DWARF CFI defines the following rules to obtain the value of a register
in the previous frame, given a current frame:

1. Same_Value:

	The current and previous values of the register are the same.

2. Val_Offset(N):

	The previous value is (CFA + N) where N is a signed offset.

3. Offset(N):

	The previous value is saved at (CFA + N).

4. register(R):

	The previous value is saved in register R.

5. Val_Expression(E):

	The previous value is the value produced by evaluating a given
	DWARF expression. DWARF expressions are evaluated on a stack. That
	is, operands are pushed and popped on a stack, DWARF operators are
	applied on them and the result is obtained.

6. Expression(E):

	The previous value is stored at the address computed from a DWARF
	expression.

7. Architectural:

	The previous value is obtained in an architecture-specific way via
	an architecture-specific "augmentor". Augmentors are vendor specific
	and are not part of the DWARF standard.

The minimum CFI needed for this work
====================================

Fortunately, gcc and clang only generate rules (1), (2) and (3) for the
SP, FP and return address (RA). So, this implementation only supports these
3 rules. These are very simple rules. At the time of this writing, these
rules are found to be sufficient for ARM, ARM64 and RISCV.

As an exercise, I also ran my CFI parser on X64. For a very small percentage
of the functions, DWARF expressions are indeed used. Of course, X64 already
has a complete objtool-based static stack validation scheme. So, X64 does
not need this.

I have not checked other architectures so far.

Compact encoding of CFI
=======================

The CFI is defined in a very generic format to allow all of the above rules
to be defined. Since this work uses only a minimal subset of the rules, the
supported CFI rules can be encoded in a more compact format. Also, this
subset of the rules can be statically evaluated at build time by objtool.
The kernel does not have to do any CFI parsing.

Unsupported rules
=================

There are three main reasons why I chose not to support rules (4) thru (7).

	- the compiler does not generate these for the SP, FP and RA for
	  arm64. So, arm64 does not need them.

	- They have complexity.

	- Objtool may not be able to do all the work for rules (4) thru (7).
	  The kernel may be required to evaluate expressions that involve
	  dereferencing an address, getting the value stored in a register,
	  etc.

That does not mean that they cannot be supported. But supporting them would
increase the complexity. I strongly suggest that this work be used only for
architectures where all of the parsing and record generation can be done in
objtool at build time. The kernel part of the implementation should be kept
simple.

How to deal with unsupported rules, if they are present?
========================================================

objtool does not generate any rule data for the code locations at which
unsupported rules exist. When the unwinder tries to find a rule for any of
these locations, it will not find any. Then, it will simply consider the
code locations unreliable from an unwind perspective. The requirement for
the unwinder is really that it must be able to identify reliable and
unreliable code. It can still do this.

So, livepatch can be supported even on architectures where unsupported rules
are generated by the compiler. It only means that the code ranges that contain
those rules will be considered unreliable by the unwinder. If they occur in
frequently used functions, then it is definitely a problem. If not, they may
result in some retries during the livepatch process. But livepatch can still
be done.

FP prolog, epilog and leaf functions
====================================

DWARF CFI rules allow objtool to recognize these cases. Objtool does not
generate any rule data for a function unless the frame is completely setup. If
an interrupt or an exception happens in code where the frame is not set up
or not set up completely, the unwinder will not find the rules for such code.
Automatically, the stack trace is considered unreliable as it should be.

Assembly functions
==================

DWARF CFI is generated by the compiler only for C functions. This means that
the unwinder will not find any rules for assembly code. So, assembly functions
are automatically considered unreliable from an unwind perspective.

For assembly functions, DWARF annotations are defined that can be placed in
assembly code. In that case, DWARF CFI can be generated for assembly functions
as well. However, DWARF annotations are a PITA to maintain. So, this is not
a good path to go down.

Now, there are certain points in assembly code that we would like to unwind
through reliably. Like interrupt and exception handlers. This is mainly for
getting reliable stack traces in these cases and reducing the number of
retries during the livepatch process. For these, unwind hints can be placed
at strategic points in assembly code. Only a small number of these hints
should be needed.

In this work, I have defined the following unwind hints so stack traces that
contain these can be done reliably:

	- Exception handlers
	- Interrupt handlers
	- FTrace tracer functions
	- FTrace graph return prepare code
	- FTrace callsites
	- Kretprobe Trampoline

Unwind hints are collected in a special section. Objtool converts unwind hints
to rule data just like the CFI based ones. The kernel does not need special
code to process unwind hints.

Generated code
==============

Generated code will not have any DWARF rules. Such code will be considered
unreliable by the kernel.

Size of the memory consumed within the kernel for this feature
==============================================================

This depends on the amount of code in the kernel which, in turn, depends on
the number of configs turned on. E.g., on the kernel on my arm64 system, the
.debug_frame section generated by the compiler in vmlinux is about 3.42 MB.
But the rule data generated by objtool for vmlinux is only about 1.06 MB.

Architecture-dependent part
===========================

The following architecture-dependent items must be supplied to support
an architecture:

	- Mapping from DWARF register numbers to actual registers. This is
	  required only for the SP and FP (and RA, if the architecture
	  defines an RA register).

	- Relocation information for the special section created by objtool.
	  Relocation types are processor-specific.

	- Architecture-specific rule checking. For instance, the return
	  address and the frame pointer are saved on adjacent locations
	  on the stack for arm64. This is checked by an arm64-specific
	  rule checker during CFI parsing.

The architecture dependent portion is very small.

Items like endianness and address size are already handled in generic code.

GitHub repository
=================

I have created a github repo to share my work. For each version I will create
a branch. For version 1, it is here:

https://github.com/madvenka786/linux/tree/dwarftool_v1

Please feel free to clone and check it out. And, please let me know if you
find any issues.

Testing
=======

I have run all of the livepatch selftests successfully. I have written a
couple of extra selftests myself which I will be posting separately.

There is an open source utility called dwarfdump. It parses the CFI and
produces ASCII output of the same. I have written a tool to extract that
information and compare it with what my parser generates. The comparison
is successful. So, the parser has been well tested.

I have extracted the same instruction addresses from vmlinux and fed
them to the lookup function in the kernel that the unwinder uses. I have
verified that the correct CFI rules are looked up for every single
input address. So, the lookup function has been well tested.

TBD
===

- Objtool generates a table of instruction addresses or PCs for the kernel.
  These need to be sorted for doing an efficient binary search. Currently,
  the sorting is done in the kernel during boot. I will add support to the
  sorttable script so that the sorting can be done at build time.

- I need to perform more rigorous testing with different scenarios. This
  is work in progress. Any ideas or suggestions are welcome.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>

Madhavan T. Venkataraman (9):
  objtool: Parse DWARF Call Frame Information in object files
  objtool: Generate DWARF rules and place them in a special section
  dwarf: Build the kernel with DWARF information
  dwarf: Implement DWARF rule processing in the kernel
  dwarf: Implement DWARF support for modules
  arm64: unwinder: Add a reliability check in the unwinder based on
    DWARF CFI
  arm64: dwarf: Implement unwind hints
  dwarf: Enable livepatch for ARM64

Suraj Jitindar Singh (1):
  dwarf: Miscellaneous changes required for enabling livepatch

 arch/Kconfig                                  |   4 +-
 arch/arm64/Kconfig                            |   7 +
 arch/arm64/Kconfig.debug                      |   5 +
 arch/arm64/configs/defconfig                  |   1 +
 arch/arm64/include/asm/livepatch.h            |  42 ++
 arch/arm64/include/asm/sections.h             |   4 +
 arch/arm64/include/asm/stacktrace.h           |   9 +
 arch/arm64/include/asm/thread_info.h          |   4 +-
 arch/arm64/include/asm/unwind_hints.h         |  28 +
 arch/arm64/kernel/entry-ftrace.S              |  23 +
 arch/arm64/kernel/entry.S                     |   3 +
 arch/arm64/kernel/ftrace.c                    |  16 +
 arch/arm64/kernel/probes/kprobes_trampoline.S |   2 +
 arch/arm64/kernel/signal.c                    |   4 +
 arch/arm64/kernel/stacktrace.c                | 131 ++++
 arch/arm64/kernel/vmlinux.lds.S               |  22 +
 include/linux/dwarf.h                         |  90 +++
 include/linux/ftrace.h                        |   4 +
 include/linux/module.h                        |   3 +
 kernel/Makefile                               |   1 +
 kernel/dwarf_fp.c                             | 305 ++++++++++
 kernel/module.c                               |  31 +
 scripts/Makefile.build                        |   4 +
 scripts/link-vmlinux.sh                       |   6 +
 tools/include/linux/dwarf.h                   |  90 +++
 tools/objtool/Build                           |   5 +
 tools/objtool/Makefile                        |  10 +-
 tools/objtool/arch/arm64/Build                |   2 +
 tools/objtool/arch/arm64/dwarf_arch.c         | 114 ++++
 tools/objtool/arch/arm64/dwarf_clang.c        |  53 ++
 .../arch/arm64/include/arch/dwarf_reg.h       |  17 +
 tools/objtool/builtin-dwarf.c                 |  75 +++
 tools/objtool/dwarf_op.c                      | 560 ++++++++++++++++++
 tools/objtool/dwarf_parse.c                   | 351 +++++++++++
 tools/objtool/dwarf_rules.c                   | 265 +++++++++
 tools/objtool/dwarf_util.c                    | 280 +++++++++
 tools/objtool/elf.c                           |   2 +-
 tools/objtool/include/objtool/builtin.h       |   1 +
 tools/objtool/include/objtool/dwarf_def.h     | 460 ++++++++++++++
 tools/objtool/include/objtool/elf.h           |   1 +
 tools/objtool/include/objtool/objtool.h       |   3 +
 tools/objtool/objtool.c                       |   1 +
 tools/objtool/sync-check.sh                   |   6 +
 tools/objtool/weak.c                          |  38 ++
 44 files changed, 3079 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/include/asm/livepatch.h
 create mode 100644 arch/arm64/include/asm/unwind_hints.h
 create mode 100644 include/linux/dwarf.h
 create mode 100644 kernel/dwarf_fp.c
 create mode 100644 tools/include/linux/dwarf.h
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/dwarf_arch.c
 create mode 100644 tools/objtool/arch/arm64/dwarf_clang.c
 create mode 100644 tools/objtool/arch/arm64/include/arch/dwarf_reg.h
 create mode 100644 tools/objtool/builtin-dwarf.c
 create mode 100644 tools/objtool/dwarf_op.c
 create mode 100644 tools/objtool/dwarf_parse.c
 create mode 100644 tools/objtool/dwarf_rules.c
 create mode 100644 tools/objtool/dwarf_util.c
 create mode 100644 tools/objtool/include/objtool/dwarf_def.h


base-commit: fc74e0a40e4f9fd0468e34045b0c45bba11dcbb2
-- 
2.25.1


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 1/9] objtool: Parse DWARF Call Frame Information in object files
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 2/9] objtool: Generate DWARF rules and place them in a special section madvenka
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

If CONFIG_DEBUG_INFO_DWARF* is enabled, the compiler generates DWARF
Call Frame Information (CFI) for every object file and places it in a
special section named ".debug_frame". The CFI information can be used
for frame pointer validation.

Implement a CFI parser in objtool. This can be invoked as follows:

	objtool dwarf generate <object-file>

The version of the DWARF standard supported in this work is version 4. The
official documentation for this version is here:

	https://dwarfstd.org/doc/DWARF4.pdf

Section 6.4 contains the description of the CFI.

Initial implementation
======================

This initial work is for supporting frame pointer validation for ARM64.
That said, it is generic enough to support other architectures in the
future.

CFI defines 7 different register rules to compute register values at every
instruction address. These are described below. Of these, only the first
3 rules are generated by gcc and clang for the stack pointer (SP), the
frame pointer (FP) and the return address (RA) for ARM64. As of this
writing, the same is true for RISCV as well.

So, in objtool, provide support for the first three rules and mark the
rest as unsupported.

Register rules
==============

CFI defines the Canonical Frame Address (CFA) as the value of the SP when
a call instruction is executed. For the called function, other register
values are expressed relative to the CFA.

CFI defines the following rules to obtain the value of a register in the
previous frame, given a current frame:

1. Same_Value:

	The current and previous values of the register are the same.

2. Val_Offset(N):

	The previous value is (CFA + N) where N is a signed offset.

3. Offset(N):

	The previous value is saved at (CFA + N).

4. register(R):

	The previous value is saved in register R.

5. Val_Expression(E):

	The previous value is the value produced by evaluating a given
	DWARF expression. DWARF expressions are evaluated on a stack. That
	is, operands are pushed and popped on a stack, DWARF operators are
	applied on them and the result is obtained.

6. Expression(E):

	The previous value is stored at the address computed from a DWARF
	expression.

7. Architectural:

	The previous value is obtained in an architecture-specific way via
	an architecture-specific "augmentor". Augmentors are vendor specific
	and are not part of the DWARF standard.

DWARF rule encoding
===================

The CFI is defined in a very generic format to allow all of the above rules
to be defined for different architectures. Since this work uses only a
minimal subset of the rules, the supported CFI rules can be encoded in a
more compact format for the kernel.

Provide stubs for generating compact rule data from the CFI rules. In a
future patch, the stubs will be filled with actual code and the rules will
be written to a special section in the object file. And, in a future patch,
the kernel will use the rule data.

Unsupported rules
=================

For arm64, this is not an issue. If the compiler generates unsupported
rules for the SP, FP and RA for some other architecture, objtool will
not generate any rule data for those code locations. These will be treated
as unreliable PCs from an unwinder perspective.

FP prolog, epilog and leaf functions
====================================

This implementation recognizes these cases in the DWARF CFI. It does
not generate any rule data unless the frame is completely setup. So, if an
interrupt or an exception were to happen in the prolog, epilog or a function
where the frame has not been set up for whatever reason, the kernel will
recognize that and consider the code unreliable from an unwinder
perspective.

Assembly functions
==================

DWARF CFI is generated by the compiler only for C functions. So, objtool
will not generate any rule data for assembly functions. By default, the
kernel will consider all assembly functions as unreliable from an unwinder
perspective.

DWARF annotations for assembly code can be used so CFI can be generated for
assembly functions as well. However, DWARF annotations are a pain to
maintain. So, we should never go down that path.

Now, there are certain points in assembly code that we would like to unwind
through reliably. Like interrupt and exception handlers. For these, unwind
hints can be defined and placed at strategic points in assembly code. This
will be done in a futurep patch. Unwind hints are a lot simpler and a lot
easier to maintain than DWARF annotations.

Generated code
==============

Generated code will not have any DWARF rules. The kernel will consider such
code unreliable from an unwinder perspective.

Architecture-specific part
===========================

The following pieces in this implementation are architecture-specific. Code
must be provided for each architecture spearately.

	- DWARF register number to architecture register mapping

	- Relocation handling for rule data (relocation types are
	  processor-specific)

	- ABI-specific checking of rules parsed by objtool.

Only a small amount of architecture-specific code is required. Other aspects
such as endianness and address size are handled in the generic code.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 tools/objtool/Build                           |   5 +
 tools/objtool/Makefile                        |  10 +-
 tools/objtool/arch/arm64/Build                |   2 +
 tools/objtool/arch/arm64/dwarf_arch.c         | 114 ++++
 tools/objtool/arch/arm64/dwarf_clang.c        |  53 ++
 .../arch/arm64/include/arch/dwarf_reg.h       |  17 +
 tools/objtool/builtin-dwarf.c                 |  57 ++
 tools/objtool/dwarf_op.c                      | 560 ++++++++++++++++++
 tools/objtool/dwarf_parse.c                   | 294 +++++++++
 tools/objtool/dwarf_rules.c                   |  37 ++
 tools/objtool/dwarf_util.c                    | 280 +++++++++
 tools/objtool/elf.c                           |   2 +-
 tools/objtool/include/objtool/builtin.h       |   1 +
 tools/objtool/include/objtool/dwarf_def.h     | 438 ++++++++++++++
 tools/objtool/include/objtool/elf.h           |   1 +
 tools/objtool/include/objtool/objtool.h       |   1 +
 tools/objtool/objtool.c                       |   1 +
 tools/objtool/weak.c                          |  27 +
 18 files changed, 1898 insertions(+), 2 deletions(-)
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/dwarf_arch.c
 create mode 100644 tools/objtool/arch/arm64/dwarf_clang.c
 create mode 100644 tools/objtool/arch/arm64/include/arch/dwarf_reg.h
 create mode 100644 tools/objtool/builtin-dwarf.c
 create mode 100644 tools/objtool/dwarf_op.c
 create mode 100644 tools/objtool/dwarf_parse.c
 create mode 100644 tools/objtool/dwarf_rules.c
 create mode 100644 tools/objtool/dwarf_util.c
 create mode 100644 tools/objtool/include/objtool/dwarf_def.h

diff --git a/tools/objtool/Build b/tools/objtool/Build
index b7222d5cc7bc..2ab5885398c1 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -7,9 +7,14 @@ objtool-$(SUBCMD_CHECK) += special.o
 objtool-$(SUBCMD_ORC) += check.o
 objtool-$(SUBCMD_ORC) += orc_gen.o
 objtool-$(SUBCMD_ORC) += orc_dump.o
+objtool-$(SUBCMD_DWARF) += dwarf_parse.o
+objtool-$(SUBCMD_DWARF) += dwarf_op.o
+objtool-$(SUBCMD_DWARF) += dwarf_rules.o
+objtool-$(SUBCMD_DWARF) += dwarf_util.o
 
 objtool-y += builtin-check.o
 objtool-y += builtin-orc.o
+objtool-y += builtin-dwarf.o
 objtool-y += elf.o
 objtool-y += objtool.o
 
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 92ce4fce7bc7..2bc84ac5515f 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -41,13 +41,21 @@ AWK = awk
 
 SUBCMD_CHECK := n
 SUBCMD_ORC := n
+SUBCMD_DWARF := n
 
 ifeq ($(SRCARCH),x86)
 	SUBCMD_CHECK := y
 	SUBCMD_ORC := y
 endif
 
-export SUBCMD_CHECK SUBCMD_ORC
+ifeq ($(SRCARCH),arm64)
+	SUBCMD_DWARF := y
+ifneq ($(LLVM),)
+	SUBCMD_DWARF_CLANG := y
+endif
+endif
+
+export SUBCMD_CHECK SUBCMD_ORC SUBCMD_DWARF SUBCMD_DWARF_CLANG
 export srctree OUTPUT CFLAGS SRCARCH AWK
 include $(srctree)/tools/build/Makefile.include
 
diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
new file mode 100644
index 000000000000..e5710de8060f
--- /dev/null
+++ b/tools/objtool/arch/arm64/Build
@@ -0,0 +1,2 @@
+objtool-$(SUBCMD_DWARF) += dwarf_arch.o
+objtool-$(SUBCMD_DWARF_CLANG) += dwarf_clang.o
diff --git a/tools/objtool/arch/arm64/dwarf_arch.c b/tools/objtool/arch/arm64/dwarf_arch.c
new file mode 100644
index 000000000000..2607ec94a12e
--- /dev/null
+++ b/tools/objtool/arch/arm64/dwarf_arch.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * dwarf_arch.c - Architecture-specific support functions.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+#include <stdio.h>
+#include <errno.h>
+
+#include <objtool/objtool.h>
+#include <objtool/warn.h>
+#include <objtool/dwarf_def.h>
+#include <arch/dwarf_reg.h>
+#include <linux/kconfig.h>
+#include <linux/compiler.h>
+
+int arch_dwarf_fde_reloc(struct fde *fde)
+{
+	GElf_Rela		rela;
+	struct symbol		*rela_symbol;
+	int			index;
+
+	/*
+	 * Find the code section, offset within the section and the symbol of
+	 * the function for this FDE. We need the symbol for debugging purposes.
+	 * We need the section and offset to set up relocation for the DWARF
+	 * rules that we will create in a separate section.
+	 */
+	if (debug_frame->reloc) {
+		/*
+		 * In this case, debug frame entries are relocatable. For every
+		 * FDE, there is a pair of relocation entries - one for the FDE
+		 * itself and one for the function it represents. So, we use
+		 * the second entry in each pair to find the section, section
+		 * offset and the symbol for the function.
+		 */
+		index = (fde->index * 2) + 1;
+		if (!gelf_getrela(debug_frame->reloc->data, index, &rela)) {
+			WARN_ELF("gelf_getrela");
+			return -ENOENT;
+		}
+		rela_symbol = find_symbol_by_index(dwarf_file->elf,
+						   GELF_R_SYM(rela.r_info));
+		fde->section = find_section_by_name(dwarf_file->elf,
+						    rela_symbol->name);
+		if (!fde->section) {
+			WARN("No section for FDE");
+			return -ENOENT;
+		}
+		fde->offset = rela.r_addend;
+		fde->symbol = find_symbol_containing(fde->section, fde->offset);
+	} else {
+		/*
+		 * In this case, the debug frame entries are not relocatable.
+		 * In the normal build of the kernel, this code is not required
+		 * because objtool will be run on relocatable objects. But
+		 * this code can handle it if objtool is run on the vmlinux
+		 * binary itself. This is for debugging purposes.
+		 */
+		struct section *sec;
+		GElf_Shdr *sh;
+		unsigned long addr = fde->start_pc;
+		unsigned long start_addr, end_addr;
+
+		fde->section = NULL;
+		for_each_sec(dwarf_file, sec) {
+			sh = &sec->sh;
+			start_addr = sh->sh_addr;
+			end_addr = start_addr + sh->sh_size;
+			if (addr >= start_addr && addr < end_addr) {
+				fde->section = sec;
+				break;
+			}
+		}
+		if (!fde->section) {
+			WARN("No section for FDE");
+			return -ENOENT;
+		}
+		fde->offset = 0;
+		fde->symbol = find_symbol_containing(fde->section,
+			fde->start_pc);
+	}
+	fde->start_pc += fde->offset;
+	fde->end_pc += fde->offset;
+	return 0;
+}
+
+/*
+ * If the offsets are 0, it means that the frame is not fully set up at
+ * this point in the object code. E.g., in the frame pointer prolog or epilog
+ * or in a leaf function. Check for this.
+ *
+ * If the frame is properly set up, then the frame pointer and return
+ * address are saved adjacent to each other. Check for this.
+ */
+int arch_dwarf_check_rules(struct fde *fde, unsigned long pc,
+			   struct rule *sp_rule, struct rule *fp_rule,
+			   struct rule *ra_rule)
+{
+	if (!sp_rule->offset || sp_rule->saved ||
+	    !fp_rule->offset || !fp_rule->saved)
+		return -EINVAL;
+
+	if (!ra_rule->offset || !ra_rule->saved ||
+	    (fp_rule->offset + 8) != ra_rule->offset)
+		return -EINVAL;
+
+	if (fde)
+		arch_dwarf_clang_hack(fde, pc, sp_rule, fp_rule);
+
+	return 0;
+}
diff --git a/tools/objtool/arch/arm64/dwarf_clang.c b/tools/objtool/arch/arm64/dwarf_clang.c
new file mode 100644
index 000000000000..e07ccf9484cf
--- /dev/null
+++ b/tools/objtool/arch/arm64/dwarf_clang.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * dwarf_arch.c - Architecture-specific support functions.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+#include <stdio.h>
+#include <errno.h>
+
+#include <objtool/objtool.h>
+#include <objtool/warn.h>
+#include <objtool/dwarf_def.h>
+#include <arch/dwarf_reg.h>
+#include <linux/kconfig.h>
+#include <linux/compiler.h>
+
+void arch_dwarf_clang_hack(struct fde *fde, unsigned long pc,
+			   struct rule *sp_rule, struct rule *fp_rule)
+{
+	struct section		*sec = fde->section;
+	unsigned long		start_pc = 0;
+	unsigned int		instruction;
+	unsigned int		opcode, rd, rn, imm;
+
+	if (!fde)
+		return;
+
+	if (!sec->reloc)
+		start_pc = sec->sh.sh_addr;
+
+	instruction = *(unsigned int *)(sec->data->d_buf + pc - start_pc - 4);
+	rd = instruction & 0x1F;
+	rn = (instruction >> 5) & 0x1F;
+	imm = (instruction >> 10) & 0x3FFF;
+	opcode = instruction >> 24;
+
+	if (opcode == 0x91 && rn == SP_REG && rd == FP_REG && imm &&
+	    sp_rule->offset == -fp_rule->offset) {
+		fde->sp_offset = imm;
+	}
+
+	imm = (instruction >> 10) & 0xFFF;
+	opcode = instruction >> 22;
+
+	if (opcode == 0x344 && rn == SP_REG && rd == SP_REG && imm &&
+	    sp_rule->offset == -fp_rule->offset) {
+		fde->sp_offset = imm;
+	}
+
+	sp_rule->offset += fde->sp_offset;
+}
diff --git a/tools/objtool/arch/arm64/include/arch/dwarf_reg.h b/tools/objtool/arch/arm64/include/arch/dwarf_reg.h
new file mode 100644
index 000000000000..b3c61060bb67
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/dwarf_reg.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * dwarf_reg.h - DWARF register numbers for the stack pointer, frame pointer
+ *		 and return address.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (c) 2022 Microsoft Corporation
+ */
+#ifndef _ARM64_ARCH_DWARF_REG_H
+#define _ARM64_ARCH_DWARF_REG_H
+
+#define FP_REG		29
+#define RA_REG		30
+#define SP_REG		31
+
+#endif /* _ARM64_ARCH_DWARF_REG_H */
diff --git a/tools/objtool/builtin-dwarf.c b/tools/objtool/builtin-dwarf.c
new file mode 100644
index 000000000000..f44b35eb3f55
--- /dev/null
+++ b/tools/objtool/builtin-dwarf.c
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * builtin-dwarf.c - DWARF command invoked by "objtool dwarf ...".
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+/*
+ * objtool dwarf:
+ *
+ * This command analyzes a .o file and adds .dwarf_rules and .dwarf_offsets
+ * sections to it, which is used by the in-kernel reliable unwinder.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <objtool/builtin.h>
+#include <objtool/objtool.h>
+
+static const char * const dwarf_usage[] = {
+	/*
+	 * Generate DWARF rules for the kernel from DWARF Call Frame
+	 * information.
+	 */
+	"objtool dwarf generate file",
+
+	NULL,
+};
+
+const struct option dwarf_options[] = {
+	OPT_END(),
+};
+
+int cmd_dwarf(int argc, const char **argv)
+{
+	const char		*object;
+	struct objtool_file	*file;
+
+	argc--; argv++;
+	if (argc != 2)
+		usage_with_options(dwarf_usage, dwarf_options);
+
+	object = argv[1];
+
+	file = objtool_open_read(object);
+	if (!file)
+		return 1;
+
+	if (!strncmp(argv[0], "gen", 3))
+		return dwarf_parse(file);
+
+	usage_with_options(dwarf_usage, dwarf_options);
+
+	return 0;
+}
diff --git a/tools/objtool/dwarf_op.c b/tools/objtool/dwarf_op.c
new file mode 100644
index 000000000000..31f9e0b4fd4b
--- /dev/null
+++ b/tools/objtool/dwarf_op.c
@@ -0,0 +1,560 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * dwarf_op.c - Code to parse DWARF operations in object files.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#include <stdio.h>
+
+#include <objtool/objtool.h>
+#include <objtool/warn.h>
+#include <objtool/dwarf_def.h>
+#include <arch/dwarf_reg.h>
+#include <linux/compiler.h>
+
+unsigned char			op, operand;
+
+static unsigned long		cur_address;
+static struct rule		sp_rule;
+static struct rule		fp_rule;
+static struct rule		ra_rule;
+static bool			unsupported;
+
+static inline bool dwarf_ignore_reg(unsigned int reg)
+{
+	/*
+	 * We don't care about registers other than the stack pointer, frame
+	 * pointer and the return address (if defined).
+	 */
+	return reg != SP_REG && reg != FP_REG && reg != RA_REG;
+}
+
+static inline void dwarf_offset_rule(unsigned int reg, long offset, bool saved)
+{
+	if (dwarf_ignore_reg(reg))
+		return;
+
+	if (reg == FP_REG && offset > 0) {
+		/* This is for Clang. */
+		reg = SP_REG;
+	}
+
+	if (reg == SP_REG) {
+		sp_rule.offset = offset;
+		sp_rule.saved = saved;
+	} else if (reg == FP_REG) {
+		fp_rule.offset = offset;
+		fp_rule.saved = saved;
+	} else {
+		ra_rule.offset = offset;
+		ra_rule.saved = saved;
+	}
+}
+
+static inline void dwarf_reg_rule(unsigned int reg, unsigned int other_reg)
+{
+	if (dwarf_ignore_reg(reg) && dwarf_ignore_reg(other_reg))
+		return;
+
+	/*
+	 * We don't support the SP, FP and RA being saved in other registers
+	 * and vice-versa.
+	 */
+	WARN("op=%d register rule for %d is not supported", op, reg);
+	unsupported = true;
+}
+
+static inline void dwarf_restore_rule(unsigned int reg)
+{
+	if (dwarf_ignore_reg(reg))
+		return;
+
+	if (reg == SP_REG)
+		sp_rule = cur_cie->sp_rule;
+	else if (reg == FP_REG)
+		fp_rule = cur_cie->fp_rule;
+	else
+		ra_rule = cur_cie->ra_rule;
+}
+
+static inline void dwarf_expression_rule(unsigned int reg)
+{
+	if (dwarf_ignore_reg(reg))
+		return;
+
+	/*
+	 * We don't support expressions to compute the values for the SP, FP
+	 * and RA.
+	 */
+	WARN("op=%d is not supported!", op);
+	unsupported = true;
+}
+
+static inline void dwarf_undefined_rule(unsigned int reg)
+{
+	if (dwarf_ignore_reg(reg))
+		return;
+
+	/*
+	 * We don't support the values for the SP, FP and RA being undefined.
+	 */
+	WARN("op=%d %d register undefined?", op, reg);
+	unsupported = true;
+}
+
+/*
+ * Whenever the PC is changed, the SP and FP rules for the range (old PC to
+ * new PC) have to be written out in the form of a DWARF rule before
+ * changing the PC.
+ */
+static void dwarf_set_address(unsigned long address, bool *fail)
+{
+	bool	skip = false;
+
+	if (unsupported) {
+		/*
+		 * We were not able to compute the rules. Reset the rules.
+		 * Do not generate a DWARF rule for this code range. The
+		 * kernel will therefore not be able to find the rules for
+		 * the code range and will consider the range to be unreliable
+		 * from an unwind perspective.
+		 */
+		unsupported = false;
+		dwarf_offset_rule(SP_REG, 0, false);
+		dwarf_offset_rule(FP_REG, 0, false);
+		dwarf_offset_rule(RA_REG, 0, false);
+		skip = true;
+	}
+
+	if (arch_dwarf_check_rules(cur_fde, cur_address,
+				   &sp_rule, &fp_rule, &ra_rule)) {
+		/*
+		 * Ignore the rules for now. Do not generate a DWARF rule
+		 * for this code range. The kernel will therefore not be able
+		 * to find the rules for the code range and will consider the
+		 * range to be unreliable from an unwind perspective. We do
+		 * not reset the rules as they can get modified by further
+		 * DWARF instruction processing to the point where they are
+		 * not ignored anymore.
+		 */
+		skip = true;
+	}
+
+	if (skip)
+		dwarf_rule_next(cur_fde, cur_address);
+	else if (dwarf_rule_add(cur_fde, cur_address, &sp_rule, &fp_rule))
+		*fail = true;
+	cur_address = address;
+}
+
+static unsigned char *dwarf_one_op(unsigned char *start, unsigned char *end)
+{
+	unsigned long		length;
+	unsigned char		byte;
+	unsigned int		delta;
+	u64			pc, reg, other_reg;
+	s64			offset;
+	bool			fail = false;
+
+	byte = *start++;
+
+	/* Primary op code in the high 2 bits */
+	op = byte & 0xc0;
+	operand = byte & 0x3F;
+	if (op == DW_CFA_extended_op) {
+		/* Extended op is in the low 6 bits. */
+		op = operand;
+	}
+
+	switch (op) {
+	/*
+	 * No-op instruction, used for padding.
+	 */
+	case DW_CFA_nop:
+		break;
+
+	/*
+	 * Instructions that advance the current PC.
+	 */
+	case DW_CFA_advance_loc:
+		/*
+		 * The factored delta is the operand itself. Add the delta to
+		 * the current instruction address to obtain the new
+		 * instruction address.
+		 */
+		delta = (unsigned int) operand;
+		delta *= cur_cie->code_factor;
+		dwarf_set_address(cur_address + delta, &fail);
+		break;
+	case DW_CFA_set_loc:
+		/*
+		 * Extract the target PC. Set the current instruction address
+		 * to it.
+		 */
+		if (cur_cie->address_size == 64)
+			GET_VALUE(pc, start, end, 8);
+		else
+			GET_VALUE(pc, start, end, 4);
+		dwarf_set_address(pc, &fail);
+		break;
+	case DW_CFA_advance_loc1:
+		/*
+		 * Extract the factored delta (unsigned byte) and add it to the
+		 * current instruction address to obtain the new instruction
+		 * address.
+		 */
+		GET_VALUE(delta, start, end, 1);
+		delta *= cur_cie->code_factor;
+		dwarf_set_address(cur_address + delta, &fail);
+		break;
+	case DW_CFA_advance_loc2:
+		/*
+		 * Extract the factored delta (unsigned short) and add it to
+		 * the current instruction address to obtain the new
+		 * instruction address.
+		 */
+		GET_VALUE(delta, start, end, 2);
+		delta *= cur_cie->code_factor;
+		dwarf_set_address(cur_address + delta, &fail);
+		break;
+	case DW_CFA_advance_loc4:
+		/*
+		 * Extract the factored delta (unsigned int) and add it to
+		 * the current instruction address to obtain the new
+		 * instruction address.
+		 */
+		GET_VALUE(delta, start, end, 4);
+		delta *= cur_cie->code_factor;
+		dwarf_set_address(cur_address + delta, &fail);
+		break;
+
+	/*
+	 * CFA definition instructions.
+	 */
+	case DW_CFA_def_cfa:
+		/*
+		 * Extract the CFA register and the unfactored CFA offset.
+		 * Define a val_offset(N) rule.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		READ_ULEB_128(offset, start, end, fail);
+		cur_cie->cfa_reg = reg;
+		cur_cie->cfa_offset = offset;
+		dwarf_offset_rule(cur_cie->cfa_reg, cur_cie->cfa_offset, false);
+		break;
+	case DW_CFA_def_cfa_sf:
+		/*
+		 * Same as DW_CFA_def_cfa except that the offset is signed
+		 * and factored.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		READ_SLEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		cur_cie->cfa_reg = reg;
+		cur_cie->cfa_offset = offset;
+		dwarf_offset_rule(cur_cie->cfa_reg, cur_cie->cfa_offset, false);
+		break;
+	case DW_CFA_def_cfa_register:
+		/*
+		 * Same as DW_CFA_def_cfa except that the saved offset is used.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		cur_cie->cfa_reg = reg;
+		dwarf_offset_rule(cur_cie->cfa_reg, cur_cie->cfa_offset, false);
+		break;
+	case DW_CFA_def_cfa_offset:
+		/*
+		 * Same as DW_CFA_def_cfa except that the saved register is
+		 * used.
+		 */
+		READ_ULEB_128(offset, start, end, fail);
+		cur_cie->cfa_offset = offset;
+		dwarf_offset_rule(cur_cie->cfa_reg, cur_cie->cfa_offset, false);
+		break;
+	case DW_CFA_def_cfa_offset_sf:
+		/*
+		 * Same as DW_CFA_def_cfa_offset except that the offset is
+		 * signed and factored.
+		 */
+		READ_SLEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		cur_cie->cfa_offset = offset;
+		dwarf_offset_rule(cur_cie->cfa_reg, cur_cie->cfa_offset, false);
+		break;
+	case DW_CFA_def_cfa_expression:
+		READ_ULEB_128(length, start, end, fail);
+		/*
+		 * Skip the expression bytes.
+		 */
+		start += length;
+		if (start > end)
+			fail = true;
+		dwarf_expression_rule(SP_REG);
+		break;
+
+	/*
+	 * Register rule instructions.
+	 */
+	case DW_CFA_undefined:
+		READ_ULEB_128(reg, start, end, fail);
+		dwarf_undefined_rule(reg);
+		break;
+	case DW_CFA_same_value:
+		/*
+		 * Set the register offset to be "same value". That is, it has
+		 * not been modified by the callee.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		dwarf_offset_rule(reg, 0, false);
+		break;
+	case DW_CFA_offset:
+		/*
+		 * The register number that is encoded in the operand itself.
+		 * Extract the factored offset. Define an offset(N) rule.
+		 */
+		reg = operand;
+		READ_ULEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		dwarf_offset_rule(reg, offset, true);
+		break;
+	case DW_CFA_offset_extended:
+		/*
+		 * Same as DW_CFA_offset except for the encoding and size of
+		 * the register operand.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		READ_ULEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		dwarf_offset_rule(reg, offset, true);
+		break;
+	case DW_CFA_offset_extended_sf:
+		/*
+		 * Same as DW_CFA_offset_extended except that the offset is
+		 * signed and factored.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		READ_SLEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		dwarf_offset_rule(reg, offset, true);
+		break;
+	case DW_CFA_val_offset:
+		/*
+		 * Extract the register number and the factored offset. Define
+		 * a val_offset(N) rule.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		READ_ULEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		dwarf_offset_rule(reg, offset, false);
+		break;
+	case DW_CFA_val_offset_sf:
+		/*
+		 * Same as DW_CFA_val_offset except that the offset is signed.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		READ_SLEB_128(offset, start, end, fail);
+		offset *= cur_cie->data_factor;
+		dwarf_offset_rule(reg, offset, false);
+		break;
+	case DW_CFA_register:
+		READ_ULEB_128(reg, start, end, fail);
+		READ_ULEB_128(other_reg, start, end, fail);
+		dwarf_reg_rule(reg, other_reg);
+		break;
+	case DW_CFA_expression:
+	case DW_CFA_val_expression:
+		READ_ULEB_128(reg, start, end, fail);
+		READ_ULEB_128(length, start, end, fail);
+		/*
+		 * Skip the expression bytes.
+		 */
+		start += length;
+		if (start > end)
+			fail = true;
+		dwarf_expression_rule(reg);
+		break;
+	case DW_CFA_restore:
+		/*
+		 * Restore the rule for the register to the one specified in
+		 * the CIE.
+		 */
+		reg = operand;
+		dwarf_restore_rule(reg);
+		break;
+	case DW_CFA_restore_extended:
+		/*
+		 * Same as DW_CFA_restore except for the encoding and size of
+		 * the register operand.
+		 */
+		READ_ULEB_128(reg, start, end, fail);
+		dwarf_restore_rule(reg);
+		break;
+
+	/*
+	 * Rule state instructions.
+	 */
+	case DW_CFA_remember_state:
+		cur_cie->saved_sp_rule = sp_rule;
+		cur_cie->saved_fp_rule = fp_rule;
+		cur_cie->saved_ra_rule = ra_rule;
+		break;
+	case DW_CFA_restore_state:
+		sp_rule = cur_cie->saved_sp_rule;
+		fp_rule = cur_cie->saved_fp_rule;
+		ra_rule = cur_cie->saved_ra_rule;
+		break;
+	default:
+		if (op >= DW_CFA_lo_user && op <= DW_CFA_hi_user) {
+			/*
+			 * Ignore arch-specific or vendor-specific ops as they
+			 * are irrelevant to the stack and frame pointers.
+			 */
+		} else {
+			WARN("Illegal CFA op %d", (int) op);
+			fail = true;
+		}
+		break;
+	}
+	return fail ? NULL : start;
+}
+
+/*
+ * Run the DWARF instructions in a CIE or an FDE.
+ */
+static unsigned char *dwarf_op(unsigned char *start, unsigned char *end)
+{
+	bool	fail = false;
+
+	/* cur_fde is set if this is an FDE. */
+	if (cur_fde) {
+		/*
+		 * For an FDE, the rules are initialized from the rules
+		 * computed for its CIE.
+		 */
+		sp_rule = cur_cie->sp_rule;
+		fp_rule = cur_cie->fp_rule;
+		ra_rule = cur_cie->ra_rule;
+	} else {
+		/*
+		 * For a CIE, the rule is initialized to all zeroes.
+		 */
+		dwarf_offset_rule(SP_REG, 0, false);
+		dwarf_offset_rule(FP_REG, 0, false);
+		dwarf_offset_rule(RA_REG, 0, false);
+	}
+
+	/*
+	 * If an unsupported DWARF rule is discovered in dwarf_one_op(),
+	 * this will be set to true.
+	 */
+	unsupported = false;
+
+	while (start < end) {
+		start = dwarf_one_op(start, end);
+		if (!start)
+			return NULL;
+	}
+
+	if (cur_fde) {
+		/* Generate the final rule for the FDE. */
+		dwarf_set_address(cur_fde->end_pc, &fail);
+	}
+	return start;
+}
+
+/*
+ * Parse DWARF instructions in CIEs and FDEs.
+ */
+void dwarf_parse_instructions(void)
+{
+	struct cie		*cie;
+	struct fde		*fde;
+	unsigned char		*start, *end;
+
+	cur_fde = NULL;
+
+	for (cie = cies; cie != NULL; cie = cie->next) {
+		cur_cie = cie;
+		start = cie->instructions;
+		end = start + cie->instructions_size;
+
+		/*
+		 * Run the DWARF instructions in the CIE to compute the
+		 * initial SP, FP and RA rules.
+		 */
+		if (!dwarf_op(start, end)) {
+			cur_cie->unusable = true;
+			continue;
+		}
+
+		cie->sp_rule = sp_rule;
+		cie->fp_rule = fp_rule;
+		cie->ra_rule = ra_rule;
+	}
+
+	for (fde = fdes; fde != NULL; fde = fde->next) {
+		/*
+		 * If any problems are encountered below, simply skip the FDE.
+		 * This means that no DWARF rules from this FDE will be
+		 * included. So, the kernel will consider the FDE's code range
+		 * to be unreliable from an unwinding perspective.
+		 */
+
+		/*
+		 * Find the CIE for this FDE using the section offset of the
+		 * CIE. The CIE list is already in increasing section offset
+		 * order.
+		 */
+		for (cie = cies; cie != NULL; cie = cie->next) {
+			if (cie->offset < fde->cie_offset)
+				continue;
+			if (cie->offset == fde->cie_offset)
+				fde->cie = cie;
+			break;
+		}
+
+		cur_cie = fde->cie;
+		if (cur_cie == NULL || cur_cie->unusable) {
+			WARN("No CIE: Could not process FDE");
+			continue;
+		}
+
+		if (!fde->section) {
+			/*
+			 * The section is needed to create relocation entries
+			 * for the DWARF rules in the FDE.
+			 */
+			WARN("No section: Could not process FDE");
+			continue;
+		}
+		cur_fde = fde;
+
+		/*
+		 * Run the DWARF instructions in the FDE to derive the rules
+		 * for computing the SP and the FP within the FDE code range.
+		 * Encode the rules in the form of DWARF rules for the benefit
+		 * of the kernel. dwarf_op() will generate the rules as it runs
+		 * the instructions.
+		 */
+		dwarf_rule_start(fde);
+
+		cur_address = fde->start_pc;
+		start = fde->instructions;
+		end = start + fde->instructions_size;
+
+		if (!dwarf_op(start, end)) {
+			/*
+			 * Rollback the DWARF rules created in the above
+			 * call.
+			 */
+			WARN("FDE instructions failed. Rolling back FDE.");
+			dwarf_rule_reset(fde);
+			continue;
+		}
+
+		dwarf_rule_next(fde, fde->end_pc);
+	}
+}
diff --git a/tools/objtool/dwarf_parse.c b/tools/objtool/dwarf_parse.c
new file mode 100644
index 000000000000..d5ac5630fbba
--- /dev/null
+++ b/tools/objtool/dwarf_parse.c
@@ -0,0 +1,294 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * dwarf_parse.c - Code to parse DWARF information in object files.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#include <stdio.h>
+#include <errno.h>
+
+#include <objtool/objtool.h>
+#include <objtool/warn.h>
+#include <objtool/dwarf_def.h>
+#include <linux/compiler.h>
+
+struct objtool_file		*dwarf_file;
+struct section			*debug_frame;
+
+struct cie			*cies, *cur_cie;
+struct fde			*fdes, *cur_fde;
+
+static struct cie		*cies_tail;
+static struct fde		*fdes_tail;
+
+static u64			cie_id;
+static int			fde_index;
+static int			address_size;
+static unsigned int		offset_size;
+static u64			entry_length;
+static unsigned char		*saved_start;
+
+/*
+ * Parse and create a new CIE.
+ */
+static unsigned char *dwarf_parse_cie(unsigned char *start, unsigned char *end)
+{
+	struct cie		*cie;
+	bool			fail = false;
+
+	cie = dwarf_alloc(sizeof(*cie));
+	if (!cie) {
+		WARN("%s: dwarf_alloc(cie) failed", __func__);
+		return NULL;
+	}
+	memset(cie, 0, sizeof(*cie));
+
+	/* Add CIE to global list. */
+	if (cies_tail == NULL)
+		cies = cie;
+	else
+		cies_tail->next = cie;
+	cies_tail = cie;
+	cie->next = NULL;
+
+	/* Section offset where this CIE resides. */
+	cie->offset = saved_start - (unsigned char *) debug_frame->data->d_buf;
+	cie->length = entry_length;
+	cie->id = cie_id;
+	cie->unusable = false;
+
+	/*
+	 * Extract the DWARF CFI version. This is different from the DWARF
+	 * version.
+	 */
+	cie->version = *start++;
+	if (cie->version > 4) {
+		/*
+		 * This implementation does not support these versions. For
+		 * instance, segment selectors are not supported.
+		 */
+		WARN("CIE version %d is not supported", cie->version);
+		return NULL;
+	}
+
+	/*
+	 * Store the size of an address in this architecture.
+	 */
+	if (cie->version == 4) {
+		GET_VALUE(cie->address_size, start, end, 1);
+		GET_VALUE(cie->segment_size, start, end, 1);
+		cie->address_size += cie->segment_size;
+	} else {
+		cie->address_size = address_size;
+	}
+
+	/*
+	 * This implementation does not support an augmentor. For instance,
+	 * the address_size can be modified by the augmentor. Make sure that
+	 * the augmentor is the null string.
+	 */
+	cie->augmentation = (char *) start;
+	if (*start++ != '\0') {
+		WARN("Augmentor is not supported");
+		return NULL;
+	}
+	cie->segment_size = 0;
+
+	/* Extract code alignment factor. */
+	READ_ULEB_128(cie->code_factor, start, end, fail);
+
+	/* Extract data alignment factor. */
+	READ_SLEB_128(cie->data_factor, start, end, fail);
+
+	if (cie->version == 1)
+		GET_VALUE(cie->return_address_reg, start, end, 1);
+	else
+		READ_ULEB_128(cie->return_address_reg, start, end, fail);
+
+	/* The remaining bytes are DWARF instructions. */
+	cie->instructions = start;
+	cie->instructions_size = end - start;
+	start = end;
+
+	return fail ? NULL : start;
+}
+
+/*
+ * Parse and create a new FDE.
+ */
+static unsigned char *dwarf_parse_fde(unsigned char *start, unsigned char *end)
+{
+	struct fde		*fde;
+	unsigned long		length;
+
+	fde = dwarf_alloc(sizeof(*fde));
+	if (!fde) {
+		WARN("%s: dwarf_alloc(fde) failed", __func__);
+		return NULL;
+	}
+	memset(fde, 0, sizeof(*fde));
+
+	/* Add FDE to global list. */
+	if (fdes_tail == NULL)
+		fdes = fde;
+	else
+		fdes_tail->next = fde;
+	fdes_tail = fde;
+	fde->next = NULL;
+
+	/*
+	 * This is the index of the FDE record in the .debug_frame section.
+	 * This is used to locate the symbol for the FDE in a relocatable
+	 * object file.
+	 */
+	fde->index = fde_index++;
+	fde->length = entry_length;
+
+	/*
+	 * For an FDE, the CIE ID field actually contains the section offset
+	 * of the CIE for the FDE.
+	 */
+	fde->cie_offset = cie_id;
+
+	/*
+	 * Set the CIE for this FDE to NULL for now. We will set this to the
+	 * correct CIE later.
+	 */
+	fde->cie = NULL;
+	fde->segment_selector = 0;
+
+	/*
+	 * Extract the starting address of the code range to which this FDE
+	 * applies.
+	 */
+	GET_VALUE(fde->start_pc, start, end, address_size);
+
+	/* Extract the size of the code range to which this FDE applies. */
+	GET_VALUE(length, start, end, address_size);
+	fde->end_pc = fde->start_pc + length;
+
+	/* The remaining bytes are DWARF instructions. */
+	fde->instructions = start;
+	fde->instructions_size = end - start;
+	start = end;
+
+	/* Relocation is arch-specific. */
+	if (arch_dwarf_fde_reloc(fde) == -EOPNOTSUPP)
+		return NULL;
+
+	return start;
+}
+
+/*
+ * Parse one entry for an FDE - either a CIE or an FDE.
+ */
+static unsigned char *dwarf_parse_one(unsigned char *start, unsigned char *end)
+{
+	bool			is_cie;
+
+	saved_start = start;
+
+	/*
+	 * The first value in an entry is the length field.
+	 */
+	GET_VALUE(entry_length, start, end, 4);
+	if (entry_length == 0) {
+		WARN("Illegal length in DWARF entry");
+		return NULL;
+	}
+
+	/*
+	 * For 64 bit entries, the first 32 bits of the entry is all 1's. The
+	 * actual length is after that.
+	 */
+	if (entry_length == 0xffffffff) {
+		GET_VALUE(entry_length, start, end, 8);
+		offset_size = 8;
+	} else {
+		offset_size = 4;
+	}
+
+	if (entry_length > (size_t) (end - start)) {
+		WARN("DWARF entry is too big");
+		return NULL;
+	}
+	end = start + entry_length;
+
+	/*
+	 * The CIE identifier field distinguishes between a CIE and an FDE. For
+	 * a CIE, this field contains a specific ID value. For an FDE, it
+	 * contains the section offset of the CIE used by the FDE.
+	 */
+	GET_VALUE(cie_id, start, end, offset_size);
+
+	is_cie = (offset_size == 4 && cie_id == CIE_ID_32) ||
+		 (offset_size == 8 && cie_id == CIE_ID_64);
+	if (is_cie)
+		start = dwarf_parse_cie(start, end);
+	else
+		start = dwarf_parse_fde(start, end);
+	return start;
+}
+
+/*
+ * Parse DWARF Call Frame Information.
+ */
+int dwarf_parse(struct objtool_file *file)
+{
+	unsigned char		*start, *end;
+
+	dwarf_file = file;
+
+	/*
+	 * Initialize the helper function based on endianness. This function
+	 * is used to extract values from the DWARF section.
+	 */
+	switch (file->elf->ehdr.e_ident[EI_DATA]) {
+	default:
+		__fallthrough;
+	case ELFDATANONE:
+		__fallthrough;
+	case ELFDATA2LSB:
+		get_value = get_value_le;	/* Little endian */
+		break;
+	case ELFDATA2MSB:
+		get_value = get_value_be;	/* Big endian */
+		break;
+	}
+
+	if (file->elf->ehdr.e_ident[EI_CLASS] == ELFCLASS64)
+		address_size = 64;
+	else
+		address_size = 32;
+
+	/*
+	 * DWARF Call Frame Information is contained in .debug_frame.
+	 * NOTE: This implementation does not support .eh_frame.
+	 */
+	debug_frame = find_section_by_name(file->elf, ".debug_frame");
+	if (!debug_frame)
+		return 0;
+
+	dwarf_alloc_init();
+
+	/*
+	 * Parse all the entries in .debug_frame and create CIEs and FDEs.
+	 */
+	start = debug_frame->data->d_buf;
+	end = start + debug_frame->data->d_size;
+
+	while (start < end) {
+		start = dwarf_parse_one(start, end);
+		if (!start)
+			return -1;
+	}
+
+	/*
+	 * Run all the DWARF instructions in the CIEs and FDEs.
+	 */
+	dwarf_parse_instructions();
+	return 0;
+}
diff --git a/tools/objtool/dwarf_rules.c b/tools/objtool/dwarf_rules.c
new file mode 100644
index 000000000000..9cf201de392a
--- /dev/null
+++ b/tools/objtool/dwarf_rules.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * dwarf_rules.c - Allocation and management of DWARF rules.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+#include <stdio.h>
+
+#include <objtool/objtool.h>
+#include <objtool/warn.h>
+#include <objtool/dwarf_def.h>
+#include <linux/compiler.h>
+
+/*
+ * The following are stubs for now. Later, they will be filled to create
+ * DWARF rules that the kernel can use to compute the frame pointer at
+ * a given instruction address.
+ */
+void dwarf_rule_start(struct fde *fde)
+{
+}
+
+int dwarf_rule_add(struct fde *fde, unsigned long addr,
+	     struct rule *sp_rule, struct rule *fp_rule)
+{
+	return 0;
+}
+
+void dwarf_rule_next(struct fde *fde, unsigned long addr)
+{
+}
+
+void dwarf_rule_reset(struct fde *fde)
+{
+}
diff --git a/tools/objtool/dwarf_util.c b/tools/objtool/dwarf_util.c
new file mode 100644
index 000000000000..77c70c54d26f
--- /dev/null
+++ b/tools/objtool/dwarf_util.c
@@ -0,0 +1,280 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * dwarf_util.c - Support functions.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+#include <stdio.h>
+#include <objtool/warn.h>
+#include <objtool/dwarf_def.h>
+#include <linux/types.h>
+
+void			*free_space;
+size_t			free_size;
+
+/* Function to extract embedded values in the DWARF instruction stream. */
+u64			(*get_value)(unsigned char *field, unsigned int size);
+
+/*
+ * Little endian helper. Adapted from binutils.
+ */
+u64 get_value_le(unsigned char *field, unsigned int size)
+{
+	switch (size) {
+	case 1:
+		return *field;
+
+	case 2:
+		return ((unsigned int) (field[0])) |
+		       (((unsigned int) (field[1])) << 8);
+
+	case 3:
+		return ((unsigned long) (field[0])) |
+		       (((unsigned long) (field[1])) << 8) |
+		       (((unsigned long) (field[2])) << 16);
+
+	case 4:
+		return ((unsigned long) (field[0])) |
+		       (((unsigned long) (field[1])) << 8) |
+		       (((unsigned long) (field[2])) << 16) |
+		       (((unsigned long) (field[3])) << 24);
+
+	case 5:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[0])) |
+			       (((u64) (field[1])) << 8) |
+			       (((u64) (field[2])) << 16) |
+			       (((u64) (field[3])) << 24) |
+			       (((u64) (field[4])) << 32);
+		}
+		__fallthrough;
+
+	case 6:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[0])) |
+			       (((u64) (field[1])) << 8) |
+			       (((u64) (field[2])) << 16) |
+			       (((u64) (field[3])) << 24) |
+			       (((u64) (field[4])) << 32) |
+			       (((u64) (field[5])) << 40);
+		}
+		__fallthrough;
+
+	case 7:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[0])) |
+			       (((u64) (field[1])) << 8) |
+			       (((u64) (field[2])) << 16) |
+			       (((u64) (field[3])) << 24) |
+			       (((u64) (field[4])) << 32) |
+			       (((u64) (field[5])) << 40) |
+			       (((u64) (field[6])) << 48);
+		}
+		__fallthrough;
+
+	case 8:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[0])) |
+			       (((u64) (field[1])) << 8) |
+			       (((u64) (field[2])) << 16) |
+			       (((u64) (field[3])) << 24) |
+			       (((u64) (field[4])) << 32) |
+			       (((u64) (field[5])) << 40) |
+			       (((u64) (field[6])) << 48) |
+			       (((u64) (field[7])) << 56);
+		}
+		__fallthrough;
+
+	default:
+		WARN("%s: Unhandled data length: %d\n", __func__, size);
+	}
+	return 0;
+}
+
+/*
+ * Big endian helper. Adapted from binutils.
+ */
+u64 get_value_be(unsigned char *field, unsigned int size)
+{
+	switch (size) {
+	case 1:
+		return *field;
+
+	case 2:
+		return ((unsigned int) (field[1])) |
+		       (((int) (field[0])) << 8);
+
+	case 3:
+		return ((unsigned long) (field[2])) |
+		       (((unsigned long) (field[1])) << 8) |
+		       (((unsigned long) (field[0])) << 16);
+
+	case 4:
+		return ((unsigned long) (field[3])) |
+		       (((unsigned long) (field[2])) << 8) |
+		       (((unsigned long) (field[1])) << 16) |
+		       (((unsigned long) (field[0])) << 24);
+
+	case 5:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[4])) |
+			       (((u64) (field[3])) << 8) |
+			       (((u64) (field[2])) << 16) |
+			       (((u64) (field[1])) << 24) |
+			       (((u64) (field[0])) << 32);
+		}
+		__fallthrough;
+
+	case 6:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[5])) |
+			       (((u64) (field[4])) << 8) |
+			       (((u64) (field[3])) << 16) |
+			       (((u64) (field[2])) << 24) |
+			       (((u64) (field[1])) << 32) |
+			       (((u64) (field[0])) << 40);
+		}
+		__fallthrough;
+
+	case 7:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[6])) |
+			       (((u64) (field[5])) << 8) |
+			       (((u64) (field[4])) << 16) |
+			       (((u64) (field[3])) << 24) |
+			       (((u64) (field[2])) << 32) |
+			       (((u64) (field[1])) << 40) |
+			       (((u64) (field[0])) << 48);
+		}
+		__fallthrough;
+
+	case 8:
+		if (sizeof(u64) >= 8) {
+			return ((u64) (field[7])) |
+			       (((u64) (field[6])) << 8) |
+			       (((u64) (field[5])) << 16) |
+			       (((u64) (field[4])) << 24) |
+			       (((u64) (field[3])) << 32) |
+			       (((u64) (field[2])) << 40) |
+			       (((u64) (field[1])) << 48) |
+			       (((u64) (field[0])) << 56);
+		}
+		__fallthrough;
+
+	default:
+		WARN("%s: Unhandled data length: %d\n", __func__, size);
+	}
+	return 0;
+}
+
+/*
+ * LEB 128 read functions adapted from LLVM code.
+ */
+u64 read_uleb_128(unsigned char *start, unsigned char *end,
+		  unsigned int *num_read, bool *fail)
+{
+	unsigned char	*cur = start, byte;
+	unsigned int	shift = 0;
+	u64		value = 0;
+	u64		slice;
+
+	do {
+		if (cur == end) {
+			WARN("%s: op=%d end of data", __func__, op);
+			*num_read = (unsigned int) (cur - start);
+			*fail = true;
+			return 0;
+		}
+
+		byte = *cur++;
+		slice = byte & 0x7f;
+
+		if ((shift >= 64 && slice != 0) ||
+		    (slice << shift >> shift) != slice) {
+			WARN("%s: op=%d value too large", __func__, op);
+			*num_read = (unsigned int) (cur - start);
+			*fail = true;
+			return 0;
+		}
+
+		value += slice << shift;
+		shift += 7;
+	} while (byte >= 128);
+
+	*num_read = (unsigned int) (cur - start);
+
+	return value;
+}
+
+s64 read_sleb_128(unsigned char *start, unsigned char *end,
+		  unsigned int *num_read, bool *fail)
+{
+	unsigned char	*cur = start, byte;
+	unsigned int	shift = 0;
+	s64		value = 0;
+	u64		slice;
+
+	do {
+		if (cur == end) {
+			WARN("%s: op=%d end of data", __func__, op);
+			*num_read = (unsigned int) (cur - start);
+			*fail = true;
+			return 0;
+		}
+
+		byte = *cur++;
+		slice = byte & 0x7f;
+
+		if ((shift >= 64 && slice != (value < 0 ? 0x7f : 0x00)) ||
+		    (shift == 63 && slice != 0 && slice != 0x7f)) {
+			WARN("%s: op=%d value too large", __func__, op);
+			*num_read = (unsigned int) (cur - start);
+			*fail = true;
+			return 0;
+		}
+
+		value |= slice << shift;
+		shift += 7;
+	} while (byte >= 128);
+
+	if (shift < 64 && (byte & 0x40)) {
+		/* Sign extend negative numbers if needed. */
+		value |= (-1ULL) << shift;
+	}
+
+	*num_read = (unsigned int) (cur - start);
+
+	return value;
+}
+
+void dwarf_alloc_init(void)
+{
+	/*
+	 * Use the size of the .debug_frame section as an estimate of the
+	 * memory we need.
+	 */
+	free_size = debug_frame->data->d_size * 4;
+	free_space = malloc(free_size);
+	if (!free_space) {
+		WARN("%s: Could not optimize allocations", __func__);
+		free_size = 0;
+	}
+}
+
+void *dwarf_alloc(size_t size)
+{
+	void	*buf;
+
+	/* Round to 8 bytes. */
+	size = (size + 7) & ~7UL;
+
+	if (free_size >= size) {
+		buf = free_space;
+		free_space += size;
+		free_size -= size;
+		return buf;
+	}
+	return malloc(size);
+}
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 4b384c907027..85606a19f633 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -108,7 +108,7 @@ static struct section *find_section_by_index(struct elf *elf,
 	return NULL;
 }
 
-static struct symbol *find_symbol_by_index(struct elf *elf, unsigned int idx)
+struct symbol *find_symbol_by_index(struct elf *elf, unsigned int idx)
 {
 	struct symbol *sym;
 
diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
index 15ac0b7d3d6a..02b18149cdd7 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -15,5 +15,6 @@ extern int cmd_parse_options(int argc, const char **argv, const char * const usa
 
 extern int cmd_check(int argc, const char **argv);
 extern int cmd_orc(int argc, const char **argv);
+extern int cmd_dwarf(int argc, const char **argv);
 
 #endif /* _BUILTIN_H */
diff --git a/tools/objtool/include/objtool/dwarf_def.h b/tools/objtool/include/objtool/dwarf_def.h
new file mode 100644
index 000000000000..7a0a18480d2b
--- /dev/null
+++ b/tools/objtool/include/objtool/dwarf_def.h
@@ -0,0 +1,438 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * dwarf_def.h - DWARF definitions for parsing DWARF information.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#ifndef _OBJTOOL_DWARF_DEF_H
+#define _OBJTOOL_DWARF_DEF_H
+
+/*
+ * The DWARF Call Frame Information (CFI) is encoded in a self-contained
+ * section called .debug_frame.
+ *
+ * DWARF CFI defines the Canonical Frame Address (CFA) as the value of the
+ * stack pointer (SP) when a call instruction is executed. For the called
+ * function, other register values are expressed relative to the CFA. For
+ * a given code location within the function, one can compute things like:
+ *
+ *	- what is the offset that must be added to the current stack pointer
+ *	  to get the CFA.
+ *
+ *	- what offset must be subtracted from the CFA to obtain the current
+ *	  frame pointer (FP).
+ *
+ *	- what offset must be subtracted from the CFA to obtain the location
+ *	  on the stack where a register value is saved. E.g., the return
+ *	  address (RA).
+ *
+ *	- what register is saved in what register.
+ *
+ *	- etc.
+ *
+ * This allows the unwinding of the stack. The unwinder starts at the top most
+ * frame and gets the value of the SP, FP and RA from the current register
+ * state. Using the DWARF CFI at the RA, the unwinder computes the values of
+ * the SP, FP and RA in the previous frame. This process continues until the
+ * SP hits the bottom of the stack and the unwinding terminates.
+ *
+ * In this work, the DWARF CFI is not used to build an unwinder. The existing
+ * frame pointer based unwinder is retained. But the CFI is used to compute
+ * an FP at every frame to validate the actual FP. If the computed and actual
+ * FPs match, then the stack frame is considered reliable. Otherwise, it is
+ * considered unreliable. If all of the frames in a stack trace are reliable,
+ * then the stack trace is reliable.
+ *
+ * Entries in a .debug_frame section are aligned on a multiple of the address
+ * size relative to the start of the section and come in two forms:
+ *
+ *	- a Common Information Entry (CIE) and
+ *
+ *	- a Frame Description Entry (FDE).
+ *
+ * A Common Information Entry holds information that is shared among many
+ * Frame Description Entries. So, it is like a header. There is at least one
+ * CIE in every non-empty .debug_frame section.
+ *
+ * The CIE contains information such as the size of an address in the
+ * architecture, the number of the return address register, etc. It also
+ * contains DWARF instructions to initialize unwind state at the start of
+ * an FDE.
+ *
+ * The FDE contains a code range to which it applies. It also contains DWARF
+ * instructions. These instructions are used to obtain the register rules
+ * at each code location. Using these rules, the SP, FP and RA are computed
+ * as mentioned above.
+ */
+
+/*
+ * DWARF CFI defines the following rules to obtain the value of a register
+ * in the previous frame, given a current frame:
+ *
+ * 1. Same_Value:
+ *
+ *	The current and previous values of the register are the same.
+ *
+ * 2. Val_Offset(N):
+ *
+ *	The previous value is (CFA + N) where N is a signed offset.
+ *
+ * 3. Offset(N):
+ *
+ *	The previous value is saved at (CFA + N).
+ *
+ * 4. register(R):
+ *
+ *	The previous value is saved in register R.
+ *
+ * 5. Val_Expression(E):
+ *
+ *	The previous value is the value produced by evaluating a given
+ *	DWARF expression. DWARF expressions are evaluated on a stack. That
+ *	is, operands are pushed and popped on a stack, operators are
+ *	applied on them and the result is obtained.
+ *
+ * 6. Expression(E):
+ *
+ *	The previous value is stored at the address computed from a DWARF
+ *	expression.
+ *
+ * 7. Architectural:
+ *
+ *	The previous value is obtained in an architecture-specific way via
+ *	an architecture-specific "augmentor". The augmentors are vendor
+ *	specific and are not part of the DWARF standard.
+ *
+ * Now, all of this is quite complicated. In this work, only (1), (2) and (3)
+ * will be supported. At the time of this writing, these are found to be
+ * sufficient for ARM64 and RISCV. Other architectures have not been checked.
+ *
+ * The code locations at which unsupported rules exist will be treated as
+ * unreliable from an unwinder perspective. In other words, if the RA of a
+ * stack frame is such a code location, then that frame is unreliable.
+ */
+
+/*
+ * The following structure is used to encode the Same_Value, Offset(N) and
+ * the Val_Offset(N) rules.
+ *
+ * offset = 0, saved = true	Never happens
+ * offset = 0, saved = false	Same_Value
+ * offset = N, saved = true	Offset(N)
+ * offset = N, saved = false	Val_Offset(N)
+ */
+struct rule {
+	long		offset;
+	bool		saved;
+};
+
+/*
+ * Common Information Entry (CIE):
+ *
+ * next
+ *	Next CIE in list.
+ *
+ * offset
+ *	Section offset at which this CIE is found.
+ *
+ * length
+ *	A constant that gives the number of bytes of the CIE structure, not
+ *	including the length field itself. The size of the length field plus
+ *	the value of length must be an integral multiple of the address size.
+ *
+ * id
+ *	A constant that is used to distinguish CIEs from FDEs.
+ *
+ * version
+ *	A version number. This number is specific to the call frame information
+ *	and is independent of the DWARF version number. Versions 4 and above
+ *	are not supported.
+ *
+ * address_size
+ *	The size of a target address in this CIE and any FDEs that use it,
+ *	in bytes.
+ *
+ * segment_size
+ *	The size of a segment selector in this CIE and any FDEs that use it,
+ *	in bytes.
+ *
+ * augmentation
+ *	A null-terminated UTF-8 string that identifies the augmentation to
+ *	this CIE or to the FDEs that use it. An augmentation is specified by
+ *	an architecture to compute values in a way that is specific to that
+ *	architecture. No augmentation is supported.
+ *
+ * code_factor
+ *	A constant that is factored out of all advance location instructions.
+ *
+ * data_factor
+ *	A constant that is factored out of certain offset instructions.
+ *
+ * return_address_reg
+ *	The number of the return address register in the architecture.
+ *
+ * instructions
+ *	DWARF instructions to initialize the unwind state at the start of
+ *	an FDE.
+ *
+ * instructions_size
+ *	Number of bytes of instructions.
+ *
+ * sp_rule
+ *	Initial rule to compute the CFA derived from the above instructions.
+ *
+ * fp_rule
+ *	Initial rule to compute the frame pointer derived from the above
+ *	instructions.
+ *
+ * ra_rule
+ *	Initial rule to compute the return address derived from the above
+ *	instructions.
+ *
+ * saved_sp_rule, saved_fp_rule, saved_ra_rule
+ *	Temporary storage to store and restore the CFA and frame pointer rules
+ *	offsets while running instructions.
+ *
+ * cfa_reg, cfa_offset
+ *	Temporary storage to remember the default CFA register and offset
+ *	while running instructions.
+ *
+ * unusable
+ *	Some error happened while processing CIE instructions.
+ */
+struct cie {
+	struct cie		*next;
+	unsigned long		offset;
+	unsigned long		length;
+	unsigned long		id;
+	unsigned char		version;
+	unsigned char		address_size;
+	unsigned char		segment_size;
+	char			*augmentation;
+	unsigned int		code_factor;
+	int			data_factor;
+	unsigned int		return_address_reg;
+	unsigned char		*instructions;
+	size_t			instructions_size;
+	struct rule		sp_rule;
+	struct rule		fp_rule;
+	struct rule		ra_rule;
+	struct rule		saved_sp_rule;
+	struct rule		saved_fp_rule;
+	struct rule		saved_ra_rule;
+	unsigned int		cfa_reg;
+	long			cfa_offset;
+	bool			unusable;
+};
+
+/*
+ * Frame Description Entry (FDE):
+ *
+ * next
+ *	Next FDE in list.
+ *
+ * length
+ *	A constant that gives the number of bytes of the header and instruction
+ *	stream for this function, not including the length field itself. The
+ *	size of the length field plus the value of length must be an integral
+ *	multiple of the address size.
+ *
+ * cie_offset
+ *	A constant offset into the .debug_frame section that denotes the CIE
+ *	that is associated with this FDE.
+ *
+ * cie
+ *	CIE for this FDE.
+ *
+ * segment_selector
+ *	Segment selectors are not supported.
+ *
+ * start_pc, end_pc
+ *	Range of code to which this FDE applies.
+ *
+ * instructions
+ *	DWARF instructions to compute the register rules.
+ *
+ * instructions_size
+ *	Number of bytes of instructions.
+ *
+ * index
+ *	Index of the FDE record in the .debug_frame section. Used to obtain
+ *	relocation information for the FDE.
+ *
+ * symbol
+ *	Symbol information for the function for the code range.
+ *
+ * section
+ *	Section information for the function.
+ *
+ * offset
+ *	Offset within the section for the function.
+ *
+ * sp_offset
+ *	Needed for clang hack.
+ */
+struct fde {
+	struct fde		*next;
+	unsigned long		length;
+	unsigned long		cie_offset;
+	struct cie		*cie;
+	unsigned long		segment_selector;
+	unsigned long		start_pc;
+	unsigned long		end_pc;
+	unsigned char		*instructions;
+	size_t			instructions_size;
+	unsigned long		index;
+	struct symbol		*symbol;
+	struct section		*section;
+	unsigned long		offset;
+	unsigned long		sp_offset;
+};
+
+/*
+ * These are identifiers for 32-bit and 64-bit CIE entries respectively.
+ * These identifiers distinguish a CIE from an FDE in .debug_frame.
+ */
+#define CIE_ID_32		0xFFFFFFFF
+#define CIE_ID_64		0xFFFFFFFFFFFFFFFFUL
+
+/*
+ * DWARF instruction op codes.
+ */
+
+/* Primary op codes in the high 2 bits */
+
+#define DW_CFA_extended_op		0x00
+#define DW_CFA_advance_loc		0x40
+#define DW_CFA_offset			0x80
+#define DW_CFA_restore			0xc0
+
+/* Extended op codes in the low 6 bits */
+
+#define DW_CFA_nop			0x00
+#define DW_CFA_set_loc			0x01
+#define DW_CFA_advance_loc1		0x02
+#define DW_CFA_advance_loc2		0x03
+#define DW_CFA_advance_loc4		0x04
+#define DW_CFA_offset_extended		0x05
+#define DW_CFA_restore_extended		0x06
+#define DW_CFA_undefined		0x07
+#define DW_CFA_same_value		0x08
+#define DW_CFA_register			0x09
+#define DW_CFA_remember_state		0x0a
+#define DW_CFA_restore_state		0x0b
+#define DW_CFA_def_cfa			0x0c
+#define DW_CFA_def_cfa_register		0x0d
+#define DW_CFA_def_cfa_offset		0x0e
+/* DWARF 3.  */
+#define DW_CFA_def_cfa_expression	0x0f
+#define DW_CFA_expression		0x10
+#define DW_CFA_offset_extended_sf	0x11
+#define DW_CFA_def_cfa_sf		0x12
+#define DW_CFA_def_cfa_offset_sf	0x13
+#define DW_CFA_val_offset		0x14
+#define DW_CFA_val_offset_sf		0x15
+#define DW_CFA_val_expression		0x16
+#define DW_CFA_lo_user			0x1c
+#define DW_CFA_hi_user			0x3f
+
+/*
+ * Extract a value directly from the DWARF instruction stream. Must take into
+ * account endianness.
+ */
+#define GET_VALUE(value, start, end, count)				\
+	do {								\
+		size_t	_size = (count);				\
+		size_t	_avail = (end) - (start);			\
+									\
+		if (sizeof(value) < _size)				\
+			_size = sizeof(value);				\
+		if ((start) > (end))					\
+			_avail = 0;					\
+		if (_size > _avail)					\
+			_size = _avail;					\
+		if (_size == 0)						\
+			(value) = 0;					\
+		else							\
+			(value) = get_value((start), _size);		\
+		(start) += _size;					\
+	} while (0)
+
+/*
+ * Extract an unsigned integer expressed in LEB 128 format from the DWARF
+ * instruction stream.
+ */
+#define READ_ULEB_128(value, start, end, fail)				\
+	do {								\
+		u64		_val;					\
+		unsigned int	_len;					\
+		bool		_fail = false;				\
+									\
+		_val = read_uleb_128(start, end, &_len, &_fail);	\
+									\
+		start += _len;						\
+		(value) = _val;						\
+		if ((value) != _val) {					\
+			WARN("READ_ULEB_128: op=%d value mismatch", op);\
+			_fail = true;					\
+		}							\
+		if (_fail)						\
+			(fail) = true;					\
+	} while (0)
+
+/*
+ * Extract a signed integer expressed in LEB 128 format from the DWARF
+ * instruction stream.
+ */
+#define READ_SLEB_128(value, start, end, fail)				\
+	do {								\
+		s64		_val;					\
+		unsigned int	_len;					\
+		bool		_fail = false;				\
+									\
+		_val = read_sleb_128(start, end, &_len, &_fail);	\
+									\
+		start += _len;						\
+		(value) = _val;						\
+		if ((value) != _val) {					\
+			WARN("READ_SLEB_128: op=%d value mismatch", op);\
+			_fail = true;					\
+		}							\
+		if (_fail)						\
+			(fail) = true;					\
+	} while (0)
+
+extern struct objtool_file	*dwarf_file;
+extern struct section		*debug_frame;
+extern struct cie		*cies, *cur_cie;
+extern struct fde		*fdes, *cur_fde;
+extern unsigned char		op, operand;
+extern void			*free_space;
+extern size_t			free_size;
+extern u64		(*get_value)(unsigned char *field, unsigned int size);
+
+void dwarf_rule_start(struct fde *fde);
+int dwarf_rule_add(struct fde *fde, unsigned long addr,
+		   struct rule *sp_rule, struct rule *fp_rule);
+void dwarf_rule_next(struct fde *fde, unsigned long addr);
+void dwarf_rule_reset(struct fde *fde);
+int arch_dwarf_fde_reloc(struct fde *fde);
+void arch_dwarf_clang_hack(struct fde *fde, unsigned long pc,
+			   struct rule *sp_rule, struct rule *fp_rule);
+int arch_dwarf_check_rules(struct fde *fde, unsigned long pc,
+			   struct rule *sp_rule, struct rule *fp_rule,
+			   struct rule *ra_rule);
+u64 get_value_le(unsigned char *field, unsigned int size);
+u64 get_value_be(unsigned char *field, unsigned int size);
+u64 read_uleb_128(unsigned char *start, unsigned char *end,
+		  unsigned int *num_read, bool *fail);
+s64 read_sleb_128(unsigned char *start, unsigned char *end,
+		  unsigned int *num_read, bool *fail);
+void dwarf_parse_instructions(void);
+void dwarf_alloc_init(void);
+void *dwarf_alloc(size_t size);
+
+#endif /* _OBJTOOL_DWARF_DEF_H */
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index cdc739fa9a6f..c133ece2cdcf 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -151,6 +151,7 @@ struct section *find_section_by_name(const struct elf *elf, const char *name);
 struct symbol *find_func_by_offset(struct section *sec, unsigned long offset);
 struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset);
 struct symbol *find_symbol_by_name(const struct elf *elf, const char *name);
+struct symbol *find_symbol_by_index(struct elf *elf, unsigned int idx);
 struct symbol *find_symbol_containing(const struct section *sec, unsigned long offset);
 struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, unsigned long offset);
 struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *sec,
diff --git a/tools/objtool/include/objtool/objtool.h b/tools/objtool/include/objtool/objtool.h
index f99fbc6078d5..0344e89a10e8 100644
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -41,5 +41,6 @@ void objtool_pv_add(struct objtool_file *file, int idx, struct symbol *func);
 int check(struct objtool_file *file);
 int orc_dump(const char *objname);
 int orc_create(struct objtool_file *file);
+int dwarf_parse(struct objtool_file *file);
 
 #endif /* _OBJTOOL_H */
diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
index bdf699f6552b..bfb9c5607cfc 100644
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -38,6 +38,7 @@ static const char objtool_usage_string[] =
 static struct cmd_struct objtool_cmds[] = {
 	{"check",	cmd_check,	"Perform stack metadata validation on an object file" },
 	{"orc",		cmd_orc,	"Generate in-place ORC unwind tables for an object file" },
+	{"dwarf",	cmd_dwarf,	"Generate DWARF rules for object file"},
 };
 
 bool help;
diff --git a/tools/objtool/weak.c b/tools/objtool/weak.c
index 8314e824db4a..67b5016a8327 100644
--- a/tools/objtool/weak.c
+++ b/tools/objtool/weak.c
@@ -8,6 +8,7 @@
 #include <stdbool.h>
 #include <errno.h>
 #include <objtool/objtool.h>
+#include <objtool/dwarf_def.h>
 
 #define UNSUPPORTED(name)						\
 ({									\
@@ -29,3 +30,29 @@ int __weak orc_create(struct objtool_file *file)
 {
 	UNSUPPORTED("orc");
 }
+
+int __weak dwarf_parse(struct objtool_file *file)
+
+{
+	fprintf(stderr, "error: objtool: %s not implemented\n", __func__);
+	return -EOPNOTSUPP;
+}
+
+int __weak arch_dwarf_fde_reloc(struct fde *fde)
+{
+	fprintf(stderr, "error: objtool: %s not implemented\n", __func__);
+	return -EOPNOTSUPP;
+}
+
+void __weak arch_dwarf_clang_hack(struct fde *fde, unsigned long pc,
+				  struct rule *sp_rule, struct rule *fp_rule)
+{
+}
+
+int __weak arch_dwarf_check_rules(struct fde *fde, unsigned long pc,
+				  struct rule *sp_rule, struct rule *fp_rule,
+				  struct rule *ra_rule)
+{
+	fprintf(stderr, "error: objtool: %s not implemented\n", __func__);
+	return -EOPNOTSUPP;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 2/9] objtool: Generate DWARF rules and place them in a special section
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 1/9] objtool: Parse DWARF Call Frame Information in object files madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 3/9] dwarf: Build the kernel with DWARF information madvenka
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Convert the DWARF Call Frame Information parsed by dwarf_parse() into
compact DWARF rules that are usable by the kernel. Place the rules in a
special section called .dwarf_rules. Also, place the PCs for the rules in
a special section called .dwarf_pcs. In addition, define relocation
entries for the PCs as they will change during linking.

An entry in .dwarf_rules and its corresponding entry in .dwarf_pcs together
describe a code range and DWARF rules for the code range. In the future,
the kernel will use the rules to compute the frame pointer at a given
instruction address. The unwinder can use the computed frame pointer to
validate the actual frame pointer for a reliable stack trace.

During rule generation, eliminate null offset rules and merge adjacent rules
that are identical to minimize the number of rules.

Also add an objtool option to dump the DWARF rules for debugging purposes.
It is invoked as follows:

	objtool dwarf dump <object-file>

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 include/linux/dwarf.h                     |  43 +++++
 tools/include/linux/dwarf.h               |  43 +++++
 tools/objtool/builtin-dwarf.c             |  22 ++-
 tools/objtool/dwarf_rules.c               | 181 +++++++++++++++++++++-
 tools/objtool/include/objtool/dwarf_def.h |  12 ++
 tools/objtool/include/objtool/objtool.h   |   2 +
 tools/objtool/sync-check.sh               |   6 +
 tools/objtool/weak.c                      |  11 ++
 8 files changed, 311 insertions(+), 9 deletions(-)
 create mode 100644 include/linux/dwarf.h
 create mode 100644 tools/include/linux/dwarf.h

diff --git a/include/linux/dwarf.h b/include/linux/dwarf.h
new file mode 100644
index 000000000000..16e9dd8c60c8
--- /dev/null
+++ b/include/linux/dwarf.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * dwarf.h - DWARF data structures used by the unwinder.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (c) 2022 Microsoft Corporation
+ */
+
+#ifndef _LINUX_DWARF_H
+#define _LINUX_DWARF_H
+
+#include <linux/types.h>
+
+/*
+ * objtool generates two special sections that contain DWARF information that
+ * will be used by the reliable unwinder to validate the frame pointer in every
+ * frame:
+ *
+ * .dwarf_rules:
+ *	This contains an array of struct dwarf_rule. Each rule contains the
+ *	size of a code range. In addition, a rule contains the offsets that
+ *	must be used to compute the frame pointer at any of the instructions
+ *	within the code range. The computation is:
+ *
+ *		CFA = %sp + sp_offset
+ *		FP = CFA + fp_offset
+ *
+ *	where %sp is the stack pointer at the instruction address and FP is
+ *	the frame pointer.
+ *
+ * .dwarf_pcs:
+ *	This contains an array of starting PCs, one for each rule.
+ */
+struct dwarf_rule {
+	unsigned int	size:30;
+	unsigned int	sp_saved:1;
+	unsigned int	fp_saved:1;
+	short		sp_offset;
+	short		fp_offset;
+};
+
+#endif /* _LINUX_DWARF_H */
diff --git a/tools/include/linux/dwarf.h b/tools/include/linux/dwarf.h
new file mode 100644
index 000000000000..16e9dd8c60c8
--- /dev/null
+++ b/tools/include/linux/dwarf.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * dwarf.h - DWARF data structures used by the unwinder.
+ *
+ * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com)
+ *
+ * Copyright (c) 2022 Microsoft Corporation
+ */
+
+#ifndef _LINUX_DWARF_H
+#define _LINUX_DWARF_H
+
+#include <linux/types.h>
+
+/*
+ * objtool generates two special sections that contain DWARF information that
+ * will be used by the reliable unwinder to validate the frame pointer in every
+ * frame:
+ *
+ * .dwarf_rules:
+ *	This contains an array of struct dwarf_rule. Each rule contains the
+ *	size of a code range. In addition, a rule contains the offsets that
+ *	must be used to compute the frame pointer at any of the instructions
+ *	within the code range. The computation is:
+ *
+ *		CFA = %sp + sp_offset
+ *		FP = CFA + fp_offset
+ *
+ *	where %sp is the stack pointer at the instruction address and FP is
+ *	the frame pointer.
+ *
+ * .dwarf_pcs:
+ *	This contains an array of starting PCs, one for each rule.
+ */
+struct dwarf_rule {
+	unsigned int	size:30;
+	unsigned int	sp_saved:1;
+	unsigned int	fp_saved:1;
+	short		sp_offset;
+	short		fp_offset;
+};
+
+#endif /* _LINUX_DWARF_H */
diff --git a/tools/objtool/builtin-dwarf.c b/tools/objtool/builtin-dwarf.c
index f44b35eb3f55..1b451e830140 100644
--- a/tools/objtool/builtin-dwarf.c
+++ b/tools/objtool/builtin-dwarf.c
@@ -25,6 +25,10 @@ static const char * const dwarf_usage[] = {
 	 * information.
 	 */
 	"objtool dwarf generate file",
+	/*
+	 * Dump DWARF rules for debugging purposes.
+	 */
+	"objtool dwarf dump file",
 
 	NULL,
 };
@@ -37,6 +41,7 @@ int cmd_dwarf(int argc, const char **argv)
 {
 	const char		*object;
 	struct objtool_file	*file;
+	int			ret;
 
 	argc--; argv++;
 	if (argc != 2)
@@ -48,8 +53,21 @@ int cmd_dwarf(int argc, const char **argv)
 	if (!file)
 		return 1;
 
-	if (!strncmp(argv[0], "gen", 3))
-		return dwarf_parse(file);
+	if (!strncmp(argv[0], "gen", 3)) {
+		ret = dwarf_parse(file);
+		if (!ret)
+			ret = dwarf_write(file);
+		if (!ret && file->elf->changed)
+			ret = elf_write(file->elf);
+		return ret;
+	}
+
+	if (!strcmp(argv[0], "dump")) {
+		ret = dwarf_parse(file);
+		if (!ret)
+			dwarf_dump();
+		return ret;
+	}
 
 	usage_with_options(dwarf_usage, dwarf_options);
 
diff --git a/tools/objtool/dwarf_rules.c b/tools/objtool/dwarf_rules.c
index 9cf201de392a..a118b392aac8 100644
--- a/tools/objtool/dwarf_rules.c
+++ b/tools/objtool/dwarf_rules.c
@@ -13,25 +13,192 @@
 #include <objtool/dwarf_def.h>
 #include <linux/compiler.h>
 
-/*
- * The following are stubs for now. Later, they will be filled to create
- * DWARF rules that the kernel can use to compute the frame pointer at
- * a given instruction address.
- */
+struct section			*dwarf_rules_sec;
+struct section			*dwarf_pcs_sec;
+
+static struct fde_entry		*cur_entry;
+static int			nentries;
+
+static int dwarf_rule_insert(struct fde *fde, unsigned long addr,
+			     struct rule *sp_rule, struct rule *fp_rule);
+
 void dwarf_rule_start(struct fde *fde)
 {
+	fde->head = NULL;
+	fde->tail = NULL;
+	cur_entry = NULL;
 }
 
 int dwarf_rule_add(struct fde *fde, unsigned long addr,
-	     struct rule *sp_rule, struct rule *fp_rule)
+		   struct rule *sp_rule, struct rule *fp_rule)
 {
-	return 0;
+	if (cur_entry) {
+		struct rule		*esp_rule = &cur_entry->sp_rule;
+		struct rule		*efp_rule = &cur_entry->fp_rule;
+
+		/*
+		 * If the rules have not changed, there is nothing to do.
+		 */
+		if (esp_rule->offset == sp_rule->offset &&
+		    efp_rule->offset == fp_rule->offset &&
+		    esp_rule->saved == sp_rule->saved &&
+		    efp_rule->saved == fp_rule->saved) {
+			return 0;
+		}
+		/* Close out the current range. */
+		cur_entry->size = addr - cur_entry->addr;
+	}
+	return dwarf_rule_insert(fde, addr, sp_rule, fp_rule);
 }
 
 void dwarf_rule_next(struct fde *fde, unsigned long addr)
 {
+	if (cur_entry) {
+		/* Close out the current range. */
+		cur_entry->size = addr - cur_entry->addr;
+		cur_entry = NULL;
+	}
 }
 
 void dwarf_rule_reset(struct fde *fde)
 {
+	struct fde_entry	*entry;
+
+	while (fde->head) {
+		entry = fde->head;
+		fde->head = entry->next;
+		free(entry);
+		nentries--;
+	}
+	fde->tail = NULL;
+	cur_entry = NULL;
+}
+
+static int dwarf_rule_insert(struct fde *fde, unsigned long addr,
+			     struct rule *sp_rule, struct rule *fp_rule)
+{
+	struct fde_entry	*entry;
+
+	entry = dwarf_alloc(sizeof(*entry));
+	if (!entry)
+		return -1;
+
+	/* Add the entry to the FDE list. */
+	if (fde->tail)
+		fde->tail->next = entry;
+	else
+		fde->head = entry;
+	fde->tail = entry;
+	entry->next = NULL;
+
+	/*
+	 * Record the starting address of the code range here. The size of
+	 * the range will be known only when the next rule comes in. At that
+	 * time, we will close out this range.
+	 */
+	entry->addr = addr;
+
+	/* Copy the rules. */
+	entry->sp_rule = *sp_rule;
+	entry->fp_rule = *fp_rule;
+
+	cur_entry = entry;
+	nentries++;
+	return 0;
+}
+
+static int dwarf_rule_write(struct elf *elf, struct fde *fde,
+			    struct fde_entry *entry, unsigned int index)
+{
+	struct dwarf_rule	rule, *drule;
+
+	/*
+	 * Encode the SP and FP rules from the entry into a single dwarf_rule
+	 * for the kernel's benefit. Copy it into .dwarf_rules.
+	 */
+	rule.size = entry->size;
+	rule.sp_saved = entry->sp_rule.saved;
+	rule.fp_saved = entry->fp_rule.saved;
+	rule.sp_offset = entry->sp_rule.offset;
+	rule.fp_offset = entry->fp_rule.offset;
+
+	drule = (struct dwarf_rule *) dwarf_rules_sec->data->d_buf + index;
+	memcpy(drule, &rule, sizeof(rule));
+
+	/* Add relocation information for the code range. */
+	if (elf_add_reloc_to_insn(elf, dwarf_pcs_sec,
+				  index * sizeof(unsigned long),
+				  R_AARCH64_ABS64,
+				  fde->section, entry->addr)) {
+		return -1;
+	}
+	return 0;
+}
+
+int dwarf_write(struct objtool_file *file)
+{
+	struct elf		*elf = file->elf;
+	struct fde		*fde;
+	struct fde_entry	*entry;
+	int			index;
+
+	/*
+	 * Check if .dwarf_rules already exists. If it doesn't, we will
+	 * assume that .dwarf_pcs doesn't exist either.
+	 */
+	if (find_section_by_name(elf, ".dwarf_rules")) {
+		WARN("file already has .dwarf_rules section");
+		return -1;
+	}
+
+	/* Create .dwarf_rules. */
+	dwarf_rules_sec = elf_create_section(elf, ".dwarf_rules", 0,
+					     sizeof(struct dwarf_rule),
+					     nentries);
+	if (!dwarf_rules_sec) {
+		WARN("Unable to create .dwarf_rules");
+		return -1;
+	}
+
+	/* Create .dwarf_pcs. */
+	dwarf_pcs_sec = elf_create_section(elf, ".dwarf_pcs", 0,
+					   sizeof(unsigned long), nentries);
+	if (!dwarf_pcs_sec) {
+		WARN("Unable to create .dwarf_pcs");
+		return -1;
+	}
+
+	/* Write DWARF rules to sections. */
+	index = 0;
+	for (fde = fdes; fde != NULL; fde = fde->next) {
+		for (entry = fde->head; entry != NULL; entry = entry->next) {
+			if (dwarf_rule_write(elf, fde, entry, index))
+				return -1;
+			index++;
+		}
+	}
+
+	return 0;
+}
+
+void dwarf_dump(void)
+{
+	struct fde		*fde;
+	struct fde_entry	*entry;
+	struct rule		*sp_rule, *fp_rule;
+	int			index = 0;
+
+	for (fde = fdes; fde != NULL; fde = fde->next) {
+		for (entry = fde->head; entry != NULL; entry = entry->next) {
+			sp_rule = &entry->sp_rule;
+			fp_rule = &entry->fp_rule;
+
+			printf("addr=%lx size=%lx:",
+			       entry->addr, entry->size);
+			printf("\tsp=%ld sp_saved=%d fp=%ld fp_saved=%d\n",
+			       sp_rule->offset, sp_rule->saved,
+			       fp_rule->offset, fp_rule->saved);
+			index++;
+		}
+	}
 }
diff --git a/tools/objtool/include/objtool/dwarf_def.h b/tools/objtool/include/objtool/dwarf_def.h
index 7a0a18480d2b..af56ccb52fff 100644
--- a/tools/objtool/include/objtool/dwarf_def.h
+++ b/tools/objtool/include/objtool/dwarf_def.h
@@ -10,6 +10,8 @@
 #ifndef _OBJTOOL_DWARF_DEF_H
 #define _OBJTOOL_DWARF_DEF_H
 
+#include <linux/dwarf.h>
+
 /*
  * The DWARF Call Frame Information (CFI) is encoded in a self-contained
  * section called .debug_frame.
@@ -228,6 +230,14 @@ struct cie {
 	bool			unusable;
 };
 
+struct fde_entry {
+	struct fde_entry	*next;
+	unsigned long		addr;
+	size_t			size;
+	struct rule		sp_rule;
+	struct rule		fp_rule;
+};
+
 /*
  * Frame Description Entry (FDE):
  *
@@ -290,6 +300,8 @@ struct fde {
 	struct section		*section;
 	unsigned long		offset;
 	unsigned long		sp_offset;
+	struct fde_entry	*head;
+	struct fde_entry	*tail;
 };
 
 /*
diff --git a/tools/objtool/include/objtool/objtool.h b/tools/objtool/include/objtool/objtool.h
index 0344e89a10e8..93e62639ab01 100644
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -42,5 +42,7 @@ int check(struct objtool_file *file);
 int orc_dump(const char *objname);
 int orc_create(struct objtool_file *file);
 int dwarf_parse(struct objtool_file *file);
+void dwarf_dump(void);
+int dwarf_write(struct objtool_file *file);
 
 #endif /* _OBJTOOL_H */
diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh
index 105a291ff8e7..345c259a115c 100755
--- a/tools/objtool/sync-check.sh
+++ b/tools/objtool/sync-check.sh
@@ -27,6 +27,12 @@ arch/x86/lib/insn.c
 '
 fi
 
+if [ "$SRCARCH" = "arm64" ]; then
+FILES="$FILES
+include/linux/dwarf.h
+"
+fi
+
 check_2 () {
   file1=$1
   file2=$2
diff --git a/tools/objtool/weak.c b/tools/objtool/weak.c
index 67b5016a8327..9d89d4fad8a1 100644
--- a/tools/objtool/weak.c
+++ b/tools/objtool/weak.c
@@ -38,6 +38,17 @@ int __weak dwarf_parse(struct objtool_file *file)
 	return -EOPNOTSUPP;
 }
 
+int __weak dwarf_write(struct objtool_file *file)
+{
+	fprintf(stderr, "error: objtool: %s not implemented\n", __func__);
+	return -1;
+}
+
+void __weak dwarf_dump(void)
+{
+	fprintf(stderr, "error: objtool: %s not implemented\n", __func__);
+}
+
 int __weak arch_dwarf_fde_reloc(struct fde *fde)
 {
 	fprintf(stderr, "error: objtool: %s not implemented\n", __func__);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 3/9] dwarf: Build the kernel with DWARF information
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 1/9] objtool: Parse DWARF Call Frame Information in object files madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 2/9] objtool: Generate DWARF rules and place them in a special section madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 4/9] dwarf: Implement DWARF rule processing in the kernel madvenka
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Define CONFIG_DWARF_FP - to include DWARF based FP validation code.
Define CONFIG_STACK_VALIDATION - to enable DWARF based FP validation.

When these configs are enabled, invoke objtool on relocatable files during
the kernel build with the following command:

	objtool dwarf generate <object-file>

Objtool creates the following sections in each object file:

.dwarf_rules	Array of DWARF rules
.dwarf_pcs	Array of PCs, one-to-one with rules

In the future, the kernel can use these sections to find the rules for a
given instruction address. The unwinder can then compute the FP at an
instruction address and validate the actual FP with that.

NOTE: CONFIG_STACK_VALIDATION needs to be turned on here. Otherwise, objtool
will not be invoked during the kernel build process. The actual stack
validation code will be added separately. This is harmless.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/Kconfig                    |  4 +++-
 arch/arm64/Kconfig              |  2 ++
 arch/arm64/Kconfig.debug        |  5 +++++
 arch/arm64/configs/defconfig    |  1 +
 arch/arm64/kernel/vmlinux.lds.S | 22 ++++++++++++++++++++++
 scripts/Makefile.build          |  4 ++++
 scripts/link-vmlinux.sh         |  6 ++++++
 7 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index d3c4ab249e9c..3b0d0db322b9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1016,7 +1016,9 @@ config HAVE_STACK_VALIDATION
 	bool
 	help
 	  Architecture supports the 'objtool check' host tool command, which
-	  performs compile-time stack metadata validation.
+	  performs compile-time stack metadata validation. Or, on architectures
+	  that use DWARF validated frame pointers, it supports the
+	  'objtool dwarf generate' host tool command.
 
 config HAVE_RELIABLE_STACKTRACE
 	bool
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c4207cf9bb17..c82a3a93297f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -220,6 +220,8 @@ config ARM64
 	select SWIOTLB
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
+	select HAVE_STACK_VALIDATION	if DWARF_FP
+	select STACK_VALIDATION		if HAVE_STACK_VALIDATION
 	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
 	select TRACE_IRQFLAGS_SUPPORT
 	help
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 265c4461031f..585967062a1c 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -20,4 +20,9 @@ config ARM64_RELOC_TEST
 	depends on m
 	tristate "Relocation testing module"
 
+config DWARF_FP
+	def_bool y
+	depends on FRAME_POINTER
+	depends on DEBUG_INFO_DWARF4
+
 source "drivers/hwtracing/coresight/Kconfig"
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index f2e2b9bdd702..a59c448f442a 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -1233,3 +1233,4 @@ CONFIG_DEBUG_KERNEL=y
 # CONFIG_DEBUG_PREEMPT is not set
 # CONFIG_FTRACE is not set
 CONFIG_MEMTEST=y
+CONFIG_DEBUG_INFO_DWARF4=y
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 50bab186c49b..fb3b9970453b 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -122,6 +122,25 @@ jiffies = jiffies_64;
 #define TRAMP_TEXT
 #endif
 
+#ifdef CONFIG_DWARF_FP
+#define DWARF_RULES					\
+	. = ALIGN(8);					\
+	.dwarf_rules : {				\
+		__dwarf_rules_start = .;		\
+		KEEP(*(.dwarf_rules))			\
+		__dwarf_rules_end = .;			\
+	}
+
+#define DWARF_PCS					\
+	. = ALIGN(8);					\
+	__dwarf_pcs_start = .;				\
+	KEEP(*(.dwarf_pcs))				\
+	__dwarf_pcs_end = .;
+#else
+#define DWARF_RULES
+#define DWARF_PCS
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from _stext to _edata, must be a round multiple of the PE/COFF
@@ -239,6 +258,7 @@ SECTIONS
 		CON_INITCALL
 		INIT_RAM_FS
 		*(.init.altinstructions .init.bss)	/* from the EFI stub */
+		DWARF_PCS
 	}
 	.exit.data : {
 		EXIT_DATA
@@ -291,6 +311,8 @@ SECTIONS
 		__mmuoff_data_end = .;
 	}
 
+	DWARF_RULES
+
 	PECOFF_EDATA_PADDING
 	__pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
 	_edata = .;
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 78656b527fe5..5e8d89c64572 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -227,6 +227,9 @@ ifdef CONFIG_STACK_VALIDATION
 
 objtool := $(objtree)/tools/objtool/objtool
 
+ifdef CONFIG_DWARF_FP
+objtool_args = dwarf generate
+else
 objtool_args =								\
 	$(if $(CONFIG_UNWINDER_ORC),orc generate,check)			\
 	$(if $(part-of-module), --module)				\
@@ -235,6 +238,7 @@ objtool_args =								\
 	$(if $(CONFIG_RETPOLINE), --retpoline)				\
 	$(if $(CONFIG_X86_SMAP), --uaccess)				\
 	$(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount)
+endif
 
 cmd_objtool = $(if $(objtool-enabled), ; $(objtool) $(objtool_args) $@)
 cmd_gen_objtooldep = $(if $(objtool-enabled), { echo ; echo '$@: $$(wildcard $(objtool))' ; } >> $(dot-target).cmd)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 5cdd9bc5c385..433e395f977b 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -104,6 +104,12 @@ objtool_link()
 	local objtoolcmd;
 	local objtoolopt;
 
+	if [ "${CONFIG_LTO_CLANG} ${CONFIG_DWARF_FP}" = "y y" ]
+	then
+		tools/objtool/objtool dwarf generate ${1}
+		return
+	fi
+
 	if [ "${CONFIG_LTO_CLANG} ${CONFIG_STACK_VALIDATION}" = "y y" ]; then
 		# Don't perform vmlinux validation unless explicitly requested,
 		# but run objtool on vmlinux.o now that we have an object file.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 4/9] dwarf: Implement DWARF rule processing in the kernel
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (2 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 3/9] dwarf: Build the kernel with DWARF information madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 5/9] dwarf: Implement DWARF support for modules madvenka
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Define a struct dwarf_info to store all of the DWARF information needed to
lookup the DWARF rules for an instruction address. There is one dwarf_info
for vmlinux and one for every module.

Implement a lookup function dwarf_lookup(). Given an instruction address,
the function looks up the corresponding DWARF rules. The unwinder will use
the lookup function in the future.

Sort the rules based on instruction address. This allows a binary search.

Divide the text range into fixed sized blocks and map the rules to their
respective blocks. Given an instruction address, first locate the block
for the address. Then, perform a binary search within the rules in the
block. This minimizes the number of rules to consider in the binary search.

dwarf_info contains an array of PCs to search. In order to save space, store
the PCs array as an array of offsets from the base PC of the text range.
This way, we only need 32 bits to store the PC.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/include/asm/sections.h |   4 +
 include/linux/dwarf.h             |  21 +++
 kernel/Makefile                   |   1 +
 kernel/dwarf_fp.c                 | 244 ++++++++++++++++++++++++++++++
 tools/include/linux/dwarf.h       |  21 +++
 5 files changed, 291 insertions(+)
 create mode 100644 kernel/dwarf_fp.c

diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h
index 152cb35bf9df..d9095a9094b7 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -22,5 +22,9 @@ extern char __irqentry_text_start[], __irqentry_text_end[];
 extern char __mmuoff_data_start[], __mmuoff_data_end[];
 extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
 extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[];
+#ifdef CONFIG_DWARF_FP
+extern char __dwarf_rules_start[], __dwarf_rules_end[];
+extern char __dwarf_pcs_start[], __dwarf_pcs_end[];
+#endif
 
 #endif /* __ASM_SECTIONS_H */
diff --git a/include/linux/dwarf.h b/include/linux/dwarf.h
index 16e9dd8c60c8..3df15e79003c 100644
--- a/include/linux/dwarf.h
+++ b/include/linux/dwarf.h
@@ -40,4 +40,25 @@ struct dwarf_rule {
 	short		fp_offset;
 };
 
+/*
+ * The whole text area is divided into fixed sized blocks. Rules are mapped
+ * to their respective blocks. To find a block for an instruction address,
+ * the block of the address is located. Then, a binary search is performed
+ * on just the rules in the block. This minimizes the number of rules to
+ * be considered for the search.
+ */
+struct dwarf_block {
+	int		first_rule;
+	int		last_rule;
+};
+
+#ifdef CONFIG_DWARF_FP
+extern struct dwarf_rule	*dwarf_lookup(unsigned long pc);
+#else
+static inline struct dwarf_rule *dwarf_lookup(unsigned long pc)
+{
+	return NULL;
+}
+#endif
+
 #endif /* _LINUX_DWARF_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 186c49582f45..7582a6323446 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -130,6 +130,7 @@ obj-$(CONFIG_WATCH_QUEUE) += watch_queue.o
 
 obj-$(CONFIG_RESOURCE_KUNIT_TEST) += resource_kunit.o
 obj-$(CONFIG_SYSCTL_KUNIT_TEST) += sysctl-test.o
+obj-$(CONFIG_DWARF_FP) += dwarf_fp.o
 
 CFLAGS_stackleak.o += $(DISABLE_STACKLEAK_PLUGIN)
 obj-$(CONFIG_GCC_PLUGIN_STACKLEAK) += stackleak.o
diff --git a/kernel/dwarf_fp.c b/kernel/dwarf_fp.c
new file mode 100644
index 000000000000..bb14fbe3f3e1
--- /dev/null
+++ b/kernel/dwarf_fp.c
@@ -0,0 +1,244 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * dwarf_fp.c - Allocate DWARF info. There will be one info for vmlinux
+ *		and one for every module. Implement a lookup function that
+ *		can locate the rule for a given instruction address.
+ *
+ * Copyright (C) 2021 Microsoft, Inc.
+ * Author: Madhavan T. Venkataraman <madvenka@microsoft.com>
+ */
+#include <linux/dwarf.h>
+#include <linux/slab.h>
+#include <linux/sort.h>
+#include <linux/types.h>
+#include <asm/sections.h>
+#include <asm/memory.h>
+
+#define OFFSET_BLOCK_SHIFT		12
+#define OFFSET_BLOCK(pc)		((pc) >> OFFSET_BLOCK_SHIFT)
+
+/*
+ * There is one struct dwarf_info for vmlinux and one for each module.
+ */
+struct dwarf_info {
+	struct dwarf_rule	*rules;
+	int			nrules;
+	unsigned int		*offsets;
+
+	struct dwarf_block	*blocks;
+	int			nblocks;
+
+	unsigned long		*pcs;
+	unsigned long		base_pc;
+	unsigned long		end_pc;
+};
+
+static DEFINE_MUTEX(dwarf_mutex);
+
+static struct dwarf_info	*vmlinux_dwarf_info;
+static struct dwarf_info	*cur_info;
+
+static int dwarf_compare(const void *arg1, const void *arg2)
+{
+	const unsigned long		*pc1 = arg1;
+	const unsigned long		*pc2 = arg2;
+
+	if (*pc1 > *pc2)
+		return 1;
+	if (*pc1 < *pc2)
+		return -1;
+	return 0;
+}
+
+static void dwarf_swap(void *arg1, void *arg2, int size)
+{
+	struct dwarf_rule	*rules = cur_info->rules;
+	unsigned long		*pc1 = arg1;
+	unsigned long		*pc2 = arg2;
+	int			i = (int) (pc1 - cur_info->pcs);
+	int			j = (int) (pc2 - cur_info->pcs);
+	unsigned long		tmp_pc;
+	struct dwarf_rule	tmp_rule;
+
+	tmp_pc = *pc1;
+	*pc1 = *pc2;
+	*pc2 = tmp_pc;
+
+	tmp_rule = rules[i];
+	rules[i] = rules[j];
+	rules[j] = tmp_rule;
+}
+
+/*
+ * Sort DWARF Records based on instruction addresses.
+ */
+static void dwarf_sort(struct dwarf_info *info)
+{
+	mutex_lock(&dwarf_mutex);
+
+	/*
+	 * cur_info is a global that allows us to sort both arrays in one go.
+	 */
+	cur_info = info;
+	sort(info->pcs, info->nrules, sizeof(*info->pcs),
+	     dwarf_compare, dwarf_swap);
+
+	mutex_unlock(&dwarf_mutex);
+}
+
+#define INVALID_RULE		-1
+
+static struct dwarf_info *dwarf_alloc(struct dwarf_rule *rules, int nrules,
+				      unsigned long *pcs)
+{
+	struct dwarf_info	*info;
+	unsigned int		*offsets, last_offset;
+	struct dwarf_block	*blocks;
+	int			r, b, nblocks;
+
+	info = kmalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return NULL;
+
+	info->rules = rules;
+	info->nrules = nrules;
+	info->pcs = pcs;
+
+	/* Sort pcs[] and rules[] in the increasing order of PC. */
+	dwarf_sort(info);
+
+	/* Compute the boundaries for the rules. */
+	info->base_pc = pcs[0];
+	info->end_pc = pcs[nrules - 1] + rules[nrules - 1].size;
+
+	offsets = kmalloc_array(nrules, sizeof(*offsets), GFP_KERNEL);
+	if (!offsets)
+		goto free_info;
+
+	/* Store the PCs as offsets from the base PC. This is to save memory. */
+	for (r = 0; r < nrules; r++)
+		offsets[r] = pcs[r] - info->base_pc;
+
+	/* Compute the number of blocks. */
+	last_offset = offsets[nrules - 1];
+	nblocks = OFFSET_BLOCK(last_offset) + 1;
+
+	blocks = kmalloc_array(nblocks, sizeof(*blocks), GFP_KERNEL);
+	if (!blocks)
+		goto free_offsets;
+
+	/* Initialize blocks. */
+	for (b = 0; b < nblocks; b++) {
+		blocks[b].first_rule = INVALID_RULE;
+		blocks[b].last_rule = INVALID_RULE;
+	}
+
+	/* Map rules to blocks. */
+	for (r = 0; r < nrules; r++) {
+		b = OFFSET_BLOCK(offsets[r]);
+		if (blocks[b].first_rule == INVALID_RULE)
+			blocks[b].first_rule = r;
+		blocks[b].last_rule = r;
+	}
+
+	/* Initialize empty blocks. */
+	for (b = 0; b < nblocks; b++) {
+		if (blocks[b].first_rule == INVALID_RULE) {
+			blocks[b].first_rule = blocks[b - 1].last_rule;
+			blocks[b].last_rule = blocks[b - 1].last_rule;
+		}
+	}
+
+	info->blocks = blocks;
+	info->nblocks = nblocks;
+	info->offsets = offsets;
+
+	/* PCs for vmlinux is in init data. It will discarded. */
+	info->pcs = NULL;
+
+	return info;
+free_offsets:
+	kfree(offsets);
+free_info:
+	kfree(info);
+	return NULL;
+}
+
+static struct dwarf_rule *dwarf_lookup_rule(struct dwarf_info *info,
+					    unsigned long pc)
+{
+	struct dwarf_block	*blocks = info->blocks;
+	unsigned int		*offsets = info->offsets, off;
+	struct dwarf_rule	*rule;
+	int			start, mid, end, n, b;
+
+	if (pc < info->base_pc || pc >= info->end_pc)
+		return NULL;
+
+	/* Make PC relative to the base for binary search. */
+	off = pc - info->base_pc;
+
+	/*
+	 * Locate the block for the offset. Do a binary search between the
+	 * start and end rules in the block.
+	 */
+	b = OFFSET_BLOCK(off);
+	start = blocks[b].first_rule;
+	end = blocks[b].last_rule + 1;
+
+	if (off < offsets[start])
+		start--;
+
+	/*
+	 * Binary search. For cache performance, we search in offsets[]
+	 * first and locate a candidate rule. Then, we perform a range check
+	 * for the candidate rule at the end. This is so that rules[]
+	 * is only accessed at the end of the search.
+	 */
+	for (n = end - start; n > 1; n = end - start) {
+		mid = start + (n >> 1);
+
+		if (off >= offsets[mid])
+			start = mid;
+		else
+			end = mid;
+	}
+
+	/* Do a final range check. */
+	rule = &info->rules[start];
+	if (off >= offsets[start] && off < (offsets[start] + rule->size))
+		return rule;
+
+	return NULL;
+}
+
+struct dwarf_rule *dwarf_lookup(unsigned long pc)
+{
+	/*
+	 * Currently, only looks up vmlinux. Support for modules will be
+	 * added later.
+	 */
+	return dwarf_lookup_rule(vmlinux_dwarf_info, pc);
+}
+
+static int __init dwarf_init_feature(void)
+{
+	struct dwarf_rule	*rules;
+	unsigned long		*pcs;
+	int			nrules, npcs;
+
+	rules = (struct dwarf_rule *) __dwarf_rules_start;
+	nrules = (__dwarf_rules_end - __dwarf_rules_start) / sizeof(*rules);
+	if (!nrules)
+		return -EINVAL;
+
+	pcs = (unsigned long *) __dwarf_pcs_start;
+	npcs = (__dwarf_pcs_end - __dwarf_pcs_start) / sizeof(*pcs);
+	if (npcs != nrules)
+		return -EINVAL;
+
+	vmlinux_dwarf_info = dwarf_alloc(rules, nrules, pcs);
+
+	return vmlinux_dwarf_info ? 0 : -EINVAL;
+}
+early_initcall(dwarf_init_feature);
diff --git a/tools/include/linux/dwarf.h b/tools/include/linux/dwarf.h
index 16e9dd8c60c8..3df15e79003c 100644
--- a/tools/include/linux/dwarf.h
+++ b/tools/include/linux/dwarf.h
@@ -40,4 +40,25 @@ struct dwarf_rule {
 	short		fp_offset;
 };
 
+/*
+ * The whole text area is divided into fixed sized blocks. Rules are mapped
+ * to their respective blocks. To find a block for an instruction address,
+ * the block of the address is located. Then, a binary search is performed
+ * on just the rules in the block. This minimizes the number of rules to
+ * be considered for the search.
+ */
+struct dwarf_block {
+	int		first_rule;
+	int		last_rule;
+};
+
+#ifdef CONFIG_DWARF_FP
+extern struct dwarf_rule	*dwarf_lookup(unsigned long pc);
+#else
+static inline struct dwarf_rule *dwarf_lookup(unsigned long pc)
+{
+	return NULL;
+}
+#endif
+
 #endif /* _LINUX_DWARF_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 5/9] dwarf: Implement DWARF support for modules
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (3 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 4/9] dwarf: Implement DWARF rule processing in the kernel madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 6/9] arm64: unwinder: Add a reliability check in the unwinder based on DWARF CFI madvenka
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

When a module is loaded, allocate and initialize its struct dwarf_info. When
a module is unloaded, free the same.

Add code in dwarf_lookup() to look up a given address in modules, if vmlinux
does not contain the address.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 include/linux/dwarf.h       | 18 ++++++++++
 include/linux/module.h      |  3 ++
 kernel/dwarf_fp.c           | 71 ++++++++++++++++++++++++++++++++++---
 kernel/module.c             | 31 ++++++++++++++++
 tools/include/linux/dwarf.h | 18 ++++++++++
 5 files changed, 136 insertions(+), 5 deletions(-)

diff --git a/include/linux/dwarf.h b/include/linux/dwarf.h
index 3df15e79003c..aa44a414b0b6 100644
--- a/include/linux/dwarf.h
+++ b/include/linux/dwarf.h
@@ -11,6 +11,7 @@
 #define _LINUX_DWARF_H
 
 #include <linux/types.h>
+#include <linux/module.h>
 
 /*
  * objtool generates two special sections that contain DWARF information that
@@ -54,11 +55,28 @@ struct dwarf_block {
 
 #ifdef CONFIG_DWARF_FP
 extern struct dwarf_rule	*dwarf_lookup(unsigned long pc);
+#ifdef CONFIG_MODULES
+extern void dwarf_module_alloc(struct module *mod,
+			       struct dwarf_rule *rules, size_t rules_size,
+			       unsigned long *pcs, size_t pcs_size);
+extern void dwarf_module_free(struct module *mod);
+#endif
 #else
 static inline struct dwarf_rule *dwarf_lookup(unsigned long pc)
 {
 	return NULL;
 }
+#ifdef CONFIG_MODULES
+static inline void dwarf_module_alloc(struct module *mod,
+					  struct dwarf_rule *rules,
+					  size_t rules_size,
+					  unsigned long *pcs, size_t pcs_size)
+{
+}
+static inline void dwarf_module_free(struct module *mod)
+{
+}
+#endif
 #endif
 
 #endif /* _LINUX_DWARF_H */
diff --git a/include/linux/module.h b/include/linux/module.h
index c9f1200b2312..bd7c69b82808 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -538,6 +538,9 @@ struct module {
 	struct error_injection_entry *ei_funcs;
 	unsigned int num_ei_funcs;
 #endif
+#ifdef CONFIG_DWARF_FP
+	void *dwarf_info;
+#endif
 } ____cacheline_aligned __randomize_layout;
 #ifndef MODULE_ARCH_INIT
 #define MODULE_ARCH_INIT {}
diff --git a/kernel/dwarf_fp.c b/kernel/dwarf_fp.c
index bb14fbe3f3e1..07d647e828cd 100644
--- a/kernel/dwarf_fp.c
+++ b/kernel/dwarf_fp.c
@@ -164,6 +164,44 @@ static struct dwarf_info *dwarf_alloc(struct dwarf_rule *rules, int nrules,
 	return NULL;
 }
 
+#ifdef CONFIG_MODULES
+
+/*
+ * Errors encountered in this function should not be fatal. All it will mean
+ * is that stack traces through the module would be considered unreliable.
+ */
+void dwarf_module_alloc(struct module *mod,
+			struct dwarf_rule *rules, size_t rules_size,
+			unsigned long *pcs, size_t pcs_size)
+{
+	int		nrules, npcs;
+
+	mod->dwarf_info = NULL;
+
+	nrules = rules_size / sizeof(*rules);
+	npcs = pcs_size / sizeof(*pcs);
+	if (!nrules || npcs != nrules)
+		return;
+
+	mod->dwarf_info = dwarf_alloc(rules, nrules, pcs);
+}
+
+void dwarf_module_free(struct module *mod)
+{
+	struct dwarf_info	*info;
+
+	info = mod->dwarf_info;
+	mod->dwarf_info = NULL;
+
+	if (info) {
+		kfree(info->blocks);
+		kfree(info->offsets);
+		kfree(info);
+	}
+}
+
+#endif
+
 static struct dwarf_rule *dwarf_lookup_rule(struct dwarf_info *info,
 					    unsigned long pc)
 {
@@ -212,13 +250,36 @@ static struct dwarf_rule *dwarf_lookup_rule(struct dwarf_info *info,
 	return NULL;
 }
 
+#ifdef CONFIG_MODULES
+
+static struct dwarf_rule *dwarf_module_lookup_rule(unsigned long pc)
+{
+	struct module	*mod;
+
+	mod = __module_address(pc);
+	if (!mod || !mod->dwarf_info)
+		return NULL;
+
+	return dwarf_lookup_rule(mod->dwarf_info, pc);
+}
+
+#else
+
+static struct dwarf_rule *dwarf_module_lookup_rule(unsigned long pc)
+{
+	return NULL;
+}
+
+#endif
+
 struct dwarf_rule *dwarf_lookup(unsigned long pc)
 {
-	/*
-	 * Currently, only looks up vmlinux. Support for modules will be
-	 * added later.
-	 */
-	return dwarf_lookup_rule(vmlinux_dwarf_info, pc);
+	struct dwarf_rule	*rule;
+
+	rule = dwarf_lookup_rule(vmlinux_dwarf_info, pc);
+	if (!rule)
+		rule = dwarf_module_lookup_rule(pc);
+	return rule;
 }
 
 static int __init dwarf_init_feature(void)
diff --git a/kernel/module.c b/kernel/module.c
index 84a9141a5e15..d9b73995b70a 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -59,6 +59,7 @@
 #include <linux/audit.h>
 #include <uapi/linux/module.h>
 #include "module-internal.h"
+#include <linux/dwarf.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/module.h>
@@ -2153,6 +2154,7 @@ void __weak module_arch_freeing_init(struct module *mod)
 }
 
 static void cfi_cleanup(struct module *mod);
+static void module_dwarf_free(struct module *mod);
 
 /* Free a module, remove from lists, etc. */
 static void free_module(struct module *mod)
@@ -2175,6 +2177,9 @@ static void free_module(struct module *mod)
 	/* Arch-specific cleanup. */
 	module_arch_cleanup(mod);
 
+	/* Dwarf cleanup. */
+	module_dwarf_free(mod);
+
 	/* Module unload stuff */
 	module_unload_free(mod);
 
@@ -3946,6 +3951,7 @@ static int unknown_module_param_cb(char *param, char *val, const char *modname,
 }
 
 static void cfi_init(struct module *mod);
+static void module_dwarf_init(struct module *mod, struct load_info *info);
 
 /*
  * Allocate and load the module: note that size of section 0 is always
@@ -4074,6 +4080,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	if (err < 0)
 		goto free_modinfo;
 
+	module_dwarf_init(mod, info);
+
 	flush_module_icache(mod);
 
 	/* Setup CFI for the module. */
@@ -4154,6 +4162,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	kfree(mod->args);
  free_arch_cleanup:
 	cfi_cleanup(mod);
+	module_dwarf_free(mod);
 	module_arch_cleanup(mod);
  free_modinfo:
 	free_modinfo(mod);
@@ -4542,6 +4551,28 @@ static void cfi_cleanup(struct module *mod)
 #endif
 }
 
+static void module_dwarf_init(struct module *mod, struct load_info *info)
+{
+	Elf_Shdr *dwarf_rules, *dwarf_pcs;
+
+	dwarf_rules = &info->sechdrs[find_sec(info, ".dwarf_rules")];
+	dwarf_pcs = &info->sechdrs[find_sec(info, ".dwarf_pcs")];
+
+	if (!dwarf_rules || !dwarf_pcs)
+		return;
+
+	dwarf_module_alloc(mod,
+			   (void *) dwarf_rules->sh_addr,
+			   dwarf_rules->sh_size,
+			   (void *) dwarf_pcs->sh_addr,
+			   dwarf_pcs->sh_size);
+}
+
+static void module_dwarf_free(struct module *mod)
+{
+	dwarf_module_free(mod);
+}
+
 /* Maximum number of characters written by module_flags() */
 #define MODULE_FLAGS_BUF_SIZE (TAINT_FLAGS_COUNT + 4)
 
diff --git a/tools/include/linux/dwarf.h b/tools/include/linux/dwarf.h
index 3df15e79003c..aa44a414b0b6 100644
--- a/tools/include/linux/dwarf.h
+++ b/tools/include/linux/dwarf.h
@@ -11,6 +11,7 @@
 #define _LINUX_DWARF_H
 
 #include <linux/types.h>
+#include <linux/module.h>
 
 /*
  * objtool generates two special sections that contain DWARF information that
@@ -54,11 +55,28 @@ struct dwarf_block {
 
 #ifdef CONFIG_DWARF_FP
 extern struct dwarf_rule	*dwarf_lookup(unsigned long pc);
+#ifdef CONFIG_MODULES
+extern void dwarf_module_alloc(struct module *mod,
+			       struct dwarf_rule *rules, size_t rules_size,
+			       unsigned long *pcs, size_t pcs_size);
+extern void dwarf_module_free(struct module *mod);
+#endif
 #else
 static inline struct dwarf_rule *dwarf_lookup(unsigned long pc)
 {
 	return NULL;
 }
+#ifdef CONFIG_MODULES
+static inline void dwarf_module_alloc(struct module *mod,
+					  struct dwarf_rule *rules,
+					  size_t rules_size,
+					  unsigned long *pcs, size_t pcs_size)
+{
+}
+static inline void dwarf_module_free(struct module *mod)
+{
+}
+#endif
 #endif
 
 #endif /* _LINUX_DWARF_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 6/9] arm64: unwinder: Add a reliability check in the unwinder based on DWARF CFI
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (4 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 5/9] dwarf: Implement DWARF support for modules madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 7/9] arm64: dwarf: Implement unwind hints madvenka
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Introduce a reliability flag in struct stackframe. This will be set to false
if the PC does not have valid DWARF rules or if the frame pointer computed
from the DWARF rules does not match the actual frame pointer.

Now that the unwinder can validate the frame pointer, introduce
arch_stack_walk_reliable().

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/include/asm/stacktrace.h |  6 +++
 arch/arm64/kernel/stacktrace.c      | 69 +++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 6564a01cc085..93adee4219ed 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -5,6 +5,7 @@
 #ifndef __ASM_STACKTRACE_H
 #define __ASM_STACKTRACE_H
 
+#include <linux/dwarf.h>
 #include <linux/percpu.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
@@ -35,6 +36,7 @@ struct stack_info {
  * A snapshot of a frame record or fp/lr register values, along with some
  * accounting information necessary for robust unwinding.
  *
+ * @sp:          The sp value (CFA) at the call site of the current function.
  * @fp:          The fp value in the frame record (or the real fp)
  * @pc:          The lr value in the frame record (or the real lr)
  *
@@ -47,8 +49,11 @@ struct stack_info {
  * @prev_type:   The type of stack this frame record was on, or a synthetic
  *               value of STACK_TYPE_UNKNOWN. This is used to detect a
  *               transition from one stack to another.
+ *
+ * @reliable     Stack trace is reliable.
  */
 struct stackframe {
+	unsigned long sp;
 	unsigned long fp;
 	unsigned long pc;
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
@@ -57,6 +62,7 @@ struct stackframe {
 #ifdef CONFIG_KRETPROBES
 	struct llist_node *kr_cur;
 #endif
+	bool reliable;
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 94f83cd44e50..f9ef7a3e7296 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2012 ARM Ltd.
  */
 #include <linux/kernel.h>
+#include <linux/dwarf.h>
 #include <linux/export.h>
 #include <linux/ftrace.h>
 #include <linux/kprobes.h>
@@ -36,8 +37,22 @@
 void start_backtrace(struct stackframe *frame, unsigned long fp,
 		     unsigned long pc)
 {
+	struct dwarf_rule *rule;
+
+	frame->reliable = true;
 	frame->fp = fp;
 	frame->pc = pc;
+	frame->sp = 0;
+	/*
+	 * Lookup the dwarf rule for PC. If it exists, initialize the SP
+	 * based on the frame pointer passed in.
+	 */
+	rule = dwarf_lookup(pc);
+	if (rule)
+		frame->sp = fp - rule->fp_offset;
+	else
+		frame->reliable = false;
+
 #ifdef CONFIG_KRETPROBES
 	frame->kr_cur = NULL;
 #endif
@@ -67,6 +82,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
 	unsigned long fp = frame->fp;
 	struct stack_info info;
+	struct dwarf_rule *rule;
+	unsigned long lookup_pc;
 
 	if (!tsk)
 		tsk = current;
@@ -137,6 +154,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 		frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
 #endif
 
+	/*
+	 * If it is the last frame, no need to check dwarf.
+	 */
+	if (frame->fp == (unsigned long)task_pt_regs(tsk)->stackframe)
+		return 0;
+
+	if (!frame->reliable) {
+		/*
+		 * The sp value cannot be reliably computed anymore because a
+		 * previous frame was unreliable.
+		 */
+		return 0;
+	}
+	lookup_pc = frame->pc;
+
+	rule = dwarf_lookup(lookup_pc);
+	if (!rule) {
+		frame->reliable = false;
+		return 0;
+	}
+
+	frame->sp += rule->sp_offset;
+	if (frame->fp != (frame->sp + rule->fp_offset)) {
+		frame->reliable = false;
+		return 0;
+	}
 	return 0;
 }
 NOKPROBE_SYMBOL(unwind_frame);
@@ -242,4 +285,30 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 	walk_stackframe(task, &frame, consume_entry, cookie);
 }
 
+noinline int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
+			      void *cookie, struct task_struct *task)
+{
+	struct stackframe frame;
+	int ret = 0;
+
+	if (task == current) {
+		start_backtrace(&frame,
+				(unsigned long)__builtin_frame_address(1),
+				(unsigned long)__builtin_return_address(0));
+	} else {
+		start_backtrace(&frame, thread_saved_fp(task),
+				thread_saved_pc(task));
+	}
+
+	while (!ret) {
+		if (!frame.reliable)
+			return -EINVAL;
+		if (!consume_entry(cookie, frame.pc))
+			return -EINVAL;
+		ret = unwind_frame(task, &frame);
+	}
+
+	return ret == -ENOENT ? 0 : -EINVAL;
+}
+
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 7/9] arm64: dwarf: Implement unwind hints
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (5 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 6/9] arm64: unwinder: Add a reliability check in the unwinder based on DWARF CFI madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 8/9] dwarf: Miscellaneous changes required for enabling livepatch madvenka
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

DWARF information is not generated for assembly functions. However, we
would like to unwind through some of these functions as they may occur
frequently in a stacktrace. Implement unwind hints for this purpose.

An unwind hint placed after a desired call instruction creates a
DWARF rule for that instruction address in a special section called
".discard.unwind_hints". objtool merges these DWARF rules along with
the ones generated by the compiler for C functions.

Add unwind hints for the following cases:

Exception handlers
==================

When an exception is taken in the kernel, an exception frame gets
pushed on the stack. To be able to unwind through the special frame,
place an unwind hint in the exception handler macro.

Interrupt handlers
==================

When an interrupt happens, the exception handler switches to an IRQ
stack and calls the interrupt handler. Place an unwind hint after
the call to the interrupt handler so the unwinder can unwind through
the switched stacks.

FTrace stub
===========

ftrace_common() calls ftrace_stub(). The stub is patched with tracer
functions when tracing is enabled. Place an unwind hint right after the
call to the stub function so that stack traces taken from within a
tracer function can correctly unwind to the caller.

FTrace Graph Caller
===================

ftrace_graph_caller() calls a function, prepare_ftrace_return(), to prepare
for graph tracing. Place an unwind hint after the call so a stack trace
taken from within the prepare function can correctly unwind to the caller.

FTrace callsite
===============

ftrace_regs_entry() sets up two stackframes - one for the callsite and
one for the ftrace entry code. Unwind hints have been placed for the
ftrace entry code above. We need an unwind hint for the callsite. Callsites
are numerous. But the unwind hint required for all the callsites is the
same. Define a dummy function with the callsite unwind hint like this:

SYM_CODE_START(ftrace_callsite)
        unwind_hint 4, 16, -16                  // for the callsite
        ret
SYM_CODE_END(ftrace_callsite)

When the unwinder comes across an ftrace entry, it will change the PC
that it needs to lookup to ftrace_callsite() to obtain the unwind
hint for the callsite like this:

#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
	if (is_ftrace_entry(frame->prev_pc))
		lookup_pc = (unsigned long)&ftrace_callsite;
#endif

This way, the unwinder can unwind through an ftrace callsite.

Kretprobe Trampoline
====================

This trampoline sets up pt_regs on the stack and sets up a synthetic
frame in the pt_regs. Place an unwind hint where trampoline_probe_handler()
returns to __kretprobe_trampoline to unwind through the synthetic frame.

is_ftrace_entry()
=================

The code for this function has been borrowed from Suraj Jitindar Singh.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/arm64/include/asm/stacktrace.h           |  3 +
 arch/arm64/include/asm/unwind_hints.h         | 28 ++++++++
 arch/arm64/kernel/entry-ftrace.S              | 23 ++++++
 arch/arm64/kernel/entry.S                     |  3 +
 arch/arm64/kernel/ftrace.c                    | 16 +++++
 arch/arm64/kernel/probes/kprobes_trampoline.S |  2 +
 arch/arm64/kernel/stacktrace.c                | 70 +++++++++++++++++--
 include/linux/dwarf.h                         | 10 ++-
 include/linux/ftrace.h                        |  4 ++
 tools/include/linux/dwarf.h                   | 10 ++-
 tools/objtool/dwarf_parse.c                   | 59 +++++++++++++++-
 tools/objtool/dwarf_rules.c                   | 65 ++++++++++++++++-
 tools/objtool/include/objtool/dwarf_def.h     | 10 +++
 13 files changed, 294 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/include/asm/unwind_hints.h

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 93adee4219ed..90392097a768 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -46,6 +46,7 @@ struct stack_info {
  * @prev_fp:     The fp that pointed to this frame record, or a synthetic value
  *               of 0. This is used to ensure that within a stack, each
  *               subsequent frame record is at an increasing address.
+ * @prev_pc:     The pc in the previous frame.
  * @prev_type:   The type of stack this frame record was on, or a synthetic
  *               value of STACK_TYPE_UNKNOWN. This is used to detect a
  *               transition from one stack to another.
@@ -58,11 +59,13 @@ struct stackframe {
 	unsigned long pc;
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
 	unsigned long prev_fp;
+	unsigned long prev_pc;
 	enum stack_type prev_type;
 #ifdef CONFIG_KRETPROBES
 	struct llist_node *kr_cur;
 #endif
 	bool reliable;
+	bool synthetic_frame;
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
diff --git a/arch/arm64/include/asm/unwind_hints.h b/arch/arm64/include/asm/unwind_hints.h
new file mode 100644
index 000000000000..b2312bfaf201
--- /dev/null
+++ b/arch/arm64/include/asm/unwind_hints.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM64_DWARF_UNWIND_HINTS_H
+#define _ASM_ARM64_DWARF_UNWIND_HINTS_H
+
+#ifdef __ASSEMBLY__
+
+#ifdef CONFIG_DWARF_FP
+
+.macro dwarf_unwind_hint, size, sp_offset, fp_offset
+.Ldwarf_hint_pc_\@ :	.pushsection .discard.unwind_hints
+			/* struct dwarf_unwind_hint */
+			.long	0, (.Ldwarf_hint_pc_\@ - .)
+			.int	\size
+			.short	\sp_offset
+			.short	\fp_offset
+			.popsection
+.endm
+
+#else
+
+.macro dwarf_unwind_hint, size, sp_offset, fp_offset
+.endm
+
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_ARM64_DWARF_UNWIND_HINTS_H */
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 8cf970d219f5..f82dad9260ec 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -11,6 +11,7 @@
 #include <asm/assembler.h>
 #include <asm/ftrace.h>
 #include <asm/insn.h>
+#include <asm/unwind_hints.h>
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 /*
@@ -99,7 +100,14 @@ SYM_CODE_START(ftrace_common)
 	mov	x3, sp				// regs
 
 SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
+	/*
+	 * Tracer functions are patched at ftrace_stub. Stack traces
+	 * taken from tracer functions will end up here. Place an
+	 * unwind hint based on the stackframe setup in ftrace_regs_entry.
+	 */
 	bl	ftrace_stub
+SYM_INNER_LABEL(ftrace_call_entry, SYM_L_GLOBAL)
+	dwarf_unwind_hint 4, PT_REGS_SIZE, (S_STACKFRAME - PT_REGS_SIZE)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller();
@@ -138,10 +146,25 @@ SYM_CODE_START(ftrace_graph_caller)
 	add	x1, sp, #S_LR			// parent_ip (callsite's LR)
 	ldr	x2, [sp, #PT_REGS_SIZE]	   	// parent fp (callsite's FP)
 	bl	prepare_ftrace_return
+SYM_INNER_LABEL(ftrace_graph_caller_entry, SYM_L_GLOBAL)
+	dwarf_unwind_hint 4, PT_REGS_SIZE, (S_STACKFRAME - PT_REGS_SIZE)
 	b	ftrace_common_return
 SYM_CODE_END(ftrace_graph_caller)
 #endif
 
+/*
+ * ftrace_regs_entry() sets up two stackframes - one for the callsite and
+ * one for the ftrace entry code. Unwind hints have been placed for the
+ * ftrace entry code above. We need an unwind hint for the callsite. Callsites
+ * are numerous. But the unwind hint required for all the callsites is the
+ * same. Define a dummy function here with the callsite unwind hint for the
+ * benefit of the unwinder.
+ */
+SYM_CODE_START(ftrace_callsite)
+	dwarf_unwind_hint 4, 16, -16			// for the callsite
+	ret
+SYM_CODE_END(ftrace_callsite)
+
 #else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 
 /*
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 2f69ae43941d..a188e3a3068d 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -28,6 +28,7 @@
 #include <asm/thread_info.h>
 #include <asm/asm-uaccess.h>
 #include <asm/unistd.h>
+#include <asm/unwind_hints.h>
 
 	.macro	clear_gp_regs
 	.irp	n,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
@@ -551,6 +552,7 @@ SYM_CODE_START_LOCAL(el\el\ht\()_\regsize\()_\label)
 	.if \el == 0
 	b	ret_to_user
 	.else
+	dwarf_unwind_hint 4, PT_REGS_SIZE, (S_STACKFRAME - PT_REGS_SIZE)
 	b	ret_to_kernel
 	.endif
 SYM_CODE_END(el\el\ht\()_\regsize\()_\label)
@@ -783,6 +785,7 @@ SYM_FUNC_START(call_on_irq_stack)
 	/* Move to the new stack and call the function there */
 	mov	sp, x16
 	blr	x1
+	dwarf_unwind_hint 4, 0, -16
 
 	/*
 	 * Restore the SP from the FP, and restore the FP and LR from the frame
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 4506c4a90ac1..ec9a00d714e5 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -299,3 +299,19 @@ int ftrace_disable_ftrace_graph_caller(void)
 }
 #endif /* CONFIG_DYNAMIC_FTRACE */
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+
+bool is_ftrace_entry(unsigned long pc)
+{
+	if (pc == (unsigned long)&ftrace_call_entry)
+		return true;
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	if (pc == (unsigned long)&ftrace_graph_caller_entry)
+		return true;
+#endif
+	return false;
+}
+
+#endif
diff --git a/arch/arm64/kernel/probes/kprobes_trampoline.S b/arch/arm64/kernel/probes/kprobes_trampoline.S
index 9a6499bed58b..325932b49857 100644
--- a/arch/arm64/kernel/probes/kprobes_trampoline.S
+++ b/arch/arm64/kernel/probes/kprobes_trampoline.S
@@ -6,6 +6,7 @@
 #include <linux/linkage.h>
 #include <asm/asm-offsets.h>
 #include <asm/assembler.h>
+#include <asm/unwind_hints.h>
 
 	.text
 
@@ -71,6 +72,7 @@ SYM_CODE_START(__kretprobe_trampoline)
 
 	mov x0, sp
 	bl trampoline_probe_handler
+	dwarf_unwind_hint 4, PT_REGS_SIZE, (S_FP - PT_REGS_SIZE)
 	/*
 	 * Replace trampoline address in lr with actual orig_ret_addr return
 	 * address.
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index f9ef7a3e7296..e1a9b695b6ae 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -19,6 +19,16 @@
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
+static inline bool is_synthetic_frame(struct dwarf_rule *rule)
+{
+	short regs_size = sizeof(struct pt_regs);
+	short frame_offset = offsetof(struct pt_regs, stackframe);
+
+	return rule->hint &&
+	       rule->sp_offset == regs_size &&
+	       rule->fp_offset == (frame_offset - regs_size);
+}
+
 /*
  * AArch64 PCS assigns the frame pointer to x29.
  *
@@ -40,6 +50,7 @@ void start_backtrace(struct stackframe *frame, unsigned long fp,
 	struct dwarf_rule *rule;
 
 	frame->reliable = true;
+	frame->synthetic_frame = false;
 	frame->fp = fp;
 	frame->pc = pc;
 	frame->sp = 0;
@@ -48,10 +59,12 @@ void start_backtrace(struct stackframe *frame, unsigned long fp,
 	 * based on the frame pointer passed in.
 	 */
 	rule = dwarf_lookup(pc);
-	if (rule)
+	if (rule) {
 		frame->sp = fp - rule->fp_offset;
-	else
+		frame->synthetic_frame = is_synthetic_frame(rule);
+	} else {
 		frame->reliable = false;
+	}
 
 #ifdef CONFIG_KRETPROBES
 	frame->kr_cur = NULL;
@@ -125,6 +138,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 	 * Record this frame record's values and location. The prev_fp and
 	 * prev_type are only meaningful to the next unwind_frame() invocation.
 	 */
+	frame->prev_pc = frame->pc;
 	frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
 	frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
 	frame->prev_fp = fp;
@@ -168,11 +182,59 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 		return 0;
 	}
 	lookup_pc = frame->pc;
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+	if (is_ftrace_entry(frame->prev_pc))
+		lookup_pc = (unsigned long)&ftrace_callsite;
+#endif
 
 	rule = dwarf_lookup(lookup_pc);
 	if (!rule) {
-		frame->reliable = false;
-		return 0;
+		if (!frame->synthetic_frame) {
+			/*
+			 * If the last instruction in a function happens to be
+			 * a call instruction, the return address would fall
+			 * outside of the function. This could be that case.
+			 * This can happen, for instance, if the called function
+			 * is a "noreturn" function. The compiler can optimize
+			 * away the instructions after the call. So, adjust the
+			 * PC so it falls inside the function and retry.
+			 *
+			 * However, if the previous frame was a synthetic frame
+			 * (e.g., interrupt/exception), the return PC in the
+			 * synthetic frame may not be the location after a call
+			 * instruction at all. In such cases, we don't want to
+			 * adjust the PC and retry.
+			 *
+			 * If this succeeds, adjust the frame pc below.
+			 */
+			lookup_pc -= 4;
+			rule = dwarf_lookup(lookup_pc);
+		}
+		if (!rule) {
+			frame->reliable = false;
+			return 0;
+		}
+		frame->pc = lookup_pc;
+	}
+	frame->synthetic_frame = false;
+
+	if (rule->hint) {
+		if (!rule->sp_offset && !rule->fp_offset) {
+			frame->reliable = false;
+			return 0;
+		}
+		frame->synthetic_frame = is_synthetic_frame(rule);
+		if (!rule->sp_offset) {
+			/*
+			 * This is the unwind hint in call_on_irq_stack(). The
+			 * SP at this point is in the IRQ stack. The CFA and
+			 * the FP are on the normal stack. To compute the CFA,
+			 * rely on the unwind hint, assume that the FP is good
+			 * and just compute the CFA from it.
+			 */
+			frame->sp = frame->fp - rule->fp_offset;
+			return 0;
+		}
 	}
 
 	frame->sp += rule->sp_offset;
diff --git a/include/linux/dwarf.h b/include/linux/dwarf.h
index aa44a414b0b6..fea42feb48a4 100644
--- a/include/linux/dwarf.h
+++ b/include/linux/dwarf.h
@@ -34,9 +34,17 @@
  *	This contains an array of starting PCs, one for each rule.
  */
 struct dwarf_rule {
-	unsigned int	size:30;
+	unsigned int	size:29;
 	unsigned int	sp_saved:1;
 	unsigned int	fp_saved:1;
+	unsigned int	hint:1;
+	short		sp_offset;
+	short		fp_offset;
+};
+
+struct dwarf_unwind_hint {
+	unsigned long	pc;
+	unsigned int	size;
 	short		sp_offset;
 	short		fp_offset;
 };
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9999e29187de..dbcd95053425 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -604,6 +604,10 @@ extern int ftrace_update_ftrace_func(ftrace_func_t func);
 extern void ftrace_caller(void);
 extern void ftrace_regs_caller(void);
 extern void ftrace_call(void);
+extern void ftrace_call_entry(void);
+extern void ftrace_graph_caller_entry(void);
+extern void ftrace_callsite(void);
+extern bool is_ftrace_entry(unsigned long pc);
 extern void ftrace_regs_call(void);
 extern void mcount_call(void);
 
diff --git a/tools/include/linux/dwarf.h b/tools/include/linux/dwarf.h
index aa44a414b0b6..fea42feb48a4 100644
--- a/tools/include/linux/dwarf.h
+++ b/tools/include/linux/dwarf.h
@@ -34,9 +34,17 @@
  *	This contains an array of starting PCs, one for each rule.
  */
 struct dwarf_rule {
-	unsigned int	size:30;
+	unsigned int	size:29;
 	unsigned int	sp_saved:1;
 	unsigned int	fp_saved:1;
+	unsigned int	hint:1;
+	short		sp_offset;
+	short		fp_offset;
+};
+
+struct dwarf_unwind_hint {
+	unsigned long	pc;
+	unsigned int	size;
 	short		sp_offset;
 	short		fp_offset;
 };
diff --git a/tools/objtool/dwarf_parse.c b/tools/objtool/dwarf_parse.c
index d5ac5630fbba..7a5af7d5c2be 100644
--- a/tools/objtool/dwarf_parse.c
+++ b/tools/objtool/dwarf_parse.c
@@ -18,6 +18,10 @@
 struct objtool_file		*dwarf_file;
 struct section			*debug_frame;
 
+struct section			*unwind_hints;
+struct section			*unwind_hints_reloc;
+static int			nhints;
+
 struct cie			*cies, *cur_cie;
 struct fde			*fdes, *cur_fde;
 
@@ -31,6 +35,8 @@ static unsigned int		offset_size;
 static u64			entry_length;
 static unsigned char		*saved_start;
 
+static int	dwarf_add_hints(void);
+
 /*
  * Parse and create a new CIE.
  */
@@ -270,7 +276,7 @@ int dwarf_parse(struct objtool_file *file)
 	 */
 	debug_frame = find_section_by_name(file->elf, ".debug_frame");
 	if (!debug_frame)
-		return 0;
+		goto hints;
 
 	dwarf_alloc_init();
 
@@ -290,5 +296,56 @@ int dwarf_parse(struct objtool_file *file)
 	 * Run all the DWARF instructions in the CIEs and FDEs.
 	 */
 	dwarf_parse_instructions();
+hints:
+	unwind_hints = find_section_by_name(file->elf, ".discard.unwind_hints");
+	if (!unwind_hints)
+		return 0;
+
+	unwind_hints_reloc = unwind_hints->reloc;
+
+	if (unwind_hints->sh.sh_size % sizeof(struct dwarf_unwind_hint)) {
+		WARN("struct dwarf_unwind_hint size mismatch");
+		return -1;
+	}
+	nhints = unwind_hints->sh.sh_size / sizeof(struct dwarf_unwind_hint);
+
+	return dwarf_add_hints();
+}
+
+/*
+ * Errors in this function are non-fatal. The worst case is that the
+ * unwind hints will not be part of the dwarf rules. That is all.
+ */
+static int dwarf_add_hints(void)
+{
+	struct dwarf_unwind_hint	*hints, *hint;
+	struct reloc			*reloc;
+	int				i;
+	struct section			*sec;
+
+	if (!nhints)
+		return 0;
+
+	hints = (struct dwarf_unwind_hint *) unwind_hints->data->d_buf;
+	for (i = 0; i < nhints; i++) {
+		hint = &hints[i];
+		if (unwind_hints_reloc) {
+			reloc = find_reloc_by_dest_range(dwarf_file->elf,
+							 unwind_hints,
+							 i * sizeof(*hint),
+							 sizeof(*hint));
+			if (!reloc) {
+				WARN("can't find reloc for hint %d", i);
+				return 0;
+			}
+			hint->pc = reloc->addend;
+			sec = reloc->sym->sec;
+		} else {
+			sec = find_section_by_name(dwarf_file->elf, ".text");
+			if (!sec)
+				return 0;
+		}
+		dwarf_hint_add(sec, hint);
+	}
 	return 0;
 }
diff --git a/tools/objtool/dwarf_rules.c b/tools/objtool/dwarf_rules.c
index a118b392aac8..632776463c23 100644
--- a/tools/objtool/dwarf_rules.c
+++ b/tools/objtool/dwarf_rules.c
@@ -19,6 +19,9 @@ struct section			*dwarf_pcs_sec;
 static struct fde_entry		*cur_entry;
 static int			nentries;
 
+static struct hint_entry	*hint_list;
+static int			nhints;
+
 static int dwarf_rule_insert(struct fde *fde, unsigned long addr,
 			     struct rule *sp_rule, struct rule *fp_rule);
 
@@ -51,6 +54,25 @@ int dwarf_rule_add(struct fde *fde, unsigned long addr,
 	return dwarf_rule_insert(fde, addr, sp_rule, fp_rule);
 }
 
+int dwarf_hint_add(struct section *section, struct dwarf_unwind_hint *hint)
+{
+	struct hint_entry	*entry;
+
+	entry = dwarf_alloc(sizeof(*entry));
+	if (!entry)
+		return -1;
+
+	/* Add the entry to the hints list. */
+	entry->next = hint_list;
+	hint_list = entry;
+
+	entry->section = section;
+	entry->hint = hint;
+	nhints++;
+
+	return 0;
+}
+
 void dwarf_rule_next(struct fde *fde, unsigned long addr)
 {
 	if (cur_entry) {
@@ -119,6 +141,7 @@ static int dwarf_rule_write(struct elf *elf, struct fde *fde,
 	rule.size = entry->size;
 	rule.sp_saved = entry->sp_rule.saved;
 	rule.fp_saved = entry->fp_rule.saved;
+	rule.hint = 0;
 	rule.sp_offset = entry->sp_rule.offset;
 	rule.fp_offset = entry->fp_rule.offset;
 
@@ -135,11 +158,42 @@ static int dwarf_rule_write(struct elf *elf, struct fde *fde,
 	return 0;
 }
 
+static int dwarf_hint_write(struct elf *elf, struct hint_entry *hentry,
+			    unsigned int index)
+{
+	struct dwarf_rule		rule, *drule;
+	struct dwarf_unwind_hint	*hint = hentry->hint;
+
+	/*
+	 * Encode the SP and FP rules from the entry into a single dwarf_rule
+	 * for the kernel's benefit. Copy it into .dwarf_rules.
+	 */
+	rule.size = hint->size;
+	rule.sp_saved = 0;
+	rule.fp_saved = 1;
+	rule.hint = 1;
+	rule.sp_offset = hint->sp_offset;
+	rule.fp_offset = hint->fp_offset;
+
+	drule = (struct dwarf_rule *) dwarf_rules_sec->data->d_buf + index;
+	memcpy(drule, &rule, sizeof(rule));
+
+	/* Add relocation information for the code range. */
+	if (elf_add_reloc_to_insn(elf, dwarf_pcs_sec,
+				  index * sizeof(unsigned long),
+				  R_AARCH64_ABS64,
+				  hentry->section, hint->pc)) {
+		return -1;
+	}
+	return 0;
+}
+
 int dwarf_write(struct objtool_file *file)
 {
 	struct elf		*elf = file->elf;
 	struct fde		*fde;
 	struct fde_entry	*entry;
+	struct hint_entry	*hentry;
 	int			index;
 
 	/*
@@ -154,7 +208,7 @@ int dwarf_write(struct objtool_file *file)
 	/* Create .dwarf_rules. */
 	dwarf_rules_sec = elf_create_section(elf, ".dwarf_rules", 0,
 					     sizeof(struct dwarf_rule),
-					     nentries);
+					     nentries + nhints);
 	if (!dwarf_rules_sec) {
 		WARN("Unable to create .dwarf_rules");
 		return -1;
@@ -162,7 +216,8 @@ int dwarf_write(struct objtool_file *file)
 
 	/* Create .dwarf_pcs. */
 	dwarf_pcs_sec = elf_create_section(elf, ".dwarf_pcs", 0,
-					   sizeof(unsigned long), nentries);
+					   sizeof(unsigned long),
+					   nentries + nhints);
 	if (!dwarf_pcs_sec) {
 		WARN("Unable to create .dwarf_pcs");
 		return -1;
@@ -178,6 +233,12 @@ int dwarf_write(struct objtool_file *file)
 		}
 	}
 
+	for (hentry = hint_list; hentry; hentry = hentry->next) {
+		if (dwarf_hint_write(elf, hentry, index))
+			return -1;
+		index++;
+	}
+
 	return 0;
 }
 
diff --git a/tools/objtool/include/objtool/dwarf_def.h b/tools/objtool/include/objtool/dwarf_def.h
index af56ccb52fff..afa6db6c3828 100644
--- a/tools/objtool/include/objtool/dwarf_def.h
+++ b/tools/objtool/include/objtool/dwarf_def.h
@@ -304,6 +304,15 @@ struct fde {
 	struct fde_entry	*tail;
 };
 
+/*
+ * Entry for an unwind hint.
+ */
+struct hint_entry {
+	struct hint_entry		*next;
+	struct section			*section;
+	struct dwarf_unwind_hint	*hint;
+};
+
 /*
  * These are identifiers for 32-bit and 64-bit CIE entries respectively.
  * These identifiers distinguish a CIE from an FDE in .debug_frame.
@@ -429,6 +438,7 @@ extern u64		(*get_value)(unsigned char *field, unsigned int size);
 void dwarf_rule_start(struct fde *fde);
 int dwarf_rule_add(struct fde *fde, unsigned long addr,
 		   struct rule *sp_rule, struct rule *fp_rule);
+int dwarf_hint_add(struct section *section, struct dwarf_unwind_hint *hint);
 void dwarf_rule_next(struct fde *fde, unsigned long addr);
 void dwarf_rule_reset(struct fde *fde);
 int arch_dwarf_fde_reloc(struct fde *fde);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 8/9] dwarf: Miscellaneous changes required for enabling livepatch
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (6 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 7/9] arm64: dwarf: Implement unwind hints madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-07 20:25   ` [RFC PATCH v1 9/9] dwarf: Enable livepatch for ARM64 madvenka
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

	- Create arch/arm64/include/asm/livepatch.h and define
	  klp_arch_set_pc() and klp_get_ftrace_location() which
	  are required for livepatch.

	- Define TIF_PATCH_PENDING in arch/arm64/include/asm/thread_info.h
	  for livepatch.

	- Check TIF_PATCH_PENDING in do_notify_resume() to patch the
	  current task for livepatch.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/arm64/include/asm/livepatch.h   | 42 ++++++++++++++++++++++++++++
 arch/arm64/include/asm/thread_info.h |  4 ++-
 arch/arm64/kernel/signal.c           |  4 +++
 3 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/livepatch.h

diff --git a/arch/arm64/include/asm/livepatch.h b/arch/arm64/include/asm/livepatch.h
new file mode 100644
index 000000000000..72d7cd86f158
--- /dev/null
+++ b/arch/arm64/include/asm/livepatch.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * livepatch.h - arm64-specific Kernel Live Patching Core
+ */
+#ifndef _ASM_ARM64_LIVEPATCH_H
+#define _ASM_ARM64_LIVEPATCH_H
+
+#include <linux/ftrace.h>
+
+static inline void klp_arch_set_pc(struct ftrace_regs *fregs, unsigned long ip)
+{
+	struct pt_regs *regs = ftrace_get_regs(fregs);
+
+	regs->pc = ip;
+}
+
+/*
+ * klp_get_ftrace_location is expected to return the address of the BL to the
+ * relevant ftrace handler in the callsite. The location of this can vary based
+ * on several compilation options.
+ * CONFIG_DYNAMIC_FTRACE_WITH_REGS
+ *	- Inserts 2 nops on function entry the second of which is the BL
+ *	  referenced above. (See ftrace_init_nop() for the callsite sequence)
+ *	  (this is required by livepatch and must be selected)
+ * CONFIG_ARM64_BTI_KERNEL:
+ *	- Inserts a hint #0x22 on function entry if the function is called
+ *	  indirectly (to satisfy BTI requirements), which is inserted before
+ *	  the two nops from above.
+ */
+#define klp_get_ftrace_location klp_get_ftrace_location
+static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
+{
+	unsigned long addr = faddr + AARCH64_INSN_SIZE;
+
+#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)
+	addr = ftrace_location_range(addr, addr + AARCH64_INSN_SIZE);
+#endif
+
+	return addr;
+}
+
+#endif /* _ASM_ARM64_LIVEPATCH_H */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e1317b7c4525..a1d8999dbdcc 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -68,6 +68,7 @@ int arch_dup_task_struct(struct task_struct *dst,
 #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
 #define TIF_MTE_ASYNC_FAULT	5	/* MTE Asynchronous Tag Check Fault */
 #define TIF_NOTIFY_SIGNAL	6	/* signal notifications exist */
+#define TIF_PATCH_PENDING	7	/* pending live patching update */
 #define TIF_SYSCALL_TRACE	8	/* syscall trace active */
 #define TIF_SYSCALL_AUDIT	9	/* syscall auditing */
 #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
@@ -98,11 +99,12 @@ int arch_dup_task_struct(struct task_struct *dst,
 #define _TIF_SVE		(1 << TIF_SVE)
 #define _TIF_MTE_ASYNC_FAULT	(1 << TIF_MTE_ASYNC_FAULT)
 #define _TIF_NOTIFY_SIGNAL	(1 << TIF_NOTIFY_SIGNAL)
+#define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
 				 _TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \
-				 _TIF_NOTIFY_SIGNAL)
+				 _TIF_NOTIFY_SIGNAL | _TIF_PATCH_PENDING)
 
 #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 8f6372b44b65..b42dffc71fb0 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -18,6 +18,7 @@
 #include <linux/sizes.h>
 #include <linux/string.h>
 #include <linux/tracehook.h>
+#include <linux/livepatch.h>
 #include <linux/ratelimit.h>
 #include <linux/syscalls.h>
 
@@ -937,6 +938,9 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags)
 					       (void __user *)NULL, current);
 			}
 
+			if (thread_flags & _TIF_PATCH_PENDING)
+				klp_update_patch_state(current);
+
 			if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
 				do_signal(regs);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [RFC PATCH v1 9/9] dwarf: Enable livepatch for ARM64
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (7 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 8/9] dwarf: Miscellaneous changes required for enabling livepatch madvenka
@ 2022-04-07 20:25   ` madvenka
  2022-04-08  0:21   ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation Josh Poimboeuf
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 75+ messages in thread
From: madvenka @ 2022-04-07 20:25 UTC (permalink / raw)
  To: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, madvenka

From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Enable livepatch in arch/arm64/Kconfig.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/Kconfig | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c82a3a93297f..6cb00b3770cf 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -222,6 +222,9 @@ config ARM64
 	select THREAD_INFO_IN_TASK
 	select HAVE_STACK_VALIDATION	if DWARF_FP
 	select STACK_VALIDATION		if HAVE_STACK_VALIDATION
+	select HAVE_RELIABLE_STACKTRACE	if STACK_VALIDATION
+	select HAVE_LIVEPATCH		if HAVE_DYNAMIC_FTRACE_WITH_REGS && HAVE_RELIABLE_STACKTRACE
+
 	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
 	select TRACE_IRQFLAGS_SUPPORT
 	help
@@ -2066,3 +2069,5 @@ source "arch/arm64/kvm/Kconfig"
 if CRYPTO
 source "arch/arm64/crypto/Kconfig"
 endif
+
+source "kernel/livepatch/Kconfig"
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (8 preceding siblings ...)
  2022-04-07 20:25   ` [RFC PATCH v1 9/9] dwarf: Enable livepatch for ARM64 madvenka
@ 2022-04-08  0:21   ` Josh Poimboeuf
  2022-04-08 11:41     ` Peter Zijlstra
                       ` (2 more replies)
  2022-04-08 10:55   ` Peter Zijlstra
  2022-04-08 12:06   ` Peter Zijlstra
  11 siblings, 3 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2022-04-08  0:21 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
> The solution
> ============
> 
> The goal here is to use the absolute minimum CFI needed to compute the FP at
> every instruction address. The unwinder can compute the FP in each frame,
> compare the actual FP with the computed one and validate the actual FP.
> 
> Objtool is enhanced to parse the CFI, extract just the rules required,
> encode them in compact data structures and create special sections for
> the rules. The unwinder uses the special sections to find the rules for
> a given instruction address and compute the FP.
> 
> Objtool can be invoked as follows:
> 
> 	objtool dwarf generate <object-file>

Hi Madhaven,

This is quite interesting.  And it's exactly the kind of crazy idea I
can appreciate ;-)

Some initial thoughts:


1)

I have some concerns about DWARF's reliability, especially considering
a) inline asm, b) regular asm, and c) the kernel's tendency to push
compilers to their limits.

BUT, supplementing the frame pointer unwinding with DWARF, rather than
just relying on DWARF alone, does help a LOT.

I guess the hope is that cross-checking two "mostly reliable" things
against each other (frame pointers and DWARF) will give a reliable
result ;-)

In a general sense, I've never looked at DWARF's reliability, even for
just normal C code.  It would be good to have some way of knowing that
DWARF looks mostly sane for both GCC and Clang.  For example, maybe
somehow cross-checking it with objtool's knowledge.  And then of course
we'd have to hope that it stays bug-free in future compilers.

I'd also be somewhat concerned about assembly.  Since there's nothing
ensuring the unwind hints are valid, and will stay valid over time, I
wonder how likely it would be for that to break, and what the
implications would be.  Most likely I guess it would break silently, but
then get caught by the frame pointer cross-checking.  So a broken hint
might not get noticed for a long time, but at least it (hopefully)
wouldn't break reliable unwinding.

Also, inline asm can sometimes do stack hacks like
"push;do_something;pop" which isn't visible to the toolchain.  But
again, hopefully the frame pointer checking would fail and mark it
unreliable.

So I do have some worries about DWARF, but the fact that it's getting
"fact checked" by frame pointers might be sufficient.


2)

If I understand correctly, objtool is converting parts of DWARF to a new
format which can then be read by the kernel.  In that case, please don't
call it DWARF as that will cause a lot of confusion.

There are actually several similarities between your new format and ORC,
which is also an objtool-created DWARF alternative.  It would be
interesting to see if they could be combined somehow.


3)

Objtool has become an integral part of x86-64, due to security and
performance features and toolchain workarounds.

Not *all* of its features require the full "branch validation" which
follows all code paths -- and was the hardest part to get right -- but
several features *do* need that: stack validation, ORC, uaccess
validation, noinstr validation.

Objtool has been picking up a lot of steam (and features) lately, with
more features currently in active development.  And lately there have
been renewed patches for porting it to powerpc and arm64 (and rumors of
s390).

If arm64 ever wants one of those features -- particularly a "branch
validation" based feature -- I think it would make more sense to just do
the stack validation in objtool, rather than the DWARF supplementation
approach.

Just to give an idea of what objtool already supports and how useful it
has become for x86, here's an excerpt from some documentation I've been
working on, since I'm in the middle of rewriting the interface to make
it more modular.  This is a list of all its current features:


Features
--------

Objtool has the following features:


- Stack unwinding metadata validation -- useful for helping to ensure
  stack traces are reliable for live patching

- ORC unwinder metadata generation -- a faster and more precise
  alternative to frame pointer based unwinding

- Retpoline validation -- ensures that all indirect calls go through
  retpoline thunks, for Spectre v2 mitigations

- Retpoline call site annotation -- annotates all retpoline thunk call
  sites, enabling the kernel to patch them inline, to prevent "thunk
  funneling" for both security and performance reasons

- Non-instrumentation validation -- validates non-instrumentable
  ("noinstr") code rules, preventing unexpected instrumentation in
  low-level C entry code

- Static call annotation -- annotates static call sites, enabling the
  kernel to implement inline static calls, a faster alternative to some
  indirect branches

- Uaccess validation -- validates uaccess rules for a proper safe
  implementation of Supervisor Mode Access Protection (SMAP)

- Straight Line Speculation validation -- validates certain SLS
  mitigations

- Indirect Branch Tracking validation -- validates Intel CET IBT rules
  to ensure that all functions referenced by function pointers have
  corresponding ENDBR instructions

- Indirect Branch Tracking annotation -- annotates unused ENDBR
  instruction sites, enabling the kernel to "seal" them (replace them
  with NOPs) to further harden IBT

- Function entry annotation -- annotates function entries, enabling
  kernel function tracing

- Other toolchain hacks which will go unmentioned at this time...

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (9 preceding siblings ...)
  2022-04-08  0:21   ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation Josh Poimboeuf
@ 2022-04-08 10:55   ` Peter Zijlstra
  2022-04-08 11:54     ` Peter Zijlstra
  2022-04-10 17:47     ` Madhavan T. Venkataraman
  2022-04-08 12:06   ` Peter Zijlstra
  11 siblings, 2 replies; 75+ messages in thread
From: Peter Zijlstra @ 2022-04-08 10:55 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:

> [-- application/octet-stream is unsupported (use 'v' to view this part) --]

Your emails are unreadable :-(

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08  0:21   ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation Josh Poimboeuf
@ 2022-04-08 11:41     ` Peter Zijlstra
  2022-04-11 17:26       ` Madhavan T. Venkataraman
  2022-04-11 17:18     ` Madhavan T. Venkataraman
  2022-04-14 14:11     ` Madhavan T. Venkataraman
  2 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2022-04-08 11:41 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: madvenka, mark.rutland, broonie, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel, chenzhongjin


Right; so not having seen the patches due to Madhaven's email being
broken, I can perhaps less appreciated the crazy involved.

On Thu, Apr 07, 2022 at 05:21:51PM -0700, Josh Poimboeuf wrote:
> 2)
> 
> If I understand correctly, objtool is converting parts of DWARF to a new
> format which can then be read by the kernel.  In that case, please don't
> call it DWARF as that will cause a lot of confusion.
> 
> There are actually several similarities between your new format and ORC,
> which is also an objtool-created DWARF alternative.  It would be
> interesting to see if they could be combined somehow.

What Josh said; please use/extend ORC.

I really don't understand where all this crazy is coming from; why does
objtool need to do something radically weird for ARM64?

There are existing ARM64 patches for objtool; in fact they have recently
been re-posted:

 https://lkml.kernel.org/r/20220407120141.43801-1-chenzhongjin@huawei.com

The only tricky bit seems to be the whole jump-table issue. Using DWARF
as input to deal with jump-tables should be possible -- exceedingly
overkill, but possible I suppose. Mandating DWARF sucks though, compile
times are so much worse with DWARVES on :/

Once objtool can properly follow/validate ARM64 code, it should be
fairly straight forward to have it generate ORC data just like it does
on x86_64.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08 10:55   ` Peter Zijlstra
@ 2022-04-08 11:54     ` Peter Zijlstra
  2022-04-08 14:34       ` Josh Poimboeuf
  2022-04-10 17:47     ` Madhavan T. Venkataraman
  1 sibling, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2022-04-08 11:54 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

On Fri, Apr 08, 2022 at 12:55:11PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
> 
> > [-- application/octet-stream is unsupported (use 'v' to view this part) --]
> 
> Your emails are unreadable :-(

List copy is OK, so perhaps it's due to how Josh bounced them..

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
                     ` (10 preceding siblings ...)
  2022-04-08 10:55   ` Peter Zijlstra
@ 2022-04-08 12:06   ` Peter Zijlstra
  2022-04-11 17:35     ` Madhavan T. Venkataraman
  11 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2022-04-08 12:06 UTC (permalink / raw)
  To: madvenka
  Cc: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
> The solution
> ============
> 
> The goal here is to use the absolute minimum CFI needed to compute the FP at
> every instruction address. The unwinder can compute the FP in each frame,
> compare the actual FP with the computed one and validate the actual FP.
> 
> Objtool is enhanced to parse the CFI, extract just the rules required,
> encode them in compact data structures and create special sections for
> the rules. The unwinder uses the special sections to find the rules for
> a given instruction address and compute the FP.
> 
> Objtool can be invoked as follows:
> 
> 	objtool dwarf generate <object-file>
> 
> The version of the DWARF standard supported in this work is version 4. The
> official documentation for this version is here:
> 
> 	https://dwarfstd.org/doc/DWARF4.pdf
> 
> Section 6.4 contains the description of the CFI.

The problem is of course that DWARF is only available for compiler
generated code and doesn't cover assembly code, of which is there is
always lots.

I suppose you can go add DWARF annotations to all the assembly, but IIRC
those are pretty terrible. We were *REALLY* happy to delete all that
nasty from the x86 code.

On top of that, AFAIK compilers don't generally consider DWARF
generation to be a correctness issue. For them it's debug info and
having it be correct is nice but not required. So using it as input for
something that's required to be correct, seems unfortunate.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08 11:54     ` Peter Zijlstra
@ 2022-04-08 14:34       ` Josh Poimboeuf
  0 siblings, 0 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2022-04-08 14:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: madvenka, mark.rutland, broonie, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

On Fri, Apr 08, 2022 at 01:54:28PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 08, 2022 at 12:55:11PM +0200, Peter Zijlstra wrote:
> > On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
> > 
> > > [-- application/octet-stream is unsupported (use 'v' to view this part) --]
> > 
> > Your emails are unreadable :-(
> 
> List copy is OK, so perhaps it's due to how Josh bounced them..

Corporate email Mimecast fail when I bounced them, sorry :-/

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-03-07 16:51       ` Madhavan T. Venkataraman
  2022-03-07 17:01         ` Mark Brown
@ 2022-04-08 14:44         ` Mark Rutland
  2022-04-08 17:58           ` Mark Rutland
                             ` (2 more replies)
  1 sibling, 3 replies; 75+ messages in thread
From: Mark Rutland @ 2022-04-08 14:44 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
> Hey Mark Rutland, Mark Brown,
> 
> Could you please review the rest of the patches in the series when you can?

Sorry, I was expecting a new version with some of my comments
addressed, in case that had effects on subsequent patches.

> Also, many of the patches have received a Reviewed-By from you both.
> So, after I send the next version out, can we upstream those ones?

I would very much like to upstream the ones I have given a Reviewed-by.

Given those were conditional on some adjustments (e.g. actually filling
out comments), do you mind if I pick those into a series now?

Then, once that's picked, you can rebase the rest atop, and we can
review that.

Thanks,
Mark.

> On 2/15/22 07:39, Mark Rutland wrote:
> > On Mon, Jan 17, 2022 at 08:56:03AM -0600, madvenka@linux.microsoft.com wrote:
> >> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> >>
> >> Rename the arguments to unwind() for better consistency. Also, use the
> >> typedef stack_trace_consume_fn for the consume_entry function as it is
> >> already defined in linux/stacktrace.h.
> >>
> >> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> > 
> > How about: 
> > 
> > | arm64: align with common stracktrace naming
> > |
> > | For historical reasons, the naming of parameters and their types in the arm64
> > | stacktrace code differs from that used in generic code and other
> > | architectures, even though the types are equivalent.
> > |
> > | For consistency and clarity, use the generic names.
> > 
> > Either way:
> > 
> > Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> > 
> > Mark.
> > 
> >> ---
> >>  arch/arm64/kernel/stacktrace.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> >> index 1b32e55735aa..f772dac78b11 100644
> >> --- a/arch/arm64/kernel/stacktrace.c
> >> +++ b/arch/arm64/kernel/stacktrace.c
> >> @@ -181,12 +181,12 @@ static int notrace unwind_next(struct unwind_state *state)
> >>  NOKPROBE_SYMBOL(unwind_next);
> >>  
> >>  static void notrace unwind(struct unwind_state *state,
> >> -			   bool (*fn)(void *, unsigned long), void *data)
> >> +			   stack_trace_consume_fn consume_entry, void *cookie)
> >>  {
> >>  	while (1) {
> >>  		int ret;
> >>  
> >> -		if (!fn(data, state->pc))
> >> +		if (!consume_entry(cookie, state->pc))
> >>  			break;
> >>  		ret = unwind_next(state);
> >>  		if (ret < 0)
> >> -- 
> >> 2.25.1
> >>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-04-08 14:44         ` Mark Rutland
@ 2022-04-08 17:58           ` Mark Rutland
  2022-04-10 17:42             ` Madhavan T. Venkataraman
  2022-04-10 17:33           ` Madhavan T. Venkataraman
  2022-04-10 17:45           ` Madhavan T. Venkataraman
  2 siblings, 1 reply; 75+ messages in thread
From: Mark Rutland @ 2022-04-08 17:58 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

On Fri, Apr 08, 2022 at 03:44:34PM +0100, Mark Rutland wrote:
> On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
> > Hey Mark Rutland, Mark Brown,
> > 
> > Could you please review the rest of the patches in the series when you can?
> 
> Sorry, I was expecting a new version with some of my comments
> addressed, in case that had effects on subsequent patches.
> 
> > Also, many of the patches have received a Reviewed-By from you both.
> > So, after I send the next version out, can we upstream those ones?
> 
> I would very much like to upstream the ones I have given a Reviewed-by.
> 
> Given those were conditional on some adjustments (e.g. actually filling
> out comments), do you mind if I pick those into a series now?

FWIW, I've picked up the set I'm trivially happy with, rebased that on
v5.18-rc1, and put that on a branch with a couple of other cleanups:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/cleanups

I'll send that out on Monday if there are no objections.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-04-08 14:44         ` Mark Rutland
  2022-04-08 17:58           ` Mark Rutland
@ 2022-04-10 17:33           ` Madhavan T. Venkataraman
  2022-04-10 17:45           ` Madhavan T. Venkataraman
  2 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-10 17:33 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 4/8/22 09:44, Mark Rutland wrote:
> On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
>> Hey Mark Rutland, Mark Brown,
>>
>> Could you please review the rest of the patches in the series when you can?
> 
> Sorry, I was expecting a new version with some of my comments
> addressed, in case that had effects on subsequent patches.
> 

Yes. I realized that. I am actually working on the next version addressing the
comments I have received.

>> Also, many of the patches have received a Reviewed-By from you both.
>> So, after I send the next version out, can we upstream those ones?
> 
> I would very much like to upstream the ones I have given a Reviewed-by.
> 
> Given those were conditional on some adjustments (e.g. actually filling
> out comments), do you mind if I pick those into a series now?
> 
> Then, once that's picked, you can rebase the rest atop, and we can
> review that.
> 

That would be great! Thanks!

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-04-08 17:58           ` Mark Rutland
@ 2022-04-10 17:42             ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-10 17:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 4/8/22 12:58, Mark Rutland wrote:
> On Fri, Apr 08, 2022 at 03:44:34PM +0100, Mark Rutland wrote:
>> On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
>>> Hey Mark Rutland, Mark Brown,
>>>
>>> Could you please review the rest of the patches in the series when you can?
>>
>> Sorry, I was expecting a new version with some of my comments
>> addressed, in case that had effects on subsequent patches.
>>
>>> Also, many of the patches have received a Reviewed-By from you both.
>>> So, after I send the next version out, can we upstream those ones?
>>
>> I would very much like to upstream the ones I have given a Reviewed-by.
>>
>> Given those were conditional on some adjustments (e.g. actually filling
>> out comments), do you mind if I pick those into a series now?
> 
> FWIW, I've picked up the set I'm trivially happy with, rebased that on
> v5.18-rc1, and put that on a branch with a couple of other cleanups:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/cleanups
> 
> I'll send that out on Monday if there are no objections.
> 
> Thanks,
> Mark.

LGTM.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind()
  2022-04-08 14:44         ` Mark Rutland
  2022-04-08 17:58           ` Mark Rutland
  2022-04-10 17:33           ` Madhavan T. Venkataraman
@ 2022-04-10 17:45           ` Madhavan T. Venkataraman
  2 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-10 17:45 UTC (permalink / raw)
  To: Mark Rutland
  Cc: broonie, jpoimboe, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 4/8/22 09:44, Mark Rutland wrote:
> On Mon, Mar 07, 2022 at 10:51:38AM -0600, Madhavan T. Venkataraman wrote:
>> Hey Mark Rutland, Mark Brown,
>>
>> Could you please review the rest of the patches in the series when you can?
> 
> Sorry, I was expecting a new version with some of my comments
> addressed, in case that had effects on subsequent patches.
> 
>> Also, many of the patches have received a Reviewed-By from you both.
>> So, after I send the next version out, can we upstream those ones?
> 
> I would very much like to upstream the ones I have given a Reviewed-by.
> 
> Given those were conditional on some adjustments (e.g. actually filling
> out comments), do you mind if I pick those into a series now?
> 
> Then, once that's picked, you can rebase the rest atop, and we can
> review that.
> 

So, do you want me to address the comments so far and send the next version?
I can do it ASAP.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08 10:55   ` Peter Zijlstra
  2022-04-08 11:54     ` Peter Zijlstra
@ 2022-04-10 17:47     ` Madhavan T. Venkataraman
  2022-04-11 16:34       ` Josh Poimboeuf
  1 sibling, 1 reply; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-10 17:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel



On 4/8/22 05:55, Peter Zijlstra wrote:
> On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
> 
>> [-- application/octet-stream is unsupported (use 'v' to view this part) --]
> 
> Your emails are unreadable :-(

I am not sure why the emails are unreadable. Any suggestions? Should I resend? Please let me know.
Sorry about this.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-10 17:47     ` Madhavan T. Venkataraman
@ 2022-04-11 16:34       ` Josh Poimboeuf
  0 siblings, 0 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2022-04-11 16:34 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Peter Zijlstra, mark.rutland, broonie, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel

On Sun, Apr 10, 2022 at 12:47:46PM -0500, Madhavan T. Venkataraman wrote:
> 
> 
> On 4/8/22 05:55, Peter Zijlstra wrote:
> > On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
> > 
> >> [-- application/octet-stream is unsupported (use 'v' to view this part) --]
> > 
> > Your emails are unreadable :-(
> 
> I am not sure why the emails are unreadable. Any suggestions? Should I resend? Please let me know.
> Sorry about this.

That was actually my (company's) fault when I bounced the patches to Peter.

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08  0:21   ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation Josh Poimboeuf
  2022-04-08 11:41     ` Peter Zijlstra
@ 2022-04-11 17:18     ` Madhavan T. Venkataraman
  2022-04-12  8:32       ` Chen Zhongjin
                         ` (2 more replies)
  2022-04-14 14:11     ` Madhavan T. Venkataraman
  2 siblings, 3 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-11 17:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

Hi Josh,

On 4/7/22 19:21, Josh Poimboeuf wrote:
> On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
>> The solution
>> ============
>>
>> The goal here is to use the absolute minimum CFI needed to compute the FP at
>> every instruction address. The unwinder can compute the FP in each frame,
>> compare the actual FP with the computed one and validate the actual FP.
>>
>> Objtool is enhanced to parse the CFI, extract just the rules required,
>> encode them in compact data structures and create special sections for
>> the rules. The unwinder uses the special sections to find the rules for
>> a given instruction address and compute the FP.
>>
>> Objtool can be invoked as follows:
>>
>> 	objtool dwarf generate <object-file>
> 
> Hi Madhaven,
> 
> This is quite interesting.  And it's exactly the kind of crazy idea I
> can appreciate ;-)
> 

A little crazy is a good thing sometimes.

> Some initial thoughts:
> 
> 
> 1)
> 
> I have some concerns about DWARF's reliability, especially considering
> a) inline asm, b) regular asm, and c) the kernel's tendency to push
> compilers to their limits.
> 

I am thinking of implementing a DWARF verifier to make sure that the DWARF information
in .debug_frame is correct. I am still in the process of designing this. I will keep
you posted on that. This should address (a) and (c).

As for (b), the compiler does not generate any DWARF rules for ASM code. DWARF
annotations are a PITA to maintain. So, in my current design regular ASM functions are 
considered unreliable from an unwind perspective except the places that have unwind
hints. Unwind hints are only needed for places that tend to occur frequently in stack
traces. So, it would be just a handful of unwind hints which can be maintained.

As you know, ASM functions come in two flavors - SYM_CODE() functions and SYM_FUNC()
functions. SYM_CODE functions are, by definition, unreliable from an unwind perspective
because they don't follow ABI rules and they don't set up any frame pointer. In my
reliable stack trace patch series, I have a patch based on the opinion of the reviewers
to mark these functions so the unwinder can recognize them and declare the stack trace
unreliable. So, the only ASM functions that matter in (b) are the SYM_FUNC functions.
For now, I have considered them to be unreliable. But I will analyze those functions to
see if any of them can occur frequently in stack traces. If some of these functions
can occur frequently in stack traces, I need to address them. I will see if unwind hints
are a good fit. I will get back to you on this.

> BUT, supplementing the frame pointer unwinding with DWARF, rather than
> just relying on DWARF alone, does help a LOT.
> 

Yes.

> I guess the hope is that cross-checking two "mostly reliable" things
> against each other (frame pointers and DWARF) will give a reliable
> result ;-)
> 

Yes!

> In a general sense, I've never looked at DWARF's reliability, even for
> just normal C code.  It would be good to have some way of knowing that
> DWARF looks mostly sane for both GCC and Clang.  For example, maybe
> somehow cross-checking it with objtool's knowledge.  And then of course
> we'd have to hope that it stays bug-free in future compilers.
> 

This is a valid point. So far, I find that gcc generates reliable DWARF information.
But there are two bugs in what Clang generates. I have added workarounds in my
parser to compensate.

So, I think a DWARF verifier is an option that architectures can use. At this point,
I don't want to mandate a verifier on every architecture. But that is a discussion
that we can have once I have a verifier ready.

> I'd also be somewhat concerned about assembly.  Since there's nothing
> ensuring the unwind hints are valid, and will stay valid over time, I
> wonder how likely it would be for that to break, and what the
> implications would be.  Most likely I guess it would break silently, but
> then get caught by the frame pointer cross-checking.  So a broken hint
> might not get noticed for a long time, but at least it (hopefully)
> wouldn't break reliable unwinding.
> 

Yes. That is my thinking as well. When the unwinder checks the actual FP with the
computed FP, any mismatch will be treated as unreliable code for unwind. So,
apart from some retries during the livepatch process, this is most probably not
a problem.

Now,  I set a flag for an unwind hint so that the unwinder knows that it is
processing an unwind hint. I could generate a warning if an unwind hint does not
result in a reliable unwind of the frame. This would bring the broken hint
to people's attention.


> Also, inline asm can sometimes do stack hacks like
> "push;do_something;pop" which isn't visible to the toolchain.  But
> again, hopefully the frame pointer checking would fail and mark it
> unreliable.
> 
> So I do have some worries about DWARF, but the fact that it's getting
> "fact checked" by frame pointers might be sufficient.
> 

Exactly.

> 
> 2)
> 
> If I understand correctly, objtool is converting parts of DWARF to a new
> format which can then be read by the kernel.  In that case, please don't
> call it DWARF as that will cause a lot of confusion.
> 

OK. I will rename it.

> There are actually several similarities between your new format and ORC,
> which is also an objtool-created DWARF alternative.  It would be
> interesting to see if they could be combined somehow.
> 

I will certainly look into it. So, if I decide to merge the two, I might want
to make a minor change to the ORC structure. Would that be OK with you?

> 
> 3)
> 
> Objtool has become an integral part of x86-64, due to security and
> performance features and toolchain workarounds.
> 
> Not *all* of its features require the full "branch validation" which
> follows all code paths -- and was the hardest part to get right -- but
> several features *do* need that: stack validation, ORC, uaccess
> validation, noinstr validation.
> 
> Objtool has been picking up a lot of steam (and features) lately, with
> more features currently in active development.  And lately there have
> been renewed patches for porting it to powerpc and arm64 (and rumors of
> s390).
> 
> If arm64 ever wants one of those features -- particularly a "branch
> validation" based feature -- I think it would make more sense to just do
> the stack validation in objtool, rather than the DWARF supplementation
> approach.
> 

First off, I think that objtool does a great job for X64. I only want to implement
frame pointer validation in a different way. All the other features of objtool
(listed below) are great. I have admired the amount of work you guys have put into
the X64 part.

These are the reasons why I tried the DWARF based method:

- My implementation is largely architecture independent. There are a couple of
  minor pieces that are architecture-specific, but they are minor in nature.
  So, if an architecture wanted to support the livepatch feature but did not
  want to do a heavy weight objtool implementation, then it has an option.
  There has been some debate about whether static analysis should be mandated
  for livepatch. My patch series is an attempt to provide an option.

- To get an objtool static analysis implementation working for an architecture
  as reliably as X64 and getting it reviewed and upstreamed can take years. It took
  years for X64, am I right? I mean, it has been quite a while since the original
  patch series for arm64 was posted. There have been only one or two minor comments
  so far. I am sure arm64 linux users would very much want to have livepatch available
  ASAP to be able to install security fixes without downtime. This is an immediate need.

- No software is bug free. So, even if static analysis is implemented for an architecture,
  it would be good to have another method of verifying the unwind rules generated from
  the static analysis. DWARF can provide that additional verification.

> Just to give an idea of what objtool already supports and how useful it
> has become for x86, here's an excerpt from some documentation I've been
> working on, since I'm in the middle of rewriting the interface to make
> it more modular.  This is a list of all its current features:
> 
> 
> Features
> --------
> 
> Objtool has the following features:
> 
> 
> - Stack unwinding metadata validation -- useful for helping to ensure
>   stack traces are reliable for live patching
> 
> - ORC unwinder metadata generation -- a faster and more precise
>   alternative to frame pointer based unwinding
> 
> - Retpoline validation -- ensures that all indirect calls go through
>   retpoline thunks, for Spectre v2 mitigations
> 
> - Retpoline call site annotation -- annotates all retpoline thunk call
>   sites, enabling the kernel to patch them inline, to prevent "thunk
>   funneling" for both security and performance reasons
> 
> - Non-instrumentation validation -- validates non-instrumentable
>   ("noinstr") code rules, preventing unexpected instrumentation in
>   low-level C entry code
> 
> - Static call annotation -- annotates static call sites, enabling the
>   kernel to implement inline static calls, a faster alternative to some
>   indirect branches
> 
> - Uaccess validation -- validates uaccess rules for a proper safe
>   implementation of Supervisor Mode Access Protection (SMAP)
> 
> - Straight Line Speculation validation -- validates certain SLS
>   mitigations
> 
> - Indirect Branch Tracking validation -- validates Intel CET IBT rules
>   to ensure that all functions referenced by function pointers have
>   corresponding ENDBR instructions
> 
> - Indirect Branch Tracking annotation -- annotates unused ENDBR
>   instruction sites, enabling the kernel to "seal" them (replace them
>   with NOPs) to further harden IBT
> 
> - Function entry annotation -- annotates function entries, enabling
>   kernel function tracing
> 
> - Other toolchain hacks which will go unmentioned at this time...
> 

I completely agree.

So, it is just frame pointer validation for livepatch I am trying to look at.

Thanks!

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08 11:41     ` Peter Zijlstra
@ 2022-04-11 17:26       ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-11 17:26 UTC (permalink / raw)
  To: Peter Zijlstra, Josh Poimboeuf
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel, chenzhongjin



On 4/8/22 06:41, Peter Zijlstra wrote:
> 
> Right; so not having seen the patches due to Madhaven's email being
> broken, I can perhaps less appreciated the crazy involved.
> 

Crazy like a fox.

> On Thu, Apr 07, 2022 at 05:21:51PM -0700, Josh Poimboeuf wrote:
>> 2)
>>
>> If I understand correctly, objtool is converting parts of DWARF to a new
>> format which can then be read by the kernel.  In that case, please don't
>> call it DWARF as that will cause a lot of confusion.
>>
>> There are actually several similarities between your new format and ORC,
>> which is also an objtool-created DWARF alternative.  It would be
>> interesting to see if they could be combined somehow.
> 
> What Josh said; please use/extend ORC.
> 

Yes. I am looking into it.

> I really don't understand where all this crazy is coming from; why does
> objtool need to do something radically weird for ARM64?
> 
> There are existing ARM64 patches for objtool; in fact they have recently
> been re-posted:
> 
>  https://lkml.kernel.org/r/20220407120141.43801-1-chenzhongjin@huawei.com
> 
> The only tricky bit seems to be the whole jump-table issue. Using DWARF
> as input to deal with jump-tables should be possible -- exceedingly
> overkill, but possible I suppose. Mandating DWARF sucks though, compile
> times are so much worse with DWARVES on :/
> 
> Once objtool can properly follow/validate ARM64 code, it should be
> fairly straight forward to have it generate ORC data just like it does
> on x86_64.
> 

My reasons for attempting the DWARF based implementation:

- My implementation is largely architecture independent. There are a couple of
  minor pieces that are architecture-specific, but they are minor in nature.
  So, if an architecture wanted to support the livepatch feature but did not
  want to do a heavy weight objtool implementation, then it has an option.
  There has been some debate about whether static analysis should be mandated
  for livepatch. My patch series is an attempt to provide an option.

- To get an objtool static analysis implementation working for an architecture
  as reliably as X64 and getting it reviewed and upstreamed can take years. It took
  years for X64, am I right? I mean, it has been quite a while since the original
  patch series for arm64 was posted. There have been only one or two minor comments
  so far. I am sure arm64 linux users would very much want to have livepatch available
  ASAP to be able to install security fixes without downtime. This is an immediate need.

- No software is bug free. So, even if static analysis is implemented for an architecture,
  it would be good to have another method of verifying the unwind rules generated from
  the static analysis. DWARF can provide that additional verification.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08 12:06   ` Peter Zijlstra
@ 2022-04-11 17:35     ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-11 17:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, broonie, jpoimboe, ardb, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jmorris, linux-arm-kernel,
	live-patching, linux-kernel



On 4/8/22 07:06, Peter Zijlstra wrote:
> On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
>> The solution
>> ============
>>
>> The goal here is to use the absolute minimum CFI needed to compute the FP at
>> every instruction address. The unwinder can compute the FP in each frame,
>> compare the actual FP with the computed one and validate the actual FP.
>>
>> Objtool is enhanced to parse the CFI, extract just the rules required,
>> encode them in compact data structures and create special sections for
>> the rules. The unwinder uses the special sections to find the rules for
>> a given instruction address and compute the FP.
>>
>> Objtool can be invoked as follows:
>>
>> 	objtool dwarf generate <object-file>
>>
>> The version of the DWARF standard supported in this work is version 4. The
>> official documentation for this version is here:
>>
>> 	https://dwarfstd.org/doc/DWARF4.pdf
>>
>> Section 6.4 contains the description of the CFI.
> 
> The problem is of course that DWARF is only available for compiler
> generated code and doesn't cover assembly code, of which is there is
> always lots.
> 

Yes. But assembly functions are of two types:

SYM_CODE_*() functions
SYM_FUNC_*() functions

SYM_CODE functions are, by definition, special functions that don't follow any ABI rules.
They don't set up a frame. Based on the opinion of ARM64 experts, these need to be
recognized by the unwinder and, if they are present in a stack trace, the stack trace
must be considered unreliable. I have, in fact, submitted a patch to implement that.

So, only SYM_FUNC*() functions are relevant for this part. I will look into these for arm64
and check if any of them can occur frequently in stack traces. If any of them is likely
to occur frequently in stack traces, I must address them. If there are only a few such
functions, unwind hints may be sufficient. I will get back to you on this.

> I suppose you can go add DWARF annotations to all the assembly, but IIRC
> those are pretty terrible. We were *REALLY* happy to delete all that
> nasty from the x86 code.
> 

DWARF annotations are a PITA to maintain. I will never recommend that!

> On top of that, AFAIK compilers don't generally consider DWARF
> generation to be a correctness issue. For them it's debug info and
> having it be correct is nice but not required. So using it as input for
> something that's required to be correct, seems unfortunate.

It is only debug info. But if that info can be verified, then it is usable for livepatch
purposes. I am thinking of implementing a verifier since DWARF reliability is a valid
concern. I will keep you posted.

Thanks!

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-11 17:18     ` Madhavan T. Venkataraman
@ 2022-04-12  8:32       ` Chen Zhongjin
  2022-04-16  0:56         ` Josh Poimboeuf
       [not found]       ` <844b3ede-eddb-cbe6-80e0-3529e2da2eb6@huawei.com>
  2022-04-16  1:07       ` Josh Poimboeuf
  2 siblings, 1 reply; 75+ messages in thread
From: Chen Zhongjin @ 2022-04-12  8:32 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel, Josh Poimboeuf

Hi Madhaven,

Sorry I sent the last email as HTML. This is a plain text resend.

On 2022/4/12 1:18, Madhavan T. Venkataraman wrote:

>> In a general sense, I've never looked at DWARF's reliability, even for
>> just normal C code.  It would be good to have some way of knowing that
>> DWARF looks mostly sane for both GCC and Clang.  For example, maybe
>> somehow cross-checking it with objtool's knowledge.  And then of course
>> we'd have to hope that it stays bug-free in future compilers.
>>
> 
> This is a valid point. So far, I find that gcc generates reliable DWARF information.
> But there are two bugs in what Clang generates. I have added workarounds in my
> parser to compensate.
> 
> So, I think a DWARF verifier is an option that architectures can use. At this point,
> I don't want to mandate a verifier on every architecture. But that is a discussion
> that we can have once I have a verifier ready.
>
I'm concerning that depending on compilers to generate correct 
information can become a trouble because we linux kernel side can rarely 
fix what compilers make. That's also why the gcc plugin idea was 
objected in the objtool migration.

If your parser can solve this it sounds more doable.

>> I'd also be somewhat concerned about assembly.  Since there's nothing
>> ensuring the unwind hints are valid, and will stay valid over time, I
>> wonder how likely it would be for that to break, and what the
>> implications would be.  Most likely I guess it would break silently, but
>> then get caught by the frame pointer cross-checking.  So a broken hint
>> might not get noticed for a long time, but at least it (hopefully)
>> wouldn't break reliable unwinding.
>>
> 
> Yes. That is my thinking as well. When the unwinder checks the actual FP with the
> computed FP, any mismatch will be treated as unreliable code for unwind. So,
> apart from some retries during the livepatch process, this is most probably not
> a problem.
> 
> Now,  I set a flag for an unwind hint so that the unwinder knows that it is
> processing an unwind hint. I could generate a warning if an unwind hint does not
> result in a reliable unwind of the frame. This would bring the broken hint
> to people's attention.
> 
> 
>> Also, inline asm can sometimes do stack hacks like
>> "push;do_something;pop" which isn't visible to the toolchain.  But
>> again, hopefully the frame pointer checking would fail and mark it
>> unreliable.
>>
>> So I do have some worries about DWARF, but the fact that it's getting
>> "fact checked" by frame pointers might be sufficient.
>>
> 
> Exactly.
> 
I'm wondering how much functions will give a unreliable result because 
any unreliable function shows in stack trace will cause livepatch 
fail/retry. IIUC all unmarked assembly functions will considered 
unreliable and cause problem. It can be a burden to mark all of them.

> - No software is bug free. So, even if static analysis is implemented for an architecture,
>    it would be good to have another method of verifying the unwind rules generated from
>    the static analysis. DWARF can provide that additional verification.
> 
I'm wondering how much functions will give a unreliable result because 
any unreliable function shows in stack trace will cause livepatch 
fail/retry. IIUC all unmarked assembly functions will considered 
unreliable and cause problem. It can be a burden to mark all of them.

> 
> So, it is just frame pointer validation for livepatch I am trying to look at.
> 
My support reason for FP with validation is that it provides a guarantee 
for FP unwinder. FP and ORC use absolute and relative for stack unwind 
to unwind stack respectively, however FP has been considered unreliable. 
Is there any feature depends on FP? If so it can be more persuasive.


Also this patch is much more completed than migration for objtool. It 
would be nice if this could be put into use quickly. The objtool-arm64 
is less than half done, but I'm going to relies as much as possible on 
current objtool components, so no more feasibility validation is required.

By the way, I was thinking about a corner case, because arm64 CALL 
instruction won't push LR onto stack atomically as x86. Before push LR, 
FP to save frame there still can be some instructions such as bti, 
paciasp. If an irq happens here, the stack frame is not constructed so 
the FP unwinder will omit this function and provides a wrong stack trace 
to livepatch.

It's just a guess and I have not built the test case. But I think it's a 
defect on arm64 that FP unwinder can't work properly on prologue and 
epilogue. Do you have any idea about this?

Thanks for your time,
Chen


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
       [not found]       ` <844b3ede-eddb-cbe6-80e0-3529e2da2eb6@huawei.com>
@ 2022-04-12 17:27         ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-12 17:27 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel, Josh Poimboeuf



On 4/12/22 02:30, Chen Zhongjin wrote:
> Hi Madhaven,
> 
> On 2022/4/12 1:18, Madhavan T. Venkataraman wrote:
>>
>> This is a valid point. So far, I find that gcc generates reliable DWARF information.
>> But there are two bugs in what Clang generates. I have added workarounds in my
>> parser to compensate.
>>
>> So, I think a DWARF verifier is an option that architectures can use. At this point,
>> I don't want to mandate a verifier on every architecture. But that is a discussion
>> that we can have once I have a verifier ready.
> I'm concerning that depending on compilers to generate correct information can become
> a trouble because we linux kernel side can rarely fix what compilers make. That's also why
> the gcc plugin idea was objected in the objtool migration.
> 
> If your parser can solve this it sounds more doable.
> 

So far, I find that gcc generates reliable DWARF information. Clang did have two bugs for
which the parser compensates.

So, what is needed is a DWARF verifier which can find buggy DWARF information. I am working on
it.

Having said that,  I think the DWARF information for the stack and frame pointer is a *lot*
simpler than the other debug stuff. So, there may be a couple of existing bugs that need to be
discovered and fixed. But I think the likelihood of bugs appearing in this area in the future is
low but it can happen. I agree that there needs to be a way to discover and flag the bugs
when they do appear.

But compiler bugs can also affect objtool and cause problems in it. If I am not mistaken,
they ran into this multiple times on the X86 side and had to get fixes done. Josh knows better.
Compiler bugs and optimizations have always been problematic from this perspective and they
will potentially continue to be.

>>
>> Yes. That is my thinking as well. When the unwinder checks the actual FP with the
>> computed FP, any mismatch will be treated as unreliable code for unwind. So,
>> apart from some retries during the livepatch process, this is most probably not
>> a problem.
>>
>> Now,  I set a flag for an unwind hint so that the unwinder knows that it is
>> processing an unwind hint. I could generate a warning if an unwind hint does not
>> result in a reliable unwind of the frame. This would bring the broken hint
>> to people's attention.
>>
>>
>> Also, inline asm can sometimes do stack hacks like
>> "push;do_something;pop" which isn't visible to the toolchain.  But
>> again, hopefully the frame pointer checking would fail and mark it
>> unreliable.
>>
> I'm wondering how much functions will give a unreliable result because any unreliable
> function shows in stack trace will cause livepatch fail/retry. IIUC all unmarked assembly
> functions will considered unreliable and cause problem. It can be a burden to mark all
> of them.

It is not a burden to mark all of them. For instance, I have submitted a patch where I mark
all the SYM_CODE*() functions by overriding the SYM_CODE_START()/END() macros in arm64. So,
the changes are very small and self-contained.

A good part of the assembly functions are defined with SYM_CODE_*(). These, by definition,
are low-level functions that do not follow ABI rules. IIRC, Objtool does not perform static
analysis on these today. These need to be recognized by the unwinder in the kernel and handled.
Josh, please correct me if I am wrong. So, this is a problem even if we had static analysis
in Objtool for arm64.

As for functions defined with SYM_FUNC_*(), they are supposed to have proper frame setup and
teardown. But most of them do not appear to have a proper frame pointer prolog and epilog today
in arm64. Some of these will probably never have an FP prolog or epilog because they are high
performance functions or specialized functions and the extra overhead may be unacceptable. Some of
the SYM_FUNC*() functions are leaf functions that don't need an FP prolog or epilog.

With static analysis, Objtool will flag all such functions. Either a proper FP prolog and epilog
have to be introduced in the code or the functions need to be flagged as unreliable from an unwind
perspective. If any of these functions occurs frequently in stack traces, then, either a proper
FP prolog and epilog have to be introduced in the function code. Or, unwind hints have to be placed
at strategic points. In either case, there is a maintenance burden although developers may prefer
one over the other on a case-by-case basis.

The DWARF situation is the same. For the frequently occurring assembly functions, unwind hints
need to be defined. Currently, I have undertaken to study the SYM_FUNC*() functions in arm64
to see if I can determine which ones belong to this category. Also, I am going to be doing targeted
livepatch testing to see if any of these functions will cause many retries during the livepatch
process. If they don't, then this is not a problem.


>> - No software is bug free. So, even if static analysis is implemented for an architecture,
>>   it would be good to have another method of verifying the unwind rules generated from
>>   the static analysis. DWARF can provide that additional verification.
> To me verifying ORC with DWARF a little odd, cuz they are running with different unwind
> mechanism. For normal scenario which calling convention is obeyed, ORC can give a
> promised reliable stack trace, while when it easily involve bug in assembly codes,
> DWARF also can't work.
> 

I am not sure I follow. With both DWARF and ORC, stack pointer and frame pointer offsets are recorded
for every instruction address. These offsets have to be the same regardless of which one you use.

The only difference is that for an assembly function that has a proper FP prolog and epilog, Objtool
static analysis is able to generate those offsets. With DWARF, the compiler does not generate any
offsets for assembly functions. I have to rely on unwind hints. BTW, I only need to do this for functions
that occur frequently in stack traces.


> My support for FP with validation is that it provides a guarantee for FP unwinder. FP and ORC
> use absolute and relative for stack unwind to unwind stack respectively, however FP has been
> considered unreliable. Is there any feature depends on FP? If so it can be more persuasive.
> 

Yes. Static analysis makes sure that functions are following ABI rules. So, it provides a static
guarantee. And that is great.

However, with DWARF, even if some functions don't follow ABI rules, a reliable unwinder can still
be provided as long as the DWARF information generated by the compiler is correct. For instance,
let us say that the compiler generates code for a function with a call to another function before
the FP has been setup properly. If the DWARF information is correctly generated, the unwinder can
see that a stack trace involving the called function is unreliable.

Also, hypothetically, if a buggy kernel function corrupts the frame pointer or the stack
pointer, dynamic validation can catch it.

> 
> Also this patch is much more completed than migration for objtool. It would be
> nice if this could be put into use quickly. The objtool-arm64 is less than half done, but I'm going
> to relies as much as possible on current objtool components, so no more feasibility validation
> is required.
> 

The approach in your patch series is certainly feasible. I don't deny that at all. And, believe me,
I would like the community to take interest in it and review it. If I get a chance, I will also
participate in that review.

As I mentioned in a previous email, my attempt is to come up with a largely architecture independent
solution to the FP validation problem with a quicker time to market. That is all.

> By the way, I was thinking about a corner case, because arm64 CALL instruction won't push LR
> onto stack atomically as x86. Before push LR, FP to save frame there still can be some instructions
> such as bti, paciasp. If an irq happens here, the stack frame is not constructed
> so the FP unwinder will omit this function and provides a wrong stack trace to livepatch.
> 
> 

With DWARF, the unwinder will see that there are no DWARF rules associated with those PCs that occur
before the FP is completely setup. It will mark the stack trace as unreliable. So, these cases are
already handled as I have explained in my cover letter.

> It's just a guess and I have not built the test case. But I think it's a defect on arm64 that FP
> unwinder can't work properly on prologue and epilogue. Do you have any idea about this?
> 

There is no defect. The frame pointer prolog can have multiple instructions before the frame is set up.
Any interrupt or exception happening on those instructions will have an unreliable stack trace by
definition. A reliable unwinder must be able to recognize that case and mark the stack trace as unreliable.
That is all.

Thanks for your comments.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-08  0:21   ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation Josh Poimboeuf
  2022-04-08 11:41     ` Peter Zijlstra
  2022-04-11 17:18     ` Madhavan T. Venkataraman
@ 2022-04-14 14:11     ` Madhavan T. Venkataraman
  2 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-14 14:11 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

Hi Josh, Peter,

I have decided to accept your comment that using the compiler generated DWARF info
is probably not reliable. I can write a DWARF verifier. But I decided that if I can
write a DWARF verifier, I can use that same code to generate the same information as
DWARF CFI.

So, I am going to implement this in the traditional way as you wanted. Fortunately,
I only need to decode a small subset of the instruction formats (just the ones that
affect the SP and FP) and branch and return instructions to be able to compute the
SP and FP offsets at every instruction address. I only need a small part of the
static analysis code.

In other words, I am removing the crazy from my patch series. I am still retaining the
idea of dynamic FP validation rather than static validation. IMO, that will be able to
handle even non-ABI compliant functions and FP corruptions from buggy kernel functions.

Huawei has resubmitted the objtool patch set. When the full blown static analysis
is implemented by them and accepted by the community, my changes can be easily merged
with theirs. So, my solution is not a competing solution. In fact, it will provide an
additional level of robustness.

I will make these changes, test and send out version 2.

Thanks for your comments.

Madhavan

On 4/7/22 19:21, Josh Poimboeuf wrote:
> On Thu, Apr 07, 2022 at 03:25:09PM -0500, madvenka@linux.microsoft.com wrote:
>> The solution
>> ============
>>
>> The goal here is to use the absolute minimum CFI needed to compute the FP at
>> every instruction address. The unwinder can compute the FP in each frame,
>> compare the actual FP with the computed one and validate the actual FP.
>>
>> Objtool is enhanced to parse the CFI, extract just the rules required,
>> encode them in compact data structures and create special sections for
>> the rules. The unwinder uses the special sections to find the rules for
>> a given instruction address and compute the FP.
>>
>> Objtool can be invoked as follows:
>>
>> 	objtool dwarf generate <object-file>
> 
> Hi Madhaven,
> 
> This is quite interesting.  And it's exactly the kind of crazy idea I
> can appreciate ;-)
> 
> Some initial thoughts:
> 
> 
> 1)
> 
> I have some concerns about DWARF's reliability, especially considering
> a) inline asm, b) regular asm, and c) the kernel's tendency to push
> compilers to their limits.
> 
> BUT, supplementing the frame pointer unwinding with DWARF, rather than
> just relying on DWARF alone, does help a LOT.
> 
> I guess the hope is that cross-checking two "mostly reliable" things
> against each other (frame pointers and DWARF) will give a reliable
> result ;-)
> 
> In a general sense, I've never looked at DWARF's reliability, even for
> just normal C code.  It would be good to have some way of knowing that
> DWARF looks mostly sane for both GCC and Clang.  For example, maybe
> somehow cross-checking it with objtool's knowledge.  And then of course
> we'd have to hope that it stays bug-free in future compilers.
> 
> I'd also be somewhat concerned about assembly.  Since there's nothing
> ensuring the unwind hints are valid, and will stay valid over time, I
> wonder how likely it would be for that to break, and what the
> implications would be.  Most likely I guess it would break silently, but
> then get caught by the frame pointer cross-checking.  So a broken hint
> might not get noticed for a long time, but at least it (hopefully)
> wouldn't break reliable unwinding.
> 
> Also, inline asm can sometimes do stack hacks like
> "push;do_something;pop" which isn't visible to the toolchain.  But
> again, hopefully the frame pointer checking would fail and mark it
> unreliable.
> 
> So I do have some worries about DWARF, but the fact that it's getting
> "fact checked" by frame pointers might be sufficient.
> 
> 
> 2)
> 
> If I understand correctly, objtool is converting parts of DWARF to a new
> format which can then be read by the kernel.  In that case, please don't
> call it DWARF as that will cause a lot of confusion.
> 
> There are actually several similarities between your new format and ORC,
> which is also an objtool-created DWARF alternative.  It would be
> interesting to see if they could be combined somehow.
> 
> 
> 3)
> 
> Objtool has become an integral part of x86-64, due to security and
> performance features and toolchain workarounds.
> 
> Not *all* of its features require the full "branch validation" which
> follows all code paths -- and was the hardest part to get right -- but
> several features *do* need that: stack validation, ORC, uaccess
> validation, noinstr validation.
> 
> Objtool has been picking up a lot of steam (and features) lately, with
> more features currently in active development.  And lately there have
> been renewed patches for porting it to powerpc and arm64 (and rumors of
> s390).
> 
> If arm64 ever wants one of those features -- particularly a "branch
> validation" based feature -- I think it would make more sense to just do
> the stack validation in objtool, rather than the DWARF supplementation
> approach.
> 
> Just to give an idea of what objtool already supports and how useful it
> has become for x86, here's an excerpt from some documentation I've been
> working on, since I'm in the middle of rewriting the interface to make
> it more modular.  This is a list of all its current features:
> 
> 
> Features
> --------
> 
> Objtool has the following features:
> 
> 
> - Stack unwinding metadata validation -- useful for helping to ensure
>   stack traces are reliable for live patching
> 
> - ORC unwinder metadata generation -- a faster and more precise
>   alternative to frame pointer based unwinding
> 
> - Retpoline validation -- ensures that all indirect calls go through
>   retpoline thunks, for Spectre v2 mitigations
> 
> - Retpoline call site annotation -- annotates all retpoline thunk call
>   sites, enabling the kernel to patch them inline, to prevent "thunk
>   funneling" for both security and performance reasons
> 
> - Non-instrumentation validation -- validates non-instrumentable
>   ("noinstr") code rules, preventing unexpected instrumentation in
>   low-level C entry code
> 
> - Static call annotation -- annotates static call sites, enabling the
>   kernel to implement inline static calls, a faster alternative to some
>   indirect branches
> 
> - Uaccess validation -- validates uaccess rules for a proper safe
>   implementation of Supervisor Mode Access Protection (SMAP)
> 
> - Straight Line Speculation validation -- validates certain SLS
>   mitigations
> 
> - Indirect Branch Tracking validation -- validates Intel CET IBT rules
>   to ensure that all functions referenced by function pointers have
>   corresponding ENDBR instructions
> 
> - Indirect Branch Tracking annotation -- annotates unused ENDBR
>   instruction sites, enabling the kernel to "seal" them (replace them
>   with NOPs) to further harden IBT
> 
> - Function entry annotation -- annotates function entries, enabling
>   kernel function tracing
> 
> - Other toolchain hacks which will go unmentioned at this time...
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-12  8:32       ` Chen Zhongjin
@ 2022-04-16  0:56         ` Josh Poimboeuf
  2022-04-18 12:28           ` Chen Zhongjin
  0 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2022-04-16  0:56 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: Madhavan T. Venkataraman, mark.rutland, broonie, ardb,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jmorris,
	linux-arm-kernel, live-patching, linux-kernel

On Tue, Apr 12, 2022 at 04:32:22PM +0800, Chen Zhongjin wrote:
> By the way, I was thinking about a corner case, because arm64 CALL
> instruction won't push LR onto stack atomically as x86. Before push LR, FP
> to save frame there still can be some instructions such as bti, paciasp. If
> an irq happens here, the stack frame is not constructed so the FP unwinder
> will omit this function and provides a wrong stack trace to livepatch.
> 
> It's just a guess and I have not built the test case. But I think it's a
> defect on arm64 that FP unwinder can't work properly on prologue and
> epilogue. Do you have any idea about this?

x86 has similar issues with frame pointers, if for example preemption or
page fault exception occurs in a leaf function, or in a function
prologue or epilogue, before or after the frame pointer setup.

This issue is solved by the "reliable" unwinder which detects
irqs/exceptions on the stack and reports the stack as unreliable.

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-11 17:18     ` Madhavan T. Venkataraman
  2022-04-12  8:32       ` Chen Zhongjin
       [not found]       ` <844b3ede-eddb-cbe6-80e0-3529e2da2eb6@huawei.com>
@ 2022-04-16  1:07       ` Josh Poimboeuf
  2 siblings, 0 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2022-04-16  1:07 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel

On Mon, Apr 11, 2022 at 12:18:13PM -0500, Madhavan T. Venkataraman wrote:
> > There are actually several similarities between your new format and ORC,
> > which is also an objtool-created DWARF alternative.  It would be
> > interesting to see if they could be combined somehow.
> > 
> 
> I will certainly look into it. So, if I decide to merge the two, I might want
> to make a minor change to the ORC structure. Would that be OK with you?

Yes, in fact I would expect it, since ORC is quite x86-specific at the
moment.  So it would need some abstractions to make it more multi-arch
friendly.

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-16  0:56         ` Josh Poimboeuf
@ 2022-04-18 12:28           ` Chen Zhongjin
  2022-04-18 16:11             ` Josh Poimboeuf
  0 siblings, 1 reply; 75+ messages in thread
From: Chen Zhongjin @ 2022-04-18 12:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Madhavan T. Venkataraman, mark.rutland, broonie, ardb,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jmorris,
	linux-arm-kernel, live-patching, linux-kernel

Hi Josh,

IIUC, ORC on x86 can make reliable stack unwind for this scenario
because objtool validates BP state.

I'm thinking that on arm64 there's no guarantee that LR will be pushed
onto stack. When we meet similar scenario on arm64, we should recover
(LR, FP) on pt_regs and continue to unwind the stack. And this is
reliable only after we validate (LR, FP).

So should we track LR on arm64 additionally as track BP on x86? Or can
we just treat (LR, FP) as a pair? because as I know they are always set
up together.

On 2022/4/16 8:56, Josh Poimboeuf wrote:
> On Tue, Apr 12, 2022 at 04:32:22PM +0800, Chen Zhongjin wrote:
>> By the way, I was thinking about a corner case, because arm64 CALL
>> instruction won't push LR onto stack atomically as x86. Before push LR, FP
>> to save frame there still can be some instructions such as bti, paciasp. If
>> an irq happens here, the stack frame is not constructed so the FP unwinder
>> will omit this function and provides a wrong stack trace to livepatch.
>>
>> It's just a guess and I have not built the test case. But I think it's a
>> defect on arm64 that FP unwinder can't work properly on prologue and
>> epilogue. Do you have any idea about this?
> 
> x86 has similar issues with frame pointers, if for example preemption or
> page fault exception occurs in a leaf function, or in a function
> prologue or epilogue, before or after the frame pointer setup.
> 
> This issue is solved by the "reliable" unwinder which detects
> irqs/exceptions on the stack and reports the stack as unreliable.
> 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-18 12:28           ` Chen Zhongjin
@ 2022-04-18 16:11             ` Josh Poimboeuf
  2022-04-18 18:38               ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2022-04-18 16:11 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: Madhavan T. Venkataraman, mark.rutland, broonie, ardb,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jmorris,
	linux-arm-kernel, live-patching, linux-kernel

On Mon, Apr 18, 2022 at 08:28:33PM +0800, Chen Zhongjin wrote:
> Hi Josh,
> 
> IIUC, ORC on x86 can make reliable stack unwind for this scenario
> because objtool validates BP state.
> 
> I'm thinking that on arm64 there's no guarantee that LR will be pushed
> onto stack. When we meet similar scenario on arm64, we should recover
> (LR, FP) on pt_regs and continue to unwind the stack. And this is
> reliable only after we validate (LR, FP).
> 
> So should we track LR on arm64 additionally as track BP on x86? Or can
> we just treat (LR, FP) as a pair? because as I know they are always set
> up together.

Does the arm64 unwinder have a way to detect kernel pt_regs on the
stack?  If so, the simplest solution is to mark all stacks with kernel
regs as unreliable.  That's what the x86 FP unwinder does.

-- 
Josh


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation
  2022-04-18 16:11             ` Josh Poimboeuf
@ 2022-04-18 18:38               ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 75+ messages in thread
From: Madhavan T. Venkataraman @ 2022-04-18 18:38 UTC (permalink / raw)
  To: Josh Poimboeuf, Chen Zhongjin
  Cc: mark.rutland, broonie, ardb, nobuta.keiya, sjitindarsingh,
	catalin.marinas, will, jmorris, linux-arm-kernel, live-patching,
	linux-kernel



On 4/18/22 11:11, Josh Poimboeuf wrote:
> On Mon, Apr 18, 2022 at 08:28:33PM +0800, Chen Zhongjin wrote:
>> Hi Josh,
>>
>> IIUC, ORC on x86 can make reliable stack unwind for this scenario
>> because objtool validates BP state.
>>
>> I'm thinking that on arm64 there's no guarantee that LR will be pushed
>> onto stack. When we meet similar scenario on arm64, we should recover
>> (LR, FP) on pt_regs and continue to unwind the stack. And this is
>> reliable only after we validate (LR, FP).
>>
>> So should we track LR on arm64 additionally as track BP on x86? Or can
>> we just treat (LR, FP) as a pair? because as I know they are always set
>> up together.
> 
> Does the arm64 unwinder have a way to detect kernel pt_regs on the
> stack?  If so, the simplest solution is to mark all stacks with kernel
> regs as unreliable.  That's what the x86 FP unwinder does.
> 

AFAICT, only the task pt_regs can be detected. For detecting the other pt_regs,
we would have to set a bit in the FP. IIRC, I had a proposal where I set the LSB in
the FP stored on the stack. The arm64 folks did not like that approach as it
would be indistinguishable from a corrupted FP, however unlikely the corruption
may be.

Unwind hints can be used for these cases to unwind reliably through them. That is
probably the current thinking. Mark Rutland can confirm.

Madhavan

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2022-04-18 18:39 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <95691cae4f4504f33d0fc9075541b1e7deefe96f>
2022-01-17 14:55 ` [PATCH v13 00/11] arm64: Reorganize the unwinder and implement stack trace reliability checks madvenka
2022-01-17 14:55   ` [PATCH v13 01/11] arm64: Remove NULL task check from unwind_frame() madvenka
2022-01-17 14:55   ` [PATCH v13 02/11] arm64: Rename unwinder functions madvenka
2022-01-17 14:56   ` [PATCH v13 03/11] arm64: Rename stackframe to unwind_state madvenka
2022-01-17 14:56   ` [PATCH v13 04/11] arm64: Split unwind_init() madvenka
2022-02-02 18:44     ` Mark Brown
2022-02-03  0:26       ` Madhavan T. Venkataraman
2022-02-03  0:39         ` Madhavan T. Venkataraman
2022-02-03 11:29           ` Mark Brown
2022-02-15 13:07     ` Mark Rutland
2022-02-15 18:04       ` Madhavan T. Venkataraman
2022-01-17 14:56   ` [PATCH v13 05/11] arm64: Copy the task argument to unwind_state madvenka
2022-02-02 18:45     ` Mark Brown
2022-02-15 13:22     ` Mark Rutland
2022-02-22 16:53       ` Madhavan T. Venkataraman
2022-01-17 14:56   ` [PATCH v13 06/11] arm64: Use stack_trace_consume_fn and rename args to unwind() madvenka
2022-02-02 18:46     ` Mark Brown
2022-02-03  0:34       ` Madhavan T. Venkataraman
2022-02-03 11:30         ` Mark Brown
2022-02-03 14:45           ` Madhavan T. Venkataraman
2022-02-15 13:39     ` Mark Rutland
2022-02-15 18:12       ` Madhavan T. Venkataraman
2022-03-07 16:51       ` Madhavan T. Venkataraman
2022-03-07 17:01         ` Mark Brown
2022-03-08 22:00           ` Madhavan T. Venkataraman
2022-03-09 11:47             ` Mark Brown
2022-03-09 15:34               ` Madhavan T. Venkataraman
2022-03-10  8:33               ` Miroslav Benes
2022-03-10 12:36                 ` Madhavan T. Venkataraman
2022-03-16  3:43               ` Josh Poimboeuf
2022-04-08 14:44         ` Mark Rutland
2022-04-08 17:58           ` Mark Rutland
2022-04-10 17:42             ` Madhavan T. Venkataraman
2022-04-10 17:33           ` Madhavan T. Venkataraman
2022-04-10 17:45           ` Madhavan T. Venkataraman
2022-01-17 14:56   ` [PATCH v13 07/11] arm64: Make the unwind loop in unwind() similar to other architectures madvenka
2022-01-17 14:56   ` [PATCH v13 08/11] arm64: Introduce stack trace reliability checks in the unwinder madvenka
2022-01-17 14:56   ` [PATCH v13 09/11] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
2022-01-17 14:56   ` [PATCH v13 10/11] arm64: Introduce arch_stack_walk_reliable() madvenka
2022-01-17 14:56   ` [PATCH v13 11/11] arm64: Select HAVE_RELIABLE_STACKTRACE madvenka
2022-01-25  5:21     ` nobuta.keiya
2022-01-25 13:43       ` Madhavan T. Venkataraman
2022-01-26 10:20         ` nobuta.keiya
2022-01-26 17:14           ` Madhavan T. Venkataraman
2022-01-27  1:13             ` nobuta.keiya
2022-01-26 17:16       ` Mark Brown
2022-04-07 20:25 ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation madvenka
2022-04-07 20:25   ` [RFC PATCH v1 1/9] objtool: Parse DWARF Call Frame Information in object files madvenka
2022-04-07 20:25   ` [RFC PATCH v1 2/9] objtool: Generate DWARF rules and place them in a special section madvenka
2022-04-07 20:25   ` [RFC PATCH v1 3/9] dwarf: Build the kernel with DWARF information madvenka
2022-04-07 20:25   ` [RFC PATCH v1 4/9] dwarf: Implement DWARF rule processing in the kernel madvenka
2022-04-07 20:25   ` [RFC PATCH v1 5/9] dwarf: Implement DWARF support for modules madvenka
2022-04-07 20:25   ` [RFC PATCH v1 6/9] arm64: unwinder: Add a reliability check in the unwinder based on DWARF CFI madvenka
2022-04-07 20:25   ` [RFC PATCH v1 7/9] arm64: dwarf: Implement unwind hints madvenka
2022-04-07 20:25   ` [RFC PATCH v1 8/9] dwarf: Miscellaneous changes required for enabling livepatch madvenka
2022-04-07 20:25   ` [RFC PATCH v1 9/9] dwarf: Enable livepatch for ARM64 madvenka
2022-04-08  0:21   ` [RFC PATCH v1 0/9] arm64: livepatch: Use DWARF Call Frame Information for frame pointer validation Josh Poimboeuf
2022-04-08 11:41     ` Peter Zijlstra
2022-04-11 17:26       ` Madhavan T. Venkataraman
2022-04-11 17:18     ` Madhavan T. Venkataraman
2022-04-12  8:32       ` Chen Zhongjin
2022-04-16  0:56         ` Josh Poimboeuf
2022-04-18 12:28           ` Chen Zhongjin
2022-04-18 16:11             ` Josh Poimboeuf
2022-04-18 18:38               ` Madhavan T. Venkataraman
     [not found]       ` <844b3ede-eddb-cbe6-80e0-3529e2da2eb6@huawei.com>
2022-04-12 17:27         ` Madhavan T. Venkataraman
2022-04-16  1:07       ` Josh Poimboeuf
2022-04-14 14:11     ` Madhavan T. Venkataraman
2022-04-08 10:55   ` Peter Zijlstra
2022-04-08 11:54     ` Peter Zijlstra
2022-04-08 14:34       ` Josh Poimboeuf
2022-04-10 17:47     ` Madhavan T. Venkataraman
2022-04-11 16:34       ` Josh Poimboeuf
2022-04-08 12:06   ` Peter Zijlstra
2022-04-11 17:35     ` Madhavan T. Venkataraman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).