* [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks [not found] <ea0ef9ed6eb34618bcf468fbbf8bdba99e15df7d> @ 2021-05-26 21:49 ` madvenka 2021-05-26 21:49 ` [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder madvenka ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: madvenka @ 2021-05-26 21:49 UTC (permalink / raw) To: broonie, mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel, madvenka From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> There are a number of places in kernel code where the stack trace is not reliable. Enhance the unwinder to check for those cases and mark the stack trace as unreliable. Once all of the checks are in place, the unwinder can provide a reliable stack trace. But before this can be used for livepatch, some other entity needs to guarantee that the frame pointers are all set up correctly in kernel functions. objtool is currently being worked on to address that need. Return address check ==================== Check the return PC of every stack frame to make sure that it is a valid kernel text address (and not some generated code, for example). If it is not a valid kernel text address, mark the stack trace as unreliable. Assembly functions ================== There are a number of assembly functions in arm64. Except for a couple of them, these functions do not have a frame pointer prolog or epilog. Also, many of them manipulate low-level state such as registers. These functions are, by definition, unreliable from a stack unwinding perspective. That is, when these functions occur in a stack trace, the unwinder would not be able to unwind through them reliably. Assembly functions are defined as SYM_FUNC_*() functions or SYM_CODE_*() functions. objtool peforms static analysis of SYM_FUNC functions. It ignores SYM_CODE functions because they have low level code that is difficult to analyze. When objtool becomes ready eventually, SYM_FUNC functions will be analyzed and "fixed" as necessary. So, they are not "interesting" for the reliable unwinder. That leaves SYM_CODE functions. It is for the unwinder to deal with these for reliable stack trace. The unwinder needs to do the following: - Recognize SYM_CODE functions in a stack trace. - If a particular SYM_CODE function can be unwinded through using some special logic, then do it. E.g., the return trampoline for Function Graph Tracing. - Otherwise, mark the stack trace unreliable. Previous approach ================= Previously, I considered a per-section approach. The following sections mostly contain SYM_CODE functions: .entry.text .idmap.text .hyp.idmap.text .hyp.text .hibernate_exit.text .entry.tramp.text So, consider the whole sections unreliable. I created an array of code ranges, one for each section. The unwinder then checks each return PC against these sections. If there is a match, the stack trace is unreliable. Although this approach is reasonable, there are two problems: 1. There are SYM_FUNC functions in these sections. They also "become" unreliable in this approach. I think that we should be able to specify which functions are reliable and which ones are not in any section. 2. There are a few SYM_CODE functions in .text and .init.text. These sections contain tons of functions that are reliable. So, I considered moving the SYM_CODE functions into a separate section called ".code.text". I am not convinced that this approach is fine: - We should be able to place functions in .text and .init.text, if we wanted to. - Moving the hypervisor related functions caused reloc issues. - Moving things around just to satisfy the unwinder did not seem right. Current approach ================ In this version, I have taken a simpler approach to solve the issues mentioned before. Define an ARM64 version of SYM_CODE_END() like this: #define SYM_CODE_END(name) \ SYM_END(name, SYM_T_NONE) ;\ 99: ;\ .pushsection "sym_code_functions", "aw" ;\ .quad name ;\ .quad 99b ;\ .popsection NOTE: I use the numeric label 99 as the end address of a function. It does not conflict with anything right now. Please let me know if this is OK. I can add a comment that the label 99 is reserved and should not be used by assembly code. The above macro does the usual SYM_END(). In addition, it records the function's address range in a special section called "sym_code_functions". This way, all SYM_CODE functions get recorded in the section automatically. Implement an early_initcall() called init_sym_code_functions() that allocates an array called sym_code_functions[] and copies the function ranges from the section to the array. Implement an unwinder_is_unreliable() function that compares a return PC to the ranges. If there is a match, return true. Else, return false. Call unwinder_is_unreliable() on every return PC from unwind_frame(). If there is a match, then mark the stack trace as unreliable. Last stack frame ================ If a SYM_CODE function occurs in the very last frame in the stack trace, then the stack trace is not considered unreliable. This is because there is no more unwinding to do. Examples: - EL0 exception stack traces end in the top level EL0 exception handlers. - All kernel thread stack traces end in ret_from_fork(). Special SYM_CODE functions ========================== The return trampolines of the Function Graph Tracer and Kretprobe can be recognized by the unwinder. If the return PCs can be translated to the original PCs, then, the unwinder should perform that translation before checking for reliability. The final PC that we end up with after all the translations is the one we need to check for reliability. Accordingly, I have moved the reliability checks to after the logic that handles the Function Graph Tracer. So, the approach is - all SYM_CODE functions are unreliable. If a SYM_CODE function is "fixed" to make it reliable, then it should become a SYM_FUNC function. Or, if the unwinder has special logic to unwind through a SYM_CODE function, then that can be done. Previously, I had some extra code to handle the Function Graph Trace return trampoline case. I have removed it. Special casing is orthogonal to this work and can be done separately. Documentation ============= Documentation/livepatch/reliable-stacktrace.rst written by Mark Rutland gives a list of requirements for a reliable unwinder. I have added sufficient comments in the code so that it is easy to map the requirements to the code in the unwinder. I have also moved some comments from the previous cover letter to the code. Performance =========== Currently, unwinder_is_unreliable() does a linear search through sym_code_functions[]. If reviewers prefer, I could sort the sym_code_functions[] array and perform a binary search for better performance. There are about 80 entries in the array. Checkpatch ========== I am getting the following error on SYM_CODE_END() that I have defined for arm64. These macros are only used in assembly. I am not sure how to deal with these. Can someone help? ERROR: Macros with multiple statements should be enclosed in a do - while loop #13: FILE: arch/arm64/include/asm/linkage.h:67: +#define SYM_CODE_END(name) \ + SYM_END(name, SYM_T_NONE) ;\ + 99: ;\ + .pushsection "sym_code_functions", "aw" ;\ + .quad name ;\ + .quad 99b ;\ + .popsection CHECK: Macro argument reuse 'name' - possible side-effects? #13: FILE: arch/arm64/include/asm/linkage.h:67: +#define SYM_CODE_END(name) \ + SYM_END(name, SYM_T_NONE) ;\ + 99: ;\ + .pushsection "sym_code_functions", "aw" ;\ + .quad name ;\ + .quad 99b ;\ + .popsection ERROR: spaces required around that ':' (ctx:VxW) #15: FILE: arch/arm64/include/asm/linkage.h:69: + 99: ;\ ^ WARNING: labels should not be indented #15: FILE: arch/arm64/include/asm/linkage.h:69: + 99: ;\ total: 2 errors, 1 warnings, 1 checks, 171 lines checked --- Changelog: v5: From Keiya Nobuta: - The term blacklist(ed) is not to be used anymore. I have changed it to unreliable. So, the function unwinder_blacklisted() has been changed to unwinder_is_unreliable(). From Mark Brown: - Add a comment for the "reliable" flag in struct stackframe. The reliability attribute is not complete until all the checks are in place. Added a comment above struct stackframe. - Include some of the comments in the cover letter in the actual code so that we can compare it with the reliable stack trace requirements document for completeness. I have added a comment: - above unwinder_is_unreliable() that lists the requirements that are addressed by the function. - above the __kernel_text_address() call about all the cases the call covers. v4: From Mark Brown: - I was checking the return PC with __kernel_text_address() before the Function Graph trace handling. Mark Brown felt that all the reliability checks should be performed on the original return PC once that is obtained. So, I have moved all the reliability checks to after the Function Graph Trace handling code in the unwinder. Basically, the unwinder should perform PC translations first (for rhe return trampoline for Function Graph Tracing, Kretprobes, etc). Then, the reliability checks should be applied to the resulting PC. - Mark said to improve the naming of the new functions so they don't collide with existing ones. I have used a prefix "unwinder_" for all the new functions. From Josh Poimboeuf: - In the error scenarios in the unwinder, the reliable flag in the stack frame should be set. Implemented this. - Some of the other comments are not relevant to the new code as I have taken a different approach in the new code. That is why I have not made those changes. E.g., Ard wanted me to add the "const" keyword to the global section array. That array does not exist in v4. Similarly, Mark Brown said to use ARRAY_SIZE() for the same array in a for loop. Other changes: - Add a new definition for SYM_CODE_END() that adds the address range of the function to a special section called "sym_code_functions". - Include the new section under initdata in vmlinux.lds.S. - Define an early_initcall() to copy the contents of the "sym_code_functions" section to an array by the same name. - Define a function unwinder_blacklisted() that compares a return PC against sym_code_sections[]. If there is a match, mark the stack trace unreliable. Call this from unwind_frame(). v3: - Implemented a sym_code_ranges[] array to contains sections bounds for text sections that contain SYM_CODE_*() functions. The unwinder checks each return PC against the sections. If it falls in any of the sections, the stack trace is marked unreliable. - Moved SYM_CODE functions from .text and .init.text into a new text section called ".code.text". Added this section to vmlinux.lds.S and sym_code_ranges[]. - Fixed the logic in the unwinder that handles Function Graph Tracer return trampoline. - Removed all the previous code that handles: - ftrace entry code for traced function - special_functions[] array that lists individual functions - kretprobe_trampoline() special case v2 - Removed the terminating entry { 0, 0 } in special_functions[] and replaced it with the idiom { /* sentinel */ }. - Change the ftrace trampoline entry ftrace_graph_call in special_functions[] to ftrace_call + 4 and added explanatory comments. - Unnested #ifdefs in special_functions[] for FTRACE. v1 - Define a bool field in struct stackframe. This will indicate if a stack trace is reliable. - Implement a special_functions[] array that will be populated with special functions in which the stack trace is considered unreliable. - Using kallsyms_lookup(), get the address ranges for the special functions and record them. - Implement an is_reliable_function(pc). This function will check if a given return PC falls in any of the special functions. If it does, the stack trace is unreliable. - Implement check_reliability() function that will check if a stack frame is reliable. Call is_reliable_function() from check_reliability(). - Before a return PC is checked against special_funtions[], it must be validates as a proper kernel text address. Call __kernel_text_address() from check_reliability(). - Finally, call check_reliability() from unwind_frame() for each stack frame. - Add EL1 exception handlers to special_functions[]. el1_sync(); el1_irq(); el1_error(); el1_sync_invalid(); el1_irq_invalid(); el1_fiq_invalid(); el1_error_invalid(); - The above functions are currently defined as LOCAL symbols. Make them global so that they can be referenced from the unwinder code. - Add FTRACE trampolines to special_functions[]: ftrace_graph_call() ftrace_graph_caller() return_to_handler() - Add the kretprobe trampoline to special functions[]: kretprobe_trampoline() Previous versions and discussion ================================ v4: https://lore.kernel.org/linux-arm-kernel/20210516040018.128105-1-madvenka@linux.microsoft.com/ v3: https://lore.kernel.org/linux-arm-kernel/20210503173615.21576-1-madvenka@linux.microsoft.com/ v2: https://lore.kernel.org/linux-arm-kernel/20210405204313.21346-1-madvenka@linux.microsoft.com/ v1: https://lore.kernel.org/linux-arm-kernel/20210330190955.13707-1-madvenka@linux.microsoft.com/ Madhavan T. Venkataraman (2): arm64: Introduce stack trace reliability checks in the unwinder arm64: Create a list of SYM_CODE functions, check return PC against list arch/arm64/include/asm/linkage.h | 12 +++ arch/arm64/include/asm/sections.h | 1 + arch/arm64/include/asm/stacktrace.h | 9 ++ arch/arm64/kernel/stacktrace.c | 154 +++++++++++++++++++++++++++- arch/arm64/kernel/vmlinux.lds.S | 7 ++ 5 files changed, 178 insertions(+), 5 deletions(-) base-commit: bf05bf16c76bb44ab5156223e1e58e26dfe30a88 -- 2.25.1 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-05-26 21:49 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks madvenka @ 2021-05-26 21:49 ` madvenka 2021-06-24 14:40 ` Mark Rutland 2021-05-26 21:49 ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka 2021-06-04 15:29 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks Mark Brown 2 siblings, 1 reply; 20+ messages in thread From: madvenka @ 2021-05-26 21:49 UTC (permalink / raw) To: broonie, mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel, madvenka From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> The unwinder should check for the presence of various features and conditions that can render the stack trace unreliable and mark the the stack trace as unreliable for the benefit of the caller. Introduce the first reliability check - If a return PC is not a valid kernel text address, consider the stack trace unreliable. It could be some generated code. Other reliability checks will be added in the future. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> --- arch/arm64/include/asm/stacktrace.h | 9 +++++++ arch/arm64/kernel/stacktrace.c | 38 +++++++++++++++++++++++++---- 2 files changed, 42 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h index eb29b1fe8255..4c822ef7f588 100644 --- a/arch/arm64/include/asm/stacktrace.h +++ b/arch/arm64/include/asm/stacktrace.h @@ -49,6 +49,13 @@ struct stack_info { * * @graph: When FUNCTION_GRAPH_TRACER is selected, holds the index of a * replacement lr value in the ftrace graph stack. + * + * @reliable: Is this stack frame reliable? There are several checks that + * need to be performed in unwind_frame() before a stack frame + * is truly reliable. Until all the checks are present, this flag + * is just a place holder. Once all the checks are implemented, + * this comment will be updated and the flag can be used by the + * caller of unwind_frame(). */ struct stackframe { unsigned long fp; @@ -59,6 +66,7 @@ struct stackframe { #ifdef CONFIG_FUNCTION_GRAPH_TRACER int graph; #endif + bool reliable; }; extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame); @@ -169,6 +177,7 @@ static inline void start_backtrace(struct stackframe *frame, bitmap_zero(frame->stacks_done, __NR_STACK_TYPES); frame->prev_fp = 0; frame->prev_type = STACK_TYPE_UNKNOWN; + frame->reliable = true; } #endif /* __ASM_STACKTRACE_H */ diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index d55bdfb7789c..9061375c8785 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -44,21 +44,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) unsigned long fp = frame->fp; struct stack_info info; + frame->reliable = true; + /* Terminal record; nothing to unwind */ if (!fp) return -ENOENT; - if (fp & 0xf) + if (fp & 0xf) { + frame->reliable = false; return -EINVAL; + } if (!tsk) tsk = current; - if (!on_accessible_stack(tsk, fp, &info)) + if (!on_accessible_stack(tsk, fp, &info)) { + frame->reliable = false; return -EINVAL; + } - if (test_bit(info.type, frame->stacks_done)) + if (test_bit(info.type, frame->stacks_done)) { + frame->reliable = false; return -EINVAL; + } /* * As stacks grow downward, any valid record on the same stack must be @@ -74,8 +82,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) * stack. */ if (info.type == frame->prev_type) { - if (fp <= frame->prev_fp) + if (fp <= frame->prev_fp) { + frame->reliable = false; return -EINVAL; + } } else { set_bit(frame->prev_type, frame->stacks_done); } @@ -100,14 +110,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) * So replace it to an original value. */ ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++); - if (WARN_ON_ONCE(!ret_stack)) + if (WARN_ON_ONCE(!ret_stack)) { + frame->reliable = false; return -EINVAL; + } frame->pc = ret_stack->ret; } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ frame->pc = ptrauth_strip_insn_pac(frame->pc); + /* + * Check the return PC for conditions that make unwinding unreliable. + * In each case, mark the stack trace as such. + */ + + /* + * Make sure that the return address is a proper kernel text address. + * A NULL or invalid return address could mean: + * + * - generated code such as eBPF and optprobe trampolines + * - Foreign code (e.g. EFI runtime services) + * - Procedure Linkage Table (PLT) entries and veneer functions + */ + if (!__kernel_text_address(frame->pc)) + frame->reliable = false; + return 0; } NOKPROBE_SYMBOL(unwind_frame); -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-05-26 21:49 ` [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder madvenka @ 2021-06-24 14:40 ` Mark Rutland 2021-06-24 16:03 ` Mark Brown ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Mark Rutland @ 2021-06-24 14:40 UTC (permalink / raw) To: madvenka Cc: broonie, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel Hi Madhavan, On Wed, May 26, 2021 at 04:49:16PM -0500, madvenka@linux.microsoft.com wrote: > From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> > > The unwinder should check for the presence of various features and > conditions that can render the stack trace unreliable and mark the > the stack trace as unreliable for the benefit of the caller. > > Introduce the first reliability check - If a return PC is not a valid > kernel text address, consider the stack trace unreliable. It could be > some generated code. > > Other reliability checks will be added in the future. > > Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> At a high-level, I'm on-board with keeping track of this per unwind step, but if we do that then I want to be abel to use this during regular unwinds (e.g. so that we can have a backtrace idicate when a step is not reliable, like x86 does with '?'), and to do that we need to be a little more accurate. I think we first need to do some more preparatory work for that, but regardless, I have some comments below. > --- > arch/arm64/include/asm/stacktrace.h | 9 +++++++ > arch/arm64/kernel/stacktrace.c | 38 +++++++++++++++++++++++++---- > 2 files changed, 42 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h > index eb29b1fe8255..4c822ef7f588 100644 > --- a/arch/arm64/include/asm/stacktrace.h > +++ b/arch/arm64/include/asm/stacktrace.h > @@ -49,6 +49,13 @@ struct stack_info { > * > * @graph: When FUNCTION_GRAPH_TRACER is selected, holds the index of a > * replacement lr value in the ftrace graph stack. > + * > + * @reliable: Is this stack frame reliable? There are several checks that > + * need to be performed in unwind_frame() before a stack frame > + * is truly reliable. Until all the checks are present, this flag > + * is just a place holder. Once all the checks are implemented, > + * this comment will be updated and the flag can be used by the > + * caller of unwind_frame(). I'd prefer that we state the high-level semantic first, then drill down into detail, e.g. | @reliable: Indicates whether this frame is beleived to be a reliable | unwinding from the parent stackframe. This may be set | regardless of whether the parent stackframe was reliable. | | This is set only if all the following are true: | | * @pc is a valid text address. | | Note: this is currently incomplete. > */ > struct stackframe { > unsigned long fp; > @@ -59,6 +66,7 @@ struct stackframe { > #ifdef CONFIG_FUNCTION_GRAPH_TRACER > int graph; > #endif > + bool reliable; > }; > > extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame); > @@ -169,6 +177,7 @@ static inline void start_backtrace(struct stackframe *frame, > bitmap_zero(frame->stacks_done, __NR_STACK_TYPES); > frame->prev_fp = 0; > frame->prev_type = STACK_TYPE_UNKNOWN; > + frame->reliable = true; > } I think we need more data than this to be accurate. Consider arch_stack_walk() starting from a pt_regs -- the initial state (the PC from the regs) is accurate, but the first unwind from that will not be, and we don't account for that at all. I think we need to capture an unwind type in struct stackframe, which we can pass into start_backtrace(), e.g. | enum unwind_type { | /* | * The next frame is indicated by the frame pointer. | * The next unwind may or may not be reliable. | */ | UNWIND_TYPE_FP, | | /* | * The next frame is indicated by the LR in pt_regs. | * The next unwind is not reliable. | */ | UNWIND_TYPE_REGS_LR, | | /* | * We do not know how to unwind to the next frame. | * The next unwind is not reliable. | */ | UNWIND_TYPE_UNKNOWN | }; That should be simple enough to set up around start_backtrace(), but we'll need further rework to make that simple at exception boundaries. With the entry rework I have queued for v5.14, we're *almost* down to a single asm<->c transition point for all vectors, and I'm hoping to factor the remainder out to C for v5.15, whereupon we can annotate that BL with some metadata for unwinding (with something similar to x86's UNWIND_HINT, but retained for runtime). > > #endif /* __ASM_STACKTRACE_H */ > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index d55bdfb7789c..9061375c8785 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -44,21 +44,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) > unsigned long fp = frame->fp; > struct stack_info info; > > + frame->reliable = true; I'd prefer to do this the other way around, e.g. here do: | /* | * Assume that an unwind step is unreliable until it has passed | * all relevant checks. | */ | frame->reliable = false; ... then only set this to true once we're certain the step is reliable. That requires fewer changes below, and would also be more robust as if we forget to update this we'd accidentally mark an entry as unreliable rather than accidentally marking it as reliable. > + > /* Terminal record; nothing to unwind */ > if (!fp) > return -ENOENT; > > - if (fp & 0xf) > + if (fp & 0xf) { > + frame->reliable = false; > return -EINVAL; > + } > > if (!tsk) > tsk = current; > > - if (!on_accessible_stack(tsk, fp, &info)) > + if (!on_accessible_stack(tsk, fp, &info)) { > + frame->reliable = false; > return -EINVAL; > + } > > - if (test_bit(info.type, frame->stacks_done)) > + if (test_bit(info.type, frame->stacks_done)) { > + frame->reliable = false; > return -EINVAL; > + } > > /* > * As stacks grow downward, any valid record on the same stack must be > @@ -74,8 +82,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) > * stack. > */ > if (info.type == frame->prev_type) { > - if (fp <= frame->prev_fp) > + if (fp <= frame->prev_fp) { > + frame->reliable = false; > return -EINVAL; > + } > } else { > set_bit(frame->prev_type, frame->stacks_done); > } > @@ -100,14 +110,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) > * So replace it to an original value. > */ > ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++); > - if (WARN_ON_ONCE(!ret_stack)) > + if (WARN_ON_ONCE(!ret_stack)) { > + frame->reliable = false; > return -EINVAL; > + } > frame->pc = ret_stack->ret; > } > #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ > > frame->pc = ptrauth_strip_insn_pac(frame->pc); > > + /* > + * Check the return PC for conditions that make unwinding unreliable. > + * In each case, mark the stack trace as such. > + */ > + > + /* > + * Make sure that the return address is a proper kernel text address. > + * A NULL or invalid return address could mean: > + * > + * - generated code such as eBPF and optprobe trampolines > + * - Foreign code (e.g. EFI runtime services) > + * - Procedure Linkage Table (PLT) entries and veneer functions > + */ > + if (!__kernel_text_address(frame->pc)) > + frame->reliable = false; I don't think we should mention PLTs here. They appear in regular kernel text, and on arm64 they are generally not problematic for unwinding. The case in which they are problematic are where they interpose an trampoline call that isn't following the AAPCS (e.g. ftrace calls from a module, or calls to __hwasan_tag_mismatch generally), and we'll have to catch those explciitly (or forbid RELIABLE_STACKTRACE with HWASAN). From a backtrace perspective, the PC itself *is* reliable, but the next unwind from this frame will not be, so I'd like to mark this as reliable and the next unwind as unreliable. We can do that with the UNWIND_TYPE_UNKNOWN suggestion above. For the comment here, how about: | /* | * If the PC is not a known kernel text address, then we cannot | * be sure that a subsequent unwind will be reliable, as we | * don't know that the code follows our unwind requirements. | */ | if (!__kernel_text_address(frame-pc)) | frame->unwind = UNWIND_TYPE_UNKNOWN; Thanks, Mark. > return 0; > } > NOKPROBE_SYMBOL(unwind_frame); > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-24 14:40 ` Mark Rutland @ 2021-06-24 16:03 ` Mark Brown 2021-06-25 15:39 ` Madhavan T. Venkataraman 2021-06-29 16:47 ` Josh Poimboeuf 2 siblings, 0 replies; 20+ messages in thread From: Mark Brown @ 2021-06-24 16:03 UTC (permalink / raw) To: Mark Rutland Cc: madvenka, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel [-- Attachment #1: Type: text/plain, Size: 819 bytes --] On Thu, Jun 24, 2021 at 03:40:21PM +0100, Mark Rutland wrote: > regular unwinds (e.g. so that we can have a backtrace idicate when a > step is not reliable, like x86 does with '?'), and to do that we need to > be a little more accurate. There was the idea that was discussed a bit when I was more actively working on this of just refactoring our unwinder infrastructure to be a lot more like the x86 and (IIRC) S/390 in form. Part of the thing there was that it'd mean that even where we're not able to actually share code we'd have more of a common baseline for how things work and what works. It'd make review, especially cross architecture review, of what's going on a bit easier too - see some of the concerns Josh had about the differences here for example. It'd be a relatively big bit of refactoring though. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-24 14:40 ` Mark Rutland 2021-06-24 16:03 ` Mark Brown @ 2021-06-25 15:39 ` Madhavan T. Venkataraman 2021-06-25 15:51 ` Mark Brown 2021-06-26 15:35 ` Madhavan T. Venkataraman 2021-06-29 16:47 ` Josh Poimboeuf 2 siblings, 2 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-25 15:39 UTC (permalink / raw) To: Mark Rutland Cc: broonie, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On 6/24/21 9:40 AM, Mark Rutland wrote: > Hi Madhavan, > > On Wed, May 26, 2021 at 04:49:16PM -0500, madvenka@linux.microsoft.com wrote: >> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> >> >> The unwinder should check for the presence of various features and >> conditions that can render the stack trace unreliable and mark the >> the stack trace as unreliable for the benefit of the caller. >> >> Introduce the first reliability check - If a return PC is not a valid >> kernel text address, consider the stack trace unreliable. It could be >> some generated code. >> >> Other reliability checks will be added in the future. >> >> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> > > At a high-level, I'm on-board with keeping track of this per unwind > step, but if we do that then I want to be abel to use this during > regular unwinds (e.g. so that we can have a backtrace idicate when a > step is not reliable, like x86 does with '?'), and to do that we need to > be a little more accurate. > The only consumer of frame->reliable is livepatch. So, in retrospect, my original per-frame reliability flag was an overkill. I was just trying to provide extra per-frame debug information which is not really a requirement for livepatch. So, let us separate the two. I will rename frame->reliable to frame->livepatch_safe. This will apply to the whole stacktrace and not to every frame. Pass a livepatch_safe flag to start_backtrace(). This will be the initial value of frame->livepatch_safe. So, if the caller knows that the starting frame is unreliable, he can pass "false" to start_backtrace(). Whenever a reliability check fails, frame->livepatch_safe = false. After that point, it will remain false till the end of the stacktrace. This keeps it simple. Also, once livepatch_safe is set to false, further reliability checks will not be performed (what would be the point?). Finally, it might be a good idea to perform reliability checks even in start_backtrace() so we don't assume that the starting frame is reliable even if the caller passes livepatch_safe=true. What do you think? > I think we first need to do some more preparatory work for that, but > regardless, I have some comments below. > I agree that some more work is required to provide per-frame debug information and tracking. That can be done later. It is not a requirement for livepatch. >> --- >> arch/arm64/include/asm/stacktrace.h | 9 +++++++ >> arch/arm64/kernel/stacktrace.c | 38 +++++++++++++++++++++++++---- >> 2 files changed, 42 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h >> index eb29b1fe8255..4c822ef7f588 100644 >> --- a/arch/arm64/include/asm/stacktrace.h >> +++ b/arch/arm64/include/asm/stacktrace.h >> @@ -49,6 +49,13 @@ struct stack_info { >> * >> * @graph: When FUNCTION_GRAPH_TRACER is selected, holds the index of a >> * replacement lr value in the ftrace graph stack. >> + * >> + * @reliable: Is this stack frame reliable? There are several checks that >> + * need to be performed in unwind_frame() before a stack frame >> + * is truly reliable. Until all the checks are present, this flag >> + * is just a place holder. Once all the checks are implemented, >> + * this comment will be updated and the flag can be used by the >> + * caller of unwind_frame(). > > I'd prefer that we state the high-level semantic first, then drill down > into detail, e.g. > > | @reliable: Indicates whether this frame is beleived to be a reliable > | unwinding from the parent stackframe. This may be set > | regardless of whether the parent stackframe was reliable. > | > | This is set only if all the following are true: > | > | * @pc is a valid text address. > | > | Note: this is currently incomplete. > I will change the name of the flag. I will change the comment accordingly. >> */ >> struct stackframe { >> unsigned long fp; >> @@ -59,6 +66,7 @@ struct stackframe { >> #ifdef CONFIG_FUNCTION_GRAPH_TRACER >> int graph; >> #endif >> + bool reliable; >> }; >> >> extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame); >> @@ -169,6 +177,7 @@ static inline void start_backtrace(struct stackframe *frame, >> bitmap_zero(frame->stacks_done, __NR_STACK_TYPES); >> frame->prev_fp = 0; >> frame->prev_type = STACK_TYPE_UNKNOWN; >> + frame->reliable = true; >> } > > I think we need more data than this to be accurate. > > Consider arch_stack_walk() starting from a pt_regs -- the initial state > (the PC from the regs) is accurate, but the first unwind from that will > not be, and we don't account for that at all. > > I think we need to capture an unwind type in struct stackframe, which we > can pass into start_backtrace(), e.g. > > | enum unwind_type { > | /* > | * The next frame is indicated by the frame pointer. > | * The next unwind may or may not be reliable. > | */ > | UNWIND_TYPE_FP, > | > | /* > | * The next frame is indicated by the LR in pt_regs. > | * The next unwind is not reliable. > | */ > | UNWIND_TYPE_REGS_LR, > | > | /* > | * We do not know how to unwind to the next frame. > | * The next unwind is not reliable. > | */ > | UNWIND_TYPE_UNKNOWN > | }; > > That should be simple enough to set up around start_backtrace(), but > we'll need further rework to make that simple at exception boundaries. > With the entry rework I have queued for v5.14, we're *almost* down to a > single asm<->c transition point for all vectors, and I'm hoping to > factor the remainder out to C for v5.15, whereupon we can annotate that > BL with some metadata for unwinding (with something similar to x86's > UNWIND_HINT, but retained for runtime). > I understood UNWIND_TYPE_FP and UNWIND_TYPE_REGS_LR. When would UNWIND_TYPE_UNKNOWN be passed to start_backtrace? Could you elaborate? Regardless, the above comment applies only to per-frame tracking when it is eventually implemented. For livepatch, it is not needed. At exception boundaries, if stack metadata is available, then use that to unwind safely. Else, livepatch_safe = false. The latter is what is being done in my patch series. So, we can go with that until stack metadata becomes available. For the UNWIND_TYPE_REGS_LR and UNWIND_TYPE_UNKNOWN cases, the caller will pass livepatch_safe=false to start_backtrace(). For UNWIND_TYPE_FP, the caller will pass livepatch_safe=true. So, only UNWIND_TYPE_FP matters for livepatch. >> >> #endif /* __ASM_STACKTRACE_H */ >> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c >> index d55bdfb7789c..9061375c8785 100644 >> --- a/arch/arm64/kernel/stacktrace.c >> +++ b/arch/arm64/kernel/stacktrace.c >> @@ -44,21 +44,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) >> unsigned long fp = frame->fp; >> struct stack_info info; >> >> + frame->reliable = true; > > I'd prefer to do this the other way around, e.g. here do: > > | /* > | * Assume that an unwind step is unreliable until it has passed > | * all relevant checks. > | */ > | frame->reliable = false; > > ... then only set this to true once we're certain the step is reliable. > > That requires fewer changes below, and would also be more robust as if > we forget to update this we'd accidentally mark an entry as unreliable > rather than accidentally marking it as reliable. > For livepatch_safe, the initial statement setting it to true at the beginning of unwind_frame() goes away. But whenever a reliability check fails, livepatch_safe has to be set to false. >> + >> /* Terminal record; nothing to unwind */ >> if (!fp) >> return -ENOENT; >> >> - if (fp & 0xf) >> + if (fp & 0xf) { >> + frame->reliable = false; >> return -EINVAL; >> + } >> >> if (!tsk) >> tsk = current; >> >> - if (!on_accessible_stack(tsk, fp, &info)) >> + if (!on_accessible_stack(tsk, fp, &info)) { >> + frame->reliable = false; >> return -EINVAL; >> + } >> >> - if (test_bit(info.type, frame->stacks_done)) >> + if (test_bit(info.type, frame->stacks_done)) { >> + frame->reliable = false; >> return -EINVAL; >> + } >> >> /* >> * As stacks grow downward, any valid record on the same stack must be >> @@ -74,8 +82,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) >> * stack. >> */ >> if (info.type == frame->prev_type) { >> - if (fp <= frame->prev_fp) >> + if (fp <= frame->prev_fp) { >> + frame->reliable = false; >> return -EINVAL; >> + } >> } else { >> set_bit(frame->prev_type, frame->stacks_done); >> } >> @@ -100,14 +110,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) >> * So replace it to an original value. >> */ >> ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++); >> - if (WARN_ON_ONCE(!ret_stack)) >> + if (WARN_ON_ONCE(!ret_stack)) { >> + frame->reliable = false; >> return -EINVAL; >> + } >> frame->pc = ret_stack->ret; >> } >> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ >> >> frame->pc = ptrauth_strip_insn_pac(frame->pc); >> >> + /* >> + * Check the return PC for conditions that make unwinding unreliable. >> + * In each case, mark the stack trace as such. >> + */ >> + >> + /* >> + * Make sure that the return address is a proper kernel text address. >> + * A NULL or invalid return address could mean: >> + * >> + * - generated code such as eBPF and optprobe trampolines >> + * - Foreign code (e.g. EFI runtime services) >> + * - Procedure Linkage Table (PLT) entries and veneer functions >> + */ >> + if (!__kernel_text_address(frame->pc)) >> + frame->reliable = false; > > I don't think we should mention PLTs here. They appear in regular kernel > text, and on arm64 they are generally not problematic for unwinding. The > case in which they are problematic are where they interpose an > trampoline call that isn't following the AAPCS (e.g. ftrace calls from a > module, or calls to __hwasan_tag_mismatch generally), and we'll have to > catch those explciitly (or forbid RELIABLE_STACKTRACE with HWASAN). > I will remove the mention of PLTs. >>From a backtrace perspective, the PC itself *is* reliable, but the next > unwind from this frame will not be, so I'd like to mark this as > reliable and the next unwind as unreliable. We can do that with the > UNWIND_TYPE_UNKNOWN suggestion above. > In the livepatch_safe approach, it can be set to false as soon as the unwinder realizes that there is unreliability, even if the unreliability is in the next frame. Actually, this would avoid one extra unwind step for livepatch. > For the comment here, how about: > > | /* > | * If the PC is not a known kernel text address, then we cannot > | * be sure that a subsequent unwind will be reliable, as we > | * don't know that the code follows our unwind requirements. > | */ > | if (!__kernel_text_address(frame-pc)) > | frame->unwind = UNWIND_TYPE_UNKNOWN; > OK. I can change the comment. Thanks! Madhavan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-25 15:39 ` Madhavan T. Venkataraman @ 2021-06-25 15:51 ` Mark Brown 2021-06-25 17:05 ` Madhavan T. Venkataraman 2021-06-26 15:35 ` Madhavan T. Venkataraman 1 sibling, 1 reply; 20+ messages in thread From: Mark Brown @ 2021-06-25 15:51 UTC (permalink / raw) To: Madhavan T. Venkataraman Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1615 bytes --] On Fri, Jun 25, 2021 at 10:39:57AM -0500, Madhavan T. Venkataraman wrote: > On 6/24/21 9:40 AM, Mark Rutland wrote: > > At a high-level, I'm on-board with keeping track of this per unwind > > step, but if we do that then I want to be abel to use this during > > regular unwinds (e.g. so that we can have a backtrace idicate when a > > step is not reliable, like x86 does with '?'), and to do that we need to > > be a little more accurate. > The only consumer of frame->reliable is livepatch. So, in retrospect, my > original per-frame reliability flag was an overkill. I was just trying to > provide extra per-frame debug information which is not really a requirement > for livepatch. It's not a requirement for livepatch but if it's there a per frame reliability flag would have other uses - for example Mark has mentioned the way x86 prints a ? next to unreliable entries in oops output for example, that'd be handy for people debugging issues and would have the added bonus of ensuring that there's more constant and widespread exercising of the reliability stuff than if it's just used for livepatch which is a bit niche. > So, let us separate the two. I will rename frame->reliable to frame->livepatch_safe. > This will apply to the whole stacktrace and not to every frame. I'd rather keep it as reliable, even with only the livepatch usage I think it's clearer. > Finally, it might be a good idea to perform reliability checks even in > start_backtrace() so we don't assume that the starting frame is reliable even > if the caller passes livepatch_safe=true. What do you think? That makes sense to me. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-25 15:51 ` Mark Brown @ 2021-06-25 17:05 ` Madhavan T. Venkataraman 2021-06-25 17:18 ` Madhavan T. Venkataraman 0 siblings, 1 reply; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-25 17:05 UTC (permalink / raw) To: Mark Brown Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On 6/25/21 10:51 AM, Mark Brown wrote: > On Fri, Jun 25, 2021 at 10:39:57AM -0500, Madhavan T. Venkataraman wrote: >> On 6/24/21 9:40 AM, Mark Rutland wrote: > >>> At a high-level, I'm on-board with keeping track of this per unwind >>> step, but if we do that then I want to be abel to use this during >>> regular unwinds (e.g. so that we can have a backtrace idicate when a >>> step is not reliable, like x86 does with '?'), and to do that we need to >>> be a little more accurate. > >> The only consumer of frame->reliable is livepatch. So, in retrospect, my >> original per-frame reliability flag was an overkill. I was just trying to >> provide extra per-frame debug information which is not really a requirement >> for livepatch. > > It's not a requirement for livepatch but if it's there a per frame > reliability flag would have other uses - for example Mark has mentioned > the way x86 prints a ? next to unreliable entries in oops output for > example, that'd be handy for people debugging issues and would have the > added bonus of ensuring that there's more constant and widespread > exercising of the reliability stuff than if it's just used for livepatch > which is a bit niche. > I agree. That is why I introduced the per-frame flag. So, let us try a different approach. First, let us get rid of the frame->reliable flag from this patch series. That flag can be implemented when all of the pieces are in place for per-frame debug and tracking. For consumers such as livepatch that don't really care about per-frame stuff, let us solve it more cleanly via the return value of unwind_frame(). Currently, the return value from unwind_frame() is a tri-state return value which is somewhat confusing. 0 means continue unwinding -error means stop unwinding. However, -ENOENT means successful termination Other values mean an error has happened. Instead, let unwind_frame() return one of 3 values: enum { UNWIND_CONTINUE, UNWIND_CONTINUE_WITH_ERRORS, UNWIND_STOP, }; All consumers will stop unwinding upon seeing UNWIND_STOP. Livepatch type consumers will stop unwinding upon seeing anything other than UNWIND_CONTINUE. Debug type consumers can choose to continue upon seeing UNWIND_CONTINUE_WITH_ERRORS. When we eventually implement per-frame stuff, debug consumers can examine the frame for more information when they see UNWIND_CONTINUE_WITH_ERRORS. This way, my patch series does not have a dependency on the per-frame enhancements. >> So, let us separate the two. I will rename frame->reliable to frame->livepatch_safe. >> This will apply to the whole stacktrace and not to every frame. > > I'd rather keep it as reliable, even with only the livepatch usage I > think it's clearer. > See suggestion above. >> Finally, it might be a good idea to perform reliability checks even in >> start_backtrace() so we don't assume that the starting frame is reliable even >> if the caller passes livepatch_safe=true. What do you think? > > That makes sense to me. > Thanks. Madhavan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-25 17:05 ` Madhavan T. Venkataraman @ 2021-06-25 17:18 ` Madhavan T. Venkataraman 0 siblings, 0 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-25 17:18 UTC (permalink / raw) To: Mark Brown Cc: Mark Rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On 6/25/21 12:05 PM, Madhavan T. Venkataraman wrote: > > > On 6/25/21 10:51 AM, Mark Brown wrote: >> On Fri, Jun 25, 2021 at 10:39:57AM -0500, Madhavan T. Venkataraman wrote: >>> On 6/24/21 9:40 AM, Mark Rutland wrote: >> >>>> At a high-level, I'm on-board with keeping track of this per unwind >>>> step, but if we do that then I want to be abel to use this during >>>> regular unwinds (e.g. so that we can have a backtrace idicate when a >>>> step is not reliable, like x86 does with '?'), and to do that we need to >>>> be a little more accurate. >> >>> The only consumer of frame->reliable is livepatch. So, in retrospect, my >>> original per-frame reliability flag was an overkill. I was just trying to >>> provide extra per-frame debug information which is not really a requirement >>> for livepatch. >> >> It's not a requirement for livepatch but if it's there a per frame >> reliability flag would have other uses - for example Mark has mentioned >> the way x86 prints a ? next to unreliable entries in oops output for >> example, that'd be handy for people debugging issues and would have the >> added bonus of ensuring that there's more constant and widespread >> exercising of the reliability stuff than if it's just used for livepatch >> which is a bit niche. >> > > I agree. That is why I introduced the per-frame flag. > > So, let us try a different approach. > > First, let us get rid of the frame->reliable flag from this patch series. That flag > can be implemented when all of the pieces are in place for per-frame debug and tracking. > > For consumers such as livepatch that don't really care about per-frame stuff, let us > solve it more cleanly via the return value of unwind_frame(). > > Currently, the return value from unwind_frame() is a tri-state return value which is > somewhat confusing. > > 0 means continue unwinding > -error means stop unwinding. However, > -ENOENT means successful termination > Other values mean an error has happened. > > Instead, let unwind_frame() return one of 3 values: > > enum { > UNWIND_CONTINUE, > UNWIND_CONTINUE_WITH_ERRORS, > UNWIND_STOP, > }; > Sorry. I need to add one more value to this. So, the enum will be: enum { UNWIND_CONTINUE, UNWIND_CONTINUE_WITH_ERRORS, UNWIND_STOP, UNWIND_STOP_WITH_ERRORS, }; UNWIND_CONTINUE (what used to be a return value of 0) Continue with the unwind. UNWIND_CONTINUE_WITH_ERRORS (new return value) Errors encountered. But the errors are not fatal errors like stack corruption. UNWIND_STOP (what used to be -ENOENT) Successful termination of unwind. UNWIND_STOP_WITH_ERRORS (what used to be -EINVAL, etc) Unsuccessful termination. Sorry I missed this the last time. So, to reiterate: All consumers will stop unwinding when they see UNWIND_STOP and UNWIND_STOP_WITH_ERRORS. Debug type consumers can choose to continue when they see UNWIND_CONTINUE_WITH_ERRORS. Livepatch type consumers will only continue on UNWIND_CONTINUE. This way, my patch series does not have a dependency on the per-frame enhancements. Thanks! Madhavan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-25 15:39 ` Madhavan T. Venkataraman 2021-06-25 15:51 ` Mark Brown @ 2021-06-26 15:35 ` Madhavan T. Venkataraman 1 sibling, 0 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-26 15:35 UTC (permalink / raw) To: Mark Rutland Cc: broonie, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel I will send out the next version without frame->reliable as implementing a per-frame reliability thing obviously needs other changes and needs Mark Rutland's code reorg. Thanks! Madhavan On 6/25/21 10:39 AM, Madhavan T. Venkataraman wrote: > > > On 6/24/21 9:40 AM, Mark Rutland wrote: >> Hi Madhavan, >> >> On Wed, May 26, 2021 at 04:49:16PM -0500, madvenka@linux.microsoft.com wrote: >>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> >>> >>> The unwinder should check for the presence of various features and >>> conditions that can render the stack trace unreliable and mark the >>> the stack trace as unreliable for the benefit of the caller. >>> >>> Introduce the first reliability check - If a return PC is not a valid >>> kernel text address, consider the stack trace unreliable. It could be >>> some generated code. >>> >>> Other reliability checks will be added in the future. >>> >>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> >> >> At a high-level, I'm on-board with keeping track of this per unwind >> step, but if we do that then I want to be abel to use this during >> regular unwinds (e.g. so that we can have a backtrace idicate when a >> step is not reliable, like x86 does with '?'), and to do that we need to >> be a little more accurate. >> > > The only consumer of frame->reliable is livepatch. So, in retrospect, my > original per-frame reliability flag was an overkill. I was just trying to > provide extra per-frame debug information which is not really a requirement > for livepatch. > > So, let us separate the two. I will rename frame->reliable to frame->livepatch_safe. > This will apply to the whole stacktrace and not to every frame. > > Pass a livepatch_safe flag to start_backtrace(). This will be the initial value > of frame->livepatch_safe. So, if the caller knows that the starting frame is > unreliable, he can pass "false" to start_backtrace(). > > Whenever a reliability check fails, frame->livepatch_safe = false. After that > point, it will remain false till the end of the stacktrace. This keeps it simple. > > Also, once livepatch_safe is set to false, further reliability checks will not > be performed (what would be the point?). > > Finally, it might be a good idea to perform reliability checks even in > start_backtrace() so we don't assume that the starting frame is reliable even > if the caller passes livepatch_safe=true. What do you think? > >> I think we first need to do some more preparatory work for that, but >> regardless, I have some comments below. >> > > I agree that some more work is required to provide per-frame debug information > and tracking. That can be done later. It is not a requirement for livepatch. > >>> --- >>> arch/arm64/include/asm/stacktrace.h | 9 +++++++ >>> arch/arm64/kernel/stacktrace.c | 38 +++++++++++++++++++++++++---- >>> 2 files changed, 42 insertions(+), 5 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h >>> index eb29b1fe8255..4c822ef7f588 100644 >>> --- a/arch/arm64/include/asm/stacktrace.h >>> +++ b/arch/arm64/include/asm/stacktrace.h >>> @@ -49,6 +49,13 @@ struct stack_info { >>> * >>> * @graph: When FUNCTION_GRAPH_TRACER is selected, holds the index of a >>> * replacement lr value in the ftrace graph stack. >>> + * >>> + * @reliable: Is this stack frame reliable? There are several checks that >>> + * need to be performed in unwind_frame() before a stack frame >>> + * is truly reliable. Until all the checks are present, this flag >>> + * is just a place holder. Once all the checks are implemented, >>> + * this comment will be updated and the flag can be used by the >>> + * caller of unwind_frame(). >> >> I'd prefer that we state the high-level semantic first, then drill down >> into detail, e.g. >> >> | @reliable: Indicates whether this frame is beleived to be a reliable >> | unwinding from the parent stackframe. This may be set >> | regardless of whether the parent stackframe was reliable. >> | >> | This is set only if all the following are true: >> | >> | * @pc is a valid text address. >> | >> | Note: this is currently incomplete. >> > > I will change the name of the flag. I will change the comment accordingly. > >>> */ >>> struct stackframe { >>> unsigned long fp; >>> @@ -59,6 +66,7 @@ struct stackframe { >>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER >>> int graph; >>> #endif >>> + bool reliable; >>> }; >>> >>> extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame); >>> @@ -169,6 +177,7 @@ static inline void start_backtrace(struct stackframe *frame, >>> bitmap_zero(frame->stacks_done, __NR_STACK_TYPES); >>> frame->prev_fp = 0; >>> frame->prev_type = STACK_TYPE_UNKNOWN; >>> + frame->reliable = true; >>> } >> >> I think we need more data than this to be accurate. >> >> Consider arch_stack_walk() starting from a pt_regs -- the initial state >> (the PC from the regs) is accurate, but the first unwind from that will >> not be, and we don't account for that at all. >> >> I think we need to capture an unwind type in struct stackframe, which we >> can pass into start_backtrace(), e.g. >> > >> | enum unwind_type { >> | /* >> | * The next frame is indicated by the frame pointer. >> | * The next unwind may or may not be reliable. >> | */ >> | UNWIND_TYPE_FP, >> | >> | /* >> | * The next frame is indicated by the LR in pt_regs. >> | * The next unwind is not reliable. >> | */ >> | UNWIND_TYPE_REGS_LR, >> | >> | /* >> | * We do not know how to unwind to the next frame. >> | * The next unwind is not reliable. >> | */ >> | UNWIND_TYPE_UNKNOWN >> | }; >> >> That should be simple enough to set up around start_backtrace(), but >> we'll need further rework to make that simple at exception boundaries. >> With the entry rework I have queued for v5.14, we're *almost* down to a >> single asm<->c transition point for all vectors, and I'm hoping to >> factor the remainder out to C for v5.15, whereupon we can annotate that >> BL with some metadata for unwinding (with something similar to x86's >> UNWIND_HINT, but retained for runtime). >> > > I understood UNWIND_TYPE_FP and UNWIND_TYPE_REGS_LR. When would UNWIND_TYPE_UNKNOWN > be passed to start_backtrace? Could you elaborate? > > Regardless, the above comment applies only to per-frame tracking when it is eventually > implemented. For livepatch, it is not needed. At exception boundaries, if stack metadata > is available, then use that to unwind safely. Else, livepatch_safe = false. The latter > is what is being done in my patch series. So, we can go with that until stack metadata > becomes available. > > For the UNWIND_TYPE_REGS_LR and UNWIND_TYPE_UNKNOWN cases, the caller will > pass livepatch_safe=false to start_backtrace(). For UNWIND_TYPE_FP, the caller will > pass livepatch_safe=true. So, only UNWIND_TYPE_FP matters for livepatch. > >>> >>> #endif /* __ASM_STACKTRACE_H */ >>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c >>> index d55bdfb7789c..9061375c8785 100644 >>> --- a/arch/arm64/kernel/stacktrace.c >>> +++ b/arch/arm64/kernel/stacktrace.c >>> @@ -44,21 +44,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) >>> unsigned long fp = frame->fp; >>> struct stack_info info; >>> >>> + frame->reliable = true; >> >> I'd prefer to do this the other way around, e.g. here do: >> >> | /* >> | * Assume that an unwind step is unreliable until it has passed >> | * all relevant checks. >> | */ >> | frame->reliable = false; >> >> ... then only set this to true once we're certain the step is reliable. >> >> That requires fewer changes below, and would also be more robust as if >> we forget to update this we'd accidentally mark an entry as unreliable >> rather than accidentally marking it as reliable. >> > > For livepatch_safe, the initial statement setting it to true at the > beginning of unwind_frame() goes away. But whenever a reliability check fails, > livepatch_safe has to be set to false. > >>> + >>> /* Terminal record; nothing to unwind */ >>> if (!fp) >>> return -ENOENT; >>> >>> - if (fp & 0xf) >>> + if (fp & 0xf) { >>> + frame->reliable = false; >>> return -EINVAL; >>> + } >>> >>> if (!tsk) >>> tsk = current; >>> >>> - if (!on_accessible_stack(tsk, fp, &info)) >>> + if (!on_accessible_stack(tsk, fp, &info)) { >>> + frame->reliable = false; >>> return -EINVAL; >>> + } >>> >>> - if (test_bit(info.type, frame->stacks_done)) >>> + if (test_bit(info.type, frame->stacks_done)) { >>> + frame->reliable = false; >>> return -EINVAL; >>> + } >>> >>> /* >>> * As stacks grow downward, any valid record on the same stack must be >>> @@ -74,8 +82,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) >>> * stack. >>> */ >>> if (info.type == frame->prev_type) { >>> - if (fp <= frame->prev_fp) >>> + if (fp <= frame->prev_fp) { >>> + frame->reliable = false; >>> return -EINVAL; >>> + } >>> } else { >>> set_bit(frame->prev_type, frame->stacks_done); >>> } >>> @@ -100,14 +110,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) >>> * So replace it to an original value. >>> */ >>> ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++); >>> - if (WARN_ON_ONCE(!ret_stack)) >>> + if (WARN_ON_ONCE(!ret_stack)) { >>> + frame->reliable = false; >>> return -EINVAL; >>> + } >>> frame->pc = ret_stack->ret; >>> } >>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ >>> >>> frame->pc = ptrauth_strip_insn_pac(frame->pc); >>> >>> + /* >>> + * Check the return PC for conditions that make unwinding unreliable. >>> + * In each case, mark the stack trace as such. >>> + */ >>> + >>> + /* >>> + * Make sure that the return address is a proper kernel text address. >>> + * A NULL or invalid return address could mean: >>> + * >>> + * - generated code such as eBPF and optprobe trampolines >>> + * - Foreign code (e.g. EFI runtime services) >>> + * - Procedure Linkage Table (PLT) entries and veneer functions >>> + */ >>> + if (!__kernel_text_address(frame->pc)) >>> + frame->reliable = false; >> >> I don't think we should mention PLTs here. They appear in regular kernel >> text, and on arm64 they are generally not problematic for unwinding. The >> case in which they are problematic are where they interpose an >> trampoline call that isn't following the AAPCS (e.g. ftrace calls from a >> module, or calls to __hwasan_tag_mismatch generally), and we'll have to >> catch those explciitly (or forbid RELIABLE_STACKTRACE with HWASAN). >> > > I will remove the mention of PLTs. > >> >From a backtrace perspective, the PC itself *is* reliable, but the next >> unwind from this frame will not be, so I'd like to mark this as >> reliable and the next unwind as unreliable. We can do that with the >> UNWIND_TYPE_UNKNOWN suggestion above. >> > > In the livepatch_safe approach, it can be set to false as soon as the unwinder > realizes that there is unreliability, even if the unreliability is in the next > frame. Actually, this would avoid one extra unwind step for livepatch. > >> For the comment here, how about: >> >> | /* >> | * If the PC is not a known kernel text address, then we cannot >> | * be sure that a subsequent unwind will be reliable, as we >> | * don't know that the code follows our unwind requirements. >> | */ >> | if (!__kernel_text_address(frame-pc)) >> | frame->unwind = UNWIND_TYPE_UNKNOWN; >> > > OK. I can change the comment. > > Thanks! > > Madhavan > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder 2021-06-24 14:40 ` Mark Rutland 2021-06-24 16:03 ` Mark Brown 2021-06-25 15:39 ` Madhavan T. Venkataraman @ 2021-06-29 16:47 ` Josh Poimboeuf 2 siblings, 0 replies; 20+ messages in thread From: Josh Poimboeuf @ 2021-06-29 16:47 UTC (permalink / raw) To: Mark Rutland Cc: madvenka, broonie, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On Thu, Jun 24, 2021 at 03:40:21PM +0100, Mark Rutland wrote: > Hi Madhavan, > > On Wed, May 26, 2021 at 04:49:16PM -0500, madvenka@linux.microsoft.com wrote: > > From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> > > > > The unwinder should check for the presence of various features and > > conditions that can render the stack trace unreliable and mark the > > the stack trace as unreliable for the benefit of the caller. > > > > Introduce the first reliability check - If a return PC is not a valid > > kernel text address, consider the stack trace unreliable. It could be > > some generated code. > > > > Other reliability checks will be added in the future. > > > > Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> > > At a high-level, I'm on-board with keeping track of this per unwind > step, but if we do that then I want to be abel to use this during > regular unwinds (e.g. so that we can have a backtrace idicate when a > step is not reliable, like x86 does with '?'), and to do that we need to > be a little more accurate. On x86, the '?' entries don't come from the unwinder's determination of whether a frame is reliable. (And the x86 unwinder doesn't track reliable-ness on a per-frame basis anyway; it keeps a per-unwind global error state.) The stack dumping code blindly scans the stack for kernel text addresses, in lockstep with calls to the unwinder. Any text addresses which aren't also reported by the unwinder are prepended with '?'. The point is two-fold: a) failsafe in case the unwinder fails or skips a frame; b) showing of breadcrumbs from previous execution contexts which can help the debugging of more difficult scenarios. -- Josh ^ permalink raw reply [flat|nested] 20+ messages in thread
* [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-05-26 21:49 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks madvenka 2021-05-26 21:49 ` [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder madvenka @ 2021-05-26 21:49 ` madvenka 2021-06-04 16:24 ` Mark Brown ` (2 more replies) 2021-06-04 15:29 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks Mark Brown 2 siblings, 3 replies; 20+ messages in thread From: madvenka @ 2021-05-26 21:49 UTC (permalink / raw) To: broonie, mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel, madvenka From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> The unwinder should check if the return PC falls in any function that is considered unreliable from an unwinding perspective. If it does, mark the stack trace unreliable. Function types ============== The compiler generates code for C functions and assigns the type STT_FUNC to them. Assembly functions are manually assigned a type: - STT_FUNC for functions defined with SYM_FUNC*() macros - STT_NONE for functions defined with SYM_CODE*() macros In the future, STT_FUNC functions will be analyzed by objtool and "fixed" as necessary. So, they are not "interesting" to the reliable unwinder in the kernel. That leaves SYM_CODE*() functions. These contain low-level code that is difficult or impossible for objtool to analyze. So, objtool ignores them leaving them to the reliable unwinder. These functions must be considered unreliable from an unwinding perspective. Define a special section for unreliable functions ================================================= Define a SYM_CODE_END() macro for arm64 that adds the function address range to a new section called "sym_code_functions". Linker file =========== Include the "sym_code_functions" section under initdata in vmlinux.lds.S. Initialization ============== Define an early_initcall() to copy the function address ranges from the "sym_code_functions" section to an array by the same name. Unwinder check ============== Define a function called unwinder_is_unreliable() that compares a return PC with sym_code_functions[]. If there is a match, then mark the stack trace as unreliable. Call unwinder_is_unreliable() from unwind_frame(). Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> --- arch/arm64/include/asm/linkage.h | 12 +++ arch/arm64/include/asm/sections.h | 1 + arch/arm64/kernel/stacktrace.c | 118 +++++++++++++++++++++++++++++- arch/arm64/kernel/vmlinux.lds.S | 7 ++ 4 files changed, 137 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/linkage.h b/arch/arm64/include/asm/linkage.h index ba89a9af820a..3b5f1fd332b0 100644 --- a/arch/arm64/include/asm/linkage.h +++ b/arch/arm64/include/asm/linkage.h @@ -60,4 +60,16 @@ SYM_FUNC_END(x); \ SYM_FUNC_END_ALIAS(__pi_##x) +/* + * Record the address range of each SYM_CODE function in a struct code_range + * in a special section. + */ +#define SYM_CODE_END(name) \ + SYM_END(name, SYM_T_NONE) ;\ + 99: ;\ + .pushsection "sym_code_functions", "aw" ;\ + .quad name ;\ + .quad 99b ;\ + .popsection + #endif diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h index 2f36b16a5b5d..29cb566f65ec 100644 --- a/arch/arm64/include/asm/sections.h +++ b/arch/arm64/include/asm/sections.h @@ -20,5 +20,6 @@ extern char __exittext_begin[], __exittext_end[]; extern char __irqentry_text_start[], __irqentry_text_end[]; extern char __mmuoff_data_start[], __mmuoff_data_end[]; extern char __entry_tramp_text_start[], __entry_tramp_text_end[]; +extern char __sym_code_functions_start[], __sym_code_functions_end[]; #endif /* __ASM_SECTIONS_H */ diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index 9061375c8785..5477a9d39b12 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -18,6 +18,109 @@ #include <asm/stack_pointer.h> #include <asm/stacktrace.h> +struct code_range { + unsigned long start; + unsigned long end; +}; + +static struct code_range *sym_code_functions; +static int num_sym_code_functions; + +int __init init_sym_code_functions(void) +{ + size_t size; + + size = (unsigned long)__sym_code_functions_end - + (unsigned long)__sym_code_functions_start; + + sym_code_functions = kmalloc(size, GFP_KERNEL); + if (!sym_code_functions) + return -ENOMEM; + + memcpy(sym_code_functions, __sym_code_functions_start, size); + /* Update num_sym_code_functions after copying sym_code_functions. */ + smp_mb(); + num_sym_code_functions = size / sizeof(struct code_range); + + return 0; +} +early_initcall(init_sym_code_functions); + +/* + * Check the return PC against sym_code_functions[]. If there is a match, then + * the consider the stack frame unreliable. These functions contain low-level + * code where the frame pointer and/or the return address register cannot be + * relied upon. This addresses the following situations: + * + * - Exception handlers and entry assembly + * - Trampoline assembly (e.g., ftrace, kprobes) + * - Hypervisor-related assembly + * - Hibernation-related assembly + * - CPU start-stop, suspend-resume assembly + * - Kernel relocation assembly + * + * Some special cases covered by sym_code_functions[] deserve a mention here: + * + * - All EL1 interrupt and exception stack traces will be considered + * unreliable. This is the correct behavior as interrupts and exceptions + * can happen on any instruction including ones in the frame pointer + * prolog and epilog. Unless stack metadata is available so the unwinder + * can unwind through these special cases, such stack traces will be + * considered unreliable. + * + * - A task can get preempted at the end of an interrupt. Stack traces + * of preempted tasks will show the interrupt frame in the stack trace + * and will be considered unreliable. + * + * - Breakpoints are exceptions. So, all stack traces in the break point + * handler (including probes) will be considered unreliable. + * + * - All of the ftrace entry trampolines are considered unreliable. So, + * all stack traces taken from tracer functions will be considered + * unreliable. + * + * - The Function Graph Tracer return trampoline (return_to_handler) + * and the Kretprobe return trampoline (kretprobe_trampoline) are + * also considered unreliable. + * + * Some of the special cases above can be unwound through using special logic + * in unwind_frame(). + * + * - return_to_handler() is handled by the unwinder by attempting to + * retrieve the original return address from the per-task return + * address stack. + * + * - kretprobe_trampoline() can be handled in a similar fashion by + * attempting to retrieve the original return address from the per-task + * kretprobe instance list. + * + * - I reckon optprobes can be handled in a similar fashion in the future? + * + * - Stack traces taken from the FTrace tracer functions can be handled + * as well. ftrace_call is an inner label defined in the Ftrace entry + * trampoline. This is the location where the call to a tracer function + * is patched. So, if the return PC equals ftrace_call+4, it is + * reliable. At that point, proper stack frames have already been set + * up for the traced function and its caller. + */ +static bool unwinder_is_unreliable(unsigned long pc) +{ + const struct code_range *range; + int i; + + /* + * If sym_code_functions[] were sorted, a binary search could be + * done to make this more performant. + */ + for (i = 0; i < num_sym_code_functions; i++) { + range = &sym_code_functions[i]; + if (pc >= range->start && pc < range->end) + return true; + } + + return false; +} + /* * AArch64 PCS assigns the frame pointer to x29. * @@ -133,7 +236,20 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) * - Foreign code (e.g. EFI runtime services) * - Procedure Linkage Table (PLT) entries and veneer functions */ - if (!__kernel_text_address(frame->pc)) + if (!__kernel_text_address(frame->pc)) { + frame->reliable = false; + return 0; + } + + /* + * If the final frame has been reached, there is no more unwinding + * to do. There is no need to check if the return PC is considered + * unreliable by the unwinder. + */ + if (!frame->fp) + return 0; + + if (unwinder_is_unreliable(frame->pc)) frame->reliable = false; return 0; diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 7eea7888bb02..32e8d57397a1 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -103,6 +103,12 @@ jiffies = jiffies_64; #define TRAMP_TEXT #endif +#define SYM_CODE_FUNCTIONS \ + . = ALIGN(16); \ + __sym_code_functions_start = .; \ + KEEP(*(sym_code_functions)) \ + __sym_code_functions_end = .; + /* * The size of the PE/COFF section that covers the kernel image, which * runs from _stext to _edata, must be a round multiple of the PE/COFF @@ -218,6 +224,7 @@ SECTIONS CON_INITCALL INIT_RAM_FS *(.init.altinstructions .init.bss) /* from the EFI stub */ + SYM_CODE_FUNCTIONS } .exit.data : { EXIT_DATA -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-05-26 21:49 ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka @ 2021-06-04 16:24 ` Mark Brown 2021-06-04 20:38 ` Madhavan T. Venkataraman 2021-06-04 16:59 ` Mark Brown 2021-06-16 1:52 ` Suraj Jitindar Singh 2 siblings, 1 reply; 20+ messages in thread From: Mark Brown @ 2021-06-04 16:24 UTC (permalink / raw) To: madvenka Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1455 bytes --] On Wed, May 26, 2021 at 04:49:17PM -0500, madvenka@linux.microsoft.com wrote: > The unwinder should check if the return PC falls in any function that > is considered unreliable from an unwinding perspective. If it does, > mark the stack trace unreliable. Reviwed-by: Mark Brown <broonie@kernel.org> However it'd be good for someone else to double check this as it's entirely possible that I've missed some case here. > + * Some special cases covered by sym_code_functions[] deserve a mention here: > + * - All EL1 interrupt and exception stack traces will be considered > + * unreliable. This is the correct behavior as interrupts and exceptions > + * can happen on any instruction including ones in the frame pointer > + * prolog and epilog. Unless stack metadata is available so the unwinder > + * can unwind through these special cases, such stack traces will be > + * considered unreliable. > + * If you're respinning this it's probably also worth noting that we only ever perform reliable stack trace on either blocked tasks or the current task which should if my reasoning is correct mean that the fact that the exclusions here mean that we avoid having to worry about so many race conditions when entering and leaving functions. If we got preempted at the wrong moment for one of them then we should observe the preemption and mark the trace as unreliable due to that which means that any confusion the race causes is a non-issue. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-06-04 16:24 ` Mark Brown @ 2021-06-04 20:38 ` Madhavan T. Venkataraman 0 siblings, 0 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-04 20:38 UTC (permalink / raw) To: Mark Brown Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On 6/4/21 11:24 AM, Mark Brown wrote: > On Wed, May 26, 2021 at 04:49:17PM -0500, madvenka@linux.microsoft.com wrote: > >> The unwinder should check if the return PC falls in any function that >> is considered unreliable from an unwinding perspective. If it does, >> mark the stack trace unreliable. > > Reviwed-by: Mark Brown <broonie@kernel.org> > Thanks. > However it'd be good for someone else to double check this as it's > entirely possible that I've missed some case here. > I will request Mark Rutland to review this as well. >> + * Some special cases covered by sym_code_functions[] deserve a mention here: > >> + * - All EL1 interrupt and exception stack traces will be considered >> + * unreliable. This is the correct behavior as interrupts and exceptions >> + * can happen on any instruction including ones in the frame pointer >> + * prolog and epilog. Unless stack metadata is available so the unwinder >> + * can unwind through these special cases, such stack traces will be >> + * considered unreliable. >> + * > > If you're respinning this it's probably also worth noting that we only > ever perform reliable stack trace on either blocked tasks or the current > task which should if my reasoning is correct mean that the fact that > the exclusions here mean that we avoid having to worry about so many > race conditions when entering and leaving functions. If we got > preempted at the wrong moment for one of them then we should observe the > preemption and mark the trace as unreliable due to that which means that > any confusion the race causes is a non-issue. > I will add a comment that "livepatch only looks at tasks that are currently not on any CPU (except for the current task). Such tasks either blocked on something and gave up the CPU voluntarily. Or, they were preempted. The above comment applies to the latter case". Madhavan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-05-26 21:49 ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka 2021-06-04 16:24 ` Mark Brown @ 2021-06-04 16:59 ` Mark Brown 2021-06-04 20:40 ` Madhavan T. Venkataraman 2021-06-16 1:52 ` Suraj Jitindar Singh 2 siblings, 1 reply; 20+ messages in thread From: Mark Brown @ 2021-06-04 16:59 UTC (permalink / raw) To: madvenka Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel [-- Attachment #1: Type: text/plain, Size: 656 bytes --] On Wed, May 26, 2021 at 04:49:17PM -0500, madvenka@linux.microsoft.com wrote: > + * - return_to_handler() is handled by the unwinder by attempting to > + * retrieve the original return address from the per-task return > + * address stack. > + * > + * - kretprobe_trampoline() can be handled in a similar fashion by > + * attempting to retrieve the original return address from the per-task > + * kretprobe instance list. > + * > + * - I reckon optprobes can be handled in a similar fashion in the future? Note that there's a patch for optprobes on the list now: https://lore.kernel.org/r/1622803839-27354-1-git-send-email-liuqi115@huawei.com [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-06-04 16:59 ` Mark Brown @ 2021-06-04 20:40 ` Madhavan T. Venkataraman 0 siblings, 0 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-04 20:40 UTC (permalink / raw) To: Mark Brown Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On 6/4/21 11:59 AM, Mark Brown wrote: > On Wed, May 26, 2021 at 04:49:17PM -0500, madvenka@linux.microsoft.com wrote: > >> + * - return_to_handler() is handled by the unwinder by attempting to >> + * retrieve the original return address from the per-task return >> + * address stack. >> + * >> + * - kretprobe_trampoline() can be handled in a similar fashion by >> + * attempting to retrieve the original return address from the per-task >> + * kretprobe instance list. >> + * >> + * - I reckon optprobes can be handled in a similar fashion in the future? > > Note that there's a patch for optprobes on the list now: > > https://lore.kernel.org/r/1622803839-27354-1-git-send-email-liuqi115@huawei.com Yes. I saw that. Madhavan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-05-26 21:49 ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka 2021-06-04 16:24 ` Mark Brown 2021-06-04 16:59 ` Mark Brown @ 2021-06-16 1:52 ` Suraj Jitindar Singh 2021-06-16 9:15 ` nobuta.keiya 2021-06-16 11:10 ` Madhavan T. Venkataraman 2 siblings, 2 replies; 20+ messages in thread From: Suraj Jitindar Singh @ 2021-06-16 1:52 UTC (permalink / raw) To: madvenka, broonie, mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On Wed, 2021-05-26 at 16:49 -0500, madvenka@linux.microsoft.com wrote: > From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> > > The unwinder should check if the return PC falls in any function that > is considered unreliable from an unwinding perspective. If it does, > mark the stack trace unreliable. > [snip] Correct me if I'm wrong, but do you not need to move the final frame check to before the unwinder_is_unreliable() call? Userland threads which have ret_from_fork as the last entry on the stack will always be marked unreliable as they will always have a SYM_CODE entry on their stack (the ret_from_fork). Also given that this means the last frame has been reached and as such there's no more unwinding to do, I don't think we care if the last pc is a code address. - Suraj > * > @@ -133,7 +236,20 @@ int notrace unwind_frame(struct task_struct > *tsk, struct stackframe *frame) > * - Foreign code (e.g. EFI runtime services) > * - Procedure Linkage Table (PLT) entries and veneer > functions > */ > - if (!__kernel_text_address(frame->pc)) > + if (!__kernel_text_address(frame->pc)) { > + frame->reliable = false; > + return 0; > + } > + > + /* > + * If the final frame has been reached, there is no more > unwinding > + * to do. There is no need to check if the return PC is > considered > + * unreliable by the unwinder. > + */ > + if (!frame->fp) > + return 0; if (frame->fp == (unsigned long)task_pt_regs(tsk)->stackframe) return -ENOENT; > + > + if (unwinder_is_unreliable(frame->pc)) > frame->reliable = false; > > return 0; > diff --git a/arch/arm64/kernel/vmlinux.lds.S > b/arch/arm64/kernel/vmlinux.lds.S > index 7eea7888bb02..32e8d57397a1 100644 > --- a/arch/arm64/kernel/vmlinux.lds.S > +++ b/arch/arm64/kernel/vmlinux.lds.S > @@ -103,6 +103,12 @@ jiffies = jiffies_64; > #define TRAMP_TEXT > #endif > > +#define SYM_CODE_FUNCTIONS \ > + . = ALIGN(16); \ > + __sym_code_functions_start = .; \ > + KEEP(*(sym_code_functions)) \ > + __sym_code_functions_end = .; > + > /* > * The size of the PE/COFF section that covers the kernel image, > which > * runs from _stext to _edata, must be a round multiple of the > PE/COFF > @@ -218,6 +224,7 @@ SECTIONS > CON_INITCALL > INIT_RAM_FS > *(.init.altinstructions .init.bss) /* from the > EFI stub */ > + SYM_CODE_FUNCTIONS > } > .exit.data : { > EXIT_DATA ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-06-16 1:52 ` Suraj Jitindar Singh @ 2021-06-16 9:15 ` nobuta.keiya 2021-06-16 11:10 ` Madhavan T. Venkataraman 1 sibling, 0 replies; 20+ messages in thread From: nobuta.keiya @ 2021-06-16 9:15 UTC (permalink / raw) To: 'Suraj Jitindar Singh', madvenka, broonie, mark.rutland, jpoimboe, ardb, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel Hi Suraj, > > if (frame->fp == (unsigned long)task_pt_regs(tsk)->stackframe) > return -ENOENT; If I understand correctly, a similar final frame check is introduced in this patch: arm64: Implement stack trace termination record https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?id=7d7b720a4b80 It is currently merged into for-next/stacktrace branch. Thanks & Best Regards, Keiya Nobuta <nobuta.keiya@fujitsu.com> --------------------------------------------------------- Solution Development Dept. Software Div. FUJITSU COMPUTER TECHNOLOGIES Ltd. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list 2021-06-16 1:52 ` Suraj Jitindar Singh 2021-06-16 9:15 ` nobuta.keiya @ 2021-06-16 11:10 ` Madhavan T. Venkataraman 1 sibling, 0 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-16 11:10 UTC (permalink / raw) To: Suraj Jitindar Singh, broonie, mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel On 6/15/21 8:52 PM, Suraj Jitindar Singh wrote: > On Wed, 2021-05-26 at 16:49 -0500, madvenka@linux.microsoft.com wrote: >> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> >> >> The unwinder should check if the return PC falls in any function that >> is considered unreliable from an unwinding perspective. If it does, >> mark the stack trace unreliable. >> > > [snip] > > Correct me if I'm wrong, but do you not need to move the final frame > check to before the unwinder_is_unreliable() call? > That is done in a patch series that has been merged into for-next/stacktrace branch. When I merge this patch series with that, the final frame check will be done prior. I have mentioned this in the cover letter: Last stack frame ================ If a SYM_CODE function occurs in the very last frame in the stack trace, then the stack trace is not considered unreliable. This is because there is no more unwinding to do. Examples: - EL0 exception stack traces end in the top level EL0 exception handlers. - All kernel thread stack traces end in ret_from_fork(). Madhavan > Userland threads which have ret_from_fork as the last entry on the > stack will always be marked unreliable as they will always have a > SYM_CODE entry on their stack (the ret_from_fork). > > Also given that this means the last frame has been reached and as such > there's no more unwinding to do, I don't think we care if the last pc > is a code address. > > - Suraj > >> * >> @@ -133,7 +236,20 @@ int notrace unwind_frame(struct task_struct >> *tsk, struct stackframe *frame) >> * - Foreign code (e.g. EFI runtime services) >> * - Procedure Linkage Table (PLT) entries and veneer >> functions >> */ >> - if (!__kernel_text_address(frame->pc)) >> + if (!__kernel_text_address(frame->pc)) { >> + frame->reliable = false; >> + return 0; >> + } >> + >> + /* >> + * If the final frame has been reached, there is no more >> unwinding >> + * to do. There is no need to check if the return PC is >> considered >> + * unreliable by the unwinder. >> + */ >> + if (!frame->fp) >> + return 0; > > if (frame->fp == (unsigned long)task_pt_regs(tsk)->stackframe) > return -ENOENT; > >> + >> + if (unwinder_is_unreliable(frame->pc)) >> frame->reliable = false; >> >> return 0; >> diff --git a/arch/arm64/kernel/vmlinux.lds.S >> b/arch/arm64/kernel/vmlinux.lds.S >> index 7eea7888bb02..32e8d57397a1 100644 >> --- a/arch/arm64/kernel/vmlinux.lds.S >> +++ b/arch/arm64/kernel/vmlinux.lds.S >> @@ -103,6 +103,12 @@ jiffies = jiffies_64; >> #define TRAMP_TEXT >> #endif >> >> +#define SYM_CODE_FUNCTIONS \ >> + . = ALIGN(16); \ >> + __sym_code_functions_start = .; \ >> + KEEP(*(sym_code_functions)) \ >> + __sym_code_functions_end = .; >> + >> /* >> * The size of the PE/COFF section that covers the kernel image, >> which >> * runs from _stext to _edata, must be a round multiple of the >> PE/COFF >> @@ -218,6 +224,7 @@ SECTIONS >> CON_INITCALL >> INIT_RAM_FS >> *(.init.altinstructions .init.bss) /* from the >> EFI stub */ >> + SYM_CODE_FUNCTIONS >> } >> .exit.data : { >> EXIT_DATA ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks 2021-05-26 21:49 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks madvenka 2021-05-26 21:49 ` [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder madvenka 2021-05-26 21:49 ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka @ 2021-06-04 15:29 ` Mark Brown 2021-06-04 20:44 ` Madhavan T. Venkataraman 2 siblings, 1 reply; 20+ messages in thread From: Mark Brown @ 2021-06-04 15:29 UTC (permalink / raw) To: madvenka Cc: mark.rutland, jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel [-- Attachment #1: Type: text/plain, Size: 426 bytes --] On Wed, May 26, 2021 at 04:49:15PM -0500, madvenka@linux.microsoft.com wrote: > From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> > > There are a number of places in kernel code where the stack trace is not > reliable. Enhance the unwinder to check for those cases and mark the > stack trace as unreliable. Once all of the checks are in place, the unwinder Reviewed-by: Mark Brown <broonie@kernel.org> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks 2021-06-04 15:29 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks Mark Brown @ 2021-06-04 20:44 ` Madhavan T. Venkataraman 0 siblings, 0 replies; 20+ messages in thread From: Madhavan T. Venkataraman @ 2021-06-04 20:44 UTC (permalink / raw) To: Mark Brown, Mark Rutland Cc: jpoimboe, ardb, nobuta.keiya, catalin.marinas, will, jmorris, pasha.tatashin, jthierry, linux-arm-kernel, live-patching, linux-kernel Hi Mark Rutland, Could you please review this patch series when you get a chance? I would really like to get a confirmation from you that there are no gaps in this. Thanks in advance! Madhavan On 6/4/21 10:29 AM, Mark Brown wrote: > On Wed, May 26, 2021 at 04:49:15PM -0500, madvenka@linux.microsoft.com wrote: >> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> >> >> There are a number of places in kernel code where the stack trace is not >> reliable. Enhance the unwinder to check for those cases and mark the >> stack trace as unreliable. Once all of the checks are in place, the unwinder > > Reviewed-by: Mark Brown <broonie@kernel.org> > ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2021-06-29 16:47 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <ea0ef9ed6eb34618bcf468fbbf8bdba99e15df7d> 2021-05-26 21:49 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks madvenka 2021-05-26 21:49 ` [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder madvenka 2021-06-24 14:40 ` Mark Rutland 2021-06-24 16:03 ` Mark Brown 2021-06-25 15:39 ` Madhavan T. Venkataraman 2021-06-25 15:51 ` Mark Brown 2021-06-25 17:05 ` Madhavan T. Venkataraman 2021-06-25 17:18 ` Madhavan T. Venkataraman 2021-06-26 15:35 ` Madhavan T. Venkataraman 2021-06-29 16:47 ` Josh Poimboeuf 2021-05-26 21:49 ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka 2021-06-04 16:24 ` Mark Brown 2021-06-04 20:38 ` Madhavan T. Venkataraman 2021-06-04 16:59 ` Mark Brown 2021-06-04 20:40 ` Madhavan T. Venkataraman 2021-06-16 1:52 ` Suraj Jitindar Singh 2021-06-16 9:15 ` nobuta.keiya 2021-06-16 11:10 ` Madhavan T. Venkataraman 2021-06-04 15:29 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks Mark Brown 2021-06-04 20:44 ` Madhavan T. Venkataraman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).