All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: madvenka@linux.microsoft.com
Cc: broonie@kernel.org, jpoimboe@redhat.com, ardb@kernel.org,
	nobuta.keiya@fujitsu.com, catalin.marinas@arm.com,
	will@kernel.org, jmorris@namei.org, pasha.tatashin@soleen.com,
	jthierry@redhat.com, linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder
Date: Thu, 24 Jun 2021 15:40:21 +0100	[thread overview]
Message-ID: <20210624144021.GA17937@C02TD0UTHF1T.local> (raw)
In-Reply-To: <20210526214917.20099-2-madvenka@linux.microsoft.com>

Hi Madhavan,

On Wed, May 26, 2021 at 04:49:16PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> The unwinder should check for the presence of various features and
> conditions that can render the stack trace unreliable and mark the
> the stack trace as unreliable for the benefit of the caller.
> 
> Introduce the first reliability check - If a return PC is not a valid
> kernel text address, consider the stack trace unreliable. It could be
> some generated code.
> 
> Other reliability checks will be added in the future.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>

At a high-level, I'm on-board with keeping track of this per unwind
step, but if we do that then I want to be abel to use this during
regular unwinds (e.g. so that we can have a backtrace idicate when a
step is not reliable, like x86 does with '?'), and to do that we need to
be a little more accurate.

I think we first need to do some more preparatory work for that, but
regardless, I have some comments below.

> ---
>  arch/arm64/include/asm/stacktrace.h |  9 +++++++
>  arch/arm64/kernel/stacktrace.c      | 38 +++++++++++++++++++++++++----
>  2 files changed, 42 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index eb29b1fe8255..4c822ef7f588 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -49,6 +49,13 @@ struct stack_info {
>   *
>   * @graph:       When FUNCTION_GRAPH_TRACER is selected, holds the index of a
>   *               replacement lr value in the ftrace graph stack.
> + *
> + * @reliable:	Is this stack frame reliable? There are several checks that
> + *              need to be performed in unwind_frame() before a stack frame
> + *              is truly reliable. Until all the checks are present, this flag
> + *              is just a place holder. Once all the checks are implemented,
> + *              this comment will be updated and the flag can be used by the
> + *              caller of unwind_frame().

I'd prefer that we state the high-level semantic first, then drill down
into detail, e.g.

| @reliable: Indicates whether this frame is beleived to be a reliable
|            unwinding from the parent stackframe. This may be set
|            regardless of whether the parent stackframe was reliable.
|            
|            This is set only if all the following are true:
| 
|            * @pc is a valid text address.
| 
|            Note: this is currently incomplete.

>   */
>  struct stackframe {
>  	unsigned long fp;
> @@ -59,6 +66,7 @@ struct stackframe {
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	int graph;
>  #endif
> +	bool reliable;
>  };
>  
>  extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
> @@ -169,6 +177,7 @@ static inline void start_backtrace(struct stackframe *frame,
>  	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
>  	frame->prev_fp = 0;
>  	frame->prev_type = STACK_TYPE_UNKNOWN;
> +	frame->reliable = true;
>  }

I think we need more data than this to be accurate.

Consider arch_stack_walk() starting from a pt_regs -- the initial state
(the PC from the regs) is accurate, but the first unwind from that will
not be, and we don't account for that at all.

I think we need to capture an unwind type in struct stackframe, which we
can pass into start_backtrace(), e.g.

| enum unwind_type {
|         /*
|          * The next frame is indicated by the frame pointer.
|          * The next unwind may or may not be reliable.
|          */
|         UNWIND_TYPE_FP,
| 
|         /*
|          * The next frame is indicated by the LR in pt_regs.
|          * The next unwind is not reliable.
|          */
|         UNWIND_TYPE_REGS_LR,
| 
|         /*
|          * We do not know how to unwind to the next frame.
|          * The next unwind is not reliable.
|          */
|         UNWIND_TYPE_UNKNOWN
| };

That should be simple enough to set up around start_backtrace(), but
we'll need further rework to make that simple at exception boundaries.
With the entry rework I have queued for v5.14, we're *almost* down to a
single asm<->c transition point for all vectors, and I'm hoping to
factor the remainder out to C for v5.15, whereupon we can annotate that
BL with some metadata for unwinding (with something similar to x86's
UNWIND_HINT, but retained for runtime).

>  
>  #endif	/* __ASM_STACKTRACE_H */
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index d55bdfb7789c..9061375c8785 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -44,21 +44,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	unsigned long fp = frame->fp;
>  	struct stack_info info;
>  
> +	frame->reliable = true;

I'd prefer to do this the other way around, e.g. here do:

|        /*
|         * Assume that an unwind step is unreliable until it has passed
|         * all relevant checks.
|         */
|        frame->reliable = false;

... then only set this to true once we're certain the step is reliable.

That requires fewer changes below, and would also be more robust as if
we forget to update this we'd accidentally mark an entry as unreliable
rather than accidentally marking it as reliable.

> +
>  	/* Terminal record; nothing to unwind */
>  	if (!fp)
>  		return -ENOENT;
>  
> -	if (fp & 0xf)
> +	if (fp & 0xf) {
> +		frame->reliable = false;
>  		return -EINVAL;
> +	}
>  
>  	if (!tsk)
>  		tsk = current;
>  
> -	if (!on_accessible_stack(tsk, fp, &info))
> +	if (!on_accessible_stack(tsk, fp, &info)) {
> +		frame->reliable = false;
>  		return -EINVAL;
> +	}
>  
> -	if (test_bit(info.type, frame->stacks_done))
> +	if (test_bit(info.type, frame->stacks_done)) {
> +		frame->reliable = false;
>  		return -EINVAL;
> +	}
>  
>  	/*
>  	 * As stacks grow downward, any valid record on the same stack must be
> @@ -74,8 +82,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	 * stack.
>  	 */
>  	if (info.type == frame->prev_type) {
> -		if (fp <= frame->prev_fp)
> +		if (fp <= frame->prev_fp) {
> +			frame->reliable = false;
>  			return -EINVAL;
> +		}
>  	} else {
>  		set_bit(frame->prev_type, frame->stacks_done);
>  	}
> @@ -100,14 +110,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  		 * So replace it to an original value.
>  		 */
>  		ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++);
> -		if (WARN_ON_ONCE(!ret_stack))
> +		if (WARN_ON_ONCE(!ret_stack)) {
> +			frame->reliable = false;
>  			return -EINVAL;
> +		}
>  		frame->pc = ret_stack->ret;
>  	}
>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>  
>  	frame->pc = ptrauth_strip_insn_pac(frame->pc);
>  
> +	/*
> +	 * Check the return PC for conditions that make unwinding unreliable.
> +	 * In each case, mark the stack trace as such.
> +	 */
> +
> +	/*
> +	 * Make sure that the return address is a proper kernel text address.
> +	 * A NULL or invalid return address could mean:
> +	 *
> +	 *	- generated code such as eBPF and optprobe trampolines
> +	 *	- Foreign code (e.g. EFI runtime services)
> +	 *	- Procedure Linkage Table (PLT) entries and veneer functions
> +	 */
> +	if (!__kernel_text_address(frame->pc))
> +		frame->reliable = false;

I don't think we should mention PLTs here. They appear in regular kernel
text, and on arm64 they are generally not problematic for unwinding. The
case in which they are problematic are where they interpose an
trampoline call that isn't following the AAPCS (e.g. ftrace calls from a
module, or calls to __hwasan_tag_mismatch generally), and we'll have to
catch those explciitly (or forbid RELIABLE_STACKTRACE with HWASAN).

From a backtrace perspective, the PC itself *is* reliable, but the next
unwind from this frame will not be, so I'd like to mark this as
reliable and the next unwind as unreliable. We can do that with the
UNWIND_TYPE_UNKNOWN suggestion above.

For the comment here, how about:

|	/*
|	 * If the PC is not a known kernel text address, then we cannot
|	 * be sure that a subsequent unwind will be reliable, as we
|	 * don't know that the code follows our unwind requirements.
|	 */
|	if (!__kernel_text_address(frame-pc))
|		frame->unwind = UNWIND_TYPE_UNKNOWN;

Thanks,
Mark.

>  	return 0;
>  }
>  NOKPROBE_SYMBOL(unwind_frame);
> -- 
> 2.25.1
> 

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: madvenka@linux.microsoft.com
Cc: broonie@kernel.org, jpoimboe@redhat.com, ardb@kernel.org,
	nobuta.keiya@fujitsu.com, catalin.marinas@arm.com,
	will@kernel.org, jmorris@namei.org, pasha.tatashin@soleen.com,
	jthierry@redhat.com, linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder
Date: Thu, 24 Jun 2021 15:40:21 +0100	[thread overview]
Message-ID: <20210624144021.GA17937@C02TD0UTHF1T.local> (raw)
In-Reply-To: <20210526214917.20099-2-madvenka@linux.microsoft.com>

Hi Madhavan,

On Wed, May 26, 2021 at 04:49:16PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> The unwinder should check for the presence of various features and
> conditions that can render the stack trace unreliable and mark the
> the stack trace as unreliable for the benefit of the caller.
> 
> Introduce the first reliability check - If a return PC is not a valid
> kernel text address, consider the stack trace unreliable. It could be
> some generated code.
> 
> Other reliability checks will be added in the future.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>

At a high-level, I'm on-board with keeping track of this per unwind
step, but if we do that then I want to be abel to use this during
regular unwinds (e.g. so that we can have a backtrace idicate when a
step is not reliable, like x86 does with '?'), and to do that we need to
be a little more accurate.

I think we first need to do some more preparatory work for that, but
regardless, I have some comments below.

> ---
>  arch/arm64/include/asm/stacktrace.h |  9 +++++++
>  arch/arm64/kernel/stacktrace.c      | 38 +++++++++++++++++++++++++----
>  2 files changed, 42 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index eb29b1fe8255..4c822ef7f588 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -49,6 +49,13 @@ struct stack_info {
>   *
>   * @graph:       When FUNCTION_GRAPH_TRACER is selected, holds the index of a
>   *               replacement lr value in the ftrace graph stack.
> + *
> + * @reliable:	Is this stack frame reliable? There are several checks that
> + *              need to be performed in unwind_frame() before a stack frame
> + *              is truly reliable. Until all the checks are present, this flag
> + *              is just a place holder. Once all the checks are implemented,
> + *              this comment will be updated and the flag can be used by the
> + *              caller of unwind_frame().

I'd prefer that we state the high-level semantic first, then drill down
into detail, e.g.

| @reliable: Indicates whether this frame is beleived to be a reliable
|            unwinding from the parent stackframe. This may be set
|            regardless of whether the parent stackframe was reliable.
|            
|            This is set only if all the following are true:
| 
|            * @pc is a valid text address.
| 
|            Note: this is currently incomplete.

>   */
>  struct stackframe {
>  	unsigned long fp;
> @@ -59,6 +66,7 @@ struct stackframe {
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	int graph;
>  #endif
> +	bool reliable;
>  };
>  
>  extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
> @@ -169,6 +177,7 @@ static inline void start_backtrace(struct stackframe *frame,
>  	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
>  	frame->prev_fp = 0;
>  	frame->prev_type = STACK_TYPE_UNKNOWN;
> +	frame->reliable = true;
>  }

I think we need more data than this to be accurate.

Consider arch_stack_walk() starting from a pt_regs -- the initial state
(the PC from the regs) is accurate, but the first unwind from that will
not be, and we don't account for that at all.

I think we need to capture an unwind type in struct stackframe, which we
can pass into start_backtrace(), e.g.

| enum unwind_type {
|         /*
|          * The next frame is indicated by the frame pointer.
|          * The next unwind may or may not be reliable.
|          */
|         UNWIND_TYPE_FP,
| 
|         /*
|          * The next frame is indicated by the LR in pt_regs.
|          * The next unwind is not reliable.
|          */
|         UNWIND_TYPE_REGS_LR,
| 
|         /*
|          * We do not know how to unwind to the next frame.
|          * The next unwind is not reliable.
|          */
|         UNWIND_TYPE_UNKNOWN
| };

That should be simple enough to set up around start_backtrace(), but
we'll need further rework to make that simple at exception boundaries.
With the entry rework I have queued for v5.14, we're *almost* down to a
single asm<->c transition point for all vectors, and I'm hoping to
factor the remainder out to C for v5.15, whereupon we can annotate that
BL with some metadata for unwinding (with something similar to x86's
UNWIND_HINT, but retained for runtime).

>  
>  #endif	/* __ASM_STACKTRACE_H */
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index d55bdfb7789c..9061375c8785 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -44,21 +44,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	unsigned long fp = frame->fp;
>  	struct stack_info info;
>  
> +	frame->reliable = true;

I'd prefer to do this the other way around, e.g. here do:

|        /*
|         * Assume that an unwind step is unreliable until it has passed
|         * all relevant checks.
|         */
|        frame->reliable = false;

... then only set this to true once we're certain the step is reliable.

That requires fewer changes below, and would also be more robust as if
we forget to update this we'd accidentally mark an entry as unreliable
rather than accidentally marking it as reliable.

> +
>  	/* Terminal record; nothing to unwind */
>  	if (!fp)
>  		return -ENOENT;
>  
> -	if (fp & 0xf)
> +	if (fp & 0xf) {
> +		frame->reliable = false;
>  		return -EINVAL;
> +	}
>  
>  	if (!tsk)
>  		tsk = current;
>  
> -	if (!on_accessible_stack(tsk, fp, &info))
> +	if (!on_accessible_stack(tsk, fp, &info)) {
> +		frame->reliable = false;
>  		return -EINVAL;
> +	}
>  
> -	if (test_bit(info.type, frame->stacks_done))
> +	if (test_bit(info.type, frame->stacks_done)) {
> +		frame->reliable = false;
>  		return -EINVAL;
> +	}
>  
>  	/*
>  	 * As stacks grow downward, any valid record on the same stack must be
> @@ -74,8 +82,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  	 * stack.
>  	 */
>  	if (info.type == frame->prev_type) {
> -		if (fp <= frame->prev_fp)
> +		if (fp <= frame->prev_fp) {
> +			frame->reliable = false;
>  			return -EINVAL;
> +		}
>  	} else {
>  		set_bit(frame->prev_type, frame->stacks_done);
>  	}
> @@ -100,14 +110,32 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  		 * So replace it to an original value.
>  		 */
>  		ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++);
> -		if (WARN_ON_ONCE(!ret_stack))
> +		if (WARN_ON_ONCE(!ret_stack)) {
> +			frame->reliable = false;
>  			return -EINVAL;
> +		}
>  		frame->pc = ret_stack->ret;
>  	}
>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>  
>  	frame->pc = ptrauth_strip_insn_pac(frame->pc);
>  
> +	/*
> +	 * Check the return PC for conditions that make unwinding unreliable.
> +	 * In each case, mark the stack trace as such.
> +	 */
> +
> +	/*
> +	 * Make sure that the return address is a proper kernel text address.
> +	 * A NULL or invalid return address could mean:
> +	 *
> +	 *	- generated code such as eBPF and optprobe trampolines
> +	 *	- Foreign code (e.g. EFI runtime services)
> +	 *	- Procedure Linkage Table (PLT) entries and veneer functions
> +	 */
> +	if (!__kernel_text_address(frame->pc))
> +		frame->reliable = false;

I don't think we should mention PLTs here. They appear in regular kernel
text, and on arm64 they are generally not problematic for unwinding. The
case in which they are problematic are where they interpose an
trampoline call that isn't following the AAPCS (e.g. ftrace calls from a
module, or calls to __hwasan_tag_mismatch generally), and we'll have to
catch those explciitly (or forbid RELIABLE_STACKTRACE with HWASAN).

From a backtrace perspective, the PC itself *is* reliable, but the next
unwind from this frame will not be, so I'd like to mark this as
reliable and the next unwind as unreliable. We can do that with the
UNWIND_TYPE_UNKNOWN suggestion above.

For the comment here, how about:

|	/*
|	 * If the PC is not a known kernel text address, then we cannot
|	 * be sure that a subsequent unwind will be reliable, as we
|	 * don't know that the code follows our unwind requirements.
|	 */
|	if (!__kernel_text_address(frame-pc))
|		frame->unwind = UNWIND_TYPE_UNKNOWN;

Thanks,
Mark.

>  	return 0;
>  }
>  NOKPROBE_SYMBOL(unwind_frame);
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-24 14:40 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <ea0ef9ed6eb34618bcf468fbbf8bdba99e15df7d>
2021-05-26 21:49 ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks madvenka
2021-05-26 21:49   ` madvenka
2021-05-26 21:49   ` [RFC PATCH v5 1/2] arm64: Introduce stack trace reliability checks in the unwinder madvenka
2021-05-26 21:49     ` madvenka
2021-06-24 14:40     ` Mark Rutland [this message]
2021-06-24 14:40       ` Mark Rutland
2021-06-24 16:03       ` Mark Brown
2021-06-24 16:03         ` Mark Brown
2021-06-25 15:39       ` Madhavan T. Venkataraman
2021-06-25 15:39         ` Madhavan T. Venkataraman
2021-06-25 15:51         ` Mark Brown
2021-06-25 15:51           ` Mark Brown
2021-06-25 17:05           ` Madhavan T. Venkataraman
2021-06-25 17:05             ` Madhavan T. Venkataraman
2021-06-25 17:18             ` Madhavan T. Venkataraman
2021-06-25 17:18               ` Madhavan T. Venkataraman
2021-06-26 15:35         ` Madhavan T. Venkataraman
2021-06-26 15:35           ` Madhavan T. Venkataraman
2021-06-29 16:47       ` Josh Poimboeuf
2021-06-29 16:47         ` Josh Poimboeuf
2021-05-26 21:49   ` [RFC PATCH v5 2/2] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
2021-05-26 21:49     ` madvenka
2021-06-04 16:24     ` Mark Brown
2021-06-04 16:24       ` Mark Brown
2021-06-04 20:38       ` Madhavan T. Venkataraman
2021-06-04 20:38         ` Madhavan T. Venkataraman
2021-06-04 16:59     ` Mark Brown
2021-06-04 16:59       ` Mark Brown
2021-06-04 20:40       ` Madhavan T. Venkataraman
2021-06-04 20:40         ` Madhavan T. Venkataraman
2021-06-16  1:52     ` Suraj Jitindar Singh
2021-06-16  1:52       ` Suraj Jitindar Singh
2021-06-16  9:15       ` nobuta.keiya
2021-06-16  9:15         ` nobuta.keiya
2021-06-16 11:10       ` Madhavan T. Venkataraman
2021-06-16 11:10         ` Madhavan T. Venkataraman
2021-06-04 15:29   ` [RFC PATCH v5 0/2] arm64: Implement stack trace reliability checks Mark Brown
2021-06-04 15:29     ` Mark Brown
2021-06-04 20:44     ` Madhavan T. Venkataraman
2021-06-04 20:44       ` Madhavan T. Venkataraman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210624144021.GA17937@C02TD0UTHF1T.local \
    --to=mark.rutland@arm.com \
    --cc=ardb@kernel.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jmorris@namei.org \
    --cc=jpoimboe@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=madvenka@linux.microsoft.com \
    --cc=nobuta.keiya@fujitsu.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.