linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: madvenka@linux.microsoft.com
Cc: broonie@kernel.org, jpoimboe@redhat.com, jthierry@redhat.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable
Date: Tue, 23 Mar 2021 10:42:51 +0000	[thread overview]
Message-ID: <20210323104251.GD95840@C02TD0UTHF1T.local> (raw)
In-Reply-To: <20210315165800.5948-5-madvenka@linux.microsoft.com>

On Mon, Mar 15, 2021 at 11:57:56AM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> EL1 exceptions can happen on any instruction including instructions in
> the frame pointer prolog or epilog. Depending on where exactly they happen,
> they could render the stack trace unreliable.
> 
> If an EL1 exception frame is found on the stack, mark the stack trace as
> unreliable.
> 
> Now, the EL1 exception frame is not at any well-known offset on the stack.
> It can be anywhere on the stack. In order to properly detect an EL1
> exception frame the following checks must be done:
> 
> 	- The frame type must be EL1_FRAME.
> 
> 	- When the register state is saved in the EL1 pt_regs, the frame
> 	  pointer x29 is saved in pt_regs->regs[29] and the return PC
> 	  is saved in pt_regs->pc. These must match with the current
> 	  frame.

Before you can do this, you need to reliably identify that you have a
pt_regs on the stack, but this patch uses a heuristic, which is not
reliable.

However, instead you can identify whether you're trying to unwind
through one of the EL1 entry functions, which tells you the same thing
without even having to look at the pt_regs.

We can do that based on the entry functions all being in .entry.text,
which we could further sub-divide to split the EL0 and EL1 entry
functions.

> 
> Interrupts encountered in kernel code are also EL1 exceptions. At the end
> of an interrupt, the interrupt handler checks if the current task must be
> preempted for any reason. If so, it calls the preemption code which takes
> the task off the CPU. A stack trace taken on the task after the preemption
> will show the EL1 frame and will be considered unreliable. This is correct
> behavior as preemption can happen practically at any point in code
> including the frame pointer prolog and epilog.
> 
> Breakpoints encountered in kernel code are also EL1 exceptions. The probing
> infrastructure uses breakpoints for executing probe code. While in the probe
> code, the stack trace will show an EL1 frame and will be considered
> unreliable. This is also correct behavior.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> ---
>  arch/arm64/include/asm/stacktrace.h |  2 +
>  arch/arm64/kernel/stacktrace.c      | 57 +++++++++++++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index eb29b1fe8255..684f65808394 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -59,6 +59,7 @@ struct stackframe {
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	int graph;
>  #endif
> +	bool reliable;
>  };
>  
>  extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
> @@ -169,6 +170,7 @@ static inline void start_backtrace(struct stackframe *frame,
>  	bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
>  	frame->prev_fp = 0;
>  	frame->prev_type = STACK_TYPE_UNKNOWN;
> +	frame->reliable = true;
>  }
>  
>  #endif	/* __ASM_STACKTRACE_H */
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 504cd161339d..6ae103326f7b 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -18,6 +18,58 @@
>  #include <asm/stack_pointer.h>
>  #include <asm/stacktrace.h>
>  
> +static void check_if_reliable(unsigned long fp, struct stackframe *frame,
> +			      struct stack_info *info)
> +{
> +	struct pt_regs *regs;
> +	unsigned long regs_start, regs_end;
> +
> +	/*
> +	 * If the stack trace has already been marked unreliable, just
> +	 * return.
> +	 */
> +	if (!frame->reliable)
> +		return;
> +
> +	/*
> +	 * Assume that this is an intermediate marker frame inside a pt_regs
> +	 * structure created on the stack and get the pt_regs pointer. Other
> +	 * checks will be done below to make sure that this is a marker
> +	 * frame.
> +	 */

Sorry, but NAK to this approach specifically. This isn't reliable (since
it can be influenced by arbitrary data on the stack), and it's far more
complicated than identifying the entry functions specifically.

Thanks,
Mark.

> +	regs_start = fp - offsetof(struct pt_regs, stackframe);
> +	if (regs_start < info->low)
> +		return;
> +	regs_end = regs_start + sizeof(*regs);
> +	if (regs_end > info->high)
> +		return;
> +	regs = (struct pt_regs *) regs_start;
> +
> +	/*
> +	 * When an EL1 exception happens, a pt_regs structure is created
> +	 * on the stack and the register state is recorded. Part of the
> +	 * state is the FP and PC at the time of the exception.
> +	 *
> +	 * In addition, the FP and PC are also stored in pt_regs->stackframe
> +	 * and pt_regs->stackframe is chained with other frames on the stack.
> +	 * This is so that the interrupted function shows up in the stack
> +	 * trace.
> +	 *
> +	 * The exception could have happened during the frame pointer
> +	 * prolog or epilog. This could result in a missing frame in
> +	 * the stack trace so that the caller of the interrupted
> +	 * function does not show up in the stack trace.
> +	 *
> +	 * So, mark the stack trace as unreliable if an EL1 frame is
> +	 * detected.
> +	 */
> +	if (regs->frame_type == EL1_FRAME && regs->pc == frame->pc &&
> +	    regs->regs[29] == frame->fp) {
> +		frame->reliable = false;
> +		return;
> +	}
> +}
> +
>  /*
>   * AArch64 PCS assigns the frame pointer to x29.
>   *
> @@ -114,6 +166,11 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  
>  	frame->pc = ptrauth_strip_insn_pac(frame->pc);
>  
> +	/*
> +	 * Check for features that render the stack trace unreliable.
> +	 */
> +	check_if_reliable(fp, frame, &info);
> +
>  	return 0;
>  }
>  NOKPROBE_SYMBOL(unwind_frame);
> -- 
> 2.25.1
> 

  reply	other threads:[~2021-03-23 10:44 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5997dfe8d261a3a543667b83c902883c1e4bd270>
2021-03-15 16:57 ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace madvenka
2021-03-15 16:57   ` [RFC PATCH v2 1/8] arm64: Implement stack trace termination record madvenka
2021-03-18 15:09     ` Mark Brown
2021-03-18 20:26       ` Madhavan T. Venkataraman
2021-03-19 12:30         ` Mark Brown
2021-03-19 14:29           ` Madhavan T. Venkataraman
2021-03-19 18:19             ` Madhavan T. Venkataraman
2021-03-19 22:03               ` Madhavan T. Venkataraman
2021-03-23 10:24                 ` Mark Rutland
2021-03-23 12:39                   ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 2/8] arm64: Implement frame types madvenka
2021-03-18 17:40     ` Mark Brown
2021-03-18 22:22       ` Madhavan T. Venkataraman
2021-03-19 13:22         ` Mark Brown
2021-03-19 14:40           ` Madhavan T. Venkataraman
2021-03-19 15:02             ` Madhavan T. Venkataraman
2021-03-19 16:20               ` Mark Brown
2021-03-19 16:27                 ` Madhavan T. Venkataraman
2021-03-23 10:34     ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 3/8] arm64: Terminate the stack trace at TASK_FRAME and EL0_FRAME madvenka
2021-03-18 18:26     ` Mark Brown
2021-03-18 20:29       ` Madhavan T. Venkataraman
2021-03-23 10:36         ` Mark Rutland
2021-03-23 12:40           ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable madvenka
2021-03-23 10:42     ` Mark Rutland [this message]
2021-03-23 12:46       ` Madhavan T. Venkataraman
2021-03-23 13:04         ` Mark Rutland
2021-03-23 13:31           ` Madhavan T. Venkataraman
2021-03-23 14:33             ` Mark Rutland
2021-03-23 15:22               ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 5/8] arm64: Detect an FTRACE " madvenka
2021-03-23 10:51     ` Mark Rutland
2021-03-23 12:56       ` Madhavan T. Venkataraman
2021-03-23 13:36         ` Mark Rutland
2021-03-23 13:38           ` Madhavan T. Venkataraman
2021-03-23 14:15             ` Madhavan T. Venkataraman
2021-03-23 14:57               ` Mark Rutland
2021-03-23 15:26                 ` Madhavan T. Venkataraman
2021-03-23 16:20                   ` Madhavan T. Venkataraman
2021-03-23 17:02                     ` Mark Rutland
2021-03-23 17:23                       ` Madhavan T. Venkataraman
2021-03-23 17:27                         ` Madhavan T. Venkataraman
2021-03-23 18:27                         ` Mark Brown
2021-03-23 20:23                           ` Madhavan T. Venkataraman
2021-03-23 18:30                         ` Mark Rutland
2021-03-23 20:24                           ` Madhavan T. Venkataraman
2021-03-23 21:04                             ` Madhavan T. Venkataraman
2021-03-23 16:48                   ` Mark Rutland
2021-03-23 16:53                     ` Madhavan T. Venkataraman
2021-03-23 17:09                       ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 6/8] arm64: Check the return PC of every stack frame madvenka
2021-03-15 16:57   ` [RFC PATCH v2 7/8] arm64: Detect kretprobed functions in stack trace madvenka
2021-03-15 16:58   ` [RFC PATCH v2 8/8] arm64: Implement arch_stack_walk_reliable() madvenka
2021-03-15 19:01   ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace Madhavan T. Venkataraman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210323104251.GD95840@C02TD0UTHF1T.local \
    --to=mark.rutland@arm.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jpoimboe@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=madvenka@linux.microsoft.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).