All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: broonie@kernel.org, jpoimboe@redhat.com, ardb@kernel.org,
	nobuta.keiya@fujitsu.com, sjitindarsingh@gmail.com,
	catalin.marinas@arm.com, will@kernel.org, jmorris@namei.org,
	pasha.tatashin@soleen.com, jthierry@redhat.com,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v6 3/3] arm64: Create a list of SYM_CODE functions, check return PC against list
Date: Thu, 29 Jul 2021 09:06:26 -0500	[thread overview]
Message-ID: <f9931a57-7a81-867b-fa2a-499d441c5acd@linux.microsoft.com> (raw)
In-Reply-To: <20210728172523.GB47345@C02TD0UTHF1T.local>

Responses inline...

On 7/28/21 12:25 PM, Mark Rutland wrote:
> On Wed, Jun 30, 2021 at 05:33:56PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>> ... <snip> ...
>> +static struct code_range	*sym_code_functions;
>> +static int			num_sym_code_functions;
>> +
>> +int __init init_sym_code_functions(void)
>> +{
>> +	size_t size;
>> +
>> +	size = (unsigned long)__sym_code_functions_end -
>> +	       (unsigned long)__sym_code_functions_start;
>> +
>> +	sym_code_functions = kmalloc(size, GFP_KERNEL);
>> +	if (!sym_code_functions)
>> +		return -ENOMEM;
>> +
>> +	memcpy(sym_code_functions, __sym_code_functions_start, size);
>> +	/* Update num_sym_code_functions after copying sym_code_functions. */
>> +	smp_mb();
>> +	num_sym_code_functions = size / sizeof(struct code_range);
>> +
>> +	return 0;
>> +}
>> +early_initcall(init_sym_code_functions);
> 
> What's the point of copying this, given we don't even sort it?
> 
> If we need to keep it around, it would be nicer to leave it where the
> linker put it, but make it rodata or ro_after_init.
> 

I was planning to sort it for performance. I have a comment to that effect.
But I can remove the copy and retain the info in linker data.

>> +
>>  /*
>>   * Check the stack frame for conditions that make unwinding unreliable.
>>   */
>>  enum unwind_rc unwind_check_frame(struct stackframe *frame)
>>  {
>> +	const struct code_range *range;
>> +	unsigned long pc;
>> +	int i;
>> +
>>  	/*
>>  	 * If the PC is not a known kernel text address, then we cannot
>>  	 * be sure that a subsequent unwind will be reliable, as we
>> @@ -30,6 +62,86 @@ enum unwind_rc unwind_check_frame(struct stackframe *frame)
>>  	 */
>>  	if (!__kernel_text_address(frame->pc))
>>  		return UNWIND_CONTINUE_WITH_RISK;
> 
> As per patch 1, I'd prefer we had something like an
> unwind_is_unreliable() helper, which can return a boolean in this case.
> 

I will look into this.

>> +
>> +	/*
>> +	 * If the final frame has been reached, there is no more unwinding
>> +	 * to do. There is no need to check if the return PC is considered
>> +	 * unreliable by the unwinder.
>> +	 */
>> +	if (!frame->fp)
>> +		return UNWIND_CONTINUE;
> 
> As mentioned on patch 1, I'd rather the main unwind loop checked for the
> final frame specifically before trying to unwind. With that in mind, we
> should never try to unwind to a NULL fp.
> 

OK.

>> +
>> +	/*
>> +	 * Check the return PC against sym_code_functions[]. If there is a
>> +	 * match, then the consider the stack frame unreliable. These functions
>> +	 * contain low-level code where the frame pointer and/or the return
>> +	 * address register cannot be relied upon. This addresses the following
>> +	 * situations:
>> +	 *
>> +	 *  - Exception handlers and entry assembly
>> +	 *  - Trampoline assembly (e.g., ftrace, kprobes)
>> +	 *  - Hypervisor-related assembly
>> +	 *  - Hibernation-related assembly
>> +	 *  - CPU start-stop, suspend-resume assembly
>> +	 *  - Kernel relocation assembly
>> +	 *
>> +	 * Some special cases covered by sym_code_functions[] deserve a mention
>> +	 * here:
>> +	 *
>> +	 *  - All EL1 interrupt and exception stack traces will be considered
>> +	 *    unreliable. This is the correct behavior as interrupts and
>> +	 *    exceptions can happen on any instruction including ones in the
>> +	 *    frame pointer prolog and epilog. Unless stack metadata is
>> +	 *    available so the unwinder can unwind through these special
>> +	 *    cases, such stack traces will be considered unreliable.
> 
> As mentioned previously, we *can* reliably unwind precisely one step
> across an exception boundary, as we can be certain of the PC value at
> the time the exception was taken, but past this we can't be certain
> whether the LR is legitimate.
> 
> I'd like that we capture that precisely in the unwinder, and I'm
> currently reworking the entry assembly to make that possible.
> 
>> +	 *
>> +	 *  - A task can get preempted at the end of an interrupt. Stack
>> +	 *    traces of preempted tasks will show the interrupt frame in the
>> +	 *    stack trace and will be considered unreliable.
>> +	 *
>> +	 *  - Breakpoints are exceptions. So, all stack traces in the break
>> +	 *    point handler (including probes) will be considered unreliable.
>> +	 *
>> +	 *  - All of the ftrace entry trampolines are considered unreliable.
>> +	 *    So, all stack traces taken from tracer functions will be
>> +	 *    considered unreliable.
>> +	 *
>> +	 *  - The Function Graph Tracer return trampoline (return_to_handler)
>> +	 *    and the Kretprobe return trampoline (kretprobe_trampoline) are
>> +	 *    also considered unreliable.
> 
> We should be able to unwind these reliably if we specifically identify
> them. I think we need a two-step check here; we should assume that
> SYM_CODE() is unreliable by default, but in specific cases we should
> unwind that reliably.
> 
>> +	 * Some of the special cases above can be unwound through using
>> +	 * special logic in unwind_frame().
>> +	 *
>> +	 *  - return_to_handler() is handled by the unwinder by attempting
>> +	 *    to retrieve the original return address from the per-task
>> +	 *    return address stack.
>> +	 *
>> +	 *  - kretprobe_trampoline() can be handled in a similar fashion by
>> +	 *    attempting to retrieve the original return address from the
>> +	 *    per-task kretprobe instance list.
> 
> We don't do this today,
> 
>> +	 *
>> +	 *  - I reckon optprobes can be handled in a similar fashion in the
>> +	 *    future?
>> +	 *
>> +	 *  - Stack traces taken from the FTrace tracer functions can be
>> +	 *    handled as well. ftrace_call is an inner label defined in the
>> +	 *    Ftrace entry trampoline. This is the location where the call
>> +	 *    to a tracer function is patched. So, if the return PC equals
>> +	 *    ftrace_call+4, it is reliable. At that point, proper stack
>> +	 *    frames have already been set up for the traced function and
>> +	 *    its caller.
>> +	 *
>> +	 * NOTE:
>> +	 *   If sym_code_functions[] were sorted, a binary search could be
>> +	 *   done to make this more performant.
>> +	 */
> 
> Since some of the above is speculative (e.g. the bit about optprobes),
> and as code will change over time, I think we should have a much terser
> comment, e.g.
> 
> 	/*
> 	 * As SYM_CODE functions don't follow the usual calling
> 	 * conventions, we assume by default that any SYM_CODE function
> 	 * cannot be unwound reliably.
> 	 *
> 	 * Note that this includes exception entry/return sequences and
> 	 * trampoline for ftrace and kprobes.
> 	 */
> 
> ... and then if/when we try to unwind a specific SYM_CODE function
> reliably, we add the comment for that specifically.
> 

Just to confirm, are you suggesting that I remove the entire large comment
detailing the various cases and replace the whole thing with the terse comment?
I did the large comment because of Mark Brown's input that we must be verbose
about all the cases so that it is clear in the future what the different
cases are and how we handle them in this code. As the code evolves, the comments
would evolve.

I can replace the comment if you want. Please confirm.

Thanks.

Madhavan

WARNING: multiple messages have this Message-ID (diff)
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: broonie@kernel.org, jpoimboe@redhat.com, ardb@kernel.org,
	nobuta.keiya@fujitsu.com, sjitindarsingh@gmail.com,
	catalin.marinas@arm.com, will@kernel.org, jmorris@namei.org,
	pasha.tatashin@soleen.com, jthierry@redhat.com,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v6 3/3] arm64: Create a list of SYM_CODE functions, check return PC against list
Date: Thu, 29 Jul 2021 09:06:26 -0500	[thread overview]
Message-ID: <f9931a57-7a81-867b-fa2a-499d441c5acd@linux.microsoft.com> (raw)
In-Reply-To: <20210728172523.GB47345@C02TD0UTHF1T.local>

Responses inline...

On 7/28/21 12:25 PM, Mark Rutland wrote:
> On Wed, Jun 30, 2021 at 05:33:56PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>> ... <snip> ...
>> +static struct code_range	*sym_code_functions;
>> +static int			num_sym_code_functions;
>> +
>> +int __init init_sym_code_functions(void)
>> +{
>> +	size_t size;
>> +
>> +	size = (unsigned long)__sym_code_functions_end -
>> +	       (unsigned long)__sym_code_functions_start;
>> +
>> +	sym_code_functions = kmalloc(size, GFP_KERNEL);
>> +	if (!sym_code_functions)
>> +		return -ENOMEM;
>> +
>> +	memcpy(sym_code_functions, __sym_code_functions_start, size);
>> +	/* Update num_sym_code_functions after copying sym_code_functions. */
>> +	smp_mb();
>> +	num_sym_code_functions = size / sizeof(struct code_range);
>> +
>> +	return 0;
>> +}
>> +early_initcall(init_sym_code_functions);
> 
> What's the point of copying this, given we don't even sort it?
> 
> If we need to keep it around, it would be nicer to leave it where the
> linker put it, but make it rodata or ro_after_init.
> 

I was planning to sort it for performance. I have a comment to that effect.
But I can remove the copy and retain the info in linker data.

>> +
>>  /*
>>   * Check the stack frame for conditions that make unwinding unreliable.
>>   */
>>  enum unwind_rc unwind_check_frame(struct stackframe *frame)
>>  {
>> +	const struct code_range *range;
>> +	unsigned long pc;
>> +	int i;
>> +
>>  	/*
>>  	 * If the PC is not a known kernel text address, then we cannot
>>  	 * be sure that a subsequent unwind will be reliable, as we
>> @@ -30,6 +62,86 @@ enum unwind_rc unwind_check_frame(struct stackframe *frame)
>>  	 */
>>  	if (!__kernel_text_address(frame->pc))
>>  		return UNWIND_CONTINUE_WITH_RISK;
> 
> As per patch 1, I'd prefer we had something like an
> unwind_is_unreliable() helper, which can return a boolean in this case.
> 

I will look into this.

>> +
>> +	/*
>> +	 * If the final frame has been reached, there is no more unwinding
>> +	 * to do. There is no need to check if the return PC is considered
>> +	 * unreliable by the unwinder.
>> +	 */
>> +	if (!frame->fp)
>> +		return UNWIND_CONTINUE;
> 
> As mentioned on patch 1, I'd rather the main unwind loop checked for the
> final frame specifically before trying to unwind. With that in mind, we
> should never try to unwind to a NULL fp.
> 

OK.

>> +
>> +	/*
>> +	 * Check the return PC against sym_code_functions[]. If there is a
>> +	 * match, then the consider the stack frame unreliable. These functions
>> +	 * contain low-level code where the frame pointer and/or the return
>> +	 * address register cannot be relied upon. This addresses the following
>> +	 * situations:
>> +	 *
>> +	 *  - Exception handlers and entry assembly
>> +	 *  - Trampoline assembly (e.g., ftrace, kprobes)
>> +	 *  - Hypervisor-related assembly
>> +	 *  - Hibernation-related assembly
>> +	 *  - CPU start-stop, suspend-resume assembly
>> +	 *  - Kernel relocation assembly
>> +	 *
>> +	 * Some special cases covered by sym_code_functions[] deserve a mention
>> +	 * here:
>> +	 *
>> +	 *  - All EL1 interrupt and exception stack traces will be considered
>> +	 *    unreliable. This is the correct behavior as interrupts and
>> +	 *    exceptions can happen on any instruction including ones in the
>> +	 *    frame pointer prolog and epilog. Unless stack metadata is
>> +	 *    available so the unwinder can unwind through these special
>> +	 *    cases, such stack traces will be considered unreliable.
> 
> As mentioned previously, we *can* reliably unwind precisely one step
> across an exception boundary, as we can be certain of the PC value at
> the time the exception was taken, but past this we can't be certain
> whether the LR is legitimate.
> 
> I'd like that we capture that precisely in the unwinder, and I'm
> currently reworking the entry assembly to make that possible.
> 
>> +	 *
>> +	 *  - A task can get preempted at the end of an interrupt. Stack
>> +	 *    traces of preempted tasks will show the interrupt frame in the
>> +	 *    stack trace and will be considered unreliable.
>> +	 *
>> +	 *  - Breakpoints are exceptions. So, all stack traces in the break
>> +	 *    point handler (including probes) will be considered unreliable.
>> +	 *
>> +	 *  - All of the ftrace entry trampolines are considered unreliable.
>> +	 *    So, all stack traces taken from tracer functions will be
>> +	 *    considered unreliable.
>> +	 *
>> +	 *  - The Function Graph Tracer return trampoline (return_to_handler)
>> +	 *    and the Kretprobe return trampoline (kretprobe_trampoline) are
>> +	 *    also considered unreliable.
> 
> We should be able to unwind these reliably if we specifically identify
> them. I think we need a two-step check here; we should assume that
> SYM_CODE() is unreliable by default, but in specific cases we should
> unwind that reliably.
> 
>> +	 * Some of the special cases above can be unwound through using
>> +	 * special logic in unwind_frame().
>> +	 *
>> +	 *  - return_to_handler() is handled by the unwinder by attempting
>> +	 *    to retrieve the original return address from the per-task
>> +	 *    return address stack.
>> +	 *
>> +	 *  - kretprobe_trampoline() can be handled in a similar fashion by
>> +	 *    attempting to retrieve the original return address from the
>> +	 *    per-task kretprobe instance list.
> 
> We don't do this today,
> 
>> +	 *
>> +	 *  - I reckon optprobes can be handled in a similar fashion in the
>> +	 *    future?
>> +	 *
>> +	 *  - Stack traces taken from the FTrace tracer functions can be
>> +	 *    handled as well. ftrace_call is an inner label defined in the
>> +	 *    Ftrace entry trampoline. This is the location where the call
>> +	 *    to a tracer function is patched. So, if the return PC equals
>> +	 *    ftrace_call+4, it is reliable. At that point, proper stack
>> +	 *    frames have already been set up for the traced function and
>> +	 *    its caller.
>> +	 *
>> +	 * NOTE:
>> +	 *   If sym_code_functions[] were sorted, a binary search could be
>> +	 *   done to make this more performant.
>> +	 */
> 
> Since some of the above is speculative (e.g. the bit about optprobes),
> and as code will change over time, I think we should have a much terser
> comment, e.g.
> 
> 	/*
> 	 * As SYM_CODE functions don't follow the usual calling
> 	 * conventions, we assume by default that any SYM_CODE function
> 	 * cannot be unwound reliably.
> 	 *
> 	 * Note that this includes exception entry/return sequences and
> 	 * trampoline for ftrace and kprobes.
> 	 */
> 
> ... and then if/when we try to unwind a specific SYM_CODE function
> reliably, we add the comment for that specifically.
> 

Just to confirm, are you suggesting that I remove the entire large comment
detailing the various cases and replace the whole thing with the terse comment?
I did the large comment because of Mark Brown's input that we must be verbose
about all the cases so that it is clear in the future what the different
cases are and how we handle them in this code. As the code evolves, the comments
would evolve.

I can replace the comment if you want. Please confirm.

Thanks.

Madhavan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-07-29 14:11 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3f2aab69a35c243c5e97f47c4ad84046355f5b90>
2021-06-30 22:33 ` [RFC PATCH v6 0/3] arm64: Implement stack trace reliability checks madvenka
2021-06-30 22:33   ` madvenka
2021-06-30 22:33   ` [RFC PATCH v6 1/3] arm64: Improve the unwinder return value madvenka
2021-06-30 22:33     ` madvenka
2021-07-28 16:56     ` Mark Rutland
2021-07-28 16:56       ` Mark Rutland
2021-07-29 13:54       ` Madhavan T. Venkataraman
2021-07-29 13:54         ` Madhavan T. Venkataraman
2021-06-30 22:33   ` [RFC PATCH v6 2/3] arm64: Introduce stack trace reliability checks in the unwinder madvenka
2021-06-30 22:33     ` madvenka
2021-06-30 22:33   ` [RFC PATCH v6 3/3] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
2021-06-30 22:33     ` madvenka
2021-07-28 17:25     ` Mark Rutland
2021-07-28 17:25       ` Mark Rutland
2021-07-29 14:06       ` Madhavan T. Venkataraman [this message]
2021-07-29 14:06         ` Madhavan T. Venkataraman
2021-07-29 14:52         ` Mark Brown
2021-07-29 14:52           ` Mark Brown
2021-07-29 17:07           ` Madhavan T. Venkataraman
2021-07-29 17:07             ` Madhavan T. Venkataraman
2021-07-29 15:48         ` Mark Rutland
2021-07-29 15:48           ` Mark Rutland
2021-07-29 16:27           ` Mark Brown
2021-07-29 16:27             ` Mark Brown
2021-07-29 17:09           ` Madhavan T. Venkataraman
2021-07-29 17:09             ` Madhavan T. Venkataraman
2021-07-26 13:49   ` [RFC PATCH v6 0/3] arm64: Implement stack trace reliability checks Madhavan T. Venkataraman
2021-07-26 13:49     ` Madhavan T. Venkataraman
2021-08-12 13:24 ` [RFC PATCH v7 0/4] arm64: Reorganize the unwinder and implement " madvenka
2021-08-12 13:24   ` madvenka
2021-08-12 13:24   ` [RFC PATCH v7 1/4] arm64: Make all stack walking functions use arch_stack_walk() madvenka
2021-08-12 13:24     ` madvenka
2021-08-12 15:23     ` Mark Brown
2021-08-12 15:23       ` Mark Brown
2021-08-12 16:30       ` Madhavan T. Venkataraman
2021-08-12 16:30         ` Madhavan T. Venkataraman
2021-08-12 20:59     ` kernel test robot
2021-08-12 13:24   ` [RFC PATCH v7 2/4] arm64: Reorganize the unwinder code for better consistency and maintenance madvenka
2021-08-12 13:24     ` madvenka
2021-08-12 13:24   ` [RFC PATCH v7 3/4] arm64: Introduce stack trace reliability checks in the unwinder madvenka
2021-08-12 13:24     ` madvenka
2021-08-12 13:24   ` [RFC PATCH v7 4/4] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
2021-08-12 13:24     ` madvenka
2021-08-12 18:53     ` kernel test robot
2021-08-12 18:31   ` [RFC PATCH v7 0/4] arm64: Reorganize the unwinder and implement stack trace reliability checks Madhavan T. Venkataraman
2021-08-12 18:31     ` Madhavan T. Venkataraman
2021-08-12 18:45     ` Madhavan T. Venkataraman
2021-08-12 18:45       ` Madhavan T. Venkataraman
2021-08-12 18:35 ` madvenka
2021-08-12 18:35   ` madvenka
2021-08-12 18:35   ` [RFC PATCH v7 1/4] arm64: Make all stack walking functions use arch_stack_walk() madvenka
2021-08-12 18:35     ` madvenka
2021-08-12 18:35   ` [RFC PATCH v7 2/4] arm64: Reorganize the unwinder code for better consistency and maintenance madvenka
2021-08-12 18:35     ` madvenka
2021-08-12 18:35   ` [RFC PATCH v7 3/4] arm64: Introduce stack trace reliability checks in the unwinder madvenka
2021-08-12 18:35     ` madvenka
2021-08-12 18:35   ` [RFC PATCH v7 4/4] arm64: Create a list of SYM_CODE functions, check return PC against list madvenka
2021-08-12 18:35     ` madvenka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9931a57-7a81-867b-fa2a-499d441c5acd@linux.microsoft.com \
    --to=madvenka@linux.microsoft.com \
    --cc=ardb@kernel.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jmorris@namei.org \
    --cc=jpoimboe@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=nobuta.keiya@fujitsu.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=sjitindarsingh@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.