All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: broonie@kernel.org, jpoimboe@redhat.com, jthierry@redhat.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable
Date: Tue, 23 Mar 2021 08:31:50 -0500	[thread overview]
Message-ID: <f5dd48d3-c0ea-a719-c10d-83e62db3e7c0@linux.microsoft.com> (raw)
In-Reply-To: <20210323130425.GA98545@C02TD0UTHF1T.local>



On 3/23/21 8:04 AM, Mark Rutland wrote:
> On Tue, Mar 23, 2021 at 07:46:10AM -0500, Madhavan T. Venkataraman wrote:
>> On 3/23/21 5:42 AM, Mark Rutland wrote:
>>> On Mon, Mar 15, 2021 at 11:57:56AM -0500, madvenka@linux.microsoft.com wrote:
>>>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>>>
>>>> EL1 exceptions can happen on any instruction including instructions in
>>>> the frame pointer prolog or epilog. Depending on where exactly they happen,
>>>> they could render the stack trace unreliable.
>>>>
>>>> If an EL1 exception frame is found on the stack, mark the stack trace as
>>>> unreliable.
>>>>
>>>> Now, the EL1 exception frame is not at any well-known offset on the stack.
>>>> It can be anywhere on the stack. In order to properly detect an EL1
>>>> exception frame the following checks must be done:
>>>>
>>>> 	- The frame type must be EL1_FRAME.
>>>>
>>>> 	- When the register state is saved in the EL1 pt_regs, the frame
>>>> 	  pointer x29 is saved in pt_regs->regs[29] and the return PC
>>>> 	  is saved in pt_regs->pc. These must match with the current
>>>> 	  frame.
>>>
>>> Before you can do this, you need to reliably identify that you have a
>>> pt_regs on the stack, but this patch uses a heuristic, which is not
>>> reliable.
>>>
>>> However, instead you can identify whether you're trying to unwind
>>> through one of the EL1 entry functions, which tells you the same thing
>>> without even having to look at the pt_regs.
>>>
>>> We can do that based on the entry functions all being in .entry.text,
>>> which we could further sub-divide to split the EL0 and EL1 entry
>>> functions.
>>
>> Yes. I will check the entry functions. But I still think that we should
>> not rely on just one check. The additional checks will make it robust.
>> So, I suggest that the return address be checked first. If that passes,
>> then we can be reasonably sure that there are pt_regs. Then, check
>> the other things in pt_regs.
> 
> What do you think this will catch?
> 

I am not sure that I have an exact example to mention here. But I will attempt
one. Let us say that a task has called arch_stack_walk() in the recent past.
The unwinder may have copied a stack frame onto some location in the stack
with one of the return addresses we check. Let us assume that there is some
stack corruption that makes a frame pointer point to that exact record. Now,
we will get a match on the return address on the next unwind.

Pardon me if the example is somewhat crude. My point is that it is highly unlikely
but not impossible for the return address to be on the stack and for the unwinder to
get an unfortunate match.

> The only way to correctly identify whether or not we have a pt_regs is
> to check whether we're in specific portions of the EL1 entry assembly
> where the regs exist. However, as any EL1<->EL1 transition cannot be
> safely unwound, we'd mark any trace going through the EL1 entry assembly
> as unreliable.
> 
> Given that, I don't think it's useful to check the regs, and I'd prefer
> to avoid the subtlteties involved in attempting to do so.
> 

I agree that the return address check is a good check. I would like to add
extra checks to be absolutely sure.

> [...]
> 
>>>> +static void check_if_reliable(unsigned long fp, struct stackframe *frame,
>>>> +			      struct stack_info *info)
>>>> +{
>>>> +	struct pt_regs *regs;
>>>> +	unsigned long regs_start, regs_end;
>>>> +
>>>> +	/*
>>>> +	 * If the stack trace has already been marked unreliable, just
>>>> +	 * return.
>>>> +	 */
>>>> +	if (!frame->reliable)
>>>> +		return;
>>>> +
>>>> +	/*
>>>> +	 * Assume that this is an intermediate marker frame inside a pt_regs
>>>> +	 * structure created on the stack and get the pt_regs pointer. Other
>>>> +	 * checks will be done below to make sure that this is a marker
>>>> +	 * frame.
>>>> +	 */
>>>
>>> Sorry, but NAK to this approach specifically. This isn't reliable (since
>>> it can be influenced by arbitrary data on the stack), and it's far more
>>> complicated than identifying the entry functions specifically.
>>
>> As I mentioned above, I agree that we should check the return address. But
>> just as a precaution, I think we should double check the pt_regs.
>>
>> Is that OK with you? It does not take away anything or increase the risk in
>> anyway. I think it makes it more robust.
> 
> As above, I think that the work necessary to correctly access the regs
> means that it's not helpful to check the regs themselves. If you have
> something in mind where checking the regs is helpful I'm happy to
> consider that, but my general preference would be to stay away from the
> regs for now.
> 

I have mentioned a possibility above. Please take a look and let me know.

Thanks.

Madhavan

WARNING: multiple messages have this Message-ID (diff)
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: broonie@kernel.org, jpoimboe@redhat.com, jthierry@redhat.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable
Date: Tue, 23 Mar 2021 08:31:50 -0500	[thread overview]
Message-ID: <f5dd48d3-c0ea-a719-c10d-83e62db3e7c0@linux.microsoft.com> (raw)
In-Reply-To: <20210323130425.GA98545@C02TD0UTHF1T.local>



On 3/23/21 8:04 AM, Mark Rutland wrote:
> On Tue, Mar 23, 2021 at 07:46:10AM -0500, Madhavan T. Venkataraman wrote:
>> On 3/23/21 5:42 AM, Mark Rutland wrote:
>>> On Mon, Mar 15, 2021 at 11:57:56AM -0500, madvenka@linux.microsoft.com wrote:
>>>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>>>
>>>> EL1 exceptions can happen on any instruction including instructions in
>>>> the frame pointer prolog or epilog. Depending on where exactly they happen,
>>>> they could render the stack trace unreliable.
>>>>
>>>> If an EL1 exception frame is found on the stack, mark the stack trace as
>>>> unreliable.
>>>>
>>>> Now, the EL1 exception frame is not at any well-known offset on the stack.
>>>> It can be anywhere on the stack. In order to properly detect an EL1
>>>> exception frame the following checks must be done:
>>>>
>>>> 	- The frame type must be EL1_FRAME.
>>>>
>>>> 	- When the register state is saved in the EL1 pt_regs, the frame
>>>> 	  pointer x29 is saved in pt_regs->regs[29] and the return PC
>>>> 	  is saved in pt_regs->pc. These must match with the current
>>>> 	  frame.
>>>
>>> Before you can do this, you need to reliably identify that you have a
>>> pt_regs on the stack, but this patch uses a heuristic, which is not
>>> reliable.
>>>
>>> However, instead you can identify whether you're trying to unwind
>>> through one of the EL1 entry functions, which tells you the same thing
>>> without even having to look at the pt_regs.
>>>
>>> We can do that based on the entry functions all being in .entry.text,
>>> which we could further sub-divide to split the EL0 and EL1 entry
>>> functions.
>>
>> Yes. I will check the entry functions. But I still think that we should
>> not rely on just one check. The additional checks will make it robust.
>> So, I suggest that the return address be checked first. If that passes,
>> then we can be reasonably sure that there are pt_regs. Then, check
>> the other things in pt_regs.
> 
> What do you think this will catch?
> 

I am not sure that I have an exact example to mention here. But I will attempt
one. Let us say that a task has called arch_stack_walk() in the recent past.
The unwinder may have copied a stack frame onto some location in the stack
with one of the return addresses we check. Let us assume that there is some
stack corruption that makes a frame pointer point to that exact record. Now,
we will get a match on the return address on the next unwind.

Pardon me if the example is somewhat crude. My point is that it is highly unlikely
but not impossible for the return address to be on the stack and for the unwinder to
get an unfortunate match.

> The only way to correctly identify whether or not we have a pt_regs is
> to check whether we're in specific portions of the EL1 entry assembly
> where the regs exist. However, as any EL1<->EL1 transition cannot be
> safely unwound, we'd mark any trace going through the EL1 entry assembly
> as unreliable.
> 
> Given that, I don't think it's useful to check the regs, and I'd prefer
> to avoid the subtlteties involved in attempting to do so.
> 

I agree that the return address check is a good check. I would like to add
extra checks to be absolutely sure.

> [...]
> 
>>>> +static void check_if_reliable(unsigned long fp, struct stackframe *frame,
>>>> +			      struct stack_info *info)
>>>> +{
>>>> +	struct pt_regs *regs;
>>>> +	unsigned long regs_start, regs_end;
>>>> +
>>>> +	/*
>>>> +	 * If the stack trace has already been marked unreliable, just
>>>> +	 * return.
>>>> +	 */
>>>> +	if (!frame->reliable)
>>>> +		return;
>>>> +
>>>> +	/*
>>>> +	 * Assume that this is an intermediate marker frame inside a pt_regs
>>>> +	 * structure created on the stack and get the pt_regs pointer. Other
>>>> +	 * checks will be done below to make sure that this is a marker
>>>> +	 * frame.
>>>> +	 */
>>>
>>> Sorry, but NAK to this approach specifically. This isn't reliable (since
>>> it can be influenced by arbitrary data on the stack), and it's far more
>>> complicated than identifying the entry functions specifically.
>>
>> As I mentioned above, I agree that we should check the return address. But
>> just as a precaution, I think we should double check the pt_regs.
>>
>> Is that OK with you? It does not take away anything or increase the risk in
>> anyway. I think it makes it more robust.
> 
> As above, I think that the work necessary to correctly access the regs
> means that it's not helpful to check the regs themselves. If you have
> something in mind where checking the regs is helpful I'm happy to
> consider that, but my general preference would be to stay away from the
> regs for now.
> 

I have mentioned a possibility above. Please take a look and let me know.

Thanks.

Madhavan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-03-23 13:32 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5997dfe8d261a3a543667b83c902883c1e4bd270>
2021-03-15 16:57 ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace madvenka
2021-03-15 16:57   ` madvenka
2021-03-15 16:57   ` [RFC PATCH v2 1/8] arm64: Implement stack trace termination record madvenka
2021-03-15 16:57     ` madvenka
2021-03-18 15:09     ` Mark Brown
2021-03-18 15:09       ` Mark Brown
2021-03-18 20:26       ` Madhavan T. Venkataraman
2021-03-18 20:26         ` Madhavan T. Venkataraman
2021-03-19 12:30         ` Mark Brown
2021-03-19 12:30           ` Mark Brown
2021-03-19 14:29           ` Madhavan T. Venkataraman
2021-03-19 14:29             ` Madhavan T. Venkataraman
2021-03-19 18:19             ` Madhavan T. Venkataraman
2021-03-19 18:19               ` Madhavan T. Venkataraman
2021-03-19 22:03               ` Madhavan T. Venkataraman
2021-03-19 22:03                 ` Madhavan T. Venkataraman
2021-03-23 10:24                 ` Mark Rutland
2021-03-23 10:24                   ` Mark Rutland
2021-03-23 12:39                   ` Madhavan T. Venkataraman
2021-03-23 12:39                     ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 2/8] arm64: Implement frame types madvenka
2021-03-15 16:57     ` madvenka
2021-03-18 17:40     ` Mark Brown
2021-03-18 17:40       ` Mark Brown
2021-03-18 22:22       ` Madhavan T. Venkataraman
2021-03-18 22:22         ` Madhavan T. Venkataraman
2021-03-19 13:22         ` Mark Brown
2021-03-19 13:22           ` Mark Brown
2021-03-19 14:40           ` Madhavan T. Venkataraman
2021-03-19 14:40             ` Madhavan T. Venkataraman
2021-03-19 15:02             ` Madhavan T. Venkataraman
2021-03-19 15:02               ` Madhavan T. Venkataraman
2021-03-19 16:20               ` Mark Brown
2021-03-19 16:20                 ` Mark Brown
2021-03-19 16:27                 ` Madhavan T. Venkataraman
2021-03-19 16:27                   ` Madhavan T. Venkataraman
2021-03-23 10:34     ` Mark Rutland
2021-03-23 10:34       ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 3/8] arm64: Terminate the stack trace at TASK_FRAME and EL0_FRAME madvenka
2021-03-15 16:57     ` madvenka
2021-03-18 18:26     ` Mark Brown
2021-03-18 18:26       ` Mark Brown
2021-03-18 20:29       ` Madhavan T. Venkataraman
2021-03-18 20:29         ` Madhavan T. Venkataraman
2021-03-23 10:36         ` Mark Rutland
2021-03-23 10:36           ` Mark Rutland
2021-03-23 12:40           ` Madhavan T. Venkataraman
2021-03-23 12:40             ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable madvenka
2021-03-15 16:57     ` madvenka
2021-03-23 10:42     ` Mark Rutland
2021-03-23 10:42       ` Mark Rutland
2021-03-23 12:46       ` Madhavan T. Venkataraman
2021-03-23 12:46         ` Madhavan T. Venkataraman
2021-03-23 13:04         ` Mark Rutland
2021-03-23 13:04           ` Mark Rutland
2021-03-23 13:31           ` Madhavan T. Venkataraman [this message]
2021-03-23 13:31             ` Madhavan T. Venkataraman
2021-03-23 14:33             ` Mark Rutland
2021-03-23 14:33               ` Mark Rutland
2021-03-23 15:22               ` Madhavan T. Venkataraman
2021-03-23 15:22                 ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 5/8] arm64: Detect an FTRACE " madvenka
2021-03-15 16:57     ` madvenka
2021-03-23 10:51     ` Mark Rutland
2021-03-23 10:51       ` Mark Rutland
2021-03-23 12:56       ` Madhavan T. Venkataraman
2021-03-23 12:56         ` Madhavan T. Venkataraman
2021-03-23 13:36         ` Mark Rutland
2021-03-23 13:36           ` Mark Rutland
2021-03-23 13:38           ` Madhavan T. Venkataraman
2021-03-23 13:38             ` Madhavan T. Venkataraman
2021-03-23 14:15             ` Madhavan T. Venkataraman
2021-03-23 14:15               ` Madhavan T. Venkataraman
2021-03-23 14:57               ` Mark Rutland
2021-03-23 14:57                 ` Mark Rutland
2021-03-23 15:26                 ` Madhavan T. Venkataraman
2021-03-23 15:26                   ` Madhavan T. Venkataraman
2021-03-23 16:20                   ` Madhavan T. Venkataraman
2021-03-23 16:20                     ` Madhavan T. Venkataraman
2021-03-23 17:02                     ` Mark Rutland
2021-03-23 17:02                       ` Mark Rutland
2021-03-23 17:23                       ` Madhavan T. Venkataraman
2021-03-23 17:23                         ` Madhavan T. Venkataraman
2021-03-23 17:27                         ` Madhavan T. Venkataraman
2021-03-23 17:27                           ` Madhavan T. Venkataraman
2021-03-23 18:27                         ` Mark Brown
2021-03-23 18:27                           ` Mark Brown
2021-03-23 20:23                           ` Madhavan T. Venkataraman
2021-03-23 20:23                             ` Madhavan T. Venkataraman
2021-03-23 18:30                         ` Mark Rutland
2021-03-23 18:30                           ` Mark Rutland
2021-03-23 20:24                           ` Madhavan T. Venkataraman
2021-03-23 20:24                             ` Madhavan T. Venkataraman
2021-03-23 21:04                             ` Madhavan T. Venkataraman
2021-03-23 21:04                               ` Madhavan T. Venkataraman
2021-03-23 16:48                   ` Mark Rutland
2021-03-23 16:48                     ` Mark Rutland
2021-03-23 16:53                     ` Madhavan T. Venkataraman
2021-03-23 16:53                       ` Madhavan T. Venkataraman
2021-03-23 17:09                       ` Mark Rutland
2021-03-23 17:09                         ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 6/8] arm64: Check the return PC of every stack frame madvenka
2021-03-15 16:57     ` madvenka
2021-03-15 16:57   ` [RFC PATCH v2 7/8] arm64: Detect kretprobed functions in stack trace madvenka
2021-03-15 16:57     ` madvenka
2021-03-15 16:58   ` [RFC PATCH v2 8/8] arm64: Implement arch_stack_walk_reliable() madvenka
2021-03-15 16:58     ` madvenka
2021-03-15 19:01   ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace Madhavan T. Venkataraman
2021-03-15 19:01     ` Madhavan T. Venkataraman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f5dd48d3-c0ea-a719-c10d-83e62db3e7c0@linux.microsoft.com \
    --to=madvenka@linux.microsoft.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jpoimboe@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.