live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: broonie@kernel.org, jpoimboe@redhat.com, jthierry@redhat.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable
Date: Tue, 23 Mar 2021 10:22:39 -0500	[thread overview]
Message-ID: <efaf82d9-8f66-7e0e-0cac-cee38450a224@linux.microsoft.com> (raw)
In-Reply-To: <20210323143345.GC98545@C02TD0UTHF1T.local>



On 3/23/21 9:33 AM, Mark Rutland wrote:
> On Tue, Mar 23, 2021 at 08:31:50AM -0500, Madhavan T. Venkataraman wrote:
>> On 3/23/21 8:04 AM, Mark Rutland wrote:
>>> On Tue, Mar 23, 2021 at 07:46:10AM -0500, Madhavan T. Venkataraman wrote:
>>>> On 3/23/21 5:42 AM, Mark Rutland wrote:
>>>>> On Mon, Mar 15, 2021 at 11:57:56AM -0500, madvenka@linux.microsoft.com wrote:
>>>>>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>>>>>
>>>>>> EL1 exceptions can happen on any instruction including instructions in
>>>>>> the frame pointer prolog or epilog. Depending on where exactly they happen,
>>>>>> they could render the stack trace unreliable.
>>>>>>
>>>>>> If an EL1 exception frame is found on the stack, mark the stack trace as
>>>>>> unreliable.
>>>>>>
>>>>>> Now, the EL1 exception frame is not at any well-known offset on the stack.
>>>>>> It can be anywhere on the stack. In order to properly detect an EL1
>>>>>> exception frame the following checks must be done:
>>>>>>
>>>>>> 	- The frame type must be EL1_FRAME.
>>>>>>
>>>>>> 	- When the register state is saved in the EL1 pt_regs, the frame
>>>>>> 	  pointer x29 is saved in pt_regs->regs[29] and the return PC
>>>>>> 	  is saved in pt_regs->pc. These must match with the current
>>>>>> 	  frame.
>>>>>
>>>>> Before you can do this, you need to reliably identify that you have a
>>>>> pt_regs on the stack, but this patch uses a heuristic, which is not
>>>>> reliable.
>>>>>
>>>>> However, instead you can identify whether you're trying to unwind
>>>>> through one of the EL1 entry functions, which tells you the same thing
>>>>> without even having to look at the pt_regs.
>>>>>
>>>>> We can do that based on the entry functions all being in .entry.text,
>>>>> which we could further sub-divide to split the EL0 and EL1 entry
>>>>> functions.
>>>>
>>>> Yes. I will check the entry functions. But I still think that we should
>>>> not rely on just one check. The additional checks will make it robust.
>>>> So, I suggest that the return address be checked first. If that passes,
>>>> then we can be reasonably sure that there are pt_regs. Then, check
>>>> the other things in pt_regs.
>>>
>>> What do you think this will catch?
>>
>> I am not sure that I have an exact example to mention here. But I will attempt
>> one. Let us say that a task has called arch_stack_walk() in the recent past.
>> The unwinder may have copied a stack frame onto some location in the stack
>> with one of the return addresses we check. Let us assume that there is some
>> stack corruption that makes a frame pointer point to that exact record. Now,
>> we will get a match on the return address on the next unwind.
> 
> I don't see how this is material to the pt_regs case, as either:
> 
> * When the unwinder considers this frame, it appears to be in the middle
>   of an EL1 entry function, and the unwinder must mark the unwinding as
>   unreliable regardless of the contents of any regs (so there's no need
>   to look at the regs).
> 
> * When the unwinder considers this frame, it does not appear to be in
>   the middle of an EL1 entry function, so the unwinder does not think
>   there are any regs to consider, and so we cannot detect this case.
> 
> ... unless I've misunderstood the example?
> 
> There's a general problem that it's possible to corrupt any portion of
> the chain to skip records, e.g.
> 
>   A -> B -> C -> D -> E -> F -> G -> H -> [final]
> 
> ... could get corrupted to:
> 
>   A -> B -> D -> H -> [final]
> 
> ... regardless of whether C/E/F/G had associated pt_regs. AFAICT there's
> no good way to catch this generally unless we have additional metadata
> to check the unwinding against.
> 
> The likelihood of this happening without triggering other checks is
> vanishingly low, and as we don't have a reliable mechanism for detecting
> this, I don't think it's worthwhile attempting to do so.
> 
> If and when we try to unwind across EL1 exception boundaries, the
> potential mismatch between the frame record and regs will be more
> significant, and I agree at that point thisd will need more thought.
> 
>> Pardon me if the example is somewhat crude. My point is that it is
>> highly unlikely but not impossible for the return address to be on the
>> stack and for the unwinder to get an unfortunate match.
> 
> I agree that this is possible in theory, but as above I don't think this
> is a practical concern.
> 

OK. What you say makes sense.

Thanks.

Madhavan

  reply	other threads:[~2021-03-23 15:23 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5997dfe8d261a3a543667b83c902883c1e4bd270>
2021-03-15 16:57 ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace madvenka
2021-03-15 16:57   ` [RFC PATCH v2 1/8] arm64: Implement stack trace termination record madvenka
2021-03-18 15:09     ` Mark Brown
2021-03-18 20:26       ` Madhavan T. Venkataraman
2021-03-19 12:30         ` Mark Brown
2021-03-19 14:29           ` Madhavan T. Venkataraman
2021-03-19 18:19             ` Madhavan T. Venkataraman
2021-03-19 22:03               ` Madhavan T. Venkataraman
2021-03-23 10:24                 ` Mark Rutland
2021-03-23 12:39                   ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 2/8] arm64: Implement frame types madvenka
2021-03-18 17:40     ` Mark Brown
2021-03-18 22:22       ` Madhavan T. Venkataraman
2021-03-19 13:22         ` Mark Brown
2021-03-19 14:40           ` Madhavan T. Venkataraman
2021-03-19 15:02             ` Madhavan T. Venkataraman
2021-03-19 16:20               ` Mark Brown
2021-03-19 16:27                 ` Madhavan T. Venkataraman
2021-03-23 10:34     ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 3/8] arm64: Terminate the stack trace at TASK_FRAME and EL0_FRAME madvenka
2021-03-18 18:26     ` Mark Brown
2021-03-18 20:29       ` Madhavan T. Venkataraman
2021-03-23 10:36         ` Mark Rutland
2021-03-23 12:40           ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable madvenka
2021-03-23 10:42     ` Mark Rutland
2021-03-23 12:46       ` Madhavan T. Venkataraman
2021-03-23 13:04         ` Mark Rutland
2021-03-23 13:31           ` Madhavan T. Venkataraman
2021-03-23 14:33             ` Mark Rutland
2021-03-23 15:22               ` Madhavan T. Venkataraman [this message]
2021-03-15 16:57   ` [RFC PATCH v2 5/8] arm64: Detect an FTRACE " madvenka
2021-03-23 10:51     ` Mark Rutland
2021-03-23 12:56       ` Madhavan T. Venkataraman
2021-03-23 13:36         ` Mark Rutland
2021-03-23 13:38           ` Madhavan T. Venkataraman
2021-03-23 14:15             ` Madhavan T. Venkataraman
2021-03-23 14:57               ` Mark Rutland
2021-03-23 15:26                 ` Madhavan T. Venkataraman
2021-03-23 16:20                   ` Madhavan T. Venkataraman
2021-03-23 17:02                     ` Mark Rutland
2021-03-23 17:23                       ` Madhavan T. Venkataraman
2021-03-23 17:27                         ` Madhavan T. Venkataraman
2021-03-23 18:27                         ` Mark Brown
2021-03-23 20:23                           ` Madhavan T. Venkataraman
2021-03-23 18:30                         ` Mark Rutland
2021-03-23 20:24                           ` Madhavan T. Venkataraman
2021-03-23 21:04                             ` Madhavan T. Venkataraman
2021-03-23 16:48                   ` Mark Rutland
2021-03-23 16:53                     ` Madhavan T. Venkataraman
2021-03-23 17:09                       ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 6/8] arm64: Check the return PC of every stack frame madvenka
2021-03-15 16:57   ` [RFC PATCH v2 7/8] arm64: Detect kretprobed functions in stack trace madvenka
2021-03-15 16:58   ` [RFC PATCH v2 8/8] arm64: Implement arch_stack_walk_reliable() madvenka
2021-03-15 19:01   ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace Madhavan T. Venkataraman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=efaf82d9-8f66-7e0e-0cac-cee38450a224@linux.microsoft.com \
    --to=madvenka@linux.microsoft.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jpoimboe@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).