linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: broonie@kernel.org, jpoimboe@redhat.com, jthierry@redhat.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 5/8] arm64: Detect an FTRACE frame and mark a stack trace unreliable
Date: Tue, 23 Mar 2021 12:27:36 -0500	[thread overview]
Message-ID: <2a390ffb-4931-9b7d-e203-5d0189052744@linux.microsoft.com> (raw)
In-Reply-To: <bc450f09-1881-9a9c-bfbc-5bb31c01d8ce@linux.microsoft.com>



On 3/23/21 12:23 PM, Madhavan T. Venkataraman wrote:
> 
> 
> On 3/23/21 12:02 PM, Mark Rutland wrote:
>> On Tue, Mar 23, 2021 at 11:20:44AM -0500, Madhavan T. Venkataraman wrote:
>>> On 3/23/21 10:26 AM, Madhavan T. Venkataraman wrote:
>>>> On 3/23/21 9:57 AM, Mark Rutland wrote:
>>>>> On Tue, Mar 23, 2021 at 09:15:36AM -0500, Madhavan T. Venkataraman wrote:
>>>> So, my next question is - can we define a practical limit for the
>>>> nesting so that any nesting beyond that is fatal? The reason I ask
>>>> is - if there is a max, then we can allocate an array of stack
>>>> frames out of band for the special frames so they are not part of
>>>> the stack and will not likely get corrupted.
>>>>
>>>> Also, we don't have to do any special detection. If the number of
>>>> out of band frames used is one or more then we have exceptions and
>>>> the stack trace is unreliable.
>>>
>>> Alternatively, if we can just increment a counter in the task
>>> structure when an exception is entered and decrement it when an
>>> exception returns, that counter will tell us that the stack trace is
>>> unreliable.
>>
>> As I noted earlier, we must treat *any* EL1 exception boundary needs to
>> be treated as unreliable for unwinding, and per my other comments w.r.t.
>> corrupting the call chain I don't think we need additional protection on
>> exception boundaries specifically.
>>
>>> Is this feasible?
>>>
>>> I think I have enough for v3 at this point. If you think that the
>>> counter idea is OK, I can implement it in v3. Once you confirm, I will
>>> start working on v3.
>>
>> Currently, I don't see a compelling reason to need this, and would
>> prefer to avoid it.
>>
> 
> I think that I did a bad job of explaining what I wanted to do. It is not
> for any additional protection at all.
> 
> So, let us say we create a field in the task structure:
> 
> 	u64		unreliable_stack;
> 
> Whenever an EL1 exception is entered or FTRACE is entered and pt_regs get
> set up and pt_regs->stackframe gets chained, increment unreliable_stack.
> On exiting the above, decrement unreliable_stack.
> 
> In arch_stack_walk_reliable(), simply do this check upfront:
> 
> 	if (task->unreliable_stack)
> 		return -EINVAL;
> 
> This way, the function does not even bother unwinding the stack to find
> exception frames or checking for different return addresses or anything.
> We also don't have to worry about code being reorganized, functions
> being renamed, etc. It also may help in debugging to know if a task is
> experiencing an exception and the level of nesting, etc.
> 
>> More generally, could we please break this work into smaller steps? I
>> reckon we can break this down into the following chunks:
>>
>> 1. Add the explicit final frame and associated handling. I suspect that
>>    this is complicated enough on its own to be an independent series,
>>    and it's something that we can merge without all the bits and pieces
>>    necessary for truly reliable stacktracing.
>>
> 
> OK. I can do that.
> 
>> 2. Figure out how we must handle kprobes and ftrace. That probably means
>>    rejecting unwinds from specific places, but we might also want to
>>    adjust the trampolines if that makes this easier.
>>
> 
> I think I am already doing all the checks except the one you mentioned
> earlier. Yes, I can do this separately.
> 
>> 3. Figure out exception boundary handling. I'm currently working to
>>    simplify the entry assembly down to a uniform set of stubs, and I'd
>>    prefer to get that sorted before we teach the unwinder about
>>    exception boundaries, as it'll be significantly simpler to reason
>>    about and won't end up clashing with the rework.
>>
> 
> So, here is where I still have a question. Is it necessary for the unwinder
> to know the exception boundaries? Is it not enough if it knows if there are
> exceptions present? For instance, using something like num_special_frames

Typo - num_special_frames should be unreliable_stack. That is the name of
the counter I used above.

Sorry about that.

Madhavan

  reply	other threads:[~2021-03-23 17:28 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5997dfe8d261a3a543667b83c902883c1e4bd270>
2021-03-15 16:57 ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace madvenka
2021-03-15 16:57   ` [RFC PATCH v2 1/8] arm64: Implement stack trace termination record madvenka
2021-03-18 15:09     ` Mark Brown
2021-03-18 20:26       ` Madhavan T. Venkataraman
2021-03-19 12:30         ` Mark Brown
2021-03-19 14:29           ` Madhavan T. Venkataraman
2021-03-19 18:19             ` Madhavan T. Venkataraman
2021-03-19 22:03               ` Madhavan T. Venkataraman
2021-03-23 10:24                 ` Mark Rutland
2021-03-23 12:39                   ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 2/8] arm64: Implement frame types madvenka
2021-03-18 17:40     ` Mark Brown
2021-03-18 22:22       ` Madhavan T. Venkataraman
2021-03-19 13:22         ` Mark Brown
2021-03-19 14:40           ` Madhavan T. Venkataraman
2021-03-19 15:02             ` Madhavan T. Venkataraman
2021-03-19 16:20               ` Mark Brown
2021-03-19 16:27                 ` Madhavan T. Venkataraman
2021-03-23 10:34     ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 3/8] arm64: Terminate the stack trace at TASK_FRAME and EL0_FRAME madvenka
2021-03-18 18:26     ` Mark Brown
2021-03-18 20:29       ` Madhavan T. Venkataraman
2021-03-23 10:36         ` Mark Rutland
2021-03-23 12:40           ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 4/8] arm64: Detect an EL1 exception frame and mark a stack trace unreliable madvenka
2021-03-23 10:42     ` Mark Rutland
2021-03-23 12:46       ` Madhavan T. Venkataraman
2021-03-23 13:04         ` Mark Rutland
2021-03-23 13:31           ` Madhavan T. Venkataraman
2021-03-23 14:33             ` Mark Rutland
2021-03-23 15:22               ` Madhavan T. Venkataraman
2021-03-15 16:57   ` [RFC PATCH v2 5/8] arm64: Detect an FTRACE " madvenka
2021-03-23 10:51     ` Mark Rutland
2021-03-23 12:56       ` Madhavan T. Venkataraman
2021-03-23 13:36         ` Mark Rutland
2021-03-23 13:38           ` Madhavan T. Venkataraman
2021-03-23 14:15             ` Madhavan T. Venkataraman
2021-03-23 14:57               ` Mark Rutland
2021-03-23 15:26                 ` Madhavan T. Venkataraman
2021-03-23 16:20                   ` Madhavan T. Venkataraman
2021-03-23 17:02                     ` Mark Rutland
2021-03-23 17:23                       ` Madhavan T. Venkataraman
2021-03-23 17:27                         ` Madhavan T. Venkataraman [this message]
2021-03-23 18:27                         ` Mark Brown
2021-03-23 20:23                           ` Madhavan T. Venkataraman
2021-03-23 18:30                         ` Mark Rutland
2021-03-23 20:24                           ` Madhavan T. Venkataraman
2021-03-23 21:04                             ` Madhavan T. Venkataraman
2021-03-23 16:48                   ` Mark Rutland
2021-03-23 16:53                     ` Madhavan T. Venkataraman
2021-03-23 17:09                       ` Mark Rutland
2021-03-15 16:57   ` [RFC PATCH v2 6/8] arm64: Check the return PC of every stack frame madvenka
2021-03-15 16:57   ` [RFC PATCH v2 7/8] arm64: Detect kretprobed functions in stack trace madvenka
2021-03-15 16:58   ` [RFC PATCH v2 8/8] arm64: Implement arch_stack_walk_reliable() madvenka
2021-03-15 19:01   ` [RFC PATCH v2 0/8] arm64: Implement reliable stack trace Madhavan T. Venkataraman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a390ffb-4931-9b7d-e203-5d0189052744@linux.microsoft.com \
    --to=madvenka@linux.microsoft.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jpoimboe@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).