live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
       [not found]     ` <20201015154951.GD4390@sirena.org.uk>
@ 2020-10-15 21:29       ` Josh Poimboeuf
  2020-10-16 11:14         ` Mark Rutland
  2020-10-16 12:15         ` Mark Brown
  0 siblings, 2 replies; 8+ messages in thread
From: Josh Poimboeuf @ 2020-10-15 21:29 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

I can't see the original patch.  Can the original poster (Mark B?) add
me to Cc on the next version?

It's also good practice to add lkml as well.  That way, those of us not
copied can at least find the patch in the archives.

live-patching@vger.kernel.org would also be a good idea for this one.

On Thu, Oct 15, 2020 at 04:49:51PM +0100, Mark Brown wrote:
> On Thu, Oct 15, 2020 at 03:16:12PM +0100, Mark Rutland wrote:
> > On Thu, Oct 15, 2020 at 03:39:37PM +0200, Miroslav Benes wrote:
> 
> > > I'll just copy an excerpt from my notes about the required guarantees. 
> > > Written by Josh (CCed, he has better idea about the problem than me 
> > > anyway).
> 
> > > It also needs to:
> > > - detect preemption / page fault frames and return an error
> > > - only return success if it reaches the end of the task stack; for user
> > >   tasks, that means the syscall barrier; for kthreads/idle tasks, that
> > >   means finding a defined thread entry point
> > > - make sure it can't get into a recursive loop
> > > - make sure each return address is a valid text address
> > > - properly detect generated code hacks like function graph tracing and
> > >   kretprobes
> > > "
> 
> > It would be great if we could put something like the above into the
> > kernel tree, either under Documentation/ or in a comment somewhere for
> > the reliable stacktrace functions.
> 
> Yes, please - the expecations are quite hard to follow at the minute,
> implementing it involves quite a bit of guesswork and cargo culting to
> figure out what the APIs are supposed to do.

Documentation is indeed long overdue.  I suppose everyone's looking at
me.  I can do that, but my bandwidth's limited for at least a few weeks.

[ Currently in week 4 of traveling cross-country with a camper
  ("caravan" in British-speak?), National Lampoon vacation style. ]

If by cargo culting, you mean reverse engineering the requirements due
to lack of documentation, that's fair.

Otherwise, if you see anything that doesn't make sense or that can be
improved, let me know.

> > AFAICT, existing architectures don't always handle all of the above in
> > arch_stack_walk_reliable(). For example, it looks like x86 assumes
> > unwiding through exceptions is reliable for !CONFIG_FRAME_POINTER, but I
> > think this might not always be true.

Why not?

What else are the existing arches missing from the above list?

> I certainly wouldn't have inferred the list from what's there :/

Fair, presumably because of missing documentation.

> The searching for a defined thread entry point for example isn't
> entirely visible in the implementations.

For now I'll speak only of x86, because I don't quite remember how
powerpc does it.

For thread entry points, aka the "end" of the stack:

- For ORC, the end of the stack is either pt_regs, or -- when unwinding
  from kthreads, idle tasks, or irqs/exceptions in entry code --
  UNWIND_HINT_EMPTY (found by the unwinder's check for orc->end.

  [ Admittedly the implementation needs to be cleaned up a bit.  EMPTY
    is too broad and needs to be split into UNDEFINED and ENTRY. ]

- For frame pointers, by convention, the end of the stack for all tasks
  is a defined stack offset: end of stack page - sizeof(pt_regs).

And yes, all that needs to be documented.

-- 
Josh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-15 21:29       ` [RFC PATCH 0/3] arm64: Implement reliable stack trace Josh Poimboeuf
@ 2020-10-16 11:14         ` Mark Rutland
  2020-10-20 10:03           ` Mark Rutland
  2020-10-16 12:15         ` Mark Brown
  1 sibling, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2020-10-16 11:14 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Brown, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

Hi Josh,

On Thu, Oct 15, 2020 at 04:29:31PM -0500, Josh Poimboeuf wrote:
> > > AFAICT, existing architectures don't always handle all of the above in
> > > arch_stack_walk_reliable(). For example, it looks like x86 assumes
> > > unwiding through exceptions is reliable for !CONFIG_FRAME_POINTER, but I
> > > think this might not always be true.
> 
> Why not?

Mark B's reply dropped this, but the next paragraph covered that:

| I was planning to send a mail once I've finished writing a test, but
| IIUC there are some windows where ftrace/kretprobes
| detection/repainting may not work, e.g. if preempted after
| ftrace_return_to_handler() decrements curr_ret_stack, but before the
| arch trampoline asm restores the original return addr. So we might
| need something like an in_return_trampoline() to detect and report
| that reliably.

... so e.g. for a callchain A->B->C, where C is instrumented there are
windows where B might be missing from the trace, but the trace is
reported as reliable.

I'll start a new thread on this (with a more fleshed-out example), with
the full set of livepatch folk, lkml, etc. I just want to write a test
case first, since it's entirely possible something I've missed is
catching this already.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-15 21:29       ` [RFC PATCH 0/3] arm64: Implement reliable stack trace Josh Poimboeuf
  2020-10-16 11:14         ` Mark Rutland
@ 2020-10-16 12:15         ` Mark Brown
  2020-10-19 23:41           ` Josh Poimboeuf
  1 sibling, 1 reply; 8+ messages in thread
From: Mark Brown @ 2020-10-16 12:15 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Rutland, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

[-- Attachment #1: Type: text/plain, Size: 2355 bytes --]

On Thu, Oct 15, 2020 at 04:29:31PM -0500, Josh Poimboeuf wrote:
> I can't see the original patch.  Can the original poster (Mark B?) add
> me to Cc on the next version?

https://lore.kernel.org/linux-arm-kernel/20201012172605.10715-1-broonie@kernel.org/

> It's also good practice to add lkml as well.  That way, those of us not
> copied can at least find the patch in the archives.

> live-patching@vger.kernel.org would also be a good idea for this one.

Sorry about that.  I don't know if it's worth including a K: pattern for
arch_stack_walk_reliable() in the livepatch entry in MAINTAINERS?

> If by cargo culting, you mean reverse engineering the requirements due
> to lack of documentation, that's fair.

Yes, exactly - just copying the existing implementations and hoping that
it's sensible/relevant and covers everything that's needed.  It's not
entirely clear what a reliable stacktrace is expected to do that a
normal stacktrace doesn't do beyond returning an error code.

> > The searching for a defined thread entry point for example isn't
> > entirely visible in the implementations.

> For now I'll speak only of x86, because I don't quite remember how
> powerpc does it.

> For thread entry points, aka the "end" of the stack:

> - For ORC, the end of the stack is either pt_regs, or -- when unwinding
>   from kthreads, idle tasks, or irqs/exceptions in entry code --
>   UNWIND_HINT_EMPTY (found by the unwinder's check for orc->end.

>   [ Admittedly the implementation needs to be cleaned up a bit.  EMPTY
>     is too broad and needs to be split into UNDEFINED and ENTRY. ]

> - For frame pointers, by convention, the end of the stack for all tasks
>   is a defined stack offset: end of stack page - sizeof(pt_regs).

> And yes, all that needs to be documented.

Ah, I'd have interpreted "defined thread entry point" as meaning
expecting to find specific functions appering at the end of the stack
rather than meaning positively identifying the end of the stack - for
arm64 we use a NULL frame pointer to indicate this in all situations.
In that case that's one bit that is already clear.

From the list Miroslav posted the bits I wouldn't have inferred were:

 - Detecting preemption/page faults
 - Preventing recursive loops
 - Verifying that return addresses are text addresses

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-16 12:15         ` Mark Brown
@ 2020-10-19 23:41           ` Josh Poimboeuf
  2020-10-20 15:39             ` Mark Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Josh Poimboeuf @ 2020-10-19 23:41 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

On Fri, Oct 16, 2020 at 01:15:34PM +0100, Mark Brown wrote:
> 
> Yes, exactly - just copying the existing implementations and hoping that
> it's sensible/relevant and covers everything that's needed.  It's not
> entirely clear what a reliable stacktrace is expected to do that a
> normal stacktrace doesn't do beyond returning an error code.

While in the end there may not be much of a difference between normal
and reliable stacktraces beyond returning an error code, it still
requires beefing up the unwinder's error detection abilities.

> > > The searching for a defined thread entry point for example isn't
> > > entirely visible in the implementations.
> 
> > For now I'll speak only of x86, because I don't quite remember how
> > powerpc does it.
> 
> > For thread entry points, aka the "end" of the stack:
> 
> > - For ORC, the end of the stack is either pt_regs, or -- when unwinding
> >   from kthreads, idle tasks, or irqs/exceptions in entry code --
> >   UNWIND_HINT_EMPTY (found by the unwinder's check for orc->end.
> 
> >   [ Admittedly the implementation needs to be cleaned up a bit.  EMPTY
> >     is too broad and needs to be split into UNDEFINED and ENTRY. ]
> 
> > - For frame pointers, by convention, the end of the stack for all tasks
> >   is a defined stack offset: end of stack page - sizeof(pt_regs).
> 
> > And yes, all that needs to be documented.
> 
> Ah, I'd have interpreted "defined thread entry point" as meaning
> expecting to find specific functions appering at the end of the stack
> rather than meaning positively identifying the end of the stack - for
> arm64 we use a NULL frame pointer to indicate this in all situations.
> In that case that's one bit that is already clear.

I think a NULL frame pointer isn't going to be robust enough.  For
example NULL could easily be introduced by a corrupt stack, or by asm
frame pointer misuse.

-- 
Josh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-16 11:14         ` Mark Rutland
@ 2020-10-20 10:03           ` Mark Rutland
  2020-10-20 15:58             ` Josh Poimboeuf
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2020-10-20 10:03 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Brown, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

On Fri, Oct 16, 2020 at 12:14:31PM +0100, Mark Rutland wrote:
> Mark B's reply dropped this, but the next paragraph covered that:
> 
> | I was planning to send a mail once I've finished writing a test, but
> | IIUC there are some windows where ftrace/kretprobes
> | detection/repainting may not work, e.g. if preempted after
> | ftrace_return_to_handler() decrements curr_ret_stack, but before the
> | arch trampoline asm restores the original return addr. So we might
> | need something like an in_return_trampoline() to detect and report
> | that reliably.
> 
> ... so e.g. for a callchain A->B->C, where C is instrumented there are
> windows where B might be missing from the trace, but the trace is
> reported as reliable.

I'd missed a couple of details, and I think I see how each existing
architecture prevents this case now.

Josh, just to confirm the x86 case, am I right in thinking that the ORC
unwinder will refuse to unwind from the return_to_handler and
kretprobe_trampoline asm? IIRC objtool shouldn't build unwind info for
those as return_to_handler is marked with SYM_CODE_{START,END}() and
kretprobe_trampoline is marked with STACK_FRAME_NON_STANDARD().

Both powerpc and s390 refuse to reliably unwind through exceptions, so
they can rely on function call boundaries to keep the callchain in a
sane state.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-19 23:41           ` Josh Poimboeuf
@ 2020-10-20 15:39             ` Mark Brown
  2020-10-20 16:28               ` Josh Poimboeuf
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Brown @ 2020-10-20 15:39 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Rutland, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

[-- Attachment #1: Type: text/plain, Size: 773 bytes --]

On Mon, Oct 19, 2020 at 06:41:55PM -0500, Josh Poimboeuf wrote:
> On Fri, Oct 16, 2020 at 01:15:34PM +0100, Mark Brown wrote:

> > Ah, I'd have interpreted "defined thread entry point" as meaning
> > expecting to find specific functions appering at the end of the stack
> > rather than meaning positively identifying the end of the stack - for
> > arm64 we use a NULL frame pointer to indicate this in all situations.
> > In that case that's one bit that is already clear.

> I think a NULL frame pointer isn't going to be robust enough.  For
> example NULL could easily be introduced by a corrupt stack, or by asm
> frame pointer misuse.

Is it just the particular poison value that you're concerned about here
or are you looking for additional checks of some other kind?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-20 10:03           ` Mark Rutland
@ 2020-10-20 15:58             ` Josh Poimboeuf
  0 siblings, 0 replies; 8+ messages in thread
From: Josh Poimboeuf @ 2020-10-20 15:58 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Mark Brown, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

On Tue, Oct 20, 2020 at 11:03:52AM +0100, Mark Rutland wrote:
> On Fri, Oct 16, 2020 at 12:14:31PM +0100, Mark Rutland wrote:
> > Mark B's reply dropped this, but the next paragraph covered that:
> > 
> > | I was planning to send a mail once I've finished writing a test, but
> > | IIUC there are some windows where ftrace/kretprobes
> > | detection/repainting may not work, e.g. if preempted after
> > | ftrace_return_to_handler() decrements curr_ret_stack, but before the
> > | arch trampoline asm restores the original return addr. So we might
> > | need something like an in_return_trampoline() to detect and report
> > | that reliably.
> > 
> > ... so e.g. for a callchain A->B->C, where C is instrumented there are
> > windows where B might be missing from the trace, but the trace is
> > reported as reliable.
> 
> I'd missed a couple of details, and I think I see how each existing
> architecture prevents this case now.
> 
> Josh, just to confirm the x86 case, am I right in thinking that the ORC
> unwinder will refuse to unwind from the return_to_handler and
> kretprobe_trampoline asm? IIRC objtool shouldn't build unwind info for
> those as return_to_handler is marked with SYM_CODE_{START,END}() and
> kretprobe_trampoline is marked with STACK_FRAME_NON_STANDARD().

Hm, return_to_handler() actually looks like a bug.  UNWIND_HINT_EMPTY
sets end=1, which causes the ORC unwinder to treat it like entry code
(end of the stack).  So while it does stop the unwind, it fails to
report an error.

This would be fixed by the idea I previously mentioned, changing
UNWIND_HINT_EMPTY -> UNWIND_HINT_UNDEFINED (end=0) for the non-entry
cases.  I'll need to work up some patches.

> Both powerpc and s390 refuse to reliably unwind through exceptions, so
> they can rely on function call boundaries to keep the callchain in a
> sane state.

Yes, and also true for x86 frame pointers.

-- 
Josh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] arm64: Implement reliable stack trace
  2020-10-20 15:39             ` Mark Brown
@ 2020-10-20 16:28               ` Josh Poimboeuf
  0 siblings, 0 replies; 8+ messages in thread
From: Josh Poimboeuf @ 2020-10-20 16:28 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, Miroslav Benes, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel, live-patching

On Tue, Oct 20, 2020 at 04:39:13PM +0100, Mark Brown wrote:
> On Mon, Oct 19, 2020 at 06:41:55PM -0500, Josh Poimboeuf wrote:
> > On Fri, Oct 16, 2020 at 01:15:34PM +0100, Mark Brown wrote:
> 
> > > Ah, I'd have interpreted "defined thread entry point" as meaning
> > > expecting to find specific functions appering at the end of the stack
> > > rather than meaning positively identifying the end of the stack - for
> > > arm64 we use a NULL frame pointer to indicate this in all situations.
> > > In that case that's one bit that is already clear.
> 
> > I think a NULL frame pointer isn't going to be robust enough.  For
> > example NULL could easily be introduced by a corrupt stack, or by asm
> > frame pointer misuse.
> 
> Is it just the particular poison value that you're concerned about here
> or are you looking for additional checks of some other kind?

You just need to know you've conclusively reached the user entry point
on the stack, without missing any functions.

A sufficiently unique poison value might be ok.  Though, defining a
certain stack offset as the "end" seems more robust.

-- 
Josh


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-20 16:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201012172605.10715-1-broonie@kernel.org>
     [not found] ` <alpine.LSU.2.21.2010151533490.14094@pobox.suse.cz>
     [not found]   ` <20201015141612.GC50416@C02TD0UTHF1T.local>
     [not found]     ` <20201015154951.GD4390@sirena.org.uk>
2020-10-15 21:29       ` [RFC PATCH 0/3] arm64: Implement reliable stack trace Josh Poimboeuf
2020-10-16 11:14         ` Mark Rutland
2020-10-20 10:03           ` Mark Rutland
2020-10-20 15:58             ` Josh Poimboeuf
2020-10-16 12:15         ` Mark Brown
2020-10-19 23:41           ` Josh Poimboeuf
2020-10-20 15:39             ` Mark Brown
2020-10-20 16:28               ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).