All of lore.kernel.org
 help / color / mirror / Atom feed
* missing stack frames
@ 2023-02-09  1:53 Pete Stevenson
  2023-02-09  3:36 ` Ian Rogers
  0 siblings, 1 reply; 4+ messages in thread
From: Pete Stevenson @ 2023-02-09  1:53 UTC (permalink / raw)
  To: linux-perf-users

We have crafted a program where with some definite expectations about
what stack traces we expect, but in less than 1% of the samples, we
see one stack frame missing.

Roughly speaking, we see the following:
libc -> main -> foo -> bar (99.49%) (expected)
libc -> main -> bar (0.51%) (not expected)

That is, in 0.51% +/- of the cases, the stack trace samples collected
by perf appear to indicate that "main" is directly calling "bar" which
we do not believe to be the case. We first observed this in a slightly
more complicated setup with a different profiler, but have been
chasing this down to a fairly reductive example. We ran the perf
profiler using a few different command line options (with and without
-g and --call-graph dwarf) and saw the same basic results with each
method.

Curious to know if anyone understands the underlying cause of this...
Is it something like a non-atomic sample of the stack trace, i.e. the
stack frames are being manipulated while the stack trace is being
sampled? Something else?

Thank you,
John P. (Pete) Stevenson

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: missing stack frames
  2023-02-09  1:53 missing stack frames Pete Stevenson
@ 2023-02-09  3:36 ` Ian Rogers
  2023-02-13  9:54   ` James Clark
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Rogers @ 2023-02-09  3:36 UTC (permalink / raw)
  To: Pete Stevenson; +Cc: linux-perf-users

On Wed, Feb 8, 2023 at 5:54 PM Pete Stevenson <jstevenson@pixielabs.ai> wrote:
>
> We have crafted a program where with some definite expectations about
> what stack traces we expect, but in less than 1% of the samples, we
> see one stack frame missing.
>
> Roughly speaking, we see the following:
> libc -> main -> foo -> bar (99.49%) (expected)
> libc -> main -> bar (0.51%) (not expected)
>
> That is, in 0.51% +/- of the cases, the stack trace samples collected
> by perf appear to indicate that "main" is directly calling "bar" which
> we do not believe to be the case. We first observed this in a slightly
> more complicated setup with a different profiler, but have been
> chasing this down to a fairly reductive example. We ran the perf
> profiler using a few different command line options (with and without
> -g and --call-graph dwarf) and saw the same basic results with each
> method.
>
> Curious to know if anyone understands the underlying cause of this...
> Is it something like a non-atomic sample of the stack trace, i.e. the
> stack frames are being manipulated while the stack trace is being
> sampled? Something else?

It is possibly an issue with frame pointers. On x86 function entry looks like:
push %rbp
mov %rsp, %rbp

Stacks grow down, the x86 representation is:
https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/stacktrace.h#L101

The stack can only be walked if both of the entry instructions have
been executed and the 0.51% may be inside these first two instructions
(or similar in the function epilogue).

Thanks,
Ian
> Thank you,
> John P. (Pete) Stevenson

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: missing stack frames
  2023-02-09  3:36 ` Ian Rogers
@ 2023-02-13  9:54   ` James Clark
  2023-02-13 17:29     ` Pete Stevenson
  0 siblings, 1 reply; 4+ messages in thread
From: James Clark @ 2023-02-13  9:54 UTC (permalink / raw)
  To: Ian Rogers, Pete Stevenson; +Cc: linux-perf-users



On 09/02/2023 03:36, Ian Rogers wrote:
> On Wed, Feb 8, 2023 at 5:54 PM Pete Stevenson <jstevenson@pixielabs.ai> wrote:
>>
>> We have crafted a program where with some definite expectations about
>> what stack traces we expect, but in less than 1% of the samples, we
>> see one stack frame missing.
>>
>> Roughly speaking, we see the following:
>> libc -> main -> foo -> bar (99.49%) (expected)
>> libc -> main -> bar (0.51%) (not expected)
>>
>> That is, in 0.51% +/- of the cases, the stack trace samples collected
>> by perf appear to indicate that "main" is directly calling "bar" which
>> we do not believe to be the case. We first observed this in a slightly
>> more complicated setup with a different profiler, but have been
>> chasing this down to a fairly reductive example. We ran the perf
>> profiler using a few different command line options (with and without
>> -g and --call-graph dwarf) and saw the same basic results with each
>> method.
>>
>> Curious to know if anyone understands the underlying cause of this...
>> Is it something like a non-atomic sample of the stack trace, i.e. the
>> stack frames are being manipulated while the stack trace is being
>> sampled? Something else?
> 
> It is possibly an issue with frame pointers. On x86 function entry looks like:
> push %rbp
> mov %rsp, %rbp
> 
> Stacks grow down, the x86 representation is:
> https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/stacktrace.h#L101
> 
> The stack can only be walked if both of the entry instructions have
> been executed and the 0.51% may be inside these first two instructions
> (or similar in the function epilogue).
> 

On Arm we had an issue that looked like this when using --call-graph=fp,
but we managed to insert the second to last missing frame in Perf using
the dwarf as a post processing step. This was because leaf functions
don't insert a frame pointer, even with -fno-omit-frame-pointer because
there is no point.

With --call-graph=dwarf I would expect the stacks to always be complete,
even with inlining and missing frame pointers. Unless you aren't
including all the debug symbols or stripping them afterwards or
something like that? Otherwise it could be a bug in libunwind or libdw
or the dwarf generation.

James

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: missing stack frames
  2023-02-13  9:54   ` James Clark
@ 2023-02-13 17:29     ` Pete Stevenson
  0 siblings, 0 replies; 4+ messages in thread
From: Pete Stevenson @ 2023-02-13 17:29 UTC (permalink / raw)
  To: James Clark; +Cc: Ian Rogers, linux-perf-users

James - the comments about dwarf make sense -- I will double check the
results using --call-graph=dwarf.

Thank you!

On Mon, Feb 13, 2023 at 1:54 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 09/02/2023 03:36, Ian Rogers wrote:
> > On Wed, Feb 8, 2023 at 5:54 PM Pete Stevenson <jstevenson@pixielabs.ai> wrote:
> >>
> >> We have crafted a program where with some definite expectations about
> >> what stack traces we expect, but in less than 1% of the samples, we
> >> see one stack frame missing.
> >>
> >> Roughly speaking, we see the following:
> >> libc -> main -> foo -> bar (99.49%) (expected)
> >> libc -> main -> bar (0.51%) (not expected)
> >>
> >> That is, in 0.51% +/- of the cases, the stack trace samples collected
> >> by perf appear to indicate that "main" is directly calling "bar" which
> >> we do not believe to be the case. We first observed this in a slightly
> >> more complicated setup with a different profiler, but have been
> >> chasing this down to a fairly reductive example. We ran the perf
> >> profiler using a few different command line options (with and without
> >> -g and --call-graph dwarf) and saw the same basic results with each
> >> method.
> >>
> >> Curious to know if anyone understands the underlying cause of this...
> >> Is it something like a non-atomic sample of the stack trace, i.e. the
> >> stack frames are being manipulated while the stack trace is being
> >> sampled? Something else?
> >
> > It is possibly an issue with frame pointers. On x86 function entry looks like:
> > push %rbp
> > mov %rsp, %rbp
> >
> > Stacks grow down, the x86 representation is:
> > https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/stacktrace.h#L101
> >
> > The stack can only be walked if both of the entry instructions have
> > been executed and the 0.51% may be inside these first two instructions
> > (or similar in the function epilogue).
> >
>
> On Arm we had an issue that looked like this when using --call-graph=fp,
> but we managed to insert the second to last missing frame in Perf using
> the dwarf as a post processing step. This was because leaf functions
> don't insert a frame pointer, even with -fno-omit-frame-pointer because
> there is no point.
>
> With --call-graph=dwarf I would expect the stacks to always be complete,
> even with inlining and missing frame pointers. Unless you aren't
> including all the debug symbols or stripping them afterwards or
> something like that? Otherwise it could be a bug in libunwind or libdw
> or the dwarf generation.
>
> James

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-13 17:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09  1:53 missing stack frames Pete Stevenson
2023-02-09  3:36 ` Ian Rogers
2023-02-13  9:54   ` James Clark
2023-02-13 17:29     ` Pete Stevenson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.