All of lore.kernel.org
 help / color / mirror / Atom feed
From: Milian Wolff <milian.wolff@kdab.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@kernel.org>
Subject: Re: Inlined functions in perf report
Date: Tue, 20 Dec 2016 14:27:10 +0100	[thread overview]
Message-ID: <2027151.EnbG4A8ymx@milian-kdab2> (raw)
In-Reply-To: <20161220121755.GL3124@twins.programming.kicks-ass.net>

[-- Attachment #1: Type: text/plain, Size: 4734 bytes --]

On Tuesday, December 20, 2016 1:17:55 PM CET Peter Zijlstra wrote:
> On Tue, Dec 20, 2016 at 12:59:54PM +0100, Steinar H. Gunderson wrote:
> > Hi Peter,
> > 
> > I can't find a good point of contact for perf, so I'm contacting you based
> > on the MAINTAINERS file; feel free to redirect somewhere if you're not
> > the right person.
> 
> Cc'ed linux-perf-users@vger.kernel.org
> 
> > I'm trying to figure out how to deal with perf report when there are
> > inlined functions; they don't generally seem to show up in the call
> > stack, which sometimes can make it very hard to figure out what is going,
> > especially in a code base one doesn't know too well. As an example, I
> > threw together a> 
> > minimal test program:
> >   #include <stdlib.h>
> >   
> >   inline int foo()
> >   {
> >   
> >           int k = rand();
> >           int sum = 1;
> >           for (int i = 0; i < 10000000000; ++i)
> >           {
> >           
> >                   sum ^= k;
> >                   sum += k;
> >           
> >           }
> >           return sum;
> >   
> >   }
> >   
> >   int main(void)
> >   {
> >   
> >           return foo();
> >   
> >   }
> > 
> > Compiling with -O2 -g, and running perf record -g yields:
> >   # Samples: 6K of event 'cycles:ppp'
> >   # Event count (approx.): 5876825543
> >   #
> >   # Children      Self  Command  Shared Object      Symbol
> >   # ........  ........  .......  .................  ......................
> >   #
> >   
> >       99.98%    99.98%  inline   inline             [.] main
> >       
> >               ---0x706258d4c544155
> >               
> >                  main
> >       
> >       99.98%     0.00%  inline   [unknown]          [.] 0x0706258d4c544155
> >       
> >               ---0x706258d4c544155
> >               
> >                  main
> > 
> > Is there a way I can get it to show “foo” in the call graph? (I suppose
> > also ideally, “foo” and not “main” should show up in a non-graph run.) Of
> > course, this gets even more confusing if foo calls bar, since it now
> > looks like the call chain is main -> bar directly.
> > 
> > I have debug information that should be sufficient in the binary, because
> > if> 
> > I break in gdb, I definitely get the call stack:
> >   Program received signal SIGINT, Interrupt.
> >   0x0000555555554589 in foo () at inline.c:5
> >   5               int k = rand();
> >   (gdb) bt
> >   #0  0x0000555555554589 in foo () at inline.c:5
> >   #1  main () at inline.c:17
> >   (gdb)
> > 
> > FWIW, this is with perf from 4.10 (git as of a few days ago) and GCC
> > 6.2.1.
> 
> OK, so it might be possible with: perf record -g --call-graph dwarf
> but that's fairly heavy on the overhead, it will dump the top-of-stack
> for each sample (8k default) and unwind using libunwind in userspace.

It is not even possible with that, perf report is lacking the steps required 
to add inline frames - it will only add "real" frames it gets from either of 
the unwind libraries.

I have a WIP patch available for this functionality though, it can be found 
here (depends on libbfd, i.e. bfd_find_inliner_info):

https://github.com/milianw/linux/commit/
71d031c9d679bfb4a4044226e8903dd80ea601b3

This is not yet upstreamable, but any early comments would be welcome. I hope 
to get some more time to drive this in the coming weeks. If you want to test 
it out, checkout my milian/perf branch of this repo, build it like you'd do 
the normal user-space perf, then run

perf report -g srcline -s sym,srcline

> The default mechanism used for call-graphs is frame-pointers which are
> (relatively) simple and fast to traverse from kernel space. The down
> side is of course that all your userspace needs to be compiled with
> frame pointers enabled and inlined functions, as you noticed, are
> 'lost'.
> 
> There has been talk to attempt to utilize the ELF EH frames which are
> mandatory in the x86_64 ABI (even for C) to attempt a kernel based
> 'DWARF' unwind, but nobody has put forward working code for this yet.
> Also, even if the EH stuff is mapped at runtime, it doesn't mean the
> pages will actually be loaded (due to demand paging) and available for
> use, which also will limit usability. (perf sampling is using
> interrupt/NMI context and we cannot page from that, so we're limited to
> memory that's present.)

While all of this would be nice to have, it is not directly related to 
inlining from what I gathered.

Bye

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

  reply	other threads:[~2016-12-20 13:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-20 11:59 Inlined functions in perf report Steinar H. Gunderson
2016-12-20 12:17 ` Peter Zijlstra
2016-12-20 13:27   ` Milian Wolff [this message]
2016-12-20 13:43     ` Steinar H. Gunderson
2016-12-20 14:03       ` Milian Wolff
2016-12-20 13:54     ` Arnaldo Carvalho de Melo
2016-12-20 14:05       ` Milian Wolff
2016-12-20 14:08       ` Steinar H. Gunderson
2016-12-20 14:37         ` Arnaldo Carvalho de Melo
2016-12-20 17:01           ` Steinar H. Gunderson
2016-12-21  0:53             ` Jin, Yao
2016-12-21  9:58               ` Steinar H. Gunderson
2016-12-21 10:09                 ` Milian Wolff
2016-12-21 10:20                   ` Steinar H. Gunderson
2016-12-21 22:56                     ` Jin, Yao
2016-12-21 22:58                       ` Steinar H. Gunderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2027151.EnbG4A8ymx@milian-kdab2 \
    --to=milian.wolff@kdab.com \
    --cc=acme@kernel.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=sgunderson@bigfoot.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.