All of lore.kernel.org
 help / color / mirror / Atom feed
* Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation
@ 2016-04-19  0:07 Mark Davis
  2016-04-19  0:18 ` David Ahern
  2016-04-19  2:26 ` Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation Taeung Song
  0 siblings, 2 replies; 10+ messages in thread
From: Mark Davis @ 2016-04-19  0:07 UTC (permalink / raw)
  To: linux-perf-users

I'm struggling getting perf_events to give me stack traces with symbols, 
despite reading many tutorials on the subject and doing (I think) all the 
necessary things. It's possible that my local install of perf (details on that 
below) is somehow botched? Anyway, here's what I did:

main.cpp is a simple C++ program that calls a few functions defined in the same 
file, allocates some memory and frees it, and prints a few things out.

compilation command:
    
    gcc -std=c++11 -lstdc++ main.cpp -Og -fno-omit-frame-pointer -fno-inline -o 
arr_test

profile command:

    perf record -a -g -- ./arr_test && perf report --stdio


I do get the following warnings about kernel symbols, but I don't think this 
should matter given that I only care about symbols in my application for now:

    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.052 MB perf.data (~2285 samples) ]
    [kernel.kallsyms] with build id e22966849c48748782a1be4fe0ce94db6838b806 
not found, continuing without symbols
    [kernel.kallsyms] with build id e22966849c48748782a1be4fe0ce94db6838b806 
not found, continuing without symbols
    Warning:
    Kernel address maps (/proc/{kallsyms,modules}) were restricted.
    
    Check /proc/sys/kernel/kptr_restrict before running 'perf record'.
    
    As no suitable kallsyms nor vmlinux was found, kernel samples
    can't be resolved.
    
    Samples in kernel modules can't be resolved as well.

Here's a snippet of the output:

    # Overhead   Command      Shared Object
    # ........  ........  .................
    #
        83.27%  arr_test  arr_test         
                |          
                |--34.12%-- 0x400908
                |          0x7fe72b381ec5
                |          
                |--10.48%-- 0x400903
                |          0x7fe72b381ec5
                |          
                |--10.08%-- 0x4008b8
                |          0x7fe72b381ec5
                |          
                |--9.22%-- 0x4008e5
                |          0x7fe72b381ec5
                |          
                |--9.05%-- 0x4008da
                |          0x7fe72b381ec5
                |          
                |--8.49%-- 0x4008f0
                |          0x7fe72b381ec5
                |          
                |--6.87%-- 0x4008d5
                |          0x7fe72b381ec5
                |          
                |--6.23%-- 0x4008c2
                |          0x7fe72b381ec5
                |          
                |--4.76%-- 0x4008fd
                |          0x7fe72b381ec5
                 --0.70%-- [...]
    
         8.02%  arr_test  [kernel.kallsyms]
                |          
                |--4.87%-- 0xffffffff81140b64
                |          0xffffffff81146646
                |          0xffffffff81182751
                |          0xffffffff811829eb
                |          0xffffffff8173317d
                |          0x7fe72bab86a7
                |          0x7fe72baa7e00


file info (shows "not stripped"):

    $ file arr_test 
    arr_test: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically 
linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

Details on my perf install (do any of these warnings prevent me from seeing 
symbols in stacks?)

    Auto-detecting system features:
    ...                     backtrace: [ on  ]
    ...                         dwarf: [ OFF ]
    ...                fortify-source: [ on  ]
    ...                         glibc: [ on  ]
    ...                          gtk2: [ on  ]
    ...                  gtk2-infobar: [ on  ]
    ...                      libaudit: [ OFF ]
    ...                        libbfd: [ OFF ]
    ...                        libelf: [ OFF ]
    ...             libelf-getphdrnum: [ OFF ]
    ...                   libelf-mmap: [ OFF ]
    ...                       libnuma: [ on  ]
    ...                       libperl: [ on  ]
    ...                     libpython: [ on  ]
    ...             libpython-version: [ on  ]
    ...                      libslang: [ on  ]
    ...                     libunwind: [ OFF ]
    ...                       on-exit: [ on  ]
    ...                stackprotector: [ on  ]
    ...            stackprotector-all: [ on  ]
    ...                       timerfd: [ on  ]
    
    config/Makefile:264: No libelf found, disables 'probe' tool, please install 
elfutils-libelf-devel/libelf-dev
    config/Makefile:329: No libunwind found, disabling post unwind support. 
Please install libunwind-dev[el] >= 1.1
    config/Makefile:354: No libaudit.h found, disables 'trace' tool, please 
install audit-libs-devel or libaudit-dev

How can I find my symbols in perf?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation
  2016-04-19  0:07 Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation Mark Davis
@ 2016-04-19  0:18 ` David Ahern
  2016-04-25  9:01   ` Milian Wolff
  2016-04-19  2:26 ` Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation Taeung Song
  1 sibling, 1 reply; 10+ messages in thread
From: David Ahern @ 2016-04-19  0:18 UTC (permalink / raw)
  To: Mark Davis, linux-perf-users

On 4/18/16 6:07 PM, Mark Davis wrote:
>     Auto-detecting system features:
>      ...                     backtrace: [ on  ]
>      ...                         dwarf: [ OFF ]
>      ...                fortify-source: [ on  ]
>      ...                         glibc: [ on  ]
>      ...                          gtk2: [ on  ]
>      ...                  gtk2-infobar: [ on  ]
>      ...                      libaudit: [ OFF ]
>      ...                        libbfd: [ OFF ]
>      ...                        libelf: [ OFF ]

Install those 2 development packages.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation
  2016-04-19  0:07 Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation Mark Davis
  2016-04-19  0:18 ` David Ahern
@ 2016-04-19  2:26 ` Taeung Song
  1 sibling, 0 replies; 10+ messages in thread
From: Taeung Song @ 2016-04-19  2:26 UTC (permalink / raw)
  To: Mark Davis, linux-perf-users

Hi,

On 04/19/2016 09:07 AM, Mark Davis wrote:
> I'm struggling getting perf_events to give me stack traces with symbols,
> despite reading many tutorials on the subject and doing (I think) all the
> necessary things. It's possible that my local install of perf (details on that
> below) is somehow botched? Anyway, here's what I did:
>
> main.cpp is a simple C++ program that calls a few functions defined in the same
> file, allocates some memory and frees it, and prints a few things out.
>
> compilation command:
>
>      gcc -std=c++11 -lstdc++ main.cpp -Og -fno-omit-frame-pointer -fno-inline -o
> arr_test
>
> profile command:
>
>      perf record -a -g -- ./arr_test && perf report --stdio
>
>
> I do get the following warnings about kernel symbols, but I don't think this
> should matter given that I only care about symbols in my application for now:
>

As a guide..
If you add ":u" to a event name, you can measure only at the user level e.g.

     perf record -ag -e cycles:u ./arr_test

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation
  2016-04-19  0:18 ` David Ahern
@ 2016-04-25  9:01   ` Milian Wolff
  2016-04-26  1:03     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Milian Wolff @ 2016-04-25  9:01 UTC (permalink / raw)
  To: David Ahern; +Cc: Mark Davis, linux-perf-users

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

On Monday, April 18, 2016 6:18:45 PM CEST David Ahern wrote:
> On 4/18/16 6:07 PM, Mark Davis wrote:
> >     Auto-detecting system features:
> >      ...                     backtrace: [ on  ]
> >      ...                         dwarf: [ OFF ]
> >      ...                fortify-source: [ on  ]
> >      ...                         glibc: [ on  ]
> >      ...                          gtk2: [ on  ]
> >      ...                  gtk2-infobar: [ on  ]
> >      ...                      libaudit: [ OFF ]
> >      ...                        libbfd: [ OFF ]
> >      ...                        libelf: [ OFF ]
> 
> Install those 2 development packages.

And if that is still not helping, you may run into the case where the samples 
are recorded in a library (like libstdc++, libc,...) which was provided by 
your distribution without frame pointers. In such a case, the backtrace will 
still be broken.

If you want to use frame pointers, and operate on user space code, my advise 
is to recompile all dependencies with frame pointers. On Yocto/Gentoo that is 
easily doable, elsewhere you'll have a hard time and waste a ton of time.
I suggest you simply use Dwarf unwinding.

Bye

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation
  2016-04-25  9:01   ` Milian Wolff
@ 2016-04-26  1:03     ` Arnaldo Carvalho de Melo
  2016-04-26  1:24       ` LBR callchains from tracepoints Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-04-26  1:03 UTC (permalink / raw)
  To: Milian Wolff; +Cc: David Ahern, Mark Davis, linux-perf-users

Em Mon, Apr 25, 2016 at 11:01:54AM +0200, Milian Wolff escreveu:
> On Monday, April 18, 2016 6:18:45 PM CEST David Ahern wrote:
> > On 4/18/16 6:07 PM, Mark Davis wrote:
> > >     Auto-detecting system features:
> > >      ...                     backtrace: [ on  ]
> > >      ...                         dwarf: [ OFF ]
> > >      ...                fortify-source: [ on  ]
> > >      ...                         glibc: [ on  ]
> > >      ...                          gtk2: [ on  ]
> > >      ...                  gtk2-infobar: [ on  ]
> > >      ...                      libaudit: [ OFF ]
> > >      ...                        libbfd: [ OFF ]
> > >      ...                        libelf: [ OFF ]

> > Install those 2 development packages.
 
> And if that is still not helping, you may run into the case where the samples 
> are recorded in a library (like libstdc++, libc,...) which was provided by 
> your distribution without frame pointers. In such a case, the backtrace will 
> still be broken.
 
> If you want to use frame pointers, and operate on user space code, my advise 
> is to recompile all dependencies with frame pointers. On Yocto/Gentoo that is 
> easily doable, elsewhere you'll have a hard time and waste a ton of time.
> I suggest you simply use Dwarf unwinding.

Please try '--call-graph lbr' to check if your hardware has LBR, will be
much cheaper than callchains.

For instance:

[acme@jouet linux]$ perf record --call-graph lbr usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (9 samples) ]
[acme@jouet linux]$ perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec:
1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
[acme@jouet linux]$

-   59.69%  usleep   [kernel]  [k] vma_interval_tree_insert
    →vma_interval_tree_insert  [kernel]
     vma_adjust                [kernel]
     __split_vma.isra.31       [kernel]
     split_vma                 [kernel]
     mprotect_fixup            [kernel]
     sys_mprotect              [kernel]
     entry_SYSCALL_64_fastpath [kernel]
     mprotect                  ld-2.22.so
     _dl_relocate_object       ld-2.22.so
     memcpy@GLIBC_2.2.5        libc-2.22.so
     _dl_relocate_object       ld-2.22.so
     __gettimeofday            libc-2.22.so
     _dl_vdso_vsym             libc-2.22.so
     _dl_lookup_symbol_x       ld-2.22.so 

This was done on a Broadwell system (ThinkPad t450s).

I now need to continue investigation why this doesn't seem to work from
tracepoints...

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* LBR callchains from tracepoints
  2016-04-26  1:03     ` Arnaldo Carvalho de Melo
@ 2016-04-26  1:24       ` Arnaldo Carvalho de Melo
  2016-04-26 16:38         ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-04-26  1:24 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Peter Zijlstra, David Ahern, Milian Wolff,
	Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
	Linux Kernel Mailing List

Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> I now need to continue investigation why this doesn't seem to work from
> tracepoints...

Bummer, the changeset (at the end of this message) hasn't any
explanation, is this really impossible? I.e. LBR callstacks from
tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?

I've read somewhere that LBR wouldn't work for the kernel, but when I
tried, for cycles:ppp I got:

[acme@jouet linux]$ perf record --call-graph lbr usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (9 samples) ]
[acme@jouet linux]$ perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec:
1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
[acme@jouet linux]$

-   59.69%  usleep   [kernel]  [k] vma_interval_tree_insert
    →vma_interval_tree_insert  [kernel]
     vma_adjust                [kernel]
     __split_vma.isra.31       [kernel]
     split_vma                 [kernel]
     mprotect_fixup            [kernel]
     sys_mprotect              [kernel]
     entry_SYSCALL_64_fastpath [kernel]
     mprotect                  ld-2.22.so
     _dl_relocate_object       ld-2.22.so
     memcpy@GLIBC_2.2.5        libc-2.22.so
     _dl_relocate_object       ld-2.22.so
     __gettimeofday            libc-2.22.so
     _dl_vdso_vsym             libc-2.22.so
     _dl_lookup_symbol_x       ld-2.22.so

This was done on a Broadwell system (ThinkPad t450s).

- Arnaldo

commit 2481c5fa6db0237e4f0168f88913178b2b495b7c
Author: Stephane Eranian <eranian@google.com>
Date:   Thu Feb 9 23:20:59 2012 +0100

    perf: Disable PERF_SAMPLE_BRANCH_* when not supported
    
    PERF_SAMPLE_BRANCH_* is disabled for:
    
     - SW events (sw counters, tracepoints)
     - HW breakpoints
     - ALL but Intel x86 architecture
     - AMD64 processors
    
    Signed-off-by: Stephane Eranian <eranian@google.com>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: LBR callchains from tracepoints
  2016-04-26  1:24       ` LBR callchains from tracepoints Arnaldo Carvalho de Melo
@ 2016-04-26 16:38         ` Peter Zijlstra
  2016-04-26 17:26           ` Alexei Starovoitov
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2016-04-26 16:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Stephane Eranian, David Ahern, Milian Wolff,
	Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
	Linux Kernel Mailing List

On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > I now need to continue investigation why this doesn't seem to work from
> > tracepoints...
> 
> Bummer, the changeset (at the end of this message) hasn't any
> explanation, is this really impossible? I.e. LBR callstacks from
> tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?

Could maybe be done, but its tricky to implement as the LBR is managed
by the hardware PMU and tracepoints are a software PMU, so we need to
then somehow frob with cross-pmu resources, in a very arch specific way.
And programmability of the hardware PMU will then depend on events
outside of it.

All rather icky.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: LBR callchains from tracepoints
  2016-04-26 16:38         ` Peter Zijlstra
@ 2016-04-26 17:26           ` Alexei Starovoitov
  2016-04-26 18:20             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2016-04-26 17:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern,
	Milian Wolff, Frédéric Weisbecker, Ingo Molnar,
	Namhyung Kim, Linux Kernel Mailing List

On Tue, Apr 26, 2016 at 06:38:28PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > I now need to continue investigation why this doesn't seem to work from
> > > tracepoints...
> > 
> > Bummer, the changeset (at the end of this message) hasn't any
> > explanation, is this really impossible? I.e. LBR callstacks from
> > tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?
> 
> Could maybe be done, but its tricky to implement as the LBR is managed
> by the hardware PMU and tracepoints are a software PMU, so we need to
> then somehow frob with cross-pmu resources, in a very arch specific way.
> And programmability of the hardware PMU will then depend on events
> outside of it.

btw we're thinking to add support for lbr to bpf, so that from the program
we can get accurate and fast stacks. That's especially important for user
space stacks. No clear idea how to do it yet, but it would be really useful.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: LBR callchains from tracepoints
  2016-04-26 17:26           ` Alexei Starovoitov
@ 2016-04-26 18:20             ` Arnaldo Carvalho de Melo
  2016-04-26 19:07               ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-04-26 18:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Peter Zijlstra, Stephane Eranian, David Ahern, Milian Wolff,
	Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
	Linux Kernel Mailing List

Em Tue, Apr 26, 2016 at 10:26:32AM -0700, Alexei Starovoitov escreveu:
> On Tue, Apr 26, 2016 at 06:38:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > I now need to continue investigation why this doesn't seem to work from
> > > > tracepoints...

> > > Bummer, the changeset (at the end of this message) hasn't any
> > > explanation, is this really impossible? I.e. LBR callstacks from
> > > tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?

> > Could maybe be done, but its tricky to implement as the LBR is managed
> > by the hardware PMU and tracepoints are a software PMU, so we need to
> > then somehow frob with cross-pmu resources, in a very arch specific way.
> > And programmability of the hardware PMU will then depend on events
> > outside of it.
 
> btw we're thinking to add support for lbr to bpf, so that from the program
> we can get accurate and fast stacks. That's especially important for user
> space stacks. No clear idea how to do it yet, but it would be really useful.

Yeah, and that already works in perf, its just that it doesn't work from some
points (PERF_TYPE_SOFTWARE, PERF_TYPE_TRACEPOINT, etc), as described in the
changeset I mentioned.

'perf trace --call-graph lbr' doesn't work right now even with it
interested only in the user space bits, i.e. setting
perf_event_attr.exclude_callchain_kernel.

  # perf trace --call-graph dwarf

works, but that, as you mention, really isn't "fast" and sometimes not
accurate, or at least wasn't with broken toolchains.

Example of mixed strace-like with userspace-only DWARF callchains (would be
lovely if this was with LBR, huh?) plus fp callchains for the
sched:sched_switch tracepoint plus LBR callchains for a hardware event, cycles,
look further below for the reason of the broken timestamps for
PERF_TYPE_HARDWARE events:

  # perf trace -e nanosleep --event sched:sched_switch/call-graph=fp/ --ev cycles/call-graph=lbr,period=100/ usleep 1
18446744073709.551 (         ): cycles/call-graph=lbr,period=100/:)
                                       __intel_pmu_enable_all+0xfe200080 ([kernel.kallsyms])
                                       intel_pmu_enable_all+0xfe200010 ([kernel.kallsyms])
                                       x86_pmu_enable+0xfe200271 ([kernel.kallsyms])
                                       perf_pmu_enable.part.81+0xfe200007 ([kernel.kallsyms])
                                       ctx_resched+0xfe20007a ([kernel.kallsyms])
                                       perf_event_exec+0xfe20011d ([kernel.kallsyms])
                                       setup_new_exec+0xfe20006f ([kernel.kallsyms])
                                       load_elf_binary+0xfe2003e3 ([kernel.kallsyms])
                                       search_binary_handler+0xfe20009e ([kernel.kallsyms])
                                       do_execveat_common.isra.38+0xfe20052c ([kernel.kallsyms])
                                       sys_execve+0xfe20003a ([kernel.kallsyms])
                                       do_syscall_64+0xfe200062 ([kernel.kallsyms])
                                       return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
                                       [0] ([unknown])
     0.310 ( 0.006 ms): usleep/20951 nanosleep(rqtp: 0x7ffda8904500       ) ...
     0.310 (         ): sched:sched_switch:usleep:20951 [120] S ==> swapper/3:0 [120])
                                       __schedule+0xfe200402 ([kernel.kallsyms])
                                       schedule+0xfe200035 ([kernel.kallsyms])
                                       do_nanosleep+0xfe20006f ([kernel.kallsyms])
                                       hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
                                       sys_nanosleep+0xfe20007a ([kernel.kallsyms])
                                       do_syscall_64+0xfe200062 ([kernel.kallsyms])
                                       return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
                                       __nanosleep+0xffff00bfad62c010 (/usr/lib64/libc-2.22.so)
18446679523046.461 (         ): cycles/call-graph=lbr,period=100/:)
                                       perf_pmu_enable.part.81+0xfe200007 ([kernel.kallsyms])
                                       __perf_event_task_sched_in+0xfe2001ad ([kernel.kallsyms])
                                       finish_task_switch+0xfe200156 ([kernel.kallsyms])
                                       __schedule+0xfe200397 ([kernel.kallsyms])
                                       schedule+0xfe200035 ([kernel.kallsyms])
                                       do_nanosleep+0xfe20006f ([kernel.kallsyms])
                                       hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
                                       sys_nanosleep+0xfe20007a ([kernel.kallsyms])
                                       do_syscall_64+0xfe200062 ([kernel.kallsyms])
                                       return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
                                       [0] ([unknown])
     0.377 ( 0.073 ms): usleep/20951  ... [continued]: nanosleep()) = 0
  [root@jouet ~]# 


perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  size                             112
  { sample_period, sample_freq }   100
  sample_type                      IP|TID|CALLCHAIN|BRANCH_STACK|IDENTIFIER
  read_format                      ID
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  sample_id_all                    1
  exclude_guest                    1
  { wakeup_events, wakeup_watermark } 1
  branch_sample_type               USER|CALL_STACK|NO_FLAGS|NO_CYCLES

missing PERF_SAMPLE_TIME, will fix.

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: LBR callchains from tracepoints
  2016-04-26 18:20             ` Arnaldo Carvalho de Melo
@ 2016-04-26 19:07               ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2016-04-26 19:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Stephane Eranian, David Ahern, Milian Wolff,
	Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
	Linux Kernel Mailing List

On Tue, Apr 26, 2016 at 03:20:32PM -0300, Arnaldo Carvalho de Melo wrote:
> Yeah, and that already works in perf, its just that it doesn't work from some
> points (PERF_TYPE_SOFTWARE, PERF_TYPE_TRACEPOINT, etc), as described in the
> changeset I mentioned.

Look at it the other way around, it _only_ works for intel cpu events,
_nothing_ else.

There is only a single PMU that supports LBR callgraph thingies, all the
others do not.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-04-26 19:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-19  0:07 Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation Mark Davis
2016-04-19  0:18 ` David Ahern
2016-04-25  9:01   ` Milian Wolff
2016-04-26  1:03     ` Arnaldo Carvalho de Melo
2016-04-26  1:24       ` LBR callchains from tracepoints Arnaldo Carvalho de Melo
2016-04-26 16:38         ` Peter Zijlstra
2016-04-26 17:26           ` Alexei Starovoitov
2016-04-26 18:20             ` Arnaldo Carvalho de Melo
2016-04-26 19:07               ` Peter Zijlstra
2016-04-19  2:26 ` Missing stack symbols with perf_event's perf report, despite -fno-omit-frame-pointer compilation Taeung Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.