All of lore.kernel.org
 help / color / mirror / Atom feed
* newbie question: tracing userspace call/return sequences
@ 2017-01-06  3:49 Christian Convey
  2017-01-06 10:16 ` Milian Wolff
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Christian Convey @ 2017-01-06  3:49 UTC (permalink / raw)
  To: linux-perf-users

Hi guys,

Any suggestions for the best way to achieve something with perf tools?

Systemtap has an example script, "para-callgraph.stp", that reports
the *sequence* of userspace function calls and returns made by an
application.

I'd like to perform that kind of analysis using perf tools rather than
Systemtap, because I recently learned that Canonical considers
Systemtap to be unsupported.  (E.g., it's pretty broken on Ubuntu
14.04.)

I think I've gotten fairly close with these steps:

  perf probe --exec=./fib -f --add='*'
  perf probe --exec=./fib -f --add='*:%return'
  perf record -e 'probe_fib:*' -aR ./fib
  perf script -g py
  perf script -s ./perf_script.py

The generated Python script contains two Python functions for each
userspace function that's called during the execution of "fib".  This
seems to work, at least on a small, toy program.

But AFAICT, this has two major downsides relative to the
Systemtap-based approach, in cases where the target program has
thousands of functions:

(1) The generated Python script will be huge, and

(2) It will require some non-trivial Python coding for me to tweak
what that script does for every call/return event.

Any suggestions for a good/better way to pull this off?

Thanks,
Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06  3:49 newbie question: tracing userspace call/return sequences Christian Convey
@ 2017-01-06 10:16 ` Milian Wolff
  2017-01-06 20:05   ` Christian Convey
  2017-01-06 18:49 ` Andi Kleen
  2017-01-09 17:20 ` Frank Ch. Eigler
  2 siblings, 1 reply; 18+ messages in thread
From: Milian Wolff @ 2017-01-06 10:16 UTC (permalink / raw)
  To: Christian Convey; +Cc: linux-perf-users

[-- Attachment #1: Type: text/plain, Size: 2027 bytes --]

On Thursday, January 5, 2017 10:49:13 PM CET Christian Convey wrote:
> Hi guys,
> 
> Any suggestions for the best way to achieve something with perf tools?
> 
> Systemtap has an example script, "para-callgraph.stp", that reports
> the *sequence* of userspace function calls and returns made by an
> application.
> 
> I'd like to perform that kind of analysis using perf tools rather than
> Systemtap, because I recently learned that Canonical considers
> Systemtap to be unsupported.  (E.g., it's pretty broken on Ubuntu
> 14.04.)
> 
> I think I've gotten fairly close with these steps:
> 
>   perf probe --exec=./fib -f --add='*'
>   perf probe --exec=./fib -f --add='*:%return'
>   perf record -e 'probe_fib:*' -aR ./fib
>   perf script -g py
>   perf script -s ./perf_script.py
> 
> The generated Python script contains two Python functions for each
> userspace function that's called during the execution of "fib".  This
> seems to work, at least on a small, toy program.
> 
> But AFAICT, this has two major downsides relative to the
> Systemtap-based approach, in cases where the target program has
> thousands of functions:
> 
> (1) The generated Python script will be huge, and
> 
> (2) It will require some non-trivial Python coding for me to tweak
> what that script does for every call/return event.
> 
> Any suggestions for a good/better way to pull this off?

There is a catch-all handler:

*trace_unhandled*, if defined, is called after for any event that
 doesn't have a handler explicitly defined for it.  The standard set
 of common arguments are passed into it:

----
def trace_unhandled(event_name, context, common_cpu, common_secs,
        common_nsecs, common_pid, common_comm):
    pass
----

Maybe you can use that one instead? I'd still be interested to know whether 
you can get all required information. If not, extending the API is doable.

Bye
-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06  3:49 newbie question: tracing userspace call/return sequences Christian Convey
  2017-01-06 10:16 ` Milian Wolff
@ 2017-01-06 18:49 ` Andi Kleen
  2017-01-06 19:07   ` Christian Convey
  2017-01-06 19:23   ` Arnaldo Carvalho de Melo
  2017-01-09 17:20 ` Frank Ch. Eigler
  2 siblings, 2 replies; 18+ messages in thread
From: Andi Kleen @ 2017-01-06 18:49 UTC (permalink / raw)
  To: Christian Convey; +Cc: linux-perf-users

Christian Convey <christian.convey@gmail.com> writes:
>
> Any suggestions for a good/better way to pull this off?

Use the right tool for the job.

On Intel, use a Broadwell or Skylake or Goldmont and use processor trace.

perf record -e intel_pt//u foo
perf script --itrace=cr

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06 18:49 ` Andi Kleen
@ 2017-01-06 19:07   ` Christian Convey
  2017-01-07  4:37     ` Namhyung Kim
  2017-01-06 19:23   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 18+ messages in thread
From: Christian Convey @ 2017-01-06 19:07 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-perf-users

> Use the right tool for the job.
>
> On Intel, use a Broadwell or Skylake or Goldmont and use processor trace.
>
> perf record -e intel_pt//u foo
> perf script --itrace=cr

Hi Andi,

Thanks for the idea.  It sounds promising.

Unfortunately for my current situation, the CPU's on which I need to
run this are too old to have intel_pt.

- Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06 18:49 ` Andi Kleen
  2017-01-06 19:07   ` Christian Convey
@ 2017-01-06 19:23   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-01-06 19:23 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Christian Convey, linux-perf-users

Em Fri, Jan 06, 2017 at 10:49:35AM -0800, Andi Kleen escreveu:
> Christian Convey <christian.convey@gmail.com> writes:
> >
> > Any suggestions for a good/better way to pull this off?
> 
> Use the right tool for the job.
> 
> On Intel, use a Broadwell or Skylake or Goldmont and use processor trace.
> 
> perf record -e intel_pt//u foo
> perf script --itrace=cr

Does that already works while VT-x is in use?

[root@jouet ~]# grep "model name" /proc/cpuinfo 
model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
[root@jouet ~]#

So Broadwell:

[root@jouet ~]# perf record -e intel_pt//u ls
Dockerfile  perf.data  perf.data.old
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data ]
[root@jouet ~]# perf script --itrace=cr
[root@jouet ~]# 

As soon as I stop a VM:

[root@jouet ~]# perf script --itrace=cr | wc -l
14644
[root@jouet ~]# perf script --itrace=cr | tail
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd672d97ae0 __do_global_dtors_aux (/usr/lib64/libpthread-2.24.so) => 7fd673c2a18a _dl_fini (/usr/lib64/ld-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd673c2a1a6 _dl_fini (/usr/lib64/ld-2.24.so) => 7fd672da4da4 _fini (/usr/lib64/libpthread-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd672da4dac _fini (/usr/lib64/libpthread-2.24.so) => 7fd673c2a1a8 _dl_fini (/usr/lib64/ld-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd673c2a38e _dl_fini (/usr/lib64/ld-2.24.so) => 7fd673461410 __run_exit_handlers (/usr/lib64/libc-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd6734613b0 __run_exit_handlers (/usr/lib64/libc-2.24.so) => 7fd6734a5a30 _IO_cleanup (/usr/lib64/libc-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd6734a5a40 _IO_cleanup (/usr/lib64/libc-2.24.so) => 7fd6734a56e0 _IO_flush_all_lockp (/usr/lib64/libc-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd6734a594f _IO_flush_all_lockp (/usr/lib64/libc-2.24.so) => 7fd6734a5a45 _IO_cleanup (/usr/lib64/libc-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd6734a5b99 _IO_cleanup (/usr/lib64/libc-2.24.so) => 7fd6734613b3 __run_exit_handlers (/usr/lib64/libc-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd6734613be __run_exit_handlers (/usr/lib64/libc-2.24.so) => 7fd6734f3c30 _exit (/usr/lib64/libc-2.24.so)
  ls 6527 [3] 6255.381805: 1 branches:u: 7fd6734f3c66 _exit (/usr/lib64/libc-2.24.so) => 0 [unknown] ([unknown])
[root@jouet ~]# 

Using -F and callident it get even better :-)

[root@jouet ~]# perf script -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cr | head -40
  ls 6527 [3] 6255.380814: tr strt  _start                                0 [unknown] =>     7fd673c19cd0 _start
  ls 6527 [3] 6255.380814: tr end   _start                     7fd673c19cd0 _start =>                0 [unknown]
  ls 6527 [3] 6255.380817: tr strt  _start                                0 [unknown] =>     7fd673c19cd0 _start
  ls 6527 [3] 6255.380817: call         _dl_start              7fd673c19cd3 _start =>     7fd673c1dce0 _dl_start
  ls 6527 [3] 6255.380817: tr end       _dl_start              7fd673c1dcff _dl_start =>                0 [unknown]
  ls 6527 [3] 6255.380818: tr strt      _dl_start                         0 [unknown] =>     7fd673c1dcff _dl_start
  ls 6527 [3] 6255.380818: tr end       _dl_start              7fd673c1dd06 _dl_start =>                0 [unknown]
  ls 6527 [3] 6255.380822: tr strt      _dl_start                         0 [unknown] =>     7fd673c1dd06 _dl_start
  ls 6527 [3] 6255.380822: tr end       _dl_start              7fd673c1e0e0 _dl_start =>                0 [unknown]
  ls 6527 [3] 6255.380824: tr strt      _dl_start                         0 [unknown] =>     7fd673c1e0e0 _dl_start
  ls 6527 [3] 6255.380824: call             _dl_setup_hash     7fd673c1df03 _dl_start =>     7fd673c24c10 _dl_setup_hash
  ls 6527 [3] 6255.380824: return           _dl_setup_hash     7fd673c24c6d _dl_setup_hash =>     7fd673c1df08 _dl_start
  ls 6527 [3] 6255.380824: call             _dl_sysdep_start   7fd673c1df63 _dl_start =>     7fd673c32710 _dl_sysdep_start
  ls 6527 [3] 6255.380824: tr end           _dl_sysdep_start   7fd673c32743 _dl_sysdep_start =>                0 [unknown]
  ls 6527 [3] 6255.380827: tr strt          _dl_sysdep_start              0 [unknown] =>     7fd673c32743 _dl_sysdep_start
  ls 6527 [3] 6255.380827: call                 brk            7fd673c329d0 _dl_sysdep_start =>     7fd673c33890 brk
  ls 6527 [3] 6255.380827: syscall                             7fd673c33897 brk =>                0 [unknown]
  ls 6527 [3] 6255.380828: tr strt              brk                       0 [unknown] =>     7fd673c33899 brk
  ls 6527 [3] 6255.380828: return               brk            7fd673c338b2 brk =>     7fd673c329d5 _dl_sysdep_start
  ls 6527 [3] 6255.380828: call                 get_common_indeces.constprop.1            7fd673c32c2f _dl_sysdep_start =>     7fd673c325e0 get_common_indeces.constprop.1
  ls 6527 [3] 6255.380828: return               get_common_indeces.constprop.1            7fd673c326c3 get_common_indeces.constprop.1 =>     7fd673c32c34 _dl_sysdep_start
  ls 6527 [3] 6255.380828: call                 strlen                                    7fd673c32aed _dl_sysdep_start =>     7fd673c36400 strlen
  ls 6527 [3] 6255.380828: return               strlen                                    7fd673c36439 strlen =>     7fd673c32af2 _dl_sysdep_start
  ls 6527 [3] 6255.380828: call                 sbrk                                      7fd673c32afb _dl_sysdep_start =>     7fd673c338e0 sbrk
  ls 6527 [3] 6255.380828: return               sbrk                                      7fd673c3392e sbrk =>     7fd673c32b00 _dl_sysdep_start
  ls 6527 [3] 6255.380828: call                 dl_main                                   7fd673c32b2c _dl_sysdep_start =>     7fd673c1a9a0 dl_main
  ls 6527 [3] 6255.380828: call                     _dl_next_ld_env_entry                 7fd673c1aa6c dl_main =>     7fd673c33110 _dl_next_ld_env_entry
  ls 6527 [3] 6255.380828: return                   _dl_next_ld_env_entry                 7fd673c33150 _dl_next_ld_env_entry =>     7fd673c1aa71 dl_main
  ls 6527 [3] 6255.380828: call                     _dl_new_object                        7fd673c1ae4f dl_main =>     7fd673c24d50 _dl_new_object
  ls 6527 [3] 6255.380828: call                         strlen                            7fd673c24d77 _dl_new_object =>     7fd673c36400 strlen
  ls 6527 [3] 6255.380828: return                       strlen                            7fd673c36439 strlen =>     7fd673c24d7c _dl_new_object
  ls 6527 [3] 6255.380828: call                             7fd673c19ab0                  7fd673c24dd1 _dl_new_object =>     7fd673c19ab0 [unknown]
  ls 6527 [3] 6255.380828: return                       __libc_memalign                   7fd673c332a9 __libc_memalign =>     7fd673c24dd6 _dl_new_object
  ls 6527 [3] 6255.380828: call                         memmove                           7fd673c24e13 _dl_new_object =>     7fd673c36f20 memmove
  ls 6527 [3] 6255.380828: return                       memmove                           7fd673c36f56 memmove =>     7fd673c24e18 _dl_new_object
  ls 6527 [3] 6255.380828: return                   _dl_new_object                        7fd673c24fc8 _dl_new_object =>     7fd673c1ae54 dl_main
  ls 6527 [3] 6255.380828: call                     _dl_add_to_namespace_list             7fd673c1ae88 dl_main =>     7fd673c24cc0 _dl_add_to_namespace_list
  ls 6527 [3] 6255.380828: call                         rtld_lock_default_lock_recursive  7fd673c24cd1 _dl_add_to_namespace_list =>     7fd673c19d30 rtld_lock_default_lock_recursive
  ls 6527 [3] 6255.380828: return                       rtld_lock_default_lock_recursive  7fd673c19d34 rtld_lock_default_lock_recursive =>     7fd673c24cd7 _dl_add_to_namespace_list
  ls 6527 [3] 6255.380828: return                   rtld_lock_default_unlock_recursive    7fd673c19d44 rtld_lock_default_unlock_recursive =>     7fd673c1ae8d dl_main
[root@jouet ~]# 

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06 10:16 ` Milian Wolff
@ 2017-01-06 20:05   ` Christian Convey
  2017-01-07 23:44     ` Masami Hiramatsu
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Convey @ 2017-01-06 20:05 UTC (permalink / raw)
  To: Milian Wolff; +Cc: linux-perf-users

On Fri, Jan 6, 2017 at 5:16 AM, Milian Wolff <milian.wolff@kdab.com> wrote:
...
> There is a catch-all handler:
>
> *trace_unhandled*, if defined, is called after for any event that
>  doesn't have a handler explicitly defined for it.  The standard set
>  of common arguments are passed into it:
>
> ----
> def trace_unhandled(event_name, context, common_cpu, common_secs,
>         common_nsecs, common_pid, common_comm):
>     pass
> ----
>
> Maybe you can use that one instead? I'd still be interested to know whether
> you can get all required information. If not, extending the API is doable.

Hi Milian,

Thanks for the idea - I tried it out.  My Python script was a bit ugly
/ hackish, but it did allow me to trace basic call/return information.

With your approach, it was also easy to trace the return-value of each
function call: I modified my return-probe definition to be "*:%return
$retval".  After that, the "trace_unhandled" function can access the
return value via "event_fields_dict['arg1']".

There are two more improvements I'd like to make, but they're less important:

(1) For each function being traced, I'd like to report the values of
all of its parameters.  (SystemTap makes this easy.)  Unfortunately,
`perf probe`'s probe-argument syntax seems to require that I know the
name of every function-parameter I want to probe.  So I may end up
having to write a script that parses either (a) the target program's
DWARF info, or (b) the output of `perf probe --vars`.

(2) I want the tracing to be robust even if the target program
dynamically links to `.so` files that I didn't anticipate when
creating the probes.  I haven't found any documentation regarding how
gracefully perf, SystemTap, etc. handle that situation.  I'm not sure
how I'd solve this problem.  Perhaps hooking `dlopen` is an option.

- Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06 19:07   ` Christian Convey
@ 2017-01-07  4:37     ` Namhyung Kim
  2017-01-07 18:26       ` Christian Convey
  2017-01-07 19:08       ` Andi Kleen
  0 siblings, 2 replies; 18+ messages in thread
From: Namhyung Kim @ 2017-01-07  4:37 UTC (permalink / raw)
  To: Christian Convey; +Cc: Andi Kleen, linux-perf-users

Hello,

On Sat, Jan 7, 2017 at 4:07 AM, Christian Convey
<christian.convey@gmail.com> wrote:
>> Use the right tool for the job.
>>
>> On Intel, use a Broadwell or Skylake or Goldmont and use processor trace.
>>
>> perf record -e intel_pt//u foo
>> perf script --itrace=cr
>
> Hi Andi,
>
> Thanks for the idea.  It sounds promising.
>
> Unfortunately for my current situation, the CPU's on which I need to
> run this are too old to have intel_pt.

You can use uftrace for function tracing.  It doesn't have a hardware dependency
but you need to rebuild your program with instrumentation enabled (like -pg
or -finstrument-functions in gcc).

https://github.com/namhyung/uftrace

-- 
Thanks,
Namhyung Kim

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-07  4:37     ` Namhyung Kim
@ 2017-01-07 18:26       ` Christian Convey
  2017-01-07 19:08       ` Andi Kleen
  1 sibling, 0 replies; 18+ messages in thread
From: Christian Convey @ 2017-01-07 18:26 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: linux-perf-users

On Fri, Jan 6, 2017 at 11:37 PM, Namhyung Kim <namhyung@gmail.com> wrote:
...
> You can use uftrace for function tracing.  It doesn't have a hardware dependency
> but you need to rebuild your program with instrumentation enabled (like -pg
> or -finstrument-functions in gcc).
>
> https://github.com/namhyung/uftrace

Thanks Namhyung.

For one of my current projects, I'm limited to tools which don't
require me to recompile / relink the target program.  So I'm limited
to options such as Perf, Systemtap, Gdb, or Intel Pin.

- Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-07  4:37     ` Namhyung Kim
  2017-01-07 18:26       ` Christian Convey
@ 2017-01-07 19:08       ` Andi Kleen
  2017-01-08  8:08         ` Namhyung Kim
  1 sibling, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2017-01-07 19:08 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Christian Convey, Andi Kleen, linux-perf-users

> You can use uftrace for function tracing.  It doesn't have a hardware dependency
> but you need to rebuild your program with instrumentation enabled (like -pg
> or -finstrument-functions in gcc).
> 
> https://github.com/namhyung/uftrace

I have a similar tool here. 

http://github.com/andikleen/ftracer

But it's really far inferior to hardware tracing. Use the right tool for the
job.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06 20:05   ` Christian Convey
@ 2017-01-07 23:44     ` Masami Hiramatsu
  2017-01-09 15:19       ` Christian Convey
  0 siblings, 1 reply; 18+ messages in thread
From: Masami Hiramatsu @ 2017-01-07 23:44 UTC (permalink / raw)
  To: Christian Convey; +Cc: Milian Wolff, linux-perf-users

Hi Christian,

On Fri, 6 Jan 2017 15:05:05 -0500
Christian Convey <christian.convey@gmail.com> wrote:

> On Fri, Jan 6, 2017 at 5:16 AM, Milian Wolff <milian.wolff@kdab.com> wrote:
> ...
> > There is a catch-all handler:
> >
> > *trace_unhandled*, if defined, is called after for any event that
> >  doesn't have a handler explicitly defined for it.  The standard set
> >  of common arguments are passed into it:
> >
> > ----
> > def trace_unhandled(event_name, context, common_cpu, common_secs,
> >         common_nsecs, common_pid, common_comm):
> >     pass
> > ----
> >
> > Maybe you can use that one instead? I'd still be interested to know whether
> > you can get all required information. If not, extending the API is doable.
> 
> Hi Milian,
> 
> Thanks for the idea - I tried it out.  My Python script was a bit ugly
> / hackish, but it did allow me to trace basic call/return information.
> 
> With your approach, it was also easy to trace the return-value of each
> function call: I modified my return-probe definition to be "*:%return
> $retval".  After that, the "trace_unhandled" function can access the
> return value via "event_fields_dict['arg1']".
> 
> There are two more improvements I'd like to make, but they're less important:
> 
> (1) For each function being traced, I'd like to report the values of
> all of its parameters.  (SystemTap makes this easy.)  Unfortunately,
> `perf probe`'s probe-argument syntax seems to require that I know the
> name of every function-parameter I want to probe.  So I may end up
> having to write a script that parses either (a) the target program's
> DWARF info, or (b) the output of `perf probe --vars`.

No, you can use $params for tracing all function parameters :)
So, please try "* $params" instead of "*".

> (2) I want the tracing to be robust even if the target program
> dynamically links to `.so` files that I didn't anticipate when
> creating the probes.  I haven't found any documentation regarding how
> gracefully perf, SystemTap, etc. handle that situation.  I'm not sure
> how I'd solve this problem.  Perhaps hooking `dlopen` is an option.

Hm, would you know which libraries(so) can be loaded?
In that case, you can put probes on those files (before loading).
The probe events are automatically enabled after loading the files.

Thank you,


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-07 19:08       ` Andi Kleen
@ 2017-01-08  8:08         ` Namhyung Kim
  0 siblings, 0 replies; 18+ messages in thread
From: Namhyung Kim @ 2017-01-08  8:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Christian Convey, linux-perf-users

Hi Andi,

On Sun, Jan 8, 2017 at 4:08 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> You can use uftrace for function tracing.  It doesn't have a hardware dependency
>> but you need to rebuild your program with instrumentation enabled (like -pg
>> or -finstrument-functions in gcc).
>>
>> https://github.com/namhyung/uftrace
>
> I have a similar tool here.
>
> http://github.com/andikleen/ftracer

Cool!

>
> But it's really far inferior to hardware tracing. Use the right tool for the
> job.

I admit that hardware tracing is much better in terms of performance
and precision.  But I don't have such a recent CPU.

Also uftrace has various types of filters to reduce data and
some useful features like capturing argument/return value,
showing kernel functions (system call, page fault, ...) with root
privilege (using ftrace) and support analysis using external programs
like chrome or flamegraph.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-07 23:44     ` Masami Hiramatsu
@ 2017-01-09 15:19       ` Christian Convey
  2017-01-09 17:03         ` Naveen N. Rao
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Convey @ 2017-01-09 15:19 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Milian Wolff, linux-perf-users

Hi Masami,

On Sat, Jan 7, 2017 at 6:44 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> No, you can use $params for tracing all function parameters :)
> So, please try "* $params" instead of "*".

Thanks for the tip!  I tried it out, and got some strange results: The
correct set of function parameter names is present in
"trace_unhandled" 's "event_fields_dict" dictionary.  However, the
actual runtime values of those parameters is sometimes inaccurate.

I'm using a simple Fibonacci program with just two functions: "main"
and "fibfunc".

Here are my steps:

gcc -g -fno-omit-frame-pointer -O0 fib.c -o fib

export PERF_EXEC_PATH=/home/cconvey/src/linux-4.8.0/tools/perf

sudo -E ${PERF_EXEC_PATH}/perf probe --exec=./fib --del='*'

sudo -E ${PERF_EXEC_PATH}/perf probe --exec=./fib --add='* $params'

sudo -E ${PERF_EXEC_PATH}/perf record -e 'probe_fib:*' -aR ./fib 3

# My script which currently just prints the parameters passed to
'trace_unhandled'.
sudo -E ${PERF_EXEC_PATH}/perf script -s ./perf-script-2.py


Here's an example of the output from my script:

> in trace_begin
> event_name = "probe_fib__main"
> event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209748661, 'argv': 94718640125824, 'common_cpu': 6, 'argc': 22053, '__probe_ip': 94718640125737}
>
> event_name = "probe_fib__fibfunc"
> event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209756797, 'n': 0, 'common_cpu': 6, '__probe_ip': 94718640125664}
>
> event_name = "probe_fib__fibfunc"
> event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209757849, 'n': 32764, 'common_cpu': 6, '__probe_ip': 94718640125664}
>
> event_name = "probe_fib__fibfunc"
> event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209758915, 'n': 2, 'common_cpu': 6, '__probe_ip': 94718640125664}
>
> in trace_end

Notice, for example, that the reported value for "argc" is 22053, rather than 2.

Any idea why that might be happening?


>
> > (2) I want the tracing to be robust even if the target program
> > dynamically links to `.so` files that I didn't anticipate when
> > creating the probes.  I haven't found any documentation regarding how
> > gracefully perf, SystemTap, etc. handle that situation.  I'm not sure
> > how I'd solve this problem.  Perhaps hooking `dlopen` is an option.
>
> Hm, would you know which libraries(so) can be loaded?

Yes, this is the problem I'm trying to solve.  One solution which
might solve my *current* needs is to simply perform two runs of the
target program.

In the first run, I use "strace" to discover which shared objects the
program opens at runtime.

And then, before running the target program a second time, I add
function-call probe events for those shared objects' functions.

Cheers,
Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-09 15:19       ` Christian Convey
@ 2017-01-09 17:03         ` Naveen N. Rao
  2017-01-09 17:59           ` Christian Convey
  0 siblings, 1 reply; 18+ messages in thread
From: Naveen N. Rao @ 2017-01-09 17:03 UTC (permalink / raw)
  To: Christian Convey; +Cc: Masami Hiramatsu, Milian Wolff, linux-perf-users

On 2017/01/09 10:19AM, Christian Convey wrote:
> Hi Masami,
> 
> On Sat, Jan 7, 2017 at 6:44 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > No, you can use $params for tracing all function parameters :)
> > So, please try "* $params" instead of "*".
> 
> Thanks for the tip!  I tried it out, and got some strange results: The
> correct set of function parameter names is present in
> "trace_unhandled" 's "event_fields_dict" dictionary.  However, the
> actual runtime values of those parameters is sometimes inaccurate.
> 
> I'm using a simple Fibonacci program with just two functions: "main"
> and "fibfunc".
> 
> Here are my steps:
> 
> gcc -g -fno-omit-frame-pointer -O0 fib.c -o fib
> 
> export PERF_EXEC_PATH=/home/cconvey/src/linux-4.8.0/tools/perf
> 
> sudo -E ${PERF_EXEC_PATH}/perf probe --exec=./fib --del='*'
> 
> sudo -E ${PERF_EXEC_PATH}/perf probe --exec=./fib --add='* $params'
> 
> sudo -E ${PERF_EXEC_PATH}/perf record -e 'probe_fib:*' -aR ./fib 3
> 
> # My script which currently just prints the parameters passed to
> 'trace_unhandled'.
> sudo -E ${PERF_EXEC_PATH}/perf script -s ./perf-script-2.py
> 
> 
> Here's an example of the output from my script:
> 
> > in trace_begin
> > event_name = "probe_fib__main"
> > event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209748661, 'argv': 94718640125824, 'common_cpu': 6, 'argc': 22053, '__probe_ip': 94718640125737}
> >
> > event_name = "probe_fib__fibfunc"
> > event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209756797, 'n': 0, 'common_cpu': 6, '__probe_ip': 94718640125664}
> >
> > event_name = "probe_fib__fibfunc"
> > event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209757849, 'n': 32764, 'common_cpu': 6, '__probe_ip': 94718640125664}
> >
> > event_name = "probe_fib__fibfunc"
> > event_fields_dict: {'common_callchain': [], 'common_pid': 18631, 'common_s': 98414, 'common_comm': 'fib', 'common_ns': 209758915, 'n': 2, 'common_cpu': 6, '__probe_ip': 94718640125664}
> >
> > in trace_end
> 
> Notice, for example, that the reported value for "argc" is 22053, rather than 2.
> 
> Any idea why that might be happening?

Try compiling your program with -O2, or use a newer perf:
https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?id=e47392bf9c0613a058cd20ee89d8ce9d957d4b24

- Naveen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-06  3:49 newbie question: tracing userspace call/return sequences Christian Convey
  2017-01-06 10:16 ` Milian Wolff
  2017-01-06 18:49 ` Andi Kleen
@ 2017-01-09 17:20 ` Frank Ch. Eigler
  2017-01-09 18:08   ` Christian Convey
  2 siblings, 1 reply; 18+ messages in thread
From: Frank Ch. Eigler @ 2017-01-09 17:20 UTC (permalink / raw)
  To: Christian Convey; +Cc: linux-perf-users


christian.convey wrote:

> [...]
> Systemtap has an example script, "para-callgraph.stp", that reports
> the *sequence* of userspace function calls and returns made by an
> application.
>
> I'd like to perform that kind of analysis using perf tools rather than
> Systemtap, because I recently learned that Canonical considers
> Systemtap to be unsupported.  (E.g., it's pretty broken on Ubuntu
> 14.04.)
> [...]

For what it's worth, upstream systemtap versions work generally fine on
Ubuntus old & new.  Consider building your own, if Canonical's community
is not keeping it fresh on their own distro.

That said, consider also dyninst (C++ binary instrumentation API), or
perhaps even as a kernel-independent pure-userspace systemtap backend:
  % stap --runtime=dyninst

- FChE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-09 17:03         ` Naveen N. Rao
@ 2017-01-09 17:59           ` Christian Convey
  2017-01-10  0:25             ` Masami Hiramatsu
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Convey @ 2017-01-09 17:59 UTC (permalink / raw)
  To: Naveen N. Rao; +Cc: Masami Hiramatsu, Milian Wolff, linux-perf-users

On Mon, Jan 9, 2017 at 12:03 PM, Naveen N. Rao
<naveen.n.rao@linux.vnet.ibm.com> wrote:
...
> On 2017/01/09 10:19AM, Christian Convey wrote:
> Try compiling your program with -O2, or use a newer perf:
> https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?id=e47392bf9c0613a058cd20ee89d8ce9d957d4b24

Thanks for the suggestion.  I switched from -O0 to -O2, and
unfortunately I could no longer add the probe events at all:

cconvey@spaceman ~/dw/scratch/cconvey/src/fib $ sudo -E
${PERF_EXEC_PATH}/perf probe --exec=./fib -f --add='* $params'
Failed to write event: Invalid argument
Please upgrade your kernel to at least 3.14 to have access to feature @__fmt
 Error: Failed to add events.

cconvey@spaceman ~/dw/scratch/cconvey/src/fib $ sudo -E
${PERF_EXEC_PATH}/perf --version
perf version 4.8.11

cconvey@spaceman ~/dw/scratch/cconvey/src/fib $ uname -a
Linux spaceman 4.8.0-32-generic #34-Ubuntu SMP Tue Dec 13 14:30:43 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux


- Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-09 17:20 ` Frank Ch. Eigler
@ 2017-01-09 18:08   ` Christian Convey
  0 siblings, 0 replies; 18+ messages in thread
From: Christian Convey @ 2017-01-09 18:08 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: linux-perf-users

On Mon, Jan 9, 2017 at 12:20 PM, Frank Ch. Eigler <fche@redhat.com> wrote:
...
> For what it's worth, upstream systemtap versions work generally fine on
> Ubuntus old & new.  Consider building your own, if Canonical's community
> is not keeping it fresh on their own distro.

Thanks for the tip.

When I was using Ubuntu 14.04, Ubuntu's systemtap package was at
version 2.3, and it didn't even pass unit tests.

On 14.04 I tried to build my own systemtap 3.0.  IIRC, I ran into
enough problems with library versions, that my messiness threshold was
exceeded.  I think where I stopped as a problem involving the version
of libc-json, where Ubuntu 14.04's packaged version was too old.

>
> That said, consider also dyninst (C++ binary instrumentation API), or
> perhaps even as a kernel-independent pure-userspace systemtap backend:
>   % stap --runtime=dyninst

Cool, thanks for the tip.  I wasn't aware of stap's `--runtime` option.

Cheers,
Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-09 17:59           ` Christian Convey
@ 2017-01-10  0:25             ` Masami Hiramatsu
  2017-01-10  2:32               ` Christian Convey
  0 siblings, 1 reply; 18+ messages in thread
From: Masami Hiramatsu @ 2017-01-10  0:25 UTC (permalink / raw)
  To: Christian Convey
  Cc: Naveen N. Rao, Masami Hiramatsu, Milian Wolff, linux-perf-users

On Mon, 9 Jan 2017 12:59:38 -0500
Christian Convey <christian.convey@gmail.com> wrote:

> On Mon, Jan 9, 2017 at 12:03 PM, Naveen N. Rao
> <naveen.n.rao@linux.vnet.ibm.com> wrote:
> ...
> > On 2017/01/09 10:19AM, Christian Convey wrote:
> > Try compiling your program with -O2, or use a newer perf:
> > https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?id=e47392bf9c0613a058cd20ee89d8ce9d957d4b24
> 
> Thanks for the suggestion.  I switched from -O0 to -O2, and
> unfortunately I could no longer add the probe events at all:
> 
> cconvey@spaceman ~/dw/scratch/cconvey/src/fib $ sudo -E
> ${PERF_EXEC_PATH}/perf probe --exec=./fib -f --add='* $params'
> Failed to write event: Invalid argument
> Please upgrade your kernel to at least 3.14 to have access to feature @__fmt
>  Error: Failed to add events.
> 
> cconvey@spaceman ~/dw/scratch/cconvey/src/fib $ sudo -E
> ${PERF_EXEC_PATH}/perf --version
> perf version 4.8.11
> 
> cconvey@spaceman ~/dw/scratch/cconvey/src/fib $ uname -a
> Linux spaceman 4.8.0-32-generic #34-Ubuntu SMP Tue Dec 13 14:30:43 UTC
> 2016 x86_64 x86_64 x86_64 GNU/Linux
 
It seems that the above fix is included v4.9 series, not v4.8.

Could you try to download the latest kernel and build the perf?
You don't need to build kernel, only tools/perf/perf is what you need.
So, what you need is
$ cd <somewhere-your-download-kernel>/tools/perf/
$ make

Not that you have to install libdw-dev for perf-probe before that.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: newbie question: tracing userspace call/return sequences
  2017-01-10  0:25             ` Masami Hiramatsu
@ 2017-01-10  2:32               ` Christian Convey
  0 siblings, 0 replies; 18+ messages in thread
From: Christian Convey @ 2017-01-10  2:32 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Naveen N. Rao, Milian Wolff, linux-perf-users

Hi Masami,

On Mon, Jan 9, 2017 at 7:25 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> It seems that the above fix is included v4.9 series, not v4.8.
>
> Could you try to download the latest kernel and build the perf?
> You don't need to build kernel, only tools/perf/perf is what you need.
> So, what you need is
> $ cd <somewhere-your-download-kernel>/tools/perf/
> $ make

I tried 4.9, and it worked!  Thanks very much!

- Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-01-10  2:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-06  3:49 newbie question: tracing userspace call/return sequences Christian Convey
2017-01-06 10:16 ` Milian Wolff
2017-01-06 20:05   ` Christian Convey
2017-01-07 23:44     ` Masami Hiramatsu
2017-01-09 15:19       ` Christian Convey
2017-01-09 17:03         ` Naveen N. Rao
2017-01-09 17:59           ` Christian Convey
2017-01-10  0:25             ` Masami Hiramatsu
2017-01-10  2:32               ` Christian Convey
2017-01-06 18:49 ` Andi Kleen
2017-01-06 19:07   ` Christian Convey
2017-01-07  4:37     ` Namhyung Kim
2017-01-07 18:26       ` Christian Convey
2017-01-07 19:08       ` Andi Kleen
2017-01-08  8:08         ` Namhyung Kim
2017-01-06 19:23   ` Arnaldo Carvalho de Melo
2017-01-09 17:20 ` Frank Ch. Eigler
2017-01-09 18:08   ` Christian Convey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.