linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
@ 2018-01-12 10:03 Thomas Richter
  2018-01-12 14:20 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Richter @ 2018-01-12 10:03 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, acme
  Cc: brueckner, schwidefsky, heiko.carstens, Thomas Richter

On Intel platform when I execute test case
   probe libc's inet_pton & backtrace it with ping
I get this output which is correct:

[root@f27 perf]# ./perf trace --no-syscalls
		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.064 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.064/0.064/0.064/0.000 ms
     0.000 probe_libc:inet_pton:(7f26a88498a0))
	   __GI___inet_pton (/usr/lib64/libc-2.26.so)
	   getaddrinfo (/usr/lib64/libc-2.26.so)
	   main (/usr/bin/ping)
[root@f27 perf]#

However when I add the global --call-graph dwarf to the command
line I get the exactly same result which is wrong:

[root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1

My understanding is that the '--call-graph dwarf' option should be global
and applies to all events. However it is ignored.
The issue is buried in function apply_config_terms() called from
trace__run()
 --> perf_evlist__config()
     --> perf_evsel__config()
           --> apply_config_terms()

This function iterates over all additional terms and handles the
/max-stack=<xx>/ case. However if there is no additional call-graph=dwarf
setting as in -e probe_libc:inet_pton/max-stack=3,call-graph=dwarf/
the function defaults to "fp" regardless of the value of --call-graph
global command line flag.

So when I add ,call-graph=dwarf to the event I get a good result:

[root@f27 perf]# ./perf trace --no-syscalls
	-e probe_libc:inet_pton/max-stack=3,call-graph=dwarf/
	ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.084 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.084/0.084/0.084/0.000 ms
     0.000 probe_libc:inet_pton:(7f9372a988a0))
	   __inet_pton (inlined)
	   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
	   __GI_getaddrinfo (inlined)
	   main (/usr/bin/ping)
	   __libc_start_main (/usr/lib64/libc-2.26.so)
	   _start (/usr/bin/ping)
[root@f27 perf]#

In fact the looks pretty similar to the output on s390x which only has
dwarf support.
Ok the max-stack is not observed, but this time the global --max-stack <x>
comes to rescue and limits the output:

[root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
	-e probe_libc:inet_pton/max-stack=3,call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.061 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
     0.000 probe_libc:inet_pton:(7fc9f6c188a0))
	   __inet_pton (inlined)
	   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
	   __GI_getaddrinfo (inlined)
	   main (/usr/bin/ping)
[root@f27 perf]#

This patch applies the global --call-graph <xxx> setting to the individual
events.

With this patch applied:
[root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
	-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.066 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.066/0.066/0.066/0.000 ms
     0.000 probe_libc:inet_pton:(7fe74f8358a0))
	   __inet_pton (inlined)
	   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
	   __GI_getaddrinfo (inlined)
	   main (/usr/bin/ping)
	   __libc_start_main (/usr/lib64/libc-2.26.so)
	   _start (/usr/bin/ping)
[root@f27 perf]#

When using dwarf the /max-stack=<x>/ is not observed.
The reason is that for call-graph=fp the kernel stack unwinder
is used and it stops after unwinding x entries.
For dwarf the complete stack data is returned and the unwinding
is done by perf itself and it uses --max-stack <x>
command line flag.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
---
 tools/perf/util/evsel.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d5fbcf8c7aa7..d1c25aafb821 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -788,8 +788,24 @@ static void apply_config_terms(struct perf_evsel *evsel,
 	if ((callgraph_buf != NULL) || (dump_size > 0) || max_stack) {
 		if (max_stack) {
 			param.max_stack = max_stack;
-			if (callgraph_buf == NULL)
-				callgraph_buf = "fp";
+			if (callgraph_buf == NULL) {
+				switch (callchain_param.record_mode) {
+				case CALLCHAIN_DWARF:
+					callgraph_buf = "dwarf";
+					break;
+				case CALLCHAIN_NONE:
+				case CALLCHAIN_FP:
+					callgraph_buf = "fp";
+					break;
+				case CALLCHAIN_LBR:
+					callgraph_buf = "lbr";
+					break;
+				default:
+				case CALLCHAIN_MAX:
+					callgraph_buf = "no";
+					break;
+				}
+			}
 		}
 
 		/* parse callgraph parameters */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-12 10:03 [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf Thomas Richter
@ 2018-01-12 14:20 ` Arnaldo Carvalho de Melo
  2018-01-12 15:01   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-12 14:20 UTC (permalink / raw)
  To: Thomas Richter
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

Em Fri, Jan 12, 2018 at 11:03:55AM +0100, Thomas Richter escreveu:
> On Intel platform when I execute test case
>    probe libc's inet_pton & backtrace it with ping
> I get this output which is correct:
> 
> [root@f27 perf]# ./perf trace --no-syscalls
> 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.064 ms
> 
>  --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.064/0.064/0.064/0.000 ms
>      0.000 probe_libc:inet_pton:(7f26a88498a0))
> 	   __GI___inet_pton (/usr/lib64/libc-2.26.so)
> 	   getaddrinfo (/usr/lib64/libc-2.26.so)
> 	   main (/usr/bin/ping)
> [root@f27 perf]#
> 
> However when I add the global --call-graph dwarf to the command
> line I get the exactly same result which is wrong:
> 
> [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> 
> My understanding is that the '--call-graph dwarf' option should be global
> and applies to all events. However it is ignored.

It is global, but then, per-event settings can override it.

So if you say that the following call (without that /max-stack=3/, which
implies DWARF, which is the safest bet for userspace callchains):
[root@jouet ~]# perf trace --no-syscalls --call-graph dwarf -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.086 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.086/0.086/0.086/0.000 ms
     0.000 probe_libc:inet_pton:(7fafee7ce350))
[root@jouet ~]#

And it doesn't work, right, that is a bug. I'll check if your patch
fixes that and then will correct the cset log, i.e. this part:

[root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1

Should read:

[root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
		-e probe_libc:inet_pton ping -6 -c 1 ::1

And _should_ produce a call-graph, without a stack limit, ok?

I'll reply to this message after testing your patch.

Thanks,

- Arnaldo

> The issue is buried in function apply_config_terms() called from
> trace__run()
>  --> perf_evlist__config()
>      --> perf_evsel__config()
>            --> apply_config_terms()
> 
> This function iterates over all additional terms and handles the
> /max-stack=<xx>/ case. However if there is no additional call-graph=dwarf
> setting as in -e probe_libc:inet_pton/max-stack=3,call-graph=dwarf/
> the function defaults to "fp" regardless of the value of --call-graph
> global command line flag.
> 
> So when I add ,call-graph=dwarf to the event I get a good result:
> 
> [root@f27 perf]# ./perf trace --no-syscalls
> 	-e probe_libc:inet_pton/max-stack=3,call-graph=dwarf/
> 	ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.084 ms
> 
>  --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.084/0.084/0.084/0.000 ms
>      0.000 probe_libc:inet_pton:(7f9372a988a0))
> 	   __inet_pton (inlined)
> 	   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
> 	   __GI_getaddrinfo (inlined)
> 	   main (/usr/bin/ping)
> 	   __libc_start_main (/usr/lib64/libc-2.26.so)
> 	   _start (/usr/bin/ping)
> [root@f27 perf]#
> 
> In fact the looks pretty similar to the output on s390x which only has
> dwarf support.
> Ok the max-stack is not observed, but this time the global --max-stack <x>
> comes to rescue and limits the output:
> 
> [root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
> 	-e probe_libc:inet_pton/max-stack=3,call-graph=dwarf/ ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.061 ms
> 
>  --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
>      0.000 probe_libc:inet_pton:(7fc9f6c188a0))
> 	   __inet_pton (inlined)
> 	   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
> 	   __GI_getaddrinfo (inlined)
> 	   main (/usr/bin/ping)
> [root@f27 perf]#
> 
> This patch applies the global --call-graph <xxx> setting to the individual
> events.
> 
> With this patch applied:
> [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> 	-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.066 ms
> 
>  --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.066/0.066/0.066/0.000 ms
>      0.000 probe_libc:inet_pton:(7fe74f8358a0))
> 	   __inet_pton (inlined)
> 	   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
> 	   __GI_getaddrinfo (inlined)
> 	   main (/usr/bin/ping)
> 	   __libc_start_main (/usr/lib64/libc-2.26.so)
> 	   _start (/usr/bin/ping)
> [root@f27 perf]#
> 
> When using dwarf the /max-stack=<x>/ is not observed.
> The reason is that for call-graph=fp the kernel stack unwinder
> is used and it stops after unwinding x entries.
> For dwarf the complete stack data is returned and the unwinding
> is done by perf itself and it uses --max-stack <x>
> command line flag.
> 
> Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
> Reviewed-by: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
> ---
>  tools/perf/util/evsel.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d5fbcf8c7aa7..d1c25aafb821 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -788,8 +788,24 @@ static void apply_config_terms(struct perf_evsel *evsel,
>  	if ((callgraph_buf != NULL) || (dump_size > 0) || max_stack) {
>  		if (max_stack) {
>  			param.max_stack = max_stack;
> -			if (callgraph_buf == NULL)
> -				callgraph_buf = "fp";
> +			if (callgraph_buf == NULL) {
> +				switch (callchain_param.record_mode) {
> +				case CALLCHAIN_DWARF:
> +					callgraph_buf = "dwarf";
> +					break;
> +				case CALLCHAIN_NONE:
> +				case CALLCHAIN_FP:
> +					callgraph_buf = "fp";
> +					break;
> +				case CALLCHAIN_LBR:
> +					callgraph_buf = "lbr";
> +					break;
> +				default:
> +				case CALLCHAIN_MAX:
> +					callgraph_buf = "no";
> +					break;
> +				}
> +			}
>  		}
>  
>  		/* parse callgraph parameters */
> -- 
> 2.14.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-12 14:20 ` Arnaldo Carvalho de Melo
@ 2018-01-12 15:01   ` Arnaldo Carvalho de Melo
  2018-01-12 16:47     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-12 15:01 UTC (permalink / raw)
  To: Thomas Richter
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

Em Fri, Jan 12, 2018 at 11:20:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Jan 12, 2018 at 11:03:55AM +0100, Thomas Richter escreveu:
> > On Intel platform when I execute test case
> >    probe libc's inet_pton & backtrace it with ping
> > I get this output which is correct:

> > [root@f27 perf]# ./perf trace --no-syscalls
> > 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> > PING ::1(::1) 56 data bytes
> > 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.064 ms

> >  --- ::1 ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 0.064/0.064/0.064/0.000 ms
> >      0.000 probe_libc:inet_pton:(7f26a88498a0))
> > 	   __GI___inet_pton (/usr/lib64/libc-2.26.so)
> > 	   getaddrinfo (/usr/lib64/libc-2.26.so)
> > 	   main (/usr/bin/ping)
> > [root@f27 perf]#
> > 
> > However when I add the global --call-graph dwarf to the command
> > line I get the exactly same result which is wrong:
> > 
> > [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> > 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> > 
> > My understanding is that the '--call-graph dwarf' option should be global
> > and applies to all events. However it is ignored.
> 
> It is global, but then, per-event settings can override it.
> 
> So if you say that the following call (without that /max-stack=3/, which
> implies DWARF, which is the safest bet for userspace callchains):
> [root@jouet ~]# perf trace --no-syscalls --call-graph dwarf -e probe_libc:inet_pton ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.086 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.086/0.086/0.086/0.000 ms
>      0.000 probe_libc:inet_pton:(7fafee7ce350))
> [root@jouet ~]#
> 
> And it doesn't work, right, that is a bug. I'll check if your patch
> fixes that and then will correct the cset log, i.e. this part:
> 
> [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> 
> Should read:
> 
> [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> 		-e probe_libc:inet_pton ping -6 -c 1 ::1
> 
> And _should_ produce a call-graph, without a stack limit, ok?
> 
> I'll reply to this message after testing your patch.

You patch works as advertised, but we only get the callgraph if we
specify max-stack in the event definition, which is odd, i.e. it serves
to ask for callgraphs, but then its value is ignored, to make it clear,
with your patch applied I get:

[root@jouet ~]# perf trace --no-syscalls --call-graph dwarf -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms
     0.000 probe_libc:inet_pton:(7f4585f7d350))
[root@jouet ~]# 

Which is a bug, the global setting should have been applied to the only
event set.

Also:

[root@jouet ~]# perf trace --no-syscalls --call-graph dwarf --max-stack=3 -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.065 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.065/0.065/0.065/0.000 ms
     0.000 probe_libc:inet_pton:(7ffac40fa350))
[root@jouet ~]# 

Also doesn't work, which also is a bug, DWARF callgraphs should have
been set _and_ limited to 3 levels.

[root@jouet ~]# perf trace --no-syscalls -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.063 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.063/0.063/0.063/0.000 ms
     0.000 probe_libc:inet_pton:(7f79a0312350))
                                       __GI___inet_pton (/usr/lib64/libc-2.26.so)
                                       getaddrinfo (/usr/lib64/libc-2.26.so)
                                       [0xffffa9fef2875f40] (/usr/bin/ping)
[root@jouet ~]#

Works as advertised, i.e. it sets DWARF callgraphs for that event only
and limits it to 3.

Then if we try with your patch:

[root@jouet ~]# perf trace --no-syscalls --call-graph dwarf -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.082 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.082/0.082/0.082/0.000 ms
     0.000 probe_libc:inet_pton:(7f4c3d953350))
                                       __inet_pton (inlined)
                                       gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                       __GI_getaddrinfo (inlined)
                                       [0xffffaa1b49e1ef3f] (/usr/bin/ping)
                                       __libc_start_main (/usr/lib64/libc-2.26.so)
                                       [0xffffaa1b49e1f379] (/usr/bin/ping)
[root@jouet ~]#

It works but ignores the max-stack setting, which is wrong, as per-event
setting should override global ones.

So what I think is happening is that --call-graph is only affecting
syscall events, i.e.:

[root@jouet ~]# perf trace -e nanosleep sleep 1
     1.047 (1000.182 ms): sleep/11767 nanosleep(rqtp: 0x7ffc7fbdf060) = 0
[root@jouet ~]# perf trace --call-graph dwarf -e nanosleep sleep 1
no symbols found in /usr/bin/sleep, maybe install a debug package?
     0.380 (1000.161 ms): sleep/11772 nanosleep(rqtp: 0x7ffffe08a460) = 0
                                       __GI___nanosleep (inlined)
                                       [0xffffa9e3d99a0856] (/usr/bin/sleep)
                                       [0xffffa9e3d99a062f] (/usr/bin/sleep)
                                       [0xffffa9e3d999d903] (/usr/bin/sleep)
                                       __libc_start_main (/usr/lib64/libc-2.26.so)
                                       [0xffffa9e3d999d9f9] (/usr/bin/sleep)
[root@jouet ~]# perf trace --call-graph dwarf --max-stack 3 -e nanosleep sleep 1
no symbols found in /usr/bin/sleep, maybe install a debug package?
     0.954 (1000.160 ms): sleep/11786 nanosleep(rqtp: 0x7ffdd19d42c0) = 0
                                       __GI___nanosleep (inlined)
                                       [0xffffa9ed007bd856] (/usr/bin/sleep)
                                       [0xffffa9ed007bd62f] (/usr/bin/sleep)
[root@jouet ~]# perf trace --max-stack 3 -e nanosleep sleep 1
no symbols found in /usr/bin/sleep, maybe install a debug package?
     0.745 (1000.170 ms): sleep/11791 nanosleep(rqtp: 0x7ffd0f665590) = 0
                                       __GI___nanosleep (inlined)
                                       [0xffffaa6de4631856] (/usr/bin/sleep)
                                       [0xffffaa6de463162f] (/usr/bin/sleep)
[root@jouet ~]#

So what we should fix is the setting of --call-graph and --max-stack in
the non syscall events, that became supported by 'perf trace' later, so
I'm not applying your patch and will try later to fix this, unless you
beat me to it.

Thanks!

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-12 15:01   ` Arnaldo Carvalho de Melo
@ 2018-01-12 16:47     ` Arnaldo Carvalho de Melo
  2018-01-12 20:02       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-12 16:47 UTC (permalink / raw)
  To: Thomas Richter
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

Em Fri, Jan 12, 2018 at 12:01:42PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Jan 12, 2018 at 11:20:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, Jan 12, 2018 at 11:03:55AM +0100, Thomas Richter escreveu:
> > > On Intel platform when I execute test case
> > >    probe libc's inet_pton & backtrace it with ping
> > > I get this output which is correct:
> 
> > > [root@f27 perf]# ./perf trace --no-syscalls
> > > 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> > > PING ::1(::1) 56 data bytes
> > > 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.064 ms
> 
> > >  --- ::1 ping statistics ---
> > > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > > rtt min/avg/max/mdev = 0.064/0.064/0.064/0.000 ms
> > >      0.000 probe_libc:inet_pton:(7f26a88498a0))
> > > 	   __GI___inet_pton (/usr/lib64/libc-2.26.so)
> > > 	   getaddrinfo (/usr/lib64/libc-2.26.so)
> > > 	   main (/usr/bin/ping)
> > > [root@f27 perf]#
> > > 
> > > However when I add the global --call-graph dwarf to the command
> > > line I get the exactly same result which is wrong:
> > > 
> > > [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> > > 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> > > 
> > > My understanding is that the '--call-graph dwarf' option should be global
> > > and applies to all events. However it is ignored.
> > 
> > It is global, but then, per-event settings can override it.
> > 
> > So if you say that the following call (without that /max-stack=3/, which
> > implies DWARF, which is the safest bet for userspace callchains):
> > [root@jouet ~]# perf trace --no-syscalls --call-graph dwarf -e probe_libc:inet_pton ping -6 -c 1 ::1
> > PING ::1(::1) 56 data bytes
> > 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.086 ms
> > 
> > --- ::1 ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 0.086/0.086/0.086/0.000 ms
> >      0.000 probe_libc:inet_pton:(7fafee7ce350))
> > [root@jouet ~]#
> > 
> > And it doesn't work, right, that is a bug. I'll check if your patch
> > fixes that and then will correct the cset log, i.e. this part:
> > 
> > [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> > 		-e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
> > 
> > Should read:
> > 
> > [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf
> > 		-e probe_libc:inet_pton ping -6 -c 1 ::1
> > 
> > And _should_ produce a call-graph, without a stack limit, ok?
> > 
> > I'll reply to this message after testing your patch.
> 
> You patch works as advertised, but we only get the callgraph if we
> specify max-stack in the event definition, which is odd, i.e. it serves
> to ask for callgraphs, but then its value is ignored, to make it clear,
> with your patch applied I get:
> 
> [root@jouet ~]# perf trace --no-syscalls --call-graph dwarf -e probe_libc:inet_pton ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms
>      0.000 probe_libc:inet_pton:(7f4585f7d350))
> [root@jouet ~]# 
> 
> Which is a bug, the global setting should have been applied to the only
> event set.

Please pull my perf/core branch, there are two csets there that should
fix this:

[acme@jouet perf]$ git log --oneline -2
cc28fec89f2a (HEAD -> perf/core, acme.korg/perf/core) perf trace: Fix setting of --call-graph/--max-stack for non-syscall events
236d812c55c2 perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely
[acme@jouet perf]$

https://git.kernel.org/acme/c/cc28fec89f2a
https://git.kernel.org/acme/c/236d812c55c2

236d812c55c2 is just a simplification to make cc28fec89f2a smaller and
with just one purpose.

There is still room for improvement, I noticed overriding is not working
for the probe event, investigating it now.

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-12 16:47     ` Arnaldo Carvalho de Melo
@ 2018-01-12 20:02       ` Arnaldo Carvalho de Melo
  2018-01-15  9:11         ` Thomas-Mich Richter
  2018-01-15 12:31         ` Thomas-Mich Richter
  0 siblings, 2 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-12 20:02 UTC (permalink / raw)
  To: Thomas Richter
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

Em Fri, Jan 12, 2018 at 01:47:06PM -0300, Arnaldo Carvalho de Melo escreveu:
> There is still room for improvement, I noticed overriding is not working
> for the probe event, investigating it now.

So, I had to fix this another way to get the possibility of overwriting
the global options (--max-stack, --call-graph) in an specific tracepoint
event:

http://git.kernel.org/acme/c/08e26396c6f2

replaced that HEAD.

This cset may take some more minutes to show up, just pushed.

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-12 20:02       ` Arnaldo Carvalho de Melo
@ 2018-01-15  9:11         ` Thomas-Mich Richter
  2018-01-15 13:20           ` Arnaldo Carvalho de Melo
  2018-01-15 12:31         ` Thomas-Mich Richter
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas-Mich Richter @ 2018-01-15  9:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

On 01/12/2018 09:02 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jan 12, 2018 at 01:47:06PM -0300, Arnaldo Carvalho de Melo escreveu:
>> There is still room for improvement, I noticed overriding is not working
>> for the probe event, investigating it now.
> 
> So, I had to fix this another way to get the possibility of overwriting
> the global options (--max-stack, --call-graph) in an specific tracepoint
> event:
> 
> http://git.kernel.org/acme/c/08e26396c6f2
> 
> replaced that HEAD.
> 
> This cset may take some more minutes to show up, just pushed.
> 

Sorry this does *not* work on my s390x.

I have cloned your perf/core tree and above commit is included.
Here is the command I tried:

[root@s8360047 perf]# ./perf trace -vv --no-syscalls  --max-stack 4 --call-graph dwarf 
           -e probe_libc:inet_pton -- ping -6 -c 1 ::1
callchain: type DWARF
callchain: stack dump size 8192
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           0x45f
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|RAW|DATA_SRC
  disabled                         1
  inherit                          1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  mmap_data                        1
  sample_id_all                    1
  exclude_guest                    1
  mmap2                            1
  comm_exec                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 6735  cpu 0  group_fd -1  flags 0x8 = 3
sys_perf_event_open: pid 6735  cpu 1  group_fd -1  flags 0x8 = 4
mmap size 2101248B
perf event ring buffer mmapped per cpu
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.070 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.070/0.070/0.070/0.000 ms
     0.000 probe_libc:inet_pton:(3ffada42060))
[root@s8360047 perf]# 

I do miss in sample_type bits for CALLCHAIN, STACK_USER and REG_USER
in the debug output of the perf_event_open() attribute printout.

When I invoke the command with
[root@s8360047 perf]# ./perf trace -vv --no-syscalls 
	-e probe_libc:inet_pton/call-graph=dwarf,max-stack=3/ -- ping -6 -c 1 ::1
callchain: type DWARF
callchain: stack dump size 8192
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           0x45f
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER
  disabled                         1
  inherit                          1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  sample_id_all                    1
  exclude_guest                    1
  exclude_callchain_user           1
  mmap2                            1
  comm_exec                        1
  { wakeup_events, wakeup_watermark } 1
  sample_regs_user                 0x3ffffffff
  sample_stack_user                8192
  sample_max_stack                 3
------------------------------------------------------------
sys_perf_event_open: pid 6768  cpu 0  group_fd -1  flags 0x8 = 3
sys_perf_event_open: pid 6768  cpu 1  group_fd -1  flags 0x8 = 4
mmap size 528384B
perf event ring buffer mmapped per cpu
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.074 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.074/0.074/0.074/0.000 ms
[... snip ....]

unwind: _start:ip = 0x2aa1e38457b (0x457b)
     0.000 probe_libc:inet_pton:(3ff9b142060))
                                       __GI___inet_pton (/usr/lib64/libc-2.26.so)
                                       gaih_inet (inlined)
                                       __GI_getaddrinfo (inlined)
                                       main (/usr/bin/ping)
                                       __libc_start_main (/usr/lib64/libc-2.26.so)
                                       _start (/usr/bin/ping)
[root@s8360047 perf]# 

I see the proper result as expected.

-- 
Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
--
Vorsitzende des Aufsichtsrats: Martina Koederitz 
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-12 20:02       ` Arnaldo Carvalho de Melo
  2018-01-15  9:11         ` Thomas-Mich Richter
@ 2018-01-15 12:31         ` Thomas-Mich Richter
  2018-01-15 13:57           ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas-Mich Richter @ 2018-01-15 12:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

On 01/12/2018 09:02 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jan 12, 2018 at 01:47:06PM -0300, Arnaldo Carvalho de Melo escreveu:
>> There is still room for improvement, I noticed overriding is not working
>> for the probe event, investigating it now.
> 
> So, I had to fix this another way to get the possibility of overwriting
> the global options (--max-stack, --call-graph) in an specific tracepoint
> event:
> 
> http://git.kernel.org/acme/c/08e26396c6f2
> 
> replaced that HEAD.
> 
> This cset may take some more minutes to show up, just pushed.
> 

I have installed your perf/core tree on my Fedora 27 Virtual Machine
running on my Intel notebook.

Here are some commands and the output on an Intel platform:

[root@f27 perf]# uname -a
Linux f27 4.15.0-rc6acme+ #1 SMP Mon Jan 15 12:35:23 CET 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@f27 perf]#

[root@f27 perf]# ./perf trace --no-syscalls --call-graph fp --max-stack 3
        -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.022 ms
     0.000 probe_libc:inet_pton:
 --- ::1 ping statistics ---
(7f8fc407d8a0))
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.022/0.022/0.022/0.000 ms
[root@f27 perf]#

--> No call graph at all, the kernel as been compiled with ORC unwinder.

[root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf --max-stack 3
        -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.024 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.024/0.024/0.024/0.000 ms
0.000 probe_libc:inet_pton:(7f7ff38488a0))
[root@f27 perf]#

--> No call graph at all, the kernel as been compiled with OCR unwinder.

[root@f27 perf]# ./perf trace --no-syscalls
        -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
        ping -6 -c 1 ::1PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.019 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.019/0.019/0.019/0.000 ms
     0.000 probe_libc:inet_pton:(7fc985d658a0))
           __inet_pton (inlined)
           gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
           __GI_getaddrinfo (inlined)
           main (/usr/bin/ping)
           __libc_start_main (/usr/lib64/libc-2.26.so)
           _start (/usr/bin/ping)
[root@f27 perf]#

--> Dwarf call graph, but max-stack=4 not honoured when specified as
        event specific restriction.

[root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
        -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.020 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.020/0.020/0.020/0.000 ms
     0.000 probe_libc:inet_pton:(7ffbc5f768a0))
            __inet_pton (inlined)
            gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
            __GI_getaddrinfo (inlined)
            main (/usr/bin/ping)
[root@f27 perf]#


--> Dwarf call graph and --max-stack 4 is also honoured.

I have the feeling that your fix does not work very well when
used with the --no-syscalls option.
Omitting --no-syscalls shows your explained behavior.

So there must be a difference between --no-syscalls and --syscalls
invocation.     

-- 
Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
--
Vorsitzende des Aufsichtsrats: Martina Koederitz 
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-15  9:11         ` Thomas-Mich Richter
@ 2018-01-15 13:20           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-15 13:20 UTC (permalink / raw)
  To: Thomas-Mich Richter
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

Em Mon, Jan 15, 2018 at 10:11:02AM +0100, Thomas-Mich Richter escreveu:
> On 01/12/2018 09:02 PM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Jan 12, 2018 at 01:47:06PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> There is still room for improvement, I noticed overriding is not working
> >> for the probe event, investigating it now.
> > 
> > So, I had to fix this another way to get the possibility of overwriting
> > the global options (--max-stack, --call-graph) in an specific tracepoint
> > event:
> > 
> > http://git.kernel.org/acme/c/08e26396c6f2
> > 
> > replaced that HEAD.
> > 
> > This cset may take some more minutes to show up, just pushed.
> > 
> 
> Sorry this does *not* work on my s390x.
 
> I have cloned your perf/core tree and above commit is included.
> Here is the command I tried:
> 
> [root@s8360047 perf]# ./perf trace -vv --no-syscalls  --max-stack 4 --call-graph dwarf 
>            -e probe_libc:inet_pton -- ping -6 -c 1 ::1

yeah, I'm looking at this now, reproduced here on x86_64 :-\

- Arnaldo

> callchain: type DWARF
> callchain: stack dump size 8192
> ------------------------------------------------------------
> perf_event_attr:
>   type                             2
>   size                             112
>   config                           0x45f
>   { sample_period, sample_freq }   1
>   sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|RAW|DATA_SRC
>   disabled                         1
>   inherit                          1
>   mmap                             1
>   comm                             1
>   enable_on_exec                   1
>   task                             1
>   mmap_data                        1
>   sample_id_all                    1
>   exclude_guest                    1
>   mmap2                            1
>   comm_exec                        1
>   { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 6735  cpu 0  group_fd -1  flags 0x8 = 3
> sys_perf_event_open: pid 6735  cpu 1  group_fd -1  flags 0x8 = 4
> mmap size 2101248B
> perf event ring buffer mmapped per cpu
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.070 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.070/0.070/0.070/0.000 ms
>      0.000 probe_libc:inet_pton:(3ffada42060))
> [root@s8360047 perf]# 
> 
> I do miss in sample_type bits for CALLCHAIN, STACK_USER and REG_USER
> in the debug output of the perf_event_open() attribute printout.

 
> When I invoke the command with
> [root@s8360047 perf]# ./perf trace -vv --no-syscalls 
> 	-e probe_libc:inet_pton/call-graph=dwarf,max-stack=3/ -- ping -6 -c 1 ::1
> callchain: type DWARF
> callchain: stack dump size 8192
> ------------------------------------------------------------
> perf_event_attr:
>   type                             2
>   size                             112
>   config                           0x45f
>   { sample_period, sample_freq }   1
>   sample_type                      IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER
>   disabled                         1
>   inherit                          1
>   mmap                             1
>   comm                             1
>   enable_on_exec                   1
>   task                             1
>   sample_id_all                    1
>   exclude_guest                    1
>   exclude_callchain_user           1
>   mmap2                            1
>   comm_exec                        1
>   { wakeup_events, wakeup_watermark } 1
>   sample_regs_user                 0x3ffffffff
>   sample_stack_user                8192
>   sample_max_stack                 3
> ------------------------------------------------------------
> sys_perf_event_open: pid 6768  cpu 0  group_fd -1  flags 0x8 = 3
> sys_perf_event_open: pid 6768  cpu 1  group_fd -1  flags 0x8 = 4
> mmap size 528384B
> perf event ring buffer mmapped per cpu
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.074 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.074/0.074/0.074/0.000 ms
> [... snip ....]
> 
> unwind: _start:ip = 0x2aa1e38457b (0x457b)
>      0.000 probe_libc:inet_pton:(3ff9b142060))
>                                        __GI___inet_pton (/usr/lib64/libc-2.26.so)
>                                        gaih_inet (inlined)
>                                        __GI_getaddrinfo (inlined)
>                                        main (/usr/bin/ping)
>                                        __libc_start_main (/usr/lib64/libc-2.26.so)
>                                        _start (/usr/bin/ping)
> [root@s8360047 perf]# 
> 
> I see the proper result as expected.
> 
> -- 
> Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
> --
> Vorsitzende des Aufsichtsrats: Martina Koederitz 
> Geschäftsführung: Dirk Wittkopp
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-15 12:31         ` Thomas-Mich Richter
@ 2018-01-15 13:57           ` Arnaldo Carvalho de Melo
  2018-01-15 14:16             ` Arnaldo Carvalho de Melo
  2018-01-15 14:45             ` Hendrik Brueckner
  0 siblings, 2 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-15 13:57 UTC (permalink / raw)
  To: Thomas-Mich Richter
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

Em Mon, Jan 15, 2018 at 01:31:00PM +0100, Thomas-Mich Richter escreveu:
> On 01/12/2018 09:02 PM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Jan 12, 2018 at 01:47:06PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> There is still room for improvement, I noticed overriding is not working
> >> for the probe event, investigating it now.
> > 
> > So, I had to fix this another way to get the possibility of overwriting
> > the global options (--max-stack, --call-graph) in an specific tracepoint
> > event:
> > 
> > http://git.kernel.org/acme/c/08e26396c6f2
> > 
> > replaced that HEAD.
> > 
> > This cset may take some more minutes to show up, just pushed.
> > 
> 
> I have installed your perf/core tree on my Fedora 27 Virtual Machine
> running on my Intel notebook.
> 
> Here are some commands and the output on an Intel platform:
> 
> [root@f27 perf]# uname -a
> Linux f27 4.15.0-rc6acme+ #1 SMP Mon Jan 15 12:35:23 CET 2018 x86_64 x86_64 x86_64 GNU/Linux
> [root@f27 perf]#
> 
> [root@f27 perf]# ./perf trace --no-syscalls --call-graph fp --max-stack 3
>         -e probe_libc:inet_pton ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.022 ms
>      0.000 probe_libc:inet_pton:
>  --- ::1 ping statistics ---
> (7f8fc407d8a0))
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.022/0.022/0.022/0.000 ms
> [root@f27 perf]#
> 
> --> No call graph at all, the kernel as been compiled with ORC unwinder.

Results with the patch at the end of this message:

[root@jouet ~]# perf trace --no-syscalls --call-graph fp --max-stack 3 -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.061 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
     0.000 probe_libc:inet_pton:(7f475b8be350))
                                       __GI___inet_pton (/usr/lib64/libc-2.26.so)
                                       getaddrinfo (/usr/lib64/libc-2.26.so)
                                       [0xffffaa7de21b9f40] (/usr/bin/ping)
[root@jouet ~]#

 
> [root@f27 perf]# ./perf trace --no-syscalls --call-graph dwarf --max-stack 3
>         -e probe_libc:inet_pton ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.024 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.024/0.024/0.024/0.000 ms
> 0.000 probe_libc:inet_pton:(7f7ff38488a0))
> [root@f27 perf]#
> 
> --> No call graph at all, the kernel as been compiled with OCR unwinder.

[root@jouet ~]# perf trace --no-syscalls --call-graph fp --max-stack 3 -e probe_libc:inet_pton ping -6 -c 1 ::1
perf trace --no-syscalls --call-graph dwarf --max-stack 3 -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.080 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.080/0.080/0.080/0.000 ms
     0.000 probe_libc:inet_pton:(7f33e9647350))
                                       __inet_pton (inlined)
                                       gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                       __GI_getaddrinfo (inlined)
[root@jouet ~]# 

And here we see a difference in the fp and DWARF unwinders, have to dig
deeper into this one, perhaps using the DWARF one we have more info and
then can end up with inlines instead of what it calls.
 
> [root@f27 perf]# ./perf trace --no-syscalls
>         -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
>         ping -6 -c 1 ::1PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.019 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.019/0.019/0.019/0.000 ms
>      0.000 probe_libc:inet_pton:(7fc985d658a0))
>            __inet_pton (inlined)
>            gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
>            __GI_getaddrinfo (inlined)
>            main (/usr/bin/ping)
>            __libc_start_main (/usr/lib64/libc-2.26.so)
>            _start (/usr/bin/ping)
> [root@f27 perf]#
> 
> --> Dwarf call graph, but max-stack=4 not honoured when specified as
>         event specific restriction.

[root@jouet ~]# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
     0.000 probe_libc:inet_pton:(7f10482eb350))
Problem processing probe_libc:inet_pton callchain, skipping...
   717.753 probe_libc:inet_pton:(7f10482eb350))
Problem processing probe_libc:inet_pton callchain, skipping...
   718.182 probe_libc:inet_pton:(7f10482eb350))
Problem processing probe_libc:inet_pton callchain, skipping...
  2662.046 probe_libc:inet_pton:(7f10482eb350))
Problem processing probe_libc:inet_pton callchain, skipping...
^C[root@jouet ~]#

Need to investigate...
 
> [root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
>         -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.020 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.020/0.020/0.020/0.000 ms
>      0.000 probe_libc:inet_pton:(7ffbc5f768a0))
>             __inet_pton (inlined)
>             gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
>             __GI_getaddrinfo (inlined)
>             main (/usr/bin/ping)
> [root@f27 perf]#
> 
> 
> --> Dwarf call graph and --max-stack 4 is also honoured.

[root@jouet ~]# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
Value too large for defined data type
[root@jouet ~]# 

Grrr.
> 
> I have the feeling that your fix does not work very well when
> used with the --no-syscalls option.
> Omitting --no-syscalls shows your explained behavior.
> 
> So there must be a difference between --no-syscalls and --syscalls
> invocation.     

Right, there is, please see the patch below, but I need to fix the other
issues I noticed here but you haven't, perhaps we're using a different
DWARF unwinder (there are two possible libraries to use).

commit 7a24f244b452cfc0acfe35d1a6168c9a8ff483ce
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Mon Jan 15 10:39:55 2018 -0300

    perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used
    
    If we use:
    
            perf trace --max-stack=4
    
    then the syscall events will use DWARF callchains, when available
    (libunwind enabled in the build) and the printing will stop at 4 levels.
    
    When we introduced support for tracepoint events this ended up not
    applying for them, fix it.
    
    Reported-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Wang Nan <wangnan0@huawei.com>
    Link: https://lkml.kernel.org/n/tip-xq2enzgcs5q40kg5wq5aac3s@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 0362974854e9..c6c7498562ea 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3065,11 +3065,23 @@ int cmd_trace(int argc, const char **argv)
 	}
 
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-	if ((trace.min_stack || max_stack_user_set) && !callchain_param.enabled && trace.trace_syscalls)
+	if ((trace.min_stack || max_stack_user_set) && !callchain_param.enabled) {
 		record_opts__parse_callchain(&trace.opts, &callchain_param, "dwarf", false);
+	}
 #endif
 
 	if (callchain_param.enabled) {
+		struct perf_evsel *evsel;
+		/*
+		 * When we use --max-stack and/or --call-graph we need to look at all the
+		 * other events passed in the command line to apply this config to them,
+		 * if not explicitely set.
+		 */
+		evlist__for_each_entry(trace.evlist, evsel) {
+			if (!(evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN))
+				perf_evsel__config_callchain(evsel, &trace.opts, &callchain_param);
+		}
+
 		if (!mmap_pages_user_set && geteuid() == 0)
 			trace.opts.mmap_pages = perf_event_mlock_kb_in_pages() * 4;
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-15 13:57           ` Arnaldo Carvalho de Melo
@ 2018-01-15 14:16             ` Arnaldo Carvalho de Melo
  2018-01-15 14:50               ` Thomas-Mich Richter
  2018-01-15 14:45             ` Hendrik Brueckner
  1 sibling, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-15 14:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas-Mich Richter, linux-kernel, linux-perf-users, brueckner,
	schwidefsky, heiko.carstens

Em Mon, Jan 15, 2018 at 10:57:52AM -0300, Arnaldo Carvalho de Melo escreveu:
> > [root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
> >         -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
> > PING ::1(::1) 56 data bytes
> > 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.020 ms
> > 
> > --- ::1 ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 0.020/0.020/0.020/0.000 ms
> >      0.000 probe_libc:inet_pton:(7ffbc5f768a0))
> >             __inet_pton (inlined)
> >             gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
> >             __GI_getaddrinfo (inlined)
> >             main (/usr/bin/ping)
> > [root@f27 perf]#
> > 
> > 
> > --> Dwarf call graph and --max-stack 4 is also honoured.
> 
> [root@jouet ~]# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
> perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
> Value too large for defined data type
> [root@jouet ~]# 
> 
> Grrr.


Got this one fixed with the following patch:

commit b78278e11f6992ca348a4b96aad3b2c0a9ecf0f0
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Mon Jan 15 11:07:58 2018 -0300

    perf callchain: Fix attr.sample_max_stack setting
    
    When setting the "dwarf" unwinder for a specific event and not
    specifying the max-stack, the attr.sample_max_stack ended up using an
    uninitialized callchain_param.max_stack, fix it by using designated
    initializers for that callchain_param variable, zeroing all non
    explicitely initialized struct members.
    
    Here is what happened:
    
      # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
      callchain: type DWARF
      callchain: stack dump size 8192
      perf_event_attr:
        type                             2
        size                             112
        config                           0x730
        { sample_period, sample_freq }   1
        sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
        exclude_callchain_user           1
        { wakeup_events, wakeup_watermark } 1
        sample_regs_user                 0xff0fff
        sample_stack_user                8192
        sample_max_stack                 50656
      sys_perf_event_open failed, error -75
      Value too large for defined data type
      # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
      callchain: type DWARF
      callchain: stack dump size 8192
      perf_event_attr:
        type                             2
        size                             112
        config                           0x730
        sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
        exclude_callchain_user           1
        sample_regs_user                 0xff0fff
        sample_stack_user                8192
        sample_max_stack                 30448
      sys_perf_event_open failed, error -75
      Value too large for defined data type
      #
    
    Now the attr.sample_max_stack is set to zero and the above works as
    expected:
    
      # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
      PING ::1(::1) 56 data bytes
      64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms
    
      --- ::1 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
           0.000 probe_libc:inet_pton:(7feb7a998350))
                                             __inet_pton (inlined)
                                             gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                             __GI_getaddrinfo (inlined)
                                             [0xffffaa39b6108f3f] (/usr/bin/ping)
      #
    
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
    Cc: Wang Nan <wangnan0@huawei.com>
    Link: https://lkml.kernel.org/n/tip-qmqnot8y3zff5wh8kb7rkl4d@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index efa2e629a669..8f971a2301d1 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -731,14 +731,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
 	struct perf_evsel_config_term *term;
 	struct list_head *config_terms = &evsel->config_terms;
 	struct perf_event_attr *attr = &evsel->attr;
-	struct callchain_param param;
+	/* callgraph default */
+	struct callchain_param param = {
+		.record_mode = callchain_param.record_mode,
+	};
 	u32 dump_size = 0;
 	int max_stack = 0;
 	const char *callgraph_buf = NULL;
 
-	/* callgraph default */
-	param.record_mode = callchain_param.record_mode;
-
 	list_for_each_entry(term, config_terms, list) {
 		switch (term->type) {
 		case PERF_EVSEL__CONFIG_TERM_PERIOD:

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-15 13:57           ` Arnaldo Carvalho de Melo
  2018-01-15 14:16             ` Arnaldo Carvalho de Melo
@ 2018-01-15 14:45             ` Hendrik Brueckner
  1 sibling, 0 replies; 13+ messages in thread
From: Hendrik Brueckner @ 2018-01-15 14:45 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas-Mich Richter, linux-kernel, linux-perf-users, brueckner,
	schwidefsky, heiko.carstens

On Mon, Jan 15, 2018 at 10:57:52AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 15, 2018 at 01:31:00PM +0100, Thomas-Mich Richter escreveu:
> > On 01/12/2018 09:02 PM, Arnaldo Carvalho de Melo wrote:
> > > Em Fri, Jan 12, 2018 at 01:47:06PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >
> [root@jouet ~]# perf trace --no-syscalls --call-graph fp --max-stack 3 -e probe_libc:inet_pton ping -6 -c 1 ::1
> perf trace --no-syscalls --call-graph dwarf --max-stack 3 -e probe_libc:inet_pton ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.080 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.080/0.080/0.080/0.000 ms
>      0.000 probe_libc:inet_pton:(7f33e9647350))
>                                        __inet_pton (inlined)
>                                        gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
>                                        __GI_getaddrinfo (inlined)
> [root@jouet ~]# 
> 
> And here we see a difference in the fp and DWARF unwinders, have to dig
> deeper into this one, perhaps using the DWARF one we have more info and
> then can end up with inlines instead of what it calls.

DWARF includes information about which functions are inlined.  The ORC
unwinder data is based on the analysis of the generated instructions
(created with the objtool).  Hence, it does no (and no longer) know whether
a bunch of instructions are originally part of a function or inlined.

Thanks and kind regards,
  Hendrik

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-15 14:16             ` Arnaldo Carvalho de Melo
@ 2018-01-15 14:50               ` Thomas-Mich Richter
  2018-01-15 15:02                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas-Mich Richter @ 2018-01-15 14:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, brueckner, schwidefsky, heiko.carstens

On 01/15/2018 03:16 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 15, 2018 at 10:57:52AM -0300, Arnaldo Carvalho de Melo escreveu:
>>> [root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
>>>         -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
>>> PING ::1(::1) 56 data bytes
>>> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.020 ms
>>>
>>> --- ::1 ping statistics ---
>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
>>> rtt min/avg/max/mdev = 0.020/0.020/0.020/0.000 ms
>>>      0.000 probe_libc:inet_pton:(7ffbc5f768a0))
>>>             __inet_pton (inlined)
>>>             gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
>>>             __GI_getaddrinfo (inlined)
>>>             main (/usr/bin/ping)
>>> [root@f27 perf]#
>>>
>>>
>>> --> Dwarf call graph and --max-stack 4 is also honoured.
>>
>> [root@jouet ~]# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
>> perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
>> Value too large for defined data type
>> [root@jouet ~]# 
>>
>> Grrr.
> 
> 
> Got this one fixed with the following patch:
> 
> commit b78278e11f6992ca348a4b96aad3b2c0a9ecf0f0

[...]

> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index efa2e629a669..8f971a2301d1 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -731,14 +731,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
>  	struct perf_evsel_config_term *term;
>  	struct list_head *config_terms = &evsel->config_terms;
>  	struct perf_event_attr *attr = &evsel->attr;
> -	struct callchain_param param;
> +	/* callgraph default */
> +	struct callchain_param param = {
> +		.record_mode = callchain_param.record_mode,
> +	};
>  	u32 dump_size = 0;
>  	int max_stack = 0;
>  	const char *callgraph_buf = NULL;
> 
> -	/* callgraph default */
> -	param.record_mode = callchain_param.record_mode;
> -
>  	list_for_each_entry(term, config_terms, list) {
>  		switch (term->type) {
>  		case PERF_EVSEL__CONFIG_TERM_PERIOD:

This patch works for me.
Here is the output on my s390x:

[root@s8360047 perf]# ./perf trace --no-syscalls --max-stack 4 
	-e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.070 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.070/0.070/0.070/0.000 ms
     0.000 probe_libc:inet_pton:(3ffa70c2060))
           __GI___inet_pton (/usr/lib64/libc-2.26.so)
           gaih_inet (inlined)
           __GI_getaddrinfo (inlined)
           main (/usr/bin/ping)
[root@s8360047 perf]# ./perf trace --no-syscalls --call-graph dwarf 
	-e probe_libc:inet_pton -- ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.086 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.086/0.086/0.086/0.000 ms
     0.000 probe_libc:inet_pton:(3ff93fc2060))
           __GI___inet_pton (/usr/lib64/libc-2.26.so)
           gaih_inet (inlined)
           __GI_getaddrinfo (inlined)
           main (/usr/bin/ping)
           __libc_start_main (/usr/lib64/libc-2.26.so)
           _start (/usr/bin/ping)
[root@s8360047 perf]#

[root@s8360047 perf]# ./perf trace --no-syscalls 
	-e probe_libc:inet_pton/call-graph=dwarf,max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.066 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.066/0.066/0.066/0.000 ms
     0.000 probe_libc:inet_pton:(3ffb82c2060))
           __GI___inet_pton (/usr/lib64/libc-2.26.so)
           gaih_inet (inlined)
           __GI_getaddrinfo (inlined)
           main (/usr/bin/ping)
           __libc_start_main (/usr/lib64/libc-2.26.so)
           _start (/usr/bin/ping)
[root@s8360047 perf]# 

Not sure if this can work at all. Since dwarf stack unwinding is done in user space
the attr.sample_max_stack set to 3 is useless in this case.

Yout have my Tested-by.
-- 
Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
--
Vorsitzende des Aufsichtsrats: Martina Koederitz 
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf
  2018-01-15 14:50               ` Thomas-Mich Richter
@ 2018-01-15 15:02                 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-15 15:02 UTC (permalink / raw)
  To: Thomas-Mich Richter
  Cc: Arnaldo Carvalho de Melo, linux-kernel, linux-perf-users,
	brueckner, schwidefsky, heiko.carstens

Em Mon, Jan 15, 2018 at 03:50:15PM +0100, Thomas-Mich Richter escreveu:
> On 01/15/2018 03:16 PM, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jan 15, 2018 at 10:57:52AM -0300, Arnaldo Carvalho de Melo escreveu:
> >>> [root@f27 perf]# ./perf trace --no-syscalls --max-stack 4
> >>>         -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
> >>> PING ::1(::1) 56 data bytes
> >>> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.020 ms
> >>>
> >>> --- ::1 ping statistics ---
> >>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> >>> rtt min/avg/max/mdev = 0.020/0.020/0.020/0.000 ms
> >>>      0.000 probe_libc:inet_pton:(7ffbc5f768a0))
> >>>             __inet_pton (inlined)
> >>>             gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
> >>>             __GI_getaddrinfo (inlined)
> >>>             main (/usr/bin/ping)
> >>> [root@f27 perf]#
> >>>
> >>>
> >>> --> Dwarf call graph and --max-stack 4 is also honoured.
> >>
> >> [root@jouet ~]# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf,max-stack=4/
> >> perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
> >> Value too large for defined data type
> >> [root@jouet ~]# 
> >>
> >> Grrr.
> > 
> > 
> > Got this one fixed with the following patch:
> > 
> > commit b78278e11f6992ca348a4b96aad3b2c0a9ecf0f0
> 
> [...]
> 
> > 
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index efa2e629a669..8f971a2301d1 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -731,14 +731,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
> >  	struct perf_evsel_config_term *term;
> >  	struct list_head *config_terms = &evsel->config_terms;
> >  	struct perf_event_attr *attr = &evsel->attr;
> > -	struct callchain_param param;
> > +	/* callgraph default */
> > +	struct callchain_param param = {
> > +		.record_mode = callchain_param.record_mode,
> > +	};
> >  	u32 dump_size = 0;
> >  	int max_stack = 0;
> >  	const char *callgraph_buf = NULL;
> > 
> > -	/* callgraph default */
> > -	param.record_mode = callchain_param.record_mode;
> > -
> >  	list_for_each_entry(term, config_terms, list) {
> >  		switch (term->type) {
> >  		case PERF_EVSEL__CONFIG_TERM_PERIOD:
> 
> This patch works for me.
> Here is the output on my s390x:
> 
> [root@s8360047 perf]# ./perf trace --no-syscalls --max-stack 4 
> 	-e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.070 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.070/0.070/0.070/0.000 ms
>      0.000 probe_libc:inet_pton:(3ffa70c2060))
>            __GI___inet_pton (/usr/lib64/libc-2.26.so)
>            gaih_inet (inlined)
>            __GI_getaddrinfo (inlined)
>            main (/usr/bin/ping)
> [root@s8360047 perf]# ./perf trace --no-syscalls --call-graph dwarf 
> 	-e probe_libc:inet_pton -- ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.086 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.086/0.086/0.086/0.000 ms
>      0.000 probe_libc:inet_pton:(3ff93fc2060))
>            __GI___inet_pton (/usr/lib64/libc-2.26.so)
>            gaih_inet (inlined)
>            __GI_getaddrinfo (inlined)
>            main (/usr/bin/ping)
>            __libc_start_main (/usr/lib64/libc-2.26.so)
>            _start (/usr/bin/ping)
> [root@s8360047 perf]#
> 
> [root@s8360047 perf]# ./perf trace --no-syscalls 
> 	-e probe_libc:inet_pton/call-graph=dwarf,max-stack=3/ ping -6 -c 1 ::1
> PING ::1(::1) 56 data bytes
> 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.066 ms
> 
> --- ::1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.066/0.066/0.066/0.000 ms
>      0.000 probe_libc:inet_pton:(3ffb82c2060))
>            __GI___inet_pton (/usr/lib64/libc-2.26.so)
>            gaih_inet (inlined)
>            __GI_getaddrinfo (inlined)
>            main (/usr/bin/ping)
>            __libc_start_main (/usr/lib64/libc-2.26.so)
>            _start (/usr/bin/ping)
> [root@s8360047 perf]# 
> 
> Not sure if this can work at all. Since dwarf stack unwinding is done in user space
> the attr.sample_max_stack set to 3 is useless in this case.

right, I'm working on this case now, this has to be honoured in
trace__resolve_callchain, that has to use evsel->attr.sample_max_stack
if it is set, otherwise fallback to trace->max_stack.

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-01-15 15:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-12 10:03 [PATCH v2] perf trace: Fix missing handling of --call-graph dwarf Thomas Richter
2018-01-12 14:20 ` Arnaldo Carvalho de Melo
2018-01-12 15:01   ` Arnaldo Carvalho de Melo
2018-01-12 16:47     ` Arnaldo Carvalho de Melo
2018-01-12 20:02       ` Arnaldo Carvalho de Melo
2018-01-15  9:11         ` Thomas-Mich Richter
2018-01-15 13:20           ` Arnaldo Carvalho de Melo
2018-01-15 12:31         ` Thomas-Mich Richter
2018-01-15 13:57           ` Arnaldo Carvalho de Melo
2018-01-15 14:16             ` Arnaldo Carvalho de Melo
2018-01-15 14:50               ` Thomas-Mich Richter
2018-01-15 15:02                 ` Arnaldo Carvalho de Melo
2018-01-15 14:45             ` Hendrik Brueckner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).