linux-trace-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Wrong Perf Backtraces
       [not found] <157597d74ff17f781d9de7e7e3defd13@ut.ac.ir>
@ 2020-03-22 20:24 ` ahmadkhorrami
  2020-03-23  0:34   ` Steven Rostedt
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-22 20:24 UTC (permalink / raw)
  To: Linux-trace Users

Hi,
I used "Perf" to extract call graphs in an evince benchmark. The command 
used is as follows:
sudo perf record -d --call-graph dwarf -c 10000 e 
mem_load_uops_retired.l3_miss:uppp /opt/evince-3.28.4/bin/evince

I extracted the backtraces using "perf script" and found out that there 
are many corrupted backtrace instances. Some contained repeated function 
calls, for example two consecutive gmallocn()s exactly at the same 
offsets. There are also some backtraces where the callers and callees do 
not match.

Note that that mappings are correct. In other words, each single line of 
the reported backtraces is correct (i.e., addresses match with 
functions). But is seems that there are some function calls in the 
middle, which are missed by "Perf". Strangely, in all runs (and also 
with different sampling frequencies) the problem occurs exactly at the 
same place.

I am really confused and looking forward to any help. I can also send 
backtraces if needed.
Regards.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-22 20:24 ` Wrong Perf Backtraces ahmadkhorrami
@ 2020-03-23  0:34   ` Steven Rostedt
       [not found]     ` <21b3df4080709f193d62b159887e2a83@ut.ac.ir>
  0 siblings, 1 reply; 67+ messages in thread
From: Steven Rostedt @ 2020-03-23  0:34 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Linux-trace Users, Arnaldo Carvalho de Melo, Peter Zijlstra, Jiri Olsa

On Mon, 23 Mar 2020 00:54:01 +0430
ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:

> Hi,
> I used "Perf" to extract call graphs in an evince benchmark. The command 
> used is as follows:
> sudo perf record -d --call-graph dwarf -c 10000 e 
> mem_load_uops_retired.l3_miss:uppp /opt/evince-3.28.4/bin/evince
> 
> I extracted the backtraces using "perf script" and found out that there 
> are many corrupted backtrace instances. Some contained repeated function 
> calls, for example two consecutive gmallocn()s exactly at the same 
> offsets. There are also some backtraces where the callers and callees do 
> not match.

Could you show some examples of the backtraces you mention?

> 
> Note that that mappings are correct. In other words, each single line of 
> the reported backtraces is correct (i.e., addresses match with 
> functions). But is seems that there are some function calls in the 
> middle, which are missed by "Perf". Strangely, in all runs (and also 
> with different sampling frequencies) the problem occurs exactly at the 
> same place.
> 
> I am really confused and looking forward to any help. I can also send 
> backtraces if needed.

-- Steve

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
       [not found]     ` <21b3df4080709f193d62b159887e2a83@ut.ac.ir>
@ 2020-03-23  8:49       ` Jiri Olsa
  2020-03-23 10:03         ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-23  8:49 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Linux-trace Users, Arnaldo Carvalho de Melo,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

On Mon, Mar 23, 2020 at 07:48:26AM +0430, ahmadkhorrami wrote:
> Here is a link to the detailed question at Stackoverflow: 
> 
> https://stackoverflow.com/questions/60766026/wrong-perf-backtraces 

hi,
what perf version are you running?

jirka


> 
> I can copy it here, if needed. 
> 
> Thanks 
> 
> On 2020-03-23 05:04, Steven Rostedt wrote:
> 
> > On Mon, 23 Mar 2020 00:54:01 +0430
> > ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> > 
> >> Hi,
> >> I used "Perf" to extract call graphs in an evince benchmark. The command 
> >> used is as follows:
> >> sudo perf record -d --call-graph dwarf -c 10000 e 
> >> mem_load_uops_retired.l3_miss:uppp /opt/evince-3.28.4/bin/evince
> >> 
> >> I extracted the backtraces using "perf script" and found out that there 
> >> are many corrupted backtrace instances. Some contained repeated function 
> >> calls, for example two consecutive gmallocn()s exactly at the same 
> >> offsets. There are also some backtraces where the callers and callees do 
> >> not match.
> > 
> > Could you show some examples of the backtraces you mention?
> > 
> >> Note that that mappings are correct. In other words, each single line of 
> >> the reported backtraces is correct (i.e., addresses match with 
> >> functions). But is seems that there are some function calls in the 
> >> middle, which are missed by "Perf". Strangely, in all runs (and also 
> >> with different sampling frequencies) the problem occurs exactly at the 
> >> same place.
> >> 
> >> I am really confused and looking forward to any help. I can also send 
> >> backtraces if needed.
> > 
> > -- Steve


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Re: Wrong Perf Backtraces
  2020-03-23  8:49       ` Jiri Olsa
@ 2020-03-23 10:03         ` ahmadkhorrami
  2020-03-25 15:18           ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-23 10:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Linux-trace Users, Arnaldo Carvalho de Melo,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

Hi,
It seems that my previous e-mail is not sent, properly. So, here is a 
link to the stackoverflow question:
https://stackoverflow.com/questions/60766026/wrong-perf-backtraces

The perf is for Ubuntu 18.04 with the following "uname -a" output:
Linux Ahmad-Laptop 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 
12:06:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
I also used a compiled Linux-5.4.7 kernel and its corresponding Perf 
tool.

Regards.

On 2020-03-23 13:19, Jiri Olsa wrote:

> On Mon, Mar 23, 2020 at 07:48:26AM +0430, ahmadkhorrami wrote:
> 
>> Here is a link to the detailed question at Stackoverflow:
>> 
>> https://stackoverflow.com/questions/60766026/wrong-perf-backtraces
> 
> hi,
> what perf version are you running?
> 
> jirka
> 
> I can copy it here, if needed.
> 
> Thanks
> 
> On 2020-03-23 05:04, Steven Rostedt wrote:
> 
> On Mon, 23 Mar 2020 00:54:01 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
> Hi,
> I used "Perf" to extract call graphs in an evince benchmark. The 
> command
> used is as follows:
> sudo perf record -d --call-graph dwarf -c 10000 e
> mem_load_uops_retired.l3_miss:uppp /opt/evince-3.28.4/bin/evince
> 
> I extracted the backtraces using "perf script" and found out that there
> are many corrupted backtrace instances. Some contained repeated 
> function
> calls, for example two consecutive gmallocn()s exactly at the same
> offsets. There are also some backtraces where the callers and callees 
> do
> not match.
> Could you show some examples of the backtraces you mention?
> 
> Note that that mappings are correct. In other words, each single line 
> of
> the reported backtraces is correct (i.e., addresses match with
> functions). But is seems that there are some function calls in the
> middle, which are missed by "Perf". Strangely, in all runs (and also
> with different sampling frequencies) the problem occurs exactly at the
> same place.
> 
> I am really confused and looking forward to any help. I can also send
> backtraces if needed.
> -- Steve

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-23 10:03         ` ahmadkhorrami
@ 2020-03-25 15:18           ` ahmadkhorrami
  2020-03-25 15:46             ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 15:18 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Linux-trace Users, Arnaldo Carvalho de Melo,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

Hi,

Could you give me some hints about where the actual problem takes place? 
Is the problem with "Perf" or the hardware part (i.e., "Hardware 
Performance Counters")? Can I revise the problem by simply modifying the 
code? How much work is needed?

Your suggestions will be appreciated, because your experience and 
knowledge in this area is much more.

Regards.

On 2020-03-23 14:33, ahmadkhorrami wrote:

> Hi,
> It seems that my previous e-mail is not sent, properly. So, here is a 
> link to the stackoverflow question:
> https://stackoverflow.com/questions/60766026/wrong-perf-backtraces
> 
> The perf is for Ubuntu 18.04 with the following "uname -a" output:
> Linux Ahmad-Laptop 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 
> 12:06:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> I also used a compiled Linux-5.4.7 kernel and its corresponding Perf 
> tool.
> 
> Regards.
> 
> On 2020-03-23 13:19, Jiri Olsa wrote:
> 
> On Mon, Mar 23, 2020 at 07:48:26AM +0430, ahmadkhorrami wrote:
> 
> Here is a link to the detailed question at Stackoverflow:
> 
> https://stackoverflow.com/questions/60766026/wrong-perf-backtraces
> hi,
> what perf version are you running?
> 
> jirka
> 
> I can copy it here, if needed.
> 
> Thanks
> 
> On 2020-03-23 05:04, Steven Rostedt wrote:
> 
> On Mon, 23 Mar 2020 00:54:01 +0430
> ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> 
> Hi,
> I used "Perf" to extract call graphs in an evince benchmark. The 
> command
> used is as follows:
> sudo perf record -d --call-graph dwarf -c 10000 e
> mem_load_uops_retired.l3_miss:uppp /opt/evince-3.28.4/bin/evince
> 
> I extracted the backtraces using "perf script" and found out that there
> are many corrupted backtrace instances. Some contained repeated 
> function
> calls, for example two consecutive gmallocn()s exactly at the same
> offsets. There are also some backtraces where the callers and callees 
> do
> not match.
> Could you show some examples of the backtraces you mention?
> 
> Note that that mappings are correct. In other words, each single line 
> of
> the reported backtraces is correct (i.e., addresses match with
> functions). But is seems that there are some function calls in the
> middle, which are missed by "Perf". Strangely, in all runs (and also
> with different sampling frequencies) the problem occurs exactly at the
> same place.
> 
> I am really confused and looking forward to any help. I can also send
> backtraces if needed.
> -- Steve

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 15:18           ` ahmadkhorrami
@ 2020-03-25 15:46             ` Jiri Olsa
  2020-03-25 18:54               ` ahmadkhorrami
  2020-03-25 18:58               ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 67+ messages in thread
From: Jiri Olsa @ 2020-03-25 15:46 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Linux-trace Users, Arnaldo Carvalho de Melo,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

On Wed, Mar 25, 2020 at 07:48:39PM +0430, ahmadkhorrami wrote:
> Hi,
> 
> Could you give me some hints about where the actual problem takes place? Is
> the problem with "Perf" or the hardware part (i.e., "Hardware Performance
> Counters")? Can I revise the problem by simply modifying the code? How much
> work is needed?

heya,
might be some callchain processing bug, but I can't reproduce it on my setup..
would you have/make some simple example that would reproduce the issue?

Another option is that you'd send perf.data together with 'perf archive' data.

Also.. we support 2 dwarf unwinders (libunwind/libdw).. not sure which one you
have compiled in, but would be helpful to see if the other shows the same.

jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 15:46             ` Jiri Olsa
@ 2020-03-25 18:54               ` ahmadkhorrami
  2020-03-25 18:58               ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 18:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Linux-trace Users, Arnaldo Carvalho de Melo,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

Thanks.
I used an evince-3.28.4 and the following command:
sudo perf record -d --call-graph dwarf -c 100 -e 
mem_load_uops_retired.l3_miss:uppp /opt/evince-3.28.4/bin/evince

I only opened a file and, then, closed it using ALT+F4. The repeated 
gmallocn()s is reproducible with any sampling frequency, but the other 
backtrace is sometimes not reproducible in low frequencies. Where should 
I send you the log file?
Regards.

On 2020-03-25 20:16, Jiri Olsa wrote:

> On Wed, Mar 25, 2020 at 07:48:39PM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> 
>> Could you give me some hints about where the actual problem takes 
>> place? Is
>> the problem with "Perf" or the hardware part (i.e., "Hardware 
>> Performance
>> Counters")? Can I revise the problem by simply modifying the code? How 
>> much
>> work is needed?
> 
> heya,
> might be some callchain processing bug, but I can't reproduce it on my 
> setup..
> would you have/make some simple example that would reproduce the issue?
> 
> Another option is that you'd send perf.data together with 'perf 
> archive' data.
> 
> Also.. we support 2 dwarf unwinders (libunwind/libdw).. not sure which 
> one you
> have compiled in, but would be helpful to see if the other shows the 
> same.
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 15:46             ` Jiri Olsa
  2020-03-25 18:54               ` ahmadkhorrami
@ 2020-03-25 18:58               ` Arnaldo Carvalho de Melo
  2020-03-25 19:10                 ` ahmadkhorrami
  1 sibling, 1 reply; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-25 18:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: ahmadkhorrami, Steven Rostedt, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao

Em Wed, Mar 25, 2020 at 04:46:43PM +0100, Jiri Olsa escreveu:
> On Wed, Mar 25, 2020 at 07:48:39PM +0430, ahmadkhorrami wrote:
> > Hi,
> > 
> > Could you give me some hints about where the actual problem takes place? Is
> > the problem with "Perf" or the hardware part (i.e., "Hardware Performance
> > Counters")? Can I revise the problem by simply modifying the code? How much
> > work is needed?
> 
> heya,
> might be some callchain processing bug, but I can't reproduce it on my setup..
> would you have/make some simple example that would reproduce the issue?
> 
> Another option is that you'd send perf.data together with 'perf archive' data.
> 
> Also.. we support 2 dwarf unwinders (libunwind/libdw).. not sure which one you
> have compiled in, but would be helpful to see if the other shows the same.

perf -vv

+

ldd `which perf`

Output will help us find out which unwinder is being used, as well as
the version of perf being used.

- Arnaldo


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 18:58               ` Arnaldo Carvalho de Melo
@ 2020-03-25 19:10                 ` ahmadkhorrami
  2020-03-25 19:28                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 19:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Steven Rostedt, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao

Thanks. But should I attach the files to the e-mail?

On 2020-03-25 23:28, Arnaldo Carvalho de Melo wrote:

> Em Wed, Mar 25, 2020 at 04:46:43PM +0100, Jiri Olsa escreveu: On Wed, 
> Mar 25, 2020 at 07:48:39PM +0430, ahmadkhorrami wrote: Hi,
> 
> Could you give me some hints about where the actual problem takes 
> place? Is
> the problem with "Perf" or the hardware part (i.e., "Hardware 
> Performance
> Counters")? Can I revise the problem by simply modifying the code? How 
> much
> work is needed?
> heya,
> might be some callchain processing bug, but I can't reproduce it on my 
> setup..
> would you have/make some simple example that would reproduce the issue?
> 
> Another option is that you'd send perf.data together with 'perf 
> archive' data.
> 
> Also.. we support 2 dwarf unwinders (libunwind/libdw).. not sure which 
> one you
> have compiled in, but would be helpful to see if the other shows the 
> same.

perf -vv

+

ldd `which perf`

Output will help us find out which unwinder is being used, as well as
the version of perf being used.

- Arnaldo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 19:10                 ` ahmadkhorrami
@ 2020-03-25 19:28                   ` Arnaldo Carvalho de Melo
  2020-03-25 20:01                     ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-25 19:28 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Jiri Olsa, Steven Rostedt, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao

Em Wed, Mar 25, 2020 at 11:40:36PM +0430, ahmadkhorrami escreveu:
> Thanks. But should I attach the files to the e-mail?

Which files?

I want just that you run the commands and send us the output from them,
like I'll do here:

[acme@seventh perf]$ perf -vv
perf version 5.6.rc6.g0d33b3435253
                 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
    dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
         syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
               libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
               libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
             libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
              libslang: [ on  ]  # HAVE_SLANG_SUPPORT
             libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
             libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
    libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
             get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
                   aio: [ on  ]  # HAVE_AIO_SUPPORT
                  zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
[acme@seventh perf]$ ldd ~/bin/perf
	linux-vdso.so.1 (0x00007fffcb9cc000)
	libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f7c58d94000)
	libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f7c58d7a000)
	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f7c58d51000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7c58d30000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f7c58d26000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f7c58be0000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f7c58bd8000)
	libelf.so.1 => /lib64/libelf.so.1 (0x00007f7c58bbd000)
	libdw.so.1 => /lib64/libdw.so.1 (0x00007f7c58b1e000)
	libslang.so.2 => /lib64/libslang.so.2 (0x00007f7c58846000)
	libperl.so.5.28 => /lib64/libperl.so.5.28 (0x00007f7c5851e000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f7c58358000)
	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f7c580ee000)
	libz.so.1 => /lib64/libz.so.1 (0x00007f7c580d4000)
	libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f7c58029000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f7c58022000)
	libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f7c58014000)
	libbabeltrace-ctf.so.1 => /lib64/libbabeltrace-ctf.so.1 (0x00007f7c57fbe000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7c57fa2000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7c58dd3000)
	libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f7c57f8e000)
	libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f7c57f53000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007f7c57f4e000)
	libbabeltrace.so.1 => /lib64/libbabeltrace.so.1 (0x00007f7c57f3e000)
	libpopt.so.0 => /lib64/libpopt.so.0 (0x00007f7c57f2e000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f7c57f24000)
	libgmodule-2.0.so.0 => /lib64/libgmodule-2.0.so.0 (0x00007f7c57f1e000)
	libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f7c57dfa000)
	libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f7c57d86000)
[acme@seventh perf]$

Just like a did, no attachments please.

- Arnaldo
 
> On 2020-03-25 23:28, Arnaldo Carvalho de Melo wrote:
> 
> >Em Wed, Mar 25, 2020 at 04:46:43PM +0100, Jiri Olsa escreveu: On
> >Wed, Mar 25, 2020 at 07:48:39PM +0430, ahmadkhorrami wrote: Hi,
> >
> >Could you give me some hints about where the actual problem takes
> >place? Is
> >the problem with "Perf" or the hardware part (i.e., "Hardware
> >Performance
> >Counters")? Can I revise the problem by simply modifying the code?
> >How much
> >work is needed?
> >heya,
> >might be some callchain processing bug, but I can't reproduce it
> >on my setup..
> >would you have/make some simple example that would reproduce the issue?
> >
> >Another option is that you'd send perf.data together with 'perf
> >archive' data.
> >
> >Also.. we support 2 dwarf unwinders (libunwind/libdw).. not sure
> >which one you
> >have compiled in, but would be helpful to see if the other shows
> >the same.
> 
> perf -vv
> 
> +
> 
> ldd `which perf`
> 
> Output will help us find out which unwinder is being used, as well as
> the version of perf being used.
> 
> - Arnaldo


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 19:28                   ` Arnaldo Carvalho de Melo
@ 2020-03-25 20:01                     ` ahmadkhorrami
  2020-03-25 20:39                       ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 20:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Steven Rostedt, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao

Here you are:
perf version 5.4.7
                  dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
     dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                  glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                   gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
          syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                 libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT
                 libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                libnuma: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
                libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
              libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
               libslang: [ on  ]  # HAVE_SLANG_SUPPORT
              libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
              libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
     libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                   zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                   lzma: [ on  ]  # HAVE_LZMA_SUPPORT
              get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                    bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
                    aio: [ on  ]  # HAVE_AIO_SUPPORT
                   zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
and
	linux-vdso.so.1 (0x00007ffe55fca000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x00007f82758f9000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f82756f1000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8275353000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f827514f000)
	libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 
(0x00007f8274f35000)
	libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f8274ce9000)
	libunwind-x86_64.so.8 => 
/usr/lib/x86_64-linux-gnu/libunwind-x86_64.so.8 (0x00007f8274aca000)
	libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 
(0x00007f82748af000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f8274689000)
	libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 
(0x00007f82741a7000)
	libperl.so.5.26 => /usr/lib/x86_64-linux-gnu/libperl.so.5.26 
(0x00007f8273daa000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f82739b9000)
	libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 
(0x00007f827343c000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f827321f000)
	libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 
(0x00007f8272fa4000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f8276427000)
	libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 
(0x00007f8272d94000)
	libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 
(0x00007f8272b5c000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f8272959000)

Mr. Olsa said he needs the output of perf archive.
Regards.

On 2020-03-25 23:58, Arnaldo Carvalho de Melo wrote:

> Em Wed, Mar 25, 2020 at 11:40:36PM +0430, ahmadkhorrami escreveu:
> 
>> Thanks. But should I attach the files to the e-mail?
> 
> Which files?
> 
> I want just that you run the commands and send us the output from them,
> like I'll do here:
> 
> [acme@seventh perf]$ perf -vv
> perf version 5.6.rc6.g0d33b3435253
> dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
> dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
> glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
> gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
> syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
> libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
> libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
> libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
> numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
> libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
> libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
> libslang: [ on  ]  # HAVE_SLANG_SUPPORT
> libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
> libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
> libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
> zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
> lzma: [ on  ]  # HAVE_LZMA_SUPPORT
> get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
> bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
> aio: [ on  ]  # HAVE_AIO_SUPPORT
> zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
> [acme@seventh perf]$ ldd ~/bin/perf
> linux-vdso.so.1 (0x00007fffcb9cc000)
> libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 
> (0x00007f7c58d94000)
> libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f7c58d7a000)
> liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f7c58d51000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7c58d30000)
> librt.so.1 => /lib64/librt.so.1 (0x00007f7c58d26000)
> libm.so.6 => /lib64/libm.so.6 (0x00007f7c58be0000)
> libdl.so.2 => /lib64/libdl.so.2 (0x00007f7c58bd8000)
> libelf.so.1 => /lib64/libelf.so.1 (0x00007f7c58bbd000)
> libdw.so.1 => /lib64/libdw.so.1 (0x00007f7c58b1e000)
> libslang.so.2 => /lib64/libslang.so.2 (0x00007f7c58846000)
> libperl.so.5.28 => /lib64/libperl.so.5.28 (0x00007f7c5851e000)
> libc.so.6 => /lib64/libc.so.6 (0x00007f7c58358000)
> libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f7c580ee000)
> libz.so.1 => /lib64/libz.so.1 (0x00007f7c580d4000)
> libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f7c58029000)
> libcap.so.2 => /lib64/libcap.so.2 (0x00007f7c58022000)
> libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f7c58014000)
> libbabeltrace-ctf.so.1 => /lib64/libbabeltrace-ctf.so.1 
> (0x00007f7c57fbe000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7c57fa2000)
> /lib64/ld-linux-x86-64.so.2 (0x00007f7c58dd3000)
> libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f7c57f8e000)
> libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f7c57f53000)
> libutil.so.1 => /lib64/libutil.so.1 (0x00007f7c57f4e000)
> libbabeltrace.so.1 => /lib64/libbabeltrace.so.1 (0x00007f7c57f3e000)
> libpopt.so.0 => /lib64/libpopt.so.0 (0x00007f7c57f2e000)
> libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f7c57f24000)
> libgmodule-2.0.so.0 => /lib64/libgmodule-2.0.so.0 (0x00007f7c57f1e000)
> libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f7c57dfa000)
> libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f7c57d86000)
> [acme@seventh perf]$
> 
> Just like a did, no attachments please.
> 
> - Arnaldo
> On 2020-03-25 23:28, Arnaldo Carvalho de Melo wrote:
> 
> Em Wed, Mar 25, 2020 at 04:46:43PM +0100, Jiri Olsa escreveu: On
> Wed, Mar 25, 2020 at 07:48:39PM +0430, ahmadkhorrami wrote: Hi,
> 
> Could you give me some hints about where the actual problem takes
> place? Is
> the problem with "Perf" or the hardware part (i.e., "Hardware
> Performance
> Counters")? Can I revise the problem by simply modifying the code?
> How much
> work is needed?
> heya,
> might be some callchain processing bug, but I can't reproduce it
> on my setup..
> would you have/make some simple example that would reproduce the issue?
> 
> Another option is that you'd send perf.data together with 'perf
> archive' data.
> 
> Also.. we support 2 dwarf unwinders (libunwind/libdw).. not sure
> which one you
> have compiled in, but would be helpful to see if the other shows
> the same.
> perf -vv
> 
> +
> 
> ldd `which perf`
> 
> Output will help us find out which unwinder is being used, as well as
> the version of perf being used.
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 20:01                     ` ahmadkhorrami
@ 2020-03-25 20:39                       ` Jiri Olsa
  2020-03-25 21:02                         ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-25 20:39 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Arnaldo Carvalho de Melo, Steven Rostedt, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

On Thu, Mar 26, 2020 at 12:31:32AM +0430, ahmadkhorrami wrote:
> Here you are:
> perf version 5.4.7
>                  dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
>     dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
>                  glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
>                   gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
>          syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
>                 libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT
>                 libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
>                libnuma: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
> numa_num_possible_cpus: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
>                libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
>              libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
>               libslang: [ on  ]  # HAVE_SLANG_SUPPORT
>              libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
>              libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
>     libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
>                   zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
>                   lzma: [ on  ]  # HAVE_LZMA_SUPPORT
>              get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
>                    bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
>                    aio: [ on  ]  # HAVE_AIO_SUPPORT
>                   zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
> and
> 	linux-vdso.so.1 (0x00007ffe55fca000)
> 	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> (0x00007f82758f9000)
> 	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f82756f1000)
> 	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8275353000)
> 	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f827514f000)
> 	libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f8274f35000)
> 	libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f8274ce9000)
> 	libunwind-x86_64.so.8 => /usr/lib/x86_64-linux-gnu/libunwind-x86_64.so.8
> (0x00007f8274aca000)
> 	libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8

libunwind, ok

> (0x00007f82748af000)
> 	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f8274689000)
> 	libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f82741a7000)
> 	libperl.so.5.26 => /usr/lib/x86_64-linux-gnu/libperl.so.5.26
> (0x00007f8273daa000)
> 	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f82739b9000)
> 	libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
> (0x00007f827343c000)
> 	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f827321f000)
> 	libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f8272fa4000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007f8276427000)
> 	libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f8272d94000)
> 	libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f8272b5c000)
> 	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f8272959000)
> 
> Mr. Olsa said he needs the output of perf archive.

Mr Olsa did not actualy try to open/close pdf before as you described.. let me try and I'll let you know

thanks,
jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 20:39                       ` Jiri Olsa
@ 2020-03-25 21:02                         ` Jiri Olsa
  2020-03-25 21:09                           ` Steven Rostedt
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-25 21:02 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Arnaldo Carvalho de Melo, Steven Rostedt, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

On Wed, Mar 25, 2020 at 09:39:27PM +0100, Jiri Olsa wrote:
> On Thu, Mar 26, 2020 at 12:31:32AM +0430, ahmadkhorrami wrote:
> > Here you are:
> > perf version 5.4.7
> >                  dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
> >     dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
> >                  glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
> >                   gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
> >          syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
> >                 libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT
> >                 libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
> >                libnuma: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
> > numa_num_possible_cpus: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
> >                libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
> >              libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
> >               libslang: [ on  ]  # HAVE_SLANG_SUPPORT
> >              libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
> >              libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
> >     libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
> >                   zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
> >                   lzma: [ on  ]  # HAVE_LZMA_SUPPORT
> >              get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
> >                    bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
> >                    aio: [ on  ]  # HAVE_AIO_SUPPORT
> >                   zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
> > and
> > 	linux-vdso.so.1 (0x00007ffe55fca000)
> > 	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > (0x00007f82758f9000)
> > 	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f82756f1000)
> > 	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8275353000)
> > 	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f827514f000)
> > 	libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f8274f35000)
> > 	libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f8274ce9000)
> > 	libunwind-x86_64.so.8 => /usr/lib/x86_64-linux-gnu/libunwind-x86_64.so.8
> > (0x00007f8274aca000)
> > 	libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8
> 
> libunwind, ok
> 
> > (0x00007f82748af000)
> > 	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f8274689000)
> > 	libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f82741a7000)
> > 	libperl.so.5.26 => /usr/lib/x86_64-linux-gnu/libperl.so.5.26
> > (0x00007f8273daa000)
> > 	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f82739b9000)
> > 	libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
> > (0x00007f827343c000)
> > 	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f827321f000)
> > 	libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f8272fa4000)
> > 	/lib64/ld-linux-x86-64.so.2 (0x00007f8276427000)
> > 	libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f8272d94000)
> > 	libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f8272b5c000)
> > 	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f8272959000)
> > 
> > Mr. Olsa said he needs the output of perf archive.
> 
> Mr Olsa did not actualy try to open/close pdf before as you described.. let me try and I'll let you know

yea, no luck.. so if you could generate some reasonable small perf.data that
shows the issue and send it over together with 'perf archive' data privately
to me and to whoever else ask for it, so we don't polute the list..

or if you could put it somewhere on the web/ftp.. that'd be best

thanks,
jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 21:02                         ` Jiri Olsa
@ 2020-03-25 21:09                           ` Steven Rostedt
  2020-03-25 21:37                             ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Steven Rostedt @ 2020-03-25 21:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: ahmadkhorrami, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

On Wed, 25 Mar 2020 22:02:52 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> yea, no luck.. so if you could generate some reasonable small perf.data that
> shows the issue and send it over together with 'perf archive' data privately
> to me and to whoever else ask for it, so we don't polute the list..

Right. And it may be better if you compress it too.

 xz perf.data

and attach the perf.data.xz (and only privately send it to Mr. Olsa).

-- Steve

> 
> or if you could put it somewhere on the web/ftp.. that'd be best
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 21:09                           ` Steven Rostedt
@ 2020-03-25 21:37                             ` ahmadkhorrami
  2020-03-25 21:46                               ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 21:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

Here is the link for the gmallocn() case:
https://gofile.io/?c=o95O7N
The output for the second case is big. I have a small one produced 
several days ago the link of which is as follows:
https://gofile.io/?c=OIPCjx
Regards.

On 2020-03-26 01:39, Steven Rostedt wrote:

> On Wed, 25 Mar 2020 22:02:52 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
>> yea, no luck.. so if you could generate some reasonable small 
>> perf.data that
>> shows the issue and send it over together with 'perf archive' data 
>> privately
>> to me and to whoever else ask for it, so we don't polute the list..
> 
> Right. And it may be better if you compress it too.
> 
> xz perf.data
> 
> and attach the perf.data.xz (and only privately send it to Mr. Olsa).
> 
> -- Steve
> 
>> or if you could put it somewhere on the web/ftp.. that'd be best

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 21:37                             ` ahmadkhorrami
@ 2020-03-25 21:46                               ` Jiri Olsa
  2020-03-25 22:21                                 ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-25 21:46 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

On Thu, Mar 26, 2020 at 02:07:39AM +0430, ahmadkhorrami wrote:
> Here is the link for the gmallocn() case:
> https://gofile.io/?c=o95O7N
> The output for the second case is big. I have a small one produced several
> days ago the link of which is as follows:
> https://gofile.io/?c=OIPCjx

looking good, but I still need you t run 'perf archive' on top of your
data and send me the perf.data.tar.bz2 it generates, like:

	[jolsa@krava perf]$ sudo ./perf record
	^C[ perf record: Woken up 1 times to write data ]
	[ perf record: Captured and wrote 1.675 MB perf.data (6248 samples) ]

	[jolsa@krava perf]$ sudo ./perf archive
	Now please run:

	$ tar xvf perf.data.tar.bz2 -C ~/.debug

	wherever you need to run 'perf report' on.

I need that perf.data.tar.bz2 generated from your data

thanks,
jirka

> Regards.
> 
> On 2020-03-26 01:39, Steven Rostedt wrote:
> 
> > On Wed, 25 Mar 2020 22:02:52 +0100
> > Jiri Olsa <jolsa@redhat.com> wrote:
> > 
> > > yea, no luck.. so if you could generate some reasonable small
> > > perf.data that
> > > shows the issue and send it over together with 'perf archive' data
> > > privately
> > > to me and to whoever else ask for it, so we don't polute the list..
> > 
> > Right. And it may be better if you compress it too.
> > 
> > xz perf.data
> > 
> > and attach the perf.data.xz (and only privately send it to Mr. Olsa).
> > 
> > -- Steve
> > 
> > > or if you could put it somewhere on the web/ftp.. that'd be best
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 21:46                               ` Jiri Olsa
@ 2020-03-25 22:21                                 ` ahmadkhorrami
  2020-03-25 23:09                                   ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 22:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

Here is the link for the gmallocn()s:
http://gofile.io/?c=qk6oXv
I will send the second one as soon as the upload is finished:

Regards.

On 2020-03-26 02:16, Jiri Olsa wrote:

> On Thu, Mar 26, 2020 at 02:07:39AM +0430, ahmadkhorrami wrote:
> 
>> Here is the link for the gmallocn() case:
>> https://gofile.io/?c=o95O7N
>> The output for the second case is big. I have a small one produced 
>> several
>> days ago the link of which is as follows:
>> https://gofile.io/?c=OIPCjx
> 
> looking good, but I still need you t run 'perf archive' on top of your
> data and send me the perf.data.tar.bz2 it generates, like:
> 
> [jolsa@krava perf]$ sudo ./perf record
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.675 MB perf.data (6248 samples) ]
> 
> [jolsa@krava perf]$ sudo ./perf archive
> Now please run:
> 
> $ tar xvf perf.data.tar.bz2 -C ~/.debug
> 
> wherever you need to run 'perf report' on.
> 
> I need that perf.data.tar.bz2 generated from your data
> 
> thanks,
> jirka
> 
> Regards.
> 
> On 2020-03-26 01:39, Steven Rostedt wrote:
> 
> On Wed, 25 Mar 2020 22:02:52 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
> yea, no luck.. so if you could generate some reasonable small
> perf.data that
> shows the issue and send it over together with 'perf archive' data
> privately
> to me and to whoever else ask for it, so we don't polute the list..
> Right. And it may be better if you compress it too.
> 
> xz perf.data
> 
> and attach the perf.data.xz (and only privately send it to Mr. Olsa).
> 
> -- Steve
> 
> or if you could put it somewhere on the web/ftp.. that'd be best

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 22:21                                 ` ahmadkhorrami
@ 2020-03-25 23:09                                   ` ahmadkhorrami
  2020-03-26  9:59                                     ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-25 23:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao

An here is the second one:
https://gofile.io/?c=oGxgSM
Regards
On 2020-03-26 02:51, ahmadkhorrami wrote:

> Here is the link for the gmallocn()s:
> http://gofile.io/?c=qk6oXv
> I will send the second one as soon as the upload is finished:
> 
> Regards.
> 
> On 2020-03-26 02:16, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 02:07:39AM +0430, ahmadkhorrami wrote:
> 
> Here is the link for the gmallocn() case:
> https://gofile.io/?c=o95O7N
> The output for the second case is big. I have a small one produced 
> several
> days ago the link of which is as follows:
> https://gofile.io/?c=OIPCjx
> looking good, but I still need you t run 'perf archive' on top of your
> data and send me the perf.data.tar.bz2 it generates, like:
> 
> [jolsa@krava perf]$ sudo ./perf record
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.675 MB perf.data (6248 samples) ]
> 
> [jolsa@krava perf]$ sudo ./perf archive
> Now please run:
> 
> $ tar xvf perf.data.tar.bz2 -C ~/.debug
> 
> wherever you need to run 'perf report' on.
> 
> I need that perf.data.tar.bz2 generated from your data
> 
> thanks,
> jirka
> 
> Regards.
> 
> On 2020-03-26 01:39, Steven Rostedt wrote:
> 
> On Wed, 25 Mar 2020 22:02:52 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
> yea, no luck.. so if you could generate some reasonable small
> perf.data that
> shows the issue and send it over together with 'perf archive' data
> privately
> to me and to whoever else ask for it, so we don't polute the list..
> Right. And it may be better if you compress it too.
> 
> xz perf.data
> 
> and attach the perf.data.xz (and only privately send it to Mr. Olsa).
> 
> -- Steve
> 
> or if you could put it somewhere on the web/ftp.. that'd be best

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-25 23:09                                   ` ahmadkhorrami
@ 2020-03-26  9:59                                     ` Jiri Olsa
  2020-03-26 13:20                                       ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-26  9:59 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

On Thu, Mar 26, 2020 at 03:39:31AM +0430, ahmadkhorrami wrote:
> An here is the second one:
> https://gofile.io/?c=oGxgSM

thanks, so far I don't see that, but I think it's because
the 'inline' code does not resolve the libc dso correctly,
CC-ing few other folks..

so I'm able to get:

	EvJobScheduler 17382 13006.872877:      10000 mem_load_uops_retired.l3_miss:uppp:     7fffd2e06588         5080022 N/A|SNP N/A|TLB N/A|LCK N/A
		    7ffff4b04c74 [unknown] (/lib/x86_64-linux-gnu/libc-2.27.so)
		    7ffff4b072ec malloc+0x27c (/lib/x86_64-linux-gnu/libc-2.27.so)
		    7fffd9872ddd gmalloc+0xd (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
		    7fffd9873391 copyString+0x11 (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)

while you see:

	EvJobScheduler 17382 13006.872877:      10000 mem_load_uops_retired.l3_miss:uppp:     7fffd2e06588         5080022 N/A|SNP N/A|TLB N/A|LCK N/A
		    7ffff4b04c74 _int_malloc+0x9a4 (/lib/x86_64-linux-gnu/libc-2.27.so)
		    7ffff4b072ec __GI___libc_malloc+0x27c (inlined)
		    7fffd9872ddd gmalloc+0xd (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
		    7fffd9872ddd gmalloc+0xd (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)

so for some reason I don't resolve 7ffff4b04c74, which might
be the reason I don't see the following address twice as you do:

	    7fffd9872ddd gmalloc+0xd (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
	    7fffd9872ddd gmalloc+0xd (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)

the previous field (7ffff4b072ec) is resolved by the
'inline' code, so I wonder it's related

I needed to make some changes to utils/srcline.c to be able
to properly open dso via buildid cache, I pushed it to:

  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/callchain

here are the steps to get above output:

  download perf.data and perf archive data from:
    https://gofile.io/?c=o95O7N
    http://gofile.io/?c=qk6oXv

  $ unxz ./perf.data.xz
  $ tar xvf perf.data.tar.bz2 -C ~/.debug

compile perf from aobve tree/branch and run:

  $ perf script -i perf.data

I think we might be missing some libraries in 'perf archive'
by not following debug_link section, will check on this

jirka


> Regards
> On 2020-03-26 02:51, ahmadkhorrami wrote:
> 
> > Here is the link for the gmallocn()s:
> > http://gofile.io/?c=qk6oXv
> > I will send the second one as soon as the upload is finished:
> > 
> > Regards.
> > 
> > On 2020-03-26 02:16, Jiri Olsa wrote:
> > 
> > On Thu, Mar 26, 2020 at 02:07:39AM +0430, ahmadkhorrami wrote:
> > 
> > Here is the link for the gmallocn() case:
> > https://gofile.io/?c=o95O7N
> > The output for the second case is big. I have a small one produced
> > several
> > days ago the link of which is as follows:
> > https://gofile.io/?c=OIPCjx
> > looking good, but I still need you t run 'perf archive' on top of your
> > data and send me the perf.data.tar.bz2 it generates, like:
> > 
> > [jolsa@krava perf]$ sudo ./perf record
> > ^C[ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 1.675 MB perf.data (6248 samples) ]
> > 
> > [jolsa@krava perf]$ sudo ./perf archive
> > Now please run:
> > 
> > $ tar xvf perf.data.tar.bz2 -C ~/.debug
> > 
> > wherever you need to run 'perf report' on.
> > 
> > I need that perf.data.tar.bz2 generated from your data
> > 
> > thanks,
> > jirka
> > 
> > Regards.
> > 
> > On 2020-03-26 01:39, Steven Rostedt wrote:
> > 
> > On Wed, 25 Mar 2020 22:02:52 +0100
> > Jiri Olsa <jolsa@redhat.com> wrote:
> > 
> > yea, no luck.. so if you could generate some reasonable small
> > perf.data that
> > shows the issue and send it over together with 'perf archive' data
> > privately
> > to me and to whoever else ask for it, so we don't polute the list..
> > Right. And it may be better if you compress it too.
> > 
> > xz perf.data
> > 
> > and attach the perf.data.xz (and only privately send it to Mr. Olsa).
> > 
> > -- Steve
> > 
> > or if you could put it somewhere on the web/ftp.. that'd be best
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-26  9:59                                     ` Jiri Olsa
@ 2020-03-26 13:20                                       ` ahmadkhorrami
  2020-03-26 15:39                                         ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-26 13:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi,
First of all, many thanks for your time. Did you say that the first file 
has problems?

The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s 
while the second (https://gofile.io/?c=oGxgSM) also has problems with 
unmatched (not necessarily repeated) function calls. I am not sure if 
the kernel for the second one is 5.4.7 or the generic Ubuntu kernel. But 
the first one is certainly 5.4.7. Just to be clear, there were many 
instances of these unmatched <caller, callees>.
I have a simple python script that checks for this situation. It 
disassembles functions using GDB and checks the (directly called) target 
of each caller. I will put some comments in the script and upload it. 
Could you check to see if the python script detects any mismatches in 
your backtraces? It takes the perf script output file as input. I will 
upload the script in an hour.

Regards.

On 2020-03-26 14:29, Jiri Olsa wrote:

> On Thu, Mar 26, 2020 at 03:39:31AM +0430, ahmadkhorrami wrote:
> 
>> An here is the second one:
>> https://gofile.io/?c=oGxgSM
> 
> thanks, so far I don't see that, but I think it's because
> the 'inline' code does not resolve the libc dso correctly,
> CC-ing few other folks..
> 
> so I'm able to get:
> 
> EvJobScheduler 17382 13006.872877:      10000 
> mem_load_uops_retired.l3_miss:uppp:     7fffd2e06588         5080022 
> N/A|SNP N/A|TLB N/A|LCK N/A
> 7ffff4b04c74 [unknown] (/lib/x86_64-linux-gnu/libc-2.27.so)
> 7ffff4b072ec malloc+0x27c (/lib/x86_64-linux-gnu/libc-2.27.so)
> 7fffd9872ddd gmalloc+0xd 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd9873391 copyString+0x11 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 
> while you see:
> 
> EvJobScheduler 17382 13006.872877:      10000 
> mem_load_uops_retired.l3_miss:uppp:     7fffd2e06588         5080022 
> N/A|SNP N/A|TLB N/A|LCK N/A
> 7ffff4b04c74 _int_malloc+0x9a4 (/lib/x86_64-linux-gnu/libc-2.27.so)
> 7ffff4b072ec __GI___libc_malloc+0x27c (inlined)
> 7fffd9872ddd gmalloc+0xd 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd9872ddd gmalloc+0xd 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 
> so for some reason I don't resolve 7ffff4b04c74, which might
> be the reason I don't see the following address twice as you do:
> 
> 7fffd9872ddd gmalloc+0xd 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd9872ddd gmalloc+0xd 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 
> the previous field (7ffff4b072ec) is resolved by the
> 'inline' code, so I wonder it's related
> 
> I needed to make some changes to utils/srcline.c to be able
> to properly open dso via buildid cache, I pushed it to:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/callchain
> 
> here are the steps to get above output:
> 
> download perf.data and perf archive data from:
> https://gofile.io/?c=o95O7N
> http://gofile.io/?c=qk6oXv
> 
> $ unxz ./perf.data.xz
> $ tar xvf perf.data.tar.bz2 -C ~/.debug
> 
> compile perf from aobve tree/branch and run:
> 
> $ perf script -i perf.data
> 
> I think we might be missing some libraries in 'perf archive'
> by not following debug_link section, will check on this
> 
> jirka
> 
> Regards
> On 2020-03-26 02:51, ahmadkhorrami wrote:
> 
> Here is the link for the gmallocn()s:
> http://gofile.io/?c=qk6oXv
> I will send the second one as soon as the upload is finished:
> 
> Regards.
> 
> On 2020-03-26 02:16, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 02:07:39AM +0430, ahmadkhorrami wrote:
> 
> Here is the link for the gmallocn() case:
> https://gofile.io/?c=o95O7N
> The output for the second case is big. I have a small one produced
> several
> days ago the link of which is as follows:
> https://gofile.io/?c=OIPCjx
> looking good, but I still need you t run 'perf archive' on top of your
> data and send me the perf.data.tar.bz2 it generates, like:
> 
> [jolsa@krava perf]$ sudo ./perf record
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.675 MB perf.data (6248 samples) ]
> 
> [jolsa@krava perf]$ sudo ./perf archive
> Now please run:
> 
> $ tar xvf perf.data.tar.bz2 -C ~/.debug
> 
> wherever you need to run 'perf report' on.
> 
> I need that perf.data.tar.bz2 generated from your data
> 
> thanks,
> jirka
> 
> Regards.
> 
> On 2020-03-26 01:39, Steven Rostedt wrote:
> 
> On Wed, 25 Mar 2020 22:02:52 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
> 
> yea, no luck.. so if you could generate some reasonable small
> perf.data that
> shows the issue and send it over together with 'perf archive' data
> privately
> to me and to whoever else ask for it, so we don't polute the list..
> Right. And it may be better if you compress it too.
> 
> xz perf.data
> 
> and attach the perf.data.xz (and only privately send it to Mr. Olsa).
> 
> -- Steve
> 
> or if you could put it somewhere on the web/ftp.. that'd be best

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-26 13:20                                       ` ahmadkhorrami
@ 2020-03-26 15:39                                         ` Jiri Olsa
  2020-03-26 18:19                                           ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-26 15:39 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> Hi,
> First of all, many thanks for your time. Did you say that the first file has
> problems?
> 
> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s while
> the second (https://gofile.io/?c=oGxgSM) also has problems with unmatched
> (not necessarily repeated) function calls. I am not sure if the kernel for
> the second one is 5.4.7 or the generic Ubuntu kernel. But the first one is
> certainly 5.4.7. Just to be clear, there were many instances of these
> unmatched <caller, callees>.

I can se all the files, but I just can't see the issue yet
but it's probably because of issues with perf archive..

let's see if somebody else can chime in

> I have a simple python script that checks for this situation. It
> disassembles functions using GDB and checks the (directly called) target of
> each caller. I will put some comments in the script and upload it. Could you
> check to see if the python script detects any mismatches in your backtraces?
> It takes the perf script output file as input. I will upload the script in
> an hour.

ok

jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-26 15:39                                         ` Jiri Olsa
@ 2020-03-26 18:19                                           ` ahmadkhorrami
  2020-03-26 18:21                                             ` ahmadkhorrami
  2020-03-27  9:20                                             ` Jiri Olsa
  0 siblings, 2 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-26 18:19 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi,
Here is the link for the python script:
https://gofile.io/?c=1ZSLwe
It is written in python-3 and takes the perf script output as input.
It looks for consecutive repeated backtrace lines and checks if the 
function in these lines calls itself at the offset in the line (i.e., 
checks if recursion is possible). If not possible it reports an error. 
Could you check to see if any error is detected in your outputs, please?
Regards.
On 2020-03-26 20:09, Jiri Olsa wrote:

> On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> First of all, many thanks for your time. Did you say that the first 
>> file has
>> problems?
>> 
>> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s 
>> while
>> the second (https://gofile.io/?c=oGxgSM) also has problems with 
>> unmatched
>> (not necessarily repeated) function calls. I am not sure if the kernel 
>> for
>> the second one is 5.4.7 or the generic Ubuntu kernel. But the first 
>> one is
>> certainly 5.4.7. Just to be clear, there were many instances of these
>> unmatched <caller, callees>.
> 
> I can se all the files, but I just can't see the issue yet
> but it's probably because of issues with perf archive..
> 
> let's see if somebody else can chime in
> 
>> I have a simple python script that checks for this situation. It
>> disassembles functions using GDB and checks the (directly called) 
>> target of
>> each caller. I will put some comments in the script and upload it. 
>> Could you
>> check to see if the python script detects any mismatches in your 
>> backtraces?
>> It takes the perf script output file as input. I will upload the 
>> script in
>> an hour.
> 
> ok
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-26 18:19                                           ` ahmadkhorrami
@ 2020-03-26 18:21                                             ` ahmadkhorrami
  2020-03-27  9:20                                             ` Jiri Olsa
  1 sibling, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-26 18:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Sorry. GDB should disassemble functions so that the names are not 
demangled.

On 2020-03-26 22:49, ahmadkhorrami wrote:

> Hi,
> Here is the link for the python script:
> https://gofile.io/?c=1ZSLwe
> It is written in python-3 and takes the perf script output as input.
> It looks for consecutive repeated backtrace lines and checks if the 
> function in these lines calls itself at the offset in the line (i.e., 
> checks if recursion is possible). If not possible it reports an error. 
> Could you check to see if any error is detected in your outputs, 
> please?
> Regards.
> On 2020-03-26 20:09, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> 
> Hi,
> First of all, many thanks for your time. Did you say that the first 
> file has
> problems?
> 
> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s 
> while
> the second (https://gofile.io/?c=oGxgSM) also has problems with 
> unmatched
> (not necessarily repeated) function calls. I am not sure if the kernel 
> for
> the second one is 5.4.7 or the generic Ubuntu kernel. But the first one 
> is
> certainly 5.4.7. Just to be clear, there were many instances of these
> unmatched <caller, callees>.
> I can se all the files, but I just can't see the issue yet
> but it's probably because of issues with perf archive..
> 
> let's see if somebody else can chime in
> 
> I have a simple python script that checks for this situation. It
> disassembles functions using GDB and checks the (directly called) 
> target of
> each caller. I will put some comments in the script and upload it. 
> Could you
> check to see if the python script detects any mismatches in your 
> backtraces?
> It takes the perf script output file as input. I will upload the script 
> in
> an hour.
> ok
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-26 18:19                                           ` ahmadkhorrami
  2020-03-26 18:21                                             ` ahmadkhorrami
@ 2020-03-27  9:20                                             ` Jiri Olsa
  2020-03-27 10:59                                               ` ahmadkhorrami
  1 sibling, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-27  9:20 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

On Thu, Mar 26, 2020 at 10:49:12PM +0430, ahmadkhorrami wrote:
> Hi,
> Here is the link for the python script:
> https://gofile.io/?c=1ZSLwe
> It is written in python-3 and takes the perf script output as input.
> It looks for consecutive repeated backtrace lines and checks if the function
> in these lines calls itself at the offset in the line (i.e., checks if
> recursion is possible). If not possible it reports an error. Could you check
> to see if any error is detected in your outputs, please?

I'm getting tons of following output:

/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30: No such file or directory.
No symbol table is loaded.  Use the "file" command.
            7ffff71b9bc1 gtk_css_node_invalidate_timestamp+0x31 (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)

I assume it's because I have all the dso binaries stored
under .biuldid path, while you check the output name

jirka

> Regards.
> On 2020-03-26 20:09, Jiri Olsa wrote:
> 
> > On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> > 
> > > Hi,
> > > First of all, many thanks for your time. Did you say that the first
> > > file has
> > > problems?
> > > 
> > > The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s
> > > while
> > > the second (https://gofile.io/?c=oGxgSM) also has problems with
> > > unmatched
> > > (not necessarily repeated) function calls. I am not sure if the
> > > kernel for
> > > the second one is 5.4.7 or the generic Ubuntu kernel. But the first
> > > one is
> > > certainly 5.4.7. Just to be clear, there were many instances of these
> > > unmatched <caller, callees>.
> > 
> > I can se all the files, but I just can't see the issue yet
> > but it's probably because of issues with perf archive..
> > 
> > let's see if somebody else can chime in
> > 
> > > I have a simple python script that checks for this situation. It
> > > disassembles functions using GDB and checks the (directly called)
> > > target of
> > > each caller. I will put some comments in the script and upload it.
> > > Could you
> > > check to see if the python script detects any mismatches in your
> > > backtraces?
> > > It takes the perf script output file as input. I will upload the
> > > script in
> > > an hour.
> > 
> > ok
> > 
> > jirka
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27  9:20                                             ` Jiri Olsa
@ 2020-03-27 10:59                                               ` ahmadkhorrami
  2020-03-27 11:04                                                 ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-27 10:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi,
Thanks! Could you tell me what should be changed in order to make the 
code runnable on your system, if it is possible?
Regards.

On 2020-03-27 13:50, Jiri Olsa wrote:

> On Thu, Mar 26, 2020 at 10:49:12PM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> Here is the link for the python script:
>> https://gofile.io/?c=1ZSLwe
>> It is written in python-3 and takes the perf script output as input.
>> It looks for consecutive repeated backtrace lines and checks if the 
>> function
>> in these lines calls itself at the offset in the line (i.e., checks if
>> recursion is possible). If not possible it reports an error. Could you 
>> check
>> to see if any error is detected in your outputs, please?
> 
> I'm getting tons of following output:
> 
> /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30: No such file or 
> directory.
> No symbol table is loaded.  Use the "file" command.
> 7ffff71b9bc1 gtk_css_node_invalidate_timestamp+0x31 
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
> 
> I assume it's because I have all the dso binaries stored
> under .biuldid path, while you check the output name
> 
> jirka
> 
> Regards.
> On 2020-03-26 20:09, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> 
> Hi,
> First of all, many thanks for your time. Did you say that the first
> file has
> problems?
> 
> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s
> while
> the second (https://gofile.io/?c=oGxgSM) also has problems with
> unmatched
> (not necessarily repeated) function calls. I am not sure if the
> kernel for
> the second one is 5.4.7 or the generic Ubuntu kernel. But the first
> one is
> certainly 5.4.7. Just to be clear, there were many instances of these
> unmatched <caller, callees>.
> I can se all the files, but I just can't see the issue yet
> but it's probably because of issues with perf archive..
> 
> let's see if somebody else can chime in
> 
> I have a simple python script that checks for this situation. It
> disassembles functions using GDB and checks the (directly called)
> target of
> each caller. I will put some comments in the script and upload it.
> Could you
> check to see if the python script detects any mismatches in your
> backtraces?
> It takes the perf script output file as input. I will upload the
> script in
> an hour.
> ok
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 10:59                                               ` ahmadkhorrami
@ 2020-03-27 11:04                                                 ` ahmadkhorrami
  2020-03-27 12:10                                                   ` Milian Wolff
  2020-03-27 18:43                                                   ` ahmadkhorrami
  0 siblings, 2 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-27 11:04 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

I do the following:
If this line is in the perf script backtrace:
             7f21ffe256db g_main_context_iteration+0x2b 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
I run the following command:
gdb -batch -ex 'file /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4' 
-ex 'disass g_main_context_iteration'.

Regards.

On 2020-03-27 15:29, ahmadkhorrami wrote:

> Hi,
> Thanks! Could you tell me what should be changed in order to make the 
> code runnable on your system, if it is possible?
> Regards.
> 
> On 2020-03-27 13:50, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 10:49:12PM +0430, ahmadkhorrami wrote:
> 
> Hi,
> Here is the link for the python script:
> https://gofile.io/?c=1ZSLwe
> It is written in python-3 and takes the perf script output as input.
> It looks for consecutive repeated backtrace lines and checks if the 
> function
> in these lines calls itself at the offset in the line (i.e., checks if
> recursion is possible). If not possible it reports an error. Could you 
> check
> to see if any error is detected in your outputs, please?
> I'm getting tons of following output:
> 
> /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30: No such file or 
> directory.
> No symbol table is loaded.  Use the "file" command.
> 7ffff71b9bc1 gtk_css_node_invalidate_timestamp+0x31 
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
> 
> I assume it's because I have all the dso binaries stored
> under .biuldid path, while you check the output name
> 
> jirka
> 
> Regards.
> On 2020-03-26 20:09, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> 
> Hi,
> First of all, many thanks for your time. Did you say that the first
> file has
> problems?
> 
> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s
> while
> the second (https://gofile.io/?c=oGxgSM) also has problems with
> unmatched
> (not necessarily repeated) function calls. I am not sure if the
> kernel for
> the second one is 5.4.7 or the generic Ubuntu kernel. But the first
> one is
> certainly 5.4.7. Just to be clear, there were many instances of these
> unmatched <caller, callees>.
> I can se all the files, but I just can't see the issue yet
> but it's probably because of issues with perf archive..
> 
> let's see if somebody else can chime in
> 
> I have a simple python script that checks for this situation. It
> disassembles functions using GDB and checks the (directly called)
> target of
> each caller. I will put some comments in the script and upload it.
> Could you
> check to see if the python script detects any mismatches in your
> backtraces?
> It takes the perf script output file as input. I will upload the
> script in
> an hour.
> ok
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 11:04                                                 ` ahmadkhorrami
@ 2020-03-27 12:10                                                   ` Milian Wolff
  2020-03-27 12:58                                                     ` ahmadkhorrami
  2020-03-27 18:43                                                   ` ahmadkhorrami
  1 sibling, 1 reply; 67+ messages in thread
From: Milian Wolff @ 2020-03-27 12:10 UTC (permalink / raw)
  To: Jiri Olsa, ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Namhyung Kim,
	Changbin Du, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1714 bytes --]

On Freitag, 27. März 2020 12:04:20 CET ahmadkhorrami wrote:
> I do the following:
> If this line is in the perf script backtrace:
>              7f21ffe256db g_main_context_iteration+0x2b
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> I run the following command:
> gdb -batch -ex 'file /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4'
> -ex 'disass g_main_context_iteration'.

You can try to parse the output of `perf buildid-list` to get a mapping that 
can be used to run this on another person's machine. E.g.

```
$ perf buildid-list
5837b1e7495db791f9a3a56fb6ca29958da75b0c [kernel.kallsyms]
845f98f6a3019620c37a2f611b2f20c27de83d5b /home/milian/projects/kdab/rnd/
hotspot/build/tests/test-clients/cpp-parallel/cpp-parallel
f6ca5853dae87d9f0503a9ef230f6d1fa15a832d /usr/lib/ld-2.30.so
92883b06055e8e21ded8eb0cd5a61f5704531152 [vdso]
8b04d1825b63d9a600e3d57ac71058935e7ad757 /usr/lib/libpthread-2.30.so
09639b80a8fad179004f2484608764d2b336dd4a /usr/lib/libstdc++.so.6.0.27
33d1f350f13728651d74dd2a56bad1e4e4648f5e /usr/lib/libc-2.30.so
$ file ~/.debug/usr/lib/ld-2.30.so/f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/
elf 
/home/milian/.debug/usr/lib/ld-2.30.so/
f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/elf: ELF 64-bit LSB shared object, 
x86-64, version 1 (SYSV), statically linked, 
BuildID[sha1]=f6ca5853dae87d9f0503a9ef230f6d1fa15a832d, not stripped
```

I'm afraid to say that I currently don't have the time required to be of more 
help in debugging this issue in general.

Good luck!
-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 12:10                                                   ` Milian Wolff
@ 2020-03-27 12:58                                                     ` ahmadkhorrami
  2020-03-27 13:25                                                       ` Milian Wolff
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-27 12:58 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Changbin Du, Andi Kleen

Hi,
Thanks Milian. So if I am right, I should do the following:
1) Run "perf buildid-list" and catch all the <file path, build-ids> 
pairs.
2) First check for the existence of the file path.
3) If failed, concatenate "~/.debug", file path, "/" and buildid and use 
it as the alternative.
I thought that GDB is smart enough to detect these situations in the 
same way that it detects debug info files.
Should I check any directory other than ~/.debug?
Regards.

On 2020-03-27 16:40, Milian Wolff wrote:

> On Freitag, 27. März 2020 12:04:20 CET ahmadkhorrami wrote:
> 
>> I do the following:
>> If this line is in the perf script backtrace:
>> 7f21ffe256db g_main_context_iteration+0x2b
>> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
>> I run the following command:
>> gdb -batch -ex 'file 
>> /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4'
>> -ex 'disass g_main_context_iteration'.
> 
> You can try to parse the output of `perf buildid-list` to get a mapping 
> that
> can be used to run this on another person's machine. E.g.
> 
> ```
> $ perf buildid-list
> 5837b1e7495db791f9a3a56fb6ca29958da75b0c [kernel.kallsyms]
> 845f98f6a3019620c37a2f611b2f20c27de83d5b 
> /home/milian/projects/kdab/rnd/
> hotspot/build/tests/test-clients/cpp-parallel/cpp-parallel
> f6ca5853dae87d9f0503a9ef230f6d1fa15a832d /usr/lib/ld-2.30.so
> 92883b06055e8e21ded8eb0cd5a61f5704531152 [vdso]
> 8b04d1825b63d9a600e3d57ac71058935e7ad757 /usr/lib/libpthread-2.30.so
> 09639b80a8fad179004f2484608764d2b336dd4a /usr/lib/libstdc++.so.6.0.27
> 33d1f350f13728651d74dd2a56bad1e4e4648f5e /usr/lib/libc-2.30.so
> $ file 
> ~/.debug/usr/lib/ld-2.30.so/f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/
> elf
> /home/milian/.debug/usr/lib/ld-2.30.so/
> f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/elf: ELF 64-bit LSB shared 
> object,
> x86-64, version 1 (SYSV), statically linked,
> BuildID[sha1]=f6ca5853dae87d9f0503a9ef230f6d1fa15a832d, not stripped
> ```
> 
> I'm afraid to say that I currently don't have the time required to be 
> of more
> help in debugging this issue in general.
> 
> Good luck!

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 12:58                                                     ` ahmadkhorrami
@ 2020-03-27 13:25                                                       ` Milian Wolff
  2020-03-27 13:33                                                         ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Milian Wolff @ 2020-03-27 13:25 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Changbin Du, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]

On Freitag, 27. März 2020 13:58:33 CET ahmadkhorrami wrote:
> Hi,
> Thanks Milian. So if I am right, I should do the following:
> 1) Run "perf buildid-list" and catch all the <file path, build-ids>
> pairs.
> 2) First check for the existence of the file path.
> 3) If failed, concatenate "~/.debug", file path, "/" and buildid and use
> it as the alternative.

you missed the trailing "/elf" but otherwise yes :)

> I thought that GDB is smart enough to detect these situations in the
> same way that it detects debug info files.
> Should I check any directory other than ~/.debug?

Hmm good point, GDB should indeed do that by default but maybe it doesn't do 
it for the `file` command for some reason? I cannot answer this, sorry.

Cheers

> On 2020-03-27 16:40, Milian Wolff wrote:
> > On Freitag, 27. März 2020 12:04:20 CET ahmadkhorrami wrote:
> >> I do the following:
> >> If this line is in the perf script backtrace:
> >> 7f21ffe256db g_main_context_iteration+0x2b
> >> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> >> I run the following command:
> >> gdb -batch -ex 'file
> >> /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4'
> >> -ex 'disass g_main_context_iteration'.
> > 
> > You can try to parse the output of `perf buildid-list` to get a mapping
> > that
> > can be used to run this on another person's machine. E.g.
> > 
> > ```
> > $ perf buildid-list
> > 5837b1e7495db791f9a3a56fb6ca29958da75b0c [kernel.kallsyms]
> > 845f98f6a3019620c37a2f611b2f20c27de83d5b
> > /home/milian/projects/kdab/rnd/
> > hotspot/build/tests/test-clients/cpp-parallel/cpp-parallel
> > f6ca5853dae87d9f0503a9ef230f6d1fa15a832d /usr/lib/ld-2.30.so
> > 92883b06055e8e21ded8eb0cd5a61f5704531152 [vdso]
> > 8b04d1825b63d9a600e3d57ac71058935e7ad757 /usr/lib/libpthread-2.30.so
> > 09639b80a8fad179004f2484608764d2b336dd4a /usr/lib/libstdc++.so.6.0.27
> > 33d1f350f13728651d74dd2a56bad1e4e4648f5e /usr/lib/libc-2.30.so
> > $ file
> > ~/.debug/usr/lib/ld-2.30.so/f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/
> > elf
> > /home/milian/.debug/usr/lib/ld-2.30.so/
> > f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/elf: ELF 64-bit LSB shared
> > object,
> > x86-64, version 1 (SYSV), statically linked,
> > BuildID[sha1]=f6ca5853dae87d9f0503a9ef230f6d1fa15a832d, not stripped
> > ```
> > 
> > I'm afraid to say that I currently don't have the time required to be
> > of more
> > help in debugging this issue in general.
> > 
> > Good luck!


-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 13:25                                                       ` Milian Wolff
@ 2020-03-27 13:33                                                         ` ahmadkhorrami
  0 siblings, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-27 13:33 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Changbin Du, Andi Kleen

Sounds Good. Thanks for the help!

On 2020-03-27 17:55, Milian Wolff wrote:

> On Freitag, 27. März 2020 13:58:33 CET ahmadkhorrami wrote:
> 
>> Hi,
>> Thanks Milian. So if I am right, I should do the following:
>> 1) Run "perf buildid-list" and catch all the <file path, build-ids>
>> pairs.
>> 2) First check for the existence of the file path.
>> 3) If failed, concatenate "~/.debug", file path, "/" and buildid and 
>> use
>> it as the alternative.
> 
> you missed the trailing "/elf" but otherwise yes :)
> 
>> I thought that GDB is smart enough to detect these situations in the
>> same way that it detects debug info files.
>> Should I check any directory other than ~/.debug?
> 
> Hmm good point, GDB should indeed do that by default but maybe it 
> doesn't do
> it for the `file` command for some reason? I cannot answer this, sorry.
> 
> Cheers
> 
> On 2020-03-27 16:40, Milian Wolff wrote: On Freitag, 27. März 2020 
> 12:04:20 CET ahmadkhorrami wrote: I do the following:
> If this line is in the perf script backtrace:
> 7f21ffe256db g_main_context_iteration+0x2b
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> I run the following command:
> gdb -batch -ex 'file
> /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4'
> -ex 'disass g_main_context_iteration'.
> You can try to parse the output of `perf buildid-list` to get a mapping
> that
> can be used to run this on another person's machine. E.g.
> 
> ```
> $ perf buildid-list
> 5837b1e7495db791f9a3a56fb6ca29958da75b0c [kernel.kallsyms]
> 845f98f6a3019620c37a2f611b2f20c27de83d5b
> /home/milian/projects/kdab/rnd/
> hotspot/build/tests/test-clients/cpp-parallel/cpp-parallel
> f6ca5853dae87d9f0503a9ef230f6d1fa15a832d /usr/lib/ld-2.30.so
> 92883b06055e8e21ded8eb0cd5a61f5704531152 [vdso]
> 8b04d1825b63d9a600e3d57ac71058935e7ad757 /usr/lib/libpthread-2.30.so
> 09639b80a8fad179004f2484608764d2b336dd4a /usr/lib/libstdc++.so.6.0.27
> 33d1f350f13728651d74dd2a56bad1e4e4648f5e /usr/lib/libc-2.30.so
> $ file
> ~/.debug/usr/lib/ld-2.30.so/f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/
> elf
> /home/milian/.debug/usr/lib/ld-2.30.so/
> f6ca5853dae87d9f0503a9ef230f6d1fa15a832d/elf: ELF 64-bit LSB shared
> object,
> x86-64, version 1 (SYSV), statically linked,
> BuildID[sha1]=f6ca5853dae87d9f0503a9ef230f6d1fa15a832d, not stripped
> ```
> 
> I'm afraid to say that I currently don't have the time required to be
> of more
> help in debugging this issue in general.
> 
> Good luck!

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 11:04                                                 ` ahmadkhorrami
  2020-03-27 12:10                                                   ` Milian Wolff
@ 2020-03-27 18:43                                                   ` ahmadkhorrami
  2020-03-27 22:37                                                     ` Jiri Olsa
  1 sibling, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-27 18:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi,
I revised the code. Hopefully, it works. Could you check it with your 
own outputs? It is valuable for me to know if your output contains wrong 
backtraces or not. Here is the link:
https://gofile.io/?c=rz2kGc
Regards.

On 2020-03-27 15:34, ahmadkhorrami wrote:

> I do the following:
> If this line is in the perf script backtrace:
> 7f21ffe256db g_main_context_iteration+0x2b 
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> I run the following command:
> gdb -batch -ex 'file /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4' 
> -ex 'disass g_main_context_iteration'.
> 
> Regards.
> 
> On 2020-03-27 15:29, ahmadkhorrami wrote:
> 
>> Hi,
>> Thanks! Could you tell me what should be changed in order to make the 
>> code runnable on your system, if it is possible?
>> Regards.
>> 
>> On 2020-03-27 13:50, Jiri Olsa wrote:
>> 
>> On Thu, Mar 26, 2020 at 10:49:12PM +0430, ahmadkhorrami wrote:
>> 
>> Hi,
>> Here is the link for the python script:
>> https://gofile.io/?c=1ZSLwe
>> It is written in python-3 and takes the perf script output as input.
>> It looks for consecutive repeated backtrace lines and checks if the 
>> function
>> in these lines calls itself at the offset in the line (i.e., checks if
>> recursion is possible). If not possible it reports an error. Could you 
>> check
>> to see if any error is detected in your outputs, please?
>> I'm getting tons of following output:
>> 
>> /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30: No such file or 
>> directory.
>> No symbol table is loaded.  Use the "file" command.
>> 7ffff71b9bc1 gtk_css_node_invalidate_timestamp+0x31 
>> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
>> 
>> I assume it's because I have all the dso binaries stored
>> under .biuldid path, while you check the output name
>> 
>> jirka
>> 
>> Regards.
>> On 2020-03-26 20:09, Jiri Olsa wrote:
>> 
>> On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
>> 
>> Hi,
>> First of all, many thanks for your time. Did you say that the first
>> file has
>> problems?
>> 
>> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s
>> while
>> the second (https://gofile.io/?c=oGxgSM) also has problems with
>> unmatched
>> (not necessarily repeated) function calls. I am not sure if the
>> kernel for
>> the second one is 5.4.7 or the generic Ubuntu kernel. But the first
>> one is
>> certainly 5.4.7. Just to be clear, there were many instances of these
>> unmatched <caller, callees>.
>> I can se all the files, but I just can't see the issue yet
>> but it's probably because of issues with perf archive..
>> 
>> let's see if somebody else can chime in
>> 
>> I have a simple python script that checks for this situation. It
>> disassembles functions using GDB and checks the (directly called)
>> target of
>> each caller. I will put some comments in the script and upload it.
>> Could you
>> check to see if the python script detects any mismatches in your
>> backtraces?
>> It takes the perf script output file as input. I will upload the
>> script in
>> an hour.
>> ok
>> 
>> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 18:43                                                   ` ahmadkhorrami
@ 2020-03-27 22:37                                                     ` Jiri Olsa
  2020-03-27 23:12                                                       ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-27 22:37 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

On Fri, Mar 27, 2020 at 11:13:27PM +0430, ahmadkhorrami wrote:
> Hi,
> I revised the code. Hopefully, it works. Could you check it with your own
> outputs? It is valuable for me to know if your output contains wrong
> backtraces or not. Here is the link:
> https://gofile.io/?c=rz2kGc

will check, so far caught one.. might be same case:

evince 2168454 1549196.055094:      43831 cycles:u:
  ffffffffaec012f0 [unknown] ([unknown])
      7f0dd44776b6 __mmap64+0x26 (inlined)
      7f0dd44776b6 __mmap64+0x26 (inlined)

will try to investigate next week

thanks,
jirka

> Regards.
> 
> On 2020-03-27 15:34, ahmadkhorrami wrote:
> 
> > I do the following:
> > If this line is in the perf script backtrace:
> > 7f21ffe256db g_main_context_iteration+0x2b
> > (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> > I run the following command:
> > gdb -batch -ex 'file /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4'
> > -ex 'disass g_main_context_iteration'.
> > 
> > Regards.
> > 
> > On 2020-03-27 15:29, ahmadkhorrami wrote:
> > 
> > > Hi,
> > > Thanks! Could you tell me what should be changed in order to make
> > > the code runnable on your system, if it is possible?
> > > Regards.
> > > 
> > > On 2020-03-27 13:50, Jiri Olsa wrote:
> > > 
> > > On Thu, Mar 26, 2020 at 10:49:12PM +0430, ahmadkhorrami wrote:
> > > 
> > > Hi,
> > > Here is the link for the python script:
> > > https://gofile.io/?c=1ZSLwe
> > > It is written in python-3 and takes the perf script output as input.
> > > It looks for consecutive repeated backtrace lines and checks if the
> > > function
> > > in these lines calls itself at the offset in the line (i.e., checks if
> > > recursion is possible). If not possible it reports an error. Could
> > > you check
> > > to see if any error is detected in your outputs, please?
> > > I'm getting tons of following output:
> > > 
> > > /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30: No such file or
> > > directory.
> > > No symbol table is loaded.  Use the "file" command.
> > > 7ffff71b9bc1 gtk_css_node_invalidate_timestamp+0x31
> > > (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
> > > 
> > > I assume it's because I have all the dso binaries stored
> > > under .biuldid path, while you check the output name
> > > 
> > > jirka
> > > 
> > > Regards.
> > > On 2020-03-26 20:09, Jiri Olsa wrote:
> > > 
> > > On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> > > 
> > > Hi,
> > > First of all, many thanks for your time. Did you say that the first
> > > file has
> > > problems?
> > > 
> > > The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s
> > > while
> > > the second (https://gofile.io/?c=oGxgSM) also has problems with
> > > unmatched
> > > (not necessarily repeated) function calls. I am not sure if the
> > > kernel for
> > > the second one is 5.4.7 or the generic Ubuntu kernel. But the first
> > > one is
> > > certainly 5.4.7. Just to be clear, there were many instances of these
> > > unmatched <caller, callees>.
> > > I can se all the files, but I just can't see the issue yet
> > > but it's probably because of issues with perf archive..
> > > 
> > > let's see if somebody else can chime in
> > > 
> > > I have a simple python script that checks for this situation. It
> > > disassembles functions using GDB and checks the (directly called)
> > > target of
> > > each caller. I will put some comments in the script and upload it.
> > > Could you
> > > check to see if the python script detects any mismatches in your
> > > backtraces?
> > > It takes the perf script output file as input. I will upload the
> > > script in
> > > an hour.
> > > ok
> > > 
> > > jirka
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 22:37                                                     ` Jiri Olsa
@ 2020-03-27 23:12                                                       ` ahmadkhorrami
  2020-03-28 23:34                                                         ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-27 23:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi,
Thanks. If you suggest the potentially bogus locations of the source 
code, I will give a try.
Regards.

On 2020-03-28 03:07, Jiri Olsa wrote:

> On Fri, Mar 27, 2020 at 11:13:27PM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> I revised the code. Hopefully, it works. Could you check it with your 
>> own
>> outputs? It is valuable for me to know if your output contains wrong
>> backtraces or not. Here is the link:
>> https://gofile.io/?c=rz2kGc
> 
> will check, so far caught one.. might be same case:
> 
> evince 2168454 1549196.055094:      43831 cycles:u:
> ffffffffaec012f0 [unknown] ([unknown])
> 7f0dd44776b6 __mmap64+0x26 (inlined)
> 7f0dd44776b6 __mmap64+0x26 (inlined)
> 
> will try to investigate next week
> 
> thanks,
> jirka
> 
> Regards.
> 
> On 2020-03-27 15:34, ahmadkhorrami wrote:
> 
> I do the following:
> If this line is in the perf script backtrace:
> 7f21ffe256db g_main_context_iteration+0x2b
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> I run the following command:
> gdb -batch -ex 'file /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4'
> -ex 'disass g_main_context_iteration'.
> 
> Regards.
> 
> On 2020-03-27 15:29, ahmadkhorrami wrote:
> 
> Hi,
> Thanks! Could you tell me what should be changed in order to make
> the code runnable on your system, if it is possible?
> Regards.
> 
> On 2020-03-27 13:50, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 10:49:12PM +0430, ahmadkhorrami wrote:
> 
> Hi,
> Here is the link for the python script:
> https://gofile.io/?c=1ZSLwe
> It is written in python-3 and takes the perf script output as input.
> It looks for consecutive repeated backtrace lines and checks if the
> function
> in these lines calls itself at the offset in the line (i.e., checks if
> recursion is possible). If not possible it reports an error. Could
> you check
> to see if any error is detected in your outputs, please?
> I'm getting tons of following output:
> 
> /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30: No such file or
> directory.
> No symbol table is loaded.  Use the "file" command.
> 7ffff71b9bc1 gtk_css_node_invalidate_timestamp+0x31
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
> 
> I assume it's because I have all the dso binaries stored
> under .biuldid path, while you check the output name
> 
> jirka
> 
> Regards.
> On 2020-03-26 20:09, Jiri Olsa wrote:
> 
> On Thu, Mar 26, 2020 at 05:50:27PM +0430, ahmadkhorrami wrote:
> 
> Hi,
> First of all, many thanks for your time. Did you say that the first
> file has
> problems?
> 
> The first file (http://gofile.io/?c=qk6oXv) has repeated gmallocn()s
> while
> the second (https://gofile.io/?c=oGxgSM) also has problems with
> unmatched
> (not necessarily repeated) function calls. I am not sure if the
> kernel for
> the second one is 5.4.7 or the generic Ubuntu kernel. But the first
> one is
> certainly 5.4.7. Just to be clear, there were many instances of these
> unmatched <caller, callees>.
> I can se all the files, but I just can't see the issue yet
> but it's probably because of issues with perf archive..
> 
> let's see if somebody else can chime in
> 
> I have a simple python script that checks for this situation. It
> disassembles functions using GDB and checks the (directly called)
> target of
> each caller. I will put some comments in the script and upload it.
> Could you
> check to see if the python script detects any mismatches in your
> backtraces?
> It takes the perf script output file as input. I will upload the
> script in
> an hour.
> ok
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-27 23:12                                                       ` ahmadkhorrami
@ 2020-03-28 23:34                                                         ` Jiri Olsa
  2020-03-29  0:43                                                           ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-28 23:34 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

On Sat, Mar 28, 2020 at 03:42:53AM +0430, ahmadkhorrami wrote:
> Hi,
> Thanks. If you suggest the potentially bogus locations of the source code, I
> will give a try.
> Regards.

heya,
the change below 'fixes' it for me:

	$ perf script ...
	...
	evince 2220122 1605573.007639:      11759 cycles:u: 
		ffffffffaec012f0 [unknown] ([unknown])
		    7f93f17116b6 __mmap64+0x26 mmap64.c:59 (inlined)
		    7f93f17116b6 __mmap64+0x26 mmap64.c:47 (inlined)

it wasn't really broken, the output is just missing the source line
info in perf script callchain output, which adds the "missing part",
because for inlined entries the entry address stays the same for all
its inlined parts

could you try the change?

thanks,
jirka


diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 3b4842840db0..7349dfbbef2e 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -174,8 +174,11 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 			if (print_srcline)
 				printed += map__fprintf_srcline(map, addr, "\n  ", fp);
 
-			if (sym && sym->inlined)
+			if (sym && sym->inlined) {
+				if (node->srcline)
+					printed += fprintf(fp, " %s", node->srcline);
 				printed += fprintf(fp, " (inlined)");
+			}
 
 			if (!print_oneline)
 				printed += fprintf(fp, "\n");


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-28 23:34                                                         ` Jiri Olsa
@ 2020-03-29  0:43                                                           ` ahmadkhorrami
  2020-03-29  1:16                                                             ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-29  0:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi Mr. Olsa,
Thanks for your time. The problem is still there. Repeated lines are not 
limited to inline functions. If my script works on your system, it will 
show these lines.
Regards.

On 2020-03-29 04:04, Jiri Olsa wrote:

> On Sat, Mar 28, 2020 at 03:42:53AM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> Thanks. If you suggest the potentially bogus locations of the source 
>> code, I
>> will give a try.
>> Regards.
> 
> heya,
> the change below 'fixes' it for me:
> 
> $ perf script ...
> ...
> evince 2220122 1605573.007639:      11759 cycles:u:
> ffffffffaec012f0 [unknown] ([unknown])
> 7f93f17116b6 __mmap64+0x26 mmap64.c:59 (inlined)
> 7f93f17116b6 __mmap64+0x26 mmap64.c:47 (inlined)
> 
> it wasn't really broken, the output is just missing the source line
> info in perf script callchain output, which adds the "missing part",
> because for inlined entries the entry address stays the same for all
> its inlined parts
> 
> could you try the change?
> 
> thanks,
> jirka
> 
> diff --git a/tools/perf/util/evsel_fprintf.c 
> b/tools/perf/util/evsel_fprintf.c
> index 3b4842840db0..7349dfbbef2e 100644
> --- a/tools/perf/util/evsel_fprintf.c
> +++ b/tools/perf/util/evsel_fprintf.c
> @@ -174,8 +174,11 @@ int sample__fprintf_callchain(struct perf_sample 
> *sample, int left_alignment,
> if (print_srcline)
> printed += map__fprintf_srcline(map, addr, "\n  ", fp);
> 
> -            if (sym && sym->inlined)
> +            if (sym && sym->inlined) {
> +                if (node->srcline)
> +                    printed += fprintf(fp, " %s", node->srcline);
> printed += fprintf(fp, " (inlined)");
> +            }
> 
> if (!print_oneline)
> printed += fprintf(fp, "\n");

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29  0:43                                                           ` ahmadkhorrami
@ 2020-03-29  1:16                                                             ` ahmadkhorrami
  2020-03-29 11:19                                                               ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-29  1:16 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Changbin Du, Andi Kleen

Hi,
Each line is correct. I mean addresses match symbols. But some callers 
and callees do not match. Perhaps callchain misses some callers in 
between.
Regards

On 2020-03-29 05:13, ahmadkhorrami wrote:

> Hi Mr. Olsa,
> Thanks for your time. The problem is still there. Repeated lines are 
> not limited to inline functions. If my script works on your system, it 
> will show these lines.
> Regards.
> 
> On 2020-03-29 04:04, Jiri Olsa wrote:
> 
> On Sat, Mar 28, 2020 at 03:42:53AM +0430, ahmadkhorrami wrote:
> 
> Hi,
> Thanks. If you suggest the potentially bogus locations of the source 
> code, I
> will give a try.
> Regards.
> heya,
> the change below 'fixes' it for me:
> 
> $ perf script ...
> ...
> evince 2220122 1605573.007639:      11759 cycles:u:
> ffffffffaec012f0 [unknown] ([unknown])
> 7f93f17116b6 __mmap64+0x26 mmap64.c:59 (inlined)
> 7f93f17116b6 __mmap64+0x26 mmap64.c:47 (inlined)
> 
> it wasn't really broken, the output is just missing the source line
> info in perf script callchain output, which adds the "missing part",
> because for inlined entries the entry address stays the same for all
> its inlined parts
> 
> could you try the change?
> 
> thanks,
> jirka
> 
> diff --git a/tools/perf/util/evsel_fprintf.c 
> b/tools/perf/util/evsel_fprintf.c
> index 3b4842840db0..7349dfbbef2e 100644
> --- a/tools/perf/util/evsel_fprintf.c
> +++ b/tools/perf/util/evsel_fprintf.c
> @@ -174,8 +174,11 @@ int sample__fprintf_callchain(struct perf_sample 
> *sample, int left_alignment,
> if (print_srcline)
> printed += map__fprintf_srcline(map, addr, "\n  ", fp);
> 
> -            if (sym && sym->inlined)
> +            if (sym && sym->inlined) {
> +                if (node->srcline)
> +                    printed += fprintf(fp, " %s", node->srcline);
> printed += fprintf(fp, " (inlined)");
> +            }
> 
> if (!print_oneline)
> printed += fprintf(fp, "\n");

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29  1:16                                                             ` ahmadkhorrami
@ 2020-03-29 11:19                                                               ` Jiri Olsa
  2020-03-29 11:52                                                                 ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-29 11:19 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Andi Kleen

On Sun, Mar 29, 2020 at 05:46:57AM +0430, ahmadkhorrami wrote:
> Hi,
> Each line is correct. I mean addresses match symbols. But some callers and
> callees do not match. Perhaps callchain misses some callers in between.
> Regards

right, I missed another case.. how about the change below?
your script is silent now on my data

jirka


---
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 3b4842840db0..fc4fb88937ed 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -173,6 +173,8 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 
 			if (print_srcline)
 				printed += map__fprintf_srcline(map, addr, "\n  ", fp);
+			else if (node->srcline)
+				printed += fprintf(fp, " %s", node->srcline);
 
 			if (sym && sym->inlined)
 				printed += fprintf(fp, " (inlined)");


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 11:19                                                               ` Jiri Olsa
@ 2020-03-29 11:52                                                                 ` ahmadkhorrami
  2020-03-29 12:08                                                                   ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-29 11:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Andi Kleen

Hi,
Thanks. Still no change. Sorry, I forgot to say that you should 
initialize the "perfCMD" variable to your perf binary path.
Regards.

On 2020-03-29 15:49, Jiri Olsa wrote:

> On Sun, Mar 29, 2020 at 05:46:57AM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> Each line is correct. I mean addresses match symbols. But some callers 
>> and
>> callees do not match. Perhaps callchain misses some callers in 
>> between.
>> Regards
> 
> right, I missed another case.. how about the change below?
> your script is silent now on my data
> 
> jirka
> 
> ---
> diff --git a/tools/perf/util/evsel_fprintf.c 
> b/tools/perf/util/evsel_fprintf.c
> index 3b4842840db0..fc4fb88937ed 100644
> --- a/tools/perf/util/evsel_fprintf.c
> +++ b/tools/perf/util/evsel_fprintf.c
> @@ -173,6 +173,8 @@ int sample__fprintf_callchain(struct perf_sample 
> *sample, int left_alignment,
> 
> if (print_srcline)
> printed += map__fprintf_srcline(map, addr, "\n  ", fp);
> +            else if (node->srcline)
> +                printed += fprintf(fp, " %s", node->srcline);
> 
> if (sym && sym->inlined)
> printed += fprintf(fp, " (inlined)");

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 11:52                                                                 ` ahmadkhorrami
@ 2020-03-29 12:08                                                                   ` Jiri Olsa
  2020-03-29 12:39                                                                     ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-29 12:08 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Andi Kleen

On Sun, Mar 29, 2020 at 04:22:27PM +0430, ahmadkhorrami wrote:
> Hi,
> Thanks. Still no change. Sorry, I forgot to say that you should initialize
> the "perfCMD" variable to your perf binary path.

sure I did that, and your script detected the double entries,
now when we show also srcline for them it's silent

so there's no change at all for your perf script output?

jirka

> Regards.
> 
> On 2020-03-29 15:49, Jiri Olsa wrote:
> 
> > On Sun, Mar 29, 2020 at 05:46:57AM +0430, ahmadkhorrami wrote:
> > 
> > > Hi,
> > > Each line is correct. I mean addresses match symbols. But some
> > > callers and
> > > callees do not match. Perhaps callchain misses some callers in
> > > between.
> > > Regards
> > 
> > right, I missed another case.. how about the change below?
> > your script is silent now on my data
> > 
> > jirka
> > 
> > ---
> > diff --git a/tools/perf/util/evsel_fprintf.c
> > b/tools/perf/util/evsel_fprintf.c
> > index 3b4842840db0..fc4fb88937ed 100644
> > --- a/tools/perf/util/evsel_fprintf.c
> > +++ b/tools/perf/util/evsel_fprintf.c
> > @@ -173,6 +173,8 @@ int sample__fprintf_callchain(struct perf_sample
> > *sample, int left_alignment,
> > 
> > if (print_srcline)
> > printed += map__fprintf_srcline(map, addr, "\n  ", fp);
> > +            else if (node->srcline)
> > +                printed += fprintf(fp, " %s", node->srcline);
> > 
> > if (sym && sym->inlined)
> > printed += fprintf(fp, " (inlined)");
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 12:08                                                                   ` Jiri Olsa
@ 2020-03-29 12:39                                                                     ` ahmadkhorrami
  2020-03-29 13:50                                                                       ` Milian Wolff
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-29 12:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Milian Wolff,
	Namhyung Kim, Andi Kleen

Thanks. I did both of your changes. Perhaps some outputs are revised. 
But I still have repeated function calls and the script detects them. 
Here is one of them when the sampling period is 1000 events:

evince  8751  2605.226573:      10000 
mem_load_uops_retired.l3_miss:uppp:     5635f8c4cf3a         5080022 
N/A|SNP N/A|TLB N/A|LCK N/A
             7f91785a737d gtk_box_forall+0x2d 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkbox.c:0
             7f91788120b5 gtk_widget_propagate_state+0x195 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f91785a73cf gtk_box_forall+0x7f 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkbox.c:0
             7f91788120b5 gtk_widget_propagate_state+0x195 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f917875ebcf gtk_stack_forall+0x2f 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkstack.c:0
             7f91788120b5 gtk_widget_propagate_state+0x195 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f91788120b5 gtk_widget_propagate_state+0x195 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f91788154aa gtk_window_forall+0x3a 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwindow.c:0
             7f91788120b5 gtk_widget_propagate_state+0x195 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f9178812a2d gtk_widget_update_state_flags+0x5d 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f917881919e gtk_window_state_event+0x21e 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwindow.c:0
             5635f6fd53ce ev_window_state_event+0x5e 
(/opt/evince-3.28.4/bin/evince) ev-window.c:4410
             7f91786b87fa _gtk_marshal_BOOLEAN__BOXED+0x6a 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
             7f9176b9410c g_closure_invoke+0x19c 
(/usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.5600.4)
             7f9176ba6de7 signal_emit_unlocked_R+0xcd7 
(/usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.5600.4) gsignal.c:0
             7f9176baf0ae g_signal_emit_valist+0x40e 
(/usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.5600.4)
             7f9176bb012e g_signal_emit+0x8e 
(/usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.5600.4)
             7f9178800533 gtk_widget_event_internal+0x163 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0
             7f91786b78cd gtk_main_do_event+0x77d 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30)
             7f91781c8764 _gdk_event_emit+0x24 
(/usr/lib/x86_64-linux-gnu/libgdk-3.so.0.2200.30)
             7f91781f8f91 gdk_event_source_dispatch+0x21 
(/usr/lib/x86_64-linux-gnu/libgdk-3.so.0.2200.30) gdkeventsource.c:0
             7f91768b9416 g_main_context_dispatch+0x2e6 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
             7f91768b964f g_main_context_iterate.isra.26+0x1ff 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4) gmain.c:0
             7f91768b96db g_main_context_iteration+0x2b 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
             7f9176e7ae3c g_application_run+0x1fc 
(/usr/lib/x86_64-linux-gnu/libgio-2.0.so.0.5600.4)
             5635f6fbc707 main+0x447 (/opt/evince-3.28.4/bin/evince) 
main.c:316
             7f9175ee0b96 __libc_start_main+0xe6 
(/lib/x86_64-linux-gnu/libc-2.27.so) libc-start.c:310
             5635f6fbc899 _start+0x29 (/opt/evince-3.28.4/bin/evince)

Here, we have two consecutive "7f91788120b5 
gtk_widget_propagate_state+0x195 
(/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines, 
while "gtk_widget_propagate_state+0x195" is not recursive. It should 
call "gtk_container_forall", which does not occur even after the second 
(inner) call.

Regards.

On 2020-03-29 16:38, Jiri Olsa wrote:

> On Sun, Mar 29, 2020 at 04:22:27PM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> Thanks. Still no change. Sorry, I forgot to say that you should 
>> initialize
>> the "perfCMD" variable to your perf binary path.
> 
> sure I did that, and your script detected the double entries,
> now when we show also srcline for them it's silent
> 
> so there's no change at all for your perf script output?
> 
> jirka
> 
> Regards.
> 
> On 2020-03-29 15:49, Jiri Olsa wrote:
> 
> On Sun, Mar 29, 2020 at 05:46:57AM +0430, ahmadkhorrami wrote:
> 
> Hi,
> Each line is correct. I mean addresses match symbols. But some
> callers and
> callees do not match. Perhaps callchain misses some callers in
> between.
> Regards
> right, I missed another case.. how about the change below?
> your script is silent now on my data
> 
> jirka
> 
> ---
> diff --git a/tools/perf/util/evsel_fprintf.c
> b/tools/perf/util/evsel_fprintf.c
> index 3b4842840db0..fc4fb88937ed 100644
> --- a/tools/perf/util/evsel_fprintf.c
> +++ b/tools/perf/util/evsel_fprintf.c
> @@ -173,6 +173,8 @@ int sample__fprintf_callchain(struct perf_sample
> *sample, int left_alignment,
> 
> if (print_srcline)
> printed += map__fprintf_srcline(map, addr, "\n  ", fp);
> +            else if (node->srcline)
> +                printed += fprintf(fp, " %s", node->srcline);
> 
> if (sym && sym->inlined)
> printed += fprintf(fp, " (inlined)");

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 12:39                                                                     ` ahmadkhorrami
@ 2020-03-29 13:50                                                                       ` Milian Wolff
  2020-03-29 14:23                                                                         ` ahmadkhorrami
  2020-03-29 19:20                                                                         ` Jiri Olsa
  0 siblings, 2 replies; 67+ messages in thread
From: Milian Wolff @ 2020-03-29 13:50 UTC (permalink / raw)
  To: Jiri Olsa, ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Namhyung Kim,
	Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]

On Sonntag, 29. März 2020 14:39:33 CEST ahmadkhorrami wrote:
> Thanks. I did both of your changes. Perhaps some outputs are revised.
> But I still have repeated function calls and the script detects them.
> Here is one of them when the sampling period is 1000 events:

<snip>

> Here, we have two consecutive "7f91788120b5
> gtk_widget_propagate_state+0x195
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> while "gtk_widget_propagate_state+0x195" is not recursive. It should
> call "gtk_container_forall", which does not occur even after the second
> (inner) call.

Potentially you are just lacking some debug symbols here for this GTK library. 
note that "gtkwidget.c:0" is bogus already - the line numbers start with 1, so 
a line of 0 reported indicates a lack of (full) debug information for this 
file.

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 13:50                                                                       ` Milian Wolff
@ 2020-03-29 14:23                                                                         ` ahmadkhorrami
  2020-03-29 19:20                                                                         ` Jiri Olsa
  1 sibling, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-29 14:23 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
Good Point. But putting symbols aside, the addresses do not match, by 
themselves.
Regards.

On 2020-03-29 18:20, Milian Wolff wrote:

> On Sonntag, 29. März 2020 14:39:33 CEST ahmadkhorrami wrote:
> 
>> Thanks. I did both of your changes. Perhaps some outputs are revised.
>> But I still have repeated function calls and the script detects them.
>> Here is one of them when the sampling period is 1000 events:
> 
> <snip>
> 
>> Here, we have two consecutive "7f91788120b5
>> gtk_widget_propagate_state+0x195
>> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" 
>> lines,
>> while "gtk_widget_propagate_state+0x195" is not recursive. It should
>> call "gtk_container_forall", which does not occur even after the 
>> second
>> (inner) call.
> 
> Potentially you are just lacking some debug symbols here for this GTK 
> library.
> note that "gtkwidget.c:0" is bogus already - the line numbers start 
> with 1, so
> a line of 0 reported indicates a lack of (full) debug information for 
> this
> file.
> 
> Cheers

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 13:50                                                                       ` Milian Wolff
  2020-03-29 14:23                                                                         ` ahmadkhorrami
@ 2020-03-29 19:20                                                                         ` Jiri Olsa
  2020-03-30  6:09                                                                           ` Milian Wolff
  1 sibling, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-29 19:20 UTC (permalink / raw)
  To: Milian Wolff
  Cc: ahmadkhorrami, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

On Sun, Mar 29, 2020 at 03:50:57PM +0200, Milian Wolff wrote:
> On Sonntag, 29. März 2020 14:39:33 CEST ahmadkhorrami wrote:
> > Thanks. I did both of your changes. Perhaps some outputs are revised.
> > But I still have repeated function calls and the script detects them.
> > Here is one of them when the sampling period is 1000 events:
> 
> <snip>
> 
> > Here, we have two consecutive "7f91788120b5
> > gtk_widget_propagate_state+0x195
> > (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> > while "gtk_widget_propagate_state+0x195" is not recursive. It should
> > call "gtk_container_forall", which does not occur even after the second
> > (inner) call.
> 
> Potentially you are just lacking some debug symbols here for this GTK library. 
> note that "gtkwidget.c:0" is bogus already - the line numbers start with 1, so 

could we just skip those then? pass on all zero lined inlines
and not add/display them

jirka

> a line of 0 reported indicates a lack of (full) debug information for this 
> file.
> 
> Cheers
> 
> -- 
> Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
> KDAB (Deutschland) GmbH, a KDAB Group company
> Tel: +49-30-521325470
> KDAB - The Qt, C++ and OpenGL Experts



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-29 19:20                                                                         ` Jiri Olsa
@ 2020-03-30  6:09                                                                           ` Milian Wolff
  2020-03-30 13:07                                                                             ` Jiri Olsa
  0 siblings, 1 reply; 67+ messages in thread
From: Milian Wolff @ 2020-03-30  6:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: ahmadkhorrami, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1731 bytes --]

On Sonntag, 29. März 2020 21:20:10 CEST Jiri Olsa wrote:
> On Sun, Mar 29, 2020 at 03:50:57PM +0200, Milian Wolff wrote:
> > On Sonntag, 29. März 2020 14:39:33 CEST ahmadkhorrami wrote:
> > > Thanks. I did both of your changes. Perhaps some outputs are revised.
> > > But I still have repeated function calls and the script detects them.
> > 
> > > Here is one of them when the sampling period is 1000 events:
> > <snip>
> > 
> > > Here, we have two consecutive "7f91788120b5
> > > gtk_widget_propagate_state+0x195
> > > (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> > > while "gtk_widget_propagate_state+0x195" is not recursive. It should
> > > call "gtk_container_forall", which does not occur even after the second
> > > (inner) call.
> > 
> > Potentially you are just lacking some debug symbols here for this GTK
> > library. note that "gtkwidget.c:0" is bogus already - the line numbers
> > start with 1, so
> could we just skip those then? pass on all zero lined inlines
> and not add/display them

I guess so - but quite frankly I'm a bit uneasy to do this without further 
exploration about the actual causes here. Afaik DWARF does not say anything 
about the validity, so in theory there might be some language/compiler that 
encodes valid DWARF locations with line 0?

Generally, it would be quite interesting to figure out why there is /some/ 
DWARF data here such that the code thinks it's able to decode the inline frame 
but then fails and leads to this confusing result?

Cheers
-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-30  6:09                                                                           ` Milian Wolff
@ 2020-03-30 13:07                                                                             ` Jiri Olsa
  2020-03-30 13:49                                                                               ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-30 13:07 UTC (permalink / raw)
  To: Milian Wolff
  Cc: ahmadkhorrami, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

On Mon, Mar 30, 2020 at 08:09:53AM +0200, Milian Wolff wrote:
> On Sonntag, 29. März 2020 21:20:10 CEST Jiri Olsa wrote:
> > On Sun, Mar 29, 2020 at 03:50:57PM +0200, Milian Wolff wrote:
> > > On Sonntag, 29. März 2020 14:39:33 CEST ahmadkhorrami wrote:
> > > > Thanks. I did both of your changes. Perhaps some outputs are revised.
> > > > But I still have repeated function calls and the script detects them.
> > > 
> > > > Here is one of them when the sampling period is 1000 events:
> > > <snip>
> > > 
> > > > Here, we have two consecutive "7f91788120b5
> > > > gtk_widget_propagate_state+0x195
> > > > (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> > > > while "gtk_widget_propagate_state+0x195" is not recursive. It should
> > > > call "gtk_container_forall", which does not occur even after the second
> > > > (inner) call.
> > > 
> > > Potentially you are just lacking some debug symbols here for this GTK
> > > library. note that "gtkwidget.c:0" is bogus already - the line numbers
> > > start with 1, so
> > could we just skip those then? pass on all zero lined inlines
> > and not add/display them
> 
> I guess so - but quite frankly I'm a bit uneasy to do this without further 
> exploration about the actual causes here. Afaik DWARF does not say anything 
> about the validity, so in theory there might be some language/compiler that 
> encodes valid DWARF locations with line 0?
> 
> Generally, it would be quite interesting to figure out why there is /some/ 
> DWARF data here such that the code thinks it's able to decode the inline frame 
> but then fails and leads to this confusing result?

right, but it's beyond my DWARF expertise ;-) we'll need
to wait till one of you guys could take a look on it

thanks,
jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-30 13:07                                                                             ` Jiri Olsa
@ 2020-03-30 13:49                                                                               ` ahmadkhorrami
  2020-03-30 19:05                                                                                 ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-30 13:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
thanks, but I still do not understand the cause of the problem. It seems 
that it is purely a software bug and the callchain addresses seem 
wrong,by themselves. Am I right?
Regards.
On 2020-03-30 17:37, Jiri Olsa wrote:

> On Mon, Mar 30, 2020 at 08:09:53AM +0200, Milian Wolff wrote: On 
> Sonntag, 29. März 2020 21:20:10 CEST Jiri Olsa wrote: On Sun, Mar 29, 
> 2020 at 03:50:57PM +0200, Milian Wolff wrote: On Sonntag, 29. März 2020 
> 14:39:33 CEST ahmadkhorrami wrote: Thanks. I did both of your changes. 
> Perhaps some outputs are revised.
> But I still have repeated function calls and the script detects them.
> Here is one of them when the sampling period is 1000 events: <snip>
> 
> Here, we have two consecutive "7f91788120b5
> gtk_widget_propagate_state+0x195
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> while "gtk_widget_propagate_state+0x195" is not recursive. It should
> call "gtk_container_forall", which does not occur even after the second
> (inner) call.
> Potentially you are just lacking some debug symbols here for this GTK
> library. note that "gtkwidget.c:0" is bogus already - the line numbers
> start with 1, so
  could we just skip those then? pass on all zero lined inlines
and not add/display them
I guess so - but quite frankly I'm a bit uneasy to do this without 
further
exploration about the actual causes here. Afaik DWARF does not say 
anything
about the validity, so in theory there might be some language/compiler 
that
encodes valid DWARF locations with line 0?

Generally, it would be quite interesting to figure out why there is 
/some/
DWARF data here such that the code thinks it's able to decode the inline 
frame
but then fails and leads to this confusing result?
right, but it's beyond my DWARF expertise ;-) we'll need
to wait till one of you guys could take a look on it

thanks,
jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-30 13:49                                                                               ` ahmadkhorrami
@ 2020-03-30 19:05                                                                                 ` ahmadkhorrami
  2020-03-30 21:05                                                                                   ` debuginfod-based dwarf downloading, was " Frank Ch. Eigler
                                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-30 19:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
Sorry, I did not pay attention to the fact that we need debug info to be 
able to decode the callchain. So, without complete debug info the 
backtrace (either the symbols or the addresses) is bogus. But you say 
that the backtrace portions containing binaries with full debug info are 
OK. Am I right? Please confirm if this is the case.
If so, these portions may be enough for my current usecase.
Regards.

On 2020-03-30 18:19, ahmadkhorrami wrote:

> Hi,
> thanks, but I still do not understand the cause of the problem. It 
> seems that it is purely a software bug and the callchain addresses seem 
> wrong,by themselves. Am I right?
> Regards.
> On 2020-03-30 17:37, Jiri Olsa wrote:
> 
>> On Mon, Mar 30, 2020 at 08:09:53AM +0200, Milian Wolff wrote: On 
>> Sonntag, 29. März 2020 21:20:10 CEST Jiri Olsa wrote: On Sun, Mar 29, 
>> 2020 at 03:50:57PM +0200, Milian Wolff wrote: On Sonntag, 29. März 
>> 2020 14:39:33 CEST ahmadkhorrami wrote: Thanks. I did both of your 
>> changes. Perhaps some outputs are revised.
>> But I still have repeated function calls and the script detects them.
>> Here is one of them when the sampling period is 1000 events: <snip>
>> 
>> Here, we have two consecutive "7f91788120b5
>> gtk_widget_propagate_state+0x195
>> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" 
>> lines,
>> while "gtk_widget_propagate_state+0x195" is not recursive. It should
>> call "gtk_container_forall", which does not occur even after the 
>> second
>> (inner) call.
>> Potentially you are just lacking some debug symbols here for this GTK
>> library. note that "gtkwidget.c:0" is bogus already - the line numbers
>> start with 1, so
> could we just skip those then? pass on all zero lined inlines
> and not add/display them
> I guess so - but quite frankly I'm a bit uneasy to do this without 
> further
> exploration about the actual causes here. Afaik DWARF does not say 
> anything
> about the validity, so in theory there might be some language/compiler 
> that
> encodes valid DWARF locations with line 0?
> 
> Generally, it would be quite interesting to figure out why there is 
> /some/
> DWARF data here such that the code thinks it's able to decode the 
> inline frame
> but then fails and leads to this confusing result?
> right, but it's beyond my DWARF expertise ;-) we'll need
> to wait till one of you guys could take a look on it
> 
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* debuginfod-based dwarf downloading, was Re: Wrong Perf Backtraces
  2020-03-30 19:05                                                                                 ` ahmadkhorrami
@ 2020-03-30 21:05                                                                                   ` Frank Ch. Eigler
  2020-03-31  9:26                                                                                     ` Jiri Olsa
  2020-03-31  4:43                                                                                   ` ahmadkhorrami
  2020-03-31 12:43                                                                                   ` ahmadkhorrami
  2 siblings, 1 reply; 67+ messages in thread
From: Frank Ch. Eigler @ 2020-03-30 21:05 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Jiri Olsa, Milian Wolff, Steven Rostedt,
	Arnaldo Carvalho de Melo, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao, Namhyung Kim, Andi Kleen

Hi -

I was reminded by the thread that debuginfo downloading for perf is
worth automating more.  Since perf uses elfutils, recently-added
facilities to elfutils 0.178 (2019-11) already allow automated
downloading from some perf code paths.  The following patch [2] builds
a little bit on that by ensuring that the downloaded debuginfo is
plopped into the $HOME/.debug/ hierarchy explicitly, and should cover
a few more code paths.  Feel free to take / adapt this code as you
like.

debuginfod can also do source code downloading on the fly, but the
perf/util/srccode.c machinery would need buildids passed, and I
couldn't quickly figure it out.  (Sorry, I also couldn't quickly
figure out the perf testsuite.)

[1] https://sourceware.org/elfutils/Debuginfod.html

[2]

Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Mon Mar 30 16:15:47 2020 -0400

    perf build-ids: fall back to debuginfod query if debuginfo not found
    
    During a perf-record, use the -ldebuginfod API to query a debuginfod
    server, should the debug data not be found in the usual system
    locations.  If successful, the usual $HOME/.debug dir is populated.
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 574c2e0b9d20..51e051858d21 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -48,6 +48,7 @@ FEATURE_TESTS_BASIC :=                  \
         libelf-gelf_getnote             \
         libelf-getshdrstrndx            \
         libelf-mmap                     \
+        libdebuginfod                   \
         libnuma                         \
         numa_num_possible_cpus          \
         libperl                         \
@@ -114,6 +115,7 @@ FEATURE_DISPLAY ?=              \
          libbfd                 \
          libcap                 \
          libelf                 \
+         libdebuginfod          \
          libnuma                \
          numa_num_possible_cpus \
          libperl                \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 7ac0d8088565..1109f5ec96f7 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -26,6 +26,7 @@ FILES=                                          \
          test-libelf-gelf_getnote.bin           \
          test-libelf-getshdrstrndx.bin          \
          test-libelf-mmap.bin                   \
+         test-libdebuginfod.bin                 \
          test-libnuma.bin                       \
          test-numa_num_possible_cpus.bin        \
          test-libperl.bin                       \
@@ -155,6 +156,9 @@ endif
 $(OUTPUT)test-libelf-getshdrstrndx.bin:
 	$(BUILD) -lelf
 
+$(OUTPUT)test-libdebuginfod.bin:
+	$(BUILD) -ldebuginfod
+
 $(OUTPUT)test-libnuma.bin:
 	$(BUILD) -lnuma
 
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 80e55e796be9..15eeecf4ff17 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -467,6 +467,11 @@ ifndef NO_LIBELF
     CFLAGS += -DHAVE_ELF_GETSHDRSTRNDX_SUPPORT
   endif
 
+  ifeq ($(feature-libdebuginfod), 1)
+    CFLAGS += -DHAVE_DEBUGINFOD_SUPPORT
+    EXTLIBS += -ldebuginfod
+  endif
+
   ifndef NO_DWARF
     ifeq ($(origin PERF_HAVE_DWARF_REGS), undefined)
       msg := $(warning DWARF register mappings have not been defined for architecture $(SRCARCH), DWARF support disabled);
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index c076fc7fe025..a33d00837d81 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -31,6 +31,10 @@
 #include "probe-file.h"
 #include "strlist.h"
 
+#if HAVE_DEBUGINFOD_SUPPORT
+#include <elfutils/debuginfod.h>
+#endif
+
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 
@@ -636,6 +640,21 @@ static char *build_id_cache__find_debug(const char *sbuild_id,
 	if (realname && access(realname, R_OK))
 		zfree(&realname);
 	nsinfo__mountns_exit(&nsc);
+
+#if HAVE_DEBUGINFOD_SUPPORT
+        if (realname == NULL) {
+                debuginfod_client* c = debuginfod_begin();
+                if (c != NULL) {
+                        int fd = debuginfod_find_debuginfo(c,
+                                                           (const unsigned char*)sbuild_id, 0,
+                                                           &realname);
+                        if (fd >= 0)
+                                close(fd); /* retaining reference by realname */
+                        debuginfod_end(c);
+                }
+        }
+#endif
+
 out:
 	free(debugfile);
 	return realname;


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-30 19:05                                                                                 ` ahmadkhorrami
  2020-03-30 21:05                                                                                   ` debuginfod-based dwarf downloading, was " Frank Ch. Eigler
@ 2020-03-31  4:43                                                                                   ` ahmadkhorrami
  2020-03-31  9:30                                                                                     ` Jiri Olsa
  2020-03-31 12:43                                                                                   ` ahmadkhorrami
  2 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31  4:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
I checked the reported address in libgtk. GDB could map it to the source 
files. Also, once, I read at Stackoverflow that perf gives up on 
printing the backtrace (truncates the output backtrace), whenever it 
detects a binary file without debug info. Could you describe more about 
this situation?
And dear Mr. Olsa, if these cases do not occur on your system, could you 
tell me how you installed the debug info for the binaries on your 
system, please?
Regards.

On 2020-03-30 23:35, ahmadkhorrami wrote:

> Hi,
> Sorry, I did not pay attention to the fact that we need debug info to 
> be able to decode the callchain. So, without complete debug info the 
> backtrace (either the symbols or the addresses) is bogus. But you say 
> that the backtrace portions containing binaries with full debug info 
> are OK. Am I right? Please confirm if this is the case.
> If so, these portions may be enough for my current usecase.
> Regards.
> 
> On 2020-03-30 18:19, ahmadkhorrami wrote:
> 
> Hi,
> thanks, but I still do not understand the cause of the problem. It 
> seems that it is purely a software bug and the callchain addresses seem 
> wrong,by themselves. Am I right?
> Regards.
> On 2020-03-30 17:37, Jiri Olsa wrote:
> 
> On Mon, Mar 30, 2020 at 08:09:53AM +0200, Milian Wolff wrote: On 
> Sonntag, 29. März 2020 21:20:10 CEST Jiri Olsa wrote: On Sun, Mar 29, 
> 2020 at 03:50:57PM +0200, Milian Wolff wrote: On Sonntag, 29. März 2020 
> 14:39:33 CEST ahmadkhorrami wrote: Thanks. I did both of your changes. 
> Perhaps some outputs are revised.
> But I still have repeated function calls and the script detects them.
> Here is one of them when the sampling period is 1000 events: <snip>
> 
> Here, we have two consecutive "7f91788120b5
> gtk_widget_propagate_state+0x195
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> while "gtk_widget_propagate_state+0x195" is not recursive. It should
> call "gtk_container_forall", which does not occur even after the second
> (inner) call.
> Potentially you are just lacking some debug symbols here for this GTK
> library. note that "gtkwidget.c:0" is bogus already - the line numbers
> start with 1, so could we just skip those then? pass on all zero lined 
> inlines
> and not add/display them
> I guess so - but quite frankly I'm a bit uneasy to do this without 
> further
> exploration about the actual causes here. Afaik DWARF does not say 
> anything
> about the validity, so in theory there might be some language/compiler 
> that
> encodes valid DWARF locations with line 0?
> 
> Generally, it would be quite interesting to figure out why there is 
> /some/
> DWARF data here such that the code thinks it's able to decode the 
> inline frame
> but then fails and leads to this confusing result?
> right, but it's beyond my DWARF expertise ;-) we'll need
> to wait till one of you guys could take a look on it
> 
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: debuginfod-based dwarf downloading, was Re: Wrong Perf Backtraces
  2020-03-30 21:05                                                                                   ` debuginfod-based dwarf downloading, was " Frank Ch. Eigler
@ 2020-03-31  9:26                                                                                     ` Jiri Olsa
  2020-03-31 14:00                                                                                       ` Frank Ch. Eigler
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-31  9:26 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: ahmadkhorrami, Milian Wolff, Steven Rostedt,
	Arnaldo Carvalho de Melo, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao, Namhyung Kim, Andi Kleen

On Mon, Mar 30, 2020 at 05:05:34PM -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> I was reminded by the thread that debuginfo downloading for perf is
> worth automating more.  Since perf uses elfutils, recently-added
> facilities to elfutils 0.178 (2019-11) already allow automated
> downloading from some perf code paths.  The following patch [2] builds
> a little bit on that by ensuring that the downloaded debuginfo is
> plopped into the $HOME/.debug/ hierarchy explicitly, and should cover
> a few more code paths.  Feel free to take / adapt this code as you
> like.

I think it's a good base, thanks a lot!

I made few comments before reading above,
feel free to post v2, if not we'll take over ;-)

thanks again,
jirka

> 
> debuginfod can also do source code downloading on the fly, but the
> perf/util/srccode.c machinery would need buildids passed, and I
> couldn't quickly figure it out.  (Sorry, I also couldn't quickly
> figure out the perf testsuite.)

np ;-)

> 
> [1] https://sourceware.org/elfutils/Debuginfod.html
> 
> [2]
> 
> Author: Frank Ch. Eigler <fche@redhat.com>
> Date:   Mon Mar 30 16:15:47 2020 -0400
> 
>     perf build-ids: fall back to debuginfod query if debuginfo not found
>     
>     During a perf-record, use the -ldebuginfod API to query a debuginfod
>     server, should the debug data not be found in the usual system
>     locations.  If successful, the usual $HOME/.debug dir is populated.
>     
>     Signed-off-by: Frank Ch. Eigler <fche@redhat.com>
> 
> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
> index 574c2e0b9d20..51e051858d21 100644
> --- a/tools/build/Makefile.feature
> +++ b/tools/build/Makefile.feature
> @@ -48,6 +48,7 @@ FEATURE_TESTS_BASIC :=                  \
>          libelf-gelf_getnote             \
>          libelf-getshdrstrndx            \
>          libelf-mmap                     \
> +        libdebuginfod                   \

we put new feature checks which are not widely spread yet
and not vitaly needed into FEATURE_TESTS_EXTRA and call:

    $(call feature_check,libdebuginfod)

before you check the feature

>          libnuma                         \
>          numa_num_possible_cpus          \
>          libperl                         \
> @@ -114,6 +115,7 @@ FEATURE_DISPLAY ?=              \
>           libbfd                 \
>           libcap                 \
>           libelf                 \
> +         libdebuginfod          \

I dont see the libdebuginfod.c, did you forget to add it?

>           libnuma                \
>           numa_num_possible_cpus \
>           libperl                \
> diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
> index 7ac0d8088565..1109f5ec96f7 100644
> --- a/tools/build/feature/Makefile
> +++ b/tools/build/feature/Makefile
> @@ -26,6 +26,7 @@ FILES=                                          \
>           test-libelf-gelf_getnote.bin           \
>           test-libelf-getshdrstrndx.bin          \
>           test-libelf-mmap.bin                   \
> +         test-libdebuginfod.bin                 \
>           test-libnuma.bin                       \
>           test-numa_num_possible_cpus.bin        \
>           test-libperl.bin                       \
> @@ -155,6 +156,9 @@ endif
>  $(OUTPUT)test-libelf-getshdrstrndx.bin:
>  	$(BUILD) -lelf
>  
> +$(OUTPUT)test-libdebuginfod.bin:
> +	$(BUILD) -ldebuginfod
> +
>  $(OUTPUT)test-libnuma.bin:
>  	$(BUILD) -lnuma
>  
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index 80e55e796be9..15eeecf4ff17 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -467,6 +467,11 @@ ifndef NO_LIBELF
>      CFLAGS += -DHAVE_ELF_GETSHDRSTRNDX_SUPPORT
>    endif
>  
> +  ifeq ($(feature-libdebuginfod), 1)
> +    CFLAGS += -DHAVE_DEBUGINFOD_SUPPORT
> +    EXTLIBS += -ldebuginfod
> +  endif
> +
>    ifndef NO_DWARF
>      ifeq ($(origin PERF_HAVE_DWARF_REGS), undefined)
>        msg := $(warning DWARF register mappings have not been defined for architecture $(SRCARCH), DWARF support disabled);
> diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> index c076fc7fe025..a33d00837d81 100644
> --- a/tools/perf/util/build-id.c
> +++ b/tools/perf/util/build-id.c
> @@ -31,6 +31,10 @@
>  #include "probe-file.h"
>  #include "strlist.h"
>  
> +#if HAVE_DEBUGINFOD_SUPPORT

s/if/ifdef

> +#include <elfutils/debuginfod.h>
> +#endif
> +
>  #include <linux/ctype.h>
>  #include <linux/zalloc.h>
>  
> @@ -636,6 +640,21 @@ static char *build_id_cache__find_debug(const char *sbuild_id,
>  	if (realname && access(realname, R_OK))
>  		zfree(&realname);
>  	nsinfo__mountns_exit(&nsc);
> +
> +#if HAVE_DEBUGINFOD_SUPPORT

s/if/ifdef

> +        if (realname == NULL) {
> +                debuginfod_client* c = debuginfod_begin();
> +                if (c != NULL) {
> +                        int fd = debuginfod_find_debuginfo(c,
> +                                                           (const unsigned char*)sbuild_id, 0,
> +                                                           &realname);
> +                        if (fd >= 0)
> +                                close(fd); /* retaining reference by realname */
> +                        debuginfod_end(c);
> +                }
> +        }
> +#endif
> +
>  out:
>  	free(debugfile);
>  	return realname;


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31  4:43                                                                                   ` ahmadkhorrami
@ 2020-03-31  9:30                                                                                     ` Jiri Olsa
  2020-03-31 11:53                                                                                       ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-31  9:30 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

On Tue, Mar 31, 2020 at 09:13:28AM +0430, ahmadkhorrami wrote:
> Hi,
> I checked the reported address in libgtk. GDB could map it to the source
> files. Also, once, I read at Stackoverflow that perf gives up on printing
> the backtrace (truncates the output backtrace), whenever it detects a binary
> file without debug info. Could you describe more about this situation?
> And dear Mr. Olsa, if these cases do not occur on your system, could you
> tell me how you installed the debug info for the binaries on your system,
> please?

google: 'how to install debuginfo on <whatever distro you run on>'

jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31  9:30                                                                                     ` Jiri Olsa
@ 2020-03-31 11:53                                                                                       ` ahmadkhorrami
  0 siblings, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 11:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi, surely I did that. I thought that there may exist a more systematic 
way which avoids my perf crashes.
Regards.

On 2020-03-31 14:00, Jiri Olsa wrote:

> On Tue, Mar 31, 2020 at 09:13:28AM +0430, ahmadkhorrami wrote:
> 
>> Hi,
>> I checked the reported address in libgtk. GDB could map it to the 
>> source
>> files. Also, once, I read at Stackoverflow that perf gives up on 
>> printing
>> the backtrace (truncates the output backtrace), whenever it detects a 
>> binary
>> file without debug info. Could you describe more about this situation?
>> And dear Mr. Olsa, if these cases do not occur on your system, could 
>> you
>> tell me how you installed the debug info for the binaries on your 
>> system,
>> please?
> 
> google: 'how to install debuginfo on <whatever distro you run on>'
> 
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-30 19:05                                                                                 ` ahmadkhorrami
  2020-03-30 21:05                                                                                   ` debuginfod-based dwarf downloading, was " Frank Ch. Eigler
  2020-03-31  4:43                                                                                   ` ahmadkhorrami
@ 2020-03-31 12:43                                                                                   ` ahmadkhorrami
  2020-03-31 13:20                                                                                     ` Jiri Olsa
  2 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 12:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
Please confirm my thought in the previous email, if it is correct.
Regards.
On 2020-03-30 23:35, ahmadkhorrami wrote:

> Hi,
> Sorry, I did not pay attention to the fact that we need debug info to 
> be able to decode the callchain. So, without complete debug info the 
> backtrace (either the symbols or the addresses) is bogus. But you say 
> that the backtrace portions containing binaries with full debug info 
> are OK. Am I right? Please confirm if this is the case.
> If so, these portions may be enough for my current usecase.
> Regards.
> 
> On 2020-03-30 18:19, ahmadkhorrami wrote:
> 
> Hi,
> thanks, but I still do not understand the cause of the problem. It 
> seems that it is purely a software bug and the callchain addresses seem 
> wrong,by themselves. Am I right?
> Regards.
> On 2020-03-30 17:37, Jiri Olsa wrote:
> 
> On Mon, Mar 30, 2020 at 08:09:53AM +0200, Milian Wolff wrote: On 
> Sonntag, 29. März 2020 21:20:10 CEST Jiri Olsa wrote: On Sun, Mar 29, 
> 2020 at 03:50:57PM +0200, Milian Wolff wrote: On Sonntag, 29. März 2020 
> 14:39:33 CEST ahmadkhorrami wrote: Thanks. I did both of your changes. 
> Perhaps some outputs are revised.
> But I still have repeated function calls and the script detects them.
> Here is one of them when the sampling period is 1000 events: <snip>
> 
> Here, we have two consecutive "7f91788120b5
> gtk_widget_propagate_state+0x195
> (/usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2200.30) gtkwidget.c:0" lines,
> while "gtk_widget_propagate_state+0x195" is not recursive. It should
> call "gtk_container_forall", which does not occur even after the second
> (inner) call.
> Potentially you are just lacking some debug symbols here for this GTK
> library. note that "gtkwidget.c:0" is bogus already - the line numbers
> start with 1, so could we just skip those then? pass on all zero lined 
> inlines
> and not add/display them
> I guess so - but quite frankly I'm a bit uneasy to do this without 
> further
> exploration about the actual causes here. Afaik DWARF does not say 
> anything
> about the validity, so in theory there might be some language/compiler 
> that
> encodes valid DWARF locations with line 0?
> 
> Generally, it would be quite interesting to figure out why there is 
> /some/
> DWARF data here such that the code thinks it's able to decode the 
> inline frame
> but then fails and leads to this confusing result?
> right, but it's beyond my DWARF expertise ;-) we'll need
> to wait till one of you guys could take a look on it
> 
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 12:43                                                                                   ` ahmadkhorrami
@ 2020-03-31 13:20                                                                                     ` Jiri Olsa
  2020-03-31 13:39                                                                                       ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Jiri Olsa @ 2020-03-31 13:20 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

On Tue, Mar 31, 2020 at 05:13:00PM +0430, ahmadkhorrami wrote:
> Hi,
> Please confirm my thought in the previous email, if it is correct.
> Regards.
> On 2020-03-30 23:35, ahmadkhorrami wrote:
> 
> > Hi,
> > Sorry, I did not pay attention to the fact that we need debug info to be
> > able to decode the callchain. So, without complete debug info the
> > backtrace (either the symbols or the addresses) is bogus. But you say
> > that the backtrace portions containing binaries with full debug info are
> > OK. Am I right? Please confirm if this is the case.
> > If so, these portions may be enough for my current usecase.

you need debuginfo to resolve symbols and inlines,
raw addresses should be coorect

jirka


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 13:20                                                                                     ` Jiri Olsa
@ 2020-03-31 13:39                                                                                       ` ahmadkhorrami
  2020-03-31 14:44                                                                                         ` Milian Wolff
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 13:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Milian Wolff, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

But the addresses do not match. Do you confirm this as a bug in 
libdwarf,...?

So I will ignore addresses without a matching symbol. But they do not 
seem reliable!

Could you tell me the name of the library that generates the raw 
addresses, so that I can try to debug it?

Regards.

On 2020-03-31 17:50, Jiri Olsa wrote:

> On Tue, Mar 31, 2020 at 05:13:00PM +0430, ahmadkhorrami wrote: Hi,
> Please confirm my thought in the previous email, if it is correct.
> Regards.
> On 2020-03-30 23:35, ahmadkhorrami wrote:
> 
> Hi,
> Sorry, I did not pay attention to the fact that we need debug info to 
> be
> able to decode the callchain. So, without complete debug info the
> backtrace (either the symbols or the addresses) is bogus. But you say
> that the backtrace portions containing binaries with full debug info 
> are
> OK. Am I right? Please confirm if this is the case.
> If so, these portions may be enough for my current usecase.

you need debuginfo to resolve symbols and inlines,
raw addresses should be coorect

jirka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: debuginfod-based dwarf downloading, was Re: Wrong Perf Backtraces
  2020-03-31  9:26                                                                                     ` Jiri Olsa
@ 2020-03-31 14:00                                                                                       ` Frank Ch. Eigler
  0 siblings, 0 replies; 67+ messages in thread
From: Frank Ch. Eigler @ 2020-03-31 14:00 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: ahmadkhorrami, Milian Wolff, Steven Rostedt,
	Arnaldo Carvalho de Melo, Linux-trace Users, Peter Zijlstra,
	linux-trace-users-owner, Jin Yao, Namhyung Kim, Andi Kleen

Hi -

> I think it's a good base, thanks a lot!
> I made few comments before reading above,
> feel free to post v2, if not we'll take over ;-)

Thanks, see below.

I tried to move the libdebuginfod feature test out of
FEATURE_TEST_BASIC but couldn't get it going elsewhere
in perf/Makefile.config, so it's back here for v2:


Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Mon Mar 30 16:15:47 2020 -0400

    perf build-ids: fall back to debuginfod query if debuginfo not found
    
    During a perf-record, use the -ldebuginfod API to query a debuginfod
    server, should the debug data not be found in the usual system
    locations.  If successful, the usual $HOME/.debug dir is populated.
    
    v2: use #ifdef HAVE_DEBUGINFOD_SUPPORT guards
        include feature test source file
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 574c2e0b9d20..51e051858d21 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -48,6 +48,7 @@ FEATURE_TESTS_BASIC :=                  \
         libelf-gelf_getnote             \
         libelf-getshdrstrndx            \
         libelf-mmap                     \
+        libdebuginfod                   \
         libnuma                         \
         numa_num_possible_cpus          \
         libperl                         \
@@ -114,6 +115,7 @@ FEATURE_DISPLAY ?=              \
          libbfd                 \
          libcap                 \
          libelf                 \
+         libdebuginfod          \
          libnuma                \
          numa_num_possible_cpus \
          libperl                \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 7ac0d8088565..1109f5ec96f7 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -26,6 +26,7 @@ FILES=                                          \
          test-libelf-gelf_getnote.bin           \
          test-libelf-getshdrstrndx.bin          \
          test-libelf-mmap.bin                   \
+         test-libdebuginfod.bin                 \
          test-libnuma.bin                       \
          test-numa_num_possible_cpus.bin        \
          test-libperl.bin                       \
@@ -155,6 +156,9 @@ endif
 $(OUTPUT)test-libelf-getshdrstrndx.bin:
 	$(BUILD) -lelf
 
+$(OUTPUT)test-libdebuginfod.bin:
+	$(BUILD) -ldebuginfod
+
 $(OUTPUT)test-libnuma.bin:
 	$(BUILD) -lnuma
 
diff --git a/tools/build/feature/test-libdebuginfod.c b/tools/build/feature/test-libdebuginfod.c
new file mode 100644
index 000000000000..0d20b06b4b4f
--- /dev/null
+++ b/tools/build/feature/test-libdebuginfod.c
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <elfutils/debuginfod.h>
+
+int main(void)
+{
+        debuginfod_client* c = debuginfod_begin();
+	return (long)c;
+}
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 80e55e796be9..15eeecf4ff17 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -467,6 +467,11 @@ ifndef NO_LIBELF
     CFLAGS += -DHAVE_ELF_GETSHDRSTRNDX_SUPPORT
   endif
 
+  ifeq ($(feature-libdebuginfod), 1)
+    CFLAGS += -DHAVE_DEBUGINFOD_SUPPORT
+    EXTLIBS += -ldebuginfod
+  endif
+
   ifndef NO_DWARF
     ifeq ($(origin PERF_HAVE_DWARF_REGS), undefined)
       msg := $(warning DWARF register mappings have not been defined for architecture $(SRCARCH), DWARF support disabled);
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index c076fc7fe025..31207b6e2066 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -31,6 +31,10 @@
 #include "probe-file.h"
 #include "strlist.h"
 
+#ifdef HAVE_DEBUGINFOD_SUPPORT
+#include <elfutils/debuginfod.h>
+#endif
+
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 
@@ -636,6 +640,21 @@ static char *build_id_cache__find_debug(const char *sbuild_id,
 	if (realname && access(realname, R_OK))
 		zfree(&realname);
 	nsinfo__mountns_exit(&nsc);
+
+#ifdef HAVE_DEBUGINFOD_SUPPORT
+        if (realname == NULL) {
+                debuginfod_client* c = debuginfod_begin();
+                if (c != NULL) {
+                        int fd = debuginfod_find_debuginfo(c,
+                                                           (const unsigned char*)sbuild_id, 0,
+                                                           &realname);
+                        if (fd >= 0)
+                                close(fd); /* retaining reference by realname */
+                        debuginfod_end(c);
+                }
+        }
+#endif
+
 out:
 	free(debugfile);
 	return realname;


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 13:39                                                                                       ` ahmadkhorrami
@ 2020-03-31 14:44                                                                                         ` Milian Wolff
  2020-03-31 15:02                                                                                           ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: Milian Wolff @ 2020-03-31 14:44 UTC (permalink / raw)
  To: Jiri Olsa, ahmadkhorrami
  Cc: Steven Rostedt, Arnaldo Carvalho de Melo, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Namhyung Kim,
	Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1947 bytes --]

On Dienstag, 31. März 2020 15:39:18 CEST ahmadkhorrami wrote:
> But the addresses do not match. Do you confirm this as a bug in
> libdwarf,...?
> 
> So I will ignore addresses without a matching symbol. But they do not
> seem reliable!
> 
> Could you tell me the name of the library that generates the raw
> addresses, so that I can try to debug it?

This is a platform specific question. There are multiple ways to unwind a 
stack. If you are on x86 then by default the .eh_frame section is available 
which holds the information necessary for unwinding. It doesn't depend on 
debug symbols, that's only used for symbolization and inline-frame resolution 
as Jiri indicated.

That said, in the context of perf, there are multiple scenarios that can lead 
to broken unwinding:

a) perf record --call-graph dwarf: unwinding can overflow the stack copy 
associated with every sample, so the upper end of the stack will be broken

b) perf record --call-graph $any: when you are sampling on a precise event, 
such as cycles:P which is the default afaik, then on Intel with PEBS e.g. the 
stack copy may be "wrong". See e.g. https://lkml.org/lkml/2018/11/6/257 and 
the overall thread. This is not solved yet afaik and after my initial attempt 
at workarounding this issue I stopped looking into it and instead opted for 
explicitly sampling on the non-precise events when I record call graphs... You 
could try that too: do you see the issue when you run e.g.:

`perf record --call-graph dwarf -e cycles`

This should take the non-precise version for sampling but then at least the 
call stacks are correct. I.e. you trade the accuracy of the instruction 
pointer to which a sample points with reduced call stack breakage.

c) bugs :)

-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 14:44                                                                                         ` Milian Wolff
@ 2020-03-31 15:02                                                                                           ` ahmadkhorrami
  2020-03-31 15:05                                                                                             ` ahmadkhorrami
  2020-03-31 15:29                                                                                             ` Milian Wolff
  0 siblings, 2 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 15:02 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi Milian,
Thanks for the detailed answer. Well, the bug you mentioned is bad news. 
Because I sample using uppp. Perhaps this leads to these weird traces. 
Is this a purely software bug?

On 2020-03-31 19:14, Milian Wolff wrote:

> On Dienstag, 31. März 2020 15:39:18 CEST ahmadkhorrami wrote:
> 
>> But the addresses do not match. Do you confirm this as a bug in
>> libdwarf,...?
>> 
>> So I will ignore addresses without a matching symbol. But they do not
>> seem reliable!
>> 
>> Could you tell me the name of the library that generates the raw
>> addresses, so that I can try to debug it?
> 
> This is a platform specific question. There are multiple ways to unwind 
> a
> stack. If you are on x86 then by default the .eh_frame section is 
> available
> which holds the information necessary for unwinding. It doesn't depend 
> on
> debug symbols, that's only used for symbolization and inline-frame 
> resolution
> as Jiri indicated.
> 
> That said, in the context of perf, there are multiple scenarios that 
> can lead
> to broken unwinding:
> 
> a) perf record --call-graph dwarf: unwinding can overflow the stack 
> copy
> associated with every sample, so the upper end of the stack will be 
> broken
> 
> b) perf record --call-graph $any: when you are sampling on a precise 
> event,
> such as cycles:P which is the default afaik, then on Intel with PEBS 
> e.g. the
> stack copy may be "wrong". See e.g. https://lkml.org/lkml/2018/11/6/257 
> and
> the overall thread. This is not solved yet afaik and after my initial 
> attempt
> at workarounding this issue I stopped looking into it and instead opted 
> for
> explicitly sampling on the non-precise events when I record call 
> graphs... You
> could try that too: do you see the issue when you run e.g.:
> 
> `perf record --call-graph dwarf -e cycles`
> 
> This should take the non-precise version for sampling but then at least 
> the
> call stacks are correct. I.e. you trade the accuracy of the instruction
> pointer to which a sample points with reduced call stack breakage.
> 
> c) bugs :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 15:02                                                                                           ` ahmadkhorrami
@ 2020-03-31 15:05                                                                                             ` ahmadkhorrami
  2020-03-31 15:29                                                                                             ` Milian Wolff
  1 sibling, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 15:05 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

And it seems that the bogus backtraces constitute only a small portion 
of the whole log. This seems to be good news.

On 2020-03-31 19:32, ahmadkhorrami wrote:

> Hi Milian,
> Thanks for the detailed answer. Well, the bug you mentioned is bad 
> news. Because I sample using uppp. Perhaps this leads to these weird 
> traces. Is this a purely software bug?
> 
> On 2020-03-31 19:14, Milian Wolff wrote:
> 
> On Dienstag, 31. März 2020 15:39:18 CEST ahmadkhorrami wrote:
> 
> But the addresses do not match. Do you confirm this as a bug in
> libdwarf,...?
> 
> So I will ignore addresses without a matching symbol. But they do not
> seem reliable!
> 
> Could you tell me the name of the library that generates the raw
> addresses, so that I can try to debug it?
> This is a platform specific question. There are multiple ways to unwind 
> a
> stack. If you are on x86 then by default the .eh_frame section is 
> available
> which holds the information necessary for unwinding. It doesn't depend 
> on
> debug symbols, that's only used for symbolization and inline-frame 
> resolution
> as Jiri indicated.
> 
> That said, in the context of perf, there are multiple scenarios that 
> can lead
> to broken unwinding:
> 
> a) perf record --call-graph dwarf: unwinding can overflow the stack 
> copy
> associated with every sample, so the upper end of the stack will be 
> broken
> 
> b) perf record --call-graph $any: when you are sampling on a precise 
> event,
> such as cycles:P which is the default afaik, then on Intel with PEBS 
> e.g. the
> stack copy may be "wrong". See e.g. https://lkml.org/lkml/2018/11/6/257 
> and
> the overall thread. This is not solved yet afaik and after my initial 
> attempt
> at workarounding this issue I stopped looking into it and instead opted 
> for
> explicitly sampling on the non-precise events when I record call 
> graphs... You
> could try that too: do you see the issue when you run e.g.:
> 
> `perf record --call-graph dwarf -e cycles`
> 
> This should take the non-precise version for sampling but then at least 
> the
> call stacks are correct. I.e. you trade the accuracy of the instruction
> pointer to which a sample points with reduced call stack breakage.
> 
> c) bugs :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 15:02                                                                                           ` ahmadkhorrami
  2020-03-31 15:05                                                                                             ` ahmadkhorrami
@ 2020-03-31 15:29                                                                                             ` Milian Wolff
  2020-03-31 16:10                                                                                               ` Arnaldo Carvalho de Melo
                                                                                                                 ` (2 more replies)
  1 sibling, 3 replies; 67+ messages in thread
From: Milian Wolff @ 2020-03-31 15:29 UTC (permalink / raw)
  To: ahmadkhorrami
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1260 bytes --]

On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
> Hi Milian,
> Thanks for the detailed answer. Well, the bug you mentioned is bad news.
> Because I sample using uppp. Perhaps this leads to these weird traces.

Please read the full thread from here on:

https://lkml.org/lkml/2018/11/2/86

But as I said - it should be easy to check if this is really the issue are 
running into or not: Try to see if you see the problem when you sample without 
`ppp`. If not, then you can be pretty sure it's this issue. If you still see 
it, then it's something different.

> Is this a purely software bug?

I wouldn't call it that, personally. Rather, it's a limitation in the hardware 
and software. We would need something completely different to "fix" this, i.e. 
something like a deeper LBR. That's btw another alternative you could try: 
`perf record --call-graph lbr` and live with the short call stacks. But at 
least these should be correct (afaik). For me personally they are always far 
too short and thus not practical to use in reality.

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 15:29                                                                                             ` Milian Wolff
@ 2020-03-31 16:10                                                                                               ` Arnaldo Carvalho de Melo
  2020-03-31 19:20                                                                                                 ` ahmadkhorrami
  2020-03-31 19:17                                                                                               ` ahmadkhorrami
  2020-03-31 20:57                                                                                               ` ahmadkhorrami
  2 siblings, 1 reply; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-31 16:10 UTC (permalink / raw)
  To: Milian Wolff
  Cc: ahmadkhorrami, Jiri Olsa, Steven Rostedt, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Namhyung Kim,
	Andi Kleen

Em Tue, Mar 31, 2020 at 05:29:17PM +0200, Milian Wolff escreveu:
> On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
> > Hi Milian,
> > Thanks for the detailed answer. Well, the bug you mentioned is bad news.
> > Because I sample using uppp. Perhaps this leads to these weird traces.
> 
> Please read the full thread from here on:
> 
> https://lkml.org/lkml/2018/11/2/86
> 
> But as I said - it should be easy to check if this is really the issue are 
> running into or not: Try to see if you see the problem when you sample without 
> `ppp`. If not, then you can be pretty sure it's this issue. If you still see 
> it, then it's something different.
> 
> > Is this a purely software bug?
> 
> I wouldn't call it that, personally. Rather, it's a limitation in the hardware 
> and software. We would need something completely different to "fix" this, i.e. 
> something like a deeper LBR. That's btw another alternative you could try: 
> `perf record --call-graph lbr` and live with the short call stacks. But at 
> least these should be correct (afaik). For me personally they are always far 
> too short and thus not practical to use in reality.

Probably this may help:

  From: Kan Liang <kan.liang@linux.intel.com>
  Subject: [PATCH V4 00/17] Stitch LBR call stack (Perf Tools)
  Date: Thu, 19 Mar 2020 13:25:00 -0700
  https://lore.kernel.org/lkml/20200319202517.23423-1-kan.liang@linux.intel.com/

- Arnaldo


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 15:29                                                                                             ` Milian Wolff
  2020-03-31 16:10                                                                                               ` Arnaldo Carvalho de Melo
@ 2020-03-31 19:17                                                                                               ` ahmadkhorrami
  2020-03-31 20:57                                                                                               ` ahmadkhorrami
  2 siblings, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 19:17 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Thanks.
On 2020-03-31 19:59, Milian Wolff wrote:

> On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
> 
>> Hi Milian,
>> Thanks for the detailed answer. Well, the bug you mentioned is bad 
>> news.
>> Because I sample using uppp. Perhaps this leads to these weird traces.
> 
> Please read the full thread from here on:
> 
> https://lkml.org/lkml/2018/11/2/86
> 
> But as I said - it should be easy to check if this is really the issue 
> are
> running into or not: Try to see if you see the problem when you sample 
> without
> `ppp`. If not, then you can be pretty sure it's this issue. If you 
> still see
> it, then it's something different.
> 
>> Is this a purely software bug?
> 
> I wouldn't call it that, personally. Rather, it's a limitation in the 
> hardware
> and software. We would need something completely different to "fix" 
> this, i.e.
> something like a deeper LBR. That's btw another alternative you could 
> try:
> `perf record --call-graph lbr` and live with the short call stacks. But 
> at
> least these should be correct (afaik). For me personally they are 
> always far
> too short and thus not practical to use in reality.
> 
> Cheers

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 16:10                                                                                               ` Arnaldo Carvalho de Melo
@ 2020-03-31 19:20                                                                                                 ` ahmadkhorrami
  0 siblings, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 19:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Milian Wolff, Jiri Olsa, Steven Rostedt, Linux-trace Users,
	Peter Zijlstra, linux-trace-users-owner, Jin Yao, Namhyung Kim,
	Andi Kleen

Thanks.
I checked "perf script -D", and it contains repeated addresses in 
itself. The problem seems to occur at the sampling instant. Does anybody 
know the location of the sampling handler?
I am looking forward to the answer.
Regards.

On 2020-03-31 20:40, Arnaldo Carvalho de Melo wrote:

> Em Tue, Mar 31, 2020 at 05:29:17PM +0200, Milian Wolff escreveu: On 
> Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote: Hi Milian,
> Thanks for the detailed answer. Well, the bug you mentioned is bad 
> news.
> Because I sample using uppp. Perhaps this leads to these weird traces.
> Please read the full thread from here on:
> 
> https://lkml.org/lkml/2018/11/2/86
> 
> But as I said - it should be easy to check if this is really the issue 
> are
> running into or not: Try to see if you see the problem when you sample 
> without
> `ppp`. If not, then you can be pretty sure it's this issue. If you 
> still see
> it, then it's something different.
> 
> Is this a purely software bug?
> I wouldn't call it that, personally. Rather, it's a limitation in the 
> hardware
> and software. We would need something completely different to "fix" 
> this, i.e.
> something like a deeper LBR. That's btw another alternative you could 
> try:
> `perf record --call-graph lbr` and live with the short call stacks. But 
> at
> least these should be correct (afaik). For me personally they are 
> always far
> too short and thus not practical to use in reality.

Probably this may help:

   From: Kan Liang <kan.liang@linux.intel.com>
   Subject: [PATCH V4 00/17] Stitch LBR call stack (Perf Tools)
   Date: Thu, 19 Mar 2020 13:25:00 -0700
   
https://lore.kernel.org/lkml/20200319202517.23423-1-kan.liang@linux.intel.com/

- Arnaldo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 15:29                                                                                             ` Milian Wolff
  2020-03-31 16:10                                                                                               ` Arnaldo Carvalho de Melo
  2020-03-31 19:17                                                                                               ` ahmadkhorrami
@ 2020-03-31 20:57                                                                                               ` ahmadkhorrami
  2020-04-04  1:01                                                                                                 ` ahmadkhorrami
  2 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-03-31 20:57 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
On 2020-03-31 19:59, Milian Wolff wrote:

> On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
> 
>> Hi Milian,
>> Thanks for the detailed answer. Well, the bug you mentioned is bad 
>> news.
>> Because I sample using uppp. Perhaps this leads to these weird traces.
> 
> Please read the full thread from here on:
> 
> https://lkml.org/lkml/2018/11/2/86
> 
> But as I said - it should be easy to check if this is really the issue 
> are
> running into or not: Try to see if you see the problem when you sample 
> without
> `ppp`. If not, then you can be pretty sure it's this issue. If you 
> still see
> it, then it's something different.
> 
Well, the problem exists even when ":uppp" is changed to ":u".
>> Is this a purely software bug?
> 
> I wouldn't call it that, personally. Rather, it's a limitation in the 
> hardware
> and software. We would need something completely different to "fix" 
> this, i.e.
> something like a deeper LBR. That's btw another alternative you could 
> try:
> `perf record --call-graph lbr` and live with the short call stacks. But 
> at
> least these should be correct (afaik). For me personally they are 
> always far
> too short and thus not practical to use in reality.
> 
> Cheers
Regards.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-03-31 20:57                                                                                               ` ahmadkhorrami
@ 2020-04-04  1:01                                                                                                 ` ahmadkhorrami
  2020-04-11 16:42                                                                                                   ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-04-04  1:01 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
I think that I should give up and assume that all illogical repeated 
addresses are inlines. Thanks everybody, especially, Jirka, Milian, 
Arnaldo and Steven. Perhaps some day I will be back and try to dig 
deeper (with your hints and assistance).
Regards.

On 2020-04-01 01:27, ahmadkhorrami wrote:

> Hi,
> On 2020-03-31 19:59, Milian Wolff wrote:
> 
> On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
> 
> Hi Milian,
> Thanks for the detailed answer. Well, the bug you mentioned is bad 
> news.
> Because I sample using uppp. Perhaps this leads to these weird traces.
> Please read the full thread from here on:
> 
> https://lkml.org/lkml/2018/11/2/86
> 
> But as I said - it should be easy to check if this is really the issue 
> are
> running into or not: Try to see if you see the problem when you sample 
> without
> `ppp`. If not, then you can be pretty sure it's this issue. If you 
> still see
> it, then it's something different.
  Well, the problem exists even when ":uppp" is changed to ":u".

>> Is this a purely software bug?
> 
> I wouldn't call it that, personally. Rather, it's a limitation in the 
> hardware
> and software. We would need something completely different to "fix" 
> this, i.e.
> something like a deeper LBR. That's btw another alternative you could 
> try:
> `perf record --call-graph lbr` and live with the short call stacks. But 
> at
> least these should be correct (afaik). For me personally they are 
> always far
> too short and thus not practical to use in reality.
> 
> Cheers
  Regards.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-04-04  1:01                                                                                                 ` ahmadkhorrami
@ 2020-04-11 16:42                                                                                                   ` ahmadkhorrami
  2020-04-11 21:04                                                                                                     ` ahmadkhorrami
  0 siblings, 1 reply; 67+ messages in thread
From: ahmadkhorrami @ 2020-04-11 16:42 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

Hi,
I found the problem by building the libraries from source code. The 
"gmallocn()"s were inlines. So, perhaps it would be better if Perf had 
reported it as "(inlined)" as in the common case.
The second problematic case (repeated 
"gtk_action_muxer_query_action()"s) was purely due to my lack of 
knowledge. Sorry for wasting your time, specially, Mr. Olsa, :D. It was 
inter-library tail-call elimination, which I did not know is possible (I 
thought tail-calls happen only inside each translation unit). In other 
words, the last operation in the callee at 
"gtk_action_muxer_query_action+0x6c" is a jump to the beginning of 
"gtk_action_muxer_query_action()".
And the debug info for the dbg-packages (Ubuntu in my case), seems to be 
OK. I wonder why source lines were reported as 0. Perhaps, the unwinding 
library is bogus.

Finally, I have a new (hopefully, unsilly) question, :D. In the 
following backtrace, __GI___libc_malloc+0x197 is reported as inline. 
While its address (i.e., 0x7ffff4b07207) does not match that of its 
parent function call point (i.e., gmalloc+0x59 which is 0x7fffd9872fb9). 
Is this logical? And, again, there are many such instances in the Perf 
output.

EvJobScheduler 10021  8653.926478:        100 
mem_load_uops_retired.l3_miss:uppp:     7fffd1062a00         5080022 
N/A|SNP N/A|TLB N/A|LCK N/A
    7ffff4b07207 tcache_get+0x197 (inlined)
    7ffff4b07207 __GI___libc_malloc+0x197 (inlined)
    7fffd9872fb9 gmalloc+0x59 (inlined)
    7fffd9872fb9 gmallocn+0x59 
(/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
    7fffd9872fb9 gmallocn+0x59 
(/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
    7fffd9951e6f _ZN8TextLine8coalesceEP10UnicodeMap+0xff 
(/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
    7fffd9952f82 _ZN9TextBlock8coalesceEP10UnicodeMapd+0x752 
(/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
    7fffd995bc37 _ZN8TextPage8coalesceEbdb+0x1507 
(/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
    7fffd995cb71 _ZN13TextOutputDev7endPageEv+0x31 
(/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
    7fffe803c6d2 _ZL26poppler_page_get_text_pageP12_PopplerPage+0x92 
(/usr/lib/x86_64-linux-gnu/libpoppler-glib.so.8.9.0)
    7fffe803deb3 poppler_page_get_selection_region+0x63 
(/usr/lib/x86_64-linux-gnu/libpoppler-glib.so.8.9.0)
    7fffe82ab650 [unknown] 
(/opt/evince-3.28.4/lib/evince/4/backends/libpdfdocument.so)
    7ffff795f165 ev_job_page_data_run+0x2f5 
(/opt/evince-3.28.4/lib/libevview3.so.3.0.0)
    7ffff7961309 ev_job_thread+0xe9 (inlined)
    7ffff7961309 ev_job_thread_proxy+0xe9 
(/opt/evince-3.28.4/lib/libevview3.so.3.0.0)
    7ffff5492194 g_thread_proxy+0x54 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
    7ffff4e686da start_thread+0xda 
(/lib/x86_64-linux-gnu/libpthread-2.27.so)
    7ffff4b9188e __GI___clone+0x3e (inlined)

I am looking forward to any help or suggestions.
Regards.

On 2020-04-04 05:31, ahmadkhorrami wrote:

> Hi,
> I think that I should give up and assume that all illogical repeated 
> addresses are inlines. Thanks everybody, especially, Jirka, Milian, 
> Arnaldo and Steven. Perhaps some day I will be back and try to dig 
> deeper (with your hints and assistance).
> Regards.
> 
> On 2020-04-01 01:27, ahmadkhorrami wrote:
> 
>> Hi,
>> On 2020-03-31 19:59, Milian Wolff wrote:
>> 
>> On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
>> 
>> Hi Milian,
>> Thanks for the detailed answer. Well, the bug you mentioned is bad 
>> news.
>> Because I sample using uppp. Perhaps this leads to these weird traces.
>> Please read the full thread from here on:
>> 
>> https://lkml.org/lkml/2018/11/2/86
>> 
>> But as I said - it should be easy to check if this is really the issue 
>> are
>> running into or not: Try to see if you see the problem when you sample 
>> without
>> `ppp`. If not, then you can be pretty sure it's this issue. If you 
>> still see
>> it, then it's something different.
> Well, the problem exists even when ":uppp" is changed to ":u".
> 
> Is this a purely software bug?
> I wouldn't call it that, personally. Rather, it's a limitation in the 
> hardware
> and software. We would need something completely different to "fix" 
> this, i.e.
> something like a deeper LBR. That's btw another alternative you could 
> try:
> `perf record --call-graph lbr` and live with the short call stacks. But 
> at
> least these should be correct (afaik). For me personally they are 
> always far
> too short and thus not practical to use in reality.
> 
> Cheers
   Regards.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Wrong Perf Backtraces
  2020-04-11 16:42                                                                                                   ` ahmadkhorrami
@ 2020-04-11 21:04                                                                                                     ` ahmadkhorrami
  0 siblings, 0 replies; 67+ messages in thread
From: ahmadkhorrami @ 2020-04-11 21:04 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo,
	Linux-trace Users, Peter Zijlstra, linux-trace-users-owner,
	Jin Yao, Namhyung Kim, Andi Kleen

I traced the code and it jumped into PLT and .... Therefore, it seems 
that it is actually a function call and is incorrectly reported as 
inline. In other words, this is a called function incorrectly reported 
as inline, while the "gmallocn" case was an inline incorrectly omitted. 
Hopefully, I'm right.

Regards.

On 2020-04-11 21:12, ahmadkhorrami wrote:

> Hi,
> I found the problem by building the libraries from source code. The 
> "gmallocn()"s were inlines. So, perhaps it would be better if Perf had 
> reported it as "(inlined)" as in the common case.
> The second problematic case (repeated 
> "gtk_action_muxer_query_action()"s) was purely due to my lack of 
> knowledge. Sorry for wasting your time, specially, Mr. Olsa, :D. It was 
> inter-library tail-call elimination, which I did not know is possible 
> (I thought tail-calls happen only inside each translation unit). In 
> other words, the last operation in the callee at 
> "gtk_action_muxer_query_action+0x6c" is a jump to the beginning of 
> "gtk_action_muxer_query_action()".
> And the debug info for the dbg-packages (Ubuntu in my case), seems to 
> be OK. I wonder why source lines were reported as 0. Perhaps, the 
> unwinding library is bogus.
> 
> Finally, I have a new (hopefully, unsilly) question, :D. In the 
> following backtrace, __GI___libc_malloc+0x197 is reported as inline. 
> While its address (i.e., 0x7ffff4b07207) does not match that of its 
> parent function call point (i.e., gmalloc+0x59 which is 
> 0x7fffd9872fb9). Is this logical? And, again, there are many such 
> instances in the Perf output.
> 
> EvJobScheduler 10021  8653.926478:        100 
> mem_load_uops_retired.l3_miss:uppp:     7fffd1062a00         5080022 
> N/A|SNP N/A|TLB N/A|LCK N/A
> 7ffff4b07207 tcache_get+0x197 (inlined)
> 7ffff4b07207 __GI___libc_malloc+0x197 (inlined)
> 7fffd9872fb9 gmalloc+0x59 (inlined)
> 7fffd9872fb9 gmallocn+0x59 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd9872fb9 gmallocn+0x59 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd9951e6f _ZN8TextLine8coalesceEP10UnicodeMap+0xff 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd9952f82 _ZN9TextBlock8coalesceEP10UnicodeMapd+0x752 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd995bc37 _ZN8TextPage8coalesceEbdb+0x1507 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffd995cb71 _ZN13TextOutputDev7endPageEv+0x31 
> (/usr/lib/x86_64-linux-gnu/libpoppler.so.73.0.0)
> 7fffe803c6d2 _ZL26poppler_page_get_text_pageP12_PopplerPage+0x92 
> (/usr/lib/x86_64-linux-gnu/libpoppler-glib.so.8.9.0)
> 7fffe803deb3 poppler_page_get_selection_region+0x63 
> (/usr/lib/x86_64-linux-gnu/libpoppler-glib.so.8.9.0)
> 7fffe82ab650 [unknown] 
> (/opt/evince-3.28.4/lib/evince/4/backends/libpdfdocument.so)
> 7ffff795f165 ev_job_page_data_run+0x2f5 
> (/opt/evince-3.28.4/lib/libevview3.so.3.0.0)
> 7ffff7961309 ev_job_thread+0xe9 (inlined)
> 7ffff7961309 ev_job_thread_proxy+0xe9 
> (/opt/evince-3.28.4/lib/libevview3.so.3.0.0)
> 7ffff5492194 g_thread_proxy+0x54 
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.4)
> 7ffff4e686da start_thread+0xda 
> (/lib/x86_64-linux-gnu/libpthread-2.27.so)
> 7ffff4b9188e __GI___clone+0x3e (inlined)
> 
> I am looking forward to any help or suggestions.
> Regards.
> 
> On 2020-04-04 05:31, ahmadkhorrami wrote:
> 
> Hi,
> I think that I should give up and assume that all illogical repeated 
> addresses are inlines. Thanks everybody, especially, Jirka, Milian, 
> Arnaldo and Steven. Perhaps some day I will be back and try to dig 
> deeper (with your hints and assistance).
> Regards.
> 
> On 2020-04-01 01:27, ahmadkhorrami wrote:
> 
> Hi,
> On 2020-03-31 19:59, Milian Wolff wrote:
> 
> On Dienstag, 31. März 2020 17:02:37 CEST ahmadkhorrami wrote:
> 
> Hi Milian,
> Thanks for the detailed answer. Well, the bug you mentioned is bad 
> news.
> Because I sample using uppp. Perhaps this leads to these weird traces.
> Please read the full thread from here on:
> 
> https://lkml.org/lkml/2018/11/2/86
> 
> But as I said - it should be easy to check if this is really the issue 
> are
> running into or not: Try to see if you see the problem when you sample 
> without
> `ppp`. If not, then you can be pretty sure it's this issue. If you 
> still see
> it, then it's something different. Well, the problem exists even when 
> ":uppp" is changed to ":u".
> 
> Is this a purely software bug?
> I wouldn't call it that, personally. Rather, it's a limitation in the 
> hardware
> and software. We would need something completely different to "fix" 
> this, i.e.
> something like a deeper LBR. That's btw another alternative you could 
> try:
> `perf record --call-graph lbr` and live with the short call stacks. But 
> at
> least these should be correct (afaik). For me personally they are 
> always far
> too short and thus not practical to use in reality.
> 
> Cheers
    Regards.

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2020-04-11 21:04 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <157597d74ff17f781d9de7e7e3defd13@ut.ac.ir>
2020-03-22 20:24 ` Wrong Perf Backtraces ahmadkhorrami
2020-03-23  0:34   ` Steven Rostedt
     [not found]     ` <21b3df4080709f193d62b159887e2a83@ut.ac.ir>
2020-03-23  8:49       ` Jiri Olsa
2020-03-23 10:03         ` ahmadkhorrami
2020-03-25 15:18           ` ahmadkhorrami
2020-03-25 15:46             ` Jiri Olsa
2020-03-25 18:54               ` ahmadkhorrami
2020-03-25 18:58               ` Arnaldo Carvalho de Melo
2020-03-25 19:10                 ` ahmadkhorrami
2020-03-25 19:28                   ` Arnaldo Carvalho de Melo
2020-03-25 20:01                     ` ahmadkhorrami
2020-03-25 20:39                       ` Jiri Olsa
2020-03-25 21:02                         ` Jiri Olsa
2020-03-25 21:09                           ` Steven Rostedt
2020-03-25 21:37                             ` ahmadkhorrami
2020-03-25 21:46                               ` Jiri Olsa
2020-03-25 22:21                                 ` ahmadkhorrami
2020-03-25 23:09                                   ` ahmadkhorrami
2020-03-26  9:59                                     ` Jiri Olsa
2020-03-26 13:20                                       ` ahmadkhorrami
2020-03-26 15:39                                         ` Jiri Olsa
2020-03-26 18:19                                           ` ahmadkhorrami
2020-03-26 18:21                                             ` ahmadkhorrami
2020-03-27  9:20                                             ` Jiri Olsa
2020-03-27 10:59                                               ` ahmadkhorrami
2020-03-27 11:04                                                 ` ahmadkhorrami
2020-03-27 12:10                                                   ` Milian Wolff
2020-03-27 12:58                                                     ` ahmadkhorrami
2020-03-27 13:25                                                       ` Milian Wolff
2020-03-27 13:33                                                         ` ahmadkhorrami
2020-03-27 18:43                                                   ` ahmadkhorrami
2020-03-27 22:37                                                     ` Jiri Olsa
2020-03-27 23:12                                                       ` ahmadkhorrami
2020-03-28 23:34                                                         ` Jiri Olsa
2020-03-29  0:43                                                           ` ahmadkhorrami
2020-03-29  1:16                                                             ` ahmadkhorrami
2020-03-29 11:19                                                               ` Jiri Olsa
2020-03-29 11:52                                                                 ` ahmadkhorrami
2020-03-29 12:08                                                                   ` Jiri Olsa
2020-03-29 12:39                                                                     ` ahmadkhorrami
2020-03-29 13:50                                                                       ` Milian Wolff
2020-03-29 14:23                                                                         ` ahmadkhorrami
2020-03-29 19:20                                                                         ` Jiri Olsa
2020-03-30  6:09                                                                           ` Milian Wolff
2020-03-30 13:07                                                                             ` Jiri Olsa
2020-03-30 13:49                                                                               ` ahmadkhorrami
2020-03-30 19:05                                                                                 ` ahmadkhorrami
2020-03-30 21:05                                                                                   ` debuginfod-based dwarf downloading, was " Frank Ch. Eigler
2020-03-31  9:26                                                                                     ` Jiri Olsa
2020-03-31 14:00                                                                                       ` Frank Ch. Eigler
2020-03-31  4:43                                                                                   ` ahmadkhorrami
2020-03-31  9:30                                                                                     ` Jiri Olsa
2020-03-31 11:53                                                                                       ` ahmadkhorrami
2020-03-31 12:43                                                                                   ` ahmadkhorrami
2020-03-31 13:20                                                                                     ` Jiri Olsa
2020-03-31 13:39                                                                                       ` ahmadkhorrami
2020-03-31 14:44                                                                                         ` Milian Wolff
2020-03-31 15:02                                                                                           ` ahmadkhorrami
2020-03-31 15:05                                                                                             ` ahmadkhorrami
2020-03-31 15:29                                                                                             ` Milian Wolff
2020-03-31 16:10                                                                                               ` Arnaldo Carvalho de Melo
2020-03-31 19:20                                                                                                 ` ahmadkhorrami
2020-03-31 19:17                                                                                               ` ahmadkhorrami
2020-03-31 20:57                                                                                               ` ahmadkhorrami
2020-04-04  1:01                                                                                                 ` ahmadkhorrami
2020-04-11 16:42                                                                                                   ` ahmadkhorrami
2020-04-11 21:04                                                                                                     ` ahmadkhorrami

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).