From: Steven Rostedt <rostedt@goodmis.org>
To: ahmadkhorrami <ahmadkhorrami@ut.ac.ir>
Cc: Linux-trace Users <linux-trace-users@vger.kernel.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
linux-perf-users@vger.kernel.org
Subject: Re: Perf Script Erroneous User Stack Trace
Date: Mon, 15 Jun 2020 16:31:45 -0400 [thread overview]
Message-ID: <20200615163145.458bd878@oasis.local.home> (raw)
In-Reply-To: <816cb5f558cd0e528812dff2168ef4ca@ut.ac.ir>
On Sun, 14 Jun 2020 18:13:21 +0430
ahmadkhorrami <ahmadkhorrami@ut.ac.ir> wrote:
> Hi,
>
> I used the following command to sample backtraces for a simple "ffmpeg"
> benchmark:
> sudo perf record -d --call-graph dwarf,65528 -c 1000000 -e
> mem_load_uops_retired.l3_miss:u ffmpeg -i
> /media/ahmad/DATA/Videos/video.mp4 -threads 1 -vf spp out.mp4
>
> As can be seen PEBS is not used, the stack size is set to the maximum
> and the sampling period is quite large. I also limited the thread count,
> but this is the first portion of "perf script --no-demangle" output:
> ffmpeg 11750 6670.061261: 1000000 mem_load_uops_retired.l3_miss:u:
> 0 5080021 N/A|SNP N/A|TLB N/A|LCK N/A
> 7fffeab68844 x264_pixel_avg_w16_avx2+0x4
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
>
> ffmpeg 11750 6670.274835: 1000000 mem_load_uops_retired.l3_miss:u:
> 0 5080021 N/A|SNP N/A|TLB N/A|LCK N/A
> 7fffeab68844 x264_pixel_avg_w16_avx2+0x4
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
>
> ffmpeg 11750 6670.496159: 1000000 mem_load_uops_retired.l3_miss:u:
> 0 5080021 N/A|SNP N/A|TLB N/A|LCK N/A
> 7fffeab8ef89 x264_pixel_sad_x4_16x16_avx2+0x49
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
>
> ffmpeg 11750 6670.852598: 1000000 mem_load_uops_retired.l3_miss:u:
> 0 5080021 N/A|SNP N/A|TLB N/A|LCK N/A
> 7fffeaac97b3 pixel_memset+0x293 (inlined)
> 7fffeaac97b3 plane_expand_border+0x293 (inlined)
> 7fffeaac97b3 x264_frame_expand_border_filtered+0x293
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab463bc x264_fdec_filter_row+0x69c
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab49523 x264_slice_write+0x1873
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab85285 x264_stack_align+0x15
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab45bdb x264_slices_write+0xfb
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 5555561e3d87 [unknown] ([heap])
>
> ffmpeg 11750 6671.110007: 1000000 mem_load_uops_retired.l3_miss:u:
> 0 5080021 N/A|SNP N/A|TLB N/A|LCK N/A
> 7fffeab6cdde x264_frame_init_lowres_core_avx2+0x8e
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
>
> ffmpeg 11750 6671.463562: 1000000 mem_load_uops_retired.l3_miss:u:
> 0 5080021 N/A|SNP N/A|TLB N/A|LCK N/A
> 7fffeaabf806 x264_macroblock_load_pic_pointers+0x886 (inlined)
> 7fffeaabf806 x264_macroblock_cache_load+0x886 (inlined)
> 7fffeaabf806 x264_macroblock_cache_load_progressive+0x886
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab49204 x264_slice_write+0x1554
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab85285 x264_stack_align+0x15
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 7fffeab45bdb x264_slices_write+0xfb
> (/usr/lib/x86_64-linux-gnu/libx264.so.152)
> 1c [unknown] ([unknown])
>
> None of the backtraces are correct. Because none of them begin with
> "__start" or "__GI___clone". I also used "LBR", instead. But it has more
> size constraints and, therefore, not suitable. The important thing to
> note is that the problem occurs only with user space events (and for all
> events that I checked). I do not think that the problem is with
> DebugInfo. Because I manually used "perf_event_open()" system call
> (without using "Perf") and the problem was still there (with raw
> callstack IPs).
>
> Therefore, I assumed that the problem is inside the kernel. Precisely,
> it should be where the userspace callchain is extracted or dumped. I
> looked for the latter (i.e., the callchain dump implementation) and it
> seemed to be here:
> https://github.com/torvalds/linux/blob/master/kernel/events/core.c#L6786
>
> But I could not (or, equivalently, did not know how to) view the user
> callchain instruction pointers.
> Am I on the right track? Does anybody know the kernel mechanism for
> extracting userspace callchains?
>
> Please accept my apology for my frequent questions. I tried to get
> around the problem, myself, but it has taken more than three complete
> days and I'm stuck!
> I really appreciate any suggestions.
No problem, but please note that perf questions are more likely to be
answered via: linux-perf-users@vger.kernel.org and not
linux-trace-users. As linux-trace-users are more for ftrace and not
perf.
-- Steve
next prev parent reply other threads:[~2020-06-15 20:31 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-14 13:43 Perf Script Erroneous User Stack Trace ahmadkhorrami
2020-06-15 20:31 ` Steven Rostedt [this message]
2020-06-15 20:53 ` Ian Rogers
2020-06-16 10:38 ` Sp! " ahmadkhorrami
2020-06-16 14:37 ` ahmadkhorrami
2020-06-16 16:20 ` Milian Wolff
2020-06-16 17:06 ` ahmadkhorrami
2020-06-16 17:42 ` ahmadkhorrami
2020-06-16 10:26 ` ahmadkhorrami
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200615163145.458bd878@oasis.local.home \
--to=rostedt@goodmis.org \
--cc=acme@redhat.com \
--cc=ahmadkhorrami@ut.ac.ir \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-trace-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).