* Re: perf report with newest perf not showing jump targets on "most" C++ code
[not found] ` <CAOBGo4xGB9sOyntHrjMyF6sT9L4Ksg8+yB6K5r+sTQxZK7gy3Q@mail.gmail.com>
@ 2021-10-23 15:41 ` Travis Downs
0 siblings, 0 replies; only message in thread
From: Travis Downs @ 2021-10-23 15:41 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users
Just keeping the dream alive that this will be fixed one day. It's
still making perf report hard to use in C++.
As noted above it was introduced in 751b1783da784299b0509adb6a9cd3024cc4f837.
On Tue, Jun 25, 2019 at 1:18 PM Travis Downs <travis.downs@gmail.com> wrote:
>
> Hi,
>
> Just wondering if there is anything I can do to help move this along.
>
> I guess it hasn't been reported elsewhere yet because this perf
> version hasn't hit the mainsteam, but I guess it will break in a
> widespread way for C++ perf report users when it starts to get
> deployed by the distros.
>
> Cheers,
> Travis
>
>
> On Tue, May 21, 2019 at 8:38 PM Travis Downs <travis.downs@gmail.com> wrote:
> >
> > Hi Arnaldo,
> >
> > Yes, --stdio2 worked well (but a bit tricky because --stdio2 was
> > introduced fairly close in the commit stream to this issue)!
> >
> > I bisected it to:
> >
> > 751b1783da784299b0509adb6a9cd3024cc4f837
> > perf annotate: Mark jumps to outher functions with the call arrow
> >
> > Indeed, in the detailed description we have:
> >
> > "... [f]or now lets just make some progress by marking jumps to outside the
> > current function as call like ..."
> >
> > And empirically I see most jumps being marked as call-like after this
> > change. Note however: with this change I see a middle-ground behavior
> > compared to what I described earlier: all the arrows on the jumps look
> > call-like, but when you select them in report tui it still correctly
> > shows the jump arrow. However, later on (by v4.20 or earlier) this
> > jump indicator disappears too.
> >
> > Here's some reproducing code:
> >
> > ---
> > #include <algorithm>
> > #include <memory>
> > #include <random>
> >
> > const size_t size = 10000000;
> >
> > int main() {
> >
> > auto buf = std::unique_ptr<uint64_t[]>(new uint64_t[size]);
> >
> > auto b = buf.get(), e = b + size;
> > std::iota(b, e, 0);
> > std::shuffle(b, e, std::default_random_engine{});
> >
> > clock_t before, after;
> >
> > before = clock();
> > std::sort(b, e);
> > after = clock();
> >
> > printf("Soring took %5.2f ns/elem\n", (after - before) * 1000000000. /
> > size / CLOCKS_PER_SEC);
> > }
> > ---
> >
> > I compile it with:
> >
> > g++ -O3 -g -march=haswell -Wall -Wextra -std=c++11 -c bench.cpp
> >
> > If you run `perf annotate --stdio2 --no-source -Mintel` you see something like:
> >
> > Disassembly of section .text:
> >
> > 0000000000400dc0 <void std::__introsort_loop<unsigned
> > long*, long, __gnu_cxx::__ops::_Iter_less_iter>(unsigned long*,
> > unsigned long*, long, __gnu_cxx::__ops::_Iter_less_iter) [clone
> > .isra.20]
> > _ZSt16__introsort_loopIPmlN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S4_T0_T1_():
> > 0.03 mov rax,rsi
> > sub rax,rdi
> > cmp rax,0x87
> > 0.03 → jle 400f5f <void std::__introsort_loop<unsigned
> > long*, long, __gnu_cxx::__ops::_Iter_less_iter>(unsigned long*,
> > unsigned long*, long, __gnu_cxx::__ops::_Iter_less_iter) [clone
> > .isra.20]+
> > 0.18 push r14
> > mov r14,rdx
> > push r13
> > mov r13,rdi
> > 0.05 push r12
> > 0.05 push rbp
> > push rbx
> > test rdx,rdx
> > → je 400eeb <void std::__introsort_loop<unsigned
> > long*, long, __gnu_cxx::__ops::_Iter_less_iter>(unsigned long*,
> > unsigned long*, long, __gnu_cxx::__ops::_Iter_less_iter) [clone
> > .isra.20]+
> > 0.03 lea rbp,[rdi+0x10]
> > _ZSt27__unguarded_partition_pivotIPmN9__gnu_cxx5__ops15_Iter_less_iterEET_S4_S4_T0_():
> > sar rax,0x4
> > 0.15 mov rcx,QWORD PTR [r13+0x8]
> > _ZSt16__introsort_loopIPmlN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S4_T0_T1_():
> > 0.03 sub r14,0x1
> > _ZSt27__unguarded_partition_pivotIPmN9__gnu_cxx5__ops15_Iter_less_iterEET_S4_S4_T0_():
> > lea rdx,[r13+rax*8+0x0]
> > mov rdi,QWORD PTR [rsi-0x8]
> > 0.03 mov rax,QWORD PTR [rdx]
> > _ZSt22__move_median_to_firstIPmN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S4_S4_S4_T0_():
> > cmp rcx,rax
> > → jae 400eb0 <void std::__introsort_loop<unsigned
> > long*, long, __gnu_cxx::__ops::_Iter_less_iter>(unsigned long*,
> > unsigned long*, long, __gnu_cxx::__ops::_Iter_less_iter) [clone
> > .isra.20]+
> > 0.08 cmp rax,rdi
> > 0.18 → jb 400ebe <void std::__introsort_loop<unsigned
> > long*, long, __gnu_cxx::__ops::_Iter_less_iter>(unsigned long*,
> > unsigned long*, long, __gnu_cxx::__ops::_Iter_less_iter) [clone
> > .isra.20]+
> > 0.03 cmp rcx,rdi
> > → jae 400ed6 <void std::__introsort_loop<unsigned
> > long*, long, __gnu_cxx::__ops::_Iter_less_iter>(unsigned long*,
> > unsigned long*, long, __gnu_cxx::__ops::_Iter_less_iter) [clone
> > .isra.20]+
> > _ZSt4swapImEvRT_S1_():
> > 0.10 mov rdx,QWORD PTR [r13+0x0]
> >
> >
> > Note all the "call-like arrows", even though most of those jumps are
> > nearby in the same function. This function is [clone .isra] so a
> > cloned function with certain values inlined. Maybe annotate/report is
> > having trouble in those cases, thinking the jump is to another
> > function?
> >
> > Cheers,
> > Travis
> >
> >
> > On Wed, May 15, 2019 at 9:38 PM Arnaldo Carvalho de Melo
> > <arnaldo.melo@gmail.com> wrote:
> > >
> > > On May 15, 2019 10:04:05 PM GMT-03:00, Travis Downs <travis.downs@gmail.com> wrote:
> > > >Hi Arnaldo,
> > > >
> > > >I'll work on a repro case. One issue though: the arrows show up in the
> > > >interactive tui mode, but to git bisect I'd like to script it, so have
> > > >some
> > > >arrow indication in --stdio mode: any way to do that?
> > >
> > > Try --stdio2, maybe it's enough.
> > >
> > > - Arnaldo
> > > >
> > > >Cheers,
> > > >Travis
> > > >
> > > >On Tue, May 14, 2019, 5:01 PM Arnaldo Carvalho de Melo <
> > > >arnaldo.melo@gmail.com> wrote:
> > > >
> > > >> Em Sun, May 05, 2019 at 06:58:51PM -0500, Travis Downs escreveu:
> > > >> > For a few months the newest perf (build from Linus' tip) doens't
> > > >show
> > > >> > jump targets in most C++ code for me.
> > > >> >
> > > >> > The arrays on jcc instructions always point to the left, like for
> > > >> > jumps that can't be resolved, even when the jumps are only a few
> > > >bytes
> > > >> > away.
> > > >> >
> > > >> > I tried to reproduce it on a "simple" C++ looping process, with the
> > > >> > same compile flags as my project, but it worked there! So something
> > > >is
> > > >> > triggering failure to resolve jump targets.
> > > >> >
> > > >> > Any advice on how to bisect or diagnose this issue would be
> > > >> > appreciated, e.g,. some way to turn on logging to see why perf
> > > >report
> > > >> > fails to resolve jump targets.
> > > >> >
> > > >> > FWIW perf as included in my distribution (perf version 4.15.18)
> > > >works
> > > >> > OK on the same files (it's perf report that matters, not perf
> > > >record)
> > > >> > but doesn't demangle the C++ symbols.
> > > >>
> > > >> Adding this to my backlog, but you could just try to use normal 'git
> > > >> bisect', i.e. find a version that works for a workload you have, then
> > > >to
> > > >> on rebuilding it till you find the version that introduce the
> > > >problem.
> > > >>
> > > >> If you can send some simple C++ source code that exhibits the
> > > >problem,
> > > >> please do, I'll try to work on this as soon as I process my
> > > >backlog...
> > > >>
> > > >> I just came back from vacation.
> > > >>
> > > >> - Arnaldo
> > > >>
> > >
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2021-10-23 15:42 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CAOBGo4xxHeiTVbJ2c0yey+86JxHW7N8uOOD1DSg5T3Z5Uge-6A@mail.gmail.com>
[not found] ` <20190514220101.GB21157@kernel.org>
[not found] ` <CAOBGo4zdETh1OrxvVDymA5QNWsRT9Z=FDFw+SJpk-GzjEnF1Rg@mail.gmail.com>
[not found] ` <EE0DAC4D-BC60-4C05-AF65-2C2551649BD7@kernel.org>
[not found] ` <CAOBGo4z31=vmUG4nJo1tKr+Km9wKD3FfejidWxEPDKD+-1JhZA@mail.gmail.com>
[not found] ` <CAOBGo4xGB9sOyntHrjMyF6sT9L4Ksg8+yB6K5r+sTQxZK7gy3Q@mail.gmail.com>
2021-10-23 15:41 ` perf report with newest perf not showing jump targets on "most" C++ code Travis Downs
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.