linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Milian Wolff <milian.wolff@kdab.com>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: Milian Wolff <mail@milianw.de>,
	linux-kernel@vger.kernel.org, Jiri Olsa <jolsa@kernel.org>,
	namhyung@kernel.org, linux-perf-users@vger.kernel.org,
	Arnaldo Carvalho <acme@kernel.org>
Subject: Re: Broken dwarf unwinding - wrong stack pointer register value?
Date: Mon, 22 Oct 2018 13:17:42 +0200	[thread overview]
Message-ID: <1775744.0RuD2nMEIt@agathebauer> (raw)
In-Reply-To: <3129539.p6j0Z5hDSJ@agathebauer>

[-- Attachment #1: Type: text/plain, Size: 9387 bytes --]

On Montag, 22. Oktober 2018 12:35:39 CEST Milian Wolff wrote:
> On Sonntag, 21. Oktober 2018 00:39:51 CEST Milian Wolff wrote:
> > Hey all,
> > 
> > I'm on the quest to figure out why perf regularly fails to unwind (some)
> > samples. I am seeing very strange behavior, where an apparently wrong
> > stack
> > pointer value is read from the register - see below for more information
> > and the end of this (long) mail for my open questions. Any help would be
> > greatly appreciated.
> > 
> > I am currently using this trivial C++ code to reproduce the issue:
> > 
> > ```
> > #include <cmath>
> > #include <complex>
> > #include <iostream>
> > #include <random>
> > 
> > using namespace std;
> > 
> > int main()
> > {
> > 
> >     uniform_real_distribution<double> uniform(-1E5, 1E5);
> >     default_random_engine engine;
> >     double s = 0;
> >     for (int i = 0; i < 10000000; ++i) {
> >     
> >         s += norm(complex<double>(uniform(engine), uniform(engine)));
> >     
> >     }
> >     cout << s << '\n';
> >     return 0;
> > 
> > }
> > ```
> > 
> > I compile it with `g++ -O2 -g` and then record it with `perf record
> > --call-
> > graph dwarf`. Using perf script, I then see e.g.:
> > 
> > ```
> > $ perf script -v --no-inline --time 90229.12668,90229.127158 --ns
> > ...
> > # first frame (working unwinding from __hypot_finite):
> > unwind: reg 16, val 7faf7dca2696
> > unwind: reg 7, val 7ffc80811ca0
> > unwind: find_proc_info dso /usr/lib/libm-2.28.so
> > unwind: access_mem addr 0x7ffc80811ce8 val 7faf7dc88af9, offset 72
> > unwind: find_proc_info dso /usr/lib/libm-2.28.so
> > unwind: access_mem addr 0x7ffc80811d08 val 56382b0fc129, offset 104
> > unwind: find_proc_info dso
> > /home/milian/projects/kdab/rnd/hotspot/build/tests/
> > test-clients/cpp-inlining/cpp-inlining
> > unwind: access_mem addr 0x7ffc80811d58 val 7faf7dabf223, offset 184
> > unwind: find_proc_info dso /usr/lib/libc-2.28.so
> > unwind: access_mem addr 0x7ffc80811e18 val 56382b0fc1ee, offset 376
> > unwind: find_proc_info dso
> > /home/milian/projects/kdab/rnd/hotspot/build/tests/
> > test-clients/cpp-inlining/cpp-inlining
> > unwind: __hypot_finite:ip = 0x7faf7dca2696 (0x29696)
> > unwind: hypotf32x:ip = 0x7faf7dc88af8 (0xfaf8)
> > unwind: main:ip = 0x56382b0fc128 (0x1128)
> > unwind: __libc_start_main:ip = 0x7faf7dabf222 (0x24222)
> > unwind: _start:ip = 0x56382b0fc1ed (0x11ed)
> > # second frame (unrelated)
> > unwind: reg 16, val 56382b0fc114
> > unwind: reg 7, val 7ffc80811d10
> > unwind: find_proc_info dso
> > /home/milian/projects/kdab/rnd/hotspot/build/tests/
> > test-clients/cpp-inlining/cpp-inlining
> > unwind: access_mem addr 0x7ffc80811d58 val 7faf7dabf223, offset 72
> > unwind: access_mem addr 0x7ffc80811e18 val 56382b0fc1ee, offset 264
> > unwind: main:ip = 0x56382b0fc114 (0x1114)
> > unwind: __libc_start_main:ip = 0x7faf7dabf222 (0x24222)
> > unwind: _start:ip = 0x56382b0fc1ed (0x11ed)
> > # third frame (broken unwinding from __hypot_finite)
> > unwind: reg 16, val 7faf7dca2688
> > unwind: reg 7, val 7ffc80811ca0
> > unwind: find_proc_info dso /usr/lib/libm-2.28.so
> > unwind: access_mem addr 0x7ffc80811cc0 val 0, offset 32
> > unwind: __hypot_finite:ip = 0x7faf7dca2688 (0x29688)
> > unwind: '':ip = 0xffffffffffffffff (0x0)
> > 
> > cpp-inlining 24617 90229.126685606:     711026 cycles:uppp:
> > 	    7faf7dca2696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
> > 	    7faf7dc88af8 hypotf32x+0x18 (/usr/lib/libm-2.28.so)
> > 	    56382b0fc128 main+0x88 (/home/milian/projects/kdab/rnd/hotspot/
> > 
> > build/tests/test-clients/cpp-inlining/cpp-inlining)
> > 
> > 	    7faf7dabf222 __libc_start_main+0xf2 (/usr/lib/libc-2.28.so)
> > 	    56382b0fc1ed _start+0x2d (/home/milian/projects/kdab/rnd/hotspot/
> > 
> > build/tests/test-clients/cpp-inlining/cpp-inlining)
> > 
> > cpp-inlining 24617 90229.126921551:     714657 cycles:uppp:
> > 	    56382b0fc114 main+0x74 (/home/milian/projects/kdab/rnd/hotspot/
> > 
> > build/tests/test-clients/cpp-inlining/cpp-inlining)
> > 
> > 	    7faf7dabf222 __libc_start_main+0xf2 (/usr/lib/libc-2.28.so)
> > 	    56382b0fc1ed _start+0x2d (/home/milian/projects/kdab/rnd/hotspot/
> > 
> > build/tests/test-clients/cpp-inlining/cpp-inlining)
> > 
> > cpp-inlining 24617 90229.127157818:     719976 cycles:uppp:
> > 	    7faf7dca2688 __hypot_finite+0x28 (/usr/lib/libm-2.28.so)
> > 	
> > 	ffffffffffffffff [unknown] ([unknown])
> > 
> > ...
> > ```
> > 
> > Now I'm trying to figure out why one __hypot_finite sample works but the
> > other one breaks for no apparent reason.
> 
> I've now collected some more background information, which is quite helpful
> I believe for the analysis of this issue:
> 
> Note how the broken sample has the IP pointing at __hypot_finite+0x28:
> 
> unwind: __hypot_finite:ip = 0x7faf7dca2688 (0x29688)
> 
> When we run my reproducer code in GDB, we can see that obtaining a backtrace
> from that address works just fine there:
> 
> ```
> $ gdb ./cpp-inlining
> GNU gdb (GDB) 8.2
> Copyright (C) 2018 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html> This is free software: you are free to
> change and redistribute it. There is NO WARRANTY, to the extent permitted
> by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-pc-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
>     <http://www.gnu.org/software/gdb/documentation/>.
> 
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from ./cpp-inlining...done.
> (gdb) break main
> Breakpoint 1 at 0x10a0: file /home/milian/projects/kdab/rnd/hotspot/tests/
> test-clients/cpp-inlining/main.cpp, line 34.
> (gdb) run
> Starting program: /home/milian/projects/kdab/rnd/hotspot/build/tests/test-
> clients/cpp-inlining/cpp-inlining
> 
> Breakpoint 1, main () at /home/milian/projects/kdab/rnd/hotspot/tests/test-
> clients/cpp-inlining/main.cpp:34
> 34      {
> (gdb) disassemble __hypot_finite
> Dump of assembler code for function __hypot_finite:
>    0x00007ffff7c96660 <+0>:     endbr64
>    0x00007ffff7c96664 <+4>:     push   %r13
>    0x00007ffff7c96666 <+6>:     push   %r12
>    0x00007ffff7c96668 <+8>:     push   %rbp
>    0x00007ffff7c96669 <+9>:     movq   %xmm1,%rbp
>    0x00007ffff7c9666e <+14>:    push   %rbx
>    0x00007ffff7c9666f <+15>:    movq   %xmm0,%rbx
>    0x00007ffff7c96674 <+20>:    shr    $0x20,%rbp
>    0x00007ffff7c96678 <+24>:    shr    $0x20,%rbx
>    0x00007ffff7c9667c <+28>:    and    $0x7fffffff,%ebp
>    0x00007ffff7c96682 <+34>:    and    $0x7fffffff,%ebx
>    0x00007ffff7c96688 <+40>:    sub    $0x28,%rsp
>    0x00007ffff7c9668c <+44>:    cmp    %ebp,%ebx
>    0x00007ffff7c9668e <+46>:    jge    0x7ffff7c966a2 <__hypot_finite+66>
>    0x00007ffff7c96690 <+48>:    mov    %ebx,%eax
>    0x00007ffff7c96692 <+50>:    movapd %xmm0,%xmm2
>    0x00007ffff7c96696 <+54>:    movapd %xmm1,%xmm0
>    0x00007ffff7c9669a <+58>:    mov    %ebp,%ebx
>    0x00007ffff7c9669c <+60>:    movapd %xmm2,%xmm1
>    0x00007ffff7c966a0 <+64>:    mov    %eax,%ebp
>    0x00007ffff7c966a2 <+66>:    movq   %xmm0,%r13
>    0x00007ffff7c966a7 <+71>:    mov    %rbx,%rax
>    0x00007ffff7c966aa <+74>:    movq   %xmm1,%r12
> --Type <RET> for more, q to quit, c to continue without paging--q
> Quit
> (gdb) break *0x00007ffff7c96688
> Breakpoint 2 at 0x7ffff7c96688
> (gdb) cont
> Continuing.
> 
> Breakpoint 2, 0x00007ffff7c96688 in __hypot_finite () from
> /usr/lib/libm.so.6 (gdb) bt
> #0  0x00007ffff7c96688 in __hypot_finite () from /usr/lib/libm.so.6
> #1  0x00007ffff7c7caf9 in hypotf64 () from /usr/lib/libm.so.6
> #2  0x0000555555555129 in std::__complex_abs (__z=<optimized out>) at /usr/
> include/c++/8.2.1/complex:1362
> #3  std::abs<double> (__z=...) at /usr/include/c++/8.2.1/complex:618
> #4  std::_Norm_helper<true>::_S_do_it<double> (__z=...) at /usr/include/c++/
> 8.2.1/complex:675
> #5  std::norm<double> (__z=...) at /usr/include/c++/8.2.1/complex:685
> #6  main () at
> /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-
> inlining/main.cpp:39
> ```
> 
> But note how __hypot_finite+0x28 actually points at an instruction that
> modifies the stack pointer by 40, just like I did manually to "fix"
> unwinding for this sample (see my initial email):
> 
>    0x00007ffff7c96688 <+40>:    sub    $0x28,%rsp
> 
> So we are still in the function prologue, and can't properly unwind from
> there apparently! Any suggestions on how this should be handled? Can we
> figure out the size of the function prologue somehow and manually fix the
> stack pointer then?

After more digging, it turns out that I've apparently chased a red herring. 
I'm running archlinux which isn't shipping debug symbols for libm. Without 
those, we are apparently out of luck to unwind from the function prologue.

I still wonder whether we could catch this somehow and handle it ourselves. I 
wonder what GDB does to unwind here, despite the lack of debug information in 
libm...

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

  reply	other threads:[~2018-10-22 11:17 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-20 22:39 Broken dwarf unwinding - wrong stack pointer register value? Milian Wolff
2018-10-21 20:32 ` Milian Wolff
2018-10-22 10:35 ` Milian Wolff
2018-10-22 11:17   ` Milian Wolff [this message]
2018-10-22 13:58     ` Andi Kleen
2018-10-22 19:26       ` Milian Wolff
2018-10-23  4:03         ` Andi Kleen
2018-10-23 10:34           ` Milian Wolff
2018-10-24 14:48             ` Andi Kleen
2018-10-30 22:34               ` Milian Wolff
2018-11-01 22:08                 ` PEBS level 2/3 breaks dwarf unwinding! [WAS: Re: Broken dwarf unwinding - wrong stack pointer register value?] Milian Wolff
2018-11-02 11:26                   ` Jiri Olsa
2018-11-02 17:56                     ` Milian Wolff
2018-11-05 20:51                       ` Jiri Olsa
2018-11-05 22:54                         ` Milian Wolff
2018-11-06  0:10                           ` Andi Kleen
2018-11-06  8:39                             ` Jiri Olsa
2018-11-06 17:26                               ` Andi Kleen
2018-11-06 20:04                               ` Milian Wolff
2018-11-06 20:24                                 ` Andi Kleen
2018-11-07 22:41                                   ` Milian Wolff
2018-11-08 12:41                                     ` Milian Wolff
2018-11-09  0:55                                       ` Andi Kleen
2018-11-09  0:54                                     ` Andi Kleen
2018-11-10 21:42                             ` Travis Downs
2018-11-11  1:07                               ` Andi Kleen
     [not found]                                 ` <CAOBGo4zirLiKX8VcROAE=kAD0+qkF0E-cBv9DtBiQr=_obDv5w@mail.gmail.com>
2018-11-11  2:54                                   ` Travis Downs
2018-11-12  3:26                                   ` Andi Kleen
2018-11-14 13:20                                     ` Milian Wolff
2018-11-15  2:05                                       ` Travis Downs
2018-11-15  9:10                                         ` Milian Wolff
2018-11-15 19:00                                           ` Andi Kleen
2018-11-15  2:15                                     ` Travis Downs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1775744.0RuD2nMEIt@agathebauer \
    --to=milian.wolff@kdab.com \
    --cc=acme@kernel.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mail@milianw.de \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).