linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Adrian Hunter <adrian.hunter@intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>, linux-kernel@vger.kernel.org
Subject: [PATCH 5/6] perf thread-stack: Improve thread_stack__no_call_return()
Date: Wed,  9 Jan 2019 11:18:34 +0200	[thread overview]
Message-ID: <20190109091835.5570-6-adrian.hunter@intel.com> (raw)
In-Reply-To: <20190109091835.5570-1-adrian.hunter@intel.com>

Improve thread_stack__no_call_return() to better handle 'returns' that do
not match the stack i.e. 'no call'. See code comments for details.
The example below shows how retpolines are affected:

Example:

$ cat simple-retpoline.c
__attribute__((noinline)) int bar(void)
{
        return -1;
}

int foo(void)
{
        return bar() + 1;
}

__attribute__((indirect_branch("thunk"))) int main()
{
        int (*volatile fn)(void) = foo;

        fn();
        return fn();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
$ objdump -d simple-retpoline
<SNIP>
0000000000001040 <main>:
    1040:       48 83 ec 18             sub    $0x18,%rsp
    1044:       48 8d 05 25 01 00 00    lea    0x125(%rip),%rax        # 1170 <foo>
    104b:       48 89 44 24 08          mov    %rax,0x8(%rsp)
    1050:       48 8b 44 24 08          mov    0x8(%rsp),%rax
    1055:       e8 1f 01 00 00          callq  1179 <__x86_indirect_thunk_rax>
    105a:       48 8b 44 24 08          mov    0x8(%rsp),%rax
    105f:       48 83 c4 18             add    $0x18,%rsp
    1063:       e9 11 01 00 00          jmpq   1179 <__x86_indirect_thunk_rax>
<SNIP>
0000000000001160 <bar>:
    1160:       b8 ff ff ff ff          mov    $0xffffffff,%eax
    1165:       c3                      retq
<SNIP>
0000000000001170 <foo>:
    1170:       e8 eb ff ff ff          callq  1160 <bar>
    1175:       83 c0 01                add    $0x1,%eax
    1178:       c3                      retq
0000000000001179 <__x86_indirect_thunk_rax>:
    1179:       e8 07 00 00 00          callq  1185 <__x86_indirect_thunk_rax+0xc>
    117e:       f3 90                   pause
    1180:       0f ae e8                lfence
    1183:       eb f9                   jmp    117e <__x86_indirect_thunk_rax+0x5>
    1185:       48 89 04 24             mov    %rax,(%rsp)
    1189:       c3                      retq
<SNIP>
$ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
$ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
2019-01-08 14:03:37.851655 Creating database...
2019-01-08 14:03:37.863256 Writing records...
2019-01-08 14:03:38.069750 Adding indexes
2019-01-08 14:03:38.078799 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db

Before:

    main
        -> __x86_indirect_thunk_rax
            -> __x86_indirect_thunk_rax
                -> __x86_indirect_thunk_rax
                    -> bar

After:

    main
        -> __x86_indirect_thunk_rax
            -> __x86_indirect_thunk_rax
                -> foo
                    -> bar

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/thread-stack.c | 49 +++++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index f52c0f90915d..632c07a125ab 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -638,14 +638,57 @@ static int thread_stack__no_call_return(struct thread *thread,
 	else
 		parent = root;
 
-	/* This 'return' had no 'call', so push and pop top of stack */
-	cp = call_path__findnew(cpr, parent, fsym, ip, ks);
+	if (parent->sym == from_al->sym) {
+		/*
+		 * At the bottom of the stack, assume the missing 'call' was
+		 * before the trace started. So, pop the current symbol and push
+		 * the 'to' symbol.
+		 */
+		if (ts->cnt == 1) {
+			err = thread_stack__call_return(thread, ts, --ts->cnt,
+							tm, ref, false);
+			if (err)
+				return err;
+		}
+
+		if (!ts->cnt) {
+			cp = call_path__findnew(cpr, root, tsym, addr, ks);
+
+			return thread_stack__push_cp(ts, addr, tm, ref, cp,
+						     true, false);
+		}
+
+		/*
+		 * Otherwise assume the 'return' is being used as a jump (e.g.
+		 * retpoline) and just push the 'to' symbol.
+		 */
+		cp = call_path__findnew(cpr, parent, tsym, addr, ks);
+
+		err = thread_stack__push_cp(ts, 0, tm, ref, cp, true, false);
+		if (!err)
+			ts->stack[ts->cnt - 1].non_call = true;
+
+		return err;
+	}
+
+	/*
+	 * Assume 'parent' has not yet returned, so push 'to', and then push and
+	 * pop 'from'.
+	 */
+
+	cp = call_path__findnew(cpr, parent, tsym, addr, ks);
 
 	err = thread_stack__push_cp(ts, addr, tm, ref, cp, true, false);
 	if (err)
 		return err;
 
-	return thread_stack__pop_cp(thread, ts, addr, tm, ref, tsym);
+	cp = call_path__findnew(cpr, cp, fsym, ip, ks);
+
+	err = thread_stack__push_cp(ts, ip, tm, ref, cp, true, false);
+	if (err)
+		return err;
+
+	return thread_stack__call_return(thread, ts, --ts->cnt, tm, ref, false);
 }
 
 static int thread_stack__trace_begin(struct thread *thread,
-- 
2.17.1


  parent reply	other threads:[~2019-01-09  9:20 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-09  9:18 [PATCH 0/6] perf thread-stack: x86 retpolines Adrian Hunter
2019-01-09  9:18 ` [PATCH 1/6] perf tools: Fix split_kallsyms_for_kcore for trampoline symbols Adrian Hunter
2019-02-09 12:57   ` [tip:perf/core] perf tools: Fix split_kallsyms_for_kcore() " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 2/6] perf thread-stack: Tidy thread_stack__push_cp() usage Adrian Hunter
2019-02-09 12:58   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 3/6] perf thread-stack: Tidy thread_stack__no_call_return() by adding more local variables Adrian Hunter
2019-02-09 12:58   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 4/6] perf thread-stack: Represent jmps to the start of a different symbol Adrian Hunter
2019-02-06 12:39   ` Arnaldo Carvalho de Melo
2019-02-06 13:25     ` Adrian Hunter
2019-02-09 12:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` Adrian Hunter [this message]
2019-02-22  9:48   ` [PATCH 5/6] perf thread-stack: Improve thread_stack__no_call_return() Adrian Hunter
2019-02-22 14:39     ` Arnaldo Carvalho de Melo
2019-02-28  7:49   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 6/6] perf thread-stack: Hide x86 retpolines Adrian Hunter
2019-01-09 15:38   ` Jiri Olsa
2019-01-10  7:52     ` Adrian Hunter
2019-02-22 19:42   ` Arnaldo Carvalho de Melo
2019-02-22 20:53     ` Arnaldo Carvalho de Melo
2019-02-28  7:49   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-10  9:55 ` [PATCH 0/6] perf thread-stack: " Jiri Olsa
2019-02-06  9:10   ` Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190109091835.5570-6-adrian.hunter@intel.com \
    --to=adrian.hunter@intel.com \
    --cc=acme@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).