linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Support inline symbols in callchains
@ 2019-02-21 16:06 Jonas Rabenstein
  2019-02-21 16:06 ` [PATCH 1/3] perf map: add function to lookup inline symbols Jonas Rabenstein
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Jonas Rabenstein @ 2019-02-21 16:06 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Arnaldo Carvalho de Melo, David Miller, Eric Saint-Etienne,
	Ingo Molnar, Jiri Olsa, Kim Phillips, Konstantin Khlebnikov,
	Milian Wolff, Namhyung Kim, Peter Zijlstra, Rob Gardner,
	Sandipan Das, linux-kernel, Jonas Rabenstein

Hi,
This patchset supersedes my previous attempt to add inline symbols to
callchain of perf-script [1] by a more generic attempt to not hook in
the output stage but directly into the callchain generation. By a matter
of fact this adds those inline symbols automatically to other commands
like perf-report.
Additionally this fixes the regression reported by Jiri Olsa [2] that
some entries from previous outputs had been vanished and now only new
lines are added if symbols had been found.

The integration for perf-report is not completely done as there is an
issue if the root for an hist_entry as for inlined symbols there may be
multiple instances (for each address-range) with the same name. But in
util/sort.c:233 only the name is compared for inlined symbols. As a
consequence the returned hist_entry may hold a reference to another
instance for this inlined symbol (with another address range than
requested) we later on fail with -ERANGE in __symbol__inc_addr_samples
(util/annotate.c:857).
This issue does still permit perf-report to be executed without any
problems and the inlined symbols do show up but none of the samples is
actually accounted to them but to the original symbol :(

To further provide information what this changeset is doing, here is a
script-session to show the differences in the output:

[jonas@x60s]$  git reset --hard v5.0-rc7; make -C tools/perf >/dev/null; \
HEAD is now at a3b22b9f11d9 Linux 5.0-rc7
[jonas@x60s]$  git am *.patch; make -C tools/perf >/dev/null; \
	mv tools /perf/perf perf-new
Applying: perf map: add function to lookup inline symbols
Applying: perf machine: use map__inlines in append_inlines
Applying: perf machine: add inline symbols to callchains
[jonas@x60s]$  cat test.c
static int foo(int m){int r=0; for(int i=0;i<m;++i)r+=i; return r;}
static int bar(int m){int r=0; for(int i=0;i<m;++i)r+=foo(i); return r;}
static int baz(int m){int r=0; for(int i=0;i<m;++i)r+=foo(i)*bar(i); return r;}
int main() { return baz(421); }
[jonas@x60s]$  gcc -O2 -fno-omit-frame-pointer -g -o test test.c
[jonas@x60s]$  ./perf-new record --call-graph fp ./test.c
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (19 samples) ]
[jonas@x60s]$  time ./perf-old script >old.script
real	0m0.039s
user	0m0.012s
sys	0m0.022s
[jonas@x60s]$  time ./perf-new script >new.script
real	0m0.045s
user	0m0.010s
sys	0m0.032s
[jonas@x60s]$  diff -u old.script new.script
--- old.script	2019-02-21 16:30:47.997194180 +0100
+++ new.script	2019-02-21 16:30:51.757309790 +0100
@@ -23,66 +23,101 @@
 	    7f43b1148090 _start+0x0 (/lib/x86_64-linux-gnu/ld-2.27.so)
 
 test  7579 1470536.968092:     354758 cycles:uppp: 
+	    7f43b1148f4b elf_get_dynamic_info+0xab (inlined)
 	    7f43b1148f4b _dl_start+0xab (/lib/x86_64-linux-gnu/ld-2.27.so)
 	    7f43b1148098 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-2.27.so)
 [...]
 
 test  7579 1470536.969210:    1922435 cycles:uppp:
+	    563f0df9c53f foo+0x4f (inlined)
+	    563f0df9c53f bar+0x4f (inlined)
+	    563f0df9c53f baz+0x4f (inlined)
 	    563f0df9c53f main+0x4f (/home/jonas/linux/test)
 	    7f43b0d77b97 __libc_start_main+0xe7 (/lib/x86_64-linux-gnu/libc-2.27.so)
 	 75e258d4c544155 [unknown] ([unknown])
[jonas@x60s]$  time ./perf-new script --no-inline >new.noinline.script
real	0m0.035s
user	0m0.012s
sys	0m0.020s
[jonas@x60s]$  diff -u old.script new.noinline.script
[jonas@x60s]$  ./perf-old report --stdio --quiet | sed '/^$/Q'
       88.80%    88.80%  test     test              [.] main
            |
            ---0x75e258d4c544155
               __libc_start_main
               main
[jonas@x60s]$  ./perf-new report --stdio --quiet | sed '/^$/Q'
    88.80%    88.80%  test     test              [.] main
            |
            ---0x75e258d4c544155
               __libc_start_main
               main
               baz (inlined)
               bar (inlined)
               foo (inlined)
[jonas@x60s]$  ./perf-new report --stdio --quiet --no-inline | sed '/^$/Q'
    88.80%    88.80%  test     test              [.] main
            |
            ---0x75e258d4c544155
               __libc_start_main
               main

I am still trying to find a way that in the new output of report the
88.80% 'Self' are not accounted to main itself but split up for the
inlined baz, bar and foo symbols.

I'm open to any help as well as feedback,
 Jonas

[1] https://www.spinics.net/lists/linux-perf-users/msg07792.html
[2] https://www.spinics.net/lists/linux-perf-users/msg07798.html

Jonas Rabenstein (3):
  perf map: add function to lookup inline symbols
  perf machine: use map__inlines in append_inlines
  perf machine: add inline symbols to callchains

 tools/perf/util/machine.c | 130 ++++++++++++++++++++++++++------------
 tools/perf/util/map.c     |  23 +++++++
 tools/perf/util/map.h     |   3 +
 3 files changed, 115 insertions(+), 41 deletions(-)

-- 
2.19.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] perf map: add function to lookup inline symbols
  2019-02-21 16:06 [PATCH 0/3] Support inline symbols in callchains Jonas Rabenstein
@ 2019-02-21 16:06 ` Jonas Rabenstein
  2019-02-21 16:06 ` [PATCH 2/3] perf machine: use map__inlines in append_inlines Jonas Rabenstein
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Jonas Rabenstein @ 2019-02-21 16:06 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Arnaldo Carvalho de Melo, David Miller, Eric Saint-Etienne,
	Ingo Molnar, Jiri Olsa, Kim Phillips, Konstantin Khlebnikov,
	Milian Wolff, Namhyung Kim, Peter Zijlstra, Rob Gardner,
	Sandipan Das, linux-kernel, Jonas Rabenstein

Inlined symbols should always be added to the dso's inlined_nodes tree
in order to reuse them for a later lookup for the same address. Instead
of repeating those steps at the different users, provide a central
method to lookup and register inline symbols for a map.

Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
---
 tools/perf/util/map.c | 23 +++++++++++++++++++++++
 tools/perf/util/map.h |  3 +++
 2 files changed, 26 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6751301a755c..0fe74e83ca8d 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -988,3 +988,26 @@ struct map_groups *map__kmaps(struct map *map)
 	}
 	return kmap->kmaps;
 }
+
+struct inline_node *map__inlines(struct map *map, u64 addr, struct symbol *sym)
+{
+	struct inline_node *node;
+
+	if (!symbol_conf.inline_name)
+		return NULL;
+
+	if (!map->dso)
+		return NULL;
+
+	addr = map->map_ip(map, addr);
+	addr = map__rip_2objdump(map, addr);
+
+	node = inlines__tree_find(&map->dso->inlined_nodes, addr);
+	if (node)
+		return node;
+
+	node = dso__parse_addr_inlines(map->dso, addr, sym);
+	if (node)
+		inlines__tree_insert(&map->dso->inlined_nodes, node);
+	return node;
+}
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 09282aa45c80..bdcc6e77e26e 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -270,4 +270,7 @@ static inline bool is_entry_trampoline(const char *name)
 	return !strcmp(name, ENTRY_TRAMPOLINE_NAME);
 }
 
+struct inline_node *map__inlines(struct map *map, u64 addr,
+				 struct symbol *sym);
+
 #endif /* __PERF_MAP_H */
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] perf machine: use map__inlines in append_inlines
  2019-02-21 16:06 [PATCH 0/3] Support inline symbols in callchains Jonas Rabenstein
  2019-02-21 16:06 ` [PATCH 1/3] perf map: add function to lookup inline symbols Jonas Rabenstein
@ 2019-02-21 16:06 ` Jonas Rabenstein
  2019-02-21 16:06 ` [PATCH 3/3] perf machine: add inline symbols to callchains Jonas Rabenstein
  2019-02-23  4:56 ` [PATCH 0/3] Support inline symbols in callchains Namhyung Kim
  3 siblings, 0 replies; 7+ messages in thread
From: Jonas Rabenstein @ 2019-02-21 16:06 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Arnaldo Carvalho de Melo, David Miller, Eric Saint-Etienne,
	Ingo Molnar, Jiri Olsa, Kim Phillips, Konstantin Khlebnikov,
	Milian Wolff, Namhyung Kim, Peter Zijlstra, Rob Gardner,
	Sandipan Das, linux-kernel, Jonas Rabenstein

The previous patch provides a generic way to lookup the root node of the
inlined symbols for a specific address. Reuse that implementation
instead of a duplicated version of the required steps.

Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
---
 tools/perf/util/machine.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 143f7057d581..dce29c21e4ea 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2330,23 +2330,12 @@ static int append_inlines(struct callchain_cursor *cursor,
 {
 	struct inline_node *inline_node;
 	struct inline_list *ilist;
-	u64 addr;
 	int ret = 1;
 
-	if (!symbol_conf.inline_name || !map || !sym)
+	inline_node = map__inlines(map, ip, sym);
+	if (!inline_node)
 		return ret;
 
-	addr = map__map_ip(map, ip);
-	addr = map__rip_2objdump(map, addr);
-
-	inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr);
-	if (!inline_node) {
-		inline_node = dso__parse_addr_inlines(map->dso, addr, sym);
-		if (!inline_node)
-			return ret;
-		inlines__tree_insert(&map->dso->inlined_nodes, inline_node);
-	}
-
 	list_for_each_entry(ilist, &inline_node->val, list) {
 		ret = callchain_cursor_append(cursor, ip, map,
 					      ilist->symbol, false,
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] perf machine: add inline symbols to callchains
  2019-02-21 16:06 [PATCH 0/3] Support inline symbols in callchains Jonas Rabenstein
  2019-02-21 16:06 ` [PATCH 1/3] perf map: add function to lookup inline symbols Jonas Rabenstein
  2019-02-21 16:06 ` [PATCH 2/3] perf machine: use map__inlines in append_inlines Jonas Rabenstein
@ 2019-02-21 16:06 ` Jonas Rabenstein
  2019-02-23  4:21   ` Namhyung Kim
  2019-02-23  4:56 ` [PATCH 0/3] Support inline symbols in callchains Namhyung Kim
  3 siblings, 1 reply; 7+ messages in thread
From: Jonas Rabenstein @ 2019-02-21 16:06 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Arnaldo Carvalho de Melo, David Miller, Eric Saint-Etienne,
	Ingo Molnar, Jiri Olsa, Kim Phillips, Konstantin Khlebnikov,
	Milian Wolff, Namhyung Kim, Peter Zijlstra, Rob Gardner,
	Sandipan Das, linux-kernel, Jonas Rabenstein

Use map__inlines to resolve inlined functions for every entry with
an symbol that should be added to a callchain.

Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
---
 tools/perf/util/machine.c | 115 ++++++++++++++++++++++++++++----------
 1 file changed, 87 insertions(+), 28 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index dce29c21e4ea..070d074482b4 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1908,6 +1908,91 @@ struct iterations {
 	u64 cycles;
 };
 
+static int __add_callchain_location(struct callchain_cursor *cursor,
+				    struct symbol **parent,
+				    struct addr_location *root_al,
+				    u64 addr, struct addr_location *al,
+				    bool branch, struct branch_flags *flags,
+				    u64 branch_from, struct iterations *iter)
+{
+	int nr_loop_iter = 0;
+	u64 iter_cycles = 0;
+
+	if (symbol_conf.hide_unresolved && al->sym == NULL)
+		return 0;
+
+	if (al->sym) {
+		if (perf_hpp_list.parent && !*parent &&
+		  symbol__match_regex(al->sym, &parent_regex))
+			*parent = al->sym;
+		else if (have_ignore_callees && root_al &&
+		  symbol__match_regex(al->sym, &ignore_callees_regex)) {
+			/* Treat this symbol as the root,
+			   forgetting its callees. */
+			*root_al = *al;
+			callchain_cursor_reset(cursor);
+		}
+	}
+
+	if (iter) {
+		nr_loop_iter = iter->nr_loop_iter;
+		iter_cycles = iter->cycles;
+	}
+	return callchain_cursor_append(cursor, addr, al->map, al->sym, branch,
+				       flags, nr_loop_iter, iter_cycles,
+				       branch_from, al->srcline);
+}
+
+static int __add_callchain_ip(struct callchain_cursor *cursor, u64 ip,
+			      struct addr_location *al, bool branch,
+			      struct branch_flags *flags, u64 branch_from,
+			      struct iterations *iter, struct symbol **parent,
+			      struct addr_location *root_al)
+{
+	struct inline_node *inline_node;
+	struct inline_list *inline_list;
+	const char *srcline;
+	struct symbol *symbol;
+	int err = 0;
+
+	al->srcline = callchain_srcline(al->map, al->sym, al->addr);
+	if (callchain_param.order == ORDER_CALLER)
+		err = __add_callchain_location(cursor, parent, root_al, ip, al,
+					       branch, flags, branch_from, iter);
+	if (err || !al->map || !al->sym)
+		goto no_inline;
+
+	inline_node = map__inlines(al->map, ip, al->sym);
+	if (!inline_node || list_empty(&inline_node->val))
+		goto no_inline;
+
+	symbol = al->sym;
+	srcline = al->srcline;
+	list_for_each_entry(inline_list, &inline_node->val, list) {
+		if (inline_list->symbol == symbol)
+			continue;
+		al->sym = inline_list->symbol;
+		al->srcline = inline_list->srcline;
+		err = __add_callchain_location(cursor, parent, root_al, ip,
+					       al, branch, flags,
+					       branch_from, iter);
+		if (err)
+			break;
+	}
+
+	if (callchain_param.order == ORDER_CALLEE) {
+		al->srcline = srcline;
+		al->sym = symbol;
+	}
+
+no_inline:
+	if (!err && callchain_param.order == ORDER_CALLEE)
+		err = __add_callchain_location(cursor, parent, root_al, ip, al,
+					       branch, flags, branch_from, iter);
+	return err;
+}
+
+
 static int add_callchain_ip(struct thread *thread,
 			    struct callchain_cursor *cursor,
 			    struct symbol **parent,
@@ -1920,9 +2005,6 @@ static int add_callchain_ip(struct thread *thread,
 			    u64 branch_from)
 {
 	struct addr_location al;
-	int nr_loop_iter = 0;
-	u64 iter_cycles = 0;
-	const char *srcline = NULL;
 
 	al.filtered = 0;
 	al.sym = NULL;
@@ -1955,31 +2037,8 @@ static int add_callchain_ip(struct thread *thread,
 		thread__find_symbol(thread, *cpumode, ip, &al);
 	}
 
-	if (al.sym != NULL) {
-		if (perf_hpp_list.parent && !*parent &&
-		    symbol__match_regex(al.sym, &parent_regex))
-			*parent = al.sym;
-		else if (have_ignore_callees && root_al &&
-		  symbol__match_regex(al.sym, &ignore_callees_regex)) {
-			/* Treat this symbol as the root,
-			   forgetting its callees. */
-			*root_al = al;
-			callchain_cursor_reset(cursor);
-		}
-	}
-
-	if (symbol_conf.hide_unresolved && al.sym == NULL)
-		return 0;
-
-	if (iter) {
-		nr_loop_iter = iter->nr_loop_iter;
-		iter_cycles = iter->cycles;
-	}
-
-	srcline = callchain_srcline(al.map, al.sym, al.addr);
-	return callchain_cursor_append(cursor, ip, al.map, al.sym,
-				       branch, flags, nr_loop_iter,
-				       iter_cycles, branch_from, srcline);
+	return __add_callchain_ip(cursor, ip, &al, branch, flags, branch_from,
+				  iter, parent, root_al);
 }
 
 struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] perf machine: add inline symbols to callchains
  2019-02-21 16:06 ` [PATCH 3/3] perf machine: add inline symbols to callchains Jonas Rabenstein
@ 2019-02-23  4:21   ` Namhyung Kim
  2019-02-23 12:05     ` Jonas Rabenstein
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2019-02-23  4:21 UTC (permalink / raw)
  To: Jonas Rabenstein
  Cc: linux-perf-users, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Arnaldo Carvalho de Melo, David Miller, Eric Saint-Etienne,
	Ingo Molnar, Jiri Olsa, Kim Phillips, Konstantin Khlebnikov,
	Milian Wolff, Peter Zijlstra, Rob Gardner, Sandipan Das,
	linux-kernel

Hello,

On Fri, Feb 22, 2019 at 1:07 AM Jonas Rabenstein
<jonas.rabenstein@studium.uni-erlangen.de> wrote:
>
> Use map__inlines to resolve inlined functions for every entry with
> an symbol that should be added to a callchain.
>
> Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
> ---
>  tools/perf/util/machine.c | 115 ++++++++++++++++++++++++++++----------
>  1 file changed, 87 insertions(+), 28 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index dce29c21e4ea..070d074482b4 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1908,6 +1908,91 @@ struct iterations {
>         u64 cycles;
>  };
>
> +static int __add_callchain_location(struct callchain_cursor *cursor,
> +                                   struct symbol **parent,
> +                                   struct addr_location *root_al,
> +                                   u64 addr, struct addr_location *al,
> +                                   bool branch, struct branch_flags *flags,
> +                                   u64 branch_from, struct iterations *iter)
> +{
> +       int nr_loop_iter = 0;
> +       u64 iter_cycles = 0;
> +
> +       if (symbol_conf.hide_unresolved && al->sym == NULL)
> +               return 0;
> +
> +       if (al->sym) {
> +               if (perf_hpp_list.parent && !*parent &&
> +                 symbol__match_regex(al->sym, &parent_regex))
> +                       *parent = al->sym;
> +               else if (have_ignore_callees && root_al &&
> +                 symbol__match_regex(al->sym, &ignore_callees_regex)) {
> +                       /* Treat this symbol as the root,
> +                          forgetting its callees. */
> +                       *root_al = *al;
> +                       callchain_cursor_reset(cursor);
> +               }
> +       }
> +
> +       if (iter) {
> +               nr_loop_iter = iter->nr_loop_iter;
> +               iter_cycles = iter->cycles;
> +       }
> +       return callchain_cursor_append(cursor, addr, al->map, al->sym, branch,
> +                                      flags, nr_loop_iter, iter_cycles,
> +                                      branch_from, al->srcline);
> +}
> +
> +static int __add_callchain_ip(struct callchain_cursor *cursor, u64 ip,
> +                             struct addr_location *al, bool branch,
> +                             struct branch_flags *flags, u64 branch_from,
> +                             struct iterations *iter, struct symbol **parent,
> +                             struct addr_location *root_al)
> +{
> +       struct inline_node *inline_node;
> +       struct inline_list *inline_list;
> +       const char *srcline;
> +       struct symbol *symbol;
> +       int err = 0;
> +
> +       al->srcline = callchain_srcline(al->map, al->sym, al->addr);
> +       if (callchain_param.order == ORDER_CALLER)
> +               err = __add_callchain_location(cursor, parent, root_al, ip, al,
> +                                              branch, flags, branch_from, iter);
> +       if (err || !al->map || !al->sym)
> +               goto no_inline;
> +
> +       inline_node = map__inlines(al->map, ip, al->sym);
> +       if (!inline_node || list_empty(&inline_node->val))
> +               goto no_inline;
> +
> +       symbol = al->sym;
> +       srcline = al->srcline;
> +       list_for_each_entry(inline_list, &inline_node->val, list) {
> +               if (inline_list->symbol == symbol)
> +                       continue;
> +               al->sym = inline_list->symbol;
> +               al->srcline = inline_list->srcline;
> +               err = __add_callchain_location(cursor, parent, root_al, ip,
> +                                              al, branch, flags,
> +                                              branch_from, iter);
> +               if (err)
> +                       break;
> +       }

Does this loop do the job both for ORDER_CALLER and ORDER_CALLEE?

Thanks,
Namhyung


> +
> +       if (callchain_param.order == ORDER_CALLEE) {
> +               al->srcline = srcline;
> +               al->sym = symbol;
> +       }
> +
> +no_inline:
> +       if (!err && callchain_param.order == ORDER_CALLEE)
> +               err = __add_callchain_location(cursor, parent, root_al, ip, al,
> +                                              branch, flags, branch_from, iter);
> +       return err;
> +}
> +
> +
>  static int add_callchain_ip(struct thread *thread,
>                             struct callchain_cursor *cursor,
>                             struct symbol **parent,
> @@ -1920,9 +2005,6 @@ static int add_callchain_ip(struct thread *thread,
>                             u64 branch_from)
>  {
>         struct addr_location al;
> -       int nr_loop_iter = 0;
> -       u64 iter_cycles = 0;
> -       const char *srcline = NULL;
>
>         al.filtered = 0;
>         al.sym = NULL;
> @@ -1955,31 +2037,8 @@ static int add_callchain_ip(struct thread *thread,
>                 thread__find_symbol(thread, *cpumode, ip, &al);
>         }
>
> -       if (al.sym != NULL) {
> -               if (perf_hpp_list.parent && !*parent &&
> -                   symbol__match_regex(al.sym, &parent_regex))
> -                       *parent = al.sym;
> -               else if (have_ignore_callees && root_al &&
> -                 symbol__match_regex(al.sym, &ignore_callees_regex)) {
> -                       /* Treat this symbol as the root,
> -                          forgetting its callees. */
> -                       *root_al = al;
> -                       callchain_cursor_reset(cursor);
> -               }
> -       }
> -
> -       if (symbol_conf.hide_unresolved && al.sym == NULL)
> -               return 0;
> -
> -       if (iter) {
> -               nr_loop_iter = iter->nr_loop_iter;
> -               iter_cycles = iter->cycles;
> -       }
> -
> -       srcline = callchain_srcline(al.map, al.sym, al.addr);
> -       return callchain_cursor_append(cursor, ip, al.map, al.sym,
> -                                      branch, flags, nr_loop_iter,
> -                                      iter_cycles, branch_from, srcline);
> +       return __add_callchain_ip(cursor, ip, &al, branch, flags, branch_from,
> +                                 iter, parent, root_al);
>  }
>
>  struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
> --
> 2.19.2
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Support inline symbols in callchains
  2019-02-21 16:06 [PATCH 0/3] Support inline symbols in callchains Jonas Rabenstein
                   ` (2 preceding siblings ...)
  2019-02-21 16:06 ` [PATCH 3/3] perf machine: add inline symbols to callchains Jonas Rabenstein
@ 2019-02-23  4:56 ` Namhyung Kim
  3 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2019-02-23  4:56 UTC (permalink / raw)
  To: Jonas Rabenstein
  Cc: linux-perf-users, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Arnaldo Carvalho de Melo, David Miller, Eric Saint-Etienne,
	Ingo Molnar, Jiri Olsa, Kim Phillips, Konstantin Khlebnikov,
	Milian Wolff, Peter Zijlstra, Rob Gardner, Sandipan Das,
	linux-kernel

On Fri, Feb 22, 2019 at 1:07 AM Jonas Rabenstein
<jonas.rabenstein@studium.uni-erlangen.de> wrote:
>
> Hi,
> This patchset supersedes my previous attempt to add inline symbols to
> callchain of perf-script [1] by a more generic attempt to not hook in
> the output stage but directly into the callchain generation. By a matter
> of fact this adds those inline symbols automatically to other commands
> like perf-report.
> Additionally this fixes the regression reported by Jiri Olsa [2] that
> some entries from previous outputs had been vanished and now only new
> lines are added if symbols had been found.
>
> The integration for perf-report is not completely done as there is an
> issue if the root for an hist_entry as for inlined symbols there may be
> multiple instances (for each address-range) with the same name. But in
> util/sort.c:233 only the name is compared for inlined symbols. As a
> consequence the returned hist_entry may hold a reference to another
> instance for this inlined symbol (with another address range than
> requested) we later on fail with -ERANGE in __symbol__inc_addr_samples
> (util/annotate.c:857).
> This issue does still permit perf-report to be executed without any
> problems and the inlined symbols do show up but none of the samples is
> actually accounted to them but to the original symbol :(
>
> To further provide information what this changeset is doing, here is a
> script-session to show the differences in the output:
>
> [jonas@x60s]$  git reset --hard v5.0-rc7; make -C tools/perf >/dev/null; \
> HEAD is now at a3b22b9f11d9 Linux 5.0-rc7
> [jonas@x60s]$  git am *.patch; make -C tools/perf >/dev/null; \
>         mv tools /perf/perf perf-new
> Applying: perf map: add function to lookup inline symbols
> Applying: perf machine: use map__inlines in append_inlines
> Applying: perf machine: add inline symbols to callchains
> [jonas@x60s]$  cat test.c
> static int foo(int m){int r=0; for(int i=0;i<m;++i)r+=i; return r;}
> static int bar(int m){int r=0; for(int i=0;i<m;++i)r+=foo(i); return r;}
> static int baz(int m){int r=0; for(int i=0;i<m;++i)r+=foo(i)*bar(i); return r;}
> int main() { return baz(421); }
> [jonas@x60s]$  gcc -O2 -fno-omit-frame-pointer -g -o test test.c
> [jonas@x60s]$  ./perf-new record --call-graph fp ./test.c
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (19 samples) ]
> [jonas@x60s]$  time ./perf-old script >old.script
> real    0m0.039s
> user    0m0.012s
> sys     0m0.022s
> [jonas@x60s]$  time ./perf-new script >new.script
> real    0m0.045s
> user    0m0.010s
> sys     0m0.032s
> [jonas@x60s]$  diff -u old.script new.script
> --- old.script  2019-02-21 16:30:47.997194180 +0100
> +++ new.script  2019-02-21 16:30:51.757309790 +0100
> @@ -23,66 +23,101 @@
>             7f43b1148090 _start+0x0 (/lib/x86_64-linux-gnu/ld-2.27.so)
>
>  test  7579 1470536.968092:     354758 cycles:uppp:
> +           7f43b1148f4b elf_get_dynamic_info+0xab (inlined)
>             7f43b1148f4b _dl_start+0xab (/lib/x86_64-linux-gnu/ld-2.27.so)
>             7f43b1148098 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-2.27.so)
>  [...]
>
>  test  7579 1470536.969210:    1922435 cycles:uppp:
> +           563f0df9c53f foo+0x4f (inlined)
> +           563f0df9c53f bar+0x4f (inlined)
> +           563f0df9c53f baz+0x4f (inlined)
>             563f0df9c53f main+0x4f (/home/jonas/linux/test)
>             7f43b0d77b97 __libc_start_main+0xe7 (/lib/x86_64-linux-gnu/libc-2.27.so)
>          75e258d4c544155 [unknown] ([unknown])
> [jonas@x60s]$  time ./perf-new script --no-inline >new.noinline.script
> real    0m0.035s
> user    0m0.012s
> sys     0m0.020s
> [jonas@x60s]$  diff -u old.script new.noinline.script
> [jonas@x60s]$  ./perf-old report --stdio --quiet | sed '/^$/Q'
>        88.80%    88.80%  test     test              [.] main
>             |
>             ---0x75e258d4c544155
>                __libc_start_main
>                main
> [jonas@x60s]$  ./perf-new report --stdio --quiet | sed '/^$/Q'
>     88.80%    88.80%  test     test              [.] main
>             |
>             ---0x75e258d4c544155
>                __libc_start_main
>                main
>                baz (inlined)
>                bar (inlined)
>                foo (inlined)
> [jonas@x60s]$  ./perf-new report --stdio --quiet --no-inline | sed '/^$/Q'
>     88.80%    88.80%  test     test              [.] main
>             |
>             ---0x75e258d4c544155
>                __libc_start_main
>                main
>
> I am still trying to find a way that in the new output of report the
> 88.80% 'Self' are not accounted to main itself but split up for the
> inlined baz, bar and foo symbols.
>
> I'm open to any help as well as feedback,

I guess you can set al->sym to a symbol of the last callchain entry in
the iter_add_single_cumulative_entry().  For annotation, al->addr
needs to be updated from the start of the inlined subroutine if DWARF
is used, but not sure for addr2line though.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] perf machine: add inline symbols to callchains
  2019-02-23  4:21   ` Namhyung Kim
@ 2019-02-23 12:05     ` Jonas Rabenstein
  0 siblings, 0 replies; 7+ messages in thread
From: Jonas Rabenstein @ 2019-02-23 12:05 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jonas Rabenstein, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Andi Kleen, Arnaldo Carvalho de Melo,
	David Miller, Eric Saint-Etienne, Ingo Molnar, Jiri Olsa,
	Kim Phillips, Konstantin Khlebnikov, Milian Wolff,
	Peter Zijlstra, Rob Gardner, Sandipan Das, linux-kernel

On Sat, Feb 23, 2019 at 01:21:34PM +0900, Namhyung Kim wrote:
> Hello,
> 
> On Fri, Feb 22, 2019 at 1:07 AM Jonas Rabenstein
> <jonas.rabenstein@studium.uni-erlangen.de> wrote:
> >
> > Use map__inlines to resolve inlined functions for every entry with
> > an symbol that should be added to a callchain.
> >
> > Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
> > ---
> >  tools/perf/util/machine.c | 115 ++++++++++++++++++++++++++++----------
> >  1 file changed, 87 insertions(+), 28 deletions(-)
> >
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index dce29c21e4ea..070d074482b4 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -1908,6 +1908,91 @@ struct iterations {
> >         u64 cycles;
> >  };
> >
> > +static int __add_callchain_location(struct callchain_cursor *cursor,
> > +                                   struct symbol **parent,
> > +                                   struct addr_location *root_al,
> > +                                   u64 addr, struct addr_location *al,
> > +                                   bool branch, struct branch_flags *flags,
> > +                                   u64 branch_from, struct iterations *iter)
> > +{
> > +       int nr_loop_iter = 0;
> > +       u64 iter_cycles = 0;
> > +
> > +       if (symbol_conf.hide_unresolved && al->sym == NULL)
> > +               return 0;
> > +
> > +       if (al->sym) {
> > +               if (perf_hpp_list.parent && !*parent &&
> > +                 symbol__match_regex(al->sym, &parent_regex))
> > +                       *parent = al->sym;
> > +               else if (have_ignore_callees && root_al &&
> > +                 symbol__match_regex(al->sym, &ignore_callees_regex)) {
> > +                       /* Treat this symbol as the root,
> > +                          forgetting its callees. */
> > +                       *root_al = *al;
> > +                       callchain_cursor_reset(cursor);
> > +               }
> > +       }
> > +
> > +       if (iter) {
> > +               nr_loop_iter = iter->nr_loop_iter;
> > +               iter_cycles = iter->cycles;
> > +       }
> > +       return callchain_cursor_append(cursor, addr, al->map, al->sym, branch,
> > +                                      flags, nr_loop_iter, iter_cycles,
> > +                                      branch_from, al->srcline);
> > +}
> > +
> > +static int __add_callchain_ip(struct callchain_cursor *cursor, u64 ip,
> > +                             struct addr_location *al, bool branch,
> > +                             struct branch_flags *flags, u64 branch_from,
> > +                             struct iterations *iter, struct symbol **parent,
> > +                             struct addr_location *root_al)
> > +{
> > +       struct inline_node *inline_node;
> > +       struct inline_list *inline_list;
> > +       const char *srcline;
> > +       struct symbol *symbol;
> > +       int err = 0;
> > +
> > +       al->srcline = callchain_srcline(al->map, al->sym, al->addr);
> > +       if (callchain_param.order == ORDER_CALLER)
> > +               err = __add_callchain_location(cursor, parent, root_al, ip, al,
> > +                                              branch, flags, branch_from, iter);
> > +       if (err || !al->map || !al->sym)
> > +               goto no_inline;
> > +
> > +       inline_node = map__inlines(al->map, ip, al->sym);
> > +       if (!inline_node || list_empty(&inline_node->val))
> > +               goto no_inline;
> > +
> > +       symbol = al->sym;
> > +       srcline = al->srcline;
> > +       list_for_each_entry(inline_list, &inline_node->val, list) {
> > +               if (inline_list->symbol == symbol)
> > +                       continue;
> > +               al->sym = inline_list->symbol;
> > +               al->srcline = inline_list->srcline;
> > +               err = __add_callchain_location(cursor, parent, root_al, ip,
> > +                                              al, branch, flags,
> > +                                              branch_from, iter);
> > +               if (err)
> > +                       break;
> > +       }
> 
> Does this loop do the job both for ORDER_CALLER and ORDER_CALLEE?
The list of inline symbols is already build in he correct order
depending on callchain_param.order (see util/srcline.c:37). As far as I
could see, perf-report in tui mode uses ORDER_CALLER and perf-script
uses ORDER_CALLEE and both show me a correct order (:
> 
> Thanks,
> Namhyung
> 
> 
> > +
> > +       if (callchain_param.order == ORDER_CALLEE) {
> > +               al->srcline = srcline;
> > +               al->sym = symbol;
> > +       }
> > +
> > +no_inline:
> > +       if (!err && callchain_param.order == ORDER_CALLEE)
> > +               err = __add_callchain_location(cursor, parent, root_al, ip, al,
> > +                                              branch, flags, branch_from, iter);
> > +       return err;
> > +}
> > +
> > +
> >  static int add_callchain_ip(struct thread *thread,
> >                             struct callchain_cursor *cursor,
> >                             struct symbol **parent,
> > @@ -1920,9 +2005,6 @@ static int add_callchain_ip(struct thread *thread,
> >                             u64 branch_from)
> >  {
> >         struct addr_location al;
> > -       int nr_loop_iter = 0;
> > -       u64 iter_cycles = 0;
> > -       const char *srcline = NULL;
> >
> >         al.filtered = 0;
> >         al.sym = NULL;
> > @@ -1955,31 +2037,8 @@ static int add_callchain_ip(struct thread *thread,
> >                 thread__find_symbol(thread, *cpumode, ip, &al);
> >         }
> >
> > -       if (al.sym != NULL) {
> > -               if (perf_hpp_list.parent && !*parent &&
> > -                   symbol__match_regex(al.sym, &parent_regex))
> > -                       *parent = al.sym;
> > -               else if (have_ignore_callees && root_al &&
> > -                 symbol__match_regex(al.sym, &ignore_callees_regex)) {
> > -                       /* Treat this symbol as the root,
> > -                          forgetting its callees. */
> > -                       *root_al = al;
> > -                       callchain_cursor_reset(cursor);
> > -               }
> > -       }
> > -
> > -       if (symbol_conf.hide_unresolved && al.sym == NULL)
> > -               return 0;
> > -
> > -       if (iter) {
> > -               nr_loop_iter = iter->nr_loop_iter;
> > -               iter_cycles = iter->cycles;
> > -       }
> > -
> > -       srcline = callchain_srcline(al.map, al.sym, al.addr);
> > -       return callchain_cursor_append(cursor, ip, al.map, al.sym,
> > -                                      branch, flags, nr_loop_iter,
> > -                                      iter_cycles, branch_from, srcline);
> > +       return __add_callchain_ip(cursor, ip, &al, branch, flags, branch_from,
> > +                                 iter, parent, root_al);
> >  }
> >
> >  struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
> > --
> > 2.19.2
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-02-23 12:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-21 16:06 [PATCH 0/3] Support inline symbols in callchains Jonas Rabenstein
2019-02-21 16:06 ` [PATCH 1/3] perf map: add function to lookup inline symbols Jonas Rabenstein
2019-02-21 16:06 ` [PATCH 2/3] perf machine: use map__inlines in append_inlines Jonas Rabenstein
2019-02-21 16:06 ` [PATCH 3/3] perf machine: add inline symbols to callchains Jonas Rabenstein
2019-02-23  4:21   ` Namhyung Kim
2019-02-23 12:05     ` Jonas Rabenstein
2019-02-23  4:56 ` [PATCH 0/3] Support inline symbols in callchains Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).