All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] generate full callchain cursor entries for inlined frames
@ 2017-05-18 19:34 Milian Wolff
  2017-05-18 19:34 ` [PATCH 1/7] perf report: remove code to handle inline frames from browsers Milian Wolff
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel; +Cc: linux-perf-users, Milian Wolff

This series of patches completely reworks the way inline frames are handled.
Instead of querying for the inline nodes on-demand in the individual tools,
we now create proper callchain nodes for inlined frames. The advantages this
approach brings are numerous:

- less duplicated code in the individual browser
- aggregated cost for inlined frames for the --children top-down list
- various bug fixes that arose from querying for a srcline/symbol based on
  the IP of a sample, which will always point to the last inlined frame
  instead of the corresponding non-inlined frame
- overall much better support for visualizing cost for heavily-inlined C++
  code, which simply was confusing and unreliably before
- srcline honors the global setting as to whether full paths or basenames
  should be shown

For comparison, below lists the output before and after for `perf script`
and `perf report`. The example file I used to generate the perf data is:

~~~~~
$ cat inlining.cpp
#include <complex>
#include <cmath>
#include <random>
#include <iostream>

using namespace std;

int main()
{
    uniform_real_distribution<double> uniform(-1E5, 1E5);
    default_random_engine engine;
    double s = 0;
    for (int i = 0; i < 10000000; ++i) {
        s += norm(complex<double>(uniform(engine), uniform(engine)));
    }
    cout << s << '\n';
    return 0;
}
$ g++ -O2 -g -o inlining inlining.cpp
$ perf record --call-graph dwarf ./inlining
~~~~~

Now, the (broken) status-quo looks like this. Look for "NOTE:" to see some
of my comments that outline the various issues I'm trying to solve by this
patch series.

~~~~~
$ perf script --inline
...
inlining 11083 97459.356656:      33680 cycles:
                   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
                    ace3 hypot (/usr/lib/libm-2.25.so)
                     a4a main (/home/milian/projects/src/perf-tests/inlining)
                         std::__complex_abs
                         std::abs<double>
                         std::_Norm_helper<true>::_S_do_it<double>
                         std::norm<double>
                         main
                   20510 __libc_start_main (/usr/lib/libc-2.25.so)
                     bd9 _start (/home/milian/projects/src/perf-tests/inlining)
# NOTE: the above inlined stack is confusing: the a4a is an address into main,
#       which is the non-inlined symbol. the entry with the address should be
#       at the end of the stack, where it's actually duplicated once more but
#       there it's missing the address
...
$ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio
...
             --38.86%--_start
                       __libc_start_main
                       |
                       |--15.68%--main random.tcc:3326
                       |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.tcc:3326 (inline)
                       |
                       |--10.36%--main random.h:143
                       |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.tcc:3332 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:332 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:151 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:143 (inline)
                       |
                       |--5.66%--main random.tcc:3332
                       |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.tcc:3332 (inline)
...
# NOTE: the grouping is totally off because the first and last frame of the
        inline nodes is completely bogus, since the IP is used to find the sym/srcline
        which is different from the actual inlined sym/srcline.
        also, the code currently displays either the inlined function name or
        the corresponding filename (but in full length, instead of just the basename).

$ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio --no-children
...
    38.86%  [.] main
            |
            |--15.68%--main random.tcc:3326
            |          /usr/include/c++/6.3.1/bits/random.tcc:3326 (inline)
            |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
            |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
            |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
            |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
            |          __libc_start_main
            |          _start
...
# NOTE: the srcline for main is wrong, it should be inlining.cpp:14,
        i.e. what is displayed in the line below (see also perf script issue above)
~~~~~

Afterwards, all of the above issues are resolved:

~~~~~
$ perf script --inline
...
inlining 11083 97459.356656:      33680 cycles:
                   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
                    ace3 hypot (/usr/lib/libm-2.25.so)
                     a4a std::__complex_abs (inlined)
                     a4a std::abs<double> (inlined)
                     a4a std::_Norm_helper<true>::_S_do_it<double> (inlined)
                     a4a std::norm<double> (inlined)
                     a4a main (/home/milian/projects/src/perf-tests/inlining)
                   20510 __libc_start_main (/usr/lib/libc-2.25.so)
                     bd9 _start (/home/milian/projects/src/perf-tests/inlining)
...
# NOTE: only one main entry, at the correct position.
        we do display the (repeated) instruction pointer as that ensures
        interoperability with e.g. the stackcollapse-perf.pl script

$ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio
...
   100.00%    38.86%  [.] main
            |
            |--61.14%--main inlining.cpp:14
            |          std::norm<double> complex:664 (inlined)
            |          std::_Norm_helper<true>::_S_do_it<double> complex:654 (inlined)
            |          std::abs<double> complex:597 (inlined)
            |          std::__complex_abs complex:589 (inlined)
            |          |
            |          |--60.29%--hypot
            |          |          |
            |          |           --56.03%--__hypot_finite
            |          |
            |           --0.85%--cabs
            |
             --38.86%--_start
                       __libc_start_main
                       |
                       |--38.19%--main inlining.cpp:14
                       |          |
                       |          |--35.59%--std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1809 (inlined)
                       |          |          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1818 (inlined)
                       |          |          |
                       |          |           --34.37%--std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() random.h:185 (inlined)
                       |          |                     |
                       |          |                     |--17.91%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3332 (inlined)
                       |          |                     |          |
                       |          |                     |           --12.24%--std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() random.h:332 (inlined)
                       |          |                     |                     std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> random.h:151 (inlined)
                       |          |                     |                     |
                       |          |                     |                     |--10.36%--std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc random.h:143 (inlined)
                       |          |                     |                     |
                       |          |                     |                      --1.88%--std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc random.h:141 (inlined)
                       |          |                     |
                       |          |                     |--15.68%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3326 (inlined)
                       |          |                     |
                       |          |                      --0.79%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3335 (inlined)
                       |          |
                       |           --1.99%--std::norm<double> complex:664 (inlined)
                       |                     std::_Norm_helper<true>::_S_do_it<double> complex:654 (inlined)
                       |                     std::abs<double> complex:597 (inlined)
                       |                     std::__complex_abs complex:589 (inlined)
                       |
                        --0.67%--main inlining.cpp:13
...

# NOTE: still somewhat confusing due to the _start and __libc_start_main frames
        that actually are *above* the main frame. But at least the stuff below
        properly splits up and shows that mutiple functions got inlined into
        inlining.cpp:14, not just one as before.

$ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio --no-children
...
    38.86%  [.] main
            |
            |--15.68%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3326 (inlined)
            |          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() random.h:185 (inlined)
            |          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1818 (inlined)
            |          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1809 (inlined)
            |          main inlining.cpp:14
            |          __libc_start_main
            |          _start
...
# NOTE: the first and last entry of the inline stack have the correct symbol and srcline now
        both function and srcline is shown, as well as the (inlined) suffix
        only the basename of the srcline is shown
~~~~~

Milian Wolff (7):
  perf report: remove code to handle inline frames from browsers
  perf util: take elf_name as const string in dso__demangle_sym
  perf report: create real callchain entries for inlined frames
  perf report: use srcline from inlined frames
  perf report: fall-back to function name comparison for -g srcline
  perf report: mark inlined frames in output by " (inlined)" suffix
  perf script: mark inlined frames and do not print DSO for them

 tools/perf/ui/browsers/hists.c   | 183 +++------------------------------------
 tools/perf/ui/stdio/hist.c       |  80 +----------------
 tools/perf/util/callchain.c      |  52 +++++------
 tools/perf/util/callchain.h      |   5 +-
 tools/perf/util/dso.c            |   2 +
 tools/perf/util/dso.h            |   1 +
 tools/perf/util/evsel_fprintf.c  |  37 +-------
 tools/perf/util/hist.c           |   5 --
 tools/perf/util/machine.c        |  54 +++++++++++-
 tools/perf/util/sort.h           |   1 -
 tools/perf/util/srcline.c        | 183 ++++++++++++++++++++++++++++++---------
 tools/perf/util/srcline.h        |  19 +++-
 tools/perf/util/symbol-elf.c     |   2 +-
 tools/perf/util/symbol-minimal.c |   2 +-
 tools/perf/util/symbol.h         |   3 +-
 15 files changed, 264 insertions(+), 365 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/7] perf report: remove code to handle inline frames from browsers
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-18 19:34 ` [PATCH 2/7] perf util: take elf_name as const string in dso__demangle_sym Milian Wolff
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

A follow-up commit will make inline frames first-class citizens
in the callchain, thereby obsoleting all of this special code.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/ui/browsers/hists.c  | 183 +++-------------------------------------
 tools/perf/ui/stdio/hist.c      |  80 +-----------------
 tools/perf/util/evsel_fprintf.c |  32 -------
 tools/perf/util/hist.c          |   5 --
 tools/perf/util/sort.h          |   1 -
 5 files changed, 13 insertions(+), 288 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 69f4570bd4f9..48d056bb3747 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -154,60 +154,9 @@ static void callchain_list__set_folding(struct callchain_list *cl, bool unfold)
 	cl->unfolded = unfold ? cl->has_children : false;
 }
 
-static struct inline_node *inline_node__create(struct map *map, u64 ip)
-{
-	struct dso *dso;
-	struct inline_node *node;
-
-	if (map == NULL)
-		return NULL;
-
-	dso = map->dso;
-	if (dso == NULL)
-		return NULL;
-
-	if (dso->kernel != DSO_TYPE_USER)
-		return NULL;
-
-	node = dso__parse_addr_inlines(dso,
-				       map__rip_2objdump(map, ip));
-
-	return node;
-}
-
-static int inline__count_rows(struct inline_node *node)
-{
-	struct inline_list *ilist;
-	int i = 0;
-
-	if (node == NULL)
-		return 0;
-
-	list_for_each_entry(ilist, &node->val, list) {
-		if ((ilist->filename != NULL) || (ilist->funcname != NULL))
-			i++;
-	}
-
-	return i;
-}
-
-static int callchain_list__inline_rows(struct callchain_list *chain)
-{
-	struct inline_node *node;
-	int rows;
-
-	node = inline_node__create(chain->ms.map, chain->ip);
-	if (node == NULL)
-		return 0;
-
-	rows = inline__count_rows(node);
-	inline_node__delete(node);
-	return rows;
-}
-
 static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
 {
-	int n = 0, inline_rows;
+	int n = 0;
 	struct rb_node *nd;
 
 	for (nd = rb_first(&node->rb_root); nd; nd = rb_next(nd)) {
@@ -218,12 +167,6 @@ static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
 		list_for_each_entry(chain, &child->val, list) {
 			++n;
 
-			if (symbol_conf.inline_name) {
-				inline_rows =
-					callchain_list__inline_rows(chain);
-				n += inline_rows;
-			}
-
 			/* We need this because we may not have children */
 			folded_sign = callchain_list__folded(chain);
 			if (folded_sign == '+')
@@ -275,7 +218,7 @@ static int callchain_node__count_rows(struct callchain_node *node)
 {
 	struct callchain_list *chain;
 	bool unfolded = false;
-	int n = 0, inline_rows;
+	int n = 0;
 
 	if (callchain_param.mode == CHAIN_FLAT)
 		return callchain_node__count_flat_rows(node);
@@ -284,10 +227,6 @@ static int callchain_node__count_rows(struct callchain_node *node)
 
 	list_for_each_entry(chain, &node->val, list) {
 		++n;
-		if (symbol_conf.inline_name) {
-			inline_rows = callchain_list__inline_rows(chain);
-			n += inline_rows;
-		}
 
 		unfolded = chain->unfolded;
 	}
@@ -435,19 +374,6 @@ static void hist_entry__init_have_children(struct hist_entry *he)
 	he->init_have_children = true;
 }
 
-static void hist_entry_init_inline_node(struct hist_entry *he)
-{
-	if (he->inline_node)
-		return;
-
-	he->inline_node = inline_node__create(he->ms.map, he->ip);
-
-	if (he->inline_node == NULL)
-		return;
-
-	he->has_children = true;
-}
-
 static bool hist_browser__toggle_fold(struct hist_browser *browser)
 {
 	struct hist_entry *he = browser->he_selection;
@@ -479,12 +405,8 @@ static bool hist_browser__toggle_fold(struct hist_browser *browser)
 
 		if (he->unfolded) {
 			if (he->leaf)
-				if (he->inline_node)
-					he->nr_rows = inline__count_rows(
-							he->inline_node);
-				else
-					he->nr_rows = callchain__count_rows(
-							&he->sorted_chain);
+				he->nr_rows = callchain__count_rows(
+						&he->sorted_chain);
 			else
 				he->nr_rows = hierarchy_count_rows(browser, he, false);
 
@@ -844,71 +766,6 @@ static bool hist_browser__check_dump_full(struct hist_browser *browser __maybe_u
 
 #define LEVEL_OFFSET_STEP 3
 
-static int hist_browser__show_inline(struct hist_browser *browser,
-				     struct inline_node *node,
-				     unsigned short row,
-				     int offset)
-{
-	struct inline_list *ilist;
-	char buf[1024];
-	int color, width, first_row;
-
-	first_row = row;
-	width = browser->b.width - (LEVEL_OFFSET_STEP + 2);
-	list_for_each_entry(ilist, &node->val, list) {
-		if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
-			color = HE_COLORSET_NORMAL;
-			if (ui_browser__is_current_entry(&browser->b, row))
-				color = HE_COLORSET_SELECTED;
-
-			if (callchain_param.key == CCKEY_ADDRESS ||
-			    callchain_param.key == CCKEY_SRCLINE) {
-				if (ilist->filename != NULL)
-					scnprintf(buf, sizeof(buf),
-						  "%s:%d (inline)",
-						  ilist->filename,
-						  ilist->line_nr);
-				else
-					scnprintf(buf, sizeof(buf), "??");
-			} else if (ilist->funcname != NULL)
-				scnprintf(buf, sizeof(buf), "%s (inline)",
-					  ilist->funcname);
-			else if (ilist->filename != NULL)
-				scnprintf(buf, sizeof(buf),
-					  "%s:%d (inline)",
-					  ilist->filename,
-					  ilist->line_nr);
-			else
-				scnprintf(buf, sizeof(buf), "??");
-
-			ui_browser__set_color(&browser->b, color);
-			hist_browser__gotorc(browser, row, 0);
-			ui_browser__write_nstring(&browser->b, " ",
-				LEVEL_OFFSET_STEP + offset);
-			ui_browser__write_nstring(&browser->b, buf, width);
-			row++;
-		}
-	}
-
-	return row - first_row;
-}
-
-static size_t show_inline_list(struct hist_browser *browser, struct map *map,
-			       u64 ip, int row, int offset)
-{
-	struct inline_node *node;
-	int ret;
-
-	node = inline_node__create(map, ip);
-	if (node == NULL)
-		return 0;
-
-	ret = hist_browser__show_inline(browser, node, row, offset);
-
-	inline_node__delete(node);
-	return ret;
-}
-
 static int hist_browser__show_callchain_list(struct hist_browser *browser,
 					     struct callchain_node *node,
 					     struct callchain_list *chain,
@@ -920,7 +777,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
 	char bf[1024], *alloc_str;
 	char buf[64], *alloc_str2;
 	const char *str;
-	int inline_rows = 0, ret = 1;
+	int ret = 1;
 
 	if (arg->row_offset != 0) {
 		arg->row_offset--;
@@ -961,12 +818,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
 	free(alloc_str);
 	free(alloc_str2);
 
-	if (symbol_conf.inline_name) {
-		inline_rows = show_inline_list(browser, chain->ms.map,
-					       chain->ip, row + 1, offset);
-	}
-
-	return ret + inline_rows;
+	return ret;
 }
 
 static bool check_percent_display(struct rb_node *node, u64 parent_total)
@@ -1390,12 +1242,6 @@ static int hist_browser__show_entry(struct hist_browser *browser,
 		folded_sign = hist_entry__folded(entry);
 	}
 
-	if (symbol_conf.inline_name &&
-	    (!entry->has_children)) {
-		hist_entry_init_inline_node(entry);
-		folded_sign = hist_entry__folded(entry);
-	}
-
 	if (row_offset == 0) {
 		struct hpp_arg arg = {
 			.b		= &browser->b,
@@ -1427,8 +1273,7 @@ static int hist_browser__show_entry(struct hist_browser *browser,
 			}
 
 			if (first) {
-				if (symbol_conf.use_callchain ||
-					symbol_conf.inline_name) {
+				if (symbol_conf.use_callchain) {
 					ui_browser__printf(&browser->b, "%c ", folded_sign);
 					width -= 2;
 				}
@@ -1470,15 +1315,11 @@ static int hist_browser__show_entry(struct hist_browser *browser,
 			.is_current_entry = current_entry,
 		};
 
-		if (entry->inline_node)
-			printed += hist_browser__show_inline(browser,
-					entry->inline_node, row, 0);
-		else
-			printed += hist_browser__show_callchain(browser,
-					entry, 1, row,
-					hist_browser__show_callchain_entry,
-					&arg,
-					hist_browser__check_output_full);
+		printed += hist_browser__show_callchain(browser,
+				entry, 1, row,
+				hist_browser__show_callchain_entry,
+				&arg,
+				hist_browser__check_output_full);
 	}
 
 	return printed;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 42e432bd2eb4..2938c5c9d3c0 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -20,67 +20,6 @@ static size_t callchain__fprintf_left_margin(FILE *fp, int left_margin)
 	return ret;
 }
 
-static size_t inline__fprintf(struct map *map, u64 ip, int left_margin,
-			      int depth, int depth_mask, FILE *fp)
-{
-	struct dso *dso;
-	struct inline_node *node;
-	struct inline_list *ilist;
-	int ret = 0, i;
-
-	if (map == NULL)
-		return 0;
-
-	dso = map->dso;
-	if (dso == NULL)
-		return 0;
-
-	if (dso->kernel != DSO_TYPE_USER)
-		return 0;
-
-	node = dso__parse_addr_inlines(dso,
-				       map__rip_2objdump(map, ip));
-	if (node == NULL)
-		return 0;
-
-	list_for_each_entry(ilist, &node->val, list) {
-		if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
-			ret += callchain__fprintf_left_margin(fp, left_margin);
-
-			for (i = 0; i < depth; i++) {
-				if (depth_mask & (1 << i))
-					ret += fprintf(fp, "|");
-				else
-					ret += fprintf(fp, " ");
-				ret += fprintf(fp, "          ");
-			}
-
-			if (callchain_param.key == CCKEY_ADDRESS ||
-			    callchain_param.key == CCKEY_SRCLINE) {
-				if (ilist->filename != NULL)
-					ret += fprintf(fp, "%s:%d (inline)",
-						       ilist->filename,
-						       ilist->line_nr);
-				else
-					ret += fprintf(fp, "??");
-			} else if (ilist->funcname != NULL)
-				ret += fprintf(fp, "%s (inline)",
-					       ilist->funcname);
-			else if (ilist->filename != NULL)
-				ret += fprintf(fp, "%s:%d (inline)",
-					       ilist->filename,
-					       ilist->line_nr);
-			else
-				ret += fprintf(fp, "??");
-
-			ret += fprintf(fp, "\n");
-		}
-	}
-
-	inline_node__delete(node);
-	return ret;
-}
-
 static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
 					  int left_margin)
 {
@@ -143,9 +82,6 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
 	fputc('\n', fp);
 	free(alloc_str);
 
-	if (symbol_conf.inline_name)
-		ret += inline__fprintf(chain->ms.map, chain->ip,
-				       left_margin, depth, depth_mask, fp);
 	return ret;
 }
 
@@ -320,13 +256,6 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 
 			if (++entries_printed == callchain_param.print_limit)
 				break;
-
-			if (symbol_conf.inline_name)
-				ret += inline__fprintf(chain->ms.map,
-						       chain->ip,
-						       left_margin,
-						       0, 0,
-						       fp);
 		}
 		root = &cnode->rb_root;
 	}
@@ -606,7 +535,6 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 {
 	int ret;
 	int callchain_ret = 0;
-	int inline_ret = 0;
 	struct perf_hpp hpp = {
 		.buf		= bf,
 		.size		= size,
@@ -628,13 +556,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 		callchain_ret = hist_entry_callchain__fprintf(he, total_period,
 							      0, fp);
 
-	if (callchain_ret == 0 && symbol_conf.inline_name) {
-		inline_ret = inline__fprintf(he->ms.map, he->ip, 0, 0, 0, fp);
-		ret += inline_ret;
-		if (inline_ret > 0)
-			ret += fprintf(fp, "\n");
-	} else
-		ret += callchain_ret;
+	ret += callchain_ret;
 
 	return ret;
 }
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 583f3a602506..f2c6c5ee11e8 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -169,38 +169,6 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 			if (!print_oneline)
 				printed += fprintf(fp, "\n");
 
-			if (symbol_conf.inline_name && node->map) {
-				struct inline_node *inode;
-
-				addr = map__rip_2objdump(node->map, node->ip),
-				inode = dso__parse_addr_inlines(node->map->dso, addr);
-
-				if (inode) {
-					struct inline_list *ilist;
-
-					list_for_each_entry(ilist, &inode->val, list) {
-						if (print_arrow)
-							printed += fprintf(fp, " <-");
-
-						/* IP is same, just skip it */
-						if (print_ip)
-							printed += fprintf(fp, "%c%16s",
-									   s, "");
-						if (print_sym)
-							printed += fprintf(fp, " %s",
-									   ilist->funcname);
-						if (print_srcline)
-							printed += fprintf(fp, "\n  %s:%d",
-									   ilist->filename,
-									   ilist->line_nr);
-						if (!print_oneline)
-							printed += fprintf(fp, "\n");
-					}
-
-					inline_node__delete(inode);
-				}
-			}
-
 			if (symbol_conf.bt_stop_list &&
 			    node->sym &&
 			    strlist__has_entry(symbol_conf.bt_stop_list,
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index cf0186a088c1..ed7f5c0a41e6 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1140,11 +1140,6 @@ void hist_entry__delete(struct hist_entry *he)
 		zfree(&he->mem_info);
 	}
 
-	if (he->inline_node) {
-		inline_node__delete(he->inline_node);
-		he->inline_node = NULL;
-	}
-
 	zfree(&he->stat_acc);
 	free_srcline(he->srcline);
 	if (he->srcfile && he->srcfile[0])
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index b7c75597e18f..2f41416f01fa 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -129,7 +129,6 @@ struct hist_entry {
 	};
 	char			*srcline;
 	char			*srcfile;
-	struct inline_node	*inline_node;
 	struct symbol		*parent;
 	struct branch_info	*branch_info;
 	struct hists		*hists;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/7] perf util: take elf_name as const string in dso__demangle_sym
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
  2017-05-18 19:34 ` [PATCH 1/7] perf report: remove code to handle inline frames from browsers Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-18 19:34 ` [PATCH 3/7] perf report: create real callchain entries for inlined frames Milian Wolff
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

The input string is not modified and thus can be passed
in as a pointer to const data.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/util/symbol-elf.c     | 2 +-
 tools/perf/util/symbol-minimal.c | 2 +-
 tools/perf/util/symbol.h         | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index e7ee47f7377a..a3c4d0204439 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -391,7 +391,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss, struct map *
 	return 0;
 }
 
-char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name)
 {
 	return demangle_sym(dso, kmodule, elf_name);
 }
diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c
index 40bf5d4c0bfd..1a5aa35b0100 100644
--- a/tools/perf/util/symbol-minimal.c
+++ b/tools/perf/util/symbol-minimal.c
@@ -377,7 +377,7 @@ void symbol__elf_init(void)
 
 char *dso__demangle_sym(struct dso *dso __maybe_unused,
 			int kmodule __maybe_unused,
-			char *elf_name __maybe_unused)
+			const char *elf_name __maybe_unused)
 {
 	return NULL;
 }
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 41ebba9a2eb2..f0b08810d7fa 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -306,7 +306,7 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
 				struct map *map);
 
-char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
 void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel);
 void symbols__insert(struct rb_root *symbols, struct symbol *sym);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/7] perf report: create real callchain entries for inlined frames
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
  2017-05-18 19:34 ` [PATCH 1/7] perf report: remove code to handle inline frames from browsers Milian Wolff
  2017-05-18 19:34 ` [PATCH 2/7] perf util: take elf_name as const string in dso__demangle_sym Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-22 12:19   ` Namhyung Kim
  2017-05-18 19:34 ` [PATCH 4/7] perf report: use srcline from " Milian Wolff
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

The inlined frames use a fake symbol that is tracked in a special
map inside the dso, which is always sorted by name. All other
entries of the symbol beside the function name are unused for
inline frames. The advantage of this approach is that all existing
users of the callchain API can now transparently display inlined
frames without having to patch their code.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/util/dso.c     |   2 +
 tools/perf/util/dso.h     |   1 +
 tools/perf/util/machine.c |  37 +++++++++++++
 tools/perf/util/srcline.c | 136 ++++++++++++++++++++++++++++++++++++----------
 tools/perf/util/srcline.h |  16 +++++-
 tools/perf/util/symbol.h  |   1 +
 6 files changed, 162 insertions(+), 31 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index a96a99d2369f..3563d190e6f0 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -9,6 +9,7 @@
 #include "compress.h"
 #include "path.h"
 #include "symbol.h"
+#include "srcline.h"
 #include "dso.h"
 #include "machine.h"
 #include "auxtrace.h"
@@ -1132,6 +1133,7 @@ void dso__delete(struct dso *dso)
 		       dso->long_name);
 	for (i = 0; i < MAP__NR_TYPES; ++i)
 		symbols__delete(&dso->symbols[i]);
+	inlines__tree_delete(&dso->inlined_nodes);
 
 	if (dso->short_name_allocated) {
 		zfree((char **)&dso->short_name);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 12350b171727..26e717278d0d 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -139,6 +139,7 @@ struct dso {
 	struct rb_root	 *root;		/* root of rbtree that rb_node is in */
 	struct rb_root	 symbols[MAP__NR_TYPES];
 	struct rb_root	 symbol_names[MAP__NR_TYPES];
+	struct rb_root	 inlined_nodes;
 	struct {
 		u64		addr;
 		struct symbol	*symbol;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index d97e014c3df3..64bfa8b43706 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2031,12 +2031,49 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 	return 0;
 }
 
+static int append_inlines(struct callchain_cursor *cursor,
+			  struct map *map, struct symbol *sym, u64 ip)
+{
+	struct inline_node *inline_node;
+	struct inline_list *ilist;
+	u64 addr;
+
+	if (!symbol_conf.inline_name || !map || !sym)
+		return 1;
+
+	addr = map__rip_2objdump(map, ip);
+
+	inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr);
+	if (!inline_node) {
+		inline_node = dso__parse_addr_inlines(map->dso, addr, sym);
+		if (!inline_node)
+			return 1;
+
+		inlines__tree_insert(&map->dso->inlined_nodes, inline_node);
+	}
+
+	list_for_each_entry(ilist, &inline_node->val, list) {
+		int ret = callchain_cursor_append(cursor, ip, map,
+						  ilist->symbol, false,
+						  NULL, 0, 0);
+
+		if (ret != 0)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int unwind_entry(struct unwind_entry *entry, void *arg)
 {
 	struct callchain_cursor *cursor = arg;
 
 	if (symbol_conf.hide_unresolved && entry->sym == NULL)
 		return 0;
+
+	if (append_inlines(cursor, entry->map, entry->sym, entry->ip) == 0)
+		return 0;
+
 	return callchain_cursor_append(cursor, entry->ip,
 				       entry->map, entry->sym,
 				       false, NULL, 0, 0);
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index ebc88a74e67b..25ceed8eb790 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -33,29 +33,19 @@ static const char *dso__name(struct dso *dso)
 	return dso_name;
 }
 
-static int inline_list__append(char *filename, char *funcname, int line_nr,
-			       struct inline_node *node, struct dso *dso)
+static int inline_list__append(struct symbol *symbol, char *filename, int line_nr,
+			       struct inline_node *node)
 {
 	struct inline_list *ilist;
-	char *demangled;
 
 	ilist = zalloc(sizeof(*ilist));
 	if (ilist == NULL)
 		return -1;
 
+	ilist->symbol = symbol;
 	ilist->filename = filename;
 	ilist->line_nr = line_nr;
 
-	if (dso != NULL) {
-		demangled = dso__demangle_sym(dso, 0, funcname);
-		if (demangled == NULL) {
-			ilist->funcname = funcname;
-		} else {
-			ilist->funcname = demangled;
-			free(funcname);
-		}
-	}
-
 	if (callchain_param.order == ORDER_CALLEE)
 		list_add_tail(&ilist->list, &node->val);
 	else
@@ -203,19 +193,52 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static struct symbol *new_inline_sym(struct dso *dso,
+				     struct symbol *base_sym,
+				     const char *funcname)
+{
+	struct symbol *inline_sym;
+	char *demangled = NULL;
+
+	if (dso) {
+		demangled = dso__demangle_sym(dso, 0, funcname);
+		if (demangled)
+			funcname = demangled;
+	}
+
+	if (strcmp(funcname, base_sym->name) == 0) {
+		// reuse the real, existing symbol
+		inline_sym = base_sym;
+	} else {
+		// create a fake symbol for the inline frame
+		inline_sym = symbol__new(base_sym ? base_sym->start : 0,
+					 base_sym ? base_sym->end : 0,
+					 base_sym ? base_sym->binding : 0,
+					 funcname);
+		if (inline_sym)
+			inline_sym->inlined = 1;
+	}
+
+	free(demangled);
+
+	return inline_sym;
+}
+
 static int inline_list__append_dso_a2l(struct dso *dso,
-				       struct inline_node *node)
+				       struct inline_node *node,
+				       struct symbol *sym)
 {
 	struct a2l_data *a2l = dso->a2l;
-	char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
 	char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+	struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
 
-	return inline_list__append(filename, funcname, a2l->line, node, dso);
+	return inline_list__append(inline_sym, filename, a2l->line, node);
 }
 
 static int addr2line(const char *dso_name, u64 addr,
 		     char **file, unsigned int *line, struct dso *dso,
-		     bool unwind_inlines, struct inline_node *node)
+		     bool unwind_inlines, struct inline_node *node,
+		     struct symbol *sym)
 {
 	int ret = 0;
 	struct a2l_data *a2l = dso->a2l;
@@ -241,7 +264,7 @@ static int addr2line(const char *dso_name, u64 addr,
 	if (unwind_inlines) {
 		int cnt = 0;
 
-		if (node && inline_list__append_dso_a2l(dso, node))
+		if (node && inline_list__append_dso_a2l(dso, node, sym))
 			return 0;
 
 		while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
@@ -249,7 +272,7 @@ static int addr2line(const char *dso_name, u64 addr,
 		       cnt++ < MAX_INLINE_NEST) {
 
 			if (node != NULL) {
-				if (inline_list__append_dso_a2l(dso, node))
+				if (inline_list__append_dso_a2l(dso, node, sym))
 					return 0;
 				// found at least one inline frame
 				ret = 1;
@@ -281,7 +304,7 @@ void dso__free_a2l(struct dso *dso)
 }
 
 static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
-	struct dso *dso)
+	struct dso *dso, struct symbol *sym)
 {
 	struct inline_node *node;
 
@@ -294,7 +317,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 	INIT_LIST_HEAD(&node->val);
 	node->addr = addr;
 
-	if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node))
+	if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node, sym))
 		goto out_free_inline_node;
 
 	if (list_empty(&node->val))
@@ -334,7 +357,8 @@ static int addr2line(const char *dso_name, u64 addr,
 		     char **file, unsigned int *line_nr,
 		     struct dso *dso __maybe_unused,
 		     bool unwind_inlines __maybe_unused,
-		     struct inline_node *node __maybe_unused)
+		     struct inline_node *node __maybe_unused,
+		     struct symbol *sym __maybe_unused)
 {
 	FILE *fp;
 	char cmd[PATH_MAX];
@@ -374,7 +398,7 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
 }
 
 static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
-	struct dso *dso __maybe_unused)
+	struct dso *dso __maybe_unused, struct symbol *sym)
 {
 	FILE *fp;
 	char cmd[PATH_MAX];
@@ -407,8 +431,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 			goto out;
 		}
 
-		if (inline_list__append(filename, NULL, line_nr, node,
-					NULL) != 0)
+		if (inline_list__append(sym, filename, line_nr, node) != 0)
 			goto out;
 
 		filename = NULL;
@@ -448,7 +471,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
 	if (dso_name == NULL)
 		goto out;
 
-	if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines, NULL))
+	if (!addr2line(dso_name, addr, &file, &line, dso,
+		       unwind_inlines, NULL, sym))
 		goto out;
 
 	if (asprintf(&srcline, "%s:%u",
@@ -494,7 +518,8 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
 	return __get_srcline(dso, addr, sym, show_sym, show_addr, false);
 }
 
-struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
+					    struct symbol *sym)
 {
 	const char *dso_name;
 
@@ -502,7 +527,7 @@ struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
 	if (dso_name == NULL)
 		return NULL;
 
-	return addr2inlines(dso_name, addr, dso);
+	return addr2inlines(dso_name, addr, dso, sym);
 }
 
 void inline_node__delete(struct inline_node *node)
@@ -512,9 +537,62 @@ void inline_node__delete(struct inline_node *node)
 	list_for_each_entry_safe(ilist, tmp, &node->val, list) {
 		list_del_init(&ilist->list);
 		zfree(&ilist->filename);
-		zfree(&ilist->funcname);
+		// only the inlined symbols are owned by the list
+		if (ilist->symbol && ilist->symbol->inlined)
+			symbol__delete(ilist->symbol);
 		free(ilist);
 	}
 
 	free(node);
 }
+
+void inlines__tree_insert(struct rb_root *tree, struct inline_node *inlines)
+{
+	struct rb_node **p = &tree->rb_node;
+	struct rb_node *parent = NULL;
+	const u64 addr = inlines->addr;
+	struct inline_node *i;
+
+	while (*p != NULL) {
+		parent = *p;
+		i = rb_entry(parent, struct inline_node, rb_node);
+		if (addr < i->addr)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	rb_link_node(&inlines->rb_node, parent, p);
+	rb_insert_color(&inlines->rb_node, tree);
+}
+
+struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr)
+{
+	struct rb_node *n = tree->rb_node;
+
+	while (n) {
+		struct inline_node *i = rb_entry(n, struct inline_node,
+						 rb_node);
+
+		if (addr < i->addr)
+			n = n->rb_left;
+		else if (addr > i->addr)
+			n = n->rb_right;
+		else
+			return i;
+	}
+
+	return NULL;
+}
+
+void inlines__tree_delete(struct rb_root *tree)
+{
+	struct inline_node *pos;
+	struct rb_node *next = rb_first(tree);
+
+	while (next) {
+		pos = rb_entry(next, struct inline_node, rb_node);
+		next = rb_next(&pos->rb_node);
+		rb_erase(&pos->rb_node, tree);
+		inline_node__delete(pos);
+	}
+}
diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
index 7b52ba88676e..ad9726987b80 100644
--- a/tools/perf/util/srcline.h
+++ b/tools/perf/util/srcline.h
@@ -2,6 +2,7 @@
 #define PERF_SRCLINE_H
 
 #include <linux/list.h>
+#include <linux/rbtree.h>
 #include <linux/types.h>
 
 struct dso;
@@ -17,8 +18,8 @@ void free_srcline(char *srcline);
 #define SRCLINE_UNKNOWN  ((char *) "??:0")
 
 struct inline_list {
+	struct symbol		*symbol;
 	char			*filename;
-	char			*funcname;
 	unsigned int		line_nr;
 	struct list_head	list;
 };
@@ -26,9 +27,20 @@ struct inline_list {
 struct inline_node {
 	u64			addr;
 	struct list_head	val;
+	struct rb_node		rb_node;
 };
 
-struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr);
+// parse inlined frames for the given address
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
+					    struct symbol *sym);
+// free resources associated to the inline node list
 void inline_node__delete(struct inline_node *node);
 
+// insert the inline node list into the DSO, which will take ownership
+void inlines__tree_insert(struct rb_root *tree, struct inline_node *inlines);
+// find previously inserted inline node list
+struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr);
+// delete all nodes within the tree of inline_node s
+void inlines__tree_delete(struct rb_root *tree);
+
 #endif /* PERF_SRCLINE_H */
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index f0b08810d7fa..b358570ce615 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -59,6 +59,7 @@ struct symbol {
 	u8		binding;
 	u8		idle:1;
 	u8		ignore:1;
+	u8		inlined:1;
 	u8		arch_sym;
 	char		name[0];
 };
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/7] perf report: use srcline from inlined frames
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
                   ` (2 preceding siblings ...)
  2017-05-18 19:34 ` [PATCH 3/7] perf report: create real callchain entries for inlined frames Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-18 19:34 ` [PATCH 5/7] perf report: fall-back to function name comparison for -g srcline Milian Wolff
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

Instead of looking up the srcline based on the IP associated with
a given callchain entry, add a new srcline member and use that.
This member is set either based on the IP for non-inlined frames,
or uses the DWARF data for inlined frames.

To ensure we don't duplicate the work for inlined frames, and also
to ensure we properly free the allocated resources, the srcline
strings are associated with the inlined_list entries cached by the
DSO.

To enable this patch, we put the existing code to merge a fileline
pair into a srcline string. This function honors the global
srcline_full_filename setting, and displays only the basename if
this is set to true (the default). Because we want to be able to
get the basename for a `const char*` we get fro a2l, a GNU-like
implementation of basename is added (the posix version requires
a non-const `char*` input argument).

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/util/callchain.c | 26 ++++++++--------------
 tools/perf/util/callchain.h |  5 +++--
 tools/perf/util/machine.c   | 19 +++++++++++++---
 tools/perf/util/srcline.c   | 53 ++++++++++++++++++++++++++++++++++-----------
 tools/perf/util/srcline.h   |  3 +--
 5 files changed, 69 insertions(+), 37 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29ac798f..3a349530fee1 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -558,6 +558,7 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor)
 		call->ip = cursor_node->ip;
 		call->ms.sym = cursor_node->sym;
 		call->ms.map = map__get(cursor_node->map);
+		call->srcline = cursor_node->srcline;
 
 		if (cursor_node->branch) {
 			call->branch_count = 1;
@@ -621,12 +622,8 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
 					     struct callchain_list *cnode)
 {
-	char *left = get_srcline(cnode->ms.map->dso,
-				 map__rip_2objdump(cnode->ms.map, cnode->ip),
-				 cnode->ms.sym, true, false);
-	char *right = get_srcline(node->map->dso,
-				  map__rip_2objdump(node->map, node->ip),
-				  node->sym, true, false);
+	const char *left = cnode->srcline;
+	const char *right = node->srcline;
 	enum match_result ret = MATCH_EQ;
 	int cmp;
 
@@ -644,8 +641,6 @@ static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
 	if (cmp != 0)
 		ret = cmp < 0 ? MATCH_LT : MATCH_GT;
 
-	free_srcline(left);
-	free_srcline(right);
 	return ret;
 }
 
@@ -917,7 +912,7 @@ merge_chain_branch(struct callchain_cursor *cursor,
 	list_for_each_entry_safe(list, next_list, &src->val, list) {
 		callchain_cursor_append(cursor, list->ip,
 					list->ms.map, list->ms.sym,
-					false, NULL, 0, 0);
+					false, NULL, 0, 0, list->srcline);
 		list_del(&list->list);
 		map__zput(list->ms.map);
 		free(list);
@@ -957,7 +952,7 @@ int callchain_merge(struct callchain_cursor *cursor,
 int callchain_cursor_append(struct callchain_cursor *cursor,
 			    u64 ip, struct map *map, struct symbol *sym,
 			    bool branch, struct branch_flags *flags,
-			    int nr_loop_iter, int samples)
+			    int nr_loop_iter, int samples, const char *srcline)
 {
 	struct callchain_cursor_node *node = *cursor->last;
 
@@ -976,6 +971,7 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
 	node->branch = branch;
 	node->nr_loop_iter = nr_loop_iter;
 	node->samples = samples;
+	node->srcline = srcline;
 
 	if (flags)
 		memcpy(&node->branch_flags, flags,
@@ -1061,12 +1057,7 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 	int printed;
 
 	if (cl->ms.sym) {
-		if (show_srcline && cl->ms.map && !cl->srcline)
-			cl->srcline = get_srcline(cl->ms.map->dso,
-						  map__rip_2objdump(cl->ms.map,
-								    cl->ip),
-						  cl->ms.sym, false, show_addr);
-		if (cl->srcline)
+		if (show_srcline && cl->srcline)
 			printed = scnprintf(bf, bfsize, "%s %s",
 					cl->ms.sym->name, cl->srcline);
 		else
@@ -1454,7 +1445,8 @@ int callchain_cursor__copy(struct callchain_cursor *dst,
 
 		rc = callchain_cursor_append(dst, node->ip, node->map, node->sym,
 					     node->branch, &node->branch_flags,
-					     node->nr_loop_iter, node->samples);
+					     node->nr_loop_iter, node->samples,
+					     node->srcline);
 		if (rc)
 			break;
 
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index c56c23dbbf72..16f51a20df6a 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -119,7 +119,7 @@ struct callchain_list {
 	u64			cycles_count;
 	u64			iter_count;
 	u64			samples_count;
-	char		       *srcline;
+	const char		*srcline;
 	struct list_head	list;
 };
 
@@ -133,6 +133,7 @@ struct callchain_cursor_node {
 	u64				ip;
 	struct map			*map;
 	struct symbol			*sym;
+	const char			*srcline;
 	bool				branch;
 	struct branch_flags		branch_flags;
 	int				nr_loop_iter;
@@ -198,7 +199,7 @@ static inline void callchain_cursor_reset(struct callchain_cursor *cursor)
 int callchain_cursor_append(struct callchain_cursor *cursor, u64 ip,
 			    struct map *map, struct symbol *sym,
 			    bool branch, struct branch_flags *flags,
-			    int nr_loop_iter, int samples);
+			    int nr_loop_iter, int samples, const char *srcline);
 
 /* Close a cursor writing session. Initialize for the reader */
 static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 64bfa8b43706..df2f1e618360 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1679,6 +1679,15 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 	return mi;
 }
 
+static char *callchain_srcline(struct map *map, struct symbol *sym, u64 ip)
+{
+	if (!map || callchain_param.key == CCKEY_FUNCTION)
+		return NULL;
+
+	return get_srcline(map->dso, map__rip_2objdump(map, ip),
+			   sym, false, callchain_param.key == CCKEY_ADDRESS);
+}
+
 static int add_callchain_ip(struct thread *thread,
 			    struct callchain_cursor *cursor,
 			    struct symbol **parent,
@@ -1741,7 +1750,9 @@ static int add_callchain_ip(struct thread *thread,
 	if (symbol_conf.hide_unresolved && al.sym == NULL)
 		return 0;
 	return callchain_cursor_append(cursor, al.addr, al.map, al.sym,
-				       branch, flags, nr_loop_iter, samples);
+				       branch, flags, nr_loop_iter, samples,
+				       callchain_srcline(al.map, al.sym,
+							 al.addr));
 }
 
 struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
@@ -2055,7 +2066,7 @@ static int append_inlines(struct callchain_cursor *cursor,
 	list_for_each_entry(ilist, &inline_node->val, list) {
 		int ret = callchain_cursor_append(cursor, ip, map,
 						  ilist->symbol, false,
-						  NULL, 0, 0);
+						  NULL, 0, 0, ilist->srcline);
 
 		if (ret != 0)
 			return ret;
@@ -2076,7 +2087,9 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
 
 	return callchain_cursor_append(cursor, entry->ip,
 				       entry->map, entry->sym,
-				       false, NULL, 0, 0);
+				       false, NULL, 0, 0,
+				       callchain_srcline(entry->map, entry->sym,
+							 entry->ip));
 }
 
 static int thread__resolve_callchain_unwind(struct thread *thread,
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 25ceed8eb790..a1fdf035d1dd 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -33,7 +33,7 @@ static const char *dso__name(struct dso *dso)
 	return dso_name;
 }
 
-static int inline_list__append(struct symbol *symbol, char *filename, int line_nr,
+static int inline_list__append(struct symbol *symbol, char *srcline,
 			       struct inline_node *node)
 {
 	struct inline_list *ilist;
@@ -43,8 +43,7 @@ static int inline_list__append(struct symbol *symbol, char *filename, int line_n
 		return -1;
 
 	ilist->symbol = symbol;
-	ilist->filename = filename;
-	ilist->line_nr = line_nr;
+	ilist->srcline = srcline;
 
 	if (callchain_param.order == ORDER_CALLEE)
 		list_add_tail(&ilist->list, &node->val);
@@ -54,6 +53,30 @@ static int inline_list__append(struct symbol *symbol, char *filename, int line_n
 	return 0;
 }
 
+// basename version that takes a const input string
+static const char *gnu_basename(const char *path)
+{
+	const char *base = strrchr(path, '/');
+
+	return base ? base + 1 : path;
+}
+
+static char *srcline_from_fileline(const char *file, unsigned int line)
+{
+	char *srcline;
+
+	if (!file)
+		return NULL;
+
+	if (!srcline_full_filename)
+		file = gnu_basename(file);
+
+	if (asprintf(&srcline, "%s:%u", file, line) < 0)
+		return NULL;
+
+	return srcline;
+}
+
 #ifdef HAVE_LIBBFD_SUPPORT
 
 /*
@@ -229,10 +252,13 @@ static int inline_list__append_dso_a2l(struct dso *dso,
 				       struct symbol *sym)
 {
 	struct a2l_data *a2l = dso->a2l;
-	char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
 	struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
+	char *srcline = NULL;
+
+	if (a2l->filename)
+		srcline = srcline_from_fileline(a2l->filename, a2l->line);
 
-	return inline_list__append(inline_sym, filename, a2l->line, node);
+	return inline_list__append(inline_sym, srcline, node);
 }
 
 static int addr2line(const char *dso_name, u64 addr,
@@ -426,12 +452,15 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 	node->addr = addr;
 
 	while (getline(&filename, &len, fp) != -1) {
+		char *srcline;
+
 		if (filename_split(filename, &line_nr) != 1) {
 			free(filename);
 			goto out;
 		}
 
-		if (inline_list__append(sym, filename, line_nr, node) != 0)
+		srcline = srcline_from_fileline(filename, line_nr);
+		if (inline_list__append(sym, srcline, node) != 0)
 			goto out;
 
 		filename = NULL;
@@ -475,16 +504,14 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
 		       unwind_inlines, NULL, sym))
 		goto out;
 
-	if (asprintf(&srcline, "%s:%u",
-				srcline_full_filename ? file : basename(file),
-				line) < 0) {
-		free(file);
+	srcline = srcline_from_fileline(file, line);
+	free(file);
+
+	if (!srcline)
 		goto out;
-	}
 
 	dso->a2l_fails = 0;
 
-	free(file);
 	return srcline;
 
 out:
@@ -536,7 +563,7 @@ void inline_node__delete(struct inline_node *node)
 
 	list_for_each_entry_safe(ilist, tmp, &node->val, list) {
 		list_del_init(&ilist->list);
-		zfree(&ilist->filename);
+		zfree(&ilist->srcline);
 		// only the inlined symbols are owned by the list
 		if (ilist->symbol && ilist->symbol->inlined)
 			symbol__delete(ilist->symbol);
diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
index ad9726987b80..0d2aca92e8c7 100644
--- a/tools/perf/util/srcline.h
+++ b/tools/perf/util/srcline.h
@@ -19,8 +19,7 @@ void free_srcline(char *srcline);
 
 struct inline_list {
 	struct symbol		*symbol;
-	char			*filename;
-	unsigned int		line_nr;
+	char			*srcline;
 	struct list_head	list;
 };
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/7] perf report: fall-back to function name comparison for -g srcline
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
                   ` (3 preceding siblings ...)
  2017-05-18 19:34 ` [PATCH 4/7] perf report: use srcline from " Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-18 19:34 ` [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix Milian Wolff
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

When a callchain entry has no srcline available, we ended up
comparing the instruction pointer. I consider this to be not
too useful. Rather, I think we should group the entries by
function name, which this patch adds. For people who want to
split the data on the IP boundary, using `-g address` is the
correct choice.

Before:

~~~~~
   100.00%    38.86%  [.] main
            |
            |--61.14%--main inlining.cpp:14
            |          std::norm<double> complex:664
            |          std::_Norm_helper<true>::_S_do_it<double> complex:654
            |          std::abs<double> complex:597
            |          std::__complex_abs complex:589
            |          |
            |          |--56.03%--hypot
            |          |          |
            |          |          |--8.45%--__hypot_finite
            |          |          |
            |          |          |--7.62%--__hypot_finite
            |          |          |
            |          |          |--2.29%--__hypot_finite
            |          |          |
            |          |          |--2.24%--__hypot_finite
            |          |          |
            |          |          |--2.06%--__hypot_finite
            |          |          |
            |          |          |--1.81%--__hypot_finite
...
~~~~~

After:

~~~~~
   100.00%    38.86%  [.] main
            |
            |--61.14%--main inlining.cpp:14
            |          std::norm<double> complex:664
            |          std::_Norm_helper<true>::_S_do_it<double> complex:654
            |          std::abs<double> complex:597
            |          std::__complex_abs complex:589
            |          |
            |          |--60.29%--hypot
            |          |          |
            |          |           --56.03%--__hypot_finite
            |          |
            |           --0.85%--cabs
~~~~~

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/util/callchain.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 3a349530fee1..211ed3713fac 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -619,11 +619,9 @@ enum match_result {
 	MATCH_GT,
 };
 
-static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
-					     struct callchain_list *cnode)
+static enum match_result match_chain_strings(const char *left,
+					     const char *right)
 {
-	const char *left = cnode->srcline;
-	const char *right = node->srcline;
 	enum match_result ret = MATCH_EQ;
 	int cmp;
 
@@ -633,10 +631,8 @@ static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
 		cmp = 1;
 	else if (left && !right)
 		cmp = -1;
-	else if (cnode->ip == node->ip)
-		cmp = 0;
 	else
-		cmp = (cnode->ip < node->ip) ? -1 : 1;
+		return MATCH_ERROR;
 
 	if (cmp != 0)
 		ret = cmp < 0 ? MATCH_LT : MATCH_GT;
@@ -651,10 +647,18 @@ static enum match_result match_chain(struct callchain_cursor_node *node,
 	u64 left, right;
 
 	if (callchain_param.key == CCKEY_SRCLINE) {
-		enum match_result match = match_chain_srcline(node, cnode);
+		enum match_result match = match_chain_strings(cnode->srcline,
+							      node->srcline);
+
+		// if no srcline is available, fallback to symbol name
+		if (match == MATCH_ERROR && cnode->ms.sym && node->sym)
+			match = match_chain_strings(cnode->ms.sym->name,
+						    node->sym->name);
 
 		if (match != MATCH_ERROR)
 			return match;
+
+		// otherwise fall-back to IP-based comparison below
 	}
 
 	if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
                   ` (4 preceding siblings ...)
  2017-05-18 19:34 ` [PATCH 5/7] perf report: fall-back to function name comparison for -g srcline Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-22 12:48   ` Namhyung Kim
  2017-05-18 19:34 ` [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them Milian Wolff
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

The original patch that introduced inline frame output in the
various browsers used this suffix already. The new centralized
approach that uses fake symbols for inlined frames was missing
this approach so far.

Instead of changing the symbol name itself, we only print the
suffix where needed. This allows us to efficiently lookup
the symbol for a given name without first having to append the
suffix before the lookup.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/util/callchain.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 211ed3713fac..4350c7ceaca1 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1061,11 +1061,15 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 	int printed;
 
 	if (cl->ms.sym) {
+		const char *inlined = cl->ms.sym->inlined ? " (inlined)" : "";
+
 		if (show_srcline && cl->srcline)
-			printed = scnprintf(bf, bfsize, "%s %s",
-					cl->ms.sym->name, cl->srcline);
+			printed = scnprintf(bf, bfsize, "%s %s%s",
+					    cl->ms.sym->name, cl->srcline,
+					    inlined);
 		else
-			printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
+			printed = scnprintf(bf, bfsize, "%s%s",
+					    cl->ms.sym->name, inlined);
 	} else
 		printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
                   ` (5 preceding siblings ...)
  2017-05-18 19:34 ` [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix Milian Wolff
@ 2017-05-18 19:34 ` Milian Wolff
  2017-05-22 12:11   ` Namhyung Kim
  2017-05-18 20:05 ` [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
  2017-05-22 12:09 ` Namhyung Kim
  8 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 19:34 UTC (permalink / raw)
  To: Linux-kernel
  Cc: linux-perf-users, Milian Wolff, Arnaldo Carvalho de Melo,
	David Ahern, Namhyung Kim, Peter Zijlstra, Yao Jin

Instead of showing the (repeated) DSO name of the non-inlined
frame, we now show the "(inlined)" suffix instead.

Before:
                   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
                    ace3 hypot (/usr/lib/libm-2.25.so)
                         std::__complex_abs (/home/milian/projects/src/perf-tests/inlining)
                         std::abs<double> (/home/milian/projects/src/perf-tests/inlining)
                         std::_Norm_helper<true>::_S_do_it<double> (/home/milian/projects/src/perf-tests/inlining)
                         std::norm<double> (/home/milian/projects/src/perf-tests/inlining)
                     a4a main (/home/milian/projects/src/perf-tests/inlining)
                   20510 __libc_start_main (/usr/lib/libc-2.25.so)
                     bd9 _start (/home/milian/projects/src/perf-tests/inlining)

After:
                   214f7 __hypot_finite (/usr/lib/libm-2.25.so)
                    ace3 hypot (/usr/lib/libm-2.25.so)
                         std::__complex_abs (inlined)
                         std::abs<double> (inlined)
                         std::_Norm_helper<true>::_S_do_it<double> (inlined)
                         std::norm<double> (inlined)
                     a4a main (/home/milian/projects/src/perf-tests/inlining)
                   20510 __libc_start_main (/usr/lib/libc-2.25.so)
                     bd9 _start (/home/milian/projects/src/perf-tests/inlining)

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
---
 tools/perf/util/evsel_fprintf.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index f2c6c5ee11e8..5b9e89257aa7 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -157,7 +157,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 				}
 			}
 
-			if (print_dso) {
+			if (print_dso && (!node->sym || !node->sym->inlined)) {
 				printed += fprintf(fp, " (");
 				printed += map__fprintf_dsoname(node->map, fp);
 				printed += fprintf(fp, ")");
@@ -166,6 +166,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 			if (print_srcline)
 				printed += map__fprintf_srcline(node->map, addr, "\n  ", fp);
 
+			if (node->sym && node->sym->inlined)
+				printed += fprintf(fp, " (inlined)");
+
 			if (!print_oneline)
 				printed += fprintf(fp, "\n");
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
                   ` (6 preceding siblings ...)
  2017-05-18 19:34 ` [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them Milian Wolff
@ 2017-05-18 20:05 ` Milian Wolff
  2017-05-22  9:06   ` Namhyung Kim
  2017-05-22 12:09 ` Namhyung Kim
  8 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-18 20:05 UTC (permalink / raw)
  To: Linux-kernel; +Cc: linux-perf-users

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> This series of patches completely reworks the way inline frames are handled.
> Instead of querying for the inline nodes on-demand in the individual tools,
> we now create proper callchain nodes for inlined frames. The advantages
> this approach brings are numerous:
> 
> - less duplicated code in the individual browser
> - aggregated cost for inlined frames for the --children top-down list
> - various bug fixes that arose from querying for a srcline/symbol based on
>   the IP of a sample, which will always point to the last inlined frame
>   instead of the corresponding non-inlined frame
> - overall much better support for visualizing cost for heavily-inlined C++
>   code, which simply was confusing and unreliably before
> - srcline honors the global setting as to whether full paths or basenames
>   should be shown
> 
> For comparison, below lists the output before and after for `perf script`
> and `perf report`. The example file I used to generate the perf data is:

And of course shortly after sending this patch series I notice the first 
issues ;-) The new behavior shows confusing results for `-g function` because 
match_chain uses sym->start. I fixed this locally to compare the actual 
function name if either of the two symbols is an inlined fake symbol:

https://github.com/milianw/linux/commit/
a1fa4486c19976f36d6ee7df263676e888c2bdb9

Before resending this whole series, it would be great if people could review 
the stuff in here such that I know whether I'm on the right track.

You can checkout my clone from github for quick testing:

https://github.com/milianw/linux/commits/wip/distinguish-inliners

Thanks

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-18 20:05 ` [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
@ 2017-05-22  9:06   ` Namhyung Kim
  2017-05-24 11:46     ` Milian Wolff
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2017-05-22  9:06 UTC (permalink / raw)
  To: Milian Wolff; +Cc: Linux-kernel, linux-perf-users, kernel-team

Hi Milian,

On Thu, May 18, 2017 at 10:05:36PM +0200, Milian Wolff wrote:
> On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> > This series of patches completely reworks the way inline frames are handled.
> > Instead of querying for the inline nodes on-demand in the individual tools,
> > we now create proper callchain nodes for inlined frames. The advantages
> > this approach brings are numerous:
> > 
> > - less duplicated code in the individual browser
> > - aggregated cost for inlined frames for the --children top-down list
> > - various bug fixes that arose from querying for a srcline/symbol based on
> >   the IP of a sample, which will always point to the last inlined frame
> >   instead of the corresponding non-inlined frame
> > - overall much better support for visualizing cost for heavily-inlined C++
> >   code, which simply was confusing and unreliably before
> > - srcline honors the global setting as to whether full paths or basenames
> >   should be shown
> > 
> > For comparison, below lists the output before and after for `perf script`
> > and `perf report`. The example file I used to generate the perf data is:
> 
> And of course shortly after sending this patch series I notice the first 
> issues ;-) The new behavior shows confusing results for `-g function` because 
> match_chain uses sym->start. I fixed this locally to compare the actual 
> function name if either of the two symbols is an inlined fake symbol:

Why not making the fake symbol has start addr of the sample IP and
length of 1.  The histogram sort code also compares the sym->start
which might confuse the output of the children mode too IMHO.

Thanks,
Namhyung


> 
> https://github.com/milianw/linux/commit/
> a1fa4486c19976f36d6ee7df263676e888c2bdb9
> 
> Before resending this whole series, it would be great if people could review 
> the stuff in here such that I know whether I'm on the right track.
> 
> You can checkout my clone from github for quick testing:
> 
> https://github.com/milianw/linux/commits/wip/distinguish-inliners
> 
> Thanks
> 
> -- 
> Milian Wolff | milian.wolff@kdab.com | Software Engineer
> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
> Tel: +49-30-521325470
> KDAB - The Qt Experts

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
                   ` (7 preceding siblings ...)
  2017-05-18 20:05 ` [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
@ 2017-05-22 12:09 ` Namhyung Kim
  8 siblings, 0 replies; 26+ messages in thread
From: Namhyung Kim @ 2017-05-22 12:09 UTC (permalink / raw)
  To: Milian Wolff; +Cc: Linux-kernel, linux-perf-users, kernel-team

On Thu, May 18, 2017 at 09:34:04PM +0200, Milian Wolff wrote:
> This series of patches completely reworks the way inline frames are handled.
> Instead of querying for the inline nodes on-demand in the individual tools,
> we now create proper callchain nodes for inlined frames. The advantages this
> approach brings are numerous:
> 
> - less duplicated code in the individual browser
> - aggregated cost for inlined frames for the --children top-down list
> - various bug fixes that arose from querying for a srcline/symbol based on
>   the IP of a sample, which will always point to the last inlined frame
>   instead of the corresponding non-inlined frame
> - overall much better support for visualizing cost for heavily-inlined C++
>   code, which simply was confusing and unreliably before
> - srcline honors the global setting as to whether full paths or basenames
>   should be shown
> 
> For comparison, below lists the output before and after for `perf script`
> and `perf report`. The example file I used to generate the perf data is:
> 
> ~~~~~
> $ cat inlining.cpp
> #include <complex>
> #include <cmath>
> #include <random>
> #include <iostream>
> 
> using namespace std;
> 
> int main()
> {
>     uniform_real_distribution<double> uniform(-1E5, 1E5);
>     default_random_engine engine;
>     double s = 0;
>     for (int i = 0; i < 10000000; ++i) {
>         s += norm(complex<double>(uniform(engine), uniform(engine)));
>     }
>     cout << s << '\n';
>     return 0;
> }
> $ g++ -O2 -g -o inlining inlining.cpp
> $ perf record --call-graph dwarf ./inlining
> ~~~~~
> 
> Now, the (broken) status-quo looks like this. Look for "NOTE:" to see some
> of my comments that outline the various issues I'm trying to solve by this
> patch series.
> 
> ~~~~~
> $ perf script --inline
> ...
> inlining 11083 97459.356656:      33680 cycles:
>                    214f7 __hypot_finite (/usr/lib/libm-2.25.so)
>                     ace3 hypot (/usr/lib/libm-2.25.so)
>                      a4a main (/home/milian/projects/src/perf-tests/inlining)
>                          std::__complex_abs
>                          std::abs<double>
>                          std::_Norm_helper<true>::_S_do_it<double>
>                          std::norm<double>
>                          main
>                    20510 __libc_start_main (/usr/lib/libc-2.25.so)
>                      bd9 _start (/home/milian/projects/src/perf-tests/inlining)
> # NOTE: the above inlined stack is confusing: the a4a is an address into main,
> #       which is the non-inlined symbol. the entry with the address should be
> #       at the end of the stack, where it's actually duplicated once more but
> #       there it's missing the address

Omitting address was to distinguish inlined entries from others.  I
didn't think it's a problem by itself (missed inter-operability below),
but duplicated entries should be removed.

> ...
> $ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio
> ...
>              --38.86%--_start
>                        __libc_start_main
>                        |
>                        |--15.68%--main random.tcc:3326
>                        |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.tcc:3326 (inline)
>                        |
>                        |--10.36%--main random.h:143
>                        |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.tcc:3332 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:332 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:151 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:143 (inline)
>                        |
>                        |--5.66%--main random.tcc:3332
>                        |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
>                        |          /usr/include/c++/6.3.1/bits/random.tcc:3332 (inline)
> ...
> # NOTE: the grouping is totally off because the first and last frame of the
>         inline nodes is completely bogus, since the IP is used to find the sym/srcline
>         which is different from the actual inlined sym/srcline.
>         also, the code currently displays either the inlined function name or
>         the corresponding filename (but in full length, instead of just the basename).

Yes, inlined nodes should have correct IP for later use.


> 
> $ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio --no-children
> ...
>     38.86%  [.] main
>             |
>             |--15.68%--main random.tcc:3326
>             |          /usr/include/c++/6.3.1/bits/random.tcc:3326 (inline)
>             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
>             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
>             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
>             |          /home/milian/projects/src/perf-tests/inlining.cpp:14 (inline)
>             |          __libc_start_main
>             |          _start
> ...
> # NOTE: the srcline for main is wrong, it should be inlining.cpp:14,
>         i.e. what is displayed in the line below (see also perf script issue above)

OK.


> ~~~~~
> 
> Afterwards, all of the above issues are resolved:
> 
> ~~~~~
> $ perf script --inline
> ...
> inlining 11083 97459.356656:      33680 cycles:
>                    214f7 __hypot_finite (/usr/lib/libm-2.25.so)
>                     ace3 hypot (/usr/lib/libm-2.25.so)
>                      a4a std::__complex_abs (inlined)
>                      a4a std::abs<double> (inlined)
>                      a4a std::_Norm_helper<true>::_S_do_it<double> (inlined)
>                      a4a std::norm<double> (inlined)
>                      a4a main (/home/milian/projects/src/perf-tests/inlining)
>                    20510 __libc_start_main (/usr/lib/libc-2.25.so)
>                      bd9 _start (/home/milian/projects/src/perf-tests/inlining)
> ...
> # NOTE: only one main entry, at the correct position.
>         we do display the (repeated) instruction pointer as that ensures
>         interoperability with e.g. the stackcollapse-perf.pl script

Looks good.


> 
> $ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio
> ...
>    100.00%    38.86%  [.] main
>             |
>             |--61.14%--main inlining.cpp:14
>             |          std::norm<double> complex:664 (inlined)
>             |          std::_Norm_helper<true>::_S_do_it<double> complex:654 (inlined)
>             |          std::abs<double> complex:597 (inlined)
>             |          std::__complex_abs complex:589 (inlined)
>             |          |
>             |          |--60.29%--hypot
>             |          |          |
>             |          |           --56.03%--__hypot_finite
>             |          |
>             |           --0.85%--cabs
>             |
>              --38.86%--_start
>                        __libc_start_main
>                        |
>                        |--38.19%--main inlining.cpp:14
>                        |          |
>                        |          |--35.59%--std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1809 (inlined)
>                        |          |          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1818 (inlined)
>                        |          |          |
>                        |          |           --34.37%--std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() random.h:185 (inlined)
>                        |          |                     |
>                        |          |                     |--17.91%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3332 (inlined)
>                        |          |                     |          |
>                        |          |                     |           --12.24%--std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() random.h:332 (inlined)
>                        |          |                     |                     std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> random.h:151 (inlined)
>                        |          |                     |                     |
>                        |          |                     |                     |--10.36%--std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc random.h:143 (inlined)
>                        |          |                     |                     |
>                        |          |                     |                      --1.88%--std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc random.h:141 (inlined)
>                        |          |                     |
>                        |          |                     |--15.68%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3326 (inlined)
>                        |          |                     |
>                        |          |                      --0.79%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3335 (inlined)
>                        |          |
>                        |           --1.99%--std::norm<double> complex:664 (inlined)
>                        |                     std::_Norm_helper<true>::_S_do_it<double> complex:654 (inlined)
>                        |                     std::abs<double> complex:597 (inlined)
>                        |                     std::__complex_abs complex:589 (inlined)
>                        |
>                         --0.67%--main inlining.cpp:13
> ...
> 
> # NOTE: still somewhat confusing due to the _start and __libc_start_main frames
>         that actually are *above* the main frame. But at least the stuff below
>         properly splits up and shows that mutiple functions got inlined into
>         inlining.cpp:14, not just one as before.

Right.  Current callchain mode has the problem.  It shows a mix of
callchains of a sample and children of the sample.  I proposed a patch
[1] to handle it but I failed to find time to update it as Jiri
suggested.

[1] https://lkml.org/lkml/2014/8/14/49


> 
> $ perf report -s sym -g srcline -i perf.inlining.data --inline --stdio --no-children
> ...
>     38.86%  [.] main
>             |
>             |--15.68%--std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3326 (inlined)
>             |          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() random.h:185 (inlined)
>             |          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1818 (inlined)
>             |          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1809 (inlined)
>             |          main inlining.cpp:14
>             |          __libc_start_main
>             |          _start
> ...
> # NOTE: the first and last entry of the inline stack have the correct symbol and srcline now
>         both function and srcline is shown, as well as the (inlined) suffix
>         only the basename of the srcline is shown

Nice work!

But I found some problems in both of your patch and current code
during testing (mostly for the children mode).  I'll send a fix soon.

Thanks,
Namhyung


> ~~~~~
> 
> Milian Wolff (7):
>   perf report: remove code to handle inline frames from browsers
>   perf util: take elf_name as const string in dso__demangle_sym
>   perf report: create real callchain entries for inlined frames
>   perf report: use srcline from inlined frames
>   perf report: fall-back to function name comparison for -g srcline
>   perf report: mark inlined frames in output by " (inlined)" suffix
>   perf script: mark inlined frames and do not print DSO for them
> 
>  tools/perf/ui/browsers/hists.c   | 183 +++------------------------------------
>  tools/perf/ui/stdio/hist.c       |  80 +----------------
>  tools/perf/util/callchain.c      |  52 +++++------
>  tools/perf/util/callchain.h      |   5 +-
>  tools/perf/util/dso.c            |   2 +
>  tools/perf/util/dso.h            |   1 +
>  tools/perf/util/evsel_fprintf.c  |  37 +-------
>  tools/perf/util/hist.c           |   5 --
>  tools/perf/util/machine.c        |  54 +++++++++++-
>  tools/perf/util/sort.h           |   1 -
>  tools/perf/util/srcline.c        | 183 ++++++++++++++++++++++++++++++---------
>  tools/perf/util/srcline.h        |  19 +++-
>  tools/perf/util/symbol-elf.c     |   2 +-
>  tools/perf/util/symbol-minimal.c |   2 +-
>  tools/perf/util/symbol.h         |   3 +-
>  15 files changed, 264 insertions(+), 365 deletions(-)
> 
> -- 
> 2.13.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them
  2017-05-18 19:34 ` [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them Milian Wolff
@ 2017-05-22 12:11   ` Namhyung Kim
  2017-05-24 11:40     ` Milian Wolff
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2017-05-22 12:11 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

On Thu, May 18, 2017 at 09:34:11PM +0200, Milian Wolff wrote:
> Instead of showing the (repeated) DSO name of the non-inlined
> frame, we now show the "(inlined)" suffix instead.
> 
> Before:
>                    214f7 __hypot_finite (/usr/lib/libm-2.25.so)
>                     ace3 hypot (/usr/lib/libm-2.25.so)
>                          std::__complex_abs (/home/milian/projects/src/perf-tests/inlining)
>                          std::abs<double> (/home/milian/projects/src/perf-tests/inlining)
>                          std::_Norm_helper<true>::_S_do_it<double> (/home/milian/projects/src/perf-tests/inlining)
>                          std::norm<double> (/home/milian/projects/src/perf-tests/inlining)
>                      a4a main (/home/milian/projects/src/perf-tests/inlining)
>                    20510 __libc_start_main (/usr/lib/libc-2.25.so)
>                      bd9 _start (/home/milian/projects/src/perf-tests/inlining)
> 
> After:
>                    214f7 __hypot_finite (/usr/lib/libm-2.25.so)
>                     ace3 hypot (/usr/lib/libm-2.25.so)
>                          std::__complex_abs (inlined)
>                          std::abs<double> (inlined)
>                          std::_Norm_helper<true>::_S_do_it<double> (inlined)
>                          std::norm<double> (inlined)

Shouldn't they have 'a4a' too?

Thanks,
Namhyung


>                      a4a main (/home/milian/projects/src/perf-tests/inlining)
>                    20510 __libc_start_main (/usr/lib/libc-2.25.so)
>                      bd9 _start (/home/milian/projects/src/perf-tests/inlining)
> 
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Yao Jin <yao.jin@linux.intel.com>
> Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
> ---
>  tools/perf/util/evsel_fprintf.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
> index f2c6c5ee11e8..5b9e89257aa7 100644
> --- a/tools/perf/util/evsel_fprintf.c
> +++ b/tools/perf/util/evsel_fprintf.c
> @@ -157,7 +157,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
>  				}
>  			}
>  
> -			if (print_dso) {
> +			if (print_dso && (!node->sym || !node->sym->inlined)) {
>  				printed += fprintf(fp, " (");
>  				printed += map__fprintf_dsoname(node->map, fp);
>  				printed += fprintf(fp, ")");
> @@ -166,6 +166,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
>  			if (print_srcline)
>  				printed += map__fprintf_srcline(node->map, addr, "\n  ", fp);
>  
> +			if (node->sym && node->sym->inlined)
> +				printed += fprintf(fp, " (inlined)");
> +
>  			if (!print_oneline)
>  				printed += fprintf(fp, "\n");
>  
> -- 
> 2.13.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/7] perf report: create real callchain entries for inlined frames
  2017-05-18 19:34 ` [PATCH 3/7] perf report: create real callchain entries for inlined frames Milian Wolff
@ 2017-05-22 12:19   ` Namhyung Kim
  2017-05-24 11:41     ` Milian Wolff
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2017-05-22 12:19 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

On Thu, May 18, 2017 at 09:34:07PM +0200, Milian Wolff wrote:
> +
> +	if (strcmp(funcname, base_sym->name) == 0) {
> +		// reuse the real, existing symbol

I don't know whether it's required by coding style guide but please
use C-style block comment if possible (especially for multiline comments).


> +		inline_sym = base_sym;
> +	} else {
> +		// create a fake symbol for the inline frame
> +		inline_sym = symbol__new(base_sym ? base_sym->start : 0,
> +					 base_sym ? base_sym->end : 0,
> +					 base_sym ? base_sym->binding : 0,
> +					 funcname);
> +		if (inline_sym)
> +			inline_sym->inlined = 1;
> +	}
> +
> +	free(demangled);
> +
> +	return inline_sym;
> +}
> +
>  static int inline_list__append_dso_a2l(struct dso *dso,
> -				       struct inline_node *node)
> +				       struct inline_node *node,
> +				       struct symbol *sym)
>  {
>  	struct a2l_data *a2l = dso->a2l;
> -	char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
>  	char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
> +	struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
>  
> -	return inline_list__append(filename, funcname, a2l->line, node, dso);
> +	return inline_list__append(inline_sym, filename, a2l->line, node);
>  }
>  
>  static int addr2line(const char *dso_name, u64 addr,
>  		     char **file, unsigned int *line, struct dso *dso,
> -		     bool unwind_inlines, struct inline_node *node)
> +		     bool unwind_inlines, struct inline_node *node,
> +		     struct symbol *sym)
>  {
>  	int ret = 0;
>  	struct a2l_data *a2l = dso->a2l;
> @@ -241,7 +264,7 @@ static int addr2line(const char *dso_name, u64 addr,
>  	if (unwind_inlines) {
>  		int cnt = 0;
>  
> -		if (node && inline_list__append_dso_a2l(dso, node))
> +		if (node && inline_list__append_dso_a2l(dso, node, sym))
>  			return 0;
>  
>  		while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
> @@ -249,7 +272,7 @@ static int addr2line(const char *dso_name, u64 addr,
>  		       cnt++ < MAX_INLINE_NEST) {
>  
>  			if (node != NULL) {
> -				if (inline_list__append_dso_a2l(dso, node))
> +				if (inline_list__append_dso_a2l(dso, node, sym))
>  					return 0;
>  				// found at least one inline frame
>  				ret = 1;
> @@ -281,7 +304,7 @@ void dso__free_a2l(struct dso *dso)
>  }
>  
>  static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
> -	struct dso *dso)
> +	struct dso *dso, struct symbol *sym)
>  {
>  	struct inline_node *node;
>  
> @@ -294,7 +317,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
>  	INIT_LIST_HEAD(&node->val);
>  	node->addr = addr;
>  
> -	if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node))
> +	if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node, sym))
>  		goto out_free_inline_node;
>  
>  	if (list_empty(&node->val))
> @@ -334,7 +357,8 @@ static int addr2line(const char *dso_name, u64 addr,
>  		     char **file, unsigned int *line_nr,
>  		     struct dso *dso __maybe_unused,
>  		     bool unwind_inlines __maybe_unused,
> -		     struct inline_node *node __maybe_unused)
> +		     struct inline_node *node __maybe_unused,
> +		     struct symbol *sym __maybe_unused)
>  {
>  	FILE *fp;
>  	char cmd[PATH_MAX];
> @@ -374,7 +398,7 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
>  }
>  
>  static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
> -	struct dso *dso __maybe_unused)
> +	struct dso *dso __maybe_unused, struct symbol *sym)
>  {
>  	FILE *fp;
>  	char cmd[PATH_MAX];
> @@ -407,8 +431,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
>  			goto out;
>  		}
>  
> -		if (inline_list__append(filename, NULL, line_nr, node,
> -					NULL) != 0)
> +		if (inline_list__append(sym, filename, line_nr, node) != 0)
>  			goto out;
>  
>  		filename = NULL;
> @@ -448,7 +471,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
>  	if (dso_name == NULL)
>  		goto out;
>  
> -	if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines, NULL))
> +	if (!addr2line(dso_name, addr, &file, &line, dso,
> +		       unwind_inlines, NULL, sym))
>  		goto out;
>  
>  	if (asprintf(&srcline, "%s:%u",
> @@ -494,7 +518,8 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
>  	return __get_srcline(dso, addr, sym, show_sym, show_addr, false);
>  }
>  
> -struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
> +struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
> +					    struct symbol *sym)
>  {
>  	const char *dso_name;
>  
> @@ -502,7 +527,7 @@ struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
>  	if (dso_name == NULL)
>  		return NULL;
>  
> -	return addr2inlines(dso_name, addr, dso);
> +	return addr2inlines(dso_name, addr, dso, sym);
>  }
>  
>  void inline_node__delete(struct inline_node *node)
> @@ -512,9 +537,62 @@ void inline_node__delete(struct inline_node *node)
>  	list_for_each_entry_safe(ilist, tmp, &node->val, list) {
>  		list_del_init(&ilist->list);
>  		zfree(&ilist->filename);
> -		zfree(&ilist->funcname);
> +		// only the inlined symbols are owned by the list
> +		if (ilist->symbol && ilist->symbol->inlined)
> +			symbol__delete(ilist->symbol);
>  		free(ilist);
>  	}
>  
>  	free(node);
>  }
> +
> +void inlines__tree_insert(struct rb_root *tree, struct inline_node *inlines)
> +{
> +	struct rb_node **p = &tree->rb_node;
> +	struct rb_node *parent = NULL;
> +	const u64 addr = inlines->addr;
> +	struct inline_node *i;
> +
> +	while (*p != NULL) {
> +		parent = *p;
> +		i = rb_entry(parent, struct inline_node, rb_node);
> +		if (addr < i->addr)
> +			p = &(*p)->rb_left;
> +		else
> +			p = &(*p)->rb_right;
> +	}
> +	rb_link_node(&inlines->rb_node, parent, p);
> +	rb_insert_color(&inlines->rb_node, tree);
> +}
> +
> +struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr)
> +{
> +	struct rb_node *n = tree->rb_node;
> +
> +	while (n) {
> +		struct inline_node *i = rb_entry(n, struct inline_node,
> +						 rb_node);
> +
> +		if (addr < i->addr)
> +			n = n->rb_left;
> +		else if (addr > i->addr)
> +			n = n->rb_right;
> +		else
> +			return i;
> +	}
> +
> +	return NULL;
> +}
> +
> +void inlines__tree_delete(struct rb_root *tree)
> +{
> +	struct inline_node *pos;
> +	struct rb_node *next = rb_first(tree);
> +
> +	while (next) {
> +		pos = rb_entry(next, struct inline_node, rb_node);
> +		next = rb_next(&pos->rb_node);
> +		rb_erase(&pos->rb_node, tree);
> +		inline_node__delete(pos);
> +	}
> +}
> diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
> index 7b52ba88676e..ad9726987b80 100644
> --- a/tools/perf/util/srcline.h
> +++ b/tools/perf/util/srcline.h
> @@ -2,6 +2,7 @@
>  #define PERF_SRCLINE_H
>  
>  #include <linux/list.h>
> +#include <linux/rbtree.h>
>  #include <linux/types.h>
>  
>  struct dso;
> @@ -17,8 +18,8 @@ void free_srcline(char *srcline);
>  #define SRCLINE_UNKNOWN  ((char *) "??:0")
>  
>  struct inline_list {
> +	struct symbol		*symbol;
>  	char			*filename;
> -	char			*funcname;
>  	unsigned int		line_nr;
>  	struct list_head	list;
>  };
> @@ -26,9 +27,20 @@ struct inline_list {
>  struct inline_node {
>  	u64			addr;
>  	struct list_head	val;
> +	struct rb_node		rb_node;
>  };
>  
> -struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr);
> +// parse inlined frames for the given address
> +struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
> +					    struct symbol *sym);
> +// free resources associated to the inline node list
>  void inline_node__delete(struct inline_node *node);
>  
> +// insert the inline node list into the DSO, which will take ownership
> +void inlines__tree_insert(struct rb_root *tree, struct inline_node *inlines);
> +// find previously inserted inline node list
> +struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr);
> +// delete all nodes within the tree of inline_node s
> +void inlines__tree_delete(struct rb_root *tree);
> +
>  #endif /* PERF_SRCLINE_H */
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index f0b08810d7fa..b358570ce615 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -59,6 +59,7 @@ struct symbol {
>  	u8		binding;
>  	u8		idle:1;
>  	u8		ignore:1;
> +	u8		inlined:1;
>  	u8		arch_sym;
>  	char		name[0];
>  };
> -- 
> 2.13.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-05-18 19:34 ` [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix Milian Wolff
@ 2017-05-22 12:48   ` Namhyung Kim
  2017-05-24 11:46     ` Milian Wolff
  2017-06-03 13:51     ` Milian Wolff
  0 siblings, 2 replies; 26+ messages in thread
From: Namhyung Kim @ 2017-05-22 12:48 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
> The original patch that introduced inline frame output in the
> various browsers used this suffix already. The new centralized
> approach that uses fake symbols for inlined frames was missing
> this approach so far.
> 
> Instead of changing the symbol name itself, we only print the
> suffix where needed. This allows us to efficiently lookup
> the symbol for a given name without first having to append the
> suffix before the lookup.

You also need to do same thing for hist_entry__sym_snprintf().

Thanks,
Namhyung


> 
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Yao Jin <yao.jin@linux.intel.com>
> Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
> ---
>  tools/perf/util/callchain.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 211ed3713fac..4350c7ceaca1 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -1061,11 +1061,15 @@ char *callchain_list__sym_name(struct callchain_list *cl,
>  	int printed;
>  
>  	if (cl->ms.sym) {
> +		const char *inlined = cl->ms.sym->inlined ? " (inlined)" : "";
> +
>  		if (show_srcline && cl->srcline)
> -			printed = scnprintf(bf, bfsize, "%s %s",
> -					cl->ms.sym->name, cl->srcline);
> +			printed = scnprintf(bf, bfsize, "%s %s%s",
> +					    cl->ms.sym->name, cl->srcline,
> +					    inlined);
>  		else
> -			printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
> +			printed = scnprintf(bf, bfsize, "%s%s",
> +					    cl->ms.sym->name, inlined);
>  	} else
>  		printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
>  
> -- 
> 2.13.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them
  2017-05-22 12:11   ` Namhyung Kim
@ 2017-05-24 11:40     ` Milian Wolff
  0 siblings, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-24 11:40 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

[-- Attachment #1: Type: text/plain, Size: 2151 bytes --]

On Monday, May 22, 2017 2:11:58 PM CEST Namhyung Kim wrote:
> On Thu, May 18, 2017 at 09:34:11PM +0200, Milian Wolff wrote:
> > Instead of showing the (repeated) DSO name of the non-inlined
> > frame, we now show the "(inlined)" suffix instead.
> > 
> > Before:
> >                    214f7 __hypot_finite (/usr/lib/libm-2.25.so)
> >                    
> >                     ace3 hypot (/usr/lib/libm-2.25.so)
> >                     
> >                          std::__complex_abs
> >                          (/home/milian/projects/src/perf-tests/inlining)
> >                          std::abs<double>
> >                          (/home/milian/projects/src/perf-tests/inlining)
> >                          std::_Norm_helper<true>::_S_do_it<double>
> >                          (/home/milian/projects/src/perf-tests/inlining)
> >                          std::norm<double>
> >                          (/home/milian/projects/src/perf-tests/inlining)
> >                      
> >                      a4a main
> >                      (/home/milian/projects/src/perf-tests/inlining)
> >                    
> >                    20510 __libc_start_main (/usr/lib/libc-2.25.so)
> >                    
> >                      bd9 _start
> >                      (/home/milian/projects/src/perf-tests/inlining)
> > 
> > After:
> >                    214f7 __hypot_finite (/usr/lib/libm-2.25.so)
> >                    
> >                     ace3 hypot (/usr/lib/libm-2.25.so)
> >                     
> >                          std::__complex_abs (inlined)
> >                          std::abs<double> (inlined)
> >                          std::_Norm_helper<true>::_S_do_it<double>
> >                          (inlined)
> >                          std::norm<double> (inlined)
> 
> Shouldn't they have 'a4a' too?

Yes, I think I forgot to update the commit message after I changed the 
behavior to keep compatibility with stackcollapse-perf.pl.

Will update the message.

Thanks

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/7] perf report: create real callchain entries for inlined frames
  2017-05-22 12:19   ` Namhyung Kim
@ 2017-05-24 11:41     ` Milian Wolff
  0 siblings, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-24 11:41 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

[-- Attachment #1: Type: text/plain, Size: 676 bytes --]

On Monday, May 22, 2017 2:19:46 PM CEST Namhyung Kim wrote:
> On Thu, May 18, 2017 at 09:34:07PM +0200, Milian Wolff wrote:
> > +
> > +	if (strcmp(funcname, base_sym->name) == 0) {
> > +		// reuse the real, existing symbol
> 
> I don't know whether it's required by coding style guide but please
> use C-style block comment if possible (especially for multiline comments).

OK. Since the patch check script didn't complain I thought it was OK to use 
this style here.

I'll update the patch series accordingly.

Thanks

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-22  9:06   ` Namhyung Kim
@ 2017-05-24 11:46     ` Milian Wolff
  2017-05-24 13:42       ` Milian Wolff
  0 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-24 11:46 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Linux-kernel, linux-perf-users, kernel-team

[-- Attachment #1: Type: text/plain, Size: 2928 bytes --]

On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> Hi Milian,
> 
> On Thu, May 18, 2017 at 10:05:36PM +0200, Milian Wolff wrote:
> > On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> > > This series of patches completely reworks the way inline frames are
> > > handled. Instead of querying for the inline nodes on-demand in the
> > > individual tools, we now create proper callchain nodes for inlined
> > > frames. The advantages this approach brings are numerous:
> > > 
> > > - less duplicated code in the individual browser
> > > - aggregated cost for inlined frames for the --children top-down list
> > > - various bug fixes that arose from querying for a srcline/symbol based
> > > on
> > > 
> > >   the IP of a sample, which will always point to the last inlined frame
> > >   instead of the corresponding non-inlined frame
> > > 
> > > - overall much better support for visualizing cost for heavily-inlined
> > > C++
> > > 
> > >   code, which simply was confusing and unreliably before
> > > 
> > > - srcline honors the global setting as to whether full paths or
> > > basenames
> > > 
> > >   should be shown
> > > 
> > > For comparison, below lists the output before and after for `perf
> > > script`
> > 
> > > and `perf report`. The example file I used to generate the perf data is:
> > And of course shortly after sending this patch series I notice the first
> > issues ;-) The new behavior shows confusing results for `-g function`
> > because match_chain uses sym->start. I fixed this locally to compare the
> > actual
> > function name if either of the two symbols is an inlined fake symbol:
>
> Why not making the fake symbol has start addr of the sample IP and
> length of 1.  The histogram sort code also compares the sym->start
> which might confuse the output of the children mode too IMHO.

I can try that out, thank you for the suggestion. But I think it can easily 
break in different ways. I.e. when the same inline function gets used at 
different IPs, it should actually be considered to be the same function when 
we group/merge/aggregate. I updated the `match_chain` function accordingly, to 
do a symname / srcline comparison on inlined frames, instead of relying on the 
symbol start/end. I think using the IP for the fake symbols won't be more 
reliable here, don't you think?

In the end, I think we'll always have to special-case inlined fake symbols 
when we aggregate data, since the sym start/end is always going to be some 
arbitrary value that may or may not be what we want it to be. Doing the 
explicit comparison on e.g. srcline/symname is always going to be the most 
reliable option, as it also directly results in a proper aggregation based on 
the strings that the user will see in the end.

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-05-22 12:48   ` Namhyung Kim
@ 2017-05-24 11:46     ` Milian Wolff
  2017-06-03 13:51     ` Milian Wolff
  1 sibling, 0 replies; 26+ messages in thread
From: Milian Wolff @ 2017-05-24 11:46 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

On Monday, May 22, 2017 2:48:18 PM CEST Namhyung Kim wrote:
> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
> > The original patch that introduced inline frame output in the
> > various browsers used this suffix already. The new centralized
> > approach that uses fake symbols for inlined frames was missing
> > this approach so far.
> > 
> > Instead of changing the symbol name itself, we only print the
> > suffix where needed. This allows us to efficiently lookup
> > the symbol for a given name without first having to append the
> > suffix before the lookup.
> 
> You also need to do same thing for hist_entry__sym_snprintf().

Thank you, will do!

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-24 11:46     ` Milian Wolff
@ 2017-05-24 13:42       ` Milian Wolff
  2017-05-24 15:02         ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-24 13:42 UTC (permalink / raw)
  To: Milian Wolff; +Cc: Namhyung Kim, Linux-kernel, linux-perf-users, kernel-team

[-- Attachment #1: Type: text/plain, Size: 4304 bytes --]

On Wednesday, May 24, 2017 1:46:04 PM CEST Milian Wolff wrote:
> On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> > Hi Milian,
> > 
> > On Thu, May 18, 2017 at 10:05:36PM +0200, Milian Wolff wrote:
> > > On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> > > > This series of patches completely reworks the way inline frames are
> > > > handled. Instead of querying for the inline nodes on-demand in the
> > > > individual tools, we now create proper callchain nodes for inlined
> > > > frames. The advantages this approach brings are numerous:
> > > > 
> > > > - less duplicated code in the individual browser
> > > > - aggregated cost for inlined frames for the --children top-down list
> > > > - various bug fixes that arose from querying for a srcline/symbol
> > > > based
> > > > on
> > > > 
> > > >   the IP of a sample, which will always point to the last inlined
> > > >   frame
> > > >   instead of the corresponding non-inlined frame
> > > > 
> > > > - overall much better support for visualizing cost for heavily-inlined
> > > > C++
> > > > 
> > > >   code, which simply was confusing and unreliably before
> > > > 
> > > > - srcline honors the global setting as to whether full paths or
> > > > basenames
> > > > 
> > > >   should be shown
> > > > 
> > > > For comparison, below lists the output before and after for `perf
> > > > script`
> > > 
> > > > and `perf report`. The example file I used to generate the perf data 
is:
> > > And of course shortly after sending this patch series I notice the first
> > > issues ;-) The new behavior shows confusing results for `-g function`
> > > because match_chain uses sym->start. I fixed this locally to compare the
> > > actual
> > 
> > > function name if either of the two symbols is an inlined fake symbol:
> > Why not making the fake symbol has start addr of the sample IP and
> > length of 1.  The histogram sort code also compares the sym->start
> > which might confuse the output of the children mode too IMHO.
> 
> I can try that out, thank you for the suggestion. But I think it can easily
> break in different ways. I.e. when the same inline function gets used at
> different IPs, it should actually be considered to be the same function when
> we group/merge/aggregate. I updated the `match_chain` function accordingly,
> to do a symname / srcline comparison on inlined frames, instead of relying
> on the symbol start/end. I think using the IP for the fake symbols won't be
> more reliable here, don't you think?
> 
> In the end, I think we'll always have to special-case inlined fake symbols
> when we aggregate data, since the sym start/end is always going to be some
> arbitrary value that may or may not be what we want it to be. Doing the
> explicit comparison on e.g. srcline/symname is always going to be the most
> reliable option, as it also directly results in a proper aggregation based
> on the strings that the user will see in the end.

I haven't yet tried it out, but I think I can come up with a way to break your 
approach easily. Assume the following pseudo-code:

void tail()
{
    instr1; // IP1
    instr2; // IP2
}

void mid()
{
    tail();
}

void main()
{
    mid();
}

Now, assume both `tail` and `mid` get inlined into `main`. If we get one 
sample each for both IP1 and IP2, we want the following merged structure if we 
merge based on symbol:

sym  | incl | self
main | 2    | 0
mid  | 2    | 0
tail | 2    | 2

If we would give the inlined fake-symbols a start of the IP, i.e. either IP1 
or IP2, then we would end up with this (unexpected) behavior instead:

sym  | incl | self
main | 2    | 0
mid  | 1    | 0
mid  | 1    | 0
tail | 1    | 1
tail | 1    | 1

The reason is that the fake symbols for the inlined frames would be considered 
to be different functions since their start/end are not equal. This is "wrong" 
in my eyes - we really have to do symbol name comparisons for inlined frames, 
and also include srcline if that is desired.

If you think the above is not a valid assessment, I'll try to change my patch 
series to use the IP + 1 trick you suggest. But I really don't think it's 
going to work.

Cheers
-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-24 13:42       ` Milian Wolff
@ 2017-05-24 15:02         ` Namhyung Kim
  2017-05-29 18:36           ` Milian Wolff
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2017-05-24 15:02 UTC (permalink / raw)
  To: Milian Wolff; +Cc: Linux-kernel, linux-perf-users, kernel-team

On Wed, May 24, 2017 at 03:42:59PM +0200, Milian Wolff wrote:
> On Wednesday, May 24, 2017 1:46:04 PM CEST Milian Wolff wrote:
> > On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> > > Why not making the fake symbol has start addr of the sample IP and
> > > length of 1.  The histogram sort code also compares the sym->start
> > > which might confuse the output of the children mode too IMHO.
> > 
> > I can try that out, thank you for the suggestion. But I think it can easily
> > break in different ways. I.e. when the same inline function gets used at
> > different IPs, it should actually be considered to be the same function when
> > we group/merge/aggregate. I updated the `match_chain` function accordingly,
> > to do a symname / srcline comparison on inlined frames, instead of relying
> > on the symbol start/end. I think using the IP for the fake symbols won't be
> > more reliable here, don't you think?
> > 
> > In the end, I think we'll always have to special-case inlined fake symbols
> > when we aggregate data, since the sym start/end is always going to be some
> > arbitrary value that may or may not be what we want it to be. Doing the
> > explicit comparison on e.g. srcline/symname is always going to be the most
> > reliable option, as it also directly results in a proper aggregation based
> > on the strings that the user will see in the end.
> 
> I haven't yet tried it out, but I think I can come up with a way to break your 
> approach easily. Assume the following pseudo-code:
> 
> void tail()
> {
>     instr1; // IP1
>     instr2; // IP2
> }
> 
> void mid()
> {
>     tail();
> }
> 
> void main()
> {
>     mid();
> }
> 
> Now, assume both `tail` and `mid` get inlined into `main`. If we get one 
> sample each for both IP1 and IP2, we want the following merged structure if we 
> merge based on symbol:
> 
> sym  | incl | self
> main | 2    | 0
> mid  | 2    | 0
> tail | 2    | 2
> 
> If we would give the inlined fake-symbols a start of the IP, i.e. either IP1 
> or IP2, then we would end up with this (unexpected) behavior instead:
> 
> sym  | incl | self
> main | 2    | 0
> mid  | 1    | 0
> mid  | 1    | 0
> tail | 1    | 1
> tail | 1    | 1
> 
> The reason is that the fake symbols for the inlined frames would be considered 
> to be different functions since their start/end are not equal. This is "wrong" 
> in my eyes - we really have to do symbol name comparisons for inlined frames, 
> and also include srcline if that is desired.

That would depend on how we treat inlined function instances.  Each
instance might be considered as same or not - but I think it'd be
better treating them as same for simplicity.

Also currently perf aggregates samples using symbol name, but a new
sort key might be added to use symbol address later.  Thus it'd be
better to be prepared for such change.

> 
> If you think the above is not a valid assessment, I'll try to change my patch 
> series to use the IP + 1 trick you suggest. But I really don't think it's 
> going to work.

So I agree that we should do symbol name comparison, but I still
prefer setting fake symbol address to [IP, +1].  That would reduce
memory space for annotate as well.

Thanks,
Namhyung


> 
> Cheers
> -- 
> Milian Wolff | milian.wolff@kdab.com | Software Engineer
> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
> Tel: +49-30-521325470
> KDAB - The Qt Experts

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-24 15:02         ` Namhyung Kim
@ 2017-05-29 18:36           ` Milian Wolff
  2017-05-30  1:33             ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-05-29 18:36 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Linux-kernel, linux-perf-users, kernel-team

[-- Attachment #1: Type: text/plain, Size: 3962 bytes --]

On Mittwoch, 24. Mai 2017 17:02:37 CEST Namhyung Kim wrote:
> On Wed, May 24, 2017 at 03:42:59PM +0200, Milian Wolff wrote:
> > On Wednesday, May 24, 2017 1:46:04 PM CEST Milian Wolff wrote:
> > > On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> > > > Why not making the fake symbol has start addr of the sample IP and
> > > > length of 1.  The histogram sort code also compares the sym->start
> > > > which might confuse the output of the children mode too IMHO.
> > > 
> > > I can try that out, thank you for the suggestion. But I think it can
> > > easily
> > > break in different ways. I.e. when the same inline function gets used at
> > > different IPs, it should actually be considered to be the same function
> > > when we group/merge/aggregate. I updated the `match_chain` function
> > > accordingly, to do a symname / srcline comparison on inlined frames,
> > > instead of relying on the symbol start/end. I think using the IP for
> > > the fake symbols won't be more reliable here, don't you think?
> > > 
> > > In the end, I think we'll always have to special-case inlined fake
> > > symbols
> > > when we aggregate data, since the sym start/end is always going to be
> > > some
> > > arbitrary value that may or may not be what we want it to be. Doing the
> > > explicit comparison on e.g. srcline/symname is always going to be the
> > > most
> > > reliable option, as it also directly results in a proper aggregation
> > > based
> > > on the strings that the user will see in the end.
> > 
> > I haven't yet tried it out, but I think I can come up with a way to break
> > your approach easily. Assume the following pseudo-code:
> > 
> > void tail()
> > {
> > 
> >     instr1; // IP1
> >     instr2; // IP2
> > 
> > }
> > 
> > void mid()
> > {
> > 
> >     tail();
> > 
> > }
> > 
> > void main()
> > {
> > 
> >     mid();
> > 
> > }
> > 
> > Now, assume both `tail` and `mid` get inlined into `main`. If we get one
> > sample each for both IP1 and IP2, we want the following merged structure
> > if we merge based on symbol:
> > 
> > sym  | incl | self
> > main | 2    | 0
> > mid  | 2    | 0
> > tail | 2    | 2
> > 
> > If we would give the inlined fake-symbols a start of the IP, i.e. either
> > IP1 or IP2, then we would end up with this (unexpected) behavior instead:
> > 
> > sym  | incl | self
> > main | 2    | 0
> > mid  | 1    | 0
> > mid  | 1    | 0
> > tail | 1    | 1
> > tail | 1    | 1
> > 
> > The reason is that the fake symbols for the inlined frames would be
> > considered to be different functions since their start/end are not equal.
> > This is "wrong" in my eyes - we really have to do symbol name comparisons
> > for inlined frames, and also include srcline if that is desired.
> 
> That would depend on how we treat inlined function instances.  Each
> instance might be considered as same or not - but I think it'd be
> better treating them as same for simplicity.
> 
> Also currently perf aggregates samples using symbol name, but a new
> sort key might be added to use symbol address later.  Thus it'd be
> better to be prepared for such change.
> 
> > If you think the above is not a valid assessment, I'll try to change my
> > patch series to use the IP + 1 trick you suggest. But I really don't
> > think it's going to work.
> 
> So I agree that we should do symbol name comparison, but I still
> prefer setting fake symbol address to [IP, +1].  That would reduce
> memory space for annotate as well.

Can you expand on this? I can implement this, but without having a way to test 
I don't know whether I'm doing it right or not ;-)

Note that fake symbols cannot be annotated from `perf report` currently. The 
browser just does nothing - no error, nothing. So I'm really unsure how this 
would influence the "memory space for annotate".

Thanks

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
  2017-05-29 18:36           ` Milian Wolff
@ 2017-05-30  1:33             ` Namhyung Kim
  0 siblings, 0 replies; 26+ messages in thread
From: Namhyung Kim @ 2017-05-30  1:33 UTC (permalink / raw)
  To: Milian Wolff; +Cc: Linux-kernel, linux-perf-users, kernel-team

On Mon, May 29, 2017 at 08:36:25PM +0200, Milian Wolff wrote:
> On Mittwoch, 24. Mai 2017 17:02:37 CEST Namhyung Kim wrote:
> > On Wed, May 24, 2017 at 03:42:59PM +0200, Milian Wolff wrote:
> > > If you think the above is not a valid assessment, I'll try to change my
> > > patch series to use the IP + 1 trick you suggest. But I really don't
> > > think it's going to work.
> > 
> > So I agree that we should do symbol name comparison, but I still
> > prefer setting fake symbol address to [IP, +1].  That would reduce
> > memory space for annotate as well.
> 
> Can you expand on this? I can implement this, but without having a way to test 
> I don't know whether I'm doing it right or not ;-)
> 
> Note that fake symbols cannot be annotated from `perf report` currently. The 
> browser just does nothing - no error, nothing. So I'm really unsure how this 
> would influence the "memory space for annotate".

Hmm.. nevermind.  I missed that it cannot be annotated.  And if it
aggregates symbols by name, start address might not be necessary.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-05-22 12:48   ` Namhyung Kim
  2017-05-24 11:46     ` Milian Wolff
@ 2017-06-03 13:51     ` Milian Wolff
  2017-06-06  1:33       ` Namhyung Kim
  1 sibling, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-06-03 13:51 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	David Ahern, Peter Zijlstra, Yao Jin, kernel-team

On Montag, 22. Mai 2017 14:48:18 CEST Namhyung Kim wrote:
> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
> > The original patch that introduced inline frame output in the
> > various browsers used this suffix already. The new centralized
> > approach that uses fake symbols for inlined frames was missing
> > this approach so far.
> > 
> > Instead of changing the symbol name itself, we only print the
> > suffix where needed. This allows us to efficiently lookup
> > the symbol for a given name without first having to append the
> > suffix before the lookup.
> 
> You also need to do same thing for hist_entry__sym_snprintf().

Hey Namhyung,

I'm working on the next iteration of this patch series, the WIP can be found 
here: https://github.com/milianw/linux/tree/wip/distinguish-inliners.

I just stumbled upon another issue:

~~~~~
perf report --inline -s srcline -g srcline  --stdio -g none
    61.22%     0.00%  main+378
    61.22%     0.00%  std::_Norm_helper<true>::_S_do_it<double>+378
    61.22%     0.00%  std::__complex_abs+378
    61.22%     0.00%  std::abs<double>+378
    61.22%     0.00%  std::norm<double>+378
~~~~~

The problem here is that the srcline in the hist is detached from what we 
actually produce in the callchain nodes. We will have to somehow hand through 
the srcline from the callchain node to the hist entry, but this seems to be a 
super large invasive change - which is why I need your input on how to resolve 
this.

The current API seems to pass the data around mostly using the addr_location 
struct, which is usually constructed on the stack and not always memset to 
zero. As such, my initial plan of adding a srcline member there would require 
me to go through all the code to ensure that we memset the struct to zero...

Alternatively, I'd have to change the API of hist_iter_ops, to let the 
callback take another `const char **srcline` out parameter. This is also going 
to be quite a large invasive change.

Do you have any suggestions on how to make this work?

Thanks

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-06-03 13:51     ` Milian Wolff
@ 2017-06-06  1:33       ` Namhyung Kim
  2017-06-06  7:26         ` Milian Wolff
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2017-06-06  1:33 UTC (permalink / raw)
  To: Milian Wolff
  Cc: linux-kernel@vger.kernel.org, linux-perf-users,
	Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra, Yao Jin,
	kernel-team

On Sat, Jun 3, 2017 at 10:51 PM, Milian Wolff <milian.wolff@kdab.com> wrote:
> On Montag, 22. Mai 2017 14:48:18 CEST Namhyung Kim wrote:
>> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
>> > The original patch that introduced inline frame output in the
>> > various browsers used this suffix already. The new centralized
>> > approach that uses fake symbols for inlined frames was missing
>> > this approach so far.
>> >
>> > Instead of changing the symbol name itself, we only print the
>> > suffix where needed. This allows us to efficiently lookup
>> > the symbol for a given name without first having to append the
>> > suffix before the lookup.
>>
>> You also need to do same thing for hist_entry__sym_snprintf().
>
> Hey Namhyung,
>
> I'm working on the next iteration of this patch series, the WIP can be found
> here: https://github.com/milianw/linux/tree/wip/distinguish-inliners.
>
> I just stumbled upon another issue:
>
> ~~~~~
> perf report --inline -s srcline -g srcline  --stdio -g none
>     61.22%     0.00%  main+378
>     61.22%     0.00%  std::_Norm_helper<true>::_S_do_it<double>+378
>     61.22%     0.00%  std::__complex_abs+378
>     61.22%     0.00%  std::abs<double>+378
>     61.22%     0.00%  std::norm<double>+378
> ~~~~~
>
> The problem here is that the srcline in the hist is detached from what we
> actually produce in the callchain nodes. We will have to somehow hand through
> the srcline from the callchain node to the hist entry, but this seems to be a
> super large invasive change - which is why I need your input on how to resolve
> this.
>
> The current API seems to pass the data around mostly using the addr_location
> struct, which is usually constructed on the stack and not always memset to
> zero. As such, my initial plan of adding a srcline member there would require
> me to go through all the code to ensure that we memset the struct to zero...
>
> Alternatively, I'd have to change the API of hist_iter_ops, to let the
> callback take another `const char **srcline` out parameter. This is also going
> to be quite a large invasive change.
>
> Do you have any suggestions on how to make this work?

I think passing srcline via addr_location might not be very invasive
since it calls machine__resolve() in most cases.  Missing places that
set al->sym should set al->srcline as well IMHO.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-06-06  1:33       ` Namhyung Kim
@ 2017-06-06  7:26         ` Milian Wolff
  2017-06-06 19:52           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 26+ messages in thread
From: Milian Wolff @ 2017-06-06  7:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel@vger.kernel.org, linux-perf-users,
	Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra, Yao Jin,
	kernel-team

[-- Attachment #1: Type: text/plain, Size: 2750 bytes --]

On Tuesday, June 6, 2017 3:33:49 AM CEST Namhyung Kim wrote:
> On Sat, Jun 3, 2017 at 10:51 PM, Milian Wolff <milian.wolff@kdab.com> wrote:
> > On Montag, 22. Mai 2017 14:48:18 CEST Namhyung Kim wrote:
> >> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
> >> > The original patch that introduced inline frame output in the
> >> > various browsers used this suffix already. The new centralized
> >> > approach that uses fake symbols for inlined frames was missing
> >> > this approach so far.
> >> > 
> >> > Instead of changing the symbol name itself, we only print the
> >> > suffix where needed. This allows us to efficiently lookup
> >> > the symbol for a given name without first having to append the
> >> > suffix before the lookup.
> >> 
> >> You also need to do same thing for hist_entry__sym_snprintf().
> > 
> > Hey Namhyung,
> > 
> > I'm working on the next iteration of this patch series, the WIP can be
> > found here:
> > https://github.com/milianw/linux/tree/wip/distinguish-inliners.
> > 
> > I just stumbled upon another issue:
> > 
> > ~~~~~
> > perf report --inline -s srcline -g srcline  --stdio -g none
> > 
> >     61.22%     0.00%  main+378
> >     61.22%     0.00%  std::_Norm_helper<true>::_S_do_it<double>+378
> >     61.22%     0.00%  std::__complex_abs+378
> >     61.22%     0.00%  std::abs<double>+378
> >     61.22%     0.00%  std::norm<double>+378
> > 
> > ~~~~~
> > 
> > The problem here is that the srcline in the hist is detached from what we
> > actually produce in the callchain nodes. We will have to somehow hand
> > through the srcline from the callchain node to the hist entry, but this
> > seems to be a super large invasive change - which is why I need your
> > input on how to resolve this.
> > 
> > The current API seems to pass the data around mostly using the
> > addr_location struct, which is usually constructed on the stack and not
> > always memset to zero. As such, my initial plan of adding a srcline
> > member there would require me to go through all the code to ensure that
> > we memset the struct to zero...
> > 
> > Alternatively, I'd have to change the API of hist_iter_ops, to let the
> > callback take another `const char **srcline` out parameter. This is also
> > going to be quite a large invasive change.
> > 
> > Do you have any suggestions on how to make this work?
> 
> I think passing srcline via addr_location might not be very invasive
> since it calls machine__resolve() in most cases.  Missing places that
> set al->sym should set al->srcline as well IMHO.

OK, perfect - I'll implement that then.

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix
  2017-06-06  7:26         ` Milian Wolff
@ 2017-06-06 19:52           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-06 19:52 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Namhyung Kim, linux-kernel@vger.kernel.org, linux-perf-users,
	Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra, Yao Jin,
	kernel-team

Em Tue, Jun 06, 2017 at 09:26:47AM +0200, Milian Wolff escreveu:
> On Tuesday, June 6, 2017 3:33:49 AM CEST Namhyung Kim wrote:
> > On Sat, Jun 3, 2017 at 10:51 PM, Milian Wolff <milian.wolff@kdab.com> wrote:
> > > The current API seems to pass the data around mostly using the
> > > addr_location struct, which is usually constructed on the stack and not
> > > always memset to zero. As such, my initial plan of adding a srcline
> > > member there would require me to go through all the code to ensure that
> > > we memset the struct to zero...

> > > Alternatively, I'd have to change the API of hist_iter_ops, to let the
> > > callback take another `const char **srcline` out parameter. This is also
> > > going to be quite a large invasive change.

> > > Do you have any suggestions on how to make this work?

> > I think passing srcline via addr_location might not be very invasive
> > since it calls machine__resolve() in most cases.  Missing places that
> > set al->sym should set al->srcline as well IMHO.

I haven't looked if it would be invasive or not, good that it isn't, but
then I think addr_location is the right place for this to be stored.
 
> OK, perfect - I'll implement that then.

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-06-06 19:53 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
2017-05-18 19:34 ` [PATCH 1/7] perf report: remove code to handle inline frames from browsers Milian Wolff
2017-05-18 19:34 ` [PATCH 2/7] perf util: take elf_name as const string in dso__demangle_sym Milian Wolff
2017-05-18 19:34 ` [PATCH 3/7] perf report: create real callchain entries for inlined frames Milian Wolff
2017-05-22 12:19   ` Namhyung Kim
2017-05-24 11:41     ` Milian Wolff
2017-05-18 19:34 ` [PATCH 4/7] perf report: use srcline from " Milian Wolff
2017-05-18 19:34 ` [PATCH 5/7] perf report: fall-back to function name comparison for -g srcline Milian Wolff
2017-05-18 19:34 ` [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix Milian Wolff
2017-05-22 12:48   ` Namhyung Kim
2017-05-24 11:46     ` Milian Wolff
2017-06-03 13:51     ` Milian Wolff
2017-06-06  1:33       ` Namhyung Kim
2017-06-06  7:26         ` Milian Wolff
2017-06-06 19:52           ` Arnaldo Carvalho de Melo
2017-05-18 19:34 ` [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them Milian Wolff
2017-05-22 12:11   ` Namhyung Kim
2017-05-24 11:40     ` Milian Wolff
2017-05-18 20:05 ` [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
2017-05-22  9:06   ` Namhyung Kim
2017-05-24 11:46     ` Milian Wolff
2017-05-24 13:42       ` Milian Wolff
2017-05-24 15:02         ` Namhyung Kim
2017-05-29 18:36           ` Milian Wolff
2017-05-30  1:33             ` Namhyung Kim
2017-05-22 12:09 ` Namhyung Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.