linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf: Implement lbr-as-callgraph v2
@ 2014-01-11 19:42 Andi Kleen
  2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter, linux-kernel

This patchkit implements lbr-as-callgraphs in per freport,
as an alternative way to present LBR information.

Current perf report does a histogram over the branch edges,
which is useful to look at basic blocks, but doesn't tell
you anything about the larger control flow.

This patchkit adds a new option --branch-history that
adds the branch paths to the callgraph history instead.

This allows to reason about individual branch paths leading
to specific samples.

Updates to v1:
- rebased on perf/core
- fix various issues
- rename the option to --branch-history
- various fixes to display the information more concise

Example output:

    % perf record -b -g ./tsrc/tcall
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
    % perf report --branch-history
    ...
        54.91%  tcall.c:6  [.] f2                      tcall
                |
                |--66.53%-- f2 tcall.c:5
                |          |
                |          |--70.83%-- f1 tcall.c:11
                |          |          f1 tcall.c:10
                |          |          main tcall.c:18
                |          |          main tcall.c:18
                |          |          main tcall.c:17
                |          |          main tcall.c:17
                |          |          f1 tcall.c:13
                |          |          f1 tcall.c:13
                |          |          f2 tcall.c:7
                |          |          f2 tcall.c:5
                |          |          f1 tcall.c:12
                |          |          f1 tcall.c:12
                |          |          f2 tcall.c:7
                |          |          f2 tcall.c:5
                |          |          f1 tcall.c:11


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/9] perf, tools: fix BFD detection on opensuse
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-12 15:16   ` Jiri Olsa
                     ` (2 more replies)
  2014-01-11 19:42 ` [PATCH 2/9] perf, tools: Support handling complete branch stacks as histograms Andi Kleen
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

opensuse libbfd requires -lz -liberty to build. Add those
to the BFD feature detection.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/config/Makefile                | 2 +-
 tools/perf/config/feature-checks/Makefile | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 01dd43d..d86d33c 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -478,7 +478,7 @@ else
 endif
 
 ifeq ($(feature-libbfd), 1)
-  EXTLIBS += -lbfd
+  EXTLIBS += -lbfd -lz -liberty
 endif
 
 ifdef NO_DEMANGLE
diff --git a/tools/perf/config/feature-checks/Makefile b/tools/perf/config/feature-checks/Makefile
index 7cf6fcd..a430e4f 100644
--- a/tools/perf/config/feature-checks/Makefile
+++ b/tools/perf/config/feature-checks/Makefile
@@ -120,7 +120,7 @@ test-libpython-version.bin:
 	$(BUILD) $(FLAGS_PYTHON_EMBED)
 
 test-libbfd.bin:
-	$(BUILD) -DPACKAGE='"perf"' -lbfd -ldl
+	$(BUILD) -DPACKAGE='"perf"' -lbfd -lz -liberty -ldl
 
 test-liberty.bin:
 	$(CC) -o $(OUTPUT)$@ test-libbfd.c -DPACKAGE='"perf"' -lbfd -ldl -liberty
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/9] perf, tools: Support handling complete branch stacks as histograms
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
  2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 3/9] perf, tools: Add --branch-history option to report v2 Andi Kleen
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Currently branch stacks can be only shown as edge histograms for
individual branches. I never found this display particularly useful.

This implements an alternative mode that creates histograms over complete
branch traces, instead of individual branches, similar to how normal
callgraphs are handled. This is done by putting it in
front of the normal callgraph and then using the normal callgraph
histogram infrastructure to unify them.

This way in complex functions we can understand the control flow
that lead to a particular sample, and may even see some control
flow in the caller for short functions.

Example (simplified, of course for such simple code this
is usually not needed):

tcall.c:

volatile a = 10000, b = 100000, c;

__attribute__((noinline)) f2()
{
	c = a / b;
}

__attribute__((noinline)) f1()
{
	f2();
	f2();
}
main()
{
	int i;
	for (i = 0; i < 1000000; i++)
		f1();
}

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --branch-history
...
    54.91%  tcall.c:6  [.] f2                      tcall
            |
            |--65.53%-- f2 tcall.c:5
            |          |
            |          |--70.83%-- f1 tcall.c:11
            |          |          f1 tcall.c:10
            |          |          main tcall.c:18
            |          |          main tcall.c:18
            |          |          main tcall.c:17
            |          |          main tcall.c:17
            |          |          f1 tcall.c:13
            |          |          f1 tcall.c:13
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:12
            |          |          f1 tcall.c:12
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:11
            |          |
            |           --29.17%-- f1 tcall.c:12
            |                     f1 tcall.c:12
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:11
            |                     f1 tcall.c:10
            |                     main tcall.c:18
            |                     main tcall.c:18
            |                     main tcall.c:17
            |                     main tcall.c:17
            |                     f1 tcall.c:13
            |                     f1 tcall.c:13
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:12

The default output is unchanged.

This is only implemented in perf report, no change to record
or anywhere else.

This adds the basic code to report:
- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.
The rest of the history code is unchanged and doesn't know the difference
between LBR entry and normal call entry.

Current limitations:
- The LBR flags (mispredict etc.) are not shown in the history
and LBR entries have no special marker.
- It would be nice if annotate marked the LBR entries somehow
(e.g. with arrows)

v2: Various fixes.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-report.c |  15 ++++--
 tools/perf/util/callchain.h |   1 +
 tools/perf/util/machine.c   | 113 ++++++++++++++++++++++++++++++++++++--------
 tools/perf/util/symbol.h    |   3 +-
 4 files changed, 106 insertions(+), 26 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 46864dd..19a74e1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -658,7 +658,7 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset)
 		callchain_param.order = ORDER_CALLER;
 	else if (!strncmp(tok2, "callee", strlen("callee")))
 		callchain_param.order = ORDER_CALLEE;
-	else
+	else if (tok2[0] != 0)
 		return -1;
 
 	/* Get the sort key */
@@ -669,8 +669,15 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset)
 		callchain_param.key = CCKEY_FUNCTION;
 	else if (!strncmp(tok2, "address", strlen("address")))
 		callchain_param.key = CCKEY_ADDRESS;
-	else
+	else if (tok2[0] != 0)
 		return -1;
+
+	tok2 = strtok(NULL, ",");
+	if (!tok2)
+		goto setup;
+	if (!strncmp(tok2, "branch", 6))
+		callchain_param.branch_callstack = 1;
+
 setup:
 	if (callchain_register_param(&callchain_param) < 0) {
 		pr_err("Can't register callchain params\n");
@@ -786,8 +793,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 		   "regex filter to identify parent, see: '--sort parent'"),
 	OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
 		    "Only display entries with parent-match"),
-	OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order",
-		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address). "
+	OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order[,branch]",
+		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address), add branches. "
 		     "Default: fractal,0.5,callee,function", &parse_callchain_opt, callchain_default_opt),
 	OPT_INTEGER(0, "max-stack", &report.max_stack,
 		    "Set the maximum stack depth when parsing the callchain, "
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 08b25af..a1a298a 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -53,6 +53,7 @@ struct callchain_param {
 	sort_chain_func_t	sort;
 	enum chain_order	order;
 	enum chain_key		key;
+	bool			branch_callstack;
 };
 
 struct callchain_list {
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0130279..f2eaf85 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1248,9 +1248,58 @@ struct branch_info *machine__resolve_bstack(struct machine *machine,
 	return bi;
 }
 
+static int add_callchain_ip(struct machine *machine,
+			    struct thread *thread,
+			    struct symbol **parent,
+			    struct addr_location *root_al,
+			    int cpumode,
+			    u64 ip)
+{
+	struct addr_location al;
+
+	al.filtered = false;
+	al.sym = NULL;
+	if (cpumode == -1) {
+		int i;
+
+		for (i = 0; i < (int)NCPUMODES && !al.sym; i++) {
+			/*
+	 	 	 * We cannot use the header.misc hint to determine whether a
+		 	 * branch stack address is user, kernel, guest, hypervisor.
+		 	 * Branches may straddle the kernel/user/hypervisor boundaries.
+		 	 * Thus, we have to try consecutively until we find a match
+		 	 * or else, the symbol is unknown
+		 	 */
+			thread__find_addr_location(thread, machine, cpumodes[i], 
+					MAP__FUNCTION,
+					ip, &al);
+		}
+	} else {
+		thread__find_addr_location(thread, machine, cpumode,
+					   MAP__FUNCTION, ip, &al);
+	}
+	if (al.sym != NULL) {
+		if (sort__has_parent && !*parent &&
+		    symbol__match_regex(al.sym, &parent_regex))
+			*parent = al.sym;
+		else if (have_ignore_callees && root_al &&
+		  symbol__match_regex(al.sym, &ignore_callees_regex)) {
+			/* Treat this symbol as the root,
+			   forgetting its callees. */
+			*root_al = al;
+			callchain_cursor_reset(&callchain_cursor);
+		}
+		if (!symbol_conf.use_callchain)
+			return -EINVAL;
+	}
+
+	return callchain_cursor_append(&callchain_cursor, ip, al.map, al.sym);
+}
+
 static int machine__resolve_callchain_sample(struct machine *machine,
 					     struct thread *thread,
 					     struct ip_callchain *chain,
+					     struct branch_stack *branch,
 					     struct symbol **parent,
 					     struct addr_location *root_al,
 					     int max_stack)
@@ -1262,6 +1311,43 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 
 	callchain_cursor_reset(&callchain_cursor);
 
+	/* 
+	 * Add branches to call stack for easier browsing. This gives
+	 * more context for a sample than just the callers.
+	 * 
+	 * This uses individual histograms of paths compared to the
+	 * aggregated histograms the normal LBR mode uses.
+	 *
+	 * Limitations for now:
+	 * - No extra filters
+	 * - No annotations (should annotate somehow)
+	 * - When the sample is near the beginning of the function
+ 	 *   we may overlap with the real callstack. Could handle this
+	 *   case later, by checking against the last ip.
+	 */
+
+	if (callchain_param.branch_callstack) {
+		for (i = 0; i < branch->nr; i++) { 
+			struct branch_entry *b; 
+
+			if (callchain_param.order == ORDER_CALLEE)
+				b = &branch->entries[i];
+			else
+				b = &branch->entries[branch->nr - i - 1];
+
+			err = add_callchain_ip(machine, thread, parent, root_al,
+					       -1, b->to);
+			if (!err)
+				err = add_callchain_ip(machine, thread, parent, root_al,
+					       -1, b->from);
+			if (err == -EINVAL)
+				break;
+			if (err)
+				return err;
+
+		}
+	}
+
 	if (chain->nr > PERF_MAX_STACK_DEPTH) {
 		pr_warning("corrupted callchain. skipping...\n");
 		return 0;
@@ -1269,7 +1355,6 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 
 	for (i = 0; i < chain_nr; i++) {
 		u64 ip;
-		struct addr_location al;
 
 		if (callchain_param.order == ORDER_CALLEE)
 			ip = chain->ips[i];
@@ -1300,26 +1385,10 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 			continue;
 		}
 
-		al.filtered = false;
-		thread__find_addr_location(thread, machine, cpumode,
-					   MAP__FUNCTION, ip, &al);
-		if (al.sym != NULL) {
-			if (sort__has_parent && !*parent &&
-			    symbol__match_regex(al.sym, &parent_regex))
-				*parent = al.sym;
-			else if (have_ignore_callees && root_al &&
-			  symbol__match_regex(al.sym, &ignore_callees_regex)) {
-				/* Treat this symbol as the root,
-				   forgetting its callees. */
-				*root_al = al;
-				callchain_cursor_reset(&callchain_cursor);
-			}
-			if (!symbol_conf.use_callchain)
-				break;
-		}
 
-		err = callchain_cursor_append(&callchain_cursor,
-					      ip, al.map, al.sym);
+		err = add_callchain_ip(machine, thread, parent, root_al, cpumode, ip);
+		if (err == -EINVAL)
+			break;
 		if (err)
 			return err;
 	}
@@ -1345,7 +1414,9 @@ int machine__resolve_callchain(struct machine *machine,
 	int ret;
 
 	ret = machine__resolve_callchain_sample(machine, thread,
-						sample->callchain, parent,
+						sample->callchain, 
+						sample->branch_stack,
+						parent,
 						root_al, max_stack);
 	if (ret)
 		return ret;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index cbd6803..0da4b24 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -99,7 +99,8 @@ struct symbol_conf {
 			annotate_asm_raw,
 			annotate_src,
 			event_group,
-			demangle;
+			demangle,
+			branch_callstack;
 	const char	*vmlinux_name,
 			*kallsyms_name,
 			*source_prefix,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/9] perf, tools: Add --branch-history option to report v2
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
  2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
  2014-01-11 19:42 ` [PATCH 2/9] perf, tools: Support handling complete branch stacks as histograms Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 4/9] perf, tools: Filter out small loops from LBR-as-call-stack Andi Kleen
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a --branch-history option to perf report that changes all
the settings necessary for using the branches in callstacks.

This is just a short cut to make this nicer to use, it does
not enable any functionality by itself.

v2: Change sort order. Rename option to --branch-history to
be less confusing.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt |  5 +++++
 tools/perf/builtin-report.c              | 27 ++++++++++++++++++++++++---
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 8eab8a4..5410f35 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -223,6 +223,11 @@ OPTIONS
 	branch stacks and it will automatically switch to the branch view mode,
 	unless --no-branch-stack is used.
 
+--branch-history::
+	Add the addresses of sampled taken branches to the callstack.
+	This allows to examine the path the program took to each sample.
+	The data collection must have used -b (or -j) and -g.
+
 --objdump=<path>::
         Path to objdump binary.
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 19a74e1..24b48ec 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -715,6 +715,16 @@ parse_branch_mode(const struct option *opt __maybe_unused,
 }
 
 static int
+parse_branch_call_mode(const struct option *opt __maybe_unused,
+		  const char *str __maybe_unused, int unset)
+{
+	int *branch_mode = opt->value;
+
+	*branch_mode = !unset;
+	return 0;
+}
+
+static int
 parse_percent_limit(const struct option *opt, const char *str,
 		    int unset __maybe_unused)
 {
@@ -729,7 +739,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	struct perf_session *session;
 	struct stat st;
 	bool has_br_stack = false;
-	int branch_mode = -1;
+	int branch_mode = -1, branch_call_mode = -1;
 	int ret = -1;
 	char callchain_default_opt[] = "fractal,0.5,callee";
 	const char * const report_usage[] = {
@@ -838,7 +848,10 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
 		    "Show event group information together"),
 	OPT_CALLBACK_NOOPT('b', "branch-stack", &branch_mode, "",
-		    "use branch records for histogram filling", parse_branch_mode),
+		    "use branch records for per branch histogram filling", parse_branch_mode),
+	OPT_CALLBACK_NOOPT(0, "branch-history", &branch_call_mode, "",
+		    "add last branch records to call history",
+		    parse_branch_call_mode),
 	OPT_STRING(0, "objdump", &objdump_path, "path",
 		   "objdump binary to use for disassembly and annotations"),
 	OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
@@ -886,8 +899,16 @@ repeat:
 	has_br_stack = perf_header__has_feat(&session->header,
 					     HEADER_BRANCH_STACK);
 
-	if (branch_mode == -1 && has_br_stack)
+	if (branch_mode == -1 && has_br_stack && branch_call_mode == -1)
 		sort__mode = SORT_MODE__BRANCH;
+	if (branch_call_mode != -1) {
+		callchain_param.branch_callstack = 1;
+		callchain_param.key = CCKEY_ADDRESS;
+		symbol_conf.use_callchain = true;
+		callchain_register_param(&callchain_param);
+		if (sort_order == default_sort_order)
+			sort_order = "srcline,symbol,dso";
+	}
 
 	/* sort__mode could be NORMAL if --no-branch-stack */
 	if (sort__mode == SORT_MODE__BRANCH) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/9] perf, tools: Filter out small loops from LBR-as-call-stack
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (2 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 3/9] perf, tools: Add --branch-history option to report v2 Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 5/9] perf, tools: Enable printing the srcline in the history Andi Kleen
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Small loops can cause unnecessary duplication in the LBR-as-callstack,
because the loop body appears multiple times. Filter out duplications
from the LBR before unifying it into the histories.  This way the
same loop body only appears once.

This uses a simple hash based cycle detector. It takes some short
cuts (not handling hash collisions) so in rare cases duplicates may
be missed.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/machine.c | 73 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 62 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f2eaf85..2f440f2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -11,6 +11,7 @@
 #include <stdbool.h>
 #include <symbol/kallsyms.h>
 #include "unwind.h"
+#include "linux/hash.h"
 
 int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 {
@@ -1296,6 +1297,46 @@ static int add_callchain_ip(struct machine *machine,
 	return callchain_cursor_append(&callchain_cursor, ip, al.map, al.sym);
 }
 
+#define CHASHSZ 127
+#define CHASHBITS 7
+#define NO_ENTRY 0xff
+
+#define PERF_MAX_BRANCH_DEPTH 127
+
+/* Remove loops. */
+static int remove_loops(struct branch_entry *l, int nr)
+{
+	int i, j, off;
+	unsigned char chash[CHASHSZ];
+	memset(chash, -1, sizeof(chash));
+
+	BUG_ON(nr >= 256);
+	for (i = 0; i < nr; i++) {
+		int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
+
+		/* no collision handling for now */
+		if (chash[h] == NO_ENTRY) {
+			chash[h] = i;
+		} else if (l[chash[h]].from == l[i].from) {
+			bool is_loop = true;
+			/* check if it is a real loop */
+			off = 0;
+			for (j = chash[h]; j < i && i + off < nr; j++, off++)
+				if (l[j].from != l[i + off].from) {
+					is_loop = false;
+					break;
+				}
+			if (is_loop) {
+				memmove(l + i, l + i + off, 
+					(nr - (i + off))
+					* sizeof(struct branch_entry));
+				nr -= off;
+			}
+		}
+	}
+	return nr;
+}
+
 static int machine__resolve_callchain_sample(struct machine *machine,
 					     struct thread *thread,
 					     struct ip_callchain *chain,
@@ -1322,29 +1363,39 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 	 * - No extra filters
 	 * - No annotations (should annotate somehow)
 	 * - When the sample is near the beginning of the function
- 	 *   we may overlap with the real callstack. Could handle this
-	 *   case later, by checking against the last ip.
+ 	 *   we may overlap with the real callstack. 
 	 */
 
+	if (branch->nr > PERF_MAX_BRANCH_DEPTH) {
+		pr_warning("corrupted branch chain. skipping...\n");
+		return 0;
+	}
+
 	if (callchain_param.branch_callstack) {
-		for (i = 0; i < branch->nr; i++) { 
-			struct branch_entry *b; 
+		int nr = branch->nr;
+		struct branch_entry be[nr];
 
+		for (i = 0; i < nr; i++) { 
 			if (callchain_param.order == ORDER_CALLEE)
-				b = &branch->entries[i];
+				be[i] = branch->entries[i];
 			else
-				b = &branch->entries[branch->nr - i - 1];
+				be[i] = branch->entries[branch->nr - i - 1];
+		}
 
-			err = add_callchain_ip(machine, thread, parent, root_al,
-					       -1, b->to);
+		nr = remove_loops(be, nr);
+
+		for (i = 0; i < nr; i++) {
+			err = add_callchain_ip(machine, thread, parent, 
+					       root_al,
+					       -1, be[i].to);
 			if (!err)
-				err = add_callchain_ip(machine, thread, parent, root_al,
-					       -1, b->from);
+				err = add_callchain_ip(machine, thread, 
+						       parent, root_al,
+						       -1, be[i].from);
 			if (err == -EINVAL)
 				break;
 			if (err)
 				return err;
-
 		}
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/9] perf, tools: Enable printing the srcline in the history
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (3 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 4/9] perf, tools: Filter out small loops from LBR-as-call-stack Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 6/9] perf, tools: Fix max stack handling with lbr-as-callgraph Andi Kleen
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

For lbr-as-callgraph we need to see the line number in the history,
because many LBR entries can be in a single function, and just
showing the same function name many times is not useful.

When the history code is configured to sort by address, also try to
resolve the address to a file:srcline and display this in the browser.
If that doesn't work still display the address.

This can be also useful without LBRs for understanding which call in a large
function (or in which inlined function) called something else.

Contains fixes from Namhyung Kim

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/ui/browsers/hists.c | 15 ++++++++++++---
 tools/perf/ui/stdio/hist.c     | 16 +++++++++++++---
 tools/perf/util/callchain.h    |  1 +
 tools/perf/util/machine.c      |  2 +-
 tools/perf/util/srcline.c      |  6 ++++--
 5 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index b720b92..509f550 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -399,9 +399,18 @@ static char *callchain_list__sym_name(struct callchain_list *cl,
 {
 	int printed;
 
-	if (cl->ms.sym)
-		printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
-	else
+	if (cl->ms.sym) {
+		if (callchain_param.key == CCKEY_ADDRESS && 
+		    cl->ms.map && !cl->srcline)
+			cl->srcline = get_srcline(cl->ms.map->dso,
+						  map__rip_2objdump(cl->ms.map,
+								    cl->ip));
+		if (cl->srcline)
+			printed = scnprintf(bf, bfsize, "%s %s", 
+					cl->ms.sym->name, cl->srcline);
+		else
+			printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
+	} else
 		printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
 
 	if (show_dso)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 831fbb7..894b12c 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -56,9 +56,19 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
 		} else
 			ret += fprintf(fp, "%s", "          ");
 	}
-	if (chain->ms.sym)
-		ret += fprintf(fp, "%s\n", chain->ms.sym->name);
-	else
+	if (chain->ms.sym) {
+		if (callchain_param.key == CCKEY_ADDRESS && 
+		    chain->ms.map)
+			chain->srcline = get_srcline(chain->ms.map->dso,
+						  map__rip_2objdump(
+							  chain->ms.map,
+							  chain->ip));
+		if (chain->srcline)
+			ret += fprintf(fp, "%s %s\n", 
+				       chain->ms.sym->name, chain->srcline);
+		else
+			ret += fprintf(fp, "%s\n", chain->ms.sym->name);
+	} else
 		ret += fprintf(fp, "0x%0" PRIx64 "\n", chain->ip);
 
 	return ret;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index a1a298a..0e4d016 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -59,6 +59,7 @@ struct callchain_param {
 struct callchain_list {
 	u64			ip;
 	struct map_symbol	ms;
+	char 		       *srcline;
 	struct list_head	list;
 };
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2f440f2..4032634 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1294,7 +1294,7 @@ static int add_callchain_ip(struct machine *machine,
 			return -EINVAL;
 	}
 
-	return callchain_cursor_append(&callchain_cursor, ip, al.map, al.sym);
+	return callchain_cursor_append(&callchain_cursor, al.addr, al.map, al.sym);
 }
 
 #define CHASHSZ 127
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 7e67879..680c02b 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -258,7 +258,7 @@ char *get_srcline(struct dso *dso, unsigned long addr)
 	const char *dso_name;
 
 	if (!dso->has_srcline)
-		return SRCLINE_UNKNOWN;
+		goto out;
 
 	if (dso->symsrc_filename)
 		dso_name = dso->symsrc_filename;
@@ -289,7 +289,9 @@ out:
 		dso->has_srcline = 0;
 		dso__free_a2l(dso);
 	}
-	return SRCLINE_UNKNOWN;
+	if (asprintf(&srcline, "%s[%lx]", dso->short_name, addr) < 0)
+		return SRCLINE_UNKNOWN;
+	return srcline;
 }
 
 void free_srcline(char *srcline)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/9] perf, tools: Fix max stack handling with lbr-as-callgraph
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (4 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 5/9] perf, tools: Enable printing the srcline in the history Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 7/9] perf, tools: Add overlap detection for report branch-call-stack mode Andi Kleen
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

The original lbr-as-callstack code ignored the maxium callgraph
length set by the user. Fix this.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/machine.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 4032634..853639c 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1372,7 +1372,7 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 	}
 
 	if (callchain_param.branch_callstack) {
-		int nr = branch->nr;
+		int nr = min(max_stack, branch->nr);
 		struct branch_entry be[nr];
 
 		for (i = 0; i < nr; i++) { 
@@ -1397,6 +1397,7 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 			if (err)
 				return err;
 		}
+		chain_nr -= nr;
 	}
 
 	if (chain->nr > PERF_MAX_STACK_DEPTH) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/9] perf, tools: Add overlap detection for report branch-call-stack mode
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (5 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 6/9] perf, tools: Fix max stack handling with lbr-as-callgraph Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 8/9] perf, tools: Only print base source file for srcline Andi Kleen
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a simple heuristic to detect overlap of LBR entries and the call
stack when in lbr-as-callgraph mode. The return address in the
normal callstack is one off compared to the from entry in the
branch stack. Handle this with a simple "assume call instruction
is not longer than 8 bytes" heuristic. With that we can remove
any redundant call in the callstack that is already in the branch
stack.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/machine.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 853639c..1f167fe 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1349,6 +1349,7 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 	int chain_nr = min(max_stack, (int)chain->nr);
 	int i;
 	int err;
+	int first_call = 0;
 
 	callchain_cursor_reset(&callchain_cursor);
 
@@ -1362,8 +1363,6 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 	 * Limitations for now:
 	 * - No extra filters
 	 * - No annotations (should annotate somehow)
-	 * - When the sample is near the beginning of the function
- 	 *   we may overlap with the real callstack. 
 	 */
 
 	if (branch->nr > PERF_MAX_BRANCH_DEPTH) {
@@ -1372,13 +1371,23 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 	}
 
 	if (callchain_param.branch_callstack) {
-		int nr = min(max_stack, branch->nr);
+		int nr = min(max_stack, (int)branch->nr);
 		struct branch_entry be[nr];
 
 		for (i = 0; i < nr; i++) { 
-			if (callchain_param.order == ORDER_CALLEE)
+			if (callchain_param.order == ORDER_CALLEE) {
 				be[i] = branch->entries[i];
-			else
+				/* 
+				 * Check for overlap into the callchain.
+				 * The return address is one off compared to
+				 * the branch entry. To adjust for this 
+				 * assume the calling instruction is not longer
+				 * than 8 bytes.
+				 */
+				if (be[i].from < chain->ips[first_call] &&
+				    be[i].from >= chain->ips[first_call] - 8)
+					first_call++;
+			} else
 				be[i] = branch->entries[branch->nr - i - 1];
 		}
 
@@ -1405,7 +1414,7 @@ static int machine__resolve_callchain_sample(struct machine *machine,
 		return 0;
 	}
 
-	for (i = 0; i < chain_nr; i++) {
+	for (i = first_call; i < chain_nr; i++) {
 		u64 ip;
 
 		if (callchain_param.order == ORDER_CALLEE)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/9] perf, tools: Only print base source file for srcline
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (6 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 7/9] perf, tools: Add overlap detection for report branch-call-stack mode Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-11 19:42 ` [PATCH 9/9] perf, tools: Support source line numbers in annotate Andi Kleen
  2014-01-12 15:16 ` perf: Implement lbr-as-callgraph v2 Jiri Olsa
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

For perf report with --sort srcline only print the base source file
name. This makes the results generally fit much better to the
screen. The path is usually not that useful anyways because it is
often from different systems.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/srcline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 680c02b..88bf0e8 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -274,7 +274,7 @@ char *get_srcline(struct dso *dso, unsigned long addr)
 	if (!addr2line(dso_name, addr, &file, &line, dso))
 		goto out;
 
-	if (asprintf(&srcline, "%s:%u", file, line) < 0) {
+	if (asprintf(&srcline, "%s:%u", basename(file), line) < 0) {
 		free(file);
 		goto out;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] perf, tools: Support source line numbers in annotate
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (7 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 8/9] perf, tools: Only print base source file for srcline Andi Kleen
@ 2014-01-11 19:42 ` Andi Kleen
  2014-01-12 15:16 ` perf: Implement lbr-as-callgraph v2 Jiri Olsa
  9 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2014-01-11 19:42 UTC (permalink / raw)
  To: acme
  Cc: jolsa, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

With srcline key/sort'ing it's useful to have line numbers
in the annotate window. This patch implements this.

Use objdump -l to request the line numbers and
save them in the line structure. Then the browser
displays them for source lines.

The line numbers are not displayed by default, but can be
toggled on with 'k'

There is one unfortunate problem with this setup. For
lines not containing source and which are outside functions
objdump -l reports the wrong line numbers (it always reports
the first line number in the next function even for lines
that are outside the function)
I haven't find a nice way to detect/correct this. Probably objdump
has to be fixed.
See https://sourceware.org/bugzilla/show_bug.cgi?id=16433

The line numbers are still useful even with these problems,
as most are correct.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/ui/browsers/annotate.c | 13 ++++++++++++-
 tools/perf/util/annotate.c        | 30 +++++++++++++++++++++++++-----
 tools/perf/util/annotate.h        |  1 +
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index f0697a3..1c8aab0 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -27,6 +27,7 @@ static struct annotate_browser_opt {
 	bool hide_src_code,
 	     use_offset,
 	     jump_arrows,
+	     show_linenr,
 	     show_nr_jumps;
 } annotate_browser__opts = {
 	.use_offset	= true,
@@ -128,7 +129,11 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 	if (!*dl->line)
 		slsmg_write_nstring(" ", width - pcnt_width);
 	else if (dl->offset == -1) {
-		printed = scnprintf(bf, sizeof(bf), "%*s  ",
+		if (dl->line_nr && annotate_browser__opts.show_linenr) 
+			printed = scnprintf(bf, sizeof(bf), "%*s %-5d ",
+					ab->addr_width, " ", dl->line_nr);
+		else
+			printed = scnprintf(bf, sizeof(bf), "%*s  ",
 				    ab->addr_width, " ");
 		slsmg_write_nstring(bf, printed);
 		slsmg_write_nstring(dl->line, width - printed - pcnt_width + 1);
@@ -733,6 +738,7 @@ static int annotate_browser__run(struct annotate_browser *browser,
 		"o             Toggle disassembler output/simplified view\n"
 		"s             Toggle source code view\n"
 		"/             Search string\n"
+		"k	       Toggle line numbers\n"
 		"r             Run available scripts\n"
 		"?             Search string backwards\n");
 			continue;
@@ -741,6 +747,10 @@ static int annotate_browser__run(struct annotate_browser *browser,
 				script_browse(NULL);
 				continue;
 			}
+		case 'k':
+			annotate_browser__opts.show_linenr =
+				!annotate_browser__opts.show_linenr;
+			break;
 		case 'H':
 			nd = browser->curr_hot;
 			break;
@@ -984,6 +994,7 @@ static struct annotate_config {
 } annotate__configs[] = {
 	ANNOTATE_CFG(hide_src_code),
 	ANNOTATE_CFG(jump_arrows),
+	ANNOTATE_CFG(show_linenr),
 	ANNOTATE_CFG(show_nr_jumps),
 	ANNOTATE_CFG(use_offset),
 };
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 469eb67..f020110 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -15,11 +15,13 @@
 #include "debug.h"
 #include "annotate.h"
 #include "evsel.h"
+#include <regex.h>
 #include <pthread.h>
 #include <linux/bitops.h>
 
 const char 	*disassembler_style;
 const char	*objdump_path;
+static regex_t	 file_lineno;
 
 static struct ins *ins__find(const char *name);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
@@ -562,13 +564,15 @@ out_free_name:
 	return -1;
 }
 
-static struct disasm_line *disasm_line__new(s64 offset, char *line, size_t privsize)
+static struct disasm_line *disasm_line__new(s64 offset, char *line,
+					size_t privsize, int line_nr)
 {
 	struct disasm_line *dl = zalloc(sizeof(*dl) + privsize);
 
 	if (dl != NULL) {
 		dl->offset = offset;
 		dl->line = strdup(line);
+		dl->line_nr = line_nr;
 		if (dl->line == NULL)
 			goto out_delete;
 
@@ -780,13 +784,15 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
  * The ops.raw part will be parsed further according to type of the instruction.
  */
 static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
-				      FILE *file, size_t privsize)
+				      FILE *file, size_t privsize,
+				      int *line_nr)
 {
 	struct annotation *notes = symbol__annotation(sym);
 	struct disasm_line *dl;
 	char *line = NULL, *parsed_line, *tmp, *tmp2, *c;
 	size_t line_len;
 	s64 line_ip, offset = -1;
+	regmatch_t match[2];
 
 	if (getline(&line, &line_len, file) < 0)
 		return -1;
@@ -804,6 +810,12 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
 	line_ip = -1;
 	parsed_line = line;
 
+	/* /filename:linenr ? Save line number and ignore. */
+	if (regexec(&file_lineno, line, 2, match, 0) == 0) {
+		*line_nr = atoi(line + match[1].rm_so);
+		return 0;
+	}
+
 	/*
 	 * Strip leading spaces:
 	 */
@@ -834,8 +846,9 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
 			parsed_line = tmp2 + 1;
 	}
 
-	dl = disasm_line__new(offset, parsed_line, privsize);
+	dl = disasm_line__new(offset, parsed_line, privsize, *line_nr);
 	free(line);
+	(*line_nr)++;
 
 	if (dl == NULL)
 		return -1;
@@ -861,6 +874,11 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
 	return 0;
 }
 
+static __attribute__((constructor)) void symbol__init_regexpr(void)
+{
+	regcomp(&file_lineno, "^/[^:]+:([0-9]+)$", REG_EXTENDED);
+}
+
 static void delete_last_nop(struct symbol *sym)
 {
 	struct annotation *notes = symbol__annotation(sym);
@@ -896,6 +914,7 @@ int symbol__annotate(struct symbol *sym, struct map *map, size_t privsize)
 	char symfs_filename[PATH_MAX];
 	struct kcore_extract kce;
 	bool delete_extract = false;
+	int lineno = 0;
 
 	if (filename) {
 		snprintf(symfs_filename, sizeof(symfs_filename), "%s%s",
@@ -977,7 +996,7 @@ fallback:
 	snprintf(command, sizeof(command),
 		 "%s %s%s --start-address=0x%016" PRIx64
 		 " --stop-address=0x%016" PRIx64
-		 " -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
+		 " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
 		 objdump_path ? objdump_path : "objdump",
 		 disassembler_style ? "-M " : "",
 		 disassembler_style ? disassembler_style : "",
@@ -994,7 +1013,8 @@ fallback:
 		goto out_free_filename;
 
 	while (!feof(file))
-		if (symbol__parse_objdump_line(sym, map, file, privsize) < 0)
+		if (symbol__parse_objdump_line(sym, map, file, privsize,
+			    &lineno) < 0)
 			break;
 
 	/*
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index b2aef59..a124e7e 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -58,6 +58,7 @@ struct disasm_line {
 	char		    *line;
 	char		    *name;
 	struct ins	    *ins;
+	int		    line_nr;
 	struct ins_operands ops;
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/9] perf, tools: fix BFD detection on opensuse
  2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
@ 2014-01-12 15:16   ` Jiri Olsa
  2014-01-13  9:03     ` Namhyung Kim
  2014-01-12 15:40   ` David Ahern
  2014-03-02  8:57   ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
  2 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2014-01-12 15:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: acme, namhyung, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

On Sat, Jan 11, 2014 at 11:42:51AM -0800, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> opensuse libbfd requires -lz -liberty to build. Add those
> to the BFD feature detection.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/config/Makefile                | 2 +-
>  tools/perf/config/feature-checks/Makefile | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
> index 01dd43d..d86d33c 100644
> --- a/tools/perf/config/Makefile
> +++ b/tools/perf/config/Makefile
> @@ -478,7 +478,7 @@ else
>  endif
>  
>  ifeq ($(feature-libbfd), 1)
> -  EXTLIBS += -lbfd
> +  EXTLIBS += -lbfd -lz -liberty
>  endif

ok, Fedora is using linker script in place of libbfd.so which
adds those anyway..

---
[jolsa@krava perf]$ cat /usr/lib64/libbfd.so 
/* GNU ld script */

/* Ensure this .so library will not be used by a link for a different format
   on a multi-architecture system.  */
OUTPUT_FORMAT(elf64-x86-64)

/* The libz dependency is unexpected by legacy build scripts.  */
/* The libdl dependency is for plugin support.  (BZ 889134)  */
INPUT ( /usr/lib64/libbfd.a -liberty -lz -ldl )
---

we also need to check and probably get rid of follow up settings of
EXTLIBS which seems useless now:

...
    ifneq ($(feature-libbfd), 1)
      $(call feature_check,liberty)
      ifeq ($(feature-liberty), 1)
        EXTLIBS += -lbfd -liberty
      else
        $(call feature_check,liberty-z)
        ifeq ($(feature-liberty-z), 1)
          EXTLIBS += -lbfd -liberty -lz
...

jirka

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf: Implement lbr-as-callgraph v2
  2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
                   ` (8 preceding siblings ...)
  2014-01-11 19:42 ` [PATCH 9/9] perf, tools: Support source line numbers in annotate Andi Kleen
@ 2014-01-12 15:16 ` Jiri Olsa
  9 siblings, 0 replies; 15+ messages in thread
From: Jiri Olsa @ 2014-01-12 15:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: acme, namhyung, mingo, dsahern, fweisbec, adrian.hunter, linux-kernel

On Sat, Jan 11, 2014 at 11:42:50AM -0800, Andi Kleen wrote:
> This patchkit implements lbr-as-callgraphs in per freport,
> as an alternative way to present LBR information.
> 
> Current perf report does a histogram over the branch edges,
> which is useful to look at basic blocks, but doesn't tell
> you anything about the larger control flow.
> 
> This patchkit adds a new option --branch-history that
> adds the branch paths to the callgraph history instead.
> 
> This allows to reason about individual branch paths leading
> to specific samples.
> 
> Updates to v1:
> - rebased on perf/core
> - fix various issues
> - rename the option to --branch-history
> - various fixes to display the information more concise
> 
> Example output:
> 
>     % perf record -b -g ./tsrc/tcall
>     [ perf record: Woken up 1 times to write data ]
>     [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
>     % perf report --branch-history
>     ...
>         54.91%  tcall.c:6  [.] f2                      tcall
>                 |
>                 |--66.53%-- f2 tcall.c:5
>                 |          |
>                 |          |--70.83%-- f1 tcall.c:11
>                 |          |          f1 tcall.c:10
>                 |          |          main tcall.c:18
>                 |          |          main tcall.c:18
>                 |          |          main tcall.c:17
>                 |          |          main tcall.c:17
>                 |          |          f1 tcall.c:13
>                 |          |          f1 tcall.c:13
>                 |          |          f2 tcall.c:7
>                 |          |          f2 tcall.c:5
>                 |          |          f1 tcall.c:12
>                 |          |          f1 tcall.c:12
>                 |          |          f2 tcall.c:7
>                 |          |          f2 tcall.c:5
>                 |          |          f1 tcall.c:11
> 

got some whitespace issues:

Applying: perf, tools: fix BFD detection on opensuse
Applying: perf, tools: Support handling complete branch stacks as histograms
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:85: space before tab in indent.
                         * We cannot use the header.misc hint to determine whether a
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:86: space before tab in indent.
                         * branch stack address is user, kernel, guest, hypervisor.
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:87: space before tab in indent.
                         * Branches may straddle the kernel/user/hypervisor boundaries.
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:88: space before tab in indent.
                         * Thus, we have to try consecutively until we find a match
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:89: space before tab in indent.
                         * or else, the symbol is unknown
warning: squelched 8 whitespace errors
warning: 13 lines add whitespace errors.
Applying: perf, tools: Add --branch-history option to report v2
Applying: perf, tools: Filter out small loops from LBR-as-call-stack
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:51: trailing whitespace.
                                memmove(l + i, l + i + off, 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:70: trailing whitespace, space before tab in indent.
         *   we may overlap with the real callstack. 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:84: trailing whitespace.
                for (i = 0; i < nr; i++) { 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:98: trailing whitespace.
                        err = add_callchain_ip(machine, thread, parent, 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:104: trailing whitespace.
                                err = add_callchain_ip(machine, thread, 
warning: 5 lines add whitespace errors.
Applying: perf, tools: Enable printing the srcline in the history
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:21: trailing whitespace.
                if (callchain_param.key == CCKEY_ADDRESS && 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:27: trailing whitespace.
                        printed = scnprintf(bf, bfsize, "%s %s", 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:47: trailing whitespace.
                if (callchain_param.key == CCKEY_ADDRESS && 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:54: trailing whitespace.
                        ret += fprintf(fp, "%s %s\n", 
warning: 4 lines add whitespace errors.
Applying: perf, tools: Fix max stack handling with lbr-as-callgraph
Applying: perf, tools: Add overlap detection for report branch-call-stack mode
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:39: trailing whitespace.
                                /* 
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:42: trailing whitespace.
                                 * the branch entry. To adjust for this 
warning: 2 lines add whitespace errors.
Applying: perf, tools: Only print base source file for srcline
Applying: perf, tools: Support source line numbers in annotate
/home/jolsa/kernel.org/linux-perf/.git/rebase-apply/patch:24: trailing whitespace.
                if (dl->line_nr && annotate_browser__opts.show_linenr) 
warning: 1 line adds whitespace errors.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/9] perf, tools: fix BFD detection on opensuse
  2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
  2014-01-12 15:16   ` Jiri Olsa
@ 2014-01-12 15:40   ` David Ahern
  2014-03-02  8:57   ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
  2 siblings, 0 replies; 15+ messages in thread
From: David Ahern @ 2014-01-12 15:40 UTC (permalink / raw)
  To: Andi Kleen, acme
  Cc: jolsa, namhyung, mingo, fweisbec, adrian.hunter, linux-kernel,
	Andi Kleen

On 1/11/14, 12:42 PM, Andi Kleen wrote:
> From: Andi Kleen<ak@linux.intel.com>
>
> opensuse libbfd requires -lz -liberty to build. Add those
> to the BFD feature detection.
>

Fixes static builds on Fedora as well.

David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/9] perf, tools: fix BFD detection on opensuse
  2014-01-12 15:16   ` Jiri Olsa
@ 2014-01-13  9:03     ` Namhyung Kim
  0 siblings, 0 replies; 15+ messages in thread
From: Namhyung Kim @ 2014-01-13  9:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, acme, mingo, dsahern, fweisbec, adrian.hunter,
	linux-kernel, Andi Kleen

On Sun, 12 Jan 2014 16:16:15 +0100, Jiri Olsa wrote:
> On Sat, Jan 11, 2014 at 11:42:51AM -0800, Andi Kleen wrote:
>> From: Andi Kleen <ak@linux.intel.com>
>> 
>> opensuse libbfd requires -lz -liberty to build. Add those
>> to the BFD feature detection.
>> 
>> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>> ---
>>  tools/perf/config/Makefile                | 2 +-
>>  tools/perf/config/feature-checks/Makefile | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
>> index 01dd43d..d86d33c 100644
>> --- a/tools/perf/config/Makefile
>> +++ b/tools/perf/config/Makefile
>> @@ -478,7 +478,7 @@ else
>>  endif
>>  
>>  ifeq ($(feature-libbfd), 1)
>> -  EXTLIBS += -lbfd
>> +  EXTLIBS += -lbfd -lz -liberty
>>  endif
>
> ok, Fedora is using linker script in place of libbfd.so which
> adds those anyway..
>
> ---
> [jolsa@krava perf]$ cat /usr/lib64/libbfd.so 
> /* GNU ld script */
>
> /* Ensure this .so library will not be used by a link for a different format
>    on a multi-architecture system.  */
> OUTPUT_FORMAT(elf64-x86-64)
>
> /* The libz dependency is unexpected by legacy build scripts.  */
> /* The libdl dependency is for plugin support.  (BZ 889134)  */
> INPUT ( /usr/lib64/libbfd.a -liberty -lz -ldl )
> ---
>
> we also need to check and probably get rid of follow up settings of
> EXTLIBS which seems useless now:
>
> ...
>     ifneq ($(feature-libbfd), 1)
>       $(call feature_check,liberty)
>       ifeq ($(feature-liberty), 1)
>         EXTLIBS += -lbfd -liberty
>       else
>         $(call feature_check,liberty-z)
>         ifeq ($(feature-liberty-z), 1)
>           EXTLIBS += -lbfd -liberty -lz
> ...

Agreed.  I think it's only for keeping dependency minimal.  But no need
to do it if it's always called with -liberty and -lz.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:perf/urgent] perf tools: fix BFD detection on opensuse
  2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
  2014-01-12 15:16   ` Jiri Olsa
  2014-01-12 15:40   ` David Ahern
@ 2014-03-02  8:57   ` tip-bot for Andi Kleen
  2 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Andi Kleen @ 2014-03-02  8:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, hpa, mingo, namhyung, jolsa, fweisbec, ak,
	dsahern, adrian.hunter, tglx

Commit-ID:  280e7c48c3b873e4987a63da276ecab25383f494
Gitweb:     http://git.kernel.org/tip/280e7c48c3b873e4987a63da276ecab25383f494
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Sat, 11 Jan 2014 11:42:51 -0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 27 Feb 2014 18:29:08 -0300

perf tools: fix BFD detection on opensuse

opensuse libbfd requires -lz -liberty to build. Add those to the BFD
feature detection.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1389469379-13340-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/config/Makefile                | 2 +-
 tools/perf/config/feature-checks/Makefile | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index c48d449..0331ea2 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -478,7 +478,7 @@ else
 endif
 
 ifeq ($(feature-libbfd), 1)
-  EXTLIBS += -lbfd
+  EXTLIBS += -lbfd -lz -liberty
 endif
 
 ifdef NO_DEMANGLE
diff --git a/tools/perf/config/feature-checks/Makefile b/tools/perf/config/feature-checks/Makefile
index 12e5513..523b7bc 100644
--- a/tools/perf/config/feature-checks/Makefile
+++ b/tools/perf/config/feature-checks/Makefile
@@ -121,7 +121,7 @@ test-libpython-version.bin:
 	$(BUILD) $(FLAGS_PYTHON_EMBED)
 
 test-libbfd.bin:
-	$(BUILD) -DPACKAGE='"perf"' -lbfd -ldl
+	$(BUILD) -DPACKAGE='"perf"' -lbfd -lz -liberty -ldl
 
 test-liberty.bin:
 	$(CC) -o $(OUTPUT)$@ test-libbfd.c -DPACKAGE='"perf"' -lbfd -ldl -liberty

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-03-02  8:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-11 19:42 perf: Implement lbr-as-callgraph v2 Andi Kleen
2014-01-11 19:42 ` [PATCH 1/9] perf, tools: fix BFD detection on opensuse Andi Kleen
2014-01-12 15:16   ` Jiri Olsa
2014-01-13  9:03     ` Namhyung Kim
2014-01-12 15:40   ` David Ahern
2014-03-02  8:57   ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
2014-01-11 19:42 ` [PATCH 2/9] perf, tools: Support handling complete branch stacks as histograms Andi Kleen
2014-01-11 19:42 ` [PATCH 3/9] perf, tools: Add --branch-history option to report v2 Andi Kleen
2014-01-11 19:42 ` [PATCH 4/9] perf, tools: Filter out small loops from LBR-as-call-stack Andi Kleen
2014-01-11 19:42 ` [PATCH 5/9] perf, tools: Enable printing the srcline in the history Andi Kleen
2014-01-11 19:42 ` [PATCH 6/9] perf, tools: Fix max stack handling with lbr-as-callgraph Andi Kleen
2014-01-11 19:42 ` [PATCH 7/9] perf, tools: Add overlap detection for report branch-call-stack mode Andi Kleen
2014-01-11 19:42 ` [PATCH 8/9] perf, tools: Only print base source file for srcline Andi Kleen
2014-01-11 19:42 ` [PATCH 9/9] perf, tools: Support source line numbers in annotate Andi Kleen
2014-01-12 15:16 ` perf: Implement lbr-as-callgraph v2 Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).