linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view
@ 2016-10-19 22:01 Jin Yao
  2016-10-19 22:01 ` [PATCH v2 1/6] perf report: Add branch flag to callchain cursor node Jin Yao
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

v2: Just a rebase to Arnaldo's perf/core branch, no functional changes.

Initial post

perf record -g -b ...
perf report --branch-history

Currently it only shows the branches from the LBR in the callgraph view.
It would be useful to annotate branch predictions and TSX aborts and
also timed LBR cycles also in the callgraph view.

This would allow a quick overview where branch predictions are and how
costly basic blocks are.

For example:

Overhead  Source:Line                                   Symbol     Shared Object   Predicted  Abort  Cycles
........  ............................................  .........  ..............  .........  .....  ......

  38.25%  div.c:45                                      [.] main   div             97.6%      0.0%   3
          |
          ---main div.c:42 (cycles:2)
             compute_flag div.c:28 (cycles:2)
             compute_flag div.c:27 (cycles:1)
             rand rand.c:28 (cycles:1)
             rand rand.c:28 (cycles:1)
             __random random.c:298 (cycles:1)
             __random random.c:297 (cycles:1)
             __random random.c:295 (cycles:1)
             __random random.c:295 (cycles:1)
             __random random.c:295 (cycles:1)
             __random random.c:295 (cycles:9)
             |
             |--36.73%--__random_r random_r.c:392 (cycles:9)
             |          __random_r random_r.c:357 (cycles:1)
             |          __random random.c:293 (cycles:1)
             |          __random random.c:293 (cycles:1)
             |          __random random.c:291 (cycles:1)
             |          __random random.c:291 (cycles:1)
             |          __random random.c:291 (cycles:1)
             |          __random random.c:288 (cycles:1)
             |          rand rand.c:27 (cycles:1)
             |          rand rand.c:26 (cycles:1)
             |          rand@plt +4194304 (cycles:1)
             |          rand@plt +4194304 (cycles:1)
             |          compute_flag div.c:25 (cycles:1)
             |          compute_flag div.c:22 (cycles:1)
             |          main div.c:40 (cycles:1)
             |          main div.c:40 (cycles:16)
             |          main div.c:39 (cycles:16)
             |          |
             |          |--29.93%--main div.c:39 (predicted:50.6%, cycles:1)
             |          |          main div.c:44 (predicted:50.6%, cycles:1)
             |          |          |
             |          |           --22.69%--main div.c:42 (cycles:2)

Predicted is hide in callchain entry if the branch is 100% predicted.
Abort is hide in callchain entry if the branch is 0 aborted.

Now stdio and browser modes are both supported.

Jin Yao (6):
  perf report: Add branch flag to callchain cursor node
  perf report: Caculate and return the branch counting in callchain
  perf report: Create a symbol_conf flag for showing branch flag
    counting
  perf report: Show branch info in callchain entry for stdio mode
  perf report: Show branch info in callchain entry for browser mode
  perf report: Display columns Predicted/Abort/Cycles in
    --branch-history

 tools/perf/Documentation/perf-report.txt |   8 ++
 tools/perf/builtin-report.c              |   9 +-
 tools/perf/ui/browsers/hists.c           |  15 ++-
 tools/perf/ui/stdio/hist.c               |  30 +++++-
 tools/perf/util/callchain.c              | 176 ++++++++++++++++++++++++++++++-
 tools/perf/util/callchain.h              |  16 ++-
 tools/perf/util/hist.c                   |   3 +
 tools/perf/util/hist.h                   |   3 +
 tools/perf/util/machine.c                |  56 +++++++---
 tools/perf/util/sort.c                   | 117 +++++++++++++++++++-
 tools/perf/util/sort.h                   |   3 +
 tools/perf/util/symbol.h                 |   1 +
 12 files changed, 411 insertions(+), 26 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/6] perf report: Add branch flag to callchain cursor node
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
@ 2016-10-19 22:01 ` Jin Yao
  2016-10-19 22:01 ` [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain Jin Yao
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

Since the branch ip has been added to call stack for easier browsing,
this patch adds more branch information. For example, add a flag to
indicate if this ip is a branch, and also add with the branch flag.

Then we can know if the cursor node represents a branch and know
what the branch flag it has.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 11 +++++++--
 tools/perf/util/callchain.h |  5 +++-
 tools/perf/util/machine.c   | 56 +++++++++++++++++++++++++++++++++------------
 3 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 07fd30b..342ef20 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -730,7 +730,8 @@ merge_chain_branch(struct callchain_cursor *cursor,
 
 	list_for_each_entry_safe(list, next_list, &src->val, list) {
 		callchain_cursor_append(cursor, list->ip,
-					list->ms.map, list->ms.sym);
+					list->ms.map, list->ms.sym,
+					false, NULL);
 		list_del(&list->list);
 		free(list);
 	}
@@ -767,7 +768,8 @@ int callchain_merge(struct callchain_cursor *cursor,
 }
 
 int callchain_cursor_append(struct callchain_cursor *cursor,
-			    u64 ip, struct map *map, struct symbol *sym)
+			    u64 ip, struct map *map, struct symbol *sym,
+			    bool branch, struct branch_flags *flags)
 {
 	struct callchain_cursor_node *node = *cursor->last;
 
@@ -782,6 +784,11 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
 	node->ip = ip;
 	node->map = map;
 	node->sym = sym;
+	node->branch = branch;
+
+	if (flags)
+		memcpy(&node->branch_flags, flags,
+			sizeof(struct branch_flags));
 
 	cursor->nr++;
 
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 13e7554..40ecf25 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -129,6 +129,8 @@ struct callchain_cursor_node {
 	u64				ip;
 	struct map			*map;
 	struct symbol			*sym;
+	bool				branch;
+	struct branch_flags		branch_flags;
 	struct callchain_cursor_node	*next;
 };
 
@@ -183,7 +185,8 @@ static inline void callchain_cursor_reset(struct callchain_cursor *cursor)
 }
 
 int callchain_cursor_append(struct callchain_cursor *cursor, u64 ip,
-			    struct map *map, struct symbol *sym);
+			    struct map *map, struct symbol *sym,
+			    bool branch, struct branch_flags *flags);
 
 /* Close a cursor writing session. Initialize for the reader */
 static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index df85b9e..c2d9d9f 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1616,7 +1616,9 @@ static int add_callchain_ip(struct thread *thread,
 			    struct symbol **parent,
 			    struct addr_location *root_al,
 			    u8 *cpumode,
-			    u64 ip)
+			    u64 ip,
+			    bool branch,
+			    struct branch_flags *flags)
 {
 	struct addr_location al;
 
@@ -1668,7 +1670,8 @@ static int add_callchain_ip(struct thread *thread,
 
 	if (symbol_conf.hide_unresolved && al.sym == NULL)
 		return 0;
-	return callchain_cursor_append(cursor, al.addr, al.map, al.sym);
+	return callchain_cursor_append(cursor, al.addr, al.map, al.sym,
+				       branch, flags);
 }
 
 struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
@@ -1757,7 +1760,9 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 	/* LBR only affects the user callchain */
 	if (i != chain_nr) {
 		struct branch_stack *lbr_stack = sample->branch_stack;
-		int lbr_nr = lbr_stack->nr, j;
+		int lbr_nr = lbr_stack->nr, j, k;
+		bool branch;
+		struct branch_flags *flags;
 		/*
 		 * LBR callstack can only get user call chain.
 		 * The mix_chain_nr is kernel call chain
@@ -1772,23 +1777,41 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 
 		for (j = 0; j < mix_chain_nr; j++) {
 			int err;
+			branch = false;
+			flags = NULL;
+
 			if (callchain_param.order == ORDER_CALLEE) {
 				if (j < i + 1)
 					ip = chain->ips[j];
-				else if (j > i + 1)
-					ip = lbr_stack->entries[j - i - 2].from;
-				else
+				else if (j > i + 1) {
+					k = j - i - 2;
+					ip = lbr_stack->entries[k].from;
+					branch = true;
+					flags = &lbr_stack->entries[k].flags;
+				} else {
 					ip = lbr_stack->entries[0].to;
+					branch = true;
+					flags = &lbr_stack->entries[0].flags;
+				}
 			} else {
-				if (j < lbr_nr)
-					ip = lbr_stack->entries[lbr_nr - j - 1].from;
+				if (j < lbr_nr) {
+					k = lbr_nr - j - 1;
+					ip = lbr_stack->entries[k].from;
+					branch = true;
+					flags = &lbr_stack->entries[k].flags;
+				}
 				else if (j > lbr_nr)
 					ip = chain->ips[i + 1 - (j - lbr_nr)];
-				else
+				else {
 					ip = lbr_stack->entries[0].to;
+					branch = true;
+					flags = &lbr_stack->entries[0].flags;
+				}
 			}
 
-			err = add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip);
+			err = add_callchain_ip(thread, cursor, parent,
+					       root_al, &cpumode, ip,
+					       branch, flags);
 			if (err)
 				return (err < 0) ? err : 0;
 		}
@@ -1872,10 +1895,12 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 
 		for (i = 0; i < nr; i++) {
 			err = add_callchain_ip(thread, cursor, parent, root_al,
-					       NULL, be[i].to);
+					       NULL, be[i].to,
+					       true, &be[i].flags);
 			if (!err)
 				err = add_callchain_ip(thread, cursor, parent, root_al,
-						       NULL, be[i].from);
+						       NULL, be[i].from,
+						       true, &be[i].flags);
 			if (err == -EINVAL)
 				break;
 			if (err)
@@ -1903,7 +1928,9 @@ check_calls:
 		if (ip < PERF_CONTEXT_MAX)
                        ++nr_entries;
 
-		err = add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip);
+		err = add_callchain_ip(thread, cursor, parent,
+				       root_al, &cpumode, ip,
+				       false, NULL);
 
 		if (err)
 			return (err < 0) ? err : 0;
@@ -1919,7 +1946,8 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
 	if (symbol_conf.hide_unresolved && entry->sym == NULL)
 		return 0;
 	return callchain_cursor_append(cursor, entry->ip,
-				       entry->map, entry->sym);
+				       entry->map, entry->sym,
+				       false, NULL);
 }
 
 static int thread__resolve_callchain_unwind(struct thread *thread,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
  2016-10-19 22:01 ` [PATCH v2 1/6] perf report: Add branch flag to callchain cursor node Jin Yao
@ 2016-10-19 22:01 ` Jin Yao
  2016-10-20 16:41   ` Nilay Vaish
  2016-10-19 22:01 ` [PATCH v2 3/6] perf report: Create a symbol_conf flag for showing branch flag counting Jin Yao
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

Create some branch counters in per callchain list entry. Each counter
is for a branch flag. For example, predicted_count counts all the
*predicted* branches. The counters get updated by processing the
callchain cursor nodes.

It also provides functions to retrieve or print the values of counters
in callchain list.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 165 +++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/callchain.h |  11 +++
 2 files changed, 175 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 342ef20..8937a2c 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -440,6 +440,19 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor)
 		call->ip = cursor_node->ip;
 		call->ms.sym = cursor_node->sym;
 		call->ms.map = cursor_node->map;
+
+		if (cursor_node->branch) {
+			call->branch_count = 1;
+
+			if (cursor_node->branch_flags.predicted)
+				call->predicted_count = 1;
+
+			if (cursor_node->branch_flags.abort)
+				call->abort_count = 1;
+
+			call->cycles_count = cursor_node->branch_flags.cycles;
+		}
+
 		list_add_tail(&call->list, &node->val);
 
 		callchain_cursor_advance(cursor);
@@ -499,8 +512,21 @@ static enum match_result match_chain(struct callchain_cursor_node *node,
 		right = node->ip;
 	}
 
-	if (left == right)
+	if (left == right) {
+		if (node->branch) {
+			cnode->branch_count++;
+
+			if (node->branch_flags.predicted)
+				cnode->predicted_count++;
+
+			if (node->branch_flags.abort)
+				cnode->abort_count++;
+
+			cnode->cycles_count += node->branch_flags.cycles;
+		}
+
 		return MATCH_EQ;
+	}
 
 	return left > right ? MATCH_GT : MATCH_LT;
 }
@@ -946,6 +972,143 @@ int callchain_node__fprintf_value(struct callchain_node *node,
 	return 0;
 }
 
+static void callchain_counts_value(struct callchain_node *node,
+				   u64 *branch_count, u64 *predicted_count,
+				   u64 *abort_count, u64 *cycles_count)
+{
+	struct callchain_list *clist;
+
+	list_for_each_entry(clist, &node->val, list) {
+		if (branch_count)
+			*branch_count += clist->branch_count;
+
+		if (predicted_count)
+			*predicted_count += clist->predicted_count;
+
+		if (abort_count)
+			*abort_count += clist->abort_count;
+
+		if (cycles_count)
+			*cycles_count += clist->cycles_count;
+	}
+}
+
+static int callchain_node_branch_counts_cumul(struct callchain_node *node,
+					      u64 *branch_count,
+					      u64 *predicted_count,
+					      u64 *abort_count,
+					      u64 *cycles_count)
+{
+	struct callchain_node *child;
+	struct rb_node *n;
+
+	n = rb_first(&node->rb_root_in);
+	while (n) {
+		child = rb_entry(n, struct callchain_node, rb_node_in);
+		n = rb_next(n);
+
+		callchain_node_branch_counts_cumul(child, branch_count,
+						   predicted_count,
+						   abort_count,
+						   cycles_count);
+
+		callchain_counts_value(child, branch_count,
+				       predicted_count, abort_count,
+				       cycles_count);
+	}
+
+	return 0;
+}
+
+int callchain_branch_counts(struct callchain_root *root,
+			    u64 *branch_count, u64 *predicted_count,
+			    u64 *abort_count, u64 *cycles_count)
+{
+	if (branch_count)
+		*branch_count = 0;
+
+	if (predicted_count)
+		*predicted_count = 0;
+
+	if (abort_count)
+		*abort_count = 0;
+
+	if (cycles_count)
+		*cycles_count = 0;
+
+	return callchain_node_branch_counts_cumul(&root->node,
+						  branch_count,
+						  predicted_count,
+						  abort_count,
+						  cycles_count);
+}
+
+static int callchain_counts_printf(FILE *fp, char *bf, int bfsize,
+				   u64 branch_count, u64 predicted_count,
+				   u64 abort_count, u64 cycles_count,
+				   const char *cumul_str)
+{
+	double predicted_percent = 0.0;
+	double abort_percent = 0.0;
+	u64 cycles = 0;
+
+	if (branch_count == 0) {
+		if (fp)
+			return fprintf(fp, " (calltrace)");
+
+		return scnprintf(bf, bfsize, " (calltrace)");
+	}
+
+	predicted_percent = predicted_count * 100.0 / branch_count;
+	abort_percent = abort_count * 100.0 / branch_count;
+	cycles = cycles_count / branch_count;
+
+	if ((predicted_percent >= 100.0) && (abort_percent <= 0.0)) {
+		if (fp)
+			return fprintf(fp, " (%scycles:%" PRId64 ")",
+				       cumul_str, cycles);
+
+		return scnprintf(bf, bfsize, " (%scycles:%" PRId64 ")",
+				 cumul_str, cycles);
+	}
+
+	if ((predicted_percent < 100.0) && (abort_percent <= 0.0)) {
+		if (fp)
+			return fprintf(fp,
+				" (%spredicted:%.1f%%, cycles:%" PRId64 ")",
+				cumul_str, predicted_percent, cycles);
+
+		return scnprintf(bf, bfsize,
+			" (%spredicted:%.1f%%, cycles:%" PRId64 ")",
+			cumul_str, predicted_percent, cycles);
+	}
+
+	if (fp)
+		return fprintf(fp,
+		" (%spredicted:%.1f%%, abort:%.1f%%, cycles:%" PRId64 ")",
+			cumul_str, predicted_percent, abort_percent, cycles);
+
+	return scnprintf(bf, bfsize,
+		" (%spredicted:%.1f%%, abort:%.1f%%, cycles:%" PRId64 ")",
+		cumul_str, predicted_percent, abort_percent, cycles);
+}
+
+int callchain_list_counts__printf_value(struct callchain_list *clist,
+					FILE *fp, char *bf, int bfsize)
+{
+	u64 branch_count, predicted_count;
+	u64 abort_count, cycles_count;
+
+	branch_count = clist->branch_count;
+	predicted_count = clist->predicted_count;
+	abort_count = clist->abort_count;
+	cycles_count = clist->cycles_count;
+
+	return callchain_counts_printf(fp, bf, bfsize, branch_count,
+				       predicted_count, abort_count,
+				       cycles_count, "");
+}
+
 static void free_callchain_node(struct callchain_node *node)
 {
 	struct callchain_list *list, *tmp;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 40ecf25..4f6bf6c 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -115,6 +115,10 @@ struct callchain_list {
 		bool		unfolded;
 		bool		has_children;
 	};
+	u64			branch_count;
+	u64			predicted_count;
+	u64			abort_count;
+	u64			cycles_count;
 	char		       *srcline;
 	struct list_head	list;
 };
@@ -264,8 +268,15 @@ char *callchain_node__scnprintf_value(struct callchain_node *node,
 int callchain_node__fprintf_value(struct callchain_node *node,
 				  FILE *fp, u64 total);
 
+int callchain_list_counts__printf_value(struct callchain_list *clist,
+					FILE *fp, char *bf, int bfsize);
+
 void free_callchain(struct callchain_root *root);
 void decay_callchain(struct callchain_root *root);
 int callchain_node__make_parent_list(struct callchain_node *node);
 
+int callchain_branch_counts(struct callchain_root *root,
+			    u64 *branch_count, u64 *predicted_count,
+			    u64 *abort_count, u64 *cycles_count);
+
 #endif	/* __PERF_CALLCHAIN_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/6] perf report: Create a symbol_conf flag for showing branch flag counting
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
  2016-10-19 22:01 ` [PATCH v2 1/6] perf report: Add branch flag to callchain cursor node Jin Yao
  2016-10-19 22:01 ` [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain Jin Yao
@ 2016-10-19 22:01 ` Jin Yao
  2016-10-19 22:01 ` [PATCH v2 4/6] perf report: Show branch info in callchain entry for stdio mode Jin Yao
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

Create a new flag show_branchflag_count in symbol_conf. The flag is used
to control if showing the branch flag counting information. The flag
depends on if the perf.data has branch data and if user chooses the
"branch-history" option in perf report command line.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-report.c | 3 +++
 tools/perf/util/symbol.h    | 1 +
 2 files changed, 4 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6e88460..c406393 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -905,6 +905,9 @@ repeat:
 	if (itrace_synth_opts.last_branch)
 		has_br_stack = true;
 
+	if (has_br_stack && branch_call_mode)
+		symbol_conf.show_branchflag_count = true;
+
 	/*
 	 * Branch mode is a tristate:
 	 * -1 means default, so decide based on the file having branch data.
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index d964844..2d0a905 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -100,6 +100,7 @@ struct symbol_conf {
 			show_total_period,
 			use_callchain,
 			cumulate_callchain,
+			show_branchflag_count,
 			exclude_other,
 			show_cpu_utilization,
 			initialized,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/6] perf report: Show branch info in callchain entry for stdio mode
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
                   ` (2 preceding siblings ...)
  2016-10-19 22:01 ` [PATCH v2 3/6] perf report: Create a symbol_conf flag for showing branch flag counting Jin Yao
@ 2016-10-19 22:01 ` Jin Yao
  2016-10-19 22:01 ` [PATCH v2 5/6] perf report: Show branch info in callchain entry for browser mode Jin Yao
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

If the branch is 100% predicated then the "predicated" is hide.
Similarly, if there is no branch tsx abort, the "abort" is hide.
There is only cycles shown (cycle is supported on skylake platform,
older platform would be 0).

One example is:

|--36.73%--__random_r random_r.c:392 (cycles:9)
|          __random_r random_r.c:357 (cycles:1)
|          __random random.c:293 (cycles:1)
|          __random random.c:293 (cycles:1)
|          __random random.c:291 (cycles:1)
|          __random random.c:291 (cycles:1)
|          __random random.c:291 (cycles:1)
|          __random random.c:288 (cycles:1)
|          rand rand.c:27 (cycles:1)
|          rand rand.c:26 (cycles:1)
|          rand@plt +4194304 (cycles:1)
|          rand@plt +4194304 (cycles:1)
|          compute_flag div.c:25 (cycles:1)
|          compute_flag div.c:22 (cycles:1)
|          main div.c:40 (cycles:1)
|          main div.c:40 (cycles:16)
|          main div.c:39 (cycles:16)
|          |
|          |--29.93%--main div.c:39 (predicted:50.6%, cycles:1)
|          |          main div.c:44 (predicted:50.6%, cycles:1)

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/ui/stdio/hist.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 89d8441..57e1f6f 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -41,7 +41,9 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
 {
 	int i;
 	size_t ret = 0;
-	char bf[1024];
+	char bf[1024], *alloc_str = NULL;
+	char buf[64];
+	const char *str;
 
 	ret += callchain__fprintf_left_margin(fp, left_margin);
 	for (i = 0; i < depth; i++) {
@@ -56,8 +58,21 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
 		} else
 			ret += fprintf(fp, "%s", "          ");
 	}
-	fputs(callchain_list__sym_name(chain, bf, sizeof(bf), false), fp);
+
+	str = callchain_list__sym_name(chain, bf, sizeof(bf), false);
+
+	if (symbol_conf.show_branchflag_count) {
+		callchain_list_counts__printf_value(chain, NULL,
+						    buf, sizeof(buf));
+		if (asprintf(&alloc_str, "%s%s", str, buf) < 0)
+			str = "Not enough memory!";
+		else
+			str = alloc_str;
+	}
+
+	fputs(str, fp);
 	fputc('\n', fp);
+	free(alloc_str);
 	return ret;
 }
 
@@ -219,8 +234,15 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 			} else
 				ret += callchain__fprintf_left_margin(fp, left_margin);
 
-			ret += fprintf(fp, "%s\n", callchain_list__sym_name(chain, bf, sizeof(bf),
-							false));
+			ret += fprintf(fp, "%s",
+				       callchain_list__sym_name(chain, bf,
+								sizeof(bf),
+								false));
+
+			if (symbol_conf.show_branchflag_count)
+				ret += callchain_list_counts__printf_value(
+							chain, fp, NULL, 0);
+			ret += fprintf(fp, "\n");
 
 			if (++entries_printed == callchain_param.print_limit)
 				break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/6] perf report: Show branch info in callchain entry for browser mode
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
                   ` (3 preceding siblings ...)
  2016-10-19 22:01 ` [PATCH v2 4/6] perf report: Show branch info in callchain entry for stdio mode Jin Yao
@ 2016-10-19 22:01 ` Jin Yao
  2016-10-19 22:01 ` [PATCH v2 6/6] perf report: Display columns Predicted/Abort/Cycles in --branch-history Jin Yao
  2016-10-23 14:10 ` [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jiri Olsa
  6 siblings, 0 replies; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

If the branch is 100% predicated then the "predicated" is hide.
Similarly, if there is no branch tsx abort, the "abort" is hide.
There is only cycles shown (cycle is supported on skylake platform,
older platform would be 0).

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/ui/browsers/hists.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index ddc4c3e..24d27c2 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -738,6 +738,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
 					     struct callchain_print_arg *arg)
 {
 	char bf[1024], *alloc_str;
+	char buf[64], *alloc_str2;
 	const char *str;
 
 	if (arg->row_offset != 0) {
@@ -746,12 +747,21 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
 	}
 
 	alloc_str = NULL;
+	alloc_str2 = NULL;
+
 	str = callchain_list__sym_name(chain, bf, sizeof(bf),
 				       browser->show_dso);
 
-	if (need_percent) {
-		char buf[64];
+	if (symbol_conf.show_branchflag_count) {
+		callchain_list_counts__printf_value(chain, NULL, buf,
+						    sizeof(buf));
+		if (asprintf(&alloc_str2, "%s%s", str, buf) < 0)
+			str = "Not enough memory!";
+		else
+			str = alloc_str2;
+	}
 
+	if (need_percent) {
 		callchain_node__scnprintf_value(node, buf, sizeof(buf),
 						total);
 
@@ -764,6 +774,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
 	print(browser, chain, str, offset, row, arg);
 
 	free(alloc_str);
+	free(alloc_str2);
 	return 1;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 6/6] perf report: Display columns Predicted/Abort/Cycles in --branch-history
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
                   ` (4 preceding siblings ...)
  2016-10-19 22:01 ` [PATCH v2 5/6] perf report: Show branch info in callchain entry for browser mode Jin Yao
@ 2016-10-19 22:01 ` Jin Yao
  2016-10-23 14:10 ` [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jiri Olsa
  6 siblings, 0 replies; 14+ messages in thread
From: Jin Yao @ 2016-10-19 22:01 UTC (permalink / raw)
  To: acme, jolsa; +Cc: Linux-kernel, ak, kan.liang, yao.jin

Use current sort mechanism but the real .se_cmp() just returns 0 so
that new columns "Predicted", "Abort" and Cycles are created in display
but actually these keys are not the sort keys.

For example:

Overhead  Source:Line   Symbol    Shared Object  Predicted  Abort  Cycles
........  ............  ........  .............  .........  .....  ......

  38.25%  div.c:45      [.] main  div            97.6%      0.0%   3

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt |   8 +++
 tools/perf/builtin-report.c              |   6 +-
 tools/perf/util/hist.c                   |   3 +
 tools/perf/util/hist.h                   |   3 +
 tools/perf/util/sort.c                   | 117 ++++++++++++++++++++++++++++++-
 tools/perf/util/sort.h                   |   3 +
 6 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 2d17462..bb927cb 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -335,6 +335,14 @@ OPTIONS
 --branch-history::
 	Add the addresses of sampled taken branches to the callstack.
 	This allows to examine the path the program took to each sample.
+
+	Also show with some branch flags that can be:
+	- Predicted: display the average percentage of predicated branches.
+		     (predicated number / total number)
+	- Abort: display the average percentage of abort branches.
+		 (abort number /total number)
+	- Cycles: cycles in basic block.
+
 	The data collection must have used -b (or -j) and -g.
 
 --objdump=<path>::
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index c406393..df83ea4 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -664,6 +664,10 @@ const char report_callchain_help[] = "Display call graph (stack chain/backtrace)
 				     CALLCHAIN_REPORT_HELP
 				     "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
 
+#define CALLCHAIN_BRANCH_SORT_ORDER	\
+	"srcline,symbol,dso,callchain_branch_predicted," \
+	"callchain_branch_abort,callchain_branch_cycles"
+
 int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 {
 	struct perf_session *session;
@@ -924,7 +928,7 @@ repeat:
 		symbol_conf.use_callchain = true;
 		callchain_register_param(&callchain_param);
 		if (sort_order == NULL)
-			sort_order = "srcline,symbol,dso";
+			sort_order = CALLCHAIN_BRANCH_SORT_ORDER;
 	}
 
 	if (report.mem_mode) {
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index e1be413..2470fff 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -176,6 +176,9 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 	hists__new_col_len(hists, HISTC_MEM_LVL, 21 + 3);
 	hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
 	hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
+	hists__new_col_len(hists, HISTC_CALLCHAIN_BRANCH_PREDICTED, 9);
+	hists__new_col_len(hists, HISTC_CALLCHAIN_BRANCH_ABORT, 5);
+	hists__new_col_len(hists, HISTC_CALLCHAIN_BRANCH_CYCLES, 6);
 
 	if (h->srcline) {
 		len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index d4b6514..74e1dd4 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -57,6 +57,9 @@ enum hist_column {
 	HISTC_SRCLINE_FROM,
 	HISTC_SRCLINE_TO,
 	HISTC_TRACE,
+	HISTC_CALLCHAIN_BRANCH_PREDICTED,
+	HISTC_CALLCHAIN_BRANCH_ABORT,
+	HISTC_CALLCHAIN_BRANCH_CYCLES,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index df622f4..e47a984 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -435,6 +435,106 @@ struct sort_entry sort_srcline_to = {
 	.se_width_idx	= HISTC_SRCLINE_TO,
 };
 
+/* --sort callchain_branch_predicted */
+
+static int64_t
+sort__callchain_branch_predicted_cmp(struct hist_entry *left __maybe_unused,
+				     struct hist_entry *right __maybe_unused)
+{
+	return 0;
+}
+
+static int hist_entry__callchain_branch_predicted_snprintf(
+	struct hist_entry *he, char *bf, size_t size, unsigned int width)
+{
+	u64 branch_count, predicted_count;
+	double percent = 0.0;
+	char str[32];
+
+	callchain_branch_counts(he->callchain, &branch_count,
+				&predicted_count, NULL, NULL);
+
+	if (branch_count)
+		percent = predicted_count * 100.0 / branch_count;
+
+	snprintf(str, sizeof(str), "%.1f%%", percent);
+	return repsep_snprintf(bf, size, "%-*.*s", width, width, str);
+}
+
+struct sort_entry sort_callchain_branch_predicted = {
+	.se_header	= "Predicted",
+	.se_cmp		= sort__callchain_branch_predicted_cmp,
+	.se_snprintf	= hist_entry__callchain_branch_predicted_snprintf,
+	.se_width_idx	= HISTC_CALLCHAIN_BRANCH_PREDICTED,
+};
+
+/* --sort callchain_branch_abort */
+
+static int64_t
+sort__callchain_branch_abort_cmp(struct hist_entry *left __maybe_unused,
+				 struct hist_entry *right __maybe_unused)
+{
+	return 0;
+}
+
+static int hist_entry__callchain_branch_abort_snprintf(struct hist_entry *he,
+						       char *bf, size_t size,
+						       unsigned int width)
+{
+	u64 branch_count, abort_count;
+	double percent = 0.0;
+	char str[32];
+
+	callchain_branch_counts(he->callchain, &branch_count,
+				NULL, &abort_count, NULL);
+
+	if (branch_count)
+		percent = abort_count * 100.0 / branch_count;
+
+	snprintf(str, sizeof(str), "%.1f%%", percent);
+	return repsep_snprintf(bf, size, "%-*.*s", width, width, str);
+}
+
+struct sort_entry sort_callchain_branch_abort = {
+	.se_header	= "Abort",
+	.se_cmp		= sort__callchain_branch_abort_cmp,
+	.se_snprintf	= hist_entry__callchain_branch_abort_snprintf,
+	.se_width_idx	= HISTC_CALLCHAIN_BRANCH_ABORT,
+};
+
+/* --sort callchain_branch_cycles */
+
+static int64_t
+sort__callchain_branch_cycles_cmp(struct hist_entry *left __maybe_unused,
+				  struct hist_entry *right __maybe_unused)
+{
+	return 0;
+}
+
+static int hist_entry__callchain_branch_cycles_snprintf(struct hist_entry *he,
+							char *bf, size_t size,
+							unsigned int width)
+{
+	u64 branch_count, cycles_count, cycles = 0;
+	char str[32];
+
+	callchain_branch_counts(he->callchain, &branch_count,
+				NULL, NULL, &cycles_count);
+
+	if (branch_count)
+		cycles = cycles_count / branch_count;
+
+	snprintf(str, sizeof(str), "%" PRId64 "", cycles);
+	return repsep_snprintf(bf, size, "%-*.*s", width, width, str);
+}
+
+struct sort_entry sort_callchain_branch_cycles = {
+	.se_header	= "Cycles",
+	.se_cmp		= sort__callchain_branch_cycles_cmp,
+	.se_snprintf	= hist_entry__callchain_branch_cycles_snprintf,
+	.se_width_idx	= HISTC_CALLCHAIN_BRANCH_CYCLES,
+};
+
 /* --sort srcfile */
 
 static char no_srcfile[1];
@@ -1435,6 +1535,15 @@ static struct sort_dimension bstack_sort_dimensions[] = {
 	DIM(SORT_CYCLES, "cycles", sort_cycles),
 	DIM(SORT_SRCLINE_FROM, "srcline_from", sort_srcline_from),
 	DIM(SORT_SRCLINE_TO, "srcline_to", sort_srcline_to),
+	DIM(SORT_CALLCHAIN_BRANCH_PREDICTED,
+		"callchain_branch_predicted",
+		sort_callchain_branch_predicted),
+	DIM(SORT_CALLCHAIN_BRANCH_ABORT,
+		"callchain_branch_abort",
+		sort_callchain_branch_abort),
+	DIM(SORT_CALLCHAIN_BRANCH_CYCLES,
+		"callchain_branch_cycles",
+		sort_callchain_branch_cycles),
 };
 
 #undef DIM
@@ -2369,7 +2478,13 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
 		if (strncasecmp(tok, sd->name, strlen(tok)))
 			continue;
 
-		if (sort__mode != SORT_MODE__BRANCH)
+		if ((sort__mode != SORT_MODE__BRANCH) &&
+			strncasecmp(tok, "callchain_branch_predicted",
+				    strlen(tok)) &&
+			strncasecmp(tok, "callchain_branch_abort",
+				    strlen(tok)) &&
+			strncasecmp(tok, "callchain_branch_cycles",
+				    strlen(tok)))
 			return -EINVAL;
 
 		if (sd->entry == &sort_sym_from || sd->entry == &sort_sym_to)
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 7aff317..30c6e97 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -224,6 +224,9 @@ enum sort_type {
 	SORT_CYCLES,
 	SORT_SRCLINE_FROM,
 	SORT_SRCLINE_TO,
+	SORT_CALLCHAIN_BRANCH_PREDICTED,
+	SORT_CALLCHAIN_BRANCH_ABORT,
+	SORT_CALLCHAIN_BRANCH_CYCLES,
 
 	/* memory mode specific sort keys */
 	__SORT_MEMORY_MODE,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-19 22:01 ` [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain Jin Yao
@ 2016-10-20 16:41   ` Nilay Vaish
  2016-10-20 16:48     ` Andi Kleen
  0 siblings, 1 reply; 14+ messages in thread
From: Nilay Vaish @ 2016-10-20 16:41 UTC (permalink / raw)
  To: Jin Yao
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Linux Kernel list,
	Andi Kleen, kan.liang

On 19 October 2016 at 17:01, Jin Yao <yao.jin@linux.intel.com> wrote:
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index 40ecf25..4f6bf6c 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -115,6 +115,10 @@ struct callchain_list {
>                 bool            unfolded;
>                 bool            has_children;
>         };
> +       u64                     branch_count;
> +       u64                     predicted_count;
> +       u64                     abort_count;

Can you explain what abort count is?  It seems you are referring to
miss-speculated branches.  If that is the case, I would prefer that we
replace abort by miss_speculated or miss_predicted.

--
Nilay

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-20 16:41   ` Nilay Vaish
@ 2016-10-20 16:48     ` Andi Kleen
  2016-10-20 17:06       ` Nilay Vaish
  0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2016-10-20 16:48 UTC (permalink / raw)
  To: Nilay Vaish
  Cc: Jin Yao, Arnaldo Carvalho de Melo, Jiri Olsa, Linux Kernel list,
	kan.liang

On Thu, Oct 20, 2016 at 11:41:11AM -0500, Nilay Vaish wrote:
> On 19 October 2016 at 17:01, Jin Yao <yao.jin@linux.intel.com> wrote:
> > diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> > index 40ecf25..4f6bf6c 100644
> > --- a/tools/perf/util/callchain.h
> > +++ b/tools/perf/util/callchain.h
> > @@ -115,6 +115,10 @@ struct callchain_list {
> >                 bool            unfolded;
> >                 bool            has_children;
> >         };
> > +       u64                     branch_count;
> > +       u64                     predicted_count;
> > +       u64                     abort_count;
> 
> Can you explain what abort count is?  It seems you are referring to
> miss-speculated branches.  If that is the case, I would prefer that we
> replace abort by miss_speculated or miss_predicted.

abort refers to TSX aborts. It has nothing to do with branch
mispredictions.

-Andi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-20 16:48     ` Andi Kleen
@ 2016-10-20 17:06       ` Nilay Vaish
  2016-10-20 18:20         ` Andi Kleen
  0 siblings, 1 reply; 14+ messages in thread
From: Nilay Vaish @ 2016-10-20 17:06 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jin Yao, Arnaldo Carvalho de Melo, Jiri Olsa, Linux Kernel list,
	kan.liang

On 20 October 2016 at 11:48, Andi Kleen <ak@linux.intel.com> wrote:
> On Thu, Oct 20, 2016 at 11:41:11AM -0500, Nilay Vaish wrote:
>> On 19 October 2016 at 17:01, Jin Yao <yao.jin@linux.intel.com> wrote:
>> > diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
>> > index 40ecf25..4f6bf6c 100644
>> > --- a/tools/perf/util/callchain.h
>> > +++ b/tools/perf/util/callchain.h
>> > @@ -115,6 +115,10 @@ struct callchain_list {
>> >                 bool            unfolded;
>> >                 bool            has_children;
>> >         };
>> > +       u64                     branch_count;
>> > +       u64                     predicted_count;
>> > +       u64                     abort_count;
>>
>> Can you explain what abort count is?  It seems you are referring to
>> miss-speculated branches.  If that is the case, I would prefer that we
>> replace abort by miss_speculated or miss_predicted.
>
> abort refers to TSX aborts. It has nothing to do with branch
> mispredictions.

OK, I am more confused now.  Are you predicting some quantity related
to transactions?  Why would you divide abort count by branch count?
Further, I just looked at patch 6/6.  It has the following text:

+ Also show with some branch flags that can be:
+ - Predicted: display the average percentage of predicated branches.
+     (predicated number / total number)
+ - Abort: display the average percentage of abort branches.
+ (abort number /total number)
+ - Cycles: cycles in basic block.


I think there is inconsistency between what you are suggesting and
what the patch has.

--
Nilay

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-20 17:06       ` Nilay Vaish
@ 2016-10-20 18:20         ` Andi Kleen
  2016-10-21  0:23           ` Jin, Yao
  0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2016-10-20 18:20 UTC (permalink / raw)
  To: Nilay Vaish
  Cc: Jin Yao, Arnaldo Carvalho de Melo, Jiri Olsa, Linux Kernel list,
	kan.liang

> OK, I am more confused now.  Are you predicting some quantity related
> to transactions?  Why would you divide abort count by branch count?
> Further, I just looked at patch 6/6.  It has the following text:
> 
> + Also show with some branch flags that can be:
> + - Predicted: display the average percentage of predicated branches.
> +     (predicated number / total number)
> + - Abort: display the average percentage of abort branches.
> + (abort number /total number)
> + - Cycles: cycles in basic block.
> 
> 
> I think there is inconsistency between what you are suggesting and
> what the patch has.

An abort is an unique branch. But yes there is no total number,
so the formula will always be 100%. So yes would probably be 
better to just display a count for abort.

-Andi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-20 18:20         ` Andi Kleen
@ 2016-10-21  0:23           ` Jin, Yao
  2016-10-25 18:11             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 14+ messages in thread
From: Jin, Yao @ 2016-10-21  0:23 UTC (permalink / raw)
  To: Andi Kleen, Nilay Vaish
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Linux Kernel list, kan.liang

Hi Andi, Hi Nilay,

Thanks so much for your comments!

I will upgrade the patch to just display the count for abort.

Thanks

Jin Yao

On 10/21/2016 2:20 AM, Andi Kleen wrote:
>> OK, I am more confused now.  Are you predicting some quantity related
>> to transactions?  Why would you divide abort count by branch count?
>> Further, I just looked at patch 6/6.  It has the following text:
>>
>> + Also show with some branch flags that can be:
>> + - Predicted: display the average percentage of predicated branches.
>> +     (predicated number / total number)
>> + - Abort: display the average percentage of abort branches.
>> + (abort number /total number)
>> + - Cycles: cycles in basic block.
>>
>>
>> I think there is inconsistency between what you are suggesting and
>> what the patch has.
> An abort is an unique branch. But yes there is no total number,
> so the formula will always be 100%. So yes would probably be
> better to just display a count for abort.
>
> -Andi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view
  2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
                   ` (5 preceding siblings ...)
  2016-10-19 22:01 ` [PATCH v2 6/6] perf report: Display columns Predicted/Abort/Cycles in --branch-history Jin Yao
@ 2016-10-23 14:10 ` Jiri Olsa
  6 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2016-10-23 14:10 UTC (permalink / raw)
  To: Jin Yao; +Cc: acme, jolsa, Linux-kernel, ak, kan.liang

On Thu, Oct 20, 2016 at 06:01:11AM +0800, Jin Yao wrote:
> v2: Just a rebase to Arnaldo's perf/core branch, no functional changes.
> 

Reviewed-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain
  2016-10-21  0:23           ` Jin, Yao
@ 2016-10-25 18:11             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-25 18:11 UTC (permalink / raw)
  To: Jin, Yao; +Cc: Andi Kleen, Nilay Vaish, Jiri Olsa, Linux Kernel list, kan.liang

Em Fri, Oct 21, 2016 at 08:23:41AM +0800, Jin, Yao escreveu:
> Hi Andi, Hi Nilay,
> 
> Thanks so much for your comments!
> 
> I will upgrade the patch to just display the count for abort.

Ok, waiting for that then,

- Arnaldo
 
> Thanks
> 
> Jin Yao
> 
> On 10/21/2016 2:20 AM, Andi Kleen wrote:
> > > OK, I am more confused now.  Are you predicting some quantity related
> > > to transactions?  Why would you divide abort count by branch count?
> > > Further, I just looked at patch 6/6.  It has the following text:
> > > 
> > > + Also show with some branch flags that can be:
> > > + - Predicted: display the average percentage of predicated branches.
> > > +     (predicated number / total number)
> > > + - Abort: display the average percentage of abort branches.
> > > + (abort number /total number)
> > > + - Cycles: cycles in basic block.
> > > 
> > > 
> > > I think there is inconsistency between what you are suggesting and
> > > what the patch has.
> > An abort is an unique branch. But yes there is no total number,
> > so the formula will always be 100%. So yes would probably be
> > better to just display a count for abort.
> > 
> > -Andi

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-10-25 18:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-19 22:01 [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jin Yao
2016-10-19 22:01 ` [PATCH v2 1/6] perf report: Add branch flag to callchain cursor node Jin Yao
2016-10-19 22:01 ` [PATCH v2 2/6] perf report: Caculate and return the branch counting in callchain Jin Yao
2016-10-20 16:41   ` Nilay Vaish
2016-10-20 16:48     ` Andi Kleen
2016-10-20 17:06       ` Nilay Vaish
2016-10-20 18:20         ` Andi Kleen
2016-10-21  0:23           ` Jin, Yao
2016-10-25 18:11             ` Arnaldo Carvalho de Melo
2016-10-19 22:01 ` [PATCH v2 3/6] perf report: Create a symbol_conf flag for showing branch flag counting Jin Yao
2016-10-19 22:01 ` [PATCH v2 4/6] perf report: Show branch info in callchain entry for stdio mode Jin Yao
2016-10-19 22:01 ` [PATCH v2 5/6] perf report: Show branch info in callchain entry for browser mode Jin Yao
2016-10-19 22:01 ` [PATCH v2 6/6] perf report: Display columns Predicted/Abort/Cycles in --branch-history Jin Yao
2016-10-23 14:10 ` [PATCH v2 0/6] perf report: Show branch flags/cycles in --branch-history callgraph view Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).