linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCH 0/5] perf report: Support folded callchain output (v1)
@ 2015-10-30 17:15 Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 1/5] perf report: Support folded callchain mode on --stdio Namhyung Kim
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-10-30 17:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, Brendan Gregg, David Ahern

Hello,

This is what Brendan requested on the perf-users mailing list to
support FlameGraphs [1] more efficiently.  This patchset adds a few
more callchain options to adjust the output for it.

At first, 'folded' output mode was added.  The folded output puts all
calchain nodes in a line separated by semicolons, a space and the
value.

The value is now can be one of 'percent', 'period', or 'count'.  The
percent is current default output and the period is the raw number of
sample periods.  The count is the number of samples for each callchain.

Here's an example:

  $ perf report --no-children --show-nr-samples --stdio -g folded,count
  ...
    39.93%     80  swapper  [kernel.vmlinux]  [k] intel_idel
  intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 57
  intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... 23


  $ perf report --no-children --stdio -g percent
  ...
    39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--28.63%-- start_secondary
               |
                --11.30%-- rest_init


  $ perf report --no-children --show-total-period --stdio -g period
  ...
    39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--9334403-- start_secondary
               |
                --3684302-- rest_init


  $ perf report --no-children --show-nr-samples --stdio -g count
  ...
    39.93%     80  swapper  [kernel.vmlinux]  [k] intel_idel
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--57-- start_secondary
               |
                --23-- rest_init


You can get it from 'perf/callchain-fold-v1' branch on my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks
Namhyung


[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html


Namhyung Kim (5):
  perf report: Support folded callchain mode on --stdio
  perf callchain: Abstract callchain print function
  perf callchain: Add count fields to struct callchain_node
  perf report: Add callchain value option
  perf report: Fix segfault on -g fractral with --stdio

 tools/perf/Documentation/perf-report.txt | 13 +++--
 tools/perf/builtin-report.c              |  2 +-
 tools/perf/ui/browsers/hists.c           |  8 +--
 tools/perf/ui/gtk/hists.c                |  8 +--
 tools/perf/ui/stdio/hist.c               | 91 ++++++++++++++++++++++++++------
 tools/perf/util/callchain.c              | 87 +++++++++++++++++++++++++++++-
 tools/perf/util/callchain.h              | 24 ++++++++-
 tools/perf/util/util.c                   |  3 +-
 8 files changed, 203 insertions(+), 33 deletions(-)

-- 
2.6.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC/PATCH 1/5] perf report: Support folded callchain mode on --stdio
  2015-10-30 17:15 [RFC/PATCH 0/5] perf report: Support folded callchain output (v1) Namhyung Kim
@ 2015-10-30 17:15 ` Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 2/5] perf callchain: Abstract callchain print function Namhyung Kim
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-10-30 17:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, Brendan Gregg, David Ahern

Add new call chain option (-g) 'folded' to print callchains in a line.
The callchains are separated by semicolons, a space, then the
(absolute) percent values like in 'flat' mode.

For example, following 20 lines can be printed in 3 lines with the
folded output mode;

  $ perf report -g flat --no-children | grep -v ^# | head -20
      60.48%  swapper  [kernel.vmlinux]  [k] intel_idle
              54.60%
                 intel_idle
                 cpuidle_enter_state
                 cpuidle_enter
                 call_cpuidle
                 cpu_startup_entry
                 start_secondary

              5.88%
                 intel_idle
                 cpuidle_enter_state
                 cpuidle_enter
                 call_cpuidle
                 cpu_startup_entry
                 rest_init
                 start_kernel
                 x86_64_start_reservations
                 x86_64_start_kernel

  $ perf report -g folded --no-children | grep -v ^# | head -3
      60.48%  swapper  [kernel.vmlinux]  [k] intel_idle
  intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 54.60%
  intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel  5.88%

This mode is supported only for --stdio now and intended to be used by
some scripts like in FlameGraphs[1].  Support for other UI might be
added later.

[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

Requested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/stdio/hist.c  | 53 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.c |  6 +++++
 tools/perf/util/callchain.h |  3 ++-
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index dfcbc90146ef..2c7436241912 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -260,6 +260,56 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
 	return ret;
 }
 
+static size_t __callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
+{
+	struct callchain_list *chain;
+	size_t ret = 0;
+	char bf[1024];
+	bool first;
+
+	if (!node)
+		return 0;
+
+	ret += __callchain__fprintf_folded(fp, node->parent);
+
+	first = (ret == 0);
+	list_for_each_entry(chain, &node->val, list) {
+		if (chain->ip >= PERF_CONTEXT_MAX)
+			continue;
+		ret += fprintf(fp, "%s%s", first ? "" : ";",
+			       callchain_list__sym_name(chain,
+						bf, sizeof(bf), false));
+		first = false;
+	}
+
+	return ret;
+}
+
+static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
+					u64 total_samples)
+{
+	size_t ret = 0;
+	u32 entries_printed = 0;
+	struct callchain_node *chain;
+	struct rb_node *rb_node = rb_first(tree);
+
+	while (rb_node) {
+		double percent;
+
+		chain = rb_entry(rb_node, struct callchain_node, rb_node);
+		percent = chain->hit * 100.0 / total_samples;
+
+		ret += __callchain__fprintf_folded(fp, chain);
+		ret += fprintf(fp, " %6.2f%%\n", percent);
+		if (++entries_printed == callchain_param.print_limit)
+			break;
+
+		rb_node = rb_next(rb_node);
+	}
+
+	return ret;
+}
+
 static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
 					    u64 total_samples, int left_margin,
 					    FILE *fp)
@@ -278,6 +328,9 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
 	case CHAIN_FLAT:
 		return callchain__fprintf_flat(fp, &he->sorted_chain, total_samples);
 		break;
+	case CHAIN_FOLDED:
+		return callchain__fprintf_folded(fp, &he->sorted_chain, total_samples);
+		break;
 	case CHAIN_NONE:
 		break;
 	default:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 735ad48e1858..08cb220ba5ea 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -44,6 +44,10 @@ static int parse_callchain_mode(const char *value)
 		callchain_param.mode = CHAIN_GRAPH_REL;
 		return 0;
 	}
+	if (!strncmp(value, "folded", strlen(value))) {
+		callchain_param.mode = CHAIN_FOLDED;
+		return 0;
+	}
 	return -1;
 }
 
@@ -218,6 +222,7 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,
 
 		switch (mode) {
 		case CHAIN_FLAT:
+		case CHAIN_FOLDED:
 			if (rnode->hit < chain->hit)
 				p = &(*p)->rb_left;
 			else
@@ -338,6 +343,7 @@ int callchain_register_param(struct callchain_param *param)
 		param->sort = sort_chain_graph_rel;
 		break;
 	case CHAIN_FLAT:
+	case CHAIN_FOLDED:
 		param->sort = sort_chain_flat;
 		break;
 	case CHAIN_NONE:
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index fce8161e54db..2f305384531f 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -43,7 +43,8 @@ enum chain_mode {
 	CHAIN_NONE,
 	CHAIN_FLAT,
 	CHAIN_GRAPH_ABS,
-	CHAIN_GRAPH_REL
+	CHAIN_GRAPH_REL,
+	CHAIN_FOLDED,
 };
 
 enum chain_order {
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH 2/5] perf callchain: Abstract callchain print function
  2015-10-30 17:15 [RFC/PATCH 0/5] perf report: Support folded callchain output (v1) Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 1/5] perf report: Support folded callchain mode on --stdio Namhyung Kim
@ 2015-10-30 17:15 ` Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 3/5] perf callchain: Add count fields to struct callchain_node Namhyung Kim
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-10-30 17:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, Brendan Gregg, David Ahern

This is a preparation to support for printing other type of callchain
value like count or period.

Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/browsers/hists.c |  8 +++++---
 tools/perf/ui/gtk/hists.c      |  8 ++------
 tools/perf/ui/stdio/hist.c     | 33 +++++++++++++++------------------
 tools/perf/util/callchain.c    | 25 +++++++++++++++++++++++++
 tools/perf/util/callchain.h    |  4 ++++
 5 files changed, 51 insertions(+), 27 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index e5afb8936040..a8897aab4c4a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -592,7 +592,6 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
 	while (node) {
 		struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
 		struct rb_node *next = rb_next(node);
-		u64 cumul = callchain_cumul_hits(child);
 		struct callchain_list *chain;
 		char folded_sign = ' ';
 		int first = true;
@@ -619,9 +618,12 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
 						       browser->show_dso);
 
 			if (was_first && need_percent) {
-				double percent = cumul * 100.0 / total;
+				char buf[64];
 
-				if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0)
+				callchain_node__sprintf_value(child, buf, sizeof(buf),
+							      total);
+
+				if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
 					str = "Not enough memory!";
 				else
 					str = alloc_str;
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585eed1e8..d8037b7023e8 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -100,14 +100,10 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
 		struct callchain_list *chain;
 		GtkTreeIter iter, new_parent;
 		bool need_new_parent;
-		double percent;
-		u64 hits, child_total;
+		u64 child_total;
 
 		node = rb_entry(nd, struct callchain_node, rb_node);
 
-		hits = callchain_cumul_hits(node);
-		percent = 100.0 * hits / total;
-
 		new_parent = *parent;
 		need_new_parent = !has_single_node && (node->val_nr > 1);
 
@@ -116,7 +112,7 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
 
 			gtk_tree_store_append(store, &iter, &new_parent);
 
-			scnprintf(buf, sizeof(buf), "%5.2f%%", percent);
+			callchain_node__sprintf_value(node, buf, sizeof(buf), total);
 			gtk_tree_store_set(store, &iter, 0, buf, -1);
 
 			callchain_list__sym_name(chain, buf, sizeof(buf), false);
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 2c7436241912..4a7b007c2a87 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -34,10 +34,10 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
 	return ret;
 }
 
-static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
+static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
+				     struct callchain_list *chain,
 				     int depth, int depth_mask, int period,
-				     u64 total_samples, u64 hits,
-				     int left_margin)
+				     u64 total_samples, int left_margin)
 {
 	int i;
 	size_t ret = 0;
@@ -50,10 +50,9 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
 		else
 			ret += fprintf(fp, " ");
 		if (!period && i == depth - 1) {
-			double percent;
-
-			percent = hits * 100.0 / total_samples;
-			ret += percent_color_fprintf(fp, "--%2.2f%%-- ", percent);
+			ret += fprintf(fp, "--");
+			ret += callchain_node__fprintf_value(node, fp, total_samples);
+			ret += fprintf(fp, "--");
 		} else
 			ret += fprintf(fp, "%s", "          ");
 	}
@@ -120,10 +119,9 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 						   left_margin);
 		i = 0;
 		list_for_each_entry(chain, &child->val, list) {
-			ret += ipchain__fprintf_graph(fp, chain, depth,
+			ret += ipchain__fprintf_graph(fp, child, chain, depth,
 						      new_depth_mask, i++,
 						      total_samples,
-						      cumul,
 						      left_margin);
 		}
 
@@ -148,9 +146,9 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 			return ret;
 
 		new_depth_mask &= ~(1 << (depth - 1));
-		ret += ipchain__fprintf_graph(fp, &rem_hits, depth,
+		ret += ipchain__fprintf_graph(fp, NULL, &rem_hits, depth,
 					      new_depth_mask, 0, total_samples,
-					      remaining, left_margin);
+					      left_margin);
 	}
 
 	return ret;
@@ -243,12 +241,11 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
 	struct rb_node *rb_node = rb_first(tree);
 
 	while (rb_node) {
-		double percent;
-
 		chain = rb_entry(rb_node, struct callchain_node, rb_node);
-		percent = chain->hit * 100.0 / total_samples;
 
-		ret = percent_color_fprintf(fp, "           %6.2f%%\n", percent);
+		ret += fprintf(fp, "           ");
+		ret += callchain_node__fprintf_value(chain, fp, total_samples);
+		ret += fprintf(fp, "\n");
 		ret += __callchain__fprintf_flat(fp, chain, total_samples);
 		ret += fprintf(fp, "\n");
 		if (++entries_printed == callchain_param.print_limit)
@@ -294,13 +291,13 @@ static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
 	struct rb_node *rb_node = rb_first(tree);
 
 	while (rb_node) {
-		double percent;
 
 		chain = rb_entry(rb_node, struct callchain_node, rb_node);
-		percent = chain->hit * 100.0 / total_samples;
 
 		ret += __callchain__fprintf_folded(fp, chain);
-		ret += fprintf(fp, " %6.2f%%\n", percent);
+		ret += putc(' ', fp);
+		ret += callchain_node__fprintf_value(chain, fp, total_samples);
+		ret += putc('\n', fp);
 		if (++entries_printed == callchain_param.print_limit)
 			break;
 
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 08cb220ba5ea..44184d198855 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -805,6 +805,31 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 	return bf;
 }
 
+char *callchain_node__sprintf_value(struct callchain_node *node,
+				    char *bf, size_t bfsize, u64 total)
+{
+	double percent = 0.0;
+	u64 cumul = callchain_cumul_hits(node);
+
+	if (total)
+		percent = cumul * 100.0 / total;
+
+	scnprintf(bf, bfsize, "%6.2f%%", percent);
+	return bf;
+}
+
+int callchain_node__fprintf_value(struct callchain_node *node,
+				 FILE *fp, u64 total)
+{
+	double percent = 0.0;
+	u64 cumul = callchain_cumul_hits(node);
+
+	if (total)
+		percent = cumul * 100.0 / total;
+
+	return percent_color_fprintf(fp, "%.2f%%", percent);
+}
+
 static void free_callchain_node(struct callchain_node *node)
 {
 	struct callchain_list *list, *tmp;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 2f305384531f..3a90a57f6213 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -230,6 +230,10 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
 
 char *callchain_list__sym_name(struct callchain_list *cl,
 			       char *bf, size_t bfsize, bool show_dso);
+char *callchain_node__sprintf_value(struct callchain_node *node,
+				    char *bf, size_t bfsize, u64 total);
+int callchain_node__fprintf_value(struct callchain_node *node,
+				  FILE *fp, u64 total);
 
 void free_callchain(struct callchain_root *root);
 
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH 3/5] perf callchain: Add count fields to struct callchain_node
  2015-10-30 17:15 [RFC/PATCH 0/5] perf report: Support folded callchain output (v1) Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 1/5] perf report: Support folded callchain mode on --stdio Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 2/5] perf callchain: Abstract callchain print function Namhyung Kim
@ 2015-10-30 17:15 ` Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 4/5] perf report: Add callchain value option Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio Namhyung Kim
  4 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-10-30 17:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, Brendan Gregg, David Ahern

It's to track the count of occurrences of the callchains.

Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/callchain.c | 10 ++++++++++
 tools/perf/util/callchain.h |  7 +++++++
 2 files changed, 17 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 44184d198855..0a97d77509bd 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -437,6 +437,8 @@ add_child(struct callchain_node *parent,
 
 	new->children_hit = 0;
 	new->hit = period;
+	new->children_count = 0;
+	new->count = 1;
 	return new;
 }
 
@@ -484,6 +486,9 @@ split_add_child(struct callchain_node *parent,
 	parent->children_hit = callchain_cumul_hits(new);
 	new->val_nr = parent->val_nr - idx_local;
 	parent->val_nr = idx_local;
+	new->count = parent->count;
+	new->children_count = parent->children_count;
+	parent->children_count = callchain_cumul_counts(new);
 
 	/* create a new child for the new branch if any */
 	if (idx_total < cursor->nr) {
@@ -494,6 +499,8 @@ split_add_child(struct callchain_node *parent,
 
 		parent->hit = 0;
 		parent->children_hit += period;
+		parent->count = 0;
+		parent->children_count += 1;
 
 		node = callchain_cursor_current(cursor);
 		new = add_child(parent, cursor, period);
@@ -516,6 +523,7 @@ split_add_child(struct callchain_node *parent,
 		rb_insert_color(&new->rb_node_in, &parent->rb_root_in);
 	} else {
 		parent->hit = period;
+		parent->count = 1;
 	}
 }
 
@@ -562,6 +570,7 @@ append_chain_children(struct callchain_node *root,
 
 inc_children_hit:
 	root->children_hit += period;
+	root->children_count++;
 }
 
 static int
@@ -614,6 +623,7 @@ append_chain(struct callchain_node *root,
 	/* we match 100% of the path, increment the hit */
 	if (matches == root->val_nr && cursor->pos == cursor->nr) {
 		root->hit += period;
+		root->count++;
 		return 0;
 	}
 
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 3a90a57f6213..2f948f0ff034 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -60,6 +60,8 @@ struct callchain_node {
 	struct rb_root		rb_root_in; /* input tree of children */
 	struct rb_root		rb_root;    /* sorted output tree of children */
 	unsigned int		val_nr;
+	unsigned int		count;
+	unsigned int		children_count;
 	u64			hit;
 	u64			children_hit;
 };
@@ -145,6 +147,11 @@ static inline u64 callchain_cumul_hits(struct callchain_node *node)
 	return node->hit + node->children_hit;
 }
 
+static inline int callchain_cumul_counts(struct callchain_node *node)
+{
+	return node->count + node->children_count;
+}
+
 int callchain_register_param(struct callchain_param *param);
 int callchain_append(struct callchain_root *root,
 		     struct callchain_cursor *cursor,
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH 4/5] perf report: Add callchain value option
  2015-10-30 17:15 [RFC/PATCH 0/5] perf report: Support folded callchain output (v1) Namhyung Kim
                   ` (2 preceding siblings ...)
  2015-10-30 17:15 ` [RFC/PATCH 3/5] perf callchain: Add count fields to struct callchain_node Namhyung Kim
@ 2015-10-30 17:15 ` Namhyung Kim
  2015-10-30 17:15 ` [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio Namhyung Kim
  4 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-10-30 17:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, Brendan Gregg, David Ahern

Now -g/--call-graph option supports how to display callchain values.
Possible values are 'percent', 'period' and 'count'.  The percent is
same as before and it's the default behavior.  The period displays the
raw period value rather than the percentage.  The count displays the
number of occurrences.

  $ perf report --no-children --stdio -g percent
  ...
    39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--28.63%-- start_secondary
               |
                --11.30%-- rest_init

  $ perf report --no-children --show-total-period --stdio -g period
  ...
    39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--9334403-- start_secondary
               |
                --3684302-- rest_init

  $ perf report --no-children --show-nr-samples --stdio -g count
  ...
    39.93%     80  swapper  [kernel.vmlinux]  [k] intel_idel
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--57-- start_secondary
               |
                --23-- rest_init

Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt | 13 ++++---
 tools/perf/builtin-report.c              |  4 +--
 tools/perf/util/callchain.c              | 60 +++++++++++++++++++++++++++-----
 tools/perf/util/callchain.h              | 10 +++++-
 tools/perf/util/util.c                   |  3 +-
 5 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 5ce8da1e1256..bb9fd23a105e 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -170,11 +170,11 @@ OPTIONS
         Dump raw trace in ASCII.
 
 -g::
---call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>::
+--call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>::
         Display call chains using type, min percent threshold, print limit,
-	call order, sort key and branch.  Note that ordering of parameters is not
-	fixed so any parement can be given in an arbitraty order.  One exception
-	is the print_limit which should be preceded by threshold.
+	call order, sort key, optional branch and value.  Note that ordering of
+	parameters is not fixed so any parement can be given in an arbitraty order.
+	One exception is the print_limit which should be preceded by threshold.
 
 	print_type can be either:
 	- flat: single column, linear exposure of call chains.
@@ -204,6 +204,11 @@ OPTIONS
 	- branch: include last branch information in callgraph when available.
 	          Usually more convenient to use --branch-history for this.
 
+	value can be:
+	- percent: diplay overhead percent (default)
+	- period: display event period
+	- count: display evnt count
+
 --children::
 	Accumulate callchain of children to parent entry so that then can
 	show up in the output.  The output will have a new "Children" column
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2853ad2bd435..3dd4bb4ded1a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -625,7 +625,7 @@ parse_percent_limit(const struct option *opt, const char *str,
 	return 0;
 }
 
-#define CALLCHAIN_DEFAULT_OPT  "graph,0.5,caller,function"
+#define CALLCHAIN_DEFAULT_OPT  "graph,0.5,caller,function,percent"
 
 const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
 				     CALLCHAIN_REPORT_HELP
@@ -708,7 +708,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
 		    "Only display entries with parent-match"),
 	OPT_CALLBACK_DEFAULT('g', "call-graph", &report,
-			     "print_type,threshold[,print_limit],order,sort_key[,branch]",
+			     "print_type,threshold[,print_limit],order,sort_key[,branch],value",
 			     report_callchain_help, &report_parse_callchain_opt,
 			     callchain_default_opt),
 	OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 0a97d77509bd..7f0a89584f1b 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -83,6 +83,23 @@ static int parse_callchain_sort_key(const char *value)
 	return -1;
 }
 
+static int parse_callchain_value(const char *value)
+{
+	if (!strncmp(value, "percent", strlen(value))) {
+		callchain_param.value = CCVAL_PERCENT;
+		return 0;
+	}
+	if (!strncmp(value, "period", strlen(value))) {
+		callchain_param.value = CCVAL_PERIOD;
+		return 0;
+	}
+	if (!strncmp(value, "count", strlen(value))) {
+		callchain_param.value = CCVAL_COUNT;
+		return 0;
+	}
+	return -1;
+}
+
 static int
 __parse_callchain_report_opt(const char *arg, bool allow_record_opt)
 {
@@ -106,7 +123,8 @@ __parse_callchain_report_opt(const char *arg, bool allow_record_opt)
 
 		if (!parse_callchain_mode(tok) ||
 		    !parse_callchain_order(tok) ||
-		    !parse_callchain_sort_key(tok)) {
+		    !parse_callchain_sort_key(tok) ||
+		    !parse_callchain_value(tok)) {
 			/* parsing ok - move on to the next */
 			try_stack_size = false;
 			goto next;
@@ -819,12 +837,26 @@ char *callchain_node__sprintf_value(struct callchain_node *node,
 				    char *bf, size_t bfsize, u64 total)
 {
 	double percent = 0.0;
-	u64 cumul = callchain_cumul_hits(node);
+	u64 period = callchain_cumul_hits(node);
+	int count = callchain_cumul_counts(node);
 
 	if (total)
-		percent = cumul * 100.0 / total;
+		percent = period * 100.0 / total;
+	if (callchain_param.mode == CHAIN_FOLDED)
+		count = node->count;
 
-	scnprintf(bf, bfsize, "%6.2f%%", percent);
+	switch (callchain_param.value) {
+	case CCVAL_PERIOD:
+		scnprintf(bf, bfsize, "%"PRIu64, period);
+		break;
+	case CCVAL_COUNT:
+		scnprintf(bf, bfsize, "%u", count);
+		break;
+	case CCVAL_PERCENT:
+	default:
+		scnprintf(bf, bfsize, "%.2f%%", percent);
+		break;
+	}
 	return bf;
 }
 
@@ -832,12 +864,24 @@ int callchain_node__fprintf_value(struct callchain_node *node,
 				 FILE *fp, u64 total)
 {
 	double percent = 0.0;
-	u64 cumul = callchain_cumul_hits(node);
+	u64 period = callchain_cumul_hits(node);
+	int count = callchain_cumul_counts(node);
 
 	if (total)
-		percent = cumul * 100.0 / total;
-
-	return percent_color_fprintf(fp, "%.2f%%", percent);
+		percent = period * 100.0 / total;
+	if (callchain_param.mode == CHAIN_FOLDED)
+		count = node->count;
+
+	switch (callchain_param.value) {
+	case CCVAL_PERIOD:
+		return fprintf(fp, "%"PRIu64, period);
+	case CCVAL_COUNT:
+		return fprintf(fp, "%u", count);
+	case CCVAL_PERCENT:
+	default:
+		return percent_color_fprintf(fp, "%.2f%%", percent);
+	}
+	return 0;
 }
 
 static void free_callchain_node(struct callchain_node *node)
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 2f948f0ff034..e8533e328a47 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -29,7 +29,8 @@
 	HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
 	HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
 	HELP_PAD "sort_key:\tcall graph sort key (function|address)\n"	\
-	HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n"
+	HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n" \
+	HELP_PAD "value:\t\tcall graph value (percent|period|count)\n"
 
 enum perf_call_graph_mode {
 	CALLCHAIN_NONE,
@@ -81,6 +82,12 @@ enum chain_key {
 	CCKEY_ADDRESS
 };
 
+enum chain_value {
+	CCVAL_PERCENT,
+	CCVAL_PERIOD,
+	CCVAL_COUNT,
+};
+
 struct callchain_param {
 	bool			enabled;
 	enum perf_call_graph_mode record_mode;
@@ -93,6 +100,7 @@ struct callchain_param {
 	bool			order_set;
 	enum chain_key		key;
 	bool			branch_callstack;
+	enum chain_value	value;
 };
 
 extern struct callchain_param callchain_param;
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index cd12c25e4ea4..174912f87913 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -20,7 +20,8 @@ struct callchain_param	callchain_param = {
 	.mode	= CHAIN_GRAPH_ABS,
 	.min_percent = 0.5,
 	.order  = ORDER_CALLEE,
-	.key	= CCKEY_FUNCTION
+	.key	= CCKEY_FUNCTION,
+	.value	= CCVAL_PERCENT,
 };
 
 /*
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio
  2015-10-30 17:15 [RFC/PATCH 0/5] perf report: Support folded callchain output (v1) Namhyung Kim
                   ` (3 preceding siblings ...)
  2015-10-30 17:15 ` [RFC/PATCH 4/5] perf report: Add callchain value option Namhyung Kim
@ 2015-10-30 17:15 ` Namhyung Kim
  2015-10-31 11:09   ` Jiri Olsa
  4 siblings, 1 reply; 8+ messages in thread
From: Namhyung Kim @ 2015-10-30 17:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, Brendan Gregg, David Ahern

ipchain__fprintf_graph() should pass non-NULL callchain node when
printing remaining entries.  Otherwise it'll segfault since it tries to
access to the node to calculate the value.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/stdio/hist.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 4a7b007c2a87..2104b09d41a8 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -88,6 +88,7 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 	size_t ret = 0;
 	int i;
 	uint entries_printed = 0;
+	int cumul_count = 0;
 
 	remaining = total_samples;
 
@@ -99,6 +100,7 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 		child = rb_entry(node, struct callchain_node, rb_node);
 		cumul = callchain_cumul_hits(child);
 		remaining -= cumul;
+		cumul_count += callchain_cumul_counts(child);
 
 		/*
 		 * The depth mask manages the output of pipes that show
@@ -141,12 +143,21 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
 
 	if (callchain_param.mode == CHAIN_GRAPH_REL &&
 		remaining && remaining != total_samples) {
+		struct callchain_node rem_node = {
+			.hit = remaining,
+		};
 
 		if (!rem_sq_bracket)
 			return ret;
 
+		if (callchain_param.value == CCVAL_COUNT) {
+			rem_node.count = child->parent->children_count - cumul_count;
+			if (rem_node.count <= 0)
+				return ret;
+		}
+
 		new_depth_mask &= ~(1 << (depth - 1));
-		ret += ipchain__fprintf_graph(fp, NULL, &rem_hits, depth,
+		ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
 					      new_depth_mask, 0, total_samples,
 					      left_margin);
 	}
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio
  2015-10-30 17:15 ` [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio Namhyung Kim
@ 2015-10-31 11:09   ` Jiri Olsa
  2015-11-02  6:45     ` Namhyung Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2015-10-31 11:09 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Brendan Gregg, David Ahern

On Sat, Oct 31, 2015 at 02:15:38AM +0900, Namhyung Kim wrote:

SNIP

>  
>  		if (!rem_sq_bracket)
>  			return ret;
>  
> +		if (callchain_param.value == CCVAL_COUNT) {
> +			rem_node.count = child->parent->children_count - cumul_count;
> +			if (rem_node.count <= 0)
> +				return ret;
> +		}
> +
>  		new_depth_mask &= ~(1 << (depth - 1));
> -		ret += ipchain__fprintf_graph(fp, NULL, &rem_hits, depth,
> +		ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
>  					      new_depth_mask, 0, total_samples,
>  					      left_margin);

this looks like being introduced within your patchset in patch:
  perf callchain: Abstract callchain print function

shouldn't it get fixed in there?

jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio
  2015-10-31 11:09   ` Jiri Olsa
@ 2015-11-02  6:45     ` Namhyung Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-11-02  6:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Brendan Gregg, David Ahern

Hi Jiri,

On Sat, Oct 31, 2015 at 12:09:46PM +0100, Jiri Olsa wrote:
> On Sat, Oct 31, 2015 at 02:15:38AM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> >  
> >  		if (!rem_sq_bracket)
> >  			return ret;
> >  
> > +		if (callchain_param.value == CCVAL_COUNT) {
> > +			rem_node.count = child->parent->children_count - cumul_count;
> > +			if (rem_node.count <= 0)
> > +				return ret;
> > +		}
> > +
> >  		new_depth_mask &= ~(1 << (depth - 1));
> > -		ret += ipchain__fprintf_graph(fp, NULL, &rem_hits, depth,
> > +		ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
> >  					      new_depth_mask, 0, total_samples,
> >  					      left_margin);
> 
> this looks like being introduced within your patchset in patch:
>   perf callchain: Abstract callchain print function
> 
> shouldn't it get fixed in there?

Right. I'll move it to the commit

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-11-02  6:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-30 17:15 [RFC/PATCH 0/5] perf report: Support folded callchain output (v1) Namhyung Kim
2015-10-30 17:15 ` [RFC/PATCH 1/5] perf report: Support folded callchain mode on --stdio Namhyung Kim
2015-10-30 17:15 ` [RFC/PATCH 2/5] perf callchain: Abstract callchain print function Namhyung Kim
2015-10-30 17:15 ` [RFC/PATCH 3/5] perf callchain: Add count fields to struct callchain_node Namhyung Kim
2015-10-30 17:15 ` [RFC/PATCH 4/5] perf report: Add callchain value option Namhyung Kim
2015-10-30 17:15 ` [RFC/PATCH 5/5] perf report: Fix segfault on -g fractral with --stdio Namhyung Kim
2015-10-31 11:09   ` Jiri Olsa
2015-11-02  6:45     ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).