All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4)
@ 2013-12-24  8:22 Namhyung Kim
  2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
                   ` (20 more replies)
  0 siblings, 21 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

Hello,

This is my third attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

Please see the patch 01/21.  I refactored functions that add hist
entries with struct add_entry_iter.  While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain.  A hist_entry now has an additional fields to keep the
cumulative period if --children option is given on perf report.

I changed the option as a separate --children and added a new
"Children" column (and renamed the default "Overhead" column into
"Self").  The output will be sorted by children (cumulative) overhead
for now.  The reason I changed to the --children is that I still think
it's much different from other --callchain options and I plan to add
support for showing (remaining) callchains to cumulative entries too
as Arun requested.  The --callchain option will take care of it even
with --children option.

I know that the UI should be changed also to be more flexible as Ingo
requested, but I'd like to do this first and then move to work on the
next.  I also added a new config option to enable it by default.

 * changes in v4:
  - change to --children option (Ingo)
  - rebased on new annotation change (Arnaldo)
  - support perf top also
  - enable --children option by default (Ingo)

 * changes in v3:
  - change to --cumulate option
  - fix a couple of bugs (Jiri, Rodrigo)
  - rename some help functions (Arnaldo)
  - cache previous hist entries rathen than just symbol and dso
  - add some preparatory cleanups
  - add report.cumulate config option


Let me show you an example:

  $ cat abc.c
  #define barrier() asm volatile("" ::: "memory")

  void a(void)
  {
  	int i;
  	for (i = 0; i < 1000000; i++)
  		barrier();
  }
  void b(void)
  {
  	a();
  }
  void c(void)
  {
  	b();
  }
  int main(void)
  {
  	c();
  	return 0;
  }

With this simple program I ran perf record and report:

  $ perf record -g -e cycles:u ./abc

  $ perf report --stdio
      88.29%      abc  abc                [.] a                  
                  |
                  --- a
                      b
                      c
                      main
                      __libc_start_main

       9.43%      abc  ld-2.17.so         [.] _dl_relocate_object
                  |
                  --- _dl_relocate_object
                      dl_main
                      _dl_sysdep_start

       2.27%      abc  [kernel.kallsyms]  [k] page_fault         
                  |
                  --- page_fault
                     |          
                     |--95.94%-- _dl_sysdep_start
                     |          _dl_start_user
                     |          
                      --4.06%-- _start

       0.00%      abc  ld-2.17.so         [.] _start             
                  |
                  --- _start


When the -g cumulative option is given, it'll be shown like this:

  $ perf report --children --stdio

  #     Self  Children  Command      Shared Object                   Symbol
  # ........  ........  .......  .................  .......................
  #
       0.00%    88.29%      abc  libc-2.17.so       [.] __libc_start_main  
       0.00%    88.29%      abc  abc                [.] main               
       0.00%    88.29%      abc  abc                [.] c                  
       0.00%    88.29%      abc  abc                [.] b                  
      88.29%    88.29%      abc  abc                [.] a                  
       0.00%    11.61%      abc  ld-2.17.so         [.] _dl_sysdep_start   
       0.00%     9.43%      abc  ld-2.17.so         [.] dl_main            
       9.43%     9.43%      abc  ld-2.17.so         [.] _dl_relocate_object
       2.27%     2.27%      abc  [kernel.kallsyms]  [k] page_fault         
       0.00%     2.18%      abc  ld-2.17.so         [.] _dl_start_user     
       0.00%     0.10%      abc  ld-2.17.so         [.] _start             

As you can see __libc_start_main -> main -> c -> b -> a callchain show
up in the output.

I know it have some rough edges or even bugs, but I really want to
release it and get reviews.  It does not handle event groups and
annotations yet.

You can also get this series on 'perf/cumulate-v4' branch in my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

[1] https://lkml.org/lkml/2012/3/31/6


Namhyung Kim (21):
  perf tools: Introduce struct add_entry_iter
  perf hists: Convert hist entry functions to use struct he_stat
  perf hists: Add support for accumulated stat of hist entry
  perf hists: Check if accumulated when adding a hist entry
  perf hists: Accumulate hist entry stat based on the callchain
  perf tools: Update cpumode for each cumulative entry
  perf report: Cache cumulative callchains
  perf hists: Sort hist entries by accumulated period
  perf ui/hist: Add support to accumulated hist stat
  perf ui/browser: Add support to accumulated hist stat
  perf ui/gtk: Add support to accumulated hist stat
  perf tools: Apply percent-limit to cumulative percentage
  perf tools: Add more hpp helper functions
  perf report: Add --children option
  perf report: Add report.children config option
  perf tools: Factor out sample__resolve_callchain()
  perf tools: Factor out fill_callchain_info()
  perf top: Support callchain accumulation
  perf top: Add --children option
  perf top: Add top.children config option
  perf tools: Enable --children option by default

 tools/perf/Documentation/perf-report.txt |   5 +
 tools/perf/Documentation/perf-top.txt    |   6 +
 tools/perf/builtin-annotate.c            |   3 +-
 tools/perf/builtin-diff.c                |   2 +-
 tools/perf/builtin-report.c              | 534 +++++++++++++++++++++++++------
 tools/perf/builtin-top.c                 | 137 +++++++-
 tools/perf/tests/hists_link.c            |   4 +-
 tools/perf/ui/browsers/hists.c           |  51 ++-
 tools/perf/ui/gtk/hists.c                |  27 +-
 tools/perf/ui/hist.c                     |  62 ++++
 tools/perf/ui/stdio/hist.c               |  13 +-
 tools/perf/util/callchain.c              |  65 ++++
 tools/perf/util/callchain.h              |   8 +
 tools/perf/util/hist.c                   |  73 +++--
 tools/perf/util/hist.h                   |   7 +-
 tools/perf/util/sort.h                   |   1 +
 tools/perf/util/symbol.c                 |  11 +-
 tools/perf/util/symbol.h                 |   1 +
 18 files changed, 855 insertions(+), 155 deletions(-)

-- 
1.7.11.7


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 14:55   ` Jiri Olsa
                     ` (2 more replies)
  2013-12-24  8:22 ` [PATCH 02/21] perf hists: Convert hist entry functions to use struct he_stat Namhyung Kim
                   ` (19 subsequent siblings)
  20 siblings, 3 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos,
	Stephane Eranian

From: Namhyung Kim <namhyung.kim@lge.com>

There're some duplicate code when adding hist entries.  They are
different in that some have branch info or mem info but generally do
same thing.  So introduce new struct add_entry_iter and add callbacks
to customize each case in general way.

The new perf_evsel__add_entry() function will look like:

  iter->prepare_entry();
  iter->add_single_entry();

  while (iter->next_entry())
    iter->add_next_entry();

  iter->finish_entry();

This will help further work like the cumulative callchain patchset.

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 368 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 278 insertions(+), 90 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bf8dd2e893e4..ff3f0d0a800f 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -93,29 +93,74 @@ static int hist_entry__append_callchain(struct hist_entry *he, struct perf_sampl
 	return callchain_append(he->callchain, &callchain_cursor, sample->period);
 }
 
-static int report__add_mem_hist_entry(struct perf_tool *tool, struct addr_location *al,
-				      struct perf_sample *sample, struct perf_evsel *evsel,
-				      union perf_event *event)
-{
-	struct report *rep = container_of(tool, struct report, tool);
-	struct symbol *parent = NULL;
-	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+struct add_entry_iter {
+	int total;
+	int curr;
+
+	struct report *rep;
+	struct perf_evsel *evsel;
+	struct perf_sample *sample;
 	struct hist_entry *he;
-	struct mem_info *mi, *mx;
-	uint64_t cost;
-	int err = report__resolve_callchain(rep, &parent, evsel, al, sample);
+	struct symbol *parent;
+	void *priv;
+
+	int (*prepare_entry)(struct add_entry_iter *, struct machine *,
+			     struct perf_evsel *, struct addr_location *,
+			     struct perf_sample *);
+	int (*add_single_entry)(struct add_entry_iter *, struct addr_location *);
+	int (*next_entry)(struct add_entry_iter *, struct addr_location *);
+	int (*add_next_entry)(struct add_entry_iter *, struct addr_location *);
+	int (*finish_entry)(struct add_entry_iter *, struct addr_location *);
+};
 
-	if (err)
-		return err;
+static int
+iter_next_nop_entry(struct add_entry_iter *iter __maybe_unused,
+		    struct addr_location *al __maybe_unused)
+{
+	return 0;
+}
+
+static int
+iter_add_next_nop_entry(struct add_entry_iter *iter __maybe_unused,
+			struct addr_location *al __maybe_unused)
+{
+	return 0;
+}
+
+static int
+iter_prepare_mem_entry(struct add_entry_iter *iter, struct machine *machine,
+		       struct perf_evsel *evsel, struct addr_location *al,
+		       struct perf_sample *sample)
+{
+	union perf_event *event = iter->priv;
+	struct mem_info *mi;
+	u8 cpumode;
+
+	BUG_ON(event == NULL);
 
-	mi = machine__resolve_mem(al->machine, al->thread, sample, cpumode);
-	if (!mi)
+	cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+
+	mi = machine__resolve_mem(machine, al->thread, sample, cpumode);
+	if (mi == NULL)
 		return -ENOMEM;
 
-	if (rep->hide_unresolved && !al->sym)
+	iter->evsel = evsel;
+	iter->sample = sample;
+	iter->priv = mi;
+	return 0;
+}
+
+static int
+iter_add_single_mem_entry(struct add_entry_iter *iter, struct addr_location *al)
+{
+	u64 cost;
+	struct mem_info *mi = iter->priv;
+	struct hist_entry *he;
+
+	if (iter->rep->hide_unresolved && !al->sym)
 		return 0;
 
-	cost = sample->weight;
+	cost = iter->sample->weight;
 	if (!cost)
 		cost = 1;
 
@@ -126,11 +171,27 @@ static int report__add_mem_hist_entry(struct perf_tool *tool, struct addr_locati
 	 * and this is indirectly achieved by passing period=weight here
 	 * and the he_stat__add_period() function.
 	 */
-	he = __hists__add_entry(&evsel->hists, al, parent, NULL, mi,
+	he = __hists__add_entry(&iter->evsel->hists, al, iter->parent, NULL, mi,
 				cost, cost, 0);
 	if (!he)
 		return -ENOMEM;
 
+	iter->he = he;
+	return 0;
+}
+
+static int
+iter_finish_mem_entry(struct add_entry_iter *iter, struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct hist_entry *he = iter->he;
+	struct mem_info *mx;
+	int err = -ENOMEM;
+	u64 cost;
+
+	if (he == NULL)
+		return 0;
+
 	err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
 	if (err)
 		goto out;
@@ -140,97 +201,223 @@ static int report__add_mem_hist_entry(struct perf_tool *tool, struct addr_locati
 	if (err)
 		goto out;
 
+	cost = iter->sample->weight;
+	if (!cost)
+		cost = 1;
+
 	evsel->hists.stats.total_period += cost;
 	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
-	err = hist_entry__append_callchain(he, sample);
+
+	err = hist_entry__append_callchain(he, iter->sample);
+
 out:
+	iter->he = NULL;
 	return err;
 }
 
-static int report__add_branch_hist_entry(struct perf_tool *tool, struct addr_location *al,
-					 struct perf_sample *sample, struct perf_evsel *evsel)
+static int
+iter_prepare_branch_entry(struct add_entry_iter *iter, struct machine *machine,
+			  struct perf_evsel *evsel, struct addr_location *al,
+			  struct perf_sample *sample)
 {
-	struct report *rep = container_of(tool, struct report, tool);
-	struct symbol *parent = NULL;
-	unsigned i;
-	struct hist_entry *he;
-	struct branch_info *bi, *bx;
-	int err = report__resolve_callchain(rep, &parent, evsel, al, sample);
+	struct branch_info *bi;
 
-	if (err)
-		return err;
-
-	bi = machine__resolve_bstack(al->machine, al->thread,
+	bi = machine__resolve_bstack(machine, al->thread,
 				     sample->branch_stack);
 	if (!bi)
 		return -ENOMEM;
 
-	for (i = 0; i < sample->branch_stack->nr; i++) {
-		if (rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
-			continue;
+	iter->curr = 0;
+	iter->total = sample->branch_stack->nr;
 
-		err = -ENOMEM;
+	iter->evsel = evsel;
+	iter->sample = sample;
+	iter->priv = bi;
+	return 0;
+}
 
-		/* overwrite the 'al' to branch-to info */
-		al->map = bi[i].to.map;
-		al->sym = bi[i].to.sym;
-		al->addr = bi[i].to.addr;
-		/*
-		 * The report shows the percentage of total branches captured
-		 * and not events sampled. Thus we use a pseudo period of 1.
-		 */
-		he = __hists__add_entry(&evsel->hists, al, parent, &bi[i], NULL,
-					1, 1, 0);
-		if (he) {
-			bx = he->branch_info;
-			err = addr_map_symbol__inc_samples(&bx->from, evsel->idx);
-			if (err)
-				goto out;
-
-			err = addr_map_symbol__inc_samples(&bx->to, evsel->idx);
-			if (err)
-				goto out;
-
-			evsel->hists.stats.total_period += 1;
-			hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
-		} else
-			goto out;
-	}
-	err = 0;
-out:
-	free(bi);
-	return err;
+static int
+iter_add_single_branch_entry(struct add_entry_iter *iter __maybe_unused,
+			     struct addr_location *al __maybe_unused)
+{
+	return 0;
 }
 
-static int report__add_hist_entry(struct perf_tool *tool, struct perf_evsel *evsel,
-				  struct addr_location *al, struct perf_sample *sample)
+static int
+iter_next_branch_entry(struct add_entry_iter *iter, struct addr_location *al)
 {
-	struct report *rep = container_of(tool, struct report, tool);
-	struct symbol *parent = NULL;
+	struct branch_info *bi = iter->priv;
+	int i = iter->curr;
+
+	if (iter->curr >= iter->total)
+		return 0;
+
+	al->map = bi[i].to.map;
+	al->sym = bi[i].to.sym;
+	al->addr = bi[i].to.addr;
+	return 1;
+}
+
+static int
+iter_add_next_branch_entry(struct add_entry_iter *iter, struct addr_location *al)
+{
+	struct branch_info *bi, *bx;
+	struct perf_evsel *evsel = iter->evsel;
 	struct hist_entry *he;
-	int err = report__resolve_callchain(rep, &parent, evsel, al, sample);
+	int i = iter->curr;
+	int err = 0;
+
+	bi = iter->priv;
 
+	if (iter->rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
+		goto out;
+
+	/*
+	 * The report shows the percentage of total branches captured
+	 * and not events sampled. Thus we use a pseudo period of 1.
+	 */
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, &bi[i], NULL,
+				1, 1, 0);
+	if (he == NULL)
+		return -ENOMEM;
+
+	bx = he->branch_info;
+	err = addr_map_symbol__inc_samples(&bx->from, evsel->idx);
 	if (err)
-		return err;
+		goto out;
+
+	err = addr_map_symbol__inc_samples(&bx->to, evsel->idx);
+	if (err)
+		goto out;
+
+	evsel->hists.stats.total_period += 1;
+	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+
+out:
+	iter->curr++;
+	return err;
+}
+
+static int
+iter_finish_branch_entry(struct add_entry_iter *iter,
+			 struct addr_location *al __maybe_unused)
+{
+	free(iter->priv);
+	iter->priv = NULL;
+
+	return iter->curr >= iter->total ? 0 : -1;
+}
 
-	he = __hists__add_entry(&evsel->hists, al, parent, NULL, NULL,
+static int
+iter_prepare_normal_entry(struct add_entry_iter *iter,
+			  struct machine *machine __maybe_unused,
+			  struct perf_evsel *evsel,
+			  struct addr_location *al __maybe_unused,
+			  struct perf_sample *sample)
+{
+	iter->evsel = evsel;
+	iter->sample = sample;
+	return 0;
+}
+
+static int
+iter_add_single_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+	struct hist_entry *he;
+
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction);
 	if (he == NULL)
 		return -ENOMEM;
 
-	err = hist_entry__append_callchain(he, sample);
-	if (err)
-		goto out;
+	iter->he = he;
+	return 0;
+}
+
+static int
+iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
+{
+	int err;
+	struct hist_entry *he = iter->he;
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+
+	if (he == NULL)
+		return 0;
+
+	iter->he = NULL;
 
 	err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+	if (err)
+		return err;
+
 	evsel->hists.stats.total_period += sample->period;
 	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+
+	return hist_entry__append_callchain(he, sample);
+}
+
+static struct add_entry_iter mem_iter = {
+	.prepare_entry 		= iter_prepare_mem_entry,
+	.add_single_entry 	= iter_add_single_mem_entry,
+	.next_entry 		= iter_next_nop_entry,
+	.add_next_entry 	= iter_add_next_nop_entry,
+	.finish_entry 		= iter_finish_mem_entry,
+};
+
+static struct add_entry_iter branch_iter = {
+	.prepare_entry 		= iter_prepare_branch_entry,
+	.add_single_entry 	= iter_add_single_branch_entry,
+	.next_entry 		= iter_next_branch_entry,
+	.add_next_entry 	= iter_add_next_branch_entry,
+	.finish_entry 		= iter_finish_branch_entry,
+};
+
+static struct add_entry_iter normal_iter = {
+	.prepare_entry 		= iter_prepare_normal_entry,
+	.add_single_entry 	= iter_add_single_normal_entry,
+	.next_entry 		= iter_next_nop_entry,
+	.add_next_entry 	= iter_add_next_nop_entry,
+	.finish_entry 		= iter_finish_normal_entry,
+};
+
+static int
+perf_evsel__add_entry(struct perf_evsel *evsel, struct addr_location *al,
+		      struct perf_sample *sample, struct machine *machine,
+		      struct add_entry_iter *iter)
+{
+	int err, err2;
+
+	err = report__resolve_callchain(iter->rep, &iter->parent, evsel, al,
+					sample);
+	if (err)
+		return err;
+
+	err = iter->prepare_entry(iter, machine, evsel, al, sample);
+	if (err)
+		goto out;
+
+	err = iter->add_single_entry(iter, al);
+	if (err)
+		goto out;
+
+	while (iter->next_entry(iter, al)) {
+		err = iter->add_next_entry(iter, al);
+		if (err)
+			break;
+	}
+
 out:
+	err2 = iter->finish_entry(iter, al);
+	if (!err)
+		err = err2;
+
 	return err;
 }
 
-
 static int process_sample_event(struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
@@ -239,6 +426,7 @@ static int process_sample_event(struct perf_tool *tool,
 {
 	struct report *rep = container_of(tool, struct report, tool);
 	struct addr_location al;
+	struct add_entry_iter *iter;
 	int ret;
 
 	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
@@ -253,22 +441,22 @@ static int process_sample_event(struct perf_tool *tool,
 	if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap))
 		return 0;
 
-	if (sort__mode == SORT_MODE__BRANCH) {
-		ret = report__add_branch_hist_entry(tool, &al, sample, evsel);
-		if (ret < 0)
-			pr_debug("problem adding lbr entry, skipping event\n");
-	} else if (rep->mem_mode == 1) {
-		ret = report__add_mem_hist_entry(tool, &al, sample, evsel, event);
-		if (ret < 0)
-			pr_debug("problem adding mem entry, skipping event\n");
-	} else {
-		if (al.map != NULL)
-			al.map->dso->hit = 1;
-
-		ret = report__add_hist_entry(tool, evsel, &al, sample);
-		if (ret < 0)
-			pr_debug("problem incrementing symbol period, skipping event\n");
-	}
+	if (sort__mode == SORT_MODE__BRANCH)
+		iter = &branch_iter;
+	else if (rep->mem_mode == 1) {
+		iter = &mem_iter;
+		iter->priv = event;
+	} else
+		iter = &normal_iter;
+
+	if (al.map != NULL)
+		al.map->dso->hit = 1;
+
+	iter->rep = rep;
+	ret = perf_evsel__add_entry(evsel, &al, sample, machine, iter);
+	if (ret < 0)
+		pr_debug("problem adding hist entry, skipping event\n");
+
 	return ret;
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 02/21] perf hists: Convert hist entry functions to use struct he_stat
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
  2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 16:09   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

hist_entry__add_cpumode_period() and hist_entry__decay() are dealing
with hist_entry's stat fields only.  So make them use the struct
directly.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/hist.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6cd4823a7a8b..283aa1972b4f 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -181,21 +181,21 @@ void hists__output_recalc_col_len(struct hists *hists, int max_rows)
 	}
 }
 
-static void hist_entry__add_cpumode_period(struct hist_entry *he,
-					   unsigned int cpumode, u64 period)
+static void he_stat__add_cpumode_period(struct he_stat *he_stat,
+					unsigned int cpumode, u64 period)
 {
 	switch (cpumode) {
 	case PERF_RECORD_MISC_KERNEL:
-		he->stat.period_sys += period;
+		he_stat->period_sys += period;
 		break;
 	case PERF_RECORD_MISC_USER:
-		he->stat.period_us += period;
+		he_stat->period_us += period;
 		break;
 	case PERF_RECORD_MISC_GUEST_KERNEL:
-		he->stat.period_guest_sys += period;
+		he_stat->period_guest_sys += period;
 		break;
 	case PERF_RECORD_MISC_GUEST_USER:
-		he->stat.period_guest_us += period;
+		he_stat->period_guest_us += period;
 		break;
 	default:
 		break;
@@ -222,10 +222,10 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
 	dest->weight		+= src->weight;
 }
 
-static void hist_entry__decay(struct hist_entry *he)
+static void he_stat__decay(struct he_stat *he_stat)
 {
-	he->stat.period = (he->stat.period * 7) / 8;
-	he->stat.nr_events = (he->stat.nr_events * 7) / 8;
+	he_stat->period = (he_stat->period * 7) / 8;
+	he_stat->nr_events = (he_stat->nr_events * 7) / 8;
 	/* XXX need decay for weight too? */
 }
 
@@ -236,7 +236,7 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
 	if (prev_period == 0)
 		return true;
 
-	hist_entry__decay(he);
+	he_stat__decay(&he->stat);
 
 	if (!he->filtered)
 		hists->stats.total_period -= prev_period - he->stat.period;
@@ -402,7 +402,7 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 	rb_link_node(&he->rb_node_in, parent, p);
 	rb_insert_color(&he->rb_node_in, hists->entries_in);
 out:
-	hist_entry__add_cpumode_period(he, al->cpumode, period);
+	he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
 	return he;
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
  2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
  2013-12-24  8:22 ` [PATCH 02/21] perf hists: Convert hist entry functions to use struct he_stat Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 16:16   ` Jiri Olsa
                     ` (2 more replies)
  2013-12-24  8:22 ` [PATCH 04/21] perf hists: Check if accumulated when adding a " Namhyung Kim
                   ` (17 subsequent siblings)
  20 siblings, 3 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Maintain accumulated stat information in hist_entry->stat_acc if
symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
vaules initially, and will be updated as callchain is processed later.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/hist.c   | 18 ++++++++++++++++++
 tools/perf/util/sort.h   |  1 +
 tools/perf/util/symbol.h |  1 +
 3 files changed, 20 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 283aa1972b4f..b61e2fa42412 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -237,6 +237,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
 		return true;
 
 	he_stat__decay(&he->stat);
+	if (symbol_conf.cumulate_callchain)
+		he_stat__decay(he->stat_acc);
 
 	if (!he->filtered)
 		hists->stats.total_period -= prev_period - he->stat.period;
@@ -284,6 +286,15 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
 	if (he != NULL) {
 		*he = *template;
 
+		if (symbol_conf.cumulate_callchain) {
+			he->stat_acc = malloc(sizeof(he->stat));
+			if (he->stat_acc == NULL) {
+				free(he);
+				return NULL;
+			}
+			memcpy(he->stat_acc, &he->stat, sizeof(he->stat));
+		}
+
 		if (he->ms.map)
 			he->ms.map->referenced = true;
 
@@ -295,6 +306,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
 			 */
 			he->branch_info = malloc(sizeof(*he->branch_info));
 			if (he->branch_info == NULL) {
+				free(he->stat_acc);
 				free(he);
 				return NULL;
 			}
@@ -367,6 +379,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 
 		if (!cmp) {
 			he_stat__add_period(&he->stat, period, weight);
+			if (symbol_conf.cumulate_callchain)
+				he_stat__add_period(he->stat_acc, period, weight);
 
 			/*
 			 * This mem info was allocated from machine__resolve_mem
@@ -403,6 +417,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 	rb_insert_color(&he->rb_node_in, hists->entries_in);
 out:
 	he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
+	if (symbol_conf.cumulate_callchain)
+		he_stat__add_cpumode_period(he->stat_acc, al->cpumode, period);
 	return he;
 }
 
@@ -502,6 +518,8 @@ static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
 
 		if (!cmp) {
 			he_stat__add_stat(&iter->stat, &he->stat);
+			if (symbol_conf.cumulate_callchain)
+				he_stat__add_stat(iter->stat_acc, he->stat_acc);
 
 			if (symbol_conf.use_callchain) {
 				callchain_cursor_reset(&callchain_cursor);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 43e5ff42a609..309f2838a1b4 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -81,6 +81,7 @@ struct hist_entry {
 		struct list_head head;
 	} pairs;
 	struct he_stat		stat;
+	struct he_stat		*stat_acc;
 	struct map_symbol	ms;
 	struct thread		*thread;
 	struct comm		*comm;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index cbd680361806..ceee088988a0 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -92,6 +92,7 @@ struct symbol_conf {
 			show_nr_samples,
 			show_total_period,
 			use_callchain,
+			cumulate_callchain,
 			exclude_other,
 			show_cpu_utilization,
 			initialized,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 04/21] perf hists: Check if accumulated when adding a hist entry
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (2 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain Namhyung Kim
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

To support callchain accumulation, @entry should be recognized if it's
accumulated or not when add_hist_entry() called.  The period of an
accumulated entry should be added to ->stat_acc but not ->stat. Add
@sample_self arg for that.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c |  3 ++-
 tools/perf/builtin-diff.c     |  2 +-
 tools/perf/builtin-report.c   |  6 +++---
 tools/perf/builtin-top.c      |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c        | 23 +++++++++++++++--------
 tools/perf/util/hist.h        |  3 ++-
 7 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 4136f9970fd5..ab8efe552a20 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -65,7 +65,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		return 0;
 	}
 
-	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL, 1, 1, 0);
+	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL, 1, 1, 0,
+				true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 2a85cc9a2d09..4dbc14c33ab9 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -308,7 +308,7 @@ static int hists__add_entry(struct hists *hists,
 			    u64 weight, u64 transaction)
 {
 	if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-			       transaction) != NULL)
+			       transaction, true) != NULL)
 		return 0;
 	return -ENOMEM;
 }
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ff3f0d0a800f..29fe19071d24 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -172,7 +172,7 @@ iter_add_single_mem_entry(struct add_entry_iter *iter, struct addr_location *al)
 	 * and the he_stat__add_period() function.
 	 */
 	he = __hists__add_entry(&iter->evsel->hists, al, iter->parent, NULL, mi,
-				cost, cost, 0);
+				cost, cost, 0, true);
 	if (!he)
 		return -ENOMEM;
 
@@ -277,7 +277,7 @@ iter_add_next_branch_entry(struct add_entry_iter *iter, struct addr_location *al
 	 * and not events sampled. Thus we use a pseudo period of 1.
 	 */
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, &bi[i], NULL,
-				1, 1, 0);
+				1, 1, 0, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -329,7 +329,7 @@ iter_add_single_normal_entry(struct add_entry_iter *iter, struct addr_location *
 
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction);
+				sample->transaction, true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 172e91a9ce62..142a41bd5062 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -245,7 +245,7 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
 	pthread_mutex_lock(&evsel->hists.lock);
 	he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction);
+				sample->transaction, true);
 	pthread_mutex_unlock(&evsel->hists.lock);
 	if (he == NULL)
 		return NULL;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 173bf42cc03e..b829c2a1a598 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -223,7 +223,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(&evsel->hists, &al, NULL,
-						NULL, NULL, 1, 1, 0);
+						NULL, NULL, 1, 1, 0, true);
 			if (he == NULL)
 				goto out;
 
@@ -246,7 +246,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(&evsel->hists, &al, NULL,
-						NULL, NULL, 1, 1, 0);
+						NULL, NULL, 1, 1, 0, true);
 			if (he == NULL)
 				goto out;
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b61e2fa42412..9a522a05b937 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -278,7 +278,8 @@ void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel)
  * histogram, sorted on item, collects periods
  */
 
-static struct hist_entry *hist_entry__new(struct hist_entry *template)
+static struct hist_entry *hist_entry__new(struct hist_entry *template,
+					  bool sample_self)
 {
 	size_t callchain_size = symbol_conf.use_callchain ? sizeof(struct callchain_root) : 0;
 	struct hist_entry *he = zalloc(sizeof(*he) + callchain_size);
@@ -293,6 +294,8 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
 				return NULL;
 			}
 			memcpy(he->stat_acc, &he->stat, sizeof(he->stat));
+			if (!sample_self)
+				memset(&he->stat, 0, sizeof(he->stat));
 		}
 
 		if (he->ms.map)
@@ -354,7 +357,8 @@ static u8 symbol__parent_filter(const struct symbol *parent)
 
 static struct hist_entry *add_hist_entry(struct hists *hists,
 					 struct hist_entry *entry,
-					 struct addr_location *al)
+					 struct addr_location *al,
+					 bool sample_self)
 {
 	struct rb_node **p;
 	struct rb_node *parent = NULL;
@@ -378,7 +382,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 		cmp = hist_entry__cmp(he, entry);
 
 		if (!cmp) {
-			he_stat__add_period(&he->stat, period, weight);
+			if (sample_self)
+				he_stat__add_period(&he->stat, period, weight);
 			if (symbol_conf.cumulate_callchain)
 				he_stat__add_period(he->stat_acc, period, weight);
 
@@ -408,7 +413,7 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 			p = &(*p)->rb_right;
 	}
 
-	he = hist_entry__new(entry);
+	he = hist_entry__new(entry, sample_self);
 	if (!he)
 		return NULL;
 
@@ -416,7 +421,8 @@ static struct hist_entry *add_hist_entry(struct hists *hists,
 	rb_link_node(&he->rb_node_in, parent, p);
 	rb_insert_color(&he->rb_node_in, hists->entries_in);
 out:
-	he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
+	if (sample_self)
+		he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
 	if (symbol_conf.cumulate_callchain)
 		he_stat__add_cpumode_period(he->stat_acc, al->cpumode, period);
 	return he;
@@ -427,7 +433,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct symbol *sym_parent,
 				      struct branch_info *bi,
 				      struct mem_info *mi,
-				      u64 period, u64 weight, u64 transaction)
+				      u64 period, u64 weight, u64 transaction,
+				      bool sample_self)
 {
 	struct hist_entry entry = {
 		.thread	= al->thread,
@@ -452,7 +459,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 		.transaction = transaction,
 	};
 
-	return add_hist_entry(hists, &entry, al);
+	return add_hist_entry(hists, &entry, al, sample_self);
 }
 
 int64_t
@@ -866,7 +873,7 @@ static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 			p = &(*p)->rb_right;
 	}
 
-	he = hist_entry__new(pair);
+	he = hist_entry__new(pair, true);
 	if (he) {
 		memset(&he->stat, 0, sizeof(he->stat));
 		he->hists = hists;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index a59743fa3ef7..eccde6c42f5c 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -89,7 +89,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct symbol *parent,
 				      struct branch_info *bi,
 				      struct mem_info *mi, u64 period,
-				      u64 weight, u64 transaction);
+				      u64 weight, u64 transaction,
+				      bool sample_self);
 int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
 int64_t hist_entry__collapse(struct hist_entry *left, struct hist_entry *right);
 int hist_entry__transaction_len(void);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (3 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 04/21] perf hists: Check if accumulated when adding a " Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 16:58   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 06/21] perf tools: Update cpumode for each cumulative entry Namhyung Kim
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Call __hists__add_entry() for each callchain node to get an
accumulated stat for an entry.  Introduce new cumulative_iter ops to
process them properly.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 103 +++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/ui/stdio/hist.c  |   2 +-
 2 files changed, 103 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 29fe19071d24..4fde0ab82498 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -360,6 +360,97 @@ iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
 	return hist_entry__append_callchain(he, sample);
 }
 
+static int
+iter_prepare_cumulative_entry(struct add_entry_iter *iter,
+			      struct machine *machine __maybe_unused,
+			      struct perf_evsel *evsel,
+			      struct addr_location *al __maybe_unused,
+			      struct perf_sample *sample)
+{
+	callchain_cursor_commit(&callchain_cursor);
+
+	iter->evsel = evsel;
+	iter->sample = sample;
+	return 0;
+}
+
+static int
+iter_add_single_cumulative_entry(struct add_entry_iter *iter,
+				 struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+	struct hist_entry *he;
+
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
+				sample->period, sample->weight,
+				sample->transaction, true);
+	if (he == NULL)
+		return -ENOMEM;
+
+	/*
+	 * This is for putting parents upward during output resort iff
+	 * only a child gets sampled.  See hist_entry__sort_on_period().
+	 */
+	he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
+
+	return hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+}
+
+static int
+iter_next_cumulative_entry(struct add_entry_iter *iter,
+			   struct addr_location *al)
+{
+	struct callchain_cursor_node *node;
+
+	node = callchain_cursor_current(&callchain_cursor);
+	if (node == NULL)
+		return 0;
+
+	al->map = node->map;
+	al->sym = node->sym;
+	if (node->map)
+		al->addr = node->map->map_ip(node->map, node->ip);
+	else
+		al->addr = node->ip;
+
+	if (iter->rep->hide_unresolved && al->sym == NULL)
+		return 0;
+
+	callchain_cursor_advance(&callchain_cursor);
+	return 1;
+}
+
+static int
+iter_add_next_cumulative_entry(struct add_entry_iter *iter,
+			       struct addr_location *al)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+	struct hist_entry *he;
+
+	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
+				sample->period, sample->weight,
+				sample->transaction, false);
+	if (he == NULL)
+		return -ENOMEM;
+
+	return hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
+}
+
+static int
+iter_finish_cumulative_entry(struct add_entry_iter *iter,
+			     struct addr_location *al __maybe_unused)
+{
+	struct perf_evsel *evsel = iter->evsel;
+	struct perf_sample *sample = iter->sample;
+
+	evsel->hists.stats.total_period += sample->period;
+	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+
+	return 0;
+}
+
 static struct add_entry_iter mem_iter = {
 	.prepare_entry 		= iter_prepare_mem_entry,
 	.add_single_entry 	= iter_add_single_mem_entry,
@@ -384,6 +475,14 @@ static struct add_entry_iter normal_iter = {
 	.finish_entry 		= iter_finish_normal_entry,
 };
 
+static struct add_entry_iter cumulative_iter = {
+	.prepare_entry 		= iter_prepare_cumulative_entry,
+	.add_single_entry 	= iter_add_single_cumulative_entry,
+	.next_entry 		= iter_next_cumulative_entry,
+	.add_next_entry 	= iter_add_next_cumulative_entry,
+	.finish_entry 		= iter_finish_cumulative_entry,
+};
+
 static int
 perf_evsel__add_entry(struct perf_evsel *evsel, struct addr_location *al,
 		      struct perf_sample *sample, struct machine *machine,
@@ -446,7 +545,9 @@ static int process_sample_event(struct perf_tool *tool,
 	else if (rep->mem_mode == 1) {
 		iter = &mem_iter;
 		iter->priv = event;
-	} else
+	} else if (symbol_conf.cumulate_callchain)
+		iter = &cumulative_iter;
+	else
 		iter = &normal_iter;
 
 	if (al.map != NULL)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index c244cb524ef2..4c4986e809d8 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -364,7 +364,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 
 	ret = fprintf(fp, "%s\n", bf);
 
-	if (symbol_conf.use_callchain)
+	if (symbol_conf.use_callchain && !symbol_conf.cumulate_callchain)
 		ret += hist_entry__callchain_fprintf(he, hists, fp);
 
 	return ret;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 06/21] perf tools: Update cpumode for each cumulative entry
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (4 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:02   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 07/21] perf report: Cache cumulative callchains Namhyung Kim
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

The cpumode and level in struct addr_localtion was set for a sample
and but updated as cumulative callchains were added.  This led to have
non-matching symbol and cpumode in the output.

Update it accordingly based on the fact whether the map is a part of
the kernel or not.  This is a reverse of what thread__find_addr_map()
does.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4fde0ab82498..17c41c686042 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -100,6 +100,7 @@ struct add_entry_iter {
 	struct report *rep;
 	struct perf_evsel *evsel;
 	struct perf_sample *sample;
+	struct machine *machine;
 	struct hist_entry *he;
 	struct symbol *parent;
 	void *priv;
@@ -362,7 +363,7 @@ iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
 
 static int
 iter_prepare_cumulative_entry(struct add_entry_iter *iter,
-			      struct machine *machine __maybe_unused,
+			      struct machine *machine,
 			      struct perf_evsel *evsel,
 			      struct addr_location *al __maybe_unused,
 			      struct perf_sample *sample)
@@ -371,6 +372,7 @@ iter_prepare_cumulative_entry(struct add_entry_iter *iter,
 
 	iter->evsel = evsel;
 	iter->sample = sample;
+	iter->machine = machine;
 	return 0;
 }
 
@@ -414,9 +416,35 @@ iter_next_cumulative_entry(struct add_entry_iter *iter,
 	else
 		al->addr = node->ip;
 
-	if (iter->rep->hide_unresolved && al->sym == NULL)
-		return 0;
+	if (al->sym == NULL) {
+		if (iter->rep->hide_unresolved)
+			return 0;
+		if (al->map == NULL)
+			goto out;
+	}
+
+	if (al->map->groups == &iter->machine->kmaps) {
+		if (machine__is_host(iter->machine)) {
+			al->cpumode = PERF_RECORD_MISC_KERNEL;
+			al->level = 'k';
+		} else {
+			al->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+			al->level = 'g';
+		}
+	} else {
+		if (machine__is_host(iter->machine)) {
+			al->cpumode = PERF_RECORD_MISC_USER;
+			al->level = '.';
+		} else if (perf_guest) {
+			al->cpumode = PERF_RECORD_MISC_GUEST_USER;
+			al->level = 'u';
+		} else {
+			al->cpumode = PERF_RECORD_MISC_HYPERVISOR;
+			al->level = 'H';
+		}
+	}
 
+out:
 	callchain_cursor_advance(&callchain_cursor);
 	return 1;
 }
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 07/21] perf report: Cache cumulative callchains
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (5 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 06/21] perf tools: Update cpumode for each cumulative entry Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:10   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 08/21] perf hists: Sort hist entries by accumulated period Namhyung Kim
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

It is possble that a callchain has cycles or recursive calls.  In that
case it'll end up having entries more than 100% overhead in the
output.  In order to prevent such entries, cache each callchain node
and skip if same entry already cumulated.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 48 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 17c41c686042..1314841ffad1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -368,8 +368,27 @@ iter_prepare_cumulative_entry(struct add_entry_iter *iter,
 			      struct addr_location *al __maybe_unused,
 			      struct perf_sample *sample)
 {
+	struct callchain_cursor_node *node;
+	struct hist_entry **he_cache;
+
 	callchain_cursor_commit(&callchain_cursor);
 
+	/*
+	 * This is for detecting cycles or recursions so that they're
+	 * cumulated only one time to prevent entries more than 100%
+	 * overhead.
+	 */
+	he_cache = malloc(sizeof(*he_cache) * (PERF_MAX_STACK_DEPTH + 1));
+	if (he_cache == NULL)
+		return -ENOMEM;
+
+	iter->priv = he_cache;
+	iter->curr = 0;
+
+	node = callchain_cursor_current(&callchain_cursor);
+	if (node == NULL)
+		return 0;
+
 	iter->evsel = evsel;
 	iter->sample = sample;
 	iter->machine = machine;
@@ -382,6 +401,7 @@ iter_add_single_cumulative_entry(struct add_entry_iter *iter,
 {
 	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
+	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
 
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
@@ -390,6 +410,8 @@ iter_add_single_cumulative_entry(struct add_entry_iter *iter,
 	if (he == NULL)
 		return -ENOMEM;
 
+	he_cache[iter->curr++] = he;
+
 	/*
 	 * This is for putting parents upward during output resort iff
 	 * only a child gets sampled.  See hist_entry__sort_on_period().
@@ -455,7 +477,29 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
 {
 	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
+	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
+	struct hist_entry he_tmp = {
+		.cpu = al->cpu,
+		.thread = al->thread,
+		.comm = thread__comm(al->thread),
+		.ip = al->addr,
+		.ms = {
+			.map = al->map,
+			.sym = al->sym,
+		},
+		.parent = iter->parent,
+	};
+	int i;
+
+	/*
+	 * Check if there's duplicate entries in the callchain.
+	 * It's possible that it has cycles or recursive calls.
+	 */
+	for (i = 0; i < iter->curr; i++) {
+		if (hist_entry__cmp(he_cache[i], &he_tmp) == 0)
+			return 0;
+	}
 
 	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
@@ -463,6 +507,8 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
 	if (he == NULL)
 		return -ENOMEM;
 
+	he_cache[iter->curr++] = he;
+
 	return hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
 }
 
@@ -476,6 +522,8 @@ iter_finish_cumulative_entry(struct add_entry_iter *iter,
 	evsel->hists.stats.total_period += sample->period;
 	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
 
+	free(iter->priv);
+	iter->priv = NULL;
 	return 0;
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 08/21] perf hists: Sort hist entries by accumulated period
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (6 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 07/21] perf report: Cache cumulative callchains Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat Namhyung Kim
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

When callchain accumulation is requested, we need to sort the entries
by accumulated period value.  When accumulated periods of two entries
are same (i.e. single path callchain) put the caller above since
accumulation tends to put callers on higher position for obvious
reason.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c |  6 ++++++
 tools/perf/util/hist.c      | 12 ++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 1314841ffad1..ae298ae91a47 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -509,6 +509,12 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
 
 	he_cache[iter->curr++] = he;
 
+	/*
+	 * This is for putting parents upward during output resort iff
+	 * only a child gets sampled.  See hist_entry__sort_on_period().
+	 */
+	he->callchain->max_depth = callchain_cursor.nr - callchain_cursor.pos;
+
 	return hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
 }
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 9a522a05b937..34bbff92b3d7 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -625,6 +625,18 @@ static int hist_entry__sort_on_period(struct hist_entry *a,
 	struct hist_entry *pair;
 	u64 *periods_a, *periods_b;
 
+	if (symbol_conf.cumulate_callchain) {
+		/*
+		 * Put caller above callee when they have equal period.
+		 */
+		if (a->stat_acc->period != b->stat_acc->period)
+			return a->stat_acc->period > b->stat_acc->period ? 1 : -1;
+
+		if (a->callchain->max_depth != b->callchain->max_depth)
+			return a->callchain->max_depth < b->callchain->max_depth ?
+				1 : -1;
+	}
+
 	ret = period_cmp(a->stat.period, b->stat.period);
 	if (ret || !symbol_conf.event_group)
 		return ret;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (7 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 08/21] perf hists: Sort hist entries by accumulated period Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:31   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 10/21] perf ui/browser: " Namhyung Kim
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Print accumulated stat of a hist entry if requested.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/hist.c   | 45 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/hist.h |  1 +
 2 files changed, 46 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 78f4c92e9b73..eb9c07bcdb01 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -129,6 +129,28 @@ static int hpp__entry_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,		\
 			  scnprintf, true);					\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
+static u64 he_get_acc_##_field(struct hist_entry *he)				\
+{										\
+	return he->stat_acc->_field;						\
+}										\
+										\
+static int hpp__color_acc_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
+			      struct perf_hpp *hpp, struct hist_entry *he) 	\
+{										\
+	return __hpp__fmt(hpp, he, he_get_acc_##_field, " %6.2f%%",		\
+			  (hpp_snprint_fn)percent_color_snprintf, true);	\
+}
+
+#define __HPP_ENTRY_ACC_PERCENT_FN(_type, _field)				\
+static int hpp__entry_acc_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,	\
+			      struct perf_hpp *hpp, struct hist_entry *he) 	\
+{										\
+	const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%";		\
+	return __hpp__fmt(hpp, he, he_get_acc_##_field, fmt,			\
+			  scnprintf, true);					\
+}
+
 #define __HPP_ENTRY_RAW_FN(_type, _field)					\
 static u64 he_get_raw_##_field(struct hist_entry *he)				\
 {										\
@@ -148,17 +170,25 @@ __HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
 __HPP_COLOR_PERCENT_FN(_type, _field)					\
 __HPP_ENTRY_PERCENT_FN(_type, _field)
 
+#define HPP_PERCENT_ACC_FNS(_type, _str, _field, _min_width, _unit_width)\
+__HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
+__HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
+__HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
+__HPP_ENTRY_ACC_PERCENT_FN(_type, _field)
+
 #define HPP_RAW_FNS(_type, _str, _field, _min_width, _unit_width)	\
 __HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
 __HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
 __HPP_ENTRY_RAW_FN(_type, _field)
 
+__HPP_HEADER_FN(overhead_self, "Self", 8, 8)
 
 HPP_PERCENT_FNS(overhead, "Overhead", period, 8, 8)
 HPP_PERCENT_FNS(overhead_sys, "sys", period_sys, 8, 8)
 HPP_PERCENT_FNS(overhead_us, "usr", period_us, 8, 8)
 HPP_PERCENT_FNS(overhead_guest_sys, "guest sys", period_guest_sys, 9, 8)
 HPP_PERCENT_FNS(overhead_guest_us, "guest usr", period_guest_us, 9, 8)
+HPP_PERCENT_ACC_FNS(overhead_acc, "Children", period, 8, 8)
 
 HPP_RAW_FNS(samples, "Samples", nr_events, 12, 12)
 HPP_RAW_FNS(period, "Period", period, 12, 12)
@@ -171,6 +201,14 @@ HPP_RAW_FNS(period, "Period", period, 12, 12)
 		.entry	= hpp__entry_ ## _name		\
 	}
 
+#define HPP__COLOR_ACC_PRINT_FNS(_name)			\
+	{						\
+		.header	= hpp__header_ ## _name,	\
+		.width	= hpp__width_ ## _name,		\
+		.color	= hpp__color_acc_ ## _name,	\
+		.entry	= hpp__entry_acc_ ## _name	\
+	}
+
 #define HPP__PRINT_FNS(_name)				\
 	{						\
 		.header	= hpp__header_ ## _name,	\
@@ -184,6 +222,7 @@ struct perf_hpp_fmt perf_hpp__format[] = {
 	HPP__COLOR_PRINT_FNS(overhead_us),
 	HPP__COLOR_PRINT_FNS(overhead_guest_sys),
 	HPP__COLOR_PRINT_FNS(overhead_guest_us),
+	HPP__COLOR_ACC_PRINT_FNS(overhead_acc),
 	HPP__PRINT_FNS(samples),
 	HPP__PRINT_FNS(period)
 };
@@ -208,6 +247,12 @@ void perf_hpp__init(void)
 {
 	perf_hpp__column_enable(PERF_HPP__OVERHEAD);
 
+	if (symbol_conf.cumulate_callchain) {
+		perf_hpp__format[PERF_HPP__OVERHEAD].header =
+						hpp__header_overhead_self;
+		perf_hpp__column_enable(PERF_HPP__OVERHEAD_ACC);
+	}
+
 	if (symbol_conf.show_cpu_utilization) {
 		perf_hpp__column_enable(PERF_HPP__OVERHEAD_SYS);
 		perf_hpp__column_enable(PERF_HPP__OVERHEAD_US);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index eccde6c42f5c..8cf8dca37e13 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -157,6 +157,7 @@ enum {
 	PERF_HPP__OVERHEAD_US,
 	PERF_HPP__OVERHEAD_GUEST_SYS,
 	PERF_HPP__OVERHEAD_GUEST_US,
+	PERF_HPP__OVERHEAD_ACC,
 	PERF_HPP__SAMPLES,
 	PERF_HPP__PERIOD,
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 10/21] perf ui/browser: Add support to accumulated hist stat
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (8 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:33   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 11/21] perf ui/gtk: " Namhyung Kim
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Print accumulated stat of a hist entry if requested.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/browsers/hists.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index a440e03cd8c2..efa78894f70d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -693,11 +693,26 @@ hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
 	return __hpp__color_fmt(hpp, he, __hpp_get_##_field, _cb);	\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field, _cb)			\
+static u64 __hpp_get_acc_##_field(struct hist_entry *he)		\
+{									\
+	return he->stat_acc->_field;					\
+}									\
+									\
+static int								\
+hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
+				struct perf_hpp *hpp,			\
+				struct hist_entry *he)			\
+{									\
+	return __hpp__color_fmt(hpp, he, __hpp_get_acc_##_field, _cb);	\
+}
+
 __HPP_COLOR_PERCENT_FN(overhead, period, __hpp__color_callchain)
 __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys, NULL)
 __HPP_COLOR_PERCENT_FN(overhead_us, period_us, NULL)
 __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys, NULL)
 __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us, NULL)
+__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period, NULL)
 
 #undef __HPP_COLOR_PERCENT_FN
 
@@ -715,6 +730,8 @@ void hist_browser__init_hpp(void)
 				hist_browser__hpp_color_overhead_guest_sys;
 	perf_hpp__format[PERF_HPP__OVERHEAD_GUEST_US].color =
 				hist_browser__hpp_color_overhead_guest_us;
+	perf_hpp__format[PERF_HPP__OVERHEAD_ACC].color =
+				hist_browser__hpp_color_overhead_acc;
 }
 
 static int hist_browser__show_entry(struct hist_browser *browser,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 11/21] perf ui/gtk: Add support to accumulated hist stat
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (9 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 10/21] perf ui/browser: " Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:35   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage Namhyung Kim
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Print accumulated stat of a hist entry if requested.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/gtk/hists.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 2ca66cc1160f..70ed0d5e1b94 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -98,11 +98,25 @@ static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,
 	return __hpp__color_fmt(hpp, he, he_get_##_field);			\
 }
 
+#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
+static u64 he_get_acc_##_field(struct hist_entry *he)				\
+{										\
+	return he->stat_acc->_field;						\
+}										\
+										\
+static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
+				       struct perf_hpp *hpp,			\
+				       struct hist_entry *he)			\
+{										\
+	return __hpp__color_fmt(hpp, he, he_get_acc_##_field);			\
+}
+
 __HPP_COLOR_PERCENT_FN(overhead, period)
 __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys)
 __HPP_COLOR_PERCENT_FN(overhead_us, period_us)
 __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys)
 __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us)
+__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period)
 
 #undef __HPP_COLOR_PERCENT_FN
 
@@ -121,6 +135,8 @@ void perf_gtk__init_hpp(void)
 				perf_gtk__hpp_color_overhead_guest_sys;
 	perf_hpp__format[PERF_HPP__OVERHEAD_GUEST_US].color =
 				perf_gtk__hpp_color_overhead_guest_us;
+	perf_hpp__format[PERF_HPP__OVERHEAD_ACC].color =
+				perf_gtk__hpp_color_overhead_acc;
 }
 
 static void callchain_list__sym_name(struct callchain_list *cl,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (10 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 11/21] perf ui/gtk: " Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:40   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 13/21] perf tools: Add more hpp helper functions Namhyung Kim
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

If -g cumulative option is given, it needs to show entries which don't
have self overhead.  So apply percent-limit to accumulated overhead
percentage in this case.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/browsers/hists.c | 34 ++++++++++++++++++++++++++--------
 tools/perf/ui/gtk/hists.c      | 11 +++++++++--
 tools/perf/ui/stdio/hist.c     | 11 +++++++++--
 3 files changed, 44 insertions(+), 12 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index efa78894f70d..b02e71ecc5fe 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -828,12 +828,19 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
 
 	for (nd = browser->top; nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		float percent = h->stat.period * 100.0 /
-					hb->hists->stats.total_period;
+		float percent;
 
 		if (h->filtered)
 			continue;
 
+		if (symbol_conf.cumulate_callchain) {
+			percent = h->stat_acc->period * 100.0 /
+					hb->hists->stats.total_period;
+		} else {
+			percent = h->stat.period * 100.0 /
+					hb->hists->stats.total_period;
+		}
+
 		if (percent < hb->min_pcnt)
 			continue;
 
@@ -851,13 +858,17 @@ static struct rb_node *hists__filter_entries(struct rb_node *nd,
 {
 	while (nd != NULL) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		float percent = h->stat.period * 100.0 /
-					hists->stats.total_period;
+		float percent;
 
-		if (percent < min_pcnt)
-			return NULL;
+		if (symbol_conf.cumulate_callchain) {
+			percent = h->stat_acc->period * 100.0 /
+					hists->stats.total_period;
+		} else {
+			percent = h->stat.period * 100.0 /
+					hists->stats.total_period;
+		}
 
-		if (!h->filtered)
+		if (!h->filtered && percent >= min_pcnt)
 			return nd;
 
 		nd = rb_next(nd);
@@ -872,8 +883,15 @@ static struct rb_node *hists__filter_prev_entries(struct rb_node *nd,
 {
 	while (nd != NULL) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		float percent = h->stat.period * 100.0 /
+		float percent;
+
+		if (symbol_conf.cumulate_callchain) {
+			percent = h->stat_acc->period * 100.0 /
+					hists->stats.total_period;
+		} else {
+			percent = h->stat.period * 100.0 /
 					hists->stats.total_period;
+		}
 
 		if (!h->filtered && percent >= min_pcnt)
 			return nd;
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 70ed0d5e1b94..06ae3342e14f 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -296,12 +296,19 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
 	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 		GtkTreeIter iter;
-		float percent = h->stat.period * 100.0 /
-					hists->stats.total_period;
+		float percent;
 
 		if (h->filtered)
 			continue;
 
+		if (symbol_conf.cumulate_callchain) {
+			percent = h->stat_acc->period * 100.0 /
+					hists->stats.total_period;
+		} else {
+			percent = h->stat.period * 100.0 /
+					hists->stats.total_period;
+		}
+
 		if (percent < min_pcnt)
 			continue;
 
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 4c4986e809d8..7ea8502192b0 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -487,12 +487,19 @@ print_entries:
 
 	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
-		float percent = h->stat.period * 100.0 /
-					hists->stats.total_period;
+		float percent;
 
 		if (h->filtered)
 			continue;
 
+		if (symbol_conf.cumulate_callchain) {
+			percent = h->stat_acc->period * 100.0 /
+					hists->stats.total_period;
+		} else {
+			percent = h->stat.period * 100.0 /
+					hists->stats.total_period;
+		}
+
 		if (percent < min_pcnt)
 			continue;
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 13/21] perf tools: Add more hpp helper functions
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (11 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 14/21] perf report: Add --children option Namhyung Kim
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Sometimes it needs to disable some columns at runtime.  Add help
functions to support that.

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/hist.c   | 17 +++++++++++++++++
 tools/perf/util/hist.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index eb9c07bcdb01..1f9e25211da2 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -275,12 +275,29 @@ void perf_hpp__column_register(struct perf_hpp_fmt *format)
 	list_add_tail(&format->list, &perf_hpp__list);
 }
 
+void perf_hpp__column_unregister(struct perf_hpp_fmt *format)
+{
+	list_del(&format->list);
+}
+
 void perf_hpp__column_enable(unsigned col)
 {
 	BUG_ON(col >= PERF_HPP__MAX_INDEX);
 	perf_hpp__column_register(&perf_hpp__format[col]);
 }
 
+void perf_hpp__column_disable(unsigned col)
+{
+	BUG_ON(col >= PERF_HPP__MAX_INDEX);
+	perf_hpp__column_unregister(&perf_hpp__format[col]);
+}
+
+void perf_hpp__cancel_cumulate(void)
+{
+	perf_hpp__column_disable(PERF_HPP__OVERHEAD_ACC);
+	perf_hpp__format[PERF_HPP__OVERHEAD].header = hpp__header_overhead;
+}
+
 int hist_entry__sort_snprintf(struct hist_entry *he, char *s, size_t size,
 			      struct hists *hists)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 8cf8dca37e13..945e2735ed69 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -166,7 +166,10 @@ enum {
 
 void perf_hpp__init(void);
 void perf_hpp__column_register(struct perf_hpp_fmt *format);
+void perf_hpp__column_unregister(struct perf_hpp_fmt *format);
 void perf_hpp__column_enable(unsigned col);
+void perf_hpp__column_disable(unsigned col);
+void perf_hpp__cancel_cumulate(void);
 
 static inline size_t perf_hpp__use_color(void)
 {
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 14/21] perf report: Add --children option
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (12 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 13/21] perf tools: Add more hpp helper functions Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 15/21] perf report: Add report.children config option Namhyung Kim
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

The --children option is for showing accumulated overhead (period)
value as well as self overhead.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt |  5 +++++
 tools/perf/builtin-report.c              | 10 ++++++++++
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 8eab8a4bdeb8..710fc8b18f6b 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -141,6 +141,11 @@ OPTIONS
 
 	Default: fractal,0.5,callee,function.
 
+--children::
+	Accumulate callchain of children to parent entry so that then can
+	show up in the output.  The output will have a new "Children" column
+	and will be sorted on the data.  It requires callchains are recorded.
+
 --max-stack::
 	Set the stack depth limit when parsing the callchain, anything
 	beyond the specified depth will be ignored. This is a trade-off
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ae298ae91a47..2e95ece29121 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -696,6 +696,14 @@ static int report__setup_sample_type(struct report *rep)
 			}
 	}
 
+	if (symbol_conf.cumulate_callchain) {
+		/* Silently ignore if callchain is not used */
+		if (!symbol_conf.use_callchain) {
+			symbol_conf.cumulate_callchain = false;
+			perf_hpp__cancel_cumulate();
+		}
+	}
+
 	if (sort__mode == SORT_MODE__BRANCH) {
 		if (!is_pipe &&
 		    !(sample_type & PERF_SAMPLE_BRANCH_STACK)) {
@@ -1126,6 +1134,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order",
 		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address). "
 		     "Default: fractal,0.5,callee,function", &parse_callchain_opt, callchain_default_opt),
+	OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
+		    "Accumulate callchains of children and show total overhead as well"),
 	OPT_INTEGER(0, "max-stack", &report.max_stack,
 		    "Set the maximum stack depth when parsing the callchain, "
 		    "anything beyond the specified depth will be ignored. "
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 15/21] perf report: Add report.children config option
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (13 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 14/21] perf report: Add --children option Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 16/21] perf tools: Factor out sample__resolve_callchain() Namhyung Kim
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Add report.children config option for setting default value of
callchain accumulation.  It affects the report output only if
perf.data contains callchain info.

A user can write .perfconfig file like below to enable accumulation
by default:

  $ cat ~/.perfconfig
  [report]
  children = true

And it can be disabled through command line:

  $ perf report --no-children

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2e95ece29121..615f078cc94e 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -71,6 +71,10 @@ static int report__config(const char *var, const char *value, void *cb)
 		rep->min_percent = strtof(value, NULL);
 		return 0;
 	}
+	if (!strcmp(var, "report.children")) {
+		symbol_conf.cumulate_callchain = perf_config_bool(var, value);
+		return 0;
+	}
 
 	return perf_default_config(var, value, cb);
 }
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 16/21] perf tools: Factor out sample__resolve_callchain()
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (14 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 15/21] perf report: Add report.children config option Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 17:56   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 17/21] perf tools: Factor out fill_callchain_info() Namhyung Kim
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

The report__resolve_callchain() can be shared with perf top code as it
doesn't really depend on the perf report code.  Factor it out as
sample__resolve_callchain().  The same goes to the hist_entry__append_
callchain() too.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 22 ++--------------------
 tools/perf/builtin-top.c    | 22 +++++++---------------
 tools/perf/util/callchain.c | 23 +++++++++++++++++++++++
 tools/perf/util/callchain.h |  6 ++++++
 4 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 615f078cc94e..1ec00acfcd92 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -79,24 +79,6 @@ static int report__config(const char *var, const char *value, void *cb)
 	return perf_default_config(var, value, cb);
 }
 
-static int report__resolve_callchain(struct report *rep, struct symbol **parent,
-				     struct perf_evsel *evsel, struct addr_location *al,
-				     struct perf_sample *sample)
-{
-	if ((sort__has_parent || symbol_conf.use_callchain) && sample->callchain) {
-		return machine__resolve_callchain(al->machine, evsel, al->thread, sample,
-						  parent, al, rep->max_stack);
-	}
-	return 0;
-}
-
-static int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample)
-{
-	if (!symbol_conf.use_callchain)
-		return 0;
-	return callchain_append(he->callchain, &callchain_cursor, sample->period);
-}
-
 struct add_entry_iter {
 	int total;
 	int curr;
@@ -576,8 +558,8 @@ perf_evsel__add_entry(struct perf_evsel *evsel, struct addr_location *al,
 {
 	int err, err2;
 
-	err = report__resolve_callchain(iter->rep, &iter->parent, evsel, al,
-					sample);
+	err = sample__resolve_callchain(sample, &iter->parent, evsel, al,
+					iter->rep->max_stack);
 	if (err)
 		return err;
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 142a41bd5062..48c527a0f4c8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -743,15 +743,10 @@ static void perf_event__process_sample(struct perf_tool *tool,
 	if (al.sym == NULL || !al.sym->ignore) {
 		struct hist_entry *he;
 
-		if ((sort__has_parent || symbol_conf.use_callchain) &&
-		    sample->callchain) {
-			err = machine__resolve_callchain(machine, evsel,
-							 al.thread, sample,
-							 &parent, &al,
-							 top->max_stack);
-			if (err)
-				return;
-		}
+		err = sample__resolve_callchain(sample, &parent, evsel, &al,
+						top->max_stack);
+		if (err)
+			return;
 
 		he = perf_evsel__add_hist_entry(evsel, &al, sample);
 		if (he == NULL) {
@@ -759,12 +754,9 @@ static void perf_event__process_sample(struct perf_tool *tool,
 			return;
 		}
 
-		if (symbol_conf.use_callchain) {
-			err = callchain_append(he->callchain, &callchain_cursor,
-					       sample->period);
-			if (err)
-				return;
-		}
+		err = hist_entry__append_callchain(he, sample);
+		if (err)
+			return;
 
 		if (sort__has_sym)
 			perf_top__record_precise_ip(top, he, evsel->idx, ip);
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e3970e3eaacf..c938da83144e 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -17,6 +17,8 @@
 
 #include "hist.h"
 #include "util.h"
+#include "sort.h"
+#include "machine.h"
 #include "callchain.h"
 
 __thread struct callchain_cursor callchain_cursor;
@@ -531,3 +533,24 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
 
 	return 0;
 }
+
+int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent,
+			      struct perf_evsel *evsel, struct addr_location *al,
+			      int max_stack)
+{
+	if (sample->callchain == NULL)
+		return 0;
+
+	if (sort__has_parent || symbol_conf.use_callchain) {
+		return machine__resolve_callchain(al->machine, evsel, al->thread,
+						  sample, parent, al, max_stack);
+	}
+	return 0;
+}
+
+int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample)
+{
+	if (!symbol_conf.use_callchain)
+		return 0;
+	return callchain_append(he->callchain, &callchain_cursor, sample->period);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 08b25af9eea1..8ad97e9b119f 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -145,10 +145,16 @@ static inline void callchain_cursor_advance(struct callchain_cursor *cursor)
 }
 
 struct option;
+struct hist_entry;
 
 int record_parse_callchain(const char *arg, struct record_opts *opts);
 int record_parse_callchain_opt(const struct option *opt, const char *arg, int unset);
 int record_callchain_opt(const struct option *opt, const char *arg, int unset);
 
+int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent,
+			      struct perf_evsel *evsel, struct addr_location *al,
+			      int max_stack);
+int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample);
+
 extern const char record_callchain_help[];
 #endif	/* __PERF_CALLCHAIN_H */
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 17/21] perf tools: Factor out fill_callchain_info()
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (15 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 16/21] perf tools: Factor out sample__resolve_callchain() Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 18/21] perf top: Support callchain accumulation Namhyung Kim
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

The fill_callchain_info() will be shared with perf top --children.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 39 ++-------------------------------------
 tools/perf/util/callchain.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  2 ++
 3 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 1ec00acfcd92..a0d33ae3c1ef 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -417,44 +417,9 @@ iter_next_cumulative_entry(struct add_entry_iter *iter,
 	if (node == NULL)
 		return 0;
 
-	al->map = node->map;
-	al->sym = node->sym;
-	if (node->map)
-		al->addr = node->map->map_ip(node->map, node->ip);
-	else
-		al->addr = node->ip;
-
-	if (al->sym == NULL) {
-		if (iter->rep->hide_unresolved)
-			return 0;
-		if (al->map == NULL)
-			goto out;
-	}
-
-	if (al->map->groups == &iter->machine->kmaps) {
-		if (machine__is_host(iter->machine)) {
-			al->cpumode = PERF_RECORD_MISC_KERNEL;
-			al->level = 'k';
-		} else {
-			al->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
-			al->level = 'g';
-		}
-	} else {
-		if (machine__is_host(iter->machine)) {
-			al->cpumode = PERF_RECORD_MISC_USER;
-			al->level = '.';
-		} else if (perf_guest) {
-			al->cpumode = PERF_RECORD_MISC_GUEST_USER;
-			al->level = 'u';
-		} else {
-			al->cpumode = PERF_RECORD_MISC_HYPERVISOR;
-			al->level = 'H';
-		}
-	}
-
-out:
 	callchain_cursor_advance(&callchain_cursor);
-	return 1;
+
+	return fill_callchain_info(al, node, iter->rep->hide_unresolved);
 }
 
 static int
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index c938da83144e..619cd0a0b284 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -554,3 +554,45 @@ int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *samp
 		return 0;
 	return callchain_append(he->callchain, &callchain_cursor, sample->period);
 }
+
+int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *node,
+			bool hide_unresolved)
+{
+	al->map = node->map;
+	al->sym = node->sym;
+	if (node->map)
+		al->addr = node->map->map_ip(node->map, node->ip);
+	else
+		al->addr = node->ip;
+
+	if (al->sym == NULL) {
+		if (hide_unresolved)
+			return 0;
+		if (al->map == NULL)
+			goto out;
+	}
+
+	if (al->map->groups == &al->machine->kmaps) {
+		if (machine__is_host(al->machine)) {
+			al->cpumode = PERF_RECORD_MISC_KERNEL;
+			al->level = 'k';
+		} else {
+			al->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+			al->level = 'g';
+		}
+	} else {
+		if (machine__is_host(al->machine)) {
+			al->cpumode = PERF_RECORD_MISC_USER;
+			al->level = '.';
+		} else if (perf_guest) {
+			al->cpumode = PERF_RECORD_MISC_GUEST_USER;
+			al->level = 'u';
+		} else {
+			al->cpumode = PERF_RECORD_MISC_HYPERVISOR;
+			al->level = 'H';
+		}
+	}
+
+out:
+	return 1;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 8ad97e9b119f..66faae21370d 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -155,6 +155,8 @@ int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent
 			      struct perf_evsel *evsel, struct addr_location *al,
 			      int max_stack);
 int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample);
+int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *node,
+			bool hide_unresolved);
 
 extern const char record_callchain_help[];
 #endif	/* __PERF_CALLCHAIN_H */
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 18/21] perf top: Support callchain accumulation
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (16 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 17/21] perf tools: Factor out fill_callchain_info() Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 18:01   ` Jiri Olsa
  2013-12-24  8:22 ` [PATCH 19/21] perf top: Add --children option Namhyung Kim
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Enable cumulation of callchain of children in perf top.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-top.c | 106 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 103 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 48c527a0f4c8..6a7a76496c94 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -657,6 +657,99 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
 	return 0;
 }
 
+static int process_cumulative_entry(struct perf_top *top,
+				    struct hist_entry *he,
+				    struct perf_evsel *evsel,
+				    struct addr_location *al,
+				    struct perf_sample *sample,
+				    struct symbol *parent)
+{
+	struct hist_entry **he_cache;
+	struct callchain_cursor_node *node;
+	int idx = 0, err;
+
+	he_cache = malloc(sizeof(*he_cache) * (PERF_MAX_STACK_DEPTH + 1));
+	if (he_cache == NULL)
+		return -ENOMEM;
+
+	pthread_mutex_lock(&evsel->hists.lock);
+
+	he_cache[idx++] = he;
+
+	/*
+	 * This is for putting parents upward during output resort iff
+	 * only a child gets sampled.  See hist_entry__sort_on_period().
+	 */
+	he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
+
+	callchain_cursor_commit(&callchain_cursor);
+
+	node = callchain_cursor_current(&callchain_cursor);
+	while (node) {
+		int i;
+		struct hist_entry he_tmp = {
+			.cpu = al->cpu,
+			.thread = al->thread,
+			.comm = thread__comm(al->thread),
+			.parent = parent,
+		};
+
+		fill_callchain_info(al, node, false);
+
+		he_tmp.ip = al->addr;
+		he_tmp.ms.map = al->map;
+		he_tmp.ms.sym = al->sym;
+
+		if (al->sym && al->sym->ignore)
+			goto next;
+
+		/*
+		 * Check if there's duplicate entries in the callchain.
+		 * It's possible that it has cycles or recursive calls.
+		 */
+		for (i = 0; i < idx; i++) {
+			if (hist_entry__cmp(he_cache[i], &he_tmp) == 0)
+				goto next;
+		}
+
+		he = __hists__add_entry(&evsel->hists, al, parent, NULL, NULL,
+					sample->period, sample->weight,
+					sample->transaction, false);
+		if (he == NULL) {
+			err = -ENOMEM;
+			break;;
+		}
+
+		he_cache[idx++] = he;
+
+		/*
+		 * This is for putting parents upward during output resort iff
+		 * only a child gets sampled.  See hist_entry__sort_on_period().
+		 */
+		he->callchain->max_depth = callchain_cursor.nr - callchain_cursor.pos;
+
+		if (sort__has_sym) {
+			u64 ip;
+
+			if (al->map)
+				ip = al->map->unmap_ip(al->map, al->addr);
+			else
+				ip = al->addr;
+
+			perf_top__record_precise_ip(top, he, evsel->idx, ip);
+		}
+
+next:
+		callchain_cursor_advance(&callchain_cursor);
+		node = callchain_cursor_current(&callchain_cursor);
+	}
+
+	pthread_mutex_unlock(&evsel->hists.lock);
+
+	free(he_cache);
+	return err;
+}
+
 static void perf_event__process_sample(struct perf_tool *tool,
 				       const union perf_event *event,
 				       struct perf_evsel *evsel,
@@ -754,9 +847,16 @@ static void perf_event__process_sample(struct perf_tool *tool,
 			return;
 		}
 
-		err = hist_entry__append_callchain(he, sample);
-		if (err)
-			return;
+		if (symbol_conf.cumulate_callchain) {
+			err = process_cumulative_entry(top, he, evsel, &al,
+						       sample, parent);
+			if (err)
+				return;
+		} else {
+			err = hist_entry__append_callchain(he, sample);
+			if (err)
+				return;
+		}
 
 		if (sort__has_sym)
 			perf_top__record_precise_ip(top, he, evsel->idx, ip);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 19/21] perf top: Add --children option
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (17 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 18/21] perf top: Support callchain accumulation Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 20/21] perf top: Add top.children config option Namhyung Kim
  2013-12-24  8:22 ` [PATCH 21/21] perf tools: Enable --children option by default Namhyung Kim
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

The --children option is for showing accumulated overhead (period)
value as well as self overhead.  It should be used with one of -g or
--call-graph option.

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-top.txt | 6 ++++++
 tools/perf/builtin-top.c              | 7 +++++++
 2 files changed, 13 insertions(+)

diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index cdd8d4946dba..01b6fd1a4428 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -149,6 +149,12 @@ Default is to monitor all CPUS.
 	Setup and enable call-graph (stack chain/backtrace) recording,
 	implies -g.
 
+--children::
+	Accumulate callchain of children to parent entry so that then can
+	show up in the output.  The output will have a new "Children" column
+	and will be sorted on the data.  It requires -g/--call-graph option
+	enabled.
+
 --max-stack::
 	Set the stack depth limit when parsing the callchain, anything
 	beyond the specified depth will be ignored. This is a trade-off
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6a7a76496c94..10395cbfef19 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1179,6 +1179,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_CALLBACK(0, "call-graph", &top.record_opts,
 		     "mode[,dump_size]", record_callchain_help,
 		     &parse_callchain_opt),
+	OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
+		    "Accumulate callchains of children and show total overhead as well"),
 	OPT_INTEGER(0, "max-stack", &top.max_stack,
 		    "Set the maximum stack depth when parsing the callchain. "
 		    "Default: " __stringify(PERF_MAX_STACK_DEPTH)),
@@ -1278,6 +1280,11 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	top.sym_evsel = perf_evlist__first(top.evlist);
 
+	if (!symbol_conf.use_callchain) {
+		symbol_conf.cumulate_callchain = false;
+		perf_hpp__cancel_cumulate();
+	}
+
 	symbol_conf.priv_size = sizeof(struct annotation);
 
 	symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 20/21] perf top: Add top.children config option
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (18 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 19/21] perf top: Add --children option Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2013-12-24  8:22 ` [PATCH 21/21] perf tools: Enable --children option by default Namhyung Kim
  20 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Add top.children config option for setting default value of
callchain accumulation.  It affects the output only if one of
-g or --call-graph option is given as well.

A user can write .perfconfig file like below to enable accumulation
by default:

  $ cat ~/.perfconfig
  [top]
  children = true

And it can be disabled through command line:

  $ perf top --no-children

Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-top.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 10395cbfef19..462a78fbc6a1 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -87,6 +87,16 @@ static void perf_top__sig_winch(int sig __maybe_unused,
 	perf_top__update_print_entries(top);
 }
 
+static int top__config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "top.children")) {
+		symbol_conf.cumulate_callchain = perf_config_bool(var, value);
+		return 0;
+	}
+
+	return perf_default_config(var, value, cb);
+}
+
 static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
 {
 	struct symbol *sym;
@@ -1217,6 +1227,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (top.evlist == NULL)
 		return -ENOMEM;
 
+	perf_config(top__config, &top);
+
 	argc = parse_options(argc, argv, options, top_usage, 0);
 	if (argc)
 		usage_with_options(top_usage, options);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 21/21] perf tools: Enable --children option by default
  2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
                   ` (19 preceding siblings ...)
  2013-12-24  8:22 ` [PATCH 20/21] perf top: Add top.children config option Namhyung Kim
@ 2013-12-24  8:22 ` Namhyung Kim
  2014-01-05 18:08   ` Jiri Olsa
  20 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2013-12-24  8:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Namhyung Kim, LKML,
	Arun Sharma, Frederic Weisbecker, Jiri Olsa, Rodrigo Campos

From: Namhyung Kim <namhyung.kim@lge.com>

Now perf top and perf report will show children column by default if
it has callchain information.

Requested-by: Ingo Molnar <mingo@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/symbol.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 923d00040bbf..3e73079e6c92 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -29,11 +29,12 @@ int vmlinux_path__nr_entries;
 char **vmlinux_path;
 
 struct symbol_conf symbol_conf = {
-	.use_modules	  = true,
-	.try_vmlinux_path = true,
-	.annotate_src	  = true,
-	.demangle	  = true,
-	.symfs            = "",
+	.use_modules		= true,
+	.try_vmlinux_path	= true,
+	.annotate_src		= true,
+	.demangle		= true,
+	.cumulate_callchain	= true,
+	.symfs			= "",
 };
 
 static enum dso_binary_type binary_type_symtab[] = {
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
@ 2014-01-05 14:55   ` Jiri Olsa
  2014-01-06  7:45     ` Namhyung Kim
  2014-01-05 15:28   ` Jiri Olsa
  2014-01-05 15:55   ` Jiri Olsa
  2 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 14:55 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>

SNIP

> -		return err;
> +		goto out;
> +
> +	err = addr_map_symbol__inc_samples(&bx->to, evsel->idx);
> +	if (err)
> +		goto out;
> +
> +	evsel->hists.stats.total_period += 1;
> +	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
> +
> +out:
> +	iter->curr++;
> +	return err;
> +}
> +
> +static int
> +iter_finish_branch_entry(struct add_entry_iter *iter,
> +			 struct addr_location *al __maybe_unused)
> +{
> +	free(iter->priv);
> +	iter->priv = NULL;

Arnaldo just added zfree ;-)

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
  2014-01-05 14:55   ` Jiri Olsa
@ 2014-01-05 15:28   ` Jiri Olsa
  2014-01-06  7:45     ` Namhyung Kim
  2014-01-05 15:55   ` Jiri Olsa
  2 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 15:28 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>

SNIP

> @@ -239,6 +426,7 @@ static int process_sample_event(struct perf_tool *tool,
>  {
>  	struct report *rep = container_of(tool, struct report, tool);
>  	struct addr_location al;
> +	struct add_entry_iter *iter;
>  	int ret;
>  
>  	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> @@ -253,22 +441,22 @@ static int process_sample_event(struct perf_tool *tool,
>  	if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap))
>  		return 0;
>  
> -	if (sort__mode == SORT_MODE__BRANCH) {
> -		ret = report__add_branch_hist_entry(tool, &al, sample, evsel);
> -		if (ret < 0)
> -			pr_debug("problem adding lbr entry, skipping event\n");
> -	} else if (rep->mem_mode == 1) {
> -		ret = report__add_mem_hist_entry(tool, &al, sample, evsel, event);
> -		if (ret < 0)
> -			pr_debug("problem adding mem entry, skipping event\n");
> -	} else {
> -		if (al.map != NULL)
> -			al.map->dso->hit = 1;
> -
> -		ret = report__add_hist_entry(tool, evsel, &al, sample);
> -		if (ret < 0)
> -			pr_debug("problem incrementing symbol period, skipping event\n");
> -	}
> +	if (sort__mode == SORT_MODE__BRANCH)
> +		iter = &branch_iter;
> +	else if (rep->mem_mode == 1) {
> +		iter = &mem_iter;
> +		iter->priv = event;
> +	} else
> +		iter = &normal_iter;
> +
> +	if (al.map != NULL)
> +		al.map->dso->hit = 1;
> +
> +	iter->rep = rep;
> +	ret = perf_evsel__add_entry(evsel, &al, sample, machine, iter);

you dont need to pass machine pointer, it's already in 'al'

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
  2014-01-05 14:55   ` Jiri Olsa
  2014-01-05 15:28   ` Jiri Olsa
@ 2014-01-05 15:55   ` Jiri Olsa
  2014-01-06  8:03     ` Namhyung Kim
  2 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 15:55 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>

SNIP

> +
> +out:
> +	iter->curr++;
> +	return err;
> +}
> +
> +static int
> +iter_finish_branch_entry(struct add_entry_iter *iter,
> +			 struct addr_location *al __maybe_unused)
> +{
> +	free(iter->priv);
> +	iter->priv = NULL;

so branch_info is duplicated in the hist_entry__new.. it's not
easy to find why this one gets freed while mem_info stays ;-)

some comment about that here would help

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/21] perf hists: Convert hist entry functions to use struct he_stat
  2013-12-24  8:22 ` [PATCH 02/21] perf hists: Convert hist entry functions to use struct he_stat Namhyung Kim
@ 2014-01-05 16:09   ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 16:09 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:08PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> hist_entry__add_cpumode_period() and hist_entry__decay() are dealing
> with hist_entry's stat fields only.  So make them use the struct
> directly.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/hist.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)

Acked-by: Jiri Olsa <jolsa@redhat.com>

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry
  2013-12-24  8:22 ` [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
@ 2014-01-05 16:16   ` Jiri Olsa
  2014-01-05 16:26   ` Jiri Olsa
  2014-01-05 16:45   ` Jiri Olsa
  2 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 16:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:09PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Maintain accumulated stat information in hist_entry->stat_acc if
> symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
> vaules initially, and will be updated as callchain is processed later.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/hist.c   | 18 ++++++++++++++++++
>  tools/perf/util/sort.h   |  1 +
>  tools/perf/util/symbol.h |  1 +
>  3 files changed, 20 insertions(+)
> 
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 283aa1972b4f..b61e2fa42412 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -237,6 +237,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
>  		return true;
>  
>  	he_stat__decay(&he->stat);
> +	if (symbol_conf.cumulate_callchain)
> +		he_stat__decay(he->stat_acc);
>  
>  	if (!he->filtered)
>  		hists->stats.total_period -= prev_period - he->stat.period;
> @@ -284,6 +286,15 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
>  	if (he != NULL) {
>  		*he = *template;
>  
> +		if (symbol_conf.cumulate_callchain) {
> +			he->stat_acc = malloc(sizeof(he->stat));
> +			if (he->stat_acc == NULL) {
> +				free(he);
> +				return NULL;
> +			}
> +			memcpy(he->stat_acc, &he->stat, sizeof(he->stat));
> +		}
> +
>  		if (he->ms.map)
>  			he->ms.map->referenced = true;
>  
> @@ -295,6 +306,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
>  			 */
>  			he->branch_info = malloc(sizeof(*he->branch_info));
>  			if (he->branch_info == NULL) {
> +				free(he->stat_acc);
>  				free(he);
>  				return NULL;

hum, he->stat_acc should get freed in hist_entry__free

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry
  2013-12-24  8:22 ` [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
  2014-01-05 16:16   ` Jiri Olsa
@ 2014-01-05 16:26   ` Jiri Olsa
  2014-01-06  8:09     ` Namhyung Kim
  2014-01-05 16:45   ` Jiri Olsa
  2 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 16:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:09PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Maintain accumulated stat information in hist_entry->stat_acc if
> symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
> vaules initially, and will be updated as callchain is processed later.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/hist.c   | 18 ++++++++++++++++++
>  tools/perf/util/sort.h   |  1 +
>  tools/perf/util/symbol.h |  1 +
>  3 files changed, 20 insertions(+)
> 
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 283aa1972b4f..b61e2fa42412 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -237,6 +237,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
>  		return true;
>  
>  	he_stat__decay(&he->stat);
> +	if (symbol_conf.cumulate_callchain)
> +		he_stat__decay(he->stat_acc);
>  
>  	if (!he->filtered)
>  		hists->stats.total_period -= prev_period - he->stat.period;
> @@ -284,6 +286,15 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
>  	if (he != NULL) {
>  		*he = *template;
>  
> +		if (symbol_conf.cumulate_callchain) {
> +			he->stat_acc = malloc(sizeof(he->stat));
> +			if (he->stat_acc == NULL) {
> +				free(he);
> +				return NULL;
> +			}
> +			memcpy(he->stat_acc, &he->stat, sizeof(he->stat));
> +		}
> +
>  		if (he->ms.map)
>  			he->ms.map->referenced = true;
>  
> @@ -295,6 +306,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
>  			 */
>  			he->branch_info = malloc(sizeof(*he->branch_info));
>  			if (he->branch_info == NULL) {
> +				free(he->stat_acc);

probably not big deal but above free should be under
symbol_conf.cumulate_callchain check..

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry
  2013-12-24  8:22 ` [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
  2014-01-05 16:16   ` Jiri Olsa
  2014-01-05 16:26   ` Jiri Olsa
@ 2014-01-05 16:45   ` Jiri Olsa
  2014-01-06  8:10     ` Namhyung Kim
  2 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 16:45 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:09PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Maintain accumulated stat information in hist_entry->stat_acc if
> symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
> vaules initially, and will be updated as callchain is processed later.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/hist.c   | 18 ++++++++++++++++++
>  tools/perf/util/sort.h   |  1 +
>  tools/perf/util/symbol.h |  1 +
>  3 files changed, 20 insertions(+)
> 
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 283aa1972b4f..b61e2fa42412 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -237,6 +237,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
>  		return true;
>  
>  	he_stat__decay(&he->stat);
> +	if (symbol_conf.cumulate_callchain)
> +		he_stat__decay(he->stat_acc);
>  
>  	if (!he->filtered)
>  		hists->stats.total_period -= prev_period - he->stat.period;
> @@ -284,6 +286,15 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)

there's callchain size computation in here:
        size_t callchain_size = symbol_conf.use_callchain ?  sizeof(struct callchain_root) : 0;

I wonder we could make it zero for symbol_conf.cumulate_callchain

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain
  2013-12-24  8:22 ` [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain Namhyung Kim
@ 2014-01-05 16:58   ` Jiri Olsa
  2014-01-05 17:15     ` Jiri Olsa
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 16:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:11PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Call __hists__add_entry() for each callchain node to get an
> accumulated stat for an entry.  Introduce new cumulative_iter ops to
> process them properly.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-report.c | 103 +++++++++++++++++++++++++++++++++++++++++++-
>  tools/perf/ui/stdio/hist.c  |   2 +-
>  2 files changed, 103 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 29fe19071d24..4fde0ab82498 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -360,6 +360,97 @@ iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
>  	return hist_entry__append_callchain(he, sample);
>  }
>  
> +static int
> +iter_prepare_cumulative_entry(struct add_entry_iter *iter,
> +			      struct machine *machine __maybe_unused,
> +			      struct perf_evsel *evsel,
> +			      struct addr_location *al __maybe_unused,
> +			      struct perf_sample *sample)
> +{
> +	callchain_cursor_commit(&callchain_cursor);
> +
> +	iter->evsel = evsel;
> +	iter->sample = sample;
> +	return 0;
> +}
> +
> +static int
> +iter_add_single_cumulative_entry(struct add_entry_iter *iter,
> +				 struct addr_location *al)
> +{
> +	struct perf_evsel *evsel = iter->evsel;
> +	struct perf_sample *sample = iter->sample;
> +	struct hist_entry *he;
> +
> +	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
> +				sample->period, sample->weight,
> +				sample->transaction, true);
> +	if (he == NULL)
> +		return -ENOMEM;
> +
> +	/*
> +	 * This is for putting parents upward during output resort iff
> +	 * only a child gets sampled.  See hist_entry__sort_on_period().
> +	 */
> +	he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;

so you're using callchain struct to hold the entry's stack
position for sorting.. I think we could store this info
inside hist_entry itself, and omit 'struct callchain_root'
size being allocated for hist_entry

I checked 'struct hist_entry' and the 'position' entry seems to
be abandonned ;-)) or we could use some unused entry (mem_info?)
and create some union.

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/21] perf tools: Update cpumode for each cumulative entry
  2013-12-24  8:22 ` [PATCH 06/21] perf tools: Update cpumode for each cumulative entry Namhyung Kim
@ 2014-01-05 17:02   ` Jiri Olsa
  2014-01-06  8:18     ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:02 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:12PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> The cpumode and level in struct addr_localtion was set for a sample
> and but updated as cumulative callchains were added.  This led to have
> non-matching symbol and cpumode in the output.
> 
> Update it accordingly based on the fact whether the map is a part of
> the kernel or not.  This is a reverse of what thread__find_addr_map()
> does.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-report.c | 34 +++++++++++++++++++++++++++++++---
>  1 file changed, 31 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 4fde0ab82498..17c41c686042 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -100,6 +100,7 @@ struct add_entry_iter {
>  	struct report *rep;
>  	struct perf_evsel *evsel;
>  	struct perf_sample *sample;
> +	struct machine *machine;
>  	struct hist_entry *he;
>  	struct symbol *parent;
>  	void *priv;
> @@ -362,7 +363,7 @@ iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
>  
>  static int
>  iter_prepare_cumulative_entry(struct add_entry_iter *iter,
> -			      struct machine *machine __maybe_unused,
> +			      struct machine *machine,
>  			      struct perf_evsel *evsel,
>  			      struct addr_location *al __maybe_unused,
>  			      struct perf_sample *sample)
> @@ -371,6 +372,7 @@ iter_prepare_cumulative_entry(struct add_entry_iter *iter,
>  
>  	iter->evsel = evsel;
>  	iter->sample = sample;
> +	iter->machine = machine;

machine is already in 'al->machine'

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/21] perf report: Cache cumulative callchains
  2013-12-24  8:22 ` [PATCH 07/21] perf report: Cache cumulative callchains Namhyung Kim
@ 2014-01-05 17:10   ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:13PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 

SNIP

>  	return hist_entry__inc_addr_samples(he, evsel->idx, al->addr);
>  }
>  
> @@ -476,6 +522,8 @@ iter_finish_cumulative_entry(struct add_entry_iter *iter,
>  	evsel->hists.stats.total_period += sample->period;
>  	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
>  
> +	free(iter->priv);
> +	iter->priv = NULL;

zfree ;-) 

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain
  2014-01-05 16:58   ` Jiri Olsa
@ 2014-01-05 17:15     ` Jiri Olsa
  2014-01-06  8:17       ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, Jan 05, 2014 at 05:58:31PM +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:11PM +0900, Namhyung Kim wrote:
> > From: Namhyung Kim <namhyung.kim@lge.com>
> > 
> > Call __hists__add_entry() for each callchain node to get an
> > accumulated stat for an entry.  Introduce new cumulative_iter ops to
> > process them properly.
> > 
> > Cc: Arun Sharma <asharma@fb.com>
> > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/builtin-report.c | 103 +++++++++++++++++++++++++++++++++++++++++++-
> >  tools/perf/ui/stdio/hist.c  |   2 +-
> >  2 files changed, 103 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> > index 29fe19071d24..4fde0ab82498 100644
> > --- a/tools/perf/builtin-report.c
> > +++ b/tools/perf/builtin-report.c
> > @@ -360,6 +360,97 @@ iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
> >  	return hist_entry__append_callchain(he, sample);
> >  }
> >  
> > +static int
> > +iter_prepare_cumulative_entry(struct add_entry_iter *iter,
> > +			      struct machine *machine __maybe_unused,
> > +			      struct perf_evsel *evsel,
> > +			      struct addr_location *al __maybe_unused,
> > +			      struct perf_sample *sample)
> > +{
> > +	callchain_cursor_commit(&callchain_cursor);
> > +
> > +	iter->evsel = evsel;
> > +	iter->sample = sample;
> > +	return 0;
> > +}
> > +
> > +static int
> > +iter_add_single_cumulative_entry(struct add_entry_iter *iter,
> > +				 struct addr_location *al)
> > +{
> > +	struct perf_evsel *evsel = iter->evsel;
> > +	struct perf_sample *sample = iter->sample;
> > +	struct hist_entry *he;
> > +
> > +	he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
> > +				sample->period, sample->weight,
> > +				sample->transaction, true);
> > +	if (he == NULL)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * This is for putting parents upward during output resort iff
> > +	 * only a child gets sampled.  See hist_entry__sort_on_period().
> > +	 */
> > +	he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
> 
> so you're using callchain struct to hold the entry's stack
> position for sorting.. I think we could store this info
> inside hist_entry itself, and omit 'struct callchain_root'
> size being allocated for hist_entry
> 
> I checked 'struct hist_entry' and the 'position' entry seems to
> be abandonned ;-)) or we could use some unused entry (mem_info?)
> and create some union.

also perhaps above 5 lines should be part of the later commit:
  perf hists: Sort hist entries by accumulated period

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat
  2013-12-24  8:22 ` [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat Namhyung Kim
@ 2014-01-05 17:31   ` Jiri Olsa
  2014-01-06  8:32     ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:31 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:15PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Print accumulated stat of a hist entry if requested.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/ui/hist.c   | 45 +++++++++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/hist.h |  1 +
>  2 files changed, 46 insertions(+)
> 
> diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
> index 78f4c92e9b73..eb9c07bcdb01 100644
> --- a/tools/perf/ui/hist.c
> +++ b/tools/perf/ui/hist.c
> @@ -129,6 +129,28 @@ static int hpp__entry_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,		\
>  			  scnprintf, true);					\
>  }
>  
> +#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
> +static u64 he_get_acc_##_field(struct hist_entry *he)				\
> +{										\
> +	return he->stat_acc->_field;						\
> +}										\
> +										\
> +static int hpp__color_acc_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
> +			      struct perf_hpp *hpp, struct hist_entry *he) 	\
> +{										\
> +	return __hpp__fmt(hpp, he, he_get_acc_##_field, " %6.2f%%",		\
> +			  (hpp_snprint_fn)percent_color_snprintf, true);	\
> +}
> +
> +#define __HPP_ENTRY_ACC_PERCENT_FN(_type, _field)				\
> +static int hpp__entry_acc_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,	\
> +			      struct perf_hpp *hpp, struct hist_entry *he) 	\
> +{										\
> +	const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%";		\
> +	return __hpp__fmt(hpp, he, he_get_acc_##_field, fmt,			\
> +			  scnprintf, true);					\
> +}
> +
>  #define __HPP_ENTRY_RAW_FN(_type, _field)					\
>  static u64 he_get_raw_##_field(struct hist_entry *he)				\
>  {										\
> @@ -148,17 +170,25 @@ __HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
>  __HPP_COLOR_PERCENT_FN(_type, _field)					\
>  __HPP_ENTRY_PERCENT_FN(_type, _field)
>  
> +#define HPP_PERCENT_ACC_FNS(_type, _str, _field, _min_width, _unit_width)\
> +__HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
> +__HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
> +__HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
> +__HPP_ENTRY_ACC_PERCENT_FN(_type, _field)

how about add 'stat|stat_acc' parameter to the HPP_PERCENT_FNS
macro and use it like:

#define __HPP_COLOR_PERCENT_FN(_type, _field, _stat)                            \
static u64 he_get_##_stat##_field(struct hist_entry *he)                        \
{                                                                               \
        return he->##_stat._field;                                              \
}                                                                               \


and use 'he_get_##_stat##_field' in __HPP_ENTRY_PERCENT_FN

This way you dont need to duplicate the code, that differs
only in the used stat data.. but it's possible I missed
something ;-)

btw just noticed __HPP_COLOR_PERCENT_FN does not use _type

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 10/21] perf ui/browser: Add support to accumulated hist stat
  2013-12-24  8:22 ` [PATCH 10/21] perf ui/browser: " Namhyung Kim
@ 2014-01-05 17:33   ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:33 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:16PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Print accumulated stat of a hist entry if requested.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/ui/browsers/hists.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index a440e03cd8c2..efa78894f70d 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -693,11 +693,26 @@ hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
>  	return __hpp__color_fmt(hpp, he, __hpp_get_##_field, _cb);	\
>  }
>  
> +#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field, _cb)			\
> +static u64 __hpp_get_acc_##_field(struct hist_entry *he)		\
> +{									\
> +	return he->stat_acc->_field;					\
> +}									\
> +									\
> +static int								\
> +hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
> +				struct perf_hpp *hpp,			\
> +				struct hist_entry *he)			\
> +{									\
> +	return __hpp__color_fmt(hpp, he, __hpp_get_acc_##_field, _cb);	\
> +}
> +
>  __HPP_COLOR_PERCENT_FN(overhead, period, __hpp__color_callchain)
>  __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys, NULL)
>  __HPP_COLOR_PERCENT_FN(overhead_us, period_us, NULL)
>  __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys, NULL)
>  __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us, NULL)
> +__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period, NULL)

same comment as for ui/hist.c..

including _type parameter in __HPP_COLOR_PERCENT_FN ;-)

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/21] perf ui/gtk: Add support to accumulated hist stat
  2013-12-24  8:22 ` [PATCH 11/21] perf ui/gtk: " Namhyung Kim
@ 2014-01-05 17:35   ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:17PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Print accumulated stat of a hist entry if requested.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/ui/gtk/hists.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
> index 2ca66cc1160f..70ed0d5e1b94 100644
> --- a/tools/perf/ui/gtk/hists.c
> +++ b/tools/perf/ui/gtk/hists.c
> @@ -98,11 +98,25 @@ static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,
>  	return __hpp__color_fmt(hpp, he, he_get_##_field);			\
>  }
>  
> +#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
> +static u64 he_get_acc_##_field(struct hist_entry *he)				\
> +{										\
> +	return he->stat_acc->_field;						\
> +}										\
> +										\
> +static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
> +				       struct perf_hpp *hpp,			\
> +				       struct hist_entry *he)			\
> +{										\
> +	return __hpp__color_fmt(hpp, he, he_get_acc_##_field);			\
> +}
> +
>  __HPP_COLOR_PERCENT_FN(overhead, period)
>  __HPP_COLOR_PERCENT_FN(overhead_sys, period_sys)
>  __HPP_COLOR_PERCENT_FN(overhead_us, period_us)
>  __HPP_COLOR_PERCENT_FN(overhead_guest_sys, period_guest_sys)
>  __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us)
> +__HPP_COLOR_ACC_PERCENT_FN(overhead_acc, period)

and same comment as for ui/hist.c in here

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage
  2013-12-24  8:22 ` [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage Namhyung Kim
@ 2014-01-05 17:40   ` Jiri Olsa
  2014-01-06  8:33     ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:40 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:18PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> If -g cumulative option is given, it needs to show entries which don't
> have self overhead.  So apply percent-limit to accumulated overhead
> percentage in this case.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/ui/browsers/hists.c | 34 ++++++++++++++++++++++++++--------
>  tools/perf/ui/gtk/hists.c      | 11 +++++++++--
>  tools/perf/ui/stdio/hist.c     | 11 +++++++++--
>  3 files changed, 44 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index efa78894f70d..b02e71ecc5fe 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -828,12 +828,19 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
>  
>  	for (nd = browser->top; nd; nd = rb_next(nd)) {
>  		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
> -		float percent = h->stat.period * 100.0 /
> -					hb->hists->stats.total_period;
> +		float percent;
>  
>  		if (h->filtered)
>  			continue;
>  
> +		if (symbol_conf.cumulate_callchain) {
> +			percent = h->stat_acc->period * 100.0 /
> +					hb->hists->stats.total_period;
> +		} else {
> +			percent = h->stat.period * 100.0 /
> +					hb->hists->stats.total_period;
> +		}

seems like above code (7 lines) should go to function

jirka

> +
>  		if (percent < hb->min_pcnt)
>  			continue;
>  
> @@ -851,13 +858,17 @@ static struct rb_node *hists__filter_entries(struct rb_node *nd,
>  {
>  	while (nd != NULL) {
>  		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
> -		float percent = h->stat.period * 100.0 /
> -					hists->stats.total_period;
> +		float percent;
>  
> -		if (percent < min_pcnt)
> -			return NULL;
> +		if (symbol_conf.cumulate_callchain) {
> +			percent = h->stat_acc->period * 100.0 /
> +					hists->stats.total_period;
> +		} else {
> +			percent = h->stat.period * 100.0 /
> +					hists->stats.total_period;
> +		}
>  
> -		if (!h->filtered)
> +		if (!h->filtered && percent >= min_pcnt)
>  			return nd;
>  
>  		nd = rb_next(nd);
> @@ -872,8 +883,15 @@ static struct rb_node *hists__filter_prev_entries(struct rb_node *nd,
>  {
>  	while (nd != NULL) {
>  		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
> -		float percent = h->stat.period * 100.0 /
> +		float percent;
> +
> +		if (symbol_conf.cumulate_callchain) {
> +			percent = h->stat_acc->period * 100.0 /
> +					hists->stats.total_period;
> +		} else {
> +			percent = h->stat.period * 100.0 /
>  					hists->stats.total_period;
> +		}
>  
>  		if (!h->filtered && percent >= min_pcnt)
>  			return nd;
> diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
> index 70ed0d5e1b94..06ae3342e14f 100644
> --- a/tools/perf/ui/gtk/hists.c
> +++ b/tools/perf/ui/gtk/hists.c
> @@ -296,12 +296,19 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
>  	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
>  		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
>  		GtkTreeIter iter;
> -		float percent = h->stat.period * 100.0 /
> -					hists->stats.total_period;
> +		float percent;
>  
>  		if (h->filtered)
>  			continue;
>  
> +		if (symbol_conf.cumulate_callchain) {
> +			percent = h->stat_acc->period * 100.0 /
> +					hists->stats.total_period;
> +		} else {
> +			percent = h->stat.period * 100.0 /
> +					hists->stats.total_period;
> +		}
> +
>  		if (percent < min_pcnt)
>  			continue;
>  
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index 4c4986e809d8..7ea8502192b0 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -487,12 +487,19 @@ print_entries:
>  
>  	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
>  		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
> -		float percent = h->stat.period * 100.0 /
> -					hists->stats.total_period;
> +		float percent;
>  
>  		if (h->filtered)
>  			continue;
>  
> +		if (symbol_conf.cumulate_callchain) {
> +			percent = h->stat_acc->period * 100.0 /
> +					hists->stats.total_period;
> +		} else {
> +			percent = h->stat.period * 100.0 /
> +					hists->stats.total_period;
> +		}
> +
>  		if (percent < min_pcnt)
>  			continue;
>  
> -- 
> 1.7.11.7
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 16/21] perf tools: Factor out sample__resolve_callchain()
  2013-12-24  8:22 ` [PATCH 16/21] perf tools: Factor out sample__resolve_callchain() Namhyung Kim
@ 2014-01-05 17:56   ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 17:56 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:22PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> The report__resolve_callchain() can be shared with perf top code as it
> doesn't really depend on the perf report code.  Factor it out as
> sample__resolve_callchain().  The same goes to the hist_entry__append_
> callchain() too.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-report.c | 22 ++--------------------
>  tools/perf/builtin-top.c    | 22 +++++++---------------
>  tools/perf/util/callchain.c | 23 +++++++++++++++++++++++
>  tools/perf/util/callchain.h |  6 ++++++
>  4 files changed, 38 insertions(+), 35 deletions(-)

Acked-by: Jiri Olsa <jolsa@redhat.com>

jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 18/21] perf top: Support callchain accumulation
  2013-12-24  8:22 ` [PATCH 18/21] perf top: Support callchain accumulation Namhyung Kim
@ 2014-01-05 18:01   ` Jiri Olsa
  2014-01-06  8:34     ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 18:01 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:24PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Enable cumulation of callchain of children in perf top.
> 
> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-top.c | 106 +++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 103 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 48c527a0f4c8..6a7a76496c94 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -657,6 +657,99 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
>  	return 0;
>  }
>  
> +static int process_cumulative_entry(struct perf_top *top,
> +				    struct hist_entry *he,
> +				    struct perf_evsel *evsel,
> +				    struct addr_location *al,
> +				    struct perf_sample *sample,
> +				    struct symbol *parent)
> +{

hum, I wonder how hard would it be to export the iterator
stuff out of the report command and export it to be usable
from here as well.. to much code dusplicated below :-\

jirka

> +	struct hist_entry **he_cache;
> +	struct callchain_cursor_node *node;
> +	int idx = 0, err;
> +
> +	he_cache = malloc(sizeof(*he_cache) * (PERF_MAX_STACK_DEPTH + 1));
> +	if (he_cache == NULL)
> +		return -ENOMEM;
> +
> +	pthread_mutex_lock(&evsel->hists.lock);
> +
> +	he_cache[idx++] = he;
> +
> +	/*
> +	 * This is for putting parents upward during output resort iff
> +	 * only a child gets sampled.  See hist_entry__sort_on_period().
> +	 */
> +	he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
> +
> +	callchain_cursor_commit(&callchain_cursor);
> +
> +	node = callchain_cursor_current(&callchain_cursor);
> +	while (node) {
> +		int i;
> +		struct hist_entry he_tmp = {
> +			.cpu = al->cpu,
> +			.thread = al->thread,
> +			.comm = thread__comm(al->thread),
> +			.parent = parent,
> +		};
> +
> +		fill_callchain_info(al, node, false);
> +
> +		he_tmp.ip = al->addr;
> +		he_tmp.ms.map = al->map;
> +		he_tmp.ms.sym = al->sym;
> +
> +		if (al->sym && al->sym->ignore)
> +			goto next;
> +
> +		/*
> +		 * Check if there's duplicate entries in the callchain.
> +		 * It's possible that it has cycles or recursive calls.
> +		 */
> +		for (i = 0; i < idx; i++) {
> +			if (hist_entry__cmp(he_cache[i], &he_tmp) == 0)
> +				goto next;
> +		}
> +
> +		he = __hists__add_entry(&evsel->hists, al, parent, NULL, NULL,
> +					sample->period, sample->weight,
> +					sample->transaction, false);
> +		if (he == NULL) {
> +			err = -ENOMEM;
> +			break;;
> +		}
> +
> +		he_cache[idx++] = he;
> +
> +		/*
> +		 * This is for putting parents upward during output resort iff
> +		 * only a child gets sampled.  See hist_entry__sort_on_period().
> +		 */
> +		he->callchain->max_depth = callchain_cursor.nr - callchain_cursor.pos;
> +
> +		if (sort__has_sym) {
> +			u64 ip;
> +
> +			if (al->map)
> +				ip = al->map->unmap_ip(al->map, al->addr);
> +			else
> +				ip = al->addr;
> +
> +			perf_top__record_precise_ip(top, he, evsel->idx, ip);
> +		}
> +
> +next:
> +		callchain_cursor_advance(&callchain_cursor);
> +		node = callchain_cursor_current(&callchain_cursor);
> +	}
> +
> +	pthread_mutex_unlock(&evsel->hists.lock);
> +
> +	free(he_cache);
> +	return err;
> +}
> +
>  static void perf_event__process_sample(struct perf_tool *tool,
>  				       const union perf_event *event,
>  				       struct perf_evsel *evsel,
> @@ -754,9 +847,16 @@ static void perf_event__process_sample(struct perf_tool *tool,
>  			return;
>  		}
>  
> -		err = hist_entry__append_callchain(he, sample);
> -		if (err)
> -			return;
> +		if (symbol_conf.cumulate_callchain) {
> +			err = process_cumulative_entry(top, he, evsel, &al,
> +						       sample, parent);
> +			if (err)
> +				return;
> +		} else {
> +			err = hist_entry__append_callchain(he, sample);
> +			if (err)
> +				return;
> +		}
>  
>  		if (sort__has_sym)
>  			perf_top__record_precise_ip(top, he, evsel->idx, ip);
> -- 
> 1.7.11.7
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 21/21] perf tools: Enable --children option by default
  2013-12-24  8:22 ` [PATCH 21/21] perf tools: Enable --children option by default Namhyung Kim
@ 2014-01-05 18:08   ` Jiri Olsa
  2014-01-06  8:47     ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Olsa @ 2014-01-05 18:08 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Tue, Dec 24, 2013 at 05:22:27PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
> 
> Now perf top and perf report will show children column by default if
> it has callchain information.
> 
> Requested-by: Ingo Molnar <mingo@kernel.org>

hum, so I could easily put 'children = false' to my .perfconfig
to get the previous default behaviour back..

but I thought the notion was not to disturb user with
new features ...and I feel disturbed ;-)

I probably missed some discussion about this

jirka

> Cc: Arun Sharma <asharma@fb.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/symbol.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 923d00040bbf..3e73079e6c92 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -29,11 +29,12 @@ int vmlinux_path__nr_entries;
>  char **vmlinux_path;
>  
>  struct symbol_conf symbol_conf = {
> -	.use_modules	  = true,
> -	.try_vmlinux_path = true,
> -	.annotate_src	  = true,
> -	.demangle	  = true,
> -	.symfs            = "",
> +	.use_modules		= true,
> +	.try_vmlinux_path	= true,
> +	.annotate_src		= true,
> +	.demangle		= true,
> +	.cumulate_callchain	= true,
> +	.symfs			= "",
>  };
>  
>  static enum dso_binary_type binary_type_symtab[] = {
> -- 
> 1.7.11.7
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2014-01-05 14:55   ` Jiri Olsa
@ 2014-01-06  7:45     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  7:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Sun, 5 Jan 2014 15:55:42 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@lge.com>
>> +static int
>> +iter_finish_branch_entry(struct add_entry_iter *iter,
>> +			 struct addr_location *al __maybe_unused)
>> +{
>> +	free(iter->priv);
>> +	iter->priv = NULL;
>
> Arnaldo just added zfree ;-)

Will change.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2014-01-05 15:28   ` Jiri Olsa
@ 2014-01-06  7:45     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  7:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Sun, 5 Jan 2014 16:28:15 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@lge.com>
>> +	iter->rep = rep;
>> +	ret = perf_evsel__add_entry(evsel, &al, sample, machine, iter);
>
> you dont need to pass machine pointer, it's already in 'al'

Right.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2014-01-05 15:55   ` Jiri Olsa
@ 2014-01-06  8:03     ` Namhyung Kim
  2014-01-06 14:32       ` Jiri Olsa
  0 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Sun, 5 Jan 2014 16:55:10 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@lge.com>
>
> SNIP
>
>> +
>> +out:
>> +	iter->curr++;
>> +	return err;
>> +}
>> +
>> +static int
>> +iter_finish_branch_entry(struct add_entry_iter *iter,
>> +			 struct addr_location *al __maybe_unused)
>> +{
>> +	free(iter->priv);
>> +	iter->priv = NULL;
>
> so branch_info is duplicated in the hist_entry__new.. it's not
> easy to find why this one gets freed while mem_info stays ;-)

The branch_info is an array and each entry gets its own part in the
array - so entries alloc & copy the (part of) branch info as the branch
info iterated.  But mem info is a single data for a single entry so a
new entry steals ownership of the mem info when created and not freed.

Maybe we can alloc & copy another mem info when adds an entry and free
the original at iter->finish_entry(), but I felt it's unnecessary..

>
> some comment about that here would help

How about this?

  /*
   * The mem_info was either already freed in add_hist_entry() or
   * passed to a new hist entry by hist_entry__new().
   */


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry
  2014-01-05 16:26   ` Jiri Olsa
@ 2014-01-06  8:09     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 17:26:50 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:09PM +0900, Namhyung Kim wrote:
>> @@ -295,6 +306,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
>>  			 */
>>  			he->branch_info = malloc(sizeof(*he->branch_info));
>>  			if (he->branch_info == NULL) {
>> +				free(he->stat_acc);
>
> probably not big deal but above free should be under
> symbol_conf.cumulate_callchain check..

The he->stat_acc will be NULL if !symbol_conf.cumulate_callchain so it's
safe to call free().

And I'll add the call to free() into hist_entry__free() anyway.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry
  2014-01-05 16:45   ` Jiri Olsa
@ 2014-01-06  8:10     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:10 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 17:45:46 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:09PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@lge.com>
>> 
>> Maintain accumulated stat information in hist_entry->stat_acc if
>> symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
>> vaules initially, and will be updated as callchain is processed later.
>> 
>> Cc: Arun Sharma <asharma@fb.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>>  tools/perf/util/hist.c   | 18 ++++++++++++++++++
>>  tools/perf/util/sort.h   |  1 +
>>  tools/perf/util/symbol.h |  1 +
>>  3 files changed, 20 insertions(+)
>> 
>> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
>> index 283aa1972b4f..b61e2fa42412 100644
>> --- a/tools/perf/util/hist.c
>> +++ b/tools/perf/util/hist.c
>> @@ -237,6 +237,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
>>  		return true;
>>  
>>  	he_stat__decay(&he->stat);
>> +	if (symbol_conf.cumulate_callchain)
>> +		he_stat__decay(he->stat_acc);
>>  
>>  	if (!he->filtered)
>>  		hists->stats.total_period -= prev_period - he->stat.period;
>> @@ -284,6 +286,15 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template)
>
> there's callchain size computation in here:
>         size_t callchain_size = symbol_conf.use_callchain ?  sizeof(struct callchain_root) : 0;
>
> I wonder we could make it zero for symbol_conf.cumulate_callchain

The plan is to support callchain with --children enabled. :)

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain
  2014-01-05 17:15     ` Jiri Olsa
@ 2014-01-06  8:17       ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:17 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 18:15:28 +0100, Jiri Olsa wrote:
> On Sun, Jan 05, 2014 at 05:58:31PM +0100, Jiri Olsa wrote:
>> On Tue, Dec 24, 2013 at 05:22:11PM +0900, Namhyung Kim wrote:
>> > +	/*
>> > +	 * This is for putting parents upward during output resort iff
>> > +	 * only a child gets sampled.  See hist_entry__sort_on_period().
>> > +	 */
>> > +	he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
>> 
>> so you're using callchain struct to hold the entry's stack
>> position for sorting.. I think we could store this info
>> inside hist_entry itself, and omit 'struct callchain_root'
>> size being allocated for hist_entry

I know it's hacky - it's to keep the sorting properly orderes entries
without callchain support Arun requested.  And I think I can add the
callchain support in the next version so it should be cleaned up too.

>> 
>> I checked 'struct hist_entry' and the 'position' entry seems to
>> be abandonned ;-)) or we could use some unused entry (mem_info?)
>> and create some union.

Yes, I also have a plan to cleanup/diet the struct hist_entry. :)

>
> also perhaps above 5 lines should be part of the later commit:
>   perf hists: Sort hist entries by accumulated period

As I said it'll be changed in the next version.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/21] perf tools: Update cpumode for each cumulative entry
  2014-01-05 17:02   ` Jiri Olsa
@ 2014-01-06  8:18     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:18 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 18:02:19 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:12PM +0900, Namhyung Kim wrote:
[SNIP]

>> @@ -100,6 +100,7 @@ struct add_entry_iter {
>>  	struct report *rep;
>>  	struct perf_evsel *evsel;
>>  	struct perf_sample *sample;
>> +	struct machine *machine;
>>  	struct hist_entry *he;
>>  	struct symbol *parent;
>>  	void *priv;
>> @@ -362,7 +363,7 @@ iter_finish_normal_entry(struct add_entry_iter *iter, struct addr_location *al)
>>  
>>  static int
>>  iter_prepare_cumulative_entry(struct add_entry_iter *iter,
>> -			      struct machine *machine __maybe_unused,
>> +			      struct machine *machine,
>>  			      struct perf_evsel *evsel,
>>  			      struct addr_location *al __maybe_unused,
>>  			      struct perf_sample *sample)
>> @@ -371,6 +372,7 @@ iter_prepare_cumulative_entry(struct add_entry_iter *iter,
>>  
>>  	iter->evsel = evsel;
>>  	iter->sample = sample;
>> +	iter->machine = machine;
>
> machine is already in 'al->machine'

Okay, will remove ->machine in the iter.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat
  2014-01-05 17:31   ` Jiri Olsa
@ 2014-01-06  8:32     ` Namhyung Kim
  2014-01-06 14:30       ` Jiri Olsa
  0 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:32 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 18:31:13 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:15PM +0900, Namhyung Kim wrote:
>> +#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
>> +static u64 he_get_acc_##_field(struct hist_entry *he)				\
>> +{										\
>> +	return he->stat_acc->_field;						\
>> +}										\
>> +										\
>> +static int hpp__color_acc_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
>> +			      struct perf_hpp *hpp, struct hist_entry *he) 	\
>> +{										\
>> +	return __hpp__fmt(hpp, he, he_get_acc_##_field, " %6.2f%%",		\
>> +			  (hpp_snprint_fn)percent_color_snprintf, true);	\
>> +}
>> +
>> +#define __HPP_ENTRY_ACC_PERCENT_FN(_type, _field)				\
>> +static int hpp__entry_acc_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,	\
>> +			      struct perf_hpp *hpp, struct hist_entry *he) 	\
>> +{										\
>> +	const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%";		\
>> +	return __hpp__fmt(hpp, he, he_get_acc_##_field, fmt,			\
>> +			  scnprintf, true);					\
>> +}
>> +
>>  #define __HPP_ENTRY_RAW_FN(_type, _field)					\
>>  static u64 he_get_raw_##_field(struct hist_entry *he)				\
>>  {										\
>> @@ -148,17 +170,25 @@ __HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
>>  __HPP_COLOR_PERCENT_FN(_type, _field)					\
>>  __HPP_ENTRY_PERCENT_FN(_type, _field)
>>  
>> +#define HPP_PERCENT_ACC_FNS(_type, _str, _field, _min_width, _unit_width)\
>> +__HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
>> +__HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
>> +__HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
>> +__HPP_ENTRY_ACC_PERCENT_FN(_type, _field)
>
> how about add 'stat|stat_acc' parameter to the HPP_PERCENT_FNS
> macro and use it like:
>
> #define __HPP_COLOR_PERCENT_FN(_type, _field, _stat)                            \
> static u64 he_get_##_stat##_field(struct hist_entry *he)                        \
> {                                                                               \
>         return he->##_stat._field;                                              \
> }                                                                               \
>
>
> and use 'he_get_##_stat##_field' in __HPP_ENTRY_PERCENT_FN
>
> This way you dont need to duplicate the code, that differs
> only in the used stat data.. but it's possible I missed
> something ;-)

The problem is that he->stat is a in-struct data but ->stat_acc is
dynamically allocated.  So it should be 

  "he->stat. ## _field"
           ~
vs.

  "he->stat_acc-> ## _field"
               ~~
>
> btw just noticed __HPP_COLOR_PERCENT_FN does not use _type

???  It's used in hpp__color_##_type().


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage
  2014-01-05 17:40   ` Jiri Olsa
@ 2014-01-06  8:33     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 18:40:33 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:18PM +0900, Namhyung Kim wrote:
>> +		if (symbol_conf.cumulate_callchain) {
>> +			percent = h->stat_acc->period * 100.0 /
>> +					hb->hists->stats.total_period;
>> +		} else {
>> +			percent = h->stat.period * 100.0 /
>> +					hb->hists->stats.total_period;
>> +		}
>
> seems like above code (7 lines) should go to function

Will do!

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 18/21] perf top: Support callchain accumulation
  2014-01-05 18:01   ` Jiri Olsa
@ 2014-01-06  8:34     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:34 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 19:01:54 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:24PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@lge.com>
>> 
>> Enable cumulation of callchain of children in perf top.
>> 
>> Cc: Arun Sharma <asharma@fb.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>>  tools/perf/builtin-top.c | 106 +++++++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 103 insertions(+), 3 deletions(-)
>> 
>> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
>> index 48c527a0f4c8..6a7a76496c94 100644
>> --- a/tools/perf/builtin-top.c
>> +++ b/tools/perf/builtin-top.c
>> @@ -657,6 +657,99 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
>>  	return 0;
>>  }
>>  
>> +static int process_cumulative_entry(struct perf_top *top,
>> +				    struct hist_entry *he,
>> +				    struct perf_evsel *evsel,
>> +				    struct addr_location *al,
>> +				    struct perf_sample *sample,
>> +				    struct symbol *parent)
>> +{
>
> hum, I wonder how hard would it be to export the iterator
> stuff out of the report command and export it to be usable
> from here as well.. to much code dusplicated below :-\

Yeah, I should find a way to reuse the code.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 21/21] perf tools: Enable --children option by default
  2014-01-05 18:08   ` Jiri Olsa
@ 2014-01-06  8:47     ` Namhyung Kim
  0 siblings, 0 replies; 53+ messages in thread
From: Namhyung Kim @ 2014-01-06  8:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Sun, 5 Jan 2014 19:08:18 +0100, Jiri Olsa wrote:
> On Tue, Dec 24, 2013 at 05:22:27PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@lge.com>
>> 
>> Now perf top and perf report will show children column by default if
>> it has callchain information.
>> 
>> Requested-by: Ingo Molnar <mingo@kernel.org>
>
> hum, so I could easily put 'children = false' to my .perfconfig
> to get the previous default behaviour back..
>
> but I thought the notion was not to disturb user with
> new features ...and I feel disturbed ;-)
>
> I probably missed some discussion about this

Well, Ingo always wants to enable this by default (like sysprof does)
and to see how many people complains. :)

  https://lkml.org/lkml/2013/10/31/97
  http://www.spinics.net/lists/kernel/msg1656502.html

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat
  2014-01-06  8:32     ` Namhyung Kim
@ 2014-01-06 14:30       ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-06 14:30 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos

On Mon, Jan 06, 2014 at 05:32:16PM +0900, Namhyung Kim wrote:
> On Sun, 5 Jan 2014 18:31:13 +0100, Jiri Olsa wrote:
> > On Tue, Dec 24, 2013 at 05:22:15PM +0900, Namhyung Kim wrote:
> >> +#define __HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
> >> +static u64 he_get_acc_##_field(struct hist_entry *he)				\
> >> +{										\
> >> +	return he->stat_acc->_field;						\
> >> +}										\
> >> +										\
> >> +static int hpp__color_acc_##_type(struct perf_hpp_fmt *fmt __maybe_unused,	\
> >> +			      struct perf_hpp *hpp, struct hist_entry *he) 	\
> >> +{										\
> >> +	return __hpp__fmt(hpp, he, he_get_acc_##_field, " %6.2f%%",		\
> >> +			  (hpp_snprint_fn)percent_color_snprintf, true);	\
> >> +}
> >> +
> >> +#define __HPP_ENTRY_ACC_PERCENT_FN(_type, _field)				\
> >> +static int hpp__entry_acc_##_type(struct perf_hpp_fmt *_fmt __maybe_unused,	\
> >> +			      struct perf_hpp *hpp, struct hist_entry *he) 	\
> >> +{										\
> >> +	const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%";		\
> >> +	return __hpp__fmt(hpp, he, he_get_acc_##_field, fmt,			\
> >> +			  scnprintf, true);					\
> >> +}
> >> +
> >>  #define __HPP_ENTRY_RAW_FN(_type, _field)					\
> >>  static u64 he_get_raw_##_field(struct hist_entry *he)				\
> >>  {										\
> >> @@ -148,17 +170,25 @@ __HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
> >>  __HPP_COLOR_PERCENT_FN(_type, _field)					\
> >>  __HPP_ENTRY_PERCENT_FN(_type, _field)
> >>  
> >> +#define HPP_PERCENT_ACC_FNS(_type, _str, _field, _min_width, _unit_width)\
> >> +__HPP_HEADER_FN(_type, _str, _min_width, _unit_width)			\
> >> +__HPP_WIDTH_FN(_type, _min_width, _unit_width)				\
> >> +__HPP_COLOR_ACC_PERCENT_FN(_type, _field)				\
> >> +__HPP_ENTRY_ACC_PERCENT_FN(_type, _field)
> >
> > how about add 'stat|stat_acc' parameter to the HPP_PERCENT_FNS
> > macro and use it like:
> >
> > #define __HPP_COLOR_PERCENT_FN(_type, _field, _stat)                            \
> > static u64 he_get_##_stat##_field(struct hist_entry *he)                        \
> > {                                                                               \
> >         return he->##_stat._field;                                              \
> > }                                                                               \
> >
> >
> > and use 'he_get_##_stat##_field' in __HPP_ENTRY_PERCENT_FN
> >
> > This way you dont need to duplicate the code, that differs
> > only in the used stat data.. but it's possible I missed
> > something ;-)
> 
> The problem is that he->stat is a in-struct data but ->stat_acc is
> dynamically allocated.  So it should be 
> 
>   "he->stat. ## _field"
>            ~
> vs.
> 
>   "he->stat_acc-> ## _field"
>                ~~

I see.. maybe some other 'is_pointer' parameter is better
then duplicating the code.. but it's not that big anyway.. nevermind ;-)

> >
> > btw just noticed __HPP_COLOR_PERCENT_FN does not use _type
> 
> ???  It's used in hpp__color_##_type().

aah ok, missed that

thanks,
jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/21] perf tools: Introduce struct add_entry_iter
  2014-01-06  8:03     ` Namhyung Kim
@ 2014-01-06 14:32       ` Jiri Olsa
  0 siblings, 0 replies; 53+ messages in thread
From: Jiri Olsa @ 2014-01-06 14:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, LKML, Arun Sharma,
	Frederic Weisbecker, Rodrigo Campos, Stephane Eranian

On Mon, Jan 06, 2014 at 05:03:22PM +0900, Namhyung Kim wrote:
> On Sun, 5 Jan 2014 16:55:10 +0100, Jiri Olsa wrote:
> > On Tue, Dec 24, 2013 at 05:22:07PM +0900, Namhyung Kim wrote:
> >> From: Namhyung Kim <namhyung.kim@lge.com>
> >
> > SNIP
> >
> >> +
> >> +out:
> >> +	iter->curr++;
> >> +	return err;
> >> +}
> >> +
> >> +static int
> >> +iter_finish_branch_entry(struct add_entry_iter *iter,
> >> +			 struct addr_location *al __maybe_unused)
> >> +{
> >> +	free(iter->priv);
> >> +	iter->priv = NULL;
> >
> > so branch_info is duplicated in the hist_entry__new.. it's not
> > easy to find why this one gets freed while mem_info stays ;-)
> 
> The branch_info is an array and each entry gets its own part in the
> array - so entries alloc & copy the (part of) branch info as the branch
> info iterated.  But mem info is a single data for a single entry so a
> new entry steals ownership of the mem info when created and not freed.
> 
> Maybe we can alloc & copy another mem info when adds an entry and free
> the original at iter->finish_entry(), but I felt it's unnecessary..
> 
> >
> > some comment about that here would help
> 
> How about this?
> 
>   /*
>    * The mem_info was either already freed in add_hist_entry() or
>    * passed to a new hist entry by hist_entry__new().
>    */
> 

sounds good, thanks
jirka

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2014-01-06 15:20 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-24  8:22 [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v4) Namhyung Kim
2013-12-24  8:22 ` [PATCH 01/21] perf tools: Introduce struct add_entry_iter Namhyung Kim
2014-01-05 14:55   ` Jiri Olsa
2014-01-06  7:45     ` Namhyung Kim
2014-01-05 15:28   ` Jiri Olsa
2014-01-06  7:45     ` Namhyung Kim
2014-01-05 15:55   ` Jiri Olsa
2014-01-06  8:03     ` Namhyung Kim
2014-01-06 14:32       ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 02/21] perf hists: Convert hist entry functions to use struct he_stat Namhyung Kim
2014-01-05 16:09   ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 03/21] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
2014-01-05 16:16   ` Jiri Olsa
2014-01-05 16:26   ` Jiri Olsa
2014-01-06  8:09     ` Namhyung Kim
2014-01-05 16:45   ` Jiri Olsa
2014-01-06  8:10     ` Namhyung Kim
2013-12-24  8:22 ` [PATCH 04/21] perf hists: Check if accumulated when adding a " Namhyung Kim
2013-12-24  8:22 ` [PATCH 05/21] perf hists: Accumulate hist entry stat based on the callchain Namhyung Kim
2014-01-05 16:58   ` Jiri Olsa
2014-01-05 17:15     ` Jiri Olsa
2014-01-06  8:17       ` Namhyung Kim
2013-12-24  8:22 ` [PATCH 06/21] perf tools: Update cpumode for each cumulative entry Namhyung Kim
2014-01-05 17:02   ` Jiri Olsa
2014-01-06  8:18     ` Namhyung Kim
2013-12-24  8:22 ` [PATCH 07/21] perf report: Cache cumulative callchains Namhyung Kim
2014-01-05 17:10   ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 08/21] perf hists: Sort hist entries by accumulated period Namhyung Kim
2013-12-24  8:22 ` [PATCH 09/21] perf ui/hist: Add support to accumulated hist stat Namhyung Kim
2014-01-05 17:31   ` Jiri Olsa
2014-01-06  8:32     ` Namhyung Kim
2014-01-06 14:30       ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 10/21] perf ui/browser: " Namhyung Kim
2014-01-05 17:33   ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 11/21] perf ui/gtk: " Namhyung Kim
2014-01-05 17:35   ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 12/21] perf tools: Apply percent-limit to cumulative percentage Namhyung Kim
2014-01-05 17:40   ` Jiri Olsa
2014-01-06  8:33     ` Namhyung Kim
2013-12-24  8:22 ` [PATCH 13/21] perf tools: Add more hpp helper functions Namhyung Kim
2013-12-24  8:22 ` [PATCH 14/21] perf report: Add --children option Namhyung Kim
2013-12-24  8:22 ` [PATCH 15/21] perf report: Add report.children config option Namhyung Kim
2013-12-24  8:22 ` [PATCH 16/21] perf tools: Factor out sample__resolve_callchain() Namhyung Kim
2014-01-05 17:56   ` Jiri Olsa
2013-12-24  8:22 ` [PATCH 17/21] perf tools: Factor out fill_callchain_info() Namhyung Kim
2013-12-24  8:22 ` [PATCH 18/21] perf top: Support callchain accumulation Namhyung Kim
2014-01-05 18:01   ` Jiri Olsa
2014-01-06  8:34     ` Namhyung Kim
2013-12-24  8:22 ` [PATCH 19/21] perf top: Add --children option Namhyung Kim
2013-12-24  8:22 ` [PATCH 20/21] perf top: Add top.children config option Namhyung Kim
2013-12-24  8:22 ` [PATCH 21/21] perf tools: Enable --children option by default Namhyung Kim
2014-01-05 18:08   ` Jiri Olsa
2014-01-06  8:47     ` Namhyung Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.