linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 0/9] perf/core improvements
@ 2014-12-01 23:09 Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 1/9] perf stat: Use perf_evsel__read_cb in read_counter Arnaldo Carvalho de Melo
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Jiri Olsa,
	Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Stephane Eranian, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling, on top of one previous pull request.

- Arnaldo


The following changes since commit 1d9e446b91e182055d874fbb30150aad479a4981:

  perf tools: Add snapshot format file parsing (2014-11-24 18:03:51 -0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-2

for you to fetch changes up to 09a6a1b07e5a579ef770d9728f5b158408c73c23:

  perf report: In branch stack mode use address history sorting (2014-12-01 20:00:31 -0300)

----------------------------------------------------------------
perf/core improvements:

User visible:

o Support handling complete branch stacks as histograms (Andi Kleen)

Infrastructure:

o Prep work for supporting per-pkg and snapshot counters in 'perf stat' (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Andi Kleen (3):
      perf callchain: Support handling complete branch stacks as histograms
      perf report: Add --branch-history option
      perf report: In branch stack mode use address history sorting

Jiri Olsa (6):
      perf stat: Use perf_evsel__read_cb in read_counter
      perf stat: Make read_counter work over the thread dimension
      perf stat: Use read_counter in read_counter_aggr
      perf tools: Remove perf_evsel__read interface
      perf stat: Add support for per-pkg counters
      perf stat: Add support for snapshot counters

 tools/perf/Documentation/perf-report.txt |  12 ++-
 tools/perf/builtin-report.c              |  31 ++++++--
 tools/perf/builtin-stat.c                | 105 +++++++++++++++++++++++---
 tools/perf/util/callchain.c              |   4 +
 tools/perf/util/callchain.h              |   1 +
 tools/perf/util/evsel.c                  |  34 ---------
 tools/perf/util/evsel.h                  |  30 +-------
 tools/perf/util/machine.c                | 126 ++++++++++++++++++++++++++++---
 tools/perf/util/symbol.h                 |   3 +-
 9 files changed, 257 insertions(+), 89 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/9] perf stat: Use perf_evsel__read_cb in read_counter
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 2/9] perf stat: Make read_counter work over the thread dimension Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Replacing __perf_evsel__read_on_cpu function with perf_evsel__read_cb
function. The read_cb callback will be used later for global aggregation
counter values as well.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1416562275-12404-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 055ce9232c9e..9cc0db1d7f06 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -388,6 +388,26 @@ static void update_shadow_stats(struct perf_evsel *counter, u64 *count)
 		update_stats(&runtime_itlb_cache_stats[0], count[0]);
 }
 
+static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused,
+		   struct perf_counts_values *count)
+{
+	switch (aggr_mode) {
+	case AGGR_CORE:
+	case AGGR_SOCKET:
+	case AGGR_NONE:
+		perf_evsel__compute_deltas(evsel, cpu, count);
+		perf_counts_values__scale(count, scale, NULL);
+		evsel->counts->cpu[cpu] = *count;
+		update_shadow_stats(evsel, count->values);
+		break;
+	case AGGR_GLOBAL:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
 /*
  * Read out the results of a single counter:
  * aggregate counts across CPUs in system-wide mode
@@ -424,16 +444,11 @@ static int read_counter_aggr(struct perf_evsel *counter)
  */
 static int read_counter(struct perf_evsel *counter)
 {
-	u64 *count;
 	int cpu;
 
 	for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
-		if (__perf_evsel__read_on_cpu(counter, cpu, 0, scale) < 0)
+		if (perf_evsel__read_cb(counter, cpu, 0, read_cb))
 			return -1;
-
-		count = counter->counts->cpu[cpu].values;
-
-		update_shadow_stats(counter, count);
 	}
 
 	return 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/9] perf stat: Make read_counter work over the thread dimension
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 1/9] perf stat: Use perf_evsel__read_cb in read_counter Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 3/9] perf stat: Use read_counter in read_counter_aggr Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

The read function will be used later for both aggr and cpu counters, so
we need to make it work over threads as well.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1416562275-12404-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9cc0db1d7f06..2511d3aae708 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -444,11 +444,18 @@ static int read_counter_aggr(struct perf_evsel *counter)
  */
 static int read_counter(struct perf_evsel *counter)
 {
-	int cpu;
+	int nthreads = thread_map__nr(evsel_list->threads);
+	int ncpus = perf_evsel__nr_cpus(counter);
+	int cpu, thread;
 
-	for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
-		if (perf_evsel__read_cb(counter, cpu, 0, read_cb))
-			return -1;
+	if (counter->system_wide)
+		nthreads = 1;
+
+	for (thread = 0; thread < nthreads; thread++) {
+		for (cpu = 0; cpu < ncpus; cpu++) {
+			if (perf_evsel__read_cb(counter, cpu, thread, read_cb))
+				return -1;
+		}
 	}
 
 	return 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/9] perf stat: Use read_counter in read_counter_aggr
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 1/9] perf stat: Use perf_evsel__read_cb in read_counter Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 2/9] perf stat: Make read_counter work over the thread dimension Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 4/9] perf tools: Remove perf_evsel__read interface Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Use the read_counter function as the values retrieval function for aggr
counter values thus eliminating the use of __perf_evsel__read function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1416562275-12404-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2511d3aae708..b24a7a08bd1d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -391,6 +391,8 @@ static void update_shadow_stats(struct perf_evsel *counter, u64 *count)
 static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused,
 		   struct perf_counts_values *count)
 {
+	struct perf_counts_values *aggr = &evsel->counts->aggr;
+
 	switch (aggr_mode) {
 	case AGGR_CORE:
 	case AGGR_SOCKET:
@@ -401,6 +403,11 @@ static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused,
 		update_shadow_stats(evsel, count->values);
 		break;
 	case AGGR_GLOBAL:
+		aggr->val += count->val;
+		if (scale) {
+			aggr->ena += count->ena;
+			aggr->run += count->run;
+		}
 	default:
 		break;
 	}
@@ -408,20 +415,27 @@ static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused,
 	return 0;
 }
 
+static int read_counter(struct perf_evsel *counter);
+
 /*
  * Read out the results of a single counter:
  * aggregate counts across CPUs in system-wide mode
  */
 static int read_counter_aggr(struct perf_evsel *counter)
 {
+	struct perf_counts_values *aggr = &counter->counts->aggr;
 	struct perf_stat *ps = counter->priv;
 	u64 *count = counter->counts->aggr.values;
 	int i;
 
-	if (__perf_evsel__read(counter, perf_evsel__nr_cpus(counter),
-			       thread_map__nr(evsel_list->threads), scale) < 0)
+	aggr->val = aggr->ena = aggr->run = 0;
+
+	if (read_counter(counter))
 		return -1;
 
+	perf_evsel__compute_deltas(counter, -1, aggr);
+	perf_counts_values__scale(aggr, scale, &counter->counts->scaled);
+
 	for (i = 0; i < 3; i++)
 		update_stats(&ps->res_stats[i], count[i]);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/9] perf tools: Remove perf_evsel__read interface
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 3/9] perf stat: Use read_counter in read_counter_aggr Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 5/9] perf stat: Add support for per-pkg counters Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Removing the perf_evsel__read interfaces because we replaced the only
user in the stat command code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1416562275-12404-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 34 ----------------------------------
 tools/perf/util/evsel.h | 29 -----------------------------
 2 files changed, 63 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2d26b7ad6fe0..1e90c8557ede 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -954,40 +954,6 @@ int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
 	return 0;
 }
 
-int __perf_evsel__read(struct perf_evsel *evsel,
-		       int ncpus, int nthreads, bool scale)
-{
-	size_t nv = scale ? 3 : 1;
-	int cpu, thread;
-	struct perf_counts_values *aggr = &evsel->counts->aggr, count;
-
-	if (evsel->system_wide)
-		nthreads = 1;
-
-	aggr->val = aggr->ena = aggr->run = 0;
-
-	for (cpu = 0; cpu < ncpus; cpu++) {
-		for (thread = 0; thread < nthreads; thread++) {
-			if (FD(evsel, cpu, thread) < 0)
-				continue;
-
-			if (readn(FD(evsel, cpu, thread),
-				  &count, nv * sizeof(u64)) < 0)
-				return -errno;
-
-			aggr->val += count.val;
-			if (scale) {
-				aggr->ena += count.ena;
-				aggr->run += count.run;
-			}
-		}
-	}
-
-	perf_evsel__compute_deltas(evsel, -1, aggr);
-	perf_counts_values__scale(aggr, scale, &evsel->counts->scaled);
-	return 0;
-}
-
 static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
 {
 	struct perf_evsel *leader = evsel->leader;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index b18d58da580b..3207f4861038 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -271,35 +271,6 @@ static inline int perf_evsel__read_on_cpu_scaled(struct perf_evsel *evsel,
 	return __perf_evsel__read_on_cpu(evsel, cpu, thread, true);
 }
 
-int __perf_evsel__read(struct perf_evsel *evsel, int ncpus, int nthreads,
-		       bool scale);
-
-/**
- * perf_evsel__read - Read the aggregate results on all CPUs
- *
- * @evsel - event selector to read value
- * @ncpus - Number of cpus affected, from zero
- * @nthreads - Number of threads affected, from zero
- */
-static inline int perf_evsel__read(struct perf_evsel *evsel,
-				    int ncpus, int nthreads)
-{
-	return __perf_evsel__read(evsel, ncpus, nthreads, false);
-}
-
-/**
- * perf_evsel__read_scaled - Read the aggregate results on all CPUs, scaled
- *
- * @evsel - event selector to read value
- * @ncpus - Number of cpus affected, from zero
- * @nthreads - Number of threads affected, from zero
- */
-static inline int perf_evsel__read_scaled(struct perf_evsel *evsel,
-					  int ncpus, int nthreads)
-{
-	return __perf_evsel__read(evsel, ncpus, nthreads, true);
-}
-
 int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 			     struct perf_sample *sample);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/9] perf stat: Add support for per-pkg counters
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 4/9] perf tools: Remove perf_evsel__read interface Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 6/9] perf stat: Add support for snapshot counters Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

The .per-pkg file indicates that all but one value per socket should be
discarded. Adding the logic of skipping the rest of the socket once
first value was read.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1416562275-12404-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h   |  1 +
 2 files changed, 50 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index b24a7a08bd1d..860e8ad06616 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -388,10 +388,56 @@ static void update_shadow_stats(struct perf_evsel *counter, u64 *count)
 		update_stats(&runtime_itlb_cache_stats[0], count[0]);
 }
 
+static void zero_per_pkg(struct perf_evsel *counter)
+{
+	if (counter->per_pkg_mask)
+		memset(counter->per_pkg_mask, 0, MAX_NR_CPUS);
+}
+
+static int check_per_pkg(struct perf_evsel *counter, int cpu, bool *skip)
+{
+	unsigned long *mask = counter->per_pkg_mask;
+	struct cpu_map *cpus = perf_evsel__cpus(counter);
+	int s;
+
+	*skip = false;
+
+	if (!counter->per_pkg)
+		return 0;
+
+	if (cpu_map__empty(cpus))
+		return 0;
+
+	if (!mask) {
+		mask = zalloc(MAX_NR_CPUS);
+		if (!mask)
+			return -ENOMEM;
+
+		counter->per_pkg_mask = mask;
+	}
+
+	s = cpu_map__get_socket(cpus, cpu);
+	if (s < 0)
+		return -1;
+
+	*skip = test_and_set_bit(s, mask) == 1;
+	return 0;
+}
+
 static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused,
 		   struct perf_counts_values *count)
 {
 	struct perf_counts_values *aggr = &evsel->counts->aggr;
+	static struct perf_counts_values zero;
+	bool skip = false;
+
+	if (check_per_pkg(evsel, cpu, &skip)) {
+		pr_err("failed to read per-pkg counter\n");
+		return -1;
+	}
+
+	if (skip)
+		count = &zero;
 
 	switch (aggr_mode) {
 	case AGGR_CORE:
@@ -465,6 +511,9 @@ static int read_counter(struct perf_evsel *counter)
 	if (counter->system_wide)
 		nthreads = 1;
 
+	if (counter->per_pkg)
+		zero_per_pkg(counter);
+
 	for (thread = 0; thread < nthreads; thread++) {
 		for (cpu = 0; cpu < ncpus; cpu++) {
 			if (perf_evsel__read_cb(counter, cpu, thread, read_cb))
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3207f4861038..38622747d130 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -93,6 +93,7 @@ struct perf_evsel {
 	bool			system_wide;
 	bool			tracking;
 	bool			per_pkg;
+	unsigned long		*per_pkg_mask;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/9] perf stat: Add support for snapshot counters
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 5/9] perf stat: Add support for per-pkg counters Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 7/9] perf callchain: Support handling complete branch stacks as histograms Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

The .snapshot file indicates that the provided event value is a snapshot
value. Bypassing the delta computation logic for such event.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1416562275-12404-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 860e8ad06616..891086376381 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -443,7 +443,8 @@ static int read_cb(struct perf_evsel *evsel, int cpu, int thread __maybe_unused,
 	case AGGR_CORE:
 	case AGGR_SOCKET:
 	case AGGR_NONE:
-		perf_evsel__compute_deltas(evsel, cpu, count);
+		if (!evsel->snapshot)
+			perf_evsel__compute_deltas(evsel, cpu, count);
 		perf_counts_values__scale(count, scale, NULL);
 		evsel->counts->cpu[cpu] = *count;
 		update_shadow_stats(evsel, count->values);
@@ -479,7 +480,8 @@ static int read_counter_aggr(struct perf_evsel *counter)
 	if (read_counter(counter))
 		return -1;
 
-	perf_evsel__compute_deltas(counter, -1, aggr);
+	if (!counter->snapshot)
+		perf_evsel__compute_deltas(counter, -1, aggr);
 	perf_counts_values__scale(aggr, scale, &counter->counts->scaled);
 
 	for (i = 0; i < 3; i++)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/9] perf callchain: Support handling complete branch stacks as histograms
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 6/9] perf stat: Add support for snapshot counters Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 8/9] perf report: Add --branch-history option Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andi Kleen, Jiri Olsa, Namhyung Kim,
	Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Currently branch stacks can be only shown as edge histograms for
individual branches. I never found this display particularly useful.

This implements an alternative mode that creates histograms over
complete branch traces, instead of individual branches, similar to how
normal callgraphs are handled. This is done by putting it in front of
the normal callgraph and then using the normal callgraph histogram
infrastructure to unify them.

This way in complex functions we can understand the control flow that
lead to a particular sample, and may even see some control flow in the
caller for short functions.

Example (simplified, of course for such simple code this is usually not
needed), please run this after the whole patchkit is in, as at this
point in the patch order there is no --branch-history, that will be
added in a patch after this one:

tcall.c:

volatile a = 10000, b = 100000, c;

__attribute__((noinline)) f2()
{
	c = a / b;
}

__attribute__((noinline)) f1()
{
	f2();
	f2();
}
main()
{
	int i;
	for (i = 0; i < 1000000; i++)
		f1();
}

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --no-children --branch-history
...
    54.91%  tcall.c:6  [.] f2                      tcall
            |
            |--65.53%-- f2 tcall.c:5
            |          |
            |          |--70.83%-- f1 tcall.c:11
            |          |          f1 tcall.c:10
            |          |          main tcall.c:18
            |          |          main tcall.c:18
            |          |          main tcall.c:17
            |          |          main tcall.c:17
            |          |          f1 tcall.c:13
            |          |          f1 tcall.c:13
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:12
            |          |          f1 tcall.c:12
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:11
            |          |
            |           --29.17%-- f1 tcall.c:12
            |                     f1 tcall.c:12
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:11
            |                     f1 tcall.c:10
            |                     main tcall.c:18
            |                     main tcall.c:18
            |                     main tcall.c:17
            |                     main tcall.c:17
            |                     f1 tcall.c:13
            |                     f1 tcall.c:13
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:12

The default output is unchanged.

This is only implemented in perf report, no change to record or anywhere
else.

This adds the basic code to report:

- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.

The rest of the history code is unchanged and doesn't know the
difference between LBR entry and normal call entry.

- detect overlaps with the callchain
- remove small loop duplicates in the LBR

Current limitations:

- The LBR flags (mispredict etc.) are not shown in the history
and LBR entries have no special marker.
- It would be nice if annotate marked the LBR entries somehow
(e.g. with arrows)

v2: Various fixes.
v3: Merge further patches into this one. Fix white space.
v4: Improve manpage. Address review feedback.
v5: Rename functions. Better error message without -g. Fix crash without
    -b.
v6: Rebase
v7: Rebase. Use NO_ENTRY in memset.
v8: Port to latest tip. Move add_callchain_ip to separate
    patch. Skip initial entries in callchain. Minor cleanups.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-report.txt |   7 +-
 tools/perf/builtin-report.c              |   4 +-
 tools/perf/util/callchain.c              |   4 +
 tools/perf/util/callchain.h              |   1 +
 tools/perf/util/machine.c                | 126 ++++++++++++++++++++++++++++---
 tools/perf/util/symbol.h                 |   3 +-
 6 files changed, 132 insertions(+), 13 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 0927bf4e6c2a..22706beffabc 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -159,7 +159,7 @@ OPTIONS
 --dump-raw-trace::
         Dump raw trace in ASCII.
 
--g [type,min[,limit],order[,key]]::
+-g [type,min[,limit],order[,key][,branch]]::
 --call-graph::
         Display call chains using type, min percent threshold, optional print
 	limit and order.
@@ -177,6 +177,11 @@ OPTIONS
 	- function: compare on functions
 	- address: compare on individual code addresses
 
+	branch can be:
+	- branch: include last branch information in callgraph
+	when available. Usually more convenient to use --branch-history
+	for this.
+
 	Default: fractal,0.5,callee,function.
 
 --children::
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 140a6cd88351..410d44fac64f 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -637,8 +637,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 		   "regex filter to identify parent, see: '--sort parent'"),
 	OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
 		    "Only display entries with parent-match"),
-	OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order",
-		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address). "
+	OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order[,branch]",
+		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address), add branches. "
 		     "Default: fractal,0.5,callee,function", &report_parse_callchain_opt, callchain_default_opt),
 	OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
 		    "Accumulate callchains of children and show total overhead as well"),
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 517ed84db97a..cf524a35cc84 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -149,6 +149,10 @@ static int parse_callchain_sort_key(const char *value)
 		callchain_param.key = CCKEY_ADDRESS;
 		return 0;
 	}
+	if (!strncmp(value, "branch", strlen(value))) {
+		callchain_param.branch_callstack = 1;
+		return 0;
+	}
 	return -1;
 }
 
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 3f158474c892..dbc08cf5f970 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -63,6 +63,7 @@ struct callchain_param {
 	sort_chain_func_t	sort;
 	enum chain_order	order;
 	enum chain_key		key;
+	bool			branch_callstack;
 };
 
 extern struct callchain_param callchain_param;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b75b487574c7..15dd0a9691ce 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -12,6 +12,7 @@
 #include <stdbool.h>
 #include <symbol/kallsyms.h>
 #include "unwind.h"
+#include "linux/hash.h"
 
 static void dsos__init(struct dsos *dsos)
 {
@@ -1391,7 +1392,11 @@ static int add_callchain_ip(struct thread *thread,
 
 	al.filtered = 0;
 	al.sym = NULL;
-	thread__find_addr_location(thread, cpumode, MAP__FUNCTION,
+	if (cpumode == -1)
+		thread__find_cpumode_addr_location(thread, MAP__FUNCTION,
+						   ip, &al);
+	else
+		thread__find_addr_location(thread, cpumode, MAP__FUNCTION,
 				   ip, &al);
 	if (al.sym != NULL) {
 		if (sort__has_parent && !*parent &&
@@ -1427,8 +1432,50 @@ struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
 	return bi;
 }
 
+#define CHASHSZ 127
+#define CHASHBITS 7
+#define NO_ENTRY 0xff
+
+#define PERF_MAX_BRANCH_DEPTH 127
+
+/* Remove loops. */
+static int remove_loops(struct branch_entry *l, int nr)
+{
+	int i, j, off;
+	unsigned char chash[CHASHSZ];
+
+	memset(chash, NO_ENTRY, sizeof(chash));
+
+	BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
+
+	for (i = 0; i < nr; i++) {
+		int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
+
+		/* no collision handling for now */
+		if (chash[h] == NO_ENTRY) {
+			chash[h] = i;
+		} else if (l[chash[h]].from == l[i].from) {
+			bool is_loop = true;
+			/* check if it is a real loop */
+			off = 0;
+			for (j = chash[h]; j < i && i + off < nr; j++, off++)
+				if (l[j].from != l[i + off].from) {
+					is_loop = false;
+					break;
+				}
+			if (is_loop) {
+				memmove(l + i, l + i + off,
+					(nr - (i + off)) * sizeof(*l));
+				nr -= off;
+			}
+		}
+	}
+	return nr;
+}
+
 static int thread__resolve_callchain_sample(struct thread *thread,
 					     struct ip_callchain *chain,
+					     struct branch_stack *branch,
 					     struct symbol **parent,
 					     struct addr_location *root_al,
 					     int max_stack)
@@ -1438,22 +1485,82 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 	int i;
 	int j;
 	int err;
-	int skip_idx __maybe_unused;
+	int skip_idx = -1;
+	int first_call = 0;
+
+	/*
+	 * Based on DWARF debug information, some architectures skip
+	 * a callchain entry saved by the kernel.
+	 */
+	if (chain->nr < PERF_MAX_STACK_DEPTH)
+		skip_idx = arch_skip_callchain_idx(thread, chain);
 
 	callchain_cursor_reset(&callchain_cursor);
 
+	/*
+	 * Add branches to call stack for easier browsing. This gives
+	 * more context for a sample than just the callers.
+	 *
+	 * This uses individual histograms of paths compared to the
+	 * aggregated histograms the normal LBR mode uses.
+	 *
+	 * Limitations for now:
+	 * - No extra filters
+	 * - No annotations (should annotate somehow)
+	 */
+
+	if (branch && callchain_param.branch_callstack) {
+		int nr = min(max_stack, (int)branch->nr);
+		struct branch_entry be[nr];
+
+		if (branch->nr > PERF_MAX_BRANCH_DEPTH) {
+			pr_warning("corrupted branch chain. skipping...\n");
+			goto check_calls;
+		}
+
+		for (i = 0; i < nr; i++) {
+			if (callchain_param.order == ORDER_CALLEE) {
+				be[i] = branch->entries[i];
+				/*
+				 * Check for overlap into the callchain.
+				 * The return address is one off compared to
+				 * the branch entry. To adjust for this
+				 * assume the calling instruction is not longer
+				 * than 8 bytes.
+				 */
+				if (i == skip_idx ||
+				    chain->ips[first_call] >= PERF_CONTEXT_MAX)
+					first_call++;
+				else if (be[i].from < chain->ips[first_call] &&
+				    be[i].from >= chain->ips[first_call] - 8)
+					first_call++;
+			} else
+				be[i] = branch->entries[branch->nr - i - 1];
+		}
+
+		nr = remove_loops(be, nr);
+
+		for (i = 0; i < nr; i++) {
+			err = add_callchain_ip(thread, parent, root_al,
+					       -1, be[i].to);
+			if (!err)
+				err = add_callchain_ip(thread, parent, root_al,
+						       -1, be[i].from);
+			if (err == -EINVAL)
+				break;
+			if (err)
+				return err;
+		}
+		chain_nr -= nr;
+	}
+
+check_calls:
 	if (chain->nr > PERF_MAX_STACK_DEPTH) {
 		pr_warning("corrupted callchain. skipping...\n");
 		return 0;
 	}
 
-	/*
-	 * Based on DWARF debug information, some architectures skip
-	 * a callchain entry saved by the kernel.
-	 */
-	skip_idx = arch_skip_callchain_idx(thread, chain);
-
-	for (i = 0; i < chain_nr; i++) {
+	for (i = first_call; i < chain_nr; i++) {
 		u64 ip;
 
 		if (callchain_param.order == ORDER_CALLEE)
@@ -1517,6 +1624,7 @@ int thread__resolve_callchain(struct thread *thread,
 			      int max_stack)
 {
 	int ret = thread__resolve_callchain_sample(thread, sample->callchain,
+						   sample->branch_stack,
 						   parent, root_al, max_stack);
 	if (ret)
 		return ret;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index e0b297c50f9d..9d602e9c6f59 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -102,7 +102,8 @@ struct symbol_conf {
 			demangle,
 			demangle_kernel,
 			filter_relative,
-			show_hist_headers;
+			show_hist_headers,
+			branch_callstack;
 	const char	*vmlinux_name,
 			*kallsyms_name,
 			*source_prefix,
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 8/9] perf report: Add --branch-history option
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 7/9] perf callchain: Support handling complete branch stacks as histograms Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-01 23:09 ` [PATCH 9/9] perf report: In branch stack mode use address history sorting Arnaldo Carvalho de Melo
  2014-12-08  6:46 ` [GIT PULL 0/9] perf/core improvements Ingo Molnar
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andi Kleen, Jiri Olsa, Namhyung Kim,
	Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Add a --branch-history option to perf report that changes all the
settings necessary for using the branches in callstacks.

This is just a short cut to make this nicer to use, it does not enable
any functionality by itself.

v2: Change sort order. Rename option to --branch-history to
    be less confusing.
v3: Updates
v4: Fix conflict with newer perf base
v5: Port to latest tip
v6: Add more comments. Remove CCKEY_ADDRESS setting. Remove
    unnecessary branch_mode setting. Use a boolean.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-report.txt |  5 +++++
 tools/perf/builtin-report.c              | 26 ++++++++++++++++++++++----
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 22706beffabc..dd7cccdde498 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -271,6 +271,11 @@ OPTIONS
 	branch stacks and it will automatically switch to the branch view mode,
 	unless --no-branch-stack is used.
 
+--branch-history::
+	Add the addresses of sampled taken branches to the callstack.
+	This allows to examine the path the program took to each sample.
+	The data collection must have used -b (or -j) and -g.
+
 --objdump=<path>::
         Path to objdump binary.
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 410d44fac64f..fb272ff435c9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -226,8 +226,9 @@ static int report__setup_sample_type(struct report *rep)
 			return -EINVAL;
 		}
 		if (symbol_conf.use_callchain) {
-			ui__error("Selected -g but no callchain data. Did "
-				    "you call 'perf record' without -g?\n");
+			ui__error("Selected -g or --branch-history but no "
+				  "callchain data. Did\n"
+				  "you call 'perf record' without -g?\n");
 			return -1;
 		}
 	} else if (!rep->dont_use_callchains &&
@@ -575,6 +576,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	struct stat st;
 	bool has_br_stack = false;
 	int branch_mode = -1;
+	bool branch_call_mode = false;
 	char callchain_default_opt[] = "fractal,0.5,callee";
 	const char * const report_usage[] = {
 		"perf report [<options>]",
@@ -684,7 +686,10 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
 		    "Show event group information together"),
 	OPT_CALLBACK_NOOPT('b', "branch-stack", &branch_mode, "",
-		    "use branch records for histogram filling", parse_branch_mode),
+		    "use branch records for per branch histogram filling",
+		    parse_branch_mode),
+	OPT_BOOLEAN(0, "branch-history", &branch_call_mode,
+		    "add last branch records to call history"),
 	OPT_STRING(0, "objdump", &objdump_path, "path",
 		   "objdump binary to use for disassembly and annotations"),
 	OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
@@ -745,10 +750,23 @@ repeat:
 	has_br_stack = perf_header__has_feat(&session->header,
 					     HEADER_BRANCH_STACK);
 
-	if ((branch_mode == -1 && has_br_stack) || branch_mode == 1) {
+	/*
+	 * Branch mode is a tristate:
+	 * -1 means default, so decide based on the file having branch data.
+	 * 0/1 means the user chose a mode.
+	 */
+	if (((branch_mode == -1 && has_br_stack) || branch_mode == 1) &&
+	    branch_call_mode == -1) {
 		sort__mode = SORT_MODE__BRANCH;
 		symbol_conf.cumulate_callchain = false;
 	}
+	if (branch_call_mode) {
+		callchain_param.branch_callstack = 1;
+		symbol_conf.use_callchain = true;
+		callchain_register_param(&callchain_param);
+		if (sort_order == NULL)
+			sort_order = "srcline,symbol,dso";
+	}
 
 	if (report.mem_mode) {
 		if (sort__mode == SORT_MODE__BRANCH) {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 9/9] perf report: In branch stack mode use address history sorting
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 8/9] perf report: Add --branch-history option Arnaldo Carvalho de Melo
@ 2014-12-01 23:09 ` Arnaldo Carvalho de Melo
  2014-12-08  6:46 ` [GIT PULL 0/9] perf/core improvements Ingo Molnar
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-01 23:09 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Enable CCKEY_ADDRESS address history sorting with --branch-history.
This makes get_srcline display the source lines correctly, otherwise all
history entries for a function a hunked into one.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1416275935-20971-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fb272ff435c9..39367609c707 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -761,6 +761,7 @@ repeat:
 		symbol_conf.cumulate_callchain = false;
 	}
 	if (branch_call_mode) {
+		callchain_param.key = CCKEY_ADDRESS;
 		callchain_param.branch_callstack = 1;
 		symbol_conf.use_callchain = true;
 		callchain_register_param(&callchain_param);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [GIT PULL 0/9] perf/core improvements
  2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2014-12-01 23:09 ` [PATCH 9/9] perf report: In branch stack mode use address history sorting Arnaldo Carvalho de Melo
@ 2014-12-08  6:46 ` Ingo Molnar
  9 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2014-12-08  6:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Jiri Olsa, Matt Fleming, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling, on top of one previous pull request.
> 
> - Arnaldo
> 
> 
> The following changes since commit 1d9e446b91e182055d874fbb30150aad479a4981:
> 
>   perf tools: Add snapshot format file parsing (2014-11-24 18:03:51 -0300)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-2
> 
> for you to fetch changes up to 09a6a1b07e5a579ef770d9728f5b158408c73c23:
> 
>   perf report: In branch stack mode use address history sorting (2014-12-01 20:00:31 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements:
> 
> User visible:
> 
> o Support handling complete branch stacks as histograms (Andi Kleen)
> 
> Infrastructure:
> 
> o Prep work for supporting per-pkg and snapshot counters in 'perf stat' (Jiri Olsa)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Andi Kleen (3):
>       perf callchain: Support handling complete branch stacks as histograms
>       perf report: Add --branch-history option
>       perf report: In branch stack mode use address history sorting
> 
> Jiri Olsa (6):
>       perf stat: Use perf_evsel__read_cb in read_counter
>       perf stat: Make read_counter work over the thread dimension
>       perf stat: Use read_counter in read_counter_aggr
>       perf tools: Remove perf_evsel__read interface
>       perf stat: Add support for per-pkg counters
>       perf stat: Add support for snapshot counters
> 
>  tools/perf/Documentation/perf-report.txt |  12 ++-
>  tools/perf/builtin-report.c              |  31 ++++++--
>  tools/perf/builtin-stat.c                | 105 +++++++++++++++++++++++---
>  tools/perf/util/callchain.c              |   4 +
>  tools/perf/util/callchain.h              |   1 +
>  tools/perf/util/evsel.c                  |  34 ---------
>  tools/perf/util/evsel.h                  |  30 +-------
>  tools/perf/util/machine.c                | 126 ++++++++++++++++++++++++++++---
>  tools/perf/util/symbol.h                 |   3 +-
>  9 files changed, 257 insertions(+), 89 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-12-08  6:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-01 23:09 [GIT PULL 0/9] perf/core improvements Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 1/9] perf stat: Use perf_evsel__read_cb in read_counter Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 2/9] perf stat: Make read_counter work over the thread dimension Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 3/9] perf stat: Use read_counter in read_counter_aggr Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 4/9] perf tools: Remove perf_evsel__read interface Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 5/9] perf stat: Add support for per-pkg counters Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 6/9] perf stat: Add support for snapshot counters Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 7/9] perf callchain: Support handling complete branch stacks as histograms Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 8/9] perf report: Add --branch-history option Arnaldo Carvalho de Melo
2014-12-01 23:09 ` [PATCH 9/9] perf report: In branch stack mode use address history sorting Arnaldo Carvalho de Melo
2014-12-08  6:46 ` [GIT PULL 0/9] perf/core improvements Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).