All of lore.kernel.org
 help / color / mirror / Atom feed
* Cycles annotation support for perf tools v2
@ 2015-05-27 17:51 Andi Kleen
  2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
                   ` (11 more replies)
  0 siblings, 12 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel

[v2: Addressed review comments. Fixed display problems and 
correctly compute IPC now. See patches for detailed changes.]

The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.

This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.

Available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools2

This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
  as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D

The kernel support has been posted separately. I included a test patch
to generate fake data for testing on existing systems.

Example output for annotate (with made up numbers):
    
The second column is the IPC and third average cycles for the basic block.

                   │    static int hex(char ch)                                                                                                       ▒
                   │    {                                                                                                                             ▒
        0.12       │      push   %rbp                                                                                                                 ◆
        0.12       │      mov    %rsp,%rbp                                                                                                            ▒
        0.12       │      sub    $0x20,%rsp                                                                                                           ▒
        0.12       │      mov    %edi,%eax                                                                                                            ▒
        0.12       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
        0.12       │      mov    %fs:0x28,%rax                                                                                                        ▒
        0.12       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
        0.12       │      xor    %eax,%eax                                                                                                            ▒
                   │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
        0.12       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
 66.67  0.12   123 │    ↓ jle    31                                                                                                                   ▒
        0.12       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
        0.12   123 │    ↓ jg     31                                                                                                                   ▒
                   │                    return ch - '0';                                                                                              ▒
 22.22  0.12       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
        0.12       │      sub    $0x30,%eax                                                                                                           ▒
        0.12   123 │    ↓ jmp    60                                                                                                                   ▒
                   │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
        0.06       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
        0.06   123 │    ↓ jle    46                                                                                                                   ▒
        0.06       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
        0.06       │    ↓ jg     46                                                                                                                   ▒
                   │                    return ch - 'a' + 10;                                                                                         ▒
        0.06       │      movsbl -0x14(%rbp),%eax                                 

Example output for branch view (again with fake data):

Overhead  Command  Source Shared Object  Source Symbol                               Target Symbol                               Basic Block Cycles   ◆
  30.08%  tcall    tcall                 [.] f1                                      [.] f2                                      123                  ▒
  27.44%  tcall    tcall                 [.] f2                                      [.] f1                                      123                  ▒
  15.60%  tcall    tcall                 [.] main                                    [.] f1                                      123                  ▒
  12.96%  tcall    tcall                 [.] f1                                      [.] main                                    123                  ▒
  12.86%  tcall    tcall                 [.] main                                    [.] main                                    123                  ▒
   0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                       [k] hrtimer_interrupt                       123             

IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.

The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.

It would be nice to add column headers to annotate.

So far no support in --branch-history or in perf script.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-06-01 14:16   ` Jiri Olsa
  2015-05-27 17:51 ` [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

cycles is a new branch_info field available on some CPUs
that indicates the time deltas between branches in the LBR.

Add a sort key and output code for the cycles
to allow to display the basic block cycles individually in perf report.

We also pass in the cycles for weight when LBRs are processed,
which allows to get global and local weight, to get an estimate
of the total cost.

And also print the cycles information for perf report -D.
I also added printing for the previously missing LBR flags
(mispredict etc.)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/util/event.h                  |  3 ++-
 tools/perf/util/hist.c                   |  3 ++-
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/session.c                | 16 ++++++++++++----
 tools/perf/util/sort.c                   | 24 ++++++++++++++++++++++++
 tools/perf/util/sort.h                   |  1 +
 7 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index c33b69f..960da20 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -109,6 +109,7 @@ OPTIONS
 	- mispredict: "N" for predicted branch, "Y" for mispredicted branch
 	- in_tx: branch in TSX transaction
 	- abort: TSX transaction abort.
+	- cycles: Cycles in basic block
 
 	And default sort keys are changed to comm, dso_from, symbol_from, dso_to
 	and symbol_to, see '--branch-stack'.
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 97179ab..cb08dce 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -129,7 +129,8 @@ struct branch_flags {
 	u64 predicted:1;
 	u64 in_tx:1;
 	u64 abort:1;
-	u64 reserved:60;
+	u64 cycles:16;
+	u64 reserved:44;
 };
 
 struct branch_entry {
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 3387706..302fc05 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -623,7 +623,8 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	 * and not events sampled. Thus we use a pseudo period of 1.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, &bi[i], NULL,
-				1, 1, 0, true);
+				1, bi->flags.cycles ? bi->flags.cycles : 1,
+				0, true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 9f31b89..b55c904 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -47,6 +47,7 @@ enum hist_column {
 	HISTC_MEM_SNOOP,
 	HISTC_MEM_DCACHELINE,
 	HISTC_TRANSACTION,
+	HISTC_CYCLES,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e722107..484b974 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -738,10 +738,18 @@ static void branch_stack__printf(struct perf_sample *sample)
 
 	printf("... branch stack: nr:%" PRIu64 "\n", sample->branch_stack->nr);
 
-	for (i = 0; i < sample->branch_stack->nr; i++)
-		printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 "\n",
-			i, sample->branch_stack->entries[i].from,
-			sample->branch_stack->entries[i].to);
+	for (i = 0; i < sample->branch_stack->nr; i++) {
+		struct branch_entry *e = &sample->branch_stack->entries[i];
+
+		printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 " %hu cycles %s%s%s%s %x\n",
+			i, e->from, e->to,
+			e->flags.cycles,
+			e->flags.mispred ? "M" : " ",
+			e->flags.predicted ? "P" : " ",
+			e->flags.abort ? "A" : " ",
+			e->flags.in_tx ? "T" : " ",
+			(unsigned)e->flags.reserved);
+	}
 }
 
 static void regs_dump__printf(u64 mask, u64 *regs)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 09d4696..2a23d62 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -528,6 +528,29 @@ static int hist_entry__mispredict_snprintf(struct hist_entry *he, char *bf,
 	return repsep_snprintf(bf, size, "%-*.*s", width, width, out);
 }
 
+static int64_t
+sort__cycles_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return left->branch_info->flags.cycles -
+		right->branch_info->flags.cycles;
+}
+
+static int hist_entry__cycles_snprintf(struct hist_entry *he, char *bf,
+				    size_t size, unsigned int width)
+{
+	if (he->branch_info->flags.cycles == 0)
+		return repsep_snprintf(bf, size, "%-*s", width, "-");
+	return repsep_snprintf(bf, size, "%-*hd", width,
+			       he->branch_info->flags.cycles);
+}
+
+struct sort_entry sort_cycles = {
+	.se_header	= "Basic Block Cycles",
+	.se_cmp		= sort__cycles_cmp,
+	.se_snprintf	= hist_entry__cycles_snprintf,
+	.se_width_idx	= HISTC_CYCLES,
+};
+
 /* --sort daddr_sym */
 static int64_t
 sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right)
@@ -1192,6 +1215,7 @@ static struct sort_dimension bstack_sort_dimensions[] = {
 	DIM(SORT_MISPREDICT, "mispredict", sort_mispredict),
 	DIM(SORT_IN_TX, "in_tx", sort_in_tx),
 	DIM(SORT_ABORT, "abort", sort_abort),
+	DIM(SORT_CYCLES, "cycles", sort_cycles),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index e97cd47..bc6c87a 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -185,6 +185,7 @@ enum sort_type {
 	SORT_MISPREDICT,
 	SORT_ABORT,
 	SORT_IN_TX,
+	SORT_CYCLES,
 
 	/* memory mode specific sort keys */
 	__SORT_MEMORY_MODE,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
  2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-06-01 14:16   ` Jiri Olsa
  2015-05-27 17:51 ` [PATCH 03/11] perf, tools: Add symbol__get_annotation Andi Kleen
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Later patches need to cheaply check that the branch mode is in ANY.
Add a new function to check all event attrs and add a flag to the
report state, which is then initialized.

v2: Rename flag
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-report.c |  7 +++++++
 tools/perf/util/evlist.c    | 10 ++++++++++
 tools/perf/util/evlist.h    |  1 +
 3 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 92fca21..bb15896 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -53,6 +53,7 @@ struct report {
 	bool			mem_mode;
 	bool			header;
 	bool			header_only;
+	bool			nonany_branch_mode;
 	int			max_stack;
 	struct perf_read_values	show_threads_values;
 	const char		*pretty_printing_style;
@@ -257,6 +258,12 @@ static int report__setup_sample_type(struct report *rep)
 		else
 			callchain_param.record_mode = CALLCHAIN_FP;
 	}
+
+	/* ??? handle more cases than just ANY? */
+	if (!(perf_evlist__combined_branch_type(session->evlist) &
+				PERF_SAMPLE_BRANCH_ANY))
+		rep->nonany_branch_mode = true;
+
 	return 0;
 }
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dc1dc2c..f607141 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1216,6 +1216,16 @@ u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
 	return __perf_evlist__combined_sample_type(evlist);
 }
 
+u64 perf_evlist__combined_branch_type(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	u64 branch_type = 0;
+
+	evlist__for_each(evlist, evsel)
+		branch_type |= evsel->attr.branch_sample_type;
+	return branch_type;
+}
+
 bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
 {
 	struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 955bf31..8c47581 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -167,6 +167,7 @@ void perf_evlist__set_leader(struct perf_evlist *evlist);
 u64 perf_evlist__read_format(struct perf_evlist *evlist);
 u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
 u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
+u64 perf_evlist__combined_branch_type(struct perf_evlist *evlist);
 bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
 u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/11] perf, tools: Add symbol__get_annotation
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
  2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
  2015-05-27 17:51 ` [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-05-28  9:32   ` [tip:perf/core] perf annotation: " tip-bot for Andi Kleen
  2015-06-01 14:17   ` [PATCH 03/11] perf, tools: " Jiri Olsa
  2015-05-27 17:51 ` [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a new utility function to get an function annotation.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/annotate.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7f5bdfc..bf80430 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -506,6 +506,17 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 	return 0;
 }
 
+static struct annotation *symbol__get_annotation(struct symbol *sym)
+{
+	struct annotation *notes = symbol__annotation(sym);
+
+	if (notes->src == NULL) {
+		if (symbol__alloc_hist(sym) < 0)
+			return NULL;
+	}
+	return notes;
+}
+
 static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 				    int evidx, u64 addr)
 {
@@ -513,13 +524,9 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 
 	if (sym == NULL)
 		return 0;
-
-	notes = symbol__annotation(sym);
-	if (notes->src == NULL) {
-		if (symbol__alloc_hist(sym) < 0)
-			return -ENOMEM;
-	}
-
+	notes = symbol__get_annotation(sym);
+	if (notes == NULL)
+		return -ENOMEM;
 	return __symbol__inc_addr_samples(sym, map, notes, evidx, addr);
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (2 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 03/11] perf, tools: Add symbol__get_annotation Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-06-01 14:19   ` Jiri Olsa
  2015-05-27 17:51 ` [PATCH 05/11] perf, tools, report: Add processing for cycle histograms Andi Kleen
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

This adds the basic infrastructure to keep track of cycle counts
per basic block for annotate. We allocate an array similar to the
normal accounting, and then account branch cycles there.

We handle two cases:
cycles per basic block with start and cycles per branch
(these are later used for either IPC or just cycles per BB)

In the start case we cannot handle overlaps, so always the longest
basic block wins.

For the cycles per branch case everything is accurately accounted.

v2: Remove unnecessary checks. Slight restructure. Move
symbol__get_annotation to another patch. Move histogram allocation.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-annotate.c |   1 +
 tools/perf/util/annotate.c    | 127 +++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/annotate.h    |  17 ++++++
 3 files changed, 142 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b57a027..f530050 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -183,6 +183,7 @@ find_next:
 			 * symbol, free he->ms.sym->src to signal we already
 			 * processed this symbol.
 			 */
+			zfree(&notes->src->cycles_hist);
 			zfree(&notes->src);
 		}
 	}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index bf80430..97637de 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -473,17 +473,73 @@ int symbol__alloc_hist(struct symbol *sym)
 	return 0;
 }
 
+/* The cycles histogram is lazily allocated. */
+static int symbol__alloc_hist_cycles(struct symbol *sym)
+{
+	struct annotation *notes = symbol__annotation(sym);
+	const size_t size = symbol__size(sym);
+
+	notes->src->cycles_hist = calloc(size, sizeof(struct cyc_hist));
+	if (notes->src->cycles_hist == NULL)
+		return -1;
+	return 0;
+}
+
 void symbol__annotate_zero_histograms(struct symbol *sym)
 {
 	struct annotation *notes = symbol__annotation(sym);
 
 	pthread_mutex_lock(&notes->lock);
-	if (notes->src != NULL)
+	if (notes->src != NULL) {
 		memset(notes->src->histograms, 0,
 		       notes->src->nr_histograms * notes->src->sizeof_sym_hist);
+		if (notes->src->cycles_hist)
+			memset(notes->src->cycles_hist, 0,
+				symbol__size(sym) * sizeof(struct cyc_hist));
+	}
 	pthread_mutex_unlock(&notes->lock);
 }
 
+static int __symbol__account_cycles(struct annotation *notes,
+				    u64 start,
+				    unsigned offset, unsigned cycles,
+				    unsigned have_start)
+{
+	struct cyc_hist *ch;
+
+	ch = notes->src->cycles_hist;
+	/*
+	 * For now we can only account one basic block per
+	 * final jump. But multiple could be overlapping.
+	 * Always account the longest one. So when
+	 * a shorter one has been already seen throw it away.
+	 *
+	 * We separately always account the full cycles.
+	 */
+	ch[offset].num_aggr++;
+	ch[offset].cycles_aggr += cycles;
+
+	if (!have_start && ch[offset].have_start)
+		return 0;
+	if (ch[offset].num) {
+		if (have_start && (!ch[offset].have_start ||
+				   ch[offset].start > start)) {
+			ch[offset].have_start = 0;
+			ch[offset].cycles = 0;
+			ch[offset].num = 0;
+			if (ch[offset].reset < 0xffff)
+				ch[offset].reset++;
+		} else if (have_start &&
+			   ch[offset].start < start)
+			return 0;
+	}
+	ch[offset].have_start = have_start;
+	ch[offset].start = start;
+	ch[offset].cycles += cycles;
+	ch[offset].num++;
+	return 0;
+}
+
 static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 				      struct annotation *notes, int evidx, u64 addr)
 {
@@ -506,7 +562,7 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 	return 0;
 }
 
-static struct annotation *symbol__get_annotation(struct symbol *sym)
+static struct annotation *symbol__get_annotation(struct symbol *sym, bool cycles)
 {
 	struct annotation *notes = symbol__annotation(sym);
 
@@ -514,6 +570,10 @@ static struct annotation *symbol__get_annotation(struct symbol *sym)
 		if (symbol__alloc_hist(sym) < 0)
 			return NULL;
 	}
+	if (!notes->src->cycles_hist && cycles) {
+		if (symbol__alloc_hist_cycles(sym) < 0)
+			return NULL;
+	}
 	return notes;
 }
 
@@ -524,12 +584,73 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 
 	if (sym == NULL)
 		return 0;
-	notes = symbol__get_annotation(sym);
+	notes = symbol__get_annotation(sym, false);
 	if (notes == NULL)
 		return -ENOMEM;
 	return __symbol__inc_addr_samples(sym, map, notes, evidx, addr);
 }
 
+static int symbol__account_cycles(u64 addr, u64 start,
+				  struct symbol *sym, unsigned cycles)
+{
+	struct annotation *notes;
+	unsigned offset;
+
+	if (sym == NULL)
+		return 0;
+	notes = symbol__get_annotation(sym, true);
+	if (notes == NULL)
+		return -ENOMEM;
+	if (addr < sym->start || addr >= sym->end)
+		return -ERANGE;
+
+	if (start) {
+		if (start < sym->start || start >= sym->end)
+			return -ERANGE;
+		if (start >= addr)
+			start = 0;
+	}
+	offset = addr - sym->start;
+	return __symbol__account_cycles(notes,
+					start ? start - sym->start : 0,
+					offset, cycles,
+					!!start);
+}
+
+int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
+				    struct addr_map_symbol *start,
+				    unsigned cycles)
+{
+	unsigned long saddr = 0;
+	int err;
+
+	if (!cycles)
+		return 0;
+
+	/*
+	 * Only set start when IPC can be computed. We can only
+	 * compute it when the basic block is completely in a single
+	 * function.
+	 * Special case the case when the jump is elsewhere, but
+	 * it starts on the function start.
+	 */
+	if (start &&
+		(start->sym == ams->sym ||
+		 (ams->sym &&
+		   start->addr == ams->sym->start + ams->map->start)))
+		saddr = start->al_addr;
+	if (saddr == 0)
+		pr_debug2("BB with bad start: addr %lx start %lx sym %lx saddr %lx\n",
+			ams->addr,
+			start ? start->addr : 0,
+			ams->sym ? ams->sym->start + ams->map->start : 0,
+			saddr);
+	err = symbol__account_cycles(ams->al_addr, saddr, ams->sym, cycles);
+	if (err)
+		pr_debug2("account_cycles failed %d\n", err);
+	return err;
+}
+
 int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, int evidx)
 {
 	return symbol__inc_addr_samples(ams->sym, ams->map, evidx, ams->al_addr);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index cadbdc9..9080181 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -79,6 +79,17 @@ struct sym_hist {
 	u64		addr[0];
 };
 
+struct cyc_hist {
+	u64	start;
+	u64	cycles;
+	u64	cycles_aggr;
+	u32	num;
+	u32	num_aggr;
+	u8	have_start;
+	/* 1 byte padding */
+	u16	reset;
+};
+
 struct source_line_percent {
 	double		percent;
 	double		percent_sum;
@@ -96,6 +107,7 @@ struct source_line {
  * @histogram: Array of addr hit histograms per event being monitored
  * @lines: If 'print_lines' is specified, per source code line percentages
  * @source: source parsed from a disassembler like objdump -dS
+ * @cyc_hist: Average cycles per basic block
  *
  * lines is allocated, percentages calculated and all sorted by percentage
  * when the annotation is about to be presented, so the percentages are for
@@ -108,6 +120,7 @@ struct annotated_source {
 	struct source_line *lines;
 	int    		   nr_histograms;
 	int    		   sizeof_sym_hist;
+	struct cyc_hist	   *cycles_hist;
 	struct sym_hist	   histograms[0];
 };
 
@@ -129,6 +142,10 @@ static inline struct annotation *symbol__annotation(struct symbol *sym)
 
 int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, int evidx);
 
+int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
+				    struct addr_map_symbol *start,
+				    unsigned cycles);
+
 int hist_entry__inc_addr_samples(struct hist_entry *he, int evidx, u64 addr);
 
 int symbol__alloc_hist(struct symbol *sym);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/11] perf, tools, report: Add processing for cycle histograms
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (3 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-06-01 14:10   ` Jiri Olsa
  2015-05-27 17:51 ` [PATCH 06/11] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Call the earlier added cycle histogram infrastructure from the perf report
hist iter callback. For this we walk the branch records.

This allows to use cycle histograms when browsing perf report annotate.

v2: Rename flag
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-report.c |  4 +++-
 tools/perf/util/hist.c      | 33 +++++++++++++++++++++++++++++++++
 tools/perf/util/hist.h      |  3 +++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bb15896..b2b1232 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -103,6 +103,9 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
 	if (!ui__has_annotation())
 		return 0;
 
+	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
+			     rep->nonany_branch_mode);
+
 	if (sort__mode == SORT_MODE__BRANCH) {
 		bi = he->branch_info;
 		err = addr_map_symbol__inc_samples(&bi->from, evsel->idx);
@@ -110,7 +113,6 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
 			goto out;
 
 		err = addr_map_symbol__inc_samples(&bi->to, evsel->idx);
-
 	} else if (rep->mem_mode) {
 		mi = he->mem_info;
 		err = addr_map_symbol__inc_samples(&mi->daddr, evsel->idx);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 302fc05..6f1e411 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1412,6 +1412,39 @@ int hists__link(struct hists *leader, struct hists *other)
 	return 0;
 }
 
+void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
+			  struct perf_sample *sample, bool nonany_branch_mode)
+{
+	struct branch_info *bi;
+
+	/* If we have branch cycles always annotate them. */
+	if (bs && bs->nr && bs->entries[0].flags.cycles) {
+		int i;
+
+		bi = sample__resolve_bstack(sample, al);
+		if (bi) {
+			struct addr_map_symbol *prev = NULL;
+
+			/*
+			 * Ignore errors, still want to process the
+			 * other entries.
+			 *
+			 * For non standard branch modes always
+			 * force no IPC (prev == NULL)
+			 *
+			 * Note that perf stores branches reversed from
+			 * program order!
+			 */
+			for (i = bs->nr - 1; i >= 0; i--) {
+				addr_map_symbol__account_cycles(&bi[i].from,
+					nonany_branch_mode ? NULL : prev,
+					bi[i].flags.cycles);
+				prev = &bi[i].to;
+			}
+			free(bi);
+		}
+	}
+}
 
 size_t perf_evlist__fprintf_nr_events(struct perf_evlist *evlist, FILE *fp)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index b55c904..77e5739 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -351,6 +351,9 @@ static inline int script_browse(const char *script_opt __maybe_unused)
 
 unsigned int hists__sort_list_width(struct hists *hists);
 
+void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
+			  struct perf_sample *sample, bool nonany_branch_mode);
+
 struct option;
 int parse_filter_percentage(const struct option *opt __maybe_unused,
 			    const char *arg, int unset __maybe_unused);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/11] perf, tools: Compute IPC and basic block cycles for annotate
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (4 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 05/11] perf, tools, report: Add processing for cycle histograms Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-05-27 17:51 ` [PATCH 07/11] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Compute the IPC and the basic block cycles for the annotate display.

IPC is computed by counting the instructions, and then dividing the
accounted cycles by that count.

The actual IPC computation can only be done at annotate time,
because we need to parse the objdump output first to know
the number of instructions in the basic block.

The cycles/IPC are also put into the perf function annotation
so that the display code can show them.

Again basic block overlaps are not handled, with the longest winning,
but there are some heuristics to hide the IPC when the longest is not
the most common.

v2: Compute IPC correctly.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/ui/browsers/annotate.c | 73 ++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/annotate.h        |  2 ++
 2 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index e5250eb..441e074 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -47,6 +47,7 @@ struct annotate_browser {
 	int		    max_jump_sources;
 	int		    nr_jumps;
 	bool		    searching_backwards;
+	bool		    have_cycles;
 	u8		    addr_width;
 	u8		    jumps_width;
 	u8		    target_width;
@@ -376,7 +377,7 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser,
 				max_percent = bpos->percent[i];
 		}
 
-		if (max_percent < 0.01) {
+		if (max_percent < 0.01 && pos->ipc == 0) {
 			RB_CLEAR_NODE(&bpos->rb_node);
 			continue;
 		}
@@ -841,6 +842,75 @@ int hist_entry__tui_annotate(struct hist_entry *he, struct perf_evsel *evsel,
 	return map_symbol__tui_annotate(&he->ms, evsel, hbt);
 }
 
+
+static unsigned count_insn(struct annotate_browser *browser, u64 start, u64 end)
+{
+	unsigned n_insn = 0;
+	u64 offset;
+
+	for (offset = start; offset <= end; offset++) {
+		if (browser->offsets[offset])
+			n_insn++;
+	}
+	return n_insn;
+}
+
+static void count_and_fill(struct annotate_browser *browser, u64 start, u64 end,
+			   struct cyc_hist *ch)
+{
+	unsigned n_insn;
+	u64 offset;
+
+	n_insn = count_insn(browser, start, end);
+	if (n_insn && ch->num && ch->cycles) {
+		float ipc = n_insn / ((double)ch->cycles / (double)ch->num);
+
+		/* Hide data when there are too many overlaps. */
+		if (ch->reset >= 0x7fff || ch->reset >= ch->num / 2)
+			return;
+
+		for (offset = start; offset <= end; offset++) {
+			struct disasm_line *dl = browser->offsets[offset];
+
+			if (dl)
+				dl->ipc = ipc;
+		}
+	}
+}
+
+/*
+ * This should probably be in util/annotate.c to share with the tty
+ * annotate, but right now we need the per byte offsets arrays,
+ * which are only here.
+ */
+static void annotate__compute_ipc(struct annotate_browser *browser, size_t size,
+			   struct symbol *sym)
+{
+	u64 offset;
+	struct annotation *notes = symbol__annotation(sym);
+
+	if (!notes->src || !notes->src->cycles_hist)
+		return;
+
+	pthread_mutex_lock(&notes->lock);
+	for (offset = 0; offset < size; ++offset) {
+		struct cyc_hist *ch;
+
+		ch = &notes->src->cycles_hist[offset];
+		if (ch && ch->cycles) {
+			struct disasm_line *dl;
+
+			if (ch->have_start)
+				count_and_fill(browser, ch->start, offset, ch);
+			dl = browser->offsets[offset];
+			if (dl && ch->num_aggr)
+				dl->cycles = ch->cycles_aggr / ch->num_aggr;
+			browser->have_cycles = true;
+		}
+	}
+	pthread_mutex_unlock(&notes->lock);
+}
+
 static void annotate_browser__mark_jump_targets(struct annotate_browser *browser,
 						size_t size)
 {
@@ -962,6 +1032,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 	}
 
 	annotate_browser__mark_jump_targets(&browser, size);
+	annotate__compute_ipc(&browser, size, sym);
 
 	browser.addr_width = browser.target_width = browser.min_addr_width = hex_width(size);
 	browser.max_addr_width = hex_width(sym->end);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 9080181..be6bafa 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -59,6 +59,8 @@ struct disasm_line {
 	char		    *name;
 	struct ins	    *ins;
 	int		    line_nr;
+	float		    ipc;
+	u64		    cycles;
 	struct ins_operands ops;
 };
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/11] perf, tools, annotate: Finally display IPC and cycle accounting
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (5 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 06/11] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-05-27 17:51 ` [PATCH 08/11] perf, tools, report: Move branch option parsing to own file Andi Kleen
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add two new columns to the annotate display and display the average cycles
and the compute IPC if available.

When the LBR was not in any branch mode the IPC
computation is automatically disabled. We still display
the cycle information.

Example output (with made up numbers):

The second column is the IPC and third average cycles.

                 │    __attribute__((noinline)) f2()
                 │    {
  5.15  0.07     │       push   %rbp
  0.01  0.07     │       mov    %rsp,%rbp
                 │            c = a / b;
  9.87  0.07     │       mov    a,%eax
        0.07     │       mov    b,%ecx
        0.07     │       cltd
  4.92  0.07  123│       idiv   %ecx
 70.79  0.07     │       mov    %eax,__TMC_END__
                 │    }
  9.25  0.07     │       pop    %rbp
  0.01  0.07  123│     ← retq

v2: Fix display problems.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/ui/browsers/annotate.c | 48 +++++++++++++++++++++++++++++----------
 1 file changed, 36 insertions(+), 12 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 441e074..9f69c82 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -11,6 +11,9 @@
 #include "../../util/evsel.h"
 #include <pthread.h>
 
+#define IPC_WIDTH 6
+#define CYCLES_WIDTH 6
+
 struct browser_disasm_line {
 	struct rb_node	rb_node;
 	u32		idx;
@@ -91,6 +94,15 @@ static int annotate_browser__set_jumps_percent_color(struct annotate_browser *br
 	 return ui_browser__set_color(&browser->b, color);
 }
 
+static int annotate_browser__pcnt_width(struct annotate_browser *ab)
+{
+	int w = 7 * ab->nr_events;
+
+	if (ab->have_cycles)
+		w += IPC_WIDTH + CYCLES_WIDTH;
+	return w;
+}
+
 static void annotate_browser__write(struct ui_browser *browser, void *entry, int row)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
@@ -101,7 +113,7 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 			     (!current_entry || (browser->use_navkeypressed &&
 					         !browser->navkeypressed)));
 	int width = browser->width, printed;
-	int i, pcnt_width = 7 * ab->nr_events;
+	int i, pcnt_width = annotate_browser__pcnt_width(ab);
 	double percent_max = 0.0;
 	char bf[256];
 
@@ -111,14 +123,30 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 	}
 
 	if (dl->offset != -1 && percent_max != 0.0) {
-		for (i = 0; i < ab->nr_events; i++) {
-			ui_browser__set_percent_color(browser, bdl->percent[i],
-						      current_entry);
-			slsmg_printf("%6.2f ", bdl->percent[i]);
+		if (percent_max != 0.0) {
+			for (i = 0; i < ab->nr_events; i++) {
+				ui_browser__set_percent_color(browser,
+							      bdl->percent[i],
+							      current_entry);
+				slsmg_printf("%6.2f ", bdl->percent[i]);
+			}
+		} else {
+			slsmg_write_nstring(" ", 7 * ab->nr_events);
 		}
 	} else {
 		ui_browser__set_percent_color(browser, 0, current_entry);
-		slsmg_write_nstring(" ", pcnt_width);
+		slsmg_write_nstring(" ", 7 * ab->nr_events);
+	}
+	if (ab->have_cycles) {
+		if (dl->ipc)
+			slsmg_printf("%*.2f ", IPC_WIDTH - 1, dl->ipc);
+		else
+			slsmg_write_nstring(" ", IPC_WIDTH);
+		if (dl->cycles)
+			slsmg_printf("%*" PRIu64 " ",
+				     CYCLES_WIDTH - 1, dl->cycles);
+		else
+			slsmg_write_nstring(" ", CYCLES_WIDTH);
 	}
 
 	SLsmg_write_char(' ');
@@ -221,7 +249,7 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 	unsigned int from, to;
 	struct map_symbol *ms = ab->b.priv;
 	struct symbol *sym = ms->sym;
-	u8 pcnt_width = 7;
+	u8 pcnt_width = annotate_browser__pcnt_width(ab);
 
 	/* PLT symbols contain external offsets */
 	if (strstr(sym->name, "@plt"))
@@ -245,8 +273,6 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 		to = (u64)btarget->idx;
 	}
 
-	pcnt_width *= ab->nr_events;
-
 	ui_browser__set_color(browser, HE_COLORSET_CODE);
 	__ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
 				 from, to);
@@ -256,9 +282,7 @@ static unsigned int annotate_browser__refresh(struct ui_browser *browser)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
 	int ret = ui_browser__list_head_refresh(browser);
-	int pcnt_width;
-
-	pcnt_width = 7 * ab->nr_events;
+	int pcnt_width = annotate_browser__pcnt_width(ab);
 
 	if (annotate_browser__opts.jump_arrows)
 		annotate_browser__draw_current_jump(browser);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/11] perf, tools, report: Move branch option parsing to own file
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (6 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 07/11] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-05-28  9:32   ` [tip:perf/core] perf tools: " tip-bot for Andi Kleen
  2015-06-01 14:20   ` [PATCH 08/11] perf, tools, report: " Jiri Olsa
  2015-05-27 17:51 ` [PATCH 09/11] perf, tools, top: Add branch annotation code to top Andi Kleen
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

.. to allow sharing between builtin-record and builtin-top later.
No code changes, just moved code.

v2: Add header
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-record.c | 89 +------------------------------------------
 tools/perf/util/Build       |  1 +
 tools/perf/util/branch.c    | 93 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/branch.h    |  5 +++
 4 files changed, 100 insertions(+), 88 deletions(-)
 create mode 100644 tools/perf/util/branch.c
 create mode 100644 tools/perf/util/branch.h

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5dfe913..c513620 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -28,6 +28,7 @@
 #include "util/thread_map.h"
 #include "util/data.h"
 #include "util/auxtrace.h"
+#include "util/branch.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -751,94 +752,6 @@ out_delete_session:
 	return status;
 }
 
-#define BRANCH_OPT(n, m) \
-	{ .name = n, .mode = (m) }
-
-#define BRANCH_END { .name = NULL }
-
-struct branch_mode {
-	const char *name;
-	int mode;
-};
-
-static const struct branch_mode branch_modes[] = {
-	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
-	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
-	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
-	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
-	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
-	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
-	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
-	BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
-	BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
-	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
-	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
-	BRANCH_END
-};
-
-static int
-parse_branch_stack(const struct option *opt, const char *str, int unset)
-{
-#define ONLY_PLM \
-	(PERF_SAMPLE_BRANCH_USER	|\
-	 PERF_SAMPLE_BRANCH_KERNEL	|\
-	 PERF_SAMPLE_BRANCH_HV)
-
-	uint64_t *mode = (uint64_t *)opt->value;
-	const struct branch_mode *br;
-	char *s, *os = NULL, *p;
-	int ret = -1;
-
-	if (unset)
-		return 0;
-
-	/*
-	 * cannot set it twice, -b + --branch-filter for instance
-	 */
-	if (*mode)
-		return -1;
-
-	/* str may be NULL in case no arg is passed to -b */
-	if (str) {
-		/* because str is read-only */
-		s = os = strdup(str);
-		if (!s)
-			return -1;
-
-		for (;;) {
-			p = strchr(s, ',');
-			if (p)
-				*p = '\0';
-
-			for (br = branch_modes; br->name; br++) {
-				if (!strcasecmp(s, br->name))
-					break;
-			}
-			if (!br->name) {
-				ui__warning("unknown branch filter %s,"
-					    " check man page\n", s);
-				goto error;
-			}
-
-			*mode |= br->mode;
-
-			if (!p)
-				break;
-
-			s = p + 1;
-		}
-	}
-	ret = 0;
-
-	/* default to any branch */
-	if ((*mode & ~ONLY_PLM) == 0) {
-		*mode = PERF_SAMPLE_BRANCH_ANY;
-	}
-error:
-	free(os);
-	return ret;
-}
-
 static void callchain_debug(void)
 {
 	static const char *str[CALLCHAIN_MAX] = { "NONE", "FP", "DWARF", "LBR" };
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 6966d07..486e77e 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -75,6 +75,7 @@ libperf-$(CONFIG_X86) += tsc.o
 libperf-y += cloexec.o
 libperf-y += thread-stack.o
 libperf-$(CONFIG_AUXTRACE) += auxtrace.o
+libperf-y += branch.o
 
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-event.o
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
new file mode 100644
index 0000000..1555064
--- /dev/null
+++ b/tools/perf/util/branch.c
@@ -0,0 +1,93 @@
+#include "perf.h"
+#include "util/util.h"
+#include "util/debug.h"
+#include "util/parse-options.h"
+#include "util/branch.h"
+
+#define BRANCH_OPT(n, m) \
+	{ .name = n, .mode = (m) }
+
+#define BRANCH_END { .name = NULL }
+
+struct branch_mode {
+	const char *name;
+	int mode;
+};
+
+static const struct branch_mode branch_modes[] = {
+	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
+	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
+	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
+	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
+	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
+	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
+	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+	BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
+	BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
+	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
+	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
+	BRANCH_END
+};
+
+int
+parse_branch_stack(const struct option *opt, const char *str, int unset)
+{
+#define ONLY_PLM \
+	(PERF_SAMPLE_BRANCH_USER	|\
+	 PERF_SAMPLE_BRANCH_KERNEL	|\
+	 PERF_SAMPLE_BRANCH_HV)
+
+	uint64_t *mode = (uint64_t *)opt->value;
+	const struct branch_mode *br;
+	char *s, *os = NULL, *p;
+	int ret = -1;
+
+	if (unset)
+		return 0;
+
+	/*
+	 * cannot set it twice, -b + --branch-filter for instance
+	 */
+	if (*mode)
+		return -1;
+
+	/* str may be NULL in case no arg is passed to -b */
+	if (str) {
+		/* because str is read-only */
+		s = os = strdup(str);
+		if (!s)
+			return -1;
+
+		for (;;) {
+			p = strchr(s, ',');
+			if (p)
+				*p = '\0';
+
+			for (br = branch_modes; br->name; br++) {
+				if (!strcasecmp(s, br->name))
+					break;
+			}
+			if (!br->name) {
+				ui__warning("unknown branch filter %s,"
+					    " check man page\n", s);
+				goto error;
+			}
+
+			*mode |= br->mode;
+
+			if (!p)
+				break;
+
+			s = p + 1;
+		}
+	}
+	ret = 0;
+
+	/* default to any branch */
+	if ((*mode & ~ONLY_PLM) == 0) {
+		*mode = PERF_SAMPLE_BRANCH_ANY;
+	}
+error:
+	free(os);
+	return ret;
+}
diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
new file mode 100644
index 0000000..78a9be4
--- /dev/null
+++ b/tools/perf/util/branch.h
@@ -0,0 +1,5 @@
+#ifndef _BRANCH_H
+#define _BRANCH_H 1
+struct option;
+int parse_branch_stack(const struct option *opt, const char *str, int unset);
+#endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/11] perf, tools, top: Add branch annotation code to top
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (7 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 08/11] perf, tools, report: Move branch option parsing to own file Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-05-27 17:51 ` [PATCH 10/11] perf, tools, report: Display cycles in branch sort mode Andi Kleen
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Now that we can process branch data in annotate it makes sense to support
enabling branch recording from top too. Most of the code needed for
this is already in shared code with report. But we need to add:

- The option parsing code (using shared code from the previous patch)
- Document the options
- Set up the IPC/cycles accounting state in the top session
- Call the accounting code in the hist iter callback

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-top.txt | 21 +++++++++++++++++++++
 tools/perf/builtin-top.c              | 10 +++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 9e5b07eb..fc2bffd 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -202,6 +202,27 @@ Default is to monitor all CPUS.
 	readability.  0 means no limit (default behavior).
 
 
+-b::
+--branch-any::
+	Enable taken branch stack sampling. Any type of taken branch may be sampled.
+	This is a shortcut for --branch-filter any. See --branch-filter for more infos.
+
+-j::
+--branch-filter::
+	Enable taken branch stack sampling. Each sample captures a series of consecutive
+	taken branches. The number of branches captured with each sample depends on the
+	underlying hardware, the type of branches of interest, and the executed code.
+	It is possible to select the types of branches captured by enabling filters.
+	For a full list of modifiers please see the perf record manpage.
+
+	The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
+	The privilege levels may be omitted, in which case, the privilege levels of the associated
+	event are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege
+	levels are subject to permissions.  When sampling on multiple events, branch stack sampling
+	is enabled for all the sampling events. The sampled branch type is the same for all events.
+	The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
+	Note that this feature may not be available on all processors.
+
 INTERACTIVE PROMPTING KEYS
 --------------------------
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index a193517..b299ce4 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -40,6 +40,7 @@
 #include "util/xyarray.h"
 #include "util/sort.h"
 #include "util/intlist.h"
+#include "util/branch.h"
 #include "arch/common.h"
 
 #include "util/debug.h"
@@ -687,6 +688,8 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
 		perf_top__record_precise_ip(top, he, evsel->idx, ip);
 	}
 
+	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
+		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
 	return 0;
 }
 
@@ -923,7 +926,6 @@ static int perf_top__setup_sample_type(struct perf_top *top __maybe_unused)
 			return -EINVAL;
 		}
 	}
-
 	return 0;
 }
 
@@ -1159,6 +1161,12 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_STRING('w', "column-widths", &symbol_conf.col_width_list_str,
 		   "width[,width...]",
 		   "don't try to adjust column width, use these fixed values"),
+	OPT_CALLBACK_NOOPT('b', "branch-any", &opts->branch_stack,
+		     "branch any", "sample any taken branches",
+		     parse_branch_stack),
+	OPT_CALLBACK('j', "branch-filter", &opts->branch_stack,
+		     "branch filter mask", "branch stack filter modes",
+		     parse_branch_stack),
 	OPT_END()
 	};
 	const char * const top_usage[] = {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/11] perf, tools, report: Display cycles in branch sort mode
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (8 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 09/11] perf, tools, top: Add branch annotation code to top Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-05-27 17:51 ` [PATCH 11/11] test patch: Add fake branch cycles to input data in report/top Andi Kleen
  2015-06-01 14:21 ` Cycles annotation support for perf tools v2 Jiri Olsa
  11 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Display the cycles by default in branch sort mode.

To make enough room for the new column I removed dso_to. It is usually
redundant with dso_from.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/sort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 2a23d62..fb61611 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -9,7 +9,7 @@ regex_t		parent_regex;
 const char	default_parent_pattern[] = "^sys_|^do_page_fault";
 const char	*parent_pattern = default_parent_pattern;
 const char	default_sort_order[] = "comm,dso,symbol";
-const char	default_branch_sort_order[] = "comm,dso_from,symbol_from,dso_to,symbol_to";
+const char	default_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles";
 const char	default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked";
 const char	default_top_sort_order[] = "dso,symbol";
 const char	default_diff_sort_order[] = "dso,symbol";
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/11] test patch: Add fake branch cycles to input data in report/top
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (9 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 10/11] perf, tools, report: Display cycles in branch sort mode Andi Kleen
@ 2015-05-27 17:51 ` Andi Kleen
  2015-06-01 14:21 ` Cycles annotation support for perf tools v2 Jiri Olsa
  11 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Not to be merged, but useful for testing if you don't have
hardware with cycles branch stack support.
---
 tools/perf/util/hist.c    | 2 +-
 tools/perf/util/machine.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6f1e411..f091f1d 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1418,7 +1418,7 @@ void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
 	struct branch_info *bi;
 
 	/* If we have branch cycles always annotate them. */
-	if (bs && bs->nr && bs->entries[0].flags.cycles) {
+	if (bs && bs->nr /* && bs->entries[0].flags.cycles */) {
 		int i;
 
 		bi = sample__resolve_bstack(sample, al);
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index daa5591..418b6bd 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1552,6 +1552,8 @@ struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
 		ip__resolve_ams(al->thread, &bi[i].to, bs->entries[i].to);
 		ip__resolve_ams(al->thread, &bi[i].from, bs->entries[i].from);
 		bi[i].flags = bs->entries[i].flags;
+		if (bi[i].flags.cycles == 0)
+			bi[i].flags.cycles = 123;
 	}
 	return bi;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf annotation: Add symbol__get_annotation
  2015-05-27 17:51 ` [PATCH 03/11] perf, tools: Add symbol__get_annotation Andi Kleen
@ 2015-05-28  9:32   ` tip-bot for Andi Kleen
  2015-06-01 14:17   ` [PATCH 03/11] perf, tools: " Jiri Olsa
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2015-05-28  9:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, tglx, namhyung, mingo, jolsa, acme, ak, hpa, linux-kernel

Commit-ID:  83be34a7a913bdf9f21f524333c63d9c48a28ef4
Gitweb:     http://git.kernel.org/tip/83be34a7a913bdf9f21f524333c63d9c48a28ef4
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 27 May 2015 10:51:46 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 May 2015 20:30:56 -0300

perf annotation: Add symbol__get_annotation

Add a new utility function to get an function annotation out of existing
code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1432749114-904-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/annotate.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7f5bdfc..bf80430 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -506,6 +506,17 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 	return 0;
 }
 
+static struct annotation *symbol__get_annotation(struct symbol *sym)
+{
+	struct annotation *notes = symbol__annotation(sym);
+
+	if (notes->src == NULL) {
+		if (symbol__alloc_hist(sym) < 0)
+			return NULL;
+	}
+	return notes;
+}
+
 static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 				    int evidx, u64 addr)
 {
@@ -513,13 +524,9 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 
 	if (sym == NULL)
 		return 0;
-
-	notes = symbol__annotation(sym);
-	if (notes->src == NULL) {
-		if (symbol__alloc_hist(sym) < 0)
-			return -ENOMEM;
-	}
-
+	notes = symbol__get_annotation(sym);
+	if (notes == NULL)
+		return -ENOMEM;
 	return __symbol__inc_addr_samples(sym, map, notes, evidx, addr);
 }
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf tools: Move branch option parsing to own file
  2015-05-27 17:51 ` [PATCH 08/11] perf, tools, report: Move branch option parsing to own file Andi Kleen
@ 2015-05-28  9:32   ` tip-bot for Andi Kleen
  2015-06-01 14:20   ` [PATCH 08/11] perf, tools, report: " Jiri Olsa
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2015-05-28  9:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, namhyung, linux-kernel, eranian, ak, mingo, tglx, hpa, acme

Commit-ID:  f00898f4e20b286877b8d6d96d6e404661fd7985
Gitweb:     http://git.kernel.org/tip/f00898f4e20b286877b8d6d96d6e404661fd7985
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 27 May 2015 10:51:51 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 May 2015 21:02:17 -0300

perf tools: Move branch option parsing to own file

.. to allow sharing between builtin-record and builtin-top later.  No
code changes, just moved code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1432749114-904-9-git-send-email-andi@firstfloor.org
[ Rename too generic branch.[ch] name to parse-branch-options.[ch] ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c            | 89 +-------------------------------
 tools/perf/util/Build                  |  1 +
 tools/perf/util/parse-branch-options.c | 93 ++++++++++++++++++++++++++++++++++
 tools/perf/util/parse-branch-options.h |  5 ++
 4 files changed, 100 insertions(+), 88 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5dfe913..91aa2a3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -28,6 +28,7 @@
 #include "util/thread_map.h"
 #include "util/data.h"
 #include "util/auxtrace.h"
+#include "util/parse-branch-options.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -751,94 +752,6 @@ out_delete_session:
 	return status;
 }
 
-#define BRANCH_OPT(n, m) \
-	{ .name = n, .mode = (m) }
-
-#define BRANCH_END { .name = NULL }
-
-struct branch_mode {
-	const char *name;
-	int mode;
-};
-
-static const struct branch_mode branch_modes[] = {
-	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
-	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
-	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
-	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
-	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
-	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
-	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
-	BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
-	BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
-	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
-	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
-	BRANCH_END
-};
-
-static int
-parse_branch_stack(const struct option *opt, const char *str, int unset)
-{
-#define ONLY_PLM \
-	(PERF_SAMPLE_BRANCH_USER	|\
-	 PERF_SAMPLE_BRANCH_KERNEL	|\
-	 PERF_SAMPLE_BRANCH_HV)
-
-	uint64_t *mode = (uint64_t *)opt->value;
-	const struct branch_mode *br;
-	char *s, *os = NULL, *p;
-	int ret = -1;
-
-	if (unset)
-		return 0;
-
-	/*
-	 * cannot set it twice, -b + --branch-filter for instance
-	 */
-	if (*mode)
-		return -1;
-
-	/* str may be NULL in case no arg is passed to -b */
-	if (str) {
-		/* because str is read-only */
-		s = os = strdup(str);
-		if (!s)
-			return -1;
-
-		for (;;) {
-			p = strchr(s, ',');
-			if (p)
-				*p = '\0';
-
-			for (br = branch_modes; br->name; br++) {
-				if (!strcasecmp(s, br->name))
-					break;
-			}
-			if (!br->name) {
-				ui__warning("unknown branch filter %s,"
-					    " check man page\n", s);
-				goto error;
-			}
-
-			*mode |= br->mode;
-
-			if (!p)
-				break;
-
-			s = p + 1;
-		}
-	}
-	ret = 0;
-
-	/* default to any branch */
-	if ((*mode & ~ONLY_PLM) == 0) {
-		*mode = PERF_SAMPLE_BRANCH_ANY;
-	}
-error:
-	free(os);
-	return ret;
-}
-
 static void callchain_debug(void)
 {
 	static const char *str[CALLCHAIN_MAX] = { "NONE", "FP", "DWARF", "LBR" };
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 6966d07..e4b676d 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -75,6 +75,7 @@ libperf-$(CONFIG_X86) += tsc.o
 libperf-y += cloexec.o
 libperf-y += thread-stack.o
 libperf-$(CONFIG_AUXTRACE) += auxtrace.o
+libperf-y += parse-branch-options.o
 
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-event.o
diff --git a/tools/perf/util/parse-branch-options.c b/tools/perf/util/parse-branch-options.c
new file mode 100644
index 0000000..9d99943
--- /dev/null
+++ b/tools/perf/util/parse-branch-options.c
@@ -0,0 +1,93 @@
+#include "perf.h"
+#include "util/util.h"
+#include "util/debug.h"
+#include "util/parse-options.h"
+#include "util/parse-branch-options.h"
+
+#define BRANCH_OPT(n, m) \
+	{ .name = n, .mode = (m) }
+
+#define BRANCH_END { .name = NULL }
+
+struct branch_mode {
+	const char *name;
+	int mode;
+};
+
+static const struct branch_mode branch_modes[] = {
+	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
+	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
+	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
+	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
+	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
+	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
+	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+	BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
+	BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
+	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
+	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
+	BRANCH_END
+};
+
+int
+parse_branch_stack(const struct option *opt, const char *str, int unset)
+{
+#define ONLY_PLM \
+	(PERF_SAMPLE_BRANCH_USER	|\
+	 PERF_SAMPLE_BRANCH_KERNEL	|\
+	 PERF_SAMPLE_BRANCH_HV)
+
+	uint64_t *mode = (uint64_t *)opt->value;
+	const struct branch_mode *br;
+	char *s, *os = NULL, *p;
+	int ret = -1;
+
+	if (unset)
+		return 0;
+
+	/*
+	 * cannot set it twice, -b + --branch-filter for instance
+	 */
+	if (*mode)
+		return -1;
+
+	/* str may be NULL in case no arg is passed to -b */
+	if (str) {
+		/* because str is read-only */
+		s = os = strdup(str);
+		if (!s)
+			return -1;
+
+		for (;;) {
+			p = strchr(s, ',');
+			if (p)
+				*p = '\0';
+
+			for (br = branch_modes; br->name; br++) {
+				if (!strcasecmp(s, br->name))
+					break;
+			}
+			if (!br->name) {
+				ui__warning("unknown branch filter %s,"
+					    " check man page\n", s);
+				goto error;
+			}
+
+			*mode |= br->mode;
+
+			if (!p)
+				break;
+
+			s = p + 1;
+		}
+	}
+	ret = 0;
+
+	/* default to any branch */
+	if ((*mode & ~ONLY_PLM) == 0) {
+		*mode = PERF_SAMPLE_BRANCH_ANY;
+	}
+error:
+	free(os);
+	return ret;
+}
diff --git a/tools/perf/util/parse-branch-options.h b/tools/perf/util/parse-branch-options.h
new file mode 100644
index 0000000..b9d9470
--- /dev/null
+++ b/tools/perf/util/parse-branch-options.h
@@ -0,0 +1,5 @@
+#ifndef _PERF_PARSE_BRANCH_OPTIONS_H
+#define _PERF_PARSE_BRANCH_OPTIONS_H 1
+struct option;
+int parse_branch_stack(const struct option *opt, const char *str, int unset);
+#endif /* _PERF_PARSE_BRANCH_OPTIONS_H */

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/11] perf, tools, report: Add processing for cycle histograms
  2015-05-27 17:51 ` [PATCH 05/11] perf, tools, report: Add processing for cycle histograms Andi Kleen
@ 2015-06-01 14:10   ` Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:10 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel, Andi Kleen

On Wed, May 27, 2015 at 10:51:48AM -0700, Andi Kleen wrote:

SNIP

> +void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
> +			  struct perf_sample *sample, bool nonany_branch_mode)
> +{
> +	struct branch_info *bi;
> +
> +	/* If we have branch cycles always annotate them. */
> +	if (bs && bs->nr && bs->entries[0].flags.cycles) {
> +		int i;
> +
> +		bi = sample__resolve_bstack(sample, al);
> +		if (bi) {
> +			struct addr_map_symbol *prev = NULL;
> +
> +			/*
> +			 * Ignore errors, still want to process the
> +			 * other entries.
> +			 *
> +			 * For non standard branch modes always
> +			 * force no IPC (prev == NULL)
> +			 *
> +			 * Note that perf stores branches reversed from
> +			 * program order!
> +			 */
> +			for (i = bs->nr - 1; i >= 0; i--) {
> +				addr_map_symbol__account_cycles(&bi[i].from,
> +					nonany_branch_mode ? NULL : prev,
> +					bi[i].flags.cycles);

it's only ERANGE you want to ignore here

jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field
  2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
@ 2015-06-01 14:16   ` Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:16 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel, Andi Kleen

On Wed, May 27, 2015 at 10:51:44AM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> cycles is a new branch_info field available on some CPUs
> that indicates the time deltas between branches in the LBR.
> 
> Add a sort key and output code for the cycles
> to allow to display the basic block cycles individually in perf report.
> 
> We also pass in the cycles for weight when LBRs are processed,
> which allows to get global and local weight, to get an estimate
> of the total cost.
> 
> And also print the cycles information for perf report -D.
> I also added printing for the previously missing LBR flags
> (mispredict etc.)
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode
  2015-05-27 17:51 ` [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
@ 2015-06-01 14:16   ` Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:16 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel, Andi Kleen

On Wed, May 27, 2015 at 10:51:45AM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> Later patches need to cheaply check that the branch mode is in ANY.
> Add a new function to check all event attrs and add a flag to the
> report state, which is then initialized.
> 
> v2: Rename flag
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 03/11] perf, tools: Add symbol__get_annotation
  2015-05-27 17:51 ` [PATCH 03/11] perf, tools: Add symbol__get_annotation Andi Kleen
  2015-05-28  9:32   ` [tip:perf/core] perf annotation: " tip-bot for Andi Kleen
@ 2015-06-01 14:17   ` Jiri Olsa
  1 sibling, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:17 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel, Andi Kleen

On Wed, May 27, 2015 at 10:51:46AM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> Add a new utility function to get an function annotation.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram
  2015-05-27 17:51 ` [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
@ 2015-06-01 14:19   ` Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:19 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel, Andi Kleen

On Wed, May 27, 2015 at 10:51:47AM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> This adds the basic infrastructure to keep track of cycle counts
> per basic block for annotate. We allocate an array similar to the
> normal accounting, and then account branch cycles there.
> 
> We handle two cases:
> cycles per basic block with start and cycles per branch
> (these are later used for either IPC or just cycles per BB)
> 
> In the start case we cannot handle overlaps, so always the longest
> basic block wins.
> 
> For the cycles per branch case everything is accurately accounted.
> 
> v2: Remove unnecessary checks. Slight restructure. Move
> symbol__get_annotation to another patch. Move histogram allocation.
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/11] perf, tools, report: Move branch option parsing to own file
  2015-05-27 17:51 ` [PATCH 08/11] perf, tools, report: Move branch option parsing to own file Andi Kleen
  2015-05-28  9:32   ` [tip:perf/core] perf tools: " tip-bot for Andi Kleen
@ 2015-06-01 14:20   ` Jiri Olsa
  1 sibling, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel, Andi Kleen

On Wed, May 27, 2015 at 10:51:51AM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> .. to allow sharing between builtin-record and builtin-top later.
> No code changes, just moved code.
> 
> v2: Add header
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Cycles annotation support for perf tools v2
  2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
                   ` (10 preceding siblings ...)
  2015-05-27 17:51 ` [PATCH 11/11] test patch: Add fake branch cycles to input data in report/top Andi Kleen
@ 2015-06-01 14:21 ` Jiri Olsa
  2015-06-01 14:43   ` Arnaldo Carvalho de Melo
  11 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2015-06-01 14:21 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, namhyung, eranian, linux-kernel

On Wed, May 27, 2015 at 10:51:43AM -0700, Andi Kleen wrote:
> [v2: Addressed review comments. Fixed display problems and 
> correctly compute IPC now. See patches for detailed changes.]
> 
> The upcoming Skylake CPU has a new timed branch stack feature,
> that reports cycle counts for individual branches in the
> last branch record.
> 
> This allows to get fine grained cost information for code, and also allows
> to compute fine grained IPC.
> 
> Available from
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools2
> 
> This patchkit adds support for this in the perf tools:
> - Basic support for the cycles field like other branch fields
> - Show cycles in the standard branch sort view (no IPC here,
>   as IPC needs the instruction counts from annotation)
> - Annotate cycles and IPC in the assembler annotate view
> - Add branch support to top, so we can do live annotation.
> - Misc support, like dumping it in perf report -D

v2 seems ok to me.. all my comments were addressed,
and I posted one more comment

anyway, I dont touch annotate code that much to ack annotate
core patches.. acking only portion of the patchset

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Cycles annotation support for perf tools v2
  2015-06-01 14:21 ` Cycles annotation support for perf tools v2 Jiri Olsa
@ 2015-06-01 14:43   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 22+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-06-01 14:43 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, namhyung, eranian, linux-kernel

Em Mon, Jun 01, 2015 at 04:21:36PM +0200, Jiri Olsa escreveu:
> On Wed, May 27, 2015 at 10:51:43AM -0700, Andi Kleen wrote:
> > [v2: Addressed review comments. Fixed display problems and 
> > correctly compute IPC now. See patches for detailed changes.]
> > 
> > The upcoming Skylake CPU has a new timed branch stack feature,
> > that reports cycle counts for individual branches in the
> > last branch record.
> > 
> > This allows to get fine grained cost information for code, and also allows
> > to compute fine grained IPC.
> > 
> > Available from
> > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools2
> > 
> > This patchkit adds support for this in the perf tools:
> > - Basic support for the cycles field like other branch fields
> > - Show cycles in the standard branch sort view (no IPC here,
> >   as IPC needs the instruction counts from annotation)
> > - Annotate cycles and IPC in the assembler annotate view
> > - Add branch support to top, so we can do live annotation.
> > - Misc support, like dumping it in perf report -D
> 
> v2 seems ok to me.. all my comments were addressed,
> and I posted one more comment
> 
> anyway, I dont touch annotate code that much to ack annotate
> core patches.. acking only portion of the patchset

I took and pushed two of those patches already, will try to look at the
annotation bits later today...

- Arnaldo

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-06-01 14:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
2015-06-01 14:16   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
2015-06-01 14:16   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 03/11] perf, tools: Add symbol__get_annotation Andi Kleen
2015-05-28  9:32   ` [tip:perf/core] perf annotation: " tip-bot for Andi Kleen
2015-06-01 14:17   ` [PATCH 03/11] perf, tools: " Jiri Olsa
2015-05-27 17:51 ` [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
2015-06-01 14:19   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 05/11] perf, tools, report: Add processing for cycle histograms Andi Kleen
2015-06-01 14:10   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 06/11] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
2015-05-27 17:51 ` [PATCH 07/11] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
2015-05-27 17:51 ` [PATCH 08/11] perf, tools, report: Move branch option parsing to own file Andi Kleen
2015-05-28  9:32   ` [tip:perf/core] perf tools: " tip-bot for Andi Kleen
2015-06-01 14:20   ` [PATCH 08/11] perf, tools, report: " Jiri Olsa
2015-05-27 17:51 ` [PATCH 09/11] perf, tools, top: Add branch annotation code to top Andi Kleen
2015-05-27 17:51 ` [PATCH 10/11] perf, tools, report: Display cycles in branch sort mode Andi Kleen
2015-05-27 17:51 ` [PATCH 11/11] test patch: Add fake branch cycles to input data in report/top Andi Kleen
2015-06-01 14:21 ` Cycles annotation support for perf tools v2 Jiri Olsa
2015-06-01 14:43   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.