linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Cycles annotation support for perf tools
@ 2015-05-10 13:51 Andi Kleen
  2015-05-10 13:51 ` [PATCH 01/10] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian

The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.

This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.

This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
  as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D

The kernel support has been posted separately. I included a test patch
to generate fake data for testing on existing systems.

Example output for annotate (with made up numbers):
    
The second column is the IPC and third average cycles for the basic block.

                   │    static int hex(char ch)                                                                                                       ▒
                   │    {                                                                                                                             ▒
        8.20       │      push   %rbp                                                                                                                 ◆
        8.20       │      mov    %rsp,%rbp                                                                                                            ▒
        8.20       │      sub    $0x20,%rsp                                                                                                           ▒
        8.20       │      mov    %edi,%eax                                                                                                            ▒
        8.20       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
        8.20       │      mov    %fs:0x28,%rax                                                                                                        ▒
        8.20       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
        8.20       │      xor    %eax,%eax                                                                                                            ▒
                   │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
        8.20       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
 66.67  8.20   123 │    ↓ jle    31                                                                                                                   ▒
        8.20       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
        8.20   123 │    ↓ jg     31                                                                                                                   ▒
                   │                    return ch - '0';                                                                                              ▒
 22.22  8.20       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
        8.20       │      sub    $0x30,%eax                                                                                                           ▒
        8.20   123 │    ↓ jmp    60                                                                                                                   ▒
                   │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
       17.57       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
       17.57   123 │    ↓ jle    46                                                                                                                   ▒
       17.57       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
       17.57       │    ↓ jg     46                                                                                                                   ▒
                   │                    return ch - 'a' + 10;                                                                                         ▒
       17.57       │      movsbl -0x14(%rbp),%eax                                 

Example output for branch view (again with fake data):

Overhead  Command  Source Shared Object  Source Symbol                               Target Symbol                               Basic Block Cycles   ◆
  30.08%  tcall    tcall                 [.] f1                                      [.] f2                                      123                  ▒
  27.44%  tcall    tcall                 [.] f2                                      [.] f1                                      123                  ▒
  15.60%  tcall    tcall                 [.] main                                    [.] f1                                      123                  ▒
  12.96%  tcall    tcall                 [.] f1                                      [.] main                                    123                  ▒
  12.86%  tcall    tcall                 [.] main                                    [.] main                                    123                  ▒
   0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                       [k] hrtimer_interrupt                       123             

IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.

The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.

It would be nice to add column headers to annotate.

So far no support in --branch-history or in perf script.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/10] perf, tools: Add tools support for cycles, weight branch_info field
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
@ 2015-05-10 13:51 ` Andi Kleen
  2015-05-10 13:51 ` [PATCH 02/10] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

cycles is a new branch_info field available on some CPUs
that indicates the time deltas between branches in the LBR.

Add a sort key and output code for the cycles
to allow to display the basic block cycles individually in perf report.

We also pass in the cycles for weight when LBRs are processed,
which allows to get global and local weight, to get an estimate
of the total cost.

And also print the cycles information for perf report -D.
I also added printing for the previously missing LBR flags
(mispredict etc.)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/util/event.h                  |  3 ++-
 tools/perf/util/hist.c                   |  3 ++-
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/session.c                | 16 ++++++++++++----
 tools/perf/util/sort.c                   | 24 ++++++++++++++++++++++++
 tools/perf/util/sort.h                   |  1 +
 7 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 27190ed..034a2b4 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -108,6 +108,7 @@ OPTIONS
 	- mispredict: "N" for predicted branch, "Y" for mispredicted branch
 	- in_tx: branch in TSX transaction
 	- abort: TSX transaction abort.
+	- cycles: Cycles in basic block
 
 	And default sort keys are changed to comm, dso_from, symbol_from, dso_to
 	and symbol_to, see '--branch-stack'.
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 7eecd5e..f028a3c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -129,7 +129,8 @@ struct branch_flags {
 	u64 predicted:1;
 	u64 in_tx:1;
 	u64 abort:1;
-	u64 reserved:60;
+	u64 cycles:16;
+	u64 reserved:44;
 };
 
 struct branch_entry {
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 3387706..302fc05 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -623,7 +623,8 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	 * and not events sampled. Thus we use a pseudo period of 1.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, &bi[i], NULL,
-				1, 1, 0, true);
+				1, bi->flags.cycles ? bi->flags.cycles : 1,
+				0, true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 9f31b89..b55c904 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -47,6 +47,7 @@ enum hist_column {
 	HISTC_MEM_SNOOP,
 	HISTC_MEM_DCACHELINE,
 	HISTC_TRANSACTION,
+	HISTC_CYCLES,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e722107..484b974 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -738,10 +738,18 @@ static void branch_stack__printf(struct perf_sample *sample)
 
 	printf("... branch stack: nr:%" PRIu64 "\n", sample->branch_stack->nr);
 
-	for (i = 0; i < sample->branch_stack->nr; i++)
-		printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 "\n",
-			i, sample->branch_stack->entries[i].from,
-			sample->branch_stack->entries[i].to);
+	for (i = 0; i < sample->branch_stack->nr; i++) {
+		struct branch_entry *e = &sample->branch_stack->entries[i];
+
+		printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 " %hu cycles %s%s%s%s %x\n",
+			i, e->from, e->to,
+			e->flags.cycles,
+			e->flags.mispred ? "M" : " ",
+			e->flags.predicted ? "P" : " ",
+			e->flags.abort ? "A" : " ",
+			e->flags.in_tx ? "T" : " ",
+			(unsigned)e->flags.reserved);
+	}
 }
 
 static void regs_dump__printf(u64 mask, u64 *regs)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 4593f36..03d8e6e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -528,6 +528,29 @@ static int hist_entry__mispredict_snprintf(struct hist_entry *he, char *bf,
 	return repsep_snprintf(bf, size, "%-*.*s", width, width, out);
 }
 
+static int64_t
+sort__cycles_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return left->branch_info->flags.cycles -
+		right->branch_info->flags.cycles;
+}
+
+static int hist_entry__cycles_snprintf(struct hist_entry *he, char *bf,
+				    size_t size, unsigned int width)
+{
+	if (he->branch_info->flags.cycles == 0)
+		return repsep_snprintf(bf, size, "%-*s", width, "-");
+	return repsep_snprintf(bf, size, "%-*hd", width,
+			       he->branch_info->flags.cycles);
+}
+
+struct sort_entry sort_cycles = {
+	.se_header	= "Basic Block Cycles",
+	.se_cmp		= sort__cycles_cmp,
+	.se_snprintf	= hist_entry__cycles_snprintf,
+	.se_width_idx	= HISTC_CYCLES,
+};
+
 /* --sort daddr_sym */
 static int64_t
 sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right)
@@ -1192,6 +1215,7 @@ static struct sort_dimension bstack_sort_dimensions[] = {
 	DIM(SORT_MISPREDICT, "mispredict", sort_mispredict),
 	DIM(SORT_IN_TX, "in_tx", sort_in_tx),
 	DIM(SORT_ABORT, "abort", sort_abort),
+	DIM(SORT_CYCLES, "cycles", sort_cycles),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index e97cd47..bc6c87a 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -185,6 +185,7 @@ enum sort_type {
 	SORT_MISPREDICT,
 	SORT_ABORT,
 	SORT_IN_TX,
+	SORT_CYCLES,
 
 	/* memory mode specific sort keys */
 	__SORT_MEMORY_MODE,
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/10] perf, tools, report: Add flag for non ANY branch mode
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
  2015-05-10 13:51 ` [PATCH 01/10] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
@ 2015-05-10 13:51 ` Andi Kleen
  2015-05-10 13:51 ` [PATCH 03/10] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Later patches need to cheaply check that the branch mode is in ANY.
Add a new function to check all event attrs and add a flag to the
report state, which is then initialized.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-report.c |  7 +++++++
 tools/perf/util/evlist.c    | 10 ++++++++++
 tools/perf/util/evlist.h    |  1 +
 3 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 18cb0ff..3b35f1e 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -53,6 +53,7 @@ struct report {
 	bool			mem_mode;
 	bool			header;
 	bool			header_only;
+	bool			nonstd_branch_mode;
 	int			max_stack;
 	struct perf_read_values	show_threads_values;
 	const char		*pretty_printing_style;
@@ -256,6 +257,12 @@ static int report__setup_sample_type(struct report *rep)
 		else
 			callchain_param.record_mode = CALLCHAIN_FP;
 	}
+
+	/* ??? handle more cases than just ANY? */
+	if (!(perf_evlist__combined_branch_type(session->evlist) &
+				PERF_SAMPLE_BRANCH_ANY))
+		rep->nonstd_branch_mode = true;
+
 	return 0;
 }
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7ec1bf9..ee0a610 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1216,6 +1216,16 @@ u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
 	return __perf_evlist__combined_sample_type(evlist);
 }
 
+u64 perf_evlist__combined_branch_type(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	u64 branch_type = 0;
+
+	evlist__for_each(evlist, evsel)
+		branch_type |= evsel->attr.branch_sample_type;
+	return branch_type;
+}
+
 bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
 {
 	struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index c07b1a9..3277851 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -166,6 +166,7 @@ void perf_evlist__set_leader(struct perf_evlist *evlist);
 u64 perf_evlist__read_format(struct perf_evlist *evlist);
 u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
 u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
+u64 perf_evlist__combined_branch_type(struct perf_evlist *evlist);
 bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
 u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/10] perf, tools, report: Add infrastructure for a cycles histogram
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
  2015-05-10 13:51 ` [PATCH 01/10] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
  2015-05-10 13:51 ` [PATCH 02/10] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
@ 2015-05-10 13:51 ` Andi Kleen
  2015-05-10 13:52 ` [PATCH 04/10] perf, tools, report: Add processing for cycle histograms Andi Kleen
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

This adds the basic infrastructure to keep track of cycle counts
per basic block for annotate. We allocate an array similar to the
normal accounting, and then account branch cycles there.

We handle two cases:
cycles per basic block with start and cycles per branch
(these are later used for either IPC or just cycles per BB)

In the start case we cannot handle overlaps, so always the longest
basic block wins.

For the cycles per branch case everything is accurately accounted.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-annotate.c |   1 +
 tools/perf/util/annotate.c    | 145 ++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/annotate.h    |  17 +++++
 3 files changed, 157 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 71bf745..52e7575 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -181,6 +181,7 @@ find_next:
 			 * symbol, free he->ms.sym->src to signal we already
 			 * processed this symbol.
 			 */
+			zfree(&notes->src->cycles_hist);
 			zfree(&notes->src);
 		}
 	}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7f5bdfc..7701dfb 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -473,17 +473,85 @@ int symbol__alloc_hist(struct symbol *sym)
 	return 0;
 }
 
+/* The cycles histogram is lazily allocated. */
+static int symbol__alloc_hist_cycles(struct symbol *sym)
+{
+	struct annotation *notes = symbol__annotation(sym);
+	const size_t size = symbol__size(sym);
+
+	notes->src->cycles_hist = calloc(size, sizeof(struct cyc_hist));
+	if (notes->src->cycles_hist == NULL)
+		return -1;
+	return 0;
+}
+
 void symbol__annotate_zero_histograms(struct symbol *sym)
 {
 	struct annotation *notes = symbol__annotation(sym);
 
 	pthread_mutex_lock(&notes->lock);
-	if (notes->src != NULL)
+	if (notes->src != NULL) {
 		memset(notes->src->histograms, 0,
 		       notes->src->nr_histograms * notes->src->sizeof_sym_hist);
+		if (notes->src->cycles_hist)
+			memset(notes->src->cycles_hist, 0,
+				symbol__size(sym) * sizeof(struct cyc_hist));
+	}
 	pthread_mutex_unlock(&notes->lock);
 }
 
+static int __symbol__account_cycles(struct symbol *sym,
+				    struct annotation *notes,
+				    u64 start,
+				    unsigned offset, unsigned cycles,
+				    unsigned have_start)
+{
+	/*
+	 * If available record cycles of last basic block.
+	 */
+	if (cycles) {
+		struct cyc_hist *ch;
+
+		if (!notes->src->cycles_hist) {
+			if (symbol__alloc_hist_cycles(sym) < 0)
+				return -ENOMEM;
+		}
+		ch = notes->src->cycles_hist;
+		if (ch != NULL) {
+			/*
+			 * For now we can only account one basic block per
+			 * final jump. But multiple could be overlapping.
+			 * Always account the longest one. So when
+			 * a shorter one has been already seen throw it away.
+			 *
+			 * We separately always account the full cycles.
+			 */
+			ch[offset].num_aggr++;
+			ch[offset].cycles_aggr += cycles;
+			if (!have_start && ch[offset].have_start)
+				return 0;
+			if (ch[offset].num) {
+				if (have_start &&
+					(!ch[offset].have_start ||
+					ch[offset].start > start)) {
+					ch[offset].have_start = 0;
+					ch[offset].cycles = 0;
+					ch[offset].num = 0;
+					if (ch[offset].reset < 0xffff)
+						ch[offset].reset++;
+				} else if (have_start &&
+					   ch[offset].start < start)
+					return 0;
+			}
+			ch[offset].have_start = have_start;
+			ch[offset].start = start;
+			ch[offset].cycles += cycles;
+			ch[offset].num++;
+		}
+	}
+	return 0;
+}
+
 static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 				      struct annotation *notes, int evidx, u64 addr)
 {
@@ -506,6 +574,17 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 	return 0;
 }
 
+static struct annotation *symbol__get_annotation(struct symbol *sym)
+{
+	struct annotation *notes = symbol__annotation(sym);
+
+	if (notes->src == NULL) {
+		if (symbol__alloc_hist(sym) < 0)
+			return NULL;
+	}
+	return notes;
+}
+
 static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 				    int evidx, u64 addr)
 {
@@ -513,14 +592,68 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 
 	if (sym == NULL)
 		return 0;
+	notes = symbol__get_annotation(sym);
+	if (notes == NULL)
+		return -ENOMEM;
+	return __symbol__inc_addr_samples(sym, map, notes, evidx, addr);
+}
 
-	notes = symbol__annotation(sym);
-	if (notes->src == NULL) {
-		if (symbol__alloc_hist(sym) < 0)
-			return -ENOMEM;
+static int symbol__account_cycles(u64 addr, u64 start,
+				  struct symbol *sym, unsigned cycles)
+{
+	struct annotation *notes;
+	unsigned offset;
+
+	if (sym == NULL)
+		return 0;
+	notes = symbol__get_annotation(sym);
+	if (notes == NULL)
+		return -ENOMEM;
+	if (addr < sym->start || addr >= sym->end)
+		return -ERANGE;
+
+	if (start) {
+		if (start < sym->start || start >= sym->end)
+			return -ERANGE;
+		if (start >= addr)
+			start = 0;
 	}
+	offset = addr - sym->start;
+	return __symbol__account_cycles(sym, notes,
+					start ? start - sym->start : 0,
+					offset, cycles,
+					!!start);
+}
 
-	return __symbol__inc_addr_samples(sym, map, notes, evidx, addr);
+int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
+				    struct addr_map_symbol *start,
+				    unsigned cycles)
+{
+	unsigned long saddr = 0;
+	int err;
+
+	/*
+	 * Only set start when IPC can be computed. We can only
+	 * compute it when the basic block is completely in a single
+	 * function.
+	 * Special case the case when the jump is elsewhere, but
+	 * it starts on the function start.
+	 */
+	if (start &&
+		(start->sym == ams->sym ||
+		 (ams->sym &&
+		   start->addr == ams->sym->start + ams->map->start)))
+		saddr = start->al_addr;
+	if (saddr == 0)
+		pr_debug2("BB with bad start: addr %lx start %lx sym %lx saddr %lx\n",
+			ams->addr,
+			start ? start->addr : 0,
+			ams->sym ? ams->sym->start + ams->map->start : 0,
+			saddr);
+	err = symbol__account_cycles(ams->al_addr, saddr, ams->sym, cycles);
+	if (err)
+		pr_debug2("account_cycles failed %d\n", err);
+	return err;
 }
 
 int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, int evidx)
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index cadbdc9..9080181 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -79,6 +79,17 @@ struct sym_hist {
 	u64		addr[0];
 };
 
+struct cyc_hist {
+	u64	start;
+	u64	cycles;
+	u64	cycles_aggr;
+	u32	num;
+	u32	num_aggr;
+	u8	have_start;
+	/* 1 byte padding */
+	u16	reset;
+};
+
 struct source_line_percent {
 	double		percent;
 	double		percent_sum;
@@ -96,6 +107,7 @@ struct source_line {
  * @histogram: Array of addr hit histograms per event being monitored
  * @lines: If 'print_lines' is specified, per source code line percentages
  * @source: source parsed from a disassembler like objdump -dS
+ * @cyc_hist: Average cycles per basic block
  *
  * lines is allocated, percentages calculated and all sorted by percentage
  * when the annotation is about to be presented, so the percentages are for
@@ -108,6 +120,7 @@ struct annotated_source {
 	struct source_line *lines;
 	int    		   nr_histograms;
 	int    		   sizeof_sym_hist;
+	struct cyc_hist	   *cycles_hist;
 	struct sym_hist	   histograms[0];
 };
 
@@ -129,6 +142,10 @@ static inline struct annotation *symbol__annotation(struct symbol *sym)
 
 int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, int evidx);
 
+int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
+				    struct addr_map_symbol *start,
+				    unsigned cycles);
+
 int hist_entry__inc_addr_samples(struct hist_entry *he, int evidx, u64 addr);
 
 int symbol__alloc_hist(struct symbol *sym);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/10] perf, tools, report: Add processing for cycle histograms
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (2 preceding siblings ...)
  2015-05-10 13:51 ` [PATCH 03/10] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-26 10:09   ` Jiri Olsa
  2015-05-10 13:52 ` [PATCH 05/10] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Call the earlier added cycle histogram infrastructure from the perf report
hist iter callback. For this we walk the branch records.

This allows to use cycle histograms when browsing perf report annotate.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-report.c |  4 +++-
 tools/perf/util/hist.c      | 33 +++++++++++++++++++++++++++++++++
 tools/perf/util/hist.h      |  3 +++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3b35f1e..c89b51a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -103,6 +103,9 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
 	if (!ui__has_annotation())
 		return 0;
 
+	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
+			     rep->nonstd_branch_mode);
+
 	if (sort__mode == SORT_MODE__BRANCH) {
 		bi = he->branch_info;
 		err = addr_map_symbol__inc_samples(&bi->from, evsel->idx);
@@ -110,7 +113,6 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
 			goto out;
 
 		err = addr_map_symbol__inc_samples(&bi->to, evsel->idx);
-
 	} else if (rep->mem_mode) {
 		mi = he->mem_info;
 		err = addr_map_symbol__inc_samples(&mi->daddr, evsel->idx);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 302fc05..cf6b48b 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1412,6 +1412,39 @@ int hists__link(struct hists *leader, struct hists *other)
 	return 0;
 }
 
+void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
+			  struct perf_sample *sample, bool nonstd_branch_mode)
+{
+	struct branch_info *bi;
+
+	/* If we have branch cycles always annotate them. */
+	if (bs && bs->nr && bs->entries[0].flags.cycles) {
+		int i;
+
+		bi = sample__resolve_bstack(sample, al);
+		if (bi) {
+			struct addr_map_symbol *prev = NULL;
+
+			/*
+			 * Ignore errors, still want to process the
+			 * other entries.
+			 *
+			 * For non standard branch modes always
+			 * force no IPC (prev == NULL)
+			 *
+			 * Note that perf stores branches reversed from
+			 * program order!
+			 */
+			for (i = bs->nr - 1; i >= 0; i--) {
+				addr_map_symbol__account_cycles(&bi[i].from,
+					nonstd_branch_mode ? NULL : prev,
+					bi[i].flags.cycles);
+				prev = &bi[i].to;
+			}
+			free(bi);
+		}
+	}
+}
 
 size_t perf_evlist__fprintf_nr_events(struct perf_evlist *evlist, FILE *fp)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index b55c904..88bb041 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -351,6 +351,9 @@ static inline int script_browse(const char *script_opt __maybe_unused)
 
 unsigned int hists__sort_list_width(struct hists *hists);
 
+void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
+			  struct perf_sample *sample, bool nonstd_branch_mode);
+
 struct option;
 int parse_filter_percentage(const struct option *opt __maybe_unused,
 			    const char *arg, int unset __maybe_unused);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/10] perf, tools: Compute IPC and basic block cycles for annotate
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (3 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 04/10] perf, tools, report: Add processing for cycle histograms Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-10 13:52 ` [PATCH 06/10] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Compute the IPC and the basic block cycles for the annotate display.

IPC is computed by counting the instructions, and then dividing the
accounted cycles by that count.

The actual IPC computation can only be done at annotate time,
because we need to parse the objdump output first to know
the number of instructions in the basic block.

The cycles/IPC are also put into the perf function annotation
so that the display code can show them.

Again basic block overlaps are not handled, with the longest winning,
but there are some heuristics to hide the IPC when the longest is not
the most common.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/ui/browsers/annotate.c | 76 ++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/annotate.h        |  2 ++
 2 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index e5250eb..6c135b5 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -47,6 +47,7 @@ struct annotate_browser {
 	int		    max_jump_sources;
 	int		    nr_jumps;
 	bool		    searching_backwards;
+	bool		    have_cycles;
 	u8		    addr_width;
 	u8		    jumps_width;
 	u8		    target_width;
@@ -376,7 +377,7 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser,
 				max_percent = bpos->percent[i];
 		}
 
-		if (max_percent < 0.01) {
+		if (max_percent < 0.01 && pos->ipc == 0) {
 			RB_CLEAR_NODE(&bpos->rb_node);
 			continue;
 		}
@@ -841,6 +842,78 @@ int hist_entry__tui_annotate(struct hist_entry *he, struct perf_evsel *evsel,
 	return map_symbol__tui_annotate(&he->ms, evsel, hbt);
 }
 
+
+static unsigned count_insn(struct annotate_browser *browser, u64 start, u64 end)
+{
+	unsigned n_insn = 0;
+	u64 offset;
+
+	for (offset = start; offset <= end; offset++) {
+		if (browser->offsets[offset])
+			n_insn++;
+	}
+	return n_insn;
+}
+
+static void count_and_fill(struct annotate_browser *browser, u64 start, u64 end,
+			   struct cyc_hist *ch)
+{
+	unsigned n_insn;
+	u64 offset;
+
+	n_insn = count_insn(browser, start, end);
+	if (n_insn && ch->num) {
+		float ipc = ((double)ch->cycles / (double)ch->num) / n_insn;
+
+		if (ipc < 0.01)
+			pr_debug2("bogus ipc %f cyc:%lu num:%u n_insn:%u\n",
+					ipc, ch->cycles, ch->num, n_insn);
+		/* Hide data when there are too many overlaps. */
+		if (ch->reset >= 0x7fff || ch->reset >= ch->num / 2)
+			return;
+
+		for (offset = start; offset <= end; offset++) {
+			struct disasm_line *dl = browser->offsets[offset];
+
+			if (dl)
+				dl->ipc = ipc;
+		}
+	}
+}
+
+/*
+ * This should probably be in util/annotate.c to share with the tty
+ * annotate, but right now we need the per byte offsets arrays,
+ * which are only here.
+ */
+static void annotate__compute_ipc(struct annotate_browser *browser, size_t size,
+			   struct symbol *sym)
+{
+	u64 offset;
+	struct annotation *notes = symbol__annotation(sym);
+
+	if (!notes->src || !notes->src->cycles_hist)
+		return;
+
+	pthread_mutex_lock(&notes->lock);
+	for (offset = 0; offset < size; ++offset) {
+		struct cyc_hist *ch;
+
+		ch = &notes->src->cycles_hist[offset];
+		if (ch && ch->cycles) {
+			struct disasm_line *dl;
+
+			if (ch->have_start)
+				count_and_fill(browser, ch->start, offset, ch);
+			dl = browser->offsets[offset];
+			if (dl && ch->num_aggr)
+				dl->cycles = ch->cycles_aggr / ch->num_aggr;
+			browser->have_cycles = true;
+		}
+	}
+	pthread_mutex_unlock(&notes->lock);
+}
+
 static void annotate_browser__mark_jump_targets(struct annotate_browser *browser,
 						size_t size)
 {
@@ -962,6 +1035,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 	}
 
 	annotate_browser__mark_jump_targets(&browser, size);
+	annotate__compute_ipc(&browser, size, sym);
 
 	browser.addr_width = browser.target_width = browser.min_addr_width = hex_width(size);
 	browser.max_addr_width = hex_width(sym->end);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 9080181..be6bafa 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -59,6 +59,8 @@ struct disasm_line {
 	char		    *name;
 	struct ins	    *ins;
 	int		    line_nr;
+	float		    ipc;
+	u64		    cycles;
 	struct ins_operands ops;
 };
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/10] perf, tools, annotate: Finally display IPC and cycle accounting
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (4 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 05/10] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-10 13:52 ` [PATCH 07/10] perf, tools, report: Move branch option parsing to own file Andi Kleen
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add two new columns to the annotate display and display the average cycles
and the compute IPC if available.

When the LBR was not in any branch mode the IPC
computation is automatically disabled. We still display
the cycle information.

Example output (with made up numbers):

The second column is the IPC and third average cycles.

                 │    __attribute__((noinline)) f2()
                 │    {
  5.15 13.67     │       push   %rbp
  0.01 13.67     │       mov    %rsp,%rbp
                 │            c = a / b;
  9.87 13.67     │       mov    a,%eax
       13.67     │       mov    b,%ecx
       13.67     │       cltd
  4.92 13.67  123│       idiv   %ecx
 70.79 13.67     │       mov    %eax,__TMC_END__
                 │    }
  9.25 13.67     │       pop    %rbp
  0.01 13.67  123│     ← retq

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/ui/browsers/annotate.c | 50 +++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 12 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 6c135b5..af22061 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -11,6 +11,9 @@
 #include "../../util/evsel.h"
 #include <pthread.h>
 
+#define IPC_WIDTH 6
+#define CYCLES_WIDTH 6
+
 struct browser_disasm_line {
 	struct rb_node	rb_node;
 	u32		idx;
@@ -91,6 +94,15 @@ static int annotate_browser__set_jumps_percent_color(struct annotate_browser *br
 	 return ui_browser__set_color(&browser->b, color);
 }
 
+static int annotate_browser__pcnt_width(struct annotate_browser *ab)
+{
+	int w = 7 * ab->nr_events;
+
+	if (ab->have_cycles)
+		w += IPC_WIDTH + CYCLES_WIDTH;
+	return w;
+}
+
 static void annotate_browser__write(struct ui_browser *browser, void *entry, int row)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
@@ -101,7 +113,7 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 			     (!current_entry || (browser->use_navkeypressed &&
 					         !browser->navkeypressed)));
 	int width = browser->width, printed;
-	int i, pcnt_width = 7 * ab->nr_events;
+	int i, pcnt_width = annotate_browser__pcnt_width(ab);
 	double percent_max = 0.0;
 	char bf[256];
 
@@ -110,11 +122,29 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 			percent_max = bdl->percent[i];
 	}
 
-	if (dl->offset != -1 && percent_max != 0.0) {
-		for (i = 0; i < ab->nr_events; i++) {
-			ui_browser__set_percent_color(browser, bdl->percent[i],
-						      current_entry);
-			slsmg_printf("%6.2f ", bdl->percent[i]);
+	if (dl->offset != -1) {
+		if (percent_max != 0.0) {
+			for (i = 0; i < ab->nr_events; i++) {
+				ui_browser__set_percent_color(browser,
+							      bdl->percent[i],
+							      current_entry);
+				slsmg_printf("%6.2f ", bdl->percent[i]);
+			}
+		} else {
+			slsmg_write_nstring(" ", 7 * ab->nr_events);
+		}
+
+		if (ab->have_cycles) {
+			ui_browser__set_color(browser, HE_COLORSET_NORMAL);
+			if (dl->ipc)
+				slsmg_printf("%*.2f ", IPC_WIDTH - 1, dl->ipc);
+			else
+				slsmg_write_nstring(" ", IPC_WIDTH);
+			if (dl->cycles)
+				slsmg_printf("%*" PRIu64 " ",
+						CYCLES_WIDTH - 1, dl->cycles);
+			else
+				slsmg_write_nstring(" ", CYCLES_WIDTH);
 		}
 	} else {
 		ui_browser__set_percent_color(browser, 0, current_entry);
@@ -221,7 +251,7 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 	unsigned int from, to;
 	struct map_symbol *ms = ab->b.priv;
 	struct symbol *sym = ms->sym;
-	u8 pcnt_width = 7;
+	u8 pcnt_width = annotate_browser__pcnt_width(ab);
 
 	/* PLT symbols contain external offsets */
 	if (strstr(sym->name, "@plt"))
@@ -245,8 +275,6 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 		to = (u64)btarget->idx;
 	}
 
-	pcnt_width *= ab->nr_events;
-
 	ui_browser__set_color(browser, HE_COLORSET_CODE);
 	__ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
 				 from, to);
@@ -256,9 +284,7 @@ static unsigned int annotate_browser__refresh(struct ui_browser *browser)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
 	int ret = ui_browser__list_head_refresh(browser);
-	int pcnt_width;
-
-	pcnt_width = 7 * ab->nr_events;
+	int pcnt_width = annotate_browser__pcnt_width(ab);
 
 	if (annotate_browser__opts.jump_arrows)
 		annotate_browser__draw_current_jump(browser);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/10] perf, tools, report: Move branch option parsing to own file
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (5 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 06/10] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-26 10:09   ` Jiri Olsa
  2015-05-10 13:52 ` [PATCH 08/10] perf, tools, top: Add branch annotation code to top Andi Kleen
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

.. to allow sharing between builtin-record and builtin-top later.
No code changes, just moved code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-record.c | 89 +------------------------------------------
 tools/perf/util/Build       |  1 +
 tools/perf/util/branch.c    | 93 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/branch.h    |  2 +
 4 files changed, 97 insertions(+), 88 deletions(-)
 create mode 100644 tools/perf/util/branch.c
 create mode 100644 tools/perf/util/branch.h

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5dfe913..c513620 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -28,6 +28,7 @@
 #include "util/thread_map.h"
 #include "util/data.h"
 #include "util/auxtrace.h"
+#include "util/branch.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -751,94 +752,6 @@ out_delete_session:
 	return status;
 }
 
-#define BRANCH_OPT(n, m) \
-	{ .name = n, .mode = (m) }
-
-#define BRANCH_END { .name = NULL }
-
-struct branch_mode {
-	const char *name;
-	int mode;
-};
-
-static const struct branch_mode branch_modes[] = {
-	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
-	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
-	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
-	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
-	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
-	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
-	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
-	BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
-	BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
-	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
-	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
-	BRANCH_END
-};
-
-static int
-parse_branch_stack(const struct option *opt, const char *str, int unset)
-{
-#define ONLY_PLM \
-	(PERF_SAMPLE_BRANCH_USER	|\
-	 PERF_SAMPLE_BRANCH_KERNEL	|\
-	 PERF_SAMPLE_BRANCH_HV)
-
-	uint64_t *mode = (uint64_t *)opt->value;
-	const struct branch_mode *br;
-	char *s, *os = NULL, *p;
-	int ret = -1;
-
-	if (unset)
-		return 0;
-
-	/*
-	 * cannot set it twice, -b + --branch-filter for instance
-	 */
-	if (*mode)
-		return -1;
-
-	/* str may be NULL in case no arg is passed to -b */
-	if (str) {
-		/* because str is read-only */
-		s = os = strdup(str);
-		if (!s)
-			return -1;
-
-		for (;;) {
-			p = strchr(s, ',');
-			if (p)
-				*p = '\0';
-
-			for (br = branch_modes; br->name; br++) {
-				if (!strcasecmp(s, br->name))
-					break;
-			}
-			if (!br->name) {
-				ui__warning("unknown branch filter %s,"
-					    " check man page\n", s);
-				goto error;
-			}
-
-			*mode |= br->mode;
-
-			if (!p)
-				break;
-
-			s = p + 1;
-		}
-	}
-	ret = 0;
-
-	/* default to any branch */
-	if ((*mode & ~ONLY_PLM) == 0) {
-		*mode = PERF_SAMPLE_BRANCH_ANY;
-	}
-error:
-	free(os);
-	return ret;
-}
-
 static void callchain_debug(void)
 {
 	static const char *str[CALLCHAIN_MAX] = { "NONE", "FP", "DWARF", "LBR" };
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index d552203..4ee4649 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -75,6 +75,7 @@ libperf-$(CONFIG_X86) += tsc.o
 libperf-y += cloexec.o
 libperf-y += thread-stack.o
 libperf-$(CONFIG_AUXTRACE) += auxtrace.o
+libperf-y += branch.o
 
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-event.o
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
new file mode 100644
index 0000000..1555064
--- /dev/null
+++ b/tools/perf/util/branch.c
@@ -0,0 +1,93 @@
+#include "perf.h"
+#include "util/util.h"
+#include "util/debug.h"
+#include "util/parse-options.h"
+#include "util/branch.h"
+
+#define BRANCH_OPT(n, m) \
+	{ .name = n, .mode = (m) }
+
+#define BRANCH_END { .name = NULL }
+
+struct branch_mode {
+	const char *name;
+	int mode;
+};
+
+static const struct branch_mode branch_modes[] = {
+	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
+	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
+	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
+	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
+	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
+	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
+	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+	BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
+	BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
+	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
+	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
+	BRANCH_END
+};
+
+int
+parse_branch_stack(const struct option *opt, const char *str, int unset)
+{
+#define ONLY_PLM \
+	(PERF_SAMPLE_BRANCH_USER	|\
+	 PERF_SAMPLE_BRANCH_KERNEL	|\
+	 PERF_SAMPLE_BRANCH_HV)
+
+	uint64_t *mode = (uint64_t *)opt->value;
+	const struct branch_mode *br;
+	char *s, *os = NULL, *p;
+	int ret = -1;
+
+	if (unset)
+		return 0;
+
+	/*
+	 * cannot set it twice, -b + --branch-filter for instance
+	 */
+	if (*mode)
+		return -1;
+
+	/* str may be NULL in case no arg is passed to -b */
+	if (str) {
+		/* because str is read-only */
+		s = os = strdup(str);
+		if (!s)
+			return -1;
+
+		for (;;) {
+			p = strchr(s, ',');
+			if (p)
+				*p = '\0';
+
+			for (br = branch_modes; br->name; br++) {
+				if (!strcasecmp(s, br->name))
+					break;
+			}
+			if (!br->name) {
+				ui__warning("unknown branch filter %s,"
+					    " check man page\n", s);
+				goto error;
+			}
+
+			*mode |= br->mode;
+
+			if (!p)
+				break;
+
+			s = p + 1;
+		}
+	}
+	ret = 0;
+
+	/* default to any branch */
+	if ((*mode & ~ONLY_PLM) == 0) {
+		*mode = PERF_SAMPLE_BRANCH_ANY;
+	}
+error:
+	free(os);
+	return ret;
+}
diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
new file mode 100644
index 0000000..66fd619
--- /dev/null
+++ b/tools/perf/util/branch.h
@@ -0,0 +1,2 @@
+struct option;
+int parse_branch_stack(const struct option *opt, const char *str, int unset);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/10] perf, tools, top: Add branch annotation code to top
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (6 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 07/10] perf, tools, report: Move branch option parsing to own file Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-10 13:52 ` [PATCH 09/10] perf, tools, report: Display cycles in branch sort mode Andi Kleen
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Now that we can process branch data in annotate it makes sense to support
enabling branch recording from top too. Most of the code needed for
this is already in shared code with report. But we need to add:

- The option parsing code (using shared code from the previous patch)
- Document the options
- Set up the IPC/cycles accounting state in the top session
- Call the accounting code in the hist iter callback

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-top.txt | 21 +++++++++++++++++++++
 tools/perf/builtin-top.c              | 10 +++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 9e5b07eb..fc2bffd 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -202,6 +202,27 @@ Default is to monitor all CPUS.
 	readability.  0 means no limit (default behavior).
 
 
+-b::
+--branch-any::
+	Enable taken branch stack sampling. Any type of taken branch may be sampled.
+	This is a shortcut for --branch-filter any. See --branch-filter for more infos.
+
+-j::
+--branch-filter::
+	Enable taken branch stack sampling. Each sample captures a series of consecutive
+	taken branches. The number of branches captured with each sample depends on the
+	underlying hardware, the type of branches of interest, and the executed code.
+	It is possible to select the types of branches captured by enabling filters.
+	For a full list of modifiers please see the perf record manpage.
+
+	The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
+	The privilege levels may be omitted, in which case, the privilege levels of the associated
+	event are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege
+	levels are subject to permissions.  When sampling on multiple events, branch stack sampling
+	is enabled for all the sampling events. The sampled branch type is the same for all events.
+	The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
+	Note that this feature may not be available on all processors.
+
 INTERACTIVE PROMPTING KEYS
 --------------------------
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 1cb3436..11750da 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -40,6 +40,7 @@
 #include "util/xyarray.h"
 #include "util/sort.h"
 #include "util/intlist.h"
+#include "util/branch.h"
 #include "arch/common.h"
 
 #include "util/debug.h"
@@ -687,6 +688,8 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
 		perf_top__record_precise_ip(top, he, evsel->idx, ip);
 	}
 
+	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
+		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
 	return 0;
 }
 
@@ -923,7 +926,6 @@ static int perf_top__setup_sample_type(struct perf_top *top __maybe_unused)
 			return -EINVAL;
 		}
 	}
-
 	return 0;
 }
 
@@ -1159,6 +1161,12 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	OPT_STRING('w', "column-widths", &symbol_conf.col_width_list_str,
 		   "width[,width...]",
 		   "don't try to adjust column width, use these fixed values"),
+	OPT_CALLBACK_NOOPT('b', "branch-any", &opts->branch_stack,
+		     "branch any", "sample any taken branches",
+		     parse_branch_stack),
+	OPT_CALLBACK('j', "branch-filter", &opts->branch_stack,
+		     "branch filter mask", "branch stack filter modes",
+		     parse_branch_stack),
 	OPT_END()
 	};
 	const char * const top_usage[] = {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/10] perf, tools, report: Display cycles in branch sort mode
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (7 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 08/10] perf, tools, top: Add branch annotation code to top Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-10 13:52 ` [PATCH 10/10] test patch: Add fake branch cycles to input data in report/top Andi Kleen
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Display the cycles by default in branch sort mode.

To make enough room for the new column I removed dso_to. It is usually
redundant with dso_from.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/sort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 03d8e6e..3022807 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -9,7 +9,7 @@ regex_t		parent_regex;
 const char	default_parent_pattern[] = "^sys_|^do_page_fault";
 const char	*parent_pattern = default_parent_pattern;
 const char	default_sort_order[] = "comm,dso,symbol";
-const char	default_branch_sort_order[] = "comm,dso_from,symbol_from,dso_to,symbol_to";
+const char	default_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles";
 const char	default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked";
 const char	default_top_sort_order[] = "dso,symbol";
 const char	default_diff_sort_order[] = "dso,symbol";
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/10] test patch: Add fake branch cycles to input data in report/top
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (8 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 09/10] perf, tools, report: Display cycles in branch sort mode Andi Kleen
@ 2015-05-10 13:52 ` Andi Kleen
  2015-05-18 19:06 ` Cycles annotation support for perf tools Andi Kleen
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-10 13:52 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Not to be merged, but useful for testing if you don't have
hardware with cycles branch stack support.
---
 tools/perf/util/hist.c    | 2 +-
 tools/perf/util/machine.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index cf6b48b..8acac89 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1418,7 +1418,7 @@ void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
 	struct branch_info *bi;
 
 	/* If we have branch cycles always annotate them. */
-	if (bs && bs->nr && bs->entries[0].flags.cycles) {
+	if (bs && bs->nr /* && bs->entries[0].flags.cycles */) {
 		int i;
 
 		bi = sample__resolve_bstack(sample, al);
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2f47110..72edc4e 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1498,6 +1498,8 @@ struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
 		ip__resolve_ams(al->thread, &bi[i].to, bs->entries[i].to);
 		ip__resolve_ams(al->thread, &bi[i].from, bs->entries[i].from);
 		bi[i].flags = bs->entries[i].flags;
+		if (bi[i].flags.cycles == 0)
+			bi[i].flags.cycles = 123;
 	}
 	return bi;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (9 preceding siblings ...)
  2015-05-10 13:52 ` [PATCH 10/10] test patch: Add fake branch cycles to input data in report/top Andi Kleen
@ 2015-05-18 19:06 ` Andi Kleen
  2015-05-19  9:55   ` Jiri Olsa
  2015-05-24 18:55 ` Jiri Olsa
  2015-05-26 10:08 ` Jiri Olsa
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2015-05-18 19:06 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung, eranian

Andi Kleen <andi@firstfloor.org> writes:

> The upcoming Skylake CPU has a new timed branch stack feature,
> that reports cycle counts for individual branches in the
> last branch record.

Any comments on this patchkit?

Thanks,
-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-18 19:06 ` Cycles annotation support for perf tools Andi Kleen
@ 2015-05-19  9:55   ` Jiri Olsa
  0 siblings, 0 replies; 23+ messages in thread
From: Jiri Olsa @ 2015-05-19  9:55 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, linux-kernel, namhyung, eranian

On Mon, May 18, 2015 at 12:06:23PM -0700, Andi Kleen wrote:
> Andi Kleen <andi@firstfloor.org> writes:
> 
> > The upcoming Skylake CPU has a new timed branch stack feature,
> > that reports cycle counts for individual branches in the
> > last branch record.
> 
> Any comments on this patchkit?

I plan to check on it.. but probably not until next week :-\ sry

jirka

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (10 preceding siblings ...)
  2015-05-18 19:06 ` Cycles annotation support for perf tools Andi Kleen
@ 2015-05-24 18:55 ` Jiri Olsa
  2015-05-24 19:19   ` Andi Kleen
  2015-05-26 10:08 ` Jiri Olsa
  12 siblings, 1 reply; 23+ messages in thread
From: Jiri Olsa @ 2015-05-24 18:55 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, linux-kernel, namhyung, eranian

On Sun, May 10, 2015 at 06:51:56AM -0700, Andi Kleen wrote:
> The upcoming Skylake CPU has a new timed branch stack feature,
> that reports cycle counts for individual branches in the
> last branch record.
> 
> This allows to get fine grained cost information for code, and also allows
> to compute fine grained IPC.
> 
> This patchkit adds support for this in the perf tools:
> - Basic support for the cycles field like other branch fields
> - Show cycles in the standard branch sort view (no IPC here,
>   as IPC needs the instruction counts from annotation)
> - Annotate cycles and IPC in the assembler annotate view
> - Add branch support to top, so we can do live annotation.
> - Misc support, like dumping it in perf report -D

hi,
do you have a branch with this change?

thanks,
jirka

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-24 18:55 ` Jiri Olsa
@ 2015-05-24 19:19   ` Andi Kleen
  0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-24 19:19 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, linux-kernel, namhyung, eranian

On Sun, May 24, 2015 at 08:55:00PM +0200, Jiri Olsa wrote:
> On Sun, May 10, 2015 at 06:51:56AM -0700, Andi Kleen wrote:
> > The upcoming Skylake CPU has a new timed branch stack feature,
> > that reports cycle counts for individual branches in the
> > last branch record.
> > 
> > This allows to get fine grained cost information for code, and also allows
> > to compute fine grained IPC.
> > 
> > This patchkit adds support for this in the perf tools:
> > - Basic support for the cycles field like other branch fields
> > - Show cycles in the standard branch sort view (no IPC here,
> >   as IPC needs the instruction counts from annotation)
> > - Annotate cycles and IPC in the assembler annotate view
> > - Add branch support to top, so we can do live annotation.
> > - Misc support, like dumping it in perf report -D
> 
> hi,
> do you have a branch with this change?

git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/skl-tools1

This version contains one bugfix over the posted version.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
                   ` (11 preceding siblings ...)
  2015-05-24 18:55 ` Jiri Olsa
@ 2015-05-26 10:08 ` Jiri Olsa
  2015-05-26 16:56   ` Andi Kleen
  12 siblings, 1 reply; 23+ messages in thread
From: Jiri Olsa @ 2015-05-26 10:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, linux-kernel, namhyung, eranian

On Sun, May 10, 2015 at 06:51:56AM -0700, Andi Kleen wrote:
> The upcoming Skylake CPU has a new timed branch stack feature,
> that reports cycle counts for individual branches in the
> last branch record.
> 
> This allows to get fine grained cost information for code, and also allows
> to compute fine grained IPC.
> 
> This patchkit adds support for this in the perf tools:
> - Basic support for the cycles field like other branch fields
> - Show cycles in the standard branch sort view (no IPC here,
>   as IPC needs the instruction counts from annotation)
> - Annotate cycles and IPC in the assembler annotate view
> - Add branch support to top, so we can do live annotation.
> - Misc support, like dumping it in perf report -D
> 
> The kernel support has been posted separately. I included a test patch
> to generate fake data for testing on existing systems.
> 
> Example output for annotate (with made up numbers):
>     
> The second column is the IPC and third average cycles for the basic block.
> 
>                    │    static int hex(char ch)                                                                                                       ▒
>                    │    {                                                                                                                             ▒
>         8.20       │      push   %rbp                                                                                                                 ◆
>         8.20       │      mov    %rsp,%rbp                                                                                                            ▒
>         8.20       │      sub    $0x20,%rsp                                                                                                           ▒
>         8.20       │      mov    %edi,%eax                                                                                                            ▒
>         8.20       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
>         8.20       │      mov    %fs:0x28,%rax                                                                                                        ▒
>         8.20       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
>         8.20       │      xor    %eax,%eax                                                                                                            ▒
>                    │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
>         8.20       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
>  66.67  8.20   123 │    ↓ jle    31                                                                                                                   ▒
>         8.20       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
>         8.20   123 │    ↓ jg     31                                                                                                                   ▒
>                    │                    return ch - '0';                                                                                              ▒
>  22.22  8.20       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
>         8.20       │      sub    $0x30,%eax                                                                                                           ▒
>         8.20   123 │    ↓ jmp    60                                                                                                                   ▒
>                    │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
>        17.57       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
>        17.57   123 │    ↓ jle    46                                                                                                                   ▒
>        17.57       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
>        17.57       │    ↓ jg     46                                                                                                                   ▒
>                    │                    return ch - 'a' + 10;                                                                                         ▒
>        17.57       │      movsbl -0x14(%rbp),%eax                                 

heya,
columns are displayed fine, but the current highlighted line disappeared
and also and standard annotation (without LBR) is broken..

it looks like the collor management is screwed, because keeps changing
background colors all over the place when I hit arrow keys

please test your patchset with standard use cases

thanks,
jirka

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/10] perf, tools, report: Add processing for cycle histograms
  2015-05-10 13:52 ` [PATCH 04/10] perf, tools, report: Add processing for cycle histograms Andi Kleen
@ 2015-05-26 10:09   ` Jiri Olsa
  2015-05-26 17:37     ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Jiri Olsa @ 2015-05-26 10:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, linux-kernel, namhyung, eranian, Andi Kleen

On Sun, May 10, 2015 at 06:52:00AM -0700, Andi Kleen wrote:

SNIP

>  		mi = he->mem_info;
>  		err = addr_map_symbol__inc_samples(&mi->daddr, evsel->idx);
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 302fc05..cf6b48b 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -1412,6 +1412,39 @@ int hists__link(struct hists *leader, struct hists *other)
>  	return 0;
>  }
>  
> +void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
> +			  struct perf_sample *sample, bool nonstd_branch_mode)
> +{
> +	struct branch_info *bi;
> +
> +	/* If we have branch cycles always annotate them. */
> +	if (bs && bs->nr && bs->entries[0].flags.cycles) {

hum, so this is assuming that having cycles fort 1st entry
means there'll be for the rest?

Also in that case why is there the '!= cycles' check within
addr_map_symbol__account_cycles ?

jirka

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/10] perf, tools, report: Move branch option parsing to own file
  2015-05-10 13:52 ` [PATCH 07/10] perf, tools, report: Move branch option parsing to own file Andi Kleen
@ 2015-05-26 10:09   ` Jiri Olsa
  0 siblings, 0 replies; 23+ messages in thread
From: Jiri Olsa @ 2015-05-26 10:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, linux-kernel, namhyung, eranian, Andi Kleen

On Sun, May 10, 2015 at 06:52:03AM -0700, Andi Kleen wrote:

SNIP

> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
> new file mode 100644
> index 0000000..66fd619
> --- /dev/null
> +++ b/tools/perf/util/branch.h
> @@ -0,0 +1,2 @@
> +struct option;
> +int parse_branch_stack(const struct option *opt, const char *str, int unset);

please add the standard #ifndef header

jirka

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-26 10:08 ` Jiri Olsa
@ 2015-05-26 16:56   ` Andi Kleen
  2015-05-26 21:40     ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2015-05-26 16:56 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, linux-kernel, namhyung, eranian

> columns are displayed fine, but the current highlighted line disappeared
> and also and standard annotation (without LBR) is broken..

I can't reproduce that. Everything looks fine to me.

Please investigate on your side.

That's the only hunk that's changing colors. It looks equivalent to me:


@@ -110,11 +122,29 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
                        percent_max = bdl->percent[i];
        }
 
-       if (dl->offset != -1 && percent_max != 0.0) {
-               for (i = 0; i < ab->nr_events; i++) {
-                       ui_browser__set_percent_color(browser, bdl->percent[i],
-                                                     current_entry);
-                       slsmg_printf("%6.2f ", bdl->percent[i]);
+       if (dl->offset != -1) {
+               if (percent_max != 0.0) {
+                       for (i = 0; i < ab->nr_events; i++) {
+                               ui_browser__set_percent_color(browser,
+                                                             bdl->percent[i],
+                                                             current_entry);
+                               slsmg_printf("%6.2f ", bdl->percent[i]);
+                       }
+               } else {
+                       slsmg_write_nstring(" ", 7 * ab->nr_events);
+               }
+
+               if (ab->have_cycles) {
+                       ui_browser__set_color(browser, HE_COLORSET_NORMAL);
+                       if (dl->ipc)
+                               slsmg_printf("%*.2f ", IPC_WIDTH - 1, dl->ipc);
+                       else
+                               slsmg_write_nstring(" ", IPC_WIDTH);
+                       if (dl->cycles)
+                               slsmg_printf("%*" PRIu64 " ",
+                                               CYCLES_WIDTH - 1, dl->cycles);
+                       else
+                               slsmg_write_nstring(" ", CYCLES_WIDTH);
                }
        } else {
                ui_browser__set_percent_color(browser, 0, current_entry);







-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/10] perf, tools, report: Add processing for cycle histograms
  2015-05-26 10:09   ` Jiri Olsa
@ 2015-05-26 17:37     ` Andi Kleen
  2015-06-01 13:02       ` Jiri Olsa
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2015-05-26 17:37 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, linux-kernel, namhyung, eranian, Andi Kleen

> hum, so this is assuming that having cycles fort 1st entry
> means there'll be for the rest?
> Also in that case why is there the '!= cycles' check within
> addr_map_symbol__account_cycles ?
>
It means there might be. It's just a short cut. But rarely
branches may still have 0 cycles, so it still needs to be
checked later.

In theory it could miss a valid one if the first happened
to be zero, but that seems very unlikely.


-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Cycles annotation support for perf tools
  2015-05-26 16:56   ` Andi Kleen
@ 2015-05-26 21:40     ` Andi Kleen
  0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-05-26 21:40 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jiri Olsa, acme, linux-kernel, namhyung, eranian

On Tue, May 26, 2015 at 06:56:16PM +0200, Andi Kleen wrote:
> > columns are displayed fine, but the current highlighted line disappeared
> > and also and standard annotation (without LBR) is broken..
> 
> I can't reproduce that. Everything looks fine to me.

Never mind. Fixed it now.

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/10] perf, tools, report: Add processing for cycle histograms
  2015-05-26 17:37     ` Andi Kleen
@ 2015-06-01 13:02       ` Jiri Olsa
  2015-06-01 13:07         ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Jiri Olsa @ 2015-06-01 13:02 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, linux-kernel, namhyung, eranian, Andi Kleen

On Tue, May 26, 2015 at 07:37:30PM +0200, Andi Kleen wrote:
> > hum, so this is assuming that having cycles fort 1st entry
> > means there'll be for the rest?
> > Also in that case why is there the '!= cycles' check within
> > addr_map_symbol__account_cycles ?
> >
> It means there might be. It's just a short cut. But rarely
> branches may still have 0 cycles, so it still needs to be
> checked later.
> 
> In theory it could miss a valid one if the first happened
> to be zero, but that seems very unlikely.

so having 'bs->entries[0].flags.cycles' is the only way
of knowing that we have the feature enabled?

jirka

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/10] perf, tools, report: Add processing for cycle histograms
  2015-06-01 13:02       ` Jiri Olsa
@ 2015-06-01 13:07         ` Andi Kleen
  0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2015-06-01 13:07 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, linux-kernel, namhyung, eranian, Andi Kleen

On Mon, Jun 01, 2015 at 03:02:23PM +0200, Jiri Olsa wrote:
> On Tue, May 26, 2015 at 07:37:30PM +0200, Andi Kleen wrote:
> > > hum, so this is assuming that having cycles fort 1st entry
> > > means there'll be for the rest?
> > > Also in that case why is there the '!= cycles' check within
> > > addr_map_symbol__account_cycles ?
> > >
> > It means there might be. It's just a short cut. But rarely
> > branches may still have 0 cycles, so it still needs to be
> > checked later.
> > 
> > In theory it could miss a valid one if the first happened
> > to be zero, but that seems very unlikely.
> 
> so having 'bs->entries[0].flags.cycles' is the only way
> of knowing that we have the feature enabled?

Yes.

In theory we could add caps in sysfs like the PT code,
but that's not implemented currently.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-06-01 13:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-10 13:51 Cycles annotation support for perf tools Andi Kleen
2015-05-10 13:51 ` [PATCH 01/10] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
2015-05-10 13:51 ` [PATCH 02/10] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
2015-05-10 13:51 ` [PATCH 03/10] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
2015-05-10 13:52 ` [PATCH 04/10] perf, tools, report: Add processing for cycle histograms Andi Kleen
2015-05-26 10:09   ` Jiri Olsa
2015-05-26 17:37     ` Andi Kleen
2015-06-01 13:02       ` Jiri Olsa
2015-06-01 13:07         ` Andi Kleen
2015-05-10 13:52 ` [PATCH 05/10] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
2015-05-10 13:52 ` [PATCH 06/10] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
2015-05-10 13:52 ` [PATCH 07/10] perf, tools, report: Move branch option parsing to own file Andi Kleen
2015-05-26 10:09   ` Jiri Olsa
2015-05-10 13:52 ` [PATCH 08/10] perf, tools, top: Add branch annotation code to top Andi Kleen
2015-05-10 13:52 ` [PATCH 09/10] perf, tools, report: Display cycles in branch sort mode Andi Kleen
2015-05-10 13:52 ` [PATCH 10/10] test patch: Add fake branch cycles to input data in report/top Andi Kleen
2015-05-18 19:06 ` Cycles annotation support for perf tools Andi Kleen
2015-05-19  9:55   ` Jiri Olsa
2015-05-24 18:55 ` Jiri Olsa
2015-05-24 19:19   ` Andi Kleen
2015-05-26 10:08 ` Jiri Olsa
2015-05-26 16:56   ` Andi Kleen
2015-05-26 21:40     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).