* Re: [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function
2018-11-29 15:39 [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function Jin Yao
@ 2018-11-29 11:33 ` Jiri Olsa
2018-11-30 0:26 ` Jin, Yao
2018-11-29 15:39 ` [PATCH v4 1/4] perf annotate: Compute average IPC and IPC coverage per symbol Jin Yao
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Jiri Olsa @ 2018-11-29 11:33 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On Thu, Nov 29, 2018 at 11:39:34PM +0800, Jin Yao wrote:
> Add supporting of displaying the average IPC and IPC coverage
> percentage per function.
>
> For example,
>
> $ perf record -b ...
> $ perf report -s symbol or
> perf report -s symbol --stdio
>
> Overhead Symbol IPC [IPC Coverage]
> 39.60% [.] __random 2.30 [ 54.8%]
> 18.02% [.] main 0.43 [ 54.3%]
> 14.21% [.] compute_flag 2.29 [100.0%]
> 14.16% [.] rand 0.36 [100.0%]
> 7.06% [.] __random_r 2.57 [ 70.5%]
> 6.85% [.] rand@plt 0.00 [ 0.0%]
> ...
>
> $ perf annotate --stdio2
>
> Percent IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)
>
> Disassembly of section .text:
>
> 000000000003aac0 <random@@GLIBC_2.2.5>:
> 8.32 3.28 sub $0x18,%rsp
> 3.28 mov $0x1,%esi
> 3.28 xor %eax,%eax
> 3.28 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
> 11.57 3.28 1 ↓ je 20
> lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
> ↓ jne 29
> ↓ jmp 43
> 11.57 1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
> ...
>
> v4:
> ---
> Add a new patch "perf report: Documentation average IPC and IPC coverage"
> to explain the new columns "IPC" and "IPC Coverage".
>
> Other patches are not changed.
>
> v3:
> ---
> Remove the sortkey "ipc" from command-line. The columns "IPC"
> and "[IPC Coverage]" are automatically enabled when "symbol"
> is specified.
>
> Patch "perf report: Display average IPC and IPC coverage per symbol"
> is impacted.
>
> v2:
> ---
> 1. Merge in Jiri's patch to support stdio mode
>
> 2. Add a new patch "perf annotate: Create a annotate2 flag
> in struct symbol" which records if the symbol has been
> annotated yet.
>
> 3. Minor update such as adding { } for multiline code in 'if'
> condition.
>
> Jin Yao (4):
> perf annotate: Compute average IPC and IPC coverage per symbol
> perf annotate: Create a annotate2 flag in struct symbol
> perf report: Display average IPC and IPC coverage per symbol
> perf report: Documentation average IPC and IPC coverage
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
thanks,
jirka
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage
2018-11-29 15:39 ` [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage Jin Yao
@ 2018-11-29 13:27 ` Ingo Molnar
2018-11-30 0:22 ` Jin, Yao
0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2018-11-29 13:27 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
* Jin Yao <yao.jin@linux.intel.com> wrote:
> Add explanations for new columns "IPC" and "IPC coverage" in perf
> documentation.
>
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
> tools/perf/Documentation/perf-report.txt | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 474a494..e5a32f3 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -126,6 +126,14 @@ OPTIONS
> And default sort keys are changed to comm, dso_from, symbol_from, dso_to
> and symbol_to, see '--branch-stack'.
>
> + When the sort key symbol is specified, columns "IPC" and "IPC Coverage"
> + are enabled automatically. Column "IPC" reports the average IPC per function
> + and column "IPC coverage" reports the percentage of instructions with
> + sampled IPC in this function. IPC means Instruction Per Cycle. If it's low,
> + it indicates there may be performance bottleneck when the function is
> + executed, such as, memory access bottleneck. If a function has high overhead
> + and low IPC, it's worth further analysis for performance optimization.
Thank you for adding this!
Just a few small nits:
s/may be performance bottleneck
/may be a performance bottleneck
s/such as, memory access bottleneck
/such as a memory access bottleneck
s/it's worth further analysis for performance optimization.
/it's worth further analyzing it to optimize its performance.
?
Other than that:
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function
@ 2018-11-29 15:39 Jin Yao
2018-11-29 11:33 ` Jiri Olsa
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Jin Yao @ 2018-11-29 15:39 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Add supporting of displaying the average IPC and IPC coverage
percentage per function.
For example,
$ perf record -b ...
$ perf report -s symbol or
perf report -s symbol --stdio
Overhead Symbol IPC [IPC Coverage]
39.60% [.] __random 2.30 [ 54.8%]
18.02% [.] main 0.43 [ 54.3%]
14.21% [.] compute_flag 2.29 [100.0%]
14.16% [.] rand 0.36 [100.0%]
7.06% [.] __random_r 2.57 [ 70.5%]
6.85% [.] rand@plt 0.00 [ 0.0%]
...
$ perf annotate --stdio2
Percent IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)
Disassembly of section .text:
000000000003aac0 <random@@GLIBC_2.2.5>:
8.32 3.28 sub $0x18,%rsp
3.28 mov $0x1,%esi
3.28 xor %eax,%eax
3.28 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
11.57 3.28 1 ↓ je 20
lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
↓ jne 29
↓ jmp 43
11.57 1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
...
v4:
---
Add a new patch "perf report: Documentation average IPC and IPC coverage"
to explain the new columns "IPC" and "IPC Coverage".
Other patches are not changed.
v3:
---
Remove the sortkey "ipc" from command-line. The columns "IPC"
and "[IPC Coverage]" are automatically enabled when "symbol"
is specified.
Patch "perf report: Display average IPC and IPC coverage per symbol"
is impacted.
v2:
---
1. Merge in Jiri's patch to support stdio mode
2. Add a new patch "perf annotate: Create a annotate2 flag
in struct symbol" which records if the symbol has been
annotated yet.
3. Minor update such as adding { } for multiline code in 'if'
condition.
Jin Yao (4):
perf annotate: Compute average IPC and IPC coverage per symbol
perf annotate: Create a annotate2 flag in struct symbol
perf report: Display average IPC and IPC coverage per symbol
perf report: Documentation average IPC and IPC coverage
tools/perf/Documentation/perf-report.txt | 8 +++++
tools/perf/builtin-report.c | 26 ++++++++++++--
tools/perf/util/annotate.c | 42 ++++++++++++++++++++--
tools/perf/util/annotate.h | 5 +++
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 61 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
tools/perf/util/symbol.h | 1 +
8 files changed, 140 insertions(+), 6 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 1/4] perf annotate: Compute average IPC and IPC coverage per symbol
2018-11-29 15:39 [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function Jin Yao
2018-11-29 11:33 ` Jiri Olsa
@ 2018-11-29 15:39 ` Jin Yao
2018-11-29 15:39 ` [PATCH v4 2/4] perf annotate: Create a annotate2 flag in struct symbol Jin Yao
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Jin Yao @ 2018-11-29 15:39 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Add support to perf report annotate view or perf annotate --stdio2 to
aggregate the IPC derived from timed LBRs per symbol. We compute the
average IPC and the IPC coverage percentage.
For example,
$ perf annotate --stdio2
Percent IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)
Disassembly of section .text:
000000000003aac0 <random@@GLIBC_2.2.5>:
8.32 3.28 sub $0x18,%rsp
3.28 mov $0x1,%esi
3.28 xor %eax,%eax
3.28 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
11.57 3.28 1 ↓ je 20
lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
↓ jne 29
↓ jmp 43
11.57 1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
0.00 1.10 1 ↓ je 43
29: lea __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
sub $0x80,%rsp
→ callq __lll_lock_wait_private
add $0x80,%rsp
0.00 3.00 43: lea __ctype_b@GLIBC_2.2.5+0x38,%rdi
3.00 lea 0xc(%rsp),%rsi
8.49 3.00 1 → callq __random_r
7.91 1.94 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
0.00 1.94 1 ↓ je 68
lock decl __abort_msg@@GLIBC_PRIVATE+0x8a0
↓ jne 70
↓ jmp 8a
0.00 2.00 68: decl __abort_msg@@GLIBC_PRIVATE+0x8a0
21.56 2.00 1 ↓ je 8a
70: lea __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
sub $0x80,%rsp
→ callq __lll_unlock_wake_private
add $0x80,%rsp
21.56 2.90 8a: movslq 0xc(%rsp),%rax
2.90 add $0x18,%rsp
9.03 2.90 1 ← retq
It shows for this symbol the average IPC is 2.30 and the IPC coverage
is 54.8%.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/util/annotate.c | 41 ++++++++++++++++++++++++++++++++++++++---
tools/perf/util/annotate.h | 5 +++++
2 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 6936daf..4b2b1b0 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1000,6 +1000,7 @@ static unsigned annotation__count_insn(struct annotation *notes, u64 start, u64
static void annotation__count_and_fill(struct annotation *notes, u64 start, u64 end, struct cyc_hist *ch)
{
unsigned n_insn;
+ unsigned int cover_insn = 0;
u64 offset;
n_insn = annotation__count_insn(notes, start, end);
@@ -1013,21 +1014,34 @@ static void annotation__count_and_fill(struct annotation *notes, u64 start, u64
for (offset = start; offset <= end; offset++) {
struct annotation_line *al = notes->offsets[offset];
- if (al)
+ if (al && al->ipc == 0.0) {
al->ipc = ipc;
+ cover_insn++;
+ }
+ }
+
+ if (cover_insn) {
+ notes->hit_cycles += ch->cycles;
+ notes->hit_insn += n_insn * ch->num;
+ notes->cover_insn += cover_insn;
}
}
}
void annotation__compute_ipc(struct annotation *notes, size_t size)
{
- u64 offset;
+ s64 offset;
if (!notes->src || !notes->src->cycles_hist)
return;
+ notes->total_insn = annotation__count_insn(notes, 0, size - 1);
+ notes->hit_cycles = 0;
+ notes->hit_insn = 0;
+ notes->cover_insn = 0;
+
pthread_mutex_lock(¬es->lock);
- for (offset = 0; offset < size; ++offset) {
+ for (offset = size - 1; offset >= 0; --offset) {
struct cyc_hist *ch;
ch = ¬es->src->cycles_hist[offset];
@@ -2563,6 +2577,22 @@ static void disasm_line__write(struct disasm_line *dl, struct annotation *notes,
disasm_line__scnprintf(dl, bf, size, !notes->options->use_offset);
}
+static void ipc_coverage_string(char *bf, int size, struct annotation *notes)
+{
+ double ipc = 0.0, coverage = 0.0;
+
+ if (notes->hit_cycles)
+ ipc = notes->hit_insn / ((double)notes->hit_cycles);
+
+ if (notes->total_insn) {
+ coverage = notes->cover_insn * 100.0 /
+ ((double)notes->total_insn);
+ }
+
+ scnprintf(bf, size, "(Average IPC: %.2f, IPC Coverage: %.1f%%)",
+ ipc, coverage);
+}
+
static void __annotation_line__write(struct annotation_line *al, struct annotation *notes,
bool first_line, bool current_entry, bool change_color, int width,
void *obj, unsigned int percent_type,
@@ -2658,6 +2688,11 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
ANNOTATION__MINMAX_CYCLES_WIDTH - 1,
"Cycle(min/max)");
}
+
+ if (show_title && !*al->line) {
+ ipc_coverage_string(bf, sizeof(bf), notes);
+ obj__printf(obj, "%*s", ANNOTATION__AVG_IPC_WIDTH, bf);
+ }
}
obj__printf(obj, " ");
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 5399ba2..fb64637 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -64,6 +64,7 @@ bool ins__is_fused(struct arch *arch, const char *ins1, const char *ins2);
#define ANNOTATION__IPC_WIDTH 6
#define ANNOTATION__CYCLES_WIDTH 6
#define ANNOTATION__MINMAX_CYCLES_WIDTH 19
+#define ANNOTATION__AVG_IPC_WIDTH 36
struct annotation_options {
bool hide_src_code,
@@ -262,6 +263,10 @@ struct annotation {
pthread_mutex_t lock;
u64 max_coverage;
u64 start;
+ u64 hit_cycles;
+ u64 hit_insn;
+ unsigned int total_insn;
+ unsigned int cover_insn;
struct annotation_options *options;
struct annotation_line **offsets;
int nr_events;
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 2/4] perf annotate: Create a annotate2 flag in struct symbol
2018-11-29 15:39 [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function Jin Yao
2018-11-29 11:33 ` Jiri Olsa
2018-11-29 15:39 ` [PATCH v4 1/4] perf annotate: Compute average IPC and IPC coverage per symbol Jin Yao
@ 2018-11-29 15:39 ` Jin Yao
2018-11-29 15:39 ` [PATCH v4 3/4] perf report: Display average IPC and IPC coverage per symbol Jin Yao
2018-11-29 15:39 ` [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage Jin Yao
4 siblings, 0 replies; 9+ messages in thread
From: Jin Yao @ 2018-11-29 15:39 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
We often use the symbol__annotate2() to annotate a specified symbol.
While annotating may take some time, so in order to avoid annotating
the same symbol repeatedly, the patch creates a new flag to indicate
the symbol has been annotated.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/util/annotate.c | 1 +
tools/perf/util/symbol.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4b2b1b0..f69d8e1 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2798,6 +2798,7 @@ int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel *ev
notes->nr_events = nr_pcnt;
annotation__update_column_widths(notes);
+ sym->annotate2 = true;
return 0;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index d026d21..14d9d43 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -63,6 +63,7 @@ struct symbol {
u8 ignore:1;
u8 inlined:1;
u8 arch_sym;
+ bool annotate2;
char name[0];
};
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 3/4] perf report: Display average IPC and IPC coverage per symbol
2018-11-29 15:39 [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function Jin Yao
` (2 preceding siblings ...)
2018-11-29 15:39 ` [PATCH v4 2/4] perf annotate: Create a annotate2 flag in struct symbol Jin Yao
@ 2018-11-29 15:39 ` Jin Yao
2018-11-29 15:39 ` [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage Jin Yao
4 siblings, 0 replies; 9+ messages in thread
From: Jin Yao @ 2018-11-29 15:39 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Support displaying the average IPC and IPC coverage per symbol
in perf report TUI and stdio modes.
For example,
$ perf record -b ...
$ perf report -s symbol
Overhead Symbol IPC [IPC Coverage]
39.60% [.] __random 2.30 [ 54.8%]
18.02% [.] main 0.43 [ 54.3%]
14.21% [.] compute_flag 2.29 [100.0%]
14.16% [.] rand 0.36 [100.0%]
7.06% [.] __random_r 2.57 [ 70.5%]
6.85% [.] rand@plt 0.00 [ 0.0%]
Jiri Olsa <jolsa@redhat.com> provides the patch to support the
stdio mode. I merge Jiri's code in this patch.
$ perf report -s symbol --stdio
# Overhead Symbol IPC [IPC Coverage]
# ........ ........................... ....................
#
39.60% [.] __random 2.30 [ 54.8%]
18.02% [.] main 0.43 [ 54.3%]
14.21% [.] compute_flag 2.29 [100.0%]
14.16% [.] rand 0.36 [100.0%]
7.06% [.] __random_r 2.57 [ 70.5%]
6.85% [.] rand@plt 0.00 [ 0.0%]
0.02% [k] run_timer_softirq 1.60 [ 57.2%]
The columns "IPC" and "[IPC Coverage]" are automatically enabled
when the sort-key "symbol" is specified. If the perf.data doesn't
contain timed LBR information, columns are filled with "-".
For example,
# Overhead Symbol IPC [IPC Coverage]
# ........ ........................... ....................
#
46.57% [.] main - -
17.60% [.] rand - -
15.84% [.] __random_r - -
11.90% [.] __random - -
6.50% [.] compute_flag - -
1.59% [.] rand@plt - -
0.00% [.] _dl_relocate_object - -
0.00% [k] tlb_flush_mmu - -
0.00% [k] perf_event_mmap - -
0.00% [k] native_sched_clock - -
0.00% [k] intel_pmu_handle_irq_v4 - -
0.00% [k] native_write_msr - -
v3:
---
Removed the sortkey 'ipc' from command-line. The columns "IPC"
and "[IPC Coverage]" are automatically enabled when "symbol"
is specified.
v2:
---
Merge in Jiri's patch to support stdio mode
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/builtin-report.c | 26 ++++++++++++++++---
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 61 +++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
4 files changed, 87 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 257c9c1..4958095 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -85,6 +85,7 @@ struct report {
int socket_filter;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
struct branch_type_stat brtype_stat;
+ bool symbol_ipc;
};
static int report__config(const char *var, const char *value, void *cb)
@@ -129,7 +130,7 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
struct mem_info *mi;
struct branch_info *bi;
- if (!ui__has_annotation())
+ if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
hist__account_cycles(sample->branch_stack, al, sample,
@@ -174,7 +175,7 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
struct perf_evsel *evsel = iter->evsel;
int err;
- if (!ui__has_annotation())
+ if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
hist__account_cycles(sample->branch_stack, al, sample,
@@ -1133,6 +1134,7 @@ int cmd_report(int argc, const char **argv)
.mode = PERF_DATA_MODE_READ,
};
int ret = hists__init();
+ char sort_tmp[128];
if (ret < 0)
return ret;
@@ -1284,6 +1286,24 @@ int cmd_report(int argc, const char **argv)
else
use_browser = 0;
+ if (sort_order && strstr(sort_order, "ipc")) {
+ parse_options_usage(report_usage, options, "s", 1);
+ goto error;
+ }
+
+ if (sort_order && strstr(sort_order, "symbol")) {
+ if (sort__mode == SORT_MODE__BRANCH) {
+ snprintf(sort_tmp, sizeof(sort_tmp), "%s,%s",
+ sort_order, "ipc_lbr");
+ report.symbol_ipc = true;
+ } else {
+ snprintf(sort_tmp, sizeof(sort_tmp), "%s,%s",
+ sort_order, "ipc_null");
+ }
+
+ sort_order = sort_tmp;
+ }
+
if (setup_sorting(session->evlist) < 0) {
if (sort_order)
parse_options_usage(report_usage, options, "s", 1);
@@ -1311,7 +1331,7 @@ int cmd_report(int argc, const char **argv)
* so don't allocate extra space that won't be used in the stdio
* implementation.
*/
- if (ui__has_annotation()) {
+ if (ui__has_annotation() || report.symbol_ipc) {
ret = symbol__annotation_init();
if (ret < 0)
goto error;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 3badd7f..664b5ed 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -62,6 +62,7 @@ enum hist_column {
HISTC_TRACE,
HISTC_SYM_SIZE,
HISTC_DSO_SIZE,
+ HISTC_SYMBOL_IPC,
HISTC_NR_COLS, /* Last entry */
};
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f96c005..0477935 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -13,6 +13,7 @@
#include "strlist.h"
#include <traceevent/event-parse.h>
#include "mem-events.h"
+#include "annotate.h"
#include <linux/kernel.h>
regex_t parent_regex;
@@ -422,6 +423,64 @@ struct sort_entry sort_srcline_to = {
.se_width_idx = HISTC_SRCLINE_TO,
};
+static int hist_entry__sym_ipc_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+
+ struct symbol *sym = he->ms.sym;
+ struct map *map = he->ms.map;
+ struct perf_evsel *evsel = hists_to_evsel(he->hists);
+ struct annotation *notes;
+ double ipc = 0.0, coverage = 0.0;
+ char tmp[64];
+
+ if (!sym)
+ return repsep_snprintf(bf, size, "%-*s", width, "-");
+
+ if (!sym->annotate2 && symbol__annotate2(sym, map, evsel,
+ &annotation__default_options, NULL) < 0) {
+ return 0;
+ }
+
+ notes = symbol__annotation(sym);
+
+ if (notes->hit_cycles)
+ ipc = notes->hit_insn / ((double)notes->hit_cycles);
+
+ if (notes->total_insn) {
+ coverage = notes->cover_insn * 100.0 /
+ ((double)notes->total_insn);
+ }
+
+ snprintf(tmp, sizeof(tmp), "%-5.2f [%5.1f%%]", ipc, coverage);
+ return repsep_snprintf(bf, size, "%-*s", width, tmp);
+}
+
+struct sort_entry sort_sym_ipc = {
+ .se_header = "IPC [IPC Coverage]",
+ .se_cmp = sort__sym_cmp,
+ .se_snprintf = hist_entry__sym_ipc_snprintf,
+ .se_width_idx = HISTC_SYMBOL_IPC,
+};
+
+static int hist_entry__sym_ipc_null_snprintf(struct hist_entry *he
+ __maybe_unused,
+ char *bf, size_t size,
+ unsigned int width)
+{
+ char tmp[64];
+
+ snprintf(tmp, sizeof(tmp), "%-5s %2s", "-", "-");
+ return repsep_snprintf(bf, size, "%-*s", width, tmp);
+}
+
+struct sort_entry sort_sym_ipc_null = {
+ .se_header = "IPC [IPC Coverage]",
+ .se_cmp = sort__sym_cmp,
+ .se_snprintf = hist_entry__sym_ipc_null_snprintf,
+ .se_width_idx = HISTC_SYMBOL_IPC,
+};
+
/* --sort srcfile */
static char no_srcfile[1];
@@ -1574,6 +1633,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_SYM_SIZE, "symbol_size", sort_sym_size),
DIM(SORT_DSO_SIZE, "dso_size", sort_dso_size),
DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
+ DIM(SORT_SYM_IPC_NULL, "ipc_null", sort_sym_ipc_null),
};
#undef DIM
@@ -1591,6 +1651,7 @@ static struct sort_dimension bstack_sort_dimensions[] = {
DIM(SORT_CYCLES, "cycles", sort_cycles),
DIM(SORT_SRCLINE_FROM, "srcline_from", sort_srcline_from),
DIM(SORT_SRCLINE_TO, "srcline_to", sort_srcline_to),
+ DIM(SORT_SYM_IPC, "ipc_lbr", sort_sym_ipc),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index a97cf8e..130fe37 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -229,6 +229,7 @@ enum sort_type {
SORT_SYM_SIZE,
SORT_DSO_SIZE,
SORT_CGROUP_ID,
+ SORT_SYM_IPC_NULL,
/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
@@ -242,6 +243,7 @@ enum sort_type {
SORT_CYCLES,
SORT_SRCLINE_FROM,
SORT_SRCLINE_TO,
+ SORT_SYM_IPC,
/* memory mode specific sort keys */
__SORT_MEMORY_MODE,
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage
2018-11-29 15:39 [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function Jin Yao
` (3 preceding siblings ...)
2018-11-29 15:39 ` [PATCH v4 3/4] perf report: Display average IPC and IPC coverage per symbol Jin Yao
@ 2018-11-29 15:39 ` Jin Yao
2018-11-29 13:27 ` Ingo Molnar
4 siblings, 1 reply; 9+ messages in thread
From: Jin Yao @ 2018-11-29 15:39 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Add explanations for new columns "IPC" and "IPC coverage" in perf
documentation.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/Documentation/perf-report.txt | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 474a494..e5a32f3 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -126,6 +126,14 @@ OPTIONS
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
+ When the sort key symbol is specified, columns "IPC" and "IPC Coverage"
+ are enabled automatically. Column "IPC" reports the average IPC per function
+ and column "IPC coverage" reports the percentage of instructions with
+ sampled IPC in this function. IPC means Instruction Per Cycle. If it's low,
+ it indicates there may be performance bottleneck when the function is
+ executed, such as, memory access bottleneck. If a function has high overhead
+ and low IPC, it's worth further analysis for performance optimization.
+
If the --mem-mode option is used, the following sort keys are also available
(incompatible with --branch-stack):
symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage
2018-11-29 13:27 ` Ingo Molnar
@ 2018-11-30 0:22 ` Jin, Yao
0 siblings, 0 replies; 9+ messages in thread
From: Jin, Yao @ 2018-11-30 0:22 UTC (permalink / raw)
To: Ingo Molnar
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On 11/29/2018 9:27 PM, Ingo Molnar wrote:
>
> * Jin Yao <yao.jin@linux.intel.com> wrote:
>
>> Add explanations for new columns "IPC" and "IPC coverage" in perf
>> documentation.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> tools/perf/Documentation/perf-report.txt | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
>> index 474a494..e5a32f3 100644
>> --- a/tools/perf/Documentation/perf-report.txt
>> +++ b/tools/perf/Documentation/perf-report.txt
>> @@ -126,6 +126,14 @@ OPTIONS
>> And default sort keys are changed to comm, dso_from, symbol_from, dso_to
>> and symbol_to, see '--branch-stack'.
>>
>> + When the sort key symbol is specified, columns "IPC" and "IPC Coverage"
>> + are enabled automatically. Column "IPC" reports the average IPC per function
>> + and column "IPC coverage" reports the percentage of instructions with
>> + sampled IPC in this function. IPC means Instruction Per Cycle. If it's low,
>> + it indicates there may be performance bottleneck when the function is
>> + executed, such as, memory access bottleneck. If a function has high overhead
>> + and low IPC, it's worth further analysis for performance optimization.
>
> Thank you for adding this!
>
> Just a few small nits:
>
> s/may be performance bottleneck
> /may be a performance bottleneck
>
> s/such as, memory access bottleneck
> /such as a memory access bottleneck
>
> s/it's worth further analysis for performance optimization.
> /it's worth further analyzing it to optimize its performance.
>
> ?
>
> Other than that:
>
> Reviewed-by: Ingo Molnar <mingo@kernel.org>
>
> Ingo
>
Thanks Ingo!
I will add these fixes in v5.
Thanks
Jin Yao
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function
2018-11-29 11:33 ` Jiri Olsa
@ 2018-11-30 0:26 ` Jin, Yao
0 siblings, 0 replies; 9+ messages in thread
From: Jin, Yao @ 2018-11-30 0:26 UTC (permalink / raw)
To: Jiri Olsa
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On 11/29/2018 7:33 PM, Jiri Olsa wrote:
> On Thu, Nov 29, 2018 at 11:39:34PM +0800, Jin Yao wrote:
>> Add supporting of displaying the average IPC and IPC coverage
>> percentage per function.
>>
>> For example,
>>
>> $ perf record -b ...
>> $ perf report -s symbol or
>> perf report -s symbol --stdio
>>
>> Overhead Symbol IPC [IPC Coverage]
>> 39.60% [.] __random 2.30 [ 54.8%]
>> 18.02% [.] main 0.43 [ 54.3%]
>> 14.21% [.] compute_flag 2.29 [100.0%]
>> 14.16% [.] rand 0.36 [100.0%]
>> 7.06% [.] __random_r 2.57 [ 70.5%]
>> 6.85% [.] rand@plt 0.00 [ 0.0%]
>> ...
>>
>> $ perf annotate --stdio2
>>
>> Percent IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)
>>
>> Disassembly of section .text:
>>
>> 000000000003aac0 <random@@GLIBC_2.2.5>:
>> 8.32 3.28 sub $0x18,%rsp
>> 3.28 mov $0x1,%esi
>> 3.28 xor %eax,%eax
>> 3.28 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
>> 11.57 3.28 1 ↓ je 20
>> lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
>> ↓ jne 29
>> ↓ jmp 43
>> 11.57 1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
>> ...
>>
>> v4:
>> ---
>> Add a new patch "perf report: Documentation average IPC and IPC coverage"
>> to explain the new columns "IPC" and "IPC Coverage".
>>
>> Other patches are not changed.
>>
>> v3:
>> ---
>> Remove the sortkey "ipc" from command-line. The columns "IPC"
>> and "[IPC Coverage]" are automatically enabled when "symbol"
>> is specified.
>>
>> Patch "perf report: Display average IPC and IPC coverage per symbol"
>> is impacted.
>>
>> v2:
>> ---
>> 1. Merge in Jiri's patch to support stdio mode
>>
>> 2. Add a new patch "perf annotate: Create a annotate2 flag
>> in struct symbol" which records if the symbol has been
>> annotated yet.
>>
>> 3. Minor update such as adding { } for multiline code in 'if'
>> condition.
>>
>> Jin Yao (4):
>> perf annotate: Compute average IPC and IPC coverage per symbol
>> perf annotate: Create a annotate2 flag in struct symbol
>> perf report: Display average IPC and IPC coverage per symbol
>> perf report: Documentation average IPC and IPC coverage
>
> Reviewed-by: Jiri Olsa <jolsa@kernel.org>
>
> thanks,
> jirka
>
Thanks so much for reviewing the patches.
I will post v5 which includes some minor updates for documentation.
Thanks
Jin Yao
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-11-30 0:26 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-29 15:39 [PATCH v4 0/4] perf report/annotate: Support average IPC and IPC coverage for function Jin Yao
2018-11-29 11:33 ` Jiri Olsa
2018-11-30 0:26 ` Jin, Yao
2018-11-29 15:39 ` [PATCH v4 1/4] perf annotate: Compute average IPC and IPC coverage per symbol Jin Yao
2018-11-29 15:39 ` [PATCH v4 2/4] perf annotate: Create a annotate2 flag in struct symbol Jin Yao
2018-11-29 15:39 ` [PATCH v4 3/4] perf report: Display average IPC and IPC coverage per symbol Jin Yao
2018-11-29 15:39 ` [PATCH v4 4/4] perf report: Documentation average IPC and IPC coverage Jin Yao
2018-11-29 13:27 ` Ingo Molnar
2018-11-30 0:22 ` Jin, Yao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.