linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Cycles annotation support for perf tools v3
@ 2015-07-18 15:24 Andi Kleen
  2015-07-18 15:24 ` [PATCH 1/9] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Andi Kleen @ 2015-07-18 15:24 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, namhyung

[v2: Addressed review comments. Fixed display problems and 
correctly compute IPC now. See patches for detailed changes.]
[v3: Merged with current Arnaldo perf/core and added acked-by.]

[Note the respective kernel patches to report cycles are in
peterz's perf/core queue, but so far not in tip. The patchkit
can be tested however with the "fake cycles" debug patch added at
the end]

The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.

This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.

Available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools3

This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
  as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D

Example output for annotate (with made up numbers):
    
The second column is the IPC and third average cycles for the basic block.

                   │    static int hex(char ch)                                                                                                       ▒
                   │    {                                                                                                                             ▒
        0.12       │      push   %rbp                                                                                                                 ◆
        0.12       │      mov    %rsp,%rbp                                                                                                            ▒
        0.12       │      sub    $0x20,%rsp                                                                                                           ▒
        0.12       │      mov    %edi,%eax                                                                                                            ▒
        0.12       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
        0.12       │      mov    %fs:0x28,%rax                                                                                                        ▒
        0.12       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
        0.12       │      xor    %eax,%eax                                                                                                            ▒
                   │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
        0.12       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
 66.67  0.12   123 │    ↓ jle    31                                                                                                                   ▒
        0.12       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
        0.12   123 │    ↓ jg     31                                                                                                                   ▒
                   │                    return ch - '0';                                                                                              ▒
 22.22  0.12       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
        0.12       │      sub    $0x30,%eax                                                                                                           ▒
        0.12   123 │    ↓ jmp    60                                                                                                                   ▒
                   │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
        0.06       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
        0.06   123 │    ↓ jle    46                                                                                                                   ▒
        0.06       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
        0.06       │    ↓ jg     46                                                                                                                   ▒
                   │                    return ch - 'a' + 10;                                                                                         ▒
        0.06       │      movsbl -0x14(%rbp),%eax                                 

Example output for branch view (again with fake data):

Overhead  Command  Source Shared Object  Source Symbol                               Target Symbol                               Basic Block Cycles   ◆
  30.08%  tcall    tcall                 [.] f1                                      [.] f2                                      123                  ▒
  27.44%  tcall    tcall                 [.] f2                                      [.] f1                                      123                  ▒
  15.60%  tcall    tcall                 [.] main                                    [.] f1                                      123                  ▒
  12.96%  tcall    tcall                 [.] f1                                      [.] main                                    123                  ▒
  12.86%  tcall    tcall                 [.] main                                    [.] main                                    123                  ▒
   0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                       [k] hrtimer_interrupt                       123             

IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.

The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.

It would be nice to add column headers to annotate.

So far no support in --branch-history or in perf script.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-07-02 20:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-18 15:24 Cycles annotation support for perf tools v3 Andi Kleen
2015-07-18 15:24 ` [PATCH 1/9] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
2015-08-07  7:19   ` [tip:perf/core] perf tools: Add " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 2/9] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
2015-08-07  7:19   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 3/9] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
2015-08-07  7:20   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 4/9] perf, tools, report: Add processing for cycle histograms Andi Kleen
2015-08-07  7:20   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 5/9] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
2015-08-07  7:20   ` [tip:perf/core] perf annotate: Compute IPC and basic block cycles tip-bot for Andi Kleen
2016-06-30  8:53   ` [PATCH 5/9] perf, tools: Compute IPC and basic block cycles for annotate Peter Zijlstra
2016-07-02 20:38     ` Andi Kleen
2015-07-18 15:24 ` [PATCH 6/9] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
2015-08-07  7:21   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 7/9] perf, tools, top: Add branch annotation code to top Andi Kleen
2015-08-07  7:21   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 8/9] perf, tools, report: Display cycles in branch sort mode Andi Kleen
2015-08-07  7:21   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 9/9] test patch: Add fake branch cycles to input data in report/top Andi Kleen
2015-08-06 19:44 ` Cycles annotation support for perf tools v3 Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).