All of lore.kernel.org
 help / color / mirror / Atom feed
* Cycles annotation support for perf tools v2
@ 2015-05-27 17:51 Andi Kleen
  2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
                   ` (11 more replies)
  0 siblings, 12 replies; 22+ messages in thread
From: Andi Kleen @ 2015-05-27 17:51 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, eranian, linux-kernel

[v2: Addressed review comments. Fixed display problems and 
correctly compute IPC now. See patches for detailed changes.]

The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.

This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.

Available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools2

This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
  as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D

The kernel support has been posted separately. I included a test patch
to generate fake data for testing on existing systems.

Example output for annotate (with made up numbers):
    
The second column is the IPC and third average cycles for the basic block.

                   │    static int hex(char ch)                                                                                                       ▒
                   │    {                                                                                                                             ▒
        0.12       │      push   %rbp                                                                                                                 ◆
        0.12       │      mov    %rsp,%rbp                                                                                                            ▒
        0.12       │      sub    $0x20,%rsp                                                                                                           ▒
        0.12       │      mov    %edi,%eax                                                                                                            ▒
        0.12       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
        0.12       │      mov    %fs:0x28,%rax                                                                                                        ▒
        0.12       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
        0.12       │      xor    %eax,%eax                                                                                                            ▒
                   │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
        0.12       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
 66.67  0.12   123 │    ↓ jle    31                                                                                                                   ▒
        0.12       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
        0.12   123 │    ↓ jg     31                                                                                                                   ▒
                   │                    return ch - '0';                                                                                              ▒
 22.22  0.12       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
        0.12       │      sub    $0x30,%eax                                                                                                           ▒
        0.12   123 │    ↓ jmp    60                                                                                                                   ▒
                   │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
        0.06       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
        0.06   123 │    ↓ jle    46                                                                                                                   ▒
        0.06       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
        0.06       │    ↓ jg     46                                                                                                                   ▒
                   │                    return ch - 'a' + 10;                                                                                         ▒
        0.06       │      movsbl -0x14(%rbp),%eax                                 

Example output for branch view (again with fake data):

Overhead  Command  Source Shared Object  Source Symbol                               Target Symbol                               Basic Block Cycles   ◆
  30.08%  tcall    tcall                 [.] f1                                      [.] f2                                      123                  ▒
  27.44%  tcall    tcall                 [.] f2                                      [.] f1                                      123                  ▒
  15.60%  tcall    tcall                 [.] main                                    [.] f1                                      123                  ▒
  12.96%  tcall    tcall                 [.] f1                                      [.] main                                    123                  ▒
  12.86%  tcall    tcall                 [.] main                                    [.] main                                    123                  ▒
   0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                       [k] hrtimer_interrupt                       123             

IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.

The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.

It would be nice to add column headers to annotate.

So far no support in --branch-history or in perf script.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-06-01 14:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-27 17:51 Cycles annotation support for perf tools v2 Andi Kleen
2015-05-27 17:51 ` [PATCH 01/11] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
2015-06-01 14:16   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 02/11] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
2015-06-01 14:16   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 03/11] perf, tools: Add symbol__get_annotation Andi Kleen
2015-05-28  9:32   ` [tip:perf/core] perf annotation: " tip-bot for Andi Kleen
2015-06-01 14:17   ` [PATCH 03/11] perf, tools: " Jiri Olsa
2015-05-27 17:51 ` [PATCH 04/11] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
2015-06-01 14:19   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 05/11] perf, tools, report: Add processing for cycle histograms Andi Kleen
2015-06-01 14:10   ` Jiri Olsa
2015-05-27 17:51 ` [PATCH 06/11] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
2015-05-27 17:51 ` [PATCH 07/11] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
2015-05-27 17:51 ` [PATCH 08/11] perf, tools, report: Move branch option parsing to own file Andi Kleen
2015-05-28  9:32   ` [tip:perf/core] perf tools: " tip-bot for Andi Kleen
2015-06-01 14:20   ` [PATCH 08/11] perf, tools, report: " Jiri Olsa
2015-05-27 17:51 ` [PATCH 09/11] perf, tools, top: Add branch annotation code to top Andi Kleen
2015-05-27 17:51 ` [PATCH 10/11] perf, tools, report: Display cycles in branch sort mode Andi Kleen
2015-05-27 17:51 ` [PATCH 11/11] test patch: Add fake branch cycles to input data in report/top Andi Kleen
2015-06-01 14:21 ` Cycles annotation support for perf tools v2 Jiri Olsa
2015-06-01 14:43   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.