linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: acme@kernel.org
Cc: jolsa@kernel.org, linux-kernel@vger.kernel.org, namhyung@kernel.org
Subject: Cycles annotation support for perf tools v3
Date: Sat, 18 Jul 2015 08:24:45 -0700	[thread overview]
Message-ID: <1437233094-12844-1-git-send-email-andi@firstfloor.org> (raw)

[v2: Addressed review comments. Fixed display problems and 
correctly compute IPC now. See patches for detailed changes.]
[v3: Merged with current Arnaldo perf/core and added acked-by.]

[Note the respective kernel patches to report cycles are in
peterz's perf/core queue, but so far not in tip. The patchkit
can be tested however with the "fake cycles" debug patch added at
the end]

The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.

This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.

Available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools3

This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
  as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D

Example output for annotate (with made up numbers):
    
The second column is the IPC and third average cycles for the basic block.

                   │    static int hex(char ch)                                                                                                       ▒
                   │    {                                                                                                                             ▒
        0.12       │      push   %rbp                                                                                                                 ◆
        0.12       │      mov    %rsp,%rbp                                                                                                            ▒
        0.12       │      sub    $0x20,%rsp                                                                                                           ▒
        0.12       │      mov    %edi,%eax                                                                                                            ▒
        0.12       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
        0.12       │      mov    %fs:0x28,%rax                                                                                                        ▒
        0.12       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
        0.12       │      xor    %eax,%eax                                                                                                            ▒
                   │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
        0.12       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
 66.67  0.12   123 │    ↓ jle    31                                                                                                                   ▒
        0.12       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
        0.12   123 │    ↓ jg     31                                                                                                                   ▒
                   │                    return ch - '0';                                                                                              ▒
 22.22  0.12       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
        0.12       │      sub    $0x30,%eax                                                                                                           ▒
        0.12   123 │    ↓ jmp    60                                                                                                                   ▒
                   │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
        0.06       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
        0.06   123 │    ↓ jle    46                                                                                                                   ▒
        0.06       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
        0.06       │    ↓ jg     46                                                                                                                   ▒
                   │                    return ch - 'a' + 10;                                                                                         ▒
        0.06       │      movsbl -0x14(%rbp),%eax                                 

Example output for branch view (again with fake data):

Overhead  Command  Source Shared Object  Source Symbol                               Target Symbol                               Basic Block Cycles   ◆
  30.08%  tcall    tcall                 [.] f1                                      [.] f2                                      123                  ▒
  27.44%  tcall    tcall                 [.] f2                                      [.] f1                                      123                  ▒
  15.60%  tcall    tcall                 [.] main                                    [.] f1                                      123                  ▒
  12.96%  tcall    tcall                 [.] f1                                      [.] main                                    123                  ▒
  12.86%  tcall    tcall                 [.] main                                    [.] main                                    123                  ▒
   0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                       [k] hrtimer_interrupt                       123             

IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.

The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.

It would be nice to add column headers to annotate.

So far no support in --branch-history or in perf script.


             reply	other threads:[~2015-07-18 15:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-18 15:24 Andi Kleen [this message]
2015-07-18 15:24 ` [PATCH 1/9] perf, tools: Add tools support for cycles, weight branch_info field Andi Kleen
2015-08-07  7:19   ` [tip:perf/core] perf tools: Add " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 2/9] perf, tools, report: Add flag for non ANY branch mode Andi Kleen
2015-08-07  7:19   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 3/9] perf, tools, report: Add infrastructure for a cycles histogram Andi Kleen
2015-08-07  7:20   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 4/9] perf, tools, report: Add processing for cycle histograms Andi Kleen
2015-08-07  7:20   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 5/9] perf, tools: Compute IPC and basic block cycles for annotate Andi Kleen
2015-08-07  7:20   ` [tip:perf/core] perf annotate: Compute IPC and basic block cycles tip-bot for Andi Kleen
2016-06-30  8:53   ` [PATCH 5/9] perf, tools: Compute IPC and basic block cycles for annotate Peter Zijlstra
2016-07-02 20:38     ` Andi Kleen
2015-07-18 15:24 ` [PATCH 6/9] perf, tools, annotate: Finally display IPC and cycle accounting Andi Kleen
2015-08-07  7:21   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 7/9] perf, tools, top: Add branch annotation code to top Andi Kleen
2015-08-07  7:21   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 8/9] perf, tools, report: Display cycles in branch sort mode Andi Kleen
2015-08-07  7:21   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2015-07-18 15:24 ` [PATCH 9/9] test patch: Add fake branch cycles to input data in report/top Andi Kleen
2015-08-06 19:44 ` Cycles annotation support for perf tools v3 Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1437233094-12844-1-git-send-email-andi@firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=acme@kernel.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).