linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/14] perf: Stream comparison
@ 2020-03-13  7:11 Jin Yao
  2020-03-13  7:11 ` [PATCH v2 01/14] perf util: Create source line mapping table Jin Yao
                   ` (15 more replies)
  0 siblings, 16 replies; 20+ messages in thread
From: Jin Yao @ 2020-03-13  7:11 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Sometimes, a small change in a hot function reducing the cycles of
this function, but the overall workload doesn't get faster. It is
interesting where the cycles are moved to.

What it would like is to diff before/after streams. A stream we think
is a callchain which is aggregated by the branch records from samples.

By browsing the hot streams, we can understand the hot code flow.
By comparing the cycles variation of same streams between old perf
data and new perf data, we can understand if the cycles are moved to
the unchanged code.

The before stream is the stream before source code changed
(e.g. streams in perf.data.old). The after stream is the stream
after source code changed (e.g. streams in perf.data).

Diffing before/after streams compares all streams (or compares top
N hot streams) between two perf data files.

If all entries of one stream in perf.data.old are fully matched with
all entries of another stream in perf.data, we think these two streams
are matched otherwise the streams are not matched.

For example,

   cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
--------------------------              --------------------------
             main div.c:39                           main div.c:39
             main div.c:44                           main div.c:44

It looks that two streams are matched and we can see for the same
streams the cycles are equal and the callchain hit percents are
slightly changed. That's expected in the normal range.

But that's not always true if source code is changed in perf.data
(e.g. source line div.c:39 is changed). If the source line is changed,
they are different streams, we can't compare them. We will think the
stream in perf.data is a new stream.

The challenge is how to identify the changed source lines. The basic
idea is to use linux command "diff" to compare the source file A and
source file A* line by line (assume A is used in perf.data.old
and A* is updated in perf.data). According to "diff" output, we can
create a source line mapping table.

For example,

  Execute 'diff ./before/div.c ./after/div.c'

  25c25
  <       i = rand() % 2;
  ---
  >       i = rand() % 4;
  39c39
  <       for (i = 0; i < 2000000000; i++) {
  ---
  >       for (i = 0; i < 20000000001; i++) {

  div.c (after -> before) lines mapping:
  0 -> 0
  1 -> 1
  2 -> 2
  3 -> 3
  4 -> 4
  5 -> 5
  6 -> 6
  7 -> 7
  8 -> 8
  9 -> 9
  ...
  24 -> 24
  25 -> -1
  26 -> 26
  27 -> 27
  28 -> 28
  29 -> 29
  30 -> 30
  31 -> 31
  32 -> 32
  33 -> 33
  34 -> 34
  35 -> 35
  36 -> 36
  37 -> 37
  38 -> 38
  39 -> -1
  40 -> 40
  ...

From the table, we can easily know div.c:39 is source line changed.
(mapped to -1). So these two streams are not matched.

Besides the hot streams comparison, this patch series also support
the top N hottest blocks comparison.

It's also useful to figure out the top N hottest blocks from old perf
data file and figure out the top N hottest blocks from new perf data file,
and then compare them for the cycles diff. It can let us easily know
how many cycles are moved from one block to another block.

Now let's see examples.

perf record -b ...      Generate perf.data.old with branch data
perf record -b ...      Generate perf.data with branch data
perf diff --stream --percent-limit 2

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39

hot chain pair 3:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 4:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

Ignore the rightmost columns such as '[Program Block Range]' and 'Shared Object' for saving space

# Output based on old perf data:
#
# Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
# ...............  ..........  ...............................  ..........................  .....................
#
           25.20%          18                     -0.36%,   -1                           -                      -
           15.24%           7                     -0.45%,    0                           -                      -
            5.07%           2                      0.09%,    0                           -                      -
            4.84%           2                      0.26%,    0                           -                      -
            4.72%           2                      0.30%,    0                           -                      -
            3.91%           1                      0.29%,    0                           -                      -
            3.05%           1                      0.11%,    0                           -                      -
            2.90%           1                      0.08%,    0                           -                      -
            2.71%           1                     -0.11%,    0                           -                      -
            2.44%           1                      0.09%,    0                           -                      -
            2.35%           1                     -0.09%,    0                           -                      -
            2.27%           1                      0.15%,    0                           -                      -
            2.27%           1                      0.06%,    0                           -                      -
            2.17%           1                      0.09%,    0                           -                      -

If we enable the source line comparison, the output might be different.

perf diff --stream --before ./before --after ./after

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 2:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39*
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39*

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

# Output based on old perf data:
#
# Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
# ...............  ..........  ...............................  ..........................  .....................
#
           25.20%          18    [block changed in new stream]                      24.84%                     17
           15.24%           7                     -0.45%,    0                           -                      -
            5.07%           2                      0.09%,    0                           -                      -
            4.84%           2                      0.26%,    0                           -                      -
            4.72%           2                      0.30%,    0                           -                      -
            3.91%           1                      0.29%,    0                           -                      -
            3.05%           1                      0.11%,    0                           -                      -
            2.90%           1                      0.08%,    0                           -                      -
            2.71%           1                     -0.11%,    0                           -                      -
            2.44%           1                      0.09%,    0                           -                      -
            2.35%           1                     -0.09%,    0                           -                      -
            2.27%           1                      0.15%,    0                           -                      -
            2.27%           1                      0.06%,    0                           -                      -
            2.17%           1                      0.09%,    0                           -                      -

Sometime some changes are not reflected in the source code,
e.g. changing the compiler option. So for this, we can't get
the changes by diffing the source code lines.

This patch series also introduces a new perf-diff option "--changed-func".
It passes the names of changed functions then perf-diff can know what
functions are changed.

For example,
perf diff --stream --changed-func main --changed-func rand

NOTE:
-----
1. For the patches:

  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Return per-event callchain streams
  perf util: Compare two streams
  perf util: Calculate the sum of all streams hits
  perf util: Report hot streams
  perf diff: Support hot streams comparison

  These patches support the hot stream comparison.

2. For the patches:
  perf util: Add new block info functions for top N hot blocks comparison
  perf util: Add new block info fmts for showing hot blocks comparison
  perf util: Enable block source line comparison
  perf diff: support hot blocks comparison

  These patches support the hot blocks comparison.

3. For the patches
  perf util: Filter out streams by name of changed functions
  perf util: Filter out blocks by name of changed functions
  perf diff: Filter out streams by changed functions

  These patches support a user specified function name list which let
  perf-diff know these functions are changed.

 v2:
 ---
 Refine the codes for following patches:
  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Calculate the sum of all streams hits
  perf util: Add new block info functions for top N hot blocks comparison 

Jin Yao (14):
  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Return per-event callchain streams
  perf util: Compare two streams
  perf util: Calculate the sum of all streams hits
  perf util: Report hot streams
  perf diff: Support hot streams comparison
  perf util: Add new block info functions for top N hot blocks
    comparison
  perf util: Add new block info fmts for showing hot blocks comparison
  perf util: Enable block source line comparison
  perf diff: support hot blocks comparison
  perf util: Filter out streams by name of changed functions
  perf util: Filter out blocks by name of changed functions
  perf diff: Filter out streams by changed functions

 tools/perf/Documentation/perf-diff.txt |  19 +
 tools/perf/builtin-diff.c              | 324 ++++++++++++---
 tools/perf/util/Build                  |   1 +
 tools/perf/util/block-info.c           | 433 ++++++++++++++++++-
 tools/perf/util/block-info.h           |  38 +-
 tools/perf/util/callchain.c            | 517 +++++++++++++++++++++++
 tools/perf/util/callchain.h            |  34 ++
 tools/perf/util/srclist.c              | 555 +++++++++++++++++++++++++
 tools/perf/util/srclist.h              |  74 ++++
 9 files changed, 1935 insertions(+), 60 deletions(-)
 create mode 100644 tools/perf/util/srclist.c
 create mode 100644 tools/perf/util/srclist.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-03-23 14:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-13  7:11 [PATCH v2 00/14] perf: Stream comparison Jin Yao
2020-03-13  7:11 ` [PATCH v2 01/14] perf util: Create source line mapping table Jin Yao
2020-03-13  7:11 ` [PATCH v2 02/14] perf util: Create streams for managing top N hottest callchains Jin Yao
2020-03-13  7:11 ` [PATCH v2 03/14] perf util: Return per-event callchain streams Jin Yao
2020-03-13  7:11 ` [PATCH v2 04/14] perf util: Compare two streams Jin Yao
2020-03-13  7:11 ` [PATCH v2 05/14] perf util: Calculate the sum of all streams hits Jin Yao
2020-03-13  7:11 ` [PATCH v2 06/14] perf util: Report hot streams Jin Yao
2020-03-13  7:11 ` [PATCH v2 07/14] perf diff: Support hot streams comparison Jin Yao
2020-03-13  7:11 ` [PATCH v2 08/14] perf util: Add new block info functions for top N hot blocks comparison Jin Yao
2020-03-13  7:11 ` [PATCH v2 09/14] perf util: Add new block info fmts for showing " Jin Yao
2020-03-13  7:11 ` [PATCH v2 10/14] perf util: Enable block source line comparison Jin Yao
2020-03-13  7:11 ` [PATCH v2 11/14] perf diff: support hot blocks comparison Jin Yao
2020-03-13  7:11 ` [PATCH v2 12/14] perf util: Filter out streams by name of changed functions Jin Yao
2020-03-13  7:11 ` [PATCH v2 13/14] perf util: Filter out blocks " Jin Yao
2020-03-13  7:11 ` [PATCH v2 14/14] perf diff: Filter out streams by " Jin Yao
2020-03-18 10:19 ` [PATCH v2 00/14] perf: Stream comparison Jiri Olsa
2020-03-19  1:48   ` Jin, Yao
2020-03-23 11:05 ` Jiri Olsa
2020-03-23 13:59   ` Jin, Yao
2020-03-23 14:37     ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).