[PATCH RFC 00/10] stat read during perf sampling

* [PATCH RFC 00/10] stat read during perf sampling
@ 2015-08-18  9:25 kan.liang
  2015-08-18  9:25 ` [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads kan.liang
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

The patch series intends to read counter statistics during sampling.
The instant benefit is that we can read memory bandwidth from uncore
event during cpu PMU event is sampling. Also, there are more and more 
free running counter events (such as freq, power etc) we have supported
or plan to support on perf. So it could benefit more.

The patch series includs 10 patches.
  - Patch 1: This patch fixes a potential bug, when evlist and evsel
    have different CPU maps. It can be merged separately.
  - Patch 2-3: These patches introduce a new sort option --socket.
    The user can sort the perf result by socket. User also can get
    the socket view of samples from perf report --stdio --socket.
    This feature should be useful for per-socket event.
  - Patch 4-10: Introduce 'N' event/group modifier. The event with
    this modifier will do counting not sampling. If a group with this
    modifier, only group leader do sampling. The counter statistics will
    be wrote in new RECORD type PERF_RECORD_STAT_READ and stored in
    perf.data. So perf report can present the counter statistics data
    accordingly.
    There may be an alternative way to get counter statistics during
    sampling by running perf record and perf stat together by script.
    But the script way have various issue and complex to parses the
    output. For example, the sophisticated bandwidth analysis requires
    fine granularity (10-20ms interval), while the perf stat interval
    is 100ms. It's better to record all data in perf.data as the
    patchset does.

Example:

 #perf record -e 'cycles,uncore_imc_1/cas_count_read/N'
	--stat-read-interval 10 -a ./tchain_edit
 [ perf record: Woken up 520 times to write data ]
 [ perf record: Captured and wrote 1.454 MB perf.data (21328 samples) ]

 $perf report --stdio --socket

 # Samples: 21K of event 'cycles'
 # Event count (approx.): 12073951084
 #

 # Socket: 0
 #
 # Overhead  Command       Shared Object     Symbol
 # ........  ............  ................  .......................................
 #
    97.58%  tchain_edit   tchain_edit       [.] f3
     0.08%  tchain_edit   tchain_edit       [.] f2
     0.05%  swapper       [kernel.vmlinux]  [k] run_timer_softirq
 # Socket: 1
 #
 # Overhead  Command       Shared Object     Symbol
 # ........  ............  ................  .......................................
 #
     0.43%  swapper       [kernel.vmlinux]  [k] acpi_idle_do_entry
     0.24%  kworker/22:1  [kernel.vmlinux]  [k] delay_tsc
     0.17%  perf          [kernel.vmlinux]  [k] smp_call_function_single

 # Socket: 0
 #
 # Performance counter stats:
 # uncore_imc_1/cas_count_read/N  29004
 #
 # Socket: 1
 #
 # Performance counter stats:
 # uncore_imc_1/cas_count_read/N  11350

 $perf report -D

...
0x3e3a8 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 0:
value 29 time: 78608435366512
0x3e3c8 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 18:
value 15 time: 78608435429055
...
0x3e468 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 0:
value 25 time: 78608445379258
0x3e488 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 18:
value 12 time: 78608445423995
...

Kan Liang (10):
  perf,tools: open event on evsel cpus and threads
  perf,tools: Support new sort type --socket
  perf,tools: support option --socket
  perf,tools: Add 'N' event/group modifier
  perf,tools: Enable statistic read for perf record
  perf,tools: New RECORD type PERF_RECORD_STAT_READ
  perf,tools: record counter statistics during sampling
  perf,tools: option to set stat read interval
  perf,tools: don't validate non-sample event
  perf,tools: Show STAT_READ in perf report

 tools/perf/Documentation/perf-list.txt   |   5 ++
 tools/perf/Documentation/perf-record.txt |   7 ++
 tools/perf/Documentation/perf-report.txt |   6 +-
 tools/perf/builtin-diff.c                |   2 +-
 tools/perf/builtin-record.c              | 140 ++++++++++++++++++++++++++++++-
 tools/perf/builtin-report.c              | 108 +++++++++++++++++++++++-
 tools/perf/builtin-top.c                 |   2 +-
 tools/perf/ui/stdio/hist.c               |  14 +++-
 tools/perf/util/cpumap.c                 |  35 ++++++--
 tools/perf/util/cpumap.h                 |   4 +
 tools/perf/util/event.c                  |   1 +
 tools/perf/util/event.h                  |  10 +++
 tools/perf/util/evlist.c                 |   9 ++
 tools/perf/util/evsel.h                  |   1 +
 tools/perf/util/hist.h                   |   3 +-
 tools/perf/util/parse-events.c           |   8 +-
 tools/perf/util/parse-events.l           |   2 +-
 tools/perf/util/session.c                |  15 ++++
 tools/perf/util/sort.c                   |  34 ++++++++
 tools/perf/util/sort.h                   |   1 +
 tools/perf/util/symbol.h                 |   1 +
 tools/perf/util/tool.h                   |   1 +
 22 files changed, 387 insertions(+), 22 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 29+ messages in thread