From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752989AbbGUMcN (ORCPT ); Tue, 21 Jul 2015 08:32:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50830 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751131AbbGUMcM (ORCPT ); Tue, 21 Jul 2015 08:32:12 -0400 From: Jiri Olsa To: Arnaldo Carvalho de Melo Cc: Andi Kleen , Ulrich Drepper , Will Deacon , Stephane Eranian , lkml , David Ahern , Ingo Molnar , Namhyung Kim , Peter Zijlstra Subject: [RFC 00/47] perf stat: Add scripting support Date: Tue, 21 Jul 2015 14:31:20 +0200 Message-Id: <1437481927-29538-1-git-send-email-jolsa@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org hi, sending RFC on another attempt for stat scripting. The initial attempt defined its own formula lang and allowed triggering user's script on the end of the stat command: http://marc.info/?l=linux-kernel&m=136742146322273&w=2 This patchset abandons the idea of new formula language and rather adds support to: - store stat data into perf.data file - add python support to process stat events Basically it allows to store stat data into perf.data and post process it with python scripts in a similar way we do for sampling data. Examples: To record data for command stat workload: $ perf stat record kill ... Performance counter stats for 'kill': 0.372007 task-clock (msec) # 0.613 CPUs utilized 3 context-switches # 0.008 M/sec 0 cpu-migrations # 0.000 K/sec 62 page-faults # 0.167 M/sec 1,129,973 cycles # 3.038 GHz stalled-cycles-frontend stalled-cycles-backend 813,313 instructions # 0.72 insns per cycle 166,161 branches # 446.661 M/sec 8,747 branch-misses # 5.26% of all branches 0.000607287 seconds time elapsed To report perf stat data: $ perf stat report Performance counter stats for '/home/jolsa/bin/perf stat record kill': 0.372007 task-clock (msec) # inf CPUs utilized 3 context-switches # 0.008 M/sec 0 cpu-migrations # 0.000 K/sec 62 page-faults # 0.167 M/sec 1,129,973 cycles # 3.038 GHz stalled-cycles-frontend stalled-cycles-backend 813,313 instructions # 0.72 insns per cycle 166,161 branches # 446.661 M/sec 8,747 branch-misses # 5.26% of all branches 0.000000000 seconds time elapsed To store system-wide period stat data: $ perf stat -e cycles:u,instructions:u -a -I 1000 record # time counts unit events 1.000265471 462,311,482 cycles:u (100.00%) 1.000265471 590,037,440 instructions:u 2.000483453 722,532,336 cycles:u (100.00%) 2.000483453 848,678,197 instructions:u 3.000759876 75,990,880 cycles:u (100.00%) 3.000759876 86,187,813 instructions:u ^C 3.213960893 85,329,533 cycles:u (100.00%) 3.213960893 135,954,296 instructions:u To report perf stat data: $ perf stat report # time counts unit events 1.000265471 462,311,482 cycles:u (100.00%) 1.000265471 590,037,440 instructions:u 2.000483453 722,532,336 cycles:u (100.00%) 2.000483453 848,678,197 instructions:u 3.000759876 75,990,880 cycles:u (100.00%) 3.000759876 86,187,813 instructions:u 3.213960893 85,329,533 cycles:u (100.00%) 3.213960893 135,954,296 instructions:u To run stat-cpi.py script over perf.data: $ perf script -s scripts/python/stat-cpi.py 1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440) 2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197) 3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813) 3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296) To pipe data from stat to stat-cpi script: $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s scripts/python/stat-cpi.py 1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236) 2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498) 3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362) 4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624) 5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156) 6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818) 7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561) Raw script stat data output: $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-pager script CPU THREAD VAL ENA RUN TIME EVENT 0 -1 12302059 1000811347 1000810712 1000198821 cycles:u 0 -1 2565362 1000823218 1000823218 1000198821 instructions:u 0 -1 14453353 1000812704 1000812704 2000382283 cycles:u 0 -1 4600932 1000799342 1000799342 2000382283 instructions:u 0 -1 15245106 1000774425 1000774425 3000538255 cycles:u 0 -1 2624324 1000769310 1000769310 3000538255 instructions:u The stat data are stored in new stat, stat-round, stat-config user events. stat - stored for each read syscall of the counter stat round - stored for each interval or end of the command invocation stat config - stores all the config information needed to process data so report tool could restore the same output as record The python script can now define 'stat___' functions to get stat events data and 'stat__interval' to get stat-round data. See CPI script example in scripts/python/stat-cpi.py. Also available in: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git perf/stat_script This patchset still has MANY rough edges and loose ends, but I think it's better approach than to define our own formula scripting language. The patchset is growing and there's still a long way to go, so I'd like to hear some opinions before I go further ;-) thanks for comments, jirka Cc: Andi Kleen Cc: Ulrich Drepper Cc: Will Deacon Cc: Stephane Eranian --- Jiri Olsa (47): perf test: Check for refcnt in thread_map test perf stat: Introduce struct perf_stat_config perf stat: Move scale into struct perf_stat_config perf stat: Move output into struct perf_stat_config perf stat: Move interval into struct perf_stat_config perf stat: Pass struct perf_stat_config into process_counter perf stat: Move counter processing code into stat object perf tools: Use bool instead of target argument in perf_evlist__propagate_maps perf tools: Tolerate NULL maps in perf_evlist__propagate_maps perf tools: Force perf_evlist__set_maps to propagate maps through events perf tools: Use argv style storage for cmdline feature data perf tools: Add thread_map event perf tools: Add thread_map event sythesize function perf tools: Add thread_map__new_event function perf tools: Add cpu_map event perf tools: Add cpu_map event synthesize function perf tools: Add cpu_map__new_event function perf tools: Add stat config event perf tools: Add stat config event synthesize function perf tools: Add stat config event read function perf tools: Add stat event perf tools: Add stat event synthesize function perf tools: Add stat event read function perf tools: Add stat round event perf tools: Add stat round event synthesize function perf tools: Introduce stat feature perf tools: Move id_offset out of struct perf_evsel union perf stat record: Add record command perf stat record: Initialize record features perf stat record: Synthesize stat record data perf stat record: Store events IDs in perf data file perf stat record: Add pipe support for record command perf stat record: Write stat events on record perf stat record: Write stat round events on record perf stat report: Add report command perf stat report: Process cpu/threads maps perf stat report: Process stat config event perf stat report: Process stat and stat round events perf stat report: Move csv_sep initialization before report command perf script: Check output fields only for samples perf script: Process cpu/threads maps perf script: Process stat config event perf script: Add process_stat/process_stat_interval scripting interface perf script: Add stat default handlers perf script: Display stat events by default perf script: Add python support for stat events perf script: Add stat-cpi.py script tools/perf/Documentation/perf-stat.txt | 24 + tools/perf/builtin-record.c | 2 + tools/perf/builtin-script.c | 143 ++++- tools/perf/builtin-stat.c | 579 +++++++++++++++------ tools/perf/scripts/python/stat-cpi.py | 74 +++ tools/perf/tests/Build | 2 + tools/perf/tests/builtin-test.c | 20 + tools/perf/tests/cpumap.c | 39 ++ tools/perf/tests/stat.c | 108 ++++ tools/perf/tests/tests.h | 6 + tools/perf/tests/thread-map.c | 47 ++ tools/perf/util/cpumap.c | 27 + tools/perf/util/cpumap.h | 3 + tools/perf/util/event.c | 170 ++++++ tools/perf/util/event.h | 94 +++- tools/perf/util/evlist.c | 34 +- tools/perf/util/evlist.h | 14 +- tools/perf/util/evsel.h | 2 +- tools/perf/util/header.c | 49 +- tools/perf/util/header.h | 2 + .../util/scripting-engines/trace-event-python.c | 113 +++- tools/perf/util/session.c | 123 +++++ tools/perf/util/stat.c | 162 ++++++ tools/perf/util/stat.h | 16 + tools/perf/util/thread_map.c | 27 + tools/perf/util/thread_map.h | 3 + tools/perf/util/tool.h | 7 +- tools/perf/util/trace-event.h | 4 + 28 files changed, 1686 insertions(+), 208 deletions(-) create mode 100644 tools/perf/scripts/python/stat-cpi.py create mode 100644 tools/perf/tests/cpumap.c create mode 100644 tools/perf/tests/stat.c