From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932490AbbJ0OYK (ORCPT ); Tue, 27 Oct 2015 10:24:10 -0400 Received: from mga03.intel.com ([134.134.136.65]:21762 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752614AbbJ0OYH convert rfc822-to-8bit (ORCPT ); Tue, 27 Oct 2015 10:24:07 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,205,1444719600"; d="scan'208";a="804367986" From: "Liang, Kan" To: Arnaldo Carvalho de Melo , Jiri Olsa CC: Andi Kleen , Ulrich Drepper , Will Deacon , Stephane Eranian , Don Zickus , lkml , "David Ahern" , Ingo Molnar , Namhyung Kim , Peter Zijlstra Subject: RE: [PATCHv5 00/52] perf stat: Add scripting support Thread-Topic: [PATCHv5 00/52] perf stat: Add scripting support Thread-Index: AQHRDzTFXL6NrPfpJU63OgRZCNqzT559hKsAgAHhL/A= Date: Tue, 27 Oct 2015 14:24:03 +0000 Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07701977956@SHSMSX103.ccr.corp.intel.com> References: <1445784728-21732-1-git-send-email-jolsa@kernel.org> <20151026173442.GM27006@kernel.org> In-Reply-To: <20151026173442.GM27006@kernel.org> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > Em Sun, Oct 25, 2015 at 03:51:16PM +0100, Jiri Olsa escreveu: > > hi, > > sending another version of stat scripting. > > > > v5 changes: > > - several patches from v4 already taken > > - using u16 for cpu number in cpu_map_event > > - renamed PERF_RECORD_HEADER_ATTR_UPDATE to > PERF_RECORD_EVENT_UPDATE > > - moved low hanging fuits patches to the start of the patchset > > - patchset tested by Kan Liang, thanks! > > So, next time please add the Tested-by: tag for the patches you haven't > changed since Kan (or any other tester, I guess) said he tested. > > I'm not adding it now, since I'm not checking one by one the patches to see > the ones that changed, only re-reviewing them on their own. > > If Kan could provide retest, I can go on and add the Tested-by tags, > Redo the test for V5. It looks good to me. Tested-by: Kan Liang Thanks, Kan > Thanks, > > - Arnaldo > > > v4 changes: > > - added attr update event for event's cpumask > > - forbig aggregation on task workloads > > - some minor reorders and changelog fixes > > > > v3 changes: > > - added attr update event to handle unit,scale,name for event > > it fixed the uncore_imc_1/cas_count_read/ record/report > > - perf report -D now displays stat related events > > - some minor and changelog fixes > > > > v2 changes: > > - rebased to latest Arnaldo's perf/core > > - patches 1 to 11 already merged in > > - added --per-core/--per-socket/-A options for perf stat report > > command to allow custom aggregation in stat report, please > > check new examples below > > - couple changelogs changes > > > > The initial attempt defined its own formula lang and allowed > > triggering user's script on the end of the stat command: > > http://marc.info/?l=linux-kernel&m=136742146322273&w=2 > > > > This patchset abandons the idea of new formula language and rather > > adds support to: > > - store stat data into perf.data file > > - add python support to process stat events > > > > Basically it allows to store stat data into perf.data and post process > > it with python scripts in a similar way we do for sampling data. > > > > The stat data are stored in new stat, stat-round, stat-config user events. > > stat - stored for each read syscall of the counter > > stat round - stored for each interval or end of the command invocation > > stat config - stores all the config information needed to process data > > so report tool could restore the same output as record > > > > The python script can now define 'stat___' > > functions to get stat events data and 'stat__interval' to get stat-round > data. > > > > See CPI script example in scripts/python/stat-cpi.py. > > > > Also available in: > > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git > > perf/stat_script > > > > thanks, > > jirka > > > > Examples: > > > > - To record data for command stat workload: > > > > $ perf stat record kill > > ... > > > > Performance counter stats for 'kill': > > > > 0.372007 task-clock (msec) # 0.613 CPUs utilized > > 3 context-switches # 0.008 M/sec > > 0 cpu-migrations # 0.000 K/sec > > 62 page-faults # 0.167 M/sec > > 1,129,973 cycles # 3.038 GHz > > stalled-cycles-frontend > > stalled-cycles-backend > > 813,313 instructions # 0.72 insns per cycle > > 166,161 branches # 446.661 M/sec > > 8,747 branch-misses # 5.26% of all branches > > > > 0.000607287 seconds time elapsed > > > > - To report perf stat data: > > > > $ perf stat report > > > > Performance counter stats for '/home/jolsa/bin/perf stat record kill': > > > > 0.372007 task-clock (msec) # inf CPUs utilized > > 3 context-switches # 0.008 M/sec > > 0 cpu-migrations # 0.000 K/sec > > 62 page-faults # 0.167 M/sec > > 1,129,973 cycles # 3.038 GHz > > stalled-cycles-frontend > > stalled-cycles-backend > > 813,313 instructions # 0.72 insns per cycle > > 166,161 branches # 446.661 M/sec > > 8,747 branch-misses # 5.26% of all branches > > > > 0.000000000 seconds time elapsed > > > > - To store system-wide period stat data: > > > > $ perf stat -e cycles:u,instructions:u -a -I 1000 record > > # time counts unit events > > 1.000265471 462,311,482 cycles:u (100.00%) > > 1.000265471 590,037,440 instructions:u > > 2.000483453 722,532,336 cycles:u (100.00%) > > 2.000483453 848,678,197 instructions:u > > 3.000759876 75,990,880 cycles:u (100.00%) > > 3.000759876 86,187,813 instructions:u > > ^C 3.213960893 85,329,533 cycles:u (100.00%) > > 3.213960893 135,954,296 instructions:u > > > > - To report perf stat data: > > > > $ perf stat report > > # time counts unit events > > 1.000265471 462,311,482 cycles:u (100.00%) > > 1.000265471 590,037,440 instructions:u > > 2.000483453 722,532,336 cycles:u (100.00%) > > 2.000483453 848,678,197 instructions:u > > 3.000759876 75,990,880 cycles:u (100.00%) > > 3.000759876 86,187,813 instructions:u > > 3.213960893 85,329,533 cycles:u (100.00%) > > 3.213960893 135,954,296 instructions:u > > > > - To run stat-cpi.py script over perf.data: > > > > $ perf script -s scripts/python/stat-cpi.py > > 1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440) > > 2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197) > > 3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813) > > 3.213961: cpu -1, thread -1 -> cpi 0.627634 > > (85329533/135954296) > > > > - To pipe data from stat to stat-cpi script: > > > > $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s > scripts/python/stat-cpi.py > > 1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236) > > 2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498) > > 3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362) > > 4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624) > > 5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156) > > 6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818) > > 7.001195: cpu 0, thread -1 -> cpi 1.122059 > > (29890042/26638561) > > > > - Raw script stat data output: > > > > $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no- > pager script > > CPU THREAD VAL ENA RUN TIME EVENT > > 0 -1 12302059 1000811347 1000810712 1000198821 > cycles:u > > 0 -1 2565362 1000823218 1000823218 1000198821 > instructions:u > > 0 -1 14453353 1000812704 1000812704 2000382283 > cycles:u > > 0 -1 4600932 1000799342 1000799342 2000382283 > instructions:u > > 0 -1 15245106 1000774425 1000774425 3000538255 > cycles:u > > 0 -1 2624324 1000769310 1000769310 3000538255 > instructions:u > > > > - To display different aggregation in report: > > > > $ perf stat -e cycles -a -I 1000 record sleep 3 > > # time counts unit events > > 1.000223609 703,427,617 cycles > > 2.000443651 609,975,307 cycles > > 3.000569616 668,479,597 cycles > > 3.000735323 1,155,816 cycles > > > > $ perf stat report > > # time counts unit events > > 1.000223609 703,427,617 cycles > > 2.000443651 609,975,307 cycles > > 3.000569616 668,479,597 cycles > > 3.000735323 1,155,816 cycles > > > > $ perf stat report --per-core > > # time core cpus counts unit events > > 1.000223609 S0-C0 2 327,612,412 cycles > > 1.000223609 S0-C1 2 375,815,205 cycles > > 2.000443651 S0-C0 2 287,462,177 cycles > > 2.000443651 S0-C1 2 322,513,130 cycles > > 3.000569616 S0-C0 2 271,571,908 cycles > > 3.000569616 S0-C1 2 396,907,689 cycles > > 3.000735323 S0-C0 2 694,977 cycles > > 3.000735323 S0-C1 2 460,839 cycles > > > > $ perf stat report --per-socket > > # time socket cpus counts unit events > > 1.000223609 S0 4 703,427,617 cycles > > 2.000443651 S0 4 609,975,307 cycles > > 3.000569616 S0 4 668,479,597 cycles > > 3.000735323 S0 4 1,155,816 cycles > > > > $ perf stat report -A > > # time CPU counts unit events > > 1.000223609 CPU0 205,431,505 cycles > > 1.000223609 CPU1 122,180,907 cycles > > 1.000223609 CPU2 176,649,682 cycles > > 1.000223609 CPU3 199,165,523 cycles > > 2.000443651 CPU0 148,447,922 cycles > > 2.000443651 CPU1 139,014,255 cycles > > 2.000443651 CPU2 204,436,559 cycles > > 2.000443651 CPU3 118,076,571 cycles > > 3.000569616 CPU0 149,788,954 cycles > > 3.000569616 CPU1 121,782,954 cycles > > 3.000569616 CPU2 247,277,700 cycles > > 3.000569616 CPU3 149,629,989 cycles > > 3.000735323 CPU0 269,675 cycles > > 3.000735323 CPU1 425,302 cycles > > 3.000735323 CPU2 364,169 cycles > > 3.000735323 CPU3 96,670 cycles > > > > > > Cc: Andi Kleen > > Cc: Ulrich Drepper > > Cc: Will Deacon > > Cc: Stephane Eranian > > Cc: Don Zickus > > Tested-by: Kan Liang > > --- > > Jiri Olsa (52): > > perf cpu_map: Add cpu_map__empty_new function > > perf stat: Cache aggregated map entries in extra cpumap > > perf tools: Add thread_map event > > perf tools: Add thread_map event sythesize function > > perf tools: Add thread_map__new_event function > > perf tools: Add perf_event__fprintf_thread_map function > > perf tools: Add cpu_map event > > perf tools: Add cpu_map event synthesize function > > perf tools: Add cpu_map__new_event function > > perf tools: Add perf_event__fprintf_cpu_map function > > perf tools: Add stat config event > > perf tools: Add stat config event synthesize function > > perf tools: Add stat config event read function > > perf tools: Add stat event > > perf tools: Add stat event synthesize function > > perf tools: Add stat event read function > > perf tools: Add stat round event > > perf tools: Add stat round event synthesize function > > perf tools: Add stat events fprintf functions > > perf tools: Add event_update event > > perf tools: Add event_update event unit type > > perf tools: Add event_update event scale type > > perf tools: Add event_update event name type > > perf tools: Add event_update event cpus type > > perf tools: Add perf_event__fprintf_event_update function > > perf report: Display newly added events in raw dump > > perf tools: Introduce stat feature > > perf tools: Move id_offset out of struct perf_evsel union > > perf stat record: Add record command > > perf stat record: Initialize record features > > perf stat record: Synthesize stat record data > > perf stat record: Store events IDs in perf data file > > perf stat record: Add pipe support for record command > > perf stat record: Write stat events on record > > perf stat record: Write stat round events on record > > perf stat record: Do not allow record with multiple runs mode > > perf stat record: Synthesize event update events > > perf stat report: Add report command > > perf stat report: Process cpu/threads maps > > perf stat report: Process stat config event > > perf stat report: Add support to initialize aggr_map from file > > perf stat report: Process stat and stat round events > > perf stat report: Process event update events > > perf stat report: Move csv_sep initialization before report command > > perf stat report: Allow to override aggr_mode > > perf script: Process cpu/threads maps > > perf script: Process stat config event > > perf script: Add process_stat/process_stat_interval scripting interface > > perf script: Add stat default handlers > > perf script: Display stat events by default > > perf script: Add python support for stat events > > perf script: Add stat-cpi.py script > > > > tools/perf/Documentation/perf-stat.txt | 34 ++ > > tools/perf/builtin-record.c | 2 + > > tools/perf/builtin-script.c | 139 +++++ > > tools/perf/builtin-stat.c | 643 ++++++++++++++++++++- > > tools/perf/scripts/python/stat-cpi.py | 74 +++ > > tools/perf/tests/Build | 3 + > > tools/perf/tests/builtin-test.c | 24 + > > tools/perf/tests/cpumap.c | 88 +++ > > tools/perf/tests/event_update.c | 117 ++++ > > tools/perf/tests/stat.c | 111 ++++ > > tools/perf/tests/tests.h | 7 + > > tools/perf/tests/thread-map.c | 43 ++ > > tools/perf/util/cpumap.c | 59 ++ > > tools/perf/util/cpumap.h | 2 + > > tools/perf/util/event.c | 308 ++++++++++ > > tools/perf/util/event.h | 150 ++++- > > tools/perf/util/evlist.c | 6 +- > > tools/perf/util/evlist.h | 3 + > > tools/perf/util/evsel.h | 2 +- > > tools/perf/util/header.c | 205 +++++++ > > tools/perf/util/header.h | 17 + > > .../util/scripting-engines/trace-event-python.c | 114 +++- > > tools/perf/util/session.c | 189 ++++++ > > tools/perf/util/stat.c | 62 ++ > > tools/perf/util/stat.h | 10 + > > tools/perf/util/thread_map.c | 27 + > > tools/perf/util/thread_map.h | 3 + > > tools/perf/util/tool.h | 8 +- > > tools/perf/util/trace-event.h | 4 + > > 29 files changed, 2427 insertions(+), 27 deletions(-) create mode > > 100644 tools/perf/scripts/python/stat-cpi.py > > create mode 100644 tools/perf/tests/cpumap.c create mode 100644 > > tools/perf/tests/event_update.c create mode 100644 > > tools/perf/tests/stat.c