Support standalone metrics and metric groups for perf

* Support standalone metrics and metric groups for perf
@ 2017-07-24 23:40 Andi Kleen
  2017-07-24 23:40 ` [PATCH v1 01/15] perf, tools, stat: Fix buffer overflow while freeing events Andi Kleen
                   ` (15 more replies)
  0 siblings, 16 replies; 36+ messages in thread
From: Andi Kleen @ 2017-07-24 23:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel

Add generic support for standalone metrics specified in JSON files
to perf stat. A metric is a formula that uses multiple events
to compute a higher level result (e.g. IPC). 

For more complex metrics we need to have micro architecture
specific knowledge, so it makes sense to tie metrics to
JSON event lists.

Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have
standalone metrics. They are in the same JSON data structure
as events, but don't have an event name, only a metric name.

We also allow to organize the metrics in metric groups, which
allows a short cut to select several related metrics at once.

This patch kit adds the code to perf to manage metric groups

The first few patches are generic bug fixes and can be applied
directly. Then there is a 'weak group' feature that is useful
independently from metrics. After there are metrics specific
patches.

The patches are available in

   git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/metric-group-4

The actual Intel JSON metrics are available in git as a separate pull
request in 

   git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2

Some example output:

   % perf list metricgroup
    ..
    Metric Groups:

    DSB:
      DSB_Coverage
            [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
    FLOPS:
      GFLOPs
            [Giga Floating Point Operations Per Second]
    Frontend:
      IFetch_Line_Utilization
            [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
    Frontend_Bandwidth:
      DSB_Coverage
            [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
    Memory_BW:
      MLP
            [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]

   % perf stat -M Summary --metric-only -a sleep 1

     Performance counter stats for 'system wide':

    Instructions                              CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
    317614222.0                              1392930775.0             0.0                 0.0                 0.2                 0.1

           1.001497549 seconds time elapsed

   % perf stat -M GFLOPs flops

     Performance counter stats for 'flops':

         3,999,541,471      fp_comp_ops_exe.sse_scalar_single #      1.2 GFLOPs                   (66.65%)
                    14      fp_comp_ops_exe.sse_scalar_double                                     (66.65%)
                     0      fp_comp_ops_exe.sse_packed_double                                     (66.67%)
                     0      fp_comp_ops_exe.sse_packed_single                                     (66.70%)
                     0      simd_fp_256.packed_double                                     (66.70%)
                     0      simd_fp_256.packed_single                                     (66.67%)
                     0      duration_time

           3.238372845 seconds time elapsed

v1: Initial post

^ permalink raw reply	[flat|nested] 36+ messages in thread