All of lore.kernel.org
 help / color / mirror / Atom feed
* Support standalone metrics and metric groups for perf
@ 2017-08-31 19:40 Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 01/11] perf, tools: Support weak groups Andi Kleen
                   ` (11 more replies)
  0 siblings, 12 replies; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel

Add generic support for standalone metrics specified in JSON files
to perf stat. A metric is a formula that uses multiple events
to compute a higher level result (e.g. IPC). 

For more complex metrics we need to have micro architecture
specific knowledge, so it makes sense to tie metrics to
JSON event lists.
    
Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have
standalone metrics. They are in the same JSON data structure
as events, but don't have an event name, only a metric name.
    
We also allow to organize the metrics in metric groups, which
allows a short cut to select several related metrics at once.

This patch kit adds the code to perf to manage metric groups

The first few patches are generic bug fixes and can be applied
directly. Then there is a 'weak group' feature that is useful
independently from metrics. After there are metrics specific
patches.

The patches are available in

   git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/metric-group-6

The actual Intel JSON metrics are available in git as a separate pull
request in 

   git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2

Some example output:

   % perf list metricgroup
    ..
    Metric Groups:
    
    DSB:
      DSB_Coverage
            [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
    FLOPS:
      GFLOPs
            [Giga Floating Point Operations Per Second]
    Frontend:
      IFetch_Line_Utilization
            [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
    Frontend_Bandwidth:
      DSB_Coverage
            [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
    Memory_BW:
      MLP
            [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]

   % perf stat -M Summary --metric-only -a sleep 1
    
     Performance counter stats for 'system wide':
    
    Instructions                              CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
    317614222.0                              1392930775.0             0.0                 0.0                 0.2                 0.1
    
           1.001497549 seconds time elapsed
    
   % perf stat -M GFLOPs flops
    
     Performance counter stats for 'flops':
    
         3,999,541,471      fp_comp_ops_exe.sse_scalar_single #      1.2 GFLOPs                   (66.65%)
                    14      fp_comp_ops_exe.sse_scalar_double                                     (66.65%)
                     0      fp_comp_ops_exe.sse_packed_double                                     (66.67%)
                     0      fp_comp_ops_exe.sse_packed_single                                     (66.70%)
                     0      simd_fp_256.packed_double                                     (66.70%)
                     0      simd_fp_256.packed_single                                     (66.67%)
    
           3.238372845 seconds time elapsed

v1: Initial post
v2: Address all review feedback (see individual patches)
BPF now works again.
Fix some bugs in perf list printing that I added last minute last time.
v3: Address all review feedback. Some patches are split. Rebased.
Not caching cpuids because it's too complicated.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 01/11] perf, tools: Support weak groups
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-01 16:57   ` Jiri Olsa
  2017-09-22 16:28   ` [tip:perf/core] perf tools: Support weak groups in 'perf stat' tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 02/11] perf, tools: Support metric_group and no event name in json parser Andi Kleen
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Setting up groups can be complicated due to the
complicated scheduling restrictions of different PMUs.
User tools usually don't understand all these restrictions.
Still in many cases it is useful to set up groups and
they work most of the time. However if the group
is set up wrong some members will not reported any values
because they never get scheduled.

Add a concept of a 'weak group': try to set up a group,
but if it's not schedulable fallback to not using
a group. That gives us the best of both worlds:
groups if they work, but still a usable fallback if they don't.

In theory it would be possible to have more complex fallback
strategies (e.g. try to split the group in half), but
the simple fallback of not using a group seems to work for now.

So far the weak group is only implemented for perf stat,
not for record.

Here's an unschedulable group (on IvyBridge with SMT on)

% perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

        73,806,067      branches
         4,848,144      branch-misses             #    6.57% of all branches
        14,754,458      l1d.replacement
        24,905,558      l2_lines_in.all
   <not supported>      l2_rqsts.all_code_rd         <------- will never report anything

With the weak group:

% perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1

       125,366,055      branches                                                      (80.02%)
         9,208,402      branch-misses             #    7.35% of all branches          (80.01%)
        24,560,249      l1d.replacement                                               (80.00%)
        43,174,971      l2_lines_in.all                                               (80.05%)
        31,891,457      l2_rqsts.all_code_rd                                          (79.92%)

The extra event scheduled with some extra multiplexing

v2: Move fallback code to separate function.
Add comment on for_each_group_member
Adjust to new perf_evsel__close interface
v3:
Fix debug print out.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-list.txt |  1 +
 tools/perf/builtin-stat.c              | 35 ++++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h                |  1 +
 tools/perf/util/parse-events.c         |  8 +++++++-
 tools/perf/util/parse-events.l         |  2 +-
 5 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index f709de54707b..d432965d728d 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -47,6 +47,7 @@ counted. The following modifiers exist:
  P - use maximum detected precise level
  S - read sample value (PERF_SAMPLE_READ)
  D - pin the event to the PMU
+ W - group is weak and will fallback to non-group if not schedulable
 
 The 'p' modifier can be used for specifying how precise the instruction
 address should be. The 'p' modifier can be specified multiple times:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 866da7aa54bf..501c1e0272fe 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -582,6 +582,32 @@ static bool perf_evsel__should_store_id(struct perf_evsel *counter)
 	return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
 }
 
+static struct perf_evsel *reset_weak_group(struct perf_evsel *counter)
+{
+	struct perf_evsel *c2, *leader;
+	bool is_open = true;
+
+	leader = counter->leader;
+	pr_debug("Weak group for %s/%d failed\n",
+			leader->name, leader->nr_members);
+
+	/*
+	 * for_each_group_member doesn't work here because it doesn't
+	 * include the first entry.
+	 */
+	evlist__for_each_entry(evsel_list, c2) {
+		if (c2 == counter)
+			is_open = false;
+		if (c2->leader == leader) {
+			if (is_open)
+				perf_evsel__close(c2);
+			c2->leader = c2;
+			c2->nr_members = 0;
+		}
+	}
+	return leader;
+}
+
 static int __run_perf_stat(int argc, const char **argv)
 {
 	int interval = stat_config.interval;
@@ -618,6 +644,15 @@ static int __run_perf_stat(int argc, const char **argv)
 	evlist__for_each_entry(evsel_list, counter) {
 try_again:
 		if (create_perf_stat_counter(counter) < 0) {
+
+			/* Weak group failed. Reset the group. */
+			if (errno == EINVAL &&
+			    counter->leader != counter &&
+			    counter->weak_group) {
+				counter = reset_weak_group(counter);
+				goto try_again;
+			}
+
 			/*
 			 * PPC returns ENXIO for HW counters until 2.6.37
 			 * (behavior changed with commit b0a873e).
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 351d3b2d8887..f538c3530227 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -136,6 +136,7 @@ struct perf_evsel {
 	const char *		metric_name;
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
+	bool			weak_group;
 };
 
 union u64_swap {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f44aeba51d1f..cd80c4eac569 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1358,6 +1358,7 @@ struct event_modifier {
 	int exclude_GH;
 	int sample_read;
 	int pinned;
+	int weak;
 };
 
 static int get_event_modifier(struct event_modifier *mod, char *str,
@@ -1376,6 +1377,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 
 	int exclude = eu | ek | eh;
 	int exclude_GH = evsel ? evsel->exclude_GH : 0;
+	int weak = 0;
 
 	memset(mod, 0, sizeof(*mod));
 
@@ -1413,6 +1415,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 			sample_read = 1;
 		} else if (*str == 'D') {
 			pinned = 1;
+		} else if (*str == 'W') {
+			weak = 1;
 		} else
 			break;
 
@@ -1443,6 +1447,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 	mod->exclude_GH = exclude_GH;
 	mod->sample_read = sample_read;
 	mod->pinned = pinned;
+	mod->weak = weak;
 
 	return 0;
 }
@@ -1456,7 +1461,7 @@ static int check_modifier(char *str)
 	char *p = str;
 
 	/* The sizeof includes 0 byte as well. */
-	if (strlen(str) > (sizeof("ukhGHpppPSDI") - 1))
+	if (strlen(str) > (sizeof("ukhGHpppPSDIW") - 1))
 		return -1;
 
 	while (*p) {
@@ -1496,6 +1501,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
 		evsel->exclude_GH          = mod.exclude_GH;
 		evsel->sample_read         = mod.sample_read;
 		evsel->precise_max         = mod.precise_max;
+		evsel->weak_group	   = mod.weak;
 
 		if (perf_evsel__is_group_leader(evsel))
 			evsel->attr.pinned = mod.pinned;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index c42edeac451f..fdb5bb52f01f 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -161,7 +161,7 @@ name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
 name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 drv_cfg_term	[a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
 /* If you add a modifier you need to update check_modifier() */
-modifier_event	[ukhpPGHSDI]+
+modifier_event	[ukhpPGHSDIW]+
 modifier_bp	[rwx]{1,3}
 
 %%
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 02/11] perf, tools: Support metric_group and no event name in json parser
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 01/11] perf, tools: Support weak groups Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:29   ` [tip:perf/core] perf vendor events: Support metric_group and no event name in JSON parser tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 03/11] perf, tools, stat: Factor out generic metric printing Andi Kleen
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Some enhancements to the JSON parser to prepare for metrics support

- Parse the new MetricGroup field
- Support JSON events with no event name, that have only MetricName.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/pmu-events/jevents.c    | 24 ++++++++++++++++++------
 tools/perf/pmu-events/jevents.h    |  2 +-
 tools/perf/pmu-events/pmu-events.h |  1 +
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index d51dc9ca8861..9eb7047bafe4 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -292,7 +292,7 @@ static int print_events_table_entry(void *data, char *name, char *event,
 				    char *desc, char *long_desc,
 				    char *pmu, char *unit, char *perpkg,
 				    char *metric_expr,
-				    char *metric_name)
+				    char *metric_name, char *metric_group)
 {
 	struct perf_entry_data *pd = data;
 	FILE *outfp = pd->outfp;
@@ -304,8 +304,10 @@ static int print_events_table_entry(void *data, char *name, char *event,
 	 */
 	fprintf(outfp, "{\n");
 
-	fprintf(outfp, "\t.name = \"%s\",\n", name);
-	fprintf(outfp, "\t.event = \"%s\",\n", event);
+	if (name)
+		fprintf(outfp, "\t.name = \"%s\",\n", name);
+	if (event)
+		fprintf(outfp, "\t.event = \"%s\",\n", event);
 	fprintf(outfp, "\t.desc = \"%s\",\n", desc);
 	fprintf(outfp, "\t.topic = \"%s\",\n", topic);
 	if (long_desc && long_desc[0])
@@ -320,6 +322,8 @@ static int print_events_table_entry(void *data, char *name, char *event,
 		fprintf(outfp, "\t.metric_expr = \"%s\",\n", metric_expr);
 	if (metric_name)
 		fprintf(outfp, "\t.metric_name = \"%s\",\n", metric_name);
+	if (metric_group)
+		fprintf(outfp, "\t.metric_group = \"%s\",\n", metric_group);
 	fprintf(outfp, "},\n");
 
 	return 0;
@@ -357,6 +361,9 @@ static char *real_event(const char *name, char *event)
 {
 	int i;
 
+	if (!name)
+		return NULL;
+
 	for (i = 0; fixed[i].name; i++)
 		if (!strcasecmp(name, fixed[i].name))
 			return (char *)fixed[i].event;
@@ -369,7 +376,7 @@ int json_events(const char *fn,
 		      char *long_desc,
 		      char *pmu, char *unit, char *perpkg,
 		      char *metric_expr,
-		      char *metric_name),
+		      char *metric_name, char *metric_group),
 	  void *data)
 {
 	int err = -EIO;
@@ -397,6 +404,7 @@ int json_events(const char *fn,
 		char *unit = NULL;
 		char *metric_expr = NULL;
 		char *metric_name = NULL;
+		char *metric_group = NULL;
 		unsigned long long eventcode = 0;
 		struct msrmap *msr = NULL;
 		jsmntok_t *msrval = NULL;
@@ -476,6 +484,8 @@ int json_events(const char *fn,
 				addfield(map, &perpkg, "", "", val);
 			} else if (json_streq(map, field, "MetricName")) {
 				addfield(map, &metric_name, "", "", val);
+			} else if (json_streq(map, field, "MetricGroup")) {
+				addfield(map, &metric_group, "", "", val);
 			} else if (json_streq(map, field, "MetricExpr")) {
 				addfield(map, &metric_expr, "", "", val);
 				for (s = metric_expr; *s; s++)
@@ -501,10 +511,11 @@ int json_events(const char *fn,
 			addfield(map, &event, ",", filter, NULL);
 		if (msr != NULL)
 			addfield(map, &event, ",", msr->pname, msrval);
-		fixname(name);
+		if (name)
+			fixname(name);
 
 		err = func(data, name, real_event(name, event), desc, long_desc,
-				pmu, unit, perpkg, metric_expr, metric_name);
+			   pmu, unit, perpkg, metric_expr, metric_name, metric_group);
 		free(event);
 		free(desc);
 		free(name);
@@ -516,6 +527,7 @@ int json_events(const char *fn,
 		free(unit);
 		free(metric_expr);
 		free(metric_name);
+		free(metric_group);
 		if (err)
 			break;
 		tok += j;
diff --git a/tools/perf/pmu-events/jevents.h b/tools/perf/pmu-events/jevents.h
index 611fac01913d..557994754410 100644
--- a/tools/perf/pmu-events/jevents.h
+++ b/tools/perf/pmu-events/jevents.h
@@ -6,7 +6,7 @@ int json_events(const char *fn,
 				char *long_desc,
 				char *pmu,
 				char *unit, char *perpkg, char *metric_expr,
-				char *metric_name),
+				char *metric_name, char *metric_group),
 		void *data);
 char *get_cpu_str(void);
 
diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
index 569eab3688dd..94fa1720f6fd 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -15,6 +15,7 @@ struct pmu_event {
 	const char *perpkg;
 	const char *metric_expr;
 	const char *metric_name;
+	const char *metric_group;
 };
 
 /*
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 03/11] perf, tools, stat: Factor out generic metric printing
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 01/11] perf, tools: Support weak groups Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 02/11] perf, tools: Support metric_group and no event name in json parser Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:29   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 04/11] perf, tools: Print generic metric header even for failed expressions Andi Kleen
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

The perf stat shadow metric printing already supports generic metrics.
Factor out the code doing that into a separate function that can be re-used
in a later patch.

No behavior changes.

v2: Fix indentation
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/stat-shadow.c | 69 ++++++++++++++++++++++++++-----------------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index a04cf56d3517..96aa6cbf24d6 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -627,6 +627,46 @@ static void print_smi_cost(int cpu, struct perf_evsel *evsel,
 	out->print_metric(out->ctx, NULL, "%4.0f", "SMI#", smi_num);
 }
 
+static void generic_metric(const char *metric_expr,
+			   struct perf_evsel **metric_events,
+			   char *name,
+			   const char *metric_name,
+			   double avg,
+			   int cpu,
+			   int ctx,
+			   struct perf_stat_output_ctx *out)
+{
+	print_metric_t print_metric = out->print_metric;
+	struct parse_ctx pctx;
+	double ratio;
+	int i;
+	void *ctxp = out->ctx;
+
+	expr__ctx_init(&pctx);
+	expr__add_id(&pctx, name, avg);
+	for (i = 0; metric_events[i]; i++) {
+		struct saved_value *v;
+
+		v = saved_value_lookup(metric_events[i], cpu, ctx, false);
+		if (!v)
+			break;
+		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
+	}
+	if (!metric_events[i]) {
+		const char *p = metric_expr;
+
+		if (expr__parse(&ratio, &pctx, &p) == 0)
+			print_metric(ctxp, NULL, "%8.1f",
+				metric_name ?
+				metric_name :
+				out->force_header ?  name : "",
+				ratio);
+		else
+			print_metric(ctxp, NULL, NULL, "", 0);
+	} else
+		print_metric(ctxp, NULL, NULL, "", 0);
+}
+
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
 				   struct perf_stat_output_ctx *out)
@@ -819,33 +859,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		else
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
-		struct parse_ctx pctx;
-		int i;
-
-		expr__ctx_init(&pctx);
-		expr__add_id(&pctx, evsel->name, avg);
-		for (i = 0; evsel->metric_events[i]; i++) {
-			struct saved_value *v;
-
-			v = saved_value_lookup(evsel->metric_events[i], cpu, ctx, false);
-			if (!v)
-				break;
-			expr__add_id(&pctx, evsel->metric_events[i]->name,
-					     avg_stats(&v->stats));
-		}
-		if (!evsel->metric_events[i]) {
-			const char *p = evsel->metric_expr;
-
-			if (expr__parse(&ratio, &pctx, &p) == 0)
-				print_metric(ctxp, NULL, "%8.1f",
-					evsel->metric_name ?
-					evsel->metric_name :
-					out->force_header ?  evsel->name : "",
-					ratio);
-			else
-				print_metric(ctxp, NULL, NULL, "", 0);
-		} else
-			print_metric(ctxp, NULL, NULL, "", 0);
+		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
+				evsel->metric_name, avg, cpu, ctx, out);
 	} else if (runtime_nsecs_stats[cpu].n != 0) {
 		char unit = 'M';
 		char unit_buf[10];
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 04/11] perf, tools: Print generic metric header even for failed expressions
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (2 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 03/11] perf, tools, stat: Factor out generic metric printing Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:30   ` [tip:perf/core] perf stat: " tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 05/11] perf, tools: Extract function to get json alias map Andi Kleen
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Print the generic metric header even when the expression evaluation
failed. Otherwise an expression that fails on the first collections
due to division by zero may suddenly reappear later without
an header.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/stat-shadow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 96aa6cbf24d6..8c7ab29169b9 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -662,7 +662,9 @@ static void generic_metric(const char *metric_expr,
 				out->force_header ?  name : "",
 				ratio);
 		else
-			print_metric(ctxp, NULL, NULL, "", 0);
+			print_metric(ctxp, NULL, NULL,
+				     out->force_header ?
+				     (metric_name ? metric_name : name) : "", 0);
 	} else
 		print_metric(ctxp, NULL, NULL, "", 0);
 }
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 05/11] perf, tools: Extract function to get json alias map
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (3 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 04/11] perf, tools: Print generic metric header even for failed expressions Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:30   ` [tip:perf/core] perf pmu: Extract function to get JSON " tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat Andi Kleen
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Extract the code to get the per cpu json alias into a separate
function for reuse. No behavior changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/pmu.c | 49 +++++++++++++++++++++++++++++++++----------------
 tools/perf/util/pmu.h |  2 ++
 2 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ac16a9db1fb5..ed25d7f88731 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -516,16 +516,8 @@ char * __weak get_cpuid_str(void)
 	return NULL;
 }
 
-/*
- * From the pmu_events_map, find the table of PMU events that corresponds
- * to the current running CPU. Then, add all PMU events from that table
- * as aliases.
- */
-static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+static char *perf_pmu__getcpuid(void)
 {
-	int i;
-	struct pmu_events_map *map;
-	struct pmu_event *pe;
 	char *cpuid;
 	static bool printed;
 
@@ -535,22 +527,50 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 	if (!cpuid)
 		cpuid = get_cpuid_str();
 	if (!cpuid)
-		return;
+		return NULL;
 
 	if (!printed) {
 		pr_debug("Using CPUID %s\n", cpuid);
 		printed = true;
 	}
+	return cpuid;
+}
+
+struct pmu_events_map *perf_pmu__find_map(void)
+{
+	struct pmu_events_map *map;
+	char *cpuid = perf_pmu__getcpuid();
+	int i;
 
 	i = 0;
-	while (1) {
+	for (;;) {
 		map = &pmu_events_map[i++];
-		if (!map->table)
-			goto out;
+		if (!map->table) {
+			map = NULL;
+			break;
+		}
 
 		if (!strcmp(map->cpuid, cpuid))
 			break;
 	}
+	free(cpuid);
+	return map;
+}
+
+/*
+ * From the pmu_events_map, find the table of PMU events that corresponds
+ * to the current running CPU. Then, add all PMU events from that table
+ * as aliases.
+ */
+static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+{
+	int i;
+	struct pmu_events_map *map;
+	struct pmu_event *pe;
+
+	map = perf_pmu__find_map();
+	if (!map)
+		return;
 
 	/*
 	 * Found a matching PMU events table. Create aliases
@@ -575,9 +595,6 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 				(char *)pe->metric_expr,
 				(char *)pe->metric_name);
 	}
-
-out:
-	free(cpuid);
 }
 
 struct perf_event_attr * __weak
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 389e9729331f..060f6abba8ed 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -90,4 +90,6 @@ int perf_pmu__test(void);
 
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu);
 
+struct pmu_events_map *perf_pmu__find_map(void);
+
 #endif /* __PMU_H */
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (4 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 05/11] perf, tools: Extract function to get json alias map Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-04 17:11   ` Arnaldo Carvalho de Melo
  2017-09-22 16:31   ` [tip:perf/core] perf stat: Support JSON metrics in perf stat tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list Andi Kleen
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add generic support for standalone metrics specified in JSON files
to perf stat. A metric is a formula that uses multiple events
to compute a higher level result (e.g. IPC).

Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have
standalone metrics. They are in the same JSON data structure
as events, but don't have an event name.

We also allow to organize the metrics in metric groups, which
allows a short cut to select several related metrics at once.

Add a new -M / --metrics option to perf stat that adds the metrics
or metric groups specified.

Add the core code to manage and parse the metric groups. They
are collected from the JSON data structures into a separate rblist.
When computing shadow values look for metrics in that list.
Then they are computed using the existing saved values infrastructure
in stat-shadow.c

The actual JSON metrics are in a separate pull request.

% perf stat -M Summary --metric-only -a sleep 1

 Performance counter stats for 'system wide':

Instructions                              CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
317614222.0                              1392930775.0             0.0                 0.0                 0.2                 0.1

       1.001497549 seconds time elapsed

% perf stat -M GFLOPs flops

 Performance counter stats for 'flops':

     3,999,541,471      fp_comp_ops_exe.sse_scalar_single #      1.2 GFLOPs                   (66.65%)
                14      fp_comp_ops_exe.sse_scalar_double                                     (66.65%)
                 0      fp_comp_ops_exe.sse_packed_double                                     (66.67%)
                 0      fp_comp_ops_exe.sse_packed_single                                     (66.70%)
                 0      simd_fp_256.packed_double                                     (66.70%)
                 0      simd_fp_256.packed_single                                     (66.67%)
                 0      duration_time

       3.238372845 seconds time elapsed

v2: Add missing header file
v3: Move find_map to pmu.c
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |   7 +
 tools/perf/builtin-stat.c              |  18 +-
 tools/perf/util/Build                  |   1 +
 tools/perf/util/metricgroup.c          | 313 +++++++++++++++++++++++++++++++++
 tools/perf/util/metricgroup.h          |  31 ++++
 tools/perf/util/pmu.c                  |   5 +-
 tools/perf/util/stat-shadow.c          |  22 ++-
 tools/perf/util/stat.h                 |   4 +-
 8 files changed, 395 insertions(+), 6 deletions(-)
 create mode 100644 tools/perf/util/metricgroup.c
 create mode 100644 tools/perf/util/metricgroup.h

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index c37d61682dfb..823fce7674bb 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -199,6 +199,13 @@ Aggregate counts per processor socket for system-wide mode measurements.
 --per-core::
 Aggregate counts per physical processor for system-wide mode measurements.
 
+-M::
+--metrics::
+Print metrics or metricgroups specified in a comma separated list.
+For a group all metrics from the group are added.
+The events from the metrics are automatically measured.
+See perf list output for the possble metrics and metricgroups.
+
 -A::
 --no-aggr::
 Do not aggregate counts across all monitored CPUs.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 501c1e0272fe..045efd7d7785 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -65,6 +65,7 @@
 #include "util/tool.h"
 #include "util/group.h"
 #include "util/string2.h"
+#include "util/metricgroup.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -133,6 +134,8 @@ static const char *smi_cost_attrs = {
 
 static struct perf_evlist	*evsel_list;
 
+static struct rblist		 metric_events;
+
 static struct target target = {
 	.uid	= UINT_MAX,
 };
@@ -1234,7 +1237,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	perf_stat__print_shadow_stats(counter, uval,
 				first_shadow_cpu(counter, id),
-				&out);
+				&out, &metric_events);
 	if (!csv_output && !metric_only) {
 		print_noise(counter, noise);
 		print_running(run, ena);
@@ -1565,7 +1568,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 		os.evsel = counter;
 		perf_stat__print_shadow_stats(counter, 0,
 					      0,
-					      &out);
+					      &out,
+					      &metric_events);
 	}
 	fputc('\n', stat_config.output);
 }
@@ -1789,6 +1793,13 @@ static int enable_metric_only(const struct option *opt __maybe_unused,
 	return 0;
 }
 
+static int parse_metric_groups(const struct option *opt,
+			       const char *str,
+			       int unset __maybe_unused)
+{
+	return metricgroup__parse_groups(opt, str, &metric_events);
+}
+
 static const struct option stat_options[] = {
 	OPT_BOOLEAN('T', "transaction", &transaction_run,
 		    "hardware transaction statistics"),
@@ -1854,6 +1865,9 @@ static const struct option stat_options[] = {
 			"measure topdown level 1 statistics"),
 	OPT_BOOLEAN(0, "smi-cost", &smi_cost,
 			"measure SMI cost"),
+	OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
+		     "monitor specified metrics or metric groups (separated by ,)",
+		     parse_metric_groups),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 94518c1bf8b6..71ab8466714d 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -34,6 +34,7 @@ libperf-y += dso.o
 libperf-y += symbol.o
 libperf-y += symbol_fprintf.o
 libperf-y += color.o
+libperf-y += metricgroup.o
 libperf-y += header.o
 libperf-y += callchain.o
 libperf-y += values.o
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
new file mode 100644
index 000000000000..7516b1746594
--- /dev/null
+++ b/tools/perf/util/metricgroup.c
@@ -0,0 +1,313 @@
+/*
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+/* Manage metrics and groups of metrics from JSON files */
+
+#include "metricgroup.h"
+#include "evlist.h"
+#include "strbuf.h"
+#include "pmu.h"
+#include "expr.h"
+#include "rblist.h"
+#include "pmu.h"
+#include <string.h>
+#include <stdbool.h>
+#include <errno.h>
+#include "pmu-events/pmu-events.h"
+#include "strbuf.h"
+#include "strlist.h"
+#include <assert.h>
+#include <ctype.h>
+
+struct metric_event *metricgroup__lookup(struct rblist *metric_events,
+					 struct perf_evsel *evsel,
+					 bool create)
+{
+	struct rb_node *nd;
+	struct metric_event me = {
+		.evsel = evsel
+	};
+	nd = rblist__find(metric_events, &me);
+	if (nd)
+		return container_of(nd, struct metric_event, nd);
+	if (create) {
+		rblist__add_node(metric_events, &me);
+		nd = rblist__find(metric_events, &me);
+		if (nd)
+			return container_of(nd, struct metric_event, nd);
+	}
+	return NULL;
+}
+
+static int metric_event_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct metric_event *a = container_of(rb_node,
+					      struct metric_event,
+					      nd);
+	const struct metric_event *b = entry;
+
+	if (a->evsel == b->evsel)
+		return 0;
+	if ((char *)a->evsel < (char *)b->evsel)
+		return -1;
+	return +1;
+}
+
+static struct rb_node *metric_event_new(struct rblist *rblist __maybe_unused,
+					const void *entry)
+{
+	struct metric_event *me = malloc(sizeof(struct metric_event));
+
+	if (!me)
+		return NULL;
+	memcpy(me, entry, sizeof(struct metric_event));
+	me->evsel = ((struct metric_event *)entry)->evsel;
+	INIT_LIST_HEAD(&me->head);
+	return &me->nd;
+}
+
+static void metricgroup__rblist_init(struct rblist *metric_events)
+{
+	rblist__init(metric_events);
+	metric_events->node_cmp = metric_event_cmp;
+	metric_events->node_new = metric_event_new;
+}
+
+struct egroup {
+	struct list_head nd;
+	int idnum;
+	const char **ids;
+	const char *metric_name;
+	const char *metric_expr;
+};
+
+static struct perf_evsel *find_evsel(struct perf_evlist *perf_evlist,
+				     const char **ids,
+				     int idnum,
+				     struct perf_evsel **metric_events)
+{
+	struct perf_evsel *ev, *start = NULL;
+	int ind = 0;
+
+	evlist__for_each_entry (perf_evlist, ev) {
+		if (!strcmp(ev->name, ids[ind])) {
+			metric_events[ind] = ev;
+			if (ind == 0)
+				start = ev;
+			if (++ind == idnum) {
+				metric_events[ind] = NULL;
+				return start;
+			}
+		} else {
+			ind = 0;
+			start = NULL;
+		}
+	}
+	/*
+	 * This can happen when an alias expands to multiple
+	 * events, like for uncore events.
+	 * We don't support this case for now.
+	 */
+	return NULL;
+}
+
+static int metricgroup__setup_events(struct list_head *groups,
+				     struct perf_evlist *perf_evlist,
+				     struct rblist *metric_events_list)
+{
+	struct metric_event *me;
+	struct metric_expr *expr;
+	int i = 0;
+	int ret = 0;
+	struct egroup *eg;
+	struct perf_evsel *evsel;
+
+	list_for_each_entry (eg, groups, nd) {
+		struct perf_evsel **metric_events;
+
+		metric_events = calloc(sizeof(void *), eg->idnum + 1);
+		if (!metric_events) {
+			ret = -ENOMEM;
+			break;
+		}
+		evsel = find_evsel(perf_evlist, eg->ids, eg->idnum,
+				   metric_events);
+		if (!evsel) {
+			pr_debug("Cannot resolve %s: %s\n",
+					eg->metric_name, eg->metric_expr);
+			continue;
+		}
+		for (i = 0; i < eg->idnum; i++)
+			metric_events[i]->collect_stat = true;
+		me = metricgroup__lookup(metric_events_list, evsel, true);
+		if (!me) {
+			ret = -ENOMEM;
+			break;
+		}
+		expr = malloc(sizeof(struct metric_expr));
+		if (!expr) {
+			ret = -ENOMEM;
+			break;
+		}
+		expr->metric_expr = eg->metric_expr;
+		expr->metric_name = eg->metric_name;
+		expr->metric_events = metric_events;
+		list_add(&expr->nd, &me->head);
+	}
+	return ret;
+}
+
+static bool match_metric(const char *n, const char *list)
+{
+	int len;
+	char *m;
+
+	if (!list)
+		return false;
+	if (!strcmp(list, "all"))
+		return true;
+	if (!n)
+		return !strcasecmp(list, "No_group");
+	len = strlen(list);
+	m = strcasestr(n, list);
+	if (!m)
+		return false;
+	if ((m == n || m[-1] == ';' || m[-1] == ' ') &&
+	    (m[len] == 0 || m[len] == ';'))
+		return true;
+	return false;
+}
+
+static int metricgroup__add_metric(const char *metric, struct strbuf *events,
+				   struct list_head *group_list)
+{
+	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_event *pe;
+	int ret = -EINVAL;
+	int i, j;
+
+	strbuf_init(events, 100);
+	strbuf_addf(events, "%s", "");
+
+	if (!map)
+		return 0;
+
+	for (i = 0; ; i++) {
+		pe = &map->table[i];
+
+		if (!pe->name && !pe->metric_group && !pe->metric_name)
+			break;
+		if (!pe->metric_expr)
+			continue;
+		if (match_metric(pe->metric_group, metric) ||
+		    match_metric(pe->metric_name, metric)) {
+			const char **ids;
+			int idnum;
+			struct egroup *eg;
+
+			pr_debug("metric expr %s for %s\n", pe->metric_expr, pe->metric_name);
+
+			if (expr__find_other(pe->metric_expr,
+					     NULL, &ids, &idnum) < 0)
+				continue;
+			if (events->len > 0)
+				strbuf_addf(events, ",");
+			for (j = 0; j < idnum; j++) {
+				pr_debug("found event %s\n", ids[j]);
+				strbuf_addf(events, "%s%s",
+					j == 0 ? "{" : ",",
+					ids[j]);
+			}
+			strbuf_addf(events, "}:W");
+
+			eg = malloc(sizeof(struct egroup));
+			if (!eg) {
+				ret = -ENOMEM;
+				break;
+			}
+			eg->ids = ids;
+			eg->idnum = idnum;
+			eg->metric_name = pe->metric_name;
+			eg->metric_expr = pe->metric_expr;
+			list_add_tail(&eg->nd, group_list);
+			ret = 0;
+		}
+	}
+	return ret;
+}
+
+static int metricgroup__add_metric_list(const char *list, struct strbuf *events,
+				        struct list_head *group_list)
+{
+	char *llist, *nlist, *p;
+	int ret = -EINVAL;
+
+	nlist = strdup(list);
+	if (!nlist)
+		return -ENOMEM;
+	llist = nlist;
+	while ((p = strsep(&llist, ",")) != NULL) {
+		ret = metricgroup__add_metric(p, events, group_list);
+		if (ret == -EINVAL) {
+			fprintf(stderr, "Cannot find metric or group `%s'\n",
+					p);
+			break;
+		}
+	}
+	free(nlist);
+	return ret;
+}
+
+static void metricgroup__free_egroups(struct list_head *group_list)
+{
+	struct egroup *eg, *egtmp;
+	int i;
+
+	list_for_each_entry_safe (eg, egtmp, group_list, nd) {
+		for (i = 0; i < eg->idnum; i++)
+			free((char *)eg->ids[i]);
+		free(eg->ids);
+		free(eg);
+	}
+}
+
+int metricgroup__parse_groups(const struct option *opt,
+			   const char *str,
+			   struct rblist *metric_events)
+{
+	struct parse_events_error parse_error;
+	struct perf_evlist *perf_evlist = *(struct perf_evlist **)opt->value;
+	struct strbuf extra_events;
+	LIST_HEAD(group_list);
+	int ret;
+
+	if (metric_events->nr_entries == 0)
+		metricgroup__rblist_init(metric_events);
+	ret = metricgroup__add_metric_list(str, &extra_events, &group_list);
+	if (ret)
+		return ret;
+	pr_debug("adding %s\n", extra_events.buf);
+	memset(&parse_error, 0, sizeof(struct parse_events_error));
+	ret = parse_events(perf_evlist, extra_events.buf, &parse_error);
+	if (ret) {
+		pr_err("Cannot set up events %s\n", extra_events.buf);
+		goto out;
+	}
+	strbuf_release(&extra_events);
+	ret = metricgroup__setup_events(&group_list, perf_evlist,
+					metric_events);
+out:
+	metricgroup__free_egroups(&group_list);
+	return ret;
+}
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
new file mode 100644
index 000000000000..06854e125ee7
--- /dev/null
+++ b/tools/perf/util/metricgroup.h
@@ -0,0 +1,31 @@
+#ifndef METRICGROUP_H
+#define METRICGROUP_H 1
+
+#include "linux/list.h"
+#include "rblist.h"
+#include <subcmd/parse-options.h>
+#include "evlist.h"
+#include "strbuf.h"
+
+struct metric_event {
+	struct rb_node nd;
+	struct perf_evsel *evsel;
+	struct list_head head; /* list of metric_expr */
+};
+
+struct metric_expr {
+	struct list_head nd;
+	const char *metric_expr;
+	const char *metric_name;
+	struct perf_evsel **metric_events;
+};
+
+struct metric_event *metricgroup__lookup(struct rblist *metric_events,
+					 struct perf_evsel *evsel,
+					 bool create);
+int metricgroup__parse_groups(const struct option *opt,
+			const char *str,
+			struct rblist *metric_events);
+
+void metricgroup__print(bool metrics, bool groups, char *filter, bool raw);
+#endif
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ed25d7f88731..7070638ab600 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -580,8 +580,11 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 		const char *pname;
 
 		pe = &map->table[i++];
-		if (!pe->name)
+		if (!pe->name) {
+			if (pe->metric_group || pe->metric_name)
+				continue;
 			break;
+		}
 
 		pname = pe->pmu ? pe->pmu : "cpu";
 		if (strncmp(pname, name, strlen(pname)))
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 8c7ab29169b9..42e6c17be7ff 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -6,6 +6,7 @@
 #include "rblist.h"
 #include "evlist.h"
 #include "expr.h"
+#include "metricgroup.h"
 
 enum {
 	CTX_BIT_USER	= 1 << 0,
@@ -671,13 +672,16 @@ static void generic_metric(const char *metric_expr,
 
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
-				   struct perf_stat_output_ctx *out)
+				   struct perf_stat_output_ctx *out,
+				   struct rblist *metric_events)
 {
 	void *ctxp = out->ctx;
 	print_metric_t print_metric = out->print_metric;
 	double total, ratio = 0.0, total2;
 	const char *color = NULL;
 	int ctx = evsel_context(evsel);
+	struct metric_event *me;
+	int num = 1;
 
 	if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
 		total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
@@ -880,6 +884,20 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
 		print_smi_cost(cpu, evsel, out);
 	} else {
-		print_metric(ctxp, NULL, NULL, NULL, 0);
+		num = 0;
 	}
+
+	if ((me = metricgroup__lookup(metric_events, evsel, false)) != NULL) {
+		struct metric_expr *mexp;
+
+		list_for_each_entry (mexp, &me->head, nd) {
+			if (num++ > 0)
+				out->new_line(ctxp);
+			generic_metric(mexp->metric_expr, mexp->metric_events,
+					evsel->name, mexp->metric_name,
+					avg, cpu, ctx, out);
+		}
+	}
+	if (num == 0)
+		print_metric(ctxp, NULL, NULL, NULL, 0);
 }
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index eacaf958e19d..47915df346fb 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -91,9 +91,11 @@ struct perf_stat_output_ctx {
 	bool force_header;
 };
 
+struct rblist;
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
-				   struct perf_stat_output_ctx *out);
+				   struct perf_stat_output_ctx *out,
+				   struct rblist *metric_events);
 void perf_stat__collect_metric_expr(struct perf_evlist *);
 
 int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw);
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (5 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:31   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-10-13 14:50   ` [PATCH v3 07/11] perf, tools, " Arnaldo Carvalho de Melo
  2017-08-31 19:40 ` [PATCH v3 08/11] perf, tools, stat: Don't use ctx for saved values lookup Andi Kleen
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add code to perf list to print metric groups, and metrics
that don't have an event name. The metricgroup code collects
the eventgroups and events into a rblist, and then prints
them according to the configured filters.

The metricgroups are printed by default, but can be
limited by perf list metric or perf list metricgroup

% perf list metricgroup
..
Metric Groups:

DSB:
  DSB_Coverage
        [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
FLOPS:
  GFLOPs
        [Giga Floating Point Operations Per Second]
Frontend:
  IFetch_Line_Utilization
        [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
Frontend_Bandwidth:
  DSB_Coverage
        [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
Memory_BW:
  MLP
        [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]

v2: Check return value of asprintf to fix warning on FC26
Fix key in lookup/addition for the groups list
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-list.txt |   7 +-
 tools/perf/builtin-list.c              |   7 ++
 tools/perf/util/metricgroup.c          | 176 +++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.c         |   3 +
 4 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index d432965d728d..ca6369fd83c1 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -8,7 +8,8 @@ perf-list - List all symbolic event types
 SYNOPSIS
 --------
 [verse]
-'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|sdt|event_glob]
+'perf list' [--no-desc] [--long-desc]
+            [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
 
 DESCRIPTION
 -----------
@@ -247,6 +248,10 @@ To limit the list use:
 
 . 'sdt' to list all Statically Defined Tracepoint events.
 
+. 'metric' to list metrics
+
+. 'metricgroup' to list metricgroups with metrics.
+
 . If none of the above is matched, it will apply the supplied glob to all
   events, printing the ones that match.
 
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 4bf2cb4d25aa..b2d2ad3dd478 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -15,6 +15,7 @@
 #include "util/cache.h"
 #include "util/pmu.h"
 #include "util/debug.h"
+#include "util/metricgroup.h"
 #include <subcmd/parse-options.h>
 
 static bool desc_flag = true;
@@ -79,6 +80,10 @@ int cmd_list(int argc, const char **argv)
 						long_desc_flag, details_flag);
 		else if (strcmp(argv[i], "sdt") == 0)
 			print_sdt_events(NULL, NULL, raw_dump);
+		else if (strcmp(argv[i], "metric") == 0)
+			metricgroup__print(true, false, NULL, raw_dump);
+		else if (strcmp(argv[i], "metricgroup") == 0)
+			metricgroup__print(false, true, NULL, raw_dump);
 		else if ((sep = strchr(argv[i], ':')) != NULL) {
 			int sep_idx;
 
@@ -96,6 +101,7 @@ int cmd_list(int argc, const char **argv)
 			s[sep_idx] = '\0';
 			print_tracepoint_events(s, s + sep_idx + 1, raw_dump);
 			print_sdt_events(s, s + sep_idx + 1, raw_dump);
+			metricgroup__print(true, true, s, raw_dump);
 			free(s);
 		} else {
 			if (asprintf(&s, "*%s*", argv[i]) < 0) {
@@ -112,6 +118,7 @@ int cmd_list(int argc, const char **argv)
 						details_flag);
 			print_tracepoint_events(NULL, s, raw_dump);
 			print_sdt_events(NULL, s, raw_dump);
+			metricgroup__print(true, true, NULL, raw_dump);
 			free(s);
 		}
 	}
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 7516b1746594..2d60114f1870 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -189,6 +189,182 @@ static bool match_metric(const char *n, const char *list)
 	return false;
 }
 
+struct mep {
+	struct rb_node nd;
+	const char *name;
+	struct strlist *metrics;
+};
+
+static int mep_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct mep *a = container_of(rb_node, struct mep, nd);
+	struct mep *b = (struct mep *)entry;
+
+	return strcmp(a->name, b->name);
+}
+
+static struct rb_node *mep_new(struct rblist *rl __maybe_unused,
+					const void *entry)
+{
+	struct mep *me = malloc(sizeof(struct mep));
+
+	if (!me)
+		return NULL;
+	memcpy(me, entry, sizeof(struct mep));
+	me->name = strdup(me->name);
+	if (!me->name)
+		goto out_me;
+	me->metrics = strlist__new(NULL, NULL);
+	if (!me->metrics)
+		goto out_name;
+	return &me->nd;
+out_name:
+	free((char *)me->name);
+out_me:
+	free(me);
+	return NULL;
+}
+
+static struct mep *mep_lookup(struct rblist *groups, const char *name)
+{
+	struct rb_node *nd;
+	struct mep me = {
+		.name = name
+	};
+	nd = rblist__find(groups, &me);
+	if (nd)
+		return container_of(nd, struct mep, nd);
+	rblist__add_node(groups, &me);
+	nd = rblist__find(groups, &me);
+	if (nd)
+		return container_of(nd, struct mep, nd);
+	return NULL;
+}
+
+static void mep_delete(struct rblist *rl __maybe_unused,
+		       struct rb_node *nd)
+{
+	struct mep *me = container_of(nd, struct mep, nd);
+
+	strlist__delete(me->metrics);
+	free((void *)me->name);
+	free(me);
+}
+
+static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
+{
+	struct str_node *sn;
+	int n = 0;
+
+	strlist__for_each_entry (sn, metrics) {
+		if (raw)
+			printf("%s%s", n > 0 ? " " : "", sn->s);
+		else
+			printf("  %s\n", sn->s);
+		n++;
+	}
+	if (raw)
+		putchar('\n');
+}
+
+void metricgroup__print(bool metrics, bool metricgroups, char *filter,
+			bool raw)
+{
+	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_event *pe;
+	int i;
+	struct rblist groups;
+	struct rb_node *node, *next;
+	struct strlist *metriclist = NULL;
+
+	if (!map)
+		return;
+
+	if (!metricgroups) {
+		metriclist = strlist__new(NULL, NULL);
+		if (!metriclist)
+			return;
+	}
+
+	rblist__init(&groups);
+	groups.node_new = mep_new;
+	groups.node_cmp = mep_cmp;
+	groups.node_delete = mep_delete;
+	for (i = 0; ; i++) {
+		const char *g;
+		pe = &map->table[i];
+
+		if (!pe->name && !pe->metric_group && !pe->metric_name)
+			break;
+		if (!pe->metric_expr)
+			continue;
+		g = pe->metric_group;
+		if (!g && pe->metric_name) {
+			if (pe->name)
+				continue;
+			g = "No_group";
+		}
+		if (g) {
+			char *omg;
+			char *mg = strdup(g);
+
+			if (!mg)
+				return;
+			omg = mg;
+			while ((g = strsep(&mg, ";")) != NULL) {
+				struct mep *me;
+				char *s;
+
+				if (*g == 0)
+					g = "No_group";
+				while (isspace(*g))
+					g++;
+				if (filter && !strstr(g, filter))
+					continue;
+				if (raw)
+					s = (char *)pe->metric_name;
+				else {
+					if (asprintf(&s, "%s\n\t[%s]",
+						     pe->metric_name, pe->desc) < 0)
+						return;
+				}
+
+				if (!s)
+					continue;
+
+				if (!metricgroups) {
+					strlist__add(metriclist, s);
+				} else {
+					me = mep_lookup(&groups, g);
+					if (!me)
+						continue;
+					strlist__add(me->metrics, s);
+				}
+			}
+			free(omg);
+		}
+	}
+
+	if (metricgroups && !raw)
+		printf("\nMetric Groups:\n\n");
+	else if (metrics && !raw)
+		printf("\nMetrics:\n\n");
+
+	for (node = rb_first(&groups.entries); node; node = next) {
+		struct mep *me = container_of(node, struct mep, nd);
+
+		if (metricgroups)
+			printf("%s%s%s", me->name, metrics ? ":" : "", raw ? " " : "\n");
+		if (metrics)
+			metricgroup__print_strlist(me->metrics, raw);
+		next = rb_next(node);
+		rblist__remove_node(&groups, node);
+	}
+	if (!metricgroups)
+		metricgroup__print_strlist(metriclist, raw);
+	strlist__delete(metriclist);
+}
+
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
 				   struct list_head *group_list)
 {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index cd80c4eac569..97507d5c37dc 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -28,6 +28,7 @@
 #include "probe-file.h"
 #include "asm/bug.h"
 #include "util/parse-branch-options.h"
+#include "metricgroup.h"
 
 #define MAX_NAME_LEN 100
 
@@ -2372,6 +2373,8 @@ void print_events(const char *event_glob, bool name_only, bool quiet_flag,
 	print_tracepoint_events(NULL, NULL, name_only);
 
 	print_sdt_events(NULL, NULL, name_only);
+
+	metricgroup__print(true, true, NULL, name_only);
 }
 
 int parse_events__is_hardcoded_term(struct parse_events_term *term)
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 08/11] perf, tools, stat: Don't use ctx for saved values lookup
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (6 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:31   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 09/11] perf, tools, stat: Support duration_time for metrics Andi Kleen
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

We don't need to use ctx to look up events for saved values.
The context is already part of the evsel pointer, which is the
primary key.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/stat-shadow.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 42e6c17be7ff..664f49a9b012 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -56,7 +56,6 @@ struct saved_value {
 	struct rb_node rb_node;
 	struct perf_evsel *evsel;
 	int cpu;
-	int ctx;
 	struct stats stats;
 };
 
@@ -67,8 +66,6 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 					     rb_node);
 	const struct saved_value *b = entry;
 
-	if (a->ctx != b->ctx)
-		return a->ctx - b->ctx;
 	if (a->cpu != b->cpu)
 		return a->cpu - b->cpu;
 	if (a->evsel == b->evsel)
@@ -90,13 +87,12 @@ static struct rb_node *saved_value_new(struct rblist *rblist __maybe_unused,
 }
 
 static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
-					      int cpu, int ctx,
+					      int cpu,
 					      bool create)
 {
 	struct rb_node *nd;
 	struct saved_value dm = {
 		.cpu = cpu,
-		.ctx = ctx,
 		.evsel = evsel,
 	};
 	nd = rblist__find(&runtime_saved_values, &dm);
@@ -232,8 +228,7 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 *count,
 		update_stats(&runtime_aperf_stats[ctx][cpu], count[0]);
 
 	if (counter->collect_stat) {
-		struct saved_value *v = saved_value_lookup(counter, cpu, ctx,
-							   true);
+		struct saved_value *v = saved_value_lookup(counter, cpu, true);
 		update_stats(&v->stats, count[0]);
 	}
 }
@@ -634,7 +629,6 @@ static void generic_metric(const char *metric_expr,
 			   const char *metric_name,
 			   double avg,
 			   int cpu,
-			   int ctx,
 			   struct perf_stat_output_ctx *out)
 {
 	print_metric_t print_metric = out->print_metric;
@@ -648,7 +642,7 @@ static void generic_metric(const char *metric_expr,
 	for (i = 0; metric_events[i]; i++) {
 		struct saved_value *v;
 
-		v = saved_value_lookup(metric_events[i], cpu, ctx, false);
+		v = saved_value_lookup(metric_events[i], cpu, false);
 		if (!v)
 			break;
 		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
@@ -866,7 +860,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
 		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
-				evsel->metric_name, avg, cpu, ctx, out);
+				evsel->metric_name, avg, cpu, out);
 	} else if (runtime_nsecs_stats[cpu].n != 0) {
 		char unit = 'M';
 		char unit_buf[10];
@@ -895,7 +889,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				out->new_line(ctxp);
 			generic_metric(mexp->metric_expr, mexp->metric_events,
 					evsel->name, mexp->metric_name,
-					avg, cpu, ctx, out);
+					avg, cpu, out);
 		}
 	}
 	if (num == 0)
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 09/11] perf, tools, stat: Support duration_time for metrics
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (7 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 08/11] perf, tools, stat: Don't use ctx for saved values lookup Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:32   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 10/11] perf, tools, stat: Hide internal duration_time counter Andi Kleen
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Some of the metrics formulas (like GFLOPs) need to know how long
the measurement period is. Support an internal event called duration_time, which
reports time in second. It maps to the dummy event, but is
special cased for statistics to report the walltime duration.

So far it is not printed, but only used internally for metrics.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/parse-events.l |  1 +
 tools/perf/util/stat-shadow.c  | 17 +++++++++++++----
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index fdb5bb52f01f..ea2426daf7e8 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -288,6 +288,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+duration_time					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
 bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 664f49a9b012..a2c12d1ef32a 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -641,11 +641,20 @@ static void generic_metric(const char *metric_expr,
 	expr__add_id(&pctx, name, avg);
 	for (i = 0; metric_events[i]; i++) {
 		struct saved_value *v;
+		struct stats *stats;
+		double scale;
 
-		v = saved_value_lookup(metric_events[i], cpu, false);
-		if (!v)
-			break;
-		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
+		if (!strcmp(metric_events[i]->name, "duration_time")) {
+			stats = &walltime_nsecs_stats;
+			scale = 1e-9;
+		} else {
+			v = saved_value_lookup(metric_events[i], cpu, false);
+			if (!v)
+				break;
+			stats = &v->stats;
+			scale = 1.0;
+		}
+		expr__add_id(&pctx, metric_events[i]->name, avg_stats(stats)*scale);
 	}
 	if (!metric_events[i]) {
 		const char *p = metric_expr;
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 10/11] perf, tools, stat: Hide internal duration_time counter
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (8 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 09/11] perf, tools, stat: Support duration_time for metrics Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:32   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-08-31 19:40 ` [PATCH v3 11/11] perf, tools, stat: Update walltime_nsecs_stats in interval mode Andi Kleen
  2017-09-01 17:26 ` Support standalone metrics and metric groups for perf Jiri Olsa
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Some perf stat metrics use an internal "duration_time" metric. It is not
correctly printed however. So hide it during output to avoid confusing users
with 0 counts.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-stat.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 045efd7d7785..bf6ae0144ecf 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -195,6 +195,11 @@ static struct perf_stat_config stat_config = {
 	.scale		= true,
 };
 
+static bool is_duration_time(struct perf_evsel *evsel)
+{
+	return !strcmp(evsel->name, "duration_time");
+}
+
 static inline void diff_timespec(struct timespec *r, struct timespec *a,
 				 struct timespec *b)
 {
@@ -1363,6 +1368,9 @@ static void print_aggr(char *prefix)
 		ad.id = id = aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
+
 			ad.val = ad.ena = ad.run = 0;
 			ad.nr = 0;
 			if (!collect_data(counter, aggr_cb, &ad))
@@ -1506,6 +1514,8 @@ static void print_no_aggr_metric(char *prefix)
 		if (prefix)
 			fputs(prefix, stat_config.output);
 		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			if (first) {
 				aggr_printout(counter, cpu, 0);
 				first = false;
@@ -1560,6 +1570,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 
 	/* Print metrics headers only */
 	evlist__for_each_entry(evsel_list, counter) {
+		if (is_duration_time(counter))
+			continue;
 		os.evsel = counter;
 		out.ctx = &os;
 		out.print_metric = print_metric_header;
@@ -1707,12 +1719,18 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 		print_aggr(prefix);
 		break;
 	case AGGR_THREAD:
-		evlist__for_each_entry(evsel_list, counter)
+		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			print_aggr_thread(counter, prefix);
+		}
 		break;
 	case AGGR_GLOBAL:
-		evlist__for_each_entry(evsel_list, counter)
+		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			print_counter_aggr(counter, prefix);
+		}
 		if (metric_only)
 			fputc('\n', stat_config.output);
 		break;
@@ -1720,8 +1738,11 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 		if (metric_only)
 			print_no_aggr_metric(prefix);
 		else {
-			evlist__for_each_entry(evsel_list, counter)
+			evlist__for_each_entry(evsel_list, counter) {
+				if (is_duration_time(counter))
+					continue;
 				print_counter(counter, prefix);
+			}
 		}
 		break;
 	case AGGR_UNSET:
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 11/11] perf, tools, stat: Update walltime_nsecs_stats in interval mode
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (9 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 10/11] perf, tools, stat: Hide internal duration_time counter Andi Kleen
@ 2017-08-31 19:40 ` Andi Kleen
  2017-09-22 16:33   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-09-01 17:26 ` Support standalone metrics and metric groups for perf Jiri Olsa
  11 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-08-31 19:40 UTC (permalink / raw)
  To: acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Some metrics (like GFLOPs) need walltime_nsecs_stats for each interval.
Compute it for each interval instead of only at the end.

Pointed out by Jiri.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-stat.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index bf6ae0144ecf..60206a1e03aa 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -415,6 +415,8 @@ static void process_interval(void)
 			pr_err("failed to write stat round event\n");
 	}
 
+	init_stats(&walltime_nsecs_stats);
+	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
 	print_counters(&rs, 0, NULL);
 }
 
-- 
2.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/11] perf, tools: Support weak groups
  2017-08-31 19:40 ` [PATCH v3 01/11] perf, tools: Support weak groups Andi Kleen
@ 2017-09-01 16:57   ` Jiri Olsa
  2017-09-01 17:00     ` Jiri Olsa
  2017-09-22 16:28   ` [tip:perf/core] perf tools: Support weak groups in 'perf stat' tip-bot for Andi Kleen
  1 sibling, 1 reply; 47+ messages in thread
From: Jiri Olsa @ 2017-09-01 16:57 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, Andi Kleen

On Thu, Aug 31, 2017 at 12:40:26PM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> Setting up groups can be complicated due to the
> complicated scheduling restrictions of different PMUs.
> User tools usually don't understand all these restrictions.
> Still in many cases it is useful to set up groups and
> they work most of the time. However if the group
> is set up wrong some members will not reported any values
> because they never get scheduled.
> 
> Add a concept of a 'weak group': try to set up a group,
> but if it's not schedulable fallback to not using
> a group. That gives us the best of both worlds:
> groups if they work, but still a usable fallback if they don't.
> 
> In theory it would be possible to have more complex fallback
> strategies (e.g. try to split the group in half), but
> the simple fallback of not using a group seems to work for now.
> 
> So far the weak group is only implemented for perf stat,
> not for record.
> 
> Here's an unschedulable group (on IvyBridge with SMT on)
> 
> % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
> 
>         73,806,067      branches
>          4,848,144      branch-misses             #    6.57% of all branches
>         14,754,458      l1d.replacement
>         24,905,558      l2_lines_in.all
>    <not supported>      l2_rqsts.all_code_rd         <------- will never report anything
> 
> With the weak group:
> 
> % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1
> 
>        125,366,055      branches                                                      (80.02%)
>          9,208,402      branch-misses             #    7.35% of all branches          (80.01%)
>         24,560,249      l1d.replacement                                               (80.00%)
>         43,174,971      l2_lines_in.all                                               (80.05%)
>         31,891,457      l2_rqsts.all_code_rd                                          (79.92%)
> 
> The extra event scheduled with some extra multiplexing
> 
> v2: Move fallback code to separate function.
> Add comment on for_each_group_member
> Adjust to new perf_evsel__close interface
> v3:
> Fix debug print out.
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/11] perf, tools: Support weak groups
  2017-09-01 16:57   ` Jiri Olsa
@ 2017-09-01 17:00     ` Jiri Olsa
  2017-09-04 16:51       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: Jiri Olsa @ 2017-09-01 17:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel, Andi Kleen

On Fri, Sep 01, 2017 at 06:57:59PM +0200, Jiri Olsa wrote:
> On Thu, Aug 31, 2017 at 12:40:26PM -0700, Andi Kleen wrote:
> > From: Andi Kleen <ak@linux.intel.com>
> > 
> > Setting up groups can be complicated due to the
> > complicated scheduling restrictions of different PMUs.
> > User tools usually don't understand all these restrictions.
> > Still in many cases it is useful to set up groups and
> > they work most of the time. However if the group
> > is set up wrong some members will not reported any values
> > because they never get scheduled.
> > 
> > Add a concept of a 'weak group': try to set up a group,
> > but if it's not schedulable fallback to not using
> > a group. That gives us the best of both worlds:
> > groups if they work, but still a usable fallback if they don't.
> > 
> > In theory it would be possible to have more complex fallback
> > strategies (e.g. try to split the group in half), but
> > the simple fallback of not using a group seems to work for now.
> > 
> > So far the weak group is only implemented for perf stat,
> > not for record.
> > 
> > Here's an unschedulable group (on IvyBridge with SMT on)
> > 
> > % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
> > 
> >         73,806,067      branches
> >          4,848,144      branch-misses             #    6.57% of all branches
> >         14,754,458      l1d.replacement
> >         24,905,558      l2_lines_in.all
> >    <not supported>      l2_rqsts.all_code_rd         <------- will never report anything
> > 
> > With the weak group:
> > 
> > % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1
> > 
> >        125,366,055      branches                                                      (80.02%)
> >          9,208,402      branch-misses             #    7.35% of all branches          (80.01%)
> >         24,560,249      l1d.replacement                                               (80.00%)
> >         43,174,971      l2_lines_in.all                                               (80.05%)
> >         31,891,457      l2_rqsts.all_code_rd                                          (79.92%)
> > 
> > The extra event scheduled with some extra multiplexing
> > 
> > v2: Move fallback code to separate function.
> > Add comment on for_each_group_member
> > Adjust to new perf_evsel__close interface
> > v3:
> > Fix debug print out.
> > Signed-off-by: Andi Kleen <ak@linux.intel.com>
> 
> Acked-by: Jiri Olsa <jolsa@kernel.org>

just realized we support this in stat only and the doc
indicates it's global.. maybe the perf-list.txt line
could have the '(perf stat only)' suffix ;-)

jirka

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Support standalone metrics and metric groups for perf
  2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
                   ` (10 preceding siblings ...)
  2017-08-31 19:40 ` [PATCH v3 11/11] perf, tools, stat: Update walltime_nsecs_stats in interval mode Andi Kleen
@ 2017-09-01 17:26 ` Jiri Olsa
  2017-09-01 17:36   ` Jiri Olsa
  2017-09-01 17:42   ` Andi Kleen
  11 siblings, 2 replies; 47+ messages in thread
From: Jiri Olsa @ 2017-09-01 17:26 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel

On Thu, Aug 31, 2017 at 12:40:25PM -0700, Andi Kleen wrote:

SNIP

> 
>    % perf stat -M Summary --metric-only -a sleep 1
>     
>      Performance counter stats for 'system wide':
>     
>     Instructions                              CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
>     317614222.0                              1392930775.0             0.0                 0.0                 0.2                 0.1
>     
>            1.001497549 seconds time elapsed
>     
>    % perf stat -M GFLOPs flops
>     
>      Performance counter stats for 'flops':
>     
>          3,999,541,471      fp_comp_ops_exe.sse_scalar_single #      1.2 GFLOPs                   (66.65%)
>                     14      fp_comp_ops_exe.sse_scalar_double                                     (66.65%)
>                      0      fp_comp_ops_exe.sse_packed_double                                     (66.67%)
>                      0      fp_comp_ops_exe.sse_packed_single                                     (66.70%)
>                      0      simd_fp_256.packed_double                                     (66.70%)
>                      0      simd_fp_256.packed_single                                     (66.67%)

looks like some events are probably crossing some
output boundaries we have:

[jolsa@krava perf]$ sudo ./perf stat -M SMT -I 1000
#           time             counts unit events
     1.000565706        408,879,985      inst_retired.any          #      0.7 CoreIPC                  (66.68%)
     1.000565706      1,120,999,114      cpu_clk_unhalted.thread_any                                     (66.68%)
     1.000565706        701,285,312      cycles                                                        (66.68%)
     1.000565706      1,148,325,740      cpu_clk_unhalted.thread_any # 574162870.0 CORE_CLKS             (66.67%)
     1.000565706        711,565,247      cpu_clk_unhalted.thread                                       (66.66%)
     1.000565706         24,057,590      cpu_clk_thread_unhalted.one_thread_active #      0.3 SMT_2T_Utilization       (66.67%)
     1.000565706         65,753,475      cpu_clk_thread_unhalted.ref_xclk_any                                     (66.67%)
^C     1.349436822         21,198,385      inst_retired.any          #      0.1 CoreIPC                  (66.70%)
     1.349436822        112,740,282      cpu_clk_unhalted.thread_any                                     (66.70%)
     1.349436822         84,509,414      cycles                                                        (66.70%)
     1.349436822        108,181,315      cpu_clk_unhalted.thread_any # 54090657.5 CORE_CLKS              (66.62%)
     1.349436822         79,700,353      cpu_clk_unhalted.thread                                       (66.61%)
     1.349436822          3,911,698      cpu_clk_thread_unhalted.one_thread_active #      0.8 SMT_2T_Utilization       (66.69%)
     1.349436822         14,739,671      cpu_clk_thread_unhalted.ref_xclk_any                                     (66.69%)


could you please check on that and maybe shift the alignment for the longest name?

thanks,
jirka

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Support standalone metrics and metric groups for perf
  2017-09-01 17:26 ` Support standalone metrics and metric groups for perf Jiri Olsa
@ 2017-09-01 17:36   ` Jiri Olsa
  2017-09-01 17:42   ` Andi Kleen
  1 sibling, 0 replies; 47+ messages in thread
From: Jiri Olsa @ 2017-09-01 17:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel

On Fri, Sep 01, 2017 at 07:26:18PM +0200, Jiri Olsa wrote:
> On Thu, Aug 31, 2017 at 12:40:25PM -0700, Andi Kleen wrote:
> 
> SNIP
> 
> > 
> >    % perf stat -M Summary --metric-only -a sleep 1
> >     
> >      Performance counter stats for 'system wide':
> >     
> >     Instructions                              CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
> >     317614222.0                              1392930775.0             0.0                 0.0                 0.2                 0.1
> >     
> >            1.001497549 seconds time elapsed
> >     
> >    % perf stat -M GFLOPs flops
> >     
> >      Performance counter stats for 'flops':
> >     
> >          3,999,541,471      fp_comp_ops_exe.sse_scalar_single #      1.2 GFLOPs                   (66.65%)
> >                     14      fp_comp_ops_exe.sse_scalar_double                                     (66.65%)
> >                      0      fp_comp_ops_exe.sse_packed_double                                     (66.67%)
> >                      0      fp_comp_ops_exe.sse_packed_single                                     (66.70%)
> >                      0      simd_fp_256.packed_double                                     (66.70%)
> >                      0      simd_fp_256.packed_single                                     (66.67%)
> 
> looks like some events are probably crossing some
> output boundaries we have:
> 
> [jolsa@krava perf]$ sudo ./perf stat -M SMT -I 1000
> #           time             counts unit events
>      1.000565706        408,879,985      inst_retired.any          #      0.7 CoreIPC                  (66.68%)
>      1.000565706      1,120,999,114      cpu_clk_unhalted.thread_any                                     (66.68%)
>      1.000565706        701,285,312      cycles                                                        (66.68%)
>      1.000565706      1,148,325,740      cpu_clk_unhalted.thread_any # 574162870.0 CORE_CLKS             (66.67%)
>      1.000565706        711,565,247      cpu_clk_unhalted.thread                                       (66.66%)
>      1.000565706         24,057,590      cpu_clk_thread_unhalted.one_thread_active #      0.3 SMT_2T_Utilization       (66.67%)
>      1.000565706         65,753,475      cpu_clk_thread_unhalted.ref_xclk_any                                     (66.67%)
> ^C     1.349436822         21,198,385      inst_retired.any          #      0.1 CoreIPC                  (66.70%)
>      1.349436822        112,740,282      cpu_clk_unhalted.thread_any                                     (66.70%)
>      1.349436822         84,509,414      cycles                                                        (66.70%)
>      1.349436822        108,181,315      cpu_clk_unhalted.thread_any # 54090657.5 CORE_CLKS              (66.62%)
>      1.349436822         79,700,353      cpu_clk_unhalted.thread                                       (66.61%)
>      1.349436822          3,911,698      cpu_clk_thread_unhalted.one_thread_active #      0.8 SMT_2T_Utilization       (66.69%)
>      1.349436822         14,739,671      cpu_clk_thread_unhalted.ref_xclk_any                                     (66.69%)
> 
> 
> could you please check on that and maybe shift the alignment for the longest name?

other than this the rest of the patchset looks ok to me

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Support standalone metrics and metric groups for perf
  2017-09-01 17:26 ` Support standalone metrics and metric groups for perf Jiri Olsa
  2017-09-01 17:36   ` Jiri Olsa
@ 2017-09-01 17:42   ` Andi Kleen
  2017-09-01 17:50     ` Jiri Olsa
  1 sibling, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-09-01 17:42 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, acme, jolsa, linux-kernel

> could you please check on that and maybe shift the alignment for the longest name?

To make everything align would require shifting everything. So most
usages which don't have that wide events wouldn't fit into 80
character columns anymore. That would be worse.

Or do a two pass output that computes lengths first. But that's
fairly complicated and would prefer to not tackle this right now.

Again it's only a few cases where it is visible at all.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Support standalone metrics and metric groups for perf
  2017-09-01 17:42   ` Andi Kleen
@ 2017-09-01 17:50     ` Jiri Olsa
  0 siblings, 0 replies; 47+ messages in thread
From: Jiri Olsa @ 2017-09-01 17:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, linux-kernel

On Fri, Sep 01, 2017 at 10:42:04AM -0700, Andi Kleen wrote:
> > could you please check on that and maybe shift the alignment for the longest name?
> 
> To make everything align would require shifting everything. So most
> usages which don't have that wide events wouldn't fit into 80
> character columns anymore. That would be worse.
> 
> Or do a two pass output that computes lengths first. But that's
> fairly complicated and would prefer to not tackle this right now.

that's what I had in mind.. but could be dealt with later

jirka

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/11] perf, tools: Support weak groups
  2017-09-01 17:00     ` Jiri Olsa
@ 2017-09-04 16:51       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-04 16:51 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, jolsa, linux-kernel, Andi Kleen

Em Fri, Sep 01, 2017 at 07:00:39PM +0200, Jiri Olsa escreveu:
> On Fri, Sep 01, 2017 at 06:57:59PM +0200, Jiri Olsa wrote:
> > On Thu, Aug 31, 2017 at 12:40:26PM -0700, Andi Kleen wrote:
> > > From: Andi Kleen <ak@linux.intel.com>
> > > Add a concept of a 'weak group': try to set up a group,
> > > but if it's not schedulable fallback to not using
> > > a group. That gives us the best of both worlds:
> > > groups if they work, but still a usable fallback if they don't.

> > Acked-by: Jiri Olsa <jolsa@kernel.org>
 
> just realized we support this in stat only and the doc indicates it's
> global.. maybe the perf-list.txt line could have the '(perf stat
> only)' suffix ;-)

I'll try to add something to that effect.

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-08-31 19:40 ` [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat Andi Kleen
@ 2017-09-04 17:11   ` Arnaldo Carvalho de Melo
  2017-09-04 17:37     ` Andi Kleen
  2017-09-22 16:31   ` [tip:perf/core] perf stat: Support JSON metrics in perf stat tip-bot for Andi Kleen
  1 sibling, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-04 17:11 UTC (permalink / raw)
  To: Andi Kleen; +Cc: jolsa, linux-kernel, Andi Kleen

Em Thu, Aug 31, 2017 at 12:40:31PM -0700, Andi Kleen escreveu:
> The actual JSON metrics are in a separate pull request.

Yeah, I noticed when trying to test it :-\

Was this pull req submitted?

- Arnaldo
 
> % perf stat -M Summary --metric-only -a sleep 1

[root@jouet ~]# perf stat -M Summary --metric-only -a sleep 1
Cannot find metric or group `Summary'

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@jouet ~]# perf stat -M GFLOPs flops
Cannot find metric or group `GFLOPs'

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@jouet ~]#

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-04 17:11   ` Arnaldo Carvalho de Melo
@ 2017-09-04 17:37     ` Andi Kleen
  2017-09-05 18:09       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-09-04 17:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Andi Kleen, jolsa, linux-kernel, Andi Kleen

On Mon, Sep 04, 2017 at 02:11:28PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Aug 31, 2017 at 12:40:31PM -0700, Andi Kleen escreveu:
> > The actual JSON metrics are in a separate pull request.
> 
> Yeah, I noticed when trying to test it :-\
> 
> Was this pull req submitted?

Not yet. I was waiting to finish review.

It makes perf crash until you apply the metricgroup patchkit first.

But you can get it here

git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2


-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-04 17:37     ` Andi Kleen
@ 2017-09-05 18:09       ` Arnaldo Carvalho de Melo
  2017-09-05 18:16         ` Arnaldo Carvalho de Melo
  2017-09-05 18:19         ` Andi Kleen
  0 siblings, 2 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-05 18:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: jolsa, linux-kernel, Andi Kleen

Em Mon, Sep 04, 2017 at 10:37:25AM -0700, Andi Kleen escreveu:
> On Mon, Sep 04, 2017 at 02:11:28PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Aug 31, 2017 at 12:40:31PM -0700, Andi Kleen escreveu:
> > > The actual JSON metrics are in a separate pull request.
> > 
> > Yeah, I noticed when trying to test it :-\
> > 
> > Was this pull req submitted?
> 
> Not yet. I was waiting to finish review.
> 
> It makes perf crash until you apply the metricgroup patchkit first.
> 
> But you can get it here
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2

Ok, so I tried installing the Broadwell ones (one of my test machines)
and got into:

[root@jouet ~]# perf stat -M Pipeline flops
bpf: builtin compilation failed: -95, try external compiler
ERROR: problems with path cpu/uops_executed.c: No such file or directory
Cannot set up events {uops_retired.retire_slots,inst_retired.any}:W,{inst_retired.any,cycles}:W,{uops_executed.thread,cpu/uops_executed.core,cmask=1/,uops_executed.cycles_ge_1_uop_exec}:W

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@jouet ~]# perf stat -M Pipeline flops


Wasn't this already fixed?

This is with these patches applied:

[acme@jouet linux]$ git log --oneline -5
5e6bf24a1929 (HEAD) perf stat: Support JSON metrics in perf stat
b1e1f50913c8 perf pmu: Extract function to get JSON alias map
af8aca46b528 perf stat: Print generic metric header even for failed expressions
15669491ee54 perf stat: Factor out generic metric printing
7d8900f2984c perf vendor events: Support metric_group and no event name in JSON parser
[acme@jouet linux]$ git log --oneline -10
5e6bf24a1929 (HEAD) perf stat: Support JSON metrics in perf stat
b1e1f50913c8 perf pmu: Extract function to get JSON alias map
af8aca46b528 perf stat: Print generic metric header even for failed expressions
15669491ee54 perf stat: Factor out generic metric printing
7d8900f2984c perf vendor events: Support metric_group and no event name in JSON parser
8f6ea3aac2e2 perf tools: Support weak groups in 'perf stat'
b5ef5ea5e8ed perf sched timehist: Add pid and tid options
eba9fac01761 (tag: perf-core-for-mingo-4.14-20170901, acme/perf/core) perf annotate browser: Help for cycling thru hottest instructions with TAB/shift+TAB
63ce8449bc10 perf stat: Only auto-merge events that are PMU aliases
fc33dccba395 perf test: Add test case for PERF_SAMPLE_PHYS_ADDR
[acme@jouet linux]$ 

Will try with all of them...

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 18:09       ` Arnaldo Carvalho de Melo
@ 2017-09-05 18:16         ` Arnaldo Carvalho de Melo
  2017-09-05 18:32           ` Arnaldo Carvalho de Melo
  2017-09-05 18:19         ` Andi Kleen
  1 sibling, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-05 18:16 UTC (permalink / raw)
  To: Andi Kleen; +Cc: jolsa, linux-kernel, Andi Kleen

Em Tue, Sep 05, 2017 at 03:09:19PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Sep 04, 2017 at 10:37:25AM -0700, Andi Kleen escreveu:
> > On Mon, Sep 04, 2017 at 02:11:28PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Aug 31, 2017 at 12:40:31PM -0700, Andi Kleen escreveu:
> > > > The actual JSON metrics are in a separate pull request.
> > > 
> > > Yeah, I noticed when trying to test it :-\
> > > 
> > > Was this pull req submitted?
> > 
> > Not yet. I was waiting to finish review.
> > 
> > It makes perf crash until you apply the metricgroup patchkit first.
> > 
> > But you can get it here
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2
> 
> Ok, so I tried installing the Broadwell ones (one of my test machines)
> and got into:
> 
> [root@jouet ~]# perf stat -M Pipeline flops
> bpf: builtin compilation failed: -95, try external compiler


So there is no 'flops' binary here, unsure if the problem is related to
that, if so, we need a fix.

But that doesn't seem to be the case, as with a metric that works I get
a sensible error message:

[root@jouet ~]# perf stat -M Memory_BW --metric-only -a flops
Workload failed: No such file or directory
[root@jouet ~]#

Anyway, testing with some other metric I got some results:

[root@jouet ~]# perf stat -M FLOPS --metric-only -a sleep 1
Cannot set up events {fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@jouet ~]# perf stat -M Memory_BW --metric-only -a sleep 1

 Performance counter stats for 'system wide':

MLP                  
     2.9            

       1.001544615 seconds time elapsed

[root@jouet ~]# 


> ERROR: problems with path cpu/uops_executed.c: No such file or directory
> Cannot set up events {uops_retired.retire_slots,inst_retired.any}:W,{inst_retired.any,cycles}:W,{uops_executed.thread,cpu/uops_executed.core,cmask=1/,uops_executed.cycles_ge_1_uop_exec}:W
> 
>  Usage: perf stat [<options>] [<command>]
> 
>     -M, --metrics <metric/metric group list>
>                           monitor specified metrics or metric groups (separated by ,)
> [root@jouet ~]# perf stat -M Pipeline flops
> 
> 
> Wasn't this already fixed?
> 
> This is with these patches applied:
> 
> [acme@jouet linux]$ git log --oneline -5
> 5e6bf24a1929 (HEAD) perf stat: Support JSON metrics in perf stat
> b1e1f50913c8 perf pmu: Extract function to get JSON alias map
> af8aca46b528 perf stat: Print generic metric header even for failed expressions
> 15669491ee54 perf stat: Factor out generic metric printing
> 7d8900f2984c perf vendor events: Support metric_group and no event name in JSON parser
> [acme@jouet linux]$ git log --oneline -10
> 5e6bf24a1929 (HEAD) perf stat: Support JSON metrics in perf stat
> b1e1f50913c8 perf pmu: Extract function to get JSON alias map
> af8aca46b528 perf stat: Print generic metric header even for failed expressions
> 15669491ee54 perf stat: Factor out generic metric printing
> 7d8900f2984c perf vendor events: Support metric_group and no event name in JSON parser
> 8f6ea3aac2e2 perf tools: Support weak groups in 'perf stat'
> b5ef5ea5e8ed perf sched timehist: Add pid and tid options
> eba9fac01761 (tag: perf-core-for-mingo-4.14-20170901, acme/perf/core) perf annotate browser: Help for cycling thru hottest instructions with TAB/shift+TAB
> 63ce8449bc10 perf stat: Only auto-merge events that are PMU aliases
> fc33dccba395 perf test: Add test case for PERF_SAMPLE_PHYS_ADDR
> [acme@jouet linux]$ 
> 
> Will try with all of them...
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 18:09       ` Arnaldo Carvalho de Melo
  2017-09-05 18:16         ` Arnaldo Carvalho de Melo
@ 2017-09-05 18:19         ` Andi Kleen
  2017-09-05 18:52           ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-09-05 18:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Andi Kleen, jolsa, linux-kernel

On Tue, Sep 05, 2017 at 03:09:19PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 04, 2017 at 10:37:25AM -0700, Andi Kleen escreveu:
> > On Mon, Sep 04, 2017 at 02:11:28PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Aug 31, 2017 at 12:40:31PM -0700, Andi Kleen escreveu:
> > > > The actual JSON metrics are in a separate pull request.
> > > 
> > > Yeah, I noticed when trying to test it :-\
> > > 
> > > Was this pull req submitted?
> > 
> > Not yet. I was waiting to finish review.
> > 
> > It makes perf crash until you apply the metricgroup patchkit first.
> > 
> > But you can get it here
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2
> 
> Ok, so I tried installing the Broadwell ones (one of my test machines)
> and got into:

I would suggest to apply all the patches from the patchkit before you test.
Some of the later ones add/fix stuff needed by some of the metrics.

> 
> [root@jouet ~]# perf stat -M Pipeline flops
> bpf: builtin compilation failed: -95, try external compiler
> ERROR: problems with path cpu/uops_executed.c: No such file or directory
> Cannot set up events {uops_retired.retire_slots,inst_retired.any}:W,{inst_retired.any,cycles}:W,{uops_executed.thread,cpu/uops_executed.core,cmask=1/,uops_executed.cycles_ge_1_uop_exec}:W
> 
>  Usage: perf stat [<options>] [<command>]
> 
>     -M, --metrics <metric/metric group list>
>                           monitor specified metrics or metric groups (separated by ,)
> [root@jouet ~]# perf stat -M Pipeline flops
> 
> 
> Wasn't this already fixed?

Yes that was supposed to be fixed with 

commit 77d0871c76bad1093a3d86870fe76dd1ad0ca397
Author: Andi Kleen <ak@linux.intel.com>
Date:   Fri Aug 11 16:26:19 2017 -0700

    perf bpf: Tighten detection of BPF events
    
      perf stat -e cpu/uops_executed.core,cmask=1/


I'll check.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 18:16         ` Arnaldo Carvalho de Melo
@ 2017-09-05 18:32           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-05 18:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: jolsa, linux-kernel, Andi Kleen

Em Tue, Sep 05, 2017 at 03:16:12PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Sep 05, 2017 at 03:09:19PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Mon, Sep 04, 2017 at 10:37:25AM -0700, Andi Kleen escreveu:
> > > On Mon, Sep 04, 2017 at 02:11:28PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Thu, Aug 31, 2017 at 12:40:31PM -0700, Andi Kleen escreveu:
> > > > > The actual JSON metrics are in a separate pull request.
> > > > 
> > > > Yeah, I noticed when trying to test it :-\
> > > > 
> > > > Was this pull req submitted?
> > > 
> > > Not yet. I was waiting to finish review.
> > > 
> > > It makes perf crash until you apply the metricgroup patchkit first.
> > > 
> > > But you can get it here
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2
> > 
> > Ok, so I tried installing the Broadwell ones (one of my test machines)
> > and got into:
> > 
> > [root@jouet ~]# perf stat -M Pipeline flops
> > bpf: builtin compilation failed: -95, try external compiler

further info:

[acme@jouet linux]$ perf stat -v -M GFLOPs sleep 1
Using CPUID GenuineIntel-6-3D
metric expr ( 1*( fp_arith_inst_retired.scalar_single + fp_arith_inst_retired.scalar_double ) + 2* fp_arith_inst_retired.128b_packed_double + 4*( fp_arith_inst_retired.128b_packed_single + fp_arith_inst_retired.256b_packed_double ) + 8* fp_arith_inst_retired.256b_packed_single ) / 1000000000 / duration_time for GFLOPs
found event fp_arith_inst_retired.scalar_single
found event fp_arith_inst_retired.scalar_double
found event fp_arith_inst_retired.128b_packed_double
found event fp_arith_inst_retired.128b_packed_single
found event fp_arith_inst_retired.256b_packed_double
found event fp_arith_inst_retired.256b_packed_single
found event duration_time
adding {fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W
fp_arith_inst_retired.scalar_single -> cpu/umask=0x2,period=2000003,event=0xc7/
fp_arith_inst_retired.scalar_double -> cpu/umask=0x1,period=2000003,event=0xc7/
fp_arith_inst_retired.128b_packed_double -> cpu/umask=0x4,period=2000003,event=0xc7/
fp_arith_inst_retired.128b_packed_single -> cpu/umask=0x8,period=2000003,event=0xc7/
fp_arith_inst_retired.256b_packed_double -> cpu/umask=0x10,period=2000003,event=0xc7/
fp_arith_inst_retired.256b_packed_single -> cpu/umask=0x20,period=2000003,event=0xc7/
Cannot set up events {fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[acme@jouet linux]$ 

Humm, so it uses that duration_time and that gets added only later?

/me tries with all the patches:

Yeah:

[acme@jouet linux]$ perf stat -M GFLOPs sleep 1

 Performance counter stats for 'sleep 1':

                 0      fp_arith_inst_retired.scalar_single:u                                   
                 3      fp_arith_inst_retired.scalar_double:u                                   
                 0      fp_arith_inst_retired.128b_packed_double:u                                   
                 0      fp_arith_inst_retired.128b_packed_single:u                                   
                 0      fp_arith_inst_retired.256b_packed_double:u                                     (40.86%)
     <not counted>      fp_arith_inst_retired.256b_packed_single:u                                     (0.00%)
                 0      duration_time:u                                             

       1.003398909 seconds time elapsed

[acme@jouet linux]$

With --metric-only I get nothing, what am I doing wrong?

[acme@jouet linux]$ perf stat -M GFLOPs --metric-only sleep 1

 Performance counter stats for 'sleep 1':

GFLOPs               
                                         

       1.003217768 seconds time elapsed

[acme@jouet linux]$

I just rebuilt the tools completely from scratch, deleting the build
dir, to see if this was some build artifact, didn't help, I still get
just the time elapsed.

Now with -v:

[acme@jouet linux]$ perf stat -v -M GFLOPs --metric-only sleep 1
Using CPUID GenuineIntel-6-3D
metric expr ( 1*( fp_arith_inst_retired.scalar_single + fp_arith_inst_retired.scalar_double ) + 2* fp_arith_inst_retired.128b_packed_double + 4*( fp_arith_inst_retired.128b_packed_single + fp_arith_inst_retired.256b_packed_double ) + 8* fp_arith_inst_retired.256b_packed_single ) / 1000000000 / duration_time for GFLOPs
found event fp_arith_inst_retired.scalar_single
found event fp_arith_inst_retired.scalar_double
found event fp_arith_inst_retired.128b_packed_double
found event fp_arith_inst_retired.128b_packed_single
found event fp_arith_inst_retired.256b_packed_double
found event fp_arith_inst_retired.256b_packed_single
found event duration_time
adding {fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W
fp_arith_inst_retired.scalar_single -> cpu/umask=0x2,period=2000003,event=0xc7/
fp_arith_inst_retired.scalar_double -> cpu/umask=0x1,period=2000003,event=0xc7/
fp_arith_inst_retired.128b_packed_double -> cpu/umask=0x4,period=2000003,event=0xc7/
fp_arith_inst_retired.128b_packed_single -> cpu/umask=0x8,period=2000003,event=0xc7/
fp_arith_inst_retired.256b_packed_double -> cpu/umask=0x10,period=2000003,event=0xc7/
fp_arith_inst_retired.256b_packed_single -> cpu/umask=0x20,period=2000003,event=0xc7/
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
Weak group for fp_arith_inst_retired.scalar_single:u/7 failed
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
fp_arith_inst_retired.scalar_single:u: 0 2137416 524337
fp_arith_inst_retired.scalar_double:u: 0 1525387 1525387
fp_arith_inst_retired.128b_packed_double:u: 0 2429065 2429065
fp_arith_inst_retired.128b_packed_single:u: 0 2429065 2429065
fp_arith_inst_retired.256b_packed_double:u: 0 2429065 1904728
fp_arith_inst_retired.256b_packed_single:u: 0 2429065 903678
duration_time:u: 0 2394375 2394375

 Performance counter stats for 'sleep 1':

fp_arith_inst_retired.scalar_single not found
GFLOPs               
fp_arith_inst_retired.scalar_single not found
                     

       1.005367355 seconds time elapsed

[acme@jouet linux]$


We need to improve these messages...

As root for that Summary more:

[acme@jouet linux]$ sudo ~acme/bin/perf stat -M Summary --metric-only -a sleep 1

 Performance counter stats for 'system wide':

Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization   
134707420.0              0.0            698757419.0              0.1                 0.0                 0.3                 0.4            

       1.003365814 seconds time elapsed

[acme@jouet linux]$ 

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 18:19         ` Andi Kleen
@ 2017-09-05 18:52           ` Arnaldo Carvalho de Melo
  2017-09-05 19:52             ` Andi Kleen
  0 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-05 18:52 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, jolsa, linux-kernel

Em Tue, Sep 05, 2017 at 11:19:52AM -0700, Andi Kleen escreveu:
> On Tue, Sep 05, 2017 at 03:09:19PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Sep 04, 2017 at 10:37:25AM -0700, Andi Kleen escreveu:
> > > But you can get it here

> > > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2

> > Ok, so I tried installing the Broadwell ones (one of my test machines)
> > and got into:
 
> I would suggest to apply all the patches from the patchkit before you test.
> Some of the later ones add/fix stuff needed by some of the metrics.
> 
> > 
> > [root@jouet ~]# perf stat -M Pipeline flops
> > bpf: builtin compilation failed: -95, try external compiler
> > ERROR: problems with path cpu/uops_executed.c: No such file or directory
> > Cannot set up events {uops_retired.retire_slots,inst_retired.any}:W,{inst_retired.any,cycles}:W,{uops_executed.thread,cpu/uops_executed.core,cmask=1/,uops_executed.cycles_ge_1_uop_exec}:W
> > 
> >  Usage: perf stat [<options>] [<command>]
> > 
> >     -M, --metrics <metric/metric group list>
> >                           monitor specified metrics or metric groups (separated by ,)
> > [root@jouet ~]# perf stat -M Pipeline flops
> > 
> > 
> > Wasn't this already fixed?
> 
> Yes that was supposed to be fixed with 
 
>     perf bpf: Tighten detection of BPF events
>     
>       perf stat -e cpu/uops_executed.core,cmask=1/
 
> I'll check.

Ok, I couldn't reproduce it anymore, after you check this, please let me
know if I should pull that perf/intel-json-metrics-2 branch so that
someone wanting to test this can have it all in one place, ok?

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 18:52           ` Arnaldo Carvalho de Melo
@ 2017-09-05 19:52             ` Andi Kleen
  2017-09-05 20:07               ` Arnaldo Carvalho de Melo
                                 ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Andi Kleen @ 2017-09-05 19:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Andi Kleen, Andi Kleen, jolsa, linux-kernel

> > I'll check.
> 
> Ok, I couldn't reproduce it anymore, after you check this, please let me
> know if I should pull that perf/intel-json-metrics-2 branch so that
> someone wanting to test this can have it all in one place, ok?

Sure please pull.

The only missing thing are metrics for Skylake Server, but I can
submit those later.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 19:52             ` Andi Kleen
@ 2017-09-05 20:07               ` Arnaldo Carvalho de Melo
  2017-09-05 20:37                 ` Andi Kleen
  2017-09-22 16:37               ` [tip:perf/core] perf vendor events: Add JSON metrics for Broadwell tip-bot for Andi Kleen
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-05 20:07 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, jolsa, linux-kernel

Em Tue, Sep 05, 2017 at 12:52:35PM -0700, Andi Kleen escreveu:
> > > I'll check.
> > 
> > Ok, I couldn't reproduce it anymore, after you check this, please let me
> > know if I should pull that perf/intel-json-metrics-2 branch so that
> > someone wanting to test this can have it all in one place, ok?
> 
> Sure please pull.
> 
> The only missing thing are metrics for Skylake Server, but I can
> submit those later.

Ok, so I looked at the commit logs and found them rather dull, how is
that option to fake a CPU so that the tool thinks it is running on some
specific machine (broadwell, skylake, etc) so that I can augment those
with the output of 'perf list metricgroup' for each of them?

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 20:07               ` Arnaldo Carvalho de Melo
@ 2017-09-05 20:37                 ` Andi Kleen
  2017-09-08 18:10                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-09-05 20:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Andi Kleen, jolsa, linux-kernel

On Tue, Sep 05, 2017 at 05:07:09PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 05, 2017 at 12:52:35PM -0700, Andi Kleen escreveu:
> > > > I'll check.
> > > 
> > > Ok, I couldn't reproduce it anymore, after you check this, please let me
> > > know if I should pull that perf/intel-json-metrics-2 branch so that
> > > someone wanting to test this can have it all in one place, ok?
> > 
> > Sure please pull.
> > 
> > The only missing thing are metrics for Skylake Server, but I can
> > submit those later.
> 
> Ok, so I looked at the commit logs and found them rather dull, how is
> that option to fake a CPU so that the tool thinks it is running on some
> specific machine (broadwell, skylake, etc) so that I can augment those
> with the output of 'perf list metricgroup' for each of them?

PERF_CPUID=GenuineIntel-...

See the mapfile.csv for valid codes

But it's quite a few.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-05 20:37                 ` Andi Kleen
@ 2017-09-08 18:10                   ` Arnaldo Carvalho de Melo
  2017-09-08 19:08                     ` Andi Kleen
  0 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-08 18:10 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, jolsa, linux-kernel

Em Tue, Sep 05, 2017 at 01:37:08PM -0700, Andi Kleen escreveu:
> On Tue, Sep 05, 2017 at 05:07:09PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Sep 05, 2017 at 12:52:35PM -0700, Andi Kleen escreveu:
> > > > > I'll check.
> > > > 
> > > > Ok, I couldn't reproduce it anymore, after you check this, please let me
> > > > know if I should pull that perf/intel-json-metrics-2 branch so that
> > > > someone wanting to test this can have it all in one place, ok?
> > > 
> > > Sure please pull.
> > > 
> > > The only missing thing are metrics for Skylake Server, but I can
> > > submit those later.
> > 
> > Ok, so I looked at the commit logs and found them rather dull, how is
> > that option to fake a CPU so that the tool thinks it is running on some
> > specific machine (broadwell, skylake, etc) so that I can augment those
> > with the output of 'perf list metricgroup' for each of them?
> 
> PERF_CPUID=GenuineIntel-...
> 
> See the mapfile.csv for valid codes
> 
> But it's quite a few.

yeah, I'm testing on the ones I have access and on a Skylake machine I'm having
trouble, see below.

That error message is not that helpful, what could be at play here? The kernel
running there is a recent RT variant for the RHEL kernel, with lots of
backports in the perf codebase, so I wasn't expecting that to be an issue,
please take a look.

I'll try later with a more recent kernel, maybe there are kernel patches
supporting Skylake that aren't in this kernel :-\

[root@seventh ~]# uname -a
Linux seventh 3.10.0-703.rt56.630.el7.x86_64 #1 SMP PREEMPT RT Sat Aug 19 22:32:17 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@seventh ~]# grep "model name" /proc/cpuinfo | head -1
model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
[root@seventh ~]# perf list metricgroup

List of pre-defined events (to be used in -e):


Metric Groups:

DSB
FLOPS
Frontend
Frontend_Bandwidth
Memory_BW
Memory_Bound
Memory_Lat
Pipeline
Ports_Utilization
Power
SMT
Summary
TLB
TopDownL1
Unknown_Branches
[root@seventh ~]# perf stat --metric-only -M Summary -a sleep 1
Cannot set up events {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@seventh ~]#

[root@seventh ~]# perf stat -vv --metric-only -M Summary -a sleep 1
Using CPUID GenuineIntel-6-9E
metric expr 1 / inst_retired.any / cycles for CPI
found event inst_retired.any
found event cycles
metric expr cpu_clk_unhalted.thread for CLKS
found event cpu_clk_unhalted.thread
metric expr inst_retired.any for Instructions
found event inst_retired.any
metric expr cpu_clk_unhalted.ref_tsc / msr@tsc@ for CPU_Utilization
found event cpu_clk_unhalted.ref_tsc
found event msr/tsc/
metric expr ( 1*( fp_arith_inst_retired.scalar_single + fp_arith_inst_retired.scalar_double ) + 2* fp_arith_inst_retired.128b_packed_double + 4*( fp_arith_inst_retired.128b_packed_single + fp_arith_inst_retired.256b_packed_double ) + 8* fp_arith_inst_retired.256b_packed_single ) / 1000000000 / duration_time for GFLOPs
found event fp_arith_inst_retired.scalar_single
found event fp_arith_inst_retired.scalar_double
found event fp_arith_inst_retired.128b_packed_double
found event fp_arith_inst_retired.128b_packed_single
found event fp_arith_inst_retired.256b_packed_double
found event fp_arith_inst_retired.256b_packed_single
found event duration_time
metric expr 1 - cpu_clk_thread_unhalted.one_thread_active / ( cpu_clk_thread_unhalted.ref_xclk_any / 2 ) if #smt_on else 0 for SMT_2T_Utilization
found event cpu_clk_thread_unhalted.one_thread_active
found event cpu_clk_thread_unhalted.ref_xclk_any
metric expr cpu_clk_unhalted.ref_tsc:u / cpu_clk_unhalted.ref_tsc for Kernel_Utilization
found event cpu_clk_unhalted.ref_tsc:u
found event cpu_clk_unhalted.ref_tsc
adding {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3
inst_retired.any -> cpu/event=0xc0/
cpu_clk_unhalted.thread -> cpu/event=0x3c/
inst_retired.any -> cpu/event=0xc0/
cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
Cannot set up events {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@seventh ~]#


Humm, it doesn't even get to try sys_perf_event_open() it seems to be bailing
out while reading some sysfs event_source files, a better error output seems in
demand:

[root@seventh ~]# perf trace -e perf_* perf stat --metric-only -M Summary -a sleep 1
Cannot set up events {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W

 Usage: perf stat [<options>] [<command>]

    -M, --metrics <metric/metric group list>
                          monitor specified metrics or metric groups (separated by ,)
[root@seventh ~]# 

[root@seventh linux]# strace perf stat --metric-only -M Summary -a sleep 1 |& tail -30
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
openat(AT_FDCWD, "/sys/bus/event_source/devices/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, /* 8 entries */, 32768)     = 232
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
stat("/sys/bus/event_source/devices/msr/format", 0x7ffd36f79480) = -1 ENOENT (No such file or directory)
stat("/sys/bus/event_source/devices/msr/type", 0x7ffd36f79470) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/sys/bus/event_source/devices/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, /* 8 entries */, 32768)     = 232
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
write(2, "Cannot set up events {inst_retir"..., 529Cannot set up events {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W
) = 529
write(2, "\n Usage: perf stat [<options>] ["..., 43
 Usage: perf stat [<options>] [<command>]
) = 43
write(2, "\n", 1
)                       = 1
write(2, "    ", 4    )                     = 4
write(2, "-M", 2-M)                       = 2
write(2, ", ", 2, )                       = 2
write(2, "--metrics", 9--metrics)                = 9
write(2, " <metric/metric group list>", 27 <metric/metric group list>) = 27
write(2, "\n", 1
)                       = 1
write(2, "                          monito"..., 86                          monitor specified metrics or metric groups (separated by ,)
) = 86
exit_group(129)                         = ?
+++ exited with 129 +++
[root@seventh linux]#

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-08 18:10                   ` Arnaldo Carvalho de Melo
@ 2017-09-08 19:08                     ` Andi Kleen
  2017-09-11 14:05                       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2017-09-08 19:08 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Andi Kleen, jolsa, linux-kernel

> > PERF_CPUID=GenuineIntel-...
> > 
> > See the mapfile.csv for valid codes
> > 
> > But it's quite a few.
> 
> yeah, I'm testing on the ones I have access and on a Skylake machine I'm having
> trouble, see below.
> [root@seventh ~]# perf stat -vv --metric-only -M Summary -a sleep 1

The problem is that your kernel is missing the msr// PMU.

It should work with a newer kernel.

I can look at better error messages.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat
  2017-09-08 19:08                     ` Andi Kleen
@ 2017-09-11 14:05                       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-11 14:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, jolsa, linux-kernel

Em Fri, Sep 08, 2017 at 12:08:48PM -0700, Andi Kleen escreveu:
> > > PERF_CPUID=GenuineIntel-...
> > > 
> > > See the mapfile.csv for valid codes
> > > 
> > > But it's quite a few.
> > 
> > yeah, I'm testing on the ones I have access and on a Skylake machine I'm having
> > trouble, see below.
> > [root@seventh ~]# perf stat -vv --metric-only -M Summary -a sleep 1
> 
> The problem is that your kernel is missing the msr// PMU.
> 
> It should work with a newer kernel.

Yes:

[root@seventh ~]# uname -r
4.12.0-rc6+
[root@seventh ~]# perf stat --metric-only -M Summary -a sleep 1 

 Performance counter stats for 'system wide':

Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization   
34021097.0               0.0            119424171.0              0.0                 0.0                 0.0                 0.0            

       1.001001793 seconds time elapsed

[root@seventh ~]#
 
> I can look at better error messages.

Please do,

Thanks!

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf tools: Support weak groups in 'perf stat'
  2017-08-31 19:40 ` [PATCH v3 01/11] perf, tools: Support weak groups Andi Kleen
  2017-09-01 16:57   ` Jiri Olsa
@ 2017-09-22 16:28   ` tip-bot for Andi Kleen
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:28 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, acme, jolsa, linux-kernel, ak, tglx, mingo

Commit-ID:  5a5dfe4b8548d806bf433090995ee0ee4c139f11
Gitweb:     http://git.kernel.org/tip/5a5dfe4b8548d806bf433090995ee0ee4c139f11
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:26 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:12 -0300

perf tools: Support weak groups in 'perf stat'

Setting up groups can be complicated due to the complicated scheduling
restrictions of different PMUs.

User tools usually don't understand all these restrictions.

Still in many cases it is useful to set up groups and they work most of
the time. However if the group is set up wrong some members will not
report any value because they never get scheduled.

Add a concept of a 'weak group': try to set up a group, but if it's not
schedulable fallback to not using a group. That gives us the best of
both worlds: groups if they work, but still a usable fallback if they
don't.

In theory it would be possible to have more complex fallback strategies
(e.g. try to split the group in half), but the simple fallback of not
using a group seems to work for now.

So far the weak group is only implemented for perf stat, not for record.

Here's an unschedulable group (on IvyBridge with SMT on)

  % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

        73,806,067      branches
         4,848,144      branch-misses             #    6.57% of all branches
        14,754,458      l1d.replacement
        24,905,558      l2_lines_in.all
   <not supported>      l2_rqsts.all_code_rd         <------- will never report anything

With the weak group:

  % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1

       125,366,055      branches                                                      (80.02%)
         9,208,402      branch-misses             #    7.35% of all branches          (80.01%)
        24,560,249      l1d.replacement                                               (80.00%)
        43,174,971      l2_lines_in.all                                               (80.05%)
        31,891,457      l2_rqsts.all_code_rd                                          (79.92%)

The extra event scheduled with some extra multiplexing

v2: Move fallback code to separate function.
Add comment on for_each_group_member
Adjust to new perf_evsel__close interface
v3: Fix debug print out.

Committer testing:

Before:

  # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

   Performance counter stats for 'system wide':

     <not counted>      branches
     <not counted>      branch-misses
     <not counted>      l1d.replacement
     <not counted>      l2_lines_in.all
   <not supported>      l2_rqsts.all_code_rd

       1.002147212 seconds time elapsed

  # perf stat -e '{branches,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

   Performance counter stats for 'system wide':

        83,207,892      branches
        11,065,444      l1d.replacement
        28,484,024      l2_lines_in.all
        12,186,179      l2_rqsts.all_code_rd

       1.001739493 seconds time elapsed

After:

  # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}':W -a sleep 1

   Performance counter stats for 'system wide':

       543,323,909      branches                                                      (80.01%)
        27,100,512      branch-misses             #    4.99% of all branches          (80.02%)
        50,402,905      l1d.replacement                                               (80.03%)
        67,385,892      l2_lines_in.all                                               (80.01%)
        21,352,885      l2_rqsts.all_code_rd                                          (79.94%)

       1.001086658 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170831194036.30146-2-andi@firstfloor.org
[ Add a "'perf stat' only, for now" comment in the man page, suggested by Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-list.txt |  2 ++
 tools/perf/builtin-stat.c              | 35 ++++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h                |  1 +
 tools/perf/util/parse-events.c         |  8 +++++++-
 tools/perf/util/parse-events.l         |  2 +-
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index f709de5..75fc17f 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -47,6 +47,8 @@ counted. The following modifiers exist:
  P - use maximum detected precise level
  S - read sample value (PERF_SAMPLE_READ)
  D - pin the event to the PMU
+ W - group is weak and will fallback to non-group if not schedulable,
+     only supported in 'perf stat' for now.
 
 The 'p' modifier can be used for specifying how precise the instruction
 address should be. The 'p' modifier can be specified multiple times:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 69523ed..7cc61eb 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -582,6 +582,32 @@ static bool perf_evsel__should_store_id(struct perf_evsel *counter)
 	return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
 }
 
+static struct perf_evsel *perf_evsel__reset_weak_group(struct perf_evsel *evsel)
+{
+	struct perf_evsel *c2, *leader;
+	bool is_open = true;
+
+	leader = evsel->leader;
+	pr_debug("Weak group for %s/%d failed\n",
+			leader->name, leader->nr_members);
+
+	/*
+	 * for_each_group_member doesn't work here because it doesn't
+	 * include the first entry.
+	 */
+	evlist__for_each_entry(evsel_list, c2) {
+		if (c2 == evsel)
+			is_open = false;
+		if (c2->leader == leader) {
+			if (is_open)
+				perf_evsel__close(c2);
+			c2->leader = c2;
+			c2->nr_members = 0;
+		}
+	}
+	return leader;
+}
+
 static int __run_perf_stat(int argc, const char **argv)
 {
 	int interval = stat_config.interval;
@@ -618,6 +644,15 @@ static int __run_perf_stat(int argc, const char **argv)
 	evlist__for_each_entry(evsel_list, counter) {
 try_again:
 		if (create_perf_stat_counter(counter) < 0) {
+
+			/* Weak group failed. Reset the group. */
+			if (errno == EINVAL &&
+			    counter->leader != counter &&
+			    counter->weak_group) {
+				counter = perf_evsel__reset_weak_group(counter);
+				goto try_again;
+			}
+
 			/*
 			 * PPC returns ENXIO for HW counters until 2.6.37
 			 * (behavior changed with commit b0a873e).
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index dd2c4b5..db65878 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -137,6 +137,7 @@ struct perf_evsel {
 	const char *		metric_name;
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
+	bool			weak_group;
 };
 
 union u64_swap {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f6257fb..57d7acf 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1366,6 +1366,7 @@ struct event_modifier {
 	int exclude_GH;
 	int sample_read;
 	int pinned;
+	int weak;
 };
 
 static int get_event_modifier(struct event_modifier *mod, char *str,
@@ -1384,6 +1385,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 
 	int exclude = eu | ek | eh;
 	int exclude_GH = evsel ? evsel->exclude_GH : 0;
+	int weak = 0;
 
 	memset(mod, 0, sizeof(*mod));
 
@@ -1421,6 +1423,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 			sample_read = 1;
 		} else if (*str == 'D') {
 			pinned = 1;
+		} else if (*str == 'W') {
+			weak = 1;
 		} else
 			break;
 
@@ -1451,6 +1455,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 	mod->exclude_GH = exclude_GH;
 	mod->sample_read = sample_read;
 	mod->pinned = pinned;
+	mod->weak = weak;
 
 	return 0;
 }
@@ -1464,7 +1469,7 @@ static int check_modifier(char *str)
 	char *p = str;
 
 	/* The sizeof includes 0 byte as well. */
-	if (strlen(str) > (sizeof("ukhGHpppPSDI") - 1))
+	if (strlen(str) > (sizeof("ukhGHpppPSDIW") - 1))
 		return -1;
 
 	while (*p) {
@@ -1504,6 +1509,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
 		evsel->exclude_GH          = mod.exclude_GH;
 		evsel->sample_read         = mod.sample_read;
 		evsel->precise_max         = mod.precise_max;
+		evsel->weak_group	   = mod.weak;
 
 		if (perf_evsel__is_group_leader(evsel))
 			evsel->attr.pinned = mod.pinned;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index c42edea..fdb5bb5 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -161,7 +161,7 @@ name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
 name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 drv_cfg_term	[a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
 /* If you add a modifier you need to update check_modifier() */
-modifier_event	[ukhpPGHSDI]+
+modifier_event	[ukhpPGHSDIW]+
 modifier_bp	[rwx]{1,3}
 
 %%

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf vendor events: Support metric_group and no event name in JSON parser
  2017-08-31 19:40 ` [PATCH v3 02/11] perf, tools: Support metric_group and no event name in json parser Andi Kleen
@ 2017-09-22 16:29   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:29 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: mingo, ak, linux-kernel, tglx, hpa, acme, jolsa

Commit-ID:  3ba36d3620d08be31f5ee9ae20abb9bf3bdeb05a
Gitweb:     http://git.kernel.org/tip/3ba36d3620d08be31f5ee9ae20abb9bf3bdeb05a
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:27 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:12 -0300

perf vendor events: Support metric_group and no event name in JSON parser

Some enhancements to the JSON parser to prepare for metrics support

- Parse the new MetricGroup field
- Support JSON events with no event name, that have only MetricName.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/pmu-events/jevents.c    | 24 ++++++++++++++++++------
 tools/perf/pmu-events/jevents.h    |  2 +-
 tools/perf/pmu-events/pmu-events.h |  1 +
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index d51dc9c..9eb7047 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -292,7 +292,7 @@ static int print_events_table_entry(void *data, char *name, char *event,
 				    char *desc, char *long_desc,
 				    char *pmu, char *unit, char *perpkg,
 				    char *metric_expr,
-				    char *metric_name)
+				    char *metric_name, char *metric_group)
 {
 	struct perf_entry_data *pd = data;
 	FILE *outfp = pd->outfp;
@@ -304,8 +304,10 @@ static int print_events_table_entry(void *data, char *name, char *event,
 	 */
 	fprintf(outfp, "{\n");
 
-	fprintf(outfp, "\t.name = \"%s\",\n", name);
-	fprintf(outfp, "\t.event = \"%s\",\n", event);
+	if (name)
+		fprintf(outfp, "\t.name = \"%s\",\n", name);
+	if (event)
+		fprintf(outfp, "\t.event = \"%s\",\n", event);
 	fprintf(outfp, "\t.desc = \"%s\",\n", desc);
 	fprintf(outfp, "\t.topic = \"%s\",\n", topic);
 	if (long_desc && long_desc[0])
@@ -320,6 +322,8 @@ static int print_events_table_entry(void *data, char *name, char *event,
 		fprintf(outfp, "\t.metric_expr = \"%s\",\n", metric_expr);
 	if (metric_name)
 		fprintf(outfp, "\t.metric_name = \"%s\",\n", metric_name);
+	if (metric_group)
+		fprintf(outfp, "\t.metric_group = \"%s\",\n", metric_group);
 	fprintf(outfp, "},\n");
 
 	return 0;
@@ -357,6 +361,9 @@ static char *real_event(const char *name, char *event)
 {
 	int i;
 
+	if (!name)
+		return NULL;
+
 	for (i = 0; fixed[i].name; i++)
 		if (!strcasecmp(name, fixed[i].name))
 			return (char *)fixed[i].event;
@@ -369,7 +376,7 @@ int json_events(const char *fn,
 		      char *long_desc,
 		      char *pmu, char *unit, char *perpkg,
 		      char *metric_expr,
-		      char *metric_name),
+		      char *metric_name, char *metric_group),
 	  void *data)
 {
 	int err = -EIO;
@@ -397,6 +404,7 @@ int json_events(const char *fn,
 		char *unit = NULL;
 		char *metric_expr = NULL;
 		char *metric_name = NULL;
+		char *metric_group = NULL;
 		unsigned long long eventcode = 0;
 		struct msrmap *msr = NULL;
 		jsmntok_t *msrval = NULL;
@@ -476,6 +484,8 @@ int json_events(const char *fn,
 				addfield(map, &perpkg, "", "", val);
 			} else if (json_streq(map, field, "MetricName")) {
 				addfield(map, &metric_name, "", "", val);
+			} else if (json_streq(map, field, "MetricGroup")) {
+				addfield(map, &metric_group, "", "", val);
 			} else if (json_streq(map, field, "MetricExpr")) {
 				addfield(map, &metric_expr, "", "", val);
 				for (s = metric_expr; *s; s++)
@@ -501,10 +511,11 @@ int json_events(const char *fn,
 			addfield(map, &event, ",", filter, NULL);
 		if (msr != NULL)
 			addfield(map, &event, ",", msr->pname, msrval);
-		fixname(name);
+		if (name)
+			fixname(name);
 
 		err = func(data, name, real_event(name, event), desc, long_desc,
-				pmu, unit, perpkg, metric_expr, metric_name);
+			   pmu, unit, perpkg, metric_expr, metric_name, metric_group);
 		free(event);
 		free(desc);
 		free(name);
@@ -516,6 +527,7 @@ int json_events(const char *fn,
 		free(unit);
 		free(metric_expr);
 		free(metric_name);
+		free(metric_group);
 		if (err)
 			break;
 		tok += j;
diff --git a/tools/perf/pmu-events/jevents.h b/tools/perf/pmu-events/jevents.h
index 611fac0..5579947 100644
--- a/tools/perf/pmu-events/jevents.h
+++ b/tools/perf/pmu-events/jevents.h
@@ -6,7 +6,7 @@ int json_events(const char *fn,
 				char *long_desc,
 				char *pmu,
 				char *unit, char *perpkg, char *metric_expr,
-				char *metric_name),
+				char *metric_name, char *metric_group),
 		void *data);
 char *get_cpu_str(void);
 
diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
index 569eab3..94fa172 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -15,6 +15,7 @@ struct pmu_event {
 	const char *perpkg;
 	const char *metric_expr;
 	const char *metric_name;
+	const char *metric_group;
 };
 
 /*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Factor out generic metric printing
  2017-08-31 19:40 ` [PATCH v3 03/11] perf, tools, stat: Factor out generic metric printing Andi Kleen
@ 2017-09-22 16:29   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:29 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: acme, ak, mingo, tglx, hpa, linux-kernel, jolsa

Commit-ID:  bba49af87393ebc8960bf8abdcbb9af53bf1aba1
Gitweb:     http://git.kernel.org/tip/bba49af87393ebc8960bf8abdcbb9af53bf1aba1
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:28 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:12 -0300

perf stat: Factor out generic metric printing

The 'perf stat' shadow metric printing already supports generic metrics.
Factor out the code doing that into a separate function that can be
re-used in a later patch.

No behavior changes.

v2: Fix indentation

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 69 ++++++++++++++++++++++++++-----------------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index a04cf56..96aa6cb 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -627,6 +627,46 @@ static void print_smi_cost(int cpu, struct perf_evsel *evsel,
 	out->print_metric(out->ctx, NULL, "%4.0f", "SMI#", smi_num);
 }
 
+static void generic_metric(const char *metric_expr,
+			   struct perf_evsel **metric_events,
+			   char *name,
+			   const char *metric_name,
+			   double avg,
+			   int cpu,
+			   int ctx,
+			   struct perf_stat_output_ctx *out)
+{
+	print_metric_t print_metric = out->print_metric;
+	struct parse_ctx pctx;
+	double ratio;
+	int i;
+	void *ctxp = out->ctx;
+
+	expr__ctx_init(&pctx);
+	expr__add_id(&pctx, name, avg);
+	for (i = 0; metric_events[i]; i++) {
+		struct saved_value *v;
+
+		v = saved_value_lookup(metric_events[i], cpu, ctx, false);
+		if (!v)
+			break;
+		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
+	}
+	if (!metric_events[i]) {
+		const char *p = metric_expr;
+
+		if (expr__parse(&ratio, &pctx, &p) == 0)
+			print_metric(ctxp, NULL, "%8.1f",
+				metric_name ?
+				metric_name :
+				out->force_header ?  name : "",
+				ratio);
+		else
+			print_metric(ctxp, NULL, NULL, "", 0);
+	} else
+		print_metric(ctxp, NULL, NULL, "", 0);
+}
+
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
 				   struct perf_stat_output_ctx *out)
@@ -819,33 +859,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		else
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
-		struct parse_ctx pctx;
-		int i;
-
-		expr__ctx_init(&pctx);
-		expr__add_id(&pctx, evsel->name, avg);
-		for (i = 0; evsel->metric_events[i]; i++) {
-			struct saved_value *v;
-
-			v = saved_value_lookup(evsel->metric_events[i], cpu, ctx, false);
-			if (!v)
-				break;
-			expr__add_id(&pctx, evsel->metric_events[i]->name,
-					     avg_stats(&v->stats));
-		}
-		if (!evsel->metric_events[i]) {
-			const char *p = evsel->metric_expr;
-
-			if (expr__parse(&ratio, &pctx, &p) == 0)
-				print_metric(ctxp, NULL, "%8.1f",
-					evsel->metric_name ?
-					evsel->metric_name :
-					out->force_header ?  evsel->name : "",
-					ratio);
-			else
-				print_metric(ctxp, NULL, NULL, "", 0);
-		} else
-			print_metric(ctxp, NULL, NULL, "", 0);
+		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
+				evsel->metric_name, avg, cpu, ctx, out);
 	} else if (runtime_nsecs_stats[cpu].n != 0) {
 		char unit = 'M';
 		char unit_buf[10];

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Print generic metric header even for failed expressions
  2017-08-31 19:40 ` [PATCH v3 04/11] perf, tools: Print generic metric header even for failed expressions Andi Kleen
@ 2017-09-22 16:30   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:30 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: jolsa, ak, mingo, hpa, linux-kernel, tglx, acme

Commit-ID:  4ed962eb38c8a33b8b6ded911410afaefa1ca48c
Gitweb:     http://git.kernel.org/tip/4ed962eb38c8a33b8b6ded911410afaefa1ca48c
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:29 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:13 -0300

perf stat: Print generic metric header even for failed expressions

Print the generic metric header even when the expression evaluation
failed. Otherwise an expression that fails on the first collections due
to division by zero may suddenly reappear later without an header.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 96aa6cb..8c7ab29 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -662,7 +662,9 @@ static void generic_metric(const char *metric_expr,
 				out->force_header ?  name : "",
 				ratio);
 		else
-			print_metric(ctxp, NULL, NULL, "", 0);
+			print_metric(ctxp, NULL, NULL,
+				     out->force_header ?
+				     (metric_name ? metric_name : name) : "", 0);
 	} else
 		print_metric(ctxp, NULL, NULL, "", 0);
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf pmu: Extract function to get JSON alias map
  2017-08-31 19:40 ` [PATCH v3 05/11] perf, tools: Extract function to get json alias map Andi Kleen
@ 2017-09-22 16:30   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:30 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: acme, tglx, mingo, ak, jolsa, linux-kernel, hpa

Commit-ID:  d77ade9f4199c77c63e2ae382a8c8fbe0582ede2
Gitweb:     http://git.kernel.org/tip/d77ade9f4199c77c63e2ae382a8c8fbe0582ede2
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:30 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:13 -0300

perf pmu: Extract function to get JSON alias map

Extract the code to get the per cpu JSON alias into a separate function
for reuse. No behavior changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 49 +++++++++++++++++++++++++++++++++----------------
 tools/perf/util/pmu.h |  2 ++
 2 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ac16a9d..ed25d7f 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -516,16 +516,8 @@ char * __weak get_cpuid_str(void)
 	return NULL;
 }
 
-/*
- * From the pmu_events_map, find the table of PMU events that corresponds
- * to the current running CPU. Then, add all PMU events from that table
- * as aliases.
- */
-static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+static char *perf_pmu__getcpuid(void)
 {
-	int i;
-	struct pmu_events_map *map;
-	struct pmu_event *pe;
 	char *cpuid;
 	static bool printed;
 
@@ -535,22 +527,50 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 	if (!cpuid)
 		cpuid = get_cpuid_str();
 	if (!cpuid)
-		return;
+		return NULL;
 
 	if (!printed) {
 		pr_debug("Using CPUID %s\n", cpuid);
 		printed = true;
 	}
+	return cpuid;
+}
+
+struct pmu_events_map *perf_pmu__find_map(void)
+{
+	struct pmu_events_map *map;
+	char *cpuid = perf_pmu__getcpuid();
+	int i;
 
 	i = 0;
-	while (1) {
+	for (;;) {
 		map = &pmu_events_map[i++];
-		if (!map->table)
-			goto out;
+		if (!map->table) {
+			map = NULL;
+			break;
+		}
 
 		if (!strcmp(map->cpuid, cpuid))
 			break;
 	}
+	free(cpuid);
+	return map;
+}
+
+/*
+ * From the pmu_events_map, find the table of PMU events that corresponds
+ * to the current running CPU. Then, add all PMU events from that table
+ * as aliases.
+ */
+static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+{
+	int i;
+	struct pmu_events_map *map;
+	struct pmu_event *pe;
+
+	map = perf_pmu__find_map();
+	if (!map)
+		return;
 
 	/*
 	 * Found a matching PMU events table. Create aliases
@@ -575,9 +595,6 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 				(char *)pe->metric_expr,
 				(char *)pe->metric_name);
 	}
-
-out:
-	free(cpuid);
 }
 
 struct perf_event_attr * __weak
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 389e972..060f6ab 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -90,4 +90,6 @@ int perf_pmu__test(void);
 
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu);
 
+struct pmu_events_map *perf_pmu__find_map(void);
+
 #endif /* __PMU_H */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Support JSON metrics in perf stat
  2017-08-31 19:40 ` [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat Andi Kleen
  2017-09-04 17:11   ` Arnaldo Carvalho de Melo
@ 2017-09-22 16:31   ` tip-bot for Andi Kleen
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:31 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: acme, ak, linux-kernel, jolsa, hpa, tglx, mingo

Commit-ID:  b18f3e365019de1a5b26a851e123f0aedcce881f
Gitweb:     http://git.kernel.org/tip/b18f3e365019de1a5b26a851e123f0aedcce881f
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:31 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:13 -0300

perf stat: Support JSON metrics in perf stat

Add generic support for standalone metrics specified in JSON files to
perf stat. A metric is a formula that uses multiple events to compute a
higher level result (e.g. IPC).

Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have standalone
metrics. They are in the same JSON data structure as events, but don't
have an event name.

We also allow to organize the metrics in metric groups, which allows a
short cut to select several related metrics at once.

Add a new -M / --metrics option to perf stat that adds the metrics or
metric groups specified.

Add the core code to manage and parse the metric groups. They are
collected from the JSON data structures into a separate rblist.  When
computing shadow values look for metrics in that list.  Then they are
computed using the existing saved values infrastructure in stat-shadow.c

The actual JSON metrics are in a separate pull request.

  % perf stat -M Summary --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Instructions   CLKS          CPU_Utilization  GFLOPs   SMT_2T_Utilization   Kernel_Utilization
  317614222.0    1392930775.0  0.0              0.0      0.2                  0.1

       1.001497549 seconds time elapsed

  % perf stat -M GFLOPs flops

   Performance counter stats for 'flops':

     3,999,541,471  fp_comp_ops_exe.sse_scalar_single #  1.2 GFLOPs   (66.65%)
                14  fp_comp_ops_exe.sse_scalar_double                 (66.65%)
                 0  fp_comp_ops_exe.sse_packed_double                 (66.67%)
                 0  fp_comp_ops_exe.sse_packed_single                 (66.70%)
                 0  simd_fp_256.packed_double                         (66.70%)
                 0  simd_fp_256.packed_single                         (66.67%)
                 0  duration_time

       3.238372845 seconds time elapsed

v2: Add missing header file
v3: Move find_map to pmu.c

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-stat.txt |   7 +
 tools/perf/builtin-stat.c              |  18 +-
 tools/perf/util/Build                  |   1 +
 tools/perf/util/metricgroup.c          | 313 +++++++++++++++++++++++++++++++++
 tools/perf/util/metricgroup.h          |  31 ++++
 tools/perf/util/pmu.c                  |   5 +-
 tools/perf/util/stat-shadow.c          |  22 ++-
 tools/perf/util/stat.h                 |   4 +-
 8 files changed, 395 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index c37d616..823fce7 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -199,6 +199,13 @@ Aggregate counts per processor socket for system-wide mode measurements.
 --per-core::
 Aggregate counts per physical processor for system-wide mode measurements.
 
+-M::
+--metrics::
+Print metrics or metricgroups specified in a comma separated list.
+For a group all metrics from the group are added.
+The events from the metrics are automatically measured.
+See perf list output for the possble metrics and metricgroups.
+
 -A::
 --no-aggr::
 Do not aggregate counts across all monitored CPUs.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7cc61eb..874bc6d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -65,6 +65,7 @@
 #include "util/tool.h"
 #include "util/group.h"
 #include "util/string2.h"
+#include "util/metricgroup.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -133,6 +134,8 @@ static const char *smi_cost_attrs = {
 
 static struct perf_evlist	*evsel_list;
 
+static struct rblist		 metric_events;
+
 static struct target target = {
 	.uid	= UINT_MAX,
 };
@@ -1234,7 +1237,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	perf_stat__print_shadow_stats(counter, uval,
 				first_shadow_cpu(counter, id),
-				&out);
+				&out, &metric_events);
 	if (!csv_output && !metric_only) {
 		print_noise(counter, noise);
 		print_running(run, ena);
@@ -1565,7 +1568,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 		os.evsel = counter;
 		perf_stat__print_shadow_stats(counter, 0,
 					      0,
-					      &out);
+					      &out,
+					      &metric_events);
 	}
 	fputc('\n', stat_config.output);
 }
@@ -1789,6 +1793,13 @@ static int enable_metric_only(const struct option *opt __maybe_unused,
 	return 0;
 }
 
+static int parse_metric_groups(const struct option *opt,
+			       const char *str,
+			       int unset __maybe_unused)
+{
+	return metricgroup__parse_groups(opt, str, &metric_events);
+}
+
 static const struct option stat_options[] = {
 	OPT_BOOLEAN('T', "transaction", &transaction_run,
 		    "hardware transaction statistics"),
@@ -1854,6 +1865,9 @@ static const struct option stat_options[] = {
 			"measure topdown level 1 statistics"),
 	OPT_BOOLEAN(0, "smi-cost", &smi_cost,
 			"measure SMI cost"),
+	OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
+		     "monitor specified metrics or metric groups (separated by ,)",
+		     parse_metric_groups),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 94518c1..71ab846 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -34,6 +34,7 @@ libperf-y += dso.o
 libperf-y += symbol.o
 libperf-y += symbol_fprintf.o
 libperf-y += color.o
+libperf-y += metricgroup.o
 libperf-y += header.o
 libperf-y += callchain.o
 libperf-y += values.o
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
new file mode 100644
index 0000000..7516b17
--- /dev/null
+++ b/tools/perf/util/metricgroup.c
@@ -0,0 +1,313 @@
+/*
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+/* Manage metrics and groups of metrics from JSON files */
+
+#include "metricgroup.h"
+#include "evlist.h"
+#include "strbuf.h"
+#include "pmu.h"
+#include "expr.h"
+#include "rblist.h"
+#include "pmu.h"
+#include <string.h>
+#include <stdbool.h>
+#include <errno.h>
+#include "pmu-events/pmu-events.h"
+#include "strbuf.h"
+#include "strlist.h"
+#include <assert.h>
+#include <ctype.h>
+
+struct metric_event *metricgroup__lookup(struct rblist *metric_events,
+					 struct perf_evsel *evsel,
+					 bool create)
+{
+	struct rb_node *nd;
+	struct metric_event me = {
+		.evsel = evsel
+	};
+	nd = rblist__find(metric_events, &me);
+	if (nd)
+		return container_of(nd, struct metric_event, nd);
+	if (create) {
+		rblist__add_node(metric_events, &me);
+		nd = rblist__find(metric_events, &me);
+		if (nd)
+			return container_of(nd, struct metric_event, nd);
+	}
+	return NULL;
+}
+
+static int metric_event_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct metric_event *a = container_of(rb_node,
+					      struct metric_event,
+					      nd);
+	const struct metric_event *b = entry;
+
+	if (a->evsel == b->evsel)
+		return 0;
+	if ((char *)a->evsel < (char *)b->evsel)
+		return -1;
+	return +1;
+}
+
+static struct rb_node *metric_event_new(struct rblist *rblist __maybe_unused,
+					const void *entry)
+{
+	struct metric_event *me = malloc(sizeof(struct metric_event));
+
+	if (!me)
+		return NULL;
+	memcpy(me, entry, sizeof(struct metric_event));
+	me->evsel = ((struct metric_event *)entry)->evsel;
+	INIT_LIST_HEAD(&me->head);
+	return &me->nd;
+}
+
+static void metricgroup__rblist_init(struct rblist *metric_events)
+{
+	rblist__init(metric_events);
+	metric_events->node_cmp = metric_event_cmp;
+	metric_events->node_new = metric_event_new;
+}
+
+struct egroup {
+	struct list_head nd;
+	int idnum;
+	const char **ids;
+	const char *metric_name;
+	const char *metric_expr;
+};
+
+static struct perf_evsel *find_evsel(struct perf_evlist *perf_evlist,
+				     const char **ids,
+				     int idnum,
+				     struct perf_evsel **metric_events)
+{
+	struct perf_evsel *ev, *start = NULL;
+	int ind = 0;
+
+	evlist__for_each_entry (perf_evlist, ev) {
+		if (!strcmp(ev->name, ids[ind])) {
+			metric_events[ind] = ev;
+			if (ind == 0)
+				start = ev;
+			if (++ind == idnum) {
+				metric_events[ind] = NULL;
+				return start;
+			}
+		} else {
+			ind = 0;
+			start = NULL;
+		}
+	}
+	/*
+	 * This can happen when an alias expands to multiple
+	 * events, like for uncore events.
+	 * We don't support this case for now.
+	 */
+	return NULL;
+}
+
+static int metricgroup__setup_events(struct list_head *groups,
+				     struct perf_evlist *perf_evlist,
+				     struct rblist *metric_events_list)
+{
+	struct metric_event *me;
+	struct metric_expr *expr;
+	int i = 0;
+	int ret = 0;
+	struct egroup *eg;
+	struct perf_evsel *evsel;
+
+	list_for_each_entry (eg, groups, nd) {
+		struct perf_evsel **metric_events;
+
+		metric_events = calloc(sizeof(void *), eg->idnum + 1);
+		if (!metric_events) {
+			ret = -ENOMEM;
+			break;
+		}
+		evsel = find_evsel(perf_evlist, eg->ids, eg->idnum,
+				   metric_events);
+		if (!evsel) {
+			pr_debug("Cannot resolve %s: %s\n",
+					eg->metric_name, eg->metric_expr);
+			continue;
+		}
+		for (i = 0; i < eg->idnum; i++)
+			metric_events[i]->collect_stat = true;
+		me = metricgroup__lookup(metric_events_list, evsel, true);
+		if (!me) {
+			ret = -ENOMEM;
+			break;
+		}
+		expr = malloc(sizeof(struct metric_expr));
+		if (!expr) {
+			ret = -ENOMEM;
+			break;
+		}
+		expr->metric_expr = eg->metric_expr;
+		expr->metric_name = eg->metric_name;
+		expr->metric_events = metric_events;
+		list_add(&expr->nd, &me->head);
+	}
+	return ret;
+}
+
+static bool match_metric(const char *n, const char *list)
+{
+	int len;
+	char *m;
+
+	if (!list)
+		return false;
+	if (!strcmp(list, "all"))
+		return true;
+	if (!n)
+		return !strcasecmp(list, "No_group");
+	len = strlen(list);
+	m = strcasestr(n, list);
+	if (!m)
+		return false;
+	if ((m == n || m[-1] == ';' || m[-1] == ' ') &&
+	    (m[len] == 0 || m[len] == ';'))
+		return true;
+	return false;
+}
+
+static int metricgroup__add_metric(const char *metric, struct strbuf *events,
+				   struct list_head *group_list)
+{
+	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_event *pe;
+	int ret = -EINVAL;
+	int i, j;
+
+	strbuf_init(events, 100);
+	strbuf_addf(events, "%s", "");
+
+	if (!map)
+		return 0;
+
+	for (i = 0; ; i++) {
+		pe = &map->table[i];
+
+		if (!pe->name && !pe->metric_group && !pe->metric_name)
+			break;
+		if (!pe->metric_expr)
+			continue;
+		if (match_metric(pe->metric_group, metric) ||
+		    match_metric(pe->metric_name, metric)) {
+			const char **ids;
+			int idnum;
+			struct egroup *eg;
+
+			pr_debug("metric expr %s for %s\n", pe->metric_expr, pe->metric_name);
+
+			if (expr__find_other(pe->metric_expr,
+					     NULL, &ids, &idnum) < 0)
+				continue;
+			if (events->len > 0)
+				strbuf_addf(events, ",");
+			for (j = 0; j < idnum; j++) {
+				pr_debug("found event %s\n", ids[j]);
+				strbuf_addf(events, "%s%s",
+					j == 0 ? "{" : ",",
+					ids[j]);
+			}
+			strbuf_addf(events, "}:W");
+
+			eg = malloc(sizeof(struct egroup));
+			if (!eg) {
+				ret = -ENOMEM;
+				break;
+			}
+			eg->ids = ids;
+			eg->idnum = idnum;
+			eg->metric_name = pe->metric_name;
+			eg->metric_expr = pe->metric_expr;
+			list_add_tail(&eg->nd, group_list);
+			ret = 0;
+		}
+	}
+	return ret;
+}
+
+static int metricgroup__add_metric_list(const char *list, struct strbuf *events,
+				        struct list_head *group_list)
+{
+	char *llist, *nlist, *p;
+	int ret = -EINVAL;
+
+	nlist = strdup(list);
+	if (!nlist)
+		return -ENOMEM;
+	llist = nlist;
+	while ((p = strsep(&llist, ",")) != NULL) {
+		ret = metricgroup__add_metric(p, events, group_list);
+		if (ret == -EINVAL) {
+			fprintf(stderr, "Cannot find metric or group `%s'\n",
+					p);
+			break;
+		}
+	}
+	free(nlist);
+	return ret;
+}
+
+static void metricgroup__free_egroups(struct list_head *group_list)
+{
+	struct egroup *eg, *egtmp;
+	int i;
+
+	list_for_each_entry_safe (eg, egtmp, group_list, nd) {
+		for (i = 0; i < eg->idnum; i++)
+			free((char *)eg->ids[i]);
+		free(eg->ids);
+		free(eg);
+	}
+}
+
+int metricgroup__parse_groups(const struct option *opt,
+			   const char *str,
+			   struct rblist *metric_events)
+{
+	struct parse_events_error parse_error;
+	struct perf_evlist *perf_evlist = *(struct perf_evlist **)opt->value;
+	struct strbuf extra_events;
+	LIST_HEAD(group_list);
+	int ret;
+
+	if (metric_events->nr_entries == 0)
+		metricgroup__rblist_init(metric_events);
+	ret = metricgroup__add_metric_list(str, &extra_events, &group_list);
+	if (ret)
+		return ret;
+	pr_debug("adding %s\n", extra_events.buf);
+	memset(&parse_error, 0, sizeof(struct parse_events_error));
+	ret = parse_events(perf_evlist, extra_events.buf, &parse_error);
+	if (ret) {
+		pr_err("Cannot set up events %s\n", extra_events.buf);
+		goto out;
+	}
+	strbuf_release(&extra_events);
+	ret = metricgroup__setup_events(&group_list, perf_evlist,
+					metric_events);
+out:
+	metricgroup__free_egroups(&group_list);
+	return ret;
+}
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
new file mode 100644
index 0000000..06854e1
--- /dev/null
+++ b/tools/perf/util/metricgroup.h
@@ -0,0 +1,31 @@
+#ifndef METRICGROUP_H
+#define METRICGROUP_H 1
+
+#include "linux/list.h"
+#include "rblist.h"
+#include <subcmd/parse-options.h>
+#include "evlist.h"
+#include "strbuf.h"
+
+struct metric_event {
+	struct rb_node nd;
+	struct perf_evsel *evsel;
+	struct list_head head; /* list of metric_expr */
+};
+
+struct metric_expr {
+	struct list_head nd;
+	const char *metric_expr;
+	const char *metric_name;
+	struct perf_evsel **metric_events;
+};
+
+struct metric_event *metricgroup__lookup(struct rblist *metric_events,
+					 struct perf_evsel *evsel,
+					 bool create);
+int metricgroup__parse_groups(const struct option *opt,
+			const char *str,
+			struct rblist *metric_events);
+
+void metricgroup__print(bool metrics, bool groups, char *filter, bool raw);
+#endif
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ed25d7f..7070638 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -580,8 +580,11 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 		const char *pname;
 
 		pe = &map->table[i++];
-		if (!pe->name)
+		if (!pe->name) {
+			if (pe->metric_group || pe->metric_name)
+				continue;
 			break;
+		}
 
 		pname = pe->pmu ? pe->pmu : "cpu";
 		if (strncmp(pname, name, strlen(pname)))
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 8c7ab29..42e6c17 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -6,6 +6,7 @@
 #include "rblist.h"
 #include "evlist.h"
 #include "expr.h"
+#include "metricgroup.h"
 
 enum {
 	CTX_BIT_USER	= 1 << 0,
@@ -671,13 +672,16 @@ static void generic_metric(const char *metric_expr,
 
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
-				   struct perf_stat_output_ctx *out)
+				   struct perf_stat_output_ctx *out,
+				   struct rblist *metric_events)
 {
 	void *ctxp = out->ctx;
 	print_metric_t print_metric = out->print_metric;
 	double total, ratio = 0.0, total2;
 	const char *color = NULL;
 	int ctx = evsel_context(evsel);
+	struct metric_event *me;
+	int num = 1;
 
 	if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
 		total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
@@ -880,6 +884,20 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
 		print_smi_cost(cpu, evsel, out);
 	} else {
-		print_metric(ctxp, NULL, NULL, NULL, 0);
+		num = 0;
 	}
+
+	if ((me = metricgroup__lookup(metric_events, evsel, false)) != NULL) {
+		struct metric_expr *mexp;
+
+		list_for_each_entry (mexp, &me->head, nd) {
+			if (num++ > 0)
+				out->new_line(ctxp);
+			generic_metric(mexp->metric_expr, mexp->metric_events,
+					evsel->name, mexp->metric_name,
+					avg, cpu, ctx, out);
+		}
+	}
+	if (num == 0)
+		print_metric(ctxp, NULL, NULL, NULL, 0);
 }
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index eacaf95..47915df 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -91,9 +91,11 @@ struct perf_stat_output_ctx {
 	bool force_header;
 };
 
+struct rblist;
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
-				   struct perf_stat_output_ctx *out);
+				   struct perf_stat_output_ctx *out,
+				   struct rblist *metric_events);
 void perf_stat__collect_metric_expr(struct perf_evlist *);
 
 int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf list: Add metric groups to perf list
  2017-08-31 19:40 ` [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list Andi Kleen
@ 2017-09-22 16:31   ` tip-bot for Andi Kleen
  2017-10-13 14:50   ` [PATCH v3 07/11] perf, tools, " Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:31 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: jolsa, linux-kernel, mingo, ak, tglx, acme, hpa

Commit-ID:  71b0acce78d12e99eeda6fd6642ba89cc2b2b49c
Gitweb:     http://git.kernel.org/tip/71b0acce78d12e99eeda6fd6642ba89cc2b2b49c
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:32 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:13 -0300

perf list: Add metric groups to perf list

Add code to perf list to print metric groups, and metrics
that don't have an event name. The metricgroup code collects
the eventgroups and events into a rblist, and then prints
them according to the configured filters.

The metricgroups are printed by default, but can be
limited by perf list metric or perf list metricgroup

  % perf list metricgroup
  ..
  Metric Groups:

  DSB:
    DSB_Coverage
          [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  FLOPS:
    GFLOPs
          [Giga Floating Point Operations Per Second]
  Frontend:
    IFetch_Line_Utilization
          [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
  Frontend_Bandwidth:
    DSB_Coverage
          [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  Memory_BW:
    MLP
          [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]

v2: Check return value of asprintf to fix warning on FC26
Fix key in lookup/addition for the groups list

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-8-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-list.txt |   7 +-
 tools/perf/builtin-list.c              |   7 ++
 tools/perf/util/metricgroup.c          | 176 +++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.c         |   3 +
 4 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 75fc17f..24679ae 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -8,7 +8,8 @@ perf-list - List all symbolic event types
 SYNOPSIS
 --------
 [verse]
-'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|sdt|event_glob]
+'perf list' [--no-desc] [--long-desc]
+            [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
 
 DESCRIPTION
 -----------
@@ -248,6 +249,10 @@ To limit the list use:
 
 . 'sdt' to list all Statically Defined Tracepoint events.
 
+. 'metric' to list metrics
+
+. 'metricgroup' to list metricgroups with metrics.
+
 . If none of the above is matched, it will apply the supplied glob to all
   events, printing the ones that match.
 
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 4bf2cb4..b2d2ad3 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -15,6 +15,7 @@
 #include "util/cache.h"
 #include "util/pmu.h"
 #include "util/debug.h"
+#include "util/metricgroup.h"
 #include <subcmd/parse-options.h>
 
 static bool desc_flag = true;
@@ -79,6 +80,10 @@ int cmd_list(int argc, const char **argv)
 						long_desc_flag, details_flag);
 		else if (strcmp(argv[i], "sdt") == 0)
 			print_sdt_events(NULL, NULL, raw_dump);
+		else if (strcmp(argv[i], "metric") == 0)
+			metricgroup__print(true, false, NULL, raw_dump);
+		else if (strcmp(argv[i], "metricgroup") == 0)
+			metricgroup__print(false, true, NULL, raw_dump);
 		else if ((sep = strchr(argv[i], ':')) != NULL) {
 			int sep_idx;
 
@@ -96,6 +101,7 @@ int cmd_list(int argc, const char **argv)
 			s[sep_idx] = '\0';
 			print_tracepoint_events(s, s + sep_idx + 1, raw_dump);
 			print_sdt_events(s, s + sep_idx + 1, raw_dump);
+			metricgroup__print(true, true, s, raw_dump);
 			free(s);
 		} else {
 			if (asprintf(&s, "*%s*", argv[i]) < 0) {
@@ -112,6 +118,7 @@ int cmd_list(int argc, const char **argv)
 						details_flag);
 			print_tracepoint_events(NULL, s, raw_dump);
 			print_sdt_events(NULL, s, raw_dump);
+			metricgroup__print(true, true, NULL, raw_dump);
 			free(s);
 		}
 	}
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 7516b17..2d60114 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -189,6 +189,182 @@ static bool match_metric(const char *n, const char *list)
 	return false;
 }
 
+struct mep {
+	struct rb_node nd;
+	const char *name;
+	struct strlist *metrics;
+};
+
+static int mep_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct mep *a = container_of(rb_node, struct mep, nd);
+	struct mep *b = (struct mep *)entry;
+
+	return strcmp(a->name, b->name);
+}
+
+static struct rb_node *mep_new(struct rblist *rl __maybe_unused,
+					const void *entry)
+{
+	struct mep *me = malloc(sizeof(struct mep));
+
+	if (!me)
+		return NULL;
+	memcpy(me, entry, sizeof(struct mep));
+	me->name = strdup(me->name);
+	if (!me->name)
+		goto out_me;
+	me->metrics = strlist__new(NULL, NULL);
+	if (!me->metrics)
+		goto out_name;
+	return &me->nd;
+out_name:
+	free((char *)me->name);
+out_me:
+	free(me);
+	return NULL;
+}
+
+static struct mep *mep_lookup(struct rblist *groups, const char *name)
+{
+	struct rb_node *nd;
+	struct mep me = {
+		.name = name
+	};
+	nd = rblist__find(groups, &me);
+	if (nd)
+		return container_of(nd, struct mep, nd);
+	rblist__add_node(groups, &me);
+	nd = rblist__find(groups, &me);
+	if (nd)
+		return container_of(nd, struct mep, nd);
+	return NULL;
+}
+
+static void mep_delete(struct rblist *rl __maybe_unused,
+		       struct rb_node *nd)
+{
+	struct mep *me = container_of(nd, struct mep, nd);
+
+	strlist__delete(me->metrics);
+	free((void *)me->name);
+	free(me);
+}
+
+static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
+{
+	struct str_node *sn;
+	int n = 0;
+
+	strlist__for_each_entry (sn, metrics) {
+		if (raw)
+			printf("%s%s", n > 0 ? " " : "", sn->s);
+		else
+			printf("  %s\n", sn->s);
+		n++;
+	}
+	if (raw)
+		putchar('\n');
+}
+
+void metricgroup__print(bool metrics, bool metricgroups, char *filter,
+			bool raw)
+{
+	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_event *pe;
+	int i;
+	struct rblist groups;
+	struct rb_node *node, *next;
+	struct strlist *metriclist = NULL;
+
+	if (!map)
+		return;
+
+	if (!metricgroups) {
+		metriclist = strlist__new(NULL, NULL);
+		if (!metriclist)
+			return;
+	}
+
+	rblist__init(&groups);
+	groups.node_new = mep_new;
+	groups.node_cmp = mep_cmp;
+	groups.node_delete = mep_delete;
+	for (i = 0; ; i++) {
+		const char *g;
+		pe = &map->table[i];
+
+		if (!pe->name && !pe->metric_group && !pe->metric_name)
+			break;
+		if (!pe->metric_expr)
+			continue;
+		g = pe->metric_group;
+		if (!g && pe->metric_name) {
+			if (pe->name)
+				continue;
+			g = "No_group";
+		}
+		if (g) {
+			char *omg;
+			char *mg = strdup(g);
+
+			if (!mg)
+				return;
+			omg = mg;
+			while ((g = strsep(&mg, ";")) != NULL) {
+				struct mep *me;
+				char *s;
+
+				if (*g == 0)
+					g = "No_group";
+				while (isspace(*g))
+					g++;
+				if (filter && !strstr(g, filter))
+					continue;
+				if (raw)
+					s = (char *)pe->metric_name;
+				else {
+					if (asprintf(&s, "%s\n\t[%s]",
+						     pe->metric_name, pe->desc) < 0)
+						return;
+				}
+
+				if (!s)
+					continue;
+
+				if (!metricgroups) {
+					strlist__add(metriclist, s);
+				} else {
+					me = mep_lookup(&groups, g);
+					if (!me)
+						continue;
+					strlist__add(me->metrics, s);
+				}
+			}
+			free(omg);
+		}
+	}
+
+	if (metricgroups && !raw)
+		printf("\nMetric Groups:\n\n");
+	else if (metrics && !raw)
+		printf("\nMetrics:\n\n");
+
+	for (node = rb_first(&groups.entries); node; node = next) {
+		struct mep *me = container_of(node, struct mep, nd);
+
+		if (metricgroups)
+			printf("%s%s%s", me->name, metrics ? ":" : "", raw ? " " : "\n");
+		if (metrics)
+			metricgroup__print_strlist(me->metrics, raw);
+		next = rb_next(node);
+		rblist__remove_node(&groups, node);
+	}
+	if (!metricgroups)
+		metricgroup__print_strlist(metriclist, raw);
+	strlist__delete(metriclist);
+}
+
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
 				   struct list_head *group_list)
 {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 57d7acf..7558892 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -28,6 +28,7 @@
 #include "probe-file.h"
 #include "asm/bug.h"
 #include "util/parse-branch-options.h"
+#include "metricgroup.h"
 
 #define MAX_NAME_LEN 100
 
@@ -2380,6 +2381,8 @@ void print_events(const char *event_glob, bool name_only, bool quiet_flag,
 	print_tracepoint_events(NULL, NULL, name_only);
 
 	print_sdt_events(NULL, NULL, name_only);
+
+	metricgroup__print(true, true, NULL, name_only);
 }
 
 int parse_events__is_hardcoded_term(struct parse_events_term *term)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Don't use ctx for saved values lookup
  2017-08-31 19:40 ` [PATCH v3 08/11] perf, tools, stat: Don't use ctx for saved values lookup Andi Kleen
@ 2017-09-22 16:31   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:31 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, acme, linux-kernel, jolsa, ak, tglx, mingo

Commit-ID:  4e1a096380e3b558ef021afc08e193ce5d1be478
Gitweb:     http://git.kernel.org/tip/4e1a096380e3b558ef021afc08e193ce5d1be478
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:33 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:13 -0300

perf stat: Don't use ctx for saved values lookup

We don't need to use ctx to look up events for saved values.  The
context is already part of the evsel pointer, which is the primary key.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-9-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 42e6c17..664f49a 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -56,7 +56,6 @@ struct saved_value {
 	struct rb_node rb_node;
 	struct perf_evsel *evsel;
 	int cpu;
-	int ctx;
 	struct stats stats;
 };
 
@@ -67,8 +66,6 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 					     rb_node);
 	const struct saved_value *b = entry;
 
-	if (a->ctx != b->ctx)
-		return a->ctx - b->ctx;
 	if (a->cpu != b->cpu)
 		return a->cpu - b->cpu;
 	if (a->evsel == b->evsel)
@@ -90,13 +87,12 @@ static struct rb_node *saved_value_new(struct rblist *rblist __maybe_unused,
 }
 
 static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
-					      int cpu, int ctx,
+					      int cpu,
 					      bool create)
 {
 	struct rb_node *nd;
 	struct saved_value dm = {
 		.cpu = cpu,
-		.ctx = ctx,
 		.evsel = evsel,
 	};
 	nd = rblist__find(&runtime_saved_values, &dm);
@@ -232,8 +228,7 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 *count,
 		update_stats(&runtime_aperf_stats[ctx][cpu], count[0]);
 
 	if (counter->collect_stat) {
-		struct saved_value *v = saved_value_lookup(counter, cpu, ctx,
-							   true);
+		struct saved_value *v = saved_value_lookup(counter, cpu, true);
 		update_stats(&v->stats, count[0]);
 	}
 }
@@ -634,7 +629,6 @@ static void generic_metric(const char *metric_expr,
 			   const char *metric_name,
 			   double avg,
 			   int cpu,
-			   int ctx,
 			   struct perf_stat_output_ctx *out)
 {
 	print_metric_t print_metric = out->print_metric;
@@ -648,7 +642,7 @@ static void generic_metric(const char *metric_expr,
 	for (i = 0; metric_events[i]; i++) {
 		struct saved_value *v;
 
-		v = saved_value_lookup(metric_events[i], cpu, ctx, false);
+		v = saved_value_lookup(metric_events[i], cpu, false);
 		if (!v)
 			break;
 		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
@@ -866,7 +860,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
 		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
-				evsel->metric_name, avg, cpu, ctx, out);
+				evsel->metric_name, avg, cpu, out);
 	} else if (runtime_nsecs_stats[cpu].n != 0) {
 		char unit = 'M';
 		char unit_buf[10];
@@ -895,7 +889,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				out->new_line(ctxp);
 			generic_metric(mexp->metric_expr, mexp->metric_events,
 					evsel->name, mexp->metric_name,
-					avg, cpu, ctx, out);
+					avg, cpu, out);
 		}
 	}
 	if (num == 0)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Support duration_time for metrics
  2017-08-31 19:40 ` [PATCH v3 09/11] perf, tools, stat: Support duration_time for metrics Andi Kleen
@ 2017-09-22 16:32   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:32 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: tglx, linux-kernel, acme, mingo, ak, jolsa, hpa

Commit-ID:  fd48aad9b0f3f7654433dfae3a72ceda36e2de28
Gitweb:     http://git.kernel.org/tip/fd48aad9b0f3f7654433dfae3a72ceda36e2de28
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:34 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:14 -0300

perf stat: Support duration_time for metrics

Some of the metrics formulas (like GFLOPs) need to know how long the
measurement period is. Support an internal event called duration_time,
which reports time in second. It maps to the dummy event, but is special
cased for statistics to report the walltime duration.

So far it is not printed, but only used internally for metrics.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-10-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.l |  1 +
 tools/perf/util/stat-shadow.c  | 17 +++++++++++++----
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index fdb5bb5..ea2426d 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -288,6 +288,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+duration_time					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
 bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 664f49a..a2c12d1e 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -641,11 +641,20 @@ static void generic_metric(const char *metric_expr,
 	expr__add_id(&pctx, name, avg);
 	for (i = 0; metric_events[i]; i++) {
 		struct saved_value *v;
+		struct stats *stats;
+		double scale;
 
-		v = saved_value_lookup(metric_events[i], cpu, false);
-		if (!v)
-			break;
-		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
+		if (!strcmp(metric_events[i]->name, "duration_time")) {
+			stats = &walltime_nsecs_stats;
+			scale = 1e-9;
+		} else {
+			v = saved_value_lookup(metric_events[i], cpu, false);
+			if (!v)
+				break;
+			stats = &v->stats;
+			scale = 1.0;
+		}
+		expr__add_id(&pctx, metric_events[i]->name, avg_stats(stats)*scale);
 	}
 	if (!metric_events[i]) {
 		const char *p = metric_expr;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Hide internal duration_time counter
  2017-08-31 19:40 ` [PATCH v3 10/11] perf, tools, stat: Hide internal duration_time counter Andi Kleen
@ 2017-09-22 16:32   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:32 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: jolsa, hpa, linux-kernel, mingo, tglx, acme, ak

Commit-ID:  e864c5ca145e49bfce4847bd14b47b5f8549b2b1
Gitweb:     http://git.kernel.org/tip/e864c5ca145e49bfce4847bd14b47b5f8549b2b1
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:35 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:14 -0300

perf stat: Hide internal duration_time counter

Some perf stat metrics use an internal "duration_time" metric. It is not
correctly printed however. So hide it during output to avoid confusing
users with 0 counts.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 874bc6d..855890e 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -195,6 +195,11 @@ static struct perf_stat_config stat_config = {
 	.scale		= true,
 };
 
+static bool is_duration_time(struct perf_evsel *evsel)
+{
+	return !strcmp(evsel->name, "duration_time");
+}
+
 static inline void diff_timespec(struct timespec *r, struct timespec *a,
 				 struct timespec *b)
 {
@@ -1363,6 +1368,9 @@ static void print_aggr(char *prefix)
 		ad.id = id = aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
+
 			ad.val = ad.ena = ad.run = 0;
 			ad.nr = 0;
 			if (!collect_data(counter, aggr_cb, &ad))
@@ -1506,6 +1514,8 @@ static void print_no_aggr_metric(char *prefix)
 		if (prefix)
 			fputs(prefix, stat_config.output);
 		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			if (first) {
 				aggr_printout(counter, cpu, 0);
 				first = false;
@@ -1560,6 +1570,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 
 	/* Print metrics headers only */
 	evlist__for_each_entry(evsel_list, counter) {
+		if (is_duration_time(counter))
+			continue;
 		os.evsel = counter;
 		out.ctx = &os;
 		out.print_metric = print_metric_header;
@@ -1707,12 +1719,18 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 		print_aggr(prefix);
 		break;
 	case AGGR_THREAD:
-		evlist__for_each_entry(evsel_list, counter)
+		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			print_aggr_thread(counter, prefix);
+		}
 		break;
 	case AGGR_GLOBAL:
-		evlist__for_each_entry(evsel_list, counter)
+		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			print_counter_aggr(counter, prefix);
+		}
 		if (metric_only)
 			fputc('\n', stat_config.output);
 		break;
@@ -1720,8 +1738,11 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 		if (metric_only)
 			print_no_aggr_metric(prefix);
 		else {
-			evlist__for_each_entry(evsel_list, counter)
+			evlist__for_each_entry(evsel_list, counter) {
+				if (is_duration_time(counter))
+					continue;
 				print_counter(counter, prefix);
+			}
 		}
 		break;
 	case AGGR_UNSET:

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf stat: Update walltime_nsecs_stats in interval mode
  2017-08-31 19:40 ` [PATCH v3 11/11] perf, tools, stat: Update walltime_nsecs_stats in interval mode Andi Kleen
@ 2017-09-22 16:33   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:33 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, linux-kernel, tglx, mingo, acme, jolsa, ak

Commit-ID:  b90f1333ef08d2a497ae239798868b046f4e3a97
Gitweb:     http://git.kernel.org/tip/b90f1333ef08d2a497ae239798868b046f4e3a97
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Thu, 31 Aug 2017 12:40:36 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:14 -0300

perf stat: Update walltime_nsecs_stats in interval mode

Some metrics (like GFLOPs) need walltime_nsecs_stats for each interval.
Compute it for each interval instead of only at the end.

Pointed out by Jiri.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-12-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 855890e..88f1d5f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -415,6 +415,8 @@ static void process_interval(void)
 			pr_err("failed to write stat round event\n");
 	}
 
+	init_stats(&walltime_nsecs_stats);
+	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
 	print_counters(&rs, 0, NULL);
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf vendor events: Add JSON metrics for Broadwell
  2017-09-05 19:52             ` Andi Kleen
  2017-09-05 20:07               ` Arnaldo Carvalho de Melo
@ 2017-09-22 16:37               ` tip-bot for Andi Kleen
  2017-09-22 16:38               ` [tip:perf/core] perf vendor events: Add JSON metrics for Skylake tip-bot for Andi Kleen
  2017-09-22 16:38               ` [tip:perf/core] perf vendor events: Add JSON metrics for Sandy Bridge tip-bot for Andi Kleen
  3 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, mingo, linux-kernel, acme, ak, tglx, jolsa

Commit-ID:  cf97962308ba6b6afdeb038505032c7c0972bdfa
Gitweb:     http://git.kernel.org/tip/cf97962308ba6b6afdeb038505032c7c0972bdfa
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Sun, 23 Jul 2017 21:53:25 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:16 -0300

perf vendor events: Add JSON metrics for Broadwell

Add JSON metrics for Broadwell.

Commiter testing:

  # uname -a
  Linux jouet 4.13.0-rc7+ #3 SMP Sat Sep 2 09:04:44 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
  # grep "model name" /proc/cpuinfo  | head -1
  model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
  # perf list metricgroup

  List of pre-defined events (to be used in -e):

  Metric Groups:

  DSB
  FLOPS
  Frontend
  Frontend_Bandwidth
  Memory_BW
  Memory_Bound
  Memory_Lat
  Pipeline
  Ports_Utilization
  Power
  SMT
  Summary
  TLB
  TopDownL1
  Unknown_Branches
  # perf stat -M Power --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
       1.1               0.0                 0.0               0.0                0.0               0.0               0.0               0.0

         1.003502904 seconds time elapsed

  #
  # perf stat -M Memory_BW --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  MLP
       1.7

         1.001364525 seconds time elapsed

  #
  # perf stat -M TLB --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Page_Walks_Utilization
       0.1

         1.005962198 seconds time elapsed

  #
  # perf stat -M Summary --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Instructions   CPI          CLKS          CPU_Utilization   GFLOPs  SMT_2T_Utilization  Kernel_Utilization
  7281856697.0       0.0    11150898087.0     1.0              0.0    1.0                 0.7

         1.012134025 seconds time elapsed

  #

Running in verbose mode shows which counters and expressions are being
used:

  # perf stat -v -M Summary --metric-only -a sleep 1
  Using CPUID GenuineIntel-6-3D
  metric expr 1 / inst_retired.any / cycles for CPI
  found event inst_retired.any
  found event cycles
  metric expr cpu_clk_unhalted.thread for CLKS
  found event cpu_clk_unhalted.thread
  metric expr inst_retired.any for Instructions
  found event inst_retired.any
  metric expr cpu_clk_unhalted.ref_tsc / msr@tsc@ for CPU_Utilization
  found event cpu_clk_unhalted.ref_tsc
  found event msr/tsc/
  metric expr ( 1*( fp_arith_inst_retired.scalar_single + fp_arith_inst_retired.scalar_double ) + 2* fp_arith_inst_retired.128b_packed_double + 4*( fp_arith_inst_retired.128b_packed_single + fp_arith_inst_retired.256b_packed_double ) + 8* fp_arith_inst_retired.256b_packed_single ) / 1000000000 / duration_time for GFLOPs
  found event fp_arith_inst_retired.scalar_single
  found event fp_arith_inst_retired.scalar_double
  found event fp_arith_inst_retired.128b_packed_double
  found event fp_arith_inst_retired.128b_packed_single
  found event fp_arith_inst_retired.256b_packed_double
  found event fp_arith_inst_retired.256b_packed_single
  found event duration_time
  metric expr 1 - cpu_clk_thread_unhalted.one_thread_active / ( cpu_clk_thread_unhalted.ref_xclk_any / 2 ) if #smt_on else 0 for SMT_2T_Utilization
  found event cpu_clk_thread_unhalted.one_thread_active
  found event cpu_clk_thread_unhalted.ref_xclk_any
  metric expr cpu_clk_unhalted.ref_tsc:u / cpu_clk_unhalted.ref_tsc for Kernel_Utilization
  found event cpu_clk_unhalted.ref_tsc:u
  found event cpu_clk_unhalted.ref_tsc
  adding {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W
  inst_retired.any -> cpu/event=0xc0/
  cpu_clk_unhalted.thread -> cpu/event=0x3c/
  inst_retired.any -> cpu/event=0xc0/
  cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
  fp_arith_inst_retired.scalar_single -> cpu/umask=0x2,period=2000003,event=0xc7/
  fp_arith_inst_retired.scalar_double -> cpu/umask=0x1,period=2000003,event=0xc7/
  fp_arith_inst_retired.128b_packed_double -> cpu/umask=0x4,period=2000003,event=0xc7/
  fp_arith_inst_retired.128b_packed_single -> cpu/umask=0x8,period=2000003,event=0xc7/
  fp_arith_inst_retired.256b_packed_double -> cpu/umask=0x10,period=2000003,event=0xc7/
  fp_arith_inst_retired.256b_packed_single -> cpu/umask=0x20,period=2000003,event=0xc7/
  cpu_clk_thread_unhalted.one_thread_active -> cpu/umask=0x2,period=2000003,event=0x3c/
  cpu_clk_thread_unhalted.ref_xclk_any -> cpu/umask=0x1,any=1,period=2000003,event=0x3c/
  cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
  cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
  Weak group for fp_arith_inst_retired.scalar_single/7 failed
  Weak group for cpu_clk_unhalted.ref_tsc:u/2 failed
  inst_retired.any: 8704146437 4026374016 619883741
  cycles: 11180800018 4026374016 619883741
  cpu_clk_unhalted.thread: 11140030295 4026323772 931621933
  inst_retired.any: 8643115117 4026260510 1243595906
  cpu_clk_unhalted.ref_tsc: 10201638510 4026184297 1247351077
  msr/tsc/: 10378022785 4026184297 1247351077
  fp_arith_inst_retired.scalar_single: 134697 4026102728 1559210545
  fp_arith_inst_retired.scalar_double: 274339 4026007348 1870014984
  fp_arith_inst_retired.128b_packed_double: 1639 4025886054 1866736918
  fp_arith_inst_retired.128b_packed_single: 0 4025776614 2175106569
  fp_arith_inst_retired.256b_packed_double: 0 4025681734 1235551129
  fp_arith_inst_retired.256b_packed_single: 0 4025582962 1232398454
  duration_time: 0 4025552913 4025552913
  cpu_clk_thread_unhalted.one_thread_active: 10505 4025474649 923893076
  cpu_clk_thread_unhalted.ref_xclk_any: 394992110 4025474649 923893076
  cpu_clk_unhalted.ref_tsc:u: 5341421014 4025360315 1231634198
  cpu_clk_unhalted.ref_tsc: 10258278508 4025252611 307909362

   Performance counter stats for 'system wide':

  Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
  8704146437.0             0.0            11140030295.0            1.0                 0.0                 1.0                 0.5

         1.006783654 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../pmu-events/arch/x86/broadwell/bdw-metrics.json | 164 +++++++++++++++++++++
 1 file changed, 164 insertions(+)

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
new file mode 100644
index 0000000..49c5f12
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
@@ -0,0 +1,164 @@
+[
+    {
+        "BriefDescription": "Instructions Per Cycle (per logical thread)",
+        "MetricExpr": "INST_RETIRED.ANY / CPU_CLK_UNHALTED.THREAD",
+        "MetricGroup": "TopDownL1",
+        "MetricName": "IPC"
+    },
+    {
+        "BriefDescription": "Uops Per Instruction",
+        "MetricExpr": "UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY",
+        "MetricGroup": "Pipeline",
+        "MetricName": "UPI"
+    },
+    {
+        "BriefDescription": "Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions",
+        "MetricExpr": "min( 1 , IDQ.MITE_UOPS / ( UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 16 * ( ICACHE.HIT + ICACHE.MISSES ) / 4.0 ) )",
+        "MetricGroup": "Frontend",
+        "MetricName": "IFetch_Line_Utilization"
+    },
+    {
+        "BriefDescription": "Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)",
+        "MetricExpr": "IDQ.DSB_UOPS / ( IDQ.DSB_UOPS + LSD.UOPS + IDQ.MITE_UOPS + IDQ.MS_UOPS )",
+        "MetricGroup": "DSB; Frontend_Bandwidth",
+        "MetricName": "DSB_Coverage"
+    },
+    {
+        "BriefDescription": "Cycles Per Instruction (threaded)",
+        "MetricExpr": "1 / INST_RETIRED.ANY / cycles",
+        "MetricGroup": "Pipeline;Summary",
+        "MetricName": "CPI"
+    },
+    {
+        "BriefDescription": "Per-thread actual clocks when the logical processor is active. This is called 'Clockticks' in VTune.",
+        "MetricExpr": "CPU_CLK_UNHALTED.THREAD",
+        "MetricGroup": "Summary",
+        "MetricName": "CLKS"
+    },
+    {
+        "BriefDescription": "Total issue-pipeline slots",
+        "MetricExpr": "4*( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles",
+        "MetricGroup": "TopDownL1",
+        "MetricName": "SLOTS"
+    },
+    {
+        "BriefDescription": "Total number of retired Instructions",
+        "MetricExpr": "INST_RETIRED.ANY",
+        "MetricGroup": "Summary",
+        "MetricName": "Instructions"
+    },
+    {
+        "BriefDescription": "Instructions Per Cycle (per physical core)",
+        "MetricExpr": "INST_RETIRED.ANY / ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles",
+        "MetricGroup": "SMT",
+        "MetricName": "CoreIPC"
+    },
+    {
+        "BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is at least 1 uop executed)",
+        "MetricExpr": "UOPS_EXECUTED.THREAD / ( cpu@uops_executed.core\\,cmask\\=1@ / 2) if #SMT_on else UOPS_EXECUTED.CYCLES_GE_1_UOP_EXEC",
+        "MetricGroup": "Pipeline;Ports_Utilization",
+        "MetricName": "ILP"
+    },
+    {
+        "BriefDescription": "Average Branch Address Clear Cost (fraction of cycles)",
+	"MetricExpr": "2* ( RS_EVENTS.EMPTY_CYCLES - ICACHE.IFDATA_STALL  - ( 14 * ITLB_MISSES.STLB_HIT + cpu@ITLB_MISSES.WALK_DURATION\\,cmask\\=1@ + 7* ITLB_MISSES.WALK_COMPLETED ) ) / RS_EVENTS.EMPTY_END",
+        "MetricGroup": "Unknown_Branches",
+        "MetricName": "BAClear_Cost"
+    },
+    {
+        "BriefDescription": "Core actual clocks when any thread is active on the physical core",
+        "MetricExpr": "( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else CPU_CLK_UNHALTED.THREAD",
+        "MetricGroup": "SMT",
+        "MetricName": "CORE_CLKS"
+    },
+    {
+        "BriefDescription": "Actual Average Latency for L1 data-cache miss demand loads",
+        "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_UOPS_RETIRED.L1_MISS + mem_load_uops_retired.hit_lfb )",
+        "MetricGroup": "Memory_Bound;Memory_Lat",
+        "MetricName": "Load_Miss_Real_Latency"
+    },
+    {
+        "BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)",
+        "MetricExpr": "L1D_PEND_MISS.PENDING / ( cpu@l1d_pend_miss.pending_cycles\\,any\\=1@ / 2) if #SMT_on else L1D_PEND_MISS.PENDING_CYCLES",
+        "MetricGroup": "Memory_Bound;Memory_BW",
+        "MetricName": "MLP"
+    },
+    {
+        "BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
+	"MetricExpr": "( cpu@ITLB_MISSES.WALK_DURATION\\,cmask\\=1@ + cpu@DTLB_LOAD_MISSES.WALK_DURATION\\,cmask\\=1@ + cpu@DTLB_STORE_MISSES.WALK_DURATION\\,cmask\\=1@ + 7*(DTLB_STORE_MISSES.WALK_COMPLETED+DTLB_LOAD_MISSES.WALK_COMPLETED+ITLB_MISSES.WALK_COMPLETED)) / ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles",
+        "MetricGroup": "TLB",
+        "MetricName": "Page_Walks_Utilization"
+    },
+    {
+        "BriefDescription": "Average CPU Utilization",
+        "MetricExpr": "CPU_CLK_UNHALTED.REF_TSC / msr@tsc@",
+        "MetricGroup": "Summary",
+        "MetricName": "CPU_Utilization"
+    },
+    {
+        "BriefDescription": "Giga Floating Point Operations Per Second",
+        "MetricExpr": "( 1*( FP_ARITH_INST_RETIRED.SCALAR_SINGLE + FP_ARITH_INST_RETIRED.SCALAR_DOUBLE ) + 2* FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE + 4*( FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE + FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE ) + 8* FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE ) / 1000000000 / duration_time",
+        "MetricGroup": "FLOPS;Summary",
+        "MetricName": "GFLOPs"
+    },
+    {
+        "BriefDescription": "Average Frequency Utilization relative nominal frequency",
+        "MetricExpr": "CPU_CLK_UNHALTED.THREAD / CPU_CLK_UNHALTED.REF_TSC",
+        "MetricGroup": "Power",
+        "MetricName": "Turbo_Utilization"
+    },
+    {
+        "BriefDescription": "Fraction of cycles where both hardware threads were active",
+        "MetricExpr": "1 - CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE / ( CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY / 2 ) if #SMT_on else 0",
+        "MetricGroup": "SMT;Summary",
+        "MetricName": "SMT_2T_Utilization"
+    },
+    {
+        "BriefDescription": "Fraction of cycles spent in Kernel mode",
+        "MetricExpr": "CPU_CLK_UNHALTED.REF_TSC:u / CPU_CLK_UNHALTED.REF_TSC",
+        "MetricGroup": "Summary",
+        "MetricName": "Kernel_Utilization"
+    },
+    {
+        "BriefDescription": "C3 residency percent per core",
+        "MetricExpr": "(cstate_core@c3\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C3_Core_Residency"
+    },
+    {
+        "BriefDescription": "C6 residency percent per core",
+        "MetricExpr": "(cstate_core@c6\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C6_Core_Residency"
+    },
+    {
+        "BriefDescription": "C7 residency percent per core",
+        "MetricExpr": "(cstate_core@c7\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C7_Core_Residency"
+    },
+    {
+        "BriefDescription": "C2 residency percent per package",
+        "MetricExpr": "(cstate_pkg@c2\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C2_Pkg_Residency"
+    },
+    {
+        "BriefDescription": "C3 residency percent per package",
+        "MetricExpr": "(cstate_pkg@c3\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C3_Pkg_Residency"
+    },
+    {
+        "BriefDescription": "C6 residency percent per package",
+        "MetricExpr": "(cstate_pkg@c6\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C6_Pkg_Residency"
+    },
+    {
+        "BriefDescription": "C7 residency percent per package",
+        "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
+        "MetricGroup": "Power",
+        "MetricName": "C7_Pkg_Residency"
+    }
+]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf vendor events: Add JSON metrics for Skylake
  2017-09-05 19:52             ` Andi Kleen
  2017-09-05 20:07               ` Arnaldo Carvalho de Melo
  2017-09-22 16:37               ` [tip:perf/core] perf vendor events: Add JSON metrics for Broadwell tip-bot for Andi Kleen
@ 2017-09-22 16:38               ` tip-bot for Andi Kleen
  2017-09-22 16:38               ` [tip:perf/core] perf vendor events: Add JSON metrics for Sandy Bridge tip-bot for Andi Kleen
  3 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:38 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: mingo, acme, ak, linux-kernel, jolsa, hpa, tglx

Commit-ID:  2e006a24127ad88422632f9e5d6a8039a40f01da
Gitweb:     http://git.kernel.org/tip/2e006a24127ad88422632f9e5d6a8039a40f01da
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Sun, 23 Jul 2017 21:54:49 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:17 -0300

perf vendor events: Add JSON metrics for Skylake

Add JSON metrics for Skylake.

Committer testing:

  # grep "model name" /proc/cpuinfo | head -1
  model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  # uname -a
  Linux seventh 4.12.0-rc6+ #1 SMP Fri Jun 30 16:40:55 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
  # perf stat --metric-only -M Summary -a sleep 1

   Performance counter stats for 'system wide':

  Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
  34021097.0               0.0            119424171.0              0.0                 0.0                 0.0                 0.0

         1.001001793 seconds time elapsed

  # perf list metricgroup

  List of pre-defined events (to be used in -e):

  Metric Groups:

  DSB
  FLOPS
  Frontend
  Frontend_Bandwidth
  Memory_BW
  Memory_Bound
  Memory_Lat
  Pipeline
  Ports_Utilization
  Power
  SMT
  Summary
  TLB
  TopDownL1
  Unknown_Branches
  # perf stat --metric-only -M Ports_Utilization -a sleep 1

   Performance counter stats for 'system wide':

  ILP
  1475828.0

       1.000688547 seconds time elapsed

  # perf stat -v --metric-only -M Ports_Utilization -a sleep 1
  Using CPUID GenuineIntel-6-9E
  metric expr uops_executed.thread / ( uops_executed.core_cycles_ge_1 / 2) if #smt_on else uops_executed.core_cycles_ge_1 for ILP
  found event uops_executed.thread
  found event uops_executed.core_cycles_ge_1
  adding {uops_executed.thread,uops_executed.core_cycles_ge_1}:W
  uops_executed.thread -> cpu/umask=0x1,period=2000003,event=0xb1/
  uops_executed.core_cycles_ge_1 -> cpu/umask=0x2,period=2000003,cmask=1,event=0xb1/
  uops_executed.thread: 8115271 4002547654 4002547654
  uops_executed.core_cycles_ge_1: 3282969 4002547654 4002547654

   Performance counter stats for 'system wide':

  ILP
  3282969.0

         1.000719870 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../{broadwell/bdw-metrics.json => skylake/skl-metrics.json} | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json b/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
similarity index 85%
copy from tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
copy to tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
index 49c5f12..411f941 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
@@ -13,7 +13,7 @@
     },
     {
         "BriefDescription": "Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions",
-        "MetricExpr": "min( 1 , IDQ.MITE_UOPS / ( UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 16 * ( ICACHE.HIT + ICACHE.MISSES ) / 4.0 ) )",
+        "MetricExpr": "min( 1 , UOPS_ISSUED.ANY / (UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 64 * ( ICACHE_64B.IFTAG_HIT + ICACHE_64B.IFTAG_MISS ) / 4.1) )",
         "MetricGroup": "Frontend",
         "MetricName": "IFetch_Line_Utilization"
     },
@@ -55,13 +55,13 @@
     },
     {
         "BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is at least 1 uop executed)",
-        "MetricExpr": "UOPS_EXECUTED.THREAD / ( cpu@uops_executed.core\\,cmask\\=1@ / 2) if #SMT_on else UOPS_EXECUTED.CYCLES_GE_1_UOP_EXEC",
+        "MetricExpr": "UOPS_EXECUTED.THREAD / ( UOPS_EXECUTED.CORE_CYCLES_GE_1 / 2) if #SMT_on else UOPS_EXECUTED.CORE_CYCLES_GE_1",
         "MetricGroup": "Pipeline;Ports_Utilization",
         "MetricName": "ILP"
     },
     {
         "BriefDescription": "Average Branch Address Clear Cost (fraction of cycles)",
-	"MetricExpr": "2* ( RS_EVENTS.EMPTY_CYCLES - ICACHE.IFDATA_STALL  - ( 14 * ITLB_MISSES.STLB_HIT + cpu@ITLB_MISSES.WALK_DURATION\\,cmask\\=1@ + 7* ITLB_MISSES.WALK_COMPLETED ) ) / RS_EVENTS.EMPTY_END",
+        "MetricExpr": "2* ( RS_EVENTS.EMPTY_CYCLES - ICACHE_16B.IFDATA_STALL  - ICACHE_64B.IFTAG_STALL ) / RS_EVENTS.EMPTY_END",
         "MetricGroup": "Unknown_Branches",
         "MetricName": "BAClear_Cost"
     },
@@ -73,19 +73,19 @@
     },
     {
         "BriefDescription": "Actual Average Latency for L1 data-cache miss demand loads",
-        "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_UOPS_RETIRED.L1_MISS + mem_load_uops_retired.hit_lfb )",
+	"MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT )",
         "MetricGroup": "Memory_Bound;Memory_Lat",
         "MetricName": "Load_Miss_Real_Latency"
     },
     {
         "BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)",
-        "MetricExpr": "L1D_PEND_MISS.PENDING / ( cpu@l1d_pend_miss.pending_cycles\\,any\\=1@ / 2) if #SMT_on else L1D_PEND_MISS.PENDING_CYCLES",
+        "MetricExpr": "L1D_PEND_MISS.PENDING / ( L1D_PEND_MISS.PENDING_CYCLES_ANY / 2) if #SMT_on else L1D_PEND_MISS.PENDING_CYCLES",
         "MetricGroup": "Memory_Bound;Memory_BW",
         "MetricName": "MLP"
     },
     {
         "BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
-	"MetricExpr": "( cpu@ITLB_MISSES.WALK_DURATION\\,cmask\\=1@ + cpu@DTLB_LOAD_MISSES.WALK_DURATION\\,cmask\\=1@ + cpu@DTLB_STORE_MISSES.WALK_DURATION\\,cmask\\=1@ + 7*(DTLB_STORE_MISSES.WALK_COMPLETED+DTLB_LOAD_MISSES.WALK_COMPLETED+ITLB_MISSES.WALK_COMPLETED)) / ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles",
+        "MetricExpr": "( ITLB_MISSES.WALK_PENDING + DTLB_LOAD_MISSES.WALK_PENDING + DTLB_STORE_MISSES.WALK_PENDING + EPT.WALK_PENDING ) / ( 2 * ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles )",
         "MetricGroup": "TLB",
         "MetricName": "Page_Walks_Utilization"
     },

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip:perf/core] perf vendor events: Add JSON metrics for Sandy Bridge
  2017-09-05 19:52             ` Andi Kleen
                                 ` (2 preceding siblings ...)
  2017-09-22 16:38               ` [tip:perf/core] perf vendor events: Add JSON metrics for Skylake tip-bot for Andi Kleen
@ 2017-09-22 16:38               ` tip-bot for Andi Kleen
  3 siblings, 0 replies; 47+ messages in thread
From: tip-bot for Andi Kleen @ 2017-09-22 16:38 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, mingo, hpa, acme, tglx, ak, jolsa

Commit-ID:  97dca6715d0a058a6af028a3019432740b4a0011
Gitweb:     http://git.kernel.org/tip/97dca6715d0a058a6af028a3019432740b4a0011
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Sun, 23 Jul 2017 21:50:34 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 13 Sep 2017 09:49:17 -0300

perf vendor events: Add JSON metrics for Sandy Bridge

Add JSON metrics for Sandy Bridge.

Committer testing:

  # grep "model name" /proc/cpuinfo | head -1
  model name	: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
    # perf list metricgroup

  List of pre-defined events (to be used in -e):

  Metric Groups:

  DSB
  FLOPS
  Frontend
  Frontend_Bandwidth
  Pipeline
  Ports_Utilization
  Power
  SMT
  Summary
  TopDownL1
  # perf stat -M Power --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
     0.8               0.0                98.1               0.0                0.0               0.0               23.4              0.0

       1.001153658 seconds time elapsed

  # perf stat -v -M Power --metric-only -a sleep 1
  Using CPUID GenuineIntel-6-2A
  metric expr cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc for Turbo_Utilization
  found event cpu_clk_unhalted.thread
  found event cpu_clk_unhalted.ref_tsc
  metric expr (cstate_core@c3\-residency@ / msr@tsc@) * 100 for C3_Core_Residency
  found event cstate_core/c3-residency/
  found event msr/tsc/
  metric expr (cstate_core@c6\-residency@ / msr@tsc@) * 100 for C6_Core_Residency
  found event cstate_core/c6-residency/
  found event msr/tsc/
  metric expr (cstate_core@c7\-residency@ / msr@tsc@) * 100 for C7_Core_Residency
  found event cstate_core/c7-residency/
  found event msr/tsc/
  metric expr (cstate_pkg@c2\-residency@ / msr@tsc@) * 100 for C2_Pkg_Residency
  found event cstate_pkg/c2-residency/
  found event msr/tsc/
  metric expr (cstate_pkg@c3\-residency@ / msr@tsc@) * 100 for C3_Pkg_Residency
  found event cstate_pkg/c3-residency/
  found event msr/tsc/
  metric expr (cstate_pkg@c6\-residency@ / msr@tsc@) * 100 for C6_Pkg_Residency
  found event cstate_pkg/c6-residency/
  found event msr/tsc/
  metric expr (cstate_pkg@c7\-residency@ / msr@tsc@) * 100 for C7_Pkg_Residency
  found event cstate_pkg/c7-residency/
  found event msr/tsc/
  adding {cpu_clk_unhalted.thread,cpu_clk_unhalted.ref_tsc}:W,{cstate_core/c3-residency/,msr/tsc/}:W,{cstate_core/c6-residency/,msr/tsc/}:W,{cstate_core/c7-residency/,msr/tsc/}:W,{cstate_pkg/c2-residency/,msr/tsc/}:W,{cstate_pkg/c3-residency/,msr/tsc/}:W,{cstate_pkg/c6-residency/,msr/tsc/}:W,{cstate_pkg/c7-residency/,msr/tsc/}:W
  cpu_clk_unhalted.thread -> cpu/event=0x3c/
  cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
  Weak group for cstate_pkg/c2-residency//2 failed
  Weak group for cstate_pkg/c3-residency//2 failed
  Weak group for cstate_pkg/c6-residency//2 failed
  Weak group for cstate_pkg/c7-residency//2 failed
  cpu_clk_unhalted.thread: 5564185 4002833569 4002833569
  cpu_clk_unhalted.ref_tsc: 7325424 4002833569 4002833569
  cstate_core/c3-residency/: 68293 4003027101 4003027101
  msr/tsc/: 12451294472 4003027101 4003027101
  cstate_core/c6-residency/: 12238830163 4003260984 4003260984
  msr/tsc/: 12452017806 4003260984 4003260984
  cstate_core/c7-residency/: 0 4003489648 4003489648
  msr/tsc/: 12452725162 4003489648 4003489648
  cstate_pkg/c2-residency/: 1830054 1000913138 1000913138
  msr/tsc/: 12453441079 4003717513 4003717513
  cstate_pkg/c3-residency/: 0 1000973570 1000973570
  msr/tsc/: 12454177865 4003954758 4003954758
  cstate_pkg/c6-residency/: 2940448859 1001032370 1001032370
  msr/tsc/: 12454833890 4004166118 4004166118
  cstate_pkg/c7-residency/: 0 1001049818 1001049818
  msr/tsc/: 12454919470 4004194204 4004194204

   Performance counter stats for 'system wide':

  Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
       0.8             0.0                98.3               0.0                0.0               0.0               23.6              0.0

         1.001126519 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../snb-metrics.json}                              | 30 +++-------------------
 1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json b/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
similarity index 72%
copy from tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
copy to tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
index 411f941..b35b1c1 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
@@ -13,7 +13,7 @@
     },
     {
         "BriefDescription": "Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions",
-        "MetricExpr": "min( 1 , UOPS_ISSUED.ANY / (UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 64 * ( ICACHE_64B.IFTAG_HIT + ICACHE_64B.IFTAG_MISS ) / 4.1) )",
+        "MetricExpr": "min( 1 , UOPS_ISSUED.ANY / ( UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 32 * ( ICACHE.HIT + ICACHE.MISSES ) / 4) )",
         "MetricGroup": "Frontend",
         "MetricName": "IFetch_Line_Utilization"
     },
@@ -55,41 +55,17 @@
     },
     {
         "BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is at least 1 uop executed)",
-        "MetricExpr": "UOPS_EXECUTED.THREAD / ( UOPS_EXECUTED.CORE_CYCLES_GE_1 / 2) if #SMT_on else UOPS_EXECUTED.CORE_CYCLES_GE_1",
+	"MetricExpr": "UOPS_DISPATCHED.THREAD / ( cpu@UOPS_DISPATCHED.CORE\\,cmask\\=1@ / 2) if #SMT_on else cpu@UOPS_DISPATCHED.CORE\\,cmask\\=1@",
         "MetricGroup": "Pipeline;Ports_Utilization",
         "MetricName": "ILP"
     },
     {
-        "BriefDescription": "Average Branch Address Clear Cost (fraction of cycles)",
-        "MetricExpr": "2* ( RS_EVENTS.EMPTY_CYCLES - ICACHE_16B.IFDATA_STALL  - ICACHE_64B.IFTAG_STALL ) / RS_EVENTS.EMPTY_END",
-        "MetricGroup": "Unknown_Branches",
-        "MetricName": "BAClear_Cost"
-    },
-    {
         "BriefDescription": "Core actual clocks when any thread is active on the physical core",
         "MetricExpr": "( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else CPU_CLK_UNHALTED.THREAD",
         "MetricGroup": "SMT",
         "MetricName": "CORE_CLKS"
     },
     {
-        "BriefDescription": "Actual Average Latency for L1 data-cache miss demand loads",
-	"MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT )",
-        "MetricGroup": "Memory_Bound;Memory_Lat",
-        "MetricName": "Load_Miss_Real_Latency"
-    },
-    {
-        "BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)",
-        "MetricExpr": "L1D_PEND_MISS.PENDING / ( L1D_PEND_MISS.PENDING_CYCLES_ANY / 2) if #SMT_on else L1D_PEND_MISS.PENDING_CYCLES",
-        "MetricGroup": "Memory_Bound;Memory_BW",
-        "MetricName": "MLP"
-    },
-    {
-        "BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
-        "MetricExpr": "( ITLB_MISSES.WALK_PENDING + DTLB_LOAD_MISSES.WALK_PENDING + DTLB_STORE_MISSES.WALK_PENDING + EPT.WALK_PENDING ) / ( 2 * ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles )",
-        "MetricGroup": "TLB",
-        "MetricName": "Page_Walks_Utilization"
-    },
-    {
         "BriefDescription": "Average CPU Utilization",
         "MetricExpr": "CPU_CLK_UNHALTED.REF_TSC / msr@tsc@",
         "MetricGroup": "Summary",
@@ -97,7 +73,7 @@
     },
     {
         "BriefDescription": "Giga Floating Point Operations Per Second",
-        "MetricExpr": "( 1*( FP_ARITH_INST_RETIRED.SCALAR_SINGLE + FP_ARITH_INST_RETIRED.SCALAR_DOUBLE ) + 2* FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE + 4*( FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE + FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE ) + 8* FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE ) / 1000000000 / duration_time",
+        "MetricExpr": "( 1*( FP_COMP_OPS_EXE.SSE_SCALAR_SINGLE + FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE ) + 2* FP_COMP_OPS_EXE.SSE_PACKED_DOUBLE + 4*( FP_COMP_OPS_EXE.SSE_PACKED_SINGLE + SIMD_FP_256.PACKED_DOUBLE ) + 8* SIMD_FP_256.PACKED_SINGLE ) / 1000000000 / duration_time",
         "MetricGroup": "FLOPS;Summary",
         "MetricName": "GFLOPs"
     },

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list
  2017-08-31 19:40 ` [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list Andi Kleen
  2017-09-22 16:31   ` [tip:perf/core] perf " tip-bot for Andi Kleen
@ 2017-10-13 14:50   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-10-13 14:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: jolsa, linux-kernel, Andi Kleen

Em Thu, Aug 31, 2017 at 12:40:32PM -0700, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
> 
> Add code to perf list to print metric groups, and metrics
> that don't have an event name. The metricgroup code collects
> the eventgroups and events into a rblist, and then prints
> them according to the configured filters.
> 
> The metricgroups are printed by default, but can be
> limited by perf list metric or perf list metricgroup

Andi, I just noticed that metric groups appear when we pass some
substring to search events that looks unrelated, can you please take a
look at this?

Thanks,

- Arnaldo

[root@jouet ~]# perf list energy-cores 
oList of pre-defined events (to be used in -e):

  power/energy-cores/                                [Kernel PMU event]


Metric Groups:

DSB:
  DSB_Coverage
        [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
FLOPS:
  GFLOPs
        [Giga Floating Point Operations Per Second]
Frontend:
  IFetch_Line_Utilization
        [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
Frontend_Bandwidth:
  DSB_Coverage
        [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
Memory_BW:
  MLP
        [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]
Memory_Bound:
  Load_Miss_Real_Latency
        [Actual Average Latency for L1 data-cache miss demand loads]
  MLP
        [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]
Memory_Lat:
  Load_Miss_Real_Latency
        [Actual Average Latency for L1 data-cache miss demand loads]
Pipeline:
  CPI

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2017-10-13 15:00 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-31 19:40 Support standalone metrics and metric groups for perf Andi Kleen
2017-08-31 19:40 ` [PATCH v3 01/11] perf, tools: Support weak groups Andi Kleen
2017-09-01 16:57   ` Jiri Olsa
2017-09-01 17:00     ` Jiri Olsa
2017-09-04 16:51       ` Arnaldo Carvalho de Melo
2017-09-22 16:28   ` [tip:perf/core] perf tools: Support weak groups in 'perf stat' tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 02/11] perf, tools: Support metric_group and no event name in json parser Andi Kleen
2017-09-22 16:29   ` [tip:perf/core] perf vendor events: Support metric_group and no event name in JSON parser tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 03/11] perf, tools, stat: Factor out generic metric printing Andi Kleen
2017-09-22 16:29   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 04/11] perf, tools: Print generic metric header even for failed expressions Andi Kleen
2017-09-22 16:30   ` [tip:perf/core] perf stat: " tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 05/11] perf, tools: Extract function to get json alias map Andi Kleen
2017-09-22 16:30   ` [tip:perf/core] perf pmu: Extract function to get JSON " tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 06/11] perf, tools, stat: Support JSON metrics in perf stat Andi Kleen
2017-09-04 17:11   ` Arnaldo Carvalho de Melo
2017-09-04 17:37     ` Andi Kleen
2017-09-05 18:09       ` Arnaldo Carvalho de Melo
2017-09-05 18:16         ` Arnaldo Carvalho de Melo
2017-09-05 18:32           ` Arnaldo Carvalho de Melo
2017-09-05 18:19         ` Andi Kleen
2017-09-05 18:52           ` Arnaldo Carvalho de Melo
2017-09-05 19:52             ` Andi Kleen
2017-09-05 20:07               ` Arnaldo Carvalho de Melo
2017-09-05 20:37                 ` Andi Kleen
2017-09-08 18:10                   ` Arnaldo Carvalho de Melo
2017-09-08 19:08                     ` Andi Kleen
2017-09-11 14:05                       ` Arnaldo Carvalho de Melo
2017-09-22 16:37               ` [tip:perf/core] perf vendor events: Add JSON metrics for Broadwell tip-bot for Andi Kleen
2017-09-22 16:38               ` [tip:perf/core] perf vendor events: Add JSON metrics for Skylake tip-bot for Andi Kleen
2017-09-22 16:38               ` [tip:perf/core] perf vendor events: Add JSON metrics for Sandy Bridge tip-bot for Andi Kleen
2017-09-22 16:31   ` [tip:perf/core] perf stat: Support JSON metrics in perf stat tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 07/11] perf, tools, list: Add metric groups to perf list Andi Kleen
2017-09-22 16:31   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-10-13 14:50   ` [PATCH v3 07/11] perf, tools, " Arnaldo Carvalho de Melo
2017-08-31 19:40 ` [PATCH v3 08/11] perf, tools, stat: Don't use ctx for saved values lookup Andi Kleen
2017-09-22 16:31   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 09/11] perf, tools, stat: Support duration_time for metrics Andi Kleen
2017-09-22 16:32   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 10/11] perf, tools, stat: Hide internal duration_time counter Andi Kleen
2017-09-22 16:32   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-08-31 19:40 ` [PATCH v3 11/11] perf, tools, stat: Update walltime_nsecs_stats in interval mode Andi Kleen
2017-09-22 16:33   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-09-01 17:26 ` Support standalone metrics and metric groups for perf Jiri Olsa
2017-09-01 17:36   ` Jiri Olsa
2017-09-01 17:42   ` Andi Kleen
2017-09-01 17:50     ` Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.