linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/6] perf stat: Small random cleanups (v1)
@ 2022-09-26 20:07 Namhyung Kim
  2022-09-26 20:07 ` [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array Namhyung Kim
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

Hello,

I'm working on perf stat and I found some items to clean up.  This time
I removed runtime stats for per-thread aggregation mode which we can simply
use thread map index to compare the shadow stat values in the rt_stat.

The code is available at 'perf/stat-cleanup-v1' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Thanks,
Namhyung


Namhyung Kim (6):
  perf stat: Convert perf_stat_evsel.res_stats array
  perf stat: Don't call perf_stat_evsel_id_init() repeatedly
  perf stat: Rename saved_value->cpu_map_idx
  perf stat: Use thread map index for shadow stat
  perf stat: Kill unused per-thread runtime stats
  perf stat: Don't compare runtime stat for shadow stats

 tools/perf/builtin-stat.c      |  54 ------
 tools/perf/util/stat-display.c |  22 ++-
 tools/perf/util/stat-shadow.c  | 320 ++++++++++++++++-----------------
 tools/perf/util/stat.c         |  20 +--
 tools/perf/util/stat.h         |   4 +-
 5 files changed, 171 insertions(+), 249 deletions(-)


base-commit: 62e64c9d2fd12839c02f1b3e8b873e7cb34e8720
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array
  2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
@ 2022-09-26 20:07 ` Namhyung Kim
  2022-09-28 10:33   ` James Clark
  2022-09-26 20:07 ` [PATCH 2/6] perf stat: Don't call perf_stat_evsel_id_init() repeatedly Namhyung Kim
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

It uses only one member, no need to have it as an array.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat-display.c |  2 +-
 tools/perf/util/stat.c         | 10 +++-------
 tools/perf/util/stat.h         |  2 +-
 3 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index b82844cb0ce7..234491f43c36 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -67,7 +67,7 @@ static void print_noise(struct perf_stat_config *config,
 		return;
 
 	ps = evsel->stats;
-	print_noise_pct(config, stddev_stats(&ps->res_stats[0]), avg);
+	print_noise_pct(config, stddev_stats(&ps->res_stats), avg);
 }
 
 static void print_cgroup(struct perf_stat_config *config, struct evsel *evsel)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..6bcd3dc32a71 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -132,12 +132,9 @@ static void perf_stat_evsel_id_init(struct evsel *evsel)
 
 static void evsel__reset_stat_priv(struct evsel *evsel)
 {
-	int i;
 	struct perf_stat_evsel *ps = evsel->stats;
 
-	for (i = 0; i < 3; i++)
-		init_stats(&ps->res_stats[i]);
-
+	init_stats(&ps->res_stats);
 	perf_stat_evsel_id_init(evsel);
 }
 
@@ -440,7 +437,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 	struct perf_counts_values *aggr = &counter->counts->aggr;
 	struct perf_stat_evsel *ps = counter->stats;
 	u64 *count = counter->counts->aggr.values;
-	int i, ret;
+	int ret;
 
 	aggr->val = aggr->ena = aggr->run = 0;
 
@@ -458,8 +455,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 		evsel__compute_deltas(counter, -1, -1, aggr);
 	perf_counts_values__scale(aggr, config->scale, &counter->counts->scaled);
 
-	for (i = 0; i < 3; i++)
-		update_stats(&ps->res_stats[i], count[i]);
+	update_stats(&ps->res_stats, *count);
 
 	if (verbose > 0) {
 		fprintf(config->output, "%s: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 72713b344b79..3eba38a1a149 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -43,7 +43,7 @@ enum perf_stat_evsel_id {
 };
 
 struct perf_stat_evsel {
-	struct stats		 res_stats[3];
+	struct stats		 res_stats;
 	enum perf_stat_evsel_id	 id;
 	u64			*group_data;
 };
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] perf stat: Don't call perf_stat_evsel_id_init() repeatedly
  2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
  2022-09-26 20:07 ` [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array Namhyung Kim
@ 2022-09-26 20:07 ` Namhyung Kim
  2022-09-28 10:41   ` James Clark
  2022-09-26 20:07 ` [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx Namhyung Kim
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

The evsel__reset_stat_priv() is called more than once if user gave -r
option for multiple run.  But it doesn't need to re-initialize the id.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 6bcd3dc32a71..e1d3152ce664 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -135,7 +135,6 @@ static void evsel__reset_stat_priv(struct evsel *evsel)
 	struct perf_stat_evsel *ps = evsel->stats;
 
 	init_stats(&ps->res_stats);
-	perf_stat_evsel_id_init(evsel);
 }
 
 static int evsel__alloc_stat_priv(struct evsel *evsel)
@@ -143,6 +142,7 @@ static int evsel__alloc_stat_priv(struct evsel *evsel)
 	evsel->stats = zalloc(sizeof(struct perf_stat_evsel));
 	if (evsel->stats == NULL)
 		return -ENOMEM;
+	perf_stat_evsel_id_init(evsel);
 	evsel__reset_stat_priv(evsel);
 	return 0;
 }
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx
  2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
  2022-09-26 20:07 ` [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array Namhyung Kim
  2022-09-26 20:07 ` [PATCH 2/6] perf stat: Don't call perf_stat_evsel_id_init() repeatedly Namhyung Kim
@ 2022-09-26 20:07 ` Namhyung Kim
  2022-09-28 10:50   ` James Clark
  2022-09-26 20:07 ` [PATCH 4/6] perf stat: Use thread map index for shadow stat Namhyung Kim
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

The cpu_map_idx fields is just to differentiate values from other
entries.  It doesn't need to be strictly cpu map index.  Actually we can
pass thread map index or aggr map index.  So rename the fields first.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat-shadow.c | 308 +++++++++++++++++-----------------
 1 file changed, 154 insertions(+), 154 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 9e1eddeff21b..99d05262055c 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -33,7 +33,7 @@ struct saved_value {
 	struct evsel *evsel;
 	enum stat_type type;
 	int ctx;
-	int cpu_map_idx;
+	int map_idx;
 	struct cgroup *cgrp;
 	struct runtime_stat *stat;
 	struct stats stats;
@@ -48,8 +48,8 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 					     rb_node);
 	const struct saved_value *b = entry;
 
-	if (a->cpu_map_idx != b->cpu_map_idx)
-		return a->cpu_map_idx - b->cpu_map_idx;
+	if (a->map_idx != b->map_idx)
+		return a->map_idx - b->map_idx;
 
 	/*
 	 * Previously the rbtree was used to link generic metrics.
@@ -106,7 +106,7 @@ static void saved_value_delete(struct rblist *rblist __maybe_unused,
 }
 
 static struct saved_value *saved_value_lookup(struct evsel *evsel,
-					      int cpu_map_idx,
+					      int map_idx,
 					      bool create,
 					      enum stat_type type,
 					      int ctx,
@@ -116,7 +116,7 @@ static struct saved_value *saved_value_lookup(struct evsel *evsel,
 	struct rblist *rblist;
 	struct rb_node *nd;
 	struct saved_value dm = {
-		.cpu_map_idx = cpu_map_idx,
+		.map_idx = map_idx,
 		.evsel = evsel,
 		.type = type,
 		.ctx = ctx,
@@ -215,10 +215,10 @@ struct runtime_stat_data {
 
 static void update_runtime_stat(struct runtime_stat *st,
 				enum stat_type type,
-				int cpu_map_idx, u64 count,
+				int map_idx, u64 count,
 				struct runtime_stat_data *rsd)
 {
-	struct saved_value *v = saved_value_lookup(NULL, cpu_map_idx, true, type,
+	struct saved_value *v = saved_value_lookup(NULL, map_idx, true, type,
 						   rsd->ctx, st, rsd->cgrp);
 
 	if (v)
@@ -231,7 +231,7 @@ static void update_runtime_stat(struct runtime_stat *st,
  * instruction rates, etc:
  */
 void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
-				    int cpu_map_idx, struct runtime_stat *st)
+				    int map_idx, struct runtime_stat *st)
 {
 	u64 count_ns = count;
 	struct saved_value *v;
@@ -243,88 +243,88 @@ void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
 	count *= counter->scale;
 
 	if (evsel__is_clock(counter))
-		update_runtime_stat(st, STAT_NSECS, cpu_map_idx, count_ns, &rsd);
+		update_runtime_stat(st, STAT_NSECS, map_idx, count_ns, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
-		update_runtime_stat(st, STAT_CYCLES, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_CYCLES, map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
-		update_runtime_stat(st, STAT_CYCLES_IN_TX, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_CYCLES_IN_TX, map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TRANSACTION_START))
-		update_runtime_stat(st, STAT_TRANSACTION, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_TRANSACTION, map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, ELISION_START))
-		update_runtime_stat(st, STAT_ELISION, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_ELISION, map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_TOTAL_SLOTS))
 		update_runtime_stat(st, STAT_TOPDOWN_TOTAL_SLOTS,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_ISSUED))
 		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_ISSUED,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_RETIRED))
 		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_RETIRED,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_BUBBLES))
 		update_runtime_stat(st, STAT_TOPDOWN_FETCH_BUBBLES,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
 		update_runtime_stat(st, STAT_TOPDOWN_RECOVERY_BUBBLES,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_RETIRING))
 		update_runtime_stat(st, STAT_TOPDOWN_RETIRING,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_BAD_SPEC))
 		update_runtime_stat(st, STAT_TOPDOWN_BAD_SPEC,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FE_BOUND))
 		update_runtime_stat(st, STAT_TOPDOWN_FE_BOUND,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_BE_BOUND))
 		update_runtime_stat(st, STAT_TOPDOWN_BE_BOUND,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_HEAVY_OPS))
 		update_runtime_stat(st, STAT_TOPDOWN_HEAVY_OPS,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_BR_MISPREDICT))
 		update_runtime_stat(st, STAT_TOPDOWN_BR_MISPREDICT,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_LAT))
 		update_runtime_stat(st, STAT_TOPDOWN_FETCH_LAT,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_MEM_BOUND))
 		update_runtime_stat(st, STAT_TOPDOWN_MEM_BOUND,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
 		update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
 		update_runtime_stat(st, STAT_STALLED_CYCLES_BACK,
-				    cpu_map_idx, count, &rsd);
+				    map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
-		update_runtime_stat(st, STAT_BRANCHES, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_BRANCHES, map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
-		update_runtime_stat(st, STAT_CACHEREFS, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_CACHEREFS, map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
-		update_runtime_stat(st, STAT_L1_DCACHE, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_L1_DCACHE, map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
-		update_runtime_stat(st, STAT_L1_ICACHE, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_L1_ICACHE, map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_LL))
-		update_runtime_stat(st, STAT_LL_CACHE, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_LL_CACHE, map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
-		update_runtime_stat(st, STAT_DTLB_CACHE, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_DTLB_CACHE, map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
-		update_runtime_stat(st, STAT_ITLB_CACHE, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_ITLB_CACHE, map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, SMI_NUM))
-		update_runtime_stat(st, STAT_SMI_NUM, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_SMI_NUM, map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, APERF))
-		update_runtime_stat(st, STAT_APERF, cpu_map_idx, count, &rsd);
+		update_runtime_stat(st, STAT_APERF, map_idx, count, &rsd);
 
 	if (counter->collect_stat) {
-		v = saved_value_lookup(counter, cpu_map_idx, true, STAT_NONE, 0, st,
+		v = saved_value_lookup(counter, map_idx, true, STAT_NONE, 0, st,
 				       rsd.cgrp);
 		update_stats(&v->stats, count);
 		if (counter->metric_leader)
 			v->metric_total += count;
 	} else if (counter->metric_leader) {
 		v = saved_value_lookup(counter->metric_leader,
-				       cpu_map_idx, true, STAT_NONE, 0, st, rsd.cgrp);
+				       map_idx, true, STAT_NONE, 0, st, rsd.cgrp);
 		v->metric_total += count;
 		v->metric_other++;
 	}
@@ -466,12 +466,12 @@ void perf_stat__collect_metric_expr(struct evlist *evsel_list)
 }
 
 static double runtime_stat_avg(struct runtime_stat *st,
-			       enum stat_type type, int cpu_map_idx,
+			       enum stat_type type, int map_idx,
 			       struct runtime_stat_data *rsd)
 {
 	struct saved_value *v;
 
-	v = saved_value_lookup(NULL, cpu_map_idx, false, type, rsd->ctx, st, rsd->cgrp);
+	v = saved_value_lookup(NULL, map_idx, false, type, rsd->ctx, st, rsd->cgrp);
 	if (!v)
 		return 0.0;
 
@@ -479,12 +479,12 @@ static double runtime_stat_avg(struct runtime_stat *st,
 }
 
 static double runtime_stat_n(struct runtime_stat *st,
-			     enum stat_type type, int cpu_map_idx,
+			     enum stat_type type, int map_idx,
 			     struct runtime_stat_data *rsd)
 {
 	struct saved_value *v;
 
-	v = saved_value_lookup(NULL, cpu_map_idx, false, type, rsd->ctx, st, rsd->cgrp);
+	v = saved_value_lookup(NULL, map_idx, false, type, rsd->ctx, st, rsd->cgrp);
 	if (!v)
 		return 0.0;
 
@@ -492,7 +492,7 @@ static double runtime_stat_n(struct runtime_stat *st,
 }
 
 static void print_stalled_cycles_frontend(struct perf_stat_config *config,
-					  int cpu_map_idx, double avg,
+					  int map_idx, double avg,
 					  struct perf_stat_output_ctx *out,
 					  struct runtime_stat *st,
 					  struct runtime_stat_data *rsd)
@@ -500,7 +500,7 @@ static void print_stalled_cycles_frontend(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_CYCLES, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -515,7 +515,7 @@ static void print_stalled_cycles_frontend(struct perf_stat_config *config,
 }
 
 static void print_stalled_cycles_backend(struct perf_stat_config *config,
-					 int cpu_map_idx, double avg,
+					 int map_idx, double avg,
 					 struct perf_stat_output_ctx *out,
 					 struct runtime_stat *st,
 					 struct runtime_stat_data *rsd)
@@ -523,7 +523,7 @@ static void print_stalled_cycles_backend(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_CYCLES, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -534,7 +534,7 @@ static void print_stalled_cycles_backend(struct perf_stat_config *config,
 }
 
 static void print_branch_misses(struct perf_stat_config *config,
-				int cpu_map_idx, double avg,
+				int map_idx, double avg,
 				struct perf_stat_output_ctx *out,
 				struct runtime_stat *st,
 				struct runtime_stat_data *rsd)
@@ -542,7 +542,7 @@ static void print_branch_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_BRANCHES, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_BRANCHES, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -553,7 +553,7 @@ static void print_branch_misses(struct perf_stat_config *config,
 }
 
 static void print_l1_dcache_misses(struct perf_stat_config *config,
-				   int cpu_map_idx, double avg,
+				   int map_idx, double avg,
 				   struct perf_stat_output_ctx *out,
 				   struct runtime_stat *st,
 				   struct runtime_stat_data *rsd)
@@ -561,7 +561,7 @@ static void print_l1_dcache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_L1_DCACHE, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_L1_DCACHE, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -572,7 +572,7 @@ static void print_l1_dcache_misses(struct perf_stat_config *config,
 }
 
 static void print_l1_icache_misses(struct perf_stat_config *config,
-				   int cpu_map_idx, double avg,
+				   int map_idx, double avg,
 				   struct perf_stat_output_ctx *out,
 				   struct runtime_stat *st,
 				   struct runtime_stat_data *rsd)
@@ -580,7 +580,7 @@ static void print_l1_icache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_L1_ICACHE, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_L1_ICACHE, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -590,7 +590,7 @@ static void print_l1_icache_misses(struct perf_stat_config *config,
 }
 
 static void print_dtlb_cache_misses(struct perf_stat_config *config,
-				    int cpu_map_idx, double avg,
+				    int map_idx, double avg,
 				    struct perf_stat_output_ctx *out,
 				    struct runtime_stat *st,
 				    struct runtime_stat_data *rsd)
@@ -598,7 +598,7 @@ static void print_dtlb_cache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_DTLB_CACHE, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_DTLB_CACHE, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -608,7 +608,7 @@ static void print_dtlb_cache_misses(struct perf_stat_config *config,
 }
 
 static void print_itlb_cache_misses(struct perf_stat_config *config,
-				    int cpu_map_idx, double avg,
+				    int map_idx, double avg,
 				    struct perf_stat_output_ctx *out,
 				    struct runtime_stat *st,
 				    struct runtime_stat_data *rsd)
@@ -616,7 +616,7 @@ static void print_itlb_cache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_ITLB_CACHE, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_ITLB_CACHE, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -626,7 +626,7 @@ static void print_itlb_cache_misses(struct perf_stat_config *config,
 }
 
 static void print_ll_cache_misses(struct perf_stat_config *config,
-				  int cpu_map_idx, double avg,
+				  int map_idx, double avg,
 				  struct perf_stat_output_ctx *out,
 				  struct runtime_stat *st,
 				  struct runtime_stat_data *rsd)
@@ -634,7 +634,7 @@ static void print_ll_cache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_LL_CACHE, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_LL_CACHE, map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -692,61 +692,61 @@ static double sanitize_val(double x)
 	return x;
 }
 
-static double td_total_slots(int cpu_map_idx, struct runtime_stat *st,
+static double td_total_slots(int map_idx, struct runtime_stat *st,
 			     struct runtime_stat_data *rsd)
 {
-	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, cpu_map_idx, rsd);
+	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, map_idx, rsd);
 }
 
-static double td_bad_spec(int cpu_map_idx, struct runtime_stat *st,
+static double td_bad_spec(int map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
 	double bad_spec = 0;
 	double total_slots;
 	double total;
 
-	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, cpu_map_idx, rsd) -
-		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, cpu_map_idx, rsd) +
-		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, cpu_map_idx, rsd);
+	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, map_idx, rsd) -
+		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, map_idx, rsd) +
+		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, map_idx, rsd);
 
-	total_slots = td_total_slots(cpu_map_idx, st, rsd);
+	total_slots = td_total_slots(map_idx, st, rsd);
 	if (total_slots)
 		bad_spec = total / total_slots;
 	return sanitize_val(bad_spec);
 }
 
-static double td_retiring(int cpu_map_idx, struct runtime_stat *st,
+static double td_retiring(int map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
 	double retiring = 0;
-	double total_slots = td_total_slots(cpu_map_idx, st, rsd);
+	double total_slots = td_total_slots(map_idx, st, rsd);
 	double ret_slots = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED,
-					    cpu_map_idx, rsd);
+					    map_idx, rsd);
 
 	if (total_slots)
 		retiring = ret_slots / total_slots;
 	return retiring;
 }
 
-static double td_fe_bound(int cpu_map_idx, struct runtime_stat *st,
+static double td_fe_bound(int map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
 	double fe_bound = 0;
-	double total_slots = td_total_slots(cpu_map_idx, st, rsd);
+	double total_slots = td_total_slots(map_idx, st, rsd);
 	double fetch_bub = runtime_stat_avg(st, STAT_TOPDOWN_FETCH_BUBBLES,
-					    cpu_map_idx, rsd);
+					    map_idx, rsd);
 
 	if (total_slots)
 		fe_bound = fetch_bub / total_slots;
 	return fe_bound;
 }
 
-static double td_be_bound(int cpu_map_idx, struct runtime_stat *st,
+static double td_be_bound(int map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
-	double sum = (td_fe_bound(cpu_map_idx, st, rsd) +
-		      td_bad_spec(cpu_map_idx, st, rsd) +
-		      td_retiring(cpu_map_idx, st, rsd));
+	double sum = (td_fe_bound(map_idx, st, rsd) +
+		      td_bad_spec(map_idx, st, rsd) +
+		      td_retiring(map_idx, st, rsd));
 	if (sum == 0)
 		return 0;
 	return sanitize_val(1.0 - sum);
@@ -757,15 +757,15 @@ static double td_be_bound(int cpu_map_idx, struct runtime_stat *st,
  * the ratios we need to recreate the sum.
  */
 
-static double td_metric_ratio(int cpu_map_idx, enum stat_type type,
+static double td_metric_ratio(int map_idx, enum stat_type type,
 			      struct runtime_stat *stat,
 			      struct runtime_stat_data *rsd)
 {
-	double sum = runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu_map_idx, rsd) +
-		runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu_map_idx, rsd) +
-		runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu_map_idx, rsd) +
-		runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu_map_idx, rsd);
-	double d = runtime_stat_avg(stat, type, cpu_map_idx, rsd);
+	double sum = runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, map_idx, rsd) +
+		runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, map_idx, rsd) +
+		runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, map_idx, rsd) +
+		runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, map_idx, rsd);
+	double d = runtime_stat_avg(stat, type, map_idx, rsd);
 
 	if (sum)
 		return d / sum;
@@ -777,23 +777,23 @@ static double td_metric_ratio(int cpu_map_idx, enum stat_type type,
  * We allow two missing.
  */
 
-static bool full_td(int cpu_map_idx, struct runtime_stat *stat,
+static bool full_td(int map_idx, struct runtime_stat *stat,
 		    struct runtime_stat_data *rsd)
 {
 	int c = 0;
 
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu_map_idx, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, map_idx, rsd) > 0)
 		c++;
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu_map_idx, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, map_idx, rsd) > 0)
 		c++;
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu_map_idx, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, map_idx, rsd) > 0)
 		c++;
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu_map_idx, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, map_idx, rsd) > 0)
 		c++;
 	return c >= 2;
 }
 
-static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
+static void print_smi_cost(struct perf_stat_config *config, int map_idx,
 			   struct perf_stat_output_ctx *out,
 			   struct runtime_stat *st,
 			   struct runtime_stat_data *rsd)
@@ -801,9 +801,9 @@ static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
 	double smi_num, aperf, cycles, cost = 0.0;
 	const char *color = NULL;
 
-	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, cpu_map_idx, rsd);
-	aperf = runtime_stat_avg(st, STAT_APERF, cpu_map_idx, rsd);
-	cycles = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
+	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, map_idx, rsd);
+	aperf = runtime_stat_avg(st, STAT_APERF, map_idx, rsd);
+	cycles = runtime_stat_avg(st, STAT_CYCLES, map_idx, rsd);
 
 	if ((cycles == 0) || (aperf == 0))
 		return;
@@ -820,7 +820,7 @@ static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
 static int prepare_metric(struct evsel **metric_events,
 			  struct metric_ref *metric_refs,
 			  struct expr_parse_ctx *pctx,
-			  int cpu_map_idx,
+			  int map_idx,
 			  struct runtime_stat *st)
 {
 	double scale;
@@ -859,7 +859,7 @@ static int prepare_metric(struct evsel **metric_events,
 				abort();
 			}
 		} else {
-			v = saved_value_lookup(metric_events[i], cpu_map_idx, false,
+			v = saved_value_lookup(metric_events[i], map_idx, false,
 					       STAT_NONE, 0, st,
 					       metric_events[i]->cgrp);
 			if (!v)
@@ -897,7 +897,7 @@ static void generic_metric(struct perf_stat_config *config,
 			   const char *metric_name,
 			   const char *metric_unit,
 			   int runtime,
-			   int cpu_map_idx,
+			   int map_idx,
 			   struct perf_stat_output_ctx *out,
 			   struct runtime_stat *st)
 {
@@ -915,7 +915,7 @@ static void generic_metric(struct perf_stat_config *config,
 		pctx->sctx.user_requested_cpu_list = strdup(config->user_requested_cpu_list);
 	pctx->sctx.runtime = runtime;
 	pctx->sctx.system_wide = config->system_wide;
-	i = prepare_metric(metric_events, metric_refs, pctx, cpu_map_idx, st);
+	i = prepare_metric(metric_events, metric_refs, pctx, map_idx, st);
 	if (i < 0) {
 		expr__ctx_free(pctx);
 		return;
@@ -960,7 +960,7 @@ static void generic_metric(struct perf_stat_config *config,
 	expr__ctx_free(pctx);
 }
 
-double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct runtime_stat *st)
+double test_generic_metric(struct metric_expr *mexp, int map_idx, struct runtime_stat *st)
 {
 	struct expr_parse_ctx *pctx;
 	double ratio = 0.0;
@@ -969,7 +969,7 @@ double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct run
 	if (!pctx)
 		return NAN;
 
-	if (prepare_metric(mexp->metric_events, mexp->metric_refs, pctx, cpu_map_idx, st) < 0)
+	if (prepare_metric(mexp->metric_events, mexp->metric_refs, pctx, map_idx, st) < 0)
 		goto out;
 
 	if (expr__parse(&ratio, pctx, mexp->metric_expr))
@@ -982,7 +982,7 @@ double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct run
 
 void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				   struct evsel *evsel,
-				   double avg, int cpu_map_idx,
+				   double avg, int map_idx,
 				   struct perf_stat_output_ctx *out,
 				   struct rblist *metric_events,
 				   struct runtime_stat *st)
@@ -1001,7 +1001,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 	if (config->iostat_run) {
 		iostat_print_metric(config, evsel, out);
 	} else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
-		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES, map_idx, &rsd);
 
 		if (total) {
 			ratio = avg / total;
@@ -1011,11 +1011,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			print_metric(config, ctxp, NULL, NULL, "insn per cycle", 0);
 		}
 
-		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT, map_idx, &rsd);
 
 		total = max(total, runtime_stat_avg(st,
 						    STAT_STALLED_CYCLES_BACK,
-						    cpu_map_idx, &rsd));
+						    map_idx, &rsd));
 
 		if (total && avg) {
 			out->new_line(config, ctxp);
@@ -1025,8 +1025,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					ratio);
 		}
 	} else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
-		if (runtime_stat_n(st, STAT_BRANCHES, cpu_map_idx, &rsd) != 0)
-			print_branch_misses(config, cpu_map_idx, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_BRANCHES, map_idx, &rsd) != 0)
+			print_branch_misses(config, map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all branches", 0);
 	} else if (
@@ -1035,8 +1035,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_L1_DCACHE, cpu_map_idx, &rsd) != 0)
-			print_l1_dcache_misses(config, cpu_map_idx, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_L1_DCACHE, map_idx, &rsd) != 0)
+			print_l1_dcache_misses(config, map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all L1-dcache accesses", 0);
 	} else if (
@@ -1045,8 +1045,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_L1_ICACHE, cpu_map_idx, &rsd) != 0)
-			print_l1_icache_misses(config, cpu_map_idx, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_L1_ICACHE, map_idx, &rsd) != 0)
+			print_l1_icache_misses(config, map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all L1-icache accesses", 0);
 	} else if (
@@ -1055,8 +1055,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_DTLB_CACHE, cpu_map_idx, &rsd) != 0)
-			print_dtlb_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_DTLB_CACHE, map_idx, &rsd) != 0)
+			print_dtlb_cache_misses(config, map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all dTLB cache accesses", 0);
 	} else if (
@@ -1065,8 +1065,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_ITLB_CACHE, cpu_map_idx, &rsd) != 0)
-			print_itlb_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_ITLB_CACHE, map_idx, &rsd) != 0)
+			print_itlb_cache_misses(config, map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all iTLB cache accesses", 0);
 	} else if (
@@ -1075,27 +1075,27 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_LL_CACHE, cpu_map_idx, &rsd) != 0)
-			print_ll_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_LL_CACHE, map_idx, &rsd) != 0)
+			print_ll_cache_misses(config, map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all LL-cache accesses", 0);
 	} else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
-		total = runtime_stat_avg(st, STAT_CACHEREFS, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_CACHEREFS, map_idx, &rsd);
 
 		if (total)
 			ratio = avg * 100 / total;
 
-		if (runtime_stat_n(st, STAT_CACHEREFS, cpu_map_idx, &rsd) != 0)
+		if (runtime_stat_n(st, STAT_CACHEREFS, map_idx, &rsd) != 0)
 			print_metric(config, ctxp, NULL, "%8.3f %%",
 				     "of all cache refs", ratio);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all cache refs", 0);
 	} else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
-		print_stalled_cycles_frontend(config, cpu_map_idx, avg, out, st, &rsd);
+		print_stalled_cycles_frontend(config, map_idx, avg, out, st, &rsd);
 	} else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
-		print_stalled_cycles_backend(config, cpu_map_idx, avg, out, st, &rsd);
+		print_stalled_cycles_backend(config, map_idx, avg, out, st, &rsd);
 	} else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
-		total = runtime_stat_avg(st, STAT_NSECS, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_NSECS, map_idx, &rsd);
 
 		if (total) {
 			ratio = avg / total;
@@ -1104,7 +1104,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			print_metric(config, ctxp, NULL, NULL, "Ghz", 0);
 		}
 	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX)) {
-		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES, map_idx, &rsd);
 
 		if (total)
 			print_metric(config, ctxp, NULL,
@@ -1114,8 +1114,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			print_metric(config, ctxp, NULL, NULL, "transactional cycles",
 				     0);
 	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX_CP)) {
-		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
-		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES, map_idx, &rsd);
+		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, map_idx, &rsd);
 
 		if (total2 < avg)
 			total2 = avg;
@@ -1125,19 +1125,19 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		else
 			print_metric(config, ctxp, NULL, NULL, "aborted cycles", 0);
 	} else if (perf_stat_evsel__is(evsel, TRANSACTION_START)) {
-		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, map_idx, &rsd);
 
 		if (avg)
 			ratio = total / avg;
 
-		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd) != 0)
+		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, map_idx, &rsd) != 0)
 			print_metric(config, ctxp, NULL, "%8.0f",
 				     "cycles / transaction", ratio);
 		else
 			print_metric(config, ctxp, NULL, NULL, "cycles / transaction",
 				      0);
 	} else if (perf_stat_evsel__is(evsel, ELISION_START)) {
-		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, map_idx, &rsd);
 
 		if (avg)
 			ratio = total / avg;
@@ -1150,28 +1150,28 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		else
 			print_metric(config, ctxp, NULL, NULL, "CPUs utilized", 0);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_BUBBLES)) {
-		double fe_bound = td_fe_bound(cpu_map_idx, st, &rsd);
+		double fe_bound = td_fe_bound(map_idx, st, &rsd);
 
 		if (fe_bound > 0.2)
 			color = PERF_COLOR_RED;
 		print_metric(config, ctxp, color, "%8.1f%%", "frontend bound",
 				fe_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_RETIRED)) {
-		double retiring = td_retiring(cpu_map_idx, st, &rsd);
+		double retiring = td_retiring(map_idx, st, &rsd);
 
 		if (retiring > 0.7)
 			color = PERF_COLOR_GREEN;
 		print_metric(config, ctxp, color, "%8.1f%%", "retiring",
 				retiring * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RECOVERY_BUBBLES)) {
-		double bad_spec = td_bad_spec(cpu_map_idx, st, &rsd);
+		double bad_spec = td_bad_spec(map_idx, st, &rsd);
 
 		if (bad_spec > 0.1)
 			color = PERF_COLOR_RED;
 		print_metric(config, ctxp, color, "%8.1f%%", "bad speculation",
 				bad_spec * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_ISSUED)) {
-		double be_bound = td_be_bound(cpu_map_idx, st, &rsd);
+		double be_bound = td_be_bound(map_idx, st, &rsd);
 		const char *name = "backend bound";
 		static int have_recovery_bubbles = -1;
 
@@ -1184,14 +1184,14 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 
 		if (be_bound > 0.2)
 			color = PERF_COLOR_RED;
-		if (td_total_slots(cpu_map_idx, st, &rsd) > 0)
+		if (td_total_slots(map_idx, st, &rsd) > 0)
 			print_metric(config, ctxp, color, "%8.1f%%", name,
 					be_bound * 100.);
 		else
 			print_metric(config, ctxp, NULL, NULL, name, 0);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RETIRING) &&
-		   full_td(cpu_map_idx, st, &rsd)) {
-		double retiring = td_metric_ratio(cpu_map_idx,
+		   full_td(map_idx, st, &rsd)) {
+		double retiring = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_RETIRING, st,
 						  &rsd);
 		if (retiring > 0.7)
@@ -1199,8 +1199,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Retiring",
 				retiring * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FE_BOUND) &&
-		   full_td(cpu_map_idx, st, &rsd)) {
-		double fe_bound = td_metric_ratio(cpu_map_idx,
+		   full_td(map_idx, st, &rsd)) {
+		double fe_bound = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_FE_BOUND, st,
 						  &rsd);
 		if (fe_bound > 0.2)
@@ -1208,8 +1208,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Frontend Bound",
 				fe_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BE_BOUND) &&
-		   full_td(cpu_map_idx, st, &rsd)) {
-		double be_bound = td_metric_ratio(cpu_map_idx,
+		   full_td(map_idx, st, &rsd)) {
+		double be_bound = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_BE_BOUND, st,
 						  &rsd);
 		if (be_bound > 0.2)
@@ -1217,8 +1217,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Backend Bound",
 				be_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BAD_SPEC) &&
-		   full_td(cpu_map_idx, st, &rsd)) {
-		double bad_spec = td_metric_ratio(cpu_map_idx,
+		   full_td(map_idx, st, &rsd)) {
+		double bad_spec = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_BAD_SPEC, st,
 						  &rsd);
 		if (bad_spec > 0.1)
@@ -1226,11 +1226,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Bad Speculation",
 				bad_spec * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_HEAVY_OPS) &&
-			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
-		double retiring = td_metric_ratio(cpu_map_idx,
+			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double retiring = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_RETIRING, st,
 						  &rsd);
-		double heavy_ops = td_metric_ratio(cpu_map_idx,
+		double heavy_ops = td_metric_ratio(map_idx,
 						   STAT_TOPDOWN_HEAVY_OPS, st,
 						   &rsd);
 		double light_ops = retiring - heavy_ops;
@@ -1246,11 +1246,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Light Operations",
 				light_ops * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BR_MISPREDICT) &&
-			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
-		double bad_spec = td_metric_ratio(cpu_map_idx,
+			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double bad_spec = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_BAD_SPEC, st,
 						  &rsd);
-		double br_mis = td_metric_ratio(cpu_map_idx,
+		double br_mis = td_metric_ratio(map_idx,
 						STAT_TOPDOWN_BR_MISPREDICT, st,
 						&rsd);
 		double m_clears = bad_spec - br_mis;
@@ -1266,11 +1266,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Machine Clears",
 				m_clears * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_LAT) &&
-			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
-		double fe_bound = td_metric_ratio(cpu_map_idx,
+			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double fe_bound = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_FE_BOUND, st,
 						  &rsd);
-		double fetch_lat = td_metric_ratio(cpu_map_idx,
+		double fetch_lat = td_metric_ratio(map_idx,
 						   STAT_TOPDOWN_FETCH_LAT, st,
 						   &rsd);
 		double fetch_bw = fe_bound - fetch_lat;
@@ -1286,11 +1286,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "Fetch Bandwidth",
 				fetch_bw * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_MEM_BOUND) &&
-			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
-		double be_bound = td_metric_ratio(cpu_map_idx,
+			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double be_bound = td_metric_ratio(map_idx,
 						  STAT_TOPDOWN_BE_BOUND, st,
 						  &rsd);
-		double mem_bound = td_metric_ratio(cpu_map_idx,
+		double mem_bound = td_metric_ratio(map_idx,
 						   STAT_TOPDOWN_MEM_BOUND, st,
 						   &rsd);
 		double core_bound = be_bound - mem_bound;
@@ -1308,12 +1308,12 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 	} else if (evsel->metric_expr) {
 		generic_metric(config, evsel->metric_expr, evsel->metric_events, NULL,
 			       evsel->name, evsel->metric_name, NULL, 1,
-			       cpu_map_idx, out, st);
-	} else if (runtime_stat_n(st, STAT_NSECS, cpu_map_idx, &rsd) != 0) {
+			       map_idx, out, st);
+	} else if (runtime_stat_n(st, STAT_NSECS, map_idx, &rsd) != 0) {
 		char unit = ' ';
 		char unit_buf[10] = "/sec";
 
-		total = runtime_stat_avg(st, STAT_NSECS, cpu_map_idx, &rsd);
+		total = runtime_stat_avg(st, STAT_NSECS, map_idx, &rsd);
 		if (total)
 			ratio = convert_unit_double(1000000000.0 * avg / total, &unit);
 
@@ -1321,7 +1321,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
 		print_metric(config, ctxp, NULL, "%8.3f", unit_buf, ratio);
 	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
-		print_smi_cost(config, cpu_map_idx, out, st, &rsd);
+		print_smi_cost(config, map_idx, out, st, &rsd);
 	} else {
 		num = 0;
 	}
@@ -1335,7 +1335,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			generic_metric(config, mexp->metric_expr, mexp->metric_events,
 				       mexp->metric_refs, evsel->name, mexp->metric_name,
 				       mexp->metric_unit, mexp->runtime,
-				       cpu_map_idx, out, st);
+				       map_idx, out, st);
 		}
 	}
 	if (num == 0)
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] perf stat: Use thread map index for shadow stat
  2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
                   ` (2 preceding siblings ...)
  2022-09-26 20:07 ` [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx Namhyung Kim
@ 2022-09-26 20:07 ` Namhyung Kim
  2022-09-28 14:49   ` James Clark
  2022-09-29  2:10   ` Ian Rogers
  2022-09-26 20:07 ` [PATCH 5/6] perf stat: Kill unused per-thread runtime stats Namhyung Kim
  2022-09-26 20:07 ` [PATCH 6/6] perf stat: Don't compare runtime stat for shadow stats Namhyung Kim
  5 siblings, 2 replies; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

When AGGR_THREAD is active, it aggregates the values for each thread.
Previously it used cpu map index which is invalid for AGGR_THREAD so
it had to use separate runtime stats with index 0.

But it can just use the rt_stat with thread_map_index.  Rename the
first_shadow_map_idx() and make it return the thread index.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat-display.c | 20 +++++++++-----------
 tools/perf/util/stat.c         |  8 ++------
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 234491f43c36..570e2c04d47d 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -442,7 +442,7 @@ static void print_metric_header(struct perf_stat_config *config,
 		fprintf(os->fh, "%*s ", config->metric_only_len, unit);
 }
 
-static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
+static int first_shadow_map_idx(struct perf_stat_config *config,
 				struct evsel *evsel, const struct aggr_cpu_id *id)
 {
 	struct perf_cpu_map *cpus = evsel__cpus(evsel);
@@ -452,6 +452,9 @@ static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
 	if (config->aggr_mode == AGGR_NONE)
 		return perf_cpu_map__idx(cpus, id->cpu);
 
+	if (config->aggr_mode == AGGR_THREAD)
+		return id->thread;
+
 	if (!config->aggr_get_id)
 		return 0;
 
@@ -646,7 +649,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
 	}
 
 	perf_stat__print_shadow_stats(config, counter, uval,
-				first_shadow_cpu_map_idx(config, counter, &id),
+				first_shadow_map_idx(config, counter, &id),
 				&out, &config->metric_events, st);
 	if (!config->csv_output && !config->metric_only && !config->json_output) {
 		print_noise(config, counter, noise);
@@ -676,7 +679,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
 				val += perf_counts(counter->counts, idx, 0)->val;
 			}
 			perf_stat__update_shadow_stats(counter, val,
-					first_shadow_cpu_map_idx(config, counter, &id),
+					first_shadow_map_idx(config, counter, &id),
 					&rt_stat);
 		}
 	}
@@ -979,14 +982,9 @@ static void print_aggr_thread(struct perf_stat_config *config,
 			fprintf(output, "%s", prefix);
 
 		id = buf[thread].id;
-		if (config->stats)
-			printout(config, id, 0, buf[thread].counter, buf[thread].uval,
-				 prefix, buf[thread].run, buf[thread].ena, 1.0,
-				 &config->stats[id.thread]);
-		else
-			printout(config, id, 0, buf[thread].counter, buf[thread].uval,
-				 prefix, buf[thread].run, buf[thread].ena, 1.0,
-				 &rt_stat);
+		printout(config, id, 0, buf[thread].counter, buf[thread].uval,
+			 prefix, buf[thread].run, buf[thread].ena, 1.0,
+			 &rt_stat);
 		fputc('\n', output);
 	}
 
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index e1d3152ce664..21137c9d5259 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -389,12 +389,8 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
 		}
 
 		if (config->aggr_mode == AGGR_THREAD) {
-			if (config->stats)
-				perf_stat__update_shadow_stats(evsel,
-					count->val, 0, &config->stats[thread]);
-			else
-				perf_stat__update_shadow_stats(evsel,
-					count->val, 0, &rt_stat);
+			perf_stat__update_shadow_stats(evsel, count->val,
+						       thread, &rt_stat);
 		}
 		break;
 	case AGGR_GLOBAL:
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] perf stat: Kill unused per-thread runtime stats
  2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
                   ` (3 preceding siblings ...)
  2022-09-26 20:07 ` [PATCH 4/6] perf stat: Use thread map index for shadow stat Namhyung Kim
@ 2022-09-26 20:07 ` Namhyung Kim
  2022-09-28 14:51   ` James Clark
  2022-09-26 20:07 ` [PATCH 6/6] perf stat: Don't compare runtime stat for shadow stats Namhyung Kim
  5 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

Now it's using the global rt_stat, no need to use per-thread stats.  Let
get rid of them.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-stat.c | 54 ---------------------------------------
 tools/perf/util/stat.h    |  2 --
 2 files changed, 56 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e05fe72c1d87..b86ebb25a799 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -292,13 +292,8 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
 
 static void perf_stat__reset_stats(void)
 {
-	int i;
-
 	evlist__reset_stats(evsel_list);
 	perf_stat__reset_shadow_stats();
-
-	for (i = 0; i < stat_config.stats_num; i++)
-		perf_stat__reset_shadow_per_stat(&stat_config.stats[i]);
 }
 
 static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
@@ -489,46 +484,6 @@ static void read_counters(struct timespec *rs)
 	}
 }
 
-static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
-{
-	int i;
-
-	config->stats = calloc(nthreads, sizeof(struct runtime_stat));
-	if (!config->stats)
-		return -1;
-
-	config->stats_num = nthreads;
-
-	for (i = 0; i < nthreads; i++)
-		runtime_stat__init(&config->stats[i]);
-
-	return 0;
-}
-
-static void runtime_stat_delete(struct perf_stat_config *config)
-{
-	int i;
-
-	if (!config->stats)
-		return;
-
-	for (i = 0; i < config->stats_num; i++)
-		runtime_stat__exit(&config->stats[i]);
-
-	zfree(&config->stats);
-}
-
-static void runtime_stat_reset(struct perf_stat_config *config)
-{
-	int i;
-
-	if (!config->stats)
-		return;
-
-	for (i = 0; i < config->stats_num; i++)
-		perf_stat__reset_shadow_per_stat(&config->stats[i]);
-}
-
 static void process_interval(void)
 {
 	struct timespec ts, rs;
@@ -537,7 +492,6 @@ static void process_interval(void)
 	diff_timespec(&rs, &ts, &ref_time);
 
 	perf_stat__reset_shadow_per_stat(&rt_stat);
-	runtime_stat_reset(&stat_config);
 	read_counters(&rs);
 
 	if (STAT_RECORD) {
@@ -1018,7 +972,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 
 		evlist__copy_prev_raw_counts(evsel_list);
 		evlist__reset_prev_raw_counts(evsel_list);
-		runtime_stat_reset(&stat_config);
 		perf_stat__reset_shadow_per_stat(&rt_stat);
 	} else {
 		update_stats(&walltime_nsecs_stats, t1 - t0);
@@ -2514,12 +2467,6 @@ int cmd_stat(int argc, const char **argv)
 	 */
 	if (stat_config.aggr_mode == AGGR_THREAD) {
 		thread_map__read_comms(evsel_list->core.threads);
-		if (target.system_wide) {
-			if (runtime_stat_new(&stat_config,
-				perf_thread_map__nr(evsel_list->core.threads))) {
-				goto out;
-			}
-		}
 	}
 
 	if (stat_config.aggr_mode == AGGR_NODE)
@@ -2660,7 +2607,6 @@ int cmd_stat(int argc, const char **argv)
 	evlist__delete(evsel_list);
 
 	metricgroup__rblist_exit(&stat_config.metric_events);
-	runtime_stat_delete(&stat_config);
 	evlist__close_control(stat_config.ctl_fd, stat_config.ctl_fd_ack, &stat_config.ctl_fd_close);
 
 	return status;
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 3eba38a1a149..43cb3f13d4d6 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -153,8 +153,6 @@ struct perf_stat_config {
 	int			 run_count;
 	int			 print_free_counters_hint;
 	int			 print_mixed_hw_group_error;
-	struct runtime_stat	*stats;
-	int			 stats_num;
 	const char		*csv_sep;
 	struct stats		*walltime_nsecs_stats;
 	struct rusage		 ru_data;
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] perf stat: Don't compare runtime stat for shadow stats
  2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
                   ` (4 preceding siblings ...)
  2022-09-26 20:07 ` [PATCH 5/6] perf stat: Kill unused per-thread runtime stats Namhyung Kim
@ 2022-09-26 20:07 ` Namhyung Kim
  2022-09-28 14:52   ` James Clark
  5 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2022-09-26 20:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing

Now it always uses the global rt_stat.  Let's get rid of the field from
the saved_value.  When the both evsels are NULL, it'd return 0 so remove
the block in the saved_value_cmp.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat-shadow.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 99d05262055c..700563306637 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -35,7 +35,6 @@ struct saved_value {
 	int ctx;
 	int map_idx;
 	struct cgroup *cgrp;
-	struct runtime_stat *stat;
 	struct stats stats;
 	u64 metric_total;
 	int metric_other;
@@ -67,16 +66,6 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 	if (a->cgrp != b->cgrp)
 		return (char *)a->cgrp < (char *)b->cgrp ? -1 : +1;
 
-	if (a->evsel == NULL && b->evsel == NULL) {
-		if (a->stat == b->stat)
-			return 0;
-
-		if ((char *)a->stat < (char *)b->stat)
-			return -1;
-
-		return 1;
-	}
-
 	if (a->evsel == b->evsel)
 		return 0;
 	if ((char *)a->evsel < (char *)b->evsel)
@@ -120,7 +109,6 @@ static struct saved_value *saved_value_lookup(struct evsel *evsel,
 		.evsel = evsel,
 		.type = type,
 		.ctx = ctx,
-		.stat = st,
 		.cgrp = cgrp,
 	};
 
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array
  2022-09-26 20:07 ` [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array Namhyung Kim
@ 2022-09-28 10:33   ` James Clark
  0 siblings, 0 replies; 17+ messages in thread
From: James Clark @ 2022-09-28 10:33 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa



On 26/09/2022 21:07, Namhyung Kim wrote:
> It uses only one member, no need to have it as an array.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat-display.c |  2 +-
>  tools/perf/util/stat.c         | 10 +++-------
>  tools/perf/util/stat.h         |  2 +-
>  3 files changed, 5 insertions(+), 9 deletions(-)
> 

Reviewed-by: James Clark <james.clark@arm.com>

> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index b82844cb0ce7..234491f43c36 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -67,7 +67,7 @@ static void print_noise(struct perf_stat_config *config,
>  		return;
>  
>  	ps = evsel->stats;
> -	print_noise_pct(config, stddev_stats(&ps->res_stats[0]), avg);
> +	print_noise_pct(config, stddev_stats(&ps->res_stats), avg);
>  }
>  
>  static void print_cgroup(struct perf_stat_config *config, struct evsel *evsel)
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ce5e9e372fc4..6bcd3dc32a71 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -132,12 +132,9 @@ static void perf_stat_evsel_id_init(struct evsel *evsel)
>  
>  static void evsel__reset_stat_priv(struct evsel *evsel)
>  {
> -	int i;
>  	struct perf_stat_evsel *ps = evsel->stats;
>  
> -	for (i = 0; i < 3; i++)
> -		init_stats(&ps->res_stats[i]);
> -
> +	init_stats(&ps->res_stats);
>  	perf_stat_evsel_id_init(evsel);
>  }
>  
> @@ -440,7 +437,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
>  	struct perf_counts_values *aggr = &counter->counts->aggr;
>  	struct perf_stat_evsel *ps = counter->stats;
>  	u64 *count = counter->counts->aggr.values;
> -	int i, ret;
> +	int ret;
>  
>  	aggr->val = aggr->ena = aggr->run = 0;
>  
> @@ -458,8 +455,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
>  		evsel__compute_deltas(counter, -1, -1, aggr);
>  	perf_counts_values__scale(aggr, config->scale, &counter->counts->scaled);
>  
> -	for (i = 0; i < 3; i++)
> -		update_stats(&ps->res_stats[i], count[i]);
> +	update_stats(&ps->res_stats, *count);
>  
>  	if (verbose > 0) {
>  		fprintf(config->output, "%s: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 72713b344b79..3eba38a1a149 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -43,7 +43,7 @@ enum perf_stat_evsel_id {
>  };
>  
>  struct perf_stat_evsel {
> -	struct stats		 res_stats[3];
> +	struct stats		 res_stats;
>  	enum perf_stat_evsel_id	 id;
>  	u64			*group_data;
>  };

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] perf stat: Don't call perf_stat_evsel_id_init() repeatedly
  2022-09-26 20:07 ` [PATCH 2/6] perf stat: Don't call perf_stat_evsel_id_init() repeatedly Namhyung Kim
@ 2022-09-28 10:41   ` James Clark
  0 siblings, 0 replies; 17+ messages in thread
From: James Clark @ 2022-09-28 10:41 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa



On 26/09/2022 21:07, Namhyung Kim wrote:
> The evsel__reset_stat_priv() is called more than once if user gave -r
> option for multiple run.  But it doesn't need to re-initialize the id.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: James Clark <james.clark@arm.com>

> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 6bcd3dc32a71..e1d3152ce664 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -135,7 +135,6 @@ static void evsel__reset_stat_priv(struct evsel *evsel)
>  	struct perf_stat_evsel *ps = evsel->stats;
>  
>  	init_stats(&ps->res_stats);
> -	perf_stat_evsel_id_init(evsel);
>  }
>  
>  static int evsel__alloc_stat_priv(struct evsel *evsel)
> @@ -143,6 +142,7 @@ static int evsel__alloc_stat_priv(struct evsel *evsel)
>  	evsel->stats = zalloc(sizeof(struct perf_stat_evsel));
>  	if (evsel->stats == NULL)
>  		return -ENOMEM;
> +	perf_stat_evsel_id_init(evsel);
>  	evsel__reset_stat_priv(evsel);
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx
  2022-09-26 20:07 ` [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx Namhyung Kim
@ 2022-09-28 10:50   ` James Clark
  2022-09-28 23:56     ` Namhyung Kim
  0 siblings, 1 reply; 17+ messages in thread
From: James Clark @ 2022-09-28 10:50 UTC (permalink / raw)
  To: Namhyung Kim, Ian Rogers
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa



On 26/09/2022 21:07, Namhyung Kim wrote:
> The cpu_map_idx fields is just to differentiate values from other
> entries.  It doesn't need to be strictly cpu map index.  Actually we can
> pass thread map index or aggr map index.  So rename the fields first.
> 
> No functional change intended.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat-shadow.c | 308 +++++++++++++++++-----------------
>  1 file changed, 154 insertions(+), 154 deletions(-)
> 
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 9e1eddeff21b..99d05262055c 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -33,7 +33,7 @@ struct saved_value {
>  	struct evsel *evsel;
>  	enum stat_type type;
>  	int ctx;
> -	int cpu_map_idx;
> +	int map_idx;

Do the same variables in stat.c and stat.h also need to be updated? The
previous change to do this exact thing (5b1af93dbc7e) changed more than
just these ones.

>  	struct cgroup *cgrp;
>  	struct runtime_stat *stat;
>  	struct stats stats;
> @@ -48,8 +48,8 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
>  					     rb_node);
>  	const struct saved_value *b = entry;
>  
> -	if (a->cpu_map_idx != b->cpu_map_idx)
> -		return a->cpu_map_idx - b->cpu_map_idx;
> +	if (a->map_idx != b->map_idx)
> +		return a->map_idx - b->map_idx;
>  
>  	/*
>  	 * Previously the rbtree was used to link generic metrics.
> @@ -106,7 +106,7 @@ static void saved_value_delete(struct rblist *rblist __maybe_unused,
>  }
>  
>  static struct saved_value *saved_value_lookup(struct evsel *evsel,
> -					      int cpu_map_idx,
> +					      int map_idx,
>  					      bool create,
>  					      enum stat_type type,
>  					      int ctx,
> @@ -116,7 +116,7 @@ static struct saved_value *saved_value_lookup(struct evsel *evsel,
>  	struct rblist *rblist;
>  	struct rb_node *nd;
>  	struct saved_value dm = {
> -		.cpu_map_idx = cpu_map_idx,
> +		.map_idx = map_idx,
>  		.evsel = evsel,
>  		.type = type,
>  		.ctx = ctx,
> @@ -215,10 +215,10 @@ struct runtime_stat_data {
>  
>  static void update_runtime_stat(struct runtime_stat *st,
>  				enum stat_type type,
> -				int cpu_map_idx, u64 count,
> +				int map_idx, u64 count,
>  				struct runtime_stat_data *rsd)
>  {
> -	struct saved_value *v = saved_value_lookup(NULL, cpu_map_idx, true, type,
> +	struct saved_value *v = saved_value_lookup(NULL, map_idx, true, type,
>  						   rsd->ctx, st, rsd->cgrp);
>  
>  	if (v)
> @@ -231,7 +231,7 @@ static void update_runtime_stat(struct runtime_stat *st,
>   * instruction rates, etc:
>   */
>  void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
> -				    int cpu_map_idx, struct runtime_stat *st)
> +				    int map_idx, struct runtime_stat *st)
>  {
>  	u64 count_ns = count;
>  	struct saved_value *v;
> @@ -243,88 +243,88 @@ void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
>  	count *= counter->scale;
>  
>  	if (evsel__is_clock(counter))
> -		update_runtime_stat(st, STAT_NSECS, cpu_map_idx, count_ns, &rsd);
> +		update_runtime_stat(st, STAT_NSECS, map_idx, count_ns, &rsd);
>  	else if (evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
> -		update_runtime_stat(st, STAT_CYCLES, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_CYCLES, map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
> -		update_runtime_stat(st, STAT_CYCLES_IN_TX, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_CYCLES_IN_TX, map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TRANSACTION_START))
> -		update_runtime_stat(st, STAT_TRANSACTION, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_TRANSACTION, map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, ELISION_START))
> -		update_runtime_stat(st, STAT_ELISION, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_ELISION, map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_TOTAL_SLOTS))
>  		update_runtime_stat(st, STAT_TOPDOWN_TOTAL_SLOTS,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_ISSUED))
>  		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_ISSUED,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_RETIRED))
>  		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_RETIRED,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_BUBBLES))
>  		update_runtime_stat(st, STAT_TOPDOWN_FETCH_BUBBLES,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
>  		update_runtime_stat(st, STAT_TOPDOWN_RECOVERY_BUBBLES,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_RETIRING))
>  		update_runtime_stat(st, STAT_TOPDOWN_RETIRING,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_BAD_SPEC))
>  		update_runtime_stat(st, STAT_TOPDOWN_BAD_SPEC,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_FE_BOUND))
>  		update_runtime_stat(st, STAT_TOPDOWN_FE_BOUND,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_BE_BOUND))
>  		update_runtime_stat(st, STAT_TOPDOWN_BE_BOUND,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_HEAVY_OPS))
>  		update_runtime_stat(st, STAT_TOPDOWN_HEAVY_OPS,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_BR_MISPREDICT))
>  		update_runtime_stat(st, STAT_TOPDOWN_BR_MISPREDICT,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_LAT))
>  		update_runtime_stat(st, STAT_TOPDOWN_FETCH_LAT,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, TOPDOWN_MEM_BOUND))
>  		update_runtime_stat(st, STAT_TOPDOWN_MEM_BOUND,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
>  		update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
>  		update_runtime_stat(st, STAT_STALLED_CYCLES_BACK,
> -				    cpu_map_idx, count, &rsd);
> +				    map_idx, count, &rsd);
>  	else if (evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
> -		update_runtime_stat(st, STAT_BRANCHES, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_BRANCHES, map_idx, count, &rsd);
>  	else if (evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
> -		update_runtime_stat(st, STAT_CACHEREFS, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_CACHEREFS, map_idx, count, &rsd);
>  	else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
> -		update_runtime_stat(st, STAT_L1_DCACHE, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_L1_DCACHE, map_idx, count, &rsd);
>  	else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
> -		update_runtime_stat(st, STAT_L1_ICACHE, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_L1_ICACHE, map_idx, count, &rsd);
>  	else if (evsel__match(counter, HW_CACHE, HW_CACHE_LL))
> -		update_runtime_stat(st, STAT_LL_CACHE, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_LL_CACHE, map_idx, count, &rsd);
>  	else if (evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
> -		update_runtime_stat(st, STAT_DTLB_CACHE, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_DTLB_CACHE, map_idx, count, &rsd);
>  	else if (evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
> -		update_runtime_stat(st, STAT_ITLB_CACHE, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_ITLB_CACHE, map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, SMI_NUM))
> -		update_runtime_stat(st, STAT_SMI_NUM, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_SMI_NUM, map_idx, count, &rsd);
>  	else if (perf_stat_evsel__is(counter, APERF))
> -		update_runtime_stat(st, STAT_APERF, cpu_map_idx, count, &rsd);
> +		update_runtime_stat(st, STAT_APERF, map_idx, count, &rsd);
>  
>  	if (counter->collect_stat) {
> -		v = saved_value_lookup(counter, cpu_map_idx, true, STAT_NONE, 0, st,
> +		v = saved_value_lookup(counter, map_idx, true, STAT_NONE, 0, st,
>  				       rsd.cgrp);
>  		update_stats(&v->stats, count);
>  		if (counter->metric_leader)
>  			v->metric_total += count;
>  	} else if (counter->metric_leader) {
>  		v = saved_value_lookup(counter->metric_leader,
> -				       cpu_map_idx, true, STAT_NONE, 0, st, rsd.cgrp);
> +				       map_idx, true, STAT_NONE, 0, st, rsd.cgrp);
>  		v->metric_total += count;
>  		v->metric_other++;
>  	}
> @@ -466,12 +466,12 @@ void perf_stat__collect_metric_expr(struct evlist *evsel_list)
>  }
>  
>  static double runtime_stat_avg(struct runtime_stat *st,
> -			       enum stat_type type, int cpu_map_idx,
> +			       enum stat_type type, int map_idx,
>  			       struct runtime_stat_data *rsd)
>  {
>  	struct saved_value *v;
>  
> -	v = saved_value_lookup(NULL, cpu_map_idx, false, type, rsd->ctx, st, rsd->cgrp);
> +	v = saved_value_lookup(NULL, map_idx, false, type, rsd->ctx, st, rsd->cgrp);
>  	if (!v)
>  		return 0.0;
>  
> @@ -479,12 +479,12 @@ static double runtime_stat_avg(struct runtime_stat *st,
>  }
>  
>  static double runtime_stat_n(struct runtime_stat *st,
> -			     enum stat_type type, int cpu_map_idx,
> +			     enum stat_type type, int map_idx,
>  			     struct runtime_stat_data *rsd)
>  {
>  	struct saved_value *v;
>  
> -	v = saved_value_lookup(NULL, cpu_map_idx, false, type, rsd->ctx, st, rsd->cgrp);
> +	v = saved_value_lookup(NULL, map_idx, false, type, rsd->ctx, st, rsd->cgrp);
>  	if (!v)
>  		return 0.0;
>  
> @@ -492,7 +492,7 @@ static double runtime_stat_n(struct runtime_stat *st,
>  }
>  
>  static void print_stalled_cycles_frontend(struct perf_stat_config *config,
> -					  int cpu_map_idx, double avg,
> +					  int map_idx, double avg,
>  					  struct perf_stat_output_ctx *out,
>  					  struct runtime_stat *st,
>  					  struct runtime_stat_data *rsd)
> @@ -500,7 +500,7 @@ static void print_stalled_cycles_frontend(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_CYCLES, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -515,7 +515,7 @@ static void print_stalled_cycles_frontend(struct perf_stat_config *config,
>  }
>  
>  static void print_stalled_cycles_backend(struct perf_stat_config *config,
> -					 int cpu_map_idx, double avg,
> +					 int map_idx, double avg,
>  					 struct perf_stat_output_ctx *out,
>  					 struct runtime_stat *st,
>  					 struct runtime_stat_data *rsd)
> @@ -523,7 +523,7 @@ static void print_stalled_cycles_backend(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_CYCLES, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -534,7 +534,7 @@ static void print_stalled_cycles_backend(struct perf_stat_config *config,
>  }
>  
>  static void print_branch_misses(struct perf_stat_config *config,
> -				int cpu_map_idx, double avg,
> +				int map_idx, double avg,
>  				struct perf_stat_output_ctx *out,
>  				struct runtime_stat *st,
>  				struct runtime_stat_data *rsd)
> @@ -542,7 +542,7 @@ static void print_branch_misses(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_BRANCHES, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_BRANCHES, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -553,7 +553,7 @@ static void print_branch_misses(struct perf_stat_config *config,
>  }
>  
>  static void print_l1_dcache_misses(struct perf_stat_config *config,
> -				   int cpu_map_idx, double avg,
> +				   int map_idx, double avg,
>  				   struct perf_stat_output_ctx *out,
>  				   struct runtime_stat *st,
>  				   struct runtime_stat_data *rsd)
> @@ -561,7 +561,7 @@ static void print_l1_dcache_misses(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_L1_DCACHE, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_L1_DCACHE, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -572,7 +572,7 @@ static void print_l1_dcache_misses(struct perf_stat_config *config,
>  }
>  
>  static void print_l1_icache_misses(struct perf_stat_config *config,
> -				   int cpu_map_idx, double avg,
> +				   int map_idx, double avg,
>  				   struct perf_stat_output_ctx *out,
>  				   struct runtime_stat *st,
>  				   struct runtime_stat_data *rsd)
> @@ -580,7 +580,7 @@ static void print_l1_icache_misses(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_L1_ICACHE, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_L1_ICACHE, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -590,7 +590,7 @@ static void print_l1_icache_misses(struct perf_stat_config *config,
>  }
>  
>  static void print_dtlb_cache_misses(struct perf_stat_config *config,
> -				    int cpu_map_idx, double avg,
> +				    int map_idx, double avg,
>  				    struct perf_stat_output_ctx *out,
>  				    struct runtime_stat *st,
>  				    struct runtime_stat_data *rsd)
> @@ -598,7 +598,7 @@ static void print_dtlb_cache_misses(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_DTLB_CACHE, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_DTLB_CACHE, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -608,7 +608,7 @@ static void print_dtlb_cache_misses(struct perf_stat_config *config,
>  }
>  
>  static void print_itlb_cache_misses(struct perf_stat_config *config,
> -				    int cpu_map_idx, double avg,
> +				    int map_idx, double avg,
>  				    struct perf_stat_output_ctx *out,
>  				    struct runtime_stat *st,
>  				    struct runtime_stat_data *rsd)
> @@ -616,7 +616,7 @@ static void print_itlb_cache_misses(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_ITLB_CACHE, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_ITLB_CACHE, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -626,7 +626,7 @@ static void print_itlb_cache_misses(struct perf_stat_config *config,
>  }
>  
>  static void print_ll_cache_misses(struct perf_stat_config *config,
> -				  int cpu_map_idx, double avg,
> +				  int map_idx, double avg,
>  				  struct perf_stat_output_ctx *out,
>  				  struct runtime_stat *st,
>  				  struct runtime_stat_data *rsd)
> @@ -634,7 +634,7 @@ static void print_ll_cache_misses(struct perf_stat_config *config,
>  	double total, ratio = 0.0;
>  	const char *color;
>  
> -	total = runtime_stat_avg(st, STAT_LL_CACHE, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_LL_CACHE, map_idx, rsd);
>  
>  	if (total)
>  		ratio = avg / total * 100.0;
> @@ -692,61 +692,61 @@ static double sanitize_val(double x)
>  	return x;
>  }
>  
> -static double td_total_slots(int cpu_map_idx, struct runtime_stat *st,
> +static double td_total_slots(int map_idx, struct runtime_stat *st,
>  			     struct runtime_stat_data *rsd)
>  {
> -	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, cpu_map_idx, rsd);
> +	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, map_idx, rsd);
>  }
>  
> -static double td_bad_spec(int cpu_map_idx, struct runtime_stat *st,
> +static double td_bad_spec(int map_idx, struct runtime_stat *st,
>  			  struct runtime_stat_data *rsd)
>  {
>  	double bad_spec = 0;
>  	double total_slots;
>  	double total;
>  
> -	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, cpu_map_idx, rsd) -
> -		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, cpu_map_idx, rsd) +
> -		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, cpu_map_idx, rsd);
> +	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, map_idx, rsd) -
> +		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, map_idx, rsd) +
> +		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, map_idx, rsd);
>  
> -	total_slots = td_total_slots(cpu_map_idx, st, rsd);
> +	total_slots = td_total_slots(map_idx, st, rsd);
>  	if (total_slots)
>  		bad_spec = total / total_slots;
>  	return sanitize_val(bad_spec);
>  }
>  
> -static double td_retiring(int cpu_map_idx, struct runtime_stat *st,
> +static double td_retiring(int map_idx, struct runtime_stat *st,
>  			  struct runtime_stat_data *rsd)
>  {
>  	double retiring = 0;
> -	double total_slots = td_total_slots(cpu_map_idx, st, rsd);
> +	double total_slots = td_total_slots(map_idx, st, rsd);
>  	double ret_slots = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED,
> -					    cpu_map_idx, rsd);
> +					    map_idx, rsd);
>  
>  	if (total_slots)
>  		retiring = ret_slots / total_slots;
>  	return retiring;
>  }
>  
> -static double td_fe_bound(int cpu_map_idx, struct runtime_stat *st,
> +static double td_fe_bound(int map_idx, struct runtime_stat *st,
>  			  struct runtime_stat_data *rsd)
>  {
>  	double fe_bound = 0;
> -	double total_slots = td_total_slots(cpu_map_idx, st, rsd);
> +	double total_slots = td_total_slots(map_idx, st, rsd);
>  	double fetch_bub = runtime_stat_avg(st, STAT_TOPDOWN_FETCH_BUBBLES,
> -					    cpu_map_idx, rsd);
> +					    map_idx, rsd);
>  
>  	if (total_slots)
>  		fe_bound = fetch_bub / total_slots;
>  	return fe_bound;
>  }
>  
> -static double td_be_bound(int cpu_map_idx, struct runtime_stat *st,
> +static double td_be_bound(int map_idx, struct runtime_stat *st,
>  			  struct runtime_stat_data *rsd)
>  {
> -	double sum = (td_fe_bound(cpu_map_idx, st, rsd) +
> -		      td_bad_spec(cpu_map_idx, st, rsd) +
> -		      td_retiring(cpu_map_idx, st, rsd));
> +	double sum = (td_fe_bound(map_idx, st, rsd) +
> +		      td_bad_spec(map_idx, st, rsd) +
> +		      td_retiring(map_idx, st, rsd));
>  	if (sum == 0)
>  		return 0;
>  	return sanitize_val(1.0 - sum);
> @@ -757,15 +757,15 @@ static double td_be_bound(int cpu_map_idx, struct runtime_stat *st,
>   * the ratios we need to recreate the sum.
>   */
>  
> -static double td_metric_ratio(int cpu_map_idx, enum stat_type type,
> +static double td_metric_ratio(int map_idx, enum stat_type type,
>  			      struct runtime_stat *stat,
>  			      struct runtime_stat_data *rsd)
>  {
> -	double sum = runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu_map_idx, rsd) +
> -		runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu_map_idx, rsd) +
> -		runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu_map_idx, rsd) +
> -		runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu_map_idx, rsd);
> -	double d = runtime_stat_avg(stat, type, cpu_map_idx, rsd);
> +	double sum = runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, map_idx, rsd) +
> +		runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, map_idx, rsd) +
> +		runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, map_idx, rsd) +
> +		runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, map_idx, rsd);
> +	double d = runtime_stat_avg(stat, type, map_idx, rsd);
>  
>  	if (sum)
>  		return d / sum;
> @@ -777,23 +777,23 @@ static double td_metric_ratio(int cpu_map_idx, enum stat_type type,
>   * We allow two missing.
>   */
>  
> -static bool full_td(int cpu_map_idx, struct runtime_stat *stat,
> +static bool full_td(int map_idx, struct runtime_stat *stat,
>  		    struct runtime_stat_data *rsd)
>  {
>  	int c = 0;
>  
> -	if (runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu_map_idx, rsd) > 0)
> +	if (runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, map_idx, rsd) > 0)
>  		c++;
> -	if (runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu_map_idx, rsd) > 0)
> +	if (runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, map_idx, rsd) > 0)
>  		c++;
> -	if (runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu_map_idx, rsd) > 0)
> +	if (runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, map_idx, rsd) > 0)
>  		c++;
> -	if (runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu_map_idx, rsd) > 0)
> +	if (runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, map_idx, rsd) > 0)
>  		c++;
>  	return c >= 2;
>  }
>  
> -static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
> +static void print_smi_cost(struct perf_stat_config *config, int map_idx,
>  			   struct perf_stat_output_ctx *out,
>  			   struct runtime_stat *st,
>  			   struct runtime_stat_data *rsd)
> @@ -801,9 +801,9 @@ static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
>  	double smi_num, aperf, cycles, cost = 0.0;
>  	const char *color = NULL;
>  
> -	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, cpu_map_idx, rsd);
> -	aperf = runtime_stat_avg(st, STAT_APERF, cpu_map_idx, rsd);
> -	cycles = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
> +	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, map_idx, rsd);
> +	aperf = runtime_stat_avg(st, STAT_APERF, map_idx, rsd);
> +	cycles = runtime_stat_avg(st, STAT_CYCLES, map_idx, rsd);
>  
>  	if ((cycles == 0) || (aperf == 0))
>  		return;
> @@ -820,7 +820,7 @@ static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
>  static int prepare_metric(struct evsel **metric_events,
>  			  struct metric_ref *metric_refs,
>  			  struct expr_parse_ctx *pctx,
> -			  int cpu_map_idx,
> +			  int map_idx,
>  			  struct runtime_stat *st)
>  {
>  	double scale;
> @@ -859,7 +859,7 @@ static int prepare_metric(struct evsel **metric_events,
>  				abort();
>  			}
>  		} else {
> -			v = saved_value_lookup(metric_events[i], cpu_map_idx, false,
> +			v = saved_value_lookup(metric_events[i], map_idx, false,
>  					       STAT_NONE, 0, st,
>  					       metric_events[i]->cgrp);
>  			if (!v)
> @@ -897,7 +897,7 @@ static void generic_metric(struct perf_stat_config *config,
>  			   const char *metric_name,
>  			   const char *metric_unit,
>  			   int runtime,
> -			   int cpu_map_idx,
> +			   int map_idx,
>  			   struct perf_stat_output_ctx *out,
>  			   struct runtime_stat *st)
>  {
> @@ -915,7 +915,7 @@ static void generic_metric(struct perf_stat_config *config,
>  		pctx->sctx.user_requested_cpu_list = strdup(config->user_requested_cpu_list);
>  	pctx->sctx.runtime = runtime;
>  	pctx->sctx.system_wide = config->system_wide;
> -	i = prepare_metric(metric_events, metric_refs, pctx, cpu_map_idx, st);
> +	i = prepare_metric(metric_events, metric_refs, pctx, map_idx, st);
>  	if (i < 0) {
>  		expr__ctx_free(pctx);
>  		return;
> @@ -960,7 +960,7 @@ static void generic_metric(struct perf_stat_config *config,
>  	expr__ctx_free(pctx);
>  }
>  
> -double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct runtime_stat *st)
> +double test_generic_metric(struct metric_expr *mexp, int map_idx, struct runtime_stat *st)
>  {
>  	struct expr_parse_ctx *pctx;
>  	double ratio = 0.0;
> @@ -969,7 +969,7 @@ double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct run
>  	if (!pctx)
>  		return NAN;
>  
> -	if (prepare_metric(mexp->metric_events, mexp->metric_refs, pctx, cpu_map_idx, st) < 0)
> +	if (prepare_metric(mexp->metric_events, mexp->metric_refs, pctx, map_idx, st) < 0)
>  		goto out;
>  
>  	if (expr__parse(&ratio, pctx, mexp->metric_expr))
> @@ -982,7 +982,7 @@ double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct run
>  
>  void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  				   struct evsel *evsel,
> -				   double avg, int cpu_map_idx,
> +				   double avg, int map_idx,
>  				   struct perf_stat_output_ctx *out,
>  				   struct rblist *metric_events,
>  				   struct runtime_stat *st)
> @@ -1001,7 +1001,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  	if (config->iostat_run) {
>  		iostat_print_metric(config, evsel, out);
>  	} else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
> -		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_CYCLES, map_idx, &rsd);
>  
>  		if (total) {
>  			ratio = avg / total;
> @@ -1011,11 +1011,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  			print_metric(config, ctxp, NULL, NULL, "insn per cycle", 0);
>  		}
>  
> -		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT, map_idx, &rsd);
>  
>  		total = max(total, runtime_stat_avg(st,
>  						    STAT_STALLED_CYCLES_BACK,
> -						    cpu_map_idx, &rsd));
> +						    map_idx, &rsd));
>  
>  		if (total && avg) {
>  			out->new_line(config, ctxp);
> @@ -1025,8 +1025,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  					ratio);
>  		}
>  	} else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
> -		if (runtime_stat_n(st, STAT_BRANCHES, cpu_map_idx, &rsd) != 0)
> -			print_branch_misses(config, cpu_map_idx, avg, out, st, &rsd);
> +		if (runtime_stat_n(st, STAT_BRANCHES, map_idx, &rsd) != 0)
> +			print_branch_misses(config, map_idx, avg, out, st, &rsd);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all branches", 0);
>  	} else if (
> @@ -1035,8 +1035,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>  					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>  
> -		if (runtime_stat_n(st, STAT_L1_DCACHE, cpu_map_idx, &rsd) != 0)
> -			print_l1_dcache_misses(config, cpu_map_idx, avg, out, st, &rsd);
> +		if (runtime_stat_n(st, STAT_L1_DCACHE, map_idx, &rsd) != 0)
> +			print_l1_dcache_misses(config, map_idx, avg, out, st, &rsd);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all L1-dcache accesses", 0);
>  	} else if (
> @@ -1045,8 +1045,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>  					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>  
> -		if (runtime_stat_n(st, STAT_L1_ICACHE, cpu_map_idx, &rsd) != 0)
> -			print_l1_icache_misses(config, cpu_map_idx, avg, out, st, &rsd);
> +		if (runtime_stat_n(st, STAT_L1_ICACHE, map_idx, &rsd) != 0)
> +			print_l1_icache_misses(config, map_idx, avg, out, st, &rsd);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all L1-icache accesses", 0);
>  	} else if (
> @@ -1055,8 +1055,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>  					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>  
> -		if (runtime_stat_n(st, STAT_DTLB_CACHE, cpu_map_idx, &rsd) != 0)
> -			print_dtlb_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
> +		if (runtime_stat_n(st, STAT_DTLB_CACHE, map_idx, &rsd) != 0)
> +			print_dtlb_cache_misses(config, map_idx, avg, out, st, &rsd);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all dTLB cache accesses", 0);
>  	} else if (
> @@ -1065,8 +1065,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>  					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>  
> -		if (runtime_stat_n(st, STAT_ITLB_CACHE, cpu_map_idx, &rsd) != 0)
> -			print_itlb_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
> +		if (runtime_stat_n(st, STAT_ITLB_CACHE, map_idx, &rsd) != 0)
> +			print_itlb_cache_misses(config, map_idx, avg, out, st, &rsd);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all iTLB cache accesses", 0);
>  	} else if (
> @@ -1075,27 +1075,27 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>  					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>  
> -		if (runtime_stat_n(st, STAT_LL_CACHE, cpu_map_idx, &rsd) != 0)
> -			print_ll_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
> +		if (runtime_stat_n(st, STAT_LL_CACHE, map_idx, &rsd) != 0)
> +			print_ll_cache_misses(config, map_idx, avg, out, st, &rsd);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all LL-cache accesses", 0);
>  	} else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
> -		total = runtime_stat_avg(st, STAT_CACHEREFS, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_CACHEREFS, map_idx, &rsd);
>  
>  		if (total)
>  			ratio = avg * 100 / total;
>  
> -		if (runtime_stat_n(st, STAT_CACHEREFS, cpu_map_idx, &rsd) != 0)
> +		if (runtime_stat_n(st, STAT_CACHEREFS, map_idx, &rsd) != 0)
>  			print_metric(config, ctxp, NULL, "%8.3f %%",
>  				     "of all cache refs", ratio);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "of all cache refs", 0);
>  	} else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
> -		print_stalled_cycles_frontend(config, cpu_map_idx, avg, out, st, &rsd);
> +		print_stalled_cycles_frontend(config, map_idx, avg, out, st, &rsd);
>  	} else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
> -		print_stalled_cycles_backend(config, cpu_map_idx, avg, out, st, &rsd);
> +		print_stalled_cycles_backend(config, map_idx, avg, out, st, &rsd);
>  	} else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
> -		total = runtime_stat_avg(st, STAT_NSECS, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_NSECS, map_idx, &rsd);
>  
>  		if (total) {
>  			ratio = avg / total;
> @@ -1104,7 +1104,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  			print_metric(config, ctxp, NULL, NULL, "Ghz", 0);
>  		}
>  	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX)) {
> -		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_CYCLES, map_idx, &rsd);
>  
>  		if (total)
>  			print_metric(config, ctxp, NULL,
> @@ -1114,8 +1114,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  			print_metric(config, ctxp, NULL, NULL, "transactional cycles",
>  				     0);
>  	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX_CP)) {
> -		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
> -		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_CYCLES, map_idx, &rsd);
> +		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, map_idx, &rsd);
>  
>  		if (total2 < avg)
>  			total2 = avg;
> @@ -1125,19 +1125,19 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "aborted cycles", 0);
>  	} else if (perf_stat_evsel__is(evsel, TRANSACTION_START)) {
> -		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, map_idx, &rsd);
>  
>  		if (avg)
>  			ratio = total / avg;
>  
> -		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd) != 0)
> +		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, map_idx, &rsd) != 0)
>  			print_metric(config, ctxp, NULL, "%8.0f",
>  				     "cycles / transaction", ratio);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "cycles / transaction",
>  				      0);
>  	} else if (perf_stat_evsel__is(evsel, ELISION_START)) {
> -		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, map_idx, &rsd);
>  
>  		if (avg)
>  			ratio = total / avg;
> @@ -1150,28 +1150,28 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		else
>  			print_metric(config, ctxp, NULL, NULL, "CPUs utilized", 0);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_BUBBLES)) {
> -		double fe_bound = td_fe_bound(cpu_map_idx, st, &rsd);
> +		double fe_bound = td_fe_bound(map_idx, st, &rsd);
>  
>  		if (fe_bound > 0.2)
>  			color = PERF_COLOR_RED;
>  		print_metric(config, ctxp, color, "%8.1f%%", "frontend bound",
>  				fe_bound * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_RETIRED)) {
> -		double retiring = td_retiring(cpu_map_idx, st, &rsd);
> +		double retiring = td_retiring(map_idx, st, &rsd);
>  
>  		if (retiring > 0.7)
>  			color = PERF_COLOR_GREEN;
>  		print_metric(config, ctxp, color, "%8.1f%%", "retiring",
>  				retiring * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RECOVERY_BUBBLES)) {
> -		double bad_spec = td_bad_spec(cpu_map_idx, st, &rsd);
> +		double bad_spec = td_bad_spec(map_idx, st, &rsd);
>  
>  		if (bad_spec > 0.1)
>  			color = PERF_COLOR_RED;
>  		print_metric(config, ctxp, color, "%8.1f%%", "bad speculation",
>  				bad_spec * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_ISSUED)) {
> -		double be_bound = td_be_bound(cpu_map_idx, st, &rsd);
> +		double be_bound = td_be_bound(map_idx, st, &rsd);
>  		const char *name = "backend bound";
>  		static int have_recovery_bubbles = -1;
>  
> @@ -1184,14 +1184,14 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  
>  		if (be_bound > 0.2)
>  			color = PERF_COLOR_RED;
> -		if (td_total_slots(cpu_map_idx, st, &rsd) > 0)
> +		if (td_total_slots(map_idx, st, &rsd) > 0)
>  			print_metric(config, ctxp, color, "%8.1f%%", name,
>  					be_bound * 100.);
>  		else
>  			print_metric(config, ctxp, NULL, NULL, name, 0);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RETIRING) &&
> -		   full_td(cpu_map_idx, st, &rsd)) {
> -		double retiring = td_metric_ratio(cpu_map_idx,
> +		   full_td(map_idx, st, &rsd)) {
> +		double retiring = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_RETIRING, st,
>  						  &rsd);
>  		if (retiring > 0.7)
> @@ -1199,8 +1199,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Retiring",
>  				retiring * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FE_BOUND) &&
> -		   full_td(cpu_map_idx, st, &rsd)) {
> -		double fe_bound = td_metric_ratio(cpu_map_idx,
> +		   full_td(map_idx, st, &rsd)) {
> +		double fe_bound = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_FE_BOUND, st,
>  						  &rsd);
>  		if (fe_bound > 0.2)
> @@ -1208,8 +1208,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Frontend Bound",
>  				fe_bound * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BE_BOUND) &&
> -		   full_td(cpu_map_idx, st, &rsd)) {
> -		double be_bound = td_metric_ratio(cpu_map_idx,
> +		   full_td(map_idx, st, &rsd)) {
> +		double be_bound = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_BE_BOUND, st,
>  						  &rsd);
>  		if (be_bound > 0.2)
> @@ -1217,8 +1217,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Backend Bound",
>  				be_bound * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BAD_SPEC) &&
> -		   full_td(cpu_map_idx, st, &rsd)) {
> -		double bad_spec = td_metric_ratio(cpu_map_idx,
> +		   full_td(map_idx, st, &rsd)) {
> +		double bad_spec = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_BAD_SPEC, st,
>  						  &rsd);
>  		if (bad_spec > 0.1)
> @@ -1226,11 +1226,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Bad Speculation",
>  				bad_spec * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_HEAVY_OPS) &&
> -			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
> -		double retiring = td_metric_ratio(cpu_map_idx,
> +			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
> +		double retiring = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_RETIRING, st,
>  						  &rsd);
> -		double heavy_ops = td_metric_ratio(cpu_map_idx,
> +		double heavy_ops = td_metric_ratio(map_idx,
>  						   STAT_TOPDOWN_HEAVY_OPS, st,
>  						   &rsd);
>  		double light_ops = retiring - heavy_ops;
> @@ -1246,11 +1246,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Light Operations",
>  				light_ops * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BR_MISPREDICT) &&
> -			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
> -		double bad_spec = td_metric_ratio(cpu_map_idx,
> +			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
> +		double bad_spec = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_BAD_SPEC, st,
>  						  &rsd);
> -		double br_mis = td_metric_ratio(cpu_map_idx,
> +		double br_mis = td_metric_ratio(map_idx,
>  						STAT_TOPDOWN_BR_MISPREDICT, st,
>  						&rsd);
>  		double m_clears = bad_spec - br_mis;
> @@ -1266,11 +1266,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Machine Clears",
>  				m_clears * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_LAT) &&
> -			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
> -		double fe_bound = td_metric_ratio(cpu_map_idx,
> +			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
> +		double fe_bound = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_FE_BOUND, st,
>  						  &rsd);
> -		double fetch_lat = td_metric_ratio(cpu_map_idx,
> +		double fetch_lat = td_metric_ratio(map_idx,
>  						   STAT_TOPDOWN_FETCH_LAT, st,
>  						   &rsd);
>  		double fetch_bw = fe_bound - fetch_lat;
> @@ -1286,11 +1286,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  		print_metric(config, ctxp, color, "%8.1f%%", "Fetch Bandwidth",
>  				fetch_bw * 100.);
>  	} else if (perf_stat_evsel__is(evsel, TOPDOWN_MEM_BOUND) &&
> -			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
> -		double be_bound = td_metric_ratio(cpu_map_idx,
> +			full_td(map_idx, st, &rsd) && (config->topdown_level > 1)) {
> +		double be_bound = td_metric_ratio(map_idx,
>  						  STAT_TOPDOWN_BE_BOUND, st,
>  						  &rsd);
> -		double mem_bound = td_metric_ratio(cpu_map_idx,
> +		double mem_bound = td_metric_ratio(map_idx,
>  						   STAT_TOPDOWN_MEM_BOUND, st,
>  						   &rsd);
>  		double core_bound = be_bound - mem_bound;
> @@ -1308,12 +1308,12 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  	} else if (evsel->metric_expr) {
>  		generic_metric(config, evsel->metric_expr, evsel->metric_events, NULL,
>  			       evsel->name, evsel->metric_name, NULL, 1,
> -			       cpu_map_idx, out, st);
> -	} else if (runtime_stat_n(st, STAT_NSECS, cpu_map_idx, &rsd) != 0) {
> +			       map_idx, out, st);
> +	} else if (runtime_stat_n(st, STAT_NSECS, map_idx, &rsd) != 0) {
>  		char unit = ' ';
>  		char unit_buf[10] = "/sec";
>  
> -		total = runtime_stat_avg(st, STAT_NSECS, cpu_map_idx, &rsd);
> +		total = runtime_stat_avg(st, STAT_NSECS, map_idx, &rsd);
>  		if (total)
>  			ratio = convert_unit_double(1000000000.0 * avg / total, &unit);
>  
> @@ -1321,7 +1321,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  			snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
>  		print_metric(config, ctxp, NULL, "%8.3f", unit_buf, ratio);
>  	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
> -		print_smi_cost(config, cpu_map_idx, out, st, &rsd);
> +		print_smi_cost(config, map_idx, out, st, &rsd);
>  	} else {
>  		num = 0;
>  	}
> @@ -1335,7 +1335,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>  			generic_metric(config, mexp->metric_expr, mexp->metric_events,
>  				       mexp->metric_refs, evsel->name, mexp->metric_name,
>  				       mexp->metric_unit, mexp->runtime,
> -				       cpu_map_idx, out, st);
> +				       map_idx, out, st);
>  		}
>  	}
>  	if (num == 0)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] perf stat: Use thread map index for shadow stat
  2022-09-26 20:07 ` [PATCH 4/6] perf stat: Use thread map index for shadow stat Namhyung Kim
@ 2022-09-28 14:49   ` James Clark
  2022-09-29  2:10   ` Ian Rogers
  1 sibling, 0 replies; 17+ messages in thread
From: James Clark @ 2022-09-28 14:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa



On 26/09/2022 21:07, Namhyung Kim wrote:
> When AGGR_THREAD is active, it aggregates the values for each thread.
> Previously it used cpu map index which is invalid for AGGR_THREAD so
> it had to use separate runtime stats with index 0.
> 
> But it can just use the rt_stat with thread_map_index.  Rename the
> first_shadow_map_idx() and make it return the thread index.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat-display.c | 20 +++++++++-----------
>  tools/perf/util/stat.c         |  8 ++------
>  2 files changed, 11 insertions(+), 17 deletions(-)

Reviewed-by: James Clark <james.clark@arm.com>
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 234491f43c36..570e2c04d47d 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -442,7 +442,7 @@ static void print_metric_header(struct perf_stat_config *config,
>  		fprintf(os->fh, "%*s ", config->metric_only_len, unit);
>  }
>  
> -static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
> +static int first_shadow_map_idx(struct perf_stat_config *config,
>  				struct evsel *evsel, const struct aggr_cpu_id *id)
>  {
>  	struct perf_cpu_map *cpus = evsel__cpus(evsel);
> @@ -452,6 +452,9 @@ static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
>  	if (config->aggr_mode == AGGR_NONE)
>  		return perf_cpu_map__idx(cpus, id->cpu);
>  
> +	if (config->aggr_mode == AGGR_THREAD)
> +		return id->thread;
> +
>  	if (!config->aggr_get_id)
>  		return 0;
>  
> @@ -646,7 +649,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
>  	}
>  
>  	perf_stat__print_shadow_stats(config, counter, uval,
> -				first_shadow_cpu_map_idx(config, counter, &id),
> +				first_shadow_map_idx(config, counter, &id),
>  				&out, &config->metric_events, st);
>  	if (!config->csv_output && !config->metric_only && !config->json_output) {
>  		print_noise(config, counter, noise);
> @@ -676,7 +679,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
>  				val += perf_counts(counter->counts, idx, 0)->val;
>  			}
>  			perf_stat__update_shadow_stats(counter, val,
> -					first_shadow_cpu_map_idx(config, counter, &id),
> +					first_shadow_map_idx(config, counter, &id),
>  					&rt_stat);
>  		}
>  	}
> @@ -979,14 +982,9 @@ static void print_aggr_thread(struct perf_stat_config *config,
>  			fprintf(output, "%s", prefix);
>  
>  		id = buf[thread].id;
> -		if (config->stats)
> -			printout(config, id, 0, buf[thread].counter, buf[thread].uval,
> -				 prefix, buf[thread].run, buf[thread].ena, 1.0,
> -				 &config->stats[id.thread]);
> -		else
> -			printout(config, id, 0, buf[thread].counter, buf[thread].uval,
> -				 prefix, buf[thread].run, buf[thread].ena, 1.0,
> -				 &rt_stat);
> +		printout(config, id, 0, buf[thread].counter, buf[thread].uval,
> +			 prefix, buf[thread].run, buf[thread].ena, 1.0,
> +			 &rt_stat);
>  		fputc('\n', output);
>  	}
>  
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index e1d3152ce664..21137c9d5259 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -389,12 +389,8 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
>  		}
>  
>  		if (config->aggr_mode == AGGR_THREAD) {
> -			if (config->stats)
> -				perf_stat__update_shadow_stats(evsel,
> -					count->val, 0, &config->stats[thread]);
> -			else
> -				perf_stat__update_shadow_stats(evsel,
> -					count->val, 0, &rt_stat);
> +			perf_stat__update_shadow_stats(evsel, count->val,
> +						       thread, &rt_stat);
>  		}
>  		break;
>  	case AGGR_GLOBAL:

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] perf stat: Kill unused per-thread runtime stats
  2022-09-26 20:07 ` [PATCH 5/6] perf stat: Kill unused per-thread runtime stats Namhyung Kim
@ 2022-09-28 14:51   ` James Clark
  0 siblings, 0 replies; 17+ messages in thread
From: James Clark @ 2022-09-28 14:51 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa



On 26/09/2022 21:07, Namhyung Kim wrote:
> Now it's using the global rt_stat, no need to use per-thread stats.  Let
> get rid of them.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-stat.c | 54 ---------------------------------------
>  tools/perf/util/stat.h    |  2 --
>  2 files changed, 56 deletions(-)

Reviewed-by: James Clark <james.clark@arm.com>
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index e05fe72c1d87..b86ebb25a799 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -292,13 +292,8 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
>  
>  static void perf_stat__reset_stats(void)
>  {
> -	int i;
> -
>  	evlist__reset_stats(evsel_list);
>  	perf_stat__reset_shadow_stats();
> -
> -	for (i = 0; i < stat_config.stats_num; i++)
> -		perf_stat__reset_shadow_per_stat(&stat_config.stats[i]);
>  }
>  
>  static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
> @@ -489,46 +484,6 @@ static void read_counters(struct timespec *rs)
>  	}
>  }
>  
> -static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
> -{
> -	int i;
> -
> -	config->stats = calloc(nthreads, sizeof(struct runtime_stat));
> -	if (!config->stats)
> -		return -1;
> -
> -	config->stats_num = nthreads;
> -
> -	for (i = 0; i < nthreads; i++)
> -		runtime_stat__init(&config->stats[i]);
> -
> -	return 0;
> -}
> -
> -static void runtime_stat_delete(struct perf_stat_config *config)
> -{
> -	int i;
> -
> -	if (!config->stats)
> -		return;
> -
> -	for (i = 0; i < config->stats_num; i++)
> -		runtime_stat__exit(&config->stats[i]);
> -
> -	zfree(&config->stats);
> -}
> -
> -static void runtime_stat_reset(struct perf_stat_config *config)
> -{
> -	int i;
> -
> -	if (!config->stats)
> -		return;
> -
> -	for (i = 0; i < config->stats_num; i++)
> -		perf_stat__reset_shadow_per_stat(&config->stats[i]);
> -}
> -
>  static void process_interval(void)
>  {
>  	struct timespec ts, rs;
> @@ -537,7 +492,6 @@ static void process_interval(void)
>  	diff_timespec(&rs, &ts, &ref_time);
>  
>  	perf_stat__reset_shadow_per_stat(&rt_stat);
> -	runtime_stat_reset(&stat_config);
>  	read_counters(&rs);
>  
>  	if (STAT_RECORD) {
> @@ -1018,7 +972,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  
>  		evlist__copy_prev_raw_counts(evsel_list);
>  		evlist__reset_prev_raw_counts(evsel_list);
> -		runtime_stat_reset(&stat_config);
>  		perf_stat__reset_shadow_per_stat(&rt_stat);
>  	} else {
>  		update_stats(&walltime_nsecs_stats, t1 - t0);
> @@ -2514,12 +2467,6 @@ int cmd_stat(int argc, const char **argv)
>  	 */
>  	if (stat_config.aggr_mode == AGGR_THREAD) {
>  		thread_map__read_comms(evsel_list->core.threads);
> -		if (target.system_wide) {
> -			if (runtime_stat_new(&stat_config,
> -				perf_thread_map__nr(evsel_list->core.threads))) {
> -				goto out;
> -			}
> -		}
>  	}
>  
>  	if (stat_config.aggr_mode == AGGR_NODE)
> @@ -2660,7 +2607,6 @@ int cmd_stat(int argc, const char **argv)
>  	evlist__delete(evsel_list);
>  
>  	metricgroup__rblist_exit(&stat_config.metric_events);
> -	runtime_stat_delete(&stat_config);
>  	evlist__close_control(stat_config.ctl_fd, stat_config.ctl_fd_ack, &stat_config.ctl_fd_close);
>  
>  	return status;
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 3eba38a1a149..43cb3f13d4d6 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -153,8 +153,6 @@ struct perf_stat_config {
>  	int			 run_count;
>  	int			 print_free_counters_hint;
>  	int			 print_mixed_hw_group_error;
> -	struct runtime_stat	*stats;
> -	int			 stats_num;
>  	const char		*csv_sep;
>  	struct stats		*walltime_nsecs_stats;
>  	struct rusage		 ru_data;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] perf stat: Don't compare runtime stat for shadow stats
  2022-09-26 20:07 ` [PATCH 6/6] perf stat: Don't compare runtime stat for shadow stats Namhyung Kim
@ 2022-09-28 14:52   ` James Clark
  0 siblings, 0 replies; 17+ messages in thread
From: James Clark @ 2022-09-28 14:52 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa



On 26/09/2022 21:07, Namhyung Kim wrote:
> Now it always uses the global rt_stat.  Let's get rid of the field from
> the saved_value.  When the both evsels are NULL, it'd return 0 so remove
> the block in the saved_value_cmp.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat-shadow.c | 12 ------------
>  1 file changed, 12 deletions(-)
> 
Reviewed-by: James Clark <james.clark@arm.com>

> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 99d05262055c..700563306637 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -35,7 +35,6 @@ struct saved_value {
>  	int ctx;
>  	int map_idx;
>  	struct cgroup *cgrp;
> -	struct runtime_stat *stat;
>  	struct stats stats;
>  	u64 metric_total;
>  	int metric_other;
> @@ -67,16 +66,6 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
>  	if (a->cgrp != b->cgrp)
>  		return (char *)a->cgrp < (char *)b->cgrp ? -1 : +1;
>  
> -	if (a->evsel == NULL && b->evsel == NULL) {
> -		if (a->stat == b->stat)
> -			return 0;
> -
> -		if ((char *)a->stat < (char *)b->stat)
> -			return -1;
> -
> -		return 1;
> -	}
> -
>  	if (a->evsel == b->evsel)
>  		return 0;
>  	if ((char *)a->evsel < (char *)b->evsel)
> @@ -120,7 +109,6 @@ static struct saved_value *saved_value_lookup(struct evsel *evsel,
>  		.evsel = evsel,
>  		.type = type,
>  		.ctx = ctx,
> -		.stat = st,
>  		.cgrp = cgrp,
>  	};
>  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx
  2022-09-28 10:50   ` James Clark
@ 2022-09-28 23:56     ` Namhyung Kim
  2022-09-29  1:58       ` Ian Rogers
  0 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2022-09-28 23:56 UTC (permalink / raw)
  To: James Clark
  Cc: Ian Rogers, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa

Hello,

On Wed, Sep 28, 2022 at 3:50 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 26/09/2022 21:07, Namhyung Kim wrote:
> > The cpu_map_idx fields is just to differentiate values from other
> > entries.  It doesn't need to be strictly cpu map index.  Actually we can
> > pass thread map index or aggr map index.  So rename the fields first.
> >
> > No functional change intended.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/util/stat-shadow.c | 308 +++++++++++++++++-----------------
> >  1 file changed, 154 insertions(+), 154 deletions(-)
> >
> > diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> > index 9e1eddeff21b..99d05262055c 100644
> > --- a/tools/perf/util/stat-shadow.c
> > +++ b/tools/perf/util/stat-shadow.c
> > @@ -33,7 +33,7 @@ struct saved_value {
> >       struct evsel *evsel;
> >       enum stat_type type;
> >       int ctx;
> > -     int cpu_map_idx;
> > +     int map_idx;
>
> Do the same variables in stat.c and stat.h also need to be updated? The
> previous change to do this exact thing (5b1af93dbc7e) changed more than
> just these ones.

Thanks for your review!  I'll change the header too.

Note that callers of perf_stat__update_shadow_stats() are free
to use cpu_map_idx as they want.  The previous change fixed
confusion between cpu number and map index.  Actually either
is fine for us as long as it's used consistently.  But we use the
cpu map index for most cases.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx
  2022-09-28 23:56     ` Namhyung Kim
@ 2022-09-29  1:58       ` Ian Rogers
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Rogers @ 2022-09-29  1:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: James Clark, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter,
	linux-perf-users, Andi Kleen, Kan Liang, Leo Yan, Zhengjun Xing,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Wed, Sep 28, 2022 at 4:57 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> On Wed, Sep 28, 2022 at 3:50 AM James Clark <james.clark@arm.com> wrote:
> >
> >
> >
> > On 26/09/2022 21:07, Namhyung Kim wrote:
> > > The cpu_map_idx fields is just to differentiate values from other
> > > entries.  It doesn't need to be strictly cpu map index.  Actually we can
> > > pass thread map index or aggr map index.  So rename the fields first.
> > >
> > > No functional change intended.
> > >
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > >  tools/perf/util/stat-shadow.c | 308 +++++++++++++++++-----------------
> > >  1 file changed, 154 insertions(+), 154 deletions(-)
> > >
> > > diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> > > index 9e1eddeff21b..99d05262055c 100644
> > > --- a/tools/perf/util/stat-shadow.c
> > > +++ b/tools/perf/util/stat-shadow.c
> > > @@ -33,7 +33,7 @@ struct saved_value {
> > >       struct evsel *evsel;
> > >       enum stat_type type;
> > >       int ctx;
> > > -     int cpu_map_idx;
> > > +     int map_idx;
> >
> > Do the same variables in stat.c and stat.h also need to be updated? The
> > previous change to do this exact thing (5b1af93dbc7e) changed more than
> > just these ones.
>
> Thanks for your review!  I'll change the header too.
>
> Note that callers of perf_stat__update_shadow_stats() are free
> to use cpu_map_idx as they want.  The previous change fixed
> confusion between cpu number and map index.  Actually either
> is fine for us as long as it's used consistently.  But we use the
> cpu map index for most cases.
>
> Thanks,
> Namhyung

It is only fine to interchange CPU and CPU map index if the CPU map
contains all CPUs and not the any CPU entry. I wonder if we should
introduce a 'struct thread' to wrap the pid_t in thread maps to avoid
swapping threads and indices? Given pids are not generally in the same
range as indices (unlike CPU numbers which substantially broke
aggregation) it is much less likely this is broken. In any case it is
worth documenting map_idx to say what indices it is expected to hold.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] perf stat: Use thread map index for shadow stat
  2022-09-26 20:07 ` [PATCH 4/6] perf stat: Use thread map index for shadow stat Namhyung Kim
  2022-09-28 14:49   ` James Clark
@ 2022-09-29  2:10   ` Ian Rogers
  2022-09-29  4:55     ` Namhyung Kim
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Rogers @ 2022-09-29  2:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Andi Kleen, Kan Liang,
	Leo Yan, Zhengjun Xing

On Mon, Sep 26, 2022 at 1:08 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> When AGGR_THREAD is active, it aggregates the values for each thread.
> Previously it used cpu map index which is invalid for AGGR_THREAD so
> it had to use separate runtime stats with index 0.
>
> But it can just use the rt_stat with thread_map_index.  Rename the
> first_shadow_map_idx() and make it return the thread index.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat-display.c | 20 +++++++++-----------
>  tools/perf/util/stat.c         |  8 ++------
>  2 files changed, 11 insertions(+), 17 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 234491f43c36..570e2c04d47d 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -442,7 +442,7 @@ static void print_metric_header(struct perf_stat_config *config,
>                 fprintf(os->fh, "%*s ", config->metric_only_len, unit);
>  }
>
> -static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
> +static int first_shadow_map_idx(struct perf_stat_config *config,
>                                 struct evsel *evsel, const struct aggr_cpu_id *id)
>  {
>         struct perf_cpu_map *cpus = evsel__cpus(evsel);
> @@ -452,6 +452,9 @@ static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
>         if (config->aggr_mode == AGGR_NONE)
>                 return perf_cpu_map__idx(cpus, id->cpu);
>
> +       if (config->aggr_mode == AGGR_THREAD)
> +               return id->thread;

The function's name implies returning an index but that isn't clear
here. Can we change the aggr_cpu_id's thread to be called thread_idx?

Thanks,
Ian

> +
>         if (!config->aggr_get_id)
>                 return 0;
>
> @@ -646,7 +649,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
>         }
>
>         perf_stat__print_shadow_stats(config, counter, uval,
> -                               first_shadow_cpu_map_idx(config, counter, &id),
> +                               first_shadow_map_idx(config, counter, &id),
>                                 &out, &config->metric_events, st);
>         if (!config->csv_output && !config->metric_only && !config->json_output) {
>                 print_noise(config, counter, noise);
> @@ -676,7 +679,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
>                                 val += perf_counts(counter->counts, idx, 0)->val;
>                         }
>                         perf_stat__update_shadow_stats(counter, val,
> -                                       first_shadow_cpu_map_idx(config, counter, &id),
> +                                       first_shadow_map_idx(config, counter, &id),
>                                         &rt_stat);
>                 }
>         }
> @@ -979,14 +982,9 @@ static void print_aggr_thread(struct perf_stat_config *config,
>                         fprintf(output, "%s", prefix);
>
>                 id = buf[thread].id;
> -               if (config->stats)
> -                       printout(config, id, 0, buf[thread].counter, buf[thread].uval,
> -                                prefix, buf[thread].run, buf[thread].ena, 1.0,
> -                                &config->stats[id.thread]);
> -               else
> -                       printout(config, id, 0, buf[thread].counter, buf[thread].uval,
> -                                prefix, buf[thread].run, buf[thread].ena, 1.0,
> -                                &rt_stat);
> +               printout(config, id, 0, buf[thread].counter, buf[thread].uval,
> +                        prefix, buf[thread].run, buf[thread].ena, 1.0,
> +                        &rt_stat);
>                 fputc('\n', output);
>         }
>
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index e1d3152ce664..21137c9d5259 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -389,12 +389,8 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
>                 }
>
>                 if (config->aggr_mode == AGGR_THREAD) {
> -                       if (config->stats)
> -                               perf_stat__update_shadow_stats(evsel,
> -                                       count->val, 0, &config->stats[thread]);
> -                       else
> -                               perf_stat__update_shadow_stats(evsel,
> -                                       count->val, 0, &rt_stat);
> +                       perf_stat__update_shadow_stats(evsel, count->val,
> +                                                      thread, &rt_stat);
>                 }
>                 break;
>         case AGGR_GLOBAL:
> --
> 2.37.3.998.g577e59143f-goog
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] perf stat: Use thread map index for shadow stat
  2022-09-29  2:10   ` Ian Rogers
@ 2022-09-29  4:55     ` Namhyung Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Namhyung Kim @ 2022-09-29  4:55 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Andi Kleen, Kan Liang,
	Leo Yan, Zhengjun Xing

Hi Ian,

On Wed, Sep 28, 2022 at 7:10 PM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Sep 26, 2022 at 1:08 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > When AGGR_THREAD is active, it aggregates the values for each thread.
> > Previously it used cpu map index which is invalid for AGGR_THREAD so
> > it had to use separate runtime stats with index 0.
> >
> > But it can just use the rt_stat with thread_map_index.  Rename the
> > first_shadow_map_idx() and make it return the thread index.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/util/stat-display.c | 20 +++++++++-----------
> >  tools/perf/util/stat.c         |  8 ++------
> >  2 files changed, 11 insertions(+), 17 deletions(-)
> >
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index 234491f43c36..570e2c04d47d 100644
> > --- a/tools/perf/util/stat-display.c
> > +++ b/tools/perf/util/stat-display.c
> > @@ -442,7 +442,7 @@ static void print_metric_header(struct perf_stat_config *config,
> >                 fprintf(os->fh, "%*s ", config->metric_only_len, unit);
> >  }
> >
> > -static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
> > +static int first_shadow_map_idx(struct perf_stat_config *config,
> >                                 struct evsel *evsel, const struct aggr_cpu_id *id)
> >  {
> >         struct perf_cpu_map *cpus = evsel__cpus(evsel);
> > @@ -452,6 +452,9 @@ static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
> >         if (config->aggr_mode == AGGR_NONE)
> >                 return perf_cpu_map__idx(cpus, id->cpu);
> >
> > +       if (config->aggr_mode == AGGR_THREAD)
> > +               return id->thread;
>
> The function's name implies returning an index but that isn't clear
> here. Can we change the aggr_cpu_id's thread to be called thread_idx?

Right, I'll do that in a separate commit.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-09-29  4:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-26 20:07 [PATCHSET 0/6] perf stat: Small random cleanups (v1) Namhyung Kim
2022-09-26 20:07 ` [PATCH 1/6] perf stat: Convert perf_stat_evsel.res_stats array Namhyung Kim
2022-09-28 10:33   ` James Clark
2022-09-26 20:07 ` [PATCH 2/6] perf stat: Don't call perf_stat_evsel_id_init() repeatedly Namhyung Kim
2022-09-28 10:41   ` James Clark
2022-09-26 20:07 ` [PATCH 3/6] perf stat: Rename saved_value->cpu_map_idx Namhyung Kim
2022-09-28 10:50   ` James Clark
2022-09-28 23:56     ` Namhyung Kim
2022-09-29  1:58       ` Ian Rogers
2022-09-26 20:07 ` [PATCH 4/6] perf stat: Use thread map index for shadow stat Namhyung Kim
2022-09-28 14:49   ` James Clark
2022-09-29  2:10   ` Ian Rogers
2022-09-29  4:55     ` Namhyung Kim
2022-09-26 20:07 ` [PATCH 5/6] perf stat: Kill unused per-thread runtime stats Namhyung Kim
2022-09-28 14:51   ` James Clark
2022-09-26 20:07 ` [PATCH 6/6] perf stat: Don't compare runtime stat for shadow stats Namhyung Kim
2022-09-28 14:52   ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).