All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] perf stat: Support overall statistics for interval mode
@ 2020-05-08  7:58 Jin Yao
  2020-05-08  7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08  7:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Currently perf-stat supports to print counts at regular interval (-I),
but it's not very easy for user to get the overall statistics.

With this patchset, it supports to report the summary at the end of
interval output.

For example,

 root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
 #           time             counts unit events
      1.000412064          2,281,114      cycles
      2.001383658          2,547,880      cycles

  Performance counter stats for 'system wide':

          4,828,994      cycles

        2.002860349 seconds time elapsed

 root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
 #           time             counts unit events
      1.000389902          1,536,093      cycles
      1.000389902            420,226      instructions              #    0.27  insn per cycle
      2.001433453          2,213,952      cycles
      2.001433453            735,465      instructions              #    0.33  insn per cycle

  Performance counter stats for 'system wide':

          3,750,045      cycles
          1,155,691      instructions              #    0.31  insn per cycle

        2.003023361 seconds time elapsed

 root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
 #           time             counts unit events
      1.000435121            905,303      inst_retired.any          #      2.9 CPI
      1.000435121          2,663,333      cycles
      1.000435121            914,702      inst_retired.any          #      0.3 IPC
      1.000435121          2,676,559      cpu_clk_unhalted.thread
      2.001615941          1,951,092      inst_retired.any          #      1.8 CPI
      2.001615941          3,551,357      cycles
      2.001615941          1,950,837      inst_retired.any          #      0.5 IPC
      2.001615941          3,551,044      cpu_clk_unhalted.thread

  Performance counter stats for 'system wide':

          2,856,395      inst_retired.any          #      2.2 CPI
          6,214,690      cycles
          2,865,539      inst_retired.any          #      0.5 IPC
          6,227,603      cpu_clk_unhalted.thread

        2.003403078 seconds time elapsed

 v4:
 ---
 1. Create runtime_stat_reset.

 2. Zero the aggr in perf_counts__reset and use it to reset
    prev_raw_counts.

 3. Move affinity setup and read_counter_cpu to a new function
    read_affinity_counters. It's only called when stat_config.summary
    is not set.

 v3:
 ---
 1. 'perf stat: Fix wrong per-thread runtime stat for interval mode'
    is a new patch which fixes an existing issue found in test.

 2. We use the prev_raw_counts for summary counts. Drop the summary_counts in v2.

 3. Fix some issues.

 v2:
 ---
 Rebase to perf/core branch

Jin Yao (4):
  perf stat: Fix wrong per-thread runtime stat for interval mode
  perf counts: Reset prev_raw_counts counts
  perf stat: Copy counts from prev_raw_counts to evsel->counts
  perf stat: Report summary for interval mode

 tools/perf/builtin-stat.c | 97 ++++++++++++++++++++++++++-------------
 tools/perf/util/counts.c  |  4 +-
 tools/perf/util/counts.h  |  1 +
 tools/perf/util/evsel.c   |  1 +
 tools/perf/util/stat.c    | 33 ++++++++++---
 tools/perf/util/stat.h    |  2 +
 6 files changed, 99 insertions(+), 39 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat for interval mode
  2020-05-08  7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
@ 2020-05-08  7:58 ` Jin Yao
  2020-05-08  7:58 ` [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08  7:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

root@kbl-ppc:~# perf stat --per-thread -e cycles,instructions -I1000 --interval-count 2
     1.004171683             perf-3696              8,747,311      cycles
        ...
     1.004171683             perf-3696                691,730      instructions              #    0.08  insn per cycle
        ...
     2.006490373             perf-3696              1,749,936      cycles
        ...
     2.006490373             perf-3696              1,484,582      instructions              #    0.28  insn per cycle
        ...

Let's see interval 2.006490373

perf-3696              1,749,936      cycles
perf-3696              1,484,582      instructions              #    0.28  insn per cycle

insn per cycle = 1,484,582 / 1,749,936 = 0.85.
But now it's 0.28, that's not correct.

stat_config.stats[] records the per-thread runtime stat. But for interval
mode, it should be reset for each interval.

So now, with this patch,

root@kbl-ppc:~# perf stat --per-thread -e cycles,instructions -I1000 --interval-count 2
     1.005818121             perf-8633              9,898,045      cycles
        ...
     1.005818121             perf-8633                693,298      instructions              #    0.07  insn per cycle
        ...
     2.007863743             perf-8633              1,551,619      cycles
        ...
     2.007863743             perf-8633              1,317,514      instructions              #    0.85  insn per cycle
        ...

Let's check interval 2.007863743.

insn per cycle = 1,317,514 / 1,551,619 = 0.85. It's correct.

This patch creates runtime_stat_reset, places it next to
untime_stat_new/runtime_stat_delete and moves all runtime_stat
functions before process_interval.

 v4:
 ---
 Create runtime_stat_reset.

Fixes: commit 14e72a21c783 ("perf stat: Update or print per-thread stats")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-stat.c | 70 +++++++++++++++++++++++----------------
 1 file changed, 41 insertions(+), 29 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e0c1ad23c768..f3b3a59ac7d2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -351,6 +351,46 @@ static void read_counters(struct timespec *rs)
 	}
 }
 
+static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
+{
+	int i;
+
+	config->stats = calloc(nthreads, sizeof(struct runtime_stat));
+	if (!config->stats)
+		return -1;
+
+	config->stats_num = nthreads;
+
+	for (i = 0; i < nthreads; i++)
+		runtime_stat__init(&config->stats[i]);
+
+	return 0;
+}
+
+static void runtime_stat_delete(struct perf_stat_config *config)
+{
+	int i;
+
+	if (!config->stats)
+		return;
+
+	for (i = 0; i < config->stats_num; i++)
+		runtime_stat__exit(&config->stats[i]);
+
+	zfree(&config->stats);
+}
+
+static void runtime_stat_reset(struct perf_stat_config *config)
+{
+	int i;
+
+	if (!config->stats)
+		return;
+
+	for (i = 0; i < config->stats_num; i++)
+		perf_stat__reset_shadow_per_stat(&config->stats[i]);
+}
+
 static void process_interval(void)
 {
 	struct timespec ts, rs;
@@ -359,6 +399,7 @@ static void process_interval(void)
 	diff_timespec(&rs, &ts, &ref_time);
 
 	perf_stat__reset_shadow_per_stat(&rt_stat);
+	runtime_stat_reset(&stat_config);
 	read_counters(&rs);
 
 	if (STAT_RECORD) {
@@ -1737,35 +1778,6 @@ int process_cpu_map_event(struct perf_session *session,
 	return set_maps(st);
 }
 
-static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
-{
-	int i;
-
-	config->stats = calloc(nthreads, sizeof(struct runtime_stat));
-	if (!config->stats)
-		return -1;
-
-	config->stats_num = nthreads;
-
-	for (i = 0; i < nthreads; i++)
-		runtime_stat__init(&config->stats[i]);
-
-	return 0;
-}
-
-static void runtime_stat_delete(struct perf_stat_config *config)
-{
-	int i;
-
-	if (!config->stats)
-		return;
-
-	for (i = 0; i < config->stats_num; i++)
-		runtime_stat__exit(&config->stats[i]);
-
-	zfree(&config->stats);
-}
-
 static const char * const stat_report_usage[] = {
 	"perf stat report [<options>]",
 	NULL,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts
  2020-05-08  7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
  2020-05-08  7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
@ 2020-05-08  7:58 ` Jin Yao
  2020-05-08  7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
  2020-05-08  7:58 ` [PATCH v4 4/4] perf stat: Report summary for interval mode Jin Yao
  3 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08  7:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

When we want to reset the evsel->prev_raw_counts, zeroing the aggr
is not enough, we need to reset the perf_counts too.

The perf_counts__reset zeros the perf_counts, and it should zero
the aggr too. This patch changes perf_counts__reset to non-static,
and calls it in evsel__reset_prev_raw_counts to reset the
prev_raw_counts.

 v4:
 ---
 Zeroing the aggr in perf_counts__reset and use it to reset
 prev_raw_counts.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/counts.c | 4 +++-
 tools/perf/util/counts.h | 1 +
 tools/perf/util/stat.c   | 7 ++-----
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
index 615c9f3e95cb..582f3aeaf5e4 100644
--- a/tools/perf/util/counts.c
+++ b/tools/perf/util/counts.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <errno.h>
 #include <stdlib.h>
+#include <string.h>
 #include "evsel.h"
 #include "counts.h"
 #include <linux/zalloc.h>
@@ -42,10 +43,11 @@ void perf_counts__delete(struct perf_counts *counts)
 	}
 }
 
-static void perf_counts__reset(struct perf_counts *counts)
+void perf_counts__reset(struct perf_counts *counts)
 {
 	xyarray__reset(counts->loaded);
 	xyarray__reset(counts->values);
+	memset(&counts->aggr, 0, sizeof(struct perf_counts_values));
 }
 
 void evsel__reset_counts(struct evsel *evsel)
diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
index 8f556c6d98fa..7ff36bf6d644 100644
--- a/tools/perf/util/counts.h
+++ b/tools/perf/util/counts.h
@@ -37,6 +37,7 @@ perf_counts__set_loaded(struct perf_counts *counts, int cpu, int thread, bool lo
 
 struct perf_counts *perf_counts__new(int ncpus, int nthreads);
 void perf_counts__delete(struct perf_counts *counts);
+void perf_counts__reset(struct perf_counts *counts);
 
 void evsel__reset_counts(struct evsel *evsel);
 int evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index f4a44df9b221..e397815f0dfb 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -163,11 +163,8 @@ static void evsel__free_prev_raw_counts(struct evsel *evsel)
 
 static void evsel__reset_prev_raw_counts(struct evsel *evsel)
 {
-	if (evsel->prev_raw_counts) {
-		evsel->prev_raw_counts->aggr.val = 0;
-		evsel->prev_raw_counts->aggr.ena = 0;
-		evsel->prev_raw_counts->aggr.run = 0;
-       }
+	if (evsel->prev_raw_counts)
+		perf_counts__reset(evsel->prev_raw_counts);
 }
 
 static int evsel__alloc_stats(struct evsel *evsel, bool alloc_raw)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
  2020-05-08  7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
  2020-05-08  7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
  2020-05-08  7:58 ` [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
@ 2020-05-08  7:58 ` Jin Yao
  2020-05-13 15:31   ` Jiri Olsa
  2020-05-08  7:58 ` [PATCH v4 4/4] perf stat: Report summary for interval mode Jin Yao
  3 siblings, 1 reply; 7+ messages in thread
From: Jin Yao @ 2020-05-08  7:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It would be useful to support the overall statistics for perf-stat
interval mode. For example, report the summary at the end of
"perf-stat -I" output.

But since perf-stat can support many aggregation modes, such as
--per-thread, --per-socket, -M and etc, we need a solution which
doesn't bring much complexity.

The idea is to use 'evsel->prev_raw_counts' which is updated in
each interval and it's saved with the latest counts. Before reporting
the summary, we copy the counts from evsel->prev_raw_counts to
evsel->counts, and next we just follow non-interval processing.

In evsel__compute_deltas, this patch saves counts to the member
[cpu0,thread0] of perf_counts for AGGR_GLOBAL.

That's because after copying evsel->prev_raw_counts to evsel->counts,
perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
Once we go to process_counter_maps again, all members of perf_counts
are 0.

So this patch uses a trick that saves the previous aggr value to
the member [cpu0,thread0] of perf_counts, then aggr calculation
in process_counter_values can work correctly.

 v4:
 ---
 Change the commit message.
 No functional change.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/evsel.c |  1 +
 tools/perf/util/stat.c  | 24 ++++++++++++++++++++++++
 tools/perf/util/stat.h  |  1 +
 3 files changed, 26 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..6fae1ec28886 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
 	if (cpu == -1) {
 		tmp = evsel->prev_raw_counts->aggr;
 		evsel->prev_raw_counts->aggr = *count;
+		*perf_counts(evsel->prev_raw_counts, 0, 0) = *count;
 	} else {
 		tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
 		*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index e397815f0dfb..aadc723ce871 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -225,6 +225,30 @@ void perf_evlist__reset_prev_raw_counts(struct evlist *evlist)
 		evsel__reset_prev_raw_counts(evsel);
 }
 
+static void perf_evsel__copy_prev_raw_counts(struct evsel *evsel)
+{
+	int ncpus = evsel__nr_cpus(evsel);
+	int nthreads = perf_thread_map__nr(evsel->core.threads);
+
+	for (int thread = 0; thread < nthreads; thread++) {
+		for (int cpu = 0; cpu < ncpus; cpu++) {
+			*perf_counts(evsel->counts, cpu, thread) =
+				*perf_counts(evsel->prev_raw_counts, cpu,
+					     thread);
+		}
+	}
+
+	evsel->counts->aggr = evsel->prev_raw_counts->aggr;
+}
+
+void perf_evlist__copy_prev_raw_counts(struct evlist *evlist)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(evlist, evsel)
+		perf_evsel__copy_prev_raw_counts(evsel);
+}
+
 static void zero_per_pkg(struct evsel *counter)
 {
 	if (counter->per_pkg_mask)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index b4fdfaa7f2c0..62cf72c71869 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -198,6 +198,7 @@ int perf_evlist__alloc_stats(struct evlist *evlist, bool alloc_raw);
 void perf_evlist__free_stats(struct evlist *evlist);
 void perf_evlist__reset_stats(struct evlist *evlist);
 void perf_evlist__reset_prev_raw_counts(struct evlist *evlist);
+void perf_evlist__copy_prev_raw_counts(struct evlist *evlist);
 
 int perf_stat_process_counter(struct perf_stat_config *config,
 			      struct evsel *counter);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 4/4] perf stat: Report summary for interval mode
  2020-05-08  7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
                   ` (2 preceding siblings ...)
  2020-05-08  7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
@ 2020-05-08  7:58 ` Jin Yao
  3 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08  7:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Currently perf-stat supports to print counts at regular interval (-I),
but it's not very easy for user to get the overall statistics.

The patch uses 'evsel->prev_raw_counts' to get counts for summary.
Copy the counts to 'evsel->counts' after printing the interval results.
Next, we just follow the non-interval processing.

Let's see some examples,

 root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
 #           time             counts unit events
      1.000412064          2,281,114      cycles
      2.001383658          2,547,880      cycles

  Performance counter stats for 'system wide':

          4,828,994      cycles

        2.002860349 seconds time elapsed

 root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
 #           time             counts unit events
      1.000389902          1,536,093      cycles
      1.000389902            420,226      instructions              #    0.27  insn per cycle
      2.001433453          2,213,952      cycles
      2.001433453            735,465      instructions              #    0.33  insn per cycle

  Performance counter stats for 'system wide':

          3,750,045      cycles
          1,155,691      instructions              #    0.31  insn per cycle

        2.003023361 seconds time elapsed

 root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
 #           time             counts unit events
      1.000435121            905,303      inst_retired.any          #      2.9 CPI
      1.000435121          2,663,333      cycles
      1.000435121            914,702      inst_retired.any          #      0.3 IPC
      1.000435121          2,676,559      cpu_clk_unhalted.thread
      2.001615941          1,951,092      inst_retired.any          #      1.8 CPI
      2.001615941          3,551,357      cycles
      2.001615941          1,950,837      inst_retired.any          #      0.5 IPC
      2.001615941          3,551,044      cpu_clk_unhalted.thread

  Performance counter stats for 'system wide':

          2,856,395      inst_retired.any          #      2.2 CPI
          6,214,690      cycles
          2,865,539      inst_retired.any          #      0.5 IPC
          6,227,603      cpu_clk_unhalted.thread

        2.003403078 seconds time elapsed

 v4:
 ---
 Move affinity setup and read_counter_cpu to a new function
 read_affinity_counters. It's only called when stat_config.summary
 is not set.

 v3:
 ---
 Use evsel->prev_raw_counts for summary counts

 v2:
 ---
 Rebase to perf/core branch

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-stat.c | 27 ++++++++++++++++++++++++---
 tools/perf/util/stat.c    |  2 +-
 tools/perf/util/stat.h    |  1 +
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f3b3a59ac7d2..d6a6aa6b997d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -314,14 +314,14 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
 	return 0;
 }
 
-static void read_counters(struct timespec *rs)
+static int read_affinity_counters(struct timespec *rs)
 {
 	struct evsel *counter;
 	struct affinity affinity;
 	int i, ncpus, cpu;
 
 	if (affinity__setup(&affinity) < 0)
-		return;
+		return -1;
 
 	ncpus = perf_cpu_map__nr(evsel_list->core.all_cpus);
 	if (!target__has_cpu(&target) || target__has_per_thread(&target))
@@ -341,6 +341,15 @@ static void read_counters(struct timespec *rs)
 		}
 	}
 	affinity__cleanup(&affinity);
+	return 0;
+}
+
+static void read_counters(struct timespec *rs)
+{
+	struct evsel *counter;
+
+	if (!stat_config.summary && (read_affinity_counters(rs) < 0))
+		return;
 
 	evlist__for_each_entry(evsel_list, counter) {
 		if (counter->err)
@@ -394,6 +403,7 @@ static void runtime_stat_reset(struct perf_stat_config *config)
 static void process_interval(void)
 {
 	struct timespec ts, rs;
+	struct stats walltime_nsecs_stats_bak;
 
 	clock_gettime(CLOCK_MONOTONIC, &ts);
 	diff_timespec(&rs, &ts, &ref_time);
@@ -407,9 +417,11 @@ static void process_interval(void)
 			pr_err("failed to write stat round event\n");
 	}
 
+	walltime_nsecs_stats_bak = walltime_nsecs_stats;
 	init_stats(&walltime_nsecs_stats);
 	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
 	print_counters(&rs, 0, NULL);
+	walltime_nsecs_stats = walltime_nsecs_stats_bak;
 }
 
 static void enable_counters(void)
@@ -765,6 +777,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 
 	update_stats(&walltime_nsecs_stats, t1 - t0);
 
+	if (interval) {
+		stat_config.interval = 0;
+		stat_config.summary = true;
+		perf_evlist__copy_prev_raw_counts(evsel_list);
+		perf_evlist__reset_prev_raw_counts(evsel_list);
+		runtime_stat_reset(&stat_config);
+		perf_stat__reset_shadow_per_stat(&rt_stat);
+	}
+
 	/*
 	 * Closing a group leader splits the group, and as we only disable
 	 * group leaders, results in remaining events becoming enabled. To
@@ -2159,7 +2180,7 @@ int cmd_stat(int argc, const char **argv)
 		}
 	}
 
-	if (!forever && status != -1 && !interval)
+	if (!forever && status != -1 && (!interval || stat_config.summary))
 		print_counters(NULL, argc, argv);
 
 	if (STAT_RECORD) {
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index aadc723ce871..49e832a9c109 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -388,7 +388,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 	 * interval mode, otherwise overall avg running
 	 * averages will be shown for each interval.
 	 */
-	if (config->interval) {
+	if (config->interval || config->summary) {
 		for (i = 0; i < 3; i++)
 			init_stats(&ps->res_stats[i]);
 	}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 62cf72c71869..c60e9e5d6474 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -110,6 +110,7 @@ struct perf_stat_config {
 	bool			 all_kernel;
 	bool			 all_user;
 	bool			 percore_show_thread;
+	bool			 summary;
 	FILE			*output;
 	unsigned int		 interval;
 	unsigned int		 timeout;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
  2020-05-08  7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
@ 2020-05-13 15:31   ` Jiri Olsa
  2020-05-14  5:42     ` Jin, Yao
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2020-05-13 15:31 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Fri, May 08, 2020 at 03:58:16PM +0800, Jin Yao wrote:
> It would be useful to support the overall statistics for perf-stat
> interval mode. For example, report the summary at the end of
> "perf-stat -I" output.
> 
> But since perf-stat can support many aggregation modes, such as
> --per-thread, --per-socket, -M and etc, we need a solution which
> doesn't bring much complexity.
> 
> The idea is to use 'evsel->prev_raw_counts' which is updated in
> each interval and it's saved with the latest counts. Before reporting
> the summary, we copy the counts from evsel->prev_raw_counts to
> evsel->counts, and next we just follow non-interval processing.
> 
> In evsel__compute_deltas, this patch saves counts to the member
> [cpu0,thread0] of perf_counts for AGGR_GLOBAL.
> 
> That's because after copying evsel->prev_raw_counts to evsel->counts,
> perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
> Once we go to process_counter_maps again, all members of perf_counts
> are 0.
> 
> So this patch uses a trick that saves the previous aggr value to
> the member [cpu0,thread0] of perf_counts, then aggr calculation
> in process_counter_values can work correctly.
> 
>  v4:
>  ---
>  Change the commit message.
>  No functional change.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/util/evsel.c |  1 +
>  tools/perf/util/stat.c  | 24 ++++++++++++++++++++++++
>  tools/perf/util/stat.h  |  1 +
>  3 files changed, 26 insertions(+)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 28683b0eb738..6fae1ec28886 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
>  	if (cpu == -1) {
>  		tmp = evsel->prev_raw_counts->aggr;
>  		evsel->prev_raw_counts->aggr = *count;
> +		*perf_counts(evsel->prev_raw_counts, 0, 0) = *count;

ok, I think I understand that now.. it's only for AGGR_GLOBAL mode,
because the perf_stat_process_counter will create aggr values from
per cpu values

but why do we need to do that all the time? can't we just set it up
before you zero prev_raw_counts in next patch?


        if (interval) {
                stat_config.interval = 0;
                stat_config.summary = true;
                perf_evlist__copy_prev_raw_counts(evsel_list);

	-> for AGGR_GLOBAL set the counts[0,0] to prev_raw_counts->aggr

                perf_evlist__reset_prev_raw_counts(evsel_list);
                runtime_stat_reset(&stat_config);
                perf_stat__reset_shadow_per_stat(&rt_stat);
        }


thanks,
jirka


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
  2020-05-13 15:31   ` Jiri Olsa
@ 2020-05-14  5:42     ` Jin, Yao
  0 siblings, 0 replies; 7+ messages in thread
From: Jin, Yao @ 2020-05-14  5:42 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 5/13/2020 11:31 PM, Jiri Olsa wrote:
> On Fri, May 08, 2020 at 03:58:16PM +0800, Jin Yao wrote:
>> It would be useful to support the overall statistics for perf-stat
>> interval mode. For example, report the summary at the end of
>> "perf-stat -I" output.
>>
>> But since perf-stat can support many aggregation modes, such as
>> --per-thread, --per-socket, -M and etc, we need a solution which
>> doesn't bring much complexity.
>>
>> The idea is to use 'evsel->prev_raw_counts' which is updated in
>> each interval and it's saved with the latest counts. Before reporting
>> the summary, we copy the counts from evsel->prev_raw_counts to
>> evsel->counts, and next we just follow non-interval processing.
>>
>> In evsel__compute_deltas, this patch saves counts to the member
>> [cpu0,thread0] of perf_counts for AGGR_GLOBAL.
>>
>> That's because after copying evsel->prev_raw_counts to evsel->counts,
>> perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
>> Once we go to process_counter_maps again, all members of perf_counts
>> are 0.
>>
>> So this patch uses a trick that saves the previous aggr value to
>> the member [cpu0,thread0] of perf_counts, then aggr calculation
>> in process_counter_values can work correctly.
>>
>>   v4:
>>   ---
>>   Change the commit message.
>>   No functional change.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/util/evsel.c |  1 +
>>   tools/perf/util/stat.c  | 24 ++++++++++++++++++++++++
>>   tools/perf/util/stat.h  |  1 +
>>   3 files changed, 26 insertions(+)
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index 28683b0eb738..6fae1ec28886 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
>>   	if (cpu == -1) {
>>   		tmp = evsel->prev_raw_counts->aggr;
>>   		evsel->prev_raw_counts->aggr = *count;
>> +		*perf_counts(evsel->prev_raw_counts, 0, 0) = *count;
> 
> ok, I think I understand that now.. it's only for AGGR_GLOBAL mode,
> because the perf_stat_process_counter will create aggr values from
> per cpu values
> 
> but why do we need to do that all the time? can't we just set it up
> before you zero prev_raw_counts in next patch?
> 
> 
>          if (interval) {
>                  stat_config.interval = 0;
>                  stat_config.summary = true;
>                  perf_evlist__copy_prev_raw_counts(evsel_list);
> 
> 	-> for AGGR_GLOBAL set the counts[0,0] to prev_raw_counts->aggr
> 
>                  perf_evlist__reset_prev_raw_counts(evsel_list);
>                  runtime_stat_reset(&stat_config);
>                  perf_stat__reset_shadow_per_stat(&rt_stat);
>          }
>


Yes, I think that's a good idea.

Now in v5, I create a new patch "perf stat: Save aggr value to first member of 
prev_raw_counts" to save aggr value to first member of prev_raw_counts for 
AGGR_GLOBAL. Then next, perf_stat_process_counter can create aggr values from 
per cpu values successfully.

Thanks
Jin Yao

> 
> thanks,
> jirka
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-14  5:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08  7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
2020-05-08  7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
2020-05-08  7:58 ` [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
2020-05-08  7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
2020-05-13 15:31   ` Jiri Olsa
2020-05-14  5:42     ` Jin, Yao
2020-05-08  7:58 ` [PATCH v4 4/4] perf stat: Report summary for interval mode Jin Yao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.