linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] perf stat: Support overall statistics for interval mode
@ 2020-05-07  6:58 Jin Yao
  2020-05-07  6:58 ` [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Jin Yao @ 2020-05-07  6:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Currently perf-stat supports to print counts at regular interval (-I),
but it's not very easy for user to get the overall statistics.

With this patchset, it supports to report the summary at the end of
interval output.

For example,

 root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
 #           time             counts unit events
      1.000412064          2,281,114      cycles
      2.001383658          2,547,880      cycles

  Performance counter stats for 'system wide':

          4,828,994      cycles

        2.002860349 seconds time elapsed

 root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
 #           time             counts unit events
      1.000389902          1,536,093      cycles
      1.000389902            420,226      instructions              #    0.27  insn per cycle
      2.001433453          2,213,952      cycles
      2.001433453            735,465      instructions              #    0.33  insn per cycle

  Performance counter stats for 'system wide':

          3,750,045      cycles
          1,155,691      instructions              #    0.31  insn per cycle

        2.003023361 seconds time elapsed

 root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
 #           time             counts unit events
      1.000435121            905,303      inst_retired.any          #      2.9 CPI
      1.000435121          2,663,333      cycles
      1.000435121            914,702      inst_retired.any          #      0.3 IPC
      1.000435121          2,676,559      cpu_clk_unhalted.thread
      2.001615941          1,951,092      inst_retired.any          #      1.8 CPI
      2.001615941          3,551,357      cycles
      2.001615941          1,950,837      inst_retired.any          #      0.5 IPC
      2.001615941          3,551,044      cpu_clk_unhalted.thread

  Performance counter stats for 'system wide':

          2,856,395      inst_retired.any          #      2.2 CPI
          6,214,690      cycles
          2,865,539      inst_retired.any          #      0.5 IPC
          6,227,603      cpu_clk_unhalted.thread

        2.003403078 seconds time elapsed

 v3:
 ---
 1. 'perf stat: Fix wrong per-thread runtime stat for interval mode'
    is a new patch which fixes an existing issue found in test.

 2. We use the prev_raw_counts for summary counts. Drop the summary_counts in v2.

 3. Fix some issues.

 v2:
 ---
 Rebase to perf/core branch

Jin Yao (4):
  perf stat: Fix wrong per-thread runtime stat for interval mode
  perf counts: Reset prev_raw_counts counts
  perf stat: Copy counts from prev_raw_counts to evsel->counts
  perf stat: Report summary for interval mode

 tools/perf/builtin-stat.c | 27 +++++++++++++++++++++++++--
 tools/perf/util/counts.c  |  5 +++++
 tools/perf/util/counts.h  |  2 ++
 tools/perf/util/evsel.c   |  1 +
 tools/perf/util/stat.c    | 27 ++++++++++++++++++++++++++-
 tools/perf/util/stat.h    |  2 ++
 6 files changed, 61 insertions(+), 3 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat for interval mode
  2020-05-07  6:58 [PATCH v3 0/4] perf stat: Support overall statistics for interval mode Jin Yao
@ 2020-05-07  6:58 ` Jin Yao
  2020-05-07 15:19   ` Jiri Olsa
  2020-05-07  6:58 ` [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Jin Yao @ 2020-05-07  6:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

root@kbl-ppc:~# perf stat --per-thread -e cycles,instructions -I1000 --interval-count 2
     1.004171683             perf-3696              8,747,311      cycles
	...
     1.004171683             perf-3696                691,730      instructions              #    0.08  insn per cycle
	...
     2.006490373             perf-3696              1,749,936      cycles
	...
     2.006490373             perf-3696              1,484,582      instructions              #    0.28  insn per cycle
	...

Let's see interval 2.006490373

perf-3696              1,749,936      cycles
perf-3696              1,484,582      instructions              #    0.28  insn per cycle

insn per cycle = 1,484,582 / 1,749,936 = 0.85.
But now it's 0.28, that's not correct.

stat_config.stats[] records the per-thread runtime stat. But for interval
mode, it should be reset for each interval.

So now, with this patch,

root@kbl-ppc:~# perf stat --per-thread -e cycles,instructions -I1000 --interval-count 2
     1.005818121             perf-8633              9,898,045      cycles
	...
     1.005818121             perf-8633                693,298      instructions              #    0.07  insn per cycle
	...
     2.007863743             perf-8633              1,551,619      cycles
	...
     2.007863743             perf-8633              1,317,514      instructions              #    0.85  insn per cycle
	...

Let's check interval 2.007863743.

insn per cycle = 1,317,514 / 1,551,619 = 0.85. It's correct.

Fixes: commit 14e72a21c783 ("perf stat: Update or print per-thread stats")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-stat.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e0c1ad23c768..97ee941649e6 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -351,6 +351,16 @@ static void read_counters(struct timespec *rs)
 	}
 }
 
+static void thread_stats_reset(struct perf_stat_config *config)
+{
+	int i;
+
+	if (config->stats) {
+		for (i = 0; i < config->stats_num; i++)
+			perf_stat__reset_shadow_per_stat(&config->stats[i]);
+	}
+}
+
 static void process_interval(void)
 {
 	struct timespec ts, rs;
@@ -359,6 +369,7 @@ static void process_interval(void)
 	diff_timespec(&rs, &ts, &ref_time);
 
 	perf_stat__reset_shadow_per_stat(&rt_stat);
+	thread_stats_reset(&stat_config);
 	read_counters(&rs);
 
 	if (STAT_RECORD) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts
  2020-05-07  6:58 [PATCH v3 0/4] perf stat: Support overall statistics for interval mode Jin Yao
  2020-05-07  6:58 ` [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
@ 2020-05-07  6:58 ` Jin Yao
  2020-05-07 15:19   ` Jiri Olsa
  2020-05-07  6:58 ` [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
  2020-05-07  6:58 ` [PATCH v3 4/4] perf stat: Report summary for interval mode Jin Yao
  3 siblings, 1 reply; 13+ messages in thread
From: Jin Yao @ 2020-05-07  6:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The evsel->prev_raw_counts is updated in perf_evsel__compute_deltas:

perf_evsel__compute_deltas()
{
	tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
	*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
}

When we want to reset the evsel->prev_raw_counts in
perf_evsel__reset_prev_raw_counts, zeroing the aggr is not enough,
we need to reset the counts too.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/counts.c | 5 +++++
 tools/perf/util/counts.h | 2 ++
 tools/perf/util/stat.c   | 1 +
 3 files changed, 8 insertions(+)

diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
index f94e1a23dad6..af3bf36f7c63 100644
--- a/tools/perf/util/counts.c
+++ b/tools/perf/util/counts.c
@@ -64,3 +64,8 @@ void perf_evsel__free_counts(struct evsel *evsel)
 	perf_counts__delete(evsel->counts);
 	evsel->counts = NULL;
 }
+
+void perf_evsel__reset_raw_counts(struct evsel *evsel)
+{
+	perf_counts__reset(evsel->prev_raw_counts);
+}
diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
index 92196df4945f..15bb9acb7cb0 100644
--- a/tools/perf/util/counts.h
+++ b/tools/perf/util/counts.h
@@ -42,4 +42,6 @@ void perf_evsel__reset_counts(struct evsel *evsel);
 int perf_evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
 void perf_evsel__free_counts(struct evsel *evsel);
 
+void perf_evsel__reset_raw_counts(struct evsel *evsel);
+
 #endif /* __PERF_COUNTS_H */
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 774468341851..89e541564ed5 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -168,6 +168,7 @@ static void perf_evsel__reset_prev_raw_counts(struct evsel *evsel)
 		evsel->prev_raw_counts->aggr.val = 0;
 		evsel->prev_raw_counts->aggr.ena = 0;
 		evsel->prev_raw_counts->aggr.run = 0;
+		perf_evsel__reset_raw_counts(evsel);
        }
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
  2020-05-07  6:58 [PATCH v3 0/4] perf stat: Support overall statistics for interval mode Jin Yao
  2020-05-07  6:58 ` [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
  2020-05-07  6:58 ` [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
@ 2020-05-07  6:58 ` Jin Yao
  2020-05-07 15:19   ` Jiri Olsa
  2020-05-07  6:58 ` [PATCH v3 4/4] perf stat: Report summary for interval mode Jin Yao
  3 siblings, 1 reply; 13+ messages in thread
From: Jin Yao @ 2020-05-07  6:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It would be useful to support the overall statistics for perf-stat
interval mode. For example, report the summary at the end of
"perf-stat -I" output.

But since perf-stat can support many aggregation modes, such as
--per-thread, --per-socket, -M and etc, we need a solution which
doesn't bring much complexity.

The idea is to use 'evsel->prev_raw_counts' which is updated in
each interval and it's saved with the latest counts. Before reporting
the summary, we copy the counts from evsel->prev_raw_counts to
evsel->counts, and next we just follow non-interval processing.

In evsel__compute_deltas, this patch saves counts to the position
of [cpu0,thread0] for AGGR_GLOBAL. After copying counts from
evsel->prev_raw_counts to evsel->counts, we don't need to
modify process_counter_maps in perf_stat_process_counter to let it
work well.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/evsel.c |  1 +
 tools/perf/util/stat.c  | 24 ++++++++++++++++++++++++
 tools/perf/util/stat.h  |  1 +
 3 files changed, 26 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f3e60c45d59a..898a697c7cdd 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1288,6 +1288,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
 	if (cpu == -1) {
 		tmp = evsel->prev_raw_counts->aggr;
 		evsel->prev_raw_counts->aggr = *count;
+		*perf_counts(evsel->prev_raw_counts, 0, 0) = *count;
 	} else {
 		tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
 		*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 89e541564ed5..ede113805ecd 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -230,6 +230,30 @@ void perf_evlist__reset_prev_raw_counts(struct evlist *evlist)
 		perf_evsel__reset_prev_raw_counts(evsel);
 }
 
+static void perf_evsel__copy_prev_raw_counts(struct evsel *evsel)
+{
+	int ncpus = evsel__nr_cpus(evsel);
+	int nthreads = perf_thread_map__nr(evsel->core.threads);
+
+	for (int thread = 0; thread < nthreads; thread++) {
+		for (int cpu = 0; cpu < ncpus; cpu++) {
+			*perf_counts(evsel->counts, cpu, thread) =
+				*perf_counts(evsel->prev_raw_counts, cpu,
+					     thread);
+		}
+	}
+
+	evsel->counts->aggr = evsel->prev_raw_counts->aggr;
+}
+
+void perf_evlist__copy_prev_raw_counts(struct evlist *evlist)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(evlist, evsel)
+		perf_evsel__copy_prev_raw_counts(evsel);
+}
+
 static void zero_per_pkg(struct evsel *counter)
 {
 	if (counter->per_pkg_mask)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index b4fdfaa7f2c0..62cf72c71869 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -198,6 +198,7 @@ int perf_evlist__alloc_stats(struct evlist *evlist, bool alloc_raw);
 void perf_evlist__free_stats(struct evlist *evlist);
 void perf_evlist__reset_stats(struct evlist *evlist);
 void perf_evlist__reset_prev_raw_counts(struct evlist *evlist);
+void perf_evlist__copy_prev_raw_counts(struct evlist *evlist);
 
 int perf_stat_process_counter(struct perf_stat_config *config,
 			      struct evsel *counter);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] perf stat: Report summary for interval mode
  2020-05-07  6:58 [PATCH v3 0/4] perf stat: Support overall statistics for interval mode Jin Yao
                   ` (2 preceding siblings ...)
  2020-05-07  6:58 ` [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
@ 2020-05-07  6:58 ` Jin Yao
  2020-05-07 15:18   ` Jiri Olsa
  3 siblings, 1 reply; 13+ messages in thread
From: Jin Yao @ 2020-05-07  6:58 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Currently perf-stat supports to print counts at regular interval (-I),
but it's not very easy for user to get the overall statistics.

The patch uses 'evsel->prev_raw_counts' to get counts for summary.
Copy the counts to 'evsel->counts' after printing the interval results.
Next, we just follow the non-interval processing.

Let's see some examples,

 root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
 #           time             counts unit events
      1.000412064          2,281,114      cycles
      2.001383658          2,547,880      cycles

  Performance counter stats for 'system wide':

          4,828,994      cycles

        2.002860349 seconds time elapsed

 root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
 #           time             counts unit events
      1.000389902          1,536,093      cycles
      1.000389902            420,226      instructions              #    0.27  insn per cycle
      2.001433453          2,213,952      cycles
      2.001433453            735,465      instructions              #    0.33  insn per cycle

  Performance counter stats for 'system wide':

          3,750,045      cycles
          1,155,691      instructions              #    0.31  insn per cycle

        2.003023361 seconds time elapsed

 root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
 #           time             counts unit events
      1.000435121            905,303      inst_retired.any          #      2.9 CPI
      1.000435121          2,663,333      cycles
      1.000435121            914,702      inst_retired.any          #      0.3 IPC
      1.000435121          2,676,559      cpu_clk_unhalted.thread
      2.001615941          1,951,092      inst_retired.any          #      1.8 CPI
      2.001615941          3,551,357      cycles
      2.001615941          1,950,837      inst_retired.any          #      0.5 IPC
      2.001615941          3,551,044      cpu_clk_unhalted.thread

  Performance counter stats for 'system wide':

          2,856,395      inst_retired.any          #      2.2 CPI
          6,214,690      cycles
          2,865,539      inst_retired.any          #      0.5 IPC
          6,227,603      cpu_clk_unhalted.thread

        2.003403078 seconds time elapsed

 v3:
 ---
 Use evsel->prev_raw_counts for summary counts

 v2:
 ---
 Rebase to perf/core branch

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-stat.c | 16 ++++++++++++++--
 tools/perf/util/stat.c    |  2 +-
 tools/perf/util/stat.h    |  1 +
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 97ee941649e6..f67d6a50274e 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -334,7 +334,7 @@ static void read_counters(struct timespec *rs)
 		evlist__for_each_entry(evsel_list, counter) {
 			if (evsel__cpu_iter_skip(counter, cpu))
 				continue;
-			if (!counter->err) {
+			if (!counter->err && !stat_config.summary) {
 				counter->err = read_counter_cpu(counter, rs,
 								counter->cpu_iter - 1);
 			}
@@ -364,6 +364,7 @@ static void thread_stats_reset(struct perf_stat_config *config)
 static void process_interval(void)
 {
 	struct timespec ts, rs;
+	struct stats walltime_nsecs_stats_bak;
 
 	clock_gettime(CLOCK_MONOTONIC, &ts);
 	diff_timespec(&rs, &ts, &ref_time);
@@ -377,9 +378,11 @@ static void process_interval(void)
 			pr_err("failed to write stat round event\n");
 	}
 
+	walltime_nsecs_stats_bak = walltime_nsecs_stats;
 	init_stats(&walltime_nsecs_stats);
 	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
 	print_counters(&rs, 0, NULL);
+	walltime_nsecs_stats = walltime_nsecs_stats_bak;
 }
 
 static void enable_counters(void)
@@ -735,6 +738,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 
 	update_stats(&walltime_nsecs_stats, t1 - t0);
 
+	if (interval) {
+		stat_config.interval = 0;
+		stat_config.summary = true;
+		perf_evlist__copy_prev_raw_counts(evsel_list);
+		perf_evlist__reset_prev_raw_counts(evsel_list);
+		thread_stats_reset(&stat_config);
+		perf_stat__reset_shadow_per_stat(&rt_stat);
+	}
+
 	/*
 	 * Closing a group leader splits the group, and as we only disable
 	 * group leaders, results in remaining events becoming enabled. To
@@ -2158,7 +2170,7 @@ int cmd_stat(int argc, const char **argv)
 		}
 	}
 
-	if (!forever && status != -1 && !interval)
+	if (!forever && status != -1 && (!interval || stat_config.summary))
 		print_counters(NULL, argc, argv);
 
 	if (STAT_RECORD) {
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ede113805ecd..47b26c024830 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -393,7 +393,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 	 * interval mode, otherwise overall avg running
 	 * averages will be shown for each interval.
 	 */
-	if (config->interval) {
+	if (config->interval || config->summary) {
 		for (i = 0; i < 3; i++)
 			init_stats(&ps->res_stats[i]);
 	}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 62cf72c71869..c60e9e5d6474 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -110,6 +110,7 @@ struct perf_stat_config {
 	bool			 all_kernel;
 	bool			 all_user;
 	bool			 percore_show_thread;
+	bool			 summary;
 	FILE			*output;
 	unsigned int		 interval;
 	unsigned int		 timeout;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] perf stat: Report summary for interval mode
  2020-05-07  6:58 ` [PATCH v3 4/4] perf stat: Report summary for interval mode Jin Yao
@ 2020-05-07 15:18   ` Jiri Olsa
  2020-05-08  1:11     ` Jin, Yao
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-05-07 15:18 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, May 07, 2020 at 02:58:22PM +0800, Jin Yao wrote:
> Currently perf-stat supports to print counts at regular interval (-I),
> but it's not very easy for user to get the overall statistics.
> 
> The patch uses 'evsel->prev_raw_counts' to get counts for summary.
> Copy the counts to 'evsel->counts' after printing the interval results.
> Next, we just follow the non-interval processing.
> 
> Let's see some examples,
> 
>  root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
>  #           time             counts unit events
>       1.000412064          2,281,114      cycles
>       2.001383658          2,547,880      cycles
> 
>   Performance counter stats for 'system wide':
> 
>           4,828,994      cycles
> 
>         2.002860349 seconds time elapsed
> 
>  root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
>  #           time             counts unit events
>       1.000389902          1,536,093      cycles
>       1.000389902            420,226      instructions              #    0.27  insn per cycle
>       2.001433453          2,213,952      cycles
>       2.001433453            735,465      instructions              #    0.33  insn per cycle
> 
>   Performance counter stats for 'system wide':
> 
>           3,750,045      cycles
>           1,155,691      instructions              #    0.31  insn per cycle
> 
>         2.003023361 seconds time elapsed
> 
>  root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
>  #           time             counts unit events
>       1.000435121            905,303      inst_retired.any          #      2.9 CPI
>       1.000435121          2,663,333      cycles
>       1.000435121            914,702      inst_retired.any          #      0.3 IPC
>       1.000435121          2,676,559      cpu_clk_unhalted.thread
>       2.001615941          1,951,092      inst_retired.any          #      1.8 CPI
>       2.001615941          3,551,357      cycles
>       2.001615941          1,950,837      inst_retired.any          #      0.5 IPC
>       2.001615941          3,551,044      cpu_clk_unhalted.thread
> 
>   Performance counter stats for 'system wide':
> 
>           2,856,395      inst_retired.any          #      2.2 CPI
>           6,214,690      cycles
>           2,865,539      inst_retired.any          #      0.5 IPC
>           6,227,603      cpu_clk_unhalted.thread
> 
>         2.003403078 seconds time elapsed
> 
>  v3:
>  ---
>  Use evsel->prev_raw_counts for summary counts
> 
>  v2:
>  ---
>  Rebase to perf/core branch
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/builtin-stat.c | 16 ++++++++++++++--
>  tools/perf/util/stat.c    |  2 +-
>  tools/perf/util/stat.h    |  1 +
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 97ee941649e6..f67d6a50274e 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -334,7 +334,7 @@ static void read_counters(struct timespec *rs)
>  		evlist__for_each_entry(evsel_list, counter) {
>  			if (evsel__cpu_iter_skip(counter, cpu))
>  				continue;
> -			if (!counter->err) {
> +			if (!counter->err && !stat_config.summary) {

you'll go through all the affinity setup and do nothing at the end,
even if you know at the begining that it's the case.. not good

we need to call only the perf_stat_process_counter in summary case

jirka

>  				counter->err = read_counter_cpu(counter, rs,
>  								counter->cpu_iter - 1);
>  			}
> @@ -364,6 +364,7 @@ static void thread_stats_reset(struct perf_stat_config *config)
>  static void process_interval(void)
>  {
>  	struct timespec ts, rs;
> +	struct stats walltime_nsecs_stats_bak;
>  
>  	clock_gettime(CLOCK_MONOTONIC, &ts);
>  	diff_timespec(&rs, &ts, &ref_time);
> @@ -377,9 +378,11 @@ static void process_interval(void)
>  			pr_err("failed to write stat round event\n");
>  	}
>  
> +	walltime_nsecs_stats_bak = walltime_nsecs_stats;
>  	init_stats(&walltime_nsecs_stats);
>  	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
>  	print_counters(&rs, 0, NULL);
> +	walltime_nsecs_stats = walltime_nsecs_stats_bak;
>  }
>  
>  static void enable_counters(void)
> @@ -735,6 +738,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  
>  	update_stats(&walltime_nsecs_stats, t1 - t0);
>  
> +	if (interval) {
> +		stat_config.interval = 0;
> +		stat_config.summary = true;
> +		perf_evlist__copy_prev_raw_counts(evsel_list);
> +		perf_evlist__reset_prev_raw_counts(evsel_list);
> +		thread_stats_reset(&stat_config);
> +		perf_stat__reset_shadow_per_stat(&rt_stat);
> +	}
> +
>  	/*
>  	 * Closing a group leader splits the group, and as we only disable
>  	 * group leaders, results in remaining events becoming enabled. To
> @@ -2158,7 +2170,7 @@ int cmd_stat(int argc, const char **argv)
>  		}
>  	}
>  
> -	if (!forever && status != -1 && !interval)
> +	if (!forever && status != -1 && (!interval || stat_config.summary))
>  		print_counters(NULL, argc, argv);
>  
>  	if (STAT_RECORD) {
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ede113805ecd..47b26c024830 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -393,7 +393,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
>  	 * interval mode, otherwise overall avg running
>  	 * averages will be shown for each interval.
>  	 */
> -	if (config->interval) {
> +	if (config->interval || config->summary) {
>  		for (i = 0; i < 3; i++)
>  			init_stats(&ps->res_stats[i]);
>  	}
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 62cf72c71869..c60e9e5d6474 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -110,6 +110,7 @@ struct perf_stat_config {
>  	bool			 all_kernel;
>  	bool			 all_user;
>  	bool			 percore_show_thread;
> +	bool			 summary;
>  	FILE			*output;
>  	unsigned int		 interval;
>  	unsigned int		 timeout;
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat for interval mode
  2020-05-07  6:58 ` [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
@ 2020-05-07 15:19   ` Jiri Olsa
  2020-05-08  2:03     ` Jin, Yao
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-05-07 15:19 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, May 07, 2020 at 02:58:19PM +0800, Jin Yao wrote:

SNIP

> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index e0c1ad23c768..97ee941649e6 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -351,6 +351,16 @@ static void read_counters(struct timespec *rs)
>  	}
>  }
>  
> +static void thread_stats_reset(struct perf_stat_config *config)
> +{
> +	int i;
> +
> +	if (config->stats) {
> +		for (i = 0; i < config->stats_num; i++)
> +			perf_stat__reset_shadow_per_stat(&config->stats[i]);
> +	}
> +}
> +
>  static void process_interval(void)
>  {
>  	struct timespec ts, rs;
> @@ -359,6 +369,7 @@ static void process_interval(void)
>  	diff_timespec(&rs, &ts, &ref_time);
>  
>  	perf_stat__reset_shadow_per_stat(&rt_stat);
> +	thread_stats_reset(&stat_config);

can't you call in here perf_stat__reset_stats?

and if not, I know it's threads related, but new
and delete functions are:

  runtime_stat_new, runtime_stat_delete

so let's call it runtime_stat_reset and place it next to
the new/delete functions

other than that it looks ok, thanks

jirka

>  	read_counters(&rs);
>  
>  	if (STAT_RECORD) {
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts
  2020-05-07  6:58 ` [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
@ 2020-05-07 15:19   ` Jiri Olsa
  2020-05-08  2:45     ` Jin, Yao
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-05-07 15:19 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, May 07, 2020 at 02:58:20PM +0800, Jin Yao wrote:
> The evsel->prev_raw_counts is updated in perf_evsel__compute_deltas:
> 
> perf_evsel__compute_deltas()
> {
> 	tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
> 	*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
> }
> 
> When we want to reset the evsel->prev_raw_counts in
> perf_evsel__reset_prev_raw_counts, zeroing the aggr is not enough,
> we need to reset the counts too.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/util/counts.c | 5 +++++
>  tools/perf/util/counts.h | 2 ++
>  tools/perf/util/stat.c   | 1 +
>  3 files changed, 8 insertions(+)
> 
> diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
> index f94e1a23dad6..af3bf36f7c63 100644
> --- a/tools/perf/util/counts.c
> +++ b/tools/perf/util/counts.c
> @@ -64,3 +64,8 @@ void perf_evsel__free_counts(struct evsel *evsel)
>  	perf_counts__delete(evsel->counts);
>  	evsel->counts = NULL;
>  }
> +
> +void perf_evsel__reset_raw_counts(struct evsel *evsel)
> +{
> +	perf_counts__reset(evsel->prev_raw_counts);
> +}
> diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
> index 92196df4945f..15bb9acb7cb0 100644
> --- a/tools/perf/util/counts.h
> +++ b/tools/perf/util/counts.h
> @@ -42,4 +42,6 @@ void perf_evsel__reset_counts(struct evsel *evsel);
>  int perf_evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
>  void perf_evsel__free_counts(struct evsel *evsel);
>  
> +void perf_evsel__reset_raw_counts(struct evsel *evsel);
> +
>  #endif /* __PERF_COUNTS_H */
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 774468341851..89e541564ed5 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -168,6 +168,7 @@ static void perf_evsel__reset_prev_raw_counts(struct evsel *evsel)
>  		evsel->prev_raw_counts->aggr.val = 0;
>  		evsel->prev_raw_counts->aggr.ena = 0;
>  		evsel->prev_raw_counts->aggr.run = 0;
> +		perf_evsel__reset_raw_counts(evsel);

that seems needed, but we have it scathered all over the place,
could you centralize the reset? the way I see it the perf_counts__reset
should zero all the members of struct perf_counts.. so also
the aggr values

it could also check for counts != NULL and we could call it
instead of:
  perf_evsel__reset_prev_raw_counts
    perf_evsel__reset_raw_counts
      perf_counts__reset

jirka


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
  2020-05-07  6:58 ` [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
@ 2020-05-07 15:19   ` Jiri Olsa
  2020-05-08  3:34     ` Jin, Yao
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-05-07 15:19 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, May 07, 2020 at 02:58:21PM +0800, Jin Yao wrote:
> It would be useful to support the overall statistics for perf-stat
> interval mode. For example, report the summary at the end of
> "perf-stat -I" output.
> 
> But since perf-stat can support many aggregation modes, such as
> --per-thread, --per-socket, -M and etc, we need a solution which
> doesn't bring much complexity.
> 
> The idea is to use 'evsel->prev_raw_counts' which is updated in
> each interval and it's saved with the latest counts. Before reporting
> the summary, we copy the counts from evsel->prev_raw_counts to
> evsel->counts, and next we just follow non-interval processing.

I did not realize we already store the count in prev_raw_counts ;-) nice catch!

> 
> In evsel__compute_deltas, this patch saves counts to the position
> of [cpu0,thread0] for AGGR_GLOBAL. After copying counts from
> evsel->prev_raw_counts to evsel->counts, we don't need to
> modify process_counter_maps in perf_stat_process_counter to let it
> work well.

I don't understand why you need to store it in here.. what's the catch
in process_counter_maps?

thanks,
jirka


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] perf stat: Report summary for interval mode
  2020-05-07 15:18   ` Jiri Olsa
@ 2020-05-08  1:11     ` Jin, Yao
  0 siblings, 0 replies; 13+ messages in thread
From: Jin, Yao @ 2020-05-08  1:11 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 5/7/2020 11:18 PM, Jiri Olsa wrote:
> On Thu, May 07, 2020 at 02:58:22PM +0800, Jin Yao wrote:
>> Currently perf-stat supports to print counts at regular interval (-I),
>> but it's not very easy for user to get the overall statistics.
>>
>> The patch uses 'evsel->prev_raw_counts' to get counts for summary.
>> Copy the counts to 'evsel->counts' after printing the interval results.
>> Next, we just follow the non-interval processing.
>>
>> Let's see some examples,
>>
>>   root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
>>   #           time             counts unit events
>>        1.000412064          2,281,114      cycles
>>        2.001383658          2,547,880      cycles
>>
>>    Performance counter stats for 'system wide':
>>
>>            4,828,994      cycles
>>
>>          2.002860349 seconds time elapsed
>>
>>   root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
>>   #           time             counts unit events
>>        1.000389902          1,536,093      cycles
>>        1.000389902            420,226      instructions              #    0.27  insn per cycle
>>        2.001433453          2,213,952      cycles
>>        2.001433453            735,465      instructions              #    0.33  insn per cycle
>>
>>    Performance counter stats for 'system wide':
>>
>>            3,750,045      cycles
>>            1,155,691      instructions              #    0.31  insn per cycle
>>
>>          2.003023361 seconds time elapsed
>>
>>   root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
>>   #           time             counts unit events
>>        1.000435121            905,303      inst_retired.any          #      2.9 CPI
>>        1.000435121          2,663,333      cycles
>>        1.000435121            914,702      inst_retired.any          #      0.3 IPC
>>        1.000435121          2,676,559      cpu_clk_unhalted.thread
>>        2.001615941          1,951,092      inst_retired.any          #      1.8 CPI
>>        2.001615941          3,551,357      cycles
>>        2.001615941          1,950,837      inst_retired.any          #      0.5 IPC
>>        2.001615941          3,551,044      cpu_clk_unhalted.thread
>>
>>    Performance counter stats for 'system wide':
>>
>>            2,856,395      inst_retired.any          #      2.2 CPI
>>            6,214,690      cycles
>>            2,865,539      inst_retired.any          #      0.5 IPC
>>            6,227,603      cpu_clk_unhalted.thread
>>
>>          2.003403078 seconds time elapsed
>>
>>   v3:
>>   ---
>>   Use evsel->prev_raw_counts for summary counts
>>
>>   v2:
>>   ---
>>   Rebase to perf/core branch
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/builtin-stat.c | 16 ++++++++++++++--
>>   tools/perf/util/stat.c    |  2 +-
>>   tools/perf/util/stat.h    |  1 +
>>   3 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 97ee941649e6..f67d6a50274e 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -334,7 +334,7 @@ static void read_counters(struct timespec *rs)
>>   		evlist__for_each_entry(evsel_list, counter) {
>>   			if (evsel__cpu_iter_skip(counter, cpu))
>>   				continue;
>> -			if (!counter->err) {
>> +			if (!counter->err && !stat_config.summary) {
> 
> you'll go through all the affinity setup and do nothing at the end,
> even if you know at the begining that it's the case.. not good
> 
> we need to call only the perf_stat_process_counter in summary case
> 
> jirka
> 

The interval processing and non-interval processing both need to call 
read_counters.

I need to prevent calling read_counter_cpu in read_counters for non-interval 
otherwise the summary counts will be bigger than the sum of interval counts.

e.g.
interval: read_counter_cpu
interval: read_counter_cpu
interval: read_counter_cpu
interval mode is over,
non-interval: read_counter_cpu /* prevent this time read_counter_cpu */

Since read_counter_cpu is called before perf_stat_process_counter, so I can't do 
this in perf_stat_process_counter.

Yes, it's not good to check stat_config.summary in loops, low efficiency.

Maybe we can move the codes of affinity setup and read_counter_cpu to a new 
function, such as affinity_read_counters and change the read_counters to:

void read_counters()
{
	if (!stat_config.summary)
		affinity_read_counters();

	evlist__for_each_entry(evsel_list, counter) {
		perf_stat_process_counter(&stat_config, counter));
	}
}

Thanks
Jin Yao

>>   				counter->err = read_counter_cpu(counter, rs,
>>   								counter->cpu_iter - 1);
>>   			}
>> @@ -364,6 +364,7 @@ static void thread_stats_reset(struct perf_stat_config *config)
>>   static void process_interval(void)
>>   {
>>   	struct timespec ts, rs;
>> +	struct stats walltime_nsecs_stats_bak;
>>   
>>   	clock_gettime(CLOCK_MONOTONIC, &ts);
>>   	diff_timespec(&rs, &ts, &ref_time);
>> @@ -377,9 +378,11 @@ static void process_interval(void)
>>   			pr_err("failed to write stat round event\n");
>>   	}
>>   
>> +	walltime_nsecs_stats_bak = walltime_nsecs_stats;
>>   	init_stats(&walltime_nsecs_stats);
>>   	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
>>   	print_counters(&rs, 0, NULL);
>> +	walltime_nsecs_stats = walltime_nsecs_stats_bak;
>>   }
>>   
>>   static void enable_counters(void)
>> @@ -735,6 +738,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>>   
>>   	update_stats(&walltime_nsecs_stats, t1 - t0);
>>   
>> +	if (interval) {
>> +		stat_config.interval = 0;
>> +		stat_config.summary = true;
>> +		perf_evlist__copy_prev_raw_counts(evsel_list);
>> +		perf_evlist__reset_prev_raw_counts(evsel_list);
>> +		thread_stats_reset(&stat_config);
>> +		perf_stat__reset_shadow_per_stat(&rt_stat);
>> +	}
>> +
>>   	/*
>>   	 * Closing a group leader splits the group, and as we only disable
>>   	 * group leaders, results in remaining events becoming enabled. To
>> @@ -2158,7 +2170,7 @@ int cmd_stat(int argc, const char **argv)
>>   		}
>>   	}
>>   
>> -	if (!forever && status != -1 && !interval)
>> +	if (!forever && status != -1 && (!interval || stat_config.summary))
>>   		print_counters(NULL, argc, argv);
>>   
>>   	if (STAT_RECORD) {
>> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
>> index ede113805ecd..47b26c024830 100644
>> --- a/tools/perf/util/stat.c
>> +++ b/tools/perf/util/stat.c
>> @@ -393,7 +393,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
>>   	 * interval mode, otherwise overall avg running
>>   	 * averages will be shown for each interval.
>>   	 */
>> -	if (config->interval) {
>> +	if (config->interval || config->summary) {
>>   		for (i = 0; i < 3; i++)
>>   			init_stats(&ps->res_stats[i]);
>>   	}
>> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
>> index 62cf72c71869..c60e9e5d6474 100644
>> --- a/tools/perf/util/stat.h
>> +++ b/tools/perf/util/stat.h
>> @@ -110,6 +110,7 @@ struct perf_stat_config {
>>   	bool			 all_kernel;
>>   	bool			 all_user;
>>   	bool			 percore_show_thread;
>> +	bool			 summary;
>>   	FILE			*output;
>>   	unsigned int		 interval;
>>   	unsigned int		 timeout;
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat for interval mode
  2020-05-07 15:19   ` Jiri Olsa
@ 2020-05-08  2:03     ` Jin, Yao
  0 siblings, 0 replies; 13+ messages in thread
From: Jin, Yao @ 2020-05-08  2:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 5/7/2020 11:19 PM, Jiri Olsa wrote:
> On Thu, May 07, 2020 at 02:58:19PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index e0c1ad23c768..97ee941649e6 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -351,6 +351,16 @@ static void read_counters(struct timespec *rs)
>>   	}
>>   }
>>   
>> +static void thread_stats_reset(struct perf_stat_config *config)
>> +{
>> +	int i;
>> +
>> +	if (config->stats) {
>> +		for (i = 0; i < config->stats_num; i++)
>> +			perf_stat__reset_shadow_per_stat(&config->stats[i]);
>> +	}
>> +}
>> +
>>   static void process_interval(void)
>>   {
>>   	struct timespec ts, rs;
>> @@ -359,6 +369,7 @@ static void process_interval(void)
>>   	diff_timespec(&rs, &ts, &ref_time);
>>   
>>   	perf_stat__reset_shadow_per_stat(&rt_stat);
>> +	thread_stats_reset(&stat_config);
> 
> can't you call in here perf_stat__reset_stats?
> 

If we call perf_stat__reset_stat here, it will reset the evsel->counts, but I 
don't think it's necessary. The counts will be updated in read_counts() soon.

> and if not, I know it's threads related, but new
> and delete functions are:
> 
>    runtime_stat_new, runtime_stat_delete
> 
> so let's call it runtime_stat_reset and place it next to
> the new/delete functions
>

Yes, that's good idea. I will create runtime_stat_reset and place it next to
untime_stat_new/runtime_stat_delete.

> other than that it looks ok, thanks
> 
> jirka
> 

Thanks!

Thanks
Jin Yao

>>   	read_counters(&rs);
>>   
>>   	if (STAT_RECORD) {
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts
  2020-05-07 15:19   ` Jiri Olsa
@ 2020-05-08  2:45     ` Jin, Yao
  0 siblings, 0 replies; 13+ messages in thread
From: Jin, Yao @ 2020-05-08  2:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 5/7/2020 11:19 PM, Jiri Olsa wrote:
> On Thu, May 07, 2020 at 02:58:20PM +0800, Jin Yao wrote:
>> The evsel->prev_raw_counts is updated in perf_evsel__compute_deltas:
>>
>> perf_evsel__compute_deltas()
>> {
>> 	tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
>> 	*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
>> }
>>
>> When we want to reset the evsel->prev_raw_counts in
>> perf_evsel__reset_prev_raw_counts, zeroing the aggr is not enough,
>> we need to reset the counts too.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/util/counts.c | 5 +++++
>>   tools/perf/util/counts.h | 2 ++
>>   tools/perf/util/stat.c   | 1 +
>>   3 files changed, 8 insertions(+)
>>
>> diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
>> index f94e1a23dad6..af3bf36f7c63 100644
>> --- a/tools/perf/util/counts.c
>> +++ b/tools/perf/util/counts.c
>> @@ -64,3 +64,8 @@ void perf_evsel__free_counts(struct evsel *evsel)
>>   	perf_counts__delete(evsel->counts);
>>   	evsel->counts = NULL;
>>   }
>> +
>> +void perf_evsel__reset_raw_counts(struct evsel *evsel)
>> +{
>> +	perf_counts__reset(evsel->prev_raw_counts);
>> +}
>> diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
>> index 92196df4945f..15bb9acb7cb0 100644
>> --- a/tools/perf/util/counts.h
>> +++ b/tools/perf/util/counts.h
>> @@ -42,4 +42,6 @@ void perf_evsel__reset_counts(struct evsel *evsel);
>>   int perf_evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
>>   void perf_evsel__free_counts(struct evsel *evsel);
>>   
>> +void perf_evsel__reset_raw_counts(struct evsel *evsel);
>> +
>>   #endif /* __PERF_COUNTS_H */
>> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
>> index 774468341851..89e541564ed5 100644
>> --- a/tools/perf/util/stat.c
>> +++ b/tools/perf/util/stat.c
>> @@ -168,6 +168,7 @@ static void perf_evsel__reset_prev_raw_counts(struct evsel *evsel)
>>   		evsel->prev_raw_counts->aggr.val = 0;
>>   		evsel->prev_raw_counts->aggr.ena = 0;
>>   		evsel->prev_raw_counts->aggr.run = 0;
>> +		perf_evsel__reset_raw_counts(evsel);
> 
> that seems needed, but we have it scathered all over the place,
> could you centralize the reset? the way I see it the perf_counts__reset
> should zero all the members of struct perf_counts.. so also
> the aggr values
> 
> it could also check for counts != NULL and we could call it
> instead of:
>    perf_evsel__reset_prev_raw_counts
>      perf_evsel__reset_raw_counts
>        perf_counts__reset
> 
> jirka
> 

While sometime, we may want to only reset evsel->counts and sometime we just 
want to reset the evsel->prev_raw_counts. So maybe we improve 
perf_evsel__reset_counts to:

void perf_evsel__reset_counts(struct evsel *evsel, bool counts, bool 
prev_raw_counts)
{
	if (counts)
		perf_counts__reset(evsel->counts);

	if (prev_raw_counts && evsel->prev_raw_counts);
		perf_counts__reset(evsel->prev_raw_counts);
}

static void perf_counts__reset(struct perf_counts *counts)
{
	counts->aggr.val = 0;
	counts->aggr.ena = 0;
	counts->aggr.run = 0;
	xyarray__reset(counts->loaded);
	xyarray__reset(counts->values);
}

So we will only keep perf_evsel__reset_counts and remove 
perf_evsel__reset_prev_raw_counts.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
  2020-05-07 15:19   ` Jiri Olsa
@ 2020-05-08  3:34     ` Jin, Yao
  0 siblings, 0 replies; 13+ messages in thread
From: Jin, Yao @ 2020-05-08  3:34 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 5/7/2020 11:19 PM, Jiri Olsa wrote:
> On Thu, May 07, 2020 at 02:58:21PM +0800, Jin Yao wrote:
>> It would be useful to support the overall statistics for perf-stat
>> interval mode. For example, report the summary at the end of
>> "perf-stat -I" output.
>>
>> But since perf-stat can support many aggregation modes, such as
>> --per-thread, --per-socket, -M and etc, we need a solution which
>> doesn't bring much complexity.
>>
>> The idea is to use 'evsel->prev_raw_counts' which is updated in
>> each interval and it's saved with the latest counts. Before reporting
>> the summary, we copy the counts from evsel->prev_raw_counts to
>> evsel->counts, and next we just follow non-interval processing.
> 
> I did not realize we already store the count in prev_raw_counts ;-) nice catch!
> 

Thanks! :)

>>
>> In evsel__compute_deltas, this patch saves counts to the position
>> of [cpu0,thread0] for AGGR_GLOBAL. After copying counts from
>> evsel->prev_raw_counts to evsel->counts, we don't need to
>> modify process_counter_maps in perf_stat_process_counter to let it
>> work well.
> 
> I don't understand why you need to store it in here.. what's the catch
> in process_counter_maps?
> 

Sorry, I didn't explain clearly.

You know the idea is to copy evsel->prev_raw_counts to evsel->counts before 
reporting the summary.

But for AGGR_GLOBAL (cpu = -1 in perf_evsel__compute_deltas), the 
evsel->prev_raw_counts is only stored with the aggr value.

if (cpu == -1) {
	tmp = evsel->prev_raw_counts->aggr;
	evsel->prev_raw_counts->aggr = *count;
} else {
	tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
	*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
}

So after copying evsel->prev_raw_counts to evsel->counts, 
perf_counts(evsel->counts, cpu, thread) are all 0.

Once we go to process_counter_maps again, in process_counter_values, count->val 
is 0.

case AGGR_GLOBAL:
	aggr->val += count->val;
	aggr->ena += count->ena;
	aggr->run += count->run;

And the aggr->val is 0.

So this patch uses a trick that saves the previous aggr value to cpu0/thread0, 
then above aggr->val calculation can work correctly.

Thanks
Jin Yao

> thanks,
> jirka
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-05-08  3:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-07  6:58 [PATCH v3 0/4] perf stat: Support overall statistics for interval mode Jin Yao
2020-05-07  6:58 ` [PATCH v3 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
2020-05-07 15:19   ` Jiri Olsa
2020-05-08  2:03     ` Jin, Yao
2020-05-07  6:58 ` [PATCH v3 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
2020-05-07 15:19   ` Jiri Olsa
2020-05-08  2:45     ` Jin, Yao
2020-05-07  6:58 ` [PATCH v3 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
2020-05-07 15:19   ` Jiri Olsa
2020-05-08  3:34     ` Jin, Yao
2020-05-07  6:58 ` [PATCH v3 4/4] perf stat: Report summary for interval mode Jin Yao
2020-05-07 15:18   ` Jiri Olsa
2020-05-08  1:11     ` Jin, Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).