* [PATCH v4 0/4] perf stat: Support overall statistics for interval mode
@ 2020-05-08 7:58 Jin Yao
2020-05-08 7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08 7:58 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Currently perf-stat supports to print counts at regular interval (-I),
but it's not very easy for user to get the overall statistics.
With this patchset, it supports to report the summary at the end of
interval output.
For example,
root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
# time counts unit events
1.000412064 2,281,114 cycles
2.001383658 2,547,880 cycles
Performance counter stats for 'system wide':
4,828,994 cycles
2.002860349 seconds time elapsed
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000389902 1,536,093 cycles
1.000389902 420,226 instructions # 0.27 insn per cycle
2.001433453 2,213,952 cycles
2.001433453 735,465 instructions # 0.33 insn per cycle
Performance counter stats for 'system wide':
3,750,045 cycles
1,155,691 instructions # 0.31 insn per cycle
2.003023361 seconds time elapsed
root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
# time counts unit events
1.000435121 905,303 inst_retired.any # 2.9 CPI
1.000435121 2,663,333 cycles
1.000435121 914,702 inst_retired.any # 0.3 IPC
1.000435121 2,676,559 cpu_clk_unhalted.thread
2.001615941 1,951,092 inst_retired.any # 1.8 CPI
2.001615941 3,551,357 cycles
2.001615941 1,950,837 inst_retired.any # 0.5 IPC
2.001615941 3,551,044 cpu_clk_unhalted.thread
Performance counter stats for 'system wide':
2,856,395 inst_retired.any # 2.2 CPI
6,214,690 cycles
2,865,539 inst_retired.any # 0.5 IPC
6,227,603 cpu_clk_unhalted.thread
2.003403078 seconds time elapsed
v4:
---
1. Create runtime_stat_reset.
2. Zero the aggr in perf_counts__reset and use it to reset
prev_raw_counts.
3. Move affinity setup and read_counter_cpu to a new function
read_affinity_counters. It's only called when stat_config.summary
is not set.
v3:
---
1. 'perf stat: Fix wrong per-thread runtime stat for interval mode'
is a new patch which fixes an existing issue found in test.
2. We use the prev_raw_counts for summary counts. Drop the summary_counts in v2.
3. Fix some issues.
v2:
---
Rebase to perf/core branch
Jin Yao (4):
perf stat: Fix wrong per-thread runtime stat for interval mode
perf counts: Reset prev_raw_counts counts
perf stat: Copy counts from prev_raw_counts to evsel->counts
perf stat: Report summary for interval mode
tools/perf/builtin-stat.c | 97 ++++++++++++++++++++++++++-------------
tools/perf/util/counts.c | 4 +-
tools/perf/util/counts.h | 1 +
tools/perf/util/evsel.c | 1 +
tools/perf/util/stat.c | 33 ++++++++++---
tools/perf/util/stat.h | 2 +
6 files changed, 99 insertions(+), 39 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat for interval mode
2020-05-08 7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
@ 2020-05-08 7:58 ` Jin Yao
2020-05-08 7:58 ` [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08 7:58 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
root@kbl-ppc:~# perf stat --per-thread -e cycles,instructions -I1000 --interval-count 2
1.004171683 perf-3696 8,747,311 cycles
...
1.004171683 perf-3696 691,730 instructions # 0.08 insn per cycle
...
2.006490373 perf-3696 1,749,936 cycles
...
2.006490373 perf-3696 1,484,582 instructions # 0.28 insn per cycle
...
Let's see interval 2.006490373
perf-3696 1,749,936 cycles
perf-3696 1,484,582 instructions # 0.28 insn per cycle
insn per cycle = 1,484,582 / 1,749,936 = 0.85.
But now it's 0.28, that's not correct.
stat_config.stats[] records the per-thread runtime stat. But for interval
mode, it should be reset for each interval.
So now, with this patch,
root@kbl-ppc:~# perf stat --per-thread -e cycles,instructions -I1000 --interval-count 2
1.005818121 perf-8633 9,898,045 cycles
...
1.005818121 perf-8633 693,298 instructions # 0.07 insn per cycle
...
2.007863743 perf-8633 1,551,619 cycles
...
2.007863743 perf-8633 1,317,514 instructions # 0.85 insn per cycle
...
Let's check interval 2.007863743.
insn per cycle = 1,317,514 / 1,551,619 = 0.85. It's correct.
This patch creates runtime_stat_reset, places it next to
untime_stat_new/runtime_stat_delete and moves all runtime_stat
functions before process_interval.
v4:
---
Create runtime_stat_reset.
Fixes: commit 14e72a21c783 ("perf stat: Update or print per-thread stats")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/builtin-stat.c | 70 +++++++++++++++++++++++----------------
1 file changed, 41 insertions(+), 29 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e0c1ad23c768..f3b3a59ac7d2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -351,6 +351,46 @@ static void read_counters(struct timespec *rs)
}
}
+static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
+{
+ int i;
+
+ config->stats = calloc(nthreads, sizeof(struct runtime_stat));
+ if (!config->stats)
+ return -1;
+
+ config->stats_num = nthreads;
+
+ for (i = 0; i < nthreads; i++)
+ runtime_stat__init(&config->stats[i]);
+
+ return 0;
+}
+
+static void runtime_stat_delete(struct perf_stat_config *config)
+{
+ int i;
+
+ if (!config->stats)
+ return;
+
+ for (i = 0; i < config->stats_num; i++)
+ runtime_stat__exit(&config->stats[i]);
+
+ zfree(&config->stats);
+}
+
+static void runtime_stat_reset(struct perf_stat_config *config)
+{
+ int i;
+
+ if (!config->stats)
+ return;
+
+ for (i = 0; i < config->stats_num; i++)
+ perf_stat__reset_shadow_per_stat(&config->stats[i]);
+}
+
static void process_interval(void)
{
struct timespec ts, rs;
@@ -359,6 +399,7 @@ static void process_interval(void)
diff_timespec(&rs, &ts, &ref_time);
perf_stat__reset_shadow_per_stat(&rt_stat);
+ runtime_stat_reset(&stat_config);
read_counters(&rs);
if (STAT_RECORD) {
@@ -1737,35 +1778,6 @@ int process_cpu_map_event(struct perf_session *session,
return set_maps(st);
}
-static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
-{
- int i;
-
- config->stats = calloc(nthreads, sizeof(struct runtime_stat));
- if (!config->stats)
- return -1;
-
- config->stats_num = nthreads;
-
- for (i = 0; i < nthreads; i++)
- runtime_stat__init(&config->stats[i]);
-
- return 0;
-}
-
-static void runtime_stat_delete(struct perf_stat_config *config)
-{
- int i;
-
- if (!config->stats)
- return;
-
- for (i = 0; i < config->stats_num; i++)
- runtime_stat__exit(&config->stats[i]);
-
- zfree(&config->stats);
-}
-
static const char * const stat_report_usage[] = {
"perf stat report [<options>]",
NULL,
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts
2020-05-08 7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
2020-05-08 7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
@ 2020-05-08 7:58 ` Jin Yao
2020-05-08 7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
2020-05-08 7:58 ` [PATCH v4 4/4] perf stat: Report summary for interval mode Jin Yao
3 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08 7:58 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
When we want to reset the evsel->prev_raw_counts, zeroing the aggr
is not enough, we need to reset the perf_counts too.
The perf_counts__reset zeros the perf_counts, and it should zero
the aggr too. This patch changes perf_counts__reset to non-static,
and calls it in evsel__reset_prev_raw_counts to reset the
prev_raw_counts.
v4:
---
Zeroing the aggr in perf_counts__reset and use it to reset
prev_raw_counts.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/util/counts.c | 4 +++-
tools/perf/util/counts.h | 1 +
tools/perf/util/stat.c | 7 ++-----
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
index 615c9f3e95cb..582f3aeaf5e4 100644
--- a/tools/perf/util/counts.c
+++ b/tools/perf/util/counts.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include <errno.h>
#include <stdlib.h>
+#include <string.h>
#include "evsel.h"
#include "counts.h"
#include <linux/zalloc.h>
@@ -42,10 +43,11 @@ void perf_counts__delete(struct perf_counts *counts)
}
}
-static void perf_counts__reset(struct perf_counts *counts)
+void perf_counts__reset(struct perf_counts *counts)
{
xyarray__reset(counts->loaded);
xyarray__reset(counts->values);
+ memset(&counts->aggr, 0, sizeof(struct perf_counts_values));
}
void evsel__reset_counts(struct evsel *evsel)
diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
index 8f556c6d98fa..7ff36bf6d644 100644
--- a/tools/perf/util/counts.h
+++ b/tools/perf/util/counts.h
@@ -37,6 +37,7 @@ perf_counts__set_loaded(struct perf_counts *counts, int cpu, int thread, bool lo
struct perf_counts *perf_counts__new(int ncpus, int nthreads);
void perf_counts__delete(struct perf_counts *counts);
+void perf_counts__reset(struct perf_counts *counts);
void evsel__reset_counts(struct evsel *evsel);
int evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index f4a44df9b221..e397815f0dfb 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -163,11 +163,8 @@ static void evsel__free_prev_raw_counts(struct evsel *evsel)
static void evsel__reset_prev_raw_counts(struct evsel *evsel)
{
- if (evsel->prev_raw_counts) {
- evsel->prev_raw_counts->aggr.val = 0;
- evsel->prev_raw_counts->aggr.ena = 0;
- evsel->prev_raw_counts->aggr.run = 0;
- }
+ if (evsel->prev_raw_counts)
+ perf_counts__reset(evsel->prev_raw_counts);
}
static int evsel__alloc_stats(struct evsel *evsel, bool alloc_raw)
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
2020-05-08 7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
2020-05-08 7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
2020-05-08 7:58 ` [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
@ 2020-05-08 7:58 ` Jin Yao
2020-05-13 15:31 ` Jiri Olsa
2020-05-08 7:58 ` [PATCH v4 4/4] perf stat: Report summary for interval mode Jin Yao
3 siblings, 1 reply; 7+ messages in thread
From: Jin Yao @ 2020-05-08 7:58 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
It would be useful to support the overall statistics for perf-stat
interval mode. For example, report the summary at the end of
"perf-stat -I" output.
But since perf-stat can support many aggregation modes, such as
--per-thread, --per-socket, -M and etc, we need a solution which
doesn't bring much complexity.
The idea is to use 'evsel->prev_raw_counts' which is updated in
each interval and it's saved with the latest counts. Before reporting
the summary, we copy the counts from evsel->prev_raw_counts to
evsel->counts, and next we just follow non-interval processing.
In evsel__compute_deltas, this patch saves counts to the member
[cpu0,thread0] of perf_counts for AGGR_GLOBAL.
That's because after copying evsel->prev_raw_counts to evsel->counts,
perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
Once we go to process_counter_maps again, all members of perf_counts
are 0.
So this patch uses a trick that saves the previous aggr value to
the member [cpu0,thread0] of perf_counts, then aggr calculation
in process_counter_values can work correctly.
v4:
---
Change the commit message.
No functional change.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/util/evsel.c | 1 +
tools/perf/util/stat.c | 24 ++++++++++++++++++++++++
tools/perf/util/stat.h | 1 +
3 files changed, 26 insertions(+)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..6fae1ec28886 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
if (cpu == -1) {
tmp = evsel->prev_raw_counts->aggr;
evsel->prev_raw_counts->aggr = *count;
+ *perf_counts(evsel->prev_raw_counts, 0, 0) = *count;
} else {
tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index e397815f0dfb..aadc723ce871 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -225,6 +225,30 @@ void perf_evlist__reset_prev_raw_counts(struct evlist *evlist)
evsel__reset_prev_raw_counts(evsel);
}
+static void perf_evsel__copy_prev_raw_counts(struct evsel *evsel)
+{
+ int ncpus = evsel__nr_cpus(evsel);
+ int nthreads = perf_thread_map__nr(evsel->core.threads);
+
+ for (int thread = 0; thread < nthreads; thread++) {
+ for (int cpu = 0; cpu < ncpus; cpu++) {
+ *perf_counts(evsel->counts, cpu, thread) =
+ *perf_counts(evsel->prev_raw_counts, cpu,
+ thread);
+ }
+ }
+
+ evsel->counts->aggr = evsel->prev_raw_counts->aggr;
+}
+
+void perf_evlist__copy_prev_raw_counts(struct evlist *evlist)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel)
+ perf_evsel__copy_prev_raw_counts(evsel);
+}
+
static void zero_per_pkg(struct evsel *counter)
{
if (counter->per_pkg_mask)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index b4fdfaa7f2c0..62cf72c71869 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -198,6 +198,7 @@ int perf_evlist__alloc_stats(struct evlist *evlist, bool alloc_raw);
void perf_evlist__free_stats(struct evlist *evlist);
void perf_evlist__reset_stats(struct evlist *evlist);
void perf_evlist__reset_prev_raw_counts(struct evlist *evlist);
+void perf_evlist__copy_prev_raw_counts(struct evlist *evlist);
int perf_stat_process_counter(struct perf_stat_config *config,
struct evsel *counter);
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 4/4] perf stat: Report summary for interval mode
2020-05-08 7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
` (2 preceding siblings ...)
2020-05-08 7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
@ 2020-05-08 7:58 ` Jin Yao
3 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2020-05-08 7:58 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Currently perf-stat supports to print counts at regular interval (-I),
but it's not very easy for user to get the overall statistics.
The patch uses 'evsel->prev_raw_counts' to get counts for summary.
Copy the counts to 'evsel->counts' after printing the interval results.
Next, we just follow the non-interval processing.
Let's see some examples,
root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2
# time counts unit events
1.000412064 2,281,114 cycles
2.001383658 2,547,880 cycles
Performance counter stats for 'system wide':
4,828,994 cycles
2.002860349 seconds time elapsed
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000389902 1,536,093 cycles
1.000389902 420,226 instructions # 0.27 insn per cycle
2.001433453 2,213,952 cycles
2.001433453 735,465 instructions # 0.33 insn per cycle
Performance counter stats for 'system wide':
3,750,045 cycles
1,155,691 instructions # 0.31 insn per cycle
2.003023361 seconds time elapsed
root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2
# time counts unit events
1.000435121 905,303 inst_retired.any # 2.9 CPI
1.000435121 2,663,333 cycles
1.000435121 914,702 inst_retired.any # 0.3 IPC
1.000435121 2,676,559 cpu_clk_unhalted.thread
2.001615941 1,951,092 inst_retired.any # 1.8 CPI
2.001615941 3,551,357 cycles
2.001615941 1,950,837 inst_retired.any # 0.5 IPC
2.001615941 3,551,044 cpu_clk_unhalted.thread
Performance counter stats for 'system wide':
2,856,395 inst_retired.any # 2.2 CPI
6,214,690 cycles
2,865,539 inst_retired.any # 0.5 IPC
6,227,603 cpu_clk_unhalted.thread
2.003403078 seconds time elapsed
v4:
---
Move affinity setup and read_counter_cpu to a new function
read_affinity_counters. It's only called when stat_config.summary
is not set.
v3:
---
Use evsel->prev_raw_counts for summary counts
v2:
---
Rebase to perf/core branch
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/builtin-stat.c | 27 ++++++++++++++++++++++++---
tools/perf/util/stat.c | 2 +-
tools/perf/util/stat.h | 1 +
3 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f3b3a59ac7d2..d6a6aa6b997d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -314,14 +314,14 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
return 0;
}
-static void read_counters(struct timespec *rs)
+static int read_affinity_counters(struct timespec *rs)
{
struct evsel *counter;
struct affinity affinity;
int i, ncpus, cpu;
if (affinity__setup(&affinity) < 0)
- return;
+ return -1;
ncpus = perf_cpu_map__nr(evsel_list->core.all_cpus);
if (!target__has_cpu(&target) || target__has_per_thread(&target))
@@ -341,6 +341,15 @@ static void read_counters(struct timespec *rs)
}
}
affinity__cleanup(&affinity);
+ return 0;
+}
+
+static void read_counters(struct timespec *rs)
+{
+ struct evsel *counter;
+
+ if (!stat_config.summary && (read_affinity_counters(rs) < 0))
+ return;
evlist__for_each_entry(evsel_list, counter) {
if (counter->err)
@@ -394,6 +403,7 @@ static void runtime_stat_reset(struct perf_stat_config *config)
static void process_interval(void)
{
struct timespec ts, rs;
+ struct stats walltime_nsecs_stats_bak;
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);
@@ -407,9 +417,11 @@ static void process_interval(void)
pr_err("failed to write stat round event\n");
}
+ walltime_nsecs_stats_bak = walltime_nsecs_stats;
init_stats(&walltime_nsecs_stats);
update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
print_counters(&rs, 0, NULL);
+ walltime_nsecs_stats = walltime_nsecs_stats_bak;
}
static void enable_counters(void)
@@ -765,6 +777,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
update_stats(&walltime_nsecs_stats, t1 - t0);
+ if (interval) {
+ stat_config.interval = 0;
+ stat_config.summary = true;
+ perf_evlist__copy_prev_raw_counts(evsel_list);
+ perf_evlist__reset_prev_raw_counts(evsel_list);
+ runtime_stat_reset(&stat_config);
+ perf_stat__reset_shadow_per_stat(&rt_stat);
+ }
+
/*
* Closing a group leader splits the group, and as we only disable
* group leaders, results in remaining events becoming enabled. To
@@ -2159,7 +2180,7 @@ int cmd_stat(int argc, const char **argv)
}
}
- if (!forever && status != -1 && !interval)
+ if (!forever && status != -1 && (!interval || stat_config.summary))
print_counters(NULL, argc, argv);
if (STAT_RECORD) {
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index aadc723ce871..49e832a9c109 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -388,7 +388,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
* interval mode, otherwise overall avg running
* averages will be shown for each interval.
*/
- if (config->interval) {
+ if (config->interval || config->summary) {
for (i = 0; i < 3; i++)
init_stats(&ps->res_stats[i]);
}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 62cf72c71869..c60e9e5d6474 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -110,6 +110,7 @@ struct perf_stat_config {
bool all_kernel;
bool all_user;
bool percore_show_thread;
+ bool summary;
FILE *output;
unsigned int interval;
unsigned int timeout;
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
2020-05-08 7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
@ 2020-05-13 15:31 ` Jiri Olsa
2020-05-14 5:42 ` Jin, Yao
0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2020-05-13 15:31 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On Fri, May 08, 2020 at 03:58:16PM +0800, Jin Yao wrote:
> It would be useful to support the overall statistics for perf-stat
> interval mode. For example, report the summary at the end of
> "perf-stat -I" output.
>
> But since perf-stat can support many aggregation modes, such as
> --per-thread, --per-socket, -M and etc, we need a solution which
> doesn't bring much complexity.
>
> The idea is to use 'evsel->prev_raw_counts' which is updated in
> each interval and it's saved with the latest counts. Before reporting
> the summary, we copy the counts from evsel->prev_raw_counts to
> evsel->counts, and next we just follow non-interval processing.
>
> In evsel__compute_deltas, this patch saves counts to the member
> [cpu0,thread0] of perf_counts for AGGR_GLOBAL.
>
> That's because after copying evsel->prev_raw_counts to evsel->counts,
> perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
> Once we go to process_counter_maps again, all members of perf_counts
> are 0.
>
> So this patch uses a trick that saves the previous aggr value to
> the member [cpu0,thread0] of perf_counts, then aggr calculation
> in process_counter_values can work correctly.
>
> v4:
> ---
> Change the commit message.
> No functional change.
>
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
> tools/perf/util/evsel.c | 1 +
> tools/perf/util/stat.c | 24 ++++++++++++++++++++++++
> tools/perf/util/stat.h | 1 +
> 3 files changed, 26 insertions(+)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 28683b0eb738..6fae1ec28886 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
> if (cpu == -1) {
> tmp = evsel->prev_raw_counts->aggr;
> evsel->prev_raw_counts->aggr = *count;
> + *perf_counts(evsel->prev_raw_counts, 0, 0) = *count;
ok, I think I understand that now.. it's only for AGGR_GLOBAL mode,
because the perf_stat_process_counter will create aggr values from
per cpu values
but why do we need to do that all the time? can't we just set it up
before you zero prev_raw_counts in next patch?
if (interval) {
stat_config.interval = 0;
stat_config.summary = true;
perf_evlist__copy_prev_raw_counts(evsel_list);
-> for AGGR_GLOBAL set the counts[0,0] to prev_raw_counts->aggr
perf_evlist__reset_prev_raw_counts(evsel_list);
runtime_stat_reset(&stat_config);
perf_stat__reset_shadow_per_stat(&rt_stat);
}
thanks,
jirka
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts
2020-05-13 15:31 ` Jiri Olsa
@ 2020-05-14 5:42 ` Jin, Yao
0 siblings, 0 replies; 7+ messages in thread
From: Jin, Yao @ 2020-05-14 5:42 UTC (permalink / raw)
To: Jiri Olsa
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
Hi Jiri,
On 5/13/2020 11:31 PM, Jiri Olsa wrote:
> On Fri, May 08, 2020 at 03:58:16PM +0800, Jin Yao wrote:
>> It would be useful to support the overall statistics for perf-stat
>> interval mode. For example, report the summary at the end of
>> "perf-stat -I" output.
>>
>> But since perf-stat can support many aggregation modes, such as
>> --per-thread, --per-socket, -M and etc, we need a solution which
>> doesn't bring much complexity.
>>
>> The idea is to use 'evsel->prev_raw_counts' which is updated in
>> each interval and it's saved with the latest counts. Before reporting
>> the summary, we copy the counts from evsel->prev_raw_counts to
>> evsel->counts, and next we just follow non-interval processing.
>>
>> In evsel__compute_deltas, this patch saves counts to the member
>> [cpu0,thread0] of perf_counts for AGGR_GLOBAL.
>>
>> That's because after copying evsel->prev_raw_counts to evsel->counts,
>> perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
>> Once we go to process_counter_maps again, all members of perf_counts
>> are 0.
>>
>> So this patch uses a trick that saves the previous aggr value to
>> the member [cpu0,thread0] of perf_counts, then aggr calculation
>> in process_counter_values can work correctly.
>>
>> v4:
>> ---
>> Change the commit message.
>> No functional change.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> tools/perf/util/evsel.c | 1 +
>> tools/perf/util/stat.c | 24 ++++++++++++++++++++++++
>> tools/perf/util/stat.h | 1 +
>> 3 files changed, 26 insertions(+)
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index 28683b0eb738..6fae1ec28886 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
>> if (cpu == -1) {
>> tmp = evsel->prev_raw_counts->aggr;
>> evsel->prev_raw_counts->aggr = *count;
>> + *perf_counts(evsel->prev_raw_counts, 0, 0) = *count;
>
> ok, I think I understand that now.. it's only for AGGR_GLOBAL mode,
> because the perf_stat_process_counter will create aggr values from
> per cpu values
>
> but why do we need to do that all the time? can't we just set it up
> before you zero prev_raw_counts in next patch?
>
>
> if (interval) {
> stat_config.interval = 0;
> stat_config.summary = true;
> perf_evlist__copy_prev_raw_counts(evsel_list);
>
> -> for AGGR_GLOBAL set the counts[0,0] to prev_raw_counts->aggr
>
> perf_evlist__reset_prev_raw_counts(evsel_list);
> runtime_stat_reset(&stat_config);
> perf_stat__reset_shadow_per_stat(&rt_stat);
> }
>
Yes, I think that's a good idea.
Now in v5, I create a new patch "perf stat: Save aggr value to first member of
prev_raw_counts" to save aggr value to first member of prev_raw_counts for
AGGR_GLOBAL. Then next, perf_stat_process_counter can create aggr values from
per cpu values successfully.
Thanks
Jin Yao
>
> thanks,
> jirka
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-05-14 5:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08 7:58 [PATCH v4 0/4] perf stat: Support overall statistics for interval mode Jin Yao
2020-05-08 7:58 ` [PATCH v4 1/4] perf stat: Fix wrong per-thread runtime stat " Jin Yao
2020-05-08 7:58 ` [PATCH v4 2/4] perf counts: Reset prev_raw_counts counts Jin Yao
2020-05-08 7:58 ` [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts Jin Yao
2020-05-13 15:31 ` Jiri Olsa
2020-05-14 5:42 ` Jin, Yao
2020-05-08 7:58 ` [PATCH v4 4/4] perf stat: Report summary for interval mode Jin Yao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.