* [PATCH] perf stat: Use proper cpu for shadow stats
@ 2020-11-13 5:02 Namhyung Kim
2020-11-13 19:18 ` Andi Kleen
2020-11-13 22:16 ` Jiri Olsa
0 siblings, 2 replies; 4+ messages in thread
From: Namhyung Kim @ 2020-11-13 5:02 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
LKML, Stephane Eranian, Ian Rogers, Andi Kleen, Sam Xi
Currently perf stat shows some metrics (like IPC) for defined events.
But when no aggregation mode is used (-A option), it shows incorrect
values since it used a value from a different cpu.
Before:
$ perf stat -aA -e cycles,instructions sleep 1
Performance counter stats for 'system wide':
CPU0 116,057,380 cycles
CPU1 86,084,722 cycles
CPU2 99,423,125 cycles
CPU3 98,272,994 cycles
CPU0 53,369,217 instructions # 0.46 insn per cycle
CPU1 33,378,058 instructions # 0.29 insn per cycle
CPU2 58,150,086 instructions # 0.50 insn per cycle
CPU3 40,029,703 instructions # 0.34 insn per cycle
1.001816971 seconds time elapsed
So the IPC for CPU1 should be 0.38 (= 33,378,058 / 86,084,722)
but it was 0.29 (= 33,378,058 / 116,057,380) and so on.
After:
$ perf stat -aA -e cycles,instructions sleep 1
Performance counter stats for 'system wide':
CPU0 109,621,384 cycles
CPU1 159,026,454 cycles
CPU2 99,460,366 cycles
CPU3 124,144,142 cycles
CPU0 44,396,706 instructions # 0.41 insn per cycle
CPU1 120,195,425 instructions # 0.76 insn per cycle
CPU2 44,763,978 instructions # 0.45 insn per cycle
CPU3 69,049,079 instructions # 0.56 insn per cycle
1.001910444 seconds time elapsed
Reported-by: Sam Xi <xyzsam@google.com>
Fixes: 44d49a600259 ("perf stat: Support metrics in --per-core/socket mode")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/stat-display.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 4b57c0c07632..a963b5b8eb72 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -324,13 +324,10 @@ static int first_shadow_cpu(struct perf_stat_config *config,
struct evlist *evlist = evsel->evlist;
int i;
- if (!config->aggr_get_id)
- return 0;
-
if (config->aggr_mode == AGGR_NONE)
return id;
- if (config->aggr_mode == AGGR_GLOBAL)
+ if (!config->aggr_get_id)
return 0;
for (i = 0; i < evsel__nr_cpus(evsel); i++) {
--
2.29.2.299.gdc1121823c-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] perf stat: Use proper cpu for shadow stats
2020-11-13 5:02 [PATCH] perf stat: Use proper cpu for shadow stats Namhyung Kim
@ 2020-11-13 19:18 ` Andi Kleen
2020-11-13 22:16 ` Jiri Olsa
1 sibling, 0 replies; 4+ messages in thread
From: Andi Kleen @ 2020-11-13 19:18 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
Ian Rogers, Sam Xi
> CPU0 109,621,384 cycles
> CPU1 159,026,454 cycles
> CPU2 99,460,366 cycles
> CPU3 124,144,142 cycles
> CPU0 44,396,706 instructions # 0.41 insn per cycle
> CPU1 120,195,425 instructions # 0.76 insn per cycle
> CPU2 44,763,978 instructions # 0.45 insn per cycle
> CPU3 69,049,079 instructions # 0.56 insn per cycle
>
> 1.001910444 seconds time elapsed
Yes makes a lot of sense. Thanks for tracking that down.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
-Andi
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] perf stat: Use proper cpu for shadow stats
2020-11-13 5:02 [PATCH] perf stat: Use proper cpu for shadow stats Namhyung Kim
2020-11-13 19:18 ` Andi Kleen
@ 2020-11-13 22:16 ` Jiri Olsa
2020-11-14 0:23 ` Namhyung Kim
1 sibling, 1 reply; 4+ messages in thread
From: Jiri Olsa @ 2020-11-13 22:16 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
Ian Rogers, Andi Kleen, Sam Xi
On Fri, Nov 13, 2020 at 02:02:36PM +0900, Namhyung Kim wrote:
> Currently perf stat shows some metrics (like IPC) for defined events.
> But when no aggregation mode is used (-A option), it shows incorrect
> values since it used a value from a different cpu.
>
> Before:
>
> $ perf stat -aA -e cycles,instructions sleep 1
>
> Performance counter stats for 'system wide':
>
> CPU0 116,057,380 cycles
> CPU1 86,084,722 cycles
> CPU2 99,423,125 cycles
> CPU3 98,272,994 cycles
> CPU0 53,369,217 instructions # 0.46 insn per cycle
> CPU1 33,378,058 instructions # 0.29 insn per cycle
> CPU2 58,150,086 instructions # 0.50 insn per cycle
> CPU3 40,029,703 instructions # 0.34 insn per cycle
>
> 1.001816971 seconds time elapsed
>
> So the IPC for CPU1 should be 0.38 (= 33,378,058 / 86,084,722)
> but it was 0.29 (= 33,378,058 / 116,057,380) and so on.
>
> After:
>
> $ perf stat -aA -e cycles,instructions sleep 1
>
> Performance counter stats for 'system wide':
>
> CPU0 109,621,384 cycles
> CPU1 159,026,454 cycles
> CPU2 99,460,366 cycles
> CPU3 124,144,142 cycles
> CPU0 44,396,706 instructions # 0.41 insn per cycle
> CPU1 120,195,425 instructions # 0.76 insn per cycle
> CPU2 44,763,978 instructions # 0.45 insn per cycle
> CPU3 69,049,079 instructions # 0.56 insn per cycle
>
> 1.001910444 seconds time elapsed
>
> Reported-by: Sam Xi <xyzsam@google.com>
> Fixes: 44d49a600259 ("perf stat: Support metrics in --per-core/socket mode")
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
nice catch! would be great to have test for this
Acked-by: Jiri Olsa <jolsa@redhat.com>
thanks,
jirka
> ---
> tools/perf/util/stat-display.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 4b57c0c07632..a963b5b8eb72 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -324,13 +324,10 @@ static int first_shadow_cpu(struct perf_stat_config *config,
> struct evlist *evlist = evsel->evlist;
> int i;
>
> - if (!config->aggr_get_id)
> - return 0;
> -
> if (config->aggr_mode == AGGR_NONE)
> return id;
>
> - if (config->aggr_mode == AGGR_GLOBAL)
> + if (!config->aggr_get_id)
> return 0;
>
> for (i = 0; i < evsel__nr_cpus(evsel); i++) {
> --
> 2.29.2.299.gdc1121823c-goog
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] perf stat: Use proper cpu for shadow stats
2020-11-13 22:16 ` Jiri Olsa
@ 2020-11-14 0:23 ` Namhyung Kim
0 siblings, 0 replies; 4+ messages in thread
From: Namhyung Kim @ 2020-11-14 0:23 UTC (permalink / raw)
To: Jiri Olsa
Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
Ian Rogers, Andi Kleen, Sam Xi
On Sat, Nov 14, 2020 at 7:16 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Nov 13, 2020 at 02:02:36PM +0900, Namhyung Kim wrote:
> > Currently perf stat shows some metrics (like IPC) for defined events.
> > But when no aggregation mode is used (-A option), it shows incorrect
> > values since it used a value from a different cpu.
> >
> > Before:
> >
> > $ perf stat -aA -e cycles,instructions sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > CPU0 116,057,380 cycles
> > CPU1 86,084,722 cycles
> > CPU2 99,423,125 cycles
> > CPU3 98,272,994 cycles
> > CPU0 53,369,217 instructions # 0.46 insn per cycle
> > CPU1 33,378,058 instructions # 0.29 insn per cycle
> > CPU2 58,150,086 instructions # 0.50 insn per cycle
> > CPU3 40,029,703 instructions # 0.34 insn per cycle
> >
> > 1.001816971 seconds time elapsed
> >
> > So the IPC for CPU1 should be 0.38 (= 33,378,058 / 86,084,722)
> > but it was 0.29 (= 33,378,058 / 116,057,380) and so on.
> >
> > After:
> >
> > $ perf stat -aA -e cycles,instructions sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > CPU0 109,621,384 cycles
> > CPU1 159,026,454 cycles
> > CPU2 99,460,366 cycles
> > CPU3 124,144,142 cycles
> > CPU0 44,396,706 instructions # 0.41 insn per cycle
> > CPU1 120,195,425 instructions # 0.76 insn per cycle
> > CPU2 44,763,978 instructions # 0.45 insn per cycle
> > CPU3 69,049,079 instructions # 0.56 insn per cycle
> >
> > 1.001910444 seconds time elapsed
> >
> > Reported-by: Sam Xi <xyzsam@google.com>
> > Fixes: 44d49a600259 ("perf stat: Support metrics in --per-core/socket mode")
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>
> nice catch! would be great to have test for this
>
> Acked-by: Jiri Olsa <jolsa@redhat.com>
Thanks! Will add a test later.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-11-14 0:23 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13 5:02 [PATCH] perf stat: Use proper cpu for shadow stats Namhyung Kim
2020-11-13 19:18 ` Andi Kleen
2020-11-13 22:16 ` Jiri Olsa
2020-11-14 0:23 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).