linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf stat: Use proper cpu for shadow stats
@ 2020-11-13  5:02 Namhyung Kim
  2020-11-13 19:18 ` Andi Kleen
  2020-11-13 22:16 ` Jiri Olsa
  0 siblings, 2 replies; 4+ messages in thread
From: Namhyung Kim @ 2020-11-13  5:02 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Ian Rogers, Andi Kleen, Sam Xi

Currently perf stat shows some metrics (like IPC) for defined events.
But when no aggregation mode is used (-A option), it shows incorrect
values since it used a value from a different cpu.

Before:

  $ perf stat -aA -e cycles,instructions sleep 1

   Performance counter stats for 'system wide':

  CPU0      116,057,380      cycles
  CPU1       86,084,722      cycles
  CPU2       99,423,125      cycles
  CPU3       98,272,994      cycles
  CPU0       53,369,217      instructions      #    0.46  insn per cycle
  CPU1       33,378,058      instructions      #    0.29  insn per cycle
  CPU2       58,150,086      instructions      #    0.50  insn per cycle
  CPU3       40,029,703      instructions      #    0.34  insn per cycle

       1.001816971 seconds time elapsed

So the IPC for CPU1 should be 0.38 (= 33,378,058 / 86,084,722)
but it was 0.29 (= 33,378,058 / 116,057,380) and so on.

After:

  $ perf stat -aA -e cycles,instructions sleep 1

   Performance counter stats for 'system wide':

  CPU0      109,621,384      cycles
  CPU1      159,026,454      cycles
  CPU2       99,460,366      cycles
  CPU3      124,144,142      cycles
  CPU0       44,396,706      instructions      #    0.41  insn per cycle
  CPU1      120,195,425      instructions      #    0.76  insn per cycle
  CPU2       44,763,978      instructions      #    0.45  insn per cycle
  CPU3       69,049,079      instructions      #    0.56  insn per cycle

       1.001910444 seconds time elapsed

Reported-by: Sam Xi <xyzsam@google.com>
Fixes: 44d49a600259 ("perf stat: Support metrics in --per-core/socket mode")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat-display.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 4b57c0c07632..a963b5b8eb72 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -324,13 +324,10 @@ static int first_shadow_cpu(struct perf_stat_config *config,
 	struct evlist *evlist = evsel->evlist;
 	int i;
 
-	if (!config->aggr_get_id)
-		return 0;
-
 	if (config->aggr_mode == AGGR_NONE)
 		return id;
 
-	if (config->aggr_mode == AGGR_GLOBAL)
+	if (!config->aggr_get_id)
 		return 0;
 
 	for (i = 0; i < evsel__nr_cpus(evsel); i++) {
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf stat: Use proper cpu for shadow stats
  2020-11-13  5:02 [PATCH] perf stat: Use proper cpu for shadow stats Namhyung Kim
@ 2020-11-13 19:18 ` Andi Kleen
  2020-11-13 22:16 ` Jiri Olsa
  1 sibling, 0 replies; 4+ messages in thread
From: Andi Kleen @ 2020-11-13 19:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Ian Rogers, Sam Xi

>   CPU0      109,621,384      cycles
>   CPU1      159,026,454      cycles
>   CPU2       99,460,366      cycles
>   CPU3      124,144,142      cycles
>   CPU0       44,396,706      instructions      #    0.41  insn per cycle
>   CPU1      120,195,425      instructions      #    0.76  insn per cycle
>   CPU2       44,763,978      instructions      #    0.45  insn per cycle
>   CPU3       69,049,079      instructions      #    0.56  insn per cycle
> 
>        1.001910444 seconds time elapsed

Yes makes a lot of sense. Thanks for tracking that down.

Reviewed-by: Andi Kleen <ak@linux.intel.com>

-Andi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf stat: Use proper cpu for shadow stats
  2020-11-13  5:02 [PATCH] perf stat: Use proper cpu for shadow stats Namhyung Kim
  2020-11-13 19:18 ` Andi Kleen
@ 2020-11-13 22:16 ` Jiri Olsa
  2020-11-14  0:23   ` Namhyung Kim
  1 sibling, 1 reply; 4+ messages in thread
From: Jiri Olsa @ 2020-11-13 22:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Ian Rogers, Andi Kleen, Sam Xi

On Fri, Nov 13, 2020 at 02:02:36PM +0900, Namhyung Kim wrote:
> Currently perf stat shows some metrics (like IPC) for defined events.
> But when no aggregation mode is used (-A option), it shows incorrect
> values since it used a value from a different cpu.
> 
> Before:
> 
>   $ perf stat -aA -e cycles,instructions sleep 1
> 
>    Performance counter stats for 'system wide':
> 
>   CPU0      116,057,380      cycles
>   CPU1       86,084,722      cycles
>   CPU2       99,423,125      cycles
>   CPU3       98,272,994      cycles
>   CPU0       53,369,217      instructions      #    0.46  insn per cycle
>   CPU1       33,378,058      instructions      #    0.29  insn per cycle
>   CPU2       58,150,086      instructions      #    0.50  insn per cycle
>   CPU3       40,029,703      instructions      #    0.34  insn per cycle
> 
>        1.001816971 seconds time elapsed
> 
> So the IPC for CPU1 should be 0.38 (= 33,378,058 / 86,084,722)
> but it was 0.29 (= 33,378,058 / 116,057,380) and so on.
> 
> After:
> 
>   $ perf stat -aA -e cycles,instructions sleep 1
> 
>    Performance counter stats for 'system wide':
> 
>   CPU0      109,621,384      cycles
>   CPU1      159,026,454      cycles
>   CPU2       99,460,366      cycles
>   CPU3      124,144,142      cycles
>   CPU0       44,396,706      instructions      #    0.41  insn per cycle
>   CPU1      120,195,425      instructions      #    0.76  insn per cycle
>   CPU2       44,763,978      instructions      #    0.45  insn per cycle
>   CPU3       69,049,079      instructions      #    0.56  insn per cycle
> 
>        1.001910444 seconds time elapsed
> 
> Reported-by: Sam Xi <xyzsam@google.com>
> Fixes: 44d49a600259 ("perf stat: Support metrics in --per-core/socket mode")
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

nice catch! would be great to have test for this

Acked-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka

> ---
>  tools/perf/util/stat-display.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 4b57c0c07632..a963b5b8eb72 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -324,13 +324,10 @@ static int first_shadow_cpu(struct perf_stat_config *config,
>  	struct evlist *evlist = evsel->evlist;
>  	int i;
>  
> -	if (!config->aggr_get_id)
> -		return 0;
> -
>  	if (config->aggr_mode == AGGR_NONE)
>  		return id;
>  
> -	if (config->aggr_mode == AGGR_GLOBAL)
> +	if (!config->aggr_get_id)
>  		return 0;
>  
>  	for (i = 0; i < evsel__nr_cpus(evsel); i++) {
> -- 
> 2.29.2.299.gdc1121823c-goog
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf stat: Use proper cpu for shadow stats
  2020-11-13 22:16 ` Jiri Olsa
@ 2020-11-14  0:23   ` Namhyung Kim
  0 siblings, 0 replies; 4+ messages in thread
From: Namhyung Kim @ 2020-11-14  0:23 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Ian Rogers, Andi Kleen, Sam Xi

On Sat, Nov 14, 2020 at 7:16 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Nov 13, 2020 at 02:02:36PM +0900, Namhyung Kim wrote:
> > Currently perf stat shows some metrics (like IPC) for defined events.
> > But when no aggregation mode is used (-A option), it shows incorrect
> > values since it used a value from a different cpu.
> >
> > Before:
> >
> >   $ perf stat -aA -e cycles,instructions sleep 1
> >
> >    Performance counter stats for 'system wide':
> >
> >   CPU0      116,057,380      cycles
> >   CPU1       86,084,722      cycles
> >   CPU2       99,423,125      cycles
> >   CPU3       98,272,994      cycles
> >   CPU0       53,369,217      instructions      #    0.46  insn per cycle
> >   CPU1       33,378,058      instructions      #    0.29  insn per cycle
> >   CPU2       58,150,086      instructions      #    0.50  insn per cycle
> >   CPU3       40,029,703      instructions      #    0.34  insn per cycle
> >
> >        1.001816971 seconds time elapsed
> >
> > So the IPC for CPU1 should be 0.38 (= 33,378,058 / 86,084,722)
> > but it was 0.29 (= 33,378,058 / 116,057,380) and so on.
> >
> > After:
> >
> >   $ perf stat -aA -e cycles,instructions sleep 1
> >
> >    Performance counter stats for 'system wide':
> >
> >   CPU0      109,621,384      cycles
> >   CPU1      159,026,454      cycles
> >   CPU2       99,460,366      cycles
> >   CPU3      124,144,142      cycles
> >   CPU0       44,396,706      instructions      #    0.41  insn per cycle
> >   CPU1      120,195,425      instructions      #    0.76  insn per cycle
> >   CPU2       44,763,978      instructions      #    0.45  insn per cycle
> >   CPU3       69,049,079      instructions      #    0.56  insn per cycle
> >
> >        1.001910444 seconds time elapsed
> >
> > Reported-by: Sam Xi <xyzsam@google.com>
> > Fixes: 44d49a600259 ("perf stat: Support metrics in --per-core/socket mode")
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>
> nice catch! would be great to have test for this
>
> Acked-by: Jiri Olsa <jolsa@redhat.com>

Thanks!  Will add a test later.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-14  0:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13  5:02 [PATCH] perf stat: Use proper cpu for shadow stats Namhyung Kim
2020-11-13 19:18 ` Andi Kleen
2020-11-13 22:16 ` Jiri Olsa
2020-11-14  0:23   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).