linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf stat: Fix per socket shadow aggregation
@ 2021-12-04  2:34 Ian Rogers
  2021-12-06 10:44 ` James Clark
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Rogers @ 2021-12-04  2:34 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh
  Cc: eranian, Ian Rogers

An uncore device may have a CPU mask that specifies one CPU per socket:
$ cat /sys/devices/uncore_imc_0/cpumask
0,18
The perf_stat_config aggr_map will map a CPU to the socket and other
aggregation values for it. Fix an error where the index into CPU mask
was being used as the index into the aggr_map. For the cpumask above the
indexes 0 and 1 are passed to aggr_map rather than the CPUs 0 and 18.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 588601000f3f..7cfad5cfec38 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -516,7 +516,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
 static void aggr_update_shadow(struct perf_stat_config *config,
 			       struct evlist *evlist)
 {
-	int cpu, s;
+	int idx, cpu, s;
 	struct aggr_cpu_id s2, id;
 	u64 val;
 	struct evsel *counter;
@@ -525,11 +525,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
 		id = config->aggr_map->map[s];
 		evlist__for_each_entry(evlist, counter) {
 			val = 0;
-			for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
+			for (idx = 0; idx < evsel__nr_cpus(counter); idx++) {
+				cpu = perf_cpu_map__cpu(evsel__cpus(counter), idx);
 				s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
 				if (!cpu_map__compare_aggr_cpu_id(s2, id))
 					continue;
-				val += perf_counts(counter->counts, cpu, 0)->val;
+				val += perf_counts(counter->counts, idx, 0)->val;
 			}
 			perf_stat__update_shadow_stats(counter, val,
 					first_shadow_cpu(config, counter, id),
-- 
2.34.1.400.ga245620fadb-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf stat: Fix per socket shadow aggregation
  2021-12-04  2:34 [PATCH] perf stat: Fix per socket shadow aggregation Ian Rogers
@ 2021-12-06 10:44 ` James Clark
  2021-12-06 22:16   ` Ian Rogers
  0 siblings, 1 reply; 4+ messages in thread
From: James Clark @ 2021-12-06 10:44 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry,
	Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
	Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, linux-perf-users, linux-kernel,
	Vineet Singh
  Cc: eranian



On 04/12/2021 02:34, Ian Rogers wrote:
> An uncore device may have a CPU mask that specifies one CPU per socket:
> $ cat /sys/devices/uncore_imc_0/cpumask
> 0,18
> The perf_stat_config aggr_map will map a CPU to the socket and other
> aggregation values for it. Fix an error where the index into CPU mask
> was being used as the index into the aggr_map. For the cpumask above the
> indexes 0 and 1 are passed to aggr_map rather than the CPUs 0 and 18.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/stat-display.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 588601000f3f..7cfad5cfec38 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -516,7 +516,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
>  static void aggr_update_shadow(struct perf_stat_config *config,
>  			       struct evlist *evlist)
>  {
> -	int cpu, s;
> +	int idx, cpu, s;
>  	struct aggr_cpu_id s2, id;
>  	u64 val;
>  	struct evsel *counter;
> @@ -525,11 +525,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
>  		id = config->aggr_map->map[s];
>  		evlist__for_each_entry(evlist, counter) {
>  			val = 0;
> -			for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
> +			for (idx = 0; idx < evsel__nr_cpus(counter); idx++) {
> +				cpu = perf_cpu_map__cpu(evsel__cpus(counter), idx);
>  				s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);

Hi Ian,

This same pattern of looping over the CPUs and calling aggr_get_id() is used a couple of
other times. For example in aggr_cb() and first_shadow_cpu(). Do you think these also
need updating?

Or could we fix it in the aggr_get_id() functions so that they expect an index instead
of CPU ID and do the conversion themselves? The callbacks do say "idx" rather than "cpu"
so maybe there is still come confusion.

For example: 

	perf_stat__get_die_cached(struct perf_stat_config *config,
					struct perf_cpu_map *map, int idx)

>  				if (!cpu_map__compare_aggr_cpu_id(s2, id))
>  					continue;
> -				val += perf_counts(counter->counts, cpu, 0)->val;
> +				val += perf_counts(counter->counts, idx, 0)->val;
>  			}
>  			perf_stat__update_shadow_stats(counter, val,
>  					first_shadow_cpu(config, counter, id),
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf stat: Fix per socket shadow aggregation
  2021-12-06 10:44 ` James Clark
@ 2021-12-06 22:16   ` Ian Rogers
  2021-12-07  8:48     ` Ian Rogers
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Rogers @ 2021-12-06 22:16 UTC (permalink / raw)
  To: James Clark
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	eranian

On Mon, Dec 6, 2021 at 2:44 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 04/12/2021 02:34, Ian Rogers wrote:
> > An uncore device may have a CPU mask that specifies one CPU per socket:
> > $ cat /sys/devices/uncore_imc_0/cpumask
> > 0,18
> > The perf_stat_config aggr_map will map a CPU to the socket and other
> > aggregation values for it. Fix an error where the index into CPU mask
> > was being used as the index into the aggr_map. For the cpumask above the
> > indexes 0 and 1 are passed to aggr_map rather than the CPUs 0 and 18.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/util/stat-display.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index 588601000f3f..7cfad5cfec38 100644
> > --- a/tools/perf/util/stat-display.c
> > +++ b/tools/perf/util/stat-display.c
> > @@ -516,7 +516,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
> >  static void aggr_update_shadow(struct perf_stat_config *config,
> >                              struct evlist *evlist)
> >  {
> > -     int cpu, s;
> > +     int idx, cpu, s;
> >       struct aggr_cpu_id s2, id;
> >       u64 val;
> >       struct evsel *counter;
> > @@ -525,11 +525,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
> >               id = config->aggr_map->map[s];
> >               evlist__for_each_entry(evlist, counter) {
> >                       val = 0;
> > -                     for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
> > +                     for (idx = 0; idx < evsel__nr_cpus(counter); idx++) {
> > +                             cpu = perf_cpu_map__cpu(evsel__cpus(counter), idx);
> >                               s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
>
> Hi Ian,
>
> This same pattern of looping over the CPUs and calling aggr_get_id() is used a couple of
> other times. For example in aggr_cb() and first_shadow_cpu(). Do you think these also
> need updating?

Thanks for the feedback James!

For first_shadow_cpu the index is translated to the the cpu via the
cpu map here:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n343
so I think it is sound.

aggr_cb looks to have the same bug as the index is being passed as the cpu:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n643

> Or could we fix it in the aggr_get_id() functions so that they expect an index instead
> of CPU ID and do the conversion themselves? The callbacks do say "idx" rather than "cpu"
> so maybe there is still come confusion.
>
> For example:
>
>         perf_stat__get_die_cached(struct perf_stat_config *config,
>                                         struct perf_cpu_map *map, int idx)

Agreed on the naming confusion. I'm a fan of using single element
structs to get type safety in code like this. I wonder here if a
for_each_cpu on perf_cpu_map would clean this up best. I'll play with
a v2 patch set that addresses this problem more widely.

Thanks,
Ian

> >                               if (!cpu_map__compare_aggr_cpu_id(s2, id))
> >                                       continue;
> > -                             val += perf_counts(counter->counts, cpu, 0)->val;
> > +                             val += perf_counts(counter->counts, idx, 0)->val;
> >                       }
> >                       perf_stat__update_shadow_stats(counter, val,
> >                                       first_shadow_cpu(config, counter, id),
> >

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf stat: Fix per socket shadow aggregation
  2021-12-06 22:16   ` Ian Rogers
@ 2021-12-07  8:48     ` Ian Rogers
  0 siblings, 0 replies; 4+ messages in thread
From: Ian Rogers @ 2021-12-07  8:48 UTC (permalink / raw)
  To: James Clark
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	eranian

On Mon, Dec 6, 2021 at 2:16 PM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Dec 6, 2021 at 2:44 AM James Clark <james.clark@arm.com> wrote:
> >
> >
> >
> > On 04/12/2021 02:34, Ian Rogers wrote:
> > > An uncore device may have a CPU mask that specifies one CPU per socket:
> > > $ cat /sys/devices/uncore_imc_0/cpumask
> > > 0,18
> > > The perf_stat_config aggr_map will map a CPU to the socket and other
> > > aggregation values for it. Fix an error where the index into CPU mask
> > > was being used as the index into the aggr_map. For the cpumask above the
> > > indexes 0 and 1 are passed to aggr_map rather than the CPUs 0 and 18.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/util/stat-display.c | 7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > > index 588601000f3f..7cfad5cfec38 100644
> > > --- a/tools/perf/util/stat-display.c
> > > +++ b/tools/perf/util/stat-display.c
> > > @@ -516,7 +516,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
> > >  static void aggr_update_shadow(struct perf_stat_config *config,
> > >                              struct evlist *evlist)
> > >  {
> > > -     int cpu, s;
> > > +     int idx, cpu, s;
> > >       struct aggr_cpu_id s2, id;
> > >       u64 val;
> > >       struct evsel *counter;
> > > @@ -525,11 +525,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
> > >               id = config->aggr_map->map[s];
> > >               evlist__for_each_entry(evlist, counter) {
> > >                       val = 0;
> > > -                     for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
> > > +                     for (idx = 0; idx < evsel__nr_cpus(counter); idx++) {
> > > +                             cpu = perf_cpu_map__cpu(evsel__cpus(counter), idx);
> > >                               s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
> >
> > Hi Ian,
> >
> > This same pattern of looping over the CPUs and calling aggr_get_id() is used a couple of
> > other times. For example in aggr_cb() and first_shadow_cpu(). Do you think these also
> > need updating?
>
> Thanks for the feedback James!
>
> For first_shadow_cpu the index is translated to the the cpu via the
> cpu map here:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n343
> so I think it is sound.
>
> aggr_cb looks to have the same bug as the index is being passed as the cpu:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n643
>
> > Or could we fix it in the aggr_get_id() functions so that they expect an index instead
> > of CPU ID and do the conversion themselves? The callbacks do say "idx" rather than "cpu"
> > so maybe there is still come confusion.
> >
> > For example:
> >
> >         perf_stat__get_die_cached(struct perf_stat_config *config,
> >                                         struct perf_cpu_map *map, int idx)
>
> Agreed on the naming confusion. I'm a fan of using single element
> structs to get type safety in code like this. I wonder here if a
> for_each_cpu on perf_cpu_map would clean this up best. I'll play with
> a v2 patch set that addresses this problem more widely.

So the real issue is the cpu map for the evlist (populated with all
CPUs) is being indexed from values from the uncore counter's cpu mask
in aggr_update_shadow. Given the easy confusion of index and cpu, I'm
pushing to eliminate the passing of the map and index in v2.

Thanks,
Ian

> Thanks,
> Ian
>
> > >                               if (!cpu_map__compare_aggr_cpu_id(s2, id))
> > >                                       continue;
> > > -                             val += perf_counts(counter->counts, cpu, 0)->val;
> > > +                             val += perf_counts(counter->counts, idx, 0)->val;
> > >                       }
> > >                       perf_stat__update_shadow_stats(counter, val,
> > >                                       first_shadow_cpu(config, counter, id),
> > >

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-12-07  8:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-04  2:34 [PATCH] perf stat: Fix per socket shadow aggregation Ian Rogers
2021-12-06 10:44 ` James Clark
2021-12-06 22:16   ` Ian Rogers
2021-12-07  8:48     ` Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).