linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
@ 2022-09-13 11:57 Athira Rajeev
  2022-09-16 11:31 ` Disha Goel
  2022-09-28 15:35 ` James Clark
  0 siblings, 2 replies; 19+ messages in thread
From: Athira Rajeev @ 2022-09-13 11:57 UTC (permalink / raw)
  To: acme, jolsa, mpe
  Cc: maddy, rnsastry, kjain, linux-perf-users, disgoel, linuxppc-dev

perf stat includes option to specify aggr_mode to display
per-socket, per-core, per-die, per-node counter details.
Also there is option -A ( AGGR_NONE, -no-aggr ), where the
counter values are displayed for each cpu along with "CPU"
value in one field of the output.

Each of the aggregate mode uses the information fetched
from "/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id. Utility functions in "cpumap.c" fetches
this information and populates the socket id, core id, cpu etc.
If the platform does not expose the topology information,
these values will be set to -1. Example, in case of powerpc,
details like physical_package_id is restricted to be exposed
in pSeries platform. So id.socket, id.core, id.cpu all will
be set as -1.

In case of displaying socket or die value, there is no check
done in the "aggr_printout" function to see if it points to
valid socket id or die. But for displaying "cpu" value, there
is a check for "if (id.core > -1)". In case of powerpc pSeries
where detail like physical_package_id is restricted to be
exposed, id.core will be set to -1. Hence the column or field
itself for CPU won't be displayed in the output.

Result for per-socket:

<<>>
perf stat -e branches --per-socket -a true

 Performance counter stats for 'system wide':

S-1      32            416,851      branches
<<>>

Here S has -1 in above result. But with -A option which also
expects CPU in one column in the result, below is observed.

<<>>
 /bin/perf stat -e instructions -A -a true

 Performance counter stats for 'system wide':

            47,146      instructions
            45,226      instructions
            43,354      instructions
            45,184      instructions
<<>>

If the cpu id value is pointing to -1 also, it makes sense
to display the column in the output to replicate the behaviour
or to be in precedence with other aggr options(like per-socket,
per-core). Remove the check "id.core" so that CPU field gets
displayed in the output.

After the fix:

<<>>
perf stat -e instructions -A -a true

 Performance counter stats for 'system wide':

CPU-1                  64,034      instructions
CPU-1                  68,941      instructions
CPU-1                  59,418      instructions
CPU-1                  70,478      instructions
CPU-1                  65,201      instructions
CPU-1                  63,704      instructions
<<>>

This is caught while running "perf test" for
"stat+json_output.sh" and "stat+csv_output.sh".

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 tools/perf/util/stat-display.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index b82844cb0ce7..1b751a730271 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
 					id.socket,
 					id.die,
 					id.core);
-			} else if (id.core > -1) {
+			} else
 				fprintf(config->output, "\"cpu\" : \"%d\", ",
 					id.cpu.cpu);
-			}
 		} else {
 			if (evsel->percore && !config->percore_show_thread) {
 				fprintf(config->output, "S%d-D%d-C%*d%s",
@@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
 					id.die,
 					config->csv_output ? 0 : -3,
 					id.core, config->csv_sep);
-			} else if (id.core > -1) {
+			} else
 				fprintf(config->output, "CPU%*d%s",
 					config->csv_output ? 0 : -7,
 					id.cpu.cpu, config->csv_sep);
-			}
 		}
 		break;
 	case AGGR_THREAD:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-09-13 11:57 [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value Athira Rajeev
@ 2022-09-16 11:31 ` Disha Goel
  2022-09-28 14:33   ` Athira Rajeev
  2022-09-28 15:35 ` James Clark
  1 sibling, 1 reply; 19+ messages in thread
From: Disha Goel @ 2022-09-16 11:31 UTC (permalink / raw)
  To: Athira Rajeev, acme, jolsa, mpe
  Cc: maddy, rnsastry, kjain, linux-perf-users, disgoel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3923 bytes --]


On 9/13/22 5:27 PM, Athira Rajeev wrote:
> perf stat includes option to specify aggr_mode to display
> per-socket, per-core, per-die, per-node counter details.
> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> counter values are displayed for each cpu along with "CPU"
> value in one field of the output.
>
> Each of the aggregate mode uses the information fetched
> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> physical_package_id. Utility functions in "cpumap.c" fetches
> this information and populates the socket id, core id, cpu etc.
> If the platform does not expose the topology information,
> these values will be set to -1. Example, in case of powerpc,
> details like physical_package_id is restricted to be exposed
> in pSeries platform. So id.socket, id.core, id.cpu all will
> be set as -1.
>
> In case of displaying socket or die value, there is no check
> done in the "aggr_printout" function to see if it points to
> valid socket id or die. But for displaying "cpu" value, there
> is a check for "if (id.core > -1)". In case of powerpc pSeries
> where detail like physical_package_id is restricted to be
> exposed, id.core will be set to -1. Hence the column or field
> itself for CPU won't be displayed in the output.
>
> Result for per-socket:
>
> <<>>
> perf stat -e branches --per-socket -a true
>
>   Performance counter stats for 'system wide':
>
> S-1      32            416,851      branches
> <<>>
>
> Here S has -1 in above result. But with -A option which also
> expects CPU in one column in the result, below is observed.
>
> <<>>
>   /bin/perf stat -e instructions -A -a true
>
>   Performance counter stats for 'system wide':
>
>              47,146      instructions
>              45,226      instructions
>              43,354      instructions
>              45,184      instructions
> <<>>
>
> If the cpu id value is pointing to -1 also, it makes sense
> to display the column in the output to replicate the behaviour
> or to be in precedence with other aggr options(like per-socket,
> per-core). Remove the check "id.core" so that CPU field gets
> displayed in the output.
>
> After the fix:
>
> <<>>
> perf stat -e instructions -A -a true
>
>   Performance counter stats for 'system wide':
>
> CPU-1                  64,034      instructions
> CPU-1                  68,941      instructions
> CPU-1                  59,418      instructions
> CPU-1                  70,478      instructions
> CPU-1                  65,201      instructions
> CPU-1                  63,704      instructions
> <<>>
>
> This is caught while running "perf test" for
> "stat+json_output.sh" and "stat+csv_output.sh".
>
> Reported-by: Disha Goel<disgoel@linux.vnet.ibm.com>
> Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>

Tested-by: Disha Goel<disgoel@linux.vnet.ibm.com>

> ---
>   tools/perf/util/stat-display.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index b82844cb0ce7..1b751a730271 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>   					id.socket,
>   					id.die,
>   					id.core);
> -			} else if (id.core > -1) {
> +			} else
>   				fprintf(config->output, "\"cpu\" : \"%d\", ",
>   					id.cpu.cpu);
> -			}
>   		} else {
>   			if (evsel->percore && !config->percore_show_thread) {
>   				fprintf(config->output, "S%d-D%d-C%*d%s",
> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>   					id.die,
>   					config->csv_output ? 0 : -3,
>   					id.core, config->csv_sep);
> -			} else if (id.core > -1) {
> +			} else
>   				fprintf(config->output, "CPU%*d%s",
>   					config->csv_output ? 0 : -7,
>   					id.cpu.cpu, config->csv_sep);
> -			}
>   		}
>   		break;
>   	case AGGR_THREAD:

[-- Attachment #2: Type: text/html, Size: 4612 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-09-16 11:31 ` Disha Goel
@ 2022-09-28 14:33   ` Athira Rajeev
  0 siblings, 0 replies; 19+ messages in thread
From: Athira Rajeev @ 2022-09-28 14:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Michael Ellerman,
	Ian Rogers, Adrian Hunter
  Cc: maddy, Nageswara Sastry, Kajol Jain, linux-perf-users,
	Disha Goel, linuxppc-dev



> On 16-Sep-2022, at 5:01 PM, Disha Goel <disgoel@linux.ibm.com> wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> 
> 
> On 9/13/22 5:27 PM, Athira Rajeev wrote:
>> perf stat includes option to specify aggr_mode to display
>> per-socket, per-core, per-die, per-node counter details.
>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>> counter values are displayed for each cpu along with "CPU"
>> value in one field of the output.
>> 
>> Each of the aggregate mode uses the information fetched
>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
>> physical_package_id. Utility functions in "cpumap.c" fetches
>> this information and populates the socket id, core id, cpu etc.
>> If the platform does not expose the topology information,
>> these values will be set to -1. Example, in case of powerpc,
>> details like physical_package_id is restricted to be exposed
>> in pSeries platform. So id.socket, id.core, id.cpu all will
>> be set as -1.
>> 
>> In case of displaying socket or die value, there is no check
>> done in the "aggr_printout" function to see if it points to
>> valid socket id or die. But for displaying "cpu" value, there
>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>> where detail like physical_package_id is restricted to be
>> exposed, id.core will be set to -1. Hence the column or field
>> itself for CPU won't be displayed in the output.
>> 
>> Result for per-socket:
>> 
>> <<>>
>> perf stat -e branches --per-socket -a true
>> 
>>  Performance counter stats for 'system wide':
>> 
>> S-1      32            416,851      branches
>> <<>>
>> 
>> Here S has -1 in above result. But with -A option which also
>> expects CPU in one column in the result, below is observed.
>> 
>> <<>>
>>  /bin/perf stat -e instructions -A -a true
>> 
>>  Performance counter stats for 'system wide':
>> 
>>             47,146      instructions
>>             45,226      instructions
>>             43,354      instructions
>>             45,184      instructions
>> <<>>
>> 
>> If the cpu id value is pointing to -1 also, it makes sense
>> to display the column in the output to replicate the behaviour
>> or to be in precedence with other aggr options(like per-socket,
>> per-core). Remove the check "id.core" so that CPU field gets
>> displayed in the output.
>> 
>> After the fix:
>> 
>> <<>>
>> perf stat -e instructions -A -a true
>> 
>>  Performance counter stats for 'system wide':
>> 
>> CPU-1                  64,034      instructions
>> CPU-1                  68,941      instructions
>> CPU-1                  59,418      instructions
>> CPU-1                  70,478      instructions
>> CPU-1                  65,201      instructions
>> CPU-1                  63,704      instructions
>> <<>>
>> 
>> This is caught while running "perf test" for
>> "stat+json_output.sh" and "stat+csv_output.sh".
>> 
>> Reported-by: Disha Goel 
>> <disgoel@linux.vnet.ibm.com>
>> 
>> Signed-off-by: Athira Rajeev 
>> <atrajeev@linux.vnet.ibm.com>
> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>

Hi,

Looking for review comments for this change.

Thanks
Athira
>> ---
>>  tools/perf/util/stat-display.c | 6 ++----
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index b82844cb0ce7..1b751a730271 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>>  					id.socket,
>>  					id.die,
>>  					id.core);
>> -			} else if (id.core > -1) {
>> +			} else
>>  				fprintf(config->output, "\"cpu\" : \"%d\", ",
>>  					id.cpu.cpu);
>> -			}
>>  		} else {
>>  			if (evsel->percore && !config->percore_show_thread) {
>>  				fprintf(config->output, "S%d-D%d-C%*d%s",
>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>>  					id.die,
>>  					config->csv_output ? 0 : -3,
>>  					id.core, config->csv_sep);
>> -			} else if (id.core > -1) {
>> +			} else
>>  				fprintf(config->output, "CPU%*d%s",
>>  					config->csv_output ? 0 : -7,
>>  					id.cpu.cpu, config->csv_sep);
>> -			}
>>  		}
>>  		break;
>>  	case AGGR_THREAD:
>> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-09-13 11:57 [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value Athira Rajeev
  2022-09-16 11:31 ` Disha Goel
@ 2022-09-28 15:35 ` James Clark
  2022-09-29  8:49   ` Athira Rajeev
  1 sibling, 1 reply; 19+ messages in thread
From: James Clark @ 2022-09-28 15:35 UTC (permalink / raw)
  To: Athira Rajeev, Ian Rogers
  Cc: maddy, rnsastry, kjain, acme, linux-perf-users, jolsa, disgoel,
	linuxppc-dev



On 13/09/2022 12:57, Athira Rajeev wrote:
> perf stat includes option to specify aggr_mode to display
> per-socket, per-core, per-die, per-node counter details.
> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> counter values are displayed for each cpu along with "CPU"
> value in one field of the output.
> 
> Each of the aggregate mode uses the information fetched
> from "/sys/devices/system/cpu/cpuX/topology" like core_id,

I thought that this wouldn't apply to the cpu field because cpu is
basically interchangeable as an index in cpumap, rather than anything
being read from the topology file.

> physical_package_id. Utility functions in "cpumap.c" fetches
> this information and populates the socket id, core id, cpu etc.
> If the platform does not expose the topology information,
> these values will be set to -1. Example, in case of powerpc,
> details like physical_package_id is restricted to be exposed
> in pSeries platform. So id.socket, id.core, id.cpu all will
> be set as -1.
> 
> In case of displaying socket or die value, there is no check
> done in the "aggr_printout" function to see if it points to
> valid socket id or die. But for displaying "cpu" value, there
> is a check for "if (id.core > -1)". In case of powerpc pSeries
> where detail like physical_package_id is restricted to be
> exposed, id.core will be set to -1. Hence the column or field
> itself for CPU won't be displayed in the output.
> 
> Result for per-socket:
> 
> <<>>
> perf stat -e branches --per-socket -a true
> 
>  Performance counter stats for 'system wide':
> 
> S-1      32            416,851      branches
> <<>>
> 
> Here S has -1 in above result. But with -A option which also
> expects CPU in one column in the result, below is observed.
> 
> <<>>
>  /bin/perf stat -e instructions -A -a true
> 
>  Performance counter stats for 'system wide':
> 
>             47,146      instructions
>             45,226      instructions
>             43,354      instructions
>             45,184      instructions
> <<>>
> 
> If the cpu id value is pointing to -1 also, it makes sense
> to display the column in the output to replicate the behaviour
> or to be in precedence with other aggr options(like per-socket,
> per-core). Remove the check "id.core" so that CPU field gets
> displayed in the output.

Why would you want to print -1 out? Seems like the if statement was a
good one to me, otherwise the output looks a bit broken to users. Are
the other aggregation modes even working if -1 is set for socket and
die? Maybe we need to not print -1 in those cases or exit earlier with a
failure.

The -1 value has a specific internal meaning which is "to not
aggregate". It doesn't mean "not set".

> 
> After the fix:
> 
> <<>>
> perf stat -e instructions -A -a true
> 
>  Performance counter stats for 'system wide':
> 
> CPU-1                  64,034      instructions
> CPU-1                  68,941      instructions
> CPU-1                  59,418      instructions
> CPU-1                  70,478      instructions
> CPU-1                  65,201      instructions
> CPU-1                  63,704      instructions
> <<>>
> 
> This is caught while running "perf test" for
> "stat+json_output.sh" and "stat+csv_output.sh".

Is it possible to fix the issue by making the tests cope with the lack
of the CPU id?

> 
> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  tools/perf/util/stat-display.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index b82844cb0ce7..1b751a730271 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>  					id.socket,
>  					id.die,
>  					id.core);
> -			} else if (id.core > -1) {
> +			} else

This should have been "id.cpu.cpu > -1". Looks like it was changed by
some kind of bad merge or rebase in df936cadfb because there is no
obvious justification for the change to .core in that commit.

>  				fprintf(config->output, "\"cpu\" : \"%d\", ",
>  					id.cpu.cpu);
> -			}
>  		} else {
>  			if (evsel->percore && !config->percore_show_thread) {
>  				fprintf(config->output, "S%d-D%d-C%*d%s",
> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>  					id.die,
>  					config->csv_output ? 0 : -3,
>  					id.core, config->csv_sep);
> -			} else if (id.core > -1) {
> +			} else
>  				fprintf(config->output, "CPU%*d%s",
>  					config->csv_output ? 0 : -7,
>  					id.cpu.cpu, config->csv_sep);
> -			}
>  		}
>  		break;
>  	case AGGR_THREAD:

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-09-28 15:35 ` James Clark
@ 2022-09-29  8:49   ` Athira Rajeev
  2022-09-29 12:56     ` James Clark
  0 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2022-09-29  8:49 UTC (permalink / raw)
  To: James Clark
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	Arnaldo Carvalho de Melo, linux-perf-users, Jiri Olsa,
	Disha Goel, linuxppc-dev



> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark@arm.com> wrote:
> 
> 
> 

Hi James,

Thanks for looking at the patch and sharing review comments.

> On 13/09/2022 12:57, Athira Rajeev wrote:
>> perf stat includes option to specify aggr_mode to display
>> per-socket, per-core, per-die, per-node counter details.
>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>> counter values are displayed for each cpu along with "CPU"
>> value in one field of the output.
>> 
>> Each of the aggregate mode uses the information fetched
>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> 
> I thought that this wouldn't apply to the cpu field because cpu is
> basically interchangeable as an index in cpumap, rather than anything
> being read from the topology file.

The cpu value is filled in this function:

Function : aggr_cpu_id__cpu
Code: util/cpumap.c

> 
>> physical_package_id. Utility functions in "cpumap.c" fetches
>> this information and populates the socket id, core id, cpu etc.
>> If the platform does not expose the topology information,
>> these values will be set to -1. Example, in case of powerpc,
>> details like physical_package_id is restricted to be exposed
>> in pSeries platform. So id.socket, id.core, id.cpu all will
>> be set as -1.
>> 
>> In case of displaying socket or die value, there is no check
>> done in the "aggr_printout" function to see if it points to
>> valid socket id or die. But for displaying "cpu" value, there
>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>> where detail like physical_package_id is restricted to be
>> exposed, id.core will be set to -1. Hence the column or field
>> itself for CPU won't be displayed in the output.
>> 
>> Result for per-socket:
>> 
>> <<>>
>> perf stat -e branches --per-socket -a true
>> 
>> Performance counter stats for 'system wide':
>> 
>> S-1      32            416,851      branches
>> <<>>
>> 
>> Here S has -1 in above result. But with -A option which also
>> expects CPU in one column in the result, below is observed.
>> 
>> <<>>
>> /bin/perf stat -e instructions -A -a true
>> 
>> Performance counter stats for 'system wide':
>> 
>>            47,146      instructions
>>            45,226      instructions
>>            43,354      instructions
>>            45,184      instructions
>> <<>>
>> 
>> If the cpu id value is pointing to -1 also, it makes sense
>> to display the column in the output to replicate the behaviour
>> or to be in precedence with other aggr options(like per-socket,
>> per-core). Remove the check "id.core" so that CPU field gets
>> displayed in the output.
> 
> Why would you want to print -1 out? Seems like the if statement was a
> good one to me, otherwise the output looks a bit broken to users. Are
> the other aggregation modes even working if -1 is set for socket and
> die? Maybe we need to not print -1 in those cases or exit earlier with a
> failure.
> 
> The -1 value has a specific internal meaning which is "to not
> aggregate". It doesn't mean "not set".

Currently, this check is done only for printing cpu value.
For socket/die/core values, this check is not done. Pasting an
example snippet from a powerpc system ( specifically from pseries platform where
the value is set to -1 )

./perf stat --per-core -a -C 1 true

 Performance counter stats for 'system wide':

S-1-D-1-C-1          1               1.06 msec cpu-clock                        #    1.018 CPUs utilized          
S-1-D-1-C-1          1                  2      context-switches                 #    1.879 K/sec                  
S-1-D-1-C-1          1                  0      cpu-migrations                   #    0.000 /sec                   

Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be
empty since we do a check before printing. 

Example:

./perf stat --per-core -A -C 1 true

 Performance counter stats for 'CPU(s) 1':

              0.88 msec cpu-clock                        #    1.022 CPUs utilized          
                 2      context-switches                                                   
                 0      cpu-migrations                                                     


No sure, whether there are scripts out there, which consume the current format and
not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to
other modes like socket, die, core etc.

Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”.
What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.

Reference: defined in util/stat.h

enum aggr_mode {
        AGGR_NONE,

James, can you point me to reference for that meaning if I have missed anything.

Thanks
Athira

> 
>> 
>> After the fix:
>> 
>> <<>>
>> perf stat -e instructions -A -a true
>> 
>> Performance counter stats for 'system wide':
>> 
>> CPU-1                  64,034      instructions
>> CPU-1                  68,941      instructions
>> CPU-1                  59,418      instructions
>> CPU-1                  70,478      instructions
>> CPU-1                  65,201      instructions
>> CPU-1                  63,704      instructions
>> <<>>
>> 
>> This is caught while running "perf test" for
>> "stat+json_output.sh" and "stat+csv_output.sh".
> 
> Is it possible to fix the issue by making the tests cope with the lack
> of the CPU id?
> 
>> 
>> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> tools/perf/util/stat-display.c | 6 ++----
>> 1 file changed, 2 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index b82844cb0ce7..1b751a730271 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>> 					id.socket,
>> 					id.die,
>> 					id.core);
>> -			} else if (id.core > -1) {
>> +			} else
> 
> This should have been "id.cpu.cpu > -1". Looks like it was changed by
> some kind of bad merge or rebase in df936cadfb because there is no
> obvious justification for the change to .core in that commit.

> 
>> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
>> 					id.cpu.cpu);
>> -			}
>> 		} else {
>> 			if (evsel->percore && !config->percore_show_thread) {
>> 				fprintf(config->output, "S%d-D%d-C%*d%s",
>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>> 					id.die,
>> 					config->csv_output ? 0 : -3,
>> 					id.core, config->csv_sep);
>> -			} else if (id.core > -1) {
>> +			} else
>> 				fprintf(config->output, "CPU%*d%s",
>> 					config->csv_output ? 0 : -7,
>> 					id.cpu.cpu, config->csv_sep);
>> -			}
>> 		}
>> 		break;
>> 	case AGGR_THREAD:


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-09-29  8:49   ` Athira Rajeev
@ 2022-09-29 12:56     ` James Clark
  2022-10-01 23:47       ` Ian Rogers
  0 siblings, 1 reply; 19+ messages in thread
From: James Clark @ 2022-09-29 12:56 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	Arnaldo Carvalho de Melo, linux-perf-users, Jiri Olsa,
	Disha Goel, linuxppc-dev



On 29/09/2022 09:49, Athira Rajeev wrote:
> 
> 
>> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark@arm.com> wrote:
>>
>>
>>
> 
> Hi James,
> 
> Thanks for looking at the patch and sharing review comments.
> 
>> On 13/09/2022 12:57, Athira Rajeev wrote:
>>> perf stat includes option to specify aggr_mode to display
>>> per-socket, per-core, per-die, per-node counter details.
>>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>>> counter values are displayed for each cpu along with "CPU"
>>> value in one field of the output.
>>>
>>> Each of the aggregate mode uses the information fetched
>>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
>>
>> I thought that this wouldn't apply to the cpu field because cpu is
>> basically interchangeable as an index in cpumap, rather than anything
>> being read from the topology file.
> 
> The cpu value is filled in this function:
> 
> Function : aggr_cpu_id__cpu
> Code: util/cpumap.c
> 
>>
>>> physical_package_id. Utility functions in "cpumap.c" fetches
>>> this information and populates the socket id, core id, cpu etc.
>>> If the platform does not expose the topology information,
>>> these values will be set to -1. Example, in case of powerpc,
>>> details like physical_package_id is restricted to be exposed
>>> in pSeries platform. So id.socket, id.core, id.cpu all will
>>> be set as -1.
>>>
>>> In case of displaying socket or die value, there is no check
>>> done in the "aggr_printout" function to see if it points to
>>> valid socket id or die. But for displaying "cpu" value, there
>>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>>> where detail like physical_package_id is restricted to be
>>> exposed, id.core will be set to -1. Hence the column or field
>>> itself for CPU won't be displayed in the output.
>>>
>>> Result for per-socket:
>>>
>>> <<>>
>>> perf stat -e branches --per-socket -a true
>>>
>>> Performance counter stats for 'system wide':
>>>
>>> S-1      32            416,851      branches
>>> <<>>
>>>
>>> Here S has -1 in above result. But with -A option which also
>>> expects CPU in one column in the result, below is observed.
>>>
>>> <<>>
>>> /bin/perf stat -e instructions -A -a true
>>>
>>> Performance counter stats for 'system wide':
>>>
>>>            47,146      instructions
>>>            45,226      instructions
>>>            43,354      instructions
>>>            45,184      instructions
>>> <<>>
>>>
>>> If the cpu id value is pointing to -1 also, it makes sense
>>> to display the column in the output to replicate the behaviour
>>> or to be in precedence with other aggr options(like per-socket,
>>> per-core). Remove the check "id.core" so that CPU field gets
>>> displayed in the output.
>>
>> Why would you want to print -1 out? Seems like the if statement was a
>> good one to me, otherwise the output looks a bit broken to users. Are
>> the other aggregation modes even working if -1 is set for socket and
>> die? Maybe we need to not print -1 in those cases or exit earlier with a
>> failure.
>>
>> The -1 value has a specific internal meaning which is "to not
>> aggregate". It doesn't mean "not set".
> 
> Currently, this check is done only for printing cpu value.
> For socket/die/core values, this check is not done. Pasting an
> example snippet from a powerpc system ( specifically from pseries platform where
> the value is set to -1 )
> 
> ./perf stat --per-core -a -C 1 true
> 
>  Performance counter stats for 'system wide':
> 
> S-1-D-1-C-1          1               1.06 msec cpu-clock                        #    1.018 CPUs utilized          
> S-1-D-1-C-1          1                  2      context-switches                 #    1.879 K/sec                  
> S-1-D-1-C-1          1                  0      cpu-migrations                   #    0.000 /sec                   
> 
> Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be
> empty since we do a check before printing. 
> 
> Example:
> 
> ./perf stat --per-core -A -C 1 true
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>               0.88 msec cpu-clock                        #    1.022 CPUs utilized          
>                  2      context-switches                                                   
>                  0      cpu-migrations                                                     
> 
> 
> No sure, whether there are scripts out there, which consume the current format and
> not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to
> other modes like socket, die, core etc.

I wouldn't worry about that because there are json and CSV modes which
are machine readable, and -1 is already not always displayed. If
anything this change here is also likely to break parsing by adding -1
where it wasn't before.

> 
> Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
> values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”.
> What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.
> 
> Reference: defined in util/stat.h
> 
> enum aggr_mode {
>         AGGR_NONE,
> 

That enum is never written to any of the cpumap members, that defines
the mode of how to fill the cpu map instead. 0 is a valid value, for
example "CPU 0". -1 is used as a special case and shouldn't be displayed
IMO.

Did you see my comment in the code below about the bad merge? Could that
not be related to your issue?

Or the one about fixing it in the test instead? Or failing early if the
topology can't be read?

I'm still not convinced that any of the modes where -1 is printed are
even working properly so it might be best to fix that rather than just
the printout.

> James, can you point me to reference for that meaning if I have missed anything.

It's here:

  /** Identify where counts are aggregated, -1 implies not to aggregate. */
  struct aggr_cpu_id {

> 
> Thanks
> Athira
> 
>>
>>>
>>> After the fix:
>>>
>>> <<>>
>>> perf stat -e instructions -A -a true
>>>
>>> Performance counter stats for 'system wide':
>>>
>>> CPU-1                  64,034      instructions
>>> CPU-1                  68,941      instructions
>>> CPU-1                  59,418      instructions
>>> CPU-1                  70,478      instructions
>>> CPU-1                  65,201      instructions
>>> CPU-1                  63,704      instructions
>>> <<>>
>>>
>>> This is caught while running "perf test" for
>>> "stat+json_output.sh" and "stat+csv_output.sh".
>>
>> Is it possible to fix the issue by making the tests cope with the lack
>> of the CPU id?
>>
>>>
>>> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
>>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>>> ---
>>> tools/perf/util/stat-display.c | 6 ++----
>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>>> index b82844cb0ce7..1b751a730271 100644
>>> --- a/tools/perf/util/stat-display.c
>>> +++ b/tools/perf/util/stat-display.c
>>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>>> 					id.socket,
>>> 					id.die,
>>> 					id.core);
>>> -			} else if (id.core > -1) {
>>> +			} else
>>
>> This should have been "id.cpu.cpu > -1". Looks like it was changed by
>> some kind of bad merge or rebase in df936cadfb because there is no
>> obvious justification for the change to .core in that commit.
> 
>>
>>> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
>>> 					id.cpu.cpu);
>>> -			}
>>> 		} else {
>>> 			if (evsel->percore && !config->percore_show_thread) {
>>> 				fprintf(config->output, "S%d-D%d-C%*d%s",
>>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>>> 					id.die,
>>> 					config->csv_output ? 0 : -3,
>>> 					id.core, config->csv_sep);
>>> -			} else if (id.core > -1) {
>>> +			} else
>>> 				fprintf(config->output, "CPU%*d%s",
>>> 					config->csv_output ? 0 : -7,
>>> 					id.cpu.cpu, config->csv_sep);
>>> -			}
>>> 		}
>>> 		break;
>>> 	case AGGR_THREAD:
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-09-29 12:56     ` James Clark
@ 2022-10-01 23:47       ` Ian Rogers
       [not found]         ` <993a1391ee931e859d972c460644d171@imap.linux.ibm.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Ian Rogers @ 2022-10-01 23:47 UTC (permalink / raw)
  To: James Clark
  Cc: Athira Rajeev, Nageswara Sastry, Kajol Jain,
	Arnaldo Carvalho de Melo, linux-perf-users, maddy, Jiri Olsa,
	Disha Goel, linuxppc-dev

On Thu, Sep 29, 2022 at 5:56 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 29/09/2022 09:49, Athira Rajeev wrote:
> >
> >
> >> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark@arm.com> wrote:
> >>
> >>
> >>
> >
> > Hi James,
> >
> > Thanks for looking at the patch and sharing review comments.
> >
> >> On 13/09/2022 12:57, Athira Rajeev wrote:
> >>> perf stat includes option to specify aggr_mode to display
> >>> per-socket, per-core, per-die, per-node counter details.
> >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> >>> counter values are displayed for each cpu along with "CPU"
> >>> value in one field of the output.
> >>>
> >>> Each of the aggregate mode uses the information fetched
> >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> >>
> >> I thought that this wouldn't apply to the cpu field because cpu is
> >> basically interchangeable as an index in cpumap, rather than anything
> >> being read from the topology file.
> >
> > The cpu value is filled in this function:
> >
> > Function : aggr_cpu_id__cpu
> > Code: util/cpumap.c
> >
> >>
> >>> physical_package_id. Utility functions in "cpumap.c" fetches
> >>> this information and populates the socket id, core id, cpu etc.
> >>> If the platform does not expose the topology information,
> >>> these values will be set to -1. Example, in case of powerpc,
> >>> details like physical_package_id is restricted to be exposed
> >>> in pSeries platform. So id.socket, id.core, id.cpu all will
> >>> be set as -1.
> >>>
> >>> In case of displaying socket or die value, there is no check
> >>> done in the "aggr_printout" function to see if it points to
> >>> valid socket id or die. But for displaying "cpu" value, there
> >>> is a check for "if (id.core > -1)". In case of powerpc pSeries
> >>> where detail like physical_package_id is restricted to be
> >>> exposed, id.core will be set to -1. Hence the column or field
> >>> itself for CPU won't be displayed in the output.
> >>>
> >>> Result for per-socket:
> >>>
> >>> <<>>
> >>> perf stat -e branches --per-socket -a true
> >>>
> >>> Performance counter stats for 'system wide':
> >>>
> >>> S-1      32            416,851      branches
> >>> <<>>
> >>>
> >>> Here S has -1 in above result. But with -A option which also
> >>> expects CPU in one column in the result, below is observed.
> >>>
> >>> <<>>
> >>> /bin/perf stat -e instructions -A -a true
> >>>
> >>> Performance counter stats for 'system wide':
> >>>
> >>>            47,146      instructions
> >>>            45,226      instructions
> >>>            43,354      instructions
> >>>            45,184      instructions
> >>> <<>>
> >>>
> >>> If the cpu id value is pointing to -1 also, it makes sense
> >>> to display the column in the output to replicate the behaviour
> >>> or to be in precedence with other aggr options(like per-socket,
> >>> per-core). Remove the check "id.core" so that CPU field gets
> >>> displayed in the output.
> >>
> >> Why would you want to print -1 out? Seems like the if statement was a
> >> good one to me, otherwise the output looks a bit broken to users. Are
> >> the other aggregation modes even working if -1 is set for socket and
> >> die? Maybe we need to not print -1 in those cases or exit earlier with a
> >> failure.
> >>
> >> The -1 value has a specific internal meaning which is "to not
> >> aggregate". It doesn't mean "not set".
> >
> > Currently, this check is done only for printing cpu value.
> > For socket/die/core values, this check is not done. Pasting an
> > example snippet from a powerpc system ( specifically from pseries platform where
> > the value is set to -1 )
> >
> > ./perf stat --per-core -a -C 1 true
> >
> >  Performance counter stats for 'system wide':
> >
> > S-1-D-1-C-1          1               1.06 msec cpu-clock                        #    1.018 CPUs utilized
> > S-1-D-1-C-1          1                  2      context-switches                 #    1.879 K/sec
> > S-1-D-1-C-1          1                  0      cpu-migrations                   #    0.000 /sec
> >
> > Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be
> > empty since we do a check before printing.
> >
> > Example:
> >
> > ./perf stat --per-core -A -C 1 true
> >
> >  Performance counter stats for 'CPU(s) 1':
> >
> >               0.88 msec cpu-clock                        #    1.022 CPUs utilized
> >                  2      context-switches
> >                  0      cpu-migrations
> >
> >
> > No sure, whether there are scripts out there, which consume the current format and
> > not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to
> > other modes like socket, die, core etc.
>
> I wouldn't worry about that because there are json and CSV modes which
> are machine readable, and -1 is already not always displayed. If
> anything this change here is also likely to break parsing by adding -1
> where it wasn't before.
>
> >
> > Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
> > values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”.
> > What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.
> >
> > Reference: defined in util/stat.h
> >
> > enum aggr_mode {
> >         AGGR_NONE,
> >
>
> That enum is never written to any of the cpumap members, that defines
> the mode of how to fill the cpu map instead. 0 is a valid value, for
> example "CPU 0". -1 is used as a special case and shouldn't be displayed
> IMO.
>
> Did you see my comment in the code below about the bad merge? Could that
> not be related to your issue?

I'm suspicious of this too. In Claire's patch:

        case AGGR_NONE:
-               if (evsel->percore && !config->percore_show_thread) {
-                       fprintf(config->output, "S%d-D%d-C%*d%s",
-                               id.socket,
-                               id.die,
-                               config->csv_output ? 0 : -3,
-                               id.core, config->csv_sep);
-               } else if (id.cpu.cpu > -1) {
-                       fprintf(config->output, "CPU%*d%s",
-                               config->csv_output ? 0 : -7,
-                               id.cpu.cpu, config->csv_sep);
+               if (config->json_output) {
+                       if (evsel->percore && !config->percore_show_thread) {
+                               fprintf(config->output, "\"core\" :
\"S%d-D%d-C%d\"",
+                                       id.socket,
+                                       id.die,
+                                       id.core);
+                       } else if (id.core > -1) {
+                               fprintf(config->output, "\"cpu\" : \"%d\", ",
+                                       id.cpu.cpu);
+                       }
+               } else {
+                       if (evsel->percore && !config->percore_show_thread) {
+                               fprintf(config->output, "S%d-D%d-C%*d%s",
+                                       id.socket,
+                                       id.die,
+                                       config->csv_output ? 0 : -3,
+                                       id.core, config->csv_sep);
+                       } else if (id.core > -1) {
+                               fprintf(config->output, "CPU%*d%s",
+                                       config->csv_output ? 0 : -7,
+                                       id.cpu.cpu, config->csv_sep);
+                       }
               }
               break;

The old code was using "id.cpu.cpu > -1" while the new code is
"id.core > -1". The value printed is id.cpu.cpu and so testing id.core
makes less sense to me. Going back to the original patch:

https://lore.kernel.org/lkml/20210811224317.1811618-1-cjense@google.com/
  case AGGR_NONE:
- if (evsel->percore && !config->percore_show_thread) {
- fprintf(config->output, "S%d-D%d-C%*d%s",
- id.socket,
- id.die,
- config->csv_output ? 0 : -3,
- id.core, config->csv_sep);
+ if (config->json_output) {
+ if (evsel->percore && !config->percore_show_thread) {
+ fprintf(config->output, "\"core\" : \"S%d-D%d-C%d\"",
+ id.socket,
+ id.die,
+ id.core);
+ } else if (id.core > -1) {
+ fprintf(config->output, "\"cpu\" : \"%d\", ",
+ evsel__cpus(evsel)->map[id.core]);
+ }
+ } else {
+ if (evsel->percore && !config->percore_show_thread) {
+ fprintf(config->output, "S%d-D%d-C%*d%s",
+ id.socket,
+ id.die,
+ config->csv_output ? 0 : -3,
+ id.core, config->csv_sep);
  } else if (id.core > -1) {
  fprintf(config->output, "CPU%*d%s",
  config->csv_output ? 0 : -7,
  evsel__cpus(evsel)->map[id.core],
  config->csv_sep);
- }
+ }
+ }
+
  break;

So testing the id.core isn't a bad index makes sense. However, we
changed from core to CPU here:
https://lore.kernel.org/all/20220105061351.120843-26-irogers@google.com/
and that was because of:
https://lore.kernel.org/r/20220105061351.120843-25-irogers@google.com

So I think the code needs to test CPU and not core. Whether that is
addressing the Power test failures is another matter, as James said we
may need a fix in the tests for that.

Thanks,
Ian

> Or the one about fixing it in the test instead? Or failing early if the
> topology can't be read?
>
> I'm still not convinced that any of the modes where -1 is printed are
> even working properly so it might be best to fix that rather than just
> the printout.
>
> > James, can you point me to reference for that meaning if I have missed anything.
>
> It's here:
>
>   /** Identify where counts are aggregated, -1 implies not to aggregate. */
>   struct aggr_cpu_id {
>
> >
> > Thanks
> > Athira
> >
> >>
> >>>
> >>> After the fix:
> >>>
> >>> <<>>
> >>> perf stat -e instructions -A -a true
> >>>
> >>> Performance counter stats for 'system wide':
> >>>
> >>> CPU-1                  64,034      instructions
> >>> CPU-1                  68,941      instructions
> >>> CPU-1                  59,418      instructions
> >>> CPU-1                  70,478      instructions
> >>> CPU-1                  65,201      instructions
> >>> CPU-1                  63,704      instructions
> >>> <<>>
> >>>
> >>> This is caught while running "perf test" for
> >>> "stat+json_output.sh" and "stat+csv_output.sh".
> >>
> >> Is it possible to fix the issue by making the tests cope with the lack
> >> of the CPU id?
> >>
> >>>
> >>> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
> >>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >>> ---
> >>> tools/perf/util/stat-display.c | 6 ++----
> >>> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> >>> index b82844cb0ce7..1b751a730271 100644
> >>> --- a/tools/perf/util/stat-display.c
> >>> +++ b/tools/perf/util/stat-display.c
> >>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
> >>>                                     id.socket,
> >>>                                     id.die,
> >>>                                     id.core);
> >>> -                   } else if (id.core > -1) {
> >>> +                   } else
> >>
> >> This should have been "id.cpu.cpu > -1". Looks like it was changed by
> >> some kind of bad merge or rebase in df936cadfb because there is no
> >> obvious justification for the change to .core in that commit.
> >
> >>
> >>>                             fprintf(config->output, "\"cpu\" : \"%d\", ",
> >>>                                     id.cpu.cpu);
> >>> -                   }
> >>>             } else {
> >>>                     if (evsel->percore && !config->percore_show_thread) {
> >>>                             fprintf(config->output, "S%d-D%d-C%*d%s",
> >>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
> >>>                                     id.die,
> >>>                                     config->csv_output ? 0 : -3,
> >>>                                     id.core, config->csv_sep);
> >>> -                   } else if (id.core > -1) {
> >>> +                   } else
> >>>                             fprintf(config->output, "CPU%*d%s",
> >>>                                     config->csv_output ? 0 : -7,
> >>>                                     id.cpu.cpu, config->csv_sep);
> >>> -                   }
> >>>             }
> >>>             break;
> >>>     case AGGR_THREAD:
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
       [not found]         ` <993a1391ee931e859d972c460644d171@imap.linux.ibm.com>
@ 2022-10-03 18:51           ` Ian Rogers
  2022-10-04  7:06             ` Athira Rajeev
  0 siblings, 1 reply; 19+ messages in thread
From: Ian Rogers @ 2022-10-03 18:51 UTC (permalink / raw)
  To: atrajeev
  Cc: Athira Rajeev, Nageswara Sastry, Kajol Jain,
	Arnaldo Carvalho de Melo, linux-perf-users, maddy, james.clark,
	Jiri Olsa, Disha Goel, linuxppc-dev

On Mon, Oct 3, 2022 at 7:03 AM atrajeev <atrajeev@imap.linux.ibm.com> wrote:
>
> On 2022-10-02 05:17, Ian Rogers wrote:
> > On Thu, Sep 29, 2022 at 5:56 AM James Clark <james.clark@arm.com>
> > wrote:
> >>
> >>
> >>
> >> On 29/09/2022 09:49, Athira Rajeev wrote:
> >> >
> >> >
> >> >> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark@arm.com> wrote:
> >> >>
> >> >>
> >> >>
> >> >
> >> > Hi James,
> >> >
> >> > Thanks for looking at the patch and sharing review comments.
> >> >
> >> >> On 13/09/2022 12:57, Athira Rajeev wrote:
> >> >>> perf stat includes option to specify aggr_mode to display
> >> >>> per-socket, per-core, per-die, per-node counter details.
> >> >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> >> >>> counter values are displayed for each cpu along with "CPU"
> >> >>> value in one field of the output.
> >> >>>
> >> >>> Each of the aggregate mode uses the information fetched
> >> >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> >> >>
> >> >> I thought that this wouldn't apply to the cpu field because cpu is
> >> >> basically interchangeable as an index in cpumap, rather than anything
> >> >> being read from the topology file.
> >> >
> >> > The cpu value is filled in this function:
> >> >
> >> > Function : aggr_cpu_id__cpu
> >> > Code: util/cpumap.c
> >> >
> >> >>
> >> >>> physical_package_id. Utility functions in "cpumap.c" fetches
> >> >>> this information and populates the socket id, core id, cpu etc.
> >> >>> If the platform does not expose the topology information,
> >> >>> these values will be set to -1. Example, in case of powerpc,
> >> >>> details like physical_package_id is restricted to be exposed
> >> >>> in pSeries platform. So id.socket, id.core, id.cpu all will
> >> >>> be set as -1.
> >> >>>
> >> >>> In case of displaying socket or die value, there is no check
> >> >>> done in the "aggr_printout" function to see if it points to
> >> >>> valid socket id or die. But for displaying "cpu" value, there
> >> >>> is a check for "if (id.core > -1)". In case of powerpc pSeries
> >> >>> where detail like physical_package_id is restricted to be
> >> >>> exposed, id.core will be set to -1. Hence the column or field
> >> >>> itself for CPU won't be displayed in the output.
> >> >>>
> >> >>> Result for per-socket:
> >> >>>
> >> >>> <<>>
> >> >>> perf stat -e branches --per-socket -a true
> >> >>>
> >> >>> Performance counter stats for 'system wide':
> >> >>>
> >> >>> S-1      32            416,851      branches
> >> >>> <<>>
> >> >>>
> >> >>> Here S has -1 in above result. But with -A option which also
> >> >>> expects CPU in one column in the result, below is observed.
> >> >>>
> >> >>> <<>>
> >> >>> /bin/perf stat -e instructions -A -a true
> >> >>>
> >> >>> Performance counter stats for 'system wide':
> >> >>>
> >> >>>            47,146      instructions
> >> >>>            45,226      instructions
> >> >>>            43,354      instructions
> >> >>>            45,184      instructions
> >> >>> <<>>
> >> >>>
> >> >>> If the cpu id value is pointing to -1 also, it makes sense
> >> >>> to display the column in the output to replicate the behaviour
> >> >>> or to be in precedence with other aggr options(like per-socket,
> >> >>> per-core). Remove the check "id.core" so that CPU field gets
> >> >>> displayed in the output.
> >> >>
> >> >> Why would you want to print -1 out? Seems like the if statement was a
> >> >> good one to me, otherwise the output looks a bit broken to users. Are
> >> >> the other aggregation modes even working if -1 is set for socket and
> >> >> die? Maybe we need to not print -1 in those cases or exit earlier with a
> >> >> failure.
> >> >>
> >> >> The -1 value has a specific internal meaning which is "to not
> >> >> aggregate". It doesn't mean "not set".
> >> >
> >> > Currently, this check is done only for printing cpu value.
> >> > For socket/die/core values, this check is not done. Pasting an
> >> > example snippet from a powerpc system ( specifically from pseries platform where
> >> > the value is set to -1 )
> >> >
> >> > ./perf stat --per-core -a -C 1 true
> >> >
> >> >  Performance counter stats for 'system wide':
> >> >
> >> > S-1-D-1-C-1          1               1.06 msec cpu-clock                        #    1.018 CPUs utilized
> >> > S-1-D-1-C-1          1                  2      context-switches                 #    1.879 K/sec
> >> > S-1-D-1-C-1          1                  0      cpu-migrations                   #    0.000 /sec
> >> >
> >> > Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be
> >> > empty since we do a check before printing.
> >> >
> >> > Example:
> >> >
> >> > ./perf stat --per-core -A -C 1 true
> >> >
> >> >  Performance counter stats for 'CPU(s) 1':
> >> >
> >> >               0.88 msec cpu-clock                        #    1.022 CPUs utilized
> >> >                  2      context-switches
> >> >                  0      cpu-migrations
> >> >
> >> >
> >> > No sure, whether there are scripts out there, which consume the current format and
> >> > not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to
> >> > other modes like socket, die, core etc.
> >>
> >> I wouldn't worry about that because there are json and CSV modes which
> >> are machine readable, and -1 is already not always displayed. If
> >> anything this change here is also likely to break parsing by adding -1
> >> where it wasn't before.
> >>
> >> >
> >> > Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
> >> > values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”.
> >> > What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.
> >> >
> >> > Reference: defined in util/stat.h
> >> >
> >> > enum aggr_mode {
> >> >         AGGR_NONE,
> >> >
> >>
> >> That enum is never written to any of the cpumap members, that defines
> >> the mode of how to fill the cpu map instead. 0 is a valid value, for
> >> example "CPU 0". -1 is used as a special case and shouldn't be
> >> displayed
> >> IMO.
> >>
> >> Did you see my comment in the code below about the bad merge? Could
> >> that
> >> not be related to your issue?
> >
> > I'm suspicious of this too. In Claire's patch:
> >
> >         case AGGR_NONE:
> > -               if (evsel->percore && !config->percore_show_thread) {
> > -                       fprintf(config->output, "S%d-D%d-C%*d%s",
> > -                               id.socket,
> > -                               id.die,
> > -                               config->csv_output ? 0 : -3,
> > -                               id.core, config->csv_sep);
> > -               } else if (id.cpu.cpu > -1) {
> > -                       fprintf(config->output, "CPU%*d%s",
> > -                               config->csv_output ? 0 : -7,
> > -                               id.cpu.cpu, config->csv_sep);
> > +               if (config->json_output) {
> > +                       if (evsel->percore &&
> > !config->percore_show_thread) {
> > +                               fprintf(config->output, "\"core\" :
> > \"S%d-D%d-C%d\"",
> > +                                       id.socket,
> > +                                       id.die,
> > +                                       id.core);
> > +                       } else if (id.core > -1) {
> > +                               fprintf(config->output, "\"cpu\" :
> > \"%d\", ",
> > +                                       id.cpu.cpu);
> > +                       }
> > +               } else {
> > +                       if (evsel->percore &&
> > !config->percore_show_thread) {
> > +                               fprintf(config->output,
> > "S%d-D%d-C%*d%s",
> > +                                       id.socket,
> > +                                       id.die,
> > +                                       config->csv_output ? 0 : -3,
> > +                                       id.core, config->csv_sep);
> > +                       } else if (id.core > -1) {
> > +                               fprintf(config->output, "CPU%*d%s",
> > +                                       config->csv_output ? 0 : -7,
> > +                                       id.cpu.cpu, config->csv_sep);
> > +                       }
> >                }
> >                break;
> >
> > The old code was using "id.cpu.cpu > -1" while the new code is
> > "id.core > -1". The value printed is id.cpu.cpu and so testing id.core
> > makes less sense to me. Going back to the original patch:
> >
> > https://lore.kernel.org/lkml/20210811224317.1811618-1-cjense@google.com/
> >   case AGGR_NONE:
> > - if (evsel->percore && !config->percore_show_thread) {
> > - fprintf(config->output, "S%d-D%d-C%*d%s",
> > - id.socket,
> > - id.die,
> > - config->csv_output ? 0 : -3,
> > - id.core, config->csv_sep);
> > + if (config->json_output) {
> > + if (evsel->percore && !config->percore_show_thread) {
> > + fprintf(config->output, "\"core\" : \"S%d-D%d-C%d\"",
> > + id.socket,
> > + id.die,
> > + id.core);
> > + } else if (id.core > -1) {
> > + fprintf(config->output, "\"cpu\" : \"%d\", ",
> > + evsel__cpus(evsel)->map[id.core]);
> > + }
> > + } else {
> > + if (evsel->percore && !config->percore_show_thread) {
> > + fprintf(config->output, "S%d-D%d-C%*d%s",
> > + id.socket,
> > + id.die,
> > + config->csv_output ? 0 : -3,
> > + id.core, config->csv_sep);
> >   } else if (id.core > -1) {
> >   fprintf(config->output, "CPU%*d%s",
> >   config->csv_output ? 0 : -7,
> >   evsel__cpus(evsel)->map[id.core],
> >   config->csv_sep);
> > - }
> > + }
> > + }
> > +
> >   break;
> >
> > So testing the id.core isn't a bad index makes sense. However, we
> > changed from core to CPU here:
> > https://lore.kernel.org/all/20220105061351.120843-26-irogers@google.com/
> > and that was because of:
> > https://lore.kernel.org/r/20220105061351.120843-25-irogers@google.com
> >
> > So I think the code needs to test CPU and not core. Whether that is
> > addressing the Power test failures is another matter, as James said we
> > may need a fix in the tests for that.
> >
>
> Hi Ian, James
>
> Thanks for the reviews and suggestions.
>
> After checking through the original commits for id.core vs cpu check,
> sharing patch below to test CPU and not core.
>
>  From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001
>  From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Date: Mon, 3 Oct 2022 15:47:27 +0530
> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in
> aggr_printout
>
> perf stat has options to aggregate the counts in different
> modes like per socket, per core etc. The function "aggr_printout"
> in util/stat-display.c which is used to print the aggregates,
> has a check for cpu in case of AGGR_NONE. This check was
> originally using condition : "if (id.cpu.cpu > -1)". But
> this got changed after commit df936cadfb58 ("perf stat: Add
> JSON output option"), which added option to output json format
> for different aggregation modes. After this commit, the
> check in "aggr_printout" is using "if (id.core > -1)".
>
> The old code was using "id.cpu.cpu > -1" while the new code
> is using "id.core > -1". But since the value printed is
> id.cpu.cpu, fix this check to use cpu and not core.
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Suggested-by: James Clark <james.clark@arm.com>
> Suggested-by: Ian Rogers <irogers@google.com>

The change below works on my dual socket SkylakeX:
..
85: perf stat CSV output linter                                     :
Ok
86: perf stat csv summary test                                      : Ok
87: perf stat JSON output linter                                    : Ok
..
I don't see anything else out of the ordinary.

Thanks,
Ian

> ---
>   tools/perf/util/stat-display.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c
> b/tools/perf/util/stat-display.c
> index b82844cb0ce7..cf28020798ec 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config
> *config,
>                                         id.socket,
>                                         id.die,
>                                         id.core);
> -                       } else if (id.core > -1) {
> +                       } else if (id.cpu.cpu > -1) {
>                                 fprintf(config->output, "\"cpu\" : \"%d\", ",
>                                         id.cpu.cpu);
>                         }
> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config
> *config,
>                                         id.die,
>                                         config->csv_output ? 0 : -3,
>                                         id.core, config->csv_sep);
> -                       } else if (id.core > -1) {
> +                       } else if (id.cpu.cpu > -1) {
>                                 fprintf(config->output, "CPU%*d%s",
>                                         config->csv_output ? 0 : -7,
>                                         id.cpu.cpu, config->csv_sep);
> --
> 2.31.1
>
> Can you suggest or help to test this patch change.
>
> To address the test failure, as James suggested, I will handle fix in
> testcases and post them
> as a separate patch. Plan is to add a sanity check in the tests to see
> if the "physical_packagge_id" ( ie socket id ) in topology points to -1
> and if so skip the test. Also in parallel, checking to see how we can
> handle the aggregation modes to work incase of "-1" value for socket or
> die
>
> Thanks
> Athira
>
> > Thanks,
> > Ian
> >
> >> Or the one about fixing it in the test instead? Or failing early if
> >> the
> >> topology can't be read?
> >>
> >> I'm still not convinced that any of the modes where -1 is printed are
> >> even working properly so it might be best to fix that rather than just
> >> the printout.
> >>
> >> > James, can you point me to reference for that meaning if I have missed anything.
> >>
> >> It's here:
> >>
> >>   /** Identify where counts are aggregated, -1 implies not to
> >> aggregate. */
> >>   struct aggr_cpu_id {
> >>
> >> >
> >> > Thanks
> >> > Athira
> >> >
> >> >>
> >> >>>
> >> >>> After the fix:
> >> >>>
> >> >>> <<>>
> >> >>> perf stat -e instructions -A -a true
> >> >>>
> >> >>> Performance counter stats for 'system wide':
> >> >>>
> >> >>> CPU-1                  64,034      instructions
> >> >>> CPU-1                  68,941      instructions
> >> >>> CPU-1                  59,418      instructions
> >> >>> CPU-1                  70,478      instructions
> >> >>> CPU-1                  65,201      instructions
> >> >>> CPU-1                  63,704      instructions
> >> >>> <<>>
> >> >>>
> >> >>> This is caught while running "perf test" for
> >> >>> "stat+json_output.sh" and "stat+csv_output.sh".
> >> >>
> >> >> Is it possible to fix the issue by making the tests cope with the lack
> >> >> of the CPU id?
> >> >>
> >> >>>
> >> >>> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
> >> >>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >> >>> ---
> >> >>> tools/perf/util/stat-display.c | 6 ++----
> >> >>> 1 file changed, 2 insertions(+), 4 deletions(-)
> >> >>>
> >> >>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> >> >>> index b82844cb0ce7..1b751a730271 100644
> >> >>> --- a/tools/perf/util/stat-display.c
> >> >>> +++ b/tools/perf/util/stat-display.c
> >> >>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
> >> >>>                                     id.socket,
> >> >>>                                     id.die,
> >> >>>                                     id.core);
> >> >>> -                   } else if (id.core > -1) {
> >> >>> +                   } else
> >> >>
> >> >> This should have been "id.cpu.cpu > -1". Looks like it was changed by
> >> >> some kind of bad merge or rebase in df936cadfb because there is no
> >> >> obvious justification for the change to .core in that commit.
> >> >
> >> >>
> >> >>>                             fprintf(config->output, "\"cpu\" : \"%d\", ",
> >> >>>                                     id.cpu.cpu);
> >> >>> -                   }
> >> >>>             } else {
> >> >>>                     if (evsel->percore && !config->percore_show_thread) {
> >> >>>                             fprintf(config->output, "S%d-D%d-C%*d%s",
> >> >>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
> >> >>>                                     id.die,
> >> >>>                                     config->csv_output ? 0 : -3,
> >> >>>                                     id.core, config->csv_sep);
> >> >>> -                   } else if (id.core > -1) {
> >> >>> +                   } else
> >> >>>                             fprintf(config->output, "CPU%*d%s",
> >> >>>                                     config->csv_output ? 0 : -7,
> >> >>>                                     id.cpu.cpu, config->csv_sep);
> >> >>> -                   }
> >> >>>             }
> >> >>>             break;
> >> >>>     case AGGR_THREAD:
> >> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-03 18:51           ` Ian Rogers
@ 2022-10-04  7:06             ` Athira Rajeev
  2022-10-04 14:49               ` Ian Rogers
  0 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2022-10-04  7:06 UTC (permalink / raw)
  To: Ian Rogers, Arnaldo Carvalho de Melo, James Clark
  Cc: maddy, Nageswara Sastry, Kajol Jain, linux-perf-users, Jiri Olsa,
	atrajeev, Disha Goel, linuxppc-dev



> On 04-Oct-2022, at 12:21 AM, Ian Rogers <irogers@google.com> wrote:
> 
> On Mon, Oct 3, 2022 at 7:03 AM atrajeev <atrajeev@imap.linux.ibm.com> wrote:
>> 
>> On 2022-10-02 05:17, Ian Rogers wrote:
>>> On Thu, Sep 29, 2022 at 5:56 AM James Clark <james.clark@arm.com>
>>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 29/09/2022 09:49, Athira Rajeev wrote:
>>>>> 
>>>>> 
>>>>>> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark@arm.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> Hi James,
>>>>> 
>>>>> Thanks for looking at the patch and sharing review comments.
>>>>> 
>>>>>> On 13/09/2022 12:57, Athira Rajeev wrote:
>>>>>>> perf stat includes option to specify aggr_mode to display
>>>>>>> per-socket, per-core, per-die, per-node counter details.
>>>>>>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>>>>>>> counter values are displayed for each cpu along with "CPU"
>>>>>>> value in one field of the output.
>>>>>>> 
>>>>>>> Each of the aggregate mode uses the information fetched
>>>>>>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
>>>>>> 
>>>>>> I thought that this wouldn't apply to the cpu field because cpu is
>>>>>> basically interchangeable as an index in cpumap, rather than anything
>>>>>> being read from the topology file.
>>>>> 
>>>>> The cpu value is filled in this function:
>>>>> 
>>>>> Function : aggr_cpu_id__cpu
>>>>> Code: util/cpumap.c
>>>>> 
>>>>>> 
>>>>>>> physical_package_id. Utility functions in "cpumap.c" fetches
>>>>>>> this information and populates the socket id, core id, cpu etc.
>>>>>>> If the platform does not expose the topology information,
>>>>>>> these values will be set to -1. Example, in case of powerpc,
>>>>>>> details like physical_package_id is restricted to be exposed
>>>>>>> in pSeries platform. So id.socket, id.core, id.cpu all will
>>>>>>> be set as -1.
>>>>>>> 
>>>>>>> In case of displaying socket or die value, there is no check
>>>>>>> done in the "aggr_printout" function to see if it points to
>>>>>>> valid socket id or die. But for displaying "cpu" value, there
>>>>>>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>>>>>>> where detail like physical_package_id is restricted to be
>>>>>>> exposed, id.core will be set to -1. Hence the column or field
>>>>>>> itself for CPU won't be displayed in the output.
>>>>>>> 
>>>>>>> Result for per-socket:
>>>>>>> 
>>>>>>> <<>>
>>>>>>> perf stat -e branches --per-socket -a true
>>>>>>> 
>>>>>>> Performance counter stats for 'system wide':
>>>>>>> 
>>>>>>> S-1      32            416,851      branches
>>>>>>> <<>>
>>>>>>> 
>>>>>>> Here S has -1 in above result. But with -A option which also
>>>>>>> expects CPU in one column in the result, below is observed.
>>>>>>> 
>>>>>>> <<>>
>>>>>>> /bin/perf stat -e instructions -A -a true
>>>>>>> 
>>>>>>> Performance counter stats for 'system wide':
>>>>>>> 
>>>>>>>           47,146      instructions
>>>>>>>           45,226      instructions
>>>>>>>           43,354      instructions
>>>>>>>           45,184      instructions
>>>>>>> <<>>
>>>>>>> 
>>>>>>> If the cpu id value is pointing to -1 also, it makes sense
>>>>>>> to display the column in the output to replicate the behaviour
>>>>>>> or to be in precedence with other aggr options(like per-socket,
>>>>>>> per-core). Remove the check "id.core" so that CPU field gets
>>>>>>> displayed in the output.
>>>>>> 
>>>>>> Why would you want to print -1 out? Seems like the if statement was a
>>>>>> good one to me, otherwise the output looks a bit broken to users. Are
>>>>>> the other aggregation modes even working if -1 is set for socket and
>>>>>> die? Maybe we need to not print -1 in those cases or exit earlier with a
>>>>>> failure.
>>>>>> 
>>>>>> The -1 value has a specific internal meaning which is "to not
>>>>>> aggregate". It doesn't mean "not set".
>>>>> 
>>>>> Currently, this check is done only for printing cpu value.
>>>>> For socket/die/core values, this check is not done. Pasting an
>>>>> example snippet from a powerpc system ( specifically from pseries platform where
>>>>> the value is set to -1 )
>>>>> 
>>>>> ./perf stat --per-core -a -C 1 true
>>>>> 
>>>>> Performance counter stats for 'system wide':
>>>>> 
>>>>> S-1-D-1-C-1          1               1.06 msec cpu-clock                        #    1.018 CPUs utilized
>>>>> S-1-D-1-C-1          1                  2      context-switches                 #    1.879 K/sec
>>>>> S-1-D-1-C-1          1                  0      cpu-migrations                   #    0.000 /sec
>>>>> 
>>>>> Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be
>>>>> empty since we do a check before printing.
>>>>> 
>>>>> Example:
>>>>> 
>>>>> ./perf stat --per-core -A -C 1 true
>>>>> 
>>>>> Performance counter stats for 'CPU(s) 1':
>>>>> 
>>>>>              0.88 msec cpu-clock                        #    1.022 CPUs utilized
>>>>>                 2      context-switches
>>>>>                 0      cpu-migrations
>>>>> 
>>>>> 
>>>>> No sure, whether there are scripts out there, which consume the current format and
>>>>> not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to
>>>>> other modes like socket, die, core etc.
>>>> 
>>>> I wouldn't worry about that because there are json and CSV modes which
>>>> are machine readable, and -1 is already not always displayed. If
>>>> anything this change here is also likely to break parsing by adding -1
>>>> where it wasn't before.
>>>> 
>>>>> 
>>>>> Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
>>>>> values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”.
>>>>> What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.
>>>>> 
>>>>> Reference: defined in util/stat.h
>>>>> 
>>>>> enum aggr_mode {
>>>>>        AGGR_NONE,
>>>>> 
>>>> 
>>>> That enum is never written to any of the cpumap members, that defines
>>>> the mode of how to fill the cpu map instead. 0 is a valid value, for
>>>> example "CPU 0". -1 is used as a special case and shouldn't be
>>>> displayed
>>>> IMO.
>>>> 
>>>> Did you see my comment in the code below about the bad merge? Could
>>>> that
>>>> not be related to your issue?
>>> 
>>> I'm suspicious of this too. In Claire's patch:
>>> 
>>>        case AGGR_NONE:
>>> -               if (evsel->percore && !config->percore_show_thread) {
>>> -                       fprintf(config->output, "S%d-D%d-C%*d%s",
>>> -                               id.socket,
>>> -                               id.die,
>>> -                               config->csv_output ? 0 : -3,
>>> -                               id.core, config->csv_sep);
>>> -               } else if (id.cpu.cpu > -1) {
>>> -                       fprintf(config->output, "CPU%*d%s",
>>> -                               config->csv_output ? 0 : -7,
>>> -                               id.cpu.cpu, config->csv_sep);
>>> +               if (config->json_output) {
>>> +                       if (evsel->percore &&
>>> !config->percore_show_thread) {
>>> +                               fprintf(config->output, "\"core\" :
>>> \"S%d-D%d-C%d\"",
>>> +                                       id.socket,
>>> +                                       id.die,
>>> +                                       id.core);
>>> +                       } else if (id.core > -1) {
>>> +                               fprintf(config->output, "\"cpu\" :
>>> \"%d\", ",
>>> +                                       id.cpu.cpu);
>>> +                       }
>>> +               } else {
>>> +                       if (evsel->percore &&
>>> !config->percore_show_thread) {
>>> +                               fprintf(config->output,
>>> "S%d-D%d-C%*d%s",
>>> +                                       id.socket,
>>> +                                       id.die,
>>> +                                       config->csv_output ? 0 : -3,
>>> +                                       id.core, config->csv_sep);
>>> +                       } else if (id.core > -1) {
>>> +                               fprintf(config->output, "CPU%*d%s",
>>> +                                       config->csv_output ? 0 : -7,
>>> +                                       id.cpu.cpu, config->csv_sep);
>>> +                       }
>>>               }
>>>               break;
>>> 
>>> The old code was using "id.cpu.cpu > -1" while the new code is
>>> "id.core > -1". The value printed is id.cpu.cpu and so testing id.core
>>> makes less sense to me. Going back to the original patch:
>>> 
>>> https://lore.kernel.org/lkml/20210811224317.1811618-1-cjense@google.com/
>>>  case AGGR_NONE:
>>> - if (evsel->percore && !config->percore_show_thread) {
>>> - fprintf(config->output, "S%d-D%d-C%*d%s",
>>> - id.socket,
>>> - id.die,
>>> - config->csv_output ? 0 : -3,
>>> - id.core, config->csv_sep);
>>> + if (config->json_output) {
>>> + if (evsel->percore && !config->percore_show_thread) {
>>> + fprintf(config->output, "\"core\" : \"S%d-D%d-C%d\"",
>>> + id.socket,
>>> + id.die,
>>> + id.core);
>>> + } else if (id.core > -1) {
>>> + fprintf(config->output, "\"cpu\" : \"%d\", ",
>>> + evsel__cpus(evsel)->map[id.core]);
>>> + }
>>> + } else {
>>> + if (evsel->percore && !config->percore_show_thread) {
>>> + fprintf(config->output, "S%d-D%d-C%*d%s",
>>> + id.socket,
>>> + id.die,
>>> + config->csv_output ? 0 : -3,
>>> + id.core, config->csv_sep);
>>>  } else if (id.core > -1) {
>>>  fprintf(config->output, "CPU%*d%s",
>>>  config->csv_output ? 0 : -7,
>>>  evsel__cpus(evsel)->map[id.core],
>>>  config->csv_sep);
>>> - }
>>> + }
>>> + }
>>> +
>>>  break;
>>> 
>>> So testing the id.core isn't a bad index makes sense. However, we
>>> changed from core to CPU here:
>>> https://lore.kernel.org/all/20220105061351.120843-26-irogers@google.com/
>>> and that was because of:
>>> https://lore.kernel.org/r/20220105061351.120843-25-irogers@google.com
>>> 
>>> So I think the code needs to test CPU and not core. Whether that is
>>> addressing the Power test failures is another matter, as James said we
>>> may need a fix in the tests for that.
>>> 
>> 
>> Hi Ian, James
>> 
>> Thanks for the reviews and suggestions.
>> 
>> After checking through the original commits for id.core vs cpu check,
>> sharing patch below to test CPU and not core.
>> 
>> From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001
>> From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> Date: Mon, 3 Oct 2022 15:47:27 +0530
>> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in
>> aggr_printout
>> 
>> perf stat has options to aggregate the counts in different
>> modes like per socket, per core etc. The function "aggr_printout"
>> in util/stat-display.c which is used to print the aggregates,
>> has a check for cpu in case of AGGR_NONE. This check was
>> originally using condition : "if (id.cpu.cpu > -1)". But
>> this got changed after commit df936cadfb58 ("perf stat: Add
>> JSON output option"), which added option to output json format
>> for different aggregation modes. After this commit, the
>> check in "aggr_printout" is using "if (id.core > -1)".
>> 
>> The old code was using "id.cpu.cpu > -1" while the new code
>> is using "id.core > -1". But since the value printed is
>> id.cpu.cpu, fix this check to use cpu and not core.
>> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> Suggested-by: James Clark <james.clark@arm.com>
>> Suggested-by: Ian Rogers <irogers@google.com>
> 
> The change below works on my dual socket SkylakeX:
> ..
> 85: perf stat CSV output linter                                     :
> Ok
> 86: perf stat csv summary test                                      : Ok
> 87: perf stat JSON output linter                                    : Ok
> ..
> I don't see anything else out of the ordinary.
> 
> Thanks,
> Ian
> 

Hi Ian,
Thanks for helping with testing. Can I add your Tested-by for the patch ?

Arnaldo,
Please suggest if I have to send as separate patch for the cpu check fix patch pasted above:
 "tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout”

Thanks
Athira
>> ---
>>  tools/perf/util/stat-display.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/perf/util/stat-display.c
>> b/tools/perf/util/stat-display.c
>> index b82844cb0ce7..cf28020798ec 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config
>> *config,
>>                                        id.socket,
>>                                        id.die,
>>                                        id.core);
>> -                       } else if (id.core > -1) {
>> +                       } else if (id.cpu.cpu > -1) {
>>                                fprintf(config->output, "\"cpu\" : \"%d\", ",
>>                                        id.cpu.cpu);
>>                        }
>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config
>> *config,
>>                                        id.die,
>>                                        config->csv_output ? 0 : -3,
>>                                        id.core, config->csv_sep);
>> -                       } else if (id.core > -1) {
>> +                       } else if (id.cpu.cpu > -1) {
>>                                fprintf(config->output, "CPU%*d%s",
>>                                        config->csv_output ? 0 : -7,
>>                                        id.cpu.cpu, config->csv_sep);
>> --
>> 2.31.1
>> 
>> Can you suggest or help to test this patch change.
>> 
>> To address the test failure, as James suggested, I will handle fix in
>> testcases and post them
>> as a separate patch. Plan is to add a sanity check in the tests to see
>> if the "physical_packagge_id" ( ie socket id ) in topology points to -1
>> and if so skip the test. Also in parallel, checking to see how we can
>> handle the aggregation modes to work incase of "-1" value for socket or
>> die
>> 
>> Thanks
>> Athira
>> 
>>> Thanks,
>>> Ian
>>> 
>>>> Or the one about fixing it in the test instead? Or failing early if
>>>> the
>>>> topology can't be read?
>>>> 
>>>> I'm still not convinced that any of the modes where -1 is printed are
>>>> even working properly so it might be best to fix that rather than just
>>>> the printout.
>>>> 
>>>>> James, can you point me to reference for that meaning if I have missed anything.
>>>> 
>>>> It's here:
>>>> 
>>>>  /** Identify where counts are aggregated, -1 implies not to
>>>> aggregate. */
>>>>  struct aggr_cpu_id {
>>>> 
>>>>> 
>>>>> Thanks
>>>>> Athira
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> After the fix:
>>>>>>> 
>>>>>>> <<>>
>>>>>>> perf stat -e instructions -A -a true
>>>>>>> 
>>>>>>> Performance counter stats for 'system wide':
>>>>>>> 
>>>>>>> CPU-1                  64,034      instructions
>>>>>>> CPU-1                  68,941      instructions
>>>>>>> CPU-1                  59,418      instructions
>>>>>>> CPU-1                  70,478      instructions
>>>>>>> CPU-1                  65,201      instructions
>>>>>>> CPU-1                  63,704      instructions
>>>>>>> <<>>
>>>>>>> 
>>>>>>> This is caught while running "perf test" for
>>>>>>> "stat+json_output.sh" and "stat+csv_output.sh".
>>>>>> 
>>>>>> Is it possible to fix the issue by making the tests cope with the lack
>>>>>> of the CPU id?
>>>>>> 
>>>>>>> 
>>>>>>> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
>>>>>>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>>>>>>> ---
>>>>>>> tools/perf/util/stat-display.c | 6 ++----
>>>>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>>>>>>> index b82844cb0ce7..1b751a730271 100644
>>>>>>> --- a/tools/perf/util/stat-display.c
>>>>>>> +++ b/tools/perf/util/stat-display.c
>>>>>>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>>>>>>>                                    id.socket,
>>>>>>>                                    id.die,
>>>>>>>                                    id.core);
>>>>>>> -                   } else if (id.core > -1) {
>>>>>>> +                   } else
>>>>>> 
>>>>>> This should have been "id.cpu.cpu > -1". Looks like it was changed by
>>>>>> some kind of bad merge or rebase in df936cadfb because there is no
>>>>>> obvious justification for the change to .core in that commit.
>>>>> 
>>>>>> 
>>>>>>>                            fprintf(config->output, "\"cpu\" : \"%d\", ",
>>>>>>>                                    id.cpu.cpu);
>>>>>>> -                   }
>>>>>>>            } else {
>>>>>>>                    if (evsel->percore && !config->percore_show_thread) {
>>>>>>>                            fprintf(config->output, "S%d-D%d-C%*d%s",
>>>>>>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>>>>>>>                                    id.die,
>>>>>>>                                    config->csv_output ? 0 : -3,
>>>>>>>                                    id.core, config->csv_sep);
>>>>>>> -                   } else if (id.core > -1) {
>>>>>>> +                   } else
>>>>>>>                            fprintf(config->output, "CPU%*d%s",
>>>>>>>                                    config->csv_output ? 0 : -7,
>>>>>>>                                    id.cpu.cpu, config->csv_sep);
>>>>>>> -                   }
>>>>>>>            }
>>>>>>>            break;
>>>>>>>    case AGGR_THREAD:
>>>>> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-04  7:06             ` Athira Rajeev
@ 2022-10-04 14:49               ` Ian Rogers
  2022-10-04 18:14                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 19+ messages in thread
From: Ian Rogers @ 2022-10-04 14:49 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: maddy, Nageswara Sastry, Kajol Jain, Arnaldo Carvalho de Melo,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 18965 bytes --]

On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>
wrote:

>
>
> > On 04-Oct-2022, at 12:21 AM, Ian Rogers <irogers@google.com> wrote:
> >
> > On Mon, Oct 3, 2022 at 7:03 AM atrajeev <atrajeev@imap.linux.ibm.com>
> wrote:
> >>
> >> On 2022-10-02 05:17, Ian Rogers wrote:
> >>> On Thu, Sep 29, 2022 at 5:56 AM James Clark <james.clark@arm.com>
> >>> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 29/09/2022 09:49, Athira Rajeev wrote:
> >>>>>
> >>>>>
> >>>>>> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark@arm.com>
> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> Hi James,
> >>>>>
> >>>>> Thanks for looking at the patch and sharing review comments.
> >>>>>
> >>>>>> On 13/09/2022 12:57, Athira Rajeev wrote:
> >>>>>>> perf stat includes option to specify aggr_mode to display
> >>>>>>> per-socket, per-core, per-die, per-node counter details.
> >>>>>>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
> >>>>>>> counter values are displayed for each cpu along with "CPU"
> >>>>>>> value in one field of the output.
> >>>>>>>
> >>>>>>> Each of the aggregate mode uses the information fetched
> >>>>>>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> >>>>>>
> >>>>>> I thought that this wouldn't apply to the cpu field because cpu is
> >>>>>> basically interchangeable as an index in cpumap, rather than
> anything
> >>>>>> being read from the topology file.
> >>>>>
> >>>>> The cpu value is filled in this function:
> >>>>>
> >>>>> Function : aggr_cpu_id__cpu
> >>>>> Code: util/cpumap.c
> >>>>>
> >>>>>>
> >>>>>>> physical_package_id. Utility functions in "cpumap.c" fetches
> >>>>>>> this information and populates the socket id, core id, cpu etc.
> >>>>>>> If the platform does not expose the topology information,
> >>>>>>> these values will be set to -1. Example, in case of powerpc,
> >>>>>>> details like physical_package_id is restricted to be exposed
> >>>>>>> in pSeries platform. So id.socket, id.core, id.cpu all will
> >>>>>>> be set as -1.
> >>>>>>>
> >>>>>>> In case of displaying socket or die value, there is no check
> >>>>>>> done in the "aggr_printout" function to see if it points to
> >>>>>>> valid socket id or die. But for displaying "cpu" value, there
> >>>>>>> is a check for "if (id.core > -1)". In case of powerpc pSeries
> >>>>>>> where detail like physical_package_id is restricted to be
> >>>>>>> exposed, id.core will be set to -1. Hence the column or field
> >>>>>>> itself for CPU won't be displayed in the output.
> >>>>>>>
> >>>>>>> Result for per-socket:
> >>>>>>>
> >>>>>>> <<>>
> >>>>>>> perf stat -e branches --per-socket -a true
> >>>>>>>
> >>>>>>> Performance counter stats for 'system wide':
> >>>>>>>
> >>>>>>> S-1      32            416,851      branches
> >>>>>>> <<>>
> >>>>>>>
> >>>>>>> Here S has -1 in above result. But with -A option which also
> >>>>>>> expects CPU in one column in the result, below is observed.
> >>>>>>>
> >>>>>>> <<>>
> >>>>>>> /bin/perf stat -e instructions -A -a true
> >>>>>>>
> >>>>>>> Performance counter stats for 'system wide':
> >>>>>>>
> >>>>>>>           47,146      instructions
> >>>>>>>           45,226      instructions
> >>>>>>>           43,354      instructions
> >>>>>>>           45,184      instructions
> >>>>>>> <<>>
> >>>>>>>
> >>>>>>> If the cpu id value is pointing to -1 also, it makes sense
> >>>>>>> to display the column in the output to replicate the behaviour
> >>>>>>> or to be in precedence with other aggr options(like per-socket,
> >>>>>>> per-core). Remove the check "id.core" so that CPU field gets
> >>>>>>> displayed in the output.
> >>>>>>
> >>>>>> Why would you want to print -1 out? Seems like the if statement was
> a
> >>>>>> good one to me, otherwise the output looks a bit broken to users.
> Are
> >>>>>> the other aggregation modes even working if -1 is set for socket and
> >>>>>> die? Maybe we need to not print -1 in those cases or exit earlier
> with a
> >>>>>> failure.
> >>>>>>
> >>>>>> The -1 value has a specific internal meaning which is "to not
> >>>>>> aggregate". It doesn't mean "not set".
> >>>>>
> >>>>> Currently, this check is done only for printing cpu value.
> >>>>> For socket/die/core values, this check is not done. Pasting an
> >>>>> example snippet from a powerpc system ( specifically from pseries
> platform where
> >>>>> the value is set to -1 )
> >>>>>
> >>>>> ./perf stat --per-core -a -C 1 true
> >>>>>
> >>>>> Performance counter stats for 'system wide':
> >>>>>
> >>>>> S-1-D-1-C-1          1               1.06 msec cpu-clock
>             #    1.018 CPUs utilized
> >>>>> S-1-D-1-C-1          1                  2      context-switches
>            #    1.879 K/sec
> >>>>> S-1-D-1-C-1          1                  0      cpu-migrations
>            #    0.000 /sec
> >>>>>
> >>>>> Here though the value is -1, we are displaying it. Where as in case
> of cpu, the first column will be
> >>>>> empty since we do a check before printing.
> >>>>>
> >>>>> Example:
> >>>>>
> >>>>> ./perf stat --per-core -A -C 1 true
> >>>>>
> >>>>> Performance counter stats for 'CPU(s) 1':
> >>>>>
> >>>>>              0.88 msec cpu-clock                        #    1.022
> CPUs utilized
> >>>>>                 2      context-switches
> >>>>>                 0      cpu-migrations
> >>>>>
> >>>>>
> >>>>> No sure, whether there are scripts out there, which consume the
> current format and
> >>>>> not displaying -1 may break it. That is why we tried with change to
> remove check for cpu, similar to
> >>>>> other modes like socket, die, core etc.
> >>>>
> >>>> I wouldn't worry about that because there are json and CSV modes which
> >>>> are machine readable, and -1 is already not always displayed. If
> >>>> anything this change here is also likely to break parsing by adding -1
> >>>> where it wasn't before.
> >>>>
> >>>>>
> >>>>> Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises
> the
> >>>>> values to -1 . I was checking to see where we are mapping -1 to “to
> not aggregate”.
> >>>>> What I could find is AGGR_NONE ( which is for no-aggr ) has value as
> zero.
> >>>>>
> >>>>> Reference: defined in util/stat.h
> >>>>>
> >>>>> enum aggr_mode {
> >>>>>        AGGR_NONE,
> >>>>>
> >>>>
> >>>> That enum is never written to any of the cpumap members, that defines
> >>>> the mode of how to fill the cpu map instead. 0 is a valid value, for
> >>>> example "CPU 0". -1 is used as a special case and shouldn't be
> >>>> displayed
> >>>> IMO.
> >>>>
> >>>> Did you see my comment in the code below about the bad merge? Could
> >>>> that
> >>>> not be related to your issue?
> >>>
> >>> I'm suspicious of this too. In Claire's patch:
> >>>
> >>>        case AGGR_NONE:
> >>> -               if (evsel->percore && !config->percore_show_thread) {
> >>> -                       fprintf(config->output, "S%d-D%d-C%*d%s",
> >>> -                               id.socket,
> >>> -                               id.die,
> >>> -                               config->csv_output ? 0 : -3,
> >>> -                               id.core, config->csv_sep);
> >>> -               } else if (id.cpu.cpu > -1) {
> >>> -                       fprintf(config->output, "CPU%*d%s",
> >>> -                               config->csv_output ? 0 : -7,
> >>> -                               id.cpu.cpu, config->csv_sep);
> >>> +               if (config->json_output) {
> >>> +                       if (evsel->percore &&
> >>> !config->percore_show_thread) {
> >>> +                               fprintf(config->output, "\"core\" :
> >>> \"S%d-D%d-C%d\"",
> >>> +                                       id.socket,
> >>> +                                       id.die,
> >>> +                                       id.core);
> >>> +                       } else if (id.core > -1) {
> >>> +                               fprintf(config->output, "\"cpu\" :
> >>> \"%d\", ",
> >>> +                                       id.cpu.cpu);
> >>> +                       }
> >>> +               } else {
> >>> +                       if (evsel->percore &&
> >>> !config->percore_show_thread) {
> >>> +                               fprintf(config->output,
> >>> "S%d-D%d-C%*d%s",
> >>> +                                       id.socket,
> >>> +                                       id.die,
> >>> +                                       config->csv_output ? 0 : -3,
> >>> +                                       id.core, config->csv_sep);
> >>> +                       } else if (id.core > -1) {
> >>> +                               fprintf(config->output, "CPU%*d%s",
> >>> +                                       config->csv_output ? 0 : -7,
> >>> +                                       id.cpu.cpu, config->csv_sep);
> >>> +                       }
> >>>               }
> >>>               break;
> >>>
> >>> The old code was using "id.cpu.cpu > -1" while the new code is
> >>> "id.core > -1". The value printed is id.cpu.cpu and so testing id.core
> >>> makes less sense to me. Going back to the original patch:
> >>>
> >>>
> https://lore.kernel.org/lkml/20210811224317.1811618-1-cjense@google.com/
> >>>  case AGGR_NONE:
> >>> - if (evsel->percore && !config->percore_show_thread) {
> >>> - fprintf(config->output, "S%d-D%d-C%*d%s",
> >>> - id.socket,
> >>> - id.die,
> >>> - config->csv_output ? 0 : -3,
> >>> - id.core, config->csv_sep);
> >>> + if (config->json_output) {
> >>> + if (evsel->percore && !config->percore_show_thread) {
> >>> + fprintf(config->output, "\"core\" : \"S%d-D%d-C%d\"",
> >>> + id.socket,
> >>> + id.die,
> >>> + id.core);
> >>> + } else if (id.core > -1) {
> >>> + fprintf(config->output, "\"cpu\" : \"%d\", ",
> >>> + evsel__cpus(evsel)->map[id.core]);
> >>> + }
> >>> + } else {
> >>> + if (evsel->percore && !config->percore_show_thread) {
> >>> + fprintf(config->output, "S%d-D%d-C%*d%s",
> >>> + id.socket,
> >>> + id.die,
> >>> + config->csv_output ? 0 : -3,
> >>> + id.core, config->csv_sep);
> >>>  } else if (id.core > -1) {
> >>>  fprintf(config->output, "CPU%*d%s",
> >>>  config->csv_output ? 0 : -7,
> >>>  evsel__cpus(evsel)->map[id.core],
> >>>  config->csv_sep);
> >>> - }
> >>> + }
> >>> + }
> >>> +
> >>>  break;
> >>>
> >>> So testing the id.core isn't a bad index makes sense. However, we
> >>> changed from core to CPU here:
> >>>
> https://lore.kernel.org/all/20220105061351.120843-26-irogers@google.com/
> >>> and that was because of:
> >>> https://lore.kernel.org/r/20220105061351.120843-25-irogers@google.com
> >>>
> >>> So I think the code needs to test CPU and not core. Whether that is
> >>> addressing the Power test failures is another matter, as James said we
> >>> may need a fix in the tests for that.
> >>>
> >>
> >> Hi Ian, James
> >>
> >> Thanks for the reviews and suggestions.
> >>
> >> After checking through the original commits for id.core vs cpu check,
> >> sharing patch below to test CPU and not core.
> >>
> >> From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001
> >> From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >> Date: Mon, 3 Oct 2022 15:47:27 +0530
> >> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in
> >> aggr_printout
> >>
> >> perf stat has options to aggregate the counts in different
> >> modes like per socket, per core etc. The function "aggr_printout"
> >> in util/stat-display.c which is used to print the aggregates,
> >> has a check for cpu in case of AGGR_NONE. This check was
> >> originally using condition : "if (id.cpu.cpu > -1)". But
> >> this got changed after commit df936cadfb58 ("perf stat: Add
> >> JSON output option"), which added option to output json format
> >> for different aggregation modes. After this commit, the
> >> check in "aggr_printout" is using "if (id.core > -1)".
> >>
> >> The old code was using "id.cpu.cpu > -1" while the new code
> >> is using "id.core > -1". But since the value printed is
> >> id.cpu.cpu, fix this check to use cpu and not core.
> >>
> >> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >> Suggested-by: James Clark <james.clark@arm.com>
> >> Suggested-by: Ian Rogers <irogers@google.com>
> >
> > The change below works on my dual socket SkylakeX:
> > ..
> > 85: perf stat CSV output linter                                     :
> > Ok
> > 86: perf stat csv summary test                                      : Ok
> > 87: perf stat JSON output linter                                    : Ok
> > ..
> > I don't see anything else out of the ordinary.
> >
> > Thanks,
> > Ian
> >
>
> Hi Ian,
> Thanks for helping with testing. Can I add your Tested-by for the patch ?
>

Yep.

Tested-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

Arnaldo,
> Please suggest if I have to send as separate patch for the cpu check fix
> patch pasted above:
>  "tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout”
>
> Thanks
> Athira
> >> ---
> >>  tools/perf/util/stat-display.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tools/perf/util/stat-display.c
> >> b/tools/perf/util/stat-display.c
> >> index b82844cb0ce7..cf28020798ec 100644
> >> --- a/tools/perf/util/stat-display.c
> >> +++ b/tools/perf/util/stat-display.c
> >> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config
> >> *config,
> >>                                        id.socket,
> >>                                        id.die,
> >>                                        id.core);
> >> -                       } else if (id.core > -1) {
> >> +                       } else if (id.cpu.cpu > -1) {
> >>                                fprintf(config->output, "\"cpu\" :
> \"%d\", ",
> >>                                        id.cpu.cpu);
> >>                        }
> >> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config
> >> *config,
> >>                                        id.die,
> >>                                        config->csv_output ? 0 : -3,
> >>                                        id.core, config->csv_sep);
> >> -                       } else if (id.core > -1) {
> >> +                       } else if (id.cpu.cpu > -1) {
> >>                                fprintf(config->output, "CPU%*d%s",
> >>                                        config->csv_output ? 0 : -7,
> >>                                        id.cpu.cpu, config->csv_sep);
> >> --
> >> 2.31.1
> >>
> >> Can you suggest or help to test this patch change.
> >>
> >> To address the test failure, as James suggested, I will handle fix in
> >> testcases and post them
> >> as a separate patch. Plan is to add a sanity check in the tests to see
> >> if the "physical_packagge_id" ( ie socket id ) in topology points to -1
> >> and if so skip the test. Also in parallel, checking to see how we can
> >> handle the aggregation modes to work incase of "-1" value for socket or
> >> die
> >>
> >> Thanks
> >> Athira
> >>
> >>> Thanks,
> >>> Ian
> >>>
> >>>> Or the one about fixing it in the test instead? Or failing early if
> >>>> the
> >>>> topology can't be read?
> >>>>
> >>>> I'm still not convinced that any of the modes where -1 is printed are
> >>>> even working properly so it might be best to fix that rather than just
> >>>> the printout.
> >>>>
> >>>>> James, can you point me to reference for that meaning if I have
> missed anything.
> >>>>
> >>>> It's here:
> >>>>
> >>>>  /** Identify where counts are aggregated, -1 implies not to
> >>>> aggregate. */
> >>>>  struct aggr_cpu_id {
> >>>>
> >>>>>
> >>>>> Thanks
> >>>>> Athira
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> After the fix:
> >>>>>>>
> >>>>>>> <<>>
> >>>>>>> perf stat -e instructions -A -a true
> >>>>>>>
> >>>>>>> Performance counter stats for 'system wide':
> >>>>>>>
> >>>>>>> CPU-1                  64,034      instructions
> >>>>>>> CPU-1                  68,941      instructions
> >>>>>>> CPU-1                  59,418      instructions
> >>>>>>> CPU-1                  70,478      instructions
> >>>>>>> CPU-1                  65,201      instructions
> >>>>>>> CPU-1                  63,704      instructions
> >>>>>>> <<>>
> >>>>>>>
> >>>>>>> This is caught while running "perf test" for
> >>>>>>> "stat+json_output.sh" and "stat+csv_output.sh".
> >>>>>>
> >>>>>> Is it possible to fix the issue by making the tests cope with the
> lack
> >>>>>> of the CPU id?
> >>>>>>
> >>>>>>>
> >>>>>>> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
> >>>>>>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >>>>>>> ---
> >>>>>>> tools/perf/util/stat-display.c | 6 ++----
> >>>>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/tools/perf/util/stat-display.c
> b/tools/perf/util/stat-display.c
> >>>>>>> index b82844cb0ce7..1b751a730271 100644
> >>>>>>> --- a/tools/perf/util/stat-display.c
> >>>>>>> +++ b/tools/perf/util/stat-display.c
> >>>>>>> @@ -168,10 +168,9 @@ static void aggr_printout(struct
> perf_stat_config *config,
> >>>>>>>                                    id.socket,
> >>>>>>>                                    id.die,
> >>>>>>>                                    id.core);
> >>>>>>> -                   } else if (id.core > -1) {
> >>>>>>> +                   } else
> >>>>>>
> >>>>>> This should have been "id.cpu.cpu > -1". Looks like it was changed
> by
> >>>>>> some kind of bad merge or rebase in df936cadfb because there is no
> >>>>>> obvious justification for the change to .core in that commit.
> >>>>>
> >>>>>>
> >>>>>>>                            fprintf(config->output, "\"cpu\" :
> \"%d\", ",
> >>>>>>>                                    id.cpu.cpu);
> >>>>>>> -                   }
> >>>>>>>            } else {
> >>>>>>>                    if (evsel->percore &&
> !config->percore_show_thread) {
> >>>>>>>                            fprintf(config->output,
> "S%d-D%d-C%*d%s",
> >>>>>>> @@ -179,11 +178,10 @@ static void aggr_printout(struct
> perf_stat_config *config,
> >>>>>>>                                    id.die,
> >>>>>>>                                    config->csv_output ? 0 : -3,
> >>>>>>>                                    id.core, config->csv_sep);
> >>>>>>> -                   } else if (id.core > -1) {
> >>>>>>> +                   } else
> >>>>>>>                            fprintf(config->output, "CPU%*d%s",
> >>>>>>>                                    config->csv_output ? 0 : -7,
> >>>>>>>                                    id.cpu.cpu, config->csv_sep);
> >>>>>>> -                   }
> >>>>>>>            }
> >>>>>>>            break;
> >>>>>>>    case AGGR_THREAD:
> >>>>>
>
>

[-- Attachment #2: Type: text/html, Size: 29653 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-04 14:49               ` Ian Rogers
@ 2022-10-04 18:14                 ` Arnaldo Carvalho de Melo
  2022-10-04 18:14                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-04 18:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Athira Rajeev, Nageswara Sastry, Kajol Jain, linux-perf-users,
	maddy, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu:
> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> > Thanks for helping with testing. Can I add your Tested-by for the patch ?
 
> Yep.
 
> Tested-by: Ian Rogers <irogers@google.com>
 
> Thanks,
> Ian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-04 18:14                 ` Arnaldo Carvalho de Melo
@ 2022-10-04 18:14                   ` Arnaldo Carvalho de Melo
  2022-10-05  4:53                     ` Athira Rajeev
  0 siblings, 1 reply; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-04 18:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Athira Rajeev, Nageswara Sastry, Kajol Jain, linux-perf-users,
	maddy, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu:
> > On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> > > Thanks for helping with testing. Can I add your Tested-by for the patch ?
>  
> > Yep.
>  
> > Tested-by: Ian Rogers <irogers@google.com>


Thanks, applied.

- Arnaldo


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-04 18:14                   ` Arnaldo Carvalho de Melo
@ 2022-10-05  4:53                     ` Athira Rajeev
  2022-10-05 12:24                       ` Arnaldo Carvalho de Melo
  2022-10-05 12:28                       ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 19+ messages in thread
From: Athira Rajeev @ 2022-10-05  4:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev



> On 04-Oct-2022, at 11:44 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu:
>>> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>>>> Thanks for helping with testing. Can I add your Tested-by for the patch ?
>> 
>>> Yep.
>> 
>>> Tested-by: Ian Rogers <irogers@google.com>
> 
> 
> Thanks, applied.
> 
> - Arnaldo

Hi Arnaldo,

Looks like you have taken change to remove id.core check:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=db83f447b323958cdc5fedcf2134effb2ec9a6fe

But the patch that has to go in is :
"[PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in ggr_printout"
which is tested by Ian and "pasted" by me in same mail thread.

Re-pasting here for reference:

From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001
From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Date: Mon, 3 Oct 2022 15:47:27 +0530
Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout

perf stat has options to aggregate the counts in different
modes like per socket, per core etc. The function "aggr_printout"
in util/stat-display.c which is used to print the aggregates,
has a check for cpu in case of AGGR_NONE. This check was
originally using condition : "if (id.cpu.cpu > -1)". But
this got changed after commit df936cadfb58 ("perf stat: Add
JSON output option"), which added option to output json format
for different aggregation modes. After this commit, the
check in "aggr_printout" is using "if (id.core > -1)".

The old code was using "id.cpu.cpu > -1" while the new code
is using "id.core > -1". But since the value printed is
id.cpu.cpu, fix this check to use cpu and not core.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Suggested-by: James Clark <james.clark@arm.com>
Suggested-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/stat-display.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index b82844cb0ce7..cf28020798ec 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
					id.socket,
					id.die,
					id.core);
-			} else if (id.core > -1) {
+			} else if (id.cpu.cpu > -1) {
				fprintf(config->output, "\"cpu\" : \"%d\", ",
					id.cpu.cpu);
			}
@@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
					id.die,
					config->csv_output ? 0 : -3,
					id.core, config->csv_sep);
-			} else if (id.core > -1) {
+			} else if (id.cpu.cpu > -1) {
				fprintf(config->output, "CPU%*d%s",
					config->csv_output ? 0 : -7,
					id.cpu.cpu, config->csv_sep);
-- 
2.31.1

If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?

Please revert https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=db83f447b323958cdc5fedcf2134effb2ec9a6fe

Thanks
Athira

> 


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-05  4:53                     ` Athira Rajeev
@ 2022-10-05 12:24                       ` Arnaldo Carvalho de Melo
  2022-10-05 12:28                       ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-05 12:24 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu:
> 
> 
> > On 04-Oct-2022, at 11:44 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu:
> >>> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >>>> Thanks for helping with testing. Can I add your Tested-by for the patch ?
> >> 
> >>> Yep.
> >> 
> >>> Tested-by: Ian Rogers <irogers@google.com>
> > 
> > 
> > Thanks, applied.
> > 
> > - Arnaldo
> 
> Hi Arnaldo,
> 
> Looks like you have taken change to remove id.core check:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=db83f447b323958cdc5fedcf2134effb2ec9a6fe
> 
> But the patch that has to go in is :
> "[PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in ggr_printout"
> which is tested by Ian and "pasted" by me in same mail thread.
> 
> Re-pasting here for reference:
> 
> >From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001
> From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Date: Mon, 3 Oct 2022 15:47:27 +0530
> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout
> 
> perf stat has options to aggregate the counts in different
> modes like per socket, per core etc. The function "aggr_printout"
> in util/stat-display.c which is used to print the aggregates,
> has a check for cpu in case of AGGR_NONE. This check was
> originally using condition : "if (id.cpu.cpu > -1)". But
> this got changed after commit df936cadfb58 ("perf stat: Add
> JSON output option"), which added option to output json format
> for different aggregation modes. After this commit, the
> check in "aggr_printout" is using "if (id.core > -1)".
> 
> The old code was using "id.cpu.cpu > -1" while the new code
> is using "id.core > -1". But since the value printed is
> id.cpu.cpu, fix this check to use cpu and not core.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Suggested-by: James Clark <james.clark@arm.com>
> Suggested-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/util/stat-display.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index b82844cb0ce7..cf28020798ec 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
> 					id.socket,
> 					id.die,
> 					id.core);
> -			} else if (id.core > -1) {
> +			} else if (id.cpu.cpu > -1) {
> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
> 					id.cpu.cpu);
> 			}
> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
> 					id.die,
> 					config->csv_output ? 0 : -3,
> 					id.core, config->csv_sep);
> -			} else if (id.core > -1) {
> +			} else if (id.cpu.cpu > -1) {
> 				fprintf(config->output, "CPU%*d%s",
> 					config->csv_output ? 0 : -7,
> 					id.cpu.cpu, config->csv_sep);
> -- 
> 2.31.1
> 
> If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?
> 
> Please revert https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=db83f447b323958cdc5fedcf2134effb2ec9a6fe

I'll do it, but in these cases just go ahead and resubmit with a v N+1
patchset, so that b4 can pick the newer version even if I point it to
the previous version message-id.

- Arnaldo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-05  4:53                     ` Athira Rajeev
  2022-10-05 12:24                       ` Arnaldo Carvalho de Melo
@ 2022-10-05 12:28                       ` Arnaldo Carvalho de Melo
  2022-10-05 12:35                         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-05 12:28 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu:
> 
> 
> > On 04-Oct-2022, at 11:44 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu:
> >>> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >>>> Thanks for helping with testing. Can I add your Tested-by for the patch ?
> >> 
> >>> Yep.
> >> 
> >>> Tested-by: Ian Rogers <irogers@google.com>
> > 
> > 
> > Thanks, applied.
> > 
> > - Arnaldo
> 
> Hi Arnaldo,
> 
> Looks like you have taken change to remove id.core check:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=db83f447b323958cdc5fedcf2134effb2ec9a6fe
> 
> But the patch that has to go in is :
> "[PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in ggr_printout"
> which is tested by Ian and "pasted" by me in same mail thread.
> 
> Re-pasting here for reference:
> 
> >From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001
> From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Date: Mon, 3 Oct 2022 15:47:27 +0530
> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout
> 
> perf stat has options to aggregate the counts in different
> modes like per socket, per core etc. The function "aggr_printout"
> in util/stat-display.c which is used to print the aggregates,
> has a check for cpu in case of AGGR_NONE. This check was
> originally using condition : "if (id.cpu.cpu > -1)". But
> this got changed after commit df936cadfb58 ("perf stat: Add
> JSON output option"), which added option to output json format
> for different aggregation modes. After this commit, the
> check in "aggr_printout" is using "if (id.core > -1)".
> 
> The old code was using "id.cpu.cpu > -1" while the new code
> is using "id.core > -1". But since the value printed is
> id.cpu.cpu, fix this check to use cpu and not core.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Suggested-by: James Clark <james.clark@arm.com>
> Suggested-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/util/stat-display.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index b82844cb0ce7..cf28020798ec 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
> 					id.socket,
> 					id.die,
> 					id.core);
> -			} else if (id.core > -1) {
> +			} else if (id.cpu.cpu > -1) {
> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
> 					id.cpu.cpu);
> 			}
> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
> 					id.die,
> 					config->csv_output ? 0 : -3,
> 					id.core, config->csv_sep);
> -			} else if (id.core > -1) {
> +			} else if (id.cpu.cpu > -1) {
> 				fprintf(config->output, "CPU%*d%s",
> 					config->csv_output ? 0 : -7,
> 					id.cpu.cpu, config->csv_sep);
> -- 
> 2.31.1
> 
> If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?

I'll have to do this by hand, tried pointing b4 to this message and it
picked the old one, also tried to save the message and apply by hand,
its mangled.

- Arnaldo
> 
> Thanks
> Athira
> 
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-05 12:28                       ` Arnaldo Carvalho de Melo
@ 2022-10-05 12:35                         ` Arnaldo Carvalho de Melo
  2022-10-06 12:46                           ` Athira Rajeev
  0 siblings, 1 reply; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-05 12:35 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu:
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index b82844cb0ce7..cf28020798ec 100644
> > --- a/tools/perf/util/stat-display.c
> > +++ b/tools/perf/util/stat-display.c
> > @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
> > 					id.socket,
> > 					id.die,
> > 					id.core);
> > -			} else if (id.core > -1) {
> > +			} else if (id.cpu.cpu > -1) {
> > 				fprintf(config->output, "\"cpu\" : \"%d\", ",
> > 					id.cpu.cpu);
> > 			}
> > @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
> > 					id.die,
> > 					config->csv_output ? 0 : -3,
> > 					id.core, config->csv_sep);
> > -			} else if (id.core > -1) {
> > +			} else if (id.cpu.cpu > -1) {
> > 				fprintf(config->output, "CPU%*d%s",
> > 					config->csv_output ? 0 : -7,
> > 					id.cpu.cpu, config->csv_sep);
> > -- 
> > If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?
> 
> I'll have to do this by hand, tried pointing b4 to this message and it
> picked the old one, also tried to save the message and apply by hand,
> its mangled.

This is what I have now, will force push later, please triple check :-)

- Arnaldo

From b7dd96f9211e4ddbd6fa080da8dec2eac98d3f2a Mon Sep 17 00:00:00 2001
From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Date: Tue, 13 Sep 2022 17:27:17 +0530
Subject: [PATCH 1/1] perf stat: Fix aggr_printout to display CPU field
 irrespective of core value

perf stat includes option to specify aggr_mode to display per-socket,
per-core, per-die, per-node counter details.  Also there is option -A (
AGGR_NONE, -no-aggr ), where the counter values are displayed for each
CPU along with "CPU" value in one field of the output.

Each of the aggregate mode uses the information fetched from
"/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id. Utility functions in "cpumap.c" fetches this
information and populates the socket id, core id, CPU etc.  If the
platform does not expose the topology information, these values will be
set to -1. Example, in case of powerpc, details like physical_package_id
is restricted to be exposed in pSeries platform. So id.socket, id.core,
id.cpu all will be set as -1.

In case of displaying socket or die value, there is no check done in the
"aggr_printout" function to see if it points to valid socket id or die.
But for displaying "cpu" value, there is a check for "if (id.core >
-1)". In case of powerpc pSeries where detail like physical_package_id
is restricted to be exposed, id.core will be set to -1. Hence the column
or field itself for CPU won't be displayed in the output.

Result for per-socket:

  perf stat -e branches --per-socket -a true

   Performance counter stats for 'system wide':

  S-1      32            416,851      branches

Here S has -1 in above result. But with -A option which also expects CPU
in one column in the result, below is observed.

  perf stat -e instructions -A -a true

   Performance counter stats for 'system wide':

            47,146      instructions
            45,226      instructions
            43,354      instructions
            45,184      instructions

If the CPU id value is pointing to -1 also, it makes sense to display
the column in the output to replicate the behaviour or to be in
precedence with other aggr options(like per-socket, per-core). Remove
the check "id.core" so that CPU field gets displayed in the output.

After the fix:

  perf stat -e instructions -A -a true

   Performance counter stats for 'system wide':

  CPU-1                  64,034      instructions
  CPU-1                  68,941      instructions
  CPU-1                  59,418      instructions
  CPU-1                  70,478      instructions
  CPU-1                  65,201      instructions
  CPU-1                  63,704      instructions

This is caught while running "perf test" for "stat+json_output.sh" and
"stat+csv_output.sh".

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220913115717.36191-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-display.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index df26fb5eb072be9f..5c47ee9963a7c04c 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
 					id.socket,
 					id.die,
 					id.core);
-			} else if (id.core > -1) {
+			} else if (id.cpu.cpu > -1) {
 				fprintf(config->output, "\"cpu\" : \"%d\", ",
 					id.cpu.cpu);
 			}
@@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
 					id.die,
 					config->csv_output ? 0 : -3,
 					id.core, config->csv_sep);
-			} else if (id.core > -1) {
+			} else if (id.cpu.cpu > -1) {
 				fprintf(config->output, "CPU%*d%s",
 					config->csv_output ? 0 : -7,
 					id.cpu.cpu, config->csv_sep);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-05 12:35                         ` Arnaldo Carvalho de Melo
@ 2022-10-06 12:46                           ` Athira Rajeev
  2022-10-06 14:03                             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2022-10-06 12:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev



> On 05-Oct-2022, at 6:05 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu:
>>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>>> index b82844cb0ce7..cf28020798ec 100644
>>> --- a/tools/perf/util/stat-display.c
>>> +++ b/tools/perf/util/stat-display.c
>>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
>>> 					id.socket,
>>> 					id.die,
>>> 					id.core);
>>> -			} else if (id.core > -1) {
>>> +			} else if (id.cpu.cpu > -1) {
>>> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
>>> 					id.cpu.cpu);
>>> 			}
>>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
>>> 					id.die,
>>> 					config->csv_output ? 0 : -3,
>>> 					id.core, config->csv_sep);
>>> -			} else if (id.core > -1) {
>>> +			} else if (id.cpu.cpu > -1) {
>>> 				fprintf(config->output, "CPU%*d%s",
>>> 					config->csv_output ? 0 : -7,
>>> 					id.cpu.cpu, config->csv_sep);
>>> -- 
>>> If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?
>> 
>> I'll have to do this by hand, tried pointing b4 to this message and it
>> picked the old one, also tried to save the message and apply by hand,
>> its mangled.
> 
> This is what I have now, will force push later, please triple check :-)


Sorry for all the confusion Arnaldo. Hereafter, I will resubmit the patch to avoid this.
In below patch which you shared, code change is correct. But commit message is different.
So to avoid further confusion, I went ahead and posted a separate patch in the mailing list with:

subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout
Patch link: https://lore.kernel.org/lkml/20221006114225.66303-1-atrajeev@linux.vnet.ibm.com/T/#u

Please pick the separately send patch.

Thanks
Athira

> 
> - Arnaldo
> 
> From b7dd96f9211e4ddbd6fa080da8dec2eac98d3f2a Mon Sep 17 00:00:00 2001
> From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Date: Tue, 13 Sep 2022 17:27:17 +0530
> Subject: [PATCH 1/1] perf stat: Fix aggr_printout to display CPU field
> irrespective of core value
> 
> perf stat includes option to specify aggr_mode to display per-socket,
> per-core, per-die, per-node counter details.  Also there is option -A (
> AGGR_NONE, -no-aggr ), where the counter values are displayed for each
> CPU along with "CPU" value in one field of the output.
> 
> Each of the aggregate mode uses the information fetched from
> "/sys/devices/system/cpu/cpuX/topology" like core_id,
> physical_package_id. Utility functions in "cpumap.c" fetches this
> information and populates the socket id, core id, CPU etc.  If the
> platform does not expose the topology information, these values will be
> set to -1. Example, in case of powerpc, details like physical_package_id
> is restricted to be exposed in pSeries platform. So id.socket, id.core,
> id.cpu all will be set as -1.
> 
> In case of displaying socket or die value, there is no check done in the
> "aggr_printout" function to see if it points to valid socket id or die.
> But for displaying "cpu" value, there is a check for "if (id.core >
> -1)". In case of powerpc pSeries where detail like physical_package_id
> is restricted to be exposed, id.core will be set to -1. Hence the column
> or field itself for CPU won't be displayed in the output.
> 
> Result for per-socket:
> 
>  perf stat -e branches --per-socket -a true
> 
>   Performance counter stats for 'system wide':
> 
>  S-1      32            416,851      branches
> 
> Here S has -1 in above result. But with -A option which also expects CPU
> in one column in the result, below is observed.
> 
>  perf stat -e instructions -A -a true
> 
>   Performance counter stats for 'system wide':
> 
>            47,146      instructions
>            45,226      instructions
>            43,354      instructions
>            45,184      instructions
> 
> If the CPU id value is pointing to -1 also, it makes sense to display
> the column in the output to replicate the behaviour or to be in
> precedence with other aggr options(like per-socket, per-core). Remove
> the check "id.core" so that CPU field gets displayed in the output.
> 
> After the fix:
> 
>  perf stat -e instructions -A -a true
> 
>   Performance counter stats for 'system wide':
> 
>  CPU-1                  64,034      instructions
>  CPU-1                  68,941      instructions
>  CPU-1                  59,418      instructions
>  CPU-1                  70,478      instructions
>  CPU-1                  65,201      instructions
>  CPU-1                  63,704      instructions
> 
> This is caught while running "perf test" for "stat+json_output.sh" and
> "stat+csv_output.sh".
> 
> Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
> Suggested-by: Ian Rogers <irogers@google.com>
> Suggested-by: James Clark <james.clark@arm.com>
> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
> Tested-by: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kajol Jain <kjain@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Link: https://lore.kernel.org/r/20220913115717.36191-1-atrajeev@linux.vnet.ibm.com
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
> tools/perf/util/stat-display.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index df26fb5eb072be9f..5c47ee9963a7c04c 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
> 					id.socket,
> 					id.die,
> 					id.core);
> -			} else if (id.core > -1) {
> +			} else if (id.cpu.cpu > -1) {
> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
> 					id.cpu.cpu);
> 			}
> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
> 					id.die,
> 					config->csv_output ? 0 : -3,
> 					id.core, config->csv_sep);
> -			} else if (id.core > -1) {
> +			} else if (id.cpu.cpu > -1) {
> 				fprintf(config->output, "CPU%*d%s",
> 					config->csv_output ? 0 : -7,
> 					id.cpu.cpu, config->csv_sep);
> -- 
> 2.37.3
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-06 12:46                           ` Athira Rajeev
@ 2022-10-06 14:03                             ` Arnaldo Carvalho de Melo
  2022-10-06 14:37                               ` Athira Rajeev
  0 siblings, 1 reply; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-06 14:03 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev

Em Thu, Oct 06, 2022 at 06:16:14PM +0530, Athira Rajeev escreveu:
> 
> 
> > On 05-Oct-2022, at 6:05 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu:
> >> Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu:
> >>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> >>> index b82844cb0ce7..cf28020798ec 100644
> >>> --- a/tools/perf/util/stat-display.c
> >>> +++ b/tools/perf/util/stat-display.c
> >>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
> >>> 					id.socket,
> >>> 					id.die,
> >>> 					id.core);
> >>> -			} else if (id.core > -1) {
> >>> +			} else if (id.cpu.cpu > -1) {
> >>> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
> >>> 					id.cpu.cpu);
> >>> 			}
> >>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
> >>> 					id.die,
> >>> 					config->csv_output ? 0 : -3,
> >>> 					id.core, config->csv_sep);
> >>> -			} else if (id.core > -1) {
> >>> +			} else if (id.cpu.cpu > -1) {
> >>> 				fprintf(config->output, "CPU%*d%s",
> >>> 					config->csv_output ? 0 : -7,
> >>> 					id.cpu.cpu, config->csv_sep);
> >>> -- 
> >>> If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?
> >> 
> >> I'll have to do this by hand, tried pointing b4 to this message and it
> >> picked the old one, also tried to save the message and apply by hand,
> >> its mangled.
> > 
> > This is what I have now, will force push later, please triple check :-)
> 
> 
> Sorry for all the confusion Arnaldo. Hereafter, I will resubmit the patch to avoid this.
> In below patch which you shared, code change is correct. But commit message is different.
> So to avoid further confusion, I went ahead and posted a separate patch in the mailing list with:
> 
> subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout
> Patch link: https://lore.kernel.org/lkml/20221006114225.66303-1-atrajeev@linux.vnet.ibm.com/T/#u
> 
> Please pick the separately send patch.

Ok, will do.

- Arnaldo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
  2022-10-06 14:03                             ` Arnaldo Carvalho de Melo
@ 2022-10-06 14:37                               ` Athira Rajeev
  0 siblings, 0 replies; 19+ messages in thread
From: Athira Rajeev @ 2022-10-06 14:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ian Rogers, maddy, Nageswara Sastry, Kajol Jain,
	linux-perf-users, James Clark, Jiri Olsa, atrajeev, Disha Goel,
	linuxppc-dev



> On 06-Oct-2022, at 7:33 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Thu, Oct 06, 2022 at 06:16:14PM +0530, Athira Rajeev escreveu:
>> 
>> 
>>> On 05-Oct-2022, at 6:05 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>> 
>>> Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu:
>>>> Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu:
>>>>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>>>>> index b82844cb0ce7..cf28020798ec 100644
>>>>> --- a/tools/perf/util/stat-display.c
>>>>> +++ b/tools/perf/util/stat-display.c
>>>>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config,
>>>>> 					id.socket,
>>>>> 					id.die,
>>>>> 					id.core);
>>>>> -			} else if (id.core > -1) {
>>>>> +			} else if (id.cpu.cpu > -1) {
>>>>> 				fprintf(config->output, "\"cpu\" : \"%d\", ",
>>>>> 					id.cpu.cpu);
>>>>> 			}
>>>>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config,
>>>>> 					id.die,
>>>>> 					config->csv_output ? 0 : -3,
>>>>> 					id.core, config->csv_sep);
>>>>> -			} else if (id.core > -1) {
>>>>> +			} else if (id.cpu.cpu > -1) {
>>>>> 				fprintf(config->output, "CPU%*d%s",
>>>>> 					config->csv_output ? 0 : -7,
>>>>> 					id.cpu.cpu, config->csv_sep);
>>>>> -- 
>>>>> If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ?
>>>> 
>>>> I'll have to do this by hand, tried pointing b4 to this message and it
>>>> picked the old one, also tried to save the message and apply by hand,
>>>> its mangled.
>>> 
>>> This is what I have now, will force push later, please triple check :-)
>> 
>> 
>> Sorry for all the confusion Arnaldo. Hereafter, I will resubmit the patch to avoid this.
>> In below patch which you shared, code change is correct. But commit message is different.
>> So to avoid further confusion, I went ahead and posted a separate patch in the mailing list with:
>> 
>> subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout
>> Patch link: https://lore.kernel.org/lkml/20221006114225.66303-1-atrajeev@linux.vnet.ibm.com/T/#u
>> 
>> Please pick the separately send patch.
> 
> Ok, will do.

Thanks Arnaldo.
> 
> - Arnaldo


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-10-06 14:38 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-13 11:57 [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value Athira Rajeev
2022-09-16 11:31 ` Disha Goel
2022-09-28 14:33   ` Athira Rajeev
2022-09-28 15:35 ` James Clark
2022-09-29  8:49   ` Athira Rajeev
2022-09-29 12:56     ` James Clark
2022-10-01 23:47       ` Ian Rogers
     [not found]         ` <993a1391ee931e859d972c460644d171@imap.linux.ibm.com>
2022-10-03 18:51           ` Ian Rogers
2022-10-04  7:06             ` Athira Rajeev
2022-10-04 14:49               ` Ian Rogers
2022-10-04 18:14                 ` Arnaldo Carvalho de Melo
2022-10-04 18:14                   ` Arnaldo Carvalho de Melo
2022-10-05  4:53                     ` Athira Rajeev
2022-10-05 12:24                       ` Arnaldo Carvalho de Melo
2022-10-05 12:28                       ` Arnaldo Carvalho de Melo
2022-10-05 12:35                         ` Arnaldo Carvalho de Melo
2022-10-06 12:46                           ` Athira Rajeev
2022-10-06 14:03                             ` Arnaldo Carvalho de Melo
2022-10-06 14:37                               ` Athira Rajeev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).