* [PATCH v2] perf stat: Improve runtime stat for interval mode
@ 2020-04-20 14:54 Jin Yao
2020-04-21 7:10 ` kajoljain
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Jin Yao @ 2020-04-20 14:54 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
For interval mode, the metric is printed after # if it exists. But
it's not calculated by the counts generated in this interval. See
following examples,
root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
# time counts unit events
1.000422803 764,809 inst_retired.any # 2.9 CPI
1.000422803 2,234,932 cycles
2.001464585 1,960,061 inst_retired.any # 1.6 CPI
2.001464585 4,022,591 cycles
The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000429493 2,869,311 cycles
1.000429493 816,875 instructions # 0.28 insn per cycle
2.001516426 9,260,973 cycles
2.001516426 5,250,634 instructions # 0.87 insn per cycle
The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is 0.57).
The current code uses a global variable rt_stat for tracking and
updating the std dev of runtime stat. Unlike the counts, rt_stat is
not reset for interval. While the counts are reset for interval.
perf_stat_process_counter()
{
if (config->interval)
init_stats(ps->res_stats);
}
So for interval, the rt_stat should be reset either.
This patch resets rt_stat before read_counters, so the runtime
stat is only calculated by the counts generated in this interval.
With this patch,
root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
# time counts unit events
1.000420924 2,408,818 inst_retired.any # 2.1 CPI
1.000420924 5,010,111 cycles
2.001448579 2,798,407 inst_retired.any # 1.6 CPI
2.001448579 4,599,861 cycles
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000428555 2,769,714 cycles
1.000428555 774,462 instructions # 0.28 insn per cycle
2.001471562 3,595,904 cycles
2.001471562 1,243,703 instructions # 0.35 insn per cycle
Now the second 'insn per cycle' and CPI are calculated by the counts
generated in this interval.
v2:
---
Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
We don't need to define new function perf_stat__reset_rt_stat.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/Documentation/perf-stat.txt | 2 ++
tools/perf/builtin-stat.c | 1 +
2 files changed, 3 insertions(+)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4d56586b2fb9..3fb5028aef08 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
example: 'perf stat -I 1000 -e cycles -a sleep 5'
+If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
+
--interval-count times::
Print count deltas for fixed number of times.
This option should be used together with "-I" option.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9207b6c45475..3f050d85c277 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -359,6 +359,7 @@ static void process_interval(void)
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);
+ perf_stat__reset_shadow_per_stat(&rt_stat);
read_counters(&rs);
if (STAT_RECORD) {
--
2.17.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] perf stat: Improve runtime stat for interval mode
2020-04-20 14:54 [PATCH v2] perf stat: Improve runtime stat for interval mode Jin Yao
@ 2020-04-21 7:10 ` kajoljain
2020-04-21 13:53 ` Arnaldo Carvalho de Melo
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: kajoljain @ 2020-04-21 7:10 UTC (permalink / raw)
To: Jin Yao, acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin
On 4/20/20 8:24 PM, Jin Yao wrote:
> For interval mode, the metric is printed after # if it exists. But
> it's not calculated by the counts generated in this interval. See
> following examples,
>
> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> # time counts unit events
> 1.000422803 764,809 inst_retired.any # 2.9 CPI
> 1.000422803 2,234,932 cycles
> 2.001464585 1,960,061 inst_retired.any # 1.6 CPI
> 2.001464585 4,022,591 cycles
>
> The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
>
> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> # time counts unit events
> 1.000429493 2,869,311 cycles
> 1.000429493 816,875 instructions # 0.28 insn per cycle
> 2.001516426 9,260,973 cycles
> 2.001516426 5,250,634 instructions # 0.87 insn per cycle
>
> The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is 0.57).
>
> The current code uses a global variable rt_stat for tracking and
> updating the std dev of runtime stat. Unlike the counts, rt_stat is
> not reset for interval. While the counts are reset for interval.
>
> perf_stat_process_counter()
> {
> if (config->interval)
> init_stats(ps->res_stats);
> }
>
> So for interval, the rt_stat should be reset either.
>
> This patch resets rt_stat before read_counters, so the runtime
> stat is only calculated by the counts generated in this interval.
>
> With this patch,
>
> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> # time counts unit events
> 1.000420924 2,408,818 inst_retired.any # 2.1 CPI
> 1.000420924 5,010,111 cycles
> 2.001448579 2,798,407 inst_retired.any # 1.6 CPI
> 2.001448579 4,599,861 cycles
>
> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> # time counts unit events
> 1.000428555 2,769,714 cycles
> 1.000428555 774,462 instructions # 0.28 insn per cycle
> 2.001471562 3,595,904 cycles
> 2.001471562 1,243,703 instructions # 0.35 insn per cycle
>
> Now the second 'insn per cycle' and CPI are calculated by the counts
> generated in this interval.
>
> v2:
> ---
> Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
> We don't need to define new function perf_stat__reset_rt_stat.
>
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
> tools/perf/Documentation/perf-stat.txt | 2 ++
> tools/perf/builtin-stat.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 4d56586b2fb9..3fb5028aef08 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
> The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
> example: 'perf stat -I 1000 -e cycles -a sleep 5'
>
> +If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
> +
> --interval-count times::
> Print count deltas for fixed number of times.
> This option should be used together with "-I" option.
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 9207b6c45475..3f050d85c277 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -359,6 +359,7 @@ static void process_interval(void)
> clock_gettime(CLOCK_MONOTONIC, &ts);
> diff_timespec(&rs, &ts, &ref_time);
>
> + perf_stat__reset_shadow_per_stat(&rt_stat);
Tested-By: Kajol Jain <kjain@linux.ibm.com>
Thanks,
Kajol
> read_counters(&rs);
>
> if (STAT_RECORD) {
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] perf stat: Improve runtime stat for interval mode
2020-04-20 14:54 [PATCH v2] perf stat: Improve runtime stat for interval mode Jin Yao
2020-04-21 7:10 ` kajoljain
@ 2020-04-21 13:53 ` Arnaldo Carvalho de Melo
2020-04-22 0:53 ` Jin, Yao
2020-04-22 7:22 ` Jiri Olsa
2020-05-08 13:05 ` [tip: perf/core] " tip-bot2 for Jin Yao
3 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-21 13:53 UTC (permalink / raw)
To: Jin Yao
Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
Em Mon, Apr 20, 2020 at 10:54:17PM +0800, Jin Yao escreveu:
> For interval mode, the metric is printed after # if it exists. But
> it's not calculated by the counts generated in this interval. See
> following examples,
>
> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> # time counts unit events
> 1.000422803 764,809 inst_retired.any # 2.9 CPI
> 1.000422803 2,234,932 cycles
> 2.001464585 1,960,061 inst_retired.any # 1.6 CPI
> 2.001464585 4,022,591 cycles
>
> The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
>
> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> # time counts unit events
> 1.000429493 2,869,311 cycles
> 1.000429493 816,875 instructions # 0.28 insn per cycle
> 2.001516426 9,260,973 cycles
> 2.001516426 5,250,634 instructions # 0.87 insn per cycle
>
> The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is 0.57).
>
> The current code uses a global variable rt_stat for tracking and
> updating the std dev of runtime stat. Unlike the counts, rt_stat is
> not reset for interval. While the counts are reset for interval.
>
> perf_stat_process_counter()
> {
> if (config->interval)
> init_stats(ps->res_stats);
> }
>
> So for interval, the rt_stat should be reset either.
s/either/too/g right?
And please try and find what was the cset that introduced the problem,
so that we can have a Fixes: line and the stable series can pick it, ok?
- Arnaldo
> This patch resets rt_stat before read_counters, so the runtime
> stat is only calculated by the counts generated in this interval.
>
> With this patch,
>
> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> # time counts unit events
> 1.000420924 2,408,818 inst_retired.any # 2.1 CPI
> 1.000420924 5,010,111 cycles
> 2.001448579 2,798,407 inst_retired.any # 1.6 CPI
> 2.001448579 4,599,861 cycles
>
> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> # time counts unit events
> 1.000428555 2,769,714 cycles
> 1.000428555 774,462 instructions # 0.28 insn per cycle
> 2.001471562 3,595,904 cycles
> 2.001471562 1,243,703 instructions # 0.35 insn per cycle
>
> Now the second 'insn per cycle' and CPI are calculated by the counts
> generated in this interval.
>
> v2:
> ---
> Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
> We don't need to define new function perf_stat__reset_rt_stat.
>
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
> tools/perf/Documentation/perf-stat.txt | 2 ++
> tools/perf/builtin-stat.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 4d56586b2fb9..3fb5028aef08 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
> The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
> example: 'perf stat -I 1000 -e cycles -a sleep 5'
>
> +If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
> +
> --interval-count times::
> Print count deltas for fixed number of times.
> This option should be used together with "-I" option.
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 9207b6c45475..3f050d85c277 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -359,6 +359,7 @@ static void process_interval(void)
> clock_gettime(CLOCK_MONOTONIC, &ts);
> diff_timespec(&rs, &ts, &ref_time);
>
> + perf_stat__reset_shadow_per_stat(&rt_stat);
> read_counters(&rs);
>
> if (STAT_RECORD) {
> --
> 2.17.1
>
--
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] perf stat: Improve runtime stat for interval mode
2020-04-21 13:53 ` Arnaldo Carvalho de Melo
@ 2020-04-22 0:53 ` Jin, Yao
2020-04-22 1:08 ` Arnaldo Melo
2020-04-23 13:44 ` Arnaldo Carvalho de Melo
0 siblings, 2 replies; 8+ messages in thread
From: Jin, Yao @ 2020-04-22 0:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
Hi Arnaldo,
On 4/21/2020 9:53 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Apr 20, 2020 at 10:54:17PM +0800, Jin Yao escreveu:
>> For interval mode, the metric is printed after # if it exists. But
>> it's not calculated by the counts generated in this interval. See
>> following examples,
>>
>> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
>> # time counts unit events
>> 1.000422803 764,809 inst_retired.any # 2.9 CPI
>> 1.000422803 2,234,932 cycles
>> 2.001464585 1,960,061 inst_retired.any # 1.6 CPI
>> 2.001464585 4,022,591 cycles
>>
>> The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
>>
>> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
>> # time counts unit events
>> 1.000429493 2,869,311 cycles
>> 1.000429493 816,875 instructions # 0.28 insn per cycle
>> 2.001516426 9,260,973 cycles
>> 2.001516426 5,250,634 instructions # 0.87 insn per cycle
>>
>> The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is 0.57).
>>
>> The current code uses a global variable rt_stat for tracking and
>> updating the std dev of runtime stat. Unlike the counts, rt_stat is
>> not reset for interval. While the counts are reset for interval.
>>
>> perf_stat_process_counter()
>> {
>> if (config->interval)
>> init_stats(ps->res_stats);
>> }
>>
>> So for interval, the rt_stat should be reset either.
>
> s/either/too/g right?
>
Yes, should use "too" here. :)
> And please try and find what was the cset that introduced the problem,
> so that we can have a Fixes: line and the stable series can pick it, ok?
>
> - Arnaldo
>
I have tried to find the patch which introduced this issue.
51fd2df1e882 ("perf stat: Fix interval output values").
This patch zeros stats for interval mode. I just think it should reset
rt_stat too.
But I really don't know if it's fair to this patch so I don't add it in
my patch description.
Thanks
Jin Yao
>> This patch resets rt_stat before read_counters, so the runtime
>> stat is only calculated by the counts generated in this interval.
>>
>> With this patch,
>>
>> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
>> # time counts unit events
>> 1.000420924 2,408,818 inst_retired.any # 2.1 CPI
>> 1.000420924 5,010,111 cycles
>> 2.001448579 2,798,407 inst_retired.any # 1.6 CPI
>> 2.001448579 4,599,861 cycles
>>
>> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
>> # time counts unit events
>> 1.000428555 2,769,714 cycles
>> 1.000428555 774,462 instructions # 0.28 insn per cycle
>> 2.001471562 3,595,904 cycles
>> 2.001471562 1,243,703 instructions # 0.35 insn per cycle
>>
>> Now the second 'insn per cycle' and CPI are calculated by the counts
>> generated in this interval.
>>
>> v2:
>> ---
>> Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
>> We don't need to define new function perf_stat__reset_rt_stat.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> tools/perf/Documentation/perf-stat.txt | 2 ++
>> tools/perf/builtin-stat.c | 1 +
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>> index 4d56586b2fb9..3fb5028aef08 100644
>> --- a/tools/perf/Documentation/perf-stat.txt
>> +++ b/tools/perf/Documentation/perf-stat.txt
>> @@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
>> The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
>> example: 'perf stat -I 1000 -e cycles -a sleep 5'
>>
>> +If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
>> +
>> --interval-count times::
>> Print count deltas for fixed number of times.
>> This option should be used together with "-I" option.
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 9207b6c45475..3f050d85c277 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -359,6 +359,7 @@ static void process_interval(void)
>> clock_gettime(CLOCK_MONOTONIC, &ts);
>> diff_timespec(&rs, &ts, &ref_time);
>>
>> + perf_stat__reset_shadow_per_stat(&rt_stat);
>> read_counters(&rs);
>>
>> if (STAT_RECORD) {
>> --
>> 2.17.1
>>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] perf stat: Improve runtime stat for interval mode
2020-04-22 0:53 ` Jin, Yao
@ 2020-04-22 1:08 ` Arnaldo Melo
2020-04-23 13:44 ` Arnaldo Carvalho de Melo
1 sibling, 0 replies; 8+ messages in thread
From: Arnaldo Melo @ 2020-04-22 1:08 UTC (permalink / raw)
To: Jin, Yao
Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On April 21, 2020 9:53:41 PM GMT-03:00, "Jin, Yao" <yao.jin@linux.intel.com> wrote:
>Hi Arnaldo,
>
>On 4/21/2020 9:53 PM, Arnaldo Carvalho de Melo wrote:
>> Em Mon, Apr 20, 2020 at 10:54:17PM +0800, Jin Yao escreveu:
>>> For interval mode, the metric is printed after # if it exists. But
>>> it's not calculated by the counts generated in this interval. See
>>> following examples,
>>>
>>> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
>>> # time counts unit events
>>> 1.000422803 764,809 inst_retired.any
># 2.9 CPI
>>> 1.000422803 2,234,932 cycles
>>> 2.001464585 1,960,061 inst_retired.any
># 1.6 CPI
>>> 2.001464585 4,022,591 cycles
>>>
>>> The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
>>>
>>> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000
>--interval-count 2
>>> # time counts unit events
>>> 1.000429493 2,869,311 cycles
>>> 1.000429493 816,875 instructions
># 0.28 insn per cycle
>>> 2.001516426 9,260,973 cycles
>>> 2.001516426 5,250,634 instructions
># 0.87 insn per cycle
>>>
>>> The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973
>is 0.57).
>>>
>>> The current code uses a global variable rt_stat for tracking and
>>> updating the std dev of runtime stat. Unlike the counts, rt_stat is
>>> not reset for interval. While the counts are reset for interval.
>>>
>>> perf_stat_process_counter()
>>> {
>>> if (config->interval)
>>> init_stats(ps->res_stats);
>>> }
>>>
>>> So for interval, the rt_stat should be reset either.
>>
>> s/either/too/g right?
>>
>
>Yes, should use "too" here. :)
Ok
>
>> And please try and find what was the cset that introduced the
>problem,
>> so that we can have a Fixes: line and the stable series can pick it,
>ok?
> >
>> - Arnaldo
>>
>
>I have tried to find the patch which introduced this issue.
>
>51fd2df1e882 ("perf stat: Fix interval output values").
>
>This patch zeros stats for interval mode. I just think it should reset
>rt_stat too.
>
>But I really don't know if it's fair to this patch so I don't add it in
>
>my patch description.
That's ok, this just speeds up the process, I'll check it :-)
- Arnaldo
>
>Thanks
>Jin Yao
>
>>> This patch resets rt_stat before read_counters, so the runtime
>>> stat is only calculated by the counts generated in this interval.
>>>
>>> With this patch,
>>>
>>> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
>>> # time counts unit events
>>> 1.000420924 2,408,818 inst_retired.any
># 2.1 CPI
>>> 1.000420924 5,010,111 cycles
>>> 2.001448579 2,798,407 inst_retired.any
># 1.6 CPI
>>> 2.001448579 4,599,861 cycles
>>>
>>> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000
>--interval-count 2
>>> # time counts unit events
>>> 1.000428555 2,769,714 cycles
>>> 1.000428555 774,462 instructions
># 0.28 insn per cycle
>>> 2.001471562 3,595,904 cycles
>>> 2.001471562 1,243,703 instructions
># 0.35 insn per cycle
>>>
>>> Now the second 'insn per cycle' and CPI are calculated by the counts
>>> generated in this interval.
>>>
>>> v2:
>>> ---
>>> Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
>>> We don't need to define new function perf_stat__reset_rt_stat.
>>>
>>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>>> ---
>>> tools/perf/Documentation/perf-stat.txt | 2 ++
>>> tools/perf/builtin-stat.c | 1 +
>>> 2 files changed, 3 insertions(+)
>>>
>>> diff --git a/tools/perf/Documentation/perf-stat.txt
>b/tools/perf/Documentation/perf-stat.txt
>>> index 4d56586b2fb9..3fb5028aef08 100644
>>> --- a/tools/perf/Documentation/perf-stat.txt
>>> +++ b/tools/perf/Documentation/perf-stat.txt
>>> @@ -176,6 +176,8 @@ Print count deltas every N milliseconds
>(minimum: 1ms)
>>> The overhead percentage could be high in some cases, for instance
>with small, sub 100ms intervals. Use with caution.
>>> example: 'perf stat -I 1000 -e cycles -a sleep 5'
>>>
>>> +If the metric exists, it is calculated by the counts generated in
>this interval and the metric is printed after #.
>>> +
>>> --interval-count times::
>>> Print count deltas for fixed number of times.
>>> This option should be used together with "-I" option.
>>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>>> index 9207b6c45475..3f050d85c277 100644
>>> --- a/tools/perf/builtin-stat.c
>>> +++ b/tools/perf/builtin-stat.c
>>> @@ -359,6 +359,7 @@ static void process_interval(void)
>>> clock_gettime(CLOCK_MONOTONIC, &ts);
>>> diff_timespec(&rs, &ts, &ref_time);
>>>
>>> + perf_stat__reset_shadow_per_stat(&rt_stat);
>>> read_counters(&rs);
>>>
>>> if (STAT_RECORD) {
>>> --
>>> 2.17.1
>>>
>>
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] perf stat: Improve runtime stat for interval mode
2020-04-20 14:54 [PATCH v2] perf stat: Improve runtime stat for interval mode Jin Yao
2020-04-21 7:10 ` kajoljain
2020-04-21 13:53 ` Arnaldo Carvalho de Melo
@ 2020-04-22 7:22 ` Jiri Olsa
2020-05-08 13:05 ` [tip: perf/core] " tip-bot2 for Jin Yao
3 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2020-04-22 7:22 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On Mon, Apr 20, 2020 at 10:54:17PM +0800, Jin Yao wrote:
> For interval mode, the metric is printed after # if it exists. But
> it's not calculated by the counts generated in this interval. See
> following examples,
>
> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> # time counts unit events
> 1.000422803 764,809 inst_retired.any # 2.9 CPI
> 1.000422803 2,234,932 cycles
> 2.001464585 1,960,061 inst_retired.any # 1.6 CPI
> 2.001464585 4,022,591 cycles
>
> The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
>
> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> # time counts unit events
> 1.000429493 2,869,311 cycles
> 1.000429493 816,875 instructions # 0.28 insn per cycle
> 2.001516426 9,260,973 cycles
> 2.001516426 5,250,634 instructions # 0.87 insn per cycle
>
> The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is 0.57).
>
> The current code uses a global variable rt_stat for tracking and
> updating the std dev of runtime stat. Unlike the counts, rt_stat is
> not reset for interval. While the counts are reset for interval.
>
> perf_stat_process_counter()
> {
> if (config->interval)
> init_stats(ps->res_stats);
> }
>
> So for interval, the rt_stat should be reset either.
>
> This patch resets rt_stat before read_counters, so the runtime
> stat is only calculated by the counts generated in this interval.
>
> With this patch,
>
> root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> # time counts unit events
> 1.000420924 2,408,818 inst_retired.any # 2.1 CPI
> 1.000420924 5,010,111 cycles
> 2.001448579 2,798,407 inst_retired.any # 1.6 CPI
> 2.001448579 4,599,861 cycles
>
> root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> # time counts unit events
> 1.000428555 2,769,714 cycles
> 1.000428555 774,462 instructions # 0.28 insn per cycle
> 2.001471562 3,595,904 cycles
> 2.001471562 1,243,703 instructions # 0.35 insn per cycle
>
> Now the second 'insn per cycle' and CPI are calculated by the counts
> generated in this interval.
>
> v2:
> ---
> Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
> We don't need to define new function perf_stat__reset_rt_stat.
Acked-by: Jiri Olsa <jolsa@redhat.com>
thanks,
jirka
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] perf stat: Improve runtime stat for interval mode
2020-04-22 0:53 ` Jin, Yao
2020-04-22 1:08 ` Arnaldo Melo
@ 2020-04-23 13:44 ` Arnaldo Carvalho de Melo
1 sibling, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-23 13:44 UTC (permalink / raw)
To: Jin, Yao
Cc: Arnaldo Carvalho de Melo, jolsa, peterz, mingo,
alexander.shishkin, Linux-kernel, ak, kan.liang, yao.jin
Em Wed, Apr 22, 2020 at 08:53:41AM +0800, Jin, Yao escreveu:
> Hi Arnaldo,
>
> On 4/21/2020 9:53 PM, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Apr 20, 2020 at 10:54:17PM +0800, Jin Yao escreveu:
> > > For interval mode, the metric is printed after # if it exists. But
> > > it's not calculated by the counts generated in this interval. See
> > > following examples,
> > >
> > > root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> > > # time counts unit events
> > > 1.000422803 764,809 inst_retired.any # 2.9 CPI
> > > 1.000422803 2,234,932 cycles
> > > 2.001464585 1,960,061 inst_retired.any # 1.6 CPI
> > > 2.001464585 4,022,591 cycles
> > >
> > > The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
> > >
> > > root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> > > # time counts unit events
> > > 1.000429493 2,869,311 cycles
> > > 1.000429493 816,875 instructions # 0.28 insn per cycle
> > > 2.001516426 9,260,973 cycles
> > > 2.001516426 5,250,634 instructions # 0.87 insn per cycle
> > >
> > > The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is 0.57).
> > >
> > > The current code uses a global variable rt_stat for tracking and
> > > updating the std dev of runtime stat. Unlike the counts, rt_stat is
> > > not reset for interval. While the counts are reset for interval.
> > >
> > > perf_stat_process_counter()
> > > {
> > > if (config->interval)
> > > init_stats(ps->res_stats);
> > > }
> > >
> > > So for interval, the rt_stat should be reset either.
> >
> > s/either/too/g right?
> >
>
> Yes, should use "too" here. :)
>
> > And please try and find what was the cset that introduced the problem,
> > so that we can have a Fixes: line and the stable series can pick it, ok?
> >
> > - Arnaldo
> >
>
> I have tried to find the patch which introduced this issue.
>
> 51fd2df1e882 ("perf stat: Fix interval output values").
>
> This patch zeros stats for interval mode. I just think it should reset
> rt_stat too.
> But I really don't know if it's fair to this patch so I don't add it in my
> patch description.
I think you are right that 51fd2df1e882 patch missed the fix you're
providing now, so I think it is fair to say that your current patch
fixes 51fd2df1e882, so I added it in the Fixes tag for the patch
discussed in this message, thanks.
- Arnaldo
> Thanks
> Jin Yao
>
> > > This patch resets rt_stat before read_counters, so the runtime
> > > stat is only calculated by the counts generated in this interval.
> > >
> > > With this patch,
> > >
> > > root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
> > > # time counts unit events
> > > 1.000420924 2,408,818 inst_retired.any # 2.1 CPI
> > > 1.000420924 5,010,111 cycles
> > > 2.001448579 2,798,407 inst_retired.any # 1.6 CPI
> > > 2.001448579 4,599,861 cycles
> > >
> > > root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
> > > # time counts unit events
> > > 1.000428555 2,769,714 cycles
> > > 1.000428555 774,462 instructions # 0.28 insn per cycle
> > > 2.001471562 3,595,904 cycles
> > > 2.001471562 1,243,703 instructions # 0.35 insn per cycle
> > >
> > > Now the second 'insn per cycle' and CPI are calculated by the counts
> > > generated in this interval.
> > >
> > > v2:
> > > ---
> > > Use just existing perf_stat__reset_shadow_per_stat(&rt_stat).
> > > We don't need to define new function perf_stat__reset_rt_stat.
> > >
> > > Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> > > ---
> > > tools/perf/Documentation/perf-stat.txt | 2 ++
> > > tools/perf/builtin-stat.c | 1 +
> > > 2 files changed, 3 insertions(+)
> > >
> > > diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> > > index 4d56586b2fb9..3fb5028aef08 100644
> > > --- a/tools/perf/Documentation/perf-stat.txt
> > > +++ b/tools/perf/Documentation/perf-stat.txt
> > > @@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
> > > The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
> > > example: 'perf stat -I 1000 -e cycles -a sleep 5'
> > > +If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
> > > +
> > > --interval-count times::
> > > Print count deltas for fixed number of times.
> > > This option should be used together with "-I" option.
> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > index 9207b6c45475..3f050d85c277 100644
> > > --- a/tools/perf/builtin-stat.c
> > > +++ b/tools/perf/builtin-stat.c
> > > @@ -359,6 +359,7 @@ static void process_interval(void)
> > > clock_gettime(CLOCK_MONOTONIC, &ts);
> > > diff_timespec(&rs, &ts, &ref_time);
> > > + perf_stat__reset_shadow_per_stat(&rt_stat);
> > > read_counters(&rs);
> > > if (STAT_RECORD) {
> > > --
> > > 2.17.1
> > >
> >
--
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip: perf/core] perf stat: Improve runtime stat for interval mode
2020-04-20 14:54 [PATCH v2] perf stat: Improve runtime stat for interval mode Jin Yao
` (2 preceding siblings ...)
2020-04-22 7:22 ` Jiri Olsa
@ 2020-05-08 13:05 ` tip-bot2 for Jin Yao
3 siblings, 0 replies; 8+ messages in thread
From: tip-bot2 for Jin Yao @ 2020-05-08 13:05 UTC (permalink / raw)
To: linux-tip-commits
Cc: Jin Yao, Jiri Olsa, Alexander Shishkin, Andi Kleen, Jin Yao,
Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo, x86, LKML
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 197ba86fdc888dc0d3d6b89b402c9c6851d4c6fb
Gitweb: https://git.kernel.org/tip/197ba86fdc888dc0d3d6b89b402c9c6851d4c6fb
Author: Jin Yao <yao.jin@linux.intel.com>
AuthorDate: Mon, 20 Apr 2020 22:54:17 +08:00
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 23 Apr 2020 11:03:46 -03:00
perf stat: Improve runtime stat for interval mode
For interval mode, the metric is printed after the '#' character if it
exists. But it's not calculated by the counts generated in this
interval.
See the following examples:
root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
# time counts unit events
1.000422803 764,809 inst_retired.any # 2.9 CPI
1.000422803 2,234,932 cycles
2.001464585 1,960,061 inst_retired.any # 1.6 CPI
2.001464585 4,022,591 cycles
The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000429493 2,869,311 cycles
1.000429493 816,875 instructions # 0.28 insn per cycle
2.001516426 9,260,973 cycles
2.001516426 5,250,634 instructions # 0.87 insn per cycle
The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is
0.57).
The current code uses a global variable 'rt_stat' for tracking and
updating the std dev of runtime stat. Unlike the counts, 'rt_stat' is not
reset for interval. While the counts are reset for interval.
perf_stat_process_counter()
{
if (config->interval)
init_stats(ps->res_stats);
}
So for interval mode, the 'rt_stat' variable should be reset too.
This patch resets 'rt_stat' before read_counters(), so the runtime stat
is only calculated by the counts generated in this interval.
With this patch:
root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
# time counts unit events
1.000420924 2,408,818 inst_retired.any # 2.1 CPI
1.000420924 5,010,111 cycles
2.001448579 2,798,407 inst_retired.any # 1.6 CPI
2.001448579 4,599,861 cycles
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000428555 2,769,714 cycles
1.000428555 774,462 instructions # 0.28 insn per cycle
2.001471562 3,595,904 cycles
2.001471562 1,243,703 instructions # 0.35 insn per cycle
Now the second 'insn per cycle' and CPI are calculated by the counts
generated in this interval.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-By: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200420145417.6864-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-stat.txt | 2 ++
tools/perf/builtin-stat.c | 1 +
2 files changed, 3 insertions(+)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4d56586..3fb5028 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
example: 'perf stat -I 1000 -e cycles -a sleep 5'
+If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
+
--interval-count times::
Print count deltas for fixed number of times.
This option should be used together with "-I" option.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9207b6c..3f050d8 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -359,6 +359,7 @@ static void process_interval(void)
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);
+ perf_stat__reset_shadow_per_stat(&rt_stat);
read_counters(&rs);
if (STAT_RECORD) {
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-05-08 13:08 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20 14:54 [PATCH v2] perf stat: Improve runtime stat for interval mode Jin Yao
2020-04-21 7:10 ` kajoljain
2020-04-21 13:53 ` Arnaldo Carvalho de Melo
2020-04-22 0:53 ` Jin, Yao
2020-04-22 1:08 ` Arnaldo Melo
2020-04-23 13:44 ` Arnaldo Carvalho de Melo
2020-04-22 7:22 ` Jiri Olsa
2020-05-08 13:05 ` [tip: perf/core] " tip-bot2 for Jin Yao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.