linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf tool: About tests debug level
@ 2021-06-21 11:20 John Garry
  2021-06-22  5:04 ` Ian Rogers
  0 siblings, 1 reply; 5+ messages in thread
From: John Garry @ 2021-06-21 11:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, irogers, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

Hi guys,

I noticed that when it was discovered recently that the new icelake JSON 
did not pass "PMU events" test, running the test with -v makes pr_err() 
and pr_debug() come out at the same level, so it's hard to distinguish 
the important logs.

Here is a link:
https://lore.kernel.org/lkml/YLdq%2FH8CXYgHWzCL@kernel.org/

And here is an extract:

parsing '(cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc) * msr@tsc@ 
/ 1000000000 / duration_time'
parsing '( ( 1 * ( fp_arith_inst_retired.scalar_single + 
fp_arith_inst_retired.scalar_double ) + 2 * 
fp_arith_inst_retired.128b_packed_double + 4 * ( 
fp_arith_inst_retired.128b_packed_single + 
fp_arith_inst_retired.256b_packed_double ) + 8 * ( 
fp_arith_inst_retired.256b_packed_single + 
fp_arith_inst_retired.512b_packed_double ) + 16 * 
fp_arith_inst_retired.512b_packed_single ) / 1000000000 ) / duration_time'
parsing 'cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc'
parsing '1 - cpu_clk_unhalted.one_thread_active / 
cpu_clk_unhalted.ref_distributed'
parsing 'cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread'
parsing '( 64 * ( uncore_imc@cas_count_read@ + 
uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time'
parsing '1000000000 * ( cha@event\=0x36\,umask\=0x21\,config\=0x40433@ / 
cha@event\=0x35\,umask\=0x21\,config\=0x40433@ ) / ( cha_0@event\=0x0@ / 
duration_time )'
parsing 'cha@event\=0x36\,umask\=0x21\,config\=0x40433@ / 
cha@event\=0x36\,umask\=0x21\,config\=0x40433\,thresh\=1@'
parsing '( 1000000000 * ( 
cha@event\=0x36\,umask\=0x21\,config\=0x40433@_pmm / 
cha@event\=0x35\,umask\=0x21\,config\=0x40433@_pmm ) / cha_0@event\=0x0@ 
)' [pr_debug]
check_parse_fake failed [pr_err]
test child finished with -1
---- end ----

I annotated in [], above

As for another issue, if you consider "Parse and process metrics", debug 
from core code comes out at same level as test code, e.g. with -v, we 
see pr_debug() from test code and core code. Again, this makes it hard 
to distinguish various levels. As an illustration, here I hack the code 
to fail a test:

sudo ./perf test -v 68
68: Parse and process metrics                                       :
--- start ---
test child forked, pid 9747
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
found event inst_retired.any verbose=1 pr_debug
found event inst_retired.any verbose=1 pr_err
found event cpu_clk_unhalted.thread verbose=1 pr_debug
found event cpu_clk_unhalted.thread verbose=1 pr_err
adding {inst_retired.any,cpu_clk_unhalted.thread}:W
FAILED tests/parse-metric.c:223 IPC failed, wrong ratio
FAILED tests/parse-metric.c:374 IPC failed
test child finished with -1
---- end ----
Parse and process metrics: FAILED!

Note that the "FAILED" messages from the test code come from pr_debug().

In a way, I feel that pr_debug()/err from the test is more important 
than pr_debug() from the core code (when running a test).

Any opinion on this or how to improve (if anyone agrees with me)? Or am 
I missing something? Or is it not so important?

Thanks,
John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-21 11:20 perf tool: About tests debug level John Garry
@ 2021-06-22  5:04 ` Ian Rogers
  2021-06-22 11:52   ` John Garry
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Rogers @ 2021-06-22  5:04 UTC (permalink / raw)
  To: John Garry
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

On Mon, Jun 21, 2021 at 4:26 AM John Garry <john.garry@huawei.com> wrote:
>
> Hi guys,
>
> I noticed that when it was discovered recently that the new icelake JSON
> did not pass "PMU events" test, running the test with -v makes pr_err()
> and pr_debug() come out at the same level, so it's hard to distinguish
> the important logs.
>
> Here is a link:
> https://lore.kernel.org/lkml/YLdq%2FH8CXYgHWzCL@kernel.org/
>
> And here is an extract:
>
> parsing '(cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc) * msr@tsc@
> / 1000000000 / duration_time'
> parsing '( ( 1 * ( fp_arith_inst_retired.scalar_single +
> fp_arith_inst_retired.scalar_double ) + 2 *
> fp_arith_inst_retired.128b_packed_double + 4 * (
> fp_arith_inst_retired.128b_packed_single +
> fp_arith_inst_retired.256b_packed_double ) + 8 * (
> fp_arith_inst_retired.256b_packed_single +
> fp_arith_inst_retired.512b_packed_double ) + 16 *
> fp_arith_inst_retired.512b_packed_single ) / 1000000000 ) / duration_time'
> parsing 'cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc'
> parsing '1 - cpu_clk_unhalted.one_thread_active /
> cpu_clk_unhalted.ref_distributed'
> parsing 'cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread'
> parsing '( 64 * ( uncore_imc@cas_count_read@ +
> uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time'
> parsing '1000000000 * ( cha@event\=0x36\,umask\=0x21\,config\=0x40433@ /
> cha@event\=0x35\,umask\=0x21\,config\=0x40433@ ) / ( cha_0@event\=0x0@ /
> duration_time )'
> parsing 'cha@event\=0x36\,umask\=0x21\,config\=0x40433@ /
> cha@event\=0x36\,umask\=0x21\,config\=0x40433\,thresh\=1@'
> parsing '( 1000000000 * (
> cha@event\=0x36\,umask\=0x21\,config\=0x40433@_pmm /
> cha@event\=0x35\,umask\=0x21\,config\=0x40433@_pmm ) / cha_0@event\=0x0@
> )' [pr_debug]
> check_parse_fake failed [pr_err]
> test child finished with -1
> ---- end ----
>
> I annotated in [], above
>
> As for another issue, if you consider "Parse and process metrics", debug
> from core code comes out at same level as test code, e.g. with -v, we
> see pr_debug() from test code and core code. Again, this makes it hard
> to distinguish various levels. As an illustration, here I hack the code
> to fail a test:
>
> sudo ./perf test -v 68
> 68: Parse and process metrics                                       :
> --- start ---
> test child forked, pid 9747
> metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
> found event inst_retired.any verbose=1 pr_debug
> found event inst_retired.any verbose=1 pr_err
> found event cpu_clk_unhalted.thread verbose=1 pr_debug
> found event cpu_clk_unhalted.thread verbose=1 pr_err
> adding {inst_retired.any,cpu_clk_unhalted.thread}:W
> FAILED tests/parse-metric.c:223 IPC failed, wrong ratio
> FAILED tests/parse-metric.c:374 IPC failed
> test child finished with -1
> ---- end ----
> Parse and process metrics: FAILED!
>
> Note that the "FAILED" messages from the test code come from pr_debug().
>
> In a way, I feel that pr_debug()/err from the test is more important
> than pr_debug() from the core code (when running a test).
>
> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> I missing something? Or is it not so important?

Hi John,

I think the issue is that in the parsing you don't know it's broken
until something goes wrong. Putting everything on pr_err would cause
spam in the not broken case. Improving the parsing error handling is a
big task with lex and yacc to some extent getting in the way. Perhaps
a middle way is to have a parameter to the parser that logs more, and
recursively call this in the parser when parsing fails. I guess there
is also a danger of a performance hit.

Thanks,
Ian

> Thanks,
> John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-22  5:04 ` Ian Rogers
@ 2021-06-22 11:52   ` John Garry
  2021-06-22 16:00     ` Ian Rogers
  0 siblings, 1 reply; 5+ messages in thread
From: John Garry @ 2021-06-22 11:52 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

On 22/06/2021 06:04, Ian Rogers wrote:
>> ---- end ----
>> Parse and process metrics: FAILED!
>>
>> Note that the "FAILED" messages from the test code come from pr_debug().
>>
>> In a way, I feel that pr_debug()/err from the test is more important
>> than pr_debug() from the core code (when running a test).
>>
>> Any opinion on this or how to improve (if anyone agrees with me)? Or am
>> I missing something? Or is it not so important?
> Hi John,
> 

Hi Ian,

> I think the issue is that in the parsing you don't know it's broken
> until something goes wrong. Putting everything on pr_err would cause
> spam in the not broken case.

Right, I would not suggest using pr_err everywhere.

> Improving the parsing error handling is a
> big task with lex and yacc to some extent getting in the way. Perhaps
> a middle way is to have a parameter to the parser that logs more, and
> recursively call this in the parser when parsing fails. I guess there
> is also a danger of a performance hit.

So I am thinking that for running a test, -v means different levels logs 
for test code and for core (non-test code). For example, -v prints 
pr_warn() and higher for test logs, but nothing for core logs. And then 
-vv for running a test gives pr_debug and above for test logs, and 
pr_warn and above for core logs. Or something like that.

Maybe that is not a good idea. But I'm just saying that it's hard to 
debug currently at -v for tests.

Thanks,
John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-22 11:52   ` John Garry
@ 2021-06-22 16:00     ` Ian Rogers
  2021-06-22 17:42       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Rogers @ 2021-06-22 16:00 UTC (permalink / raw)
  To: John Garry
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

On Tue, Jun 22, 2021 at 4:58 AM John Garry <john.garry@huawei.com> wrote:
>
> On 22/06/2021 06:04, Ian Rogers wrote:
> >> ---- end ----
> >> Parse and process metrics: FAILED!
> >>
> >> Note that the "FAILED" messages from the test code come from pr_debug().
> >>
> >> In a way, I feel that pr_debug()/err from the test is more important
> >> than pr_debug() from the core code (when running a test).
> >>
> >> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> >> I missing something? Or is it not so important?
> > Hi John,
> >
>
> Hi Ian,
>
> > I think the issue is that in the parsing you don't know it's broken
> > until something goes wrong. Putting everything on pr_err would cause
> > spam in the not broken case.
>
> Right, I would not suggest using pr_err everywhere.
>
> > Improving the parsing error handling is a
> > big task with lex and yacc to some extent getting in the way. Perhaps
> > a middle way is to have a parameter to the parser that logs more, and
> > recursively call this in the parser when parsing fails. I guess there
> > is also a danger of a performance hit.
>
> So I am thinking that for running a test, -v means different levels logs
> for test code and for core (non-test code). For example, -v prints
> pr_warn() and higher for test logs, but nothing for core logs. And then
> -vv for running a test gives pr_debug and above for test logs, and
> pr_warn and above for core logs. Or something like that.
>
> Maybe that is not a good idea. But I'm just saying that it's hard to
> debug currently at -v for tests.
>
> Thanks,
> John

I think this sounds good. It'd be nice also to have verbose output in
the shell tests following the same convention. There's currently no
verbose logging in shell tests but I propose it here:
https://lore.kernel.org/lkml/20210621215648.2991319-1-irogers@google.com/
By their nature some of the shell tests launch perf, perhaps there can
be some convention on passing the verbose flag through in those cases.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-22 16:00     ` Ian Rogers
@ 2021-06-22 17:42       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-22 17:42 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Jiri Olsa, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Namhyung Kim, Jin Yao

Em Tue, Jun 22, 2021 at 09:00:31AM -0700, Ian Rogers escreveu:
> On Tue, Jun 22, 2021 at 4:58 AM John Garry <john.garry@huawei.com> wrote:
> >
> > On 22/06/2021 06:04, Ian Rogers wrote:
> > >> ---- end ----
> > >> Parse and process metrics: FAILED!
> > >>
> > >> Note that the "FAILED" messages from the test code come from pr_debug().
> > >>
> > >> In a way, I feel that pr_debug()/err from the test is more important
> > >> than pr_debug() from the core code (when running a test).
> > >>
> > >> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> > >> I missing something? Or is it not so important?
> > > Hi John,
> > >
> >
> > Hi Ian,
> >
> > > I think the issue is that in the parsing you don't know it's broken
> > > until something goes wrong. Putting everything on pr_err would cause
> > > spam in the not broken case.
> >
> > Right, I would not suggest using pr_err everywhere.
> >
> > > Improving the parsing error handling is a
> > > big task with lex and yacc to some extent getting in the way. Perhaps
> > > a middle way is to have a parameter to the parser that logs more, and
> > > recursively call this in the parser when parsing fails. I guess there
> > > is also a danger of a performance hit.
> >
> > So I am thinking that for running a test, -v means different levels logs
> > for test code and for core (non-test code). For example, -v prints
> > pr_warn() and higher for test logs, but nothing for core logs. And then
> > -vv for running a test gives pr_debug and above for test logs, and
> > pr_warn and above for core logs. Or something like that.
> >
> > Maybe that is not a good idea. But I'm just saying that it's hard to
> > debug currently at -v for tests.
> >
> > Thanks,
> > John
> 
> I think this sounds good. It'd be nice also to have verbose output in
> the shell tests following the same convention. There's currently no
> verbose logging in shell tests but I propose it here:
> https://lore.kernel.org/lkml/20210621215648.2991319-1-irogers@google.com/
> By their nature some of the shell tests launch perf, perhaps there can
> be some convention on passing the verbose flag through in those cases.

Hey, there is even a v2 for that one, lemme process it :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-22 17:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21 11:20 perf tool: About tests debug level John Garry
2021-06-22  5:04 ` Ian Rogers
2021-06-22 11:52   ` John Garry
2021-06-22 16:00     ` Ian Rogers
2021-06-22 17:42       ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).