All of lore.kernel.org
 help / color / mirror / Atom feed
* perf tool: About tests debug level
@ 2021-06-21 11:20 John Garry
  2021-06-22  5:04 ` Ian Rogers
  0 siblings, 1 reply; 5+ messages in thread
From: John Garry @ 2021-06-21 11:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, irogers, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

Hi guys,

I noticed that when it was discovered recently that the new icelake JSON 
did not pass "PMU events" test, running the test with -v makes pr_err() 
and pr_debug() come out at the same level, so it's hard to distinguish 
the important logs.

Here is a link:
https://lore.kernel.org/lkml/YLdq%2FH8CXYgHWzCL@kernel.org/

And here is an extract:

parsing '(cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc) * msr@tsc@ 
/ 1000000000 / duration_time'
parsing '( ( 1 * ( fp_arith_inst_retired.scalar_single + 
fp_arith_inst_retired.scalar_double ) + 2 * 
fp_arith_inst_retired.128b_packed_double + 4 * ( 
fp_arith_inst_retired.128b_packed_single + 
fp_arith_inst_retired.256b_packed_double ) + 8 * ( 
fp_arith_inst_retired.256b_packed_single + 
fp_arith_inst_retired.512b_packed_double ) + 16 * 
fp_arith_inst_retired.512b_packed_single ) / 1000000000 ) / duration_time'
parsing 'cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc'
parsing '1 - cpu_clk_unhalted.one_thread_active / 
cpu_clk_unhalted.ref_distributed'
parsing 'cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread'
parsing '( 64 * ( uncore_imc@cas_count_read@ + 
uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time'
parsing '1000000000 * ( cha@event\=0x36\,umask\=0x21\,config\=0x40433@ / 
cha@event\=0x35\,umask\=0x21\,config\=0x40433@ ) / ( cha_0@event\=0x0@ / 
duration_time )'
parsing 'cha@event\=0x36\,umask\=0x21\,config\=0x40433@ / 
cha@event\=0x36\,umask\=0x21\,config\=0x40433\,thresh\=1@'
parsing '( 1000000000 * ( 
cha@event\=0x36\,umask\=0x21\,config\=0x40433@_pmm / 
cha@event\=0x35\,umask\=0x21\,config\=0x40433@_pmm ) / cha_0@event\=0x0@ 
)' [pr_debug]
check_parse_fake failed [pr_err]
test child finished with -1
---- end ----

I annotated in [], above

As for another issue, if you consider "Parse and process metrics", debug 
from core code comes out at same level as test code, e.g. with -v, we 
see pr_debug() from test code and core code. Again, this makes it hard 
to distinguish various levels. As an illustration, here I hack the code 
to fail a test:

sudo ./perf test -v 68
68: Parse and process metrics                                       :
--- start ---
test child forked, pid 9747
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
found event inst_retired.any verbose=1 pr_debug
found event inst_retired.any verbose=1 pr_err
found event cpu_clk_unhalted.thread verbose=1 pr_debug
found event cpu_clk_unhalted.thread verbose=1 pr_err
adding {inst_retired.any,cpu_clk_unhalted.thread}:W
FAILED tests/parse-metric.c:223 IPC failed, wrong ratio
FAILED tests/parse-metric.c:374 IPC failed
test child finished with -1
---- end ----
Parse and process metrics: FAILED!

Note that the "FAILED" messages from the test code come from pr_debug().

In a way, I feel that pr_debug()/err from the test is more important 
than pr_debug() from the core code (when running a test).

Any opinion on this or how to improve (if anyone agrees with me)? Or am 
I missing something? Or is it not so important?

Thanks,
John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-21 11:20 perf tool: About tests debug level John Garry
@ 2021-06-22  5:04 ` Ian Rogers
  2021-06-22 11:52   ` John Garry
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Rogers @ 2021-06-22  5:04 UTC (permalink / raw)
  To: John Garry
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

On Mon, Jun 21, 2021 at 4:26 AM John Garry <john.garry@huawei.com> wrote:
>
> Hi guys,
>
> I noticed that when it was discovered recently that the new icelake JSON
> did not pass "PMU events" test, running the test with -v makes pr_err()
> and pr_debug() come out at the same level, so it's hard to distinguish
> the important logs.
>
> Here is a link:
> https://lore.kernel.org/lkml/YLdq%2FH8CXYgHWzCL@kernel.org/
>
> And here is an extract:
>
> parsing '(cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc) * msr@tsc@
> / 1000000000 / duration_time'
> parsing '( ( 1 * ( fp_arith_inst_retired.scalar_single +
> fp_arith_inst_retired.scalar_double ) + 2 *
> fp_arith_inst_retired.128b_packed_double + 4 * (
> fp_arith_inst_retired.128b_packed_single +
> fp_arith_inst_retired.256b_packed_double ) + 8 * (
> fp_arith_inst_retired.256b_packed_single +
> fp_arith_inst_retired.512b_packed_double ) + 16 *
> fp_arith_inst_retired.512b_packed_single ) / 1000000000 ) / duration_time'
> parsing 'cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc'
> parsing '1 - cpu_clk_unhalted.one_thread_active /
> cpu_clk_unhalted.ref_distributed'
> parsing 'cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread'
> parsing '( 64 * ( uncore_imc@cas_count_read@ +
> uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time'
> parsing '1000000000 * ( cha@event\=0x36\,umask\=0x21\,config\=0x40433@ /
> cha@event\=0x35\,umask\=0x21\,config\=0x40433@ ) / ( cha_0@event\=0x0@ /
> duration_time )'
> parsing 'cha@event\=0x36\,umask\=0x21\,config\=0x40433@ /
> cha@event\=0x36\,umask\=0x21\,config\=0x40433\,thresh\=1@'
> parsing '( 1000000000 * (
> cha@event\=0x36\,umask\=0x21\,config\=0x40433@_pmm /
> cha@event\=0x35\,umask\=0x21\,config\=0x40433@_pmm ) / cha_0@event\=0x0@
> )' [pr_debug]
> check_parse_fake failed [pr_err]
> test child finished with -1
> ---- end ----
>
> I annotated in [], above
>
> As for another issue, if you consider "Parse and process metrics", debug
> from core code comes out at same level as test code, e.g. with -v, we
> see pr_debug() from test code and core code. Again, this makes it hard
> to distinguish various levels. As an illustration, here I hack the code
> to fail a test:
>
> sudo ./perf test -v 68
> 68: Parse and process metrics                                       :
> --- start ---
> test child forked, pid 9747
> metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
> found event inst_retired.any verbose=1 pr_debug
> found event inst_retired.any verbose=1 pr_err
> found event cpu_clk_unhalted.thread verbose=1 pr_debug
> found event cpu_clk_unhalted.thread verbose=1 pr_err
> adding {inst_retired.any,cpu_clk_unhalted.thread}:W
> FAILED tests/parse-metric.c:223 IPC failed, wrong ratio
> FAILED tests/parse-metric.c:374 IPC failed
> test child finished with -1
> ---- end ----
> Parse and process metrics: FAILED!
>
> Note that the "FAILED" messages from the test code come from pr_debug().
>
> In a way, I feel that pr_debug()/err from the test is more important
> than pr_debug() from the core code (when running a test).
>
> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> I missing something? Or is it not so important?

Hi John,

I think the issue is that in the parsing you don't know it's broken
until something goes wrong. Putting everything on pr_err would cause
spam in the not broken case. Improving the parsing error handling is a
big task with lex and yacc to some extent getting in the way. Perhaps
a middle way is to have a parameter to the parser that logs more, and
recursively call this in the parser when parsing fails. I guess there
is also a danger of a performance hit.

Thanks,
Ian

> Thanks,
> John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-22  5:04 ` Ian Rogers
@ 2021-06-22 11:52   ` John Garry
  2021-06-22 16:00     ` Ian Rogers
  0 siblings, 1 reply; 5+ messages in thread
From: John Garry @ 2021-06-22 11:52 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

On 22/06/2021 06:04, Ian Rogers wrote:
>> ---- end ----
>> Parse and process metrics: FAILED!
>>
>> Note that the "FAILED" messages from the test code come from pr_debug().
>>
>> In a way, I feel that pr_debug()/err from the test is more important
>> than pr_debug() from the core code (when running a test).
>>
>> Any opinion on this or how to improve (if anyone agrees with me)? Or am
>> I missing something? Or is it not so important?
> Hi John,
> 

Hi Ian,

> I think the issue is that in the parsing you don't know it's broken
> until something goes wrong. Putting everything on pr_err would cause
> spam in the not broken case.

Right, I would not suggest using pr_err everywhere.

> Improving the parsing error handling is a
> big task with lex and yacc to some extent getting in the way. Perhaps
> a middle way is to have a parameter to the parser that logs more, and
> recursively call this in the parser when parsing fails. I guess there
> is also a danger of a performance hit.

So I am thinking that for running a test, -v means different levels logs 
for test code and for core (non-test code). For example, -v prints 
pr_warn() and higher for test logs, but nothing for core logs. And then 
-vv for running a test gives pr_debug and above for test logs, and 
pr_warn and above for core logs. Or something like that.

Maybe that is not a good idea. But I'm just saying that it's hard to 
debug currently at -v for tests.

Thanks,
John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-22 11:52   ` John Garry
@ 2021-06-22 16:00     ` Ian Rogers
  2021-06-22 17:42       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Rogers @ 2021-06-22 16:00 UTC (permalink / raw)
  To: John Garry
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-perf-users,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Jin Yao

On Tue, Jun 22, 2021 at 4:58 AM John Garry <john.garry@huawei.com> wrote:
>
> On 22/06/2021 06:04, Ian Rogers wrote:
> >> ---- end ----
> >> Parse and process metrics: FAILED!
> >>
> >> Note that the "FAILED" messages from the test code come from pr_debug().
> >>
> >> In a way, I feel that pr_debug()/err from the test is more important
> >> than pr_debug() from the core code (when running a test).
> >>
> >> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> >> I missing something? Or is it not so important?
> > Hi John,
> >
>
> Hi Ian,
>
> > I think the issue is that in the parsing you don't know it's broken
> > until something goes wrong. Putting everything on pr_err would cause
> > spam in the not broken case.
>
> Right, I would not suggest using pr_err everywhere.
>
> > Improving the parsing error handling is a
> > big task with lex and yacc to some extent getting in the way. Perhaps
> > a middle way is to have a parameter to the parser that logs more, and
> > recursively call this in the parser when parsing fails. I guess there
> > is also a danger of a performance hit.
>
> So I am thinking that for running a test, -v means different levels logs
> for test code and for core (non-test code). For example, -v prints
> pr_warn() and higher for test logs, but nothing for core logs. And then
> -vv for running a test gives pr_debug and above for test logs, and
> pr_warn and above for core logs. Or something like that.
>
> Maybe that is not a good idea. But I'm just saying that it's hard to
> debug currently at -v for tests.
>
> Thanks,
> John

I think this sounds good. It'd be nice also to have verbose output in
the shell tests following the same convention. There's currently no
verbose logging in shell tests but I propose it here:
https://lore.kernel.org/lkml/20210621215648.2991319-1-irogers@google.com/
By their nature some of the shell tests launch perf, perhaps there can
be some convention on passing the verbose flag through in those cases.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: perf tool: About tests debug level
  2021-06-22 16:00     ` Ian Rogers
@ 2021-06-22 17:42       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-22 17:42 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Jiri Olsa, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Namhyung Kim, Jin Yao

Em Tue, Jun 22, 2021 at 09:00:31AM -0700, Ian Rogers escreveu:
> On Tue, Jun 22, 2021 at 4:58 AM John Garry <john.garry@huawei.com> wrote:
> >
> > On 22/06/2021 06:04, Ian Rogers wrote:
> > >> ---- end ----
> > >> Parse and process metrics: FAILED!
> > >>
> > >> Note that the "FAILED" messages from the test code come from pr_debug().
> > >>
> > >> In a way, I feel that pr_debug()/err from the test is more important
> > >> than pr_debug() from the core code (when running a test).
> > >>
> > >> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> > >> I missing something? Or is it not so important?
> > > Hi John,
> > >
> >
> > Hi Ian,
> >
> > > I think the issue is that in the parsing you don't know it's broken
> > > until something goes wrong. Putting everything on pr_err would cause
> > > spam in the not broken case.
> >
> > Right, I would not suggest using pr_err everywhere.
> >
> > > Improving the parsing error handling is a
> > > big task with lex and yacc to some extent getting in the way. Perhaps
> > > a middle way is to have a parameter to the parser that logs more, and
> > > recursively call this in the parser when parsing fails. I guess there
> > > is also a danger of a performance hit.
> >
> > So I am thinking that for running a test, -v means different levels logs
> > for test code and for core (non-test code). For example, -v prints
> > pr_warn() and higher for test logs, but nothing for core logs. And then
> > -vv for running a test gives pr_debug and above for test logs, and
> > pr_warn and above for core logs. Or something like that.
> >
> > Maybe that is not a good idea. But I'm just saying that it's hard to
> > debug currently at -v for tests.
> >
> > Thanks,
> > John
> 
> I think this sounds good. It'd be nice also to have verbose output in
> the shell tests following the same convention. There's currently no
> verbose logging in shell tests but I propose it here:
> https://lore.kernel.org/lkml/20210621215648.2991319-1-irogers@google.com/
> By their nature some of the shell tests launch perf, perhaps there can
> be some convention on passing the verbose flag through in those cases.

Hey, there is even a v2 for that one, lemme process it :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-22 17:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21 11:20 perf tool: About tests debug level John Garry
2021-06-22  5:04 ` Ian Rogers
2021-06-22 11:52   ` John Garry
2021-06-22 16:00     ` Ian Rogers
2021-06-22 17:42       ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.