All of lore.kernel.org
 help / color / mirror / Atom feed
* [Fuego] LPC Increase Test Coverage in a Linux-based OS
@ 2016-11-05 17:15 Victor Rodriguez
  2016-11-08  0:26 ` Bird, Timothy
  2016-11-10  3:09 ` Daniel Sangorrin
  0 siblings, 2 replies; 13+ messages in thread
From: Victor Rodriguez @ 2016-11-05 17:15 UTC (permalink / raw)
  To: fuego, Guillermo Adrian Ponce Castañeda

Hi Fuego team.

This week I presented a case of study for the problem of lack of test
log output standardization in the majority of packages that are used
to build the current Linux distributions. This was presented as a BOF
( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
the Linux Plumbers Conference.

it was a productive  discussion that let us share the problem that we
have in the current projects that we use every day to build a
distribution ( either in embedded as in a cloud base distribution).
The open source projects don't follow a standard output log format to
print the passing and failing tests that they run during packaging
time ( "make test" or "make check" )

The Clear Linux project is using a simple Perl script that helps them
to count the number of passing and failing tests (which should be
trivial if could have a single standard output among all the projects,
but we don’t):

https://github.com/clearlinux/autospec/blob/master/autospec/count.pl

# perl count.pl <build.log>

Examples of real packages build logs:

https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x86_64/build.log
https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x86_64/build.log

So far that simple (and not well engineered) parser has found 26
“standard” outputs ( and counting ) .  The script has the fail that it
does not recognize the name of the tests in order to detect
regressions. Maybe one test was passing in the previous release and in
the new one is failing, and then the number of failing tests remains
the same.

To be honest, before presenting at LPC I was very confident that this
script ( or another version of it , much smarter ) could be beginning
of the solution to the problem we have. However, during the discussion
at LPC I understand that this might be a huge effort (not sure if
bigger) in order to solve the nightmare we already have.

Tim Bird participates at the BOF and recommends me to send a mail to
the Fuego project team in order to look for more inputs and ideas bout
this topic.

I really believe in the importance of attack this problem before we
have a bigger problem

All feedback is more than welcome

Regards

Victor Rodriguez

[presentation slides] :
https://drive.google.com/open?id=0B7iKrGdVkDhIcVpncUdGTGhEQTQ
[BOF notes] : https://drive.google.com/open?id=1lOPXQcrhL4AoOBSDnwUlJAKIXsReU8OqP82usZn-DCo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-05 17:15 [Fuego] LPC Increase Test Coverage in a Linux-based OS Victor Rodriguez
@ 2016-11-08  0:26 ` Bird, Timothy
  2016-11-08 19:38   ` Guillermo Adrian Ponce Castañeda
  2016-11-10  3:09 ` Daniel Sangorrin
  1 sibling, 1 reply; 13+ messages in thread
From: Bird, Timothy @ 2016-11-08  0:26 UTC (permalink / raw)
  To: Victor Rodriguez, fuego, Guillermo Adrian Ponce Castañeda

Victor,

Thanks for raising this topic.  I think it's an important one.  I have some comments below, inline.

> -----Original Message-----
> From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
>
> This week I presented a case of study for the problem of lack of test
> log output standardization in the majority of packages that are used
> to build the current Linux distributions. This was presented as a BOF
> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
> the Linux Plumbers Conference.
> 
> it was a productive  discussion that let us share the problem that we
> have in the current projects that we use every day to build a
> distribution ( either in embedded as in a cloud base distribution).
> The open source projects don't follow a standard output log format to
> print the passing and failing tests that they run during packaging
> time ( "make test" or "make check" )
> 
> The Clear Linux project is using a simple Perl script that helps them
> to count the number of passing and failing tests (which should be
> trivial if could have a single standard output among all the projects,
> but we don’t):
> 
> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> 
> # perl count.pl <build.log>

A few remarks about this.  This will be something of a stream of ideas, not
very well organized.  I'd like to prevent requiring too many different
language skills in Fuego.  In order to write a test for Fuego, we already require
knowledge of shell script, python (for the benchmark parsers) and json formats
(for the test specs and plans).  I'd be hesitant to adopt something in perl, but maybe
there's a way to leverage the expertise embedded in your script.

I'm not that fond of the idea of integrating all the parsers into a single program.
I think it's conceptually simpler to have a parser per log file format.  However,
I haven't looked in detail at your parser, so I can't really comment on it's
complexity.  I note that 0day has a parser per test (but I haven't checked to
see if they re-use common parsers between tests.)  Possibly some combination
of code-driven and data-driven parsers is best, but I don't have the experience
you guys do with your parser.

If I understood your presentation, you are currently parsing
logs for thousands of packages. I thought you said that about half of the
20,000 packages in a distro have unit tests, and I thought you said that
your parser was covering about half of those (so, about 5000 packages currently).
And this is with 26 log formats parsed so far.

I'm guessing that packages have a "long tail" of formats, with them getting
weirder and weirder the farther out on the tail of formats you get.

Please correct my numbers if I'm mistaken.

> Examples of real packages build logs:
> 
> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8
> 6_64/build.log
> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x
> 86_64/build.log
> 
> So far that simple (and not well engineered) parser has found 26
> “standard” outputs ( and counting ) . 

This is actually remarkable, as Fuego is only handing the formats for the
standalone tests we ship with Fuego.  As I stated in the BOF, we have two 
mechanisms, one for functional tests that uses shell, grep and diff, and
one for benchmark tests that uses a very small python program that uses
regexes.   So, currently we only have 50 tests covered, but many of these
parsers use very simple one-line grep regexes.

Neither of these Fuego log results parser methods supports tracking individual
subtest results.

> The script has the fail that it
> does not recognize the name of the tests in order to detect
> regressions. Maybe one test was passing in the previous release and in
> the new one is failing, and then the number of failing tests remains
> the same.

This is a concern with the Fuego log parsing as well.

I would like to modify Fuego's parser to not just parse out counts, but to
also convert the results to something where individual sub-tests can be
tracked over time.  Daniel Sangorrin's recent work converting the output
of LTP into excel format might be one way to do this (although I'm not
that comfortable with using a proprietary format - I would prefer CSV
or json, but I think Daniel is going for ease of use first.)

I need to do some more research, but I'm hoping that there are Jenkins
plugins (maybe xUnit) that will provide tools to automatically handle 
visualization of test and sub-test results over time.  If so, I might
try converting the Fuego parsers to product that format.

> To be honest, before presenting at LPC I was very confident that this
> script ( or another version of it , much smarter ) could be beginning
> of the solution to the problem we have. However, during the discussion
> at LPC I understand that this might be a huge effort (not sure if
> bigger) in order to solve the nightmare we already have.

So far, I think you're solving a bit different problem than Fuego is, and in one sense are
much farther along than Fuego.  I'm hoping we can learn from your
experience with this.

I do think we share the goal of producing a standard, or at least a recommendation,
for a common test log output format.  This would help the industry going forward.
Even if individual tests don't produce the standard format, it will help 3rd parties
write parsers that conform the test output to the format, as well as encourage the
development of tools that utilize the format for visualization or regression checking.

Do you feel confident enough to propose a format?  I don't at the moment.
I'd like to survey the industry for 1) existing formats produced by tests (which you have good experience
with, which is already maybe capture well by your perl script), and 2) existing tools
that use common formats as input (e.g. the Jenkins xunit plugin).  From this I'd like
to develop some ideas about the fields that are most commonly used, and a good language to
express those fields. My preference would be JSON - I'm something of an XML naysayer, but
I could be talked into YAML.  Under no circumstances do I want to invent a new language for
this.
 
> Tim Bird participates at the BOF and recommends me to send a mail to
> the Fuego project team in order to look for more inputs and ideas bout
> this topic.
> 
> I really believe in the importance of attack this problem before we
> have a bigger problem
> 
> All feedback is more than welcome

Here is how I propose moving forward on this.  I'd like to get a group together to study this
issue.  I wrote down a list of people at LPC who seem to be working on test issues.  I'd like to
do the following:
 1) perform a survey of the areas I mentioned above
 2) write up a draft spec
 3) send it around for comments (to what individual and lists? is an open issue)
 4) discuss it at a future face-to-face meeting (probably at ELC or maybe next year's plumbers)
 5) publish it as a standard endorsed by the Linux Foundation

Let me know what you think, and if you'd like to be involved.

Thanks and regards,
 -- Tim


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-08  0:26 ` Bird, Timothy
@ 2016-11-08 19:38   ` Guillermo Adrian Ponce Castañeda
  2016-11-09  0:21     ` Bird, Timothy
  0 siblings, 1 reply; 13+ messages in thread
From: Guillermo Adrian Ponce Castañeda @ 2016-11-08 19:38 UTC (permalink / raw)
  To: Bird, Timothy; +Cc: fuego

[-- Attachment #1: Type: text/plain, Size: 8568 bytes --]

Hello Tim and Victor,

I am a co-author of this code and I must confess that it was more or less
my fault that it was made on Perl.

Regarding how many logs the program analyzes, I think it is nowhere near
5000, it is much less, but taking in count that some logs are similar I
think it is possible that some logs that haven't been tested are going to
work, but who knows :).

And about the output file, right now it delivers a comma separated list of
numbers, without headers, this is because this code is part of a  bigger
tool, I think that code is not open source yet, but that doesn't matter I
guess, the thing here is that I think the output could be changed into a
json like you suggested and i can try to translate the code from Perl to
Python, still not sure how long it's gonna take, but I can sure try.

Thanks.
- Guillermo Ponce

On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <Tim.Bird@am.sony.com> wrote:

> Victor,
>
> Thanks for raising this topic.  I think it's an important one.  I have
> some comments below, inline.
>
> > -----Original Message-----
> > From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
> >
> > This week I presented a case of study for the problem of lack of test
> > log output standardization in the majority of packages that are used
> > to build the current Linux distributions. This was presented as a BOF
> > ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
> > the Linux Plumbers Conference.
> >
> > it was a productive  discussion that let us share the problem that we
> > have in the current projects that we use every day to build a
> > distribution ( either in embedded as in a cloud base distribution).
> > The open source projects don't follow a standard output log format to
> > print the passing and failing tests that they run during packaging
> > time ( "make test" or "make check" )
> >
> > The Clear Linux project is using a simple Perl script that helps them
> > to count the number of passing and failing tests (which should be
> > trivial if could have a single standard output among all the projects,
> > but we don’t):
> >
> > https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> >
> > # perl count.pl <build.log>
>
> A few remarks about this.  This will be something of a stream of ideas, not
> very well organized.  I'd like to prevent requiring too many different
> language skills in Fuego.  In order to write a test for Fuego, we already
> require
> knowledge of shell script, python (for the benchmark parsers) and json
> formats
> (for the test specs and plans).  I'd be hesitant to adopt something in
> perl, but maybe
> there's a way to leverage the expertise embedded in your script.
>
> I'm not that fond of the idea of integrating all the parsers into a single
> program.
> I think it's conceptually simpler to have a parser per log file format.
> However,
> I haven't looked in detail at your parser, so I can't really comment on
> it's
> complexity.  I note that 0day has a parser per test (but I haven't checked
> to
> see if they re-use common parsers between tests.)  Possibly some
> combination
> of code-driven and data-driven parsers is best, but I don't have the
> experience
> you guys do with your parser.
>
> If I understood your presentation, you are currently parsing
> logs for thousands of packages. I thought you said that about half of the
> 20,000 packages in a distro have unit tests, and I thought you said that
> your parser was covering about half of those (so, about 5000 packages
> currently).
> And this is with 26 log formats parsed so far.
>
> I'm guessing that packages have a "long tail" of formats, with them getting
> weirder and weirder the farther out on the tail of formats you get.
>
> Please correct my numbers if I'm mistaken.
>
> > Examples of real packages build logs:
> >
> > https://kojipkgs.fedoraproject.org//packages/
> gcc/6.2.1/2.fc25/data/logs/x8
> > 6_64/build.log
> > https://kojipkgs.fedoraproject.org//packages/
> acl/2.2.52/11.fc24/data/logs/x
> > 86_64/build.log
> >
> > So far that simple (and not well engineered) parser has found 26
> > “standard” outputs ( and counting ) .
>
> This is actually remarkable, as Fuego is only handing the formats for the
> standalone tests we ship with Fuego.  As I stated in the BOF, we have two
> mechanisms, one for functional tests that uses shell, grep and diff, and
> one for benchmark tests that uses a very small python program that uses
> regexes.   So, currently we only have 50 tests covered, but many of these
> parsers use very simple one-line grep regexes.
>
> Neither of these Fuego log results parser methods supports tracking
> individual
> subtest results.
>
> > The script has the fail that it
> > does not recognize the name of the tests in order to detect
> > regressions. Maybe one test was passing in the previous release and in
> > the new one is failing, and then the number of failing tests remains
> > the same.
>
> This is a concern with the Fuego log parsing as well.
>
> I would like to modify Fuego's parser to not just parse out counts, but to
> also convert the results to something where individual sub-tests can be
> tracked over time.  Daniel Sangorrin's recent work converting the output
> of LTP into excel format might be one way to do this (although I'm not
> that comfortable with using a proprietary format - I would prefer CSV
> or json, but I think Daniel is going for ease of use first.)
>
> I need to do some more research, but I'm hoping that there are Jenkins
> plugins (maybe xUnit) that will provide tools to automatically handle
> visualization of test and sub-test results over time.  If so, I might
> try converting the Fuego parsers to product that format.
>
> > To be honest, before presenting at LPC I was very confident that this
> > script ( or another version of it , much smarter ) could be beginning
> > of the solution to the problem we have. However, during the discussion
> > at LPC I understand that this might be a huge effort (not sure if
> > bigger) in order to solve the nightmare we already have.
>
> So far, I think you're solving a bit different problem than Fuego is, and
> in one sense are
> much farther along than Fuego.  I'm hoping we can learn from your
> experience with this.
>
> I do think we share the goal of producing a standard, or at least a
> recommendation,
> for a common test log output format.  This would help the industry going
> forward.
> Even if individual tests don't produce the standard format, it will help
> 3rd parties
> write parsers that conform the test output to the format, as well as
> encourage the
> development of tools that utilize the format for visualization or
> regression checking.
>
> Do you feel confident enough to propose a format?  I don't at the moment.
> I'd like to survey the industry for 1) existing formats produced by tests
> (which you have good experience
> with, which is already maybe capture well by your perl script), and 2)
> existing tools
> that use common formats as input (e.g. the Jenkins xunit plugin).  From
> this I'd like
> to develop some ideas about the fields that are most commonly used, and a
> good language to
> express those fields. My preference would be JSON - I'm something of an
> XML naysayer, but
> I could be talked into YAML.  Under no circumstances do I want to invent a
> new language for
> this.
>
> > Tim Bird participates at the BOF and recommends me to send a mail to
> > the Fuego project team in order to look for more inputs and ideas bout
> > this topic.
> >
> > I really believe in the importance of attack this problem before we
> > have a bigger problem
> >
> > All feedback is more than welcome
>
> Here is how I propose moving forward on this.  I'd like to get a group
> together to study this
> issue.  I wrote down a list of people at LPC who seem to be working on
> test issues.  I'd like to
> do the following:
>  1) perform a survey of the areas I mentioned above
>  2) write up a draft spec
>  3) send it around for comments (to what individual and lists? is an open
> issue)
>  4) discuss it at a future face-to-face meeting (probably at ELC or maybe
> next year's plumbers)
>  5) publish it as a standard endorsed by the Linux Foundation
>
> Let me know what you think, and if you'd like to be involved.
>
> Thanks and regards,
>  -- Tim
>
>


-- 
- Guillermo Ponce

[-- Attachment #2: Type: text/html, Size: 10161 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-08 19:38   ` Guillermo Adrian Ponce Castañeda
@ 2016-11-09  0:21     ` Bird, Timothy
  2016-11-09 18:04       ` Guillermo Adrian Ponce Castañeda
  2016-11-10  4:07       ` Daniel Sangorrin
  0 siblings, 2 replies; 13+ messages in thread
From: Bird, Timothy @ 2016-11-09  0:21 UTC (permalink / raw)
  To: Guillermo Adrian Ponce Castañeda; +Cc: fuego



> -----Original Message-----
> From: Guillermo Adrian Ponce Castañeda on Tuesday, November 08, 2016 11:38 AM
>
> I am a co-author of this code and I must confess that it was more or less my
> fault that it was made on Perl.

No blame intended. :-)
 
> 
> Regarding how many logs the program analyzes, I think it is nowhere near
> 5000, it is much less, but taking in count that some logs are similar I think it is
> possible that some logs that haven't been tested are going to work, but who
> knows :).
> 
> 
> And about the output file, right now it delivers a comma separated list of
> numbers, without headers, this is because this code is part of a  bigger tool, I
> think that code is not open source yet, but that doesn't matter I guess, the
> thing here is that I think the output could be changed into a json like you
> suggested and i can try to translate the code from Perl to Python, still not
> sure how long it's gonna take, but I can sure try.

Well, don't do any re-writing just yet.  I think we need to consider the
output format some more, and decide whether it makes sense to have a
single vs. multiple parsers first.

An important issue here is scalability of the project, and making it easy
to allow (and incentivize) other developers to create and maintain
parsers for the log files.  Or, to help encourage people to use a common
format either initially, or by conversion from their current log format.
The only way to scale this is by having 3rd parties adopt the format, and
be willing to maintain compatibility with it over time.

I think it's important to consider what will motivate people to adopt a common
log format.  They either need to 1) write a parser for their current format, or
2) identify an existing parser which is close enough and modify it to
support their format, or 3) convert their test output directly to the desired
format.  This will be some amount of work whichever route people take.

I think what will be of value is having tools that read and process the format,
and provide utility to those who use the format for output.  So I want to do a bit
of a survey on what tools (visualizers, aggregators, automated processors, 
notifiers, etc.) might be useful to different developer groups, and make sure
the format is something that can be used by existing tools or by envisioned
future tools, that would be valuable to community members.

In more high-level terms, we should trying to create a double-sided network effect,
where use  (output) of the format drives tools creation, and tools usage
of the format (input) drives format popularity.

Can you describe a bit more what tools, if any, you use to view the results,
or any other processing systems that the results are used with?  If you are reviewing
results manually, are there steps you are doing now by hand that you'd like to
do automatically in the future, that a common format would help you with?

I'll go first - Fuego is currently just using the standard Jenkins "weather" report
and 'list of recent overall pass/failure' for each test. So we don't have anything
visualizing the results of sub-tests, or even displaying the counts for each test run, at the moment.
Daniel Sangorrin has just recently proposed a facility to put LTP results into spreadsheet format,
to allow visualizing test results over time via spreadsheet tools.  I'd like to add better
sub-test visualization in the future, but that's lower on our priority list at the moment.

Also in the future, we'd like to do test results aggregation, to allow for data mining
of results from tests on different hardware platforms and embedded distributions.
This will require that the parsed log output be machine-readable, and consistent.
 -- Tim

> On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <Tim.Bird@am.sony.com
> <mailto:Tim.Bird@am.sony.com> > wrote:
> 
> 
> 	Victor,
> 
> 	Thanks for raising this topic.  I think it's an important one.  I have
> some comments below, inline.
> 
> 	> -----Original Message-----
> 	> From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
> 	>
> 	> This week I presented a case of study for the problem of lack of
> test
> 	> log output standardization in the majority of packages that are used
> 	> to build the current Linux distributions. This was presented as a BOF
> 	> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555
> <https://www.linuxplumbersconf.org/2016/ocw/proposals/3555> )  during
> 	> the Linux Plumbers Conference.
> 	>
> 	> it was a productive  discussion that let us share the problem that
> we
> 	> have in the current projects that we use every day to build a
> 	> distribution ( either in embedded as in a cloud base distribution).
> 	> The open source projects don't follow a standard output log format
> to
> 	> print the passing and failing tests that they run during packaging
> 	> time ( "make test" or "make check" )
> 	>
> 	> The Clear Linux project is using a simple Perl script that helps them
> 	> to count the number of passing and failing tests (which should be
> 	> trivial if could have a single standard output among all the projects,
> 	> but we don’t):
> 	>
> 	>
> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> <https://github.com/clearlinux/autospec/blob/master/autospec/count.pl>
> 	>
> 	> # perl count.pl <http://count.pl>  <build.log>
> 
> 	A few remarks about this.  This will be something of a stream of
> ideas, not
> 	very well organized.  I'd like to prevent requiring too many different
> 	language skills in Fuego.  In order to write a test for Fuego, we
> already require
> 	knowledge of shell script, python (for the benchmark parsers) and
> json formats
> 	(for the test specs and plans).  I'd be hesitant to adopt something in
> perl, but maybe
> 	there's a way to leverage the expertise embedded in your script.
> 
> 	I'm not that fond of the idea of integrating all the parsers into a single
> program.
> 	I think it's conceptually simpler to have a parser per log file format.
> However,
> 	I haven't looked in detail at your parser, so I can't really comment on
> it's
> 	complexity.  I note that 0day has a parser per test (but I haven't
> checked to
> 	see if they re-use common parsers between tests.)  Possibly some
> combination
> 	of code-driven and data-driven parsers is best, but I don't have the
> experience
> 	you guys do with your parser.
> 
> 	If I understood your presentation, you are currently parsing
> 	logs for thousands of packages. I thought you said that about half of
> the
> 	20,000 packages in a distro have unit tests, and I thought you said
> that
> 	your parser was covering about half of those (so, about 5000
> packages currently).
> 	And this is with 26 log formats parsed so far.
> 
> 	I'm guessing that packages have a "long tail" of formats, with them
> getting
> 	weirder and weirder the farther out on the tail of formats you get.
> 
> 	Please correct my numbers if I'm mistaken.
> 
> 	> Examples of real packages build logs:
> 	>
> 	>
> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8
> <https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x
> 8>
> 	> 6_64/build.log
> 	>
> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x
> <https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/
> x>
> 	> 86_64/build.log
> 	>
> 	> So far that simple (and not well engineered) parser has found 26
> 	> “standard” outputs ( and counting ) .
> 
> 	This is actually remarkable, as Fuego is only handing the formats for
> the
> 	standalone tests we ship with Fuego.  As I stated in the BOF, we have
> two
> 	mechanisms, one for functional tests that uses shell, grep and diff,
> and
> 	one for benchmark tests that uses a very small python program that
> uses
> 	regexes.   So, currently we only have 50 tests covered, but many of
> these
> 	parsers use very simple one-line grep regexes.
> 
> 	Neither of these Fuego log results parser methods supports tracking
> individual
> 	subtest results.
> 
> 	> The script has the fail that it
> 	> does not recognize the name of the tests in order to detect
> 	> regressions. Maybe one test was passing in the previous release
> and in
> 	> the new one is failing, and then the number of failing tests remains
> 	> the same.
> 
> 	This is a concern with the Fuego log parsing as well.
> 
> 	I would like to modify Fuego's parser to not just parse out counts, but
> to
> 	also convert the results to something where individual sub-tests can
> be
> 	tracked over time.  Daniel Sangorrin's recent work converting the
> output
> 	of LTP into excel format might be one way to do this (although I'm
> not
> 	that comfortable with using a proprietary format - I would prefer CSV
> 	or json, but I think Daniel is going for ease of use first.)
> 
> 	I need to do some more research, but I'm hoping that there are
> Jenkins
> 	plugins (maybe xUnit) that will provide tools to automatically handle
> 	visualization of test and sub-test results over time.  If so, I might
> 	try converting the Fuego parsers to product that format.
> 
> 	> To be honest, before presenting at LPC I was very confident that
> this
> 	> script ( or another version of it , much smarter ) could be beginning
> 	> of the solution to the problem we have. However, during the
> discussion
> 	> at LPC I understand that this might be a huge effort (not sure if
> 	> bigger) in order to solve the nightmare we already have.
> 
> 	So far, I think you're solving a bit different problem than Fuego is,
> and in one sense are
> 	much farther along than Fuego.  I'm hoping we can learn from your
> 	experience with this.
> 
> 	I do think we share the goal of producing a standard, or at least a
> recommendation,
> 	for a common test log output format.  This would help the industry
> going forward.
> 	Even if individual tests don't produce the standard format, it will help
> 3rd parties
> 	write parsers that conform the test output to the format, as well as
> encourage the
> 	development of tools that utilize the format for visualization or
> regression checking.
> 
> 	Do you feel confident enough to propose a format?  I don't at the
> moment.
> 	I'd like to survey the industry for 1) existing formats produced by
> tests (which you have good experience
> 	with, which is already maybe capture well by your perl script), and 2)
> existing tools
> 	that use common formats as input (e.g. the Jenkins xunit plugin).
> From this I'd like
> 	to develop some ideas about the fields that are most commonly
> used, and a good language to
> 	express those fields. My preference would be JSON - I'm something
> of an XML naysayer, but
> 	I could be talked into YAML.  Under no circumstances do I want to
> invent a new language for
> 	this.
> 
> 	> Tim Bird participates at the BOF and recommends me to send a mail
> to
> 	> the Fuego project team in order to look for more inputs and ideas
> bout
> 	> this topic.
> 	>
> 	> I really believe in the importance of attack this problem before we
> 	> have a bigger problem
> 	>
> 	> All feedback is more than welcome
> 
> 	Here is how I propose moving forward on this.  I'd like to get a group
> together to study this
> 	issue.  I wrote down a list of people at LPC who seem to be working
> on test issues.  I'd like to
> 	do the following:
> 	 1) perform a survey of the areas I mentioned above
> 	 2) write up a draft spec
> 	 3) send it around for comments (to what individual and lists? is an
> open issue)
> 	 4) discuss it at a future face-to-face meeting (probably at ELC or
> maybe next year's plumbers)
> 	 5) publish it as a standard endorsed by the Linux Foundation
> 
> 	Let me know what you think, and if you'd like to be involved.
> 
> 	Thanks and regards,
> 	 -- Tim
> 
> 
> 
> 
> 
> 
> --
> 
> - Guillermo Ponce

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-09  0:21     ` Bird, Timothy
@ 2016-11-09 18:04       ` Guillermo Adrian Ponce Castañeda
  2016-11-09 20:07         ` Victor Rodriguez
  2016-11-10  4:07       ` Daniel Sangorrin
  1 sibling, 1 reply; 13+ messages in thread
From: Guillermo Adrian Ponce Castañeda @ 2016-11-09 18:04 UTC (permalink / raw)
  To: Bird, Timothy; +Cc: fuego

[-- Attachment #1: Type: text/plain, Size: 14482 bytes --]

Hi Tim and Victor,

Ok, if we would like to talk about the tools that are used to visualize the
results, I will try to describe the way it works without incurring in
revealing proprietary information, since that code is not open source yet.

There is an initial script that gets the active packages names for the
linux distro and gets the build logs for each one.
That script calls count.pl script and attaches the package name to the
results of the count.pl script. That string will result like
'<package>,100,80,20,0,0', if I remind correctly, and each package output
will be appended to big csv file with headers.
After we have that csv file we pass it to another script that will create
some graphs.

So basically it is all CSV and home made tools to analyze them, I think it
can be automated, but I guess Victor can give us more details on the
current process on that matter if any.

Thanks and Regards.
- Guillermo Ponce


On Tue, Nov 8, 2016 at 6:21 PM, Bird, Timothy <Tim.Bird@am.sony.com> wrote:

>
>
> > -----Original Message-----
> > From: Guillermo Adrian Ponce Castañeda on Tuesday, November 08, 2016
> 11:38 AM
> >
> > I am a co-author of this code and I must confess that it was more or
> less my
> > fault that it was made on Perl.
>
> No blame intended. :-)
>
> >
> > Regarding how many logs the program analyzes, I think it is nowhere near
> > 5000, it is much less, but taking in count that some logs are similar I
> think it is
> > possible that some logs that haven't been tested are going to work, but
> who
> > knows :).
> >
> >
> > And about the output file, right now it delivers a comma separated list
> of
> > numbers, without headers, this is because this code is part of a  bigger
> tool, I
> > think that code is not open source yet, but that doesn't matter I guess,
> the
> > thing here is that I think the output could be changed into a json like
> you
> > suggested and i can try to translate the code from Perl to Python, still
> not
> > sure how long it's gonna take, but I can sure try.
>
> Well, don't do any re-writing just yet.  I think we need to consider the
> output format some more, and decide whether it makes sense to have a
> single vs. multiple parsers first.
>
> An important issue here is scalability of the project, and making it easy
> to allow (and incentivize) other developers to create and maintain
> parsers for the log files.  Or, to help encourage people to use a common
> format either initially, or by conversion from their current log format.
> The only way to scale this is by having 3rd parties adopt the format, and
> be willing to maintain compatibility with it over time.
>
> I think it's important to consider what will motivate people to adopt a
> common
> log format.  They either need to 1) write a parser for their current
> format, or
> 2) identify an existing parser which is close enough and modify it to
> support their format, or 3) convert their test output directly to the
> desired
> format.  This will be some amount of work whichever route people take.
>
> I think what will be of value is having tools that read and process the
> format,
> and provide utility to those who use the format for output.  So I want to
> do a bit
> of a survey on what tools (visualizers, aggregators, automated processors,
> notifiers, etc.) might be useful to different developer groups, and make
> sure
> the format is something that can be used by existing tools or by envisioned
> future tools, that would be valuable to community members.
>
> In more high-level terms, we should trying to create a double-sided
> network effect,
> where use  (output) of the format drives tools creation, and tools usage
> of the format (input) drives format popularity.
>
> Can you describe a bit more what tools, if any, you use to view the
> results,
> or any other processing systems that the results are used with?  If you
> are reviewing
> results manually, are there steps you are doing now by hand that you'd
> like to
> do automatically in the future, that a common format would help you with?
>
> I'll go first - Fuego is currently just using the standard Jenkins
> "weather" report
> and 'list of recent overall pass/failure' for each test. So we don't have
> anything
> visualizing the results of sub-tests, or even displaying the counts for
> each test run, at the moment.
> Daniel Sangorrin has just recently proposed a facility to put LTP results
> into spreadsheet format,
> to allow visualizing test results over time via spreadsheet tools.  I'd
> like to add better
> sub-test visualization in the future, but that's lower on our priority
> list at the moment.
>
> Also in the future, we'd like to do test results aggregation, to allow for
> data mining
> of results from tests on different hardware platforms and embedded
> distributions.
> This will require that the parsed log output be machine-readable, and
> consistent.
>  -- Tim
>
> > On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <Tim.Bird@am.sony.com
> > <mailto:Tim.Bird@am.sony.com> > wrote:
> >
> >
> >       Victor,
> >
> >       Thanks for raising this topic.  I think it's an important one.  I
> have
> > some comments below, inline.
> >
> >       > -----Original Message-----
> >       > From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
> >       >
> >       > This week I presented a case of study for the problem of lack of
> > test
> >       > log output standardization in the majority of packages that are
> used
> >       > to build the current Linux distributions. This was presented as
> a BOF
> >       > ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555
> > <https://www.linuxplumbersconf.org/2016/ocw/proposals/3555> )  during
> >       > the Linux Plumbers Conference.
> >       >
> >       > it was a productive  discussion that let us share the problem
> that
> > we
> >       > have in the current projects that we use every day to build a
> >       > distribution ( either in embedded as in a cloud base
> distribution).
> >       > The open source projects don't follow a standard output log
> format
> > to
> >       > print the passing and failing tests that they run during
> packaging
> >       > time ( "make test" or "make check" )
> >       >
> >       > The Clear Linux project is using a simple Perl script that helps
> them
> >       > to count the number of passing and failing tests (which should be
> >       > trivial if could have a single standard output among all the
> projects,
> >       > but we don’t):
> >       >
> >       >
> > https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> > <https://github.com/clearlinux/autospec/blob/master/autospec/count.pl>
> >       >
> >       > # perl count.pl <http://count.pl>  <build.log>
> >
> >       A few remarks about this.  This will be something of a stream of
> > ideas, not
> >       very well organized.  I'd like to prevent requiring too many
> different
> >       language skills in Fuego.  In order to write a test for Fuego, we
> > already require
> >       knowledge of shell script, python (for the benchmark parsers) and
> > json formats
> >       (for the test specs and plans).  I'd be hesitant to adopt
> something in
> > perl, but maybe
> >       there's a way to leverage the expertise embedded in your script.
> >
> >       I'm not that fond of the idea of integrating all the parsers into
> a single
> > program.
> >       I think it's conceptually simpler to have a parser per log file
> format.
> > However,
> >       I haven't looked in detail at your parser, so I can't really
> comment on
> > it's
> >       complexity.  I note that 0day has a parser per test (but I haven't
> > checked to
> >       see if they re-use common parsers between tests.)  Possibly some
> > combination
> >       of code-driven and data-driven parsers is best, but I don't have
> the
> > experience
> >       you guys do with your parser.
> >
> >       If I understood your presentation, you are currently parsing
> >       logs for thousands of packages. I thought you said that about half
> of
> > the
> >       20,000 packages in a distro have unit tests, and I thought you said
> > that
> >       your parser was covering about half of those (so, about 5000
> > packages currently).
> >       And this is with 26 log formats parsed so far.
> >
> >       I'm guessing that packages have a "long tail" of formats, with them
> > getting
> >       weirder and weirder the farther out on the tail of formats you get.
> >
> >       Please correct my numbers if I'm mistaken.
> >
> >       > Examples of real packages build logs:
> >       >
> >       >
> > https://kojipkgs.fedoraproject.org//packages/
> gcc/6.2.1/2.fc25/data/logs/x8
> > <https://kojipkgs.fedoraproject.org//packages/
> gcc/6.2.1/2.fc25/data/logs/x
> > 8>
> >       > 6_64/build.log
> >       >
> > https://kojipkgs.fedoraproject.org//packages/
> acl/2.2.52/11.fc24/data/logs/x
> > <https://kojipkgs.fedoraproject.org//packages/
> acl/2.2.52/11.fc24/data/logs/
> > x>
> >       > 86_64/build.log
> >       >
> >       > So far that simple (and not well engineered) parser has found 26
> >       > “standard” outputs ( and counting ) .
> >
> >       This is actually remarkable, as Fuego is only handing the formats
> for
> > the
> >       standalone tests we ship with Fuego.  As I stated in the BOF, we
> have
> > two
> >       mechanisms, one for functional tests that uses shell, grep and
> diff,
> > and
> >       one for benchmark tests that uses a very small python program that
> > uses
> >       regexes.   So, currently we only have 50 tests covered, but many of
> > these
> >       parsers use very simple one-line grep regexes.
> >
> >       Neither of these Fuego log results parser methods supports tracking
> > individual
> >       subtest results.
> >
> >       > The script has the fail that it
> >       > does not recognize the name of the tests in order to detect
> >       > regressions. Maybe one test was passing in the previous release
> > and in
> >       > the new one is failing, and then the number of failing tests
> remains
> >       > the same.
> >
> >       This is a concern with the Fuego log parsing as well.
> >
> >       I would like to modify Fuego's parser to not just parse out
> counts, but
> > to
> >       also convert the results to something where individual sub-tests
> can
> > be
> >       tracked over time.  Daniel Sangorrin's recent work converting the
> > output
> >       of LTP into excel format might be one way to do this (although I'm
> > not
> >       that comfortable with using a proprietary format - I would prefer
> CSV
> >       or json, but I think Daniel is going for ease of use first.)
> >
> >       I need to do some more research, but I'm hoping that there are
> > Jenkins
> >       plugins (maybe xUnit) that will provide tools to automatically
> handle
> >       visualization of test and sub-test results over time.  If so, I
> might
> >       try converting the Fuego parsers to product that format.
> >
> >       > To be honest, before presenting at LPC I was very confident that
> > this
> >       > script ( or another version of it , much smarter ) could be
> beginning
> >       > of the solution to the problem we have. However, during the
> > discussion
> >       > at LPC I understand that this might be a huge effort (not sure if
> >       > bigger) in order to solve the nightmare we already have.
> >
> >       So far, I think you're solving a bit different problem than Fuego
> is,
> > and in one sense are
> >       much farther along than Fuego.  I'm hoping we can learn from your
> >       experience with this.
> >
> >       I do think we share the goal of producing a standard, or at least a
> > recommendation,
> >       for a common test log output format.  This would help the industry
> > going forward.
> >       Even if individual tests don't produce the standard format, it
> will help
> > 3rd parties
> >       write parsers that conform the test output to the format, as well
> as
> > encourage the
> >       development of tools that utilize the format for visualization or
> > regression checking.
> >
> >       Do you feel confident enough to propose a format?  I don't at the
> > moment.
> >       I'd like to survey the industry for 1) existing formats produced by
> > tests (which you have good experience
> >       with, which is already maybe capture well by your perl script),
> and 2)
> > existing tools
> >       that use common formats as input (e.g. the Jenkins xunit plugin).
> > From this I'd like
> >       to develop some ideas about the fields that are most commonly
> > used, and a good language to
> >       express those fields. My preference would be JSON - I'm something
> > of an XML naysayer, but
> >       I could be talked into YAML.  Under no circumstances do I want to
> > invent a new language for
> >       this.
> >
> >       > Tim Bird participates at the BOF and recommends me to send a mail
> > to
> >       > the Fuego project team in order to look for more inputs and ideas
> > bout
> >       > this topic.
> >       >
> >       > I really believe in the importance of attack this problem before
> we
> >       > have a bigger problem
> >       >
> >       > All feedback is more than welcome
> >
> >       Here is how I propose moving forward on this.  I'd like to get a
> group
> > together to study this
> >       issue.  I wrote down a list of people at LPC who seem to be working
> > on test issues.  I'd like to
> >       do the following:
> >        1) perform a survey of the areas I mentioned above
> >        2) write up a draft spec
> >        3) send it around for comments (to what individual and lists? is
> an
> > open issue)
> >        4) discuss it at a future face-to-face meeting (probably at ELC or
> > maybe next year's plumbers)
> >        5) publish it as a standard endorsed by the Linux Foundation
> >
> >       Let me know what you think, and if you'd like to be involved.
> >
> >       Thanks and regards,
> >        -- Tim
> >
> >
> >
> >
> >
> >
> > --
> >
> > - Guillermo Ponce
>



-- 
- Guillermo Ponce

[-- Attachment #2: Type: text/html, Size: 18531 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-09 18:04       ` Guillermo Adrian Ponce Castañeda
@ 2016-11-09 20:07         ` Victor Rodriguez
  0 siblings, 0 replies; 13+ messages in thread
From: Victor Rodriguez @ 2016-11-09 20:07 UTC (permalink / raw)
  To: Guillermo Adrian Ponce Castañeda; +Cc: fuego

Hi team

Sorry for the delay

On Wed, Nov 9, 2016 at 1:04 PM, Guillermo Adrian Ponce Castañeda
<ga.poncec@gmail.com> wrote:
> Hi Tim and Victor,
>
> Ok, if we would like to talk about the tools that are used to visualize the
> results, I will try to describe the way it works without incurring in
> revealing proprietary information, since that code is not open source yet.
>
> There is an initial script that gets the active packages names for the linux
> distro and gets the build logs for each one.
> That script calls count.pl script and attaches the package name to the
> results of the count.pl script. That string will result like
> '<package>,100,80,20,0,0', if I remind correctly, and each package output
> will be appended to big csv file with headers.
> After we have that csv file we pass it to another script that will create
> some graphs.
>
> So basically it is all CSV and home made tools to analyze them, I think it
> can be automated, but I guess Victor can give us more details on the current
> process on that matter if any.
>
> Thanks and Regards.
> - Guillermo Ponce
>
>
> On Tue, Nov 8, 2016 at 6:21 PM, Bird, Timothy <Tim.Bird@am.sony.com> wrote:
>>
>>
>>
>> > -----Original Message-----
>> > From: Guillermo Adrian Ponce Castañeda on Tuesday, November 08, 2016
>> > 11:38 AM
>> >
>> > I am a co-author of this code and I must confess that it was more or
>> > less my
>> > fault that it was made on Perl.
>>
>> No blame intended. :-)
>>
>> >
>> > Regarding how many logs the program analyzes, I think it is nowhere near
>> > 5000, it is much less, but taking in count that some logs are similar I
>> > think it is
>> > possible that some logs that haven't been tested are going to work, but
>> > who
>> > knows :).
>> >
>> >
>> > And about the output file, right now it delivers a comma separated list
>> > of
>> > numbers, without headers, this is because this code is part of a  bigger
>> > tool, I
>> > think that code is not open source yet, but that doesn't matter I guess,
>> > the
>> > thing here is that I think the output could be changed into a json like
>> > you
>> > suggested and i can try to translate the code from Perl to Python, still
>> > not
>> > sure how long it's gonna take, but I can sure try.
>>
>> Well, don't do any re-writing just yet.  I think we need to consider the
>> output format some more, and decide whether it makes sense to have a
>> single vs. multiple parsers first.
>>
>> An important issue here is scalability of the project, and making it easy
>> to allow (and incentivize) other developers to create and maintain
>> parsers for the log files.  Or, to help encourage people to use a common
>> format either initially, or by conversion from their current log format.
>> The only way to scale this is by having 3rd parties adopt the format, and
>> be willing to maintain compatibility with it over time.
>>

I am more than happy to send patches to upstream once we could get an
standard output

Memo , I will try to count how many packages are using each parser and
get the most used one

>> I think it's important to consider what will motivate people to adopt a
>> common
>> log format.  They either need to 1) write a parser for their current
>> format, or
>> 2) identify an existing parser which is close enough and modify it to
>> support their format, or 3) convert their test output directly to the
>> desired
>> format.  This will be some amount of work whichever route people take.
>>
>> I think what will be of value is having tools that read and process the
>> format,
>> and provide utility to those who use the format for output.  So I want to
>> do a bit
>> of a survey on what tools (visualizers, aggregators, automated processors,
>> notifiers, etc.) might be useful to different developer groups, and make
>> sure
>> the format is something that can be used by existing tools or by
>> envisioned
>> future tools, that would be valuable to community members.
>>

I think this is a pretty good standard if we are looking for compatibility

https://testanything.org/

Jenkins also have plugins for this


>> In more high-level terms, we should trying to create a double-sided
>> network effect,
>> where use  (output) of the format drives tools creation, and tools usage
>> of the format (input) drives format popularity.
>>
>> Can you describe a bit more what tools, if any, you use to view the
>> results,
>> or any other processing systems that the results are used with?  If you
>> are reviewing
>> results manually, are there steps you are doing now by hand that you'd
>> like to
>> do automatically in the future, that a common format would help you with?
>>
>> I'll go first - Fuego is currently just using the standard Jenkins
>> "weather" report
>> and 'list of recent overall pass/failure' for each test. So we don't have
>> anything
>> visualizing the results of sub-tests, or even displaying the counts for
>> each test run, at the moment.
>> Daniel Sangorrin has just recently proposed a facility to put LTP results
>> into spreadsheet format,
>> to allow visualizing test results over time via spreadsheet tools.  I'd
>> like to add better
>> sub-test visualization in the future, but that's lower on our priority
>> list at the moment.
>>
>> Also in the future, we'd like to do test results aggregation, to allow for
>> data mining
>> of results from tests on different hardware platforms and embedded
>> distributions.
>> This will require that the parsed log output be machine-readable, and
>> consistent.

Agree , this will be for multiple patforms


Thanks a lot for follow up this Tim , this is a good topic that if we
coudl solve , we will help a lot to the Linux comunity

Do you consider that we should do some survay in other OS
distributions ? ( debian , fedora , suse ... maybe , even yocto )

Where else do you think we should rise this problem ?

Do you think we should wirte an small article about it in LWN ( or
something similar ) to get atraction ( a call of action )  ?

The idea of a linux testing summit is pretty much amazing , count on
me for help with that :)

Regards

Victor Rodriguez

>>  -- Tim
>>
>> > On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <Tim.Bird@am.sony.com
>> > <mailto:Tim.Bird@am.sony.com> > wrote:
>> >
>> >
>> >       Victor,
>> >
>> >       Thanks for raising this topic.  I think it's an important one.  I
>> > have
>> > some comments below, inline.
>> >
>> >       > -----Original Message-----
>> >       > From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
>> >       >
>> >       > This week I presented a case of study for the problem of lack of
>> > test
>> >       > log output standardization in the majority of packages that are
>> > used
>> >       > to build the current Linux distributions. This was presented as
>> > a BOF
>> >       > ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555
>> > <https://www.linuxplumbersconf.org/2016/ocw/proposals/3555> )  during
>> >       > the Linux Plumbers Conference.
>> >       >
>> >       > it was a productive  discussion that let us share the problem
>> > that
>> > we
>> >       > have in the current projects that we use every day to build a
>> >       > distribution ( either in embedded as in a cloud base
>> > distribution).
>> >       > The open source projects don't follow a standard output log
>> > format
>> > to
>> >       > print the passing and failing tests that they run during
>> > packaging
>> >       > time ( "make test" or "make check" )
>> >       >
>> >       > The Clear Linux project is using a simple Perl script that helps
>> > them
>> >       > to count the number of passing and failing tests (which should
>> > be
>> >       > trivial if could have a single standard output among all the
>> > projects,
>> >       > but we don’t):
>> >       >
>> >       >
>> > https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
>> > <https://github.com/clearlinux/autospec/blob/master/autospec/count.pl>
>> >       >
>> >       > # perl count.pl <http://count.pl>  <build.log>
>> >
>> >       A few remarks about this.  This will be something of a stream of
>> > ideas, not
>> >       very well organized.  I'd like to prevent requiring too many
>> > different
>> >       language skills in Fuego.  In order to write a test for Fuego, we
>> > already require
>> >       knowledge of shell script, python (for the benchmark parsers) and
>> > json formats
>> >       (for the test specs and plans).  I'd be hesitant to adopt
>> > something in
>> > perl, but maybe
>> >       there's a way to leverage the expertise embedded in your script.
>> >
>> >       I'm not that fond of the idea of integrating all the parsers into
>> > a single
>> > program.
>> >       I think it's conceptually simpler to have a parser per log file
>> > format.
>> > However,
>> >       I haven't looked in detail at your parser, so I can't really
>> > comment on
>> > it's
>> >       complexity.  I note that 0day has a parser per test (but I haven't
>> > checked to
>> >       see if they re-use common parsers between tests.)  Possibly some
>> > combination
>> >       of code-driven and data-driven parsers is best, but I don't have
>> > the
>> > experience
>> >       you guys do with your parser.
>> >
>> >       If I understood your presentation, you are currently parsing
>> >       logs for thousands of packages. I thought you said that about half
>> > of
>> > the
>> >       20,000 packages in a distro have unit tests, and I thought you
>> > said
>> > that
>> >       your parser was covering about half of those (so, about 5000
>> > packages currently).
>> >       And this is with 26 log formats parsed so far.
>> >
>> >       I'm guessing that packages have a "long tail" of formats, with
>> > them
>> > getting
>> >       weirder and weirder the farther out on the tail of formats you
>> > get.
>> >
>> >       Please correct my numbers if I'm mistaken.
>> >
>> >       > Examples of real packages build logs:
>> >       >
>> >       >
>> >
>> > https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8
>> >
>> > <https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x
>> > 8>
>> >       > 6_64/build.log
>> >       >
>> >
>> > https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x
>> >
>> > <https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/
>> > x>
>> >       > 86_64/build.log
>> >       >
>> >       > So far that simple (and not well engineered) parser has found 26
>> >       > “standard” outputs ( and counting ) .
>> >
>> >       This is actually remarkable, as Fuego is only handing the formats
>> > for
>> > the
>> >       standalone tests we ship with Fuego.  As I stated in the BOF, we
>> > have
>> > two
>> >       mechanisms, one for functional tests that uses shell, grep and
>> > diff,
>> > and
>> >       one for benchmark tests that uses a very small python program that
>> > uses
>> >       regexes.   So, currently we only have 50 tests covered, but many
>> > of
>> > these
>> >       parsers use very simple one-line grep regexes.
>> >
>> >       Neither of these Fuego log results parser methods supports
>> > tracking
>> > individual
>> >       subtest results.
>> >
>> >       > The script has the fail that it
>> >       > does not recognize the name of the tests in order to detect
>> >       > regressions. Maybe one test was passing in the previous release
>> > and in
>> >       > the new one is failing, and then the number of failing tests
>> > remains
>> >       > the same.
>> >
>> >       This is a concern with the Fuego log parsing as well.
>> >
>> >       I would like to modify Fuego's parser to not just parse out
>> > counts, but
>> > to
>> >       also convert the results to something where individual sub-tests
>> > can
>> > be
>> >       tracked over time.  Daniel Sangorrin's recent work converting the
>> > output
>> >       of LTP into excel format might be one way to do this (although I'm
>> > not
>> >       that comfortable with using a proprietary format - I would prefer
>> > CSV
>> >       or json, but I think Daniel is going for ease of use first.)
>> >
>> >       I need to do some more research, but I'm hoping that there are
>> > Jenkins
>> >       plugins (maybe xUnit) that will provide tools to automatically
>> > handle
>> >       visualization of test and sub-test results over time.  If so, I
>> > might
>> >       try converting the Fuego parsers to product that format.
>> >
>> >       > To be honest, before presenting at LPC I was very confident that
>> > this
>> >       > script ( or another version of it , much smarter ) could be
>> > beginning
>> >       > of the solution to the problem we have. However, during the
>> > discussion
>> >       > at LPC I understand that this might be a huge effort (not sure
>> > if
>> >       > bigger) in order to solve the nightmare we already have.
>> >
>> >       So far, I think you're solving a bit different problem than Fuego
>> > is,
>> > and in one sense are
>> >       much farther along than Fuego.  I'm hoping we can learn from your
>> >       experience with this.
>> >
>> >       I do think we share the goal of producing a standard, or at least
>> > a
>> > recommendation,
>> >       for a common test log output format.  This would help the industry
>> > going forward.
>> >       Even if individual tests don't produce the standard format, it
>> > will help
>> > 3rd parties
>> >       write parsers that conform the test output to the format, as well
>> > as
>> > encourage the
>> >       development of tools that utilize the format for visualization or
>> > regression checking.
>> >
>> >       Do you feel confident enough to propose a format?  I don't at the
>> > moment.
>> >       I'd like to survey the industry for 1) existing formats produced
>> > by
>> > tests (which you have good experience
>> >       with, which is already maybe capture well by your perl script),
>> > and 2)
>> > existing tools
>> >       that use common formats as input (e.g. the Jenkins xunit plugin).
>> > From this I'd like
>> >       to develop some ideas about the fields that are most commonly
>> > used, and a good language to
>> >       express those fields. My preference would be JSON - I'm something
>> > of an XML naysayer, but
>> >       I could be talked into YAML.  Under no circumstances do I want to
>> > invent a new language for
>> >       this.
>> >
>> >       > Tim Bird participates at the BOF and recommends me to send a
>> > mail
>> > to
>> >       > the Fuego project team in order to look for more inputs and
>> > ideas
>> > bout
>> >       > this topic.
>> >       >
>> >       > I really believe in the importance of attack this problem before
>> > we
>> >       > have a bigger problem
>> >       >
>> >       > All feedback is more than welcome
>> >
>> >       Here is how I propose moving forward on this.  I'd like to get a
>> > group
>> > together to study this
>> >       issue.  I wrote down a list of people at LPC who seem to be
>> > working
>> > on test issues.  I'd like to
>> >       do the following:
>> >        1) perform a survey of the areas I mentioned above
>> >        2) write up a draft spec
>> >        3) send it around for comments (to what individual and lists? is
>> > an
>> > open issue)
>> >        4) discuss it at a future face-to-face meeting (probably at ELC
>> > or
>> > maybe next year's plumbers)
>> >        5) publish it as a standard endorsed by the Linux Foundation
>> >
>> >       Let me know what you think, and if you'd like to be involved.
>> >
>> >       Thanks and regards,
>> >        -- Tim
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > - Guillermo Ponce
>
>
>
>
> --
> - Guillermo Ponce

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-05 17:15 [Fuego] LPC Increase Test Coverage in a Linux-based OS Victor Rodriguez
  2016-11-08  0:26 ` Bird, Timothy
@ 2016-11-10  3:09 ` Daniel Sangorrin
  2016-11-10 13:30   ` Victor Rodriguez
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel Sangorrin @ 2016-11-10  3:09 UTC (permalink / raw)
  To: 'Victor Rodriguez',
	fuego, 'Guillermo Adrian Ponce Castañeda'

Hi Victor,

> -----Original Message-----
> From: fuego-bounces@lists.linuxfoundation.org [mailto:fuego-bounces@lists.linuxfoundation.org] On Behalf Of Victor Rodriguez
> Sent: Sunday, November 06, 2016 2:15 AM
> To: fuego@lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
> Subject: [Fuego] LPC Increase Test Coverage in a Linux-based OS
> 
> Hi Fuego team.
> 
> This week I presented a case of study for the problem of lack of test
> log output standardization in the majority of packages that are used
> to build the current Linux distributions. This was presented as a BOF
> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
> the Linux Plumbers Conference.
> 
> it was a productive  discussion that let us share the problem that we
> have in the current projects that we use every day to build a
> distribution ( either in embedded as in a cloud base distribution).
> The open source projects don't follow a standard output log format to
> print the passing and failing tests that they run during packaging
> time ( "make test" or "make check" )

Sorry I couldn't download your slides because of proxy issues but
I think you are talking about the tests that are inside packages (e.g. .deb .rpm files).
For example, autopkgtest for debian. Is that correct?

I'm not an expert about them, but I believe these tests can also be executed
decoupled  from the build process in a flexible way (e.g.: locally, on qemu,
remotely through ssh, or on an lxc/schroot environment for example). 

Being able to leverage all these tests in Fuego for testing package-based 
embedded systems would be great. 

For non-package-based embedded systems, I think those tests [2]
could be ported and made cross-compilable. In particular, Yocto/OpenEmbedded's ptest
framework decouples the compiling phase from the testing phase and
produces "a consistent output format".

[1] https://packages.debian.org/sid/autopkgtest
[2] https://wiki.yoctoproject.org/wiki/Ptest

> The Clear Linux project is using a simple Perl script that helps them
> to count the number of passing and failing tests (which should be
> trivial if could have a single standard output among all the projects,
> but we don’t):

I think that counting is good but we also need to know specifically which test/subtest
in particular failed and what the error log was like.

Best regards
Daniel

--
IoT Technology center
Toshiba Corp. Industrial ICT solutions, 
Daniel SANGORRIN



> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> 
> # perl count.pl <build.log>
> 
> Examples of real packages build logs:
> 
> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x86_64/build.log
> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x86_64/build.log
> 
> So far that simple (and not well engineered) parser has found 26
> “standard” outputs ( and counting ) .  The script has the fail that it
> does not recognize the name of the tests in order to detect
> regressions. Maybe one test was passing in the previous release and in
> the new one is failing, and then the number of failing tests remains
> the same.
> 
> To be honest, before presenting at LPC I was very confident that this
> script ( or another version of it , much smarter ) could be beginning
> of the solution to the problem we have. However, during the discussion
> at LPC I understand that this might be a huge effort (not sure if
> bigger) in order to solve the nightmare we already have.
> 
> Tim Bird participates at the BOF and recommends me to send a mail to
> the Fuego project team in order to look for more inputs and ideas bout
> this topic.
> 
> I really believe in the importance of attack this problem before we
> have a bigger problem
> 
> All feedback is more than welcome
> 
> Regards
> 
> Victor Rodriguez
> 
> [presentation slides] :
> https://drive.google.com/open?id=0B7iKrGdVkDhIcVpncUdGTGhEQTQ
> [BOF notes] : https://drive.google.com/open?id=1lOPXQcrhL4AoOBSDnwUlJAKIXsReU8OqP82usZn-DCo
> _______________________________________________
> Fuego mailing list
> Fuego@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/fuego



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-09  0:21     ` Bird, Timothy
  2016-11-09 18:04       ` Guillermo Adrian Ponce Castañeda
@ 2016-11-10  4:07       ` Daniel Sangorrin
  1 sibling, 0 replies; 13+ messages in thread
From: Daniel Sangorrin @ 2016-11-10  4:07 UTC (permalink / raw)
  To: 'Bird, Timothy', 'Guillermo Adrian Ponce Castañeda'
  Cc: fuego

Hi all,

> -----Original Message-----
> From: fuego-bounces@lists.linuxfoundation.org [mailto:fuego-bounces@lists.linuxfoundation.org] On Behalf Of Bird, Timothy
> Sent: Wednesday, November 09, 2016 9:21 AM
> To: Guillermo Adrian Ponce Castañeda
> Cc: fuego@lists.linuxfoundation.org
> Subject: Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
> 
...
> I'll go first - Fuego is currently just using the standard Jenkins "weather" report
> and 'list of recent overall pass/failure' for each test. So we don't have anything
> visualizing the results of sub-tests, or even displaying the counts for each test run, at the moment.
> Daniel Sangorrin has just recently proposed a facility to put LTP results into spreadsheet format,
> to allow visualizing test results over time via spreadsheet tools.  I'd like to add better
> sub-test visualization in the future, but that's lower on our priority list at the moment.

Actually, the spreadsheet format I'm using is basically CSV + some colors to easily distinguish
failed from pass tests. It can be opened with libreoffice or exported to CSV format.
I can add CSV output to my script (which can also be opened with libreoffice).
 
Best regards
Daniel

> Also in the future, we'd like to do test results aggregation, to allow for data mining
> of results from tests on different hardware platforms and embedded distributions.
> This will require that the parsed log output be machine-readable, and consistent.
>  -- Tim
> 
> > On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <Tim.Bird@am.sony.com
> > <mailto:Tim.Bird@am.sony.com> > wrote:
> >
> >
> > 	Victor,
> >
> > 	Thanks for raising this topic.  I think it's an important one.  I have
> > some comments below, inline.
> >
> > 	> -----Original Message-----
> > 	> From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
> > 	>
> > 	> This week I presented a case of study for the problem of lack of
> > test
> > 	> log output standardization in the majority of packages that are used
> > 	> to build the current Linux distributions. This was presented as a BOF
> > 	> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555
> > <https://www.linuxplumbersconf.org/2016/ocw/proposals/3555> )  during
> > 	> the Linux Plumbers Conference.
> > 	>
> > 	> it was a productive  discussion that let us share the problem that
> > we
> > 	> have in the current projects that we use every day to build a
> > 	> distribution ( either in embedded as in a cloud base distribution).
> > 	> The open source projects don't follow a standard output log format
> > to
> > 	> print the passing and failing tests that they run during packaging
> > 	> time ( "make test" or "make check" )
> > 	>
> > 	> The Clear Linux project is using a simple Perl script that helps them
> > 	> to count the number of passing and failing tests (which should be
> > 	> trivial if could have a single standard output among all the projects,
> > 	> but we don’t):
> > 	>
> > 	>
> > https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> > <https://github.com/clearlinux/autospec/blob/master/autospec/count.pl>
> > 	>
> > 	> # perl count.pl <http://count.pl>  <build.log>
> >
> > 	A few remarks about this.  This will be something of a stream of
> > ideas, not
> > 	very well organized.  I'd like to prevent requiring too many different
> > 	language skills in Fuego.  In order to write a test for Fuego, we
> > already require
> > 	knowledge of shell script, python (for the benchmark parsers) and
> > json formats
> > 	(for the test specs and plans).  I'd be hesitant to adopt something in
> > perl, but maybe
> > 	there's a way to leverage the expertise embedded in your script.
> >
> > 	I'm not that fond of the idea of integrating all the parsers into a single
> > program.
> > 	I think it's conceptually simpler to have a parser per log file format.
> > However,
> > 	I haven't looked in detail at your parser, so I can't really comment on
> > it's
> > 	complexity.  I note that 0day has a parser per test (but I haven't
> > checked to
> > 	see if they re-use common parsers between tests.)  Possibly some
> > combination
> > 	of code-driven and data-driven parsers is best, but I don't have the
> > experience
> > 	you guys do with your parser.
> >
> > 	If I understood your presentation, you are currently parsing
> > 	logs for thousands of packages. I thought you said that about half of
> > the
> > 	20,000 packages in a distro have unit tests, and I thought you said
> > that
> > 	your parser was covering about half of those (so, about 5000
> > packages currently).
> > 	And this is with 26 log formats parsed so far.
> >
> > 	I'm guessing that packages have a "long tail" of formats, with them
> > getting
> > 	weirder and weirder the farther out on the tail of formats you get.
> >
> > 	Please correct my numbers if I'm mistaken.
> >
> > 	> Examples of real packages build logs:
> > 	>
> > 	>
> > https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8
> > <https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x
> > 8>
> > 	> 6_64/build.log
> > 	>
> > https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x
> > <https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/
> > x>
> > 	> 86_64/build.log
> > 	>
> > 	> So far that simple (and not well engineered) parser has found 26
> > 	> “standard” outputs ( and counting ) .
> >
> > 	This is actually remarkable, as Fuego is only handing the formats for
> > the
> > 	standalone tests we ship with Fuego.  As I stated in the BOF, we have
> > two
> > 	mechanisms, one for functional tests that uses shell, grep and diff,
> > and
> > 	one for benchmark tests that uses a very small python program that
> > uses
> > 	regexes.   So, currently we only have 50 tests covered, but many of
> > these
> > 	parsers use very simple one-line grep regexes.
> >
> > 	Neither of these Fuego log results parser methods supports tracking
> > individual
> > 	subtest results.
> >
> > 	> The script has the fail that it
> > 	> does not recognize the name of the tests in order to detect
> > 	> regressions. Maybe one test was passing in the previous release
> > and in
> > 	> the new one is failing, and then the number of failing tests remains
> > 	> the same.
> >
> > 	This is a concern with the Fuego log parsing as well.
> >
> > 	I would like to modify Fuego's parser to not just parse out counts, but
> > to
> > 	also convert the results to something where individual sub-tests can
> > be
> > 	tracked over time.  Daniel Sangorrin's recent work converting the
> > output
> > 	of LTP into excel format might be one way to do this (although I'm
> > not
> > 	that comfortable with using a proprietary format - I would prefer CSV
> > 	or json, but I think Daniel is going for ease of use first.)
> >
> > 	I need to do some more research, but I'm hoping that there are
> > Jenkins
> > 	plugins (maybe xUnit) that will provide tools to automatically handle
> > 	visualization of test and sub-test results over time.  If so, I might
> > 	try converting the Fuego parsers to product that format.
> >
> > 	> To be honest, before presenting at LPC I was very confident that
> > this
> > 	> script ( or another version of it , much smarter ) could be beginning
> > 	> of the solution to the problem we have. However, during the
> > discussion
> > 	> at LPC I understand that this might be a huge effort (not sure if
> > 	> bigger) in order to solve the nightmare we already have.
> >
> > 	So far, I think you're solving a bit different problem than Fuego is,
> > and in one sense are
> > 	much farther along than Fuego.  I'm hoping we can learn from your
> > 	experience with this.
> >
> > 	I do think we share the goal of producing a standard, or at least a
> > recommendation,
> > 	for a common test log output format.  This would help the industry
> > going forward.
> > 	Even if individual tests don't produce the standard format, it will help
> > 3rd parties
> > 	write parsers that conform the test output to the format, as well as
> > encourage the
> > 	development of tools that utilize the format for visualization or
> > regression checking.
> >
> > 	Do you feel confident enough to propose a format?  I don't at the
> > moment.
> > 	I'd like to survey the industry for 1) existing formats produced by
> > tests (which you have good experience
> > 	with, which is already maybe capture well by your perl script), and 2)
> > existing tools
> > 	that use common formats as input (e.g. the Jenkins xunit plugin).
> > From this I'd like
> > 	to develop some ideas about the fields that are most commonly
> > used, and a good language to
> > 	express those fields. My preference would be JSON - I'm something
> > of an XML naysayer, but
> > 	I could be talked into YAML.  Under no circumstances do I want to
> > invent a new language for
> > 	this.
> >
> > 	> Tim Bird participates at the BOF and recommends me to send a mail
> > to
> > 	> the Fuego project team in order to look for more inputs and ideas
> > bout
> > 	> this topic.
> > 	>
> > 	> I really believe in the importance of attack this problem before we
> > 	> have a bigger problem
> > 	>
> > 	> All feedback is more than welcome
> >
> > 	Here is how I propose moving forward on this.  I'd like to get a group
> > together to study this
> > 	issue.  I wrote down a list of people at LPC who seem to be working
> > on test issues.  I'd like to
> > 	do the following:
> > 	 1) perform a survey of the areas I mentioned above
> > 	 2) write up a draft spec
> > 	 3) send it around for comments (to what individual and lists? is an
> > open issue)
> > 	 4) discuss it at a future face-to-face meeting (probably at ELC or
> > maybe next year's plumbers)
> > 	 5) publish it as a standard endorsed by the Linux Foundation
> >
> > 	Let me know what you think, and if you'd like to be involved.
> >
> > 	Thanks and regards,
> > 	 -- Tim
> >
> >
> >
> >
> >
> >
> > --
> >
> > - Guillermo Ponce
> _______________________________________________
> Fuego mailing list
> Fuego@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/fuego



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-10  3:09 ` Daniel Sangorrin
@ 2016-11-10 13:30   ` Victor Rodriguez
  2016-11-14  1:44     ` Daniel Sangorrin
  0 siblings, 1 reply; 13+ messages in thread
From: Victor Rodriguez @ 2016-11-10 13:30 UTC (permalink / raw)
  To: Daniel Sangorrin; +Cc: fuego

On Wed, Nov 9, 2016 at 10:09 PM, Daniel Sangorrin
<daniel.sangorrin@toshiba.co.jp> wrote:
> Hi Victor,
>
>> -----Original Message-----
>> From: fuego-bounces@lists.linuxfoundation.org [mailto:fuego-bounces@lists.linuxfoundation.org] On Behalf Of Victor Rodriguez
>> Sent: Sunday, November 06, 2016 2:15 AM
>> To: fuego@lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
>> Subject: [Fuego] LPC Increase Test Coverage in a Linux-based OS
>>
>> Hi Fuego team.
>>
>> This week I presented a case of study for the problem of lack of test
>> log output standardization in the majority of packages that are used
>> to build the current Linux distributions. This was presented as a BOF
>> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
>> the Linux Plumbers Conference.
>>
>> it was a productive  discussion that let us share the problem that we
>> have in the current projects that we use every day to build a
>> distribution ( either in embedded as in a cloud base distribution).
>> The open source projects don't follow a standard output log format to
>> print the passing and failing tests that they run during packaging
>> time ( "make test" or "make check" )
>
> Sorry I couldn't download your slides because of proxy issues but
> I think you are talking about the tests that are inside packages (e.g. .deb .rpm files).
> For example, autopkgtest for debian. Is that correct?
>

Yes

> I'm not an expert about them, but I believe these tests can also be executed
> decoupled  from the build process in a flexible way (e.g.: locally, on qemu,
> remotely through ssh, or on an lxc/schroot environment for example).
>

Yes , with a little of extra work in the tool path , for example some
of the test point to the binary they build
isntead of the one in /usr/bin for example. But yes, with a little of
extra work all these test can be decoupled

> Being able to leverage all these tests in Fuego for testing package-based
> embedded systems would be great.
>

Yes !!!

> For non-package-based embedded systems, I think those tests [2]
> could be ported and made cross-compilable. In particular, Yocto/OpenEmbedded's ptest
> framework decouples the compiling phase from the testing phase and
> produces "a consistent output format".
>
> [1] https://packages.debian.org/sid/autopkgtest
> [2] https://wiki.yoctoproject.org/wiki/Ptest
>

I knew I was not wrong when I mention about this Ptest during the conference

Let me take a look and see how they work

>> The Clear Linux project is using a simple Perl script that helps them
>> to count the number of passing and failing tests (which should be
>> trivial if could have a single standard output among all the projects,
>> but we don’t):
>
> I think that counting is good but we also need to know specifically which test/subtest
> in particular failed and what the error log was like.
>

Great , how do you push this to jenkins ?

What do you think about the TAP ?

If you could share a csv example, that will be great

Thanks Daniel

> Best regards
> Daniel
>
> --
> IoT Technology center
> Toshiba Corp. Industrial ICT solutions,
> Daniel SANGORRIN
>
>
>
>> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
>>
>> # perl count.pl <build.log>
>>
>> Examples of real packages build logs:
>>
>> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x86_64/build.log
>> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x86_64/build.log
>>
>> So far that simple (and not well engineered) parser has found 26
>> “standard” outputs ( and counting ) .  The script has the fail that it
>> does not recognize the name of the tests in order to detect
>> regressions. Maybe one test was passing in the previous release and in
>> the new one is failing, and then the number of failing tests remains
>> the same.
>>
>> To be honest, before presenting at LPC I was very confident that this
>> script ( or another version of it , much smarter ) could be beginning
>> of the solution to the problem we have. However, during the discussion
>> at LPC I understand that this might be a huge effort (not sure if
>> bigger) in order to solve the nightmare we already have.
>>
>> Tim Bird participates at the BOF and recommends me to send a mail to
>> the Fuego project team in order to look for more inputs and ideas bout
>> this topic.
>>
>> I really believe in the importance of attack this problem before we
>> have a bigger problem
>>
>> All feedback is more than welcome
>>
>> Regards
>>
>> Victor Rodriguez
>>
>> [presentation slides] :
>> https://drive.google.com/open?id=0B7iKrGdVkDhIcVpncUdGTGhEQTQ
>> [BOF notes] : https://drive.google.com/open?id=1lOPXQcrhL4AoOBSDnwUlJAKIXsReU8OqP82usZn-DCo
>> _______________________________________________
>> Fuego mailing list
>> Fuego@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/fuego
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-10 13:30   ` Victor Rodriguez
@ 2016-11-14  1:44     ` Daniel Sangorrin
  2016-12-14  1:53       ` Victor Rodriguez
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Sangorrin @ 2016-11-14  1:44 UTC (permalink / raw)
  To: 'Victor Rodriguez'; +Cc: fuego

[-- Attachment #1: Type: text/plain, Size: 9498 bytes --]

> -----Original Message-----
> From: Victor Rodriguez [mailto:vm.rod25@gmail.com]
> Sent: Thursday, November 10, 2016 10:30 PM
> To: Daniel Sangorrin
> Cc: fuego@lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
> Subject: Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
> 
> On Wed, Nov 9, 2016 at 10:09 PM, Daniel Sangorrin
> <daniel.sangorrin@toshiba.co.jp> wrote:
> > Hi Victor,
> >
> >> -----Original Message-----
> >> From: fuego-bounces@lists.linuxfoundation.org [mailto:fuego-bounces@lists.linuxfoundation.org] On Behalf Of Victor Rodriguez
> >> Sent: Sunday, November 06, 2016 2:15 AM
> >> To: fuego@lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
> >> Subject: [Fuego] LPC Increase Test Coverage in a Linux-based OS
> >>
> >> Hi Fuego team.
> >>
> >> This week I presented a case of study for the problem of lack of test
> >> log output standardization in the majority of packages that are used
> >> to build the current Linux distributions. This was presented as a BOF
> >> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
> >> the Linux Plumbers Conference.
> >>
> >> it was a productive  discussion that let us share the problem that we
> >> have in the current projects that we use every day to build a
> >> distribution ( either in embedded as in a cloud base distribution).
> >> The open source projects don't follow a standard output log format to
> >> print the passing and failing tests that they run during packaging
> >> time ( "make test" or "make check" )
> >
> > Sorry I couldn't download your slides because of proxy issues but
> > I think you are talking about the tests that are inside packages (e.g. .deb .rpm files).
> > For example, autopkgtest for debian. Is that correct?
> >
> 
> Yes
> 
> > I'm not an expert about them, but I believe these tests can also be executed
> > decoupled  from the build process in a flexible way (e.g.: locally, on qemu,
> > remotely through ssh, or on an lxc/schroot environment for example).
> >
> 
> Yes , with a little of extra work in the tool path , for example some
> of the test point to the binary they build
> isntead of the one in /usr/bin for example. But yes, with a little of
> extra work all these test can be decoupled
> 
> > Being able to leverage all these tests in Fuego for testing package-based
> > embedded systems would be great.
> >
> 
> Yes !!!
> 
> > For non-package-based embedded systems, I think those tests [2]
> > could be ported and made cross-compilable. In particular, Yocto/OpenEmbedded's ptest
> > framework decouples the compiling phase from the testing phase and
> > produces "a consistent output format".
> >
> > [1] https://packages.debian.org/sid/autopkgtest
> > [2] https://wiki.yoctoproject.org/wiki/Ptest
> >
> 
> I knew I was not wrong when I mention about this Ptest during the conference
> 
> Let me take a look and see how they work

Ptest normally uses the test suite that comes with the original source code. For
example, for the openssh recipe it uses some of the tests inside openssh's "regress" folder.

However, there are many recipes without their corresponding ptest definitions. I'm not sure
if that is just because nobody added them yet, or because there was no test suite
in the original source code.

There is another thing called testimage [1] that seems closer to the build tests you
are talking about, but I have never used it. Might be worth asking about it.

[1] https://wiki.yoctoproject.org/wiki/Image_tests

> >> The Clear Linux project is using a simple Perl script that helps them
> >> to count the number of passing and failing tests (which should be
> >> trivial if could have a single standard output among all the projects,
> >> but we don’t):
> >
> > I think that counting is good but we also need to know specifically which test/subtest
> > in particular failed and what the error log was like.
> >
> 
> Great , how do you push this to jenkins ?

At the moment, I just put a link from the jenkins LTP webpage to the spreadsheet file.
When the LTP tests finish, you click on that link and get the updated spreadsheet.
In the future I want to split the parsing functionality from the spreadsheet (see below).

> What do you think about the TAP ?

I didn't know TAP before. From my understanding, the part of TAP that would be
useful for us is the "specification of the test output format" and the available
"consumers/parsers" (including one for jenkins [2] and a python library [3]).

Although subtests (grouping/test suites) seem not to be officially in the TAP 13
specification (according to [2]), the format looks quite flexible. 

1..2
ok 1
not ok 2 - br.eti.kinoshita.selenium.TestListVeterinarians#testGoogle
  ---
  extensions:
      Files:
          my_message.txt:
            File-Title: my_message.txt
            File-Description: Sample message
            File-Size: 31
            File-Name: message.txt
            File-Content: TuNvIGNvbnRhdmFtIGNvbSBtaW5oYSBhc3T6Y2lhIQ==
            File-Type: image/png
  ...

Comparing with the output of
ctest (test framework provided by cmake) I only miss two things:
  - The name of the test/subtest (TAP uses numbers, I don't like that)
  - Information about timing (ctest tells you how long it took for the test to finish)

Still, I think that could be worked out easily.

[2] https://wiki.jenkins-ci.org/display/JENKINS/TAP+Plugin
[3] https://pypi.python.org/pypi/tap.py

> If you could share a csv example, that will be great

I'm using a normal spreadsheet, not csv (although it's just a table like csv). The advantages over
csv are that you don't have to care about 'commas' in the error logs; that you can separate
test cases using sheets; and that you can apply colors for easier visualization when you have
lots of tests passing and only a few of them failing. You can see the example attached or 
on the slides at [4]. 

Another advantage is that you can later analyze it (e.g. calculate the 5 number summary, 
create some figures, etc..), write comments about why it's failing (e.g. only because that 
functionality is not present), and hand it as a report to your customer.
# This is an important point that some people miss. The test results are not just for the
# developers. They are used for compliance, certification, or customer deliverables as well.

However, I admit that my implementation is mixing a "parser" (extract information 
from the LTP logs) and a "visualizer" (the spreadsheet). Once we decide on the standard 
output format" we can just use it as a visualizer.

I think we should go for an architecture that looks a bit like the cloud log collectors 
fluentd [5] or logstash.

LTP log format   -----adapter---+
PTS log format   -----adapter---+---> TAP/Ctest format --> Visualizers (jenkins, spreadsheet, gnuplot..)
NNN log format   ---adapter---+

Actually, I think we could even use them by writing input adapter plugins and a TAP output plugin.
Probably there is some value in doing it like that, because there are many powerful tools around 
for visualization (kibana) and searching (elasticsearch). 

[4] http://elinux.org/images/7/77/Fuego-jamboree-oct-2016.pdf
[5] https://camo.githubusercontent.com/c4abfe337c0b54b36f81bce78481f8965acbc7a9/687474703a2f2f646f63732e666c75656e74642e6f72672f696d616765732f666c75656e74642d6172636869746563747572652e706e67

Cheers

--
IoT Technology center
Toshiba Corp. Industrial ICT solutions, 
Daniel SANGORRIN

> >> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> >>
> >> # perl count.pl <build.log>
> >>
> >> Examples of real packages build logs:
> >>
> >> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x86_64/build.log
> >> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x86_64/build.log
> >>
> >> So far that simple (and not well engineered) parser has found 26
> >> “standard” outputs ( and counting ) .  The script has the fail that it
> >> does not recognize the name of the tests in order to detect
> >> regressions. Maybe one test was passing in the previous release and in
> >> the new one is failing, and then the number of failing tests remains
> >> the same.
> >>
> >> To be honest, before presenting at LPC I was very confident that this
> >> script ( or another version of it , much smarter ) could be beginning
> >> of the solution to the problem we have. However, during the discussion
> >> at LPC I understand that this might be a huge effort (not sure if
> >> bigger) in order to solve the nightmare we already have.
> >>
> >> Tim Bird participates at the BOF and recommends me to send a mail to
> >> the Fuego project team in order to look for more inputs and ideas bout
> >> this topic.
> >>
> >> I really believe in the importance of attack this problem before we
> >> have a bigger problem
> >>
> >> All feedback is more than welcome
> >>
> >> Regards
> >>
> >> Victor Rodriguez
> >>
> >> [presentation slides] :
> >> https://drive.google.com/open?id=0B7iKrGdVkDhIcVpncUdGTGhEQTQ
> >> [BOF notes] : https://drive.google.com/open?id=1lOPXQcrhL4AoOBSDnwUlJAKIXsReU8OqP82usZn-DCo
> >> _______________________________________________
> >> Fuego mailing list
> >> Fuego@lists.linuxfoundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/fuego
> >
> >

[-- Attachment #2: results-jamboree.xlsx --]
[-- Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, Size: 54705 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-11-14  1:44     ` Daniel Sangorrin
@ 2016-12-14  1:53       ` Victor Rodriguez
  2016-12-14  6:13         ` Daniel Sangorrin
  0 siblings, 1 reply; 13+ messages in thread
From: Victor Rodriguez @ 2016-12-14  1:53 UTC (permalink / raw)
  To: Daniel Sangorrin; +Cc: fuego

On Sun, Nov 13, 2016 at 7:44 PM, Daniel Sangorrin
<daniel.sangorrin@toshiba.co.jp> wrote:
>> -----Original Message-----
>> From: Victor Rodriguez [mailto:vm.rod25@gmail.com]
>> Sent: Thursday, November 10, 2016 10:30 PM
>> To: Daniel Sangorrin
>> Cc: fuego@lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
>> Subject: Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
>>
>> On Wed, Nov 9, 2016 at 10:09 PM, Daniel Sangorrin
>> <daniel.sangorrin@toshiba.co.jp> wrote:
>> > Hi Victor,
>> >
>> >> -----Original Message-----
>> >> From: fuego-bounces@lists.linuxfoundation.org [mailto:fuego-bounces@lists.linuxfoundation.org] On Behalf Of Victor Rodriguez
>> >> Sent: Sunday, November 06, 2016 2:15 AM
>> >> To: fuego@lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
>> >> Subject: [Fuego] LPC Increase Test Coverage in a Linux-based OS
>> >>
>> >> Hi Fuego team.
>> >>
>> >> This week I presented a case of study for the problem of lack of test
>> >> log output standardization in the majority of packages that are used
>> >> to build the current Linux distributions. This was presented as a BOF
>> >> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
>> >> the Linux Plumbers Conference.
>> >>
>> >> it was a productive  discussion that let us share the problem that we
>> >> have in the current projects that we use every day to build a
>> >> distribution ( either in embedded as in a cloud base distribution).
>> >> The open source projects don't follow a standard output log format to
>> >> print the passing and failing tests that they run during packaging
>> >> time ( "make test" or "make check" )
>> >
>> > Sorry I couldn't download your slides because of proxy issues but
>> > I think you are talking about the tests that are inside packages (e.g. .deb .rpm files).
>> > For example, autopkgtest for debian. Is that correct?
>> >
>>
>> Yes
>>
>> > I'm not an expert about them, but I believe these tests can also be executed
>> > decoupled  from the build process in a flexible way (e.g.: locally, on qemu,
>> > remotely through ssh, or on an lxc/schroot environment for example).
>> >
>>
>> Yes , with a little of extra work in the tool path , for example some
>> of the test point to the binary they build
>> isntead of the one in /usr/bin for example. But yes, with a little of
>> extra work all these test can be decoupled
>>
>> > Being able to leverage all these tests in Fuego for testing package-based
>> > embedded systems would be great.
>> >
>>
>> Yes !!!
>>
>> > For non-package-based embedded systems, I think those tests [2]
>> > could be ported and made cross-compilable. In particular, Yocto/OpenEmbedded's ptest
>> > framework decouples the compiling phase from the testing phase and
>> > produces "a consistent output format".
>> >
>> > [1] https://packages.debian.org/sid/autopkgtest
>> > [2] https://wiki.yoctoproject.org/wiki/Ptest
>> >
>>
>> I knew I was not wrong when I mention about this Ptest during the conference
>>
>> Let me take a look and see how they work
>

Sorry for the long , long delay

> Ptest normally uses the test suite that comes with the original source code. For
> example, for the openssh recipe it uses some of the tests inside openssh's "regress" folder.
>
> However, there are many recipes without their corresponding ptest definitions. I'm not sure
> if that is just because nobody added them yet, or because there was no test suite
> in the original source code.
>
> There is another thing called testimage [1] that seems closer to the build tests you
> are talking about, but I have never used it. Might be worth asking about it.
>
> [1] https://wiki.yoctoproject.org/wiki/Image_tests
>

I will track this with the Yocto team

>> >> The Clear Linux project is using a simple Perl script that helps them
>> >> to count the number of passing and failing tests (which should be
>> >> trivial if could have a single standard output among all the projects,
>> >> but we don’t):
>> >
>> > I think that counting is good but we also need to know specifically which test/subtest
>> > in particular failed and what the error log was like.
>> >
>>
>> Great , how do you push this to jenkins ?
>
> At the moment, I just put a link from the jenkins LTP webpage to the spreadsheet file.
> When the LTP tests finish, you click on that link and get the updated spreadsheet.
> In the future I want to split the parsing functionality from the spreadsheet (see below).
>
>> What do you think about the TAP ?
>
> I didn't know TAP before. From my understanding, the part of TAP that would be
> useful for us is the "specification of the test output format" and the available
> "consumers/parsers" (including one for jenkins [2] and a python library [3]).
>
> Although subtests (grouping/test suites) seem not to be officially in the TAP 13
> specification (according to [2]), the format looks quite flexible.
>
> 1..2
> ok 1
> not ok 2 - br.eti.kinoshita.selenium.TestListVeterinarians#testGoogle
>   ---
>   extensions:
>       Files:
>           my_message.txt:
>             File-Title: my_message.txt
>             File-Description: Sample message
>             File-Size: 31
>             File-Name: message.txt
>             File-Content: TuNvIGNvbnRhdmFtIGNvbSBtaW5oYSBhc3T6Y2lhIQ==
>             File-Type: image/png
>   ...
>
> Comparing with the output of
> ctest (test framework provided by cmake) I only miss two things:
>   - The name of the test/subtest (TAP uses numbers, I don't like that)

When it is exported to the Jenkins it can track the name of the test, ie:

#!/usr/bin/env bats

@test "addition using bc" {
  result="$(echo 2+2 | bc)"
  [ "$result" -eq 4 ]
}

@test "addition using dc" {
  result="$(echo 2 2+p | dc)"
  [ "$result" -eq 4 ]
}

This is an example of what we use in our shell scripts:

https://github.com/sstephenson/bats


>   - Information about timing (ctest tells you how long it took for the test to finish)
>

Ctest is  more acurate regarding to the way that it is printed , but i
dont se how ctest can be linked to jenkins

Actually I think everyone shoudl follow ctests as standard output

> Still, I think that could be worked out easily.
>
> [2] https://wiki.jenkins-ci.org/display/JENKINS/TAP+Plugin
> [3] https://pypi.python.org/pypi/tap.py
>
>> If you could share a csv example, that will be great
>
> I'm using a normal spreadsheet, not csv (although it's just a table like csv). The advantages over
> csv are that you don't have to care about 'commas' in the error logs; that you can separate
> test cases using sheets; and that you can apply colors for easier visualization when you have
> lots of tests passing and only a few of them failing. You can see the example attached or
> on the slides at [4].
>
Great implementation


> Another advantage is that you can later analyze it (e.g. calculate the 5 number summary,
> create some figures, etc..), write comments about why it's failing (e.g. only because that
> functionality is not present), and hand it as a report to your customer.
> # This is an important point that some people miss. The test results are not just for the
> # developers. They are used for compliance, certification, or customer deliverables as well.
>
> However, I admit that my implementation is mixing a "parser" (extract information
> from the LTP logs) and a "visualizer" (the spreadsheet). Once we decide on the standard
> output format" we can just use it as a visualizer.
>
> I think we should go for an architecture that looks a bit like the cloud log collectors
> fluentd [5] or logstash.
>
> LTP log format   -----adapter---+
> PTS log format   -----adapter---+---> TAP/Ctest format --> Visualizers (jenkins, spreadsheet, gnuplot..)
> NNN log format   ---adapter---+
>
> Actually, I think we could even use them by writing input adapter plugins and a TAP output plugin.
> Probably there is some value in doing it like that, because there are many powerful tools around
> for visualization (kibana) and searching (elasticsearch).
>

Ok this makes me think all day in how these grat tools can be used
here.. the question that I have is
how can I send the logs to these tools and how they parse the output ?
do we have to send the parser to the opensource project ?
Is Fuego willing to add thsi to his infraestructure ?




> [4] http://elinux.org/images/7/77/Fuego-jamboree-oct-2016.pdf
> [5] https://camo.githubusercontent.com/c4abfe337c0b54b36f81bce78481f8965acbc7a9/687474703a2f2f646f63732e666c75656e74642e6f72672f696d616765732f666c75656e74642d6172636869746563747572652e706e67
>
> Cheers
>


After talking with Guillermo we came to the idea of move our parsers
to the Fuego modules

We are going to attack this problem with two solutions, happy to hear feeadback


1) Merge the parsers we have into the Fuego infrastructure
2) Provide an API to the new developers ( and current maintainers of
the existing packages ) to check if their logs are easy to track ( it
means that we can get the status and name of each test ) if the API
can't read the log file we sugest the developer to fit their test to a
standard ( as CMAKE or autotools )

To be honest it seems like a Titanic work to change all the packages
to a standard log output ( specially since ther eare things from the
80's ) but we can make the new ones fit the standards we have and
sugest the maintainers to fit into one.

Tim , I think that we should make a call for action to the linux
comunity , do you think a publication might be useful ? maybe LWN or
someplace else ?

Is good to see we can help to fix this problem

Willing to help :)

Regards

Victor Rodriguez



> --
> IoT Technology center
> Toshiba Corp. Industrial ICT solutions,
> Daniel SANGORRIN
>
>> >> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
>> >>
>> >> # perl count.pl <build.log>
>> >>
>> >> Examples of real packages build logs:
>> >>
>> >> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x86_64/build.log
>> >> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x86_64/build.log
>> >>
>> >> So far that simple (and not well engineered) parser has found 26
>> >> “standard” outputs ( and counting ) .  The script has the fail that it
>> >> does not recognize the name of the tests in order to detect
>> >> regressions. Maybe one test was passing in the previous release and in
>> >> the new one is failing, and then the number of failing tests remains
>> >> the same.
>> >>
>> >> To be honest, before presenting at LPC I was very confident that this
>> >> script ( or another version of it , much smarter ) could be beginning
>> >> of the solution to the problem we have. However, during the discussion
>> >> at LPC I understand that this might be a huge effort (not sure if
>> >> bigger) in order to solve the nightmare we already have.
>> >>
>> >> Tim Bird participates at the BOF and recommends me to send a mail to
>> >> the Fuego project team in order to look for more inputs and ideas bout
>> >> this topic.
>> >>
>> >> I really believe in the importance of attack this problem before we
>> >> have a bigger problem
>> >>
>> >> All feedback is more than welcome
>> >>
>> >> Regards
>> >>
>> >> Victor Rodriguez
>> >>
>> >> [presentation slides] :
>> >> https://drive.google.com/open?id=0B7iKrGdVkDhIcVpncUdGTGhEQTQ
>> >> [BOF notes] : https://drive.google.com/open?id=1lOPXQcrhL4AoOBSDnwUlJAKIXsReU8OqP82usZn-DCo
>> >> _______________________________________________
>> >> Fuego mailing list
>> >> Fuego@lists.linuxfoundation.org
>> >> https://lists.linuxfoundation.org/mailman/listinfo/fuego
>> >
>> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-12-14  1:53       ` Victor Rodriguez
@ 2016-12-14  6:13         ` Daniel Sangorrin
  2016-12-17 23:51           ` Victor Rodriguez
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Sangorrin @ 2016-12-14  6:13 UTC (permalink / raw)
  To: 'Victor Rodriguez'; +Cc: fuego

Hi Victor :)

> -----Original Message-----
> From: Victor Rodriguez [mailto:vm.rod25@gmail.com]
> Sent: Wednesday, December 14, 2016 10:54 AM
> > I think we should go for an architecture that looks a bit like the cloud log collectors
> > fluentd [5] or logstash.
> >
> > LTP log format   -----adapter---+
> > PTS log format   -----adapter---+---> TAP/Ctest format --> Visualizers (jenkins, spreadsheet, gnuplot..)
> > NNN log format   ---adapter---+
> >
> > Actually, I think we could even use them by writing input adapter plugins and a TAP output plugin.
> > Probably there is some value in doing it like that, because there are many powerful tools around
> > for visualization (kibana) and searching (elasticsearch).
> >
> 
> Ok this makes me think all day in how these grat tools can be used
> here.. the question that I have is
> how can I send the logs to these tools and how they parse the output ?

The execution flow goes like this:

1) Jenkins calls mytest.sh (with several environment variables predefined)
2) mytest.sh defines the contents of several functions (pre_check, build, deploy, run, processing, post_test) 
that are executed depending on the phase of the test.

The logs are parsed in the processing function. So to add a new parser ("adapter" in the graph) you need to call it from
the test processing function.

There are already 2 parsers in Fuego:
  - For benchmark tests (bench_processing):  This parser allows creating plots (png) and JSON files with the test results and associated metadata. It requires
one smaller parser.py, which uses a python regex, for each test. It also has a file to determine whether the benchmark was
successful or not.
  - For functional tests (log_compare): parser that compares a test log with an expected log. This parser also generates JSON files ultimately.

By the way, AGL-JTA has done something very similar. They have implemented parsers that convert
the build.xml files generated by jenkins (test timing, result, board name, ...) as well as the test log 
into XML and then visualize it in HTML.

> do we have to send the parser to the opensource project ?
> Is Fuego willing to add thsi to his infraestructure ?

My opinion is that we should do this inside Fuego for now.

> After talking with Guillermo we came to the idea of move our parsers
> to the Fuego modules
> 
> We are going to attack this problem with two solutions, happy to hear feeadback
> 
> 
> 1) Merge the parsers we have into the Fuego infrastructure

Thanks. 
By the way, have you checked the "log_compare" parser? 

> 2) Provide an API to the new developers ( and current maintainers of
> the existing packages ) to check if their logs are easy to track ( it
> means that we can get the status and name of each test ) if the API
> can't read the log file we sugest the developer to fit their test to a
> standard ( as CMAKE or autotools )

One question. Do you think that package maintainers would prefer
Fuego over OpenQA/DebianCI/Taskotron/Beaker/ Open Build Service etc?

Fuego is oriented towards embedded systems: minimal requirements on 
the target, small, simple and easy to customize.

> To be honest it seems like a Titanic work to change all the packages
> to a standard log output ( specially since ther eare things from the
> 80's ) but we can make the new ones fit the standards we have and
> sugest the maintainers to fit into one.

Rather than changing the logs completely, what we want is to
extract the meaningful information (e.g. passed, failed, score, .., parameters used, host machine)
and output it in a common format (JSON) that can then be visualized or sent through a 
REST API to a centralized Fuego.
 
Best regards
Daniel




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
  2016-12-14  6:13         ` Daniel Sangorrin
@ 2016-12-17 23:51           ` Victor Rodriguez
  0 siblings, 0 replies; 13+ messages in thread
From: Victor Rodriguez @ 2016-12-17 23:51 UTC (permalink / raw)
  To: Daniel Sangorrin; +Cc: fuego

[-- Attachment #1: Type: text/plain, Size: 4525 bytes --]

On Wed, Dec 14, 2016 at 12:13 AM, Daniel Sangorrin <
daniel.sangorrin@toshiba.co.jp> wrote:

> Hi Victor :)
>
> > -----Original Message-----
> > From: Victor Rodriguez [mailto:vm.rod25@gmail.com]
> > Sent: Wednesday, December 14, 2016 10:54 AM
> > > I think we should go for an architecture that looks a bit like the
> cloud log collectors
> > > fluentd [5] or logstash.
> > >
> > > LTP log format   -----adapter---+
> > > PTS log format   -----adapter---+---> TAP/Ctest format --> Visualizers
> (jenkins, spreadsheet, gnuplot..)
> > > NNN log format   ---adapter---+
> > >
> > > Actually, I think we could even use them by writing input adapter
> plugins and a TAP output plugin.
> > > Probably there is some value in doing it like that, because there are
> many powerful tools around
> > > for visualization (kibana) and searching (elasticsearch).
> > >
> >
> > Ok this makes me think all day in how these grat tools can be used
> > here.. the question that I have is
> > how can I send the logs to these tools and how they parse the output ?
>
> The execution flow goes like this:
>
> 1) Jenkins calls mytest.sh (with several environment variables predefined)
> 2) mytest.sh defines the contents of several functions (pre_check, build,
> deploy, run, processing, post_test)
> that are executed depending on the phase of the test.
>
> The logs are parsed in the processing function. So to add a new parser
> ("adapter" in the graph) you need to call it from
> the test processing function.
>
> There are already 2 parsers in Fuego:
>   - For benchmark tests (bench_processing):  This parser allows creating
> plots (png) and JSON files with the test results and associated metadata.
> It requires
> one smaller parser.py, which uses a python regex, for each test. It also
> has a file to determine whether the benchmark was
> successful or not.
>   - For functional tests (log_compare): parser that compares a test log
> with an expected log. This parser also generates JSON files ultimately.
>
> By the way, AGL-JTA has done something very similar. They have implemented
> parsers that convert
> the build.xml files generated by jenkins (test timing, result, board name,
> ...) as well as the test log
> into XML and then visualize it in HTML.
>
> > do we have to send the parser to the opensource project ?
> > Is Fuego willing to add thsi to his infraestructure ?
>
> My opinion is that we should do this inside Fuego for now.
>
> > After talking with Guillermo we came to the idea of move our parsers
> > to the Fuego modules
> >
> > We are going to attack this problem with two solutions, happy to hear
> feeadback
> >
> >
> > 1) Merge the parsers we have into the Fuego infrastructure
>
> Thanks.
> By the way, have you checked the "log_compare" parser?
>
> thanks , doing this now


> > 2) Provide an API to the new developers ( and current maintainers of
> > the existing packages ) to check if their logs are easy to track ( it
> > means that we can get the status and name of each test ) if the API
> > can't read the log file we sugest the developer to fit their test to a
> > standard ( as CMAKE or autotools )
>
> One question. Do you think that package maintainers would prefer
> Fuego over OpenQA/DebianCI/Taskotron/Beaker/ Open Build Service etc?
>
> I dont know , but is a good start wha we are going to do

Tim also has the idea of writing about this probelm ina  public place like
a wiki or elinux.org

Another idea is to make a Testing summit day in one of the Linux fundation
events


> Fuego is oriented towards embedded systems: minimal requirements on
> the target, small, simple and easy to customize.
>
>
We will see how comunity react to the idea


> > To be honest it seems like a Titanic work to change all the packages
> > to a standard log output ( specially since ther eare things from the
> > 80's ) but we can make the new ones fit the standards we have and
> > sugest the maintainers to fit into one.
>
> Rather than changing the logs completely, what we want is to
> extract the meaningful information (e.g. passed, failed, score, ..,
> parameters used, host machine)
> and output it in a common format (JSON) that can then be visualized or
> sent through a
> REST API to a centralized Fuego.
>
>
Agree but some of the people on my conference sugest change the logs as the
best way to solve the issue , yes in terms of comupter engineering , not in
termos of eforts :(

Thanks Daniel , we will work on this and send the patches to fuego

Regards

Victor

> Best regards
> Daniel
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 6257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-12-17 23:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-05 17:15 [Fuego] LPC Increase Test Coverage in a Linux-based OS Victor Rodriguez
2016-11-08  0:26 ` Bird, Timothy
2016-11-08 19:38   ` Guillermo Adrian Ponce Castañeda
2016-11-09  0:21     ` Bird, Timothy
2016-11-09 18:04       ` Guillermo Adrian Ponce Castañeda
2016-11-09 20:07         ` Victor Rodriguez
2016-11-10  4:07       ` Daniel Sangorrin
2016-11-10  3:09 ` Daniel Sangorrin
2016-11-10 13:30   ` Victor Rodriguez
2016-11-14  1:44     ` Daniel Sangorrin
2016-12-14  1:53       ` Victor Rodriguez
2016-12-14  6:13         ` Daniel Sangorrin
2016-12-17 23:51           ` Victor Rodriguez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.