Hi RP,

I looked into ptest and regression. The existing "resultstool regression" can be used to perform regression on ptest, since the testresults.json capture ptest status. I had executed regression script for the below 2 ptest testresults.json. Attached was the regression report for ptest. 

https://autobuilder.yocto.io/pub/releases/yocto-2.7_M2.rc1/testresults/qemux86-64-ptest/testresults.json
https://autobuilder.yocto.io/pub/releases/yocto-2.7_M1.rc1/testresults/qemux86-64-ptest/testresults.json

The only challenges now was since ptest result set was relatively large, it was taking some time for computing the regression. Also there was this "ptestresult.rawlogs" testcase that does not contain status but the large rawlog. 

I did an experiment where I run the regression on testresults.json with and without the ptest rawlog. It shows the time taken for regression was significantly larger when it contain the rawlog. I will try to improve the regression time by throw away the rawlog at runtime when perform computing. 
testresults.json with rawlog
Regression start time: 20190131122805
Regression end time:   20190131124425
Time taken: 16 mins 20 sec

testresults.json without rawlog
Regression start time: 20190131124512
Regression end time:   20190131124529
Time taken: 17 sec

Thanks,
Ee Peng 

-----Original Message-----
From: Yeoh, Ee Peng 
Sent: Tuesday, January 29, 2019 5:15 PM
To: Richard Purdie <richard.purdie@linuxfoundation.org>; openembedded-core@lists.openembedded.org
Cc: Eggleton, Paul <paul.eggleton@intel.com>; Burton, Ross <ross.burton@intel.com>
Subject: RE: [OE-core] [PATCH 1/2 v5] resultstool: enable merge, store, report and regression analysis

Hi RP,

I had submitted the v6 patches with below changes.
v6:
  Add regression for directory and git repository
  Enable regression pairing base set to multiple target sets
  Revise selftest testing for regression http://lists.openembedded.org/pipermail/openembedded-core/2019-January/278486.html
http://lists.openembedded.org/pipermail/openembedded-core/2019-January/278487.html
http://lists.openembedded.org/pipermail/openembedded-core/2019-January/278488.html

For regression directory and git, it can support arbitrary directory layout.  The regression will select pair of result instances for comparison based on the unique configurations data inside the result instance itself. 

I have some questions regarding below items:
>I think there is a third thing we also need to look at:
>
>It would be great if there was some way of allowing some kind of templating when storing into the git >repository. This way a general local log file from tmp/log/oeqa could be stored into the git repo, being >split according to the layout of the repo if needed.
>
>Our default layout could match that from the autobuilder but the repository could define a layout?
Before developing custom template layout for store git repo, I would like to understand more so that I will make sure the output will fulfill the requirement. May I know what was the intention to store result into git repo with custom layout template? Can you share the use case? 

For ptest and perform tests, let me look into them. Thank you for sharing the logparser. 
http://git.yoctoproject.org/cgit.cgi/poky-contrib/tree/meta/lib/oeqa/utils/logparser.py#n101

Best regards,
Yeoh Ee Peng 

-----Original Message-----
From: Richard Purdie [mailto:richard.purdie@linuxfoundation.org]
Sent: Tuesday, January 29, 2019 12:29 AM
To: Yeoh, Ee Peng <ee.peng.yeoh@intel.com>; openembedded-core@lists.openembedded.org
Cc: Eggleton, Paul <paul.eggleton@intel.com>; Burton, Ross <ross.burton@intel.com>
Subject: Re: [OE-core] [PATCH 1/2 v5] resultstool: enable merge, store, report and regression analysis

Hi Ee Peng,

On Mon, 2019-01-28 at 02:12 +0000, Yeoh, Ee Peng wrote:
> Thanks for providing the precious inputs. 
> Agreed with you that the current patch that enable files based 
> regression was not enough for other use cases.
> 
> From the information that you had shared, there are 2 more regression 
> use cases that I have in mind:
> Use case#1: directory based regression Given that Autobuilder stored 
> result files inside /testresults directories, user shall be able to 
> perform the directory based regression using output from Autobuilder 
> directly, such as below Autobuilder directories.
> https://autobuilder.yocto.io/pub/releases/yocto-2.6.1.rc1/testresults/
> qemux86/testresults.json
> https://autobuilder.yocto.io/pub/releases/yocto-2.7_M1.rc1/testresults
> /qemux86/testresults.json
> https://autobuilder.yocto.io/pub/releases/yocto-2.7_M2.rc1/testresults
> /qemux86/testresults.json
> 
> Assumed that there are 2 directories storing list of result files.
> User shall provide these 2 directories for regression, regression 
> scripts will first parse through all the available files inside each 
> directories, then perform regression based on available configuration 
> data to determine the regression pair (eg. select result_set_1 from
> directory#1 and result_set_x from directory#2 if they both have 
> matching configurations).

Yes, this would be very useful. I suspect you don't need to have matching layouts, just import from all the json files in a given directory for the comparison.

This way we can support arbitrary layouts.

> Use case#2: git branch based regression Given that Autobuilder stored 
> result files inside /testresults directories, user shall first store 
> these directories and the result files in each git branch accordingly 
> using the existing store plugin.
> After that, user can used the git branch based regression to analysis 
> the information.
> Store in yocto-2.6.1.rc1, yocto-2.7_M1.rc1, yocto-2.7_M2.rc1 git 
> branch accordingly 
> https://autobuilder.yocto.io/pub/releases/yocto-2.6.1.rc1/testresults/
> https://autobuilder.yocto.io/pub/releases/yocto-2.7_M1.rc1/testresults
> /
> https://autobuilder.yocto.io/pub/releases/yocto-2.7_M2.rc1/testresults
> /
>  
> Assumed that result files are stored inside git repository with 
> specific git branch storing result files for single commit. User shall 
> provide the 2 specific git branches for regression, regression scripts 
> will first parse through all the available files inside each git 
> branch, then perform regression based on available configuration data 
> to determine the regression pair (eg. select result_set_1 from
> git_branch_1 and result_set_x from git_branch_2 if they both have 
> matching configurations).
> 
> The current codebase can be easily extended to enable both use cases 
> above. Please let me know if both use cases above are important and 
> please give us your inputs.

Yes, I think these are two key use cases we need to support.

I think there is a third thing we also need to look at:

It would be great if there was some way of allowing some kind of templating when storing into the git repository. This way a general local log file from tmp/log/oeqa could be stored into the git repo, being split according to the layout of the repo if needed.

Our default layout could match that from the autobuilder but the repository could define a layout?

As mentioned, we also need to think about ptest. Currently the runtime ptest code has some parsing and log generation code. I think pieces
like:

http://git.yoctoproject.org/cgit.cgi/poky-contrib/tree/meta/lib/oeqa/utils/logparser.py#n101

and the log_as_files() code should be moved to the reporting tool and that the test code should just generate the json results file which can later be parsed/processed as needed. I did post on the list earlier about some of the other challenges we have with the ptest data.

buildperf is the other piece of this which we'll need to think about.

Cheers,

Richard