* A common place for CI results? [not found] <299272045.11819252.1554465036421.JavaMail.zimbra@redhat.com> @ 2019-04-05 14:41 ` Veronika Kabatova 2019-04-08 22:16 ` Tim.Bird 2019-05-14 23:01 ` Tim.Bird 0 siblings, 2 replies; 15+ messages in thread From: Veronika Kabatova @ 2019-04-05 14:41 UTC (permalink / raw) To: automated-testing, info Hi, as we know from this list, there's plenty CI systems doing some testing on the upstream kernels (and maybe some others we don't know about). It would be great if there was a single common place where all the CI systems can put their results. This would make it much easier for the kernel maintainers and developers to see testing status since they only need to check one place instead of having a list of sites/mailing lists where each CI posts their contributions. A few weeks ago, with some people we've been talking about kernelci.org being in a good place to act as the central upstream kernel CI piece that most maintainers already know about. So I'm wondering if it would be possible for kernelci to also act as an aggregator of all results? There's already an API for publishing a report [0] so it shouldn't be too hard to adjust it to handle and show more information. I also found the beta version for test results [1] so actually, most of the needed functionality seems to be already there. Since there will be multiple CI systems, the source and contact point for the contributor (so maintainers know whom to ask about results if needed) would likely be the only missing essential data point. The common place for results would also make it easier for new CI systems to get involved with upstream. There are likely other companies out there running some tests on kernel internally but don't publish the results anywhere. Only adding some API calls into their code (with the data they are allowed to publish) would make it very simple for them to start contributing. If we want to make them interested, the starting point needs to be trivial. Different companies have different setups and policies and they might not be able to fulfill arbitrary requirements so they opt to not get involved at all, which is a shame because their results can be useful. After the initial "onboarding" step they might be willing to contribute more and more too. Please let me know if the idea makes sense or if something similar is already in plans. I'd be happy to contribute to the effort because I believe it would make everyone's life easier and we'd all benefit from it (and maybe someone else from my team would be willing to help out too if needed). Thanks, Veronika Kabatova CKI Project [0] https://api.kernelci.org/examples.html#sending-a-boot-report [1] https://kernelci.org/test/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-04-05 14:41 ` A common place for CI results? Veronika Kabatova @ 2019-04-08 22:16 ` Tim.Bird 2019-04-09 13:41 ` Guenter Roeck 2019-05-14 23:01 ` Tim.Bird 1 sibling, 1 reply; 15+ messages in thread From: Tim.Bird @ 2019-04-08 22:16 UTC (permalink / raw) To: vkabatov, automated-testing, info > -----Original Message----- > From: Veronika Kabatova ... > as we know from this list, there's plenty CI systems doing some testing on > the > upstream kernels (and maybe some others we don't know about). > > It would be great if there was a single common place where all the CI systems > can put their results. This would make it much easier for the kernel > maintainers and developers to see testing status since they only need to > check one place instead of having a list of sites/mailing lists where each CI > posts their contributions. We've had discussions about this, and decided there are a few issues. Some of these you identify below. > > A few weeks ago, with some people we've been talking about kernelci.org > being > in a good place to act as the central upstream kernel CI piece that most > maintainers already know about. So I'm wondering if it would be possible for > kernelci to also act as an aggregator of all results? Right now, the kernelCI central server is (to my knowledge) maintained by Kevin Hilman, on his own dime. That may be changing with the Linux Foundation possibly creating a testing project to provide support for this. But in any event, at the scale we're talking about (with lots of test frameworks and potentially thousands of boards and hundreds of thousands of test run results arriving daily), hosting this is costly. So there's a question of who pays for this. > There's already an API > for publishing a report [0] so it shouldn't be too hard to adjust it to > handle and show more information. I also found the beta version for test > results [1] so actually, most of the needed functionality seems to be already > there. Since there will be multiple CI systems, the source and contact point > for the contributor (so maintainers know whom to ask about results if > needed) > would likely be the only missing essential data point. One of the things on our action item list is to have discussions about a common results format. See https://elinux.org/ATS_2018_Minutes (towards the end right before "Decisions from the summit") I think this addresses the issue of what information is needed for a universal results format. I think we should definitely add a 'contributor' field to a common definition, for the reasons you mention. Some other issues, are making it so that different test frameworks emit the same testcase names when they run the same test. For example, in Fuego there is a testcase called Functional.LTP.syscalls.abort07. It's not required, but it seems like it would be valuable if CKI, Linaro, Fuego and others decided on a canonical name for this particular testcase, so they were the same in each run result. I took an action item from our meetings at Linaro last week to look at this issue (testcase name harmonization). > > The common place for results would also make it easier for new CI systems > to > get involved with upstream. There are likely other companies out there > running > some tests on kernel internally but don't publish the results anywhere. > Only > adding some API calls into their code (with the data they are allowed to > publish) would make it very simple for them to start contributing. If we want > to make them interested, the starting point needs to be trivial. Different > companies have different setups and policies and they might not be able to > fulfill arbitrary requirements so they opt to not get involved at all, which > is a shame because their results can be useful. After the initial "onboarding" > step they might be willing to contribute more and more too. > Indeed. Probably most groups don't publish their test results, even when they are using open source tests. There are lots of reasons for this (including there not being a place to publish them, as you mention). It would be good to also address the other reasons that testing entities don't publish, and try to remove as many obstacles (or to try to encourage as much as possible) publishing of test results. > Please let me know if the idea makes sense or if something similar is already > in plans. I'd be happy to contribute to the effort because I believe it would > make everyone's life easier and we'd all benefit from it (and maybe > someone > else from my team would be willing to help out too if needed). I think it makes a lot of sense, and we'd like to take steps to make that possible. The aspect of this that I plan to work on myself is testcase name harmonization. That's one aspect of standardizing a common or universal results format. But I've already got a lot of things I'm working on. If someone else wants to volunteer to work on this, or head up a workgroup to work on this, let me know. Regards, -- Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-04-08 22:16 ` Tim.Bird @ 2019-04-09 13:41 ` Guenter Roeck 2019-04-10 9:28 ` [Automated-testing] " Mark Brown 2019-04-10 17:47 ` Veronika Kabatova 0 siblings, 2 replies; 15+ messages in thread From: Guenter Roeck @ 2019-04-09 13:41 UTC (permalink / raw) To: kernelci, Bird, Timothy; +Cc: vkabatov, automated-testing, info On Mon, Apr 8, 2019 at 10:48 PM <Tim.Bird@sony.com> wrote: > > > -----Original Message----- > > From: Veronika Kabatova > ... > > as we know from this list, there's plenty CI systems doing some testing on > > the > > upstream kernels (and maybe some others we don't know about). > > > > It would be great if there was a single common place where all the CI systems > > can put their results. This would make it much easier for the kernel > > maintainers and developers to see testing status since they only need to > > check one place instead of having a list of sites/mailing lists where each CI > > posts their contributions. > > We've had discussions about this, and decided there are a few issues. > Some of these you identify below. > > > > > A few weeks ago, with some people we've been talking about kernelci.org > > being > > in a good place to act as the central upstream kernel CI piece that most > > maintainers already know about. So I'm wondering if it would be possible for > > kernelci to also act as an aggregator of all results? > > Right now, the kernelCI central server is (to my knowledge) maintained > by Kevin Hilman, on his own dime. That may be changing with the Linux > Foundation possibly creating a testing project to provide support for this. > But in any event, at the scale we're talking about (with lots of test frameworks > and potentially thousands of boards and hundreds of thousands of test > run results arriving daily), hosting this is costly. So there's a question of > who pays for this. > In theory that would be the Linux Foundation as part of the KernelCI project. Unfortunately, while companies and people do show interest in KernelCI, there seems to be little interest in actually joining the project. My understanding is that the Linux Foundation will only make it official if/when there are five members. Currently there are three, Google being one of them. Any company interested in the project may possibly want to consider joining it. When doing so, you'l have influence setting its direction, and that may include hosting test results other than those from KernelCI itself. Guenter > > There's already an API > > for publishing a report [0] so it shouldn't be too hard to adjust it to > > handle and show more information. I also found the beta version for test > > results [1] so actually, most of the needed functionality seems to be already > > there. Since there will be multiple CI systems, the source and contact point > > for the contributor (so maintainers know whom to ask about results if > > needed) > > would likely be the only missing essential data point. > > One of the things on our action item list is to have discussions about a common results format. > See https://elinux.org/ATS_2018_Minutes (towards the end right before "Decisions from the summit") > I think this addresses the issue of what information is needed for a universal results format. > I think we should definitely add a 'contributor' field to a common definition, for the > reasons you mention. > > Some other issues, are making it so that different test frameworks emit the same testcase > names when they run the same test. For example, in Fuego there is a testcase called > Functional.LTP.syscalls.abort07. It's not required, but it seems like it would be valuable > if CKI, Linaro, Fuego and others decided on a canonical name for this particular testcase, > so they were the same in each run result. > > I took an action item from our meetings at Linaro last week to look at this issue > (testcase name harmonization). > > > > > The common place for results would also make it easier for new CI systems > > to > > get involved with upstream. There are likely other companies out there > > running > > some tests on kernel internally but don't publish the results anywhere. > > > Only > > adding some API calls into their code (with the data they are allowed to > > publish) would make it very simple for them to start contributing. If we want > > to make them interested, the starting point needs to be trivial. Different > > companies have different setups and policies and they might not be able to > > fulfill arbitrary requirements so they opt to not get involved at all, which > > is a shame because their results can be useful. After the initial "onboarding" > > step they might be willing to contribute more and more too. > > > Indeed. Probably most groups don't publish their test results, even > when they are using open source tests. There are lots of reasons for this > (including there not being a place to publish them, as you mention). > It would be good to also address the other reasons that testing entities > don't publish, and try to remove as many obstacles (or to try to encourage > as much as possible) publishing of test results. > > > Please let me know if the idea makes sense or if something similar is already > > in plans. I'd be happy to contribute to the effort because I believe it would > > make everyone's life easier and we'd all benefit from it (and maybe > > someone > > else from my team would be willing to help out too if needed). > > I think it makes a lot of sense, and we'd like to take steps to make that possible. > > The aspect of this that I plan to work on myself is testcase name harmonization. > That's one aspect of standardizing a common or universal results format. > But I've already got a lot of things I'm working on. If someone else wants to > volunteer to work on this, or head up a workgroup to work on this, let me know. > > Regards, > -- Tim > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Automated-testing] A common place for CI results? 2019-04-09 13:41 ` Guenter Roeck @ 2019-04-10 9:28 ` Mark Brown 2019-04-10 17:47 ` Veronika Kabatova 1 sibling, 0 replies; 15+ messages in thread From: Mark Brown @ 2019-04-10 9:28 UTC (permalink / raw) To: Guenter Roeck; +Cc: kernelci, Bird, Timothy, info, automated-testing [-- Attachment #1: Type: text/plain, Size: 559 bytes --] On Tue, Apr 09, 2019 at 06:41:24AM -0700, Guenter Roeck wrote: > On Mon, Apr 8, 2019 at 10:48 PM <Tim.Bird@sony.com> wrote: > > Right now, the kernelCI central server is (to my knowledge) maintained > > by Kevin Hilman, on his own dime. That may be changing with the Linux Linaro is paying for the core servers (the Hetzner boxes with the core servers are a combination of Linaro and Collabora, IIRC the boxes Collabora is paying for are all builders). As far as I'm aware no individual is paying out of pocket for anything except for labs at the minute. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-04-09 13:41 ` Guenter Roeck 2019-04-10 9:28 ` [Automated-testing] " Mark Brown @ 2019-04-10 17:47 ` Veronika Kabatova 2019-04-10 21:13 ` [Automated-testing] " Kevin Hilman 1 sibling, 1 reply; 15+ messages in thread From: Veronika Kabatova @ 2019-04-10 17:47 UTC (permalink / raw) To: Guenter Roeck, Timothy Bird; +Cc: kernelci, automated-testing, info ----- Original Message ----- > From: "Guenter Roeck" <groeck@google.com> > To: kernelci@groups.io, "Timothy Bird" <Tim.Bird@sony.com> > Cc: vkabatov@redhat.com, automated-testing@yoctoproject.org, info@kernelci.org > Sent: Tuesday, April 9, 2019 3:41:24 PM > Subject: Re: A common place for CI results? > > On Mon, Apr 8, 2019 at 10:48 PM <Tim.Bird@sony.com> wrote: > > > > > -----Original Message----- > > > From: Veronika Kabatova > > ... > > > as we know from this list, there's plenty CI systems doing some testing > > > on > > > the > > > upstream kernels (and maybe some others we don't know about). > > > > > > It would be great if there was a single common place where all the CI > > > systems > > > can put their results. This would make it much easier for the kernel > > > maintainers and developers to see testing status since they only need to > > > check one place instead of having a list of sites/mailing lists where > > > each CI > > > posts their contributions. > > > > We've had discussions about this, and decided there are a few issues. > > Some of these you identify below. > > > > > > > > A few weeks ago, with some people we've been talking about kernelci.org > > > being > > > in a good place to act as the central upstream kernel CI piece that most > > > maintainers already know about. So I'm wondering if it would be possible > > > for > > > kernelci to also act as an aggregator of all results? > > > > Right now, the kernelCI central server is (to my knowledge) maintained > > by Kevin Hilman, on his own dime. That may be changing with the Linux > > Foundation possibly creating a testing project to provide support for this. > > But in any event, at the scale we're talking about (with lots of test > > frameworks > > and potentially thousands of boards and hundreds of thousands of test > > run results arriving daily), hosting this is costly. So there's a question > > of > > who pays for this. > > > > In theory that would be the Linux Foundation as part of the KernelCI > project. Unfortunately, while companies and people do show interest in > KernelCI, there seems to be little interest in actually joining the > project. My understanding is that the Linux Foundation will only make > it official if/when there are five members. Currently there are three, > Google being one of them. Any company interested in the project may > possibly want to consider joining it. When doing so, you'l have > influence setting its direction, and that may include hosting test > results other than those from KernelCI itself. > Is there any page with details on how to join and what are the requirements on us that I can pass along to management to get an official statement? We are definitely interested in more involvement with upstream, both kernel and different CI systems, as we have a common goal. If we can help each other out and build a central CI system for upstream kernels that people can rely on, we want to be a part of this effort. We have started our own interaction with upstream (see my intro email on this list) but as all CI systems face the same challenges it only makes sense to join the forces. > Guenter > > > > There's already an API > > > for publishing a report [0] so it shouldn't be too hard to adjust it to > > > handle and show more information. I also found the beta version for test > > > results [1] so actually, most of the needed functionality seems to be > > > already > > > there. Since there will be multiple CI systems, the source and contact > > > point > > > for the contributor (so maintainers know whom to ask about results if > > > needed) > > > would likely be the only missing essential data point. > > > > One of the things on our action item list is to have discussions about a > > common results format. > > See https://elinux.org/ATS_2018_Minutes (towards the end right before > > "Decisions from the summit") > > I think this addresses the issue of what information is needed for a > > universal results format. > > I think we should definitely add a 'contributor' field to a common > > definition, for the > > reasons you mention. > > > > Some other issues, are making it so that different test frameworks emit the > > same testcase > > names when they run the same test. For example, in Fuego there is a > > testcase called > > Functional.LTP.syscalls.abort07. It's not required, but it seems like it > > would be valuable > > if CKI, Linaro, Fuego and others decided on a canonical name for this > > particular testcase, > > so they were the same in each run result. > > Good point. CKI only reports full testsuite names as results (so it would be "LTP lite") and then we have a short log with subtests and results, and a longer log with details. But for CI systems that report each subtest separately, having a common name (with maybe "LTP" as an aggregated result) would definitely be beneficial and easier to parse by both humans and automation. > > I took an action item from our meetings at Linaro last week to look at this > > issue > > (testcase name harmonization). > > > > > > > > The common place for results would also make it easier for new CI systems > > > to > > > get involved with upstream. There are likely other companies out there > > > running > > > some tests on kernel internally but don't publish the results anywhere. > > > > > Only > > > adding some API calls into their code (with the data they are allowed to > > > publish) would make it very simple for them to start contributing. If we > > > want > > > to make them interested, the starting point needs to be trivial. > > > Different > > > companies have different setups and policies and they might not be able > > > to > > > fulfill arbitrary requirements so they opt to not get involved at all, > > > which > > > is a shame because their results can be useful. After the initial > > > "onboarding" > > > step they might be willing to contribute more and more too. > > > > > Indeed. Probably most groups don't publish their test results, even > > when they are using open source tests. There are lots of reasons for this > > (including there not being a place to publish them, as you mention). > > It would be good to also address the other reasons that testing entities > > don't publish, and try to remove as many obstacles (or to try to encourage > > as much as possible) publishing of test results. > > Absolutely agreed. > > > Please let me know if the idea makes sense or if something similar is > > > already > > > in plans. I'd be happy to contribute to the effort because I believe it > > > would > > > make everyone's life easier and we'd all benefit from it (and maybe > > > someone > > > else from my team would be willing to help out too if needed). > > > > I think it makes a lot of sense, and we'd like to take steps to make that > > possible. > > > > The aspect of this that I plan to work on myself is testcase name > > harmonization. > > That's one aspect of standardizing a common or universal results format. > > But I've already got a lot of things I'm working on. If someone else wants > > to > > volunteer to work on this, or head up a workgroup to work on this, let me > > know. > > Totally understand your situation, too much work and too little time :) I can try to put an idea together and post it here for feedback to help out. Do you have any data points or previous discussions to link? It would be great to have something to build upon instead of posting a brain dump that won't work for already known issues (that aren't known by me). Veronika > > Regards, > > -- Tim > > > > > > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Automated-testing] A common place for CI results? 2019-04-10 17:47 ` Veronika Kabatova @ 2019-04-10 21:13 ` Kevin Hilman 2019-04-11 16:02 ` Veronika Kabatova 0 siblings, 1 reply; 15+ messages in thread From: Kevin Hilman @ 2019-04-10 21:13 UTC (permalink / raw) To: Veronika Kabatova Cc: Guenter Roeck, Timothy Bird, info, automated-testing, kernelci [-- Attachment #1: Type: text/plain, Size: 496 bytes --] On Wed, Apr 10, 2019 at 10:47 AM Veronika Kabatova <vkabatov@redhat.com> wrote: [...] > Is there any page with details on how to join and what are the requirements > on us that I can pass along to management to get an official statement? Attatched is the LF slide deck with the project overview, membership levels and costs etc. I'd be happy to discuss more with you on a call after you review the deck, but it would have to be next week as I'm OoO for the rest of this week. Thanks, Kevin [-- Attachment #2: kernelCI Project Pitch.pdf --] [-- Type: application/pdf, Size: 1536010 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Automated-testing] A common place for CI results? 2019-04-10 21:13 ` [Automated-testing] " Kevin Hilman @ 2019-04-11 16:02 ` Veronika Kabatova 0 siblings, 0 replies; 15+ messages in thread From: Veronika Kabatova @ 2019-04-11 16:02 UTC (permalink / raw) To: Kevin Hilman Cc: Guenter Roeck, Timothy Bird, info, automated-testing, kernelci ----- Original Message ----- > From: "Kevin Hilman" <khilman@baylibre.com> > To: "Veronika Kabatova" <vkabatov@redhat.com> > Cc: "Guenter Roeck" <groeck@google.com>, "Timothy Bird" <Tim.Bird@sony.com>, "info" <info@kernelci.org>, > automated-testing@yoctoproject.org, kernelci@groups.io > Sent: Wednesday, April 10, 2019 11:13:40 PM > Subject: Re: [Automated-testing] A common place for CI results? > > On Wed, Apr 10, 2019 at 10:47 AM Veronika Kabatova <vkabatov@redhat.com> > wrote: > > [...] > > > Is there any page with details on how to join and what are the requirements > > on us that I can pass along to management to get an official statement? > > Attatched is the LF slide deck with the project overview, membership > levels and costs etc. > > I'd be happy to discuss more with you on a call after you review the > deck, but it would have to be next week as I'm OoO for the rest of > this week. Sounds good. Feel free to reach out off list to set up the time and call location. I'll prepare a list of questions to discuss (especially as I have no idea how the project memberships work, even though Red Hat is already a member of Linux Foundation). Afterwards I can pass along all the information and try to get any funding. Thanks, Veronika > > Thanks, > > Kevin > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-04-05 14:41 ` A common place for CI results? Veronika Kabatova 2019-04-08 22:16 ` Tim.Bird @ 2019-05-14 23:01 ` Tim.Bird 2019-05-15 20:33 ` Dan Rue 1 sibling, 1 reply; 15+ messages in thread From: Tim.Bird @ 2019-05-14 23:01 UTC (permalink / raw) To: vkabatov, automated-testing, info > -----Original Message----- > From: Veronika Kabatova > > Hi, > > as we know from this list, there's plenty CI systems doing some testing on > the > upstream kernels (and maybe some others we don't know about). > > It would be great if there was a single common place where all the CI systems > can put their results. This would make it much easier for the kernel > maintainers and developers to see testing status since they only need to > check one place instead of having a list of sites/mailing lists where each CI > posts their contributions. > > > A few weeks ago, with some people we've been talking about kernelci.org > being > in a good place to act as the central upstream kernel CI piece that most > maintainers already know about. So I'm wondering if it would be possible for > kernelci to also act as an aggregator of all results? There's already an API > for publishing a report [0] so it shouldn't be too hard to adjust it to > handle and show more information. I also found the beta version for test > results [1] so actually, most of the needed functionality seems to be already > there. Since there will be multiple CI systems, the source and contact point > for the contributor (so maintainers know whom to ask about results if > needed) > would likely be the only missing essential data point. > > > The common place for results would also make it easier for new CI systems > to > get involved with upstream. There are likely other companies out there > running > some tests on kernel internally but don't publish the results anywhere. Only > adding some API calls into their code (with the data they are allowed to > publish) would make it very simple for them to start contributing. If we want > to make them interested, the starting point needs to be trivial. Different > companies have different setups and policies and they might not be able to > fulfill arbitrary requirements so they opt to not get involved at all, which > is a shame because their results can be useful. After the initial "onboarding" > step they might be willing to contribute more and more too. > > > Please let me know if the idea makes sense or if something similar is already > in plans. I'd be happy to contribute to the effort because I believe it would > make everyone's life easier and we'd all benefit from it (and maybe > someone > else from my team would be willing to help out too if needed). I never responded to this, but this sounds like a really good idea to me. I don't care much which backend we aggregate to, but it would be good as a community to start using one service to start with. It would help to find issues with the API, or the results schema, if multiple people started using it. I know that people using Fuego are sending data to their own instances of KernelCI. But I don't know what the issues are for sending this data to a shared KernelCI service. I would be interested in hooking up my lab to send Fuego results to KernelCI. This would be a good exercise. I'm not sure what the next steps would be, but maybe we could discuss this on the next automated testing conference call. -- Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-05-14 23:01 ` Tim.Bird @ 2019-05-15 20:33 ` Dan Rue 2019-05-15 21:06 ` Tom Gall 2019-05-15 22:58 ` [Automated-testing] " Carlos Hernandez 0 siblings, 2 replies; 15+ messages in thread From: Dan Rue @ 2019-05-15 20:33 UTC (permalink / raw) To: kernelci, Tim.Bird; +Cc: vkabatov, automated-testing, info On Tue, May 14, 2019 at 11:01:35PM +0000, Tim.Bird@sony.com wrote: > > > > -----Original Message----- > > From: Veronika Kabatova > > > > Hi, > > > > as we know from this list, there's plenty CI systems doing some testing on > > the > > upstream kernels (and maybe some others we don't know about). > > > > It would be great if there was a single common place where all the CI systems > > can put their results. This would make it much easier for the kernel > > maintainers and developers to see testing status since they only need to > > check one place instead of having a list of sites/mailing lists where each CI > > posts their contributions. > > > > > > A few weeks ago, with some people we've been talking about kernelci.org > > being > > in a good place to act as the central upstream kernel CI piece that most > > maintainers already know about. So I'm wondering if it would be possible for > > kernelci to also act as an aggregator of all results? There's already an API > > for publishing a report [0] so it shouldn't be too hard to adjust it to > > handle and show more information. I also found the beta version for test > > results [1] so actually, most of the needed functionality seems to be already > > there. Since there will be multiple CI systems, the source and contact point > > for the contributor (so maintainers know whom to ask about results if > > needed) > > would likely be the only missing essential data point. > > > > > > The common place for results would also make it easier for new CI systems > > to > > get involved with upstream. There are likely other companies out there > > running > > some tests on kernel internally but don't publish the results anywhere. Only > > adding some API calls into their code (with the data they are allowed to > > publish) would make it very simple for them to start contributing. If we want > > to make them interested, the starting point needs to be trivial. Different > > companies have different setups and policies and they might not be able to > > fulfill arbitrary requirements so they opt to not get involved at all, which > > is a shame because their results can be useful. After the initial "onboarding" > > step they might be willing to contribute more and more too. > > > > > > Please let me know if the idea makes sense or if something similar is already > > in plans. I'd be happy to contribute to the effort because I believe it would > > make everyone's life easier and we'd all benefit from it (and maybe > > someone > > else from my team would be willing to help out too if needed). > > I never responded to this, yea, you did. ;) > but this sounds like a really good idea to me. I don't care much which > backend we aggregate to, but it would be good as a community to start > using one service to start with. It would help to find issues with > the API, or the results schema, if multiple people started using it. > > I know that people using Fuego are sending data to their own instances > of KernelCI. But I don't know what the issues are for sending this > data to a shared KernelCI service. > > I would be interested in hooking up my lab to send Fuego results to > KernelCI. This would be a good exercise. I'm not sure what the next > steps would be, but maybe we could discuss this on the next automated > testing conference call. OK here's my idea. I don't personally think kernelci (or LKFT) are set up to aggregate results currently. We have too many assumptions about where tests are coming from, how things are built, etc. In other words, dealing with noisy data is going to be non-trivial in any existing project. I would propose aggregating data into something like google's BigQuery. This has a few benefits: - Non-opinionated place to hold structured data - Allows many downstream use-cases - Managed hosting, and data is publicly available - Storage is sponsored by google as a part of https://cloud.google.com/bigquery/public-data/ - First 1TB of query per 'project' is free, and users pay for more queries than that With storage taken care of, how do we get the data in? First, we'll need some canonical data structure defined. I would approach defining the canonical structure in conjunction with the first few projects that are interested in contributing their results. Each project will have an ETL pipeline which will extract the test results from a given project (such as kernelci, lkft, etc), translate it into the canonical data structure, and load it into the google bigquery dataset at a regular interval or in real-time. The translation layer is where things like test names are handled. The things this leaves me wanting are: - raw data storage. It would be nice if raw data were stored somewhere permanent in some intermediary place so that later implementations could happen, and for data that doesn't fit into whatever structure we end up with. - time, to actually try it and find the gaps. This is just an idea I've been thinking about. Anyone with experience here that can help flesh this out? Dan -- Linaro - Kernel Validation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-05-15 20:33 ` Dan Rue @ 2019-05-15 21:06 ` Tom Gall 2019-05-20 15:32 ` Veronika Kabatova 2019-05-15 22:58 ` [Automated-testing] " Carlos Hernandez 1 sibling, 1 reply; 15+ messages in thread From: Tom Gall @ 2019-05-15 21:06 UTC (permalink / raw) To: kernelci, Dan Rue; +Cc: Tim.Bird, vkabatov, automated-testing, info > On May 15, 2019, at 3:33 PM, Dan Rue <dan.rue@linaro.org> wrote: > > On Tue, May 14, 2019 at 11:01:35PM +0000, Tim.Bird@sony.com wrote: >> >> >>> -----Original Message----- >>> From: Veronika Kabatova >>> >>> Hi, >>> >>> as we know from this list, there's plenty CI systems doing some testing on >>> the >>> upstream kernels (and maybe some others we don't know about). >>> >>> It would be great if there was a single common place where all the CI systems >>> can put their results. This would make it much easier for the kernel >>> maintainers and developers to see testing status since they only need to >>> check one place instead of having a list of sites/mailing lists where each CI >>> posts their contributions. >>> >>> >>> A few weeks ago, with some people we've been talking about kernelci.org >>> being >>> in a good place to act as the central upstream kernel CI piece that most >>> maintainers already know about. So I'm wondering if it would be possible for >>> kernelci to also act as an aggregator of all results? There's already an API >>> for publishing a report [0] so it shouldn't be too hard to adjust it to >>> handle and show more information. I also found the beta version for test >>> results [1] so actually, most of the needed functionality seems to be already >>> there. Since there will be multiple CI systems, the source and contact point >>> for the contributor (so maintainers know whom to ask about results if >>> needed) >>> would likely be the only missing essential data point. >>> >>> >>> The common place for results would also make it easier for new CI systems >>> to >>> get involved with upstream. There are likely other companies out there >>> running >>> some tests on kernel internally but don't publish the results anywhere. Only >>> adding some API calls into their code (with the data they are allowed to >>> publish) would make it very simple for them to start contributing. If we want >>> to make them interested, the starting point needs to be trivial. Different >>> companies have different setups and policies and they might not be able to >>> fulfill arbitrary requirements so they opt to not get involved at all, which >>> is a shame because their results can be useful. After the initial "onboarding" >>> step they might be willing to contribute more and more too. >>> >>> >>> Please let me know if the idea makes sense or if something similar is already >>> in plans. I'd be happy to contribute to the effort because I believe it would >>> make everyone's life easier and we'd all benefit from it (and maybe >>> someone >>> else from my team would be willing to help out too if needed). >> >> I never responded to this, > > yea, you did. ;) > >> but this sounds like a really good idea to me. I don't care much which >> backend we aggregate to, but it would be good as a community to start >> using one service to start with. It would help to find issues with >> the API, or the results schema, if multiple people started using it. >> >> I know that people using Fuego are sending data to their own instances >> of KernelCI. But I don't know what the issues are for sending this >> data to a shared KernelCI service. >> >> I would be interested in hooking up my lab to send Fuego results to >> KernelCI. This would be a good exercise. I'm not sure what the next >> steps would be, but maybe we could discuss this on the next automated >> testing conference call. > > OK here's my idea. > > I don't personally think kernelci (or LKFT) are set up to aggregate > results currently. We have too many assumptions about where tests are > coming from, how things are built, etc. In other words, dealing with > noisy data is going to be non-trivial in any existing project. I completely agree. > I would propose aggregating data into something like google's BigQuery. > This has a few benefits: > - Non-opinionated place to hold structured data > - Allows many downstream use-cases > - Managed hosting, and data is publicly available > - Storage is sponsored by google as a part of > https://cloud.google.com/bigquery/public-data/ > - First 1TB of query per 'project' is free, and users pay for more > queries than that I very much like this idea. I do lots of android kernel testing and being able to work with / compare / contribute to what is essentially a pile of data in BQ would be great. As an end user working with the data I’d also have lots of dash board options to customize and share queries with others. > With storage taken care of, how do we get the data in? > First, we'll need some canonical data structure defined. I would > approach defining the canonical structure in conjunction with the first > few projects that are interested in contributing their results. Each > project will have an ETL pipeline which will extract the test results > from a given project (such as kernelci, lkft, etc), translate it into > the canonical data structure, and load it into the google bigquery > dataset at a regular interval or in real-time. The translation layer is > where things like test names are handled. Exactly. I would hope that the various projects that are producing data would be motived to plug in. After all, it makes the data they are producing more useful and available to a larger group of people. > The things this leaves me wanting are: > - raw data storage. It would be nice if raw data were stored somewhere > permanent in some intermediary place so that later implementations > could happen, and for data that doesn't fit into whatever structure we > end up with. I agree. > - time, to actually try it and find the gaps. This is just an idea I've > been thinking about. Anyone with experience here that can help flesh > this out? I’m willing to lend a hand. > Dan > > -- > Linaro - Kernel Validation Tom — Directory, Linaro Consumer Group ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-05-15 21:06 ` Tom Gall @ 2019-05-20 15:32 ` Veronika Kabatova 2019-05-28 8:24 ` Guillaume Tucker 0 siblings, 1 reply; 15+ messages in thread From: Veronika Kabatova @ 2019-05-20 15:32 UTC (permalink / raw) To: Tom Gall, Dan Rue, Tim Bird; +Cc: kernelci, automated-testing, info ----- Original Message ----- > From: "Tom Gall" <tom.gall@linaro.org> > To: kernelci@groups.io, "Dan Rue" <dan.rue@linaro.org> > Cc: "Tim Bird" <Tim.Bird@sony.com>, vkabatov@redhat.com, automated-testing@yoctoproject.org, info@kernelci.org > Sent: Wednesday, May 15, 2019 11:06:33 PM > Subject: Re: A common place for CI results? > > > > > On May 15, 2019, at 3:33 PM, Dan Rue <dan.rue@linaro.org> wrote: > > > > On Tue, May 14, 2019 at 11:01:35PM +0000, Tim.Bird@sony.com wrote: > >> > >> > >>> -----Original Message----- > >>> From: Veronika Kabatova > >>> > >>> Hi, > >>> > >>> as we know from this list, there's plenty CI systems doing some testing > >>> on > >>> the > >>> upstream kernels (and maybe some others we don't know about). > >>> > >>> It would be great if there was a single common place where all the CI > >>> systems > >>> can put their results. This would make it much easier for the kernel > >>> maintainers and developers to see testing status since they only need to > >>> check one place instead of having a list of sites/mailing lists where > >>> each CI > >>> posts their contributions. > >>> > >>> > >>> A few weeks ago, with some people we've been talking about kernelci.org > >>> being > >>> in a good place to act as the central upstream kernel CI piece that most > >>> maintainers already know about. So I'm wondering if it would be possible > >>> for > >>> kernelci to also act as an aggregator of all results? There's already an > >>> API > >>> for publishing a report [0] so it shouldn't be too hard to adjust it to > >>> handle and show more information. I also found the beta version for test > >>> results [1] so actually, most of the needed functionality seems to be > >>> already > >>> there. Since there will be multiple CI systems, the source and contact > >>> point > >>> for the contributor (so maintainers know whom to ask about results if > >>> needed) > >>> would likely be the only missing essential data point. > >>> > >>> > >>> The common place for results would also make it easier for new CI systems > >>> to > >>> get involved with upstream. There are likely other companies out there > >>> running > >>> some tests on kernel internally but don't publish the results anywhere. > >>> Only > >>> adding some API calls into their code (with the data they are allowed to > >>> publish) would make it very simple for them to start contributing. If we > >>> want > >>> to make them interested, the starting point needs to be trivial. > >>> Different > >>> companies have different setups and policies and they might not be able > >>> to > >>> fulfill arbitrary requirements so they opt to not get involved at all, > >>> which > >>> is a shame because their results can be useful. After the initial > >>> "onboarding" > >>> step they might be willing to contribute more and more too. > >>> > >>> > >>> Please let me know if the idea makes sense or if something similar is > >>> already > >>> in plans. I'd be happy to contribute to the effort because I believe it > >>> would > >>> make everyone's life easier and we'd all benefit from it (and maybe > >>> someone > >>> else from my team would be willing to help out too if needed). > >> > >> I never responded to this, > > > > yea, you did. ;) > > > >> but this sounds like a really good idea to me. I don't care much which > >> backend we aggregate to, but it would be good as a community to start > >> using one service to start with. It would help to find issues with > >> the API, or the results schema, if multiple people started using it. > >> > >> I know that people using Fuego are sending data to their own instances > >> of KernelCI. But I don't know what the issues are for sending this > >> data to a shared KernelCI service. > >> > >> I would be interested in hooking up my lab to send Fuego results to > >> KernelCI. This would be a good exercise. I'm not sure what the next > >> steps would be, but maybe we could discuss this on the next automated > >> testing conference call. > > > > OK here's my idea. > > > > I don't personally think kernelci (or LKFT) are set up to aggregate > > results currently. We have too many assumptions about where tests are > > coming from, how things are built, etc. In other words, dealing with > > noisy data is going to be non-trivial in any existing project. > > I completely agree. > This is a good point. I'm totally fine with having a separate independent place for aggregation. > > I would propose aggregating data into something like google's BigQuery. > > This has a few benefits: > > - Non-opinionated place to hold structured data > > - Allows many downstream use-cases > > - Managed hosting, and data is publicly available > > - Storage is sponsored by google as a part of > > https://cloud.google.com/bigquery/public-data/ > > - First 1TB of query per 'project' is free, and users pay for more > > queries than that > > I very much like this idea. I do lots of android kernel testing > and being able to work with / compare / contribute to what > is essentially a pile of data in BQ would be great. As an > end user working with the data I’d also have lots of dash > board options to customize and share queries with others. > > > With storage taken care of, how do we get the data in? > > > First, we'll need some canonical data structure defined. I would > > approach defining the canonical structure in conjunction with the first > > few projects that are interested in contributing their results. Each > > project will have an ETL pipeline which will extract the test results > > from a given project (such as kernelci, lkft, etc), translate it into > > the canonical data structure, and load it into the google bigquery > > dataset at a regular interval or in real-time. The translation layer is > > where things like test names are handled. > +1, exactly how I imagined this part. > Exactly. I would hope that the various projects that are producing > data would be motived to plug in. After all, it makes the data > they are producing more useful and available to a larger group > of people. > > > The things this leaves me wanting are: > > - raw data storage. It would be nice if raw data were stored somewhere > > permanent in some intermediary place so that later implementations > > could happen, and for data that doesn't fit into whatever structure we > > end up with. > > I agree. +1 > > > - time, to actually try it and find the gaps. This is just an idea I've > > been thinking about. Anyone with experience here that can help flesh > > this out? > > I’m willing to lend a hand. > Thanks for starting up a specific proposal! I agree with everything that was brought up. I'll try to find time to participate in the implementation part too (although my experience with data storage is.. limited, I should be able to help out with the structure prototyping and maybe other parts too). Thanks again, Veronika CKI Project > > Dan > > > > -- > > Linaro - Kernel Validation > > Tom > > — > Directory, Linaro Consumer Group > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-05-20 15:32 ` Veronika Kabatova @ 2019-05-28 8:24 ` Guillaume Tucker 2019-05-28 14:45 ` Veronika Kabatova 0 siblings, 1 reply; 15+ messages in thread From: Guillaume Tucker @ 2019-05-28 8:24 UTC (permalink / raw) To: kernelci, vkabatov; +Cc: Tom Gall, Dan Rue, Tim Bird, automated-testing, info [-- Attachment #1: Type: text/plain, Size: 11085 bytes --] Hello, On Mon, May 20, 2019 at 4:38 PM Veronika Kabatova <vkabatov@redhat.com> wrote: > > > ----- Original Message ----- > > From: "Tom Gall" <tom.gall@linaro.org> > > To: kernelci@groups.io, "Dan Rue" <dan.rue@linaro.org> > > Cc: "Tim Bird" <Tim.Bird@sony.com>, vkabatov@redhat.com, > automated-testing@yoctoproject.org, info@kernelci.org > > Sent: Wednesday, May 15, 2019 11:06:33 PM > > Subject: Re: A common place for CI results? > > > > > > > > > On May 15, 2019, at 3:33 PM, Dan Rue <dan.rue@linaro.org> wrote: > > > > > > On Tue, May 14, 2019 at 11:01:35PM +0000, Tim.Bird@sony.com wrote: > > >> > > >> > > >>> -----Original Message----- > > >>> From: Veronika Kabatova > > >>> > > >>> Hi, > > >>> > > >>> as we know from this list, there's plenty CI systems doing some > testing > > >>> on > > >>> the > > >>> upstream kernels (and maybe some others we don't know about). > > >>> > > >>> It would be great if there was a single common place where all the CI > > >>> systems > > >>> can put their results. This would make it much easier for the kernel > > >>> maintainers and developers to see testing status since they only > need to > > >>> check one place instead of having a list of sites/mailing lists where > > >>> each CI > > >>> posts their contributions. > > >>> > > >>> > > >>> A few weeks ago, with some people we've been talking about > kernelci.org > > >>> being > > >>> in a good place to act as the central upstream kernel CI piece that > most > > >>> maintainers already know about. So I'm wondering if it would be > possible > > >>> for > > >>> kernelci to also act as an aggregator of all results? There's > already an > > >>> API > > >>> for publishing a report [0] so it shouldn't be too hard to adjust it > to > > >>> handle and show more information. I also found the beta version for > test > > >>> results [1] so actually, most of the needed functionality seems to be > > >>> already > > >>> there. Since there will be multiple CI systems, the source and > contact > > >>> point > > >>> for the contributor (so maintainers know whom to ask about results if > > >>> needed) > > >>> would likely be the only missing essential data point. > > >>> > > >>> > > >>> The common place for results would also make it easier for new CI > systems > > >>> to > > >>> get involved with upstream. There are likely other companies out > there > > >>> running > > >>> some tests on kernel internally but don't publish the results > anywhere. > > >>> Only > > >>> adding some API calls into their code (with the data they are > allowed to > > >>> publish) would make it very simple for them to start contributing. > If we > > >>> want > > >>> to make them interested, the starting point needs to be trivial. > > >>> Different > > >>> companies have different setups and policies and they might not be > able > > >>> to > > >>> fulfill arbitrary requirements so they opt to not get involved at > all, > > >>> which > > >>> is a shame because their results can be useful. After the initial > > >>> "onboarding" > > >>> step they might be willing to contribute more and more too. > > >>> > > >>> > > >>> Please let me know if the idea makes sense or if something similar is > > >>> already > > >>> in plans. I'd be happy to contribute to the effort because I believe > it > > >>> would > > >>> make everyone's life easier and we'd all benefit from it (and maybe > > >>> someone > > >>> else from my team would be willing to help out too if needed). > > >> > > >> I never responded to this, > > > > > > yea, you did. ;) > > > > > >> but this sounds like a really good idea to me. I don't care much which > > >> backend we aggregate to, but it would be good as a community to start > > >> using one service to start with. It would help to find issues with > > >> the API, or the results schema, if multiple people started using it. > > >> > > >> I know that people using Fuego are sending data to their own instances > > >> of KernelCI. But I don't know what the issues are for sending this > > >> data to a shared KernelCI service. > > >> > > >> I would be interested in hooking up my lab to send Fuego results to > > >> KernelCI. This would be a good exercise. I'm not sure what the next > > >> steps would be, but maybe we could discuss this on the next automated > > >> testing conference call. > > > > > > OK here's my idea. > > > > > > I don't personally think kernelci (or LKFT) are set up to aggregate > > > results currently. We have too many assumptions about where tests are > > > coming from, how things are built, etc. In other words, dealing with > > > noisy data is going to be non-trivial in any existing project. > > > > I completely agree. > > > > This is a good point. I'm totally fine with having a separate independent > place for aggregation. > > > > I would propose aggregating data into something like google's BigQuery. > > > This has a few benefits: > > > - Non-opinionated place to hold structured data > > > - Allows many downstream use-cases > > > - Managed hosting, and data is publicly available > > > - Storage is sponsored by google as a part of > > > https://cloud.google.com/bigquery/public-data/ > > > - First 1TB of query per 'project' is free, and users pay for more > > > queries than that > > > > I very much like this idea. I do lots of android kernel testing > > and being able to work with / compare / contribute to what > > is essentially a pile of data in BQ would be great. As an > > end user working with the data I’d also have lots of dash > > board options to customize and share queries with others. > > > > > With storage taken care of, how do we get the data in? > > > > > First, we'll need some canonical data structure defined. I would > > > approach defining the canonical structure in conjunction with the first > > > few projects that are interested in contributing their results. Each > > > project will have an ETL pipeline which will extract the test results > > > from a given project (such as kernelci, lkft, etc), translate it into > > > the canonical data structure, and load it into the google bigquery > > > dataset at a regular interval or in real-time. The translation layer is > > > where things like test names are handled. > > > > +1, exactly how I imagined this part. > > > Exactly. I would hope that the various projects that are producing > > data would be motived to plug in. After all, it makes the data > > they are producing more useful and available to a larger group > > of people. > > > > > The things this leaves me wanting are: > > > - raw data storage. It would be nice if raw data were stored somewhere > > > permanent in some intermediary place so that later implementations > > > could happen, and for data that doesn't fit into whatever structure we > > > end up with. > > > > I agree. > > +1 > > > > > > - time, to actually try it and find the gaps. This is just an idea I've > > > been thinking about. Anyone with experience here that can help flesh > > > this out? > > > > I’m willing to lend a hand. > > > > Thanks for starting up a specific proposal! I agree with everything that > was > brought up. I'll try to find time to participate in the implementation part > too (although my experience with data storage is.. limited, I should be > able > to help out with the structure prototyping and maybe other parts too). > This all sounds great: having a common location to store the results that is scalable and definitions of test case names. However, there is a whole layer of logic above and around this which KernelCI does, and I'm sure other CI systems also do with some degree of overlap. So it seems to me that solving how to deal with the results is only one piece in the puzzle to get a common CI architecture for upstream kernel testing. Sorry I'm a bit late to the party so I'll add my 2¢ here... Around the end of last year I made this document and mentioned it on this list, about making KernelCI more modular: https://docs.google.com/document/d/15F42HdHTO6NbSL53_iLl77lfe1XQKdWaHAf7XCNkKD8/edit?usp=sharing https://groups.io/g/kernelci/topic/kernelci_modular_pipeline/29692355 The idea is to make it possible to have alternative components in the KernelCI "pipeline". Right now, KernelCI has these components: * Jenkins job to monitor git branches and build kernels * LAVA to run tests * Custom backend and storage server to keep binaries and data * Custom web frontend to show the results They could all be replaced or used in conjunction with alternative build systems, database engines, test lab schedulers and dashboards. The key thing is the code orchestrating all this, which is kept in the kernelci-core repository. For example, when a change has been detected in a tree, rather than triggering kernel builds on Jenkins there could be a request sent to another build system to do that elsewhere. Likewise, when some builds are ready to be tested, jobs could be scheduled in non-LAVA labs simply by sending another kind of HTTP request than the ones we're currently sending to the LAVA APIs. This could be easily described in some config files, in fact we already have one with the list of labs where to submit jobs to. Builds and tests are configured in YAML files in KernelCI, which could easily be extended too with new attributes. The big advantage of having a central way to orchestrate all this is that results are going to be consistent and higher-level features can be enabled: each tree will be sampled at the same commit, so we don't end up with one CI lab running a version of mainline a few patches than another one etc... It means we can expand some KernelCI features to a larger ecosystem of CI labs, such as: * redundancy checks when the same test / hardware is tested in multiple places (say, if a RPi fails to boot in a LAVA lab but not in CKI's lab...) * regression tracking across the whole spectrum of CI labs * common reports for each single kernel revision being tested * bisections extended to non-LAVA labs It feels like a diagram would be needed to really give an idea of how this would work. APIs and callback mechanisms would need to be well defined to have clear entry points for the various components in a modular system like this. I think we would be able to reuse some of the things currently used by KernelCI and improve them, taking into account what other CI labs have been doing (LKFT, CKI...). I'm only scratching the surface here, but I wanted to raise this point to see if others shared the same vision. It would be unfortunate if we came up with a great solution focused on results, but then realised that it had big design limitations when trying to add more abstract functionality across all the contributing CI labs. Well, I tried to keep it short - hope this makes any sense. Cheers, Guillaume [-- Attachment #2: Type: text/html, Size: 14230 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A common place for CI results? 2019-05-28 8:24 ` Guillaume Tucker @ 2019-05-28 14:45 ` Veronika Kabatova 0 siblings, 0 replies; 15+ messages in thread From: Veronika Kabatova @ 2019-05-28 14:45 UTC (permalink / raw) To: Guillaume Tucker Cc: kernelci, Tom Gall, Dan Rue, Tim Bird, automated-testing, info ----- Original Message ----- > From: "Guillaume Tucker" <guillaume.tucker@gmail.com> > To: kernelci@groups.io, vkabatov@redhat.com > Cc: "Tom Gall" <tom.gall@linaro.org>, "Dan Rue" <dan.rue@linaro.org>, "Tim > Bird" <Tim.Bird@sony.com>, automated-testing@yoctoproject.org, > info@kernelci.org > Sent: Tuesday, May 28, 2019 10:24:44 AM > Subject: Re: A common place for CI results? > Hello, > On Mon, May 20, 2019 at 4:38 PM Veronika Kabatova < vkabatov@redhat.com > > wrote: > > ----- Original Message ----- > > > > From: "Tom Gall" < tom.gall@linaro.org > > > > > To: kernelci@groups.io , "Dan Rue" < dan.rue@linaro.org > > > > > Cc: "Tim Bird" < Tim.Bird@sony.com >, vkabatov@redhat.com , > > > automated-testing@yoctoproject.org , info@kernelci.org > > > > Sent: Wednesday, May 15, 2019 11:06:33 PM > > > > Subject: Re: A common place for CI results? > > > > > > > > > > > > > > > > > On May 15, 2019, at 3:33 PM, Dan Rue < dan.rue@linaro.org > wrote: > > > > > > > > > > On Tue, May 14, 2019 at 11:01:35PM +0000, Tim.Bird@sony.com wrote: > > > > >> > > > > >> > > > > >>> -----Original Message----- > > > > >>> From: Veronika Kabatova > > > > >>> > > > > >>> Hi, > > > > >>> > > > > >>> as we know from this list, there's plenty CI systems doing some > > > >>> testing > > > > >>> on > > > > >>> the > > > > >>> upstream kernels (and maybe some others we don't know about). > > > > >>> > > > > >>> It would be great if there was a single common place where all the CI > > > > >>> systems > > > > >>> can put their results. This would make it much easier for the kernel > > > > >>> maintainers and developers to see testing status since they only need > > > >>> to > > > > >>> check one place instead of having a list of sites/mailing lists where > > > > >>> each CI > > > > >>> posts their contributions. > > > > >>> > > > > >>> > > > > >>> A few weeks ago, with some people we've been talking about > > > >>> kernelci.org > > > > >>> being > > > > >>> in a good place to act as the central upstream kernel CI piece that > > > >>> most > > > > >>> maintainers already know about. So I'm wondering if it would be > > > >>> possible > > > > >>> for > > > > >>> kernelci to also act as an aggregator of all results? There's already > > > >>> an > > > > >>> API > > > > >>> for publishing a report [0] so it shouldn't be too hard to adjust it > > > >>> to > > > > >>> handle and show more information. I also found the beta version for > > > >>> test > > > > >>> results [1] so actually, most of the needed functionality seems to be > > > > >>> already > > > > >>> there. Since there will be multiple CI systems, the source and > > > >>> contact > > > > >>> point > > > > >>> for the contributor (so maintainers know whom to ask about results if > > > > >>> needed) > > > > >>> would likely be the only missing essential data point. > > > > >>> > > > > >>> > > > > >>> The common place for results would also make it easier for new CI > > > >>> systems > > > > >>> to > > > > >>> get involved with upstream. There are likely other companies out > > > >>> there > > > > >>> running > > > > >>> some tests on kernel internally but don't publish the results > > > >>> anywhere. > > > > >>> Only > > > > >>> adding some API calls into their code (with the data they are allowed > > > >>> to > > > > >>> publish) would make it very simple for them to start contributing. If > > > >>> we > > > > >>> want > > > > >>> to make them interested, the starting point needs to be trivial. > > > > >>> Different > > > > >>> companies have different setups and policies and they might not be > > > >>> able > > > > >>> to > > > > >>> fulfill arbitrary requirements so they opt to not get involved at > > > >>> all, > > > > >>> which > > > > >>> is a shame because their results can be useful. After the initial > > > > >>> "onboarding" > > > > >>> step they might be willing to contribute more and more too. > > > > >>> > > > > >>> > > > > >>> Please let me know if the idea makes sense or if something similar is > > > > >>> already > > > > >>> in plans. I'd be happy to contribute to the effort because I believe > > > >>> it > > > > >>> would > > > > >>> make everyone's life easier and we'd all benefit from it (and maybe > > > > >>> someone > > > > >>> else from my team would be willing to help out too if needed). > > > > >> > > > > >> I never responded to this, > > > > > > > > > > yea, you did. ;) > > > > > > > > > >> but this sounds like a really good idea to me. I don't care much which > > > > >> backend we aggregate to, but it would be good as a community to start > > > > >> using one service to start with. It would help to find issues with > > > > >> the API, or the results schema, if multiple people started using it. > > > > >> > > > > >> I know that people using Fuego are sending data to their own instances > > > > >> of KernelCI. But I don't know what the issues are for sending this > > > > >> data to a shared KernelCI service. > > > > >> > > > > >> I would be interested in hooking up my lab to send Fuego results to > > > > >> KernelCI. This would be a good exercise. I'm not sure what the next > > > > >> steps would be, but maybe we could discuss this on the next automated > > > > >> testing conference call. > > > > > > > > > > OK here's my idea. > > > > > > > > > > I don't personally think kernelci (or LKFT) are set up to aggregate > > > > > results currently. We have too many assumptions about where tests are > > > > > coming from, how things are built, etc. In other words, dealing with > > > > > noisy data is going to be non-trivial in any existing project. > > > > > > > > I completely agree. > > > > > > > This is a good point. I'm totally fine with having a separate independent > > > place for aggregation. > > > > > I would propose aggregating data into something like google's BigQuery. > > > > > This has a few benefits: > > > > > - Non-opinionated place to hold structured data > > > > > - Allows many downstream use-cases > > > > > - Managed hosting, and data is publicly available > > > > > - Storage is sponsored by google as a part of > > > > > https://cloud.google.com/bigquery/public-data/ > > > > > - First 1TB of query per 'project' is free, and users pay for more > > > > > queries than that > > > > > > > > I very much like this idea. I do lots of android kernel testing > > > > and being able to work with / compare / contribute to what > > > > is essentially a pile of data in BQ would be great. As an > > > > end user working with the data I’d also have lots of dash > > > > board options to customize and share queries with others. > > > > > > > > > With storage taken care of, how do we get the data in? > > > > > > > > > First, we'll need some canonical data structure defined. I would > > > > > approach defining the canonical structure in conjunction with the first > > > > > few projects that are interested in contributing their results. Each > > > > > project will have an ETL pipeline which will extract the test results > > > > > from a given project (such as kernelci, lkft, etc), translate it into > > > > > the canonical data structure, and load it into the google bigquery > > > > > dataset at a regular interval or in real-time. The translation layer is > > > > > where things like test names are handled. > > > > > > > +1, exactly how I imagined this part. > > > > Exactly. I would hope that the various projects that are producing > > > > data would be motived to plug in. After all, it makes the data > > > > they are producing more useful and available to a larger group > > > > of people. > > > > > > > > > The things this leaves me wanting are: > > > > > - raw data storage. It would be nice if raw data were stored somewhere > > > > > permanent in some intermediary place so that later implementations > > > > > could happen, and for data that doesn't fit into whatever structure we > > > > > end up with. > > > > > > > > I agree. > > > +1 > > > > > > > > > - time, to actually try it and find the gaps. This is just an idea I've > > > > > been thinking about. Anyone with experience here that can help flesh > > > > > this out? > > > > > > > > I’m willing to lend a hand. > > > > > > > Thanks for starting up a specific proposal! I agree with everything that > > was > > > brought up. I'll try to find time to participate in the implementation part > > > too (although my experience with data storage is.. limited, I should be > > able > > > to help out with the structure prototyping and maybe other parts too). > > This all sounds great: having a common location to store the > results that is scalable and definitions of test case names. > However, there is a whole layer of logic above and around this > which KernelCI does, and I'm sure other CI systems also do with > some degree of overlap. So it seems to me that solving how to > deal with the results is only one piece in the puzzle to get a > common CI architecture for upstream kernel testing. Sorry I'm a > bit late to the party so I'll add my 2¢ here... > Around the end of last year I made this document and mentioned it > on this list, about making KernelCI more modular: > https://docs.google.com/document/d/15F42HdHTO6NbSL53_iLl77lfe1XQKdWaHAf7XCNkKD8/edit?usp=sharing > https://groups.io/g/kernelci/topic/kernelci_modular_pipeline/29692355 I have a plan to read this document but still didn't get around it :( > The idea is to make it possible to have alternative components in > the KernelCI "pipeline". Right now, KernelCI has these > components: > * Jenkins job to monitor git branches and build kernels > * LAVA to run tests > * Custom backend and storage server to keep binaries and data > * Custom web frontend to show the results > They could all be replaced or used in conjunction with > alternative build systems, database engines, test lab schedulers > and dashboards. The key thing is the code orchestrating all > this, which is kept in the kernelci-core repository. > For example, when a change has been detected in a tree, rather > than triggering kernel builds on Jenkins there could be a request > sent to another build system to do that elsewhere. Likewise, > when some builds are ready to be tested, jobs could be scheduled > in non-LAVA labs simply by sending another kind of HTTP request > than the ones we're currently sending to the LAVA APIs. This > could be easily described in some config files, in fact we > already have one with the list of labs where to submit jobs to. > Builds and tests are configured in YAML files in KernelCI, which > could easily be extended too with new attributes. > The big advantage of having a central way to orchestrate all this > is that results are going to be consistent and higher-level > features can be enabled: each tree will be sampled at the same > commit, so we don't end up with one CI lab running a version of > mainline a few patches than another one etc... It means we can > expand some KernelCI features to a larger ecosystem of CI labs, > such as: > * redundancy checks when the same test / hardware is tested in > multiple places (say, if a RPi fails to boot in a LAVA lab but > not in CKI's lab...) > * regression tracking across the whole spectrum of CI labs > * common reports for each single kernel revision being tested > * bisections extended to non-LAVA labs > It feels like a diagram would be needed to really give an idea of > how this would work. APIs and callback mechanisms would need to > be well defined to have clear entry points for the various > components in a modular system like this. I think we would be > able to reuse some of the things currently used by KernelCI and > improve them, taking into account what other CI labs have been > doing (LKFT, CKI...). > I'm only scratching the surface here, but I wanted to raise this > point to see if others shared the same vision. It would be > unfortunate if we came up with a great solution focused on > results, but then realised that it had big design limitations > when trying to add more abstract functionality across all the > contributing CI labs. This is definitely something I'd love to see in the (likely very far) future too. However, I'd say that it would require changes in the CI systems and not necessarily in the result format / way of displaying the data, which is what we are trying to set up and agree on here. Each of the CI systems in question would be responsible for their API calls and the receiving side would just validate what it got, and this is likely not something that would change even in the far future when we get more integrated. I agree with all the points you made and we should definitely keep them in mind going forward, but unless I overlooked something (which is totally possible :) these two things don't depend on each other. > Well, I tried to keep it short - hope this makes any sense. It does. I really have to find some time to get through that doc. Veronika > Cheers, > Guillaume ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Automated-testing] A common place for CI results? 2019-05-15 20:33 ` Dan Rue 2019-05-15 21:06 ` Tom Gall @ 2019-05-15 22:58 ` Carlos Hernandez 2019-05-16 12:05 ` Mark Brown 1 sibling, 1 reply; 15+ messages in thread From: Carlos Hernandez @ 2019-05-15 22:58 UTC (permalink / raw) To: Dan Rue, kernelci, Tim.Bird; +Cc: info, automated-testing [-- Attachment #1: Type: text/plain, Size: 2137 bytes --] On 5/15/19 4:33 PM, Dan Rue wrote: > OK here's my idea. > > I don't personally think kernelci (or LKFT) are set up to aggregate > results currently. We have too many assumptions about where tests are > coming from, how things are built, etc. In other words, dealing with > noisy data is going to be non-trivial in any existing project. > > I would propose aggregating data into something like google's BigQuery. > This has a few benefits: > - Non-opinionated place to hold structured data > - Allows many downstream use-cases > - Managed hosting, and data is publicly available > - Storage is sponsored by google as a part of > https://cloud.google.com/bigquery/public-data/ > - First 1TB of query per 'project' is free, and users pay for more > queries than that > > With storage taken care of, how do we get the data in? > > First, we'll need some canonical data structure defined. I would > approach defining the canonical structure in conjunction with the first > few projects that are interested in contributing their results. Each > project will have an ETL pipeline which will extract the test results > from a given project (such as kernelci, lkft, etc), translate it into > the canonical data structure, and load it into the google bigquery > dataset at a regular interval or in real-time. The translation layer is > where things like test names are handled. +1 I like the idea > > The things this leaves me wanting are: > - raw data storage. It would be nice if raw data were stored somewhere > permanent in some intermediary place so that later implementations > could happen, and for data that doesn't fit into whatever structure we > end up with. If required, we could setup a related table w/ raw data. I believe max cell size ~ 100MB per https://cloud.google.com/bigquery/quotas However, another approach could be to define the structure version in the schema. New fields can be added and left blank for old data. > - time, to actually try it and find the gaps. This is just an idea I've > been thinking about. Anyone with experience here that can help flesh > this out? > > Dan -- Carlos [-- Attachment #2: Type: text/html, Size: 3058 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Automated-testing] A common place for CI results? 2019-05-15 22:58 ` [Automated-testing] " Carlos Hernandez @ 2019-05-16 12:05 ` Mark Brown 0 siblings, 0 replies; 15+ messages in thread From: Mark Brown @ 2019-05-16 12:05 UTC (permalink / raw) To: Carlos Hernandez; +Cc: Dan Rue, kernelci, Tim.Bird, automated-testing, info [-- Attachment #1: Type: text/plain, Size: 1033 bytes --] On Wed, May 15, 2019 at 06:58:04PM -0400, Carlos Hernandez wrote: > On 5/15/19 4:33 PM, Dan Rue wrote: > > This has a few benefits: > > - Non-opinionated place to hold structured data Of course structure is opinion :/ > +1 > > I like the idea Me too. > > The things this leaves me wanting are: > > - raw data storage. It would be nice if raw data were stored somewhere > > permanent in some intermediary place so that later implementations > > could happen, and for data that doesn't fit into whatever structure we > > end up with. > If required, we could setup a related table w/ raw data. I believe max cell > size ~ 100MB per https://cloud.google.com/bigquery/quotas > However, another approach could be to define the structure version in the > schema. New fields can be added and left blank for old data. Versioned structures do make tooling to use the data more difficult to implement, I think Dan's idea is good especially early on when things are being tried for the first time. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2019-05-28 14:45 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <299272045.11819252.1554465036421.JavaMail.zimbra@redhat.com> 2019-04-05 14:41 ` A common place for CI results? Veronika Kabatova 2019-04-08 22:16 ` Tim.Bird 2019-04-09 13:41 ` Guenter Roeck 2019-04-10 9:28 ` [Automated-testing] " Mark Brown 2019-04-10 17:47 ` Veronika Kabatova 2019-04-10 21:13 ` [Automated-testing] " Kevin Hilman 2019-04-11 16:02 ` Veronika Kabatova 2019-05-14 23:01 ` Tim.Bird 2019-05-15 20:33 ` Dan Rue 2019-05-15 21:06 ` Tom Gall 2019-05-20 15:32 ` Veronika Kabatova 2019-05-28 8:24 ` Guillaume Tucker 2019-05-28 14:45 ` Veronika Kabatova 2019-05-15 22:58 ` [Automated-testing] " Carlos Hernandez 2019-05-16 12:05 ` Mark Brown
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.