From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: Working with the KernelCI project References: <20200707222342.scrz75265etaqlmd@redhat.com> <20200709110029.GB27682@intel.com> <69138572-7241-1636-8018-34cd380ec540@redhat.com> <20200713001929.GA1812@intel.com> <4a5d8379-b96d-6777-0d98-4ef13e56e0b3@redhat.com> <20200809022529.GB26573@intel.com> From: "Nikolai Kondrashov" Message-ID: <80b657a1-de83-6d59-c26a-aa3352822986@redhat.com> Date: Mon, 10 Aug 2020 11:50:33 +0300 MIME-Version: 1.0 In-Reply-To: <20200809022529.GB26573@intel.com> Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit List-ID: To: Philip Li Cc: Don Zickus , dvyukov@google.com, kernelci-members@groups.io, nkondras@redhat.com, julie.du@intel.com, kernelci@groups.io, =?UTF-8?Q?I=c3=b1aki_Malerba?= On 8/9/20 5:25 AM, Philip Li wrote: > On Wed, Jul 22, 2020 at 03:42:41PM +0300, Nikolai Kondrashov wrote: >> On 7/13/20 3:19 AM, Philip Li wrote: >>> On Thu, Jul 09, 2020 at 10:05:04PM +0300, Nikolai Kondrashov wrote: >> The above describes the revision you're testing as a patch being applied to a >> particular commit in the stm32 repo's master branch. The revision has a build, >> which failed, the build has the config URL and the log linked, as well as the > > Thanks, besides the final result like success or failed, does the db currently > support start but not completed build (or planned but not start)? Not yet, and we don't support updating the submitted objects yet either. The database considers the results final, although you can send them in gradually. This lets us keep the design simpler, so we can concentrate on figuring out more controversial topics, such as which objects we need, which data we need for each object and how we can represent it. For this reason we need your data now, whatever it is, so we can figure it out :) Once we more-or-less settle on that, we can consider making reporting more gradual and flexible. >> Since you don't provide any build information in that report, the build object >> doesn't have any data. However, that's still valid according to the current >> schema. > > If the commit from test result is same as build report's commit, does it mean > when uploading test result, we need figure out the revision's id and builds' id > to be part of test result's joson, if they have been uploaded during build result? You don't need to know the revision ID (the commit hash) when submitting a test result, only the build ID. But yes, you have to know it to link your test to the build (which links to the revision, in turn). You don't have to include the revision/build objects themselves in the same submission. It's up to you what you use for the build ID, as long as it's unique across all the builds you submit. E.g. you could use an existing ID from your system which corresponds to a build, and which is available to you when you report the test. >> Note the "contacts" field all revisions have: this will help us determine who >> to send the reports to. > > is this a feature to be triggered by the db to send reports? As 0day sends > mail to notify the bad commit to related contacts to be aware of the problem. Yes, this is intended for that. Although we are currently developing the notification system and are not sending the reports to real people yet. >> Perhaps we need to add support for test input files to accommodate your >> reproduction instructions and custom scripts. >> >> I have omitted some fields I could've added, and we need to improve the schema >> to accommodate your reports better, of course. >> >> However, if you'd be interested, we could help you set up forwarding your >> reports to KernelCI. You can start very simple and small, as the schema only >> requires a handful of fields. This will help us see your needs: what data you >> want in reports and on the dashboard, how many reports you want to push (both >> positive and negative), etc. > > Thanks for the offering, the aggregation is a nice feature, but we are fully > occupied by planned effort and it's hard to do actual implementation level thing > in short term. Meanwhile I'd like to know more about it, and look fwd to the > hacking session if i could attend it. That's understandable, of course. If you make it, I hope I'll be able to show how easy it is to start submitting :) As a primer, this is literally how the interface could look: cat report.json | kcidb-submit -p kernelci-production -t kernelci_new See you at Plumbers! Nick On 8/9/20 5:25 AM, Philip Li wrote: > On Wed, Jul 22, 2020 at 03:42:41PM +0300, Nikolai Kondrashov wrote: >> Hi Philip, >> >> Re-sending this to add the kernelci maillist and a colleague, sorry. >> Please reply to *this* message instead of the first one. >> >> On 7/13/20 3:19 AM, Philip Li wrote: >>> On Thu, Jul 09, 2020 at 10:05:04PM +0300, Nikolai Kondrashov wrote: >>>> How about I try to take a 0-day report and express it as a KCIDB submission, >>>> as an illustration of how this could work? Would that help you understand what >>>> we're trying to do? If yes, could you give me a link to one? >>> Right, the "accuracy" for all single branch we test is more related to technical >>> problem if we need look for a way to solve it. Here assume we have a report, it >>> does have chance to be aggeragated. You can pick up any link from >>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/ (build) or >>> https://lists.01.org/hyperkitty/list/lkp@lists.01.org/ (runtime) for example. >> >> Alright, first I'll copy the KCIDB intro I sent Dmitry in another branch of >> this thread, in case you didn't read it (otherwise skip past END-OF-INTRO): >> >> You can submit all your data by just piping >> JSON to the "kcidb-submit" tool (providing some options for authentication and >> destination), or you could use the Python 3 API. >> >> The expected JSON data is a dictionary of arrays, along with the schema >> version identifier. Each of the arrays in the dictionary could be missing or >> empty, but otherwise they contain various report objects. The database will >> accept them in any (reasonable) amount, order, or combination, although >> submission order can affect when and how notifications are sent out. >> >> Every object has an ID which can be used to refer to it and link objects >> together, but all the IDs are generated by the submitter, who is making sure >> they're unique. For most of the objects (except revisions at this moment) you >> can just use your CI system's ID for it and prefix it with your CI system's >> name. This way you don't have to maintain a mapping between your system's IDs >> and our IDs when you report results gradually. If you don't have that, you can >> just generate them, for example hash some key fields or, as the last resort, >> use UUIDs. Revisions are just using commit hashes at the moment. >> >> Every object has a special property called "misc" which can contain arbitrary >> data. You could use that to submit data we don't have a schema for yet. The >> users/developers will be able to see it, and we can use that as samples for >> implementing future support. >> >> See example use of "misc" for a KernelCI test and its execution environment >> (yes, it could be formatted better): >> https://staging.kernelci.org:3000/d/test/test?orgId=1&var-id=kernelci:staging.kernelci.org:5ef9ab28baa38e14753eeeec >> >> END-OF-INTRO >> >> I think it's great that 0-day e-mails contain everything needed to investigate >> and reproduce the issue and are self-sufficient. However, at the moment KCIDB >> doesn't allow embedding artifacts or logs into submissions, but instead >> expects them to be stored somewhere else and have the URLs provided. >> >> At the moment we would need you to do that, at least with .config files, but >> if you can't, we can work on supporting embedding them. We planned on copying >> linked files to KernelCI-managed storage anyway, eventually. >> >> KCIDB also doesn't support embedding the nice error summaries and log excerpts >> you include in your reports, but that should be quite easy to amend by storing >> them in the database itself (provided they're not too big). Just needs a >> little thinking about the exact schema to use. >> >> For now, though, my examples below assume you can provide links to files >> (spoofed here), and you're supplying the excerpts or complete logs with those. > thanks, the samples below are helpful, i have added a few comments to discuss. > >> >> KCIDB doesn't support source code linters and static analysis (such as >> checkpatch, or coverity) at the moment, and we should add that. However, >> a sparse run could be expressed in KCIDB as a build. >> >> Taking this report as a sample: >> >> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/GUFRHHPKTWFYNLRH4LE2E2YELI6XG2IE/ >> >> this is how a submission could look: >> >> { >> "revisions": [ >> { >> "id": "391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f2633622d8ca67ee0357740daa51095e59fa672db64895e871d8195d777c", >> "origin": "0day", >> "discovery_time": "2020-07-08T07:57:24+03:00", >> "git_repository_url": "https://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32.git", >> "git_repository_commit_hash": "391e437eedc0dab0a9f2c26997e68e040ae04ea3", >> "git_repository_branch": "master", >> "patch_mboxes": [ >> { >> "name": "0001-irqchip-stm32-exti-map-direct-event-to-irq-parent.patch", >> "url": "https://github.com/0day-ci/linux/commit/3f47dd3217f24edfd442b35784001979e7aeacc7.patch" >> } >> ], >> "message_id": "20200706081106.25125-1-alexandre.torgue@st.com", >> "contacts": [ >> "Alexandre Torgue ", >> "kbuild-all ", >> "Marc Zyngier ", >> "Thomas Gleixner ", >> "Jason Cooper ", >> "LKML " >> ], >> "valid": true >> } >> ], >> "builds": [ >> { >> "id": "0day:391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f2633622d8ca67ee0357740daa51095e59fa672db64895e871d8195d777c:sparse", >> "origin": "0day", >> "revision_id": "391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f2633622d8ca67ee0357740daa51095e59fa672db64895e871d8195d777c", >> "architecture": "arm", >> "compiler": "arm-linux-gnueabi-gcc (GCC) 9.3.0", >> "start_time": "2020-07-08T07:57:24+03:00", >> "config_url": "https://01.org/0day/391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f2633622d8ca67ee0357740daa51095e59fa672db64895e871d8195d777c/config", >> "log_url": "https://01.org/0day/391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f2633622d8ca67ee0357740daa51095e59fa672db64895e871d8195d777c/sparse_build.log", >> "command": "COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=arm", >> "input_files": [ >> { >> "name": "make.cross", >> "url": "https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross" >> }, >> { >> "name": "instructions.txt", >> "url": "https://01.org/0day/391e437eedc0dab0a9f2c26997e68e040ae04ea3+88b9f2633622d8ca67ee0357740daa51095e59fa672db64895e871d8195d777c/sparse_repro.txt" >> } >> ], >> "valid": false >> } >> ], >> "version": { >> "major": 3, >> "minor": 0 >> } >> } >> >> >> The above describes the revision you're testing as a patch being applied to a >> particular commit in the stm32 repo's master branch. The revision has a build, >> which failed, the build has the config URL and the log linked, as well as the > Thanks, besides the final result like success or failed, does the db currently > support start but not completed build (or planned but not start)? > >> reproduction instructions linked as one of the "input files". We can work on >> adding a dedicated field for reproduction instructinos for both builds and >> tests, since they're very useful and syzbot also produces them. >> >> A failed W=1 build: >> >> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/WEKO2YRAZIPZFUQAY2D4XAOWJGC3HGBD/ >> >> Would look similar: >> >> { >> "revisions": [ >> { >> "id": "c46ed28dbe95844c1d15addd26ff05499057c4d5+3a6bc6d39be8f6c3acc50a89c648859f1ee0d638f4969ec4d2cab6c7135518c2", >> "origin": "0day", >> "discovery_time": "2020-07-08T07:57:24+03:00", >> "git_repository_url": "https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git", >> "git_repository_commit_hash": "c46ed28dbe95844c1d15addd26ff05499057c4d5", >> "git_repository_branch": "for-next", >> "patch_mboxes": [ >> { >> "name": "0001-arm64-dts-qcom-sc7180-Add-lpass-cpu-node-for-I2S-dri.patch", >> "url": "https://github.com/0day-ci/linux/commit/d20696ca206ae45d9d27fbeffb23fe5431b5de9d.patch" >> } >> ], >> "message_id": "20200716061445.628709-1-cychiang@chromium.org", >> "contacts": [ >> "Ajit Pandey ", >> "Cheng-Yi Chiang ", >> "kbuild-all ", >> "Andy Gross ", >> "Bjorn Andersson ", >> "Rob Herring ", >> "linux-arm-msm@vger.kernel.org", >> "devicetree@vger.kernel.org" >> ], >> "valid": true >> } >> ], >> "builds": [ >> { >> "id": "0day:c46ed28dbe95844c1d15addd26ff05499057c4d5+3a6bc6d39be8f6c3acc50a89c648859f1ee0d638f4969ec4d2cab6c7135518c2:sparse", >> "origin": "0day", >> "revision_id": "c46ed28dbe95844c1d15addd26ff05499057c4d5+3a6bc6d39be8f6c3acc50a89c648859f1ee0d638f4969ec4d2cab6c7135518c2", >> "architecture": "arm", >> "compiler": "clang version 12.0.0", >> "start_time": "2020-07-08T07:57:24+03:00", >> "config_url": "https://01.org/0day/c46ed28dbe95844c1d15addd26ff05499057c4d5+3a6bc6d39be8f6c3acc50a89c648859f1ee0d638f4969ec4d2cab6c7135518c2/config", >> "log_url": "https://01.org/0day/c46ed28dbe95844c1d15addd26ff05499057c4d5+3a6bc6d39be8f6c3acc50a89c648859f1ee0d638f4969ec4d2cab6c7135518c2/w=1_build.log", >> "command": "COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64", >> "input_files": [ >> { >> "name": "make.cross", >> "url": "https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross" >> }, >> { >> "name": "instructions.txt", >> "url": "https://01.org/0day/c46ed28dbe95844c1d15addd26ff05499057c4d5+3a6bc6d39be8f6c3acc50a89c648859f1ee0d638f4969ec4d2cab6c7135518c2/w=1_repro.txt" >> } >> ], >> "valid": false >> } >> ], >> "version": { >> "major": 3, >> "minor": 0 >> } >> } >> >> KCIDB also doesn't support non-runtime tests for compiled kernels (such as >> size regression tests you're running), and we should add that, but meanwhile >> we can accommodate them as tests without "environments". >> >> This one: >> >> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/PX2O2OJZT2TZ7SU4VUB5ODM4KRBPTXD7/ >> >> Could look like this: >> >> { >> "revisions": [ >> { >> "id": "aa63af1b08246bd31b77d056bf1d47f775cecbe2", >> "origin": "0day", >> "discovery_time": "2020-07-17T14:41:52+03:00", >> "git_repository_url": "https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git", >> "git_repository_commit_hash": "aa63af1b08246bd31b77d056bf1d47f775cecbe2", >> "git_repository_branch": "mount_setattr", >> "contacts": [ >> "Christian Brauner ", >> "kbuild-all " >> ], >> "valid": true >> } >> ], >> "builds": [ >> { >> "id": "0day:aa63af1b08246bd31b77d056bf1d47f775cecbe2", >> "origin": "0day", >> "revision_id": "aa63af1b08246bd31b77d056bf1d47f775cecbe2", >> "start_time": "2020-07-17T14:41:52+03:00", >> "valid": true >> } >> ], >> "tests": [ >> { >> "id": "0day:aa63af1b08246bd31b77d056bf1d47f775cecbe2:size", >> "origin": "0day", >> "build_id": "0day:aa63af1b08246bd31b77d056bf1d47f775cecbe2", >> "path": "size_regression", >> "output_files": [ >> { >> "name": "details.txt", >> "url": "https://01.org/0day/aa63af1b08246bd31b77d056bf1d47f775cecbe2/size_regression_details.log", >> } >> ], >> "start_time": "2020-07-17T14:41:52+03:00", >> "status": "FAIL", >> "waived": false >> }, >> ], >> "version": { >> "major": 3, >> "minor": 0 >> } >> } >> >> Since you don't provide any build information in that report, the build object >> doesn't have any data. However, that's still valid according to the current >> schema. > If the commit from test result is same as build report's commit, does it mean > when uploading test result, we need figure out the revision's id and builds' id > to be part of test result's joson, if they have been uploaded during build result? > >> >> Finally, this runtime test failure: >> >> https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/ULKTFB6NGLORWH2WLEKFSFEQFSIWLT5F/ >> >> you can report like this: >> >> { >> "revisions": [ >> { >> "id": "5155be9994e557618a8312389fb4e52dfbf28a3c", >> "origin": "0day", >> "discovery_time": "2020-07-17T09:04:55+03:00", >> "git_repository_url": "https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git" >> "git_repository_commit_hash": "5155be9994e557618a8312389fb4e52dfbf28a3c", >> "git_repository_branch": "master", >> "contacts": [ >> "Paul E. McKenney ", >> "LKP " >> ], >> "valid": true >> } >> ], >> "builds": [ >> { >> "id": "0day:5155be9994e557618a8312389fb4e52dfbf28a3c", >> "origin": "0day", >> "revision_id": "5155be9994e557618a8312389fb4e52dfbf28a3c", >> "start_time": "2020-07-17T09:04:55+03:00", >> "architecture": "i386", >> "command": "make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage", >> "config_url": "https://01.org/0day/5155be9994e557618a8312389fb4e52dfbf28a3c/config", >> "valid": true >> } >> ], >> "tests": [ >> { >> "id": "0day:5155be9994e557618a8312389fb4e52dfbf28a3c:trinity", >> "origin": "0day", >> "build_id": "0day:5155be9994e557618a8312389fb4e52dfbf28a3c", >> "path": "trinity", >> "output_files": [ >> { >> "name": "dmesg.xz", >> "url": "https://01.org/0day/5155be9994e557618a8312389fb4e52dfbf28a3c/dmesg.xz", >> }, >> { >> "name": "details.txt", >> "url": "https://01.org/0day/5155be9994e557618a8312389fb4e52dfbf28a3c/trinity_details.log", >> } >> ], >> "environment": { >> "description": "qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G" >> }, >> "start_time": "2020-07-17T09:04:55+03:00", >> "status": "FAIL", >> "waived": false >> }, >> ], >> "version": { >> "major": 3, >> "minor": 0 >> } >> } >> >> Note the "contacts" field all revisions have: this will help us determine who >> to send the reports to. > is this a feature to be triggered by the db to send reports? As 0day sends > mail to notify the bad commit to related contacts to be aware of the problem. > >> >> Perhaps we need to add support for test input files to accommodate your >> reproduction instructions and custom scripts. >> >> I have omitted some fields I could've added, and we need to improve the schema >> to accommodate your reports better, of course. >> >> However, if you'd be interested, we could help you set up forwarding your >> reports to KernelCI. You can start very simple and small, as the schema only >> requires a handful of fields. This will help us see your needs: what data you >> want in reports and on the dashboard, how many reports you want to push (both >> positive and negative), etc. > Thanks for the offering, the aggregation is a nice feature, but we are fully > occupied by planned effort and it's hard to do actual implementation level thing > in short term. Meanwhile I'd like to know more about it, and look fwd to the > hacking session if i could attend it. > >> >> Don't hesitate to write with questions, suggestions, and hope to "see" you at >> this year's Plumbers, where we hopefully will be presenting more about this >> effort. I'll also be writing a separate article introducing the schema this >> week, will copy both you and Dmitry here. >> >> Nick >> >> On 7/13/20 3:19 AM, Philip Li wrote: >>> On Thu, Jul 09, 2020 at 10:05:04PM +0300, Nikolai Kondrashov wrote: >>>> Hi Philip, >>>> >>>> On 7/9/20 2:00 PM, Philip Li wrote: >>>>> The 0-day ci mostly focus on regression and then bisection, and the strategy >>>>> is to merge various branches to run the test. This is not exactly as the >>>>> tranditional CI. The worry here is to know exactly one branch is pass or >>>>> fail currently is not 100% available. For instance, the final merged branch >>>>> is fail doesn't provide fail/pass info of each individual branch. This triggers >>>>> bisection to kick out the bad branch. >>>>> >>>>> Then it need redo the testing of remaining ones, which is not always >>>>> feasible for us (considering the computing power). >>>> >>>> Yes, I think everyone here could sympathise with limited hardware resources :) >>>> >>>>> Especially, sometimes the bisection would fail. >>>> >>>> Can you give an example of how would a bisection fail? >>>> Would that be a flaky test failing on a previously assumed-good commit, for >>>> example? >>> thanks, one example is build issue that breaks the bisectability which can >>> lead to bisect fail. >>> >>>> >>>>> As we focus on regression a lot to bisect to first bad commit, there would >>>>> be uncertainty to draw conclusion for single branch. >>>> >>>> I think not having complete certainty for a project as large as the Linux >>>> kernel is normal. Kernel CI has the bisection system as well, and syzbot is >>>> going to even greater lengths with identifying similar failures. We at CKI >>>> have test maintainers constantly looking at test failures and deciding whether >>>> they're false or not. These are things we just have to handle for common >>>> reporting to work. >>>> >>>>> This requires more careful thinking for us without increasing the needs of >>>>> computing resource. This is one bottleneck I can see so far. Not sure any >>>>> idea or recommendation for this. >>>> >>>> Our aim with common reporting is simply to provide a unified way to reach >>>> developers with testing results, essentially to send them a single e-mail >>>> report, instead of one report per CI system, to make a single database >>>> available for analyzis and a single dashboard UI. >>>> >>>> I.e. instead of sending an e-mail report to a developer we ask you to send a >>>> JSON report to us, and then we try to handle analyzing and reporting for you. >>>> >>>> It is up to the submitting CI system to choose how many, or how little tests >>>> to run, or how much or how little data to send. Kernel CI is not going to ask >>>> you to run any tests, it is up to you. >>>> >>>> In the end, we trust you want the developers to notice and fix the problems >>>> you find, you'll try to provide enough data, and we'd like to make a system >>>> which will help you do that. If you can pinpoint the exact commit - great! If >>>> not, we'll just have some data from you which can be analyzed otherwise. >>>> >>>> How about I try to take a 0-day report and express it as a KCIDB submission, >>>> as an illustration of how this could work? Would that help you understand what >>>> we're trying to do? If yes, could you give me a link to one? >>> Right, the "accuracy" for all single branch we test is more related to technical >>> problem if we need look for a way to solve it. Here assume we have a report, it >>> does have chance to be aggeragated. You can pick up any link from >>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/ (build) or >>> https://lists.01.org/hyperkitty/list/lkp@lists.01.org/ (runtime) for example. >>> >>> Thanks >>> >>>> >>>> Nick >>>> >>> >> >