All of lore.kernel.org
 help / color / mirror / Atom feed
* Contributing ARM tests results to KCIDB
@ 2020-09-17 12:50 cristian.marussi
  2020-09-17 13:52 ` Nikolai Kondrashov
  0 siblings, 1 reply; 23+ messages in thread
From: cristian.marussi @ 2020-09-17 12:50 UTC (permalink / raw)
  To: Nikolai.Kondrashov; +Cc: broonie, basil.eljuse, cristian.marussi, kernelci

Hi Nikolai,

I work at ARM in the Kernel team and, in short, we'd like certainly to
contribute our internal Kernel test results to KCIDB.

After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
and I'd like to start experimenting with kci-submit (on non-production
instances), so as to assess how to fit our results into your schema and maybe
contribute with some new KCIDB requirements if strictly needed.

Is it possible to get some valid credentials and a playground instance to
point at ?

Thanks

Regards

Cristian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-17 12:50 Contributing ARM tests results to KCIDB cristian.marussi
@ 2020-09-17 13:52 ` Nikolai Kondrashov
  2020-09-17 16:22   ` Cristian Marussi
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-09-17 13:52 UTC (permalink / raw)
  To: kernelci, cristian.marussi; +Cc: broonie, basil.eljuse

Hi Christian,

On 9/17/20 3:50 PM, Cristian Marussi wrote:
 > Hi Nikolai,
 >
 > I work at ARM in the Kernel team and, in short, we'd like certainly to
 > contribute our internal Kernel test results to KCIDB.

Wonderful!

 > After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 > up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 > and I'd like to start experimenting with kci-submit (on non-production
 > instances), so as to assess how to fit our results into your schema and maybe
 > contribute with some new KCIDB requirements if strictly needed.

Great, this is exactly what we need, welcome aboard :)

Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
freenode.net, if you have any questions, problems, or requirements.

 > Is it possible to get some valid credentials and a playground instance to
 > point at ?

Absolutely, I created credentials for you and sent them in a separate message.

You can use origin "arm" for the start, unless you have multiple CI systems
and want to differentiate them somehow in your reports.

Nick

On 9/17/20 3:50 PM, Cristian Marussi wrote:
 > Hi Nikolai,
 >
 > I work at ARM in the Kernel team and, in short, we'd like certainly to
 > contribute our internal Kernel test results to KCIDB.
 >
 > After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 > up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 > and I'd like to start experimenting with kci-submit (on non-production
 > instances), so as to assess how to fit our results into your schema and maybe
 > contribute with some new KCIDB requirements if strictly needed.
 >
 > Is it possible to get some valid credentials and a playground instance to
 > point at ?
 >
 > Thanks
 >
 > Regards
 >
 > Cristian
 >
 >
 > 
 >
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-17 13:52 ` Nikolai Kondrashov
@ 2020-09-17 16:22   ` Cristian Marussi
  2020-09-17 17:26     ` Nikolai Kondrashov
  0 siblings, 1 reply; 23+ messages in thread
From: Cristian Marussi @ 2020-09-17 16:22 UTC (permalink / raw)
  To: Nikolai Kondrashov; +Cc: kernelci, broonie, basil.eljuse

On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> Hi Christian,
> 
> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> > Hi Nikolai,
> >
> > I work at ARM in the Kernel team and, in short, we'd like certainly to
> > contribute our internal Kernel test results to KCIDB.
> 
> Wonderful!
> 
> > After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> > up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> > and I'd like to start experimenting with kci-submit (on non-production
> > instances), so as to assess how to fit our results into your schema and maybe
> > contribute with some new KCIDB requirements if strictly needed.
> 
> Great, this is exactly what we need, welcome aboard :)
> 
> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> freenode.net, if you have any questions, problems, or requirements.
> 
> > Is it possible to get some valid credentials and a playground instance to
> > point at ?
> 
> Absolutely, I created credentials for you and sent them in a separate message.
> 
> You can use origin "arm" for the start, unless you have multiple CI systems
> and want to differentiate them somehow in your reports.
> 
> Nick
> 
 Thanks !

It works too ... :D 

https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e

..quick question though....given that now I'll have to play quite a bit
with it and see how's better to present our data, if anythinjg missing etc etc,
is there any chance (or way) that if I submmit the same JSON report multiple
times with slight differences here and there (but with the same IDs clearly)
I'll get my DB updated in the bits I have changed: as an example I've just
resubmitted the same report with added discovery_time and descriptions, and got
NO errors, but I cannot see the changes in the UI (unless they have still to
propagate...)..or maybe I can obtain the same effect by dropping my dataset
before re-submitting ?

Regards

Thanks

Cristian

> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> > Hi Nikolai,
> >
> > I work at ARM in the Kernel team and, in short, we'd like certainly to
> > contribute our internal Kernel test results to KCIDB.
> >
> > After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> > up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> > and I'd like to start experimenting with kci-submit (on non-production
> > instances), so as to assess how to fit our results into your schema and maybe
> > contribute with some new KCIDB requirements if strictly needed.
> >
> > Is it possible to get some valid credentials and a playground instance to
> > point at ?
> >
> > Thanks
> >
> > Regards
> >
> > Cristian
> >
> >
> > 
> >
> >
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-17 16:22   ` Cristian Marussi
@ 2020-09-17 17:26     ` Nikolai Kondrashov
  2020-09-18 15:21       ` Cristian Marussi
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-09-17 17:26 UTC (permalink / raw)
  To: Cristian Marussi; +Cc: kernelci, broonie, basil.eljuse

On 9/17/20 7:22 PM, Cristian Marussi wrote:
 > It works too ... :D
 >
 > https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e

Whoa, awesome!

And you have already uncovered a few issues we need to fix, too!
I will deal with them tomorrow.

 > ..quick question though....given that now I'll have to play quite a bit
 > with it and see how's better to present our data, if anythinjg missing etc etc,
 > is there any chance (or way) that if I submmit the same JSON report multiple
 > times with slight differences here and there (but with the same IDs clearly)
 > I'll get my DB updated in the bits I have changed: as an example I've just
 > resubmitted the same report with added discovery_time and descriptions, and got
 > NO errors, but I cannot see the changes in the UI (unless they have still to
 > propagate...)..or maybe I can obtain the same effect by dropping my dataset
 > before re-submitting ?

Right now it's not supported (with various possible quirks if attempted).
So, preferably, submit only one, complete and final instance of each object
(with unique ID) for now.

We have a plan to support merging missing properties across multiple reported
objects with the same ID.

             Object A        Object B    Dashboard/Notifications

FieldX:     Foo             Foo         Foo
FieldY:                     Bar         Bar
FieldZ:     Baz                         Baz
FieldU:     Red             Blue        Red/Blue

Since we're using a distributed database we cannot really maintain order
(without introducing artificial global lock), so the order of the reports
doesn't matter. We can only guarantee that a present value would override
missing value. It would be undefined which value would be picked among
multiple different values.

This would allow gradual reporting of each object, but no editing, sorry.

However, once again, this is a plan with some research done, only.
I plan to start implementing it within a few weeks.

Nick

On 9/17/20 7:22 PM, Cristian Marussi wrote:
 > On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >> Hi Christian,
 >>
 >> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>> Hi Nikolai,
 >>>
 >>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>> contribute our internal Kernel test results to KCIDB.
 >>
 >> Wonderful!
 >>
 >>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>> and I'd like to start experimenting with kci-submit (on non-production
 >>> instances), so as to assess how to fit our results into your schema and maybe
 >>> contribute with some new KCIDB requirements if strictly needed.
 >>
 >> Great, this is exactly what we need, welcome aboard :)
 >>
 >> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >> freenode.net, if you have any questions, problems, or requirements.
 >>
 >>> Is it possible to get some valid credentials and a playground instance to
 >>> point at ?
 >>
 >> Absolutely, I created credentials for you and sent them in a separate message.
 >>
 >> You can use origin "arm" for the start, unless you have multiple CI systems
 >> and want to differentiate them somehow in your reports.
 >>
 >> Nick
 >>
 >   Thanks !
 >
 > It works too ... :D
 >
 > https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >
 > ..quick question though....given that now I'll have to play quite a bit
 > with it and see how's better to present our data, if anythinjg missing etc etc,
 > is there any chance (or way) that if I submmit the same JSON report multiple
 > times with slight differences here and there (but with the same IDs clearly)
 > I'll get my DB updated in the bits I have changed: as an example I've just
 > resubmitted the same report with added discovery_time and descriptions, and got
 > NO errors, but I cannot see the changes in the UI (unless they have still to
 > propagate...)..or maybe I can obtain the same effect by dropping my dataset
 > before re-submitting ?
 >
 > Regards
 >
 > Thanks
 >
 > Cristian
 >
 >> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>> Hi Nikolai,
 >>>
 >>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>> contribute our internal Kernel test results to KCIDB.
 >>>
 >>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>> and I'd like to start experimenting with kci-submit (on non-production
 >>> instances), so as to assess how to fit our results into your schema and maybe
 >>> contribute with some new KCIDB requirements if strictly needed.
 >>>
 >>> Is it possible to get some valid credentials and a playground instance to
 >>> point at ?
 >>>
 >>> Thanks
 >>>
 >>> Regards
 >>>
 >>> Cristian
 >>>
 >>>
 >>> 
 >>>
 >>>
 >>
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-17 17:26     ` Nikolai Kondrashov
@ 2020-09-18 15:21       ` Cristian Marussi
  2020-09-18 15:30         ` Nikolai Kondrashov
  0 siblings, 1 reply; 23+ messages in thread
From: Cristian Marussi @ 2020-09-18 15:21 UTC (permalink / raw)
  To: Nikolai Kondrashov; +Cc: kernelci, broonie, basil.eljuse

Hi Nikolai,

On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> > It works too ... :D
> >
> > https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> 
> Whoa, awesome!
> 
> And you have already uncovered a few issues we need to fix, too!
> I will deal with them tomorrow.
> 
> > ..quick question though....given that now I'll have to play quite a bit
> > with it and see how's better to present our data, if anythinjg missing etc etc,
> > is there any chance (or way) that if I submmit the same JSON report multiple
> > times with slight differences here and there (but with the same IDs clearly)
> > I'll get my DB updated in the bits I have changed: as an example I've just
> > resubmitted the same report with added discovery_time and descriptions, and got
> > NO errors, but I cannot see the changes in the UI (unless they have still to
> > propagate...)..or maybe I can obtain the same effect by dropping my dataset
> > before re-submitting ?
> 
> Right now it's not supported (with various possible quirks if attempted).
> So, preferably, submit only one, complete and final instance of each object
> (with unique ID) for now.
> 
> We have a plan to support merging missing properties across multiple reported
> objects with the same ID.
> 
>             Object A        Object B    Dashboard/Notifications
> 
> FieldX:     Foo             Foo         Foo
> FieldY:                     Bar         Bar
> FieldZ:     Baz                         Baz
> FieldU:     Red             Blue        Red/Blue
> 
> Since we're using a distributed database we cannot really maintain order
> (without introducing artificial global lock), so the order of the reports
> doesn't matter. We can only guarantee that a present value would override
> missing value. It would be undefined which value would be picked among
> multiple different values.
> 
> This would allow gradual reporting of each object, but no editing, sorry.
> 
> However, once again, this is a plan with some research done, only.
> I plan to start implementing it within a few weeks.
> 

So in order to carry on my experiments, I've just tried to push a new dataset
with a few changes in my data-layout to mimic what I see other origins do; this
contained something like 38 builds across 4 different revisions (with brand new
revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
push from yesterday.

JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
(I pushed >30mins ago)

Any idea ?

Thanks

Cristian

> Nick
> 
> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> > On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >> Hi Christian,
> >>
> >> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>> Hi Nikolai,
> >>>
> >>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>> contribute our internal Kernel test results to KCIDB.
> >>
> >> Wonderful!
> >>
> >>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>> and I'd like to start experimenting with kci-submit (on non-production
> >>> instances), so as to assess how to fit our results into your schema and maybe
> >>> contribute with some new KCIDB requirements if strictly needed.
> >>
> >> Great, this is exactly what we need, welcome aboard :)
> >>
> >> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >> freenode.net, if you have any questions, problems, or requirements.
> >>
> >>> Is it possible to get some valid credentials and a playground instance to
> >>> point at ?
> >>
> >> Absolutely, I created credentials for you and sent them in a separate message.
> >>
> >> You can use origin "arm" for the start, unless you have multiple CI systems
> >> and want to differentiate them somehow in your reports.
> >>
> >> Nick
> >>
> >   Thanks !
> >
> > It works too ... :D
> >
> > https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >
> > ..quick question though....given that now I'll have to play quite a bit
> > with it and see how's better to present our data, if anythinjg missing etc etc,
> > is there any chance (or way) that if I submmit the same JSON report multiple
> > times with slight differences here and there (but with the same IDs clearly)
> > I'll get my DB updated in the bits I have changed: as an example I've just
> > resubmitted the same report with added discovery_time and descriptions, and got
> > NO errors, but I cannot see the changes in the UI (unless they have still to
> > propagate...)..or maybe I can obtain the same effect by dropping my dataset
> > before re-submitting ?
> >
> > Regards
> >
> > Thanks
> >
> > Cristian
> >
> >> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>> Hi Nikolai,
> >>>
> >>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>> contribute our internal Kernel test results to KCIDB.
> >>>
> >>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>> and I'd like to start experimenting with kci-submit (on non-production
> >>> instances), so as to assess how to fit our results into your schema and maybe
> >>> contribute with some new KCIDB requirements if strictly needed.
> >>>
> >>> Is it possible to get some valid credentials and a playground instance to
> >>> point at ?
> >>>
> >>> Thanks
> >>>
> >>> Regards
> >>>
> >>> Cristian
> >>>
> >>>
> >>> 
> >>>
> >>>
> >>
> >
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-18 15:21       ` Cristian Marussi
@ 2020-09-18 15:30         ` Nikolai Kondrashov
  2020-09-18 15:53           ` Nikolai Kondrashov
  2020-09-18 16:06           ` Cristian Marussi
  0 siblings, 2 replies; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-09-18 15:30 UTC (permalink / raw)
  To: kernelci, cristian.marussi; +Cc: broonie, basil.eljuse

On 9/18/20 6:21 PM, Cristian Marussi wrote:
 > So in order to carry on my experiments, I've just tried to push a new dataset
 > with a few changes in my data-layout to mimic what I see other origins do; this
 > contained something like 38 builds across 4 different revisions (with brand new
 > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 > push from yesterday.
 >
 > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 > (I pushed >30mins ago)
 >
 > Any idea ?

Yes, I think it's one of the problems you uncovered :)

The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
database on the backend doesn't understand some of them. In particular it
doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
That's what I wanted to fix today, but ran out of time.

Additionally, the backend doesn't have a way to report a problem to the
submitter at the moment. We intend to fix that, but for now it's possible only
through us looking at the logs and sending a message to the submitter :)

To work around this you can pad your timestamps with dummy date and time
data.

E.g. instead of sending:

     2020-09-13

you can send:

     2020-09-13 00:00:00+00:00

Hopefully that's the only problem. It could be, since you managed to send data
before :)

Nick

On 9/18/20 6:21 PM, Cristian Marussi wrote:
 > Hi Nikolai,
 >
 > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>> It works too ... :D
 >>>
 >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>
 >> Whoa, awesome!
 >>
 >> And you have already uncovered a few issues we need to fix, too!
 >> I will deal with them tomorrow.
 >>
 >>> ..quick question though....given that now I'll have to play quite a bit
 >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>> times with slight differences here and there (but with the same IDs clearly)
 >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>> before re-submitting ?
 >>
 >> Right now it's not supported (with various possible quirks if attempted).
 >> So, preferably, submit only one, complete and final instance of each object
 >> (with unique ID) for now.
 >>
 >> We have a plan to support merging missing properties across multiple reported
 >> objects with the same ID.
 >>
 >>              Object A        Object B    Dashboard/Notifications
 >>
 >> FieldX:     Foo             Foo         Foo
 >> FieldY:                     Bar         Bar
 >> FieldZ:     Baz                         Baz
 >> FieldU:     Red             Blue        Red/Blue
 >>
 >> Since we're using a distributed database we cannot really maintain order
 >> (without introducing artificial global lock), so the order of the reports
 >> doesn't matter. We can only guarantee that a present value would override
 >> missing value. It would be undefined which value would be picked among
 >> multiple different values.
 >>
 >> This would allow gradual reporting of each object, but no editing, sorry.
 >>
 >> However, once again, this is a plan with some research done, only.
 >> I plan to start implementing it within a few weeks.
 >>
 >
 > So in order to carry on my experiments, I've just tried to push a new dataset
 > with a few changes in my data-layout to mimic what I see other origins do; this
 > contained something like 38 builds across 4 different revisions (with brand new
 > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 > push from yesterday.
 >
 > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 > (I pushed >30mins ago)
 >
 > Any idea ?
 >
 > Thanks
 >
 > Cristian
 >
 >> Nick
 >>
 >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>> Hi Christian,
 >>>>
 >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>> Hi Nikolai,
 >>>>>
 >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>
 >>>> Wonderful!
 >>>>
 >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>
 >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>
 >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>
 >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>> point at ?
 >>>>
 >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>
 >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>> and want to differentiate them somehow in your reports.
 >>>>
 >>>> Nick
 >>>>
 >>>    Thanks !
 >>>
 >>> It works too ... :D
 >>>
 >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>
 >>> ..quick question though....given that now I'll have to play quite a bit
 >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>> times with slight differences here and there (but with the same IDs clearly)
 >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>> before re-submitting ?
 >>>
 >>> Regards
 >>>
 >>> Thanks
 >>>
 >>> Cristian
 >>>
 >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>> Hi Nikolai,
 >>>>>
 >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>
 >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>
 >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>> point at ?
 >>>>>
 >>>>> Thanks
 >>>>>
 >>>>> Regards
 >>>>>
 >>>>> Cristian
 >>>>>
 >>>>>
 >>>>>
 >>>>>
 >>>>>
 >>>>
 >>>
 >>
 >
 >
 > 
 >
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-18 15:30         ` Nikolai Kondrashov
@ 2020-09-18 15:53           ` Nikolai Kondrashov
  2020-09-18 16:42             ` Cristian Marussi
  2020-09-18 16:06           ` Cristian Marussi
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-09-18 15:53 UTC (permalink / raw)
  To: kernelci, cristian.marussi; +Cc: broonie, basil.eljuse

On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 > Yes, I think it's one of the problems you uncovered :)
 >
 > The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 > database on the backend doesn't understand some of them. In particular it
 > doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 > That's what I wanted to fix today, but ran out of time.

Looking at this more it seems that Python's jsonschema module simply doesn't
enforce the requirements we put on those fields 🤦. You can send essentially
what you want and then hit BigQuery, which is serious about them.

Sorry about that.

I opened an issue for this: https://github.com/kernelci/kcidb/issues/108

For now please just make sure your timestamp comply with RFC3339.

You can produce such a timestamp e.g. using "date --rfc-3339=s".

Nick

On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 > On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >  > So in order to carry on my experiments, I've just tried to push a new dataset
 >  > with a few changes in my data-layout to mimic what I see other origins do; this
 >  > contained something like 38 builds across 4 different revisions (with brand new
 >  > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >  > push from yesterday.
 >  >
 >  > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >  > (I pushed >30mins ago)
 >  >
 >  > Any idea ?
 >
 > Yes, I think it's one of the problems you uncovered :)
 >
 > The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 > database on the backend doesn't understand some of them. In particular it
 > doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 > That's what I wanted to fix today, but ran out of time.
 >
 > Additionally, the backend doesn't have a way to report a problem to the
 > submitter at the moment. We intend to fix that, but for now it's possible only
 > through us looking at the logs and sending a message to the submitter :)
 >
 > To work around this you can pad your timestamps with dummy date and time
 > data.
 >
 > E.g. instead of sending:
 >
 >      2020-09-13
 >
 > you can send:
 >
 >      2020-09-13 00:00:00+00:00
 >
 > Hopefully that's the only problem. It could be, since you managed to send data
 > before :)
 >
 > Nick
 >
 > On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >  > Hi Nikolai,
 >  >
 >  > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >  >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >  >>> It works too ... :D
 >  >>>
 >  >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >  >>
 >  >> Whoa, awesome!
 >  >>
 >  >> And you have already uncovered a few issues we need to fix, too!
 >  >> I will deal with them tomorrow.
 >  >>
 >  >>> ..quick question though....given that now I'll have to play quite a bit
 >  >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >  >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >  >>> times with slight differences here and there (but with the same IDs clearly)
 >  >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >  >>> resubmitted the same report with added discovery_time and descriptions, and got
 >  >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >  >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >  >>> before re-submitting ?
 >  >>
 >  >> Right now it's not supported (with various possible quirks if attempted).
 >  >> So, preferably, submit only one, complete and final instance of each object
 >  >> (with unique ID) for now.
 >  >>
 >  >> We have a plan to support merging missing properties across multiple reported
 >  >> objects with the same ID.
 >  >>
 >  >>              Object A        Object B    Dashboard/Notifications
 >  >>
 >  >> FieldX:     Foo             Foo         Foo
 >  >> FieldY:                     Bar         Bar
 >  >> FieldZ:     Baz                         Baz
 >  >> FieldU:     Red             Blue        Red/Blue
 >  >>
 >  >> Since we're using a distributed database we cannot really maintain order
 >  >> (without introducing artificial global lock), so the order of the reports
 >  >> doesn't matter. We can only guarantee that a present value would override
 >  >> missing value. It would be undefined which value would be picked among
 >  >> multiple different values.
 >  >>
 >  >> This would allow gradual reporting of each object, but no editing, sorry.
 >  >>
 >  >> However, once again, this is a plan with some research done, only.
 >  >> I plan to start implementing it within a few weeks.
 >  >>
 >  >
 >  > So in order to carry on my experiments, I've just tried to push a new dataset
 >  > with a few changes in my data-layout to mimic what I see other origins do; this
 >  > contained something like 38 builds across 4 different revisions (with brand new
 >  > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >  > push from yesterday.
 >  >
 >  > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >  > (I pushed >30mins ago)
 >  >
 >  > Any idea ?
 >  >
 >  > Thanks
 >  >
 >  > Cristian
 >  >
 >  >> Nick
 >  >>
 >  >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >  >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >  >>>> Hi Christian,
 >  >>>>
 >  >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >  >>>>> Hi Nikolai,
 >  >>>>>
 >  >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >  >>>>> contribute our internal Kernel test results to KCIDB.
 >  >>>>
 >  >>>> Wonderful!
 >  >>>>
 >  >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >  >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >  >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >  >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >  >>>>> contribute with some new KCIDB requirements if strictly needed.
 >  >>>>
 >  >>>> Great, this is exactly what we need, welcome aboard :)
 >  >>>>
 >  >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >  >>>> freenode.net, if you have any questions, problems, or requirements.
 >  >>>>
 >  >>>>> Is it possible to get some valid credentials and a playground instance to
 >  >>>>> point at ?
 >  >>>>
 >  >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >  >>>>
 >  >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >  >>>> and want to differentiate them somehow in your reports.
 >  >>>>
 >  >>>> Nick
 >  >>>>
 >  >>>    Thanks !
 >  >>>
 >  >>> It works too ... :D
 >  >>>
 >  >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >  >>>
 >  >>> ..quick question though....given that now I'll have to play quite a bit
 >  >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >  >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >  >>> times with slight differences here and there (but with the same IDs clearly)
 >  >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >  >>> resubmitted the same report with added discovery_time and descriptions, and got
 >  >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >  >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >  >>> before re-submitting ?
 >  >>>
 >  >>> Regards
 >  >>>
 >  >>> Thanks
 >  >>>
 >  >>> Cristian
 >  >>>
 >  >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >  >>>>> Hi Nikolai,
 >  >>>>>
 >  >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >  >>>>> contribute our internal Kernel test results to KCIDB.
 >  >>>>>
 >  >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >  >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >  >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >  >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >  >>>>> contribute with some new KCIDB requirements if strictly needed.
 >  >>>>>
 >  >>>>> Is it possible to get some valid credentials and a playground instance to
 >  >>>>> point at ?
 >  >>>>>
 >  >>>>> Thanks
 >  >>>>>
 >  >>>>> Regards
 >  >>>>>
 >  >>>>> Cristian
 >  >>>>>
 >  >>>>>
 >  >>>>>
 >  >>>>>
 >  >>>>>
 >  >>>>
 >  >>>
 >  >>
 >  >
 >  >
 >  > >
 >  >
 >
 >
 >
 > 
 >
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-18 15:30         ` Nikolai Kondrashov
  2020-09-18 15:53           ` Nikolai Kondrashov
@ 2020-09-18 16:06           ` Cristian Marussi
  1 sibling, 0 replies; 23+ messages in thread
From: Cristian Marussi @ 2020-09-18 16:06 UTC (permalink / raw)
  To: Nikolai Kondrashov; +Cc: kernelci, broonie, basil.eljuse

Hi Nikolai,

On Fri, Sep 18, 2020 at 06:30:30PM +0300, Nikolai Kondrashov wrote:
> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> > So in order to carry on my experiments, I've just tried to push a new dataset
> > with a few changes in my data-layout to mimic what I see other origins do; this
> > contained something like 38 builds across 4 different revisions (with brand new
> > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> > push from yesterday.
> >
> > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> > (I pushed >30mins ago)
> >
> > Any idea ?
> 
> Yes, I think it's one of the problems you uncovered :)
> 
> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> database on the backend doesn't understand some of them. In particular it
> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".

Ah damn, I was in fact dubious about that, but I'll add full timestamp.

> That's what I wanted to fix today, but ran out of time.
> 
> Additionally, the backend doesn't have a way to report a problem to the
> submitter at the moment. We intend to fix that, but for now it's possible only
> through us looking at the logs and sending a message to the submitter :)
> 

Does not seem so much fun :D 

> To work around this you can pad your timestamps with dummy date and time
> data.
> 
> E.g. instead of sending:
> 
>     2020-09-13
> 
> you can send:
> 
>     2020-09-13 00:00:00+00:00
> 
> Hopefully that's the only problem. It could be, since you managed to send data
> before :)
> 
Great, it works now as you advised with a dummy hour timestamp added !

Thanks

Cristian

> Nick
> 
> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> > Hi Nikolai,
> >
> > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>> It works too ... :D
> >>>
> >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>
> >> Whoa, awesome!
> >>
> >> And you have already uncovered a few issues we need to fix, too!
> >> I will deal with them tomorrow.
> >>
> >>> ..quick question though....given that now I'll have to play quite a bit
> >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>> times with slight differences here and there (but with the same IDs clearly)
> >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>> before re-submitting ?
> >>
> >> Right now it's not supported (with various possible quirks if attempted).
> >> So, preferably, submit only one, complete and final instance of each object
> >> (with unique ID) for now.
> >>
> >> We have a plan to support merging missing properties across multiple reported
> >> objects with the same ID.
> >>
> >>              Object A        Object B    Dashboard/Notifications
> >>
> >> FieldX:     Foo             Foo         Foo
> >> FieldY:                     Bar         Bar
> >> FieldZ:     Baz                         Baz
> >> FieldU:     Red             Blue        Red/Blue
> >>
> >> Since we're using a distributed database we cannot really maintain order
> >> (without introducing artificial global lock), so the order of the reports
> >> doesn't matter. We can only guarantee that a present value would override
> >> missing value. It would be undefined which value would be picked among
> >> multiple different values.
> >>
> >> This would allow gradual reporting of each object, but no editing, sorry.
> >>
> >> However, once again, this is a plan with some research done, only.
> >> I plan to start implementing it within a few weeks.
> >>
> >
> > So in order to carry on my experiments, I've just tried to push a new dataset
> > with a few changes in my data-layout to mimic what I see other origins do; this
> > contained something like 38 builds across 4 different revisions (with brand new
> > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> > push from yesterday.
> >
> > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> > (I pushed >30mins ago)
> >
> > Any idea ?
> >
> > Thanks
> >
> > Cristian
> >
> >> Nick
> >>
> >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >>>> Hi Christian,
> >>>>
> >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>> Hi Nikolai,
> >>>>>
> >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>
> >>>> Wonderful!
> >>>>
> >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>
> >>>> Great, this is exactly what we need, welcome aboard :)
> >>>>
> >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >>>> freenode.net, if you have any questions, problems, or requirements.
> >>>>
> >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>> point at ?
> >>>>
> >>>> Absolutely, I created credentials for you and sent them in a separate message.
> >>>>
> >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> >>>> and want to differentiate them somehow in your reports.
> >>>>
> >>>> Nick
> >>>>
> >>>    Thanks !
> >>>
> >>> It works too ... :D
> >>>
> >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>
> >>> ..quick question though....given that now I'll have to play quite a bit
> >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>> times with slight differences here and there (but with the same IDs clearly)
> >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>> before re-submitting ?
> >>>
> >>> Regards
> >>>
> >>> Thanks
> >>>
> >>> Cristian
> >>>
> >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>> Hi Nikolai,
> >>>>>
> >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>
> >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>
> >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>> point at ?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>> Cristian
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> >
> > 
> >
> >
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-18 15:53           ` Nikolai Kondrashov
@ 2020-09-18 16:42             ` Cristian Marussi
  2020-09-18 16:57               ` Nikolai Kondrashov
  2020-11-05 18:46               ` Cristian Marussi
  0 siblings, 2 replies; 23+ messages in thread
From: Cristian Marussi @ 2020-09-18 16:42 UTC (permalink / raw)
  To: kernelci, Nikolai.Kondrashov; +Cc: broonie, basil.eljuse

Hi Nick,

On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> > Yes, I think it's one of the problems you uncovered :)
> >
> > The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> > database on the backend doesn't understand some of them. In particular it
> > doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> > That's what I wanted to fix today, but ran out of time.
> 
> Looking at this more it seems that Python's jsonschema module simply doesn't
> enforce the requirements we put on those fields 🤦. You can send essentially
> what you want and then hit BigQuery, which is serious about them.

...in fact on my side I check too with jsonschema in my script before using kcidb :D
> 
> Sorry about that.
> 

No worries.

> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
> 
> For now please just make sure your timestamp comply with RFC3339.
> 
> You can produce such a timestamp e.g. using "date --rfc-3339=s".

I'll anyway fix my data on my side too, to have the real discovery timestamp.

> 
> Nick
> 

Thanks

Cristian

> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> > On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >  > So in order to carry on my experiments, I've just tried to push a new dataset
> >  > with a few changes in my data-layout to mimic what I see other origins do; this
> >  > contained something like 38 builds across 4 different revisions (with brand new
> >  > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >  > push from yesterday.
> >  >
> >  > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >  > (I pushed >30mins ago)
> >  >
> >  > Any idea ?
> >
> > Yes, I think it's one of the problems you uncovered :)
> >
> > The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> > database on the backend doesn't understand some of them. In particular it
> > doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> > That's what I wanted to fix today, but ran out of time.
> >
> > Additionally, the backend doesn't have a way to report a problem to the
> > submitter at the moment. We intend to fix that, but for now it's possible only
> > through us looking at the logs and sending a message to the submitter :)
> >
> > To work around this you can pad your timestamps with dummy date and time
> > data.
> >
> > E.g. instead of sending:
> >
> >      2020-09-13
> >
> > you can send:
> >
> >      2020-09-13 00:00:00+00:00
> >
> > Hopefully that's the only problem. It could be, since you managed to send data
> > before :)
> >
> > Nick
> >
> > On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >  > Hi Nikolai,
> >  >
> >  > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> >  >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >  >>> It works too ... :D
> >  >>>
> >  >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >  >>
> >  >> Whoa, awesome!
> >  >>
> >  >> And you have already uncovered a few issues we need to fix, too!
> >  >> I will deal with them tomorrow.
> >  >>
> >  >>> ..quick question though....given that now I'll have to play quite a bit
> >  >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >  >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >  >>> times with slight differences here and there (but with the same IDs clearly)
> >  >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >  >>> resubmitted the same report with added discovery_time and descriptions, and got
> >  >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >  >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >  >>> before re-submitting ?
> >  >>
> >  >> Right now it's not supported (with various possible quirks if attempted).
> >  >> So, preferably, submit only one, complete and final instance of each object
> >  >> (with unique ID) for now.
> >  >>
> >  >> We have a plan to support merging missing properties across multiple reported
> >  >> objects with the same ID.
> >  >>
> >  >>              Object A        Object B    Dashboard/Notifications
> >  >>
> >  >> FieldX:     Foo             Foo         Foo
> >  >> FieldY:                     Bar         Bar
> >  >> FieldZ:     Baz                         Baz
> >  >> FieldU:     Red             Blue        Red/Blue
> >  >>
> >  >> Since we're using a distributed database we cannot really maintain order
> >  >> (without introducing artificial global lock), so the order of the reports
> >  >> doesn't matter. We can only guarantee that a present value would override
> >  >> missing value. It would be undefined which value would be picked among
> >  >> multiple different values.
> >  >>
> >  >> This would allow gradual reporting of each object, but no editing, sorry.
> >  >>
> >  >> However, once again, this is a plan with some research done, only.
> >  >> I plan to start implementing it within a few weeks.
> >  >>
> >  >
> >  > So in order to carry on my experiments, I've just tried to push a new dataset
> >  > with a few changes in my data-layout to mimic what I see other origins do; this
> >  > contained something like 38 builds across 4 different revisions (with brand new
> >  > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >  > push from yesterday.
> >  >
> >  > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >  > (I pushed >30mins ago)
> >  >
> >  > Any idea ?
> >  >
> >  > Thanks
> >  >
> >  > Cristian
> >  >
> >  >> Nick
> >  >>
> >  >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >  >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >  >>>> Hi Christian,
> >  >>>>
> >  >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >  >>>>> Hi Nikolai,
> >  >>>>>
> >  >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >  >>>>> contribute our internal Kernel test results to KCIDB.
> >  >>>>
> >  >>>> Wonderful!
> >  >>>>
> >  >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >  >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >  >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >  >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >  >>>>> contribute with some new KCIDB requirements if strictly needed.
> >  >>>>
> >  >>>> Great, this is exactly what we need, welcome aboard :)
> >  >>>>
> >  >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >  >>>> freenode.net, if you have any questions, problems, or requirements.
> >  >>>>
> >  >>>>> Is it possible to get some valid credentials and a playground instance to
> >  >>>>> point at ?
> >  >>>>
> >  >>>> Absolutely, I created credentials for you and sent them in a separate message.
> >  >>>>
> >  >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> >  >>>> and want to differentiate them somehow in your reports.
> >  >>>>
> >  >>>> Nick
> >  >>>>
> >  >>>    Thanks !
> >  >>>
> >  >>> It works too ... :D
> >  >>>
> >  >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >  >>>
> >  >>> ..quick question though....given that now I'll have to play quite a bit
> >  >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >  >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >  >>> times with slight differences here and there (but with the same IDs clearly)
> >  >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >  >>> resubmitted the same report with added discovery_time and descriptions, and got
> >  >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >  >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >  >>> before re-submitting ?
> >  >>>
> >  >>> Regards
> >  >>>
> >  >>> Thanks
> >  >>>
> >  >>> Cristian
> >  >>>
> >  >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >  >>>>> Hi Nikolai,
> >  >>>>>
> >  >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >  >>>>> contribute our internal Kernel test results to KCIDB.
> >  >>>>>
> >  >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >  >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >  >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >  >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >  >>>>> contribute with some new KCIDB requirements if strictly needed.
> >  >>>>>
> >  >>>>> Is it possible to get some valid credentials and a playground instance to
> >  >>>>> point at ?
> >  >>>>>
> >  >>>>> Thanks
> >  >>>>>
> >  >>>>> Regards
> >  >>>>>
> >  >>>>> Cristian
> >  >>>>>
> >  >>>>>
> >  >>>>>
> >  >>>>>
> >  >>>>>
> >  >>>>
> >  >>>
> >  >>
> >  >
> >  >
> >  > >
> >  >
> >
> >
> >
> > 
> >
> >
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-18 16:42             ` Cristian Marussi
@ 2020-09-18 16:57               ` Nikolai Kondrashov
  2020-11-05 18:46               ` Cristian Marussi
  1 sibling, 0 replies; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-09-18 16:57 UTC (permalink / raw)
  To: kernelci, cristian.marussi; +Cc: broonie, basil.eljuse

On 9/18/20 7:42 PM, Cristian Marussi wrote:
 > Hi Nick,
 >
 > On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>> Yes, I think it's one of the problems you uncovered :)
 >>>
 >>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>> database on the backend doesn't understand some of them. In particular it
 >>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>> That's what I wanted to fix today, but ran out of time.
 >>
 >> Looking at this more it seems that Python's jsonschema module simply doesn't
 >> enforce the requirements we put on those fields 🤦. You can send essentially
 >> what you want and then hit BigQuery, which is serious about them.
 >
 > ...in fact on my side I check too with jsonschema in my script before using kcidb :D

Taking a peek into jsonschema's code it seems it should support verifying
that, but perhaps you need to enable it explicitly?

 >>
 >> Sorry about that.
 >>
 >
 > No worries.
 >
 >> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>
 >> For now please just make sure your timestamp comply with RFC3339.
 >>
 >> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >
 > I'll anyway fix my data on my side too, to have the real discovery timestamp.

Great! And glad to hear it worked for you :)

Have a nice weekend!
Nick

 >>
 >> Nick
 >>
 >
 > Thanks
 >
 > Cristian
 >
 >> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>   > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>   > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>   > contained something like 38 builds across 4 different revisions (with brand new
 >>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>   > push from yesterday.
 >>>   >
 >>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>   > (I pushed >30mins ago)
 >>>   >
 >>>   > Any idea ?
 >>>
 >>> Yes, I think it's one of the problems you uncovered :)
 >>>
 >>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>> database on the backend doesn't understand some of them. In particular it
 >>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>> That's what I wanted to fix today, but ran out of time.
 >>>
 >>> Additionally, the backend doesn't have a way to report a problem to the
 >>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>> through us looking at the logs and sending a message to the submitter :)
 >>>
 >>> To work around this you can pad your timestamps with dummy date and time
 >>> data.
 >>>
 >>> E.g. instead of sending:
 >>>
 >>>       2020-09-13
 >>>
 >>> you can send:
 >>>
 >>>       2020-09-13 00:00:00+00:00
 >>>
 >>> Hopefully that's the only problem. It could be, since you managed to send data
 >>> before :)
 >>>
 >>> Nick
 >>>
 >>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>   > Hi Nikolai,
 >>>   >
 >>>   > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>   >>> It works too ... :D
 >>>   >>>
 >>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>   >>
 >>>   >> Whoa, awesome!
 >>>   >>
 >>>   >> And you have already uncovered a few issues we need to fix, too!
 >>>   >> I will deal with them tomorrow.
 >>>   >>
 >>>   >>> ..quick question though....given that now I'll have to play quite a bit
 >>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>   >>> times with slight differences here and there (but with the same IDs clearly)
 >>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>   >>> before re-submitting ?
 >>>   >>
 >>>   >> Right now it's not supported (with various possible quirks if attempted).
 >>>   >> So, preferably, submit only one, complete and final instance of each object
 >>>   >> (with unique ID) for now.
 >>>   >>
 >>>   >> We have a plan to support merging missing properties across multiple reported
 >>>   >> objects with the same ID.
 >>>   >>
 >>>   >>              Object A        Object B    Dashboard/Notifications
 >>>   >>
 >>>   >> FieldX:     Foo             Foo         Foo
 >>>   >> FieldY:                     Bar         Bar
 >>>   >> FieldZ:     Baz                         Baz
 >>>   >> FieldU:     Red             Blue        Red/Blue
 >>>   >>
 >>>   >> Since we're using a distributed database we cannot really maintain order
 >>>   >> (without introducing artificial global lock), so the order of the reports
 >>>   >> doesn't matter. We can only guarantee that a present value would override
 >>>   >> missing value. It would be undefined which value would be picked among
 >>>   >> multiple different values.
 >>>   >>
 >>>   >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>   >>
 >>>   >> However, once again, this is a plan with some research done, only.
 >>>   >> I plan to start implementing it within a few weeks.
 >>>   >>
 >>>   >
 >>>   > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>   > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>   > contained something like 38 builds across 4 different revisions (with brand new
 >>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>   > push from yesterday.
 >>>   >
 >>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>   > (I pushed >30mins ago)
 >>>   >
 >>>   > Any idea ?
 >>>   >
 >>>   > Thanks
 >>>   >
 >>>   > Cristian
 >>>   >
 >>>   >> Nick
 >>>   >>
 >>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>   >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>   >>>> Hi Christian,
 >>>   >>>>
 >>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>   >>>>> Hi Nikolai,
 >>>   >>>>>
 >>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>   >>>>> contribute our internal Kernel test results to KCIDB.
 >>>   >>>>
 >>>   >>>> Wonderful!
 >>>   >>>>
 >>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>   >>>>
 >>>   >>>> Great, this is exactly what we need, welcome aboard :)
 >>>   >>>>
 >>>   >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>   >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>   >>>>
 >>>   >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>   >>>>> point at ?
 >>>   >>>>
 >>>   >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>   >>>>
 >>>   >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>   >>>> and want to differentiate them somehow in your reports.
 >>>   >>>>
 >>>   >>>> Nick
 >>>   >>>>
 >>>   >>>    Thanks !
 >>>   >>>
 >>>   >>> It works too ... :D
 >>>   >>>
 >>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>   >>>
 >>>   >>> ..quick question though....given that now I'll have to play quite a bit
 >>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>   >>> times with slight differences here and there (but with the same IDs clearly)
 >>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>   >>> before re-submitting ?
 >>>   >>>
 >>>   >>> Regards
 >>>   >>>
 >>>   >>> Thanks
 >>>   >>>
 >>>   >>> Cristian
 >>>   >>>
 >>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>   >>>>> Hi Nikolai,
 >>>   >>>>>
 >>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>   >>>>> contribute our internal Kernel test results to KCIDB.
 >>>   >>>>>
 >>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>   >>>>>
 >>>   >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>   >>>>> point at ?
 >>>   >>>>>
 >>>   >>>>> Thanks
 >>>   >>>>>
 >>>   >>>>> Regards
 >>>   >>>>>
 >>>   >>>>> Cristian
 >>>   >>>>>
 >>>   >>>>>
 >>>   >>>>>
 >>>   >>>>>
 >>>   >>>>>
 >>>   >>>>
 >>>   >>>
 >>>   >>
 >>>   >
 >>>   >
 >>>   > >
 >>>   >
 >>>
 >>>
 >>>
 >>>
 >>>
 >>>
 >>
 >
 >
 > 
 >
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-09-18 16:42             ` Cristian Marussi
  2020-09-18 16:57               ` Nikolai Kondrashov
@ 2020-11-05 18:46               ` Cristian Marussi
  2020-11-06 10:35                 ` Nikolai Kondrashov
  2020-12-02  8:05                 ` Nikolai Kondrashov
  1 sibling, 2 replies; 23+ messages in thread
From: Cristian Marussi @ 2020-11-05 18:46 UTC (permalink / raw)
  To: kernelci, Nikolai.Kondrashov; +Cc: broonie, basil.eljuse

Hi Nick,

after past month few experiments on ARM KCIDB submissions against your
KCIDB staging instance , I was dragged a bit away from this by other stuff
before effectively deploying some real automation on our side to push our
daily results to KCIDB...now I'm back at it and I'll keep on testing
some automation on our side for a bit against your KCIDB staging instance
before asking you to move to production eventually.

But, today I realized, though, that I cannot push anymore data successfully
into staging even using the same test script I used one month ago to push
some new test data seems to fail now (I tested a few different days and
JSON validates fine with jsonschema...with proper dates with hours...)...
...I cannot see any of my today tests' pushes on:

https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04

Auth seems to proceed fine, but I cannot find any submission dated after
the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
version installed past months from your github though.

Do you see any errors on your side that can shed a light on this ?

Thanks

Regards

Cristian

On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
> Hi Nick,
> 
> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
> > On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> > > Yes, I think it's one of the problems you uncovered :)
> > >
> > > The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> > > database on the backend doesn't understand some of them. In particular it
> > > doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> > > That's what I wanted to fix today, but ran out of time.
> > 
> > Looking at this more it seems that Python's jsonschema module simply doesn't
> > enforce the requirements we put on those fields 🤦. You can send essentially
> > what you want and then hit BigQuery, which is serious about them.
> 
> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
> > 
> > Sorry about that.
> > 
> 
> No worries.
> 
> > I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
> > 
> > For now please just make sure your timestamp comply with RFC3339.
> > 
> > You can produce such a timestamp e.g. using "date --rfc-3339=s".
> 
> I'll anyway fix my data on my side too, to have the real discovery timestamp.
> 
> > 
> > Nick
> > 
> 
> Thanks
> 
> Cristian
> 
> > On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> > > On 9/18/20 6:21 PM, Cristian Marussi wrote:
> > >  > So in order to carry on my experiments, I've just tried to push a new dataset
> > >  > with a few changes in my data-layout to mimic what I see other origins do; this
> > >  > contained something like 38 builds across 4 different revisions (with brand new
> > >  > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> > >  > push from yesterday.
> > >  >
> > >  > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> > >  > (I pushed >30mins ago)
> > >  >
> > >  > Any idea ?
> > >
> > > Yes, I think it's one of the problems you uncovered :)
> > >
> > > The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> > > database on the backend doesn't understand some of them. In particular it
> > > doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> > > That's what I wanted to fix today, but ran out of time.
> > >
> > > Additionally, the backend doesn't have a way to report a problem to the
> > > submitter at the moment. We intend to fix that, but for now it's possible only
> > > through us looking at the logs and sending a message to the submitter :)
> > >
> > > To work around this you can pad your timestamps with dummy date and time
> > > data.
> > >
> > > E.g. instead of sending:
> > >
> > >      2020-09-13
> > >
> > > you can send:
> > >
> > >      2020-09-13 00:00:00+00:00
> > >
> > > Hopefully that's the only problem. It could be, since you managed to send data
> > > before :)
> > >
> > > Nick
> > >
> > > On 9/18/20 6:21 PM, Cristian Marussi wrote:
> > >  > Hi Nikolai,
> > >  >
> > >  > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> > >  >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> > >  >>> It works too ... :D
> > >  >>>
> > >  >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> > >  >>
> > >  >> Whoa, awesome!
> > >  >>
> > >  >> And you have already uncovered a few issues we need to fix, too!
> > >  >> I will deal with them tomorrow.
> > >  >>
> > >  >>> ..quick question though....given that now I'll have to play quite a bit
> > >  >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> > >  >>> is there any chance (or way) that if I submmit the same JSON report multiple
> > >  >>> times with slight differences here and there (but with the same IDs clearly)
> > >  >>> I'll get my DB updated in the bits I have changed: as an example I've just
> > >  >>> resubmitted the same report with added discovery_time and descriptions, and got
> > >  >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> > >  >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> > >  >>> before re-submitting ?
> > >  >>
> > >  >> Right now it's not supported (with various possible quirks if attempted).
> > >  >> So, preferably, submit only one, complete and final instance of each object
> > >  >> (with unique ID) for now.
> > >  >>
> > >  >> We have a plan to support merging missing properties across multiple reported
> > >  >> objects with the same ID.
> > >  >>
> > >  >>              Object A        Object B    Dashboard/Notifications
> > >  >>
> > >  >> FieldX:     Foo             Foo         Foo
> > >  >> FieldY:                     Bar         Bar
> > >  >> FieldZ:     Baz                         Baz
> > >  >> FieldU:     Red             Blue        Red/Blue
> > >  >>
> > >  >> Since we're using a distributed database we cannot really maintain order
> > >  >> (without introducing artificial global lock), so the order of the reports
> > >  >> doesn't matter. We can only guarantee that a present value would override
> > >  >> missing value. It would be undefined which value would be picked among
> > >  >> multiple different values.
> > >  >>
> > >  >> This would allow gradual reporting of each object, but no editing, sorry.
> > >  >>
> > >  >> However, once again, this is a plan with some research done, only.
> > >  >> I plan to start implementing it within a few weeks.
> > >  >>
> > >  >
> > >  > So in order to carry on my experiments, I've just tried to push a new dataset
> > >  > with a few changes in my data-layout to mimic what I see other origins do; this
> > >  > contained something like 38 builds across 4 different revisions (with brand new
> > >  > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> > >  > push from yesterday.
> > >  >
> > >  > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> > >  > (I pushed >30mins ago)
> > >  >
> > >  > Any idea ?
> > >  >
> > >  > Thanks
> > >  >
> > >  > Cristian
> > >  >
> > >  >> Nick
> > >  >>
> > >  >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> > >  >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> > >  >>>> Hi Christian,
> > >  >>>>
> > >  >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> > >  >>>>> Hi Nikolai,
> > >  >>>>>
> > >  >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> > >  >>>>> contribute our internal Kernel test results to KCIDB.
> > >  >>>>
> > >  >>>> Wonderful!
> > >  >>>>
> > >  >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> > >  >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> > >  >>>>> and I'd like to start experimenting with kci-submit (on non-production
> > >  >>>>> instances), so as to assess how to fit our results into your schema and maybe
> > >  >>>>> contribute with some new KCIDB requirements if strictly needed.
> > >  >>>>
> > >  >>>> Great, this is exactly what we need, welcome aboard :)
> > >  >>>>
> > >  >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> > >  >>>> freenode.net, if you have any questions, problems, or requirements.
> > >  >>>>
> > >  >>>>> Is it possible to get some valid credentials and a playground instance to
> > >  >>>>> point at ?
> > >  >>>>
> > >  >>>> Absolutely, I created credentials for you and sent them in a separate message.
> > >  >>>>
> > >  >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> > >  >>>> and want to differentiate them somehow in your reports.
> > >  >>>>
> > >  >>>> Nick
> > >  >>>>
> > >  >>>    Thanks !
> > >  >>>
> > >  >>> It works too ... :D
> > >  >>>
> > >  >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> > >  >>>
> > >  >>> ..quick question though....given that now I'll have to play quite a bit
> > >  >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> > >  >>> is there any chance (or way) that if I submmit the same JSON report multiple
> > >  >>> times with slight differences here and there (but with the same IDs clearly)
> > >  >>> I'll get my DB updated in the bits I have changed: as an example I've just
> > >  >>> resubmitted the same report with added discovery_time and descriptions, and got
> > >  >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> > >  >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> > >  >>> before re-submitting ?
> > >  >>>
> > >  >>> Regards
> > >  >>>
> > >  >>> Thanks
> > >  >>>
> > >  >>> Cristian
> > >  >>>
> > >  >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> > >  >>>>> Hi Nikolai,
> > >  >>>>>
> > >  >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> > >  >>>>> contribute our internal Kernel test results to KCIDB.
> > >  >>>>>
> > >  >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> > >  >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> > >  >>>>> and I'd like to start experimenting with kci-submit (on non-production
> > >  >>>>> instances), so as to assess how to fit our results into your schema and maybe
> > >  >>>>> contribute with some new KCIDB requirements if strictly needed.
> > >  >>>>>
> > >  >>>>> Is it possible to get some valid credentials and a playground instance to
> > >  >>>>> point at ?
> > >  >>>>>
> > >  >>>>> Thanks
> > >  >>>>>
> > >  >>>>> Regards
> > >  >>>>>
> > >  >>>>> Cristian
> > >  >>>>>
> > >  >>>>>
> > >  >>>>>
> > >  >>>>>
> > >  >>>>>
> > >  >>>>
> > >  >>>
> > >  >>
> > >  >
> > >  >
> > >  > >
> > >  >
> > >
> > >
> > >
> > > 
> > >
> > >
> > 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-11-05 18:46               ` Cristian Marussi
@ 2020-11-06 10:35                 ` Nikolai Kondrashov
  2020-12-02  8:05                 ` Nikolai Kondrashov
  1 sibling, 0 replies; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-11-06 10:35 UTC (permalink / raw)
  To: kernelci, cristian.marussi; +Cc: broonie, basil.eljuse

[-- Attachment #1: Type: text/plain, Size: 13952 bytes --]

Hi Cristian,

On 11/5/20 8:46 PM, Cristian Marussi wrote:
 > after past month few experiments on ARM KCIDB submissions against your
 > KCIDB staging instance , I was dragged a bit away from this by other stuff
 > before effectively deploying some real automation on our side to push our
 > daily results to KCIDB...now I'm back at it and I'll keep on testing
 > some automation on our side for a bit against your KCIDB staging instance
 > before asking you to move to production eventually.

Glad to see you returning to this :)

 > But, today I realized, though, that I cannot push anymore data successfully
 > into staging even using the same test script I used one month ago to push
 > some new test data seems to fail now (I tested a few different days and
 > JSON validates fine with jsonschema...with proper dates with hours...)...
 > ...I cannot see any of my today tests' pushes on:
 >
 > https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >
 > Auth seems to proceed fine, but I cannot find any submission dated after
 > the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 > version installed past months from your github though.
 >
 > Do you see any errors on your side that can shed a light on this ?

Yeah I can see your submissions, and I see that they're failing validation due
to the timestamps (yeah, them again ^_^) missing the "T" between the date and
the time, as required by the "date-time" format of the JSON schema
(https://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.7.3.1),
which is basically https://tools.ietf.org/html/rfc3339#section-5.6

KCIDB had an issue, where we didn't enable validating "formats" in the JSON
schema, partly because of the jsonschema package being sneaky about it.

That is fixed in the latest release. You can catch those issues, if you update
your kcidb installation. E.g. with:

     pip3 install --user git+https://github.com/kernelci/kcidb.git@v8

I have manually fixed up and attached the first of your recent submissions, so
you can easily see what needs changing.

We still have the server-side error-reporting issue open and queued for fixing
in the next release (https://github.com/kernelci/kcidb/issues/125), so you can
see those yourself, we just didn't have time to get to it yet ^_^.

Nick

On 11/5/20 8:46 PM, Cristian Marussi wrote:
 > Hi Nick,
 >
 > after past month few experiments on ARM KCIDB submissions against your
 > KCIDB staging instance , I was dragged a bit away from this by other stuff
 > before effectively deploying some real automation on our side to push our
 > daily results to KCIDB...now I'm back at it and I'll keep on testing
 > some automation on our side for a bit against your KCIDB staging instance
 > before asking you to move to production eventually.
 >
 > But, today I realized, though, that I cannot push anymore data successfully
 > into staging even using the same test script I used one month ago to push
 > some new test data seems to fail now (I tested a few different days and
 > JSON validates fine with jsonschema...with proper dates with hours...)...
 > ...I cannot see any of my today tests' pushes on:
 >
 > https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >
 > Auth seems to proceed fine, but I cannot find any submission dated after
 > the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 > version installed past months from your github though.
 >
 > Do you see any errors on your side that can shed a light on this ?
 >
 > Thanks
 >
 > Regards
 >
 > Cristian
 >
 > On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
 >> Hi Nick,
 >>
 >> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>> Yes, I think it's one of the problems you uncovered :)
 >>>>
 >>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>> database on the backend doesn't understand some of them. In particular it
 >>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>> That's what I wanted to fix today, but ran out of time.
 >>>
 >>> Looking at this more it seems that Python's jsonschema module simply doesn't
 >>> enforce the requirements we put on those fields 🤦. You can send essentially
 >>> what you want and then hit BigQuery, which is serious about them.
 >>
 >> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
 >>>
 >>> Sorry about that.
 >>>
 >>
 >> No worries.
 >>
 >>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>>
 >>> For now please just make sure your timestamp comply with RFC3339.
 >>>
 >>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >>
 >> I'll anyway fix my data on my side too, to have the real discovery timestamp.
 >>
 >>>
 >>> Nick
 >>>
 >>
 >> Thanks
 >>
 >> Cristian
 >>
 >>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>   > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>   > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>   > contained something like 38 builds across 4 different revisions (with brand new
 >>>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>   > push from yesterday.
 >>>>   >
 >>>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>   > (I pushed >30mins ago)
 >>>>   >
 >>>>   > Any idea ?
 >>>>
 >>>> Yes, I think it's one of the problems you uncovered :)
 >>>>
 >>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>> database on the backend doesn't understand some of them. In particular it
 >>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>> That's what I wanted to fix today, but ran out of time.
 >>>>
 >>>> Additionally, the backend doesn't have a way to report a problem to the
 >>>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>>> through us looking at the logs and sending a message to the submitter :)
 >>>>
 >>>> To work around this you can pad your timestamps with dummy date and time
 >>>> data.
 >>>>
 >>>> E.g. instead of sending:
 >>>>
 >>>>       2020-09-13
 >>>>
 >>>> you can send:
 >>>>
 >>>>       2020-09-13 00:00:00+00:00
 >>>>
 >>>> Hopefully that's the only problem. It could be, since you managed to send data
 >>>> before :)
 >>>>
 >>>> Nick
 >>>>
 >>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>   > Hi Nikolai,
 >>>>   >
 >>>>   > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>   >>> It works too ... :D
 >>>>   >>>
 >>>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>   >>
 >>>>   >> Whoa, awesome!
 >>>>   >>
 >>>>   >> And you have already uncovered a few issues we need to fix, too!
 >>>>   >> I will deal with them tomorrow.
 >>>>   >>
 >>>>   >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>   >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>   >>> before re-submitting ?
 >>>>   >>
 >>>>   >> Right now it's not supported (with various possible quirks if attempted).
 >>>>   >> So, preferably, submit only one, complete and final instance of each object
 >>>>   >> (with unique ID) for now.
 >>>>   >>
 >>>>   >> We have a plan to support merging missing properties across multiple reported
 >>>>   >> objects with the same ID.
 >>>>   >>
 >>>>   >>              Object A        Object B    Dashboard/Notifications
 >>>>   >>
 >>>>   >> FieldX:     Foo             Foo         Foo
 >>>>   >> FieldY:                     Bar         Bar
 >>>>   >> FieldZ:     Baz                         Baz
 >>>>   >> FieldU:     Red             Blue        Red/Blue
 >>>>   >>
 >>>>   >> Since we're using a distributed database we cannot really maintain order
 >>>>   >> (without introducing artificial global lock), so the order of the reports
 >>>>   >> doesn't matter. We can only guarantee that a present value would override
 >>>>   >> missing value. It would be undefined which value would be picked among
 >>>>   >> multiple different values.
 >>>>   >>
 >>>>   >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>>   >>
 >>>>   >> However, once again, this is a plan with some research done, only.
 >>>>   >> I plan to start implementing it within a few weeks.
 >>>>   >>
 >>>>   >
 >>>>   > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>   > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>   > contained something like 38 builds across 4 different revisions (with brand new
 >>>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>   > push from yesterday.
 >>>>   >
 >>>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>   > (I pushed >30mins ago)
 >>>>   >
 >>>>   > Any idea ?
 >>>>   >
 >>>>   > Thanks
 >>>>   >
 >>>>   > Cristian
 >>>>   >
 >>>>   >> Nick
 >>>>   >>
 >>>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>   >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>>   >>>> Hi Christian,
 >>>>   >>>>
 >>>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>   >>>>> Hi Nikolai,
 >>>>   >>>>>
 >>>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>   >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>   >>>>
 >>>>   >>>> Wonderful!
 >>>>   >>>>
 >>>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>   >>>>
 >>>>   >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>   >>>>
 >>>>   >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>>   >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>   >>>>
 >>>>   >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>   >>>>> point at ?
 >>>>   >>>>
 >>>>   >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>   >>>>
 >>>>   >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>>   >>>> and want to differentiate them somehow in your reports.
 >>>>   >>>>
 >>>>   >>>> Nick
 >>>>   >>>>
 >>>>   >>>    Thanks !
 >>>>   >>>
 >>>>   >>> It works too ... :D
 >>>>   >>>
 >>>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>   >>>
 >>>>   >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>   >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>   >>> before re-submitting ?
 >>>>   >>>
 >>>>   >>> Regards
 >>>>   >>>
 >>>>   >>> Thanks
 >>>>   >>>
 >>>>   >>> Cristian
 >>>>   >>>
 >>>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>   >>>>> Hi Nikolai,
 >>>>   >>>>>
 >>>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>   >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>   >>>>>
 >>>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>   >>>>>
 >>>>   >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>   >>>>> point at ?
 >>>>   >>>>>
 >>>>   >>>>> Thanks
 >>>>   >>>>>
 >>>>   >>>>> Regards
 >>>>   >>>>>
 >>>>   >>>>> Cristian
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>
 >>>>   >>>
 >>>>   >>
 >>>>   >
 >>>>   >
 >>>>   > >
 >>>>   >
 >>>>
 >>>>
 >>>>
 >>>>
 >>>>
 >>>>
 >>>
 >
 >
 > 
 >
 >


[-- Attachment #2: fixed.json.gz --]
[-- Type: application/gzip, Size: 223562 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-11-05 18:46               ` Cristian Marussi
  2020-11-06 10:35                 ` Nikolai Kondrashov
@ 2020-12-02  8:05                 ` Nikolai Kondrashov
  2020-12-02  9:23                   ` Cristian Marussi
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-12-02  8:05 UTC (permalink / raw)
  To: kernelci, cristian.marussi; +Cc: broonie, basil.eljuse

Hi Cristian,

On 11/5/20 8:46 PM, Cristian Marussi wrote:
 > Hi Nick,
 >
 > after past month few experiments on ARM KCIDB submissions against your
 > KCIDB staging instance , I was dragged a bit away from this by other stuff
 > before effectively deploying some real automation on our side to push our
 > daily results to KCIDB...now I'm back at it and I'll keep on testing
 > some automation on our side for a bit against your KCIDB staging instance
 > before asking you to move to production eventually.

I see your data has been steadily trickling into our playground database and
it looks quite good. Would you like to move to the production instance?

I can review your data for you, we can fix the remaining issues if we find
them, and I can give you the permissions to push to production. Then you will
only need to change the topic you push to from "playground_kernelci_new" to
"kernelci_new".

Nick

On 11/5/20 8:46 PM, Cristian Marussi wrote:
 > Hi Nick,
 >
 > after past month few experiments on ARM KCIDB submissions against your
 > KCIDB staging instance , I was dragged a bit away from this by other stuff
 > before effectively deploying some real automation on our side to push our
 > daily results to KCIDB...now I'm back at it and I'll keep on testing
 > some automation on our side for a bit against your KCIDB staging instance
 > before asking you to move to production eventually.
 >
 > But, today I realized, though, that I cannot push anymore data successfully
 > into staging even using the same test script I used one month ago to push
 > some new test data seems to fail now (I tested a few different days and
 > JSON validates fine with jsonschema...with proper dates with hours...)...
 > ...I cannot see any of my today tests' pushes on:
 >
 > https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >
 > Auth seems to proceed fine, but I cannot find any submission dated after
 > the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 > version installed past months from your github though.
 >
 > Do you see any errors on your side that can shed a light on this ?
 >
 > Thanks
 >
 > Regards
 >
 > Cristian
 >
 > On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
 >> Hi Nick,
 >>
 >> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>> Yes, I think it's one of the problems you uncovered :)
 >>>>
 >>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>> database on the backend doesn't understand some of them. In particular it
 >>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>> That's what I wanted to fix today, but ran out of time.
 >>>
 >>> Looking at this more it seems that Python's jsonschema module simply doesn't
 >>> enforce the requirements we put on those fields 🤦. You can send essentially
 >>> what you want and then hit BigQuery, which is serious about them.
 >>
 >> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
 >>>
 >>> Sorry about that.
 >>>
 >>
 >> No worries.
 >>
 >>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>>
 >>> For now please just make sure your timestamp comply with RFC3339.
 >>>
 >>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >>
 >> I'll anyway fix my data on my side too, to have the real discovery timestamp.
 >>
 >>>
 >>> Nick
 >>>
 >>
 >> Thanks
 >>
 >> Cristian
 >>
 >>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>   > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>   > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>   > contained something like 38 builds across 4 different revisions (with brand new
 >>>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>   > push from yesterday.
 >>>>   >
 >>>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>   > (I pushed >30mins ago)
 >>>>   >
 >>>>   > Any idea ?
 >>>>
 >>>> Yes, I think it's one of the problems you uncovered :)
 >>>>
 >>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>> database on the backend doesn't understand some of them. In particular it
 >>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>> That's what I wanted to fix today, but ran out of time.
 >>>>
 >>>> Additionally, the backend doesn't have a way to report a problem to the
 >>>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>>> through us looking at the logs and sending a message to the submitter :)
 >>>>
 >>>> To work around this you can pad your timestamps with dummy date and time
 >>>> data.
 >>>>
 >>>> E.g. instead of sending:
 >>>>
 >>>>       2020-09-13
 >>>>
 >>>> you can send:
 >>>>
 >>>>       2020-09-13 00:00:00+00:00
 >>>>
 >>>> Hopefully that's the only problem. It could be, since you managed to send data
 >>>> before :)
 >>>>
 >>>> Nick
 >>>>
 >>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>   > Hi Nikolai,
 >>>>   >
 >>>>   > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>   >>> It works too ... :D
 >>>>   >>>
 >>>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>   >>
 >>>>   >> Whoa, awesome!
 >>>>   >>
 >>>>   >> And you have already uncovered a few issues we need to fix, too!
 >>>>   >> I will deal with them tomorrow.
 >>>>   >>
 >>>>   >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>   >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>   >>> before re-submitting ?
 >>>>   >>
 >>>>   >> Right now it's not supported (with various possible quirks if attempted).
 >>>>   >> So, preferably, submit only one, complete and final instance of each object
 >>>>   >> (with unique ID) for now.
 >>>>   >>
 >>>>   >> We have a plan to support merging missing properties across multiple reported
 >>>>   >> objects with the same ID.
 >>>>   >>
 >>>>   >>              Object A        Object B    Dashboard/Notifications
 >>>>   >>
 >>>>   >> FieldX:     Foo             Foo         Foo
 >>>>   >> FieldY:                     Bar         Bar
 >>>>   >> FieldZ:     Baz                         Baz
 >>>>   >> FieldU:     Red             Blue        Red/Blue
 >>>>   >>
 >>>>   >> Since we're using a distributed database we cannot really maintain order
 >>>>   >> (without introducing artificial global lock), so the order of the reports
 >>>>   >> doesn't matter. We can only guarantee that a present value would override
 >>>>   >> missing value. It would be undefined which value would be picked among
 >>>>   >> multiple different values.
 >>>>   >>
 >>>>   >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>>   >>
 >>>>   >> However, once again, this is a plan with some research done, only.
 >>>>   >> I plan to start implementing it within a few weeks.
 >>>>   >>
 >>>>   >
 >>>>   > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>   > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>   > contained something like 38 builds across 4 different revisions (with brand new
 >>>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>   > push from yesterday.
 >>>>   >
 >>>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>   > (I pushed >30mins ago)
 >>>>   >
 >>>>   > Any idea ?
 >>>>   >
 >>>>   > Thanks
 >>>>   >
 >>>>   > Cristian
 >>>>   >
 >>>>   >> Nick
 >>>>   >>
 >>>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>   >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>>   >>>> Hi Christian,
 >>>>   >>>>
 >>>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>   >>>>> Hi Nikolai,
 >>>>   >>>>>
 >>>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>   >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>   >>>>
 >>>>   >>>> Wonderful!
 >>>>   >>>>
 >>>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>   >>>>
 >>>>   >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>   >>>>
 >>>>   >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>>   >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>   >>>>
 >>>>   >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>   >>>>> point at ?
 >>>>   >>>>
 >>>>   >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>   >>>>
 >>>>   >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>>   >>>> and want to differentiate them somehow in your reports.
 >>>>   >>>>
 >>>>   >>>> Nick
 >>>>   >>>>
 >>>>   >>>    Thanks !
 >>>>   >>>
 >>>>   >>> It works too ... :D
 >>>>   >>>
 >>>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>   >>>
 >>>>   >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>   >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>   >>> before re-submitting ?
 >>>>   >>>
 >>>>   >>> Regards
 >>>>   >>>
 >>>>   >>> Thanks
 >>>>   >>>
 >>>>   >>> Cristian
 >>>>   >>>
 >>>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>   >>>>> Hi Nikolai,
 >>>>   >>>>>
 >>>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>   >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>   >>>>>
 >>>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>   >>>>>
 >>>>   >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>   >>>>> point at ?
 >>>>   >>>>>
 >>>>   >>>>> Thanks
 >>>>   >>>>>
 >>>>   >>>>> Regards
 >>>>   >>>>>
 >>>>   >>>>> Cristian
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>>
 >>>>   >>>>
 >>>>   >>>
 >>>>   >>
 >>>>   >
 >>>>   >
 >>>>   > >
 >>>>   >
 >>>>
 >>>>
 >>>>
 >>>>
 >>>>
 >>>>
 >>>
 >
 >
 > 
 >
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-02  8:05                 ` Nikolai Kondrashov
@ 2020-12-02  9:23                   ` Cristian Marussi
  2020-12-02 10:16                     ` Nikolai Kondrashov
  0 siblings, 1 reply; 23+ messages in thread
From: Cristian Marussi @ 2020-12-02  9:23 UTC (permalink / raw)
  To: kernelci, Nikolai.Kondrashov; +Cc: broonie, basil.eljuse

Hi Nick

On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> Hi Cristian,
> 
> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> > Hi Nick,
> >
> > after past month few experiments on ARM KCIDB submissions against your
> > KCIDB staging instance , I was dragged a bit away from this by other stuff
> > before effectively deploying some real automation on our side to push our
> > daily results to KCIDB...now I'm back at it and I'll keep on testing
> > some automation on our side for a bit against your KCIDB staging instance
> > before asking you to move to production eventually.
> 
> I see your data has been steadily trickling into our playground database and
> it looks quite good. Would you like to move to the production instance?
> 
> I can review your data for you, we can fix the remaining issues if we find
> them, and I can give you the permissions to push to production. Then you will
> only need to change the topic you push to from "playground_kernelci_new" to
> "kernelci_new".

In fact I left one staging instance on our side to push data on your
staging instance to verify remaining issues on our side *and there are a
couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
a little while a go that you're going to switch to schema v4 with some minor
changes in revisions and commit_hashes so I wanted to conform to that once
it's published (even though you're back compatible with v3 AFAIU)....

... then I've got dragged away again from this past week :D

In fact my next steps (possibly next week) would have been (beside my fixes)
to ask you how to proceed further to production KCIDB.

Would you want me to stop flooding your staging instance in the meantime (:D)
till I'm back at it at least , I think I have enugh data now to debug anyway.
(I could made a few more check next week though)

If it's just a matter of switching project (once got enhanced permissions
from you) please do it, and I'll try to finalize all next week on our
side and move to production.

Thanks for the patience

Cristian


> 
> Nick
> 
> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> > Hi Nick,
> >
> > after past month few experiments on ARM KCIDB submissions against your
> > KCIDB staging instance , I was dragged a bit away from this by other stuff
> > before effectively deploying some real automation on our side to push our
> > daily results to KCIDB...now I'm back at it and I'll keep on testing
> > some automation on our side for a bit against your KCIDB staging instance
> > before asking you to move to production eventually.
> >
> > But, today I realized, though, that I cannot push anymore data successfully
> > into staging even using the same test script I used one month ago to push
> > some new test data seems to fail now (I tested a few different days and
> > JSON validates fine with jsonschema...with proper dates with hours...)...
> > ...I cannot see any of my today tests' pushes on:
> >
> > https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
> >
> > Auth seems to proceed fine, but I cannot find any submission dated after
> > the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
> > version installed past months from your github though.
> >
> > Do you see any errors on your side that can shed a light on this ?
> >
> > Thanks
> >
> > Regards
> >
> > Cristian
> >
> > On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
> >> Hi Nick,
> >>
> >> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
> >>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>> Yes, I think it's one of the problems you uncovered :)
> >>>>
> >>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>> database on the backend doesn't understand some of them. In particular it
> >>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>> That's what I wanted to fix today, but ran out of time.
> >>>
> >>> Looking at this more it seems that Python's jsonschema module simply doesn't
> >>> enforce the requirements we put on those fields 🤦. You can send essentially
> >>> what you want and then hit BigQuery, which is serious about them.
> >>
> >> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
> >>>
> >>> Sorry about that.
> >>>
> >>
> >> No worries.
> >>
> >>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
> >>>
> >>> For now please just make sure your timestamp comply with RFC3339.
> >>>
> >>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
> >>
> >> I'll anyway fix my data on my side too, to have the real discovery timestamp.
> >>
> >>>
> >>> Nick
> >>>
> >>
> >> Thanks
> >>
> >> Cristian
> >>
> >>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>   > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>   > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>   > contained something like 38 builds across 4 different revisions (with brand new
> >>>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>   > push from yesterday.
> >>>>   >
> >>>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>   > (I pushed >30mins ago)
> >>>>   >
> >>>>   > Any idea ?
> >>>>
> >>>> Yes, I think it's one of the problems you uncovered :)
> >>>>
> >>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>> database on the backend doesn't understand some of them. In particular it
> >>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>> That's what I wanted to fix today, but ran out of time.
> >>>>
> >>>> Additionally, the backend doesn't have a way to report a problem to the
> >>>> submitter at the moment. We intend to fix that, but for now it's possible only
> >>>> through us looking at the logs and sending a message to the submitter :)
> >>>>
> >>>> To work around this you can pad your timestamps with dummy date and time
> >>>> data.
> >>>>
> >>>> E.g. instead of sending:
> >>>>
> >>>>       2020-09-13
> >>>>
> >>>> you can send:
> >>>>
> >>>>       2020-09-13 00:00:00+00:00
> >>>>
> >>>> Hopefully that's the only problem. It could be, since you managed to send data
> >>>> before :)
> >>>>
> >>>> Nick
> >>>>
> >>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>   > Hi Nikolai,
> >>>>   >
> >>>>   > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> >>>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>   >>> It works too ... :D
> >>>>   >>>
> >>>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>   >>
> >>>>   >> Whoa, awesome!
> >>>>   >>
> >>>>   >> And you have already uncovered a few issues we need to fix, too!
> >>>>   >> I will deal with them tomorrow.
> >>>>   >>
> >>>>   >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>   >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>   >>> before re-submitting ?
> >>>>   >>
> >>>>   >> Right now it's not supported (with various possible quirks if attempted).
> >>>>   >> So, preferably, submit only one, complete and final instance of each object
> >>>>   >> (with unique ID) for now.
> >>>>   >>
> >>>>   >> We have a plan to support merging missing properties across multiple reported
> >>>>   >> objects with the same ID.
> >>>>   >>
> >>>>   >>              Object A        Object B    Dashboard/Notifications
> >>>>   >>
> >>>>   >> FieldX:     Foo             Foo         Foo
> >>>>   >> FieldY:                     Bar         Bar
> >>>>   >> FieldZ:     Baz                         Baz
> >>>>   >> FieldU:     Red             Blue        Red/Blue
> >>>>   >>
> >>>>   >> Since we're using a distributed database we cannot really maintain order
> >>>>   >> (without introducing artificial global lock), so the order of the reports
> >>>>   >> doesn't matter. We can only guarantee that a present value would override
> >>>>   >> missing value. It would be undefined which value would be picked among
> >>>>   >> multiple different values.
> >>>>   >>
> >>>>   >> This would allow gradual reporting of each object, but no editing, sorry.
> >>>>   >>
> >>>>   >> However, once again, this is a plan with some research done, only.
> >>>>   >> I plan to start implementing it within a few weeks.
> >>>>   >>
> >>>>   >
> >>>>   > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>   > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>   > contained something like 38 builds across 4 different revisions (with brand new
> >>>>   > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>   > push from yesterday.
> >>>>   >
> >>>>   > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>   > (I pushed >30mins ago)
> >>>>   >
> >>>>   > Any idea ?
> >>>>   >
> >>>>   > Thanks
> >>>>   >
> >>>>   > Cristian
> >>>>   >
> >>>>   >> Nick
> >>>>   >>
> >>>>   >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>   >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >>>>   >>>> Hi Christian,
> >>>>   >>>>
> >>>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>   >>>>> Hi Nikolai,
> >>>>   >>>>>
> >>>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>   >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>   >>>>
> >>>>   >>>> Wonderful!
> >>>>   >>>>
> >>>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>   >>>>
> >>>>   >>>> Great, this is exactly what we need, welcome aboard :)
> >>>>   >>>>
> >>>>   >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >>>>   >>>> freenode.net, if you have any questions, problems, or requirements.
> >>>>   >>>>
> >>>>   >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>   >>>>> point at ?
> >>>>   >>>>
> >>>>   >>>> Absolutely, I created credentials for you and sent them in a separate message.
> >>>>   >>>>
> >>>>   >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> >>>>   >>>> and want to differentiate them somehow in your reports.
> >>>>   >>>>
> >>>>   >>>> Nick
> >>>>   >>>>
> >>>>   >>>    Thanks !
> >>>>   >>>
> >>>>   >>> It works too ... :D
> >>>>   >>>
> >>>>   >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>   >>>
> >>>>   >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>   >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>   >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>   >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>   >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>   >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>   >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>   >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>   >>> before re-submitting ?
> >>>>   >>>
> >>>>   >>> Regards
> >>>>   >>>
> >>>>   >>> Thanks
> >>>>   >>>
> >>>>   >>> Cristian
> >>>>   >>>
> >>>>   >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>   >>>>> Hi Nikolai,
> >>>>   >>>>>
> >>>>   >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>   >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>   >>>>>
> >>>>   >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>   >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>   >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>   >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>   >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>   >>>>>
> >>>>   >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>   >>>>> point at ?
> >>>>   >>>>>
> >>>>   >>>>> Thanks
> >>>>   >>>>>
> >>>>   >>>>> Regards
> >>>>   >>>>>
> >>>>   >>>>> Cristian
> >>>>   >>>>>
> >>>>   >>>>>
> >>>>   >>>>>
> >>>>   >>>>>
> >>>>   >>>>>
> >>>>   >>>>
> >>>>   >>>
> >>>>   >>
> >>>>   >
> >>>>   >
> >>>>   > >
> >>>>   >
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >
> >
> > >
> >
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-02  9:23                   ` Cristian Marussi
@ 2020-12-02 10:16                     ` Nikolai Kondrashov
  2020-12-02 12:01                       ` Cristian Marussi
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-12-02 10:16 UTC (permalink / raw)
  To: Cristian Marussi, kernelci; +Cc: broonie, basil.eljuse

On 12/2/20 11:23 AM, Cristian Marussi wrote:
 > On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>> after past month few experiments on ARM KCIDB submissions against your
 >>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>> before effectively deploying some real automation on our side to push our
 >>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>> some automation on our side for a bit against your KCIDB staging instance
 >>> before asking you to move to production eventually.
 >>
 >> I see your data has been steadily trickling into our playground database and
 >> it looks quite good. Would you like to move to the production instance?
 >>
 >> I can review your data for you, we can fix the remaining issues if we find
 >> them, and I can give you the permissions to push to production. Then you will
 >> only need to change the topic you push to from "playground_kernelci_new" to
 >> "kernelci_new".
 >
 > In fact I left one staging instance on our side to push data on your
 > staging instance to verify remaining issues on our side *and there are a
 > couple of minor ones I spotted that I'd like to fix indeed);

Sure, it's up to you when you decide to switch. However, if you'd like, list
your issues here, and I would be able to tell you if those are important from
KCIDB POV.

Looking at your data, I can only find one serious issue: the test run ("test")
IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
use 643 distinct build_id's among them.

The test run IDs should correspond to a single execution of a test. Otherwise
we won't be able to tell them apart. You can send multiple reports containing
test runs ("tests") with the same ID, but that would still mean the same
execution, only repeating the same data, or adding more.

A little more explanation:
https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times

 From POV of KCIDB, what you're sending now is overwriting the same test runs
over and over, and we can't really tell which one of those objects is the
final version.

Aside from that, you might want to add `"valid": true` to your "revision"
objects to indicate they're alright. You never seem to send patched revisions,
so it should always be true for you. Then instead of the blank "Status" field:

     https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d

you would get a nice green check mark, like this:

     https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce

Finally, at this stage we really need a breadth of data coming from
different CI system, rather than its depth or precision, so we can understand
the problem at hand better and faster. It would do us no good to concentrate
on just a few, and solidify the design around them. That would make it more
difficult for others to join.

You can refine and add more data afterwards.

 > moreover I saw a little while a go that you're going to switch to schema v4
 > with some minor changes in revisions and commit_hashes so I wanted to
 > conform to that once it's published (even though you're back compatible with
 > v3 AFAIU)....

I would rather you didn't wait for that, as I'm neck deep in research for the
next release right now, and it doesn't seem like it's gonna come out soon.
I'm concentrating on getting our result notifications in a good shape so we
can reach actual kernel developers ASAP.

We can work on upgrading your setup later, when it comes out. And there are
going to be other changes, anyway. So, I'd rather we released early and
iterated.

 > ... then I've got dragged away again from this past week :D
 >
 > In fact my next steps (possibly next week) would have been (beside my fixes)
 > to ask you how to proceed further to production KCIDB.

There's never enough time for everything :)

 > Would you want me to stop flooding your staging instance in the meantime (:D)
 > till I'm back at it at least , I think I have enugh data now to debug anyway.
 > (I could made a few more check next week though)

Don't worry about that, and keep pushing, maybe you'll manage to break it
again and then we can fix it :)

 > If it's just a matter of switching project (once got enhanced permissions
 > from you) please do it, and I'll try to finalize all next week on our
 > side and move to production.

Permission granted! Switch when you feel ready, and don't hesitate to ping me
for another review, if you need it.

Just replace "playground_kernelci_new" topic with "kernelci_new" in your
setup when you're ready.

 > Thanks for the patience

Thank you for your effort, we need your data :D

Nick

On 12/2/20 11:23 AM, Cristian Marussi wrote:
 > Hi Nick
 >
 > On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >> Hi Cristian,
 >>
 >> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>> Hi Nick,
 >>>
 >>> after past month few experiments on ARM KCIDB submissions against your
 >>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>> before effectively deploying some real automation on our side to push our
 >>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>> some automation on our side for a bit against your KCIDB staging instance
 >>> before asking you to move to production eventually.
 >>
 >> I see your data has been steadily trickling into our playground database and
 >> it looks quite good. Would you like to move to the production instance?
 >>
 >> I can review your data for you, we can fix the remaining issues if we find
 >> them, and I can give you the permissions to push to production. Then you will
 >> only need to change the topic you push to from "playground_kernelci_new" to
 >> "kernelci_new".
 >
 > In fact I left one staging instance on our side to push data on your
 > staging instance to verify remaining issues on our side *and there are a
 > couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
 > a little while a go that you're going to switch to schema v4 with some minor
 > changes in revisions and commit_hashes so I wanted to conform to that once
 > it's published (even though you're back compatible with v3 AFAIU)....
 >
 > ... then I've got dragged away again from this past week :D
 >
 > In fact my next steps (possibly next week) would have been (beside my fixes)
 > to ask you how to proceed further to production KCIDB.
 >
 > Would you want me to stop flooding your staging instance in the meantime (:D)
 > till I'm back at it at least , I think I have enugh data now to debug anyway.
 > (I could made a few more check next week though)
 >
 > If it's just a matter of switching project (once got enhanced permissions
 > from you) please do it, and I'll try to finalize all next week on our
 > side and move to production.
 >
 > Thanks for the patience
 >
 > Cristian
 >
 >
 >>
 >> Nick
 >>
 >> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>> Hi Nick,
 >>>
 >>> after past month few experiments on ARM KCIDB submissions against your
 >>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>> before effectively deploying some real automation on our side to push our
 >>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>> some automation on our side for a bit against your KCIDB staging instance
 >>> before asking you to move to production eventually.
 >>>
 >>> But, today I realized, though, that I cannot push anymore data successfully
 >>> into staging even using the same test script I used one month ago to push
 >>> some new test data seems to fail now (I tested a few different days and
 >>> JSON validates fine with jsonschema...with proper dates with hours...)...
 >>> ...I cannot see any of my today tests' pushes on:
 >>>
 >>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >>>
 >>> Auth seems to proceed fine, but I cannot find any submission dated after
 >>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 >>> version installed past months from your github though.
 >>>
 >>> Do you see any errors on your side that can shed a light on this ?
 >>>
 >>> Thanks
 >>>
 >>> Regards
 >>>
 >>> Cristian
 >>>
 >>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
 >>>> Hi Nick,
 >>>>
 >>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>
 >>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>
 >>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
 >>>>> enforce the requirements we put on those fields 🤦. You can send essentially
 >>>>> what you want and then hit BigQuery, which is serious about them.
 >>>>
 >>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
 >>>>>
 >>>>> Sorry about that.
 >>>>>
 >>>>
 >>>> No worries.
 >>>>
 >>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>>>>
 >>>>> For now please just make sure your timestamp comply with RFC3339.
 >>>>>
 >>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >>>>
 >>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
 >>>>
 >>>>>
 >>>>> Nick
 >>>>>
 >>>>
 >>>> Thanks
 >>>>
 >>>> Cristian
 >>>>
 >>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>    > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>    > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>    > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>    > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>    > push from yesterday.
 >>>>>>    >
 >>>>>>    > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>    > (I pushed >30mins ago)
 >>>>>>    >
 >>>>>>    > Any idea ?
 >>>>>>
 >>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>
 >>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>
 >>>>>> Additionally, the backend doesn't have a way to report a problem to the
 >>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>>>>> through us looking at the logs and sending a message to the submitter :)
 >>>>>>
 >>>>>> To work around this you can pad your timestamps with dummy date and time
 >>>>>> data.
 >>>>>>
 >>>>>> E.g. instead of sending:
 >>>>>>
 >>>>>>        2020-09-13
 >>>>>>
 >>>>>> you can send:
 >>>>>>
 >>>>>>        2020-09-13 00:00:00+00:00
 >>>>>>
 >>>>>> Hopefully that's the only problem. It could be, since you managed to send data
 >>>>>> before :)
 >>>>>>
 >>>>>> Nick
 >>>>>>
 >>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>    > Hi Nikolai,
 >>>>>>    >
 >>>>>>    > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>>>>    >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>    >>> It works too ... :D
 >>>>>>    >>>
 >>>>>>    >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>    >>
 >>>>>>    >> Whoa, awesome!
 >>>>>>    >>
 >>>>>>    >> And you have already uncovered a few issues we need to fix, too!
 >>>>>>    >> I will deal with them tomorrow.
 >>>>>>    >>
 >>>>>>    >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>    >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>    >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>    >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>    >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>    >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>    >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>    >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>    >>> before re-submitting ?
 >>>>>>    >>
 >>>>>>    >> Right now it's not supported (with various possible quirks if attempted).
 >>>>>>    >> So, preferably, submit only one, complete and final instance of each object
 >>>>>>    >> (with unique ID) for now.
 >>>>>>    >>
 >>>>>>    >> We have a plan to support merging missing properties across multiple reported
 >>>>>>    >> objects with the same ID.
 >>>>>>    >>
 >>>>>>    >>              Object A        Object B    Dashboard/Notifications
 >>>>>>    >>
 >>>>>>    >> FieldX:     Foo             Foo         Foo
 >>>>>>    >> FieldY:                     Bar         Bar
 >>>>>>    >> FieldZ:     Baz                         Baz
 >>>>>>    >> FieldU:     Red             Blue        Red/Blue
 >>>>>>    >>
 >>>>>>    >> Since we're using a distributed database we cannot really maintain order
 >>>>>>    >> (without introducing artificial global lock), so the order of the reports
 >>>>>>    >> doesn't matter. We can only guarantee that a present value would override
 >>>>>>    >> missing value. It would be undefined which value would be picked among
 >>>>>>    >> multiple different values.
 >>>>>>    >>
 >>>>>>    >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>>>>    >>
 >>>>>>    >> However, once again, this is a plan with some research done, only.
 >>>>>>    >> I plan to start implementing it within a few weeks.
 >>>>>>    >>
 >>>>>>    >
 >>>>>>    > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>    > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>    > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>    > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>    > push from yesterday.
 >>>>>>    >
 >>>>>>    > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>    > (I pushed >30mins ago)
 >>>>>>    >
 >>>>>>    > Any idea ?
 >>>>>>    >
 >>>>>>    > Thanks
 >>>>>>    >
 >>>>>>    > Cristian
 >>>>>>    >
 >>>>>>    >> Nick
 >>>>>>    >>
 >>>>>>    >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>    >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>>>>    >>>> Hi Christian,
 >>>>>>    >>>>
 >>>>>>    >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>    >>>>> Hi Nikolai,
 >>>>>>    >>>>>
 >>>>>>    >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>    >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>    >>>>
 >>>>>>    >>>> Wonderful!
 >>>>>>    >>>>
 >>>>>>    >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>    >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>    >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>    >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>    >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>    >>>>
 >>>>>>    >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>>>    >>>>
 >>>>>>    >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>>>>    >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>>>    >>>>
 >>>>>>    >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>    >>>>> point at ?
 >>>>>>    >>>>
 >>>>>>    >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>>>    >>>>
 >>>>>>    >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>>>>    >>>> and want to differentiate them somehow in your reports.
 >>>>>>    >>>>
 >>>>>>    >>>> Nick
 >>>>>>    >>>>
 >>>>>>    >>>    Thanks !
 >>>>>>    >>>
 >>>>>>    >>> It works too ... :D
 >>>>>>    >>>
 >>>>>>    >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>    >>>
 >>>>>>    >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>    >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>    >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>    >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>    >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>    >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>    >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>    >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>    >>> before re-submitting ?
 >>>>>>    >>>
 >>>>>>    >>> Regards
 >>>>>>    >>>
 >>>>>>    >>> Thanks
 >>>>>>    >>>
 >>>>>>    >>> Cristian
 >>>>>>    >>>
 >>>>>>    >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>    >>>>> Hi Nikolai,
 >>>>>>    >>>>>
 >>>>>>    >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>    >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>    >>>>>
 >>>>>>    >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>    >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>    >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>    >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>    >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>    >>>>>
 >>>>>>    >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>    >>>>> point at ?
 >>>>>>    >>>>>
 >>>>>>    >>>>> Thanks
 >>>>>>    >>>>>
 >>>>>>    >>>>> Regards
 >>>>>>    >>>>>
 >>>>>>    >>>>> Cristian
 >>>>>>    >>>>>
 >>>>>>    >>>>>
 >>>>>>    >>>>>
 >>>>>>    >>>>>
 >>>>>>    >>>>>
 >>>>>>    >>>>
 >>>>>>    >>>
 >>>>>>    >>
 >>>>>>    >
 >>>>>>    >
 >>>>>>    > >
 >>>>>>    >
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>
 >>>
 >>>
 >>>>
 >>>
 >>
 >>
 >>
 >> 
 >>
 >>
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-02 10:16                     ` Nikolai Kondrashov
@ 2020-12-02 12:01                       ` Cristian Marussi
  2020-12-02 13:38                         ` Nikolai Kondrashov
  2021-03-15  9:00                         ` Nikolai Kondrashov
  0 siblings, 2 replies; 23+ messages in thread
From: Cristian Marussi @ 2020-12-02 12:01 UTC (permalink / raw)
  To: Nikolai Kondrashov; +Cc: kernelci, broonie, basil.eljuse

On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
> On 12/2/20 11:23 AM, Cristian Marussi wrote:
> > On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> >> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>> after past month few experiments on ARM KCIDB submissions against your
> >>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>> before effectively deploying some real automation on our side to push our
> >>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>> some automation on our side for a bit against your KCIDB staging instance
> >>> before asking you to move to production eventually.
> >>
> >> I see your data has been steadily trickling into our playground database and
> >> it looks quite good. Would you like to move to the production instance?
> >>
> >> I can review your data for you, we can fix the remaining issues if we find
> >> them, and I can give you the permissions to push to production. Then you will
> >> only need to change the topic you push to from "playground_kernelci_new" to
> >> "kernelci_new".
> >
> > In fact I left one staging instance on our side to push data on your
> > staging instance to verify remaining issues on our side *and there are a
> > couple of minor ones I spotted that I'd like to fix indeed);
> 
> Sure, it's up to you when you decide to switch. However, if you'd like, list
> your issues here, and I would be able to tell you if those are important from
> KCIDB POV.
> 
> Looking at your data, I can only find one serious issue: the test run ("test")
> IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
> use 643 distinct build_id's among them.
> 
> The test run IDs should correspond to a single execution of a test. Otherwise
> we won't be able to tell them apart. You can send multiple reports containing
> test runs ("tests") with the same ID, but that would still mean the same
> execution, only repeating the same data, or adding more.
> 
> A little more explanation:
> https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times
> 
> From POV of KCIDB, what you're sending now is overwriting the same test runs
> over and over, and we can't really tell which one of those objects is the
> final version.


Ah, that was exactly what I used to do in my first initial experiments and then,
looking at the data on the UI, I was dumb enough to decide that I should have got
it wrong and I started using the test_id instead of the test_execution_id, because
I thought that, anyway, you can recognize the different test executions of the
same test_id looking at the different build_id is part of (which for us represent
the different test suite runs)....but I suppose this wrong assumption of mine
sparked from the relational data model I use on our side. I'll fix it.

> 
> Aside from that, you might want to add `"valid": true` to your "revision"
> objects to indicate they're alright. You never seem to send patched revisions,
> so it should always be true for you. Then instead of the blank "Status" field:
> 
>     https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d
> 
> you would get a nice green check mark, like this:
> 
>     https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce
> 

Ah I missed this valid flag on revision too, I'll fix.

> Finally, at this stage we really need a breadth of data coming from
> different CI system, rather than its depth or precision, so we can understand
> the problem at hand better and faster. It would do us no good to concentrate
> on just a few, and solidify the design around them. That would make it more
> difficult for others to join.
> 
> You can refine and add more data afterwards.
> 

Sure, in fact, as of now I still have to ask for some changes in our reporting
backend, (which generates the original data stored in our DB and then pushed
to you), so I have to admit the git commit hash are partially faked (since I
have only a git describe string to start from) and as a consequence they won't
really be so much useful for comparisons amongst different origins (given
they don't refer real kernel commits), BUT I thought this NOT to be a
blocking problem for now, so that I can start pushing data to KCIDB and
then later on (once I get real full hashes on my side) I'll start pushing the
real valid ones, does it sounds good ?


> > moreover I saw a little while a go that you're going to switch to schema v4
> > with some minor changes in revisions and commit_hashes so I wanted to
> > conform to that once it's published (even though you're back compatible with
> > v3 AFAIU)....
> 
> I would rather you didn't wait for that, as I'm neck deep in research for the
> next release right now, and it doesn't seem like it's gonna come out soon.
> I'm concentrating on getting our result notifications in a good shape so we
> can reach actual kernel developers ASAP.
> 
> We can work on upgrading your setup later, when it comes out. And there are
> going to be other changes, anyway. So, I'd rather we released early and
> iterated.
> 

Good I'l stick to v3.

Side question...for dynamic schema validation purposes...is there any URL
where I can fetch the latest currently valid schema ... something like:

https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json

so that I can check automatically against the latest greatest instead of
using a builtin predownloaded one (or is it a bad idea in your opinion ?)

> > ... then I've got dragged away again from this past week :D
> >
> > In fact my next steps (possibly next week) would have been (beside my fixes)
> > to ask you how to proceed further to production KCIDB.
> 
> There's never enough time for everything :)
> 

eh..

> > Would you want me to stop flooding your staging instance in the meantime (:D)
> > till I'm back at it at least , I think I have enugh data now to debug anyway.
> > (I could made a few more check next week though)
> 
> Don't worry about that, and keep pushing, maybe you'll manage to break it
> again and then we can fix it :)
> 

Fine :D

> > If it's just a matter of switching project (once got enhanced permissions
> > from you) please do it, and I'll try to finalize all next week on our
> > side and move to production.
> 
> Permission granted! Switch when you feel ready, and don't hesitate to ping me
> for another review, if you need it.
> 
> Just replace "playground_kernelci_new" topic with "kernelci_new" in your
> setup when you're ready.
> 

Cool, thanks.

> > Thanks for the patience
> 
> Thank you for your effort, we need your data :D
> 
> Nick
> 

Thank you Nick

Cheers,

Cristian


> On 12/2/20 11:23 AM, Cristian Marussi wrote:
> > Hi Nick
> >
> > On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> >> Hi Cristian,
> >>
> >> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>> Hi Nick,
> >>>
> >>> after past month few experiments on ARM KCIDB submissions against your
> >>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>> before effectively deploying some real automation on our side to push our
> >>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>> some automation on our side for a bit against your KCIDB staging instance
> >>> before asking you to move to production eventually.
> >>
> >> I see your data has been steadily trickling into our playground database and
> >> it looks quite good. Would you like to move to the production instance?
> >>
> >> I can review your data for you, we can fix the remaining issues if we find
> >> them, and I can give you the permissions to push to production. Then you will
> >> only need to change the topic you push to from "playground_kernelci_new" to
> >> "kernelci_new".
> >
> > In fact I left one staging instance on our side to push data on your
> > staging instance to verify remaining issues on our side *and there are a
> > couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
> > a little while a go that you're going to switch to schema v4 with some minor
> > changes in revisions and commit_hashes so I wanted to conform to that once
> > it's published (even though you're back compatible with v3 AFAIU)....
> >
> > ... then I've got dragged away again from this past week :D
> >
> > In fact my next steps (possibly next week) would have been (beside my fixes)
> > to ask you how to proceed further to production KCIDB.
> >
> > Would you want me to stop flooding your staging instance in the meantime (:D)
> > till I'm back at it at least , I think I have enugh data now to debug anyway.
> > (I could made a few more check next week though)
> >
> > If it's just a matter of switching project (once got enhanced permissions
> > from you) please do it, and I'll try to finalize all next week on our
> > side and move to production.
> >
> > Thanks for the patience
> >
> > Cristian
> >
> >
> >>
> >> Nick
> >>
> >> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>> Hi Nick,
> >>>
> >>> after past month few experiments on ARM KCIDB submissions against your
> >>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>> before effectively deploying some real automation on our side to push our
> >>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>> some automation on our side for a bit against your KCIDB staging instance
> >>> before asking you to move to production eventually.
> >>>
> >>> But, today I realized, though, that I cannot push anymore data successfully
> >>> into staging even using the same test script I used one month ago to push
> >>> some new test data seems to fail now (I tested a few different days and
> >>> JSON validates fine with jsonschema...with proper dates with hours...)...
> >>> ...I cannot see any of my today tests' pushes on:
> >>>
> >>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
> >>>
> >>> Auth seems to proceed fine, but I cannot find any submission dated after
> >>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
> >>> version installed past months from your github though.
> >>>
> >>> Do you see any errors on your side that can shed a light on this ?
> >>>
> >>> Thanks
> >>>
> >>> Regards
> >>>
> >>> Cristian
> >>>
> >>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
> >>>> Hi Nick,
> >>>>
> >>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
> >>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>>>> Yes, I think it's one of the problems you uncovered :)
> >>>>>>
> >>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>>>> database on the backend doesn't understand some of them. In particular it
> >>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>>>> That's what I wanted to fix today, but ran out of time.
> >>>>>
> >>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
> >>>>> enforce the requirements we put on those fields 🤦. You can send essentially
> >>>>> what you want and then hit BigQuery, which is serious about them.
> >>>>
> >>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
> >>>>>
> >>>>> Sorry about that.
> >>>>>
> >>>>
> >>>> No worries.
> >>>>
> >>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
> >>>>>
> >>>>> For now please just make sure your timestamp comply with RFC3339.
> >>>>>
> >>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
> >>>>
> >>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
> >>>>
> >>>>>
> >>>>> Nick
> >>>>>
> >>>>
> >>>> Thanks
> >>>>
> >>>> Cristian
> >>>>
> >>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>>>    > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>>>    > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>>>    > contained something like 38 builds across 4 different revisions (with brand new
> >>>>>>    > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>>>    > push from yesterday.
> >>>>>>    >
> >>>>>>    > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>>>    > (I pushed >30mins ago)
> >>>>>>    >
> >>>>>>    > Any idea ?
> >>>>>>
> >>>>>> Yes, I think it's one of the problems you uncovered :)
> >>>>>>
> >>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>>>> database on the backend doesn't understand some of them. In particular it
> >>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>>>> That's what I wanted to fix today, but ran out of time.
> >>>>>>
> >>>>>> Additionally, the backend doesn't have a way to report a problem to the
> >>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
> >>>>>> through us looking at the logs and sending a message to the submitter :)
> >>>>>>
> >>>>>> To work around this you can pad your timestamps with dummy date and time
> >>>>>> data.
> >>>>>>
> >>>>>> E.g. instead of sending:
> >>>>>>
> >>>>>>        2020-09-13
> >>>>>>
> >>>>>> you can send:
> >>>>>>
> >>>>>>        2020-09-13 00:00:00+00:00
> >>>>>>
> >>>>>> Hopefully that's the only problem. It could be, since you managed to send data
> >>>>>> before :)
> >>>>>>
> >>>>>> Nick
> >>>>>>
> >>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>>>    > Hi Nikolai,
> >>>>>>    >
> >>>>>>    > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> >>>>>>    >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>>>    >>> It works too ... :D
> >>>>>>    >>>
> >>>>>>    >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>>>    >>
> >>>>>>    >> Whoa, awesome!
> >>>>>>    >>
> >>>>>>    >> And you have already uncovered a few issues we need to fix, too!
> >>>>>>    >> I will deal with them tomorrow.
> >>>>>>    >>
> >>>>>>    >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>>>    >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>>>    >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>>>    >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>>>    >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>>>    >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>>>    >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>>>    >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>>>    >>> before re-submitting ?
> >>>>>>    >>
> >>>>>>    >> Right now it's not supported (with various possible quirks if attempted).
> >>>>>>    >> So, preferably, submit only one, complete and final instance of each object
> >>>>>>    >> (with unique ID) for now.
> >>>>>>    >>
> >>>>>>    >> We have a plan to support merging missing properties across multiple reported
> >>>>>>    >> objects with the same ID.
> >>>>>>    >>
> >>>>>>    >>              Object A        Object B    Dashboard/Notifications
> >>>>>>    >>
> >>>>>>    >> FieldX:     Foo             Foo         Foo
> >>>>>>    >> FieldY:                     Bar         Bar
> >>>>>>    >> FieldZ:     Baz                         Baz
> >>>>>>    >> FieldU:     Red             Blue        Red/Blue
> >>>>>>    >>
> >>>>>>    >> Since we're using a distributed database we cannot really maintain order
> >>>>>>    >> (without introducing artificial global lock), so the order of the reports
> >>>>>>    >> doesn't matter. We can only guarantee that a present value would override
> >>>>>>    >> missing value. It would be undefined which value would be picked among
> >>>>>>    >> multiple different values.
> >>>>>>    >>
> >>>>>>    >> This would allow gradual reporting of each object, but no editing, sorry.
> >>>>>>    >>
> >>>>>>    >> However, once again, this is a plan with some research done, only.
> >>>>>>    >> I plan to start implementing it within a few weeks.
> >>>>>>    >>
> >>>>>>    >
> >>>>>>    > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>>>    > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>>>    > contained something like 38 builds across 4 different revisions (with brand new
> >>>>>>    > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>>>    > push from yesterday.
> >>>>>>    >
> >>>>>>    > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>>>    > (I pushed >30mins ago)
> >>>>>>    >
> >>>>>>    > Any idea ?
> >>>>>>    >
> >>>>>>    > Thanks
> >>>>>>    >
> >>>>>>    > Cristian
> >>>>>>    >
> >>>>>>    >> Nick
> >>>>>>    >>
> >>>>>>    >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>>>    >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >>>>>>    >>>> Hi Christian,
> >>>>>>    >>>>
> >>>>>>    >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>>>    >>>>> Hi Nikolai,
> >>>>>>    >>>>>
> >>>>>>    >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>>>    >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>>    >>>>
> >>>>>>    >>>> Wonderful!
> >>>>>>    >>>>
> >>>>>>    >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>>>    >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>>>    >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>>>    >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>>>    >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>>    >>>>
> >>>>>>    >>>> Great, this is exactly what we need, welcome aboard :)
> >>>>>>    >>>>
> >>>>>>    >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >>>>>>    >>>> freenode.net, if you have any questions, problems, or requirements.
> >>>>>>    >>>>
> >>>>>>    >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>>>    >>>>> point at ?
> >>>>>>    >>>>
> >>>>>>    >>>> Absolutely, I created credentials for you and sent them in a separate message.
> >>>>>>    >>>>
> >>>>>>    >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> >>>>>>    >>>> and want to differentiate them somehow in your reports.
> >>>>>>    >>>>
> >>>>>>    >>>> Nick
> >>>>>>    >>>>
> >>>>>>    >>>    Thanks !
> >>>>>>    >>>
> >>>>>>    >>> It works too ... :D
> >>>>>>    >>>
> >>>>>>    >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>>>    >>>
> >>>>>>    >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>>>    >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>>>    >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>>>    >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>>>    >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>>>    >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>>>    >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>>>    >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>>>    >>> before re-submitting ?
> >>>>>>    >>>
> >>>>>>    >>> Regards
> >>>>>>    >>>
> >>>>>>    >>> Thanks
> >>>>>>    >>>
> >>>>>>    >>> Cristian
> >>>>>>    >>>
> >>>>>>    >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>>>    >>>>> Hi Nikolai,
> >>>>>>    >>>>>
> >>>>>>    >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>>>    >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>>    >>>>>
> >>>>>>    >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>>>    >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>>>    >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>>>    >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>>>    >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>>    >>>>>
> >>>>>>    >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>>>    >>>>> point at ?
> >>>>>>    >>>>>
> >>>>>>    >>>>> Thanks
> >>>>>>    >>>>>
> >>>>>>    >>>>> Regards
> >>>>>>    >>>>>
> >>>>>>    >>>>> Cristian
> >>>>>>    >>>>>
> >>>>>>    >>>>>
> >>>>>>    >>>>>
> >>>>>>    >>>>>
> >>>>>>    >>>>>
> >>>>>>    >>>>
> >>>>>>    >>>
> >>>>>>    >>
> >>>>>>    >
> >>>>>>    >
> >>>>>>    > >
> >>>>>>    >
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >>>
> >>>>
> >>>
> >>
> >>
> >>
> >> 
> >>
> >>
> >
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-02 12:01                       ` Cristian Marussi
@ 2020-12-02 13:38                         ` Nikolai Kondrashov
  2020-12-10 17:23                           ` Cristian Marussi
  2021-03-15  9:00                         ` Nikolai Kondrashov
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-12-02 13:38 UTC (permalink / raw)
  To: Cristian Marussi; +Cc: kernelci, broonie, basil.eljuse

On 12/2/20 2:01 PM, Cristian Marussi wrote:
 >>  From POV of KCIDB, what you're sending now is overwriting the same test runs
 >> over and over, and we can't really tell which one of those objects is the
 >> final version.
 >
 >
 > Ah, that was exactly what I used to do in my first initial experiments and then,
 > looking at the data on the UI, I was dumb enough to decide that I should have got
 > it wrong and I started using the test_id instead of the test_execution_id, because
 > I thought that, anyway, you can recognize the different test executions of the
 > same test_id looking at the different build_id is part of (which for us represent
 > the different test suite runs)....but I suppose this wrong assumption of mine
 > sparked from the relational data model I use on our side. I'll fix it.

Yes, that would work, but then we get a "foreign key explosion" as we start
linking to tests from other objects beside builds. So, for now we're sticking
to the "one ID column per table" policy.

Thanks for bearing with us, and am glad to hear you already have
`test_execution_id` in your database, so the fix shouldn't take long :)

 > Sure, in fact, as of now I still have to ask for some changes in our reporting
 > backend, (which generates the original data stored in our DB and then pushed
 > to you), so I have to admit the git commit hash are partially faked (since I
 > have only a git describe string to start from) and as a consequence they won't
 > really be so much useful for comparisons amongst different origins (given
 > they don't refer real kernel commits), BUT I thought this NOT to be a
 > blocking problem for now, so that I can start pushing data to KCIDB and
 > then later on (once I get real full hashes on my side) I'll start pushing the
 > real valid ones, does it sounds good ?

Yes, no problem. We don't have maintainers/developers to get angry yet :D

I'm looking forward to having four-origin revisions in the dashboard, though,
one more than e.g. this one:

     https://staging.kernelci.org:3000/d/revision/revision?var-id=3650b228f83adda7e5ee532e2b90429c03f7b9ec

 > Side question...for dynamic schema validation purposes...is there any URL
 > where I can fetch the latest currently valid schema ... something like:
 >
 > https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
 >
 > so that I can check automatically against the latest greatest instead of
 > using a builtin predownloaded one (or is it a bad idea in your opinion ?)

The JSON schemas we generate with `kcidb-schema`, and use inside KCIDB, only
validate *one* major version. So v3 data would only validate with v3 schema,
but not with e.g. v4.

So if you e.g. download and validate against the latest-release schema
automatically, validation will start failing the moment a release with v4
comes out.

Automatic data upgrades between major versions are done in Python whenever we
see a difference between the numbers.

OTOH, minor version bumps of the schema are backwards-compatible, and you
would be fine upgrading validation to those. However, we don't have many of
those at all yet, as we're still changing the schema a lot.

So, I think a reasonable workflow right now is to download and switch to a new
version at the same time you're upgrading your submission code to the next
major release of the schema. You'll need more work on the code than just
switching the schema, anyway.

However, let's get back to this further along the way, perhaps we can think of
something smoother and more automated. E.g. set up a way to have automatic
upgrades between minor versions.

Thanks :)
Nick

On 12/2/20 2:01 PM, Cristian Marussi wrote:
 > On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
 >> On 12/2/20 11:23 AM, Cristian Marussi wrote:
 >>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>> before effectively deploying some real automation on our side to push our
 >>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>> before asking you to move to production eventually.
 >>>>
 >>>> I see your data has been steadily trickling into our playground database and
 >>>> it looks quite good. Would you like to move to the production instance?
 >>>>
 >>>> I can review your data for you, we can fix the remaining issues if we find
 >>>> them, and I can give you the permissions to push to production. Then you will
 >>>> only need to change the topic you push to from "playground_kernelci_new" to
 >>>> "kernelci_new".
 >>>
 >>> In fact I left one staging instance on our side to push data on your
 >>> staging instance to verify remaining issues on our side *and there are a
 >>> couple of minor ones I spotted that I'd like to fix indeed);
 >>
 >> Sure, it's up to you when you decide to switch. However, if you'd like, list
 >> your issues here, and I would be able to tell you if those are important from
 >> KCIDB POV.
 >>
 >> Looking at your data, I can only find one serious issue: the test run ("test")
 >> IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
 >> use 643 distinct build_id's among them.
 >>
 >> The test run IDs should correspond to a single execution of a test. Otherwise
 >> we won't be able to tell them apart. You can send multiple reports containing
 >> test runs ("tests") with the same ID, but that would still mean the same
 >> execution, only repeating the same data, or adding more.
 >>
 >> A little more explanation:
 >> https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times
 >>
 >>  From POV of KCIDB, what you're sending now is overwriting the same test runs
 >> over and over, and we can't really tell which one of those objects is the
 >> final version.
 >
 >
 > Ah, that was exactly what I used to do in my first initial experiments and then,
 > looking at the data on the UI, I was dumb enough to decide that I should have got
 > it wrong and I started using the test_id instead of the test_execution_id, because
 > I thought that, anyway, you can recognize the different test executions of the
 > same test_id looking at the different build_id is part of (which for us represent
 > the different test suite runs)....but I suppose this wrong assumption of mine
 > sparked from the relational data model I use on our side. I'll fix it.
 >
 >>
 >> Aside from that, you might want to add `"valid": true` to your "revision"
 >> objects to indicate they're alright. You never seem to send patched revisions,
 >> so it should always be true for you. Then instead of the blank "Status" field:
 >>
 >>      https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d
 >>
 >> you would get a nice green check mark, like this:
 >>
 >>      https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce
 >>
 >
 > Ah I missed this valid flag on revision too, I'll fix.
 >
 >> Finally, at this stage we really need a breadth of data coming from
 >> different CI system, rather than its depth or precision, so we can understand
 >> the problem at hand better and faster. It would do us no good to concentrate
 >> on just a few, and solidify the design around them. That would make it more
 >> difficult for others to join.
 >>
 >> You can refine and add more data afterwards.
 >>
 >
 > Sure, in fact, as of now I still have to ask for some changes in our reporting
 > backend, (which generates the original data stored in our DB and then pushed
 > to you), so I have to admit the git commit hash are partially faked (since I
 > have only a git describe string to start from) and as a consequence they won't
 > really be so much useful for comparisons amongst different origins (given
 > they don't refer real kernel commits), BUT I thought this NOT to be a
 > blocking problem for now, so that I can start pushing data to KCIDB and
 > then later on (once I get real full hashes on my side) I'll start pushing the
 > real valid ones, does it sounds good ?
 >
 >
 >>> moreover I saw a little while a go that you're going to switch to schema v4
 >>> with some minor changes in revisions and commit_hashes so I wanted to
 >>> conform to that once it's published (even though you're back compatible with
 >>> v3 AFAIU)....
 >>
 >> I would rather you didn't wait for that, as I'm neck deep in research for the
 >> next release right now, and it doesn't seem like it's gonna come out soon.
 >> I'm concentrating on getting our result notifications in a good shape so we
 >> can reach actual kernel developers ASAP.
 >>
 >> We can work on upgrading your setup later, when it comes out. And there are
 >> going to be other changes, anyway. So, I'd rather we released early and
 >> iterated.
 >>
 >
 > Good I'l stick to v3.
 >
 > Side question...for dynamic schema validation purposes...is there any URL
 > where I can fetch the latest currently valid schema ... something like:
 >
 > https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
 >
 > so that I can check automatically against the latest greatest instead of
 > using a builtin predownloaded one (or is it a bad idea in your opinion ?)
 >
 >>> ... then I've got dragged away again from this past week :D
 >>>
 >>> In fact my next steps (possibly next week) would have been (beside my fixes)
 >>> to ask you how to proceed further to production KCIDB.
 >>
 >> There's never enough time for everything :)
 >>
 >
 > eh..
 >
 >>> Would you want me to stop flooding your staging instance in the meantime (:D)
 >>> till I'm back at it at least , I think I have enugh data now to debug anyway.
 >>> (I could made a few more check next week though)
 >>
 >> Don't worry about that, and keep pushing, maybe you'll manage to break it
 >> again and then we can fix it :)
 >>
 >
 > Fine :D
 >
 >>> If it's just a matter of switching project (once got enhanced permissions
 >>> from you) please do it, and I'll try to finalize all next week on our
 >>> side and move to production.
 >>
 >> Permission granted! Switch when you feel ready, and don't hesitate to ping me
 >> for another review, if you need it.
 >>
 >> Just replace "playground_kernelci_new" topic with "kernelci_new" in your
 >> setup when you're ready.
 >>
 >
 > Cool, thanks.
 >
 >>> Thanks for the patience
 >>
 >> Thank you for your effort, we need your data :D
 >>
 >> Nick
 >>
 >
 > Thank you Nick
 >
 > Cheers,
 >
 > Cristian
 >
 >
 >> On 12/2/20 11:23 AM, Cristian Marussi wrote:
 >>> Hi Nick
 >>>
 >>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>> Hi Cristian,
 >>>>
 >>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>> Hi Nick,
 >>>>>
 >>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>> before effectively deploying some real automation on our side to push our
 >>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>> before asking you to move to production eventually.
 >>>>
 >>>> I see your data has been steadily trickling into our playground database and
 >>>> it looks quite good. Would you like to move to the production instance?
 >>>>
 >>>> I can review your data for you, we can fix the remaining issues if we find
 >>>> them, and I can give you the permissions to push to production. Then you will
 >>>> only need to change the topic you push to from "playground_kernelci_new" to
 >>>> "kernelci_new".
 >>>
 >>> In fact I left one staging instance on our side to push data on your
 >>> staging instance to verify remaining issues on our side *and there are a
 >>> couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
 >>> a little while a go that you're going to switch to schema v4 with some minor
 >>> changes in revisions and commit_hashes so I wanted to conform to that once
 >>> it's published (even though you're back compatible with v3 AFAIU)....
 >>>
 >>> ... then I've got dragged away again from this past week :D
 >>>
 >>> In fact my next steps (possibly next week) would have been (beside my fixes)
 >>> to ask you how to proceed further to production KCIDB.
 >>>
 >>> Would you want me to stop flooding your staging instance in the meantime (:D)
 >>> till I'm back at it at least , I think I have enugh data now to debug anyway.
 >>> (I could made a few more check next week though)
 >>>
 >>> If it's just a matter of switching project (once got enhanced permissions
 >>> from you) please do it, and I'll try to finalize all next week on our
 >>> side and move to production.
 >>>
 >>> Thanks for the patience
 >>>
 >>> Cristian
 >>>
 >>>
 >>>>
 >>>> Nick
 >>>>
 >>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>> Hi Nick,
 >>>>>
 >>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>> before effectively deploying some real automation on our side to push our
 >>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>> before asking you to move to production eventually.
 >>>>>
 >>>>> But, today I realized, though, that I cannot push anymore data successfully
 >>>>> into staging even using the same test script I used one month ago to push
 >>>>> some new test data seems to fail now (I tested a few different days and
 >>>>> JSON validates fine with jsonschema...with proper dates with hours...)...
 >>>>> ...I cannot see any of my today tests' pushes on:
 >>>>>
 >>>>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >>>>>
 >>>>> Auth seems to proceed fine, but I cannot find any submission dated after
 >>>>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 >>>>> version installed past months from your github though.
 >>>>>
 >>>>> Do you see any errors on your side that can shed a light on this ?
 >>>>>
 >>>>> Thanks
 >>>>>
 >>>>> Regards
 >>>>>
 >>>>> Cristian
 >>>>>
 >>>>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
 >>>>>> Hi Nick,
 >>>>>>
 >>>>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>>>
 >>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>>
 >>>>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
 >>>>>>> enforce the requirements we put on those fields 🤦. You can send essentially
 >>>>>>> what you want and then hit BigQuery, which is serious about them.
 >>>>>>
 >>>>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
 >>>>>>>
 >>>>>>> Sorry about that.
 >>>>>>>
 >>>>>>
 >>>>>> No worries.
 >>>>>>
 >>>>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>>>>>>
 >>>>>>> For now please just make sure your timestamp comply with RFC3339.
 >>>>>>>
 >>>>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >>>>>>
 >>>>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
 >>>>>>
 >>>>>>>
 >>>>>>> Nick
 >>>>>>>
 >>>>>>
 >>>>>> Thanks
 >>>>>>
 >>>>>> Cristian
 >>>>>>
 >>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>>>     > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>>>     > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>>>     > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>>>     > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>>>     > push from yesterday.
 >>>>>>>>     >
 >>>>>>>>     > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>>>     > (I pushed >30mins ago)
 >>>>>>>>     >
 >>>>>>>>     > Any idea ?
 >>>>>>>>
 >>>>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>>>
 >>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>>>
 >>>>>>>> Additionally, the backend doesn't have a way to report a problem to the
 >>>>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>>>>>>> through us looking at the logs and sending a message to the submitter :)
 >>>>>>>>
 >>>>>>>> To work around this you can pad your timestamps with dummy date and time
 >>>>>>>> data.
 >>>>>>>>
 >>>>>>>> E.g. instead of sending:
 >>>>>>>>
 >>>>>>>>         2020-09-13
 >>>>>>>>
 >>>>>>>> you can send:
 >>>>>>>>
 >>>>>>>>         2020-09-13 00:00:00+00:00
 >>>>>>>>
 >>>>>>>> Hopefully that's the only problem. It could be, since you managed to send data
 >>>>>>>> before :)
 >>>>>>>>
 >>>>>>>> Nick
 >>>>>>>>
 >>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>>>     > Hi Nikolai,
 >>>>>>>>     >
 >>>>>>>>     > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>     >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>>>     >>> It works too ... :D
 >>>>>>>>     >>>
 >>>>>>>>     >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>>>     >>
 >>>>>>>>     >> Whoa, awesome!
 >>>>>>>>     >>
 >>>>>>>>     >> And you have already uncovered a few issues we need to fix, too!
 >>>>>>>>     >> I will deal with them tomorrow.
 >>>>>>>>     >>
 >>>>>>>>     >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>>>     >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>>>     >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>>>     >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>>>     >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>>>     >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>>>     >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>>>     >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>>>     >>> before re-submitting ?
 >>>>>>>>     >>
 >>>>>>>>     >> Right now it's not supported (with various possible quirks if attempted).
 >>>>>>>>     >> So, preferably, submit only one, complete and final instance of each object
 >>>>>>>>     >> (with unique ID) for now.
 >>>>>>>>     >>
 >>>>>>>>     >> We have a plan to support merging missing properties across multiple reported
 >>>>>>>>     >> objects with the same ID.
 >>>>>>>>     >>
 >>>>>>>>     >>              Object A        Object B    Dashboard/Notifications
 >>>>>>>>     >>
 >>>>>>>>     >> FieldX:     Foo             Foo         Foo
 >>>>>>>>     >> FieldY:                     Bar         Bar
 >>>>>>>>     >> FieldZ:     Baz                         Baz
 >>>>>>>>     >> FieldU:     Red             Blue        Red/Blue
 >>>>>>>>     >>
 >>>>>>>>     >> Since we're using a distributed database we cannot really maintain order
 >>>>>>>>     >> (without introducing artificial global lock), so the order of the reports
 >>>>>>>>     >> doesn't matter. We can only guarantee that a present value would override
 >>>>>>>>     >> missing value. It would be undefined which value would be picked among
 >>>>>>>>     >> multiple different values.
 >>>>>>>>     >>
 >>>>>>>>     >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>>>>>>     >>
 >>>>>>>>     >> However, once again, this is a plan with some research done, only.
 >>>>>>>>     >> I plan to start implementing it within a few weeks.
 >>>>>>>>     >>
 >>>>>>>>     >
 >>>>>>>>     > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>>>     > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>>>     > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>>>     > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>>>     > push from yesterday.
 >>>>>>>>     >
 >>>>>>>>     > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>>>     > (I pushed >30mins ago)
 >>>>>>>>     >
 >>>>>>>>     > Any idea ?
 >>>>>>>>     >
 >>>>>>>>     > Thanks
 >>>>>>>>     >
 >>>>>>>>     > Cristian
 >>>>>>>>     >
 >>>>>>>>     >> Nick
 >>>>>>>>     >>
 >>>>>>>>     >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>>>     >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>     >>>> Hi Christian,
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>>>     >>>>> Hi Nikolai,
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>>>     >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> Wonderful!
 >>>>>>>>     >>>>
 >>>>>>>>     >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>>>     >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>>>     >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>>>     >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>>>     >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>>>>>>     >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>>>>>     >>>>
 >>>>>>>>     >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>>>     >>>>> point at ?
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>>>>>>     >>>> and want to differentiate them somehow in your reports.
 >>>>>>>>     >>>>
 >>>>>>>>     >>>> Nick
 >>>>>>>>     >>>>
 >>>>>>>>     >>>    Thanks !
 >>>>>>>>     >>>
 >>>>>>>>     >>> It works too ... :D
 >>>>>>>>     >>>
 >>>>>>>>     >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>>>     >>>
 >>>>>>>>     >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>>>     >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>>>     >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>>>     >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>>>     >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>>>     >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>>>     >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>>>     >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>>>     >>> before re-submitting ?
 >>>>>>>>     >>>
 >>>>>>>>     >>> Regards
 >>>>>>>>     >>>
 >>>>>>>>     >>> Thanks
 >>>>>>>>     >>>
 >>>>>>>>     >>> Cristian
 >>>>>>>>     >>>
 >>>>>>>>     >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>>>     >>>>> Hi Nikolai,
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>>>     >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>>>     >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>>>     >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>>>     >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>>>     >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>>>     >>>>> point at ?
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> Thanks
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> Regards
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>> Cristian
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>>
 >>>>>>>>     >>>>
 >>>>>>>>     >>>
 >>>>>>>>     >>
 >>>>>>>>     >
 >>>>>>>>     >
 >>>>>>>>     > >
 >>>>>>>>     >
 >>>>>>>>
 >>>>>>>>
 >>>>>>>>
 >>>>>>>>
 >>>>>>>>
 >>>>>>>>
 >>>>>>>
 >>>>>
 >>>>>
 >>>>>>
 >>>>>
 >>>>
 >>>>
 >>>>
 >>>> 
 >>>>
 >>>>
 >>>
 >>
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-02 13:38                         ` Nikolai Kondrashov
@ 2020-12-10 17:23                           ` Cristian Marussi
  2020-12-10 18:17                             ` Nikolai Kondrashov
  0 siblings, 1 reply; 23+ messages in thread
From: Cristian Marussi @ 2020-12-10 17:23 UTC (permalink / raw)
  To: kernelci, Nikolai.Kondrashov; +Cc: broonie, basil.eljuse

Hi Nick

On Wed, Dec 02, 2020 at 03:38:19PM +0200, Nikolai Kondrashov via groups.io wrote:
> On 12/2/20 2:01 PM, Cristian Marussi wrote:
> >>  From POV of KCIDB, what you're sending now is overwriting the same test runs
> >> over and over, and we can't really tell which one of those objects is the
> >> final version.
> >
> >
> > Ah, that was exactly what I used to do in my first initial experiments and then,
> > looking at the data on the UI, I was dumb enough to decide that I should have got
> > it wrong and I started using the test_id instead of the test_execution_id, because
> > I thought that, anyway, you can recognize the different test executions of the
> > same test_id looking at the different build_id is part of (which for us represent
> > the different test suite runs)....but I suppose this wrong assumption of mine
> > sparked from the relational data model I use on our side. I'll fix it.
> 
> Yes, that would work, but then we get a "foreign key explosion" as we start
> linking to tests from other objects beside builds. So, for now we're sticking
> to the "one ID column per table" policy.
> 
> Thanks for bearing with us, and am glad to hear you already have
> `test_execution_id` in your database, so the fix shouldn't take long :)
> 
> > Sure, in fact, as of now I still have to ask for some changes in our reporting
> > backend, (which generates the original data stored in our DB and then pushed
> > to you), so I have to admit the git commit hash are partially faked (since I
> > have only a git describe string to start from) and as a consequence they won't
> > really be so much useful for comparisons amongst different origins (given
> > they don't refer real kernel commits), BUT I thought this NOT to be a
> > blocking problem for now, so that I can start pushing data to KCIDB and
> > then later on (once I get real full hashes on my side) I'll start pushing the
> > real valid ones, does it sounds good ?
> 
> Yes, no problem. We don't have maintainers/developers to get angry yet :D
> 
> I'm looking forward to having four-origin revisions in the dashboard, though,
> one more than e.g. this one:
> 
>     https://staging.kernelci.org:3000/d/revision/revision?var-id=3650b228f83adda7e5ee532e2b90429c03f7b9ec
> 

I fixed the issue about uniqueness of the tests IDs but left the valid
flag on the revision undefined as of now given the revision hash is
temporarily faked (as I told you)...just to have an indication that the
revision is bogus.
Anyway I'll have that fixed in our backend soon, and once I'll start
receiving a proper real hash the system 'should' automatically start
tagging revisions as valid: True.

> > Side question...for dynamic schema validation purposes...is there any URL
> > where I can fetch the latest currently valid schema ... something like:
> >
> > https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
> >
> > so that I can check automatically against the latest greatest instead of
> > using a builtin predownloaded one (or is it a bad idea in your opinion ?)
> 
> The JSON schemas we generate with `kcidb-schema`, and use inside KCIDB, only
> validate *one* major version. So v3 data would only validate with v3 schema,
> but not with e.g. v4.
> 
> So if you e.g. download and validate against the latest-release schema
> automatically, validation will start failing the moment a release with v4
> comes out.
> 
> Automatic data upgrades between major versions are done in Python whenever we
> see a difference between the numbers.
> 
> OTOH, minor version bumps of the schema are backwards-compatible, and you
> would be fine upgrading validation to those. However, we don't have many of
> those at all yet, as we're still changing the schema a lot.
> 
> So, I think a reasonable workflow right now is to download and switch to a new
> version at the same time you're upgrading your submission code to the next
> major release of the schema. You'll need more work on the code than just
> switching the schema, anyway.
> 
> However, let's get back to this further along the way, perhaps we can think of
> something smoother and more automated. E.g. set up a way to have automatic
> upgrades between minor versions.

Agreed, using v3 for the moment.

Moreover, after fixing a few more annoyances on my side, today I switched to
KCIDB production and pushed December results; from tomorrow morning it should
start feeding daily data to KCIDB production.

Thanks for the support and patience.

Cristian

> 
> Thanks :)
> Nick
> 
> On 12/2/20 2:01 PM, Cristian Marussi wrote:
> > On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
> >> On 12/2/20 11:23 AM, Cristian Marussi wrote:
> >>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> >>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>>>> after past month few experiments on ARM KCIDB submissions against your
> >>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>>>> before effectively deploying some real automation on our side to push our
> >>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>>>> some automation on our side for a bit against your KCIDB staging instance
> >>>>> before asking you to move to production eventually.
> >>>>
> >>>> I see your data has been steadily trickling into our playground database and
> >>>> it looks quite good. Would you like to move to the production instance?
> >>>>
> >>>> I can review your data for you, we can fix the remaining issues if we find
> >>>> them, and I can give you the permissions to push to production. Then you will
> >>>> only need to change the topic you push to from "playground_kernelci_new" to
> >>>> "kernelci_new".
> >>>
> >>> In fact I left one staging instance on our side to push data on your
> >>> staging instance to verify remaining issues on our side *and there are a
> >>> couple of minor ones I spotted that I'd like to fix indeed);
> >>
> >> Sure, it's up to you when you decide to switch. However, if you'd like, list
> >> your issues here, and I would be able to tell you if those are important from
> >> KCIDB POV.
> >>
> >> Looking at your data, I can only find one serious issue: the test run ("test")
> >> IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
> >> use 643 distinct build_id's among them.
> >>
> >> The test run IDs should correspond to a single execution of a test. Otherwise
> >> we won't be able to tell them apart. You can send multiple reports containing
> >> test runs ("tests") with the same ID, but that would still mean the same
> >> execution, only repeating the same data, or adding more.
> >>
> >> A little more explanation:
> >> https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times
> >>
> >>  From POV of KCIDB, what you're sending now is overwriting the same test runs
> >> over and over, and we can't really tell which one of those objects is the
> >> final version.
> >
> >
> > Ah, that was exactly what I used to do in my first initial experiments and then,
> > looking at the data on the UI, I was dumb enough to decide that I should have got
> > it wrong and I started using the test_id instead of the test_execution_id, because
> > I thought that, anyway, you can recognize the different test executions of the
> > same test_id looking at the different build_id is part of (which for us represent
> > the different test suite runs)....but I suppose this wrong assumption of mine
> > sparked from the relational data model I use on our side. I'll fix it.
> >
> >>
> >> Aside from that, you might want to add `"valid": true` to your "revision"
> >> objects to indicate they're alright. You never seem to send patched revisions,
> >> so it should always be true for you. Then instead of the blank "Status" field:
> >>
> >>      https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d
> >>
> >> you would get a nice green check mark, like this:
> >>
> >>      https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce
> >>
> >
> > Ah I missed this valid flag on revision too, I'll fix.
> >
> >> Finally, at this stage we really need a breadth of data coming from
> >> different CI system, rather than its depth or precision, so we can understand
> >> the problem at hand better and faster. It would do us no good to concentrate
> >> on just a few, and solidify the design around them. That would make it more
> >> difficult for others to join.
> >>
> >> You can refine and add more data afterwards.
> >>
> >
> > Sure, in fact, as of now I still have to ask for some changes in our reporting
> > backend, (which generates the original data stored in our DB and then pushed
> > to you), so I have to admit the git commit hash are partially faked (since I
> > have only a git describe string to start from) and as a consequence they won't
> > really be so much useful for comparisons amongst different origins (given
> > they don't refer real kernel commits), BUT I thought this NOT to be a
> > blocking problem for now, so that I can start pushing data to KCIDB and
> > then later on (once I get real full hashes on my side) I'll start pushing the
> > real valid ones, does it sounds good ?
> >
> >
> >>> moreover I saw a little while a go that you're going to switch to schema v4
> >>> with some minor changes in revisions and commit_hashes so I wanted to
> >>> conform to that once it's published (even though you're back compatible with
> >>> v3 AFAIU)....
> >>
> >> I would rather you didn't wait for that, as I'm neck deep in research for the
> >> next release right now, and it doesn't seem like it's gonna come out soon.
> >> I'm concentrating on getting our result notifications in a good shape so we
> >> can reach actual kernel developers ASAP.
> >>
> >> We can work on upgrading your setup later, when it comes out. And there are
> >> going to be other changes, anyway. So, I'd rather we released early and
> >> iterated.
> >>
> >
> > Good I'l stick to v3.
> >
> > Side question...for dynamic schema validation purposes...is there any URL
> > where I can fetch the latest currently valid schema ... something like:
> >
> > https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
> >
> > so that I can check automatically against the latest greatest instead of
> > using a builtin predownloaded one (or is it a bad idea in your opinion ?)
> >
> >>> ... then I've got dragged away again from this past week :D
> >>>
> >>> In fact my next steps (possibly next week) would have been (beside my fixes)
> >>> to ask you how to proceed further to production KCIDB.
> >>
> >> There's never enough time for everything :)
> >>
> >
> > eh..
> >
> >>> Would you want me to stop flooding your staging instance in the meantime (:D)
> >>> till I'm back at it at least , I think I have enugh data now to debug anyway.
> >>> (I could made a few more check next week though)
> >>
> >> Don't worry about that, and keep pushing, maybe you'll manage to break it
> >> again and then we can fix it :)
> >>
> >
> > Fine :D
> >
> >>> If it's just a matter of switching project (once got enhanced permissions
> >>> from you) please do it, and I'll try to finalize all next week on our
> >>> side and move to production.
> >>
> >> Permission granted! Switch when you feel ready, and don't hesitate to ping me
> >> for another review, if you need it.
> >>
> >> Just replace "playground_kernelci_new" topic with "kernelci_new" in your
> >> setup when you're ready.
> >>
> >
> > Cool, thanks.
> >
> >>> Thanks for the patience
> >>
> >> Thank you for your effort, we need your data :D
> >>
> >> Nick
> >>
> >
> > Thank you Nick
> >
> > Cheers,
> >
> > Cristian
> >
> >
> >> On 12/2/20 11:23 AM, Cristian Marussi wrote:
> >>> Hi Nick
> >>>
> >>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> >>>> Hi Cristian,
> >>>>
> >>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>>>> Hi Nick,
> >>>>>
> >>>>> after past month few experiments on ARM KCIDB submissions against your
> >>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>>>> before effectively deploying some real automation on our side to push our
> >>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>>>> some automation on our side for a bit against your KCIDB staging instance
> >>>>> before asking you to move to production eventually.
> >>>>
> >>>> I see your data has been steadily trickling into our playground database and
> >>>> it looks quite good. Would you like to move to the production instance?
> >>>>
> >>>> I can review your data for you, we can fix the remaining issues if we find
> >>>> them, and I can give you the permissions to push to production. Then you will
> >>>> only need to change the topic you push to from "playground_kernelci_new" to
> >>>> "kernelci_new".
> >>>
> >>> In fact I left one staging instance on our side to push data on your
> >>> staging instance to verify remaining issues on our side *and there are a
> >>> couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
> >>> a little while a go that you're going to switch to schema v4 with some minor
> >>> changes in revisions and commit_hashes so I wanted to conform to that once
> >>> it's published (even though you're back compatible with v3 AFAIU)....
> >>>
> >>> ... then I've got dragged away again from this past week :D
> >>>
> >>> In fact my next steps (possibly next week) would have been (beside my fixes)
> >>> to ask you how to proceed further to production KCIDB.
> >>>
> >>> Would you want me to stop flooding your staging instance in the meantime (:D)
> >>> till I'm back at it at least , I think I have enugh data now to debug anyway.
> >>> (I could made a few more check next week though)
> >>>
> >>> If it's just a matter of switching project (once got enhanced permissions
> >>> from you) please do it, and I'll try to finalize all next week on our
> >>> side and move to production.
> >>>
> >>> Thanks for the patience
> >>>
> >>> Cristian
> >>>
> >>>
> >>>>
> >>>> Nick
> >>>>
> >>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>>>> Hi Nick,
> >>>>>
> >>>>> after past month few experiments on ARM KCIDB submissions against your
> >>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>>>> before effectively deploying some real automation on our side to push our
> >>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>>>> some automation on our side for a bit against your KCIDB staging instance
> >>>>> before asking you to move to production eventually.
> >>>>>
> >>>>> But, today I realized, though, that I cannot push anymore data successfully
> >>>>> into staging even using the same test script I used one month ago to push
> >>>>> some new test data seems to fail now (I tested a few different days and
> >>>>> JSON validates fine with jsonschema...with proper dates with hours...)...
> >>>>> ...I cannot see any of my today tests' pushes on:
> >>>>>
> >>>>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
> >>>>>
> >>>>> Auth seems to proceed fine, but I cannot find any submission dated after
> >>>>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
> >>>>> version installed past months from your github though.
> >>>>>
> >>>>> Do you see any errors on your side that can shed a light on this ?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>> Cristian
> >>>>>
> >>>>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
> >>>>>> Hi Nick,
> >>>>>>
> >>>>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
> >>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>>>>>> Yes, I think it's one of the problems you uncovered :)
> >>>>>>>>
> >>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>>>>>> database on the backend doesn't understand some of them. In particular it
> >>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>>>>>> That's what I wanted to fix today, but ran out of time.
> >>>>>>>
> >>>>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
> >>>>>>> enforce the requirements we put on those fields 🤦. You can send essentially
> >>>>>>> what you want and then hit BigQuery, which is serious about them.
> >>>>>>
> >>>>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
> >>>>>>>
> >>>>>>> Sorry about that.
> >>>>>>>
> >>>>>>
> >>>>>> No worries.
> >>>>>>
> >>>>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
> >>>>>>>
> >>>>>>> For now please just make sure your timestamp comply with RFC3339.
> >>>>>>>
> >>>>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
> >>>>>>
> >>>>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
> >>>>>>
> >>>>>>>
> >>>>>>> Nick
> >>>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> Cristian
> >>>>>>
> >>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>>>>>     > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>>>>>     > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>>>>>     > contained something like 38 builds across 4 different revisions (with brand new
> >>>>>>>>     > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>>>>>     > push from yesterday.
> >>>>>>>>     >
> >>>>>>>>     > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>>>>>     > (I pushed >30mins ago)
> >>>>>>>>     >
> >>>>>>>>     > Any idea ?
> >>>>>>>>
> >>>>>>>> Yes, I think it's one of the problems you uncovered :)
> >>>>>>>>
> >>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>>>>>> database on the backend doesn't understand some of them. In particular it
> >>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>>>>>> That's what I wanted to fix today, but ran out of time.
> >>>>>>>>
> >>>>>>>> Additionally, the backend doesn't have a way to report a problem to the
> >>>>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
> >>>>>>>> through us looking at the logs and sending a message to the submitter :)
> >>>>>>>>
> >>>>>>>> To work around this you can pad your timestamps with dummy date and time
> >>>>>>>> data.
> >>>>>>>>
> >>>>>>>> E.g. instead of sending:
> >>>>>>>>
> >>>>>>>>         2020-09-13
> >>>>>>>>
> >>>>>>>> you can send:
> >>>>>>>>
> >>>>>>>>         2020-09-13 00:00:00+00:00
> >>>>>>>>
> >>>>>>>> Hopefully that's the only problem. It could be, since you managed to send data
> >>>>>>>> before :)
> >>>>>>>>
> >>>>>>>> Nick
> >>>>>>>>
> >>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>>>>>     > Hi Nikolai,
> >>>>>>>>     >
> >>>>>>>>     > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> >>>>>>>>     >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>>>>>     >>> It works too ... :D
> >>>>>>>>     >>>
> >>>>>>>>     >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>>>>>     >>
> >>>>>>>>     >> Whoa, awesome!
> >>>>>>>>     >>
> >>>>>>>>     >> And you have already uncovered a few issues we need to fix, too!
> >>>>>>>>     >> I will deal with them tomorrow.
> >>>>>>>>     >>
> >>>>>>>>     >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>>>>>     >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>>>>>     >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>>>>>     >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>>>>>     >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>>>>>     >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>>>>>     >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>>>>>     >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>>>>>     >>> before re-submitting ?
> >>>>>>>>     >>
> >>>>>>>>     >> Right now it's not supported (with various possible quirks if attempted).
> >>>>>>>>     >> So, preferably, submit only one, complete and final instance of each object
> >>>>>>>>     >> (with unique ID) for now.
> >>>>>>>>     >>
> >>>>>>>>     >> We have a plan to support merging missing properties across multiple reported
> >>>>>>>>     >> objects with the same ID.
> >>>>>>>>     >>
> >>>>>>>>     >>              Object A        Object B    Dashboard/Notifications
> >>>>>>>>     >>
> >>>>>>>>     >> FieldX:     Foo             Foo         Foo
> >>>>>>>>     >> FieldY:                     Bar         Bar
> >>>>>>>>     >> FieldZ:     Baz                         Baz
> >>>>>>>>     >> FieldU:     Red             Blue        Red/Blue
> >>>>>>>>     >>
> >>>>>>>>     >> Since we're using a distributed database we cannot really maintain order
> >>>>>>>>     >> (without introducing artificial global lock), so the order of the reports
> >>>>>>>>     >> doesn't matter. We can only guarantee that a present value would override
> >>>>>>>>     >> missing value. It would be undefined which value would be picked among
> >>>>>>>>     >> multiple different values.
> >>>>>>>>     >>
> >>>>>>>>     >> This would allow gradual reporting of each object, but no editing, sorry.
> >>>>>>>>     >>
> >>>>>>>>     >> However, once again, this is a plan with some research done, only.
> >>>>>>>>     >> I plan to start implementing it within a few weeks.
> >>>>>>>>     >>
> >>>>>>>>     >
> >>>>>>>>     > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>>>>>     > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>>>>>     > contained something like 38 builds across 4 different revisions (with brand new
> >>>>>>>>     > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>>>>>     > push from yesterday.
> >>>>>>>>     >
> >>>>>>>>     > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>>>>>     > (I pushed >30mins ago)
> >>>>>>>>     >
> >>>>>>>>     > Any idea ?
> >>>>>>>>     >
> >>>>>>>>     > Thanks
> >>>>>>>>     >
> >>>>>>>>     > Cristian
> >>>>>>>>     >
> >>>>>>>>     >> Nick
> >>>>>>>>     >>
> >>>>>>>>     >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>>>>>     >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >>>>>>>>     >>>> Hi Christian,
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>>>>>     >>>>> Hi Nikolai,
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>>>>>     >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> Wonderful!
> >>>>>>>>     >>>>
> >>>>>>>>     >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>>>>>     >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>>>>>     >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>>>>>     >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>>>>>     >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> Great, this is exactly what we need, welcome aboard :)
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >>>>>>>>     >>>> freenode.net, if you have any questions, problems, or requirements.
> >>>>>>>>     >>>>
> >>>>>>>>     >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>>>>>     >>>>> point at ?
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> Absolutely, I created credentials for you and sent them in a separate message.
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> >>>>>>>>     >>>> and want to differentiate them somehow in your reports.
> >>>>>>>>     >>>>
> >>>>>>>>     >>>> Nick
> >>>>>>>>     >>>>
> >>>>>>>>     >>>    Thanks !
> >>>>>>>>     >>>
> >>>>>>>>     >>> It works too ... :D
> >>>>>>>>     >>>
> >>>>>>>>     >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>>>>>     >>>
> >>>>>>>>     >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>>>>>     >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>>>>>     >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>>>>>     >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>>>>>     >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>>>>>     >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>>>>>     >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>>>>>     >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>>>>>     >>> before re-submitting ?
> >>>>>>>>     >>>
> >>>>>>>>     >>> Regards
> >>>>>>>>     >>>
> >>>>>>>>     >>> Thanks
> >>>>>>>>     >>>
> >>>>>>>>     >>> Cristian
> >>>>>>>>     >>>
> >>>>>>>>     >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>>>>>     >>>>> Hi Nikolai,
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>>>>>     >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>>>>>     >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>>>>>     >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>>>>>     >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>>>>>     >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>>>>>     >>>>> point at ?
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> Thanks
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> Regards
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>> Cristian
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>>
> >>>>>>>>     >>>>
> >>>>>>>>     >>>
> >>>>>>>>     >>
> >>>>>>>>     >
> >>>>>>>>     >
> >>>>>>>>     > >
> >>>>>>>>     >
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> >>>>
> >>>>
> >>>
> >>
> >
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-10 17:23                           ` Cristian Marussi
@ 2020-12-10 18:17                             ` Nikolai Kondrashov
  2020-12-10 20:19                               ` Cristian Marussi
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-12-10 18:17 UTC (permalink / raw)
  To: Cristian Marussi, kernelci; +Cc: broonie, basil.eljuse

Hi Cristian,

On 12/10/20 7:23 PM, Cristian Marussi wrote:
 > I fixed the issue about uniqueness of the tests IDs but left the valid
 > flag on the revision undefined as of now given the revision hash is
 > temporarily faked (as I told you)...just to have an indication that the
 > revision is bogus.
 > Anyway I'll have that fixed in our backend soon, and once I'll start
 > receiving a proper real hash the system 'should' automatically start
 > tagging revisions as valid: True.

Good plan!

 > Moreover, after fixing a few more annoyances on my side, today I switched to
 > KCIDB production and pushed December results; from tomorrow morning it should
 > start feeding daily data to KCIDB production.

Woo-hoo! Wonderful, this is a nice Christmas present :)

 > Thanks for the support and patience.

Thank you for your work, Cristian!

I notice a bit of strange data: failed builds have one (failed) boot test
submitted. Is this on purpose, does this mean something special? Logically, we
can't boot a build if it hasn't completed, don't we?

Here's an example:

     https://staging.kernelci.org:3000/d/build/build?orgId=1&var-id=arm:2020-12-08:d6051b14fced47d1983fd70171b9bcd7170491ce

Nick

On 12/10/20 7:23 PM, Cristian Marussi wrote:
 > Hi Nick
 >
 > On Wed, Dec 02, 2020 at 03:38:19PM +0200, Nikolai Kondrashov via groups.io wrote:
 >> On 12/2/20 2:01 PM, Cristian Marussi wrote:
 >>>>   From POV of KCIDB, what you're sending now is overwriting the same test runs
 >>>> over and over, and we can't really tell which one of those objects is the
 >>>> final version.
 >>>
 >>>
 >>> Ah, that was exactly what I used to do in my first initial experiments and then,
 >>> looking at the data on the UI, I was dumb enough to decide that I should have got
 >>> it wrong and I started using the test_id instead of the test_execution_id, because
 >>> I thought that, anyway, you can recognize the different test executions of the
 >>> same test_id looking at the different build_id is part of (which for us represent
 >>> the different test suite runs)....but I suppose this wrong assumption of mine
 >>> sparked from the relational data model I use on our side. I'll fix it.
 >>
 >> Yes, that would work, but then we get a "foreign key explosion" as we start
 >> linking to tests from other objects beside builds. So, for now we're sticking
 >> to the "one ID column per table" policy.
 >>
 >> Thanks for bearing with us, and am glad to hear you already have
 >> `test_execution_id` in your database, so the fix shouldn't take long :)
 >>
 >>> Sure, in fact, as of now I still have to ask for some changes in our reporting
 >>> backend, (which generates the original data stored in our DB and then pushed
 >>> to you), so I have to admit the git commit hash are partially faked (since I
 >>> have only a git describe string to start from) and as a consequence they won't
 >>> really be so much useful for comparisons amongst different origins (given
 >>> they don't refer real kernel commits), BUT I thought this NOT to be a
 >>> blocking problem for now, so that I can start pushing data to KCIDB and
 >>> then later on (once I get real full hashes on my side) I'll start pushing the
 >>> real valid ones, does it sounds good ?
 >>
 >> Yes, no problem. We don't have maintainers/developers to get angry yet :D
 >>
 >> I'm looking forward to having four-origin revisions in the dashboard, though,
 >> one more than e.g. this one:
 >>
 >>      https://staging.kernelci.org:3000/d/revision/revision?var-id=3650b228f83adda7e5ee532e2b90429c03f7b9ec
 >>
 >
 > I fixed the issue about uniqueness of the tests IDs but left the valid
 > flag on the revision undefined as of now given the revision hash is
 > temporarily faked (as I told you)...just to have an indication that the
 > revision is bogus.
 > Anyway I'll have that fixed in our backend soon, and once I'll start
 > receiving a proper real hash the system 'should' automatically start
 > tagging revisions as valid: True.
 >
 >>> Side question...for dynamic schema validation purposes...is there any URL
 >>> where I can fetch the latest currently valid schema ... something like:
 >>>
 >>> https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
 >>>
 >>> so that I can check automatically against the latest greatest instead of
 >>> using a builtin predownloaded one (or is it a bad idea in your opinion ?)
 >>
 >> The JSON schemas we generate with `kcidb-schema`, and use inside KCIDB, only
 >> validate *one* major version. So v3 data would only validate with v3 schema,
 >> but not with e.g. v4.
 >>
 >> So if you e.g. download and validate against the latest-release schema
 >> automatically, validation will start failing the moment a release with v4
 >> comes out.
 >>
 >> Automatic data upgrades between major versions are done in Python whenever we
 >> see a difference between the numbers.
 >>
 >> OTOH, minor version bumps of the schema are backwards-compatible, and you
 >> would be fine upgrading validation to those. However, we don't have many of
 >> those at all yet, as we're still changing the schema a lot.
 >>
 >> So, I think a reasonable workflow right now is to download and switch to a new
 >> version at the same time you're upgrading your submission code to the next
 >> major release of the schema. You'll need more work on the code than just
 >> switching the schema, anyway.
 >>
 >> However, let's get back to this further along the way, perhaps we can think of
 >> something smoother and more automated. E.g. set up a way to have automatic
 >> upgrades between minor versions.
 >
 > Agreed, using v3 for the moment.
 >
 > Moreover, after fixing a few more annoyances on my side, today I switched to
 > KCIDB production and pushed December results; from tomorrow morning it should
 > start feeding daily data to KCIDB production.
 >
 > Thanks for the support and patience.
 >
 > Cristian
 >
 >>
 >> Thanks :)
 >> Nick
 >>
 >> On 12/2/20 2:01 PM, Cristian Marussi wrote:
 >>> On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
 >>>> On 12/2/20 11:23 AM, Cristian Marussi wrote:
 >>>>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>>>> before effectively deploying some real automation on our side to push our
 >>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>>>> before asking you to move to production eventually.
 >>>>>>
 >>>>>> I see your data has been steadily trickling into our playground database and
 >>>>>> it looks quite good. Would you like to move to the production instance?
 >>>>>>
 >>>>>> I can review your data for you, we can fix the remaining issues if we find
 >>>>>> them, and I can give you the permissions to push to production. Then you will
 >>>>>> only need to change the topic you push to from "playground_kernelci_new" to
 >>>>>> "kernelci_new".
 >>>>>
 >>>>> In fact I left one staging instance on our side to push data on your
 >>>>> staging instance to verify remaining issues on our side *and there are a
 >>>>> couple of minor ones I spotted that I'd like to fix indeed);
 >>>>
 >>>> Sure, it's up to you when you decide to switch. However, if you'd like, list
 >>>> your issues here, and I would be able to tell you if those are important from
 >>>> KCIDB POV.
 >>>>
 >>>> Looking at your data, I can only find one serious issue: the test run ("test")
 >>>> IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
 >>>> use 643 distinct build_id's among them.
 >>>>
 >>>> The test run IDs should correspond to a single execution of a test. Otherwise
 >>>> we won't be able to tell them apart. You can send multiple reports containing
 >>>> test runs ("tests") with the same ID, but that would still mean the same
 >>>> execution, only repeating the same data, or adding more.
 >>>>
 >>>> A little more explanation:
 >>>> https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times
 >>>>
 >>>>   From POV of KCIDB, what you're sending now is overwriting the same test runs
 >>>> over and over, and we can't really tell which one of those objects is the
 >>>> final version.
 >>>
 >>>
 >>> Ah, that was exactly what I used to do in my first initial experiments and then,
 >>> looking at the data on the UI, I was dumb enough to decide that I should have got
 >>> it wrong and I started using the test_id instead of the test_execution_id, because
 >>> I thought that, anyway, you can recognize the different test executions of the
 >>> same test_id looking at the different build_id is part of (which for us represent
 >>> the different test suite runs)....but I suppose this wrong assumption of mine
 >>> sparked from the relational data model I use on our side. I'll fix it.
 >>>
 >>>>
 >>>> Aside from that, you might want to add `"valid": true` to your "revision"
 >>>> objects to indicate they're alright. You never seem to send patched revisions,
 >>>> so it should always be true for you. Then instead of the blank "Status" field:
 >>>>
 >>>>       https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d
 >>>>
 >>>> you would get a nice green check mark, like this:
 >>>>
 >>>>       https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce
 >>>>
 >>>
 >>> Ah I missed this valid flag on revision too, I'll fix.
 >>>
 >>>> Finally, at this stage we really need a breadth of data coming from
 >>>> different CI system, rather than its depth or precision, so we can understand
 >>>> the problem at hand better and faster. It would do us no good to concentrate
 >>>> on just a few, and solidify the design around them. That would make it more
 >>>> difficult for others to join.
 >>>>
 >>>> You can refine and add more data afterwards.
 >>>>
 >>>
 >>> Sure, in fact, as of now I still have to ask for some changes in our reporting
 >>> backend, (which generates the original data stored in our DB and then pushed
 >>> to you), so I have to admit the git commit hash are partially faked (since I
 >>> have only a git describe string to start from) and as a consequence they won't
 >>> really be so much useful for comparisons amongst different origins (given
 >>> they don't refer real kernel commits), BUT I thought this NOT to be a
 >>> blocking problem for now, so that I can start pushing data to KCIDB and
 >>> then later on (once I get real full hashes on my side) I'll start pushing the
 >>> real valid ones, does it sounds good ?
 >>>
 >>>
 >>>>> moreover I saw a little while a go that you're going to switch to schema v4
 >>>>> with some minor changes in revisions and commit_hashes so I wanted to
 >>>>> conform to that once it's published (even though you're back compatible with
 >>>>> v3 AFAIU)....
 >>>>
 >>>> I would rather you didn't wait for that, as I'm neck deep in research for the
 >>>> next release right now, and it doesn't seem like it's gonna come out soon.
 >>>> I'm concentrating on getting our result notifications in a good shape so we
 >>>> can reach actual kernel developers ASAP.
 >>>>
 >>>> We can work on upgrading your setup later, when it comes out. And there are
 >>>> going to be other changes, anyway. So, I'd rather we released early and
 >>>> iterated.
 >>>>
 >>>
 >>> Good I'l stick to v3.
 >>>
 >>> Side question...for dynamic schema validation purposes...is there any URL
 >>> where I can fetch the latest currently valid schema ... something like:
 >>>
 >>> https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
 >>>
 >>> so that I can check automatically against the latest greatest instead of
 >>> using a builtin predownloaded one (or is it a bad idea in your opinion ?)
 >>>
 >>>>> ... then I've got dragged away again from this past week :D
 >>>>>
 >>>>> In fact my next steps (possibly next week) would have been (beside my fixes)
 >>>>> to ask you how to proceed further to production KCIDB.
 >>>>
 >>>> There's never enough time for everything :)
 >>>>
 >>>
 >>> eh..
 >>>
 >>>>> Would you want me to stop flooding your staging instance in the meantime (:D)
 >>>>> till I'm back at it at least , I think I have enugh data now to debug anyway.
 >>>>> (I could made a few more check next week though)
 >>>>
 >>>> Don't worry about that, and keep pushing, maybe you'll manage to break it
 >>>> again and then we can fix it :)
 >>>>
 >>>
 >>> Fine :D
 >>>
 >>>>> If it's just a matter of switching project (once got enhanced permissions
 >>>>> from you) please do it, and I'll try to finalize all next week on our
 >>>>> side and move to production.
 >>>>
 >>>> Permission granted! Switch when you feel ready, and don't hesitate to ping me
 >>>> for another review, if you need it.
 >>>>
 >>>> Just replace "playground_kernelci_new" topic with "kernelci_new" in your
 >>>> setup when you're ready.
 >>>>
 >>>
 >>> Cool, thanks.
 >>>
 >>>>> Thanks for the patience
 >>>>
 >>>> Thank you for your effort, we need your data :D
 >>>>
 >>>> Nick
 >>>>
 >>>
 >>> Thank you Nick
 >>>
 >>> Cheers,
 >>>
 >>> Cristian
 >>>
 >>>
 >>>> On 12/2/20 11:23 AM, Cristian Marussi wrote:
 >>>>> Hi Nick
 >>>>>
 >>>>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>>>> Hi Cristian,
 >>>>>>
 >>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>>>> Hi Nick,
 >>>>>>>
 >>>>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>>>> before effectively deploying some real automation on our side to push our
 >>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>>>> before asking you to move to production eventually.
 >>>>>>
 >>>>>> I see your data has been steadily trickling into our playground database and
 >>>>>> it looks quite good. Would you like to move to the production instance?
 >>>>>>
 >>>>>> I can review your data for you, we can fix the remaining issues if we find
 >>>>>> them, and I can give you the permissions to push to production. Then you will
 >>>>>> only need to change the topic you push to from "playground_kernelci_new" to
 >>>>>> "kernelci_new".
 >>>>>
 >>>>> In fact I left one staging instance on our side to push data on your
 >>>>> staging instance to verify remaining issues on our side *and there are a
 >>>>> couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
 >>>>> a little while a go that you're going to switch to schema v4 with some minor
 >>>>> changes in revisions and commit_hashes so I wanted to conform to that once
 >>>>> it's published (even though you're back compatible with v3 AFAIU)....
 >>>>>
 >>>>> ... then I've got dragged away again from this past week :D
 >>>>>
 >>>>> In fact my next steps (possibly next week) would have been (beside my fixes)
 >>>>> to ask you how to proceed further to production KCIDB.
 >>>>>
 >>>>> Would you want me to stop flooding your staging instance in the meantime (:D)
 >>>>> till I'm back at it at least , I think I have enugh data now to debug anyway.
 >>>>> (I could made a few more check next week though)
 >>>>>
 >>>>> If it's just a matter of switching project (once got enhanced permissions
 >>>>> from you) please do it, and I'll try to finalize all next week on our
 >>>>> side and move to production.
 >>>>>
 >>>>> Thanks for the patience
 >>>>>
 >>>>> Cristian
 >>>>>
 >>>>>
 >>>>>>
 >>>>>> Nick
 >>>>>>
 >>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>>>> Hi Nick,
 >>>>>>>
 >>>>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>>>> before effectively deploying some real automation on our side to push our
 >>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>>>> before asking you to move to production eventually.
 >>>>>>>
 >>>>>>> But, today I realized, though, that I cannot push anymore data successfully
 >>>>>>> into staging even using the same test script I used one month ago to push
 >>>>>>> some new test data seems to fail now (I tested a few different days and
 >>>>>>> JSON validates fine with jsonschema...with proper dates with hours...)...
 >>>>>>> ...I cannot see any of my today tests' pushes on:
 >>>>>>>
 >>>>>>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >>>>>>>
 >>>>>>> Auth seems to proceed fine, but I cannot find any submission dated after
 >>>>>>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 >>>>>>> version installed past months from your github though.
 >>>>>>>
 >>>>>>> Do you see any errors on your side that can shed a light on this ?
 >>>>>>>
 >>>>>>> Thanks
 >>>>>>>
 >>>>>>> Regards
 >>>>>>>
 >>>>>>> Cristian
 >>>>>>>
 >>>>>>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
 >>>>>>>> Hi Nick,
 >>>>>>>>
 >>>>>>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>>>>>
 >>>>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>>>>
 >>>>>>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
 >>>>>>>>> enforce the requirements we put on those fields 🤦. You can send essentially
 >>>>>>>>> what you want and then hit BigQuery, which is serious about them.
 >>>>>>>>
 >>>>>>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
 >>>>>>>>>
 >>>>>>>>> Sorry about that.
 >>>>>>>>>
 >>>>>>>>
 >>>>>>>> No worries.
 >>>>>>>>
 >>>>>>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>>>>>>>>
 >>>>>>>>> For now please just make sure your timestamp comply with RFC3339.
 >>>>>>>>>
 >>>>>>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >>>>>>>>
 >>>>>>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
 >>>>>>>>
 >>>>>>>>>
 >>>>>>>>> Nick
 >>>>>>>>>
 >>>>>>>>
 >>>>>>>> Thanks
 >>>>>>>>
 >>>>>>>> Cristian
 >>>>>>>>
 >>>>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>>>>>      > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>>>>>      > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>>>>>      > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>>>>>      > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>>>>>      > push from yesterday.
 >>>>>>>>>>      >
 >>>>>>>>>>      > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>>>>>      > (I pushed >30mins ago)
 >>>>>>>>>>      >
 >>>>>>>>>>      > Any idea ?
 >>>>>>>>>>
 >>>>>>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>>>>>
 >>>>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>>>>>
 >>>>>>>>>> Additionally, the backend doesn't have a way to report a problem to the
 >>>>>>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>>>>>>>>> through us looking at the logs and sending a message to the submitter :)
 >>>>>>>>>>
 >>>>>>>>>> To work around this you can pad your timestamps with dummy date and time
 >>>>>>>>>> data.
 >>>>>>>>>>
 >>>>>>>>>> E.g. instead of sending:
 >>>>>>>>>>
 >>>>>>>>>>          2020-09-13
 >>>>>>>>>>
 >>>>>>>>>> you can send:
 >>>>>>>>>>
 >>>>>>>>>>          2020-09-13 00:00:00+00:00
 >>>>>>>>>>
 >>>>>>>>>> Hopefully that's the only problem. It could be, since you managed to send data
 >>>>>>>>>> before :)
 >>>>>>>>>>
 >>>>>>>>>> Nick
 >>>>>>>>>>
 >>>>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>>>>>      > Hi Nikolai,
 >>>>>>>>>>      >
 >>>>>>>>>>      > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>>>      >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>>>>>      >>> It works too ... :D
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> Whoa, awesome!
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> And you have already uncovered a few issues we need to fix, too!
 >>>>>>>>>>      >> I will deal with them tomorrow.
 >>>>>>>>>>      >>
 >>>>>>>>>>      >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>>>>>      >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>>>>>      >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>>>>>      >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>>>>>      >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>>>>>      >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>>>>>      >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>>>>>      >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>>>>>      >>> before re-submitting ?
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> Right now it's not supported (with various possible quirks if attempted).
 >>>>>>>>>>      >> So, preferably, submit only one, complete and final instance of each object
 >>>>>>>>>>      >> (with unique ID) for now.
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> We have a plan to support merging missing properties across multiple reported
 >>>>>>>>>>      >> objects with the same ID.
 >>>>>>>>>>      >>
 >>>>>>>>>>      >>              Object A        Object B    Dashboard/Notifications
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> FieldX:     Foo             Foo         Foo
 >>>>>>>>>>      >> FieldY:                     Bar         Bar
 >>>>>>>>>>      >> FieldZ:     Baz                         Baz
 >>>>>>>>>>      >> FieldU:     Red             Blue        Red/Blue
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> Since we're using a distributed database we cannot really maintain order
 >>>>>>>>>>      >> (without introducing artificial global lock), so the order of the reports
 >>>>>>>>>>      >> doesn't matter. We can only guarantee that a present value would override
 >>>>>>>>>>      >> missing value. It would be undefined which value would be picked among
 >>>>>>>>>>      >> multiple different values.
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> However, once again, this is a plan with some research done, only.
 >>>>>>>>>>      >> I plan to start implementing it within a few weeks.
 >>>>>>>>>>      >>
 >>>>>>>>>>      >
 >>>>>>>>>>      > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>>>>>      > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>>>>>      > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>>>>>      > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>>>>>      > push from yesterday.
 >>>>>>>>>>      >
 >>>>>>>>>>      > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>>>>>      > (I pushed >30mins ago)
 >>>>>>>>>>      >
 >>>>>>>>>>      > Any idea ?
 >>>>>>>>>>      >
 >>>>>>>>>>      > Thanks
 >>>>>>>>>>      >
 >>>>>>>>>>      > Cristian
 >>>>>>>>>>      >
 >>>>>>>>>>      >> Nick
 >>>>>>>>>>      >>
 >>>>>>>>>>      >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>>>>>      >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>>>      >>>> Hi Christian,
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>>>>>      >>>>> Hi Nikolai,
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>>>>>      >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> Wonderful!
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>>>>>      >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>>>>>      >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>>>>>      >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>>>>>      >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>>>>>>>>      >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>>>>>      >>>>> point at ?
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>>>>>>>>      >>>> and want to differentiate them somehow in your reports.
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>> Nick
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>    Thanks !
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> It works too ... :D
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>>>>>      >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>>>>>      >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>>>>>      >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>>>>>      >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>>>>>      >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>>>>>      >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>>>>>      >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>>>>>      >>> before re-submitting ?
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> Regards
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> Thanks
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>> Cristian
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>>>>>      >>>>> Hi Nikolai,
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>>>>>      >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>>>>>      >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>>>>>      >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>>>>>      >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>>>>>      >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>>>>>      >>>>> point at ?
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> Thanks
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> Regards
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>> Cristian
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>>
 >>>>>>>>>>      >>>>
 >>>>>>>>>>      >>>
 >>>>>>>>>>      >>
 >>>>>>>>>>      >
 >>>>>>>>>>      >
 >>>>>>>>>>      > >
 >>>>>>>>>>      >
 >>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>
 >>>>>>>
 >>>>>>>
 >>>>>>>>
 >>>>>>>
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>>>>>>
 >>>>>>
 >>>>>
 >>>>
 >>>
 >>
 >>
 >>
 >> 
 >>
 >>
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-10 18:17                             ` Nikolai Kondrashov
@ 2020-12-10 20:19                               ` Cristian Marussi
  2020-12-14 10:23                                 ` Nikolai Kondrashov
  0 siblings, 1 reply; 23+ messages in thread
From: Cristian Marussi @ 2020-12-10 20:19 UTC (permalink / raw)
  To: kernelci, Nikolai.Kondrashov; +Cc: broonie, basil.eljuse

Hi

On Thu, Dec 10, 2020 at 08:17:42PM +0200, Nikolai Kondrashov via groups.io wrote:
> Hi Cristian,
> 
> On 12/10/20 7:23 PM, Cristian Marussi wrote:
> > I fixed the issue about uniqueness of the tests IDs but left the valid
> > flag on the revision undefined as of now given the revision hash is
> > temporarily faked (as I told you)...just to have an indication that the
> > revision is bogus.
> > Anyway I'll have that fixed in our backend soon, and once I'll start
> > receiving a proper real hash the system 'should' automatically start
> > tagging revisions as valid: True.
> 
> Good plan!
> 
> > Moreover, after fixing a few more annoyances on my side, today I switched to
> > KCIDB production and pushed December results; from tomorrow morning it should
> > start feeding daily data to KCIDB production.
> 
> Woo-hoo! Wonderful, this is a nice Christmas present :)
> 
> > Thanks for the support and patience.
> 
> Thank you for your work, Cristian!
> 
> I notice a bit of strange data: failed builds have one (failed) boot test
> submitted. Is this on purpose, does this mean something special? Logically, we
> can't boot a build if it hasn't completed, don't we?
> 
> Here's an example:
> 
>     https://staging.kernelci.org:3000/d/build/build?orgId=1&var-id=arm:2020-12-08:d6051b14fced47d1983fd70171b9bcd7170491ce
> 
> Nick
> 

So basically, everything I have on my side represents a test run of some kind
of suite (LTP, KSELFTEST, KVM-UT...etc), because basically this is what we
trace currently (at least the data we accumulate in the DB); failed builds
(as in compilation failed) are not really tracked, so I would have all the
builds green in KCIDB in this scenario.

If a testrun(kernel) successfully boots and successfully runs till the end I
gather a number of individual test results.
Then I 'synthetize' a boot test and a cumulative test-suite result in
addition to all the singular tests results I could find.

In order to fit the above in your schema currently, and give some info about
the testrun(build) general health, I mark builds valid only if the testrun/
kernel has both:
  -> booted
  -> run the test_suite till completion (without hang) with or without
  singular tests failures

In all the other cases, so no boot or hang with imcomplete results, build
gets red but anyway, a failed boot test (on noboot) or a successfull boot
test and nothing else (on hang) could be present.
(and at the moment I don't have public logs to provide as you can see
which is not so useful)

Alternatively, sticking probably better to the intended usage of your schema,
I could just mark all builds valid for now, and then mark invalid in the
future only the broken compilations as expected (once and if such data will
be available programmatically on my side): in such case we'd anyway have the
boot test results to see what's going on a green build with apparently
no other results.

Maybe it's better really going this latter way to fit the usual meaning of
the schema and be able to provide compilation issues results in the
future.
If you feel this is reasonable I can easily fix it immediately (for the real
final deployment is still be fully done :D)

Thanks

Cristian


> On 12/10/20 7:23 PM, Cristian Marussi wrote:
> > Hi Nick
> >
> > On Wed, Dec 02, 2020 at 03:38:19PM +0200, Nikolai Kondrashov via groups.io wrote:
> >> On 12/2/20 2:01 PM, Cristian Marussi wrote:
> >>>>   From POV of KCIDB, what you're sending now is overwriting the same test runs
> >>>> over and over, and we can't really tell which one of those objects is the
> >>>> final version.
> >>>
> >>>
> >>> Ah, that was exactly what I used to do in my first initial experiments and then,
> >>> looking at the data on the UI, I was dumb enough to decide that I should have got
> >>> it wrong and I started using the test_id instead of the test_execution_id, because
> >>> I thought that, anyway, you can recognize the different test executions of the
> >>> same test_id looking at the different build_id is part of (which for us represent
> >>> the different test suite runs)....but I suppose this wrong assumption of mine
> >>> sparked from the relational data model I use on our side. I'll fix it.
> >>
> >> Yes, that would work, but then we get a "foreign key explosion" as we start
> >> linking to tests from other objects beside builds. So, for now we're sticking
> >> to the "one ID column per table" policy.
> >>
> >> Thanks for bearing with us, and am glad to hear you already have
> >> `test_execution_id` in your database, so the fix shouldn't take long :)
> >>
> >>> Sure, in fact, as of now I still have to ask for some changes in our reporting
> >>> backend, (which generates the original data stored in our DB and then pushed
> >>> to you), so I have to admit the git commit hash are partially faked (since I
> >>> have only a git describe string to start from) and as a consequence they won't
> >>> really be so much useful for comparisons amongst different origins (given
> >>> they don't refer real kernel commits), BUT I thought this NOT to be a
> >>> blocking problem for now, so that I can start pushing data to KCIDB and
> >>> then later on (once I get real full hashes on my side) I'll start pushing the
> >>> real valid ones, does it sounds good ?
> >>
> >> Yes, no problem. We don't have maintainers/developers to get angry yet :D
> >>
> >> I'm looking forward to having four-origin revisions in the dashboard, though,
> >> one more than e.g. this one:
> >>
> >>      https://staging.kernelci.org:3000/d/revision/revision?var-id=3650b228f83adda7e5ee532e2b90429c03f7b9ec
> >>
> >
> > I fixed the issue about uniqueness of the tests IDs but left the valid
> > flag on the revision undefined as of now given the revision hash is
> > temporarily faked (as I told you)...just to have an indication that the
> > revision is bogus.
> > Anyway I'll have that fixed in our backend soon, and once I'll start
> > receiving a proper real hash the system 'should' automatically start
> > tagging revisions as valid: True.
> >
> >>> Side question...for dynamic schema validation purposes...is there any URL
> >>> where I can fetch the latest currently valid schema ... something like:
> >>>
> >>> https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
> >>>
> >>> so that I can check automatically against the latest greatest instead of
> >>> using a builtin predownloaded one (or is it a bad idea in your opinion ?)
> >>
> >> The JSON schemas we generate with `kcidb-schema`, and use inside KCIDB, only
> >> validate *one* major version. So v3 data would only validate with v3 schema,
> >> but not with e.g. v4.
> >>
> >> So if you e.g. download and validate against the latest-release schema
> >> automatically, validation will start failing the moment a release with v4
> >> comes out.
> >>
> >> Automatic data upgrades between major versions are done in Python whenever we
> >> see a difference between the numbers.
> >>
> >> OTOH, minor version bumps of the schema are backwards-compatible, and you
> >> would be fine upgrading validation to those. However, we don't have many of
> >> those at all yet, as we're still changing the schema a lot.
> >>
> >> So, I think a reasonable workflow right now is to download and switch to a new
> >> version at the same time you're upgrading your submission code to the next
> >> major release of the schema. You'll need more work on the code than just
> >> switching the schema, anyway.
> >>
> >> However, let's get back to this further along the way, perhaps we can think of
> >> something smoother and more automated. E.g. set up a way to have automatic
> >> upgrades between minor versions.
> >
> > Agreed, using v3 for the moment.
> >
> > Moreover, after fixing a few more annoyances on my side, today I switched to
> > KCIDB production and pushed December results; from tomorrow morning it should
> > start feeding daily data to KCIDB production.
> >
> > Thanks for the support and patience.
> >
> > Cristian
> >
> >>
> >> Thanks :)
> >> Nick
> >>
> >> On 12/2/20 2:01 PM, Cristian Marussi wrote:
> >>> On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
> >>>> On 12/2/20 11:23 AM, Cristian Marussi wrote:
> >>>>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> >>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>>>>>> after past month few experiments on ARM KCIDB submissions against your
> >>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>>>>>> before effectively deploying some real automation on our side to push our
> >>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>>>>>> some automation on our side for a bit against your KCIDB staging instance
> >>>>>>> before asking you to move to production eventually.
> >>>>>>
> >>>>>> I see your data has been steadily trickling into our playground database and
> >>>>>> it looks quite good. Would you like to move to the production instance?
> >>>>>>
> >>>>>> I can review your data for you, we can fix the remaining issues if we find
> >>>>>> them, and I can give you the permissions to push to production. Then you will
> >>>>>> only need to change the topic you push to from "playground_kernelci_new" to
> >>>>>> "kernelci_new".
> >>>>>
> >>>>> In fact I left one staging instance on our side to push data on your
> >>>>> staging instance to verify remaining issues on our side *and there are a
> >>>>> couple of minor ones I spotted that I'd like to fix indeed);
> >>>>
> >>>> Sure, it's up to you when you decide to switch. However, if you'd like, list
> >>>> your issues here, and I would be able to tell you if those are important from
> >>>> KCIDB POV.
> >>>>
> >>>> Looking at your data, I can only find one serious issue: the test run ("test")
> >>>> IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
> >>>> use 643 distinct build_id's among them.
> >>>>
> >>>> The test run IDs should correspond to a single execution of a test. Otherwise
> >>>> we won't be able to tell them apart. You can send multiple reports containing
> >>>> test runs ("tests") with the same ID, but that would still mean the same
> >>>> execution, only repeating the same data, or adding more.
> >>>>
> >>>> A little more explanation:
> >>>> https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times
> >>>>
> >>>>   From POV of KCIDB, what you're sending now is overwriting the same test runs
> >>>> over and over, and we can't really tell which one of those objects is the
> >>>> final version.
> >>>
> >>>
> >>> Ah, that was exactly what I used to do in my first initial experiments and then,
> >>> looking at the data on the UI, I was dumb enough to decide that I should have got
> >>> it wrong and I started using the test_id instead of the test_execution_id, because
> >>> I thought that, anyway, you can recognize the different test executions of the
> >>> same test_id looking at the different build_id is part of (which for us represent
> >>> the different test suite runs)....but I suppose this wrong assumption of mine
> >>> sparked from the relational data model I use on our side. I'll fix it.
> >>>
> >>>>
> >>>> Aside from that, you might want to add `"valid": true` to your "revision"
> >>>> objects to indicate they're alright. You never seem to send patched revisions,
> >>>> so it should always be true for you. Then instead of the blank "Status" field:
> >>>>
> >>>>       https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d
> >>>>
> >>>> you would get a nice green check mark, like this:
> >>>>
> >>>>       https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce
> >>>>
> >>>
> >>> Ah I missed this valid flag on revision too, I'll fix.
> >>>
> >>>> Finally, at this stage we really need a breadth of data coming from
> >>>> different CI system, rather than its depth or precision, so we can understand
> >>>> the problem at hand better and faster. It would do us no good to concentrate
> >>>> on just a few, and solidify the design around them. That would make it more
> >>>> difficult for others to join.
> >>>>
> >>>> You can refine and add more data afterwards.
> >>>>
> >>>
> >>> Sure, in fact, as of now I still have to ask for some changes in our reporting
> >>> backend, (which generates the original data stored in our DB and then pushed
> >>> to you), so I have to admit the git commit hash are partially faked (since I
> >>> have only a git describe string to start from) and as a consequence they won't
> >>> really be so much useful for comparisons amongst different origins (given
> >>> they don't refer real kernel commits), BUT I thought this NOT to be a
> >>> blocking problem for now, so that I can start pushing data to KCIDB and
> >>> then later on (once I get real full hashes on my side) I'll start pushing the
> >>> real valid ones, does it sounds good ?
> >>>
> >>>
> >>>>> moreover I saw a little while a go that you're going to switch to schema v4
> >>>>> with some minor changes in revisions and commit_hashes so I wanted to
> >>>>> conform to that once it's published (even though you're back compatible with
> >>>>> v3 AFAIU)....
> >>>>
> >>>> I would rather you didn't wait for that, as I'm neck deep in research for the
> >>>> next release right now, and it doesn't seem like it's gonna come out soon.
> >>>> I'm concentrating on getting our result notifications in a good shape so we
> >>>> can reach actual kernel developers ASAP.
> >>>>
> >>>> We can work on upgrading your setup later, when it comes out. And there are
> >>>> going to be other changes, anyway. So, I'd rather we released early and
> >>>> iterated.
> >>>>
> >>>
> >>> Good I'l stick to v3.
> >>>
> >>> Side question...for dynamic schema validation purposes...is there any URL
> >>> where I can fetch the latest currently valid schema ... something like:
> >>>
> >>> https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
> >>>
> >>> so that I can check automatically against the latest greatest instead of
> >>> using a builtin predownloaded one (or is it a bad idea in your opinion ?)
> >>>
> >>>>> ... then I've got dragged away again from this past week :D
> >>>>>
> >>>>> In fact my next steps (possibly next week) would have been (beside my fixes)
> >>>>> to ask you how to proceed further to production KCIDB.
> >>>>
> >>>> There's never enough time for everything :)
> >>>>
> >>>
> >>> eh..
> >>>
> >>>>> Would you want me to stop flooding your staging instance in the meantime (:D)
> >>>>> till I'm back at it at least , I think I have enugh data now to debug anyway.
> >>>>> (I could made a few more check next week though)
> >>>>
> >>>> Don't worry about that, and keep pushing, maybe you'll manage to break it
> >>>> again and then we can fix it :)
> >>>>
> >>>
> >>> Fine :D
> >>>
> >>>>> If it's just a matter of switching project (once got enhanced permissions
> >>>>> from you) please do it, and I'll try to finalize all next week on our
> >>>>> side and move to production.
> >>>>
> >>>> Permission granted! Switch when you feel ready, and don't hesitate to ping me
> >>>> for another review, if you need it.
> >>>>
> >>>> Just replace "playground_kernelci_new" topic with "kernelci_new" in your
> >>>> setup when you're ready.
> >>>>
> >>>
> >>> Cool, thanks.
> >>>
> >>>>> Thanks for the patience
> >>>>
> >>>> Thank you for your effort, we need your data :D
> >>>>
> >>>> Nick
> >>>>
> >>>
> >>> Thank you Nick
> >>>
> >>> Cheers,
> >>>
> >>> Cristian
> >>>
> >>>
> >>>> On 12/2/20 11:23 AM, Cristian Marussi wrote:
> >>>>> Hi Nick
> >>>>>
> >>>>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
> >>>>>> Hi Cristian,
> >>>>>>
> >>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>>>>>> Hi Nick,
> >>>>>>>
> >>>>>>> after past month few experiments on ARM KCIDB submissions against your
> >>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>>>>>> before effectively deploying some real automation on our side to push our
> >>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>>>>>> some automation on our side for a bit against your KCIDB staging instance
> >>>>>>> before asking you to move to production eventually.
> >>>>>>
> >>>>>> I see your data has been steadily trickling into our playground database and
> >>>>>> it looks quite good. Would you like to move to the production instance?
> >>>>>>
> >>>>>> I can review your data for you, we can fix the remaining issues if we find
> >>>>>> them, and I can give you the permissions to push to production. Then you will
> >>>>>> only need to change the topic you push to from "playground_kernelci_new" to
> >>>>>> "kernelci_new".
> >>>>>
> >>>>> In fact I left one staging instance on our side to push data on your
> >>>>> staging instance to verify remaining issues on our side *and there are a
> >>>>> couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
> >>>>> a little while a go that you're going to switch to schema v4 with some minor
> >>>>> changes in revisions and commit_hashes so I wanted to conform to that once
> >>>>> it's published (even though you're back compatible with v3 AFAIU)....
> >>>>>
> >>>>> ... then I've got dragged away again from this past week :D
> >>>>>
> >>>>> In fact my next steps (possibly next week) would have been (beside my fixes)
> >>>>> to ask you how to proceed further to production KCIDB.
> >>>>>
> >>>>> Would you want me to stop flooding your staging instance in the meantime (:D)
> >>>>> till I'm back at it at least , I think I have enugh data now to debug anyway.
> >>>>> (I could made a few more check next week though)
> >>>>>
> >>>>> If it's just a matter of switching project (once got enhanced permissions
> >>>>> from you) please do it, and I'll try to finalize all next week on our
> >>>>> side and move to production.
> >>>>>
> >>>>> Thanks for the patience
> >>>>>
> >>>>> Cristian
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Nick
> >>>>>>
> >>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
> >>>>>>> Hi Nick,
> >>>>>>>
> >>>>>>> after past month few experiments on ARM KCIDB submissions against your
> >>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
> >>>>>>> before effectively deploying some real automation on our side to push our
> >>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
> >>>>>>> some automation on our side for a bit against your KCIDB staging instance
> >>>>>>> before asking you to move to production eventually.
> >>>>>>>
> >>>>>>> But, today I realized, though, that I cannot push anymore data successfully
> >>>>>>> into staging even using the same test script I used one month ago to push
> >>>>>>> some new test data seems to fail now (I tested a few different days and
> >>>>>>> JSON validates fine with jsonschema...with proper dates with hours...)...
> >>>>>>> ...I cannot see any of my today tests' pushes on:
> >>>>>>>
> >>>>>>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
> >>>>>>>
> >>>>>>> Auth seems to proceed fine, but I cannot find any submission dated after
> >>>>>>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
> >>>>>>> version installed past months from your github though.
> >>>>>>>
> >>>>>>> Do you see any errors on your side that can shed a light on this ?
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> Regards
> >>>>>>>
> >>>>>>> Cristian
> >>>>>>>
> >>>>>>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
> >>>>>>>> Hi Nick,
> >>>>>>>>
> >>>>>>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
> >>>>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>>>>>>>> Yes, I think it's one of the problems you uncovered :)
> >>>>>>>>>>
> >>>>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>>>>>>>> database on the backend doesn't understand some of them. In particular it
> >>>>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>>>>>>>> That's what I wanted to fix today, but ran out of time.
> >>>>>>>>>
> >>>>>>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
> >>>>>>>>> enforce the requirements we put on those fields 🤦. You can send essentially
> >>>>>>>>> what you want and then hit BigQuery, which is serious about them.
> >>>>>>>>
> >>>>>>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
> >>>>>>>>>
> >>>>>>>>> Sorry about that.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> No worries.
> >>>>>>>>
> >>>>>>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
> >>>>>>>>>
> >>>>>>>>> For now please just make sure your timestamp comply with RFC3339.
> >>>>>>>>>
> >>>>>>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
> >>>>>>>>
> >>>>>>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Nick
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>>
> >>>>>>>> Cristian
> >>>>>>>>
> >>>>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
> >>>>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>>>>>>>      > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>>>>>>>      > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>>>>>>>      > contained something like 38 builds across 4 different revisions (with brand new
> >>>>>>>>>>      > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>>>>>>>      > push from yesterday.
> >>>>>>>>>>      >
> >>>>>>>>>>      > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>>>>>>>      > (I pushed >30mins ago)
> >>>>>>>>>>      >
> >>>>>>>>>>      > Any idea ?
> >>>>>>>>>>
> >>>>>>>>>> Yes, I think it's one of the problems you uncovered :)
> >>>>>>>>>>
> >>>>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
> >>>>>>>>>> database on the backend doesn't understand some of them. In particular it
> >>>>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
> >>>>>>>>>> That's what I wanted to fix today, but ran out of time.
> >>>>>>>>>>
> >>>>>>>>>> Additionally, the backend doesn't have a way to report a problem to the
> >>>>>>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
> >>>>>>>>>> through us looking at the logs and sending a message to the submitter :)
> >>>>>>>>>>
> >>>>>>>>>> To work around this you can pad your timestamps with dummy date and time
> >>>>>>>>>> data.
> >>>>>>>>>>
> >>>>>>>>>> E.g. instead of sending:
> >>>>>>>>>>
> >>>>>>>>>>          2020-09-13
> >>>>>>>>>>
> >>>>>>>>>> you can send:
> >>>>>>>>>>
> >>>>>>>>>>          2020-09-13 00:00:00+00:00
> >>>>>>>>>>
> >>>>>>>>>> Hopefully that's the only problem. It could be, since you managed to send data
> >>>>>>>>>> before :)
> >>>>>>>>>>
> >>>>>>>>>> Nick
> >>>>>>>>>>
> >>>>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
> >>>>>>>>>>      > Hi Nikolai,
> >>>>>>>>>>      >
> >>>>>>>>>>      > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
> >>>>>>>>>>      >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>>>>>>>      >>> It works too ... :D
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> Whoa, awesome!
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> And you have already uncovered a few issues we need to fix, too!
> >>>>>>>>>>      >> I will deal with them tomorrow.
> >>>>>>>>>>      >>
> >>>>>>>>>>      >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>>>>>>>      >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>>>>>>>      >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>>>>>>>      >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>>>>>>>      >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>>>>>>>      >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>>>>>>>      >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>>>>>>>      >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>>>>>>>      >>> before re-submitting ?
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> Right now it's not supported (with various possible quirks if attempted).
> >>>>>>>>>>      >> So, preferably, submit only one, complete and final instance of each object
> >>>>>>>>>>      >> (with unique ID) for now.
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> We have a plan to support merging missing properties across multiple reported
> >>>>>>>>>>      >> objects with the same ID.
> >>>>>>>>>>      >>
> >>>>>>>>>>      >>              Object A        Object B    Dashboard/Notifications
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> FieldX:     Foo             Foo         Foo
> >>>>>>>>>>      >> FieldY:                     Bar         Bar
> >>>>>>>>>>      >> FieldZ:     Baz                         Baz
> >>>>>>>>>>      >> FieldU:     Red             Blue        Red/Blue
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> Since we're using a distributed database we cannot really maintain order
> >>>>>>>>>>      >> (without introducing artificial global lock), so the order of the reports
> >>>>>>>>>>      >> doesn't matter. We can only guarantee that a present value would override
> >>>>>>>>>>      >> missing value. It would be undefined which value would be picked among
> >>>>>>>>>>      >> multiple different values.
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> This would allow gradual reporting of each object, but no editing, sorry.
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> However, once again, this is a plan with some research done, only.
> >>>>>>>>>>      >> I plan to start implementing it within a few weeks.
> >>>>>>>>>>      >>
> >>>>>>>>>>      >
> >>>>>>>>>>      > So in order to carry on my experiments, I've just tried to push a new dataset
> >>>>>>>>>>      > with a few changes in my data-layout to mimic what I see other origins do; this
> >>>>>>>>>>      > contained something like 38 builds across 4 different revisions (with brand new
> >>>>>>>>>>      > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
> >>>>>>>>>>      > push from yesterday.
> >>>>>>>>>>      >
> >>>>>>>>>>      > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
> >>>>>>>>>>      > (I pushed >30mins ago)
> >>>>>>>>>>      >
> >>>>>>>>>>      > Any idea ?
> >>>>>>>>>>      >
> >>>>>>>>>>      > Thanks
> >>>>>>>>>>      >
> >>>>>>>>>>      > Cristian
> >>>>>>>>>>      >
> >>>>>>>>>>      >> Nick
> >>>>>>>>>>      >>
> >>>>>>>>>>      >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
> >>>>>>>>>>      >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
> >>>>>>>>>>      >>>> Hi Christian,
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>>>>>>>      >>>>> Hi Nikolai,
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>>>>>>>      >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> Wonderful!
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>>>>>>>      >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>>>>>>>      >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>>>>>>>      >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>>>>>>>      >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> Great, this is exactly what we need, welcome aboard :)
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
> >>>>>>>>>>      >>>> freenode.net, if you have any questions, problems, or requirements.
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>>>>>>>      >>>>> point at ?
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> Absolutely, I created credentials for you and sent them in a separate message.
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> You can use origin "arm" for the start, unless you have multiple CI systems
> >>>>>>>>>>      >>>> and want to differentiate them somehow in your reports.
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>> Nick
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>    Thanks !
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> It works too ... :D
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> ..quick question though....given that now I'll have to play quite a bit
> >>>>>>>>>>      >>> with it and see how's better to present our data, if anythinjg missing etc etc,
> >>>>>>>>>>      >>> is there any chance (or way) that if I submmit the same JSON report multiple
> >>>>>>>>>>      >>> times with slight differences here and there (but with the same IDs clearly)
> >>>>>>>>>>      >>> I'll get my DB updated in the bits I have changed: as an example I've just
> >>>>>>>>>>      >>> resubmitted the same report with added discovery_time and descriptions, and got
> >>>>>>>>>>      >>> NO errors, but I cannot see the changes in the UI (unless they have still to
> >>>>>>>>>>      >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
> >>>>>>>>>>      >>> before re-submitting ?
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> Regards
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> Thanks
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>> Cristian
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
> >>>>>>>>>>      >>>>> Hi Nikolai,
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
> >>>>>>>>>>      >>>>> contribute our internal Kernel test results to KCIDB.
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
> >>>>>>>>>>      >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
> >>>>>>>>>>      >>>>> and I'd like to start experimenting with kci-submit (on non-production
> >>>>>>>>>>      >>>>> instances), so as to assess how to fit our results into your schema and maybe
> >>>>>>>>>>      >>>>> contribute with some new KCIDB requirements if strictly needed.
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> Is it possible to get some valid credentials and a playground instance to
> >>>>>>>>>>      >>>>> point at ?
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> Thanks
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> Regards
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>> Cristian
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>>
> >>>>>>>>>>      >>>>
> >>>>>>>>>>      >>>
> >>>>>>>>>>      >>
> >>>>>>>>>>      >
> >>>>>>>>>>      >
> >>>>>>>>>>      > >
> >>>>>>>>>>      >
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> >>
> >>
> >
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-10 20:19                               ` Cristian Marussi
@ 2020-12-14 10:23                                 ` Nikolai Kondrashov
  0 siblings, 0 replies; 23+ messages in thread
From: Nikolai Kondrashov @ 2020-12-14 10:23 UTC (permalink / raw)
  To: Cristian Marussi, kernelci; +Cc: broonie, basil.eljuse

Hi Cristian,

Thank you for your quick answer!

On 12/10/20 10:19 PM, Cristian Marussi wrote:
 > So basically, everything I have on my side represents a test run of some kind
 > of suite (LTP, KSELFTEST, KVM-UT...etc), because basically this is what we
 > trace currently (at least the data we accumulate in the DB); failed builds
 > (as in compilation failed) are not really tracked, so I would have all the
 > builds green in KCIDB in this scenario.
 >
 > If a testrun(kernel) successfully boots and successfully runs till the end I
 > gather a number of individual test results.
 > Then I 'synthetize' a boot test and a cumulative test-suite result in
 > addition to all the singular tests results I could find.
 >
 > In order to fit the above in your schema currently, and give some info about
 > the testrun(build) general health, I mark builds valid only if the testrun/
 > kernel has both:
 >    -> booted
 >    -> run the test_suite till completion (without hang) with or without
 >    singular tests failures
 >
 > In all the other cases, so no boot or hang with imcomplete results, build
 > gets red but anyway, a failed boot test (on noboot) or a successfull boot
 > test and nothing else (on hang) could be present.
 > (and at the moment I don't have public logs to provide as you can see
 > which is not so useful)
 >
 > Alternatively, sticking probably better to the intended usage of your schema,
 > I could just mark all builds valid for now, and then mark invalid in the
 > future only the broken compilations as expected (once and if such data will
 > be available programmatically on my side): in such case we'd anyway have the
 > boot test results to see what's going on a green build with apparently
 > no other results.

I think this would be best from the POV of KCIDB. The build's "valid" flag is
really only whether the build completed or not.

You don't really need to report the overall status of tests, as KCIDB will be
summarizing them, but if you want, you can report it with a "root" "test"
object. A "root" test object is one which has "path" set to the empty string.

Like this:

     {
         "id": "arm::36189",
         "origin": "arm",
         "path": "",
         "status": "PASS"
     }

 > Maybe it's better really going this latter way to fit the usual meaning of
 > the schema and be able to provide compilation issues results in the
 > future.
 > If you feel this is reasonable I can easily fix it immediately (for the real
 > final deployment is still be fully done :D)

Awesome :) Yeah, I think it's best to switch to using "valid" only for
"compilation" status. You can add links to logs (and/or supply relevant log
excerpts when we support that), later.

Thank you!
Nick

On 12/10/20 10:19 PM, Cristian Marussi wrote:
 > Hi
 >
 > On Thu, Dec 10, 2020 at 08:17:42PM +0200, Nikolai Kondrashov via groups.io wrote:
 >> Hi Cristian,
 >>
 >> On 12/10/20 7:23 PM, Cristian Marussi wrote:
 >>> I fixed the issue about uniqueness of the tests IDs but left the valid
 >>> flag on the revision undefined as of now given the revision hash is
 >>> temporarily faked (as I told you)...just to have an indication that the
 >>> revision is bogus.
 >>> Anyway I'll have that fixed in our backend soon, and once I'll start
 >>> receiving a proper real hash the system 'should' automatically start
 >>> tagging revisions as valid: True.
 >>
 >> Good plan!
 >>
 >>> Moreover, after fixing a few more annoyances on my side, today I switched to
 >>> KCIDB production and pushed December results; from tomorrow morning it should
 >>> start feeding daily data to KCIDB production.
 >>
 >> Woo-hoo! Wonderful, this is a nice Christmas present :)
 >>
 >>> Thanks for the support and patience.
 >>
 >> Thank you for your work, Cristian!
 >>
 >> I notice a bit of strange data: failed builds have one (failed) boot test
 >> submitted. Is this on purpose, does this mean something special? Logically, we
 >> can't boot a build if it hasn't completed, don't we?
 >>
 >> Here's an example:
 >>
 >>      https://staging.kernelci.org:3000/d/build/build?orgId=1&var-id=arm:2020-12-08:d6051b14fced47d1983fd70171b9bcd7170491ce
 >>
 >> Nick
 >>
 >
 > So basically, everything I have on my side represents a test run of some kind
 > of suite (LTP, KSELFTEST, KVM-UT...etc), because basically this is what we
 > trace currently (at least the data we accumulate in the DB); failed builds
 > (as in compilation failed) are not really tracked, so I would have all the
 > builds green in KCIDB in this scenario.
 >
 > If a testrun(kernel) successfully boots and successfully runs till the end I
 > gather a number of individual test results.
 > Then I 'synthetize' a boot test and a cumulative test-suite result in
 > addition to all the singular tests results I could find.
 >
 > In order to fit the above in your schema currently, and give some info about
 > the testrun(build) general health, I mark builds valid only if the testrun/
 > kernel has both:
 >    -> booted
 >    -> run the test_suite till completion (without hang) with or without
 >    singular tests failures
 >
 > In all the other cases, so no boot or hang with imcomplete results, build
 > gets red but anyway, a failed boot test (on noboot) or a successfull boot
 > test and nothing else (on hang) could be present.
 > (and at the moment I don't have public logs to provide as you can see
 > which is not so useful)
 >
 > Alternatively, sticking probably better to the intended usage of your schema,
 > I could just mark all builds valid for now, and then mark invalid in the
 > future only the broken compilations as expected (once and if such data will
 > be available programmatically on my side): in such case we'd anyway have the
 > boot test results to see what's going on a green build with apparently
 > no other results.
 >
 > Maybe it's better really going this latter way to fit the usual meaning of
 > the schema and be able to provide compilation issues results in the
 > future.
 > If you feel this is reasonable I can easily fix it immediately (for the real
 > final deployment is still be fully done :D)
 >
 > Thanks
 >
 > Cristian
 >
 >
 >> On 12/10/20 7:23 PM, Cristian Marussi wrote:
 >>> Hi Nick
 >>>
 >>> On Wed, Dec 02, 2020 at 03:38:19PM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>> On 12/2/20 2:01 PM, Cristian Marussi wrote:
 >>>>>>    From POV of KCIDB, what you're sending now is overwriting the same test runs
 >>>>>> over and over, and we can't really tell which one of those objects is the
 >>>>>> final version.
 >>>>>
 >>>>>
 >>>>> Ah, that was exactly what I used to do in my first initial experiments and then,
 >>>>> looking at the data on the UI, I was dumb enough to decide that I should have got
 >>>>> it wrong and I started using the test_id instead of the test_execution_id, because
 >>>>> I thought that, anyway, you can recognize the different test executions of the
 >>>>> same test_id looking at the different build_id is part of (which for us represent
 >>>>> the different test suite runs)....but I suppose this wrong assumption of mine
 >>>>> sparked from the relational data model I use on our side. I'll fix it.
 >>>>
 >>>> Yes, that would work, but then we get a "foreign key explosion" as we start
 >>>> linking to tests from other objects beside builds. So, for now we're sticking
 >>>> to the "one ID column per table" policy.
 >>>>
 >>>> Thanks for bearing with us, and am glad to hear you already have
 >>>> `test_execution_id` in your database, so the fix shouldn't take long :)
 >>>>
 >>>>> Sure, in fact, as of now I still have to ask for some changes in our reporting
 >>>>> backend, (which generates the original data stored in our DB and then pushed
 >>>>> to you), so I have to admit the git commit hash are partially faked (since I
 >>>>> have only a git describe string to start from) and as a consequence they won't
 >>>>> really be so much useful for comparisons amongst different origins (given
 >>>>> they don't refer real kernel commits), BUT I thought this NOT to be a
 >>>>> blocking problem for now, so that I can start pushing data to KCIDB and
 >>>>> then later on (once I get real full hashes on my side) I'll start pushing the
 >>>>> real valid ones, does it sounds good ?
 >>>>
 >>>> Yes, no problem. We don't have maintainers/developers to get angry yet :D
 >>>>
 >>>> I'm looking forward to having four-origin revisions in the dashboard, though,
 >>>> one more than e.g. this one:
 >>>>
 >>>>       https://staging.kernelci.org:3000/d/revision/revision?var-id=3650b228f83adda7e5ee532e2b90429c03f7b9ec
 >>>>
 >>>
 >>> I fixed the issue about uniqueness of the tests IDs but left the valid
 >>> flag on the revision undefined as of now given the revision hash is
 >>> temporarily faked (as I told you)...just to have an indication that the
 >>> revision is bogus.
 >>> Anyway I'll have that fixed in our backend soon, and once I'll start
 >>> receiving a proper real hash the system 'should' automatically start
 >>> tagging revisions as valid: True.
 >>>
 >>>>> Side question...for dynamic schema validation purposes...is there any URL
 >>>>> where I can fetch the latest currently valid schema ... something like:
 >>>>>
 >>>>> https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
 >>>>>
 >>>>> so that I can check automatically against the latest greatest instead of
 >>>>> using a builtin predownloaded one (or is it a bad idea in your opinion ?)
 >>>>
 >>>> The JSON schemas we generate with `kcidb-schema`, and use inside KCIDB, only
 >>>> validate *one* major version. So v3 data would only validate with v3 schema,
 >>>> but not with e.g. v4.
 >>>>
 >>>> So if you e.g. download and validate against the latest-release schema
 >>>> automatically, validation will start failing the moment a release with v4
 >>>> comes out.
 >>>>
 >>>> Automatic data upgrades between major versions are done in Python whenever we
 >>>> see a difference between the numbers.
 >>>>
 >>>> OTOH, minor version bumps of the schema are backwards-compatible, and you
 >>>> would be fine upgrading validation to those. However, we don't have many of
 >>>> those at all yet, as we're still changing the schema a lot.
 >>>>
 >>>> So, I think a reasonable workflow right now is to download and switch to a new
 >>>> version at the same time you're upgrading your submission code to the next
 >>>> major release of the schema. You'll need more work on the code than just
 >>>> switching the schema, anyway.
 >>>>
 >>>> However, let's get back to this further along the way, perhaps we can think of
 >>>> something smoother and more automated. E.g. set up a way to have automatic
 >>>> upgrades between minor versions.
 >>>
 >>> Agreed, using v3 for the moment.
 >>>
 >>> Moreover, after fixing a few more annoyances on my side, today I switched to
 >>> KCIDB production and pushed December results; from tomorrow morning it should
 >>> start feeding daily data to KCIDB production.
 >>>
 >>> Thanks for the support and patience.
 >>>
 >>> Cristian
 >>>
 >>>>
 >>>> Thanks :)
 >>>> Nick
 >>>>
 >>>> On 12/2/20 2:01 PM, Cristian Marussi wrote:
 >>>>> On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
 >>>>>> On 12/2/20 11:23 AM, Cristian Marussi wrote:
 >>>>>>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>>>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>>>>>> before effectively deploying some real automation on our side to push our
 >>>>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>>>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>>>>>> before asking you to move to production eventually.
 >>>>>>>>
 >>>>>>>> I see your data has been steadily trickling into our playground database and
 >>>>>>>> it looks quite good. Would you like to move to the production instance?
 >>>>>>>>
 >>>>>>>> I can review your data for you, we can fix the remaining issues if we find
 >>>>>>>> them, and I can give you the permissions to push to production. Then you will
 >>>>>>>> only need to change the topic you push to from "playground_kernelci_new" to
 >>>>>>>> "kernelci_new".
 >>>>>>>
 >>>>>>> In fact I left one staging instance on our side to push data on your
 >>>>>>> staging instance to verify remaining issues on our side *and there are a
 >>>>>>> couple of minor ones I spotted that I'd like to fix indeed);
 >>>>>>
 >>>>>> Sure, it's up to you when you decide to switch. However, if you'd like, list
 >>>>>> your issues here, and I would be able to tell you if those are important from
 >>>>>> KCIDB POV.
 >>>>>>
 >>>>>> Looking at your data, I can only find one serious issue: the test run ("test")
 >>>>>> IDs are not unique. E.g. there are 1460 objects with ID "arm:LTP:11" which
 >>>>>> use 643 distinct build_id's among them.
 >>>>>>
 >>>>>> The test run IDs should correspond to a single execution of a test. Otherwise
 >>>>>> we won't be able to tell them apart. You can send multiple reports containing
 >>>>>> test runs ("tests") with the same ID, but that would still mean the same
 >>>>>> execution, only repeating the same data, or adding more.
 >>>>>>
 >>>>>> A little more explanation:
 >>>>>> https://github.com/kernelci/kcidb/blob/master/SUBMISSION_HOWTO.md#submitting-objects-multiple-times
 >>>>>>
 >>>>>>    From POV of KCIDB, what you're sending now is overwriting the same test runs
 >>>>>> over and over, and we can't really tell which one of those objects is the
 >>>>>> final version.
 >>>>>
 >>>>>
 >>>>> Ah, that was exactly what I used to do in my first initial experiments and then,
 >>>>> looking at the data on the UI, I was dumb enough to decide that I should have got
 >>>>> it wrong and I started using the test_id instead of the test_execution_id, because
 >>>>> I thought that, anyway, you can recognize the different test executions of the
 >>>>> same test_id looking at the different build_id is part of (which for us represent
 >>>>> the different test suite runs)....but I suppose this wrong assumption of mine
 >>>>> sparked from the relational data model I use on our side. I'll fix it.
 >>>>>
 >>>>>>
 >>>>>> Aside from that, you might want to add `"valid": true` to your "revision"
 >>>>>> objects to indicate they're alright. You never seem to send patched revisions,
 >>>>>> so it should always be true for you. Then instead of the blank "Status" field:
 >>>>>>
 >>>>>>        https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=playground_kernelci04&var-id=f0d5c8f71bbb1aa1e98cb1a89adb9d57c04ede3d
 >>>>>>
 >>>>>> you would get a nice green check mark, like this:
 >>>>>>
 >>>>>>        https://staging.kernelci.org:3000/d/revision/revision?orgId=1&var-dataset=kernelci04&var-id=8af5fe40bd59d8aa26dd76d9971435177aacbfce
 >>>>>>
 >>>>>
 >>>>> Ah I missed this valid flag on revision too, I'll fix.
 >>>>>
 >>>>>> Finally, at this stage we really need a breadth of data coming from
 >>>>>> different CI system, rather than its depth or precision, so we can understand
 >>>>>> the problem at hand better and faster. It would do us no good to concentrate
 >>>>>> on just a few, and solidify the design around them. That would make it more
 >>>>>> difficult for others to join.
 >>>>>>
 >>>>>> You can refine and add more data afterwards.
 >>>>>>
 >>>>>
 >>>>> Sure, in fact, as of now I still have to ask for some changes in our reporting
 >>>>> backend, (which generates the original data stored in our DB and then pushed
 >>>>> to you), so I have to admit the git commit hash are partially faked (since I
 >>>>> have only a git describe string to start from) and as a consequence they won't
 >>>>> really be so much useful for comparisons amongst different origins (given
 >>>>> they don't refer real kernel commits), BUT I thought this NOT to be a
 >>>>> blocking problem for now, so that I can start pushing data to KCIDB and
 >>>>> then later on (once I get real full hashes on my side) I'll start pushing the
 >>>>> real valid ones, does it sounds good ?
 >>>>>
 >>>>>
 >>>>>>> moreover I saw a little while a go that you're going to switch to schema v4
 >>>>>>> with some minor changes in revisions and commit_hashes so I wanted to
 >>>>>>> conform to that once it's published (even though you're back compatible with
 >>>>>>> v3 AFAIU)....
 >>>>>>
 >>>>>> I would rather you didn't wait for that, as I'm neck deep in research for the
 >>>>>> next release right now, and it doesn't seem like it's gonna come out soon.
 >>>>>> I'm concentrating on getting our result notifications in a good shape so we
 >>>>>> can reach actual kernel developers ASAP.
 >>>>>>
 >>>>>> We can work on upgrading your setup later, when it comes out. And there are
 >>>>>> going to be other changes, anyway. So, I'd rather we released early and
 >>>>>> iterated.
 >>>>>>
 >>>>>
 >>>>> Good I'l stick to v3.
 >>>>>
 >>>>> Side question...for dynamic schema validation purposes...is there any URL
 >>>>> where I can fetch the latest currently valid schema ... something like:
 >>>>>
 >>>>> https://github.com/kernelci/kcidb/releases/kcidb.latest.schema.json
 >>>>>
 >>>>> so that I can check automatically against the latest greatest instead of
 >>>>> using a builtin predownloaded one (or is it a bad idea in your opinion ?)
 >>>>>
 >>>>>>> ... then I've got dragged away again from this past week :D
 >>>>>>>
 >>>>>>> In fact my next steps (possibly next week) would have been (beside my fixes)
 >>>>>>> to ask you how to proceed further to production KCIDB.
 >>>>>>
 >>>>>> There's never enough time for everything :)
 >>>>>>
 >>>>>
 >>>>> eh..
 >>>>>
 >>>>>>> Would you want me to stop flooding your staging instance in the meantime (:D)
 >>>>>>> till I'm back at it at least , I think I have enugh data now to debug anyway.
 >>>>>>> (I could made a few more check next week though)
 >>>>>>
 >>>>>> Don't worry about that, and keep pushing, maybe you'll manage to break it
 >>>>>> again and then we can fix it :)
 >>>>>>
 >>>>>
 >>>>> Fine :D
 >>>>>
 >>>>>>> If it's just a matter of switching project (once got enhanced permissions
 >>>>>>> from you) please do it, and I'll try to finalize all next week on our
 >>>>>>> side and move to production.
 >>>>>>
 >>>>>> Permission granted! Switch when you feel ready, and don't hesitate to ping me
 >>>>>> for another review, if you need it.
 >>>>>>
 >>>>>> Just replace "playground_kernelci_new" topic with "kernelci_new" in your
 >>>>>> setup when you're ready.
 >>>>>>
 >>>>>
 >>>>> Cool, thanks.
 >>>>>
 >>>>>>> Thanks for the patience
 >>>>>>
 >>>>>> Thank you for your effort, we need your data :D
 >>>>>>
 >>>>>> Nick
 >>>>>>
 >>>>>
 >>>>> Thank you Nick
 >>>>>
 >>>>> Cheers,
 >>>>>
 >>>>> Cristian
 >>>>>
 >>>>>
 >>>>>> On 12/2/20 11:23 AM, Cristian Marussi wrote:
 >>>>>>> Hi Nick
 >>>>>>>
 >>>>>>> On Wed, Dec 02, 2020 at 10:05:05AM +0200, Nikolai Kondrashov via groups.io wrote:
 >>>>>>>> Hi Cristian,
 >>>>>>>>
 >>>>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>>>>>> Hi Nick,
 >>>>>>>>>
 >>>>>>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>>>>>> before effectively deploying some real automation on our side to push our
 >>>>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>>>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>>>>>> before asking you to move to production eventually.
 >>>>>>>>
 >>>>>>>> I see your data has been steadily trickling into our playground database and
 >>>>>>>> it looks quite good. Would you like to move to the production instance?
 >>>>>>>>
 >>>>>>>> I can review your data for you, we can fix the remaining issues if we find
 >>>>>>>> them, and I can give you the permissions to push to production. Then you will
 >>>>>>>> only need to change the topic you push to from "playground_kernelci_new" to
 >>>>>>>> "kernelci_new".
 >>>>>>>
 >>>>>>> In fact I left one staging instance on our side to push data on your
 >>>>>>> staging instance to verify remaining issues on our side *and there are a
 >>>>>>> couple of minor ones I spotted that I'd like to fix indeed); moreover I saw
 >>>>>>> a little while a go that you're going to switch to schema v4 with some minor
 >>>>>>> changes in revisions and commit_hashes so I wanted to conform to that once
 >>>>>>> it's published (even though you're back compatible with v3 AFAIU)....
 >>>>>>>
 >>>>>>> ... then I've got dragged away again from this past week :D
 >>>>>>>
 >>>>>>> In fact my next steps (possibly next week) would have been (beside my fixes)
 >>>>>>> to ask you how to proceed further to production KCIDB.
 >>>>>>>
 >>>>>>> Would you want me to stop flooding your staging instance in the meantime (:D)
 >>>>>>> till I'm back at it at least , I think I have enugh data now to debug anyway.
 >>>>>>> (I could made a few more check next week though)
 >>>>>>>
 >>>>>>> If it's just a matter of switching project (once got enhanced permissions
 >>>>>>> from you) please do it, and I'll try to finalize all next week on our
 >>>>>>> side and move to production.
 >>>>>>>
 >>>>>>> Thanks for the patience
 >>>>>>>
 >>>>>>> Cristian
 >>>>>>>
 >>>>>>>
 >>>>>>>>
 >>>>>>>> Nick
 >>>>>>>>
 >>>>>>>> On 11/5/20 8:46 PM, Cristian Marussi wrote:
 >>>>>>>>> Hi Nick,
 >>>>>>>>>
 >>>>>>>>> after past month few experiments on ARM KCIDB submissions against your
 >>>>>>>>> KCIDB staging instance , I was dragged a bit away from this by other stuff
 >>>>>>>>> before effectively deploying some real automation on our side to push our
 >>>>>>>>> daily results to KCIDB...now I'm back at it and I'll keep on testing
 >>>>>>>>> some automation on our side for a bit against your KCIDB staging instance
 >>>>>>>>> before asking you to move to production eventually.
 >>>>>>>>>
 >>>>>>>>> But, today I realized, though, that I cannot push anymore data successfully
 >>>>>>>>> into staging even using the same test script I used one month ago to push
 >>>>>>>>> some new test data seems to fail now (I tested a few different days and
 >>>>>>>>> JSON validates fine with jsonschema...with proper dates with hours...)...
 >>>>>>>>> ...I cannot see any of my today tests' pushes on:
 >>>>>>>>>
 >>>>>>>>> https://staging.kernelci.org:3000/d/home/home?orgId=1&from=now-1y&to=now&refresh=30m&var-origin=arm&var-git_repository_url=All&var-dataset=playground_kernelci04
 >>>>>>>>>
 >>>>>>>>> Auth seems to proceed fine, but I cannot find any submission dated after
 >>>>>>>>> the old ~15/18-09-2020 submissions. I'm using the same kci-submit tools
 >>>>>>>>> version installed past months from your github though.
 >>>>>>>>>
 >>>>>>>>> Do you see any errors on your side that can shed a light on this ?
 >>>>>>>>>
 >>>>>>>>> Thanks
 >>>>>>>>>
 >>>>>>>>> Regards
 >>>>>>>>>
 >>>>>>>>> Cristian
 >>>>>>>>>
 >>>>>>>>> On Fri, Sep 18, 2020 at 05:42:28PM +0100, Cristian Marussi wrote:
 >>>>>>>>>> Hi Nick,
 >>>>>>>>>>
 >>>>>>>>>> On Fri, Sep 18, 2020 at 06:53:28PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>>>>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>>>>>>>
 >>>>>>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>>>>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>>>>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>>>>>>
 >>>>>>>>>>> Looking at this more it seems that Python's jsonschema module simply doesn't
 >>>>>>>>>>> enforce the requirements we put on those fields 🤦. You can send essentially
 >>>>>>>>>>> what you want and then hit BigQuery, which is serious about them.
 >>>>>>>>>>
 >>>>>>>>>> ...in fact on my side I check too with jsonschema in my script before using kcidb :D
 >>>>>>>>>>>
 >>>>>>>>>>> Sorry about that.
 >>>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>> No worries.
 >>>>>>>>>>
 >>>>>>>>>>> I opened an issue for this: https://github.com/kernelci/kcidb/issues/108
 >>>>>>>>>>>
 >>>>>>>>>>> For now please just make sure your timestamp comply with RFC3339.
 >>>>>>>>>>>
 >>>>>>>>>>> You can produce such a timestamp e.g. using "date --rfc-3339=s".
 >>>>>>>>>>
 >>>>>>>>>> I'll anyway fix my data on my side too, to have the real discovery timestamp.
 >>>>>>>>>>
 >>>>>>>>>>>
 >>>>>>>>>>> Nick
 >>>>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>> Thanks
 >>>>>>>>>>
 >>>>>>>>>> Cristian
 >>>>>>>>>>
 >>>>>>>>>>> On 9/18/20 6:30 PM, Nikolai Kondrashov wrote:
 >>>>>>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>>>>>>>       > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>>>>>>>       > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>>>>>>>       > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>>>>>>>       > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>>>>>>>       > push from yesterday.
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>>>>>>>       > (I pushed >30mins ago)
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > Any idea ?
 >>>>>>>>>>>>
 >>>>>>>>>>>> Yes, I think it's one of the problems you uncovered :)
 >>>>>>>>>>>>
 >>>>>>>>>>>> The schema allows for fully-compliant RFC3339 timestamps, but the BigQuery
 >>>>>>>>>>>> database on the backend doesn't understand some of them. In particular it
 >>>>>>>>>>>> doesn't understand the date-only timestamps you send. E.g. "2020-09-13".
 >>>>>>>>>>>> That's what I wanted to fix today, but ran out of time.
 >>>>>>>>>>>>
 >>>>>>>>>>>> Additionally, the backend doesn't have a way to report a problem to the
 >>>>>>>>>>>> submitter at the moment. We intend to fix that, but for now it's possible only
 >>>>>>>>>>>> through us looking at the logs and sending a message to the submitter :)
 >>>>>>>>>>>>
 >>>>>>>>>>>> To work around this you can pad your timestamps with dummy date and time
 >>>>>>>>>>>> data.
 >>>>>>>>>>>>
 >>>>>>>>>>>> E.g. instead of sending:
 >>>>>>>>>>>>
 >>>>>>>>>>>>           2020-09-13
 >>>>>>>>>>>>
 >>>>>>>>>>>> you can send:
 >>>>>>>>>>>>
 >>>>>>>>>>>>           2020-09-13 00:00:00+00:00
 >>>>>>>>>>>>
 >>>>>>>>>>>> Hopefully that's the only problem. It could be, since you managed to send data
 >>>>>>>>>>>> before :)
 >>>>>>>>>>>>
 >>>>>>>>>>>> Nick
 >>>>>>>>>>>>
 >>>>>>>>>>>> On 9/18/20 6:21 PM, Cristian Marussi wrote:
 >>>>>>>>>>>>       > Hi Nikolai,
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > On Thu, Sep 17, 2020 at 08:26:15PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>>>>>       >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>>>>>>>       >>> It works too ... :D
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> Whoa, awesome!
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> And you have already uncovered a few issues we need to fix, too!
 >>>>>>>>>>>>       >> I will deal with them tomorrow.
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>>>>>>>       >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>>>>>>>       >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>>>>>>>       >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>>>>>>>       >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>>>>>>>       >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>>>>>>>       >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>>>>>>>       >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>>>>>>>       >>> before re-submitting ?
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> Right now it's not supported (with various possible quirks if attempted).
 >>>>>>>>>>>>       >> So, preferably, submit only one, complete and final instance of each object
 >>>>>>>>>>>>       >> (with unique ID) for now.
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> We have a plan to support merging missing properties across multiple reported
 >>>>>>>>>>>>       >> objects with the same ID.
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >>              Object A        Object B    Dashboard/Notifications
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> FieldX:     Foo             Foo         Foo
 >>>>>>>>>>>>       >> FieldY:                     Bar         Bar
 >>>>>>>>>>>>       >> FieldZ:     Baz                         Baz
 >>>>>>>>>>>>       >> FieldU:     Red             Blue        Red/Blue
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> Since we're using a distributed database we cannot really maintain order
 >>>>>>>>>>>>       >> (without introducing artificial global lock), so the order of the reports
 >>>>>>>>>>>>       >> doesn't matter. We can only guarantee that a present value would override
 >>>>>>>>>>>>       >> missing value. It would be undefined which value would be picked among
 >>>>>>>>>>>>       >> multiple different values.
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> This would allow gradual reporting of each object, but no editing, sorry.
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> However, once again, this is a plan with some research done, only.
 >>>>>>>>>>>>       >> I plan to start implementing it within a few weeks.
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > So in order to carry on my experiments, I've just tried to push a new dataset
 >>>>>>>>>>>>       > with a few changes in my data-layout to mimic what I see other origins do; this
 >>>>>>>>>>>>       > contained something like 38 builds across 4 different revisions (with brand new
 >>>>>>>>>>>>       > revisions IDs), but I cannot see anything on the UI: I just keep seeing the old
 >>>>>>>>>>>>       > push from yesterday.
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > JSON seems valid and kcidb-submit does not report any error even using -l DEBUG.
 >>>>>>>>>>>>       > (I pushed >30mins ago)
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > Any idea ?
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > Thanks
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > Cristian
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       >> Nick
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >> On 9/17/20 7:22 PM, Cristian Marussi wrote:
 >>>>>>>>>>>>       >>> On Thu, Sep 17, 2020 at 04:52:30PM +0300, Nikolai Kondrashov wrote:
 >>>>>>>>>>>>       >>>> Hi Christian,
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>>>>>>>       >>>>> Hi Nikolai,
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>>>>>>>       >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> Wonderful!
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>>>>>>>       >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>>>>>>>       >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>>>>>>>       >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>>>>>>>       >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> Great, this is exactly what we need, welcome aboard :)
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> Please don't hesitate to reach out on kernelci@groups.io or on #kernelci on
 >>>>>>>>>>>>       >>>> freenode.net, if you have any questions, problems, or requirements.
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>>>>>>>       >>>>> point at ?
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> Absolutely, I created credentials for you and sent them in a separate message.
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> You can use origin "arm" for the start, unless you have multiple CI systems
 >>>>>>>>>>>>       >>>> and want to differentiate them somehow in your reports.
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>> Nick
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>    Thanks !
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> It works too ... :D
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> https://staging.kernelci.org:3000/d/build/build?orgId=1&var-dataset=playground_kernelci04&var-id=arm:2020-07-07:d3d7689c2cc9503266cac3bc777bb4ddae2e5f2e
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> ..quick question though....given that now I'll have to play quite a bit
 >>>>>>>>>>>>       >>> with it and see how's better to present our data, if anythinjg missing etc etc,
 >>>>>>>>>>>>       >>> is there any chance (or way) that if I submmit the same JSON report multiple
 >>>>>>>>>>>>       >>> times with slight differences here and there (but with the same IDs clearly)
 >>>>>>>>>>>>       >>> I'll get my DB updated in the bits I have changed: as an example I've just
 >>>>>>>>>>>>       >>> resubmitted the same report with added discovery_time and descriptions, and got
 >>>>>>>>>>>>       >>> NO errors, but I cannot see the changes in the UI (unless they have still to
 >>>>>>>>>>>>       >>> propagate...)..or maybe I can obtain the same effect by dropping my dataset
 >>>>>>>>>>>>       >>> before re-submitting ?
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> Regards
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> Thanks
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>> Cristian
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>>> On 9/17/20 3:50 PM, Cristian Marussi wrote:
 >>>>>>>>>>>>       >>>>> Hi Nikolai,
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> I work at ARM in the Kernel team and, in short, we'd like certainly to
 >>>>>>>>>>>>       >>>>> contribute our internal Kernel test results to KCIDB.
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> After having attended your LPC2020 TestMC and KernelCI/BoF, I've now cooked
 >>>>>>>>>>>>       >>>>> up some KCIDB JSON test report (seemingly valid against your KCIDB v3 schema)
 >>>>>>>>>>>>       >>>>> and I'd like to start experimenting with kci-submit (on non-production
 >>>>>>>>>>>>       >>>>> instances), so as to assess how to fit our results into your schema and maybe
 >>>>>>>>>>>>       >>>>> contribute with some new KCIDB requirements if strictly needed.
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> Is it possible to get some valid credentials and a playground instance to
 >>>>>>>>>>>>       >>>>> point at ?
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> Thanks
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> Regards
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>> Cristian
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>>
 >>>>>>>>>>>>       >>>>
 >>>>>>>>>>>>       >>>
 >>>>>>>>>>>>       >>
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>       > >
 >>>>>>>>>>>>       >
 >>>>>>>>>>>>
 >>>>>>>>>>>>
 >>>>>>>>>>>>
 >>>>>>>>>>>>
 >>>>>>>>>>>>
 >>>>>>>>>>>>
 >>>>>>>>>>>
 >>>>>>>>>
 >>>>>>>>>
 >>>>>>>>>>
 >>>>>>>>>
 >>>>>>>>
 >>>>>>>>
 >>>>>>>>
 >>>>>>>>>>>>
 >>>>>>>>
 >>>>>>>
 >>>>>>
 >>>>>
 >>>>
 >>>>
 >>>>
 >>>>>>
 >>>>
 >>>
 >>
 >>
 >>
 >> 
 >>
 >>
 >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2020-12-02 12:01                       ` Cristian Marussi
  2020-12-02 13:38                         ` Nikolai Kondrashov
@ 2021-03-15  9:00                         ` Nikolai Kondrashov
  2021-03-17 19:07                           ` Cristian Marussi
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolai Kondrashov @ 2021-03-15  9:00 UTC (permalink / raw)
  To: Cristian Marussi; +Cc: kernelci, broonie, basil.eljuse

Hi Cristian,

On 12/2/20 2:01 PM, Cristian Marussi wrote:
> On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
>> Finally, at this stage we really need a breadth of data coming from
>> different CI system, rather than its depth or precision, so we can understand
>> the problem at hand better and faster. It would do us no good to concentrate
>> on just a few, and solidify the design around them. That would make it more
>> difficult for others to join.
>>
>> You can refine and add more data afterwards.
>>
> 
> Sure, in fact, as of now I still have to ask for some changes in our reporting
> backend, (which generates the original data stored in our DB and then pushed
> to you), so I have to admit the git commit hash are partially faked (since I
> have only a git describe string to start from) and as a consequence they won't
> really be so much useful for comparisons amongst different origins (given
> they don't refer real kernel commits), BUT I thought this NOT to be a
> blocking problem for now, so that I can start pushing data to KCIDB and
> then later on (once I get real full hashes on my side) I'll start pushing the
> real valid ones, does it sounds good ?

Is there any progress towards having the full commit hashes available?

I'm working on aggregating testing data for notification e-mails and I could
use a few samples of data which has both summarized LTP results from Red Hat's
CKI and the detailed LTP results from ARM, under the same revision.

Plus, we're moving ever closer to reaching out to developers with our data,
and it would be good to have the right hashes in your data :)

Nick


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Contributing ARM tests results to KCIDB
  2021-03-15  9:00                         ` Nikolai Kondrashov
@ 2021-03-17 19:07                           ` Cristian Marussi
  0 siblings, 0 replies; 23+ messages in thread
From: Cristian Marussi @ 2021-03-17 19:07 UTC (permalink / raw)
  To: kernelci, Nikolai.Kondrashov; +Cc: broonie, basil.eljuse

Hi Nick

sorry for the delay.

On Mon, Mar 15, 2021 at 11:00:24AM +0200, Nikolai Kondrashov via groups.io wrote:
> Hi Cristian,
> 
> On 12/2/20 2:01 PM, Cristian Marussi wrote:
> > On Wed, Dec 02, 2020 at 12:16:10PM +0200, Nikolai Kondrashov wrote:
> > > Finally, at this stage we really need a breadth of data coming from
> > > different CI system, rather than its depth or precision, so we can understand
> > > the problem at hand better and faster. It would do us no good to concentrate
> > > on just a few, and solidify the design around them. That would make it more
> > > difficult for others to join.
> > > 
> > > You can refine and add more data afterwards.
> > > 
> > 
> > Sure, in fact, as of now I still have to ask for some changes in our reporting
> > backend, (which generates the original data stored in our DB and then pushed
> > to you), so I have to admit the git commit hash are partially faked (since I
> > have only a git describe string to start from) and as a consequence they won't
> > really be so much useful for comparisons amongst different origins (given
> > they don't refer real kernel commits), BUT I thought this NOT to be a
> > blocking problem for now, so that I can start pushing data to KCIDB and
> > then later on (once I get real full hashes on my side) I'll start pushing the
> > real valid ones, does it sounds good ?
> 
> Is there any progress towards having the full commit hashes available?
> 

No sorry, I'll look into this next and see if I can speed up a bit.

Thanks

Cristian
> I'm working on aggregating testing data for notification e-mails and I could
> use a few samples of data which has both summarized LTP results from Red Hat's
> CKI and the detailed LTP results from ARM, under the same revision.
> 
> Plus, we're moving ever closer to reaching out to developers with our data,
> and it would be good to have the right hashes in your data :)
> 
> Nick
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-03-17 19:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-17 12:50 Contributing ARM tests results to KCIDB cristian.marussi
2020-09-17 13:52 ` Nikolai Kondrashov
2020-09-17 16:22   ` Cristian Marussi
2020-09-17 17:26     ` Nikolai Kondrashov
2020-09-18 15:21       ` Cristian Marussi
2020-09-18 15:30         ` Nikolai Kondrashov
2020-09-18 15:53           ` Nikolai Kondrashov
2020-09-18 16:42             ` Cristian Marussi
2020-09-18 16:57               ` Nikolai Kondrashov
2020-11-05 18:46               ` Cristian Marussi
2020-11-06 10:35                 ` Nikolai Kondrashov
2020-12-02  8:05                 ` Nikolai Kondrashov
2020-12-02  9:23                   ` Cristian Marussi
2020-12-02 10:16                     ` Nikolai Kondrashov
2020-12-02 12:01                       ` Cristian Marussi
2020-12-02 13:38                         ` Nikolai Kondrashov
2020-12-10 17:23                           ` Cristian Marussi
2020-12-10 18:17                             ` Nikolai Kondrashov
2020-12-10 20:19                               ` Cristian Marussi
2020-12-14 10:23                                 ` Nikolai Kondrashov
2021-03-15  9:00                         ` Nikolai Kondrashov
2021-03-17 19:07                           ` Cristian Marussi
2020-09-18 16:06           ` Cristian Marussi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.