All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
       [not found] <2B31D2BD02000066A2327079@prv1-mh.provo.novell.com>
@ 2019-06-19  7:28 ` Jan Beulich
  2019-06-19  9:02   ` Julien Grall
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2019-06-19  7:28 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

>>> On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
> branch xen-4.10-testing
> xenbranch xen-4.10-testing
> job test-armhf-armhf-xl-arndale
> testid debian-install
> 
> Tree: linux git://xenbits.xen.org/linux-pvops.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> 
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  xen git://xenbits.xen.org/xen.git
>   Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>   Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/138005/ 
> 
> 
>   commit 702c9146c00d65d1e9c5955335ba002505e97e09
>   Author: Julien Grall <julien.grall@arm.com>
>   Date:   Mon Apr 29 15:05:16 2019 +0100
>   
>       xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent re-ordering
>       
>       Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>       speculatively and out of order relative to other instructions executed
>       on the same PE."
>       
>       Add an instruction barrier to get accurate number of cycles when
>       requested in get_cycles(). For the other users of CNPCT_EL0, replace by
>       a call to get_cycles().
>       
>       This is part of XSA-295.
>       
>       Signed-off-by: Julien Grall <julien.grall@arm.com>
>       Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Just in case you didn't notice this coming in.

Jan

> For bisection revision-tuple graph see:
>    
> http://logs.test-lab.xenproject.org/osstest/results/bisect/xen-4.10-testing/t 
> est-armhf-armhf-xl-arndale.debian-install.html
> Revision IDs in each graph node refer, respectively, to the Trees above.
> 
> ----------------------------------------
> Running cs-bisection-step 
> --graph-out=/home/logs/results/bisect/xen-4.10-testing/test-armhf-armhf-xl-ar
> ndale.debian-install --summary-out=tmp/138005.bisection-summary 
> --basis-template=137381 --blessings=real,real-bisect xen-4.10-testing 
> test-armhf-armhf-xl-arndale debian-install
> Searching for failure / basis pass:
>  137854 fail [host=arndale-metrocentre] / 137381 [host=arndale-bluewater] 
> 137277 [host=arndale-lakeside] 136692 [host=arndale-westfield] 136552 
> [host=arndale-bluewater] 136391 [host=arndale-lakeside] 136232 ok.
> Failure / basis pass flights: 137854 / 136232
> (tree in latest but not in basispass: seabios)
> Tree: linux git://xenbits.xen.org/linux-pvops.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> Latest e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 6197b859ec77e4950e5ae0202002d6d4dbef143b
> Basis pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> f684c3f5eef4be691e137ae64e7d00521ec201de 
> 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 
> ab261f5ac491a0a4d65a641fc7da29b810ec0fb2
> Generating revisions with ./adhoc-revtuple-generator  
> git://xenbits.xen.org/linux-pvops.git#e64ac26749dc2c0f390caccd04274608ab31c8c
> f-e64ac26749dc2c0f390caccd04274608ab31c8cf 
> git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b5
> 18f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> git://xenbits.xen.org/osstest/ovmf.git#f684c3f5eef4be691e137ae64e7d00521ec201
> de-fc7d997c35372126823c3b0acf7b67c45cbeea36 
> git://xenbits.xen.org/qemu-xen.git#6ea4cef2bd717045ac0e84b52a5b1b7716feb0c\
>  2-8acabec966263f90ad493e4af2642947c0c43d23 
> git://xenbits.xen.org/xen.git#ab261f5ac491a0a4d65a641fc7da29b810ec0fb2-6197b8
> 59ec77e4950e5ae0202002d6d4dbef143b
> Loaded 3009 nodes in revision graph
> Searching for test results:
>  136232 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> f684c3f5eef4be691e137ae64e7d00521ec201de 
> 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 
> ab261f5ac491a0a4d65a641fc7da29b810ec0fb2
>  136391 [host=arndale-lakeside]
>  136552 [host=arndale-bluewater]
>  136692 [host=arndale-westfield]
>  137277 [host=arndale-lakeside]
>  137381 [host=arndale-bluewater]
>  137727 fail irrelevant
>  137854 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 6197b859ec77e4950e5ae0202002d6d4dbef143b
>  137924 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> f684c3f5eef4be691e137ae64e7d00521ec201de 
> 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 
> ab261f5ac491a0a4d65a641fc7da29b810ec0fb2
>  137934 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> e2afc8ab59c3dc6cf368c339a26840f0869a2356 
> 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 
> ac516e89402b0c8df92b139831727ece5db700e3
>  137926 fail irrelevant
>  137944 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 6197b859ec77e4950e5ae0202002d6d4dbef143b
>  137970 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 7484d020f0e91b70fe3a14030a45369f959c932f
>  137951 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> 49fb605709796488589c6f282522283911a575f6 
> 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 
> adf037bba1e6af47fef8584c1ad41f424ebda01e
>  137957 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> 1631bb26ae991e530d3c96fe3161ea15144b358e 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> adf037bba1e6af47fef8584c1ad41f424ebda01e
>  138005 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 702c9146c00d65d1e9c5955335ba002505e97e09
>  137983 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> c11933bda1eeb6ffc3350720bf4701d37b02211c
>  137988 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 3f10c53b3ef12f770f64b914914b9f13882e9dae
>  137993 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 52220b5f437a8d03ba108e127e7d717657edf99c
>  137996 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 702c9146c00d65d1e9c5955335ba002505e97e09
>  137998 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 52220b5f437a8d03ba108e127e7d717657edf99c
>  138000 fail e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 702c9146c00d65d1e9c5955335ba002505e97e09
>  138003 pass e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 52220b5f437a8d03ba108e127e7d717657edf99c
> Searching for interesting versions
>  Result found: flight 136232 (pass), for basis pass
>  Result found: flight 137854 (fail), for basis failure
>  Repro found: flight 137924 (pass), for basis pass
>  Repro found: flight 137944 (fail), for basis failure
>  0 revisions at e64ac26749dc2c0f390caccd04274608ab31c8cf 
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 
> fc7d997c35372126823c3b0acf7b67c45cbeea36 
> 8acabec966263f90ad493e4af2642947c0c43d23 
> 52220b5f437a8d03ba108e127e7d717657edf99c
> No revisions left to test, checking graph state.
>  Result found: flight 137993 (pass), for last pass
>  Result found: flight 137996 (fail), for first failure
>  Repro found: flight 137998 (pass), for last pass
>  Repro found: flight 138000 (fail), for first failure
>  Repro found: flight 138003 (pass), for last pass
>  Repro found: flight 138005 (fail), for first failure
> 
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  xen git://xenbits.xen.org/xen.git
>   Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>   Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/138005/ 
> 
> 
>   commit 702c9146c00d65d1e9c5955335ba002505e97e09
>   Author: Julien Grall <julien.grall@arm.com>
>   Date:   Mon Apr 29 15:05:16 2019 +0100
>   
>       xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent re-ordering
>       
>       Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>       speculatively and out of order relative to other instructions executed
>       on the same PE."
>       
>       Add an instruction barrier to get accurate number of cycles when
>       requested in get_cycles(). For the other users of CNPCT_EL0, replace 
> by
>       a call to get_cycles().
>       
>       This is part of XSA-295.
>       
>       Signed-off-by: Julien Grall <julien.grall@arm.com>
>       Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> pnmtopng: 203 colors found
> Revision graph left in 
> /home/logs/results/bisect/xen-4.10-testing/test-armhf-armhf-xl-arndale.debian
> -install.{dot,ps,png,html,svg}.
> ----------------------------------------
> 138005: tolerable ALL FAIL
> 
> flight 138005 xen-4.10-testing real-bisect [real]
> http://logs.test-lab.xenproject.org/osstest/logs/138005/ 
> 
> Failures :-/ but no regressions.
> 
> Tests which did not succeed,
> including tests which could not be run:
>  test-armhf-armhf-xl-arndale  10 debian-install          fail baseline 
> untested
> 
> 
> jobs:
>  test-armhf-armhf-xl-arndale                                  fail    
> 
> 
> ------------------------------------------------------------
> sg-report-flight on osstest.test-lab.xenproject.org
> logs: /home/logs/logs
> images: /home/logs/images
> 
> Logs, config files, etc. are available at
>     http://logs.test-lab.xenproject.org/osstest/logs 
> 
> Explanation of these reports, and of osstest in general, is at
>     
> http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master 
>     http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master 
> 
> Test harness code can be found at
>     http://xenbits.xen.org/gitweb?p=osstest.git;a=summary 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org 
> https://lists.xenproject.org/mailman/listinfo/xen-devel 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-19  7:28 ` [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale Jan Beulich
@ 2019-06-19  9:02   ` Julien Grall
  2019-06-19 15:59     ` Julien Grall
  0 siblings, 1 reply; 8+ messages in thread
From: Julien Grall @ 2019-06-19  9:02 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini; +Cc: xen-devel

Hi,

On 6/19/19 8:28 AM, Jan Beulich wrote:
>>>> On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
>> branch xen-4.10-testing
>> xenbranch xen-4.10-testing
>> job test-armhf-armhf-xl-arndale
>> testid debian-install
>>
>> Tree: linux git://xenbits.xen.org/linux-pvops.git
>> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
>> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
>> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
>> Tree: xen git://xenbits.xen.org/xen.git
>>
>> *** Found and reproduced problem changeset ***
>>
>>    Bug is in tree:  xen git://xenbits.xen.org/xen.git
>>    Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>>    Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>>    Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/138005/
>>
>>
>>    commit 702c9146c00d65d1e9c5955335ba002505e97e09
>>    Author: Julien Grall <julien.grall@arm.com>
>>    Date:   Mon Apr 29 15:05:16 2019 +0100
>>    
>>        xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent re-ordering
>>        
>>        Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>>        speculatively and out of order relative to other instructions executed
>>        on the same PE."
>>        
>>        Add an instruction barrier to get accurate number of cycles when
>>        requested in get_cycles(). For the other users of CNPCT_EL0, replace by
>>        a call to get_cycles().
>>        
>>        This is part of XSA-295.
>>        
>>        Signed-off-by: Julien Grall <julien.grall@arm.com>
>>        Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> Just in case you didn't notice this coming in.

I saw it this morning. But I fail to understand how the instruction 
barrier will result in a timeout during the guest installation...

The more this only doesn't seem to appear on staging-4.12.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-19  9:02   ` Julien Grall
@ 2019-06-19 15:59     ` Julien Grall
  2019-06-19 16:56       ` Stefano Stabellini
  0 siblings, 1 reply; 8+ messages in thread
From: Julien Grall @ 2019-06-19 15:59 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini; +Cc: xen-devel, Ian Jackson

Answering to myself.

On 19/06/2019 10:02, Julien Grall wrote:
> Hi,
> 
> On 6/19/19 8:28 AM, Jan Beulich wrote:
>>>>> On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
>>> branch xen-4.10-testing
>>> xenbranch xen-4.10-testing
>>> job test-armhf-armhf-xl-arndale
>>> testid debian-install
>>>
>>> Tree: linux git://xenbits.xen.org/linux-pvops.git
>>> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
>>> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
>>> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
>>> Tree: xen git://xenbits.xen.org/xen.git
>>>
>>> *** Found and reproduced problem changeset ***
>>>
>>>    Bug is in tree:  xen git://xenbits.xen.org/xen.git
>>>    Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>>>    Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>>>    Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/138005/
>>>
>>>
>>>    commit 702c9146c00d65d1e9c5955335ba002505e97e09
>>>    Author: Julien Grall <julien.grall@arm.com>
>>>    Date:   Mon Apr 29 15:05:16 2019 +0100
>>>        xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent re-ordering
>>>        Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>>>        speculatively and out of order relative to other instructions executed
>>>        on the same PE."
>>>        Add an instruction barrier to get accurate number of cycles when
>>>        requested in get_cycles(). For the other users of CNPCT_EL0, replace by
>>>        a call to get_cycles().
>>>        This is part of XSA-295.
>>>        Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>        Acked-by: Stefano Stabellini <sstabellini@kernel.org>
>>
>> Just in case you didn't notice this coming in.
> 
> I saw it this morning. But I fail to understand how the instruction barrier will 
> result in a timeout during the guest installation...

One thing to keep in mind here is isb() will also carry a compiler barrier. So 
the resulting binary may be different as the compiler may re-order the load.

One possibility is there are a missing data barrier in other part of the code. 
The isb() would expose it.

> 
> The more this only doesn't seem to appear on staging-4.12.

@Stefano, as we know staging-4.12 is working, one way to debug this is to try to 
reproduce in different commit between staging-4.12 with the isb() patch applied.

Hopefully this should test us where the bug was fixed.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-19 15:59     ` Julien Grall
@ 2019-06-19 16:56       ` Stefano Stabellini
  2019-06-20 17:24         ` Julien Grall
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Stabellini @ 2019-06-19 16:56 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, Ian Jackson, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 3822 bytes --]

On Wed, 19 Jun 2019, Julien Grall wrote:
> > On 6/19/19 8:28 AM, Jan Beulich wrote:
> > > > > > On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
> > > > branch xen-4.10-testing
> > > > xenbranch xen-4.10-testing
> > > > job test-armhf-armhf-xl-arndale
> > > > testid debian-install
> > > > 
> > > > Tree: linux git://xenbits.xen.org/linux-pvops.git
> > > > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> > > > Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
> > > > Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> > > > Tree: xen git://xenbits.xen.org/xen.git
> > > > 
> > > > *** Found and reproduced problem changeset ***
> > > > 
> > > >    Bug is in tree:  xen git://xenbits.xen.org/xen.git
> > > >    Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
> > > >    Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
> > > >    Last fail repro:
> > > > http://logs.test-lab.xenproject.org/osstest/logs/138005/
> > > > 
> > > > 
> > > >    commit 702c9146c00d65d1e9c5955335ba002505e97e09
> > > >    Author: Julien Grall <julien.grall@arm.com>
> > > >    Date:   Mon Apr 29 15:05:16 2019 +0100
> > > >        xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent
> > > > re-ordering
> > > >        Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
> > > >        speculatively and out of order relative to other instructions
> > > > executed
> > > >        on the same PE."
> > > >        Add an instruction barrier to get accurate number of cycles when
> > > >        requested in get_cycles(). For the other users of CNPCT_EL0,
> > > > replace by
> > > >        a call to get_cycles().
> > > >        This is part of XSA-295.
> > > >        Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > >        Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> > > 
> > > Just in case you didn't notice this coming in.
> > 
> > I saw it this morning. But I fail to understand how the instruction barrier
> > will result in a timeout during the guest installation...
> 
> One thing to keep in mind here is isb() will also carry a compiler barrier. So
> the resulting binary may be different as the compiler may re-order the load.
> 
> One possibility is there are a missing data barrier in other part of the code.
> The isb() would expose it.
> 
> > 
> > The more this only doesn't seem to appear on staging-4.12.
> 
> @Stefano, as we know staging-4.12 is working, one way to debug this is to try
> to reproduce in different commit between staging-4.12 with the isb() patch
> applied.
> 
> Hopefully this should test us where the bug was fixed.

Ian, I noticed the bisector hasn't managed to pinpoint a commit on
staging-4.11 yet. Obviously, we suspect it is the same patch ("xen/arm:
Add an isb() before reading CNTPCT_EL0 to prevent re-ordering") causing
the issue, but it would be really good to confirm.

Could you please schedule a bisector run on staging-4.11, ideally on
Arndale hardware (the same used for the bisection on 4.10 here)?


After we confirm that 4.11 is suffering from the same issue, we could
try to identify which commit "fixed" the problem between 4.11 and 4.12,
as 4.12 passed the tests yesterday. I am getting a bit ahead of myself
here, but I would love if we could use the bisector to spot the "fix"
somehow, maybe preparing a special branch for the bisector. The special
branch could start from the common root between staging-4.11 and
staging-4.12, which is tag 4.11.0-rc6, apply the patch that breaks, then
apply the other 4.12 commits, one of them we suspect is the fix we
need:

4.11.0-rc6 | CNTPCT_EL0 patch | other 4.12 commits
 ^               ^
 |               broken patch
 current common root between staging-4.12 and staging-4.11

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-19 16:56       ` Stefano Stabellini
@ 2019-06-20 17:24         ` Julien Grall
  2019-06-20 23:55           ` Stefano Stabellini
  2019-06-21  8:58           ` Jan Beulich
  0 siblings, 2 replies; 8+ messages in thread
From: Julien Grall @ 2019-06-20 17:24 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Ian Jackson, Jan Beulich

Hi,

On 6/19/19 5:56 PM, Stefano Stabellini wrote:
> On Wed, 19 Jun 2019, Julien Grall wrote:
>>> On 6/19/19 8:28 AM, Jan Beulich wrote:
>>>>>>> On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
>>>>> branch xen-4.10-testing
>>>>> xenbranch xen-4.10-testing
>>>>> job test-armhf-armhf-xl-arndale
>>>>> testid debian-install
>>>>>
>>>>> Tree: linux git://xenbits.xen.org/linux-pvops.git
>>>>> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
>>>>> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
>>>>> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
>>>>> Tree: xen git://xenbits.xen.org/xen.git
>>>>>
>>>>> *** Found and reproduced problem changeset ***
>>>>>
>>>>>     Bug is in tree:  xen git://xenbits.xen.org/xen.git
>>>>>     Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>>>>>     Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>>>>>     Last fail repro:
>>>>> http://logs.test-lab.xenproject.org/osstest/logs/138005/
>>>>>
>>>>>
>>>>>     commit 702c9146c00d65d1e9c5955335ba002505e97e09
>>>>>     Author: Julien Grall <julien.grall@arm.com>
>>>>>     Date:   Mon Apr 29 15:05:16 2019 +0100
>>>>>         xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent
>>>>> re-ordering
>>>>>         Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>>>>>         speculatively and out of order relative to other instructions
>>>>> executed
>>>>>         on the same PE."
>>>>>         Add an instruction barrier to get accurate number of cycles when
>>>>>         requested in get_cycles(). For the other users of CNPCT_EL0,
>>>>> replace by
>>>>>         a call to get_cycles().
>>>>>         This is part of XSA-295.
>>>>>         Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>>>         Acked-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>
>>>> Just in case you didn't notice this coming in.
>>>
>>> I saw it this morning. But I fail to understand how the instruction barrier
>>> will result in a timeout during the guest installation...
>>
>> One thing to keep in mind here is isb() will also carry a compiler barrier. So
>> the resulting binary may be different as the compiler may re-order the load.
>>
>> One possibility is there are a missing data barrier in other part of the code.
>> The isb() would expose it.
>>
>>>
>>> The more this only doesn't seem to appear on staging-4.12.
>>
>> @Stefano, as we know staging-4.12 is working, one way to debug this is to try
>> to reproduce in different commit between staging-4.12 with the isb() patch
>> applied.
>>
>> Hopefully this should test us where the bug was fixed.
> 
> Ian, I noticed the bisector hasn't managed to pinpoint a commit on
> staging-4.11 yet. Obviously, we suspect it is the same patch ("xen/arm:
> Add an isb() before reading CNTPCT_EL0 to prevent re-ordering") causing
> the issue, but it would be really good to confirm.
> 
> Could you please schedule a bisector run on staging-4.11, ideally on
> Arndale hardware (the same used for the bisection on 4.10 here)?
> 
> 
> After we confirm that 4.11 is suffering from the same issue, we could
> try to identify which commit "fixed" the problem between 4.11 and 4.12,
> as 4.12 passed the tests yesterday. I am getting a bit ahead of myself
> here, but I would love if we could use the bisector to spot the "fix"
> somehow, maybe preparing a special branch for the bisector. The special
> branch could start from the common root between staging-4.11 and
> staging-4.12, which is tag 4.11.0-rc6, apply the patch that breaks, then
> apply the other 4.12 commits, one of them we suspect is the fix we
> need:
> 
> 4.11.0-rc6 | CNTPCT_EL0 patch | other 4.12 commits
>   ^               ^
>   |               broken patch
>   current common root between staging-4.12 and staging-4.11
> 

Actually I may have found the error. I feel quite ashamed I didn't spot 
this during review and when the bisector fingered it.

staging-4.11 and staging.4.12 didn't have get_cycles implemented (i.e it 
returned 0). During the backport, get_cycles() got suddenly implemented 
(aside the isb()) so it now returns the number of cycles.

However, before commit da3d55ae67 "console: avoid printing no or null 
time stamps", cycles_t (return type of get_cycles) was unsigned long. On 
arm32 unsigned long is 32-bit and therefore does not cover the full 
counter (64-bit). So the number of cycles will be truncated leading to 
all sort of timing issue.

The correct fix is to switch cycles_t from unsigned long to uint64_t. I 
think we would need this to be backported for xen 4.9 and 4.8 as well.

I will send a fix.

This is the second instance where backport gone wrong in this XSA. 
Stefano, when you have a clash during backport, can you make sure to 
investigate the problem and also write in the commit message what you 
changed?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-20 17:24         ` Julien Grall
@ 2019-06-20 23:55           ` Stefano Stabellini
  2019-06-21  8:58           ` Jan Beulich
  1 sibling, 0 replies; 8+ messages in thread
From: Stefano Stabellini @ 2019-06-20 23:55 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, Ian Jackson, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 5789 bytes --]

On Thu, 20 Jun 2019, Julien Grall wrote:
> On 6/19/19 5:56 PM, Stefano Stabellini wrote:
> > On Wed, 19 Jun 2019, Julien Grall wrote:
> > > > On 6/19/19 8:28 AM, Jan Beulich wrote:
> > > > > > > > On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
> > > > > > branch xen-4.10-testing
> > > > > > xenbranch xen-4.10-testing
> > > > > > job test-armhf-armhf-xl-arndale
> > > > > > testid debian-install
> > > > > > 
> > > > > > Tree: linux git://xenbits.xen.org/linux-pvops.git
> > > > > > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> > > > > > Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
> > > > > > Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> > > > > > Tree: xen git://xenbits.xen.org/xen.git
> > > > > > 
> > > > > > *** Found and reproduced problem changeset ***
> > > > > > 
> > > > > >     Bug is in tree:  xen git://xenbits.xen.org/xen.git
> > > > > >     Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
> > > > > >     Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
> > > > > >     Last fail repro:
> > > > > > http://logs.test-lab.xenproject.org/osstest/logs/138005/
> > > > > > 
> > > > > > 
> > > > > >     commit 702c9146c00d65d1e9c5955335ba002505e97e09
> > > > > >     Author: Julien Grall <julien.grall@arm.com>
> > > > > >     Date:   Mon Apr 29 15:05:16 2019 +0100
> > > > > >         xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent
> > > > > > re-ordering
> > > > > >         Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can
> > > > > > occur
> > > > > >         speculatively and out of order relative to other
> > > > > > instructions
> > > > > > executed
> > > > > >         on the same PE."
> > > > > >         Add an instruction barrier to get accurate number of cycles
> > > > > > when
> > > > > >         requested in get_cycles(). For the other users of CNPCT_EL0,
> > > > > > replace by
> > > > > >         a call to get_cycles().
> > > > > >         This is part of XSA-295.
> > > > > >         Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > > > >         Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> > > > > 
> > > > > Just in case you didn't notice this coming in.
> > > > 
> > > > I saw it this morning. But I fail to understand how the instruction
> > > > barrier
> > > > will result in a timeout during the guest installation...
> > > 
> > > One thing to keep in mind here is isb() will also carry a compiler
> > > barrier. So
> > > the resulting binary may be different as the compiler may re-order the
> > > load.
> > > 
> > > One possibility is there are a missing data barrier in other part of the
> > > code.
> > > The isb() would expose it.
> > > 
> > > > 
> > > > The more this only doesn't seem to appear on staging-4.12.
> > > 
> > > @Stefano, as we know staging-4.12 is working, one way to debug this is to
> > > try
> > > to reproduce in different commit between staging-4.12 with the isb() patch
> > > applied.
> > > 
> > > Hopefully this should test us where the bug was fixed.
> > 
> > Ian, I noticed the bisector hasn't managed to pinpoint a commit on
> > staging-4.11 yet. Obviously, we suspect it is the same patch ("xen/arm:
> > Add an isb() before reading CNTPCT_EL0 to prevent re-ordering") causing
> > the issue, but it would be really good to confirm.
> > 
> > Could you please schedule a bisector run on staging-4.11, ideally on
> > Arndale hardware (the same used for the bisection on 4.10 here)?
> > 
> > 
> > After we confirm that 4.11 is suffering from the same issue, we could
> > try to identify which commit "fixed" the problem between 4.11 and 4.12,
> > as 4.12 passed the tests yesterday. I am getting a bit ahead of myself
> > here, but I would love if we could use the bisector to spot the "fix"
> > somehow, maybe preparing a special branch for the bisector. The special
> > branch could start from the common root between staging-4.11 and
> > staging-4.12, which is tag 4.11.0-rc6, apply the patch that breaks, then
> > apply the other 4.12 commits, one of them we suspect is the fix we
> > need:
> > 
> > 4.11.0-rc6 | CNTPCT_EL0 patch | other 4.12 commits
> >   ^               ^
> >   |               broken patch
> >   current common root between staging-4.12 and staging-4.11
> > 
> 
> Actually I may have found the error. I feel quite ashamed I didn't spot this
> during review and when the bisector fingered it.
> 
> staging-4.11 and staging.4.12 didn't have get_cycles implemented (i.e it
> returned 0). During the backport, get_cycles() got suddenly implemented (aside
> the isb()) so it now returns the number of cycles.
> 
> However, before commit da3d55ae67 "console: avoid printing no or null time
> stamps", cycles_t (return type of get_cycles) was unsigned long. On arm32
> unsigned long is 32-bit and therefore does not cover the full counter
> (64-bit). So the number of cycles will be truncated leading to all sort of
> timing issue.
> 
> The correct fix is to switch cycles_t from unsigned long to uint64_t. I think
> we would need this to be backported for xen 4.9 and 4.8 as well.
> 
> I will send a fix.

Thank you, I am glad it got resolved quicker than expected.


> This is the second instance where backport gone wrong in this XSA. Stefano,
> when you have a clash during backport, can you make sure to investigate the
> problem and also write in the commit message what you changed?

Investigation is not bulletproof, in this case the issue slipped through
several reviews by multiple people. But one thing we can do is to adopt
the Linux common practice of writing the changes during backports and
merges between square brackets, such as:

  [fix blah blah blah]

It would make it a bit easier to spot mistakes.

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-20 17:24         ` Julien Grall
  2019-06-20 23:55           ` Stefano Stabellini
@ 2019-06-21  8:58           ` Jan Beulich
  2019-06-21  9:14             ` Julien Grall
  1 sibling, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2019-06-21  8:58 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Ian Jackson

>>> On 20.06.19 at 19:24, <julien.grall@arm.com> wrote:
> Actually I may have found the error. I feel quite ashamed I didn't spot 
> this during review and when the bisector fingered it.
> 
> staging-4.11 and staging.4.12 didn't have get_cycles implemented (i.e it 
> returned 0). During the backport, get_cycles() got suddenly implemented 
> (aside the isb()) so it now returns the number of cycles.

Stefano, how can this be a valid backport under the given title? The
(imo) only correct way of backporting that hunk would have been to
simply drop it, adding isb() instead of the switch to call the function
in the two other places.

To both of you: How certain are you that the subsequent type
change is really all that's needed, and that the sudden change in
behavior of get_cycles() won't have other undue side effects?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
  2019-06-21  8:58           ` Jan Beulich
@ 2019-06-21  9:14             ` Julien Grall
  0 siblings, 0 replies; 8+ messages in thread
From: Julien Grall @ 2019-06-21  9:14 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini; +Cc: xen-devel, Ian Jackson

Hi Jan,

On 21/06/2019 09:58, Jan Beulich wrote:
>>>> On 20.06.19 at 19:24, <julien.grall@arm.com> wrote:
>> Actually I may have found the error. I feel quite ashamed I didn't spot
>> this during review and when the bisector fingered it.
>>
>> staging-4.11 and staging.4.12 didn't have get_cycles implemented (i.e it
>> returned 0). During the backport, get_cycles() got suddenly implemented
>> (aside the isb()) so it now returns the number of cycles.
> 
> Stefano, how can this be a valid backport under the given title? The
> (imo) only correct way of backporting that hunk would have been to
> simply drop it, adding isb() instead of the switch to call the function
> in the two other places.
> 
> To both of you: How certain are you that the subsequent type
> change is really all that's needed, and that the sudden change in
> behavior of get_cycles() won't have other undue side effects?

Here the current callers:

42sh> ack get_cycles

include/asm-arm/time.h
12:static inline cycles_t get_cycles (void)

include/asm-x86/time.h
29:static inline cycles_t get_cycles(void)

arch/arm/time.c
152:    boot_count = get_cycles();
193:    uint64_t ticks = get_cycles() - boot_count;

arch/x86/irq.c
898:        tsc_in = tb_init_done ? get_cycles() : 0;
900:        TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
921:        tsc_in = tb_init_done ? get_cycles() : 0;
923:        TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());

common/tmem.c
275:    pgp->timestamp = get_cycles();
389:    life = get_cycles() - pgp->timestamp;

common/keyhandler.c
389:    per_cpu(read_cycles_time, cpu) = get_cycles();
412:    per_cpu(read_cycles_time, cpu) = get_cycles();

common/trace.c
595:        u64 tsc = (u64)get_cycles();
778:            this_cpu(lost_records_first_tsc)=(u64)get_cycles();

The two callers we care the most are in arch/arm/time.c. They were added with 
the patch backported and I know this is fine because the value before was 64-bit 
as well. So there are no change except for the isb() here.

I am not sure anyone ever tested/used tmem.c on Arm (and it is dropped on recent 
release). So that's not going to make much difference.

common/keyhandler.c, this is mostly for debugging.

common/trace.c does not matter as tracing does not work on Arm before 4.12.

So I am pretty confident that this change will not provoke more side effects 
than an isb() directly in the code.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-06-21  9:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2B31D2BD02000066A2327079@prv1-mh.provo.novell.com>
2019-06-19  7:28 ` [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale Jan Beulich
2019-06-19  9:02   ` Julien Grall
2019-06-19 15:59     ` Julien Grall
2019-06-19 16:56       ` Stefano Stabellini
2019-06-20 17:24         ` Julien Grall
2019-06-20 23:55           ` Stefano Stabellini
2019-06-21  8:58           ` Jan Beulich
2019-06-21  9:14             ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.