xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@arm.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
Date: Thu, 20 Jun 2019 18:24:27 +0100	[thread overview]
Message-ID: <f1dbb360-1946-83c8-128e-caf179b5681d@arm.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1906190943500.2072@sstabellini-ThinkPad-T480s>

Hi,

On 6/19/19 5:56 PM, Stefano Stabellini wrote:
> On Wed, 19 Jun 2019, Julien Grall wrote:
>>> On 6/19/19 8:28 AM, Jan Beulich wrote:
>>>>>>> On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
>>>>> branch xen-4.10-testing
>>>>> xenbranch xen-4.10-testing
>>>>> job test-armhf-armhf-xl-arndale
>>>>> testid debian-install
>>>>>
>>>>> Tree: linux git://xenbits.xen.org/linux-pvops.git
>>>>> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
>>>>> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
>>>>> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
>>>>> Tree: xen git://xenbits.xen.org/xen.git
>>>>>
>>>>> *** Found and reproduced problem changeset ***
>>>>>
>>>>>     Bug is in tree:  xen git://xenbits.xen.org/xen.git
>>>>>     Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>>>>>     Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>>>>>     Last fail repro:
>>>>> http://logs.test-lab.xenproject.org/osstest/logs/138005/
>>>>>
>>>>>
>>>>>     commit 702c9146c00d65d1e9c5955335ba002505e97e09
>>>>>     Author: Julien Grall <julien.grall@arm.com>
>>>>>     Date:   Mon Apr 29 15:05:16 2019 +0100
>>>>>         xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent
>>>>> re-ordering
>>>>>         Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>>>>>         speculatively and out of order relative to other instructions
>>>>> executed
>>>>>         on the same PE."
>>>>>         Add an instruction barrier to get accurate number of cycles when
>>>>>         requested in get_cycles(). For the other users of CNPCT_EL0,
>>>>> replace by
>>>>>         a call to get_cycles().
>>>>>         This is part of XSA-295.
>>>>>         Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>>>         Acked-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>
>>>> Just in case you didn't notice this coming in.
>>>
>>> I saw it this morning. But I fail to understand how the instruction barrier
>>> will result in a timeout during the guest installation...
>>
>> One thing to keep in mind here is isb() will also carry a compiler barrier. So
>> the resulting binary may be different as the compiler may re-order the load.
>>
>> One possibility is there are a missing data barrier in other part of the code.
>> The isb() would expose it.
>>
>>>
>>> The more this only doesn't seem to appear on staging-4.12.
>>
>> @Stefano, as we know staging-4.12 is working, one way to debug this is to try
>> to reproduce in different commit between staging-4.12 with the isb() patch
>> applied.
>>
>> Hopefully this should test us where the bug was fixed.
> 
> Ian, I noticed the bisector hasn't managed to pinpoint a commit on
> staging-4.11 yet. Obviously, we suspect it is the same patch ("xen/arm:
> Add an isb() before reading CNTPCT_EL0 to prevent re-ordering") causing
> the issue, but it would be really good to confirm.
> 
> Could you please schedule a bisector run on staging-4.11, ideally on
> Arndale hardware (the same used for the bisection on 4.10 here)?
> 
> 
> After we confirm that 4.11 is suffering from the same issue, we could
> try to identify which commit "fixed" the problem between 4.11 and 4.12,
> as 4.12 passed the tests yesterday. I am getting a bit ahead of myself
> here, but I would love if we could use the bisector to spot the "fix"
> somehow, maybe preparing a special branch for the bisector. The special
> branch could start from the common root between staging-4.11 and
> staging-4.12, which is tag 4.11.0-rc6, apply the patch that breaks, then
> apply the other 4.12 commits, one of them we suspect is the fix we
> need:
> 
> 4.11.0-rc6 | CNTPCT_EL0 patch | other 4.12 commits
>   ^               ^
>   |               broken patch
>   current common root between staging-4.12 and staging-4.11
> 

Actually I may have found the error. I feel quite ashamed I didn't spot 
this during review and when the bisector fingered it.

staging-4.11 and staging.4.12 didn't have get_cycles implemented (i.e it 
returned 0). During the backport, get_cycles() got suddenly implemented 
(aside the isb()) so it now returns the number of cycles.

However, before commit da3d55ae67 "console: avoid printing no or null 
time stamps", cycles_t (return type of get_cycles) was unsigned long. On 
arm32 unsigned long is 32-bit and therefore does not cover the full 
counter (64-bit). So the number of cycles will be truncated leading to 
all sort of timing issue.

The correct fix is to switch cycles_t from unsigned long to uint64_t. I 
think we would need this to be backported for xen 4.9 and 4.8 as well.

I will send a fix.

This is the second instance where backport gone wrong in this XSA. 
Stefano, when you have a clash during backport, can you make sure to 
investigate the problem and also write in the commit message what you 
changed?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-06-20 17:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2B31D2BD02000066A2327079@prv1-mh.provo.novell.com>
2019-06-19  7:28 ` [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale Jan Beulich
2019-06-19  9:02   ` Julien Grall
2019-06-19 15:59     ` Julien Grall
2019-06-19 16:56       ` Stefano Stabellini
2019-06-20 17:24         ` Julien Grall [this message]
2019-06-20 23:55           ` Stefano Stabellini
2019-06-21  8:58           ` Jan Beulich
2019-06-21  9:14             ` Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1dbb360-1946-83c8-128e-caf179b5681d@arm.com \
    --to=julien.grall@arm.com \
    --cc=JBeulich@suse.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).