All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@arm.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: Re: [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale
Date: Thu, 20 Jun 2019 18:24:27 +0100	[thread overview]
Message-ID: <f1dbb360-1946-83c8-128e-caf179b5681d@arm.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1906190943500.2072@sstabellini-ThinkPad-T480s>

Hi,

On 6/19/19 5:56 PM, Stefano Stabellini wrote:
> On Wed, 19 Jun 2019, Julien Grall wrote:
>>> On 6/19/19 8:28 AM, Jan Beulich wrote:
>>>>>>> On 19.06.19 at 09:06, <osstest-admin@xenproject.org> wrote:
>>>>> branch xen-4.10-testing
>>>>> xenbranch xen-4.10-testing
>>>>> job test-armhf-armhf-xl-arndale
>>>>> testid debian-install
>>>>>
>>>>> Tree: linux git://xenbits.xen.org/linux-pvops.git
>>>>> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
>>>>> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
>>>>> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
>>>>> Tree: xen git://xenbits.xen.org/xen.git
>>>>>
>>>>> *** Found and reproduced problem changeset ***
>>>>>
>>>>>     Bug is in tree:  xen git://xenbits.xen.org/xen.git
>>>>>     Bug introduced:  702c9146c00d65d1e9c5955335ba002505e97e09
>>>>>     Bug not present: 52220b5f437a8d03ba108e127e7d717657edf99c
>>>>>     Last fail repro:
>>>>> http://logs.test-lab.xenproject.org/osstest/logs/138005/
>>>>>
>>>>>
>>>>>     commit 702c9146c00d65d1e9c5955335ba002505e97e09
>>>>>     Author: Julien Grall <julien.grall@arm.com>
>>>>>     Date:   Mon Apr 29 15:05:16 2019 +0100
>>>>>         xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent
>>>>> re-ordering
>>>>>         Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur
>>>>>         speculatively and out of order relative to other instructions
>>>>> executed
>>>>>         on the same PE."
>>>>>         Add an instruction barrier to get accurate number of cycles when
>>>>>         requested in get_cycles(). For the other users of CNPCT_EL0,
>>>>> replace by
>>>>>         a call to get_cycles().
>>>>>         This is part of XSA-295.
>>>>>         Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>>>         Acked-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>
>>>> Just in case you didn't notice this coming in.
>>>
>>> I saw it this morning. But I fail to understand how the instruction barrier
>>> will result in a timeout during the guest installation...
>>
>> One thing to keep in mind here is isb() will also carry a compiler barrier. So
>> the resulting binary may be different as the compiler may re-order the load.
>>
>> One possibility is there are a missing data barrier in other part of the code.
>> The isb() would expose it.
>>
>>>
>>> The more this only doesn't seem to appear on staging-4.12.
>>
>> @Stefano, as we know staging-4.12 is working, one way to debug this is to try
>> to reproduce in different commit between staging-4.12 with the isb() patch
>> applied.
>>
>> Hopefully this should test us where the bug was fixed.
> 
> Ian, I noticed the bisector hasn't managed to pinpoint a commit on
> staging-4.11 yet. Obviously, we suspect it is the same patch ("xen/arm:
> Add an isb() before reading CNTPCT_EL0 to prevent re-ordering") causing
> the issue, but it would be really good to confirm.
> 
> Could you please schedule a bisector run on staging-4.11, ideally on
> Arndale hardware (the same used for the bisection on 4.10 here)?
> 
> 
> After we confirm that 4.11 is suffering from the same issue, we could
> try to identify which commit "fixed" the problem between 4.11 and 4.12,
> as 4.12 passed the tests yesterday. I am getting a bit ahead of myself
> here, but I would love if we could use the bisector to spot the "fix"
> somehow, maybe preparing a special branch for the bisector. The special
> branch could start from the common root between staging-4.11 and
> staging-4.12, which is tag 4.11.0-rc6, apply the patch that breaks, then
> apply the other 4.12 commits, one of them we suspect is the fix we
> need:
> 
> 4.11.0-rc6 | CNTPCT_EL0 patch | other 4.12 commits
>   ^               ^
>   |               broken patch
>   current common root between staging-4.12 and staging-4.11
> 

Actually I may have found the error. I feel quite ashamed I didn't spot 
this during review and when the bisector fingered it.

staging-4.11 and staging.4.12 didn't have get_cycles implemented (i.e it 
returned 0). During the backport, get_cycles() got suddenly implemented 
(aside the isb()) so it now returns the number of cycles.

However, before commit da3d55ae67 "console: avoid printing no or null 
time stamps", cycles_t (return type of get_cycles) was unsigned long. On 
arm32 unsigned long is 32-bit and therefore does not cover the full 
counter (64-bit). So the number of cycles will be truncated leading to 
all sort of timing issue.

The correct fix is to switch cycles_t from unsigned long to uint64_t. I 
think we would need this to be backported for xen 4.9 and 4.8 as well.

I will send a fix.

This is the second instance where backport gone wrong in this XSA. 
Stefano, when you have a clash during backport, can you make sure to 
investigate the problem and also write in the commit message what you 
changed?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-06-20 17:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2B31D2BD02000066A2327079@prv1-mh.provo.novell.com>
2019-06-19  7:28 ` [Xen-devel] Fwd: [xen-4.10-testing bisection] complete test-armhf-armhf-xl-arndale Jan Beulich
2019-06-19  9:02   ` Julien Grall
2019-06-19 15:59     ` Julien Grall
2019-06-19 16:56       ` Stefano Stabellini
2019-06-20 17:24         ` Julien Grall [this message]
2019-06-20 23:55           ` Stefano Stabellini
2019-06-21  8:58           ` Jan Beulich
2019-06-21  9:14             ` Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1dbb360-1946-83c8-128e-caf179b5681d@arm.com \
    --to=julien.grall@arm.com \
    --cc=JBeulich@suse.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.