On Thu, 3 Dec 2020, Wei Chen wrote: > Hi Julien, > > > -----Original Message----- > > From: Julien Grall > > Sent: 2020年12月3日 2:11 > > To: Wei Chen ; Stefano Stabellini > > Cc: Penny Zheng ; xen-devel@lists.xenproject.org; > > Andre Przywara ; Bertrand Marquis > > ; Kaly Xin ; nd > > > > Subject: Re: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround > > > > > > > > On 26/11/2020 11:27, Wei Chen wrote: > > > Hi Julien, > > > > Hi Wei, > > > > >> -----Original Message----- > > >> From: Julien Grall > > >> Sent: 2020年11月26日 18:55 > > >> To: Wei Chen ; Stefano Stabellini > > > > >> Cc: Penny Zheng ; xen-devel@lists.xenproject.org; > > >> Andre Przywara ; Bertrand Marquis > > >> ; Kaly Xin ; nd > > >> > > >> Subject: Re: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround > > >> > > >> Hi Wei, > > >> > > >> Your e-mail font seems to be different to the usual plain text one. Are > > >> you sending the e-mail using HTML by any chance? > > >> > > > > > > It's strange, I always use the plain text. > > > > Maybe exchange decided to mangle the e-mail :). Anyway, this new message > > looks fine. > > > > > > > >> On 26/11/2020 02:07, Wei Chen wrote: > > >>> Hi Stefano, > > >>> > > >>>> -----Original Message----- > > >>>> From: Stefano Stabellini > > >>>> Sent: 2020??????11??????26?????? 8:00 > > >>>> To: Wei Chen > > >>>> Cc: Julien Grall ; Penny Zheng ; > > >> xen- > > >>>> devel@lists.xenproject.org; sstabellini@kernel.org; Andre Przywara > > >>>> ; Bertrand Marquis > > >> ; > > >>>> Kaly Xin ; nd > > >>>> Subject: RE: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 > > workaround > > >>>> > > >>>> Resuming this old thread. > > >>>> > > >>>> On Fri, 13 Nov 2020, Wei Chen wrote: > > >>>>>> Hi, > > >>>>>> > > >>>>>> On 09/11/2020 08:21, Penny Zheng wrote: > > >>>>>>> CNTVCT_EL0 or CNTPCT_EL0 counter read in Cortex-A73 (all versions) > > >>>>>>> might return a wrong value when the counter crosses a 32bit boundary. > > >>>>>>> > > >>>>>>> Until now, there is no case for Xen itself to access CNTVCT_EL0, > > >>>>>>> and it also should be the Guest OS's responsibility to deal with > > >>>>>>> this part. > > >>>>>>> > > >>>>>>> But for CNTPCT, there exists several cases in Xen involving reading > > >>>>>>> CNTPCT, so a possible workaround is that performing the read twice, > > >>>>>>> and to return one or the other depending on whether a transition has > > >>>>>>> taken place. > > >>>>>>> > > >>>>>>> Signed-off-by: Penny Zheng > > >>>>>> > > >>>>>> Acked-by: Julien Grall > > >>>>>> > > >>>>>> On a related topic, do we need a fix similar to Linux commit > > >>>>>> 75a19a0202db "arm64: arch_timer: Ensure counter register reads occur > > >>>>>> with seqlock held"? > > >>>>>> > > >>>>> > > >>>>> I take a look at this Linux commit, it seems to prevent the seq-lock to be > > >>>>> speculated. Using an enforce ordering instead of ISB after the read > > counter > > >>>>> operation seems to be for performance reasons. > > >>>>> > > >>>>> I have found that you had placed an ISB before read counter in get_cycles > > >>>>> to prevent counter value to be speculated. But you haven't placed the > > >> second > > >>>>> ISB after reading. Is it because we haven't used the get_cycles in seq lock > > >>>>> critical context (Maybe I didn't find the right place)? So should we need to > > >> fix it > > >>>>> now, or you prefer to fix it now for future usage? > > >>>> > > >>>> Looking at the call sites, it doesn't look like we need any ISB after > > >>>> get_cycles as it is not used in any critical context. There is also a > > >>>> data dependency with the value returned by it. > > >> > > >> I am assuming you looked at all the users of NOW(). Is that right? > > >> > > >>>> > > >>>> So I am thinking we don't need any fix. At most we need an in-code > > comment? > > >>> > > >>> I agree with you to add an in-code comment. It will remind us in future > > when > > >> we > > >>> use the get_cycles in critical context. Adding it now will probably only lead > > to > > >>> meaningless performance degradation. > > >> > > >> I read this as there would be no perfomance impact if we add the > > >> ordering it now. Did you intend to say? > > > > > > Sorry about my English. I intended to say "Adding it now may introduce some > > > performance cost. And this performance cost may be not worth. Because Xen > > > may never use it in a similar scenario " > > > > Don't worry! I think the performance should not be noticeable if we use > > the same trick as Linux. > > > > >> In addition, AFAICT, the x86 version of get_cycles() is already able to > > >> provide that ordering. So there are chances that code may rely on it. > > >> > > >> While I don't necessarily agree to add barriers everywhere by default > > >> (this may have big impact on the platform). I think it is better to have > > >> an accurate number of cycles. > > >> > > > > > > As x86 had done it, I think it’s ok to do it for Arm. This will keep a function > > > behaves the same on different architectures. > > > > Just to be clear, I am not 100% sure this is what Intel is doing. > > Although this is my understanding of the comment in the code. > > > > @Stefano, what do you think? > > > > @Wei, assuming Stefano is happy with the proposal, would you be happy to > > send a patch for that? > > > > Of course, I am willing to do that. It seems the enforce_ordering patch has been > merged. And Vincenzo reported the enforce_ordering method will have ~4.5 > performance improvement[1] (Compare to ISB). So I will use enforce_ordering > method directly instead of using ISB. > > [1]https://lkml.org/lkml/2020/3/13/645 If we can enforce ordering without adding an ISB, I am all for it.