From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from Galois.linutronix.de ([146.0.238.70]:45364 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727144AbfBHT2N (ORCPT ); Fri, 8 Feb 2019 14:28:13 -0500 Date: Fri, 8 Feb 2019 20:28:07 +0100 (CET) From: Thomas Gleixner Subject: Re: [PATCH v2 06/28] kernel: Define gettimeofday vdso common code In-Reply-To: <20190208173539.GD24375@fuggles.cambridge.arm.com> Message-ID: References: <20181129170530.37789-1-vincenzo.frascino@arm.com> <20181129170530.37789-7-vincenzo.frascino@arm.com> <20181207175321.GA11430@edgewater-inn.cambridge.arm.com> <20190208173539.GD24375@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-arch-owner@vger.kernel.org List-ID: To: Will Deacon Cc: Vincenzo Frascino , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Catalin Marinas , Arnd Bergmann , Russell King , Ralf Baechle , Paul Burton , Daniel Lezcano , Mark Salyzyn , Peter Collingbourne Message-ID: <20190208192807.zQsU5mpHpb5ttjoqjQ5RxjcRq0n8mLZ7lTuuknLHt90@z> Will, On Fri, 8 Feb 2019, Will Deacon wrote: > On Fri, Dec 07, 2018 at 05:53:21PM +0000, Will Deacon wrote: > > Anyway, moving the counter read into the protected region is a little fiddly > > because the memory barriers we have in there won't give us the ordering we > > need. We'll instead need to do something nasty, like create a dependency > > from the counter read to the read of the seqlock: > > > > Maybe the untested crufty hack below, although this will be a nightmare to > > implement in C. > > We discussed this in person this week, but you couldn't recall the details > off the top of your head so I'm replying here. Please could you clarify what > your concern was with the existing code, and whether or not I've got the > wrong end of the stick? If you just collect the variables under the seqcount protection and have the readout of the timer outside of it then you are not guaranteeing consistent state. The problem is: do { seq = read_seqcount_begin(d->seq); last = d->cycle_last; mult = d->mult; shift = d->shift; ns = d->ns_base; while (read_seqcount_retry(d->seq, seq)); ns += ((read_clock() - last) * mult); ns >>= shift; So on the first glance this looks consistent because you collect all data and then do the read and calc outside the loop. But if 'd->mult' gets updated _before_ read_clock() then you can run into a situation where time goes backwards with the next read. Here is the flow you need for that: t1 = gettime() { collect_data() ---> Interrupt, updates mult (mult becomes smaller) This can expand over a very long time when the task is scheduled out here and there are multiple updates in between. The farther out the read is delayed, the more likely the problem is going to observable. read_clock_and_calc() } t2 = gettime() { collect_data() read_clock_and_calc() } This second read uses updated data, i.e. the smaller mult. So depending on the distance of the two reads and the difference of the mult, the caller can observe t2 < t1, which is a NONO. You can observe it on a single CPU. No virt, SMP, migration needed at all. Thanks, tglx