From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753273AbdE0Hbz (ORCPT ); Sat, 27 May 2017 03:31:55 -0400 Received: from mail-wr0-f196.google.com ([209.85.128.196]:35806 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753210AbdE0Hbx (ORCPT ); Sat, 27 May 2017 03:31:53 -0400 Date: Sat, 27 May 2017 09:31:39 +0200 From: Ingo Molnar To: John Stultz Cc: lkml , Thomas Gleixner , Miroslav Lichvar , Richard Cochran , Prarit Bhargava , Stephen Boyd , Daniel Mentz Subject: Re: [RFC][PATCH 1/4] time: Fix clock->read(clock) race around clocksource changes Message-ID: <20170527073138.btn3qta2af6b2wzu@gmail.com> References: <1495856035-6622-1-git-send-email-john.stultz@linaro.org> <1495856035-6622-2-git-send-email-john.stultz@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1495856035-6622-2-git-send-email-john.stultz@linaro.org> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * John Stultz wrote: > In some testing on arm64 platforms, I was seeing null ptr > crashes in the kselftest/timers clocksource-switch test. > > This was happening in a read function like: > u64 clocksource_mmio_readl_down(struct clocksource *c) > { > return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; > } > > Where the callers enter the seqlock, and then call something > like: > cycle_now = tkr->read(tkr->clock); > > The problem seeming to be that since the read and clock > references are happening separately, its possible the > clocksource change happens in between and we end up calling the > old read with the new clocksource, (or vice-versa) which causes > the to_mmio_clksrc() in the read function to run off into space. Maybe: s/clock/->clock pointer s/read/old ->read() function for increased clarity - because both are ambiguous in an English sentence. > > This patch tries to address the issue by providing a helper > function that atomically reads the clock value and then calls > the clock->read(clock) call so that we always call the read > funciton with the appropriate clocksource and don't accidentally > mix them. s/call/function because 'call' is used both as a verb and as noun here, making for a difficult read. > > The one exception where this helper isn't necessary is for the > fast-timekepers which use their own locking and update logic > to the tkr structures. > > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Miroslav Lichvar > Cc: Richard Cochran > Cc: Prarit Bhargava > Cc: Stephen Boyd > Cc: Daniel Mentz > Signed-off-by: John Stultz > --- > kernel/time/timekeeping.c | 40 +++++++++++++++++++++++++++++----------- > 1 file changed, 29 insertions(+), 11 deletions(-) > > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > index 9652bc5..abc1968 100644 > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -118,6 +118,26 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta) > tk->offs_boot = ktime_add(tk->offs_boot, delta); > } > > +/* > + * tk_clock_read - atomic clocksource read() helper > + * > + * This helper is necessary to use in the read paths as, while the seqlock > + * ensures we don't return a bad value while structures are updated, it > + * doesn't protect from potential crashes. There is the possibility that > + * the tkr's clocksource may change between the read reference, and the > + * clock refernce passed to the read function. This can cause crashes if > + * the wrong clocksource is passed to the wrong read function. > + * This isn't necessary to use when holding the timekeeper_lock or doing > + * a read of the fast-timekeeper tkrs (which is protected by its own locking > + * and update logic). typo: s/refernce/reference I'd also do: s/as/because because while 'as' is not wrong, it's somewhat harder to parse on first reading. Other than that: Acked-by: Ingo Molnar Thanks, Ingo