All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/8] ntp/rtc: Fixes and cleanups
@ 2020-12-06 21:46 Thomas Gleixner
  2020-12-06 21:46 ` [patch 1/8] rtc: mc146818: Prevent reading garbage Thomas Gleixner
                   ` (8 more replies)
  0 siblings, 9 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

Miroslav ran into a situation where the periodic RTC synchronization almost
never was able to hit the time window for the update. That happens due the
usage of delayed_work and the properties of the timer wheel.

While that particular problem is halfways simple to fix this started to
unearth other problems with that code particularly with rtc_set_npt_time()
but expanded into other things as well.

  1) The update offset for rtc-cmos is off by a full second

  2) The readout of MC146818 (rtc-cmos and arch code) is broken and can
     return garbage.

  2) Alexandre questioned the approach in general and wants to get rid of
     it. Of course there are better methods to do that and it can be
     completely done in user space.

     Unfortunately it's not that simple as this would be a user visible
     change, so making it at least halfways correct.

  3) Alexandre requested to move that code into the NTP code as this is not
     really RTC functionality and just usage of the RTC API.

  4) The update offset itself was questioned, but the time between the
     write and the next seconds increment in the RTC is fundamentaly a
     hardware property. The transport time, which is pretty irrelevant for
     direct accessible RTCs (rtc-cmos), but relevant for RTC behind i2c/SPI
     needs to be added on top.

     It's undebated that this transport time cannot be correctly estimated,
     but right now it's 500ms which is far off. The correct transport time
     can be calibrated, a halfways correct value supplied via DT, but
     that's an orthogonal problem.

The following series addresses the above:

    1) Fix the readout function of MC146818
    2) Fix the rtc-cmos offset
    3) Reduce the default transport time

    4) Switch the NTP periodic sync code to a hrtimer/work combo

    5) Move rtc_set_npt_time() to the ntp code
    6) Make the update offset more intuitive
    7) Simplify the whole machinery
     
Thanks,

	tglx
---
 a/drivers/rtc/systohc.c        |   61 ----------
 drivers/rtc/Makefile           |    1 
 drivers/rtc/class.c            |    9 +
 drivers/rtc/rtc-cmos.c         |    3 
 drivers/rtc/rtc-mc146818-lib.c |   70 +++++++-----
 include/linux/rtc.h            |   69 +++++-------
 include/linux/timex.h          |    1 
 kernel/time/ntp.c              |  232 ++++++++++++++++++++++++-----------------
 kernel/time/ntp_internal.h     |    7 +
 9 files changed, 225 insertions(+), 228 deletions(-)




^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 1/8] rtc: mc146818: Prevent reading garbage
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2021-01-25 18:40   ` [patch 1/8] rtc: mc146818: Prevent reading garbage - bug Mickaël Salaün
  2020-12-06 21:46 ` [patch 2/8] rtc: mc146818: Reduce spinlock section in mc146818_set_time() Thomas Gleixner
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

The MC146818 driver is prone to read garbage from the RTC. There are
several issues all related to the update cycle of the MC146818. The chip
increments seconds obviously once per second and indicates that by a bit in
a register. The bit goes high 244us before the actual update starts. During
the update the readout of the time values is undefined.

The code just checks whether the update in progress bit (UIP) is set before
reading the clock. If it's set it waits arbitrary 20ms before retrying,
which is ample because the maximum update time is ~2ms.

But this check does not guarantee that the UIP bit goes high and the actual
update happens during the readout. So the following can happen

 0.997 	       UIP = False
   -> Interrupt/NMI/preemption
 0.998	       UIP -> True
 0.999	       Readout	<- Undefined

To prevent this rework the code so it checks UIP before and after the
readout and if set after the readout try again.

But that's not enough to cover the following:

 0.997 	       UIP = False
 	       Readout seconds
   -> NMI (or vCPU scheduled out)
 0.998	       UIP -> True
 	       update completes
	       UIP -> False
 1.000	       Readout	minutes,....
 	       UIP check succeeds

That can make the readout wrong up to 59 seconds.

To prevent this, read the seconds value before the first UIP check,
validate it after checking UIP and after reading out the rest.

It's amazing that the original i386 code had this actually correct and
the generic implementation of the MC146818 driver got it wrong in 2002 and
it stayed that way until today.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/rtc-mc146818-lib.c |   64 ++++++++++++++++++++++++-----------------
 1 file changed, 39 insertions(+), 25 deletions(-)

--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -8,41 +8,41 @@
 #include <linux/acpi.h>
 #endif
 
-/*
- * Returns true if a clock update is in progress
- */
-static inline unsigned char mc146818_is_updating(void)
-{
-	unsigned char uip;
-	unsigned long flags;
-
-	spin_lock_irqsave(&rtc_lock, flags);
-	uip = (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP);
-	spin_unlock_irqrestore(&rtc_lock, flags);
-	return uip;
-}
-
 unsigned int mc146818_get_time(struct rtc_time *time)
 {
 	unsigned char ctrl;
 	unsigned long flags;
 	unsigned char century = 0;
+	bool retry;
 
 #ifdef CONFIG_MACH_DECSTATION
 	unsigned int real_year;
 #endif
 
+again:
+	spin_lock_irqsave(&rtc_lock, flags);
 	/*
-	 * read RTC once any update in progress is done. The update
-	 * can take just over 2ms. We wait 20ms. There is no need to
-	 * to poll-wait (up to 1s - eeccch) for the falling edge of RTC_UIP.
-	 * If you need to know *exactly* when a second has started, enable
-	 * periodic update complete interrupts, (via ioctl) and then
-	 * immediately read /dev/rtc which will block until you get the IRQ.
-	 * Once the read clears, read the RTC time (again via ioctl). Easy.
+	 * Check whether there is an update in progress during which the
+	 * readout is unspecified. The maximum update time is ~2ms. Poll
+	 * every msec for completion.
+	 *
+	 * Store the second value before checking UIP so a long lasting NMI
+	 * which happens to hit after the UIP check cannot make an update
+	 * cycle invisible.
 	 */
-	if (mc146818_is_updating())
-		mdelay(20);
+	time->tm_sec = CMOS_READ(RTC_SECONDS);
+
+	if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		mdelay(1);
+		goto again;
+	}
+
+	/* Revalidate the above readout */
+	if (time->tm_sec != CMOS_READ(RTC_SECONDS)) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		goto again;
+	}
 
 	/*
 	 * Only the values that we read from the RTC are set. We leave
@@ -50,8 +50,6 @@ unsigned int mc146818_get_time(struct rt
 	 * RTC has RTC_DAY_OF_WEEK, we ignore it, as it is only updated
 	 * by the RTC when initially set to a non-zero value.
 	 */
-	spin_lock_irqsave(&rtc_lock, flags);
-	time->tm_sec = CMOS_READ(RTC_SECONDS);
 	time->tm_min = CMOS_READ(RTC_MINUTES);
 	time->tm_hour = CMOS_READ(RTC_HOURS);
 	time->tm_mday = CMOS_READ(RTC_DAY_OF_MONTH);
@@ -66,8 +64,24 @@ unsigned int mc146818_get_time(struct rt
 		century = CMOS_READ(acpi_gbl_FADT.century);
 #endif
 	ctrl = CMOS_READ(RTC_CONTROL);
+	/*
+	 * Check for the UIP bit again. If it is set now then
+	 * the above values may contain garbage.
+	 */
+	retry = CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP;
+	/*
+	 * A NMI might have interrupted the above sequence so check whether
+	 * the seconds value has changed which indicates that the NMI took
+	 * longer than the UIP bit was set. Unlikely, but possible and
+	 * there is also virt...
+	 */
+	retry |= time->tm_sec != CMOS_READ(RTC_SECONDS);
+
 	spin_unlock_irqrestore(&rtc_lock, flags);
 
+	if (retry)
+		goto again;
+
 	if (!(ctrl & RTC_DM_BINARY) || RTC_ALWAYS_BCD)
 	{
 		time->tm_sec = bcd2bin(time->tm_sec);


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 2/8] rtc: mc146818: Reduce spinlock section in mc146818_set_time()
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
  2020-12-06 21:46 ` [patch 1/8] rtc: mc146818: Prevent reading garbage Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-12-06 21:46 ` [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct Thomas Gleixner
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

No need to hold the lock and disable interrupts for doing math.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/rtc-mc146818-lib.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -135,7 +135,6 @@ int mc146818_set_time(struct rtc_time *t
 	if (yrs > 255)	/* They are unsigned */
 		return -EINVAL;
 
-	spin_lock_irqsave(&rtc_lock, flags);
 #ifdef CONFIG_MACH_DECSTATION
 	real_yrs = yrs;
 	leap_yr = ((!((yrs + 1900) % 4) && ((yrs + 1900) % 100)) ||
@@ -164,10 +163,8 @@ int mc146818_set_time(struct rtc_time *t
 	/* These limits and adjustments are independent of
 	 * whether the chip is in binary mode or not.
 	 */
-	if (yrs > 169) {
-		spin_unlock_irqrestore(&rtc_lock, flags);
+	if (yrs > 169)
 		return -EINVAL;
-	}
 
 	if (yrs >= 100)
 		yrs -= 100;
@@ -183,6 +180,7 @@ int mc146818_set_time(struct rtc_time *t
 		century = bin2bcd(century);
 	}
 
+	spin_lock_irqsave(&rtc_lock, flags);
 	save_control = CMOS_READ(RTC_CONTROL);
 	CMOS_WRITE((save_control|RTC_SET), RTC_CONTROL);
 	save_freq_select = CMOS_READ(RTC_FREQ_SELECT);


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
  2020-12-06 21:46 ` [patch 1/8] rtc: mc146818: Prevent reading garbage Thomas Gleixner
  2020-12-06 21:46 ` [patch 2/8] rtc: mc146818: Reduce spinlock section in mc146818_set_time() Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-07 20:50   ` Jason Gunthorpe
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-12-06 21:46 ` [patch 4/8] rtc: core: Make the sync offset default more realistic Thomas Gleixner
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

The offset for rtc_cmos must be -500ms to work correctly with the current
implementation of rtc_set_ntp_time() due to the following:

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

twrite - tsched is the transport time for the write to hit the device,
which is negligible for this chip because it's accessed directly.

t2 - twrite = 500ms according to the datasheet.

But rtc_set_ntp_time() calculation of tsched is:

    tsched = t2 - 1sec - (t2 - twrite)

The default for the sync offset is 500ms which means that the write happens
at t2 - 1.5 seconds which is obviously off by a second for this device.

Make the offset -500ms so it works correct.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/rtc-cmos.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -868,6 +868,9 @@ cmos_do_probe(struct device *dev, struct
 	if (retval)
 		goto cleanup2;
 
+	/* Set the sync offset for the periodic 11min update correct */
+	cmos_rtc.rtc->set_offset_nsec = -(NSEC_PER_SEC / 2);
+
 	/* export at least the first block of NVRAM */
 	nvmem_cfg.size = address_space - NVRAM_OFFSET;
 	if (rtc_nvmem_register(cmos_rtc.rtc, &nvmem_cfg))


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 4/8] rtc: core: Make the sync offset default more realistic
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
                   ` (2 preceding siblings ...)
  2020-12-06 21:46 ` [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-07 20:55   ` Jason Gunthorpe
                     ` (2 more replies)
  2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
                   ` (4 subsequent siblings)
  8 siblings, 3 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

The offset which is used to steer the start of an RTC synchronization
update via rtc_set_ntp_time() is huge. The math behind this is:

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

twrite - tsched is the transport time for the write to hit the device.

t2 - twrite depends on the chip and is for most chips one second.

The rtc_set_ntp_time() calculation of tsched is:

    tsched = t2 - 1sec - (t2 - twrite)

The default for the sync offset is 500ms which means that twrite - tsched
is 500ms assumed that t2 - twrite is one second.

This is 0.5 seconds off for RTCs which are directly accessible by IO writes
and probably for the majority of i2C/SPI based RTC off by an order of
magnitude. Set it to 10ms which should bring it closer to reality.

The default can be adjusted by drivers (rtc_cmos does so) and could be
adjusted further by a calibration method which is an orthogonal problem.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/class.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -201,7 +201,7 @@ static struct rtc_device *rtc_allocate_d
 	device_initialize(&rtc->dev);
 
 	/* Drivers can revise this default after allocating the device. */
-	rtc->set_offset_nsec =  NSEC_PER_SEC / 2;
+	rtc->set_offset_nsec =  10 * NSEC_PER_MSEC;
 
 	rtc->irq_freq = 1;
 	rtc->max_user_freq = 64;


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 5/8] ntp: Make the RTC synchronization more reliable
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
                   ` (3 preceding siblings ...)
  2020-12-06 21:46 ` [patch 4/8] rtc: core: Make the sync offset default more realistic Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-07 12:47   ` Miroslav Lichvar
                     ` (3 more replies)
  2020-12-06 21:46 ` [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code Thomas Gleixner
                   ` (3 subsequent siblings)
  8 siblings, 4 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

Miroslav reported that the periodic RTC synchronization in the NTP code
fails more often than not to hit the specified update window.

The reason is that the code uses delayed_work to schedule the update which
needs to be in thread context as the underlying RTC might be connected via
a slow bus, e.g. I2C. In the update function it verifies whether the
current time is correct vs. the requirements of the underlying RTC.

But delayed_work is using the timer wheel for scheduling which is
inaccurate by design. Depending on the distance to the expiry the wheel
gets less granular to allow batching and to avoid the cascading of the
original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading
wheel") and the code for further details.

The code already deals with this by splitting the 660 seconds period into a
long 659 seconds timer and then retrying with a smaller delta.

But looking at the actual granularities of the timer wheel (which depend on
the HZ configuration) the 659 seconds timer ends up in an outer wheel level
and is affected by a worst case granularity of:

HZ          Granularity
1000        32s
 250        16s
 100        40s

So the initial timer can be already off by max 12.5% which is not a big
issue as the period of the sync is defined as ~11 minutes.

The fine grained second attempt schedules to the desired update point with
a timer expiring less than a second from now. Depending on the actual delta
and the HZ setting even the second attempt can end up in outer wheel levels
which have a large enough granularity to make the correctness check fail.

As this is a fundamental property of the timer wheel there is no way to
make this more accurate short of iterating in one jiffies steps towards the
update point.

Switch it to an hrtimer instead which schedules the actual update work. The
hrtimer will expire precisely (max 1 jiffie delay when high resolution
timers are not available). The actual scheduling delay of the work is the
same as before.

The update is triggered from do_adjtimex() which is a bit racy but not much
more racy than it was before:

     if (ntp_synced())
     	queue_delayed_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the work is currently executed and has not managed to
reschedule itself.

This becomes now:

     if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
     	queue_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the hrtimer has expired and the work is currently
executed and has not yet managed to rearm the hrtimer.

Not a big problem as it just schedules work for nothing.

The new implementation has a safe guard in place to catch the case where
the hrtimer is queued on entry to the work function and avoids an extra
update attempt of the RTC that way.

Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
---
 include/linux/timex.h      |    1 
 kernel/time/ntp.c          |   90 ++++++++++++++++++++++++---------------------
 kernel/time/ntp_internal.h |    7 +++
 3 files changed, 55 insertions(+), 43 deletions(-)

--- a/include/linux/timex.h
+++ b/include/linux/timex.h
@@ -157,7 +157,6 @@ extern int do_clock_adjtime(const clocki
 extern void hardpps(const struct timespec64 *, const struct timespec64 *);
 
 int read_current_timer(unsigned long *timer_val);
-void ntp_notify_cmos_timer(void);
 
 /* The clock frequency of the i8253/i8254 PIT */
 #define PIT_TICK_RATE 1193182ul
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -494,65 +494,55 @@ int second_overflow(time64_t secs)
 	return leap;
 }
 
+#if defined(CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC)
 static void sync_hw_clock(struct work_struct *work);
-static DECLARE_DELAYED_WORK(sync_work, sync_hw_clock);
-
-static void sched_sync_hw_clock(struct timespec64 now,
-				unsigned long target_nsec, bool fail)
+static DECLARE_WORK(sync_work, sync_hw_clock);
+static struct hrtimer sync_hrtimer;
+#define SYNC_PERIOD_NS (11UL * 60 * NSEC_PER_SEC)
 
+static enum hrtimer_restart sync_timer_callback(struct hrtimer *timer)
 {
-	struct timespec64 next;
+	queue_work(system_power_efficient_wq, &sync_work);
 
-	ktime_get_real_ts64(&next);
-	if (!fail)
-		next.tv_sec = 659;
-	else {
-		/*
-		 * Try again as soon as possible. Delaying long periods
-		 * decreases the accuracy of the work queue timer. Due to this
-		 * the algorithm is very likely to require a short-sleep retry
-		 * after the above long sleep to synchronize ts_nsec.
-		 */
-		next.tv_sec = 0;
-	}
+	return HRTIMER_NORESTART;
+}
 
-	/* Compute the needed delay that will get to tv_nsec == target_nsec */
-	next.tv_nsec = target_nsec - next.tv_nsec;
-	if (next.tv_nsec <= 0)
-		next.tv_nsec += NSEC_PER_SEC;
-	if (next.tv_nsec >= NSEC_PER_SEC) {
-		next.tv_sec++;
-		next.tv_nsec -= NSEC_PER_SEC;
-	}
+static void sched_sync_hw_clock(unsigned long offset_nsec, bool retry)
+{
+	ktime_t exp = ktime_set(ktime_get_real_seconds(), 0);
 
-	queue_delayed_work(system_power_efficient_wq, &sync_work,
-			   timespec64_to_jiffies(&next));
+	if (retry)
+		exp = ktime_add_ns(exp, 2 * NSEC_PER_SEC - offset_nsec);
+	else
+		exp = ktime_add_ns(exp, SYNC_PERIOD_NS - offset_nsec);
+
+	hrtimer_start(&sync_hrtimer, exp, HRTIMER_MODE_ABS);
 }
 
 static void sync_rtc_clock(void)
 {
-	unsigned long target_nsec;
-	struct timespec64 adjust, now;
+	unsigned long offset_nsec;
+	struct timespec64 adjust;
 	int rc;
 
 	if (!IS_ENABLED(CONFIG_RTC_SYSTOHC))
 		return;
 
-	ktime_get_real_ts64(&now);
+	ktime_get_real_ts64(&adjust);
 
-	adjust = now;
 	if (persistent_clock_is_local)
 		adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
 
 	/*
-	 * The current RTC in use will provide the target_nsec it wants to be
-	 * called at, and does rtc_tv_nsec_ok internally.
+	 * The current RTC in use will provide the nanoseconds offset prior
+	 * to a full second it wants to be called at, and invokes
+	 * rtc_tv_nsec_ok() internally.
 	 */
-	rc = rtc_set_ntp_time(adjust, &target_nsec);
+	rc = rtc_set_ntp_time(adjust, &offset_nsec);
 	if (rc == -ENODEV)
 		return;
 
-	sched_sync_hw_clock(now, target_nsec, rc);
+	sched_sync_hw_clock(offset_nsec, rc != 0);
 }
 
 #ifdef CONFIG_GENERIC_CMOS_UPDATE
@@ -599,7 +589,7 @@ static bool sync_cmos_clock(void)
 		}
 	}
 
-	sched_sync_hw_clock(now, target_nsec, rc);
+	sched_sync_hw_clock(target_nsec, rc != 0);
 	return true;
 }
 
@@ -613,7 +603,12 @@ static bool sync_cmos_clock(void)
  */
 static void sync_hw_clock(struct work_struct *work)
 {
-	if (!ntp_synced())
+	/*
+	 * Don't update if STA_UNSYNC is set and if ntp_notify_cmos_timer()
+	 * managed to schedule the work between the timer firing and the
+	 * work being able to rearm the timer. Wait for the timer to expire.
+	 */
+	if (!ntp_synced() || hrtimer_is_queued(&sync_hrtimer))
 		return;
 
 	if (sync_cmos_clock())
@@ -624,13 +619,23 @@ static void sync_hw_clock(struct work_st
 
 void ntp_notify_cmos_timer(void)
 {
-	if (!ntp_synced())
-		return;
+	/*
+	 * When the work is currently executed but has not yet the timer
+	 * rearmed this queues the work immediately again. No big issue,
+	 * just a pointless work scheduled.
+	 */
+	if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
+		queue_work(system_power_efficient_wq, &sync_work);
+}
 
-	if (IS_ENABLED(CONFIG_GENERIC_CMOS_UPDATE) ||
-	    IS_ENABLED(CONFIG_RTC_SYSTOHC))
-		queue_delayed_work(system_power_efficient_wq, &sync_work, 0);
+static void __init ntp_init_cmos_sync(void)
+{
+	hrtimer_init(&sync_hrtimer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+	sync_hrtimer.function = sync_timer_callback;
 }
+#else /* CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC) */
+static inline void __init ntp_init_cmos_sync(void) { }
+#endif /* !CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC) */
 
 /*
  * Propagate a new txc->status value into the NTP state:
@@ -1044,4 +1049,5 @@ static int __init ntp_tick_adj_setup(cha
 void __init ntp_init(void)
 {
 	ntp_clear();
+	ntp_init_cmos_sync();
 }
--- a/kernel/time/ntp_internal.h
+++ b/kernel/time/ntp_internal.h
@@ -12,4 +12,11 @@ extern int __do_adjtimex(struct __kernel
 			 const struct timespec64 *ts,
 			 s32 *time_tai, struct audit_ntp_data *ad);
 extern void __hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts);
+
+#if defined(CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC)
+extern void ntp_notify_cmos_timer(void);
+#else
+static inline void ntp_notify_cmos_timer(void) { }
+#endif
+
 #endif /* _LINUX_NTP_INTERNAL_H */


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
                   ` (4 preceding siblings ...)
  2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-07 20:59   ` Jason Gunthorpe
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-12-06 21:46 ` [patch 7/8] ntp: Make the RTC sync offset less obscure Thomas Gleixner
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

rtc_set_ntp_time() is not really RTC functionality as the code is just a
user of RTC. Move it into the NTP code which allows further cleanups.

Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/Makefile  |    1 
 drivers/rtc/systohc.c |   61 ----------------------------------
 include/linux/rtc.h   |   34 -------------------
 kernel/time/ntp.c     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 85 insertions(+), 99 deletions(-)

--- a/drivers/rtc/Makefile
+++ b/drivers/rtc/Makefile
@@ -6,7 +6,6 @@
 ccflags-$(CONFIG_RTC_DEBUG)	:= -DDEBUG
 
 obj-$(CONFIG_RTC_LIB)		+= lib.o
-obj-$(CONFIG_RTC_SYSTOHC)	+= systohc.o
 obj-$(CONFIG_RTC_CLASS)		+= rtc-core.o
 obj-$(CONFIG_RTC_MC146818_LIB)	+= rtc-mc146818-lib.o
 rtc-core-y			:= class.o interface.o
--- a/drivers/rtc/systohc.c
+++ /dev/null
@@ -1,61 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <linux/rtc.h>
-#include <linux/time.h>
-
-/**
- * rtc_set_ntp_time - Save NTP synchronized time to the RTC
- * @now: Current time of day
- * @target_nsec: pointer for desired now->tv_nsec value
- *
- * Replacement for the NTP platform function update_persistent_clock64
- * that stores time for later retrieval by rtc_hctosys.
- *
- * Returns 0 on successful RTC update, -ENODEV if a RTC update is not
- * possible at all, and various other -errno for specific temporary failure
- * cases.
- *
- * -EPROTO is returned if now.tv_nsec is not close enough to *target_nsec.
- *
- * If temporary failure is indicated the caller should try again 'soon'
- */
-int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
-{
-	struct rtc_device *rtc;
-	struct rtc_time tm;
-	struct timespec64 to_set;
-	int err = -ENODEV;
-	bool ok;
-
-	rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE);
-	if (!rtc)
-		goto out_err;
-
-	if (!rtc->ops || !rtc->ops->set_time)
-		goto out_close;
-
-	/* Compute the value of tv_nsec we require the caller to supply in
-	 * now.tv_nsec.  This is the value such that (now +
-	 * set_offset_nsec).tv_nsec == 0.
-	 */
-	set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec);
-	*target_nsec = to_set.tv_nsec;
-
-	/* The ntp code must call this with the correct value in tv_nsec, if
-	 * it does not we update target_nsec and return EPROTO to make the ntp
-	 * code try again later.
-	 */
-	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
-	if (!ok) {
-		err = -EPROTO;
-		goto out_close;
-	}
-
-	rtc_time64_to_tm(to_set.tv_sec, &tm);
-
-	err = rtc_set_time(rtc, &tm);
-
-out_close:
-	rtc_class_close(rtc);
-out_err:
-	return err;
-}
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -165,7 +165,6 @@ int __rtc_register_device(struct module
 
 extern int rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm);
 extern int rtc_set_time(struct rtc_device *rtc, struct rtc_time *tm);
-extern int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec);
 int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm);
 extern int rtc_read_alarm(struct rtc_device *rtc,
 			struct rtc_wkalrm *alrm);
@@ -205,39 +204,6 @@ static inline bool is_leap_year(unsigned
 	return (!(year % 4) && (year % 100)) || !(year % 400);
 }
 
-/* Determine if we can call to driver to set the time. Drivers can only be
- * called to set a second aligned time value, and the field set_offset_nsec
- * specifies how far away from the second aligned time to call the driver.
- *
- * This also computes 'to_set' which is the time we are trying to set, and has
- * a zero in tv_nsecs, such that:
- *    to_set - set_delay_nsec == now +/- FUZZ
- *
- */
-static inline bool rtc_tv_nsec_ok(s64 set_offset_nsec,
-				  struct timespec64 *to_set,
-				  const struct timespec64 *now)
-{
-	/* Allowed error in tv_nsec, arbitarily set to 5 jiffies in ns. */
-	const unsigned long TIME_SET_NSEC_FUZZ = TICK_NSEC * 5;
-	struct timespec64 delay = {.tv_sec = 0,
-				   .tv_nsec = set_offset_nsec};
-
-	*to_set = timespec64_add(*now, delay);
-
-	if (to_set->tv_nsec < TIME_SET_NSEC_FUZZ) {
-		to_set->tv_nsec = 0;
-		return true;
-	}
-
-	if (to_set->tv_nsec > NSEC_PER_SEC - TIME_SET_NSEC_FUZZ) {
-		to_set->tv_sec++;
-		to_set->tv_nsec = 0;
-		return true;
-	}
-	return false;
-}
-
 #define rtc_register_device(device) \
 	__rtc_register_device(THIS_MODULE, device)
 
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -519,15 +519,94 @@ static void sched_sync_hw_clock(unsigned
 	hrtimer_start(&sync_hrtimer, exp, HRTIMER_MODE_ABS);
 }
 
+/*
+ * Determine if we can call to driver to set the time. Drivers can only be
+ * called to set a second aligned time value, and the field set_offset_nsec
+ * specifies how far away from the second aligned time to call the driver.
+ *
+ * This also computes 'to_set' which is the time we are trying to set, and has
+ * a zero in tv_nsecs, such that:
+ *    to_set - set_delay_nsec == now +/- FUZZ
+ *
+ */
+static inline bool rtc_tv_nsec_ok(long set_offset_nsec,
+				  struct timespec64 *to_set,
+				  const struct timespec64 *now)
+{
+	/* Allowed error in tv_nsec, arbitarily set to 5 jiffies in ns. */
+	const unsigned long TIME_SET_NSEC_FUZZ = TICK_NSEC * 5;
+	struct timespec64 delay = {.tv_sec = 0,
+				   .tv_nsec = set_offset_nsec};
+
+	*to_set = timespec64_add(*now, delay);
+
+	if (to_set->tv_nsec < TIME_SET_NSEC_FUZZ) {
+		to_set->tv_nsec = 0;
+		return true;
+	}
+
+	if (to_set->tv_nsec > NSEC_PER_SEC - TIME_SET_NSEC_FUZZ) {
+		to_set->tv_sec++;
+		to_set->tv_nsec = 0;
+		return true;
+	}
+	return false;
+}
+
+#ifdef CONFIG_RTC_SYSTOHC
+/*
+ * rtc_set_ntp_time - Save NTP synchronized time to the RTC
+ */
+static int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
+{
+	struct rtc_device *rtc;
+	struct rtc_time tm;
+	struct timespec64 to_set;
+	int err = -ENODEV;
+	bool ok;
+
+	rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE);
+	if (!rtc)
+		goto out_err;
+
+	if (!rtc->ops || !rtc->ops->set_time)
+		goto out_close;
+
+	/*
+	 * Compute the value of tv_nsec we require the caller to supply in
+	 * now.tv_nsec.  This is the value such that (now +
+	 * set_offset_nsec).tv_nsec == 0.
+	 */
+	set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec);
+	*target_nsec = to_set.tv_nsec;
+
+	/*
+	 * The ntp code must call this with the correct value in tv_nsec, if
+	 * it does not we update target_nsec and return EPROTO to make the ntp
+	 * code try again later.
+	 */
+	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
+	if (!ok) {
+		err = -EPROTO;
+		goto out_close;
+	}
+
+	rtc_time64_to_tm(to_set.tv_sec, &tm);
+
+	err = rtc_set_time(rtc, &tm);
+
+out_close:
+	rtc_class_close(rtc);
+out_err:
+	return err;
+}
+
 static void sync_rtc_clock(void)
 {
 	unsigned long offset_nsec;
 	struct timespec64 adjust;
 	int rc;
 
-	if (!IS_ENABLED(CONFIG_RTC_SYSTOHC))
-		return;
-
 	ktime_get_real_ts64(&adjust);
 
 	if (persistent_clock_is_local)
@@ -544,6 +623,9 @@ static void sync_rtc_clock(void)
 
 	sched_sync_hw_clock(offset_nsec, rc != 0);
 }
+#else
+static inline void sync_rtc_clock(void) { }
+#endif
 
 #ifdef CONFIG_GENERIC_CMOS_UPDATE
 int __weak update_persistent_clock64(struct timespec64 now64)


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 7/8] ntp: Make the RTC sync offset less obscure
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
                   ` (5 preceding siblings ...)
  2020-12-06 21:46 ` [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-12-06 21:46 ` [patch 8/8] ntp: Consolidate the RTC update implementation Thomas Gleixner
  2020-12-09  0:33 ` [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
  8 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

The current RTC set_offset_nsec value is not really intuitive to
understand. 

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

The offset is calculated from twrite based on the assumption that t2 -
twrite == 1s. That means for the MC146818 RTC the offset needs to be
negative so that the write happens 500ms before t2.

It's easier to understand when the whole calculation is based on t2. That
avoids negative offsets and the meaning is obvious:

 t2 - twrite:     The time defined by the chip when seconds increment
      		  after the write.

 twrite - tsched: The time for the transport to the point where the chip
 	  	  is updated. 

==> set_offset_nsec =  t2 - tsched
    ttransport      =  twrite - tsched
    tRTCinc         =  t2 - twrite
==> set_offset_nsec =  ttransport + tRTCinc

tRTCinc is a chip property and can be obtained from the data sheet.

ttransport depends on how the RTC is connected. It is close to 0 for
directly accessible RTCs. For RTCs behind a slow bus, e.g. i2c, it's the
time required to send the update over the bus. This can be estimated or
even calibrated, but that's a different problem.

Adjust the implementation and update comments accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/class.c    |    9 +++++++--
 drivers/rtc/rtc-cmos.c |    2 +-
 include/linux/rtc.h    |   35 +++++++++++++++++++++++++++++------
 kernel/time/ntp.c      |   49 +++++++++++++++++++++++++------------------------
 4 files changed, 62 insertions(+), 33 deletions(-)

--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -200,8 +200,13 @@ static struct rtc_device *rtc_allocate_d
 
 	device_initialize(&rtc->dev);
 
-	/* Drivers can revise this default after allocating the device. */
-	rtc->set_offset_nsec =  10 * NSEC_PER_MSEC;
+	/*
+	 * Drivers can revise this default after allocating the device.
+	 * The default is what most RTCs do: Increment seconds exactly one
+	 * second after the write happened. This adds a default transport
+	 * time of 10ms which is at least halfways close to reality.
+	 */
+	rtc->set_offset_nsec = NSEC_PER_SEC + 10 * NSEC_PER_MSEC;
 
 	rtc->irq_freq = 1;
 	rtc->max_user_freq = 64;
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -869,7 +869,7 @@ cmos_do_probe(struct device *dev, struct
 		goto cleanup2;
 
 	/* Set the sync offset for the periodic 11min update correct */
-	cmos_rtc.rtc->set_offset_nsec = -(NSEC_PER_SEC / 2);
+	cmos_rtc.rtc->set_offset_nsec = NSEC_PER_SEC / 2;
 
 	/* export at least the first block of NVRAM */
 	nvmem_cfg.size = address_space - NVRAM_OFFSET;
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -110,13 +110,36 @@ struct rtc_device {
 	/* Some hardware can't support UIE mode */
 	int uie_unsupported;
 
-	/* Number of nsec it takes to set the RTC clock. This influences when
-	 * the set ops are called. An offset:
-	 *   - of 0.5 s will call RTC set for wall clock time 10.0 s at 9.5 s
-	 *   - of 1.5 s will call RTC set for wall clock time 10.0 s at 8.5 s
-	 *   - of -0.5 s will call RTC set for wall clock time 10.0 s at 10.5 s
+	/*
+	 * This offset specifies the update timing of the RTC.
+	 *
+	 * tsched     t1 write(t2.tv_sec - 1sec))  t2 RTC increments seconds
+	 *
+	 * The offset defines how tsched is computed so that the write to
+	 * the RTC (t2.tv_sec - 1sec) is correct versus the time required
+	 * for the transport of the write and the time which the RTC needs
+	 * to increment seconds the first time after the write (t2).
+	 *
+	 * For direct accessible RTCs tsched ~= t1 because the write time
+	 * is negligible. For RTCs behind slow busses the transport time is
+	 * significant and has to be taken into account.
+	 *
+	 * The time between the write (t1) and the first increment after
+	 * the write (t2) is RTC specific. For a MC146818 RTC it's 500ms,
+	 * for many others it's exactly 1 second. Consult the datasheet.
+	 *
+	 * The value of this offset is also used to calculate the to be
+	 * written value (t2.tv_sec - 1sec) at tsched.
+	 *
+	 * The default value for this is NSEC_PER_SEC + 10 msec default
+	 * transport time. The offset can be adjusted by drivers so the
+	 * calculation for the to be written value at tsched becomes
+	 * correct:
+	 *
+	 *	newval = tsched + set_offset_nsec - NSEC_PER_SEC
+	 * and  (tsched + set_offset_nsec) % NSEC_PER_SEC == 0
 	 */
-	long set_offset_nsec;
+	unsigned long set_offset_nsec;
 
 	bool registered;
 
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -520,22 +520,33 @@ static void sched_sync_hw_clock(unsigned
 }
 
 /*
- * Determine if we can call to driver to set the time. Drivers can only be
- * called to set a second aligned time value, and the field set_offset_nsec
- * specifies how far away from the second aligned time to call the driver.
- *
- * This also computes 'to_set' which is the time we are trying to set, and has
- * a zero in tv_nsecs, such that:
- *    to_set - set_delay_nsec == now +/- FUZZ
+ * Check whether @now is correct versus the required time to update the RTC
+ * and calculate the value which needs to be written to the RTC so that the
+ * next seconds increment of the RTC after the write is aligned with the next
+ * seconds increment of clock REALTIME.
  *
+ * tsched     t1 write(t2.tv_sec - 1sec))	t2 RTC increments seconds
+ *
+ * t2.tv_nsec == 0
+ * tsched = t2 - set_offset_nsec
+ * newval = t2 - NSEC_PER_SEC
+ *
+ * ==> neval = tsched + set_offset_nsec - NSEC_PER_SEC
+ *
+ * As the execution of this code is not guaranteed to happen exactly at
+ * tsched this allows it to happen within a fuzzy region:
+ *
+ *	abs(now - tsched) < FUZZ
+ *
+ * If @now is not inside the allowed window the function returns false.
  */
-static inline bool rtc_tv_nsec_ok(long set_offset_nsec,
+static inline bool rtc_tv_nsec_ok(unsigned long set_offset_nsec,
 				  struct timespec64 *to_set,
 				  const struct timespec64 *now)
 {
 	/* Allowed error in tv_nsec, arbitarily set to 5 jiffies in ns. */
 	const unsigned long TIME_SET_NSEC_FUZZ = TICK_NSEC * 5;
-	struct timespec64 delay = {.tv_sec = 0,
+	struct timespec64 delay = {.tv_sec = -1,
 				   .tv_nsec = set_offset_nsec};
 
 	*to_set = timespec64_add(*now, delay);
@@ -557,11 +568,11 @@ static inline bool rtc_tv_nsec_ok(long s
 /*
  * rtc_set_ntp_time - Save NTP synchronized time to the RTC
  */
-static int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
+static int rtc_set_ntp_time(struct timespec64 now, unsigned long *offset_nsec)
 {
+	struct timespec64 to_set;
 	struct rtc_device *rtc;
 	struct rtc_time tm;
-	struct timespec64 to_set;
 	int err = -ENODEV;
 	bool ok;
 
@@ -572,19 +583,9 @@ static int rtc_set_ntp_time(struct times
 	if (!rtc->ops || !rtc->ops->set_time)
 		goto out_close;
 
-	/*
-	 * Compute the value of tv_nsec we require the caller to supply in
-	 * now.tv_nsec.  This is the value such that (now +
-	 * set_offset_nsec).tv_nsec == 0.
-	 */
-	set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec);
-	*target_nsec = to_set.tv_nsec;
+	/* Store the update offset for this RTC */
+	*offset_nsec = rtc->set_offset_nsec;
 
-	/*
-	 * The ntp code must call this with the correct value in tv_nsec, if
-	 * it does not we update target_nsec and return EPROTO to make the ntp
-	 * code try again later.
-	 */
 	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
 	if (!ok) {
 		err = -EPROTO;
@@ -657,7 +658,7 @@ static bool sync_cmos_clock(void)
 	 * implement this legacy API.
 	 */
 	ktime_get_real_ts64(&now);
-	if (rtc_tv_nsec_ok(-1 * target_nsec, &adjust, &now)) {
+	if (rtc_tv_nsec_ok(target_nsec, &adjust, &now)) {
 		if (persistent_clock_is_local)
 			adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
 		rc = update_persistent_clock64(adjust);


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [patch 8/8] ntp: Consolidate the RTC update implementation
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
                   ` (6 preceding siblings ...)
  2020-12-06 21:46 ` [patch 7/8] ntp: Make the RTC sync offset less obscure Thomas Gleixner
@ 2020-12-06 21:46 ` Thomas Gleixner
  2020-12-07 21:05   ` Jason Gunthorpe
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-12-09  0:33 ` [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
  8 siblings, 2 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:46 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

The code for the legacy RTC and the RTC class based update are pretty much
the same. Consolidate the common parts into one function and just invoke
the actual setter functions.

For RTC class based devices the update code checks whether the offset is
valid for the device, which is usually not the case for the first
invocation. If it's not the same it stores the correct offset and lets the
caller try again. That's not much different from the previous approach
where the first invocation had a pretty low probability to actually hit the
allowed window.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/ntp.c |  139 ++++++++++++++++++------------------------------------
 1 file changed, 47 insertions(+), 92 deletions(-)

--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -564,118 +564,53 @@ static inline bool rtc_tv_nsec_ok(unsign
 	return false;
 }
 
+#ifdef CONFIG_GENERIC_CMOS_UPDATE
+int __weak update_persistent_clock64(struct timespec64 now64)
+{
+	return -ENODEV;
+}
+#else
+static inline int update_persistent_clock64(struct timespec64 now64)
+{
+	return -ENODEV;
+}
+#endif
+
 #ifdef CONFIG_RTC_SYSTOHC
-/*
- * rtc_set_ntp_time - Save NTP synchronized time to the RTC
- */
-static int rtc_set_ntp_time(struct timespec64 now, unsigned long *offset_nsec)
+/* Save NTP synchronized time to the RTC */
+static int update_rtc(struct timespec64 *to_set, unsigned long *offset_nsec)
 {
-	struct timespec64 to_set;
 	struct rtc_device *rtc;
 	struct rtc_time tm;
 	int err = -ENODEV;
-	bool ok;
 
 	rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE);
 	if (!rtc)
-		goto out_err;
+		return -ENODEV;
 
 	if (!rtc->ops || !rtc->ops->set_time)
 		goto out_close;
 
-	/* Store the update offset for this RTC */
-	*offset_nsec = rtc->set_offset_nsec;
-
-	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
-	if (!ok) {
-		err = -EPROTO;
-		goto out_close;
+	/* First call might not have the correct offset */
+	if (*offset_nsec == rtc->set_offset_nsec) {
+		rtc_time64_to_tm(to_set->tv_sec, &tm);
+		err = rtc_set_time(rtc, &tm);
+	} else {
+		/* Store the update offset and let the caller try again */
+		*offset_nsec = rtc->set_offset_nsec;
+		err = -EAGAIN;
 	}
-
-	rtc_time64_to_tm(to_set.tv_sec, &tm);
-
-	err = rtc_set_time(rtc, &tm);
-
 out_close:
 	rtc_class_close(rtc);
-out_err:
 	return err;
 }
-
-static void sync_rtc_clock(void)
-{
-	unsigned long offset_nsec;
-	struct timespec64 adjust;
-	int rc;
-
-	ktime_get_real_ts64(&adjust);
-
-	if (persistent_clock_is_local)
-		adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
-
-	/*
-	 * The current RTC in use will provide the nanoseconds offset prior
-	 * to a full second it wants to be called at, and invokes
-	 * rtc_tv_nsec_ok() internally.
-	 */
-	rc = rtc_set_ntp_time(adjust, &offset_nsec);
-	if (rc == -ENODEV)
-		return;
-
-	sched_sync_hw_clock(offset_nsec, rc != 0);
-}
 #else
-static inline void sync_rtc_clock(void) { }
-#endif
-
-#ifdef CONFIG_GENERIC_CMOS_UPDATE
-int __weak update_persistent_clock64(struct timespec64 now64)
+static inline int update_rtc(struct timespec64 *to_set, unsigned long *offset_nsec)
 {
 	return -ENODEV;
 }
 #endif
 
-static bool sync_cmos_clock(void)
-{
-	static bool no_cmos;
-	struct timespec64 now;
-	struct timespec64 adjust;
-	int rc = -EPROTO;
-	long target_nsec = NSEC_PER_SEC / 2;
-
-	if (!IS_ENABLED(CONFIG_GENERIC_CMOS_UPDATE))
-		return false;
-
-	if (no_cmos)
-		return false;
-
-	/*
-	 * Historically update_persistent_clock64() has followed x86
-	 * semantics, which match the MC146818A/etc RTC. This RTC will store
-	 * 'adjust' and then in .5s it will advance once second.
-	 *
-	 * Architectures are strongly encouraged to use rtclib and not
-	 * implement this legacy API.
-	 */
-	ktime_get_real_ts64(&now);
-	if (rtc_tv_nsec_ok(target_nsec, &adjust, &now)) {
-		if (persistent_clock_is_local)
-			adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
-		rc = update_persistent_clock64(adjust);
-		/*
-		 * The machine does not support update_persistent_clock64 even
-		 * though it defines CONFIG_GENERIC_CMOS_UPDATE.
-		 */
-		if (rc == -ENODEV) {
-			no_cmos = true;
-			return false;
-		}
-	}
-
-	sched_sync_hw_clock(target_nsec, rc != 0);
-	return true;
-}
-
 /*
  * If we have an externally synchronized Linux clock, then update RTC clock
  * accordingly every ~11 minutes. Generally RTCs can only store second
@@ -686,6 +621,10 @@ static bool sync_cmos_clock(void)
  */
 static void sync_hw_clock(struct work_struct *work)
 {
+	static unsigned long offset_nsec = NSEC_PER_SEC / 2;
+	struct timespec64 now, to_set;
+	int res = -EAGAIN;
+
 	/*
 	 * Don't update if STA_UNSYNC is set and if ntp_notify_cmos_timer()
 	 * managed to schedule the work between the timer firing and the
@@ -694,10 +633,26 @@ static void sync_hw_clock(struct work_st
 	if (!ntp_synced() || hrtimer_is_queued(&sync_hrtimer))
 		return;
 
-	if (sync_cmos_clock())
-		return;
+	ktime_get_real_ts64(&now);
+	/* If @now is not in the allowed window, try again */
+	if (!rtc_tv_nsec_ok(offset_nsec, &to_set, &now))
+		goto rearm;
 
-	sync_rtc_clock();
+	/* Take timezone adjusted RTCs into account */
+	if (persistent_clock_is_local)
+		to_set.tv_sec -= (sys_tz.tz_minuteswest * 60);
+
+	/* Try the legacy RTC first */
+	res = update_persistent_clock64(to_set);
+	if (res != -ENODEV)
+		goto rearm;
+
+	/* Try the RTC class */
+	res = update_rtc(&to_set, &offset_nsec);
+	if (res == -ENODEV)
+		return;
+rearm:
+	sched_sync_hw_clock(offset_nsec, res != 0);
 }
 
 void ntp_notify_cmos_timer(void)


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 5/8] ntp: Make the RTC synchronization more reliable
  2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
@ 2020-12-07 12:47   ` Miroslav Lichvar
  2020-12-07 20:59   ` Jason Gunthorpe
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 54+ messages in thread
From: Miroslav Lichvar @ 2020-12-07 12:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Jason Gunthorpe, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Sun, Dec 06, 2020 at 10:46:18PM +0100, Thomas Gleixner wrote:
> Switch it to an hrtimer instead which schedules the actual update work. The
> hrtimer will expire precisely (max 1 jiffie delay when high resolution
> timers are not available). The actual scheduling delay of the work is the
> same as before.

It works well in my tests.

> This becomes now:
> 
>      if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
>      	queue_work(system_power_efficient_wq, &sync_work, 0);
> 
> which is racy when the hrtimer has expired and the work is currently
> executed and has not yet managed to rearm the hrtimer.
> 
> Not a big problem as it just schedules work for nothing.

No more unexpected updates of the RTC observed.

Tested-by: Miroslav Lichvar <mlichvar@redhat.com>

Thanks,

-- 
Miroslav Lichvar


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct
  2020-12-06 21:46 ` [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct Thomas Gleixner
@ 2020-12-07 20:50   ` Jason Gunthorpe
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 54+ messages in thread
From: Jason Gunthorpe @ 2020-12-07 20:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Sun, Dec 06, 2020 at 10:46:16PM +0100, Thomas Gleixner wrote:
> The offset for rtc_cmos must be -500ms to work correctly with the current
> implementation of rtc_set_ntp_time() due to the following:
> 
>   tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)
> 
> twrite - tsched is the transport time for the write to hit the device,
> which is negligible for this chip because it's accessed directly.
> 
> t2 - twrite = 500ms according to the datasheet.
> 
> But rtc_set_ntp_time() calculation of tsched is:
> 
>     tsched = t2 - 1sec - (t2 - twrite)
> 
> The default for the sync offset is 500ms which means that the write happens
> at t2 - 1.5 seconds which is obviously off by a second for this device.
> 
> Make the offset -500ms so it works correct.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/rtc/rtc-cmos.c |    3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Matches what sync_cmos_clock() does at least.

My recollection is this change was not supposed to change anything, so
either I got things mixed up and this is wrong:

drivers/rtc/class.c:    rtc->set_offset_nsec =  NSEC_PER_SEC / 2;

Or it faithfully preserved some nonsense from before..

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 4/8] rtc: core: Make the sync offset default more realistic
  2020-12-06 21:46 ` [patch 4/8] rtc: core: Make the sync offset default more realistic Thomas Gleixner
@ 2020-12-07 20:55   ` Jason Gunthorpe
  2020-12-10 23:59   ` Alexandre Belloni
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 54+ messages in thread
From: Jason Gunthorpe @ 2020-12-07 20:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Sun, Dec 06, 2020 at 10:46:17PM +0100, Thomas Gleixner wrote:
> The offset which is used to steer the start of an RTC synchronization
> update via rtc_set_ntp_time() is huge. The math behind this is:
> 
>   tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)
> 
> twrite - tsched is the transport time for the write to hit the device.
> 
> t2 - twrite depends on the chip and is for most chips one second.
> 
> The rtc_set_ntp_time() calculation of tsched is:
> 
>     tsched = t2 - 1sec - (t2 - twrite)
> 
> The default for the sync offset is 500ms which means that twrite - tsched
> is 500ms assumed that t2 - twrite is one second.
> 
> This is 0.5 seconds off for RTCs which are directly accessible by IO writes
> and probably for the majority of i2C/SPI based RTC off by an order of
> magnitude. Set it to 10ms which should bring it closer to reality.
> 
> The default can be adjusted by drivers (rtc_cmos does so) and could be
> adjusted further by a calibration method which is an orthogonal problem.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>  drivers/rtc/class.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> +++ b/drivers/rtc/class.c
> @@ -201,7 +201,7 @@ static struct rtc_device *rtc_allocate_d
>  	device_initialize(&rtc->dev);
>  
>  	/* Drivers can revise this default after allocating the device. */
> -	rtc->set_offset_nsec =  NSEC_PER_SEC / 2;
> +	rtc->set_offset_nsec =  10 * NSEC_PER_MSEC;

So the old value is clearly wrong for CMOS, and I have a strong
feeling this was an error and it should have been -NSEC_PER_SEC/2

I have no idea if CMOS behavior or 0s behavior is more common in the
rtclib drivers, but it seems since nobody noticed the huge offset
mistake in 3 years it doesn't actually really matter.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 5/8] ntp: Make the RTC synchronization more reliable
  2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
  2020-12-07 12:47   ` Miroslav Lichvar
@ 2020-12-07 20:59   ` Jason Gunthorpe
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2020-12-29 19:41   ` [patch 5/8] " Geert Uytterhoeven
  3 siblings, 0 replies; 54+ messages in thread
From: Jason Gunthorpe @ 2020-12-07 20:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Sun, Dec 06, 2020 at 10:46:18PM +0100, Thomas Gleixner wrote:
> Miroslav reported that the periodic RTC synchronization in the NTP code
> fails more often than not to hit the specified update window.
> 
> The reason is that the code uses delayed_work to schedule the update which
> needs to be in thread context as the underlying RTC might be connected via
> a slow bus, e.g. I2C. In the update function it verifies whether the
> current time is correct vs. the requirements of the underlying RTC.
> 
> But delayed_work is using the timer wheel for scheduling which is
> inaccurate by design. Depending on the distance to the expiry the wheel
> gets less granular to allow batching and to avoid the cascading of the
> original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading
> wheel") and the code for further details.
> 
> The code already deals with this by splitting the 660 seconds period into a
> long 659 seconds timer and then retrying with a smaller delta.
> 
> But looking at the actual granularities of the timer wheel (which depend on
> the HZ configuration) the 659 seconds timer ends up in an outer wheel level
> and is affected by a worst case granularity of:
> 
> HZ          Granularity
> 1000        32s
>  250        16s
>  100        40s
> 
> So the initial timer can be already off by max 12.5% which is not a big
> issue as the period of the sync is defined as ~11 minutes.
> 
> The fine grained second attempt schedules to the desired update point with
> a timer expiring less than a second from now. Depending on the actual delta
> and the HZ setting even the second attempt can end up in outer wheel levels
> which have a large enough granularity to make the correctness check fail.
> 
> As this is a fundamental property of the timer wheel there is no way to
> make this more accurate short of iterating in one jiffies steps towards the
> update point.
> 
> Switch it to an hrtimer instead which schedules the actual update work. The
> hrtimer will expire precisely (max 1 jiffie delay when high resolution
> timers are not available). The actual scheduling delay of the work is the
> same as before.
> 
> The update is triggered from do_adjtimex() which is a bit racy but not much
> more racy than it was before:
> 
>      if (ntp_synced())
>      	queue_delayed_work(system_power_efficient_wq, &sync_work, 0);
> 
> which is racy when the work is currently executed and has not managed to
> reschedule itself.
> 
> This becomes now:
> 
>      if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
>      	queue_work(system_power_efficient_wq, &sync_work, 0);
> 
> which is racy when the hrtimer has expired and the work is currently
> executed and has not yet managed to rearm the hrtimer.
> 
> Not a big problem as it just schedules work for nothing.
> 
> The new implementation has a safe guard in place to catch the case where
> the hrtimer is queued on entry to the work function and avoids an extra
> update attempt of the RTC that way.
> 
> Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Prarit Bhargava <prarit@redhat.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> ---
>  include/linux/timex.h      |    1 
>  kernel/time/ntp.c          |   90 ++++++++++++++++++++++++---------------------
>  kernel/time/ntp_internal.h |    7 +++
>  3 files changed, 55 insertions(+), 43 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code
  2020-12-06 21:46 ` [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code Thomas Gleixner
@ 2020-12-07 20:59   ` Jason Gunthorpe
  2020-12-09  3:51     ` Alexandre Belloni
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 1 reply; 54+ messages in thread
From: Jason Gunthorpe @ 2020-12-07 20:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Sun, Dec 06, 2020 at 10:46:19PM +0100, Thomas Gleixner wrote:
> rtc_set_ntp_time() is not really RTC functionality as the code is just a
> user of RTC. Move it into the NTP code which allows further cleanups.
> 
> Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/rtc/Makefile  |    1 
>  drivers/rtc/systohc.c |   61 ----------------------------------
>  include/linux/rtc.h   |   34 -------------------
>  kernel/time/ntp.c     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 85 insertions(+), 99 deletions(-)

Fair enough, it is asymmetric with how HCTOSYS works, but not a big
deal

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 8/8] ntp: Consolidate the RTC update implementation
  2020-12-06 21:46 ` [patch 8/8] ntp: Consolidate the RTC update implementation Thomas Gleixner
@ 2020-12-07 21:05   ` Jason Gunthorpe
  2020-12-11  9:23     ` Thomas Gleixner
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 1 reply; 54+ messages in thread
From: Jason Gunthorpe @ 2020-12-07 21:05 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Sun, Dec 06, 2020 at 10:46:21PM +0100, Thomas Gleixner wrote:
>  /*
>   * If we have an externally synchronized Linux clock, then update RTC clock
>   * accordingly every ~11 minutes. Generally RTCs can only store second
> @@ -686,6 +621,10 @@ static bool sync_cmos_clock(void)
>   */
>  static void sync_hw_clock(struct work_struct *work)
>  {
> +	static unsigned long offset_nsec = NSEC_PER_SEC / 2;

A comment here explaining this is the default: because the platform is
assumed to use CMOS, and by the way, this whole thing is obsolete
don't use it, seems appropriate..

The time split is clearer if you think of it from a bus/datasheet
perspective, less clear if you try to measure the system directly, eg
from an alarm. But, I think this  has a better chance of some rtclib
driver authors to fill in the datasheet value at least.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 0/8] ntp/rtc: Fixes and cleanups
  2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
                   ` (7 preceding siblings ...)
  2020-12-06 21:46 ` [patch 8/8] ntp: Consolidate the RTC update implementation Thomas Gleixner
@ 2020-12-09  0:33 ` Thomas Gleixner
  2020-12-09  4:01   ` Alexandre Belloni
  8 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-09  0:33 UTC (permalink / raw)
  To: LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

Alexandre,

On Sun, Dec 06 2020 at 22:46, Thomas Gleixner wrote:
> Miroslav ran into a situation where the periodic RTC synchronization almost
> never was able to hit the time window for the update. That happens due the
> usage of delayed_work and the properties of the timer wheel.
>
> While that particular problem is halfways simple to fix this started to
> unearth other problems with that code particularly with rtc_set_npt_time()
> but expanded into other things as well.
>
>   1) The update offset for rtc-cmos is off by a full second
>
>   2) The readout of MC146818 (rtc-cmos and arch code) is broken and can
>      return garbage.
>
>   2) Alexandre questioned the approach in general and wants to get rid of
>      it. Of course there are better methods to do that and it can be
>      completely done in user space.
>
>      Unfortunately it's not that simple as this would be a user visible
>      change, so making it at least halfways correct.
>
>   3) Alexandre requested to move that code into the NTP code as this is not
>      really RTC functionality and just usage of the RTC API.
>
>   4) The update offset itself was questioned, but the time between the
>      write and the next seconds increment in the RTC is fundamentaly a
>      hardware property. The transport time, which is pretty irrelevant for
>      direct accessible RTCs (rtc-cmos), but relevant for RTC behind i2c/SPI
>      needs to be added on top.
>
>      It's undebated that this transport time cannot be correctly estimated,
>      but right now it's 500ms which is far off. The correct transport time
>      can be calibrated, a halfways correct value supplied via DT, but
>      that's an orthogonal problem.
>
> The following series addresses the above:
>
>     1) Fix the readout function of MC146818
>     2) Fix the rtc-cmos offset
>     3) Reduce the default transport time
>
>     4) Switch the NTP periodic sync code to a hrtimer/work combo
>
>     5) Move rtc_set_npt_time() to the ntp code
>     6) Make the update offset more intuitive
>     7) Simplify the whole machinery

any opinion on this?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code
  2020-12-07 20:59   ` Jason Gunthorpe
@ 2020-12-09  3:51     ` Alexandre Belloni
  0 siblings, 0 replies; 54+ messages in thread
From: Alexandre Belloni @ 2020-12-09  3:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Thomas Gleixner, LKML, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On 07/12/2020 16:59:52-0400, Jason Gunthorpe wrote:
> On Sun, Dec 06, 2020 at 10:46:19PM +0100, Thomas Gleixner wrote:
> > rtc_set_ntp_time() is not really RTC functionality as the code is just a
> > user of RTC. Move it into the NTP code which allows further cleanups.
> > 
> > Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >  drivers/rtc/Makefile  |    1 
> >  drivers/rtc/systohc.c |   61 ----------------------------------
> >  include/linux/rtc.h   |   34 -------------------
> >  kernel/time/ntp.c     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++--
> >  4 files changed, 85 insertions(+), 99 deletions(-)
> 
> Fair enough, it is asymmetric with how HCTOSYS works, but not a big
> deal
> 

It is already asymmetric as hctosys will always happen as soon as the
rtc driver is loaded but systohc will only actually be called only if
ntp is enabled.

I must admit I did consider moving it ou of the rtc subsystem instead of
having it moved in class.c to solve the issue with rtc drivers compiled
as modules but I still carry the nasty BITS_PER_LONG check to keep
systemd happy on a 32bit userspace.

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 0/8] ntp/rtc: Fixes and cleanups
  2020-12-09  0:33 ` [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
@ 2020-12-09  4:01   ` Alexandre Belloni
  2020-12-09 12:39     ` Thomas Gleixner
  0 siblings, 1 reply; 54+ messages in thread
From: Alexandre Belloni @ 2020-12-09  4:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Jason Gunthorpe, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On 09/12/2020 01:33:08+0100, Thomas Gleixner wrote:
> Alexandre,
> 
> On Sun, Dec 06 2020 at 22:46, Thomas Gleixner wrote:
> > Miroslav ran into a situation where the periodic RTC synchronization almost
> > never was able to hit the time window for the update. That happens due the
> > usage of delayed_work and the properties of the timer wheel.
> >
> > While that particular problem is halfways simple to fix this started to
> > unearth other problems with that code particularly with rtc_set_npt_time()
> > but expanded into other things as well.
> >
> >   1) The update offset for rtc-cmos is off by a full second
> >
> >   2) The readout of MC146818 (rtc-cmos and arch code) is broken and can
> >      return garbage.
> >
> >   2) Alexandre questioned the approach in general and wants to get rid of
> >      it. Of course there are better methods to do that and it can be
> >      completely done in user space.
> >
> >      Unfortunately it's not that simple as this would be a user visible
> >      change, so making it at least halfways correct.
> >
> >   3) Alexandre requested to move that code into the NTP code as this is not
> >      really RTC functionality and just usage of the RTC API.
> >
> >   4) The update offset itself was questioned, but the time between the
> >      write and the next seconds increment in the RTC is fundamentaly a
> >      hardware property. The transport time, which is pretty irrelevant for
> >      direct accessible RTCs (rtc-cmos), but relevant for RTC behind i2c/SPI
> >      needs to be added on top.
> >
> >      It's undebated that this transport time cannot be correctly estimated,
> >      but right now it's 500ms which is far off. The correct transport time
> >      can be calibrated, a halfways correct value supplied via DT, but
> >      that's an orthogonal problem.
> >
> > The following series addresses the above:
> >
> >     1) Fix the readout function of MC146818
> >     2) Fix the rtc-cmos offset
> >     3) Reduce the default transport time
> >
> >     4) Switch the NTP periodic sync code to a hrtimer/work combo
> >
> >     5) Move rtc_set_npt_time() to the ntp code
> >     6) Make the update offset more intuitive
> >     7) Simplify the whole machinery
> 
> any opinion on this?
> 

This looks very good to me, however, I think the 10ms offset is a bit
too much. Do you mind waiting one or two days so I can get my test setup
back up?


-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 0/8] ntp/rtc: Fixes and cleanups
  2020-12-09  4:01   ` Alexandre Belloni
@ 2020-12-09 12:39     ` Thomas Gleixner
  0 siblings, 0 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-09 12:39 UTC (permalink / raw)
  To: Alexandre Belloni
  Cc: LKML, Jason Gunthorpe, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Wed, Dec 09 2020 at 05:01, Alexandre Belloni wrote:
> On 09/12/2020 01:33:08+0100, Thomas Gleixner wrote:
>> > The following series addresses the above:
>> >
>> >     1) Fix the readout function of MC146818
>> >     2) Fix the rtc-cmos offset
>> >     3) Reduce the default transport time
>> >
>> >     4) Switch the NTP periodic sync code to a hrtimer/work combo
>> >
>> >     5) Move rtc_set_npt_time() to the ntp code
>> >     6) Make the update offset more intuitive
>> >     7) Simplify the whole machinery
>> 
>> any opinion on this?
>> 
> This looks very good to me, however, I think the 10ms offset is a bit
> too much.

I pulled it out of thin air TBH. It's at least 50 times more accurate
than what we had before :)

> Do you mind waiting one or two days so I can get my test setup
> back up?

No problem.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 4/8] rtc: core: Make the sync offset default more realistic
  2020-12-06 21:46 ` [patch 4/8] rtc: core: Make the sync offset default more realistic Thomas Gleixner
  2020-12-07 20:55   ` Jason Gunthorpe
@ 2020-12-10 23:59   ` Alexandre Belloni
  2020-12-11  0:23     ` Thomas Gleixner
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  2 siblings, 1 reply; 54+ messages in thread
From: Alexandre Belloni @ 2020-12-10 23:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Jason Gunthorpe, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

Hi,

On 06/12/2020 22:46:17+0100, Thomas Gleixner wrote:
> The offset which is used to steer the start of an RTC synchronization
> update via rtc_set_ntp_time() is huge. The math behind this is:
> 
>   tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)
> 
> twrite - tsched is the transport time for the write to hit the device.
> 
> t2 - twrite depends on the chip and is for most chips one second.
> 
> The rtc_set_ntp_time() calculation of tsched is:
> 
>     tsched = t2 - 1sec - (t2 - twrite)
> 
> The default for the sync offset is 500ms which means that twrite - tsched
> is 500ms assumed that t2 - twrite is one second.
> 
> This is 0.5 seconds off for RTCs which are directly accessible by IO writes
> and probably for the majority of i2C/SPI based RTC off by an order of
> magnitude. Set it to 10ms which should bring it closer to reality.
> 
> The default can be adjusted by drivers (rtc_cmos does so) and could be
> adjusted further by a calibration method which is an orthogonal problem.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/rtc/class.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/drivers/rtc/class.c
> +++ b/drivers/rtc/class.c
> @@ -201,7 +201,7 @@ static struct rtc_device *rtc_allocate_d
>  	device_initialize(&rtc->dev);
>  
>  	/* Drivers can revise this default after allocating the device. */
> -	rtc->set_offset_nsec =  NSEC_PER_SEC / 2;
> +	rtc->set_offset_nsec =  10 * NSEC_PER_MSEC;

I did retest, on a slow 100kHz i2c bus, with a fairly inconvenient RTC,
The maximum offset to set the RTC was 4845533ns so I'd say 10ms is too
large. Should we make that 5ms ?

Apart from that, on the series, you can add my
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 4/8] rtc: core: Make the sync offset default more realistic
  2020-12-10 23:59   ` Alexandre Belloni
@ 2020-12-11  0:23     ` Thomas Gleixner
  2020-12-11  0:28       ` Alexandre Belloni
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-11  0:23 UTC (permalink / raw)
  To: Alexandre Belloni
  Cc: LKML, Jason Gunthorpe, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

Alexandre,

On Fri, Dec 11 2020 at 00:59, Alexandre Belloni wrote:
> On 06/12/2020 22:46:17+0100, Thomas Gleixner wrote:
>>  	/* Drivers can revise this default after allocating the device. */
>> -	rtc->set_offset_nsec =  NSEC_PER_SEC / 2;
>> +	rtc->set_offset_nsec =  10 * NSEC_PER_MSEC;
>
> I did retest, on a slow 100kHz i2c bus, with a fairly inconvenient RTC,
> The maximum offset to set the RTC was 4845533ns so I'd say 10ms is too
> large. Should we make that 5ms ?

Sure. As I said I pulled the 10 out of thin air.

> Apart from that, on the series, you can add my
> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

I'll amend the s/10/5/ throughout the series while queueing the whole
pile in the timers/core branch unless you want it to be handled
differently.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 4/8] rtc: core: Make the sync offset default more realistic
  2020-12-11  0:23     ` Thomas Gleixner
@ 2020-12-11  0:28       ` Alexandre Belloni
  0 siblings, 0 replies; 54+ messages in thread
From: Alexandre Belloni @ 2020-12-11  0:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Jason Gunthorpe, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On 11/12/2020 01:23:57+0100, Thomas Gleixner wrote:
> Alexandre,
> 
> On Fri, Dec 11 2020 at 00:59, Alexandre Belloni wrote:
> > On 06/12/2020 22:46:17+0100, Thomas Gleixner wrote:
> >>  	/* Drivers can revise this default after allocating the device. */
> >> -	rtc->set_offset_nsec =  NSEC_PER_SEC / 2;
> >> +	rtc->set_offset_nsec =  10 * NSEC_PER_MSEC;
> >
> > I did retest, on a slow 100kHz i2c bus, with a fairly inconvenient RTC,
> > The maximum offset to set the RTC was 4845533ns so I'd say 10ms is too
> > large. Should we make that 5ms ?
> 
> Sure. As I said I pulled the 10 out of thin air.
> 
> > Apart from that, on the series, you can add my
> > Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> 
> I'll amend the s/10/5/ throughout the series while queueing the whole
> pile in the timers/core branch unless you want it to be handled
> differently.
> 

That's fine for me.

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 8/8] ntp: Consolidate the RTC update implementation
  2020-12-07 21:05   ` Jason Gunthorpe
@ 2020-12-11  9:23     ` Thomas Gleixner
  0 siblings, 0 replies; 54+ messages in thread
From: Thomas Gleixner @ 2020-12-11  9:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: LKML, Alexandre Belloni, Miroslav Lichvar, John Stultz,
	Prarit Bhargava, Alessandro Zummo, linux-rtc, Peter Zijlstra

On Mon, Dec 07 2020 at 17:05, Jason Gunthorpe wrote:
> On Sun, Dec 06, 2020 at 10:46:21PM +0100, Thomas Gleixner wrote:
>>  static void sync_hw_clock(struct work_struct *work)
>>  {
>> +	static unsigned long offset_nsec = NSEC_PER_SEC / 2;
>
> A comment here explaining this is the default: because the platform is
> assumed to use CMOS, and by the way, this whole thing is obsolete
> don't use it, seems appropriate..

Will add something like that.

> The time split is clearer if you think of it from a bus/datasheet
> perspective, less clear if you try to measure the system directly, eg
> from an alarm. But, I think this  has a better chance of some rtclib
> driver authors to fill in the datasheet value at least.

That's the hope. You know hope dies last...

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [tip: timers/core] ntp: Make the RTC sync offset less obscure
  2020-12-06 21:46 ` [patch 7/8] ntp: Make the RTC sync offset less obscure Thomas Gleixner
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     69eca258c85000564577642ba28335eb4e1df8f0
Gitweb:        https://git.kernel.org/tip/69eca258c85000564577642ba28335eb4e1df8f0
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:20 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:53 +01:00

ntp: Make the RTC sync offset less obscure

The current RTC set_offset_nsec value is not really intuitive to
understand. 

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

The offset is calculated from twrite based on the assumption that t2 -
twrite == 1s. That means for the MC146818 RTC the offset needs to be
negative so that the write happens 500ms before t2.

It's easier to understand when the whole calculation is based on t2. That
avoids negative offsets and the meaning is obvious:

 t2 - twrite:     The time defined by the chip when seconds increment
      		  after the write.

 twrite - tsched: The time for the transport to the point where the chip
 	  	  is updated. 

==> set_offset_nsec =  t2 - tsched
    ttransport      =  twrite - tsched
    tRTCinc         =  t2 - twrite
==> set_offset_nsec =  ttransport + tRTCinc

tRTCinc is a chip property and can be obtained from the data sheet.

ttransport depends on how the RTC is connected. It is close to 0 for
directly accessible RTCs. For RTCs behind a slow bus, e.g. i2c, it's the
time required to send the update over the bus. This can be estimated or
even calibrated, but that's a different problem.

Adjust the implementation and update comments accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.263204937@linutronix.de

---
 drivers/rtc/class.c    |  9 ++++++--
 drivers/rtc/rtc-cmos.c |  2 +-
 include/linux/rtc.h    | 35 +++++++++++++++++++++++++------
 kernel/time/ntp.c      | 47 ++++++++++++++++++++---------------------
 4 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index d795737..5855aa2 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -200,8 +200,13 @@ static struct rtc_device *rtc_allocate_device(void)
 
 	device_initialize(&rtc->dev);
 
-	/* Drivers can revise this default after allocating the device. */
-	rtc->set_offset_nsec =  5 * NSEC_PER_MSEC;
+	/*
+	 * Drivers can revise this default after allocating the device.
+	 * The default is what most RTCs do: Increment seconds exactly one
+	 * second after the write happened. This adds a default transport
+	 * time of 5ms which is at least halfways close to reality.
+	 */
+	rtc->set_offset_nsec = NSEC_PER_SEC + 5 * NSEC_PER_MSEC;
 
 	rtc->irq_freq = 1;
 	rtc->max_user_freq = 64;
diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 7728fac..c5bcd2a 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -869,7 +869,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq)
 		goto cleanup2;
 
 	/* Set the sync offset for the periodic 11min update correct */
-	cmos_rtc.rtc->set_offset_nsec = -(NSEC_PER_SEC / 2);
+	cmos_rtc.rtc->set_offset_nsec = NSEC_PER_SEC / 2;
 
 	/* export at least the first block of NVRAM */
 	nvmem_cfg.size = address_space - NVRAM_OFFSET;
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index ff62680..b829382 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -110,13 +110,36 @@ struct rtc_device {
 	/* Some hardware can't support UIE mode */
 	int uie_unsupported;
 
-	/* Number of nsec it takes to set the RTC clock. This influences when
-	 * the set ops are called. An offset:
-	 *   - of 0.5 s will call RTC set for wall clock time 10.0 s at 9.5 s
-	 *   - of 1.5 s will call RTC set for wall clock time 10.0 s at 8.5 s
-	 *   - of -0.5 s will call RTC set for wall clock time 10.0 s at 10.5 s
+	/*
+	 * This offset specifies the update timing of the RTC.
+	 *
+	 * tsched     t1 write(t2.tv_sec - 1sec))  t2 RTC increments seconds
+	 *
+	 * The offset defines how tsched is computed so that the write to
+	 * the RTC (t2.tv_sec - 1sec) is correct versus the time required
+	 * for the transport of the write and the time which the RTC needs
+	 * to increment seconds the first time after the write (t2).
+	 *
+	 * For direct accessible RTCs tsched ~= t1 because the write time
+	 * is negligible. For RTCs behind slow busses the transport time is
+	 * significant and has to be taken into account.
+	 *
+	 * The time between the write (t1) and the first increment after
+	 * the write (t2) is RTC specific. For a MC146818 RTC it's 500ms,
+	 * for many others it's exactly 1 second. Consult the datasheet.
+	 *
+	 * The value of this offset is also used to calculate the to be
+	 * written value (t2.tv_sec - 1sec) at tsched.
+	 *
+	 * The default value for this is NSEC_PER_SEC + 10 msec default
+	 * transport time. The offset can be adjusted by drivers so the
+	 * calculation for the to be written value at tsched becomes
+	 * correct:
+	 *
+	 *	newval = tsched + set_offset_nsec - NSEC_PER_SEC
+	 * and  (tsched + set_offset_nsec) % NSEC_PER_SEC == 0
 	 */
-	long set_offset_nsec;
+	unsigned long set_offset_nsec;
 
 	bool registered;
 
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 84a5546..a34ac06 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -520,22 +520,33 @@ static void sched_sync_hw_clock(unsigned long offset_nsec, bool retry)
 }
 
 /*
- * Determine if we can call to driver to set the time. Drivers can only be
- * called to set a second aligned time value, and the field set_offset_nsec
- * specifies how far away from the second aligned time to call the driver.
+ * Check whether @now is correct versus the required time to update the RTC
+ * and calculate the value which needs to be written to the RTC so that the
+ * next seconds increment of the RTC after the write is aligned with the next
+ * seconds increment of clock REALTIME.
  *
- * This also computes 'to_set' which is the time we are trying to set, and has
- * a zero in tv_nsecs, such that:
- *    to_set - set_delay_nsec == now +/- FUZZ
+ * tsched     t1 write(t2.tv_sec - 1sec))	t2 RTC increments seconds
  *
+ * t2.tv_nsec == 0
+ * tsched = t2 - set_offset_nsec
+ * newval = t2 - NSEC_PER_SEC
+ *
+ * ==> neval = tsched + set_offset_nsec - NSEC_PER_SEC
+ *
+ * As the execution of this code is not guaranteed to happen exactly at
+ * tsched this allows it to happen within a fuzzy region:
+ *
+ *	abs(now - tsched) < FUZZ
+ *
+ * If @now is not inside the allowed window the function returns false.
  */
-static inline bool rtc_tv_nsec_ok(long set_offset_nsec,
+static inline bool rtc_tv_nsec_ok(unsigned long set_offset_nsec,
 				  struct timespec64 *to_set,
 				  const struct timespec64 *now)
 {
 	/* Allowed error in tv_nsec, arbitarily set to 5 jiffies in ns. */
 	const unsigned long TIME_SET_NSEC_FUZZ = TICK_NSEC * 5;
-	struct timespec64 delay = {.tv_sec = 0,
+	struct timespec64 delay = {.tv_sec = -1,
 				   .tv_nsec = set_offset_nsec};
 
 	*to_set = timespec64_add(*now, delay);
@@ -557,11 +568,11 @@ static inline bool rtc_tv_nsec_ok(long set_offset_nsec,
 /*
  * rtc_set_ntp_time - Save NTP synchronized time to the RTC
  */
-static int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
+static int rtc_set_ntp_time(struct timespec64 now, unsigned long *offset_nsec)
 {
+	struct timespec64 to_set;
 	struct rtc_device *rtc;
 	struct rtc_time tm;
-	struct timespec64 to_set;
 	int err = -ENODEV;
 	bool ok;
 
@@ -572,19 +583,9 @@ static int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
 	if (!rtc->ops || !rtc->ops->set_time)
 		goto out_close;
 
-	/*
-	 * Compute the value of tv_nsec we require the caller to supply in
-	 * now.tv_nsec.  This is the value such that (now +
-	 * set_offset_nsec).tv_nsec == 0.
-	 */
-	set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec);
-	*target_nsec = to_set.tv_nsec;
+	/* Store the update offset for this RTC */
+	*offset_nsec = rtc->set_offset_nsec;
 
-	/*
-	 * The ntp code must call this with the correct value in tv_nsec, if
-	 * it does not we update target_nsec and return EPROTO to make the ntp
-	 * code try again later.
-	 */
 	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
 	if (!ok) {
 		err = -EPROTO;
@@ -657,7 +658,7 @@ static bool sync_cmos_clock(void)
 	 * implement this legacy API.
 	 */
 	ktime_get_real_ts64(&now);
-	if (rtc_tv_nsec_ok(-1 * target_nsec, &adjust, &now)) {
+	if (rtc_tv_nsec_ok(target_nsec, &adjust, &now)) {
 		if (persistent_clock_is_local)
 			adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
 		rc = update_persistent_clock64(adjust);

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] ntp: Consolidate the RTC update implementation
  2020-12-06 21:46 ` [patch 8/8] ntp: Consolidate the RTC update implementation Thomas Gleixner
  2020-12-07 21:05   ` Jason Gunthorpe
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Jason Gunthorpe, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     76e87d96b30b5fee91b381fbc444a3eabcd9469a
Gitweb:        https://git.kernel.org/tip/76e87d96b30b5fee91b381fbc444a3eabcd9469a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:21 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:53 +01:00

ntp: Consolidate the RTC update implementation

The code for the legacy RTC and the RTC class based update are pretty much
the same. Consolidate the common parts into one function and just invoke
the actual setter functions.

For RTC class based devices the update code checks whether the offset is
valid for the device, which is usually not the case for the first
invocation. If it's not the same it stores the correct offset and lets the
caller try again. That's not much different from the previous approach
where the first invocation had a pretty low probability to actually hit the
allowed window.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.355743355@linutronix.de

---
 kernel/time/ntp.c | 144 ++++++++++++++++-----------------------------
 1 file changed, 52 insertions(+), 92 deletions(-)

diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index a34ac06..7404d38 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -564,118 +564,53 @@ static inline bool rtc_tv_nsec_ok(unsigned long set_offset_nsec,
 	return false;
 }
 
+#ifdef CONFIG_GENERIC_CMOS_UPDATE
+int __weak update_persistent_clock64(struct timespec64 now64)
+{
+	return -ENODEV;
+}
+#else
+static inline int update_persistent_clock64(struct timespec64 now64)
+{
+	return -ENODEV;
+}
+#endif
+
 #ifdef CONFIG_RTC_SYSTOHC
-/*
- * rtc_set_ntp_time - Save NTP synchronized time to the RTC
- */
-static int rtc_set_ntp_time(struct timespec64 now, unsigned long *offset_nsec)
+/* Save NTP synchronized time to the RTC */
+static int update_rtc(struct timespec64 *to_set, unsigned long *offset_nsec)
 {
-	struct timespec64 to_set;
 	struct rtc_device *rtc;
 	struct rtc_time tm;
 	int err = -ENODEV;
-	bool ok;
 
 	rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE);
 	if (!rtc)
-		goto out_err;
+		return -ENODEV;
 
 	if (!rtc->ops || !rtc->ops->set_time)
 		goto out_close;
 
-	/* Store the update offset for this RTC */
-	*offset_nsec = rtc->set_offset_nsec;
-
-	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
-	if (!ok) {
-		err = -EPROTO;
-		goto out_close;
+	/* First call might not have the correct offset */
+	if (*offset_nsec == rtc->set_offset_nsec) {
+		rtc_time64_to_tm(to_set->tv_sec, &tm);
+		err = rtc_set_time(rtc, &tm);
+	} else {
+		/* Store the update offset and let the caller try again */
+		*offset_nsec = rtc->set_offset_nsec;
+		err = -EAGAIN;
 	}
-
-	rtc_time64_to_tm(to_set.tv_sec, &tm);
-
-	err = rtc_set_time(rtc, &tm);
-
 out_close:
 	rtc_class_close(rtc);
-out_err:
 	return err;
 }
-
-static void sync_rtc_clock(void)
-{
-	unsigned long offset_nsec;
-	struct timespec64 adjust;
-	int rc;
-
-	ktime_get_real_ts64(&adjust);
-
-	if (persistent_clock_is_local)
-		adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
-
-	/*
-	 * The current RTC in use will provide the nanoseconds offset prior
-	 * to a full second it wants to be called at, and invokes
-	 * rtc_tv_nsec_ok() internally.
-	 */
-	rc = rtc_set_ntp_time(adjust, &offset_nsec);
-	if (rc == -ENODEV)
-		return;
-
-	sched_sync_hw_clock(offset_nsec, rc != 0);
-}
 #else
-static inline void sync_rtc_clock(void) { }
-#endif
-
-#ifdef CONFIG_GENERIC_CMOS_UPDATE
-int __weak update_persistent_clock64(struct timespec64 now64)
+static inline int update_rtc(struct timespec64 *to_set, unsigned long *offset_nsec)
 {
 	return -ENODEV;
 }
 #endif
 
-static bool sync_cmos_clock(void)
-{
-	static bool no_cmos;
-	struct timespec64 now;
-	struct timespec64 adjust;
-	int rc = -EPROTO;
-	long target_nsec = NSEC_PER_SEC / 2;
-
-	if (!IS_ENABLED(CONFIG_GENERIC_CMOS_UPDATE))
-		return false;
-
-	if (no_cmos)
-		return false;
-
-	/*
-	 * Historically update_persistent_clock64() has followed x86
-	 * semantics, which match the MC146818A/etc RTC. This RTC will store
-	 * 'adjust' and then in .5s it will advance once second.
-	 *
-	 * Architectures are strongly encouraged to use rtclib and not
-	 * implement this legacy API.
-	 */
-	ktime_get_real_ts64(&now);
-	if (rtc_tv_nsec_ok(target_nsec, &adjust, &now)) {
-		if (persistent_clock_is_local)
-			adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
-		rc = update_persistent_clock64(adjust);
-		/*
-		 * The machine does not support update_persistent_clock64 even
-		 * though it defines CONFIG_GENERIC_CMOS_UPDATE.
-		 */
-		if (rc == -ENODEV) {
-			no_cmos = true;
-			return false;
-		}
-	}
-
-	sched_sync_hw_clock(target_nsec, rc != 0);
-	return true;
-}
-
 /*
  * If we have an externally synchronized Linux clock, then update RTC clock
  * accordingly every ~11 minutes. Generally RTCs can only store second
@@ -687,6 +622,15 @@ static bool sync_cmos_clock(void)
 static void sync_hw_clock(struct work_struct *work)
 {
 	/*
+	 * The default synchronization offset is 500ms for the deprecated
+	 * update_persistent_clock64() under the assumption that it uses
+	 * the infamous CMOS clock (MC146818).
+	 */
+	static unsigned long offset_nsec = NSEC_PER_SEC / 2;
+	struct timespec64 now, to_set;
+	int res = -EAGAIN;
+
+	/*
 	 * Don't update if STA_UNSYNC is set and if ntp_notify_cmos_timer()
 	 * managed to schedule the work between the timer firing and the
 	 * work being able to rearm the timer. Wait for the timer to expire.
@@ -694,10 +638,26 @@ static void sync_hw_clock(struct work_struct *work)
 	if (!ntp_synced() || hrtimer_is_queued(&sync_hrtimer))
 		return;
 
-	if (sync_cmos_clock())
-		return;
+	ktime_get_real_ts64(&now);
+	/* If @now is not in the allowed window, try again */
+	if (!rtc_tv_nsec_ok(offset_nsec, &to_set, &now))
+		goto rearm;
 
-	sync_rtc_clock();
+	/* Take timezone adjusted RTCs into account */
+	if (persistent_clock_is_local)
+		to_set.tv_sec -= (sys_tz.tz_minuteswest * 60);
+
+	/* Try the legacy RTC first. */
+	res = update_persistent_clock64(to_set);
+	if (res != -ENODEV)
+		goto rearm;
+
+	/* Try the RTC class */
+	res = update_rtc(&to_set, &offset_nsec);
+	if (res == -ENODEV)
+		return;
+rearm:
+	sched_sync_hw_clock(offset_nsec, res != 0);
 }
 
 void ntp_notify_cmos_timer(void)

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] ntp, rtc: Move rtc_set_ntp_time() to ntp code
  2020-12-06 21:46 ` [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code Thomas Gleixner
  2020-12-07 20:59   ` Jason Gunthorpe
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Jason Gunthorpe, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     33e62e832384c8cb523044e0e9d99d7133f98e93
Gitweb:        https://git.kernel.org/tip/33e62e832384c8cb523044e0e9d99d7133f98e93
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:19 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:52 +01:00

ntp, rtc: Move rtc_set_ntp_time() to ntp code

rtc_set_ntp_time() is not really RTC functionality as the code is just a
user of RTC. Move it into the NTP code which allows further cleanups.

Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.166871172@linutronix.de

---
 drivers/rtc/Makefile  |  1 +-
 drivers/rtc/systohc.c | 61 +-----------------------------
 include/linux/rtc.h   | 34 +----------------
 kernel/time/ntp.c     | 88 ++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 85 insertions(+), 99 deletions(-)
 delete mode 100644 drivers/rtc/systohc.c

diff --git a/drivers/rtc/Makefile b/drivers/rtc/Makefile
index bfb5746..bb8f319 100644
--- a/drivers/rtc/Makefile
+++ b/drivers/rtc/Makefile
@@ -6,7 +6,6 @@
 ccflags-$(CONFIG_RTC_DEBUG)	:= -DDEBUG
 
 obj-$(CONFIG_RTC_LIB)		+= lib.o
-obj-$(CONFIG_RTC_SYSTOHC)	+= systohc.o
 obj-$(CONFIG_RTC_CLASS)		+= rtc-core.o
 obj-$(CONFIG_RTC_MC146818_LIB)	+= rtc-mc146818-lib.o
 rtc-core-y			:= class.o interface.o
diff --git a/drivers/rtc/systohc.c b/drivers/rtc/systohc.c
deleted file mode 100644
index 8b70f05..0000000
--- a/drivers/rtc/systohc.c
+++ /dev/null
@@ -1,61 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <linux/rtc.h>
-#include <linux/time.h>
-
-/**
- * rtc_set_ntp_time - Save NTP synchronized time to the RTC
- * @now: Current time of day
- * @target_nsec: pointer for desired now->tv_nsec value
- *
- * Replacement for the NTP platform function update_persistent_clock64
- * that stores time for later retrieval by rtc_hctosys.
- *
- * Returns 0 on successful RTC update, -ENODEV if a RTC update is not
- * possible at all, and various other -errno for specific temporary failure
- * cases.
- *
- * -EPROTO is returned if now.tv_nsec is not close enough to *target_nsec.
- *
- * If temporary failure is indicated the caller should try again 'soon'
- */
-int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
-{
-	struct rtc_device *rtc;
-	struct rtc_time tm;
-	struct timespec64 to_set;
-	int err = -ENODEV;
-	bool ok;
-
-	rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE);
-	if (!rtc)
-		goto out_err;
-
-	if (!rtc->ops || !rtc->ops->set_time)
-		goto out_close;
-
-	/* Compute the value of tv_nsec we require the caller to supply in
-	 * now.tv_nsec.  This is the value such that (now +
-	 * set_offset_nsec).tv_nsec == 0.
-	 */
-	set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec);
-	*target_nsec = to_set.tv_nsec;
-
-	/* The ntp code must call this with the correct value in tv_nsec, if
-	 * it does not we update target_nsec and return EPROTO to make the ntp
-	 * code try again later.
-	 */
-	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
-	if (!ok) {
-		err = -EPROTO;
-		goto out_close;
-	}
-
-	rtc_time64_to_tm(to_set.tv_sec, &tm);
-
-	err = rtc_set_time(rtc, &tm);
-
-out_close:
-	rtc_class_close(rtc);
-out_err:
-	return err;
-}
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 22d1575..ff62680 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -165,7 +165,6 @@ int __rtc_register_device(struct module *owner, struct rtc_device *rtc);
 
 extern int rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm);
 extern int rtc_set_time(struct rtc_device *rtc, struct rtc_time *tm);
-extern int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec);
 int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm);
 extern int rtc_read_alarm(struct rtc_device *rtc,
 			struct rtc_wkalrm *alrm);
@@ -205,39 +204,6 @@ static inline bool is_leap_year(unsigned int year)
 	return (!(year % 4) && (year % 100)) || !(year % 400);
 }
 
-/* Determine if we can call to driver to set the time. Drivers can only be
- * called to set a second aligned time value, and the field set_offset_nsec
- * specifies how far away from the second aligned time to call the driver.
- *
- * This also computes 'to_set' which is the time we are trying to set, and has
- * a zero in tv_nsecs, such that:
- *    to_set - set_delay_nsec == now +/- FUZZ
- *
- */
-static inline bool rtc_tv_nsec_ok(s64 set_offset_nsec,
-				  struct timespec64 *to_set,
-				  const struct timespec64 *now)
-{
-	/* Allowed error in tv_nsec, arbitarily set to 5 jiffies in ns. */
-	const unsigned long TIME_SET_NSEC_FUZZ = TICK_NSEC * 5;
-	struct timespec64 delay = {.tv_sec = 0,
-				   .tv_nsec = set_offset_nsec};
-
-	*to_set = timespec64_add(*now, delay);
-
-	if (to_set->tv_nsec < TIME_SET_NSEC_FUZZ) {
-		to_set->tv_nsec = 0;
-		return true;
-	}
-
-	if (to_set->tv_nsec > NSEC_PER_SEC - TIME_SET_NSEC_FUZZ) {
-		to_set->tv_sec++;
-		to_set->tv_nsec = 0;
-		return true;
-	}
-	return false;
-}
-
 #define rtc_register_device(device) \
 	__rtc_register_device(THIS_MODULE, device)
 
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index ff1a7b8..84a5546 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -519,15 +519,94 @@ static void sched_sync_hw_clock(unsigned long offset_nsec, bool retry)
 	hrtimer_start(&sync_hrtimer, exp, HRTIMER_MODE_ABS);
 }
 
+/*
+ * Determine if we can call to driver to set the time. Drivers can only be
+ * called to set a second aligned time value, and the field set_offset_nsec
+ * specifies how far away from the second aligned time to call the driver.
+ *
+ * This also computes 'to_set' which is the time we are trying to set, and has
+ * a zero in tv_nsecs, such that:
+ *    to_set - set_delay_nsec == now +/- FUZZ
+ *
+ */
+static inline bool rtc_tv_nsec_ok(long set_offset_nsec,
+				  struct timespec64 *to_set,
+				  const struct timespec64 *now)
+{
+	/* Allowed error in tv_nsec, arbitarily set to 5 jiffies in ns. */
+	const unsigned long TIME_SET_NSEC_FUZZ = TICK_NSEC * 5;
+	struct timespec64 delay = {.tv_sec = 0,
+				   .tv_nsec = set_offset_nsec};
+
+	*to_set = timespec64_add(*now, delay);
+
+	if (to_set->tv_nsec < TIME_SET_NSEC_FUZZ) {
+		to_set->tv_nsec = 0;
+		return true;
+	}
+
+	if (to_set->tv_nsec > NSEC_PER_SEC - TIME_SET_NSEC_FUZZ) {
+		to_set->tv_sec++;
+		to_set->tv_nsec = 0;
+		return true;
+	}
+	return false;
+}
+
+#ifdef CONFIG_RTC_SYSTOHC
+/*
+ * rtc_set_ntp_time - Save NTP synchronized time to the RTC
+ */
+static int rtc_set_ntp_time(struct timespec64 now, unsigned long *target_nsec)
+{
+	struct rtc_device *rtc;
+	struct rtc_time tm;
+	struct timespec64 to_set;
+	int err = -ENODEV;
+	bool ok;
+
+	rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE);
+	if (!rtc)
+		goto out_err;
+
+	if (!rtc->ops || !rtc->ops->set_time)
+		goto out_close;
+
+	/*
+	 * Compute the value of tv_nsec we require the caller to supply in
+	 * now.tv_nsec.  This is the value such that (now +
+	 * set_offset_nsec).tv_nsec == 0.
+	 */
+	set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec);
+	*target_nsec = to_set.tv_nsec;
+
+	/*
+	 * The ntp code must call this with the correct value in tv_nsec, if
+	 * it does not we update target_nsec and return EPROTO to make the ntp
+	 * code try again later.
+	 */
+	ok = rtc_tv_nsec_ok(rtc->set_offset_nsec, &to_set, &now);
+	if (!ok) {
+		err = -EPROTO;
+		goto out_close;
+	}
+
+	rtc_time64_to_tm(to_set.tv_sec, &tm);
+
+	err = rtc_set_time(rtc, &tm);
+
+out_close:
+	rtc_class_close(rtc);
+out_err:
+	return err;
+}
+
 static void sync_rtc_clock(void)
 {
 	unsigned long offset_nsec;
 	struct timespec64 adjust;
 	int rc;
 
-	if (!IS_ENABLED(CONFIG_RTC_SYSTOHC))
-		return;
-
 	ktime_get_real_ts64(&adjust);
 
 	if (persistent_clock_is_local)
@@ -544,6 +623,9 @@ static void sync_rtc_clock(void)
 
 	sched_sync_hw_clock(offset_nsec, rc != 0);
 }
+#else
+static inline void sync_rtc_clock(void) { }
+#endif
 
 #ifdef CONFIG_GENERIC_CMOS_UPDATE
 int __weak update_persistent_clock64(struct timespec64 now64)

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] rtc: core: Make the sync offset default more realistic
  2020-12-06 21:46 ` [patch 4/8] rtc: core: Make the sync offset default more realistic Thomas Gleixner
  2020-12-07 20:55   ` Jason Gunthorpe
  2020-12-10 23:59   ` Alexandre Belloni
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Jason Gunthorpe, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     354c796b9270eb4780e59e3bdb83a3ae4930a832
Gitweb:        https://git.kernel.org/tip/354c796b9270eb4780e59e3bdb83a3ae4930a832
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:17 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:52 +01:00

rtc: core: Make the sync offset default more realistic

The offset which is used to steer the start of an RTC synchronization
update via rtc_set_ntp_time() is huge. The math behind this is:

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

twrite - tsched is the transport time for the write to hit the device.

t2 - twrite depends on the chip and is for most chips one second.

The rtc_set_ntp_time() calculation of tsched is:

    tsched = t2 - 1sec - (t2 - twrite)

The default for the sync offset is 500ms which means that twrite - tsched
is 500ms assumed that t2 - twrite is one second.

This is 0.5 seconds off for RTCs which are directly accessible by IO writes
and probably for the majority of i2C/SPI based RTC off by an order of
magnitude. Set it to 5ms which should bring it closer to reality.

The default can be adjusted by drivers (rtc_cmos does so) and could be
adjusted further by a calibration method which is an orthogonal problem.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220541.960333166@linutronix.de

---
 drivers/rtc/class.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index 7c88d19..d795737 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -201,7 +201,7 @@ static struct rtc_device *rtc_allocate_device(void)
 	device_initialize(&rtc->dev);
 
 	/* Drivers can revise this default after allocating the device. */
-	rtc->set_offset_nsec =  NSEC_PER_SEC / 2;
+	rtc->set_offset_nsec =  5 * NSEC_PER_MSEC;
 
 	rtc->irq_freq = 1;
 	rtc->max_user_freq = 64;

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] rtc: cmos: Make rtc_cmos sync offset correct
  2020-12-06 21:46 ` [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct Thomas Gleixner
  2020-12-07 20:50   ` Jason Gunthorpe
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Jason Gunthorpe, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     b0ecd8e8c5ef376777277c4c2db7de92ac59f23f
Gitweb:        https://git.kernel.org/tip/b0ecd8e8c5ef376777277c4c2db7de92ac59f23f
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:16 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:52 +01:00

rtc: cmos: Make rtc_cmos sync offset correct

The offset for rtc_cmos must be -500ms to work correctly with the current
implementation of rtc_set_ntp_time() due to the following:

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

twrite - tsched is the transport time for the write to hit the device,
which is negligible for this chip because it's accessed directly.

t2 - twrite = 500ms according to the datasheet.

But rtc_set_ntp_time() calculation of tsched is:

    tsched = t2 - 1sec - (t2 - twrite)

The default for the sync offset is 500ms which means that the write happens
at t2 - 1.5 seconds which is obviously off by a second for this device.

Make the offset -500ms so it works correct.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220541.830517160@linutronix.de

---
 drivers/rtc/rtc-cmos.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index c633319..7728fac 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -868,6 +868,9 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq)
 	if (retval)
 		goto cleanup2;
 
+	/* Set the sync offset for the periodic 11min update correct */
+	cmos_rtc.rtc->set_offset_nsec = -(NSEC_PER_SEC / 2);
+
 	/* export at least the first block of NVRAM */
 	nvmem_cfg.size = address_space - NVRAM_OFFSET;
 	if (rtc_nvmem_register(cmos_rtc.rtc, &nvmem_cfg))

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] ntp: Make the RTC synchronization more reliable
  2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
  2020-12-07 12:47   ` Miroslav Lichvar
  2020-12-07 20:59   ` Jason Gunthorpe
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  2020-12-29 19:41   ` [patch 5/8] " Geert Uytterhoeven
  3 siblings, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Miroslav Lichvar, Thomas Gleixner, Jason Gunthorpe,
	Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     c9e6189fb03123a7dfb93589280347b46f30b161
Gitweb:        https://git.kernel.org/tip/c9e6189fb03123a7dfb93589280347b46f30b161
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:18 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:52 +01:00

ntp: Make the RTC synchronization more reliable

Miroslav reported that the periodic RTC synchronization in the NTP code
fails more often than not to hit the specified update window.

The reason is that the code uses delayed_work to schedule the update which
needs to be in thread context as the underlying RTC might be connected via
a slow bus, e.g. I2C. In the update function it verifies whether the
current time is correct vs. the requirements of the underlying RTC.

But delayed_work is using the timer wheel for scheduling which is
inaccurate by design. Depending on the distance to the expiry the wheel
gets less granular to allow batching and to avoid the cascading of the
original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading
wheel") and the code for further details.

The code already deals with this by splitting the 660 seconds period into a
long 659 seconds timer and then retrying with a smaller delta.

But looking at the actual granularities of the timer wheel (which depend on
the HZ configuration) the 659 seconds timer ends up in an outer wheel level
and is affected by a worst case granularity of:

HZ          Granularity
1000        32s
 250        16s
 100        40s

So the initial timer can be already off by max 12.5% which is not a big
issue as the period of the sync is defined as ~11 minutes.

The fine grained second attempt schedules to the desired update point with
a timer expiring less than a second from now. Depending on the actual delta
and the HZ setting even the second attempt can end up in outer wheel levels
which have a large enough granularity to make the correctness check fail.

As this is a fundamental property of the timer wheel there is no way to
make this more accurate short of iterating in one jiffies steps towards the
update point.

Switch it to an hrtimer instead which schedules the actual update work. The
hrtimer will expire precisely (max 1 jiffie delay when high resolution
timers are not available). The actual scheduling delay of the work is the
same as before.

The update is triggered from do_adjtimex() which is a bit racy but not much
more racy than it was before:

     if (ntp_synced())
     	queue_delayed_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the work is currently executed and has not managed to
reschedule itself.

This becomes now:

     if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
     	queue_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the hrtimer has expired and the work is currently
executed and has not yet managed to rearm the hrtimer.

Not a big problem as it just schedules work for nothing.

The new implementation has a safe guard in place to catch the case where
the hrtimer is queued on entry to the work function and avoids an extra
update attempt of the RTC that way.

Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.062910520@linutronix.de

---
 include/linux/timex.h      |  1 +-
 kernel/time/ntp.c          | 90 +++++++++++++++++++------------------
 kernel/time/ntp_internal.h |  7 +++-
 3 files changed, 55 insertions(+), 43 deletions(-)

diff --git a/include/linux/timex.h b/include/linux/timex.h
index ce08597..9c2e54f 100644
--- a/include/linux/timex.h
+++ b/include/linux/timex.h
@@ -157,7 +157,6 @@ extern int do_clock_adjtime(const clockid_t which_clock, struct __kernel_timex *
 extern void hardpps(const struct timespec64 *, const struct timespec64 *);
 
 int read_current_timer(unsigned long *timer_val);
-void ntp_notify_cmos_timer(void);
 
 /* The clock frequency of the i8253/i8254 PIT */
 #define PIT_TICK_RATE 1193182ul
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 069ca78..ff1a7b8 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -494,65 +494,55 @@ out:
 	return leap;
 }
 
+#if defined(CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC)
 static void sync_hw_clock(struct work_struct *work);
-static DECLARE_DELAYED_WORK(sync_work, sync_hw_clock);
-
-static void sched_sync_hw_clock(struct timespec64 now,
-				unsigned long target_nsec, bool fail)
+static DECLARE_WORK(sync_work, sync_hw_clock);
+static struct hrtimer sync_hrtimer;
+#define SYNC_PERIOD_NS (11UL * 60 * NSEC_PER_SEC)
 
+static enum hrtimer_restart sync_timer_callback(struct hrtimer *timer)
 {
-	struct timespec64 next;
+	queue_work(system_power_efficient_wq, &sync_work);
 
-	ktime_get_real_ts64(&next);
-	if (!fail)
-		next.tv_sec = 659;
-	else {
-		/*
-		 * Try again as soon as possible. Delaying long periods
-		 * decreases the accuracy of the work queue timer. Due to this
-		 * the algorithm is very likely to require a short-sleep retry
-		 * after the above long sleep to synchronize ts_nsec.
-		 */
-		next.tv_sec = 0;
-	}
+	return HRTIMER_NORESTART;
+}
 
-	/* Compute the needed delay that will get to tv_nsec == target_nsec */
-	next.tv_nsec = target_nsec - next.tv_nsec;
-	if (next.tv_nsec <= 0)
-		next.tv_nsec += NSEC_PER_SEC;
-	if (next.tv_nsec >= NSEC_PER_SEC) {
-		next.tv_sec++;
-		next.tv_nsec -= NSEC_PER_SEC;
-	}
+static void sched_sync_hw_clock(unsigned long offset_nsec, bool retry)
+{
+	ktime_t exp = ktime_set(ktime_get_real_seconds(), 0);
+
+	if (retry)
+		exp = ktime_add_ns(exp, 2 * NSEC_PER_SEC - offset_nsec);
+	else
+		exp = ktime_add_ns(exp, SYNC_PERIOD_NS - offset_nsec);
 
-	queue_delayed_work(system_power_efficient_wq, &sync_work,
-			   timespec64_to_jiffies(&next));
+	hrtimer_start(&sync_hrtimer, exp, HRTIMER_MODE_ABS);
 }
 
 static void sync_rtc_clock(void)
 {
-	unsigned long target_nsec;
-	struct timespec64 adjust, now;
+	unsigned long offset_nsec;
+	struct timespec64 adjust;
 	int rc;
 
 	if (!IS_ENABLED(CONFIG_RTC_SYSTOHC))
 		return;
 
-	ktime_get_real_ts64(&now);
+	ktime_get_real_ts64(&adjust);
 
-	adjust = now;
 	if (persistent_clock_is_local)
 		adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
 
 	/*
-	 * The current RTC in use will provide the target_nsec it wants to be
-	 * called at, and does rtc_tv_nsec_ok internally.
+	 * The current RTC in use will provide the nanoseconds offset prior
+	 * to a full second it wants to be called at, and invokes
+	 * rtc_tv_nsec_ok() internally.
 	 */
-	rc = rtc_set_ntp_time(adjust, &target_nsec);
+	rc = rtc_set_ntp_time(adjust, &offset_nsec);
 	if (rc == -ENODEV)
 		return;
 
-	sched_sync_hw_clock(now, target_nsec, rc);
+	sched_sync_hw_clock(offset_nsec, rc != 0);
 }
 
 #ifdef CONFIG_GENERIC_CMOS_UPDATE
@@ -599,7 +589,7 @@ static bool sync_cmos_clock(void)
 		}
 	}
 
-	sched_sync_hw_clock(now, target_nsec, rc);
+	sched_sync_hw_clock(target_nsec, rc != 0);
 	return true;
 }
 
@@ -613,7 +603,12 @@ static bool sync_cmos_clock(void)
  */
 static void sync_hw_clock(struct work_struct *work)
 {
-	if (!ntp_synced())
+	/*
+	 * Don't update if STA_UNSYNC is set and if ntp_notify_cmos_timer()
+	 * managed to schedule the work between the timer firing and the
+	 * work being able to rearm the timer. Wait for the timer to expire.
+	 */
+	if (!ntp_synced() || hrtimer_is_queued(&sync_hrtimer))
 		return;
 
 	if (sync_cmos_clock())
@@ -624,13 +619,23 @@ static void sync_hw_clock(struct work_struct *work)
 
 void ntp_notify_cmos_timer(void)
 {
-	if (!ntp_synced())
-		return;
+	/*
+	 * When the work is currently executed but has not yet the timer
+	 * rearmed this queues the work immediately again. No big issue,
+	 * just a pointless work scheduled.
+	 */
+	if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
+		queue_work(system_power_efficient_wq, &sync_work);
+}
 
-	if (IS_ENABLED(CONFIG_GENERIC_CMOS_UPDATE) ||
-	    IS_ENABLED(CONFIG_RTC_SYSTOHC))
-		queue_delayed_work(system_power_efficient_wq, &sync_work, 0);
+static void __init ntp_init_cmos_sync(void)
+{
+	hrtimer_init(&sync_hrtimer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+	sync_hrtimer.function = sync_timer_callback;
 }
+#else /* CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC) */
+static inline void __init ntp_init_cmos_sync(void) { }
+#endif /* !CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC) */
 
 /*
  * Propagate a new txc->status value into the NTP state:
@@ -1044,4 +1049,5 @@ __setup("ntp_tick_adj=", ntp_tick_adj_setup);
 void __init ntp_init(void)
 {
 	ntp_clear();
+	ntp_init_cmos_sync();
 }
diff --git a/kernel/time/ntp_internal.h b/kernel/time/ntp_internal.h
index 908ecaa..23d1b74 100644
--- a/kernel/time/ntp_internal.h
+++ b/kernel/time/ntp_internal.h
@@ -12,4 +12,11 @@ extern int __do_adjtimex(struct __kernel_timex *txc,
 			 const struct timespec64 *ts,
 			 s32 *time_tai, struct audit_ntp_data *ad);
 extern void __hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts);
+
+#if defined(CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC)
+extern void ntp_notify_cmos_timer(void);
+#else
+static inline void ntp_notify_cmos_timer(void) { }
+#endif
+
 #endif /* _LINUX_NTP_INTERNAL_H */

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] rtc: mc146818: Prevent reading garbage
  2020-12-06 21:46 ` [patch 1/8] rtc: mc146818: Prevent reading garbage Thomas Gleixner
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  2021-01-25 18:40   ` [patch 1/8] rtc: mc146818: Prevent reading garbage - bug Mickaël Salaün
  1 sibling, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     05a0302c35481e9b47fb90ba40922b0a4cae40d8
Gitweb:        https://git.kernel.org/tip/05a0302c35481e9b47fb90ba40922b0a4cae40d8
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:14 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:52 +01:00

rtc: mc146818: Prevent reading garbage

The MC146818 driver is prone to read garbage from the RTC. There are
several issues all related to the update cycle of the MC146818. The chip
increments seconds obviously once per second and indicates that by a bit in
a register. The bit goes high 244us before the actual update starts. During
the update the readout of the time values is undefined.

The code just checks whether the update in progress bit (UIP) is set before
reading the clock. If it's set it waits arbitrary 20ms before retrying,
which is ample because the maximum update time is ~2ms.

But this check does not guarantee that the UIP bit goes high and the actual
update happens during the readout. So the following can happen

 0.997 	       UIP = False
   -> Interrupt/NMI/preemption
 0.998	       UIP -> True
 0.999	       Readout	<- Undefined

To prevent this rework the code so it checks UIP before and after the
readout and if set after the readout try again.

But that's not enough to cover the following:

 0.997 	       UIP = False
 	       Readout seconds
   -> NMI (or vCPU scheduled out)
 0.998	       UIP -> True
 	       update completes
	       UIP -> False
 1.000	       Readout	minutes,....
 	       UIP check succeeds

That can make the readout wrong up to 59 seconds.

To prevent this, read the seconds value before the first UIP check,
validate it after checking UIP and after reading out the rest.

It's amazing that the original i386 code had this actually correct and
the generic implementation of the MC146818 driver got it wrong in 2002 and
it stayed that way until today.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220541.594826678@linutronix.de

---
 drivers/rtc/rtc-mc146818-lib.c | 64 ++++++++++++++++++++-------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 2ecd875..98048bb 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -8,41 +8,41 @@
 #include <linux/acpi.h>
 #endif
 
-/*
- * Returns true if a clock update is in progress
- */
-static inline unsigned char mc146818_is_updating(void)
-{
-	unsigned char uip;
-	unsigned long flags;
-
-	spin_lock_irqsave(&rtc_lock, flags);
-	uip = (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP);
-	spin_unlock_irqrestore(&rtc_lock, flags);
-	return uip;
-}
-
 unsigned int mc146818_get_time(struct rtc_time *time)
 {
 	unsigned char ctrl;
 	unsigned long flags;
 	unsigned char century = 0;
+	bool retry;
 
 #ifdef CONFIG_MACH_DECSTATION
 	unsigned int real_year;
 #endif
 
+again:
+	spin_lock_irqsave(&rtc_lock, flags);
 	/*
-	 * read RTC once any update in progress is done. The update
-	 * can take just over 2ms. We wait 20ms. There is no need to
-	 * to poll-wait (up to 1s - eeccch) for the falling edge of RTC_UIP.
-	 * If you need to know *exactly* when a second has started, enable
-	 * periodic update complete interrupts, (via ioctl) and then
-	 * immediately read /dev/rtc which will block until you get the IRQ.
-	 * Once the read clears, read the RTC time (again via ioctl). Easy.
+	 * Check whether there is an update in progress during which the
+	 * readout is unspecified. The maximum update time is ~2ms. Poll
+	 * every msec for completion.
+	 *
+	 * Store the second value before checking UIP so a long lasting NMI
+	 * which happens to hit after the UIP check cannot make an update
+	 * cycle invisible.
 	 */
-	if (mc146818_is_updating())
-		mdelay(20);
+	time->tm_sec = CMOS_READ(RTC_SECONDS);
+
+	if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		mdelay(1);
+		goto again;
+	}
+
+	/* Revalidate the above readout */
+	if (time->tm_sec != CMOS_READ(RTC_SECONDS)) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		goto again;
+	}
 
 	/*
 	 * Only the values that we read from the RTC are set. We leave
@@ -50,8 +50,6 @@ unsigned int mc146818_get_time(struct rtc_time *time)
 	 * RTC has RTC_DAY_OF_WEEK, we ignore it, as it is only updated
 	 * by the RTC when initially set to a non-zero value.
 	 */
-	spin_lock_irqsave(&rtc_lock, flags);
-	time->tm_sec = CMOS_READ(RTC_SECONDS);
 	time->tm_min = CMOS_READ(RTC_MINUTES);
 	time->tm_hour = CMOS_READ(RTC_HOURS);
 	time->tm_mday = CMOS_READ(RTC_DAY_OF_MONTH);
@@ -66,8 +64,24 @@ unsigned int mc146818_get_time(struct rtc_time *time)
 		century = CMOS_READ(acpi_gbl_FADT.century);
 #endif
 	ctrl = CMOS_READ(RTC_CONTROL);
+	/*
+	 * Check for the UIP bit again. If it is set now then
+	 * the above values may contain garbage.
+	 */
+	retry = CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP;
+	/*
+	 * A NMI might have interrupted the above sequence so check whether
+	 * the seconds value has changed which indicates that the NMI took
+	 * longer than the UIP bit was set. Unlikely, but possible and
+	 * there is also virt...
+	 */
+	retry |= time->tm_sec != CMOS_READ(RTC_SECONDS);
+
 	spin_unlock_irqrestore(&rtc_lock, flags);
 
+	if (retry)
+		goto again;
+
 	if (!(ctrl & RTC_DM_BINARY) || RTC_ALWAYS_BCD)
 	{
 		time->tm_sec = bcd2bin(time->tm_sec);

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [tip: timers/core] rtc: mc146818: Reduce spinlock section in mc146818_set_time()
  2020-12-06 21:46 ` [patch 2/8] rtc: mc146818: Reduce spinlock section in mc146818_set_time() Thomas Gleixner
@ 2020-12-11 10:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-11 10:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     dcf257e92622ba0e25fdc4b6699683e7ae67e2a1
Gitweb:        https://git.kernel.org/tip/dcf257e92622ba0e25fdc4b6699683e7ae67e2a1
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:46:15 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 11 Dec 2020 10:40:52 +01:00

rtc: mc146818: Reduce spinlock section in mc146818_set_time()

No need to hold the lock and disable interrupts for doing math.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220541.709243630@linutronix.de

---
 drivers/rtc/rtc-mc146818-lib.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 98048bb..972a5b9 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -135,7 +135,6 @@ int mc146818_set_time(struct rtc_time *time)
 	if (yrs > 255)	/* They are unsigned */
 		return -EINVAL;
 
-	spin_lock_irqsave(&rtc_lock, flags);
 #ifdef CONFIG_MACH_DECSTATION
 	real_yrs = yrs;
 	leap_yr = ((!((yrs + 1900) % 4) && ((yrs + 1900) % 100)) ||
@@ -164,10 +163,8 @@ int mc146818_set_time(struct rtc_time *time)
 	/* These limits and adjustments are independent of
 	 * whether the chip is in binary mode or not.
 	 */
-	if (yrs > 169) {
-		spin_unlock_irqrestore(&rtc_lock, flags);
+	if (yrs > 169)
 		return -EINVAL;
-	}
 
 	if (yrs >= 100)
 		yrs -= 100;
@@ -183,6 +180,7 @@ int mc146818_set_time(struct rtc_time *time)
 		century = bin2bcd(century);
 	}
 
+	spin_lock_irqsave(&rtc_lock, flags);
 	save_control = CMOS_READ(RTC_CONTROL);
 	CMOS_WRITE((save_control|RTC_SET), RTC_CONTROL);
 	save_freq_select = CMOS_READ(RTC_FREQ_SELECT);

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [patch 5/8] ntp: Make the RTC synchronization more reliable
  2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
                     ` (2 preceding siblings ...)
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
@ 2020-12-29 19:41   ` Geert Uytterhoeven
  2021-01-11 10:12     ` Thomas Gleixner
  3 siblings, 1 reply; 54+ messages in thread
From: Geert Uytterhoeven @ 2020-12-29 19:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra, Wolfram Sang, Linux-Renesas

Hi Thomas,

On Sun, Dec 6, 2020 at 11:10 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> Miroslav reported that the periodic RTC synchronization in the NTP code
> fails more often than not to hit the specified update window.
>
> The reason is that the code uses delayed_work to schedule the update which
> needs to be in thread context as the underlying RTC might be connected via
> a slow bus, e.g. I2C. In the update function it verifies whether the
> current time is correct vs. the requirements of the underlying RTC.
>
> But delayed_work is using the timer wheel for scheduling which is
> inaccurate by design. Depending on the distance to the expiry the wheel
> gets less granular to allow batching and to avoid the cascading of the
> original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading
> wheel") and the code for further details.
>
> The code already deals with this by splitting the 660 seconds period into a
> long 659 seconds timer and then retrying with a smaller delta.
>
> But looking at the actual granularities of the timer wheel (which depend on
> the HZ configuration) the 659 seconds timer ends up in an outer wheel level
> and is affected by a worst case granularity of:
>
> HZ          Granularity
> 1000        32s
>  250        16s
>  100        40s
>
> So the initial timer can be already off by max 12.5% which is not a big
> issue as the period of the sync is defined as ~11 minutes.
>
> The fine grained second attempt schedules to the desired update point with
> a timer expiring less than a second from now. Depending on the actual delta
> and the HZ setting even the second attempt can end up in outer wheel levels
> which have a large enough granularity to make the correctness check fail.
>
> As this is a fundamental property of the timer wheel there is no way to
> make this more accurate short of iterating in one jiffies steps towards the
> update point.
>
> Switch it to an hrtimer instead which schedules the actual update work. The
> hrtimer will expire precisely (max 1 jiffie delay when high resolution
> timers are not available). The actual scheduling delay of the work is the
> same as before.
>
> The update is triggered from do_adjtimex() which is a bit racy but not much
> more racy than it was before:
>
>      if (ntp_synced())
>         queue_delayed_work(system_power_efficient_wq, &sync_work, 0);
>
> which is racy when the work is currently executed and has not managed to
> reschedule itself.
>
> This becomes now:
>
>      if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
>         queue_work(system_power_efficient_wq, &sync_work, 0);
>
> which is racy when the hrtimer has expired and the work is currently
> executed and has not yet managed to rearm the hrtimer.
>
> Not a big problem as it just schedules work for nothing.
>
> The new implementation has a safe guard in place to catch the case where
> the hrtimer is queued on entry to the work function and avoids an extra
> update attempt of the RTC that way.
>
> Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Thanks for your patch, which is now commit c9e6189fb03123a7 ("ntp: Make
the RTC synchronization more reliable").

Since this commit, the I2C RTC on the R-Car M2-W Koelsch development
board is accessed every two seconds.  Sticking a WARN() in the I2C
activation path gives e.g.

    Modules linked in:
    CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted
5.10.0-rc1-koelsch-00038-gc9e6189fb031-dirty #1021
    Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
    Workqueue: events rtc_timer_do_work
    [<c020ea7c>] (unwind_backtrace) from [<c020a4dc>] (show_stack+0x10/0x14)
    [<c020a4dc>] (show_stack) from [<c090e314>] (dump_stack+0xa0/0xc8)
    [<c090e314>] (dump_stack) from [<c02228a8>] (__warn+0xbc/0xfc)
    [<c02228a8>] (__warn) from [<c090b434>] (warn_slowpath_fmt+0x78/0xb0)
    [<c090b434>] (warn_slowpath_fmt) from [<c058f184>]
(cpg_mstp_clock_endisable+0x94/0x1f4)
    [<c058f184>] (cpg_mstp_clock_endisable) from [<c05881c8>]
(clk_core_enable+0x194/0x29c)
    [<c05881c8>] (clk_core_enable) from [<c05882e8>]
(clk_core_enable_lock+0x18/0x2c)
    [<c05882e8>] (clk_core_enable_lock) from [<c063b5c4>]
(pm_clk_resume+0x68/0xa0)
    [<c063b5c4>] (pm_clk_resume) from [<c063a738>]
(genpd_runtime_resume+0xc8/0x174)
    [<c063a738>] (genpd_runtime_resume) from [<c063236c>]
(__rpm_callback+0x30/0xe0)
    [<c063236c>] (__rpm_callback) from [<c063248c>] (rpm_callback+0x70/0x80)
    [<c063248c>] (rpm_callback) from [<c063216c>] (rpm_resume+0x438/0x4dc)
    [<c063216c>] (rpm_resume) from [<c0632274>] (__pm_runtime_resume+0x64/0x80)
    [<c0632274>] (__pm_runtime_resume) from [<c0731ab4>]
(sh_mobile_i2c_xfer+0x38/0x554)
    [<c0731ab4>] (sh_mobile_i2c_xfer) from [<c072a6c4>]
(__i2c_transfer+0x4e4/0x680)
    [<c072a6c4>] (__i2c_transfer) from [<c072a8b8>] (i2c_transfer+0x58/0xf8)
    [<c072a8b8>] (i2c_transfer) from [<c0645688>] (regmap_i2c_read+0x58/0x94)
    [<c0645688>] (regmap_i2c_read) from [<c0641490>]
(_regmap_raw_read+0x19c/0x2f4)
    [<c0641490>] (_regmap_raw_read) from [<c064162c>]
(_regmap_bus_read+0x44/0x68)
    [<c064162c>] (_regmap_bus_read) from [<c0640308>] (_regmap_read+0x84/0x1a4)
    [<c0640308>] (_regmap_read) from [<c0640984>]
(_regmap_update_bits+0xa8/0xf4)
    [<c0640984>] (_regmap_update_bits) from [<c0641b5c>]
(regmap_update_bits_base+0x4c/0x70)
    [<c0641b5c>] (regmap_update_bits_base) from [<c0728244>]
(regmap_update_bits+0x18/0x20)
    [<c0728244>] (regmap_update_bits) from [<c072491c>]
(rtc_alarm_disable+0x28/0x38)
    [<c072491c>] (rtc_alarm_disable) from [<c0726408>]
(rtc_timer_do_work+0x88/0x294)
    [<c0726408>] (rtc_timer_do_work) from [<c023e27c>]
(process_one_work+0x308/0x524)
    [<c023e27c>] (process_one_work) from [<c023ec80>]
(worker_thread+0x22c/0x2ec)
    [<c023ec80>] (worker_thread) from [<c024417c>] (kthread+0x128/0x138)
    [<c024417c>] (kthread) from [<c020010c>] (ret_from_fork+0x14/0x28)
    Exception stack(0xc220ffb0 to 0xc220fff8)
    ffa0:                                     00000000 00000000
00000000 00000000
    ffc0: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
    ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
    irq event stamp: 74448
    hardirqs last  enabled at (74447): [<c02a5f90>]
seqcount_lockdep_reader_access.constprop.0+0x58/0x68
    hardirqs last disabled at (74448): [<c0919a68>]
_raw_spin_lock_irqsave+0x20/0x70
    softirqs last  enabled at (74360): [<c02012b0>] __do_softirq+0x1b8/0x468
    softirqs last disabled at (74353): [<c0228f24>] __irq_exit_rcu+0x110/0x17c
    ---[ end trace 8c279400e5758606 ]---

Before, the synchronization happened only every 696s.

Worse, this synchronization may also happen while the system is partly
suspended, sometimes triggering a WARNING during resume from s2ram:

     Disabling non-boot CPUs ...
     Enabling non-boot CPUs ...
    +------------[ cut here ]------------
    +WARNING: CPU: 0 PID: 21 at drivers/i2c/i2c-core.h:54
__i2c_transfer+0x464/0x4a0
    +i2c i2c-6: Transfer while suspended
    +CPU: 0 PID: 21 Comm: kworker/0:1 Not tainted
5.11.0-rc1-shmobile-00107-gcf9760aa181f #829
    +Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
    +Workqueue: events_power_efficient sync_hw_clock
    +[<c010dba4>] (unwind_backtrace) from [<c0109b28>] (show_stack+0x10/0x14)
    +[<c0109b28>] (show_stack) from [<c07a120c>] (dump_stack+0x8c/0xa8)
    +[<c07a120c>] (dump_stack) from [<c011c538>] (__warn+0xc0/0xec)
    +[<c011c538>] (__warn) from [<c079a7bc>] (warn_slowpath_fmt+0x78/0xb0)
    +[<c079a7bc>] (warn_slowpath_fmt) from [<c0566574>]
(__i2c_transfer+0x464/0x4a0)
    +[<c0566574>] (__i2c_transfer) from [<c0566608>] (i2c_transfer+0x58/0xf8)
    +[<c0566608>] (i2c_transfer) from [<c0489f80>] (regmap_i2c_read+0x58/0x94)
    +[<c0489f80>] (regmap_i2c_read) from [<c0485e00>]
(_regmap_raw_read+0x108/0x1bc)
    +[<c0485e00>] (_regmap_raw_read) from [<c0485ef8>]
(_regmap_bus_read+0x44/0x68)
    +[<c0485ef8>] (_regmap_bus_read) from [<c0484018>] (_regmap_read+0x84/0x100)
    +[<c0484018>] (_regmap_read) from [<c0485444>]
(_regmap_update_bits+0xa8/0xf4)
    +[<c0485444>] (_regmap_update_bits) from [<c0485574>]
    (_regmap_select_page+0xe4/0x100)
    +[<c0485574>] (_regmap_select_page) from [<c0485664>]
    (_regmap_raw_write_impl+0xd4/0x608)
    +[<c0485664>] (_regmap_raw_write_impl) from [<c04863f4>]
    (_regmap_raw_write+0xd8/0x114)
    +[<c04863f4>] (_regmap_raw_write) from [<c0486488>]
(regmap_raw_write+0x58/0x7c)
    +[<c0486488>] (regmap_raw_write) from [<c04866cc>]
    (regmap_bulk_write+0x118/0x13c)
    +[<c04866cc>] (regmap_bulk_write) from [<c05605b4>]
    (da9063_rtc_set_time+0x44/0x8c)
    +[<c05605b4>] (da9063_rtc_set_time) from [<c055e428>]
(rtc_set_time+0x8c/0x15c)
    +[<c055e428>] (rtc_set_time) from [<c01872cc>] (sync_hw_clock+0x12c/0x210)
    +[<c01872cc>] (sync_hw_clock) from [<c01337d0>]
(process_one_work+0x1bc/0x2ac)
    +[<c01337d0>] (process_one_work) from [<c0133b18>]
(worker_thread+0x22c/0x2d0)
    +[<c0133b18>] (worker_thread) from [<c01388a8>] (kthread+0x100/0x10c)
    +[<c01388a8>] (kthread) from [<c0100150>] (ret_from_fork+0x14/0x24)
    +Exception stack(0xc1195fb0 to 0xc1195ff8)
    +5fa0:                                     00000000 00000000
00000000 00000000
    +5fc0: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
    +5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    +---[ end trace 5d3a7a10ee0cec3d ]---
    +da9063-rtc da9063-rtc: Failed to set RTC time data: -108
    +da9063-rtc da9063-rtc: Failed to read RTC time data: -108
     CPU1 is up

The latter is probably a pre-existing issue, just more likely to trigger
now the sync interval is 2s.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 5/8] ntp: Make the RTC synchronization more reliable
  2020-12-29 19:41   ` [patch 5/8] " Geert Uytterhoeven
@ 2021-01-11 10:12     ` Thomas Gleixner
  2021-01-11 10:40       ` Geert Uytterhoeven
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2021-01-11 10:12 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: LKML, Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra, Wolfram Sang, Linux-Renesas

On Tue, Dec 29 2020 at 20:41, Geert Uytterhoeven wrote:
> Hi Thomas,
>> Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> Thanks for your patch, which is now commit c9e6189fb03123a7 ("ntp: Make
> the RTC synchronization more reliable").
>
> Since this commit, the I2C RTC on the R-Car M2-W Koelsch development
> board is accessed every two seconds.  Sticking a WARN() in the I2C
> activation path gives e.g.

Huch? Every two seconds? The timer is armed with 11 * 60 * NSEC_PER_SEC,
which is 11 minutes. Confused....

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 5/8] ntp: Make the RTC synchronization more reliable
  2021-01-11 10:12     ` Thomas Gleixner
@ 2021-01-11 10:40       ` Geert Uytterhoeven
  0 siblings, 0 replies; 54+ messages in thread
From: Geert Uytterhoeven @ 2021-01-11 10:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra, Wolfram Sang, Linux-Renesas

Hi Thomas,

On Mon, Jan 11, 2021 at 11:12 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, Dec 29 2020 at 20:41, Geert Uytterhoeven wrote:
> >> Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
> >> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> >
> > Thanks for your patch, which is now commit c9e6189fb03123a7 ("ntp: Make
> > the RTC synchronization more reliable").
> >
> > Since this commit, the I2C RTC on the R-Car M2-W Koelsch development
> > board is accessed every two seconds.  Sticking a WARN() in the I2C
> > activation path gives e.g.
>
> Huch? Every two seconds? The timer is armed with 11 * 60 * NSEC_PER_SEC,
> which is 11 minutes. Confused....

Thanks for the hint:

    #define SYNC_PERIOD_NS (11UL * 60 * NSEC_PER_SEC)

is truncated on 32-bit platforms.

Patch sent.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 1/8] rtc: mc146818: Prevent reading garbage - bug
  2020-12-06 21:46 ` [patch 1/8] rtc: mc146818: Prevent reading garbage Thomas Gleixner
  2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
@ 2021-01-25 18:40   ` Mickaël Salaün
  2021-01-26 13:26     ` Thomas Gleixner
  1 sibling, 1 reply; 54+ messages in thread
From: Mickaël Salaün @ 2021-01-25 18:40 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

Hi,

After some bisecting, I found that commit 05a0302c3548 ("rtc: mc146818:
Prevent reading garbage", this patch, introduced since v5.11-rc1) makes
my VM hang at boot. Before this commit, I got this (and didn't notice)
at every boot:
rtc_cmos rtc_cmos: registered as rtc0
rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
rtc_cmos rtc_cmos: alarms up to one day, 114 bytes nvram

I notice that this patch creates infinite loops, which my VM falls into
(cf. below).

I didn't succeed to properly fix this without a revert. I tried to set a
maximum number of jumps, but I got pvqspinlock warnings.

Regards,
 Mickaël


On 06/12/2020 22:46, Thomas Gleixner wrote:
> The MC146818 driver is prone to read garbage from the RTC. There are
> several issues all related to the update cycle of the MC146818. The chip
> increments seconds obviously once per second and indicates that by a bit in
> a register. The bit goes high 244us before the actual update starts. During
> the update the readout of the time values is undefined.
> 
> The code just checks whether the update in progress bit (UIP) is set before
> reading the clock. If it's set it waits arbitrary 20ms before retrying,
> which is ample because the maximum update time is ~2ms.
> 
> But this check does not guarantee that the UIP bit goes high and the actual
> update happens during the readout. So the following can happen
> 
>  0.997 	       UIP = False
>    -> Interrupt/NMI/preemption
>  0.998	       UIP -> True
>  0.999	       Readout	<- Undefined
> 
> To prevent this rework the code so it checks UIP before and after the
> readout and if set after the readout try again.
> 
> But that's not enough to cover the following:
> 
>  0.997 	       UIP = False
>  	       Readout seconds
>    -> NMI (or vCPU scheduled out)
>  0.998	       UIP -> True
>  	       update completes
> 	       UIP -> False
>  1.000	       Readout	minutes,....
>  	       UIP check succeeds
> 
> That can make the readout wrong up to 59 seconds.
> 
> To prevent this, read the seconds value before the first UIP check,
> validate it after checking UIP and after reading out the rest.
> 
> It's amazing that the original i386 code had this actually correct and
> the generic implementation of the MC146818 driver got it wrong in 2002 and
> it stayed that way until today.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/rtc/rtc-mc146818-lib.c |   64 ++++++++++++++++++++++++-----------------
>  1 file changed, 39 insertions(+), 25 deletions(-)
> 
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -8,41 +8,41 @@
>  #include <linux/acpi.h>
>  #endif
>  
> -/*
> - * Returns true if a clock update is in progress
> - */
> -static inline unsigned char mc146818_is_updating(void)
> -{
> -	unsigned char uip;
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&rtc_lock, flags);
> -	uip = (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP);
> -	spin_unlock_irqrestore(&rtc_lock, flags);
> -	return uip;
> -}
> -
>  unsigned int mc146818_get_time(struct rtc_time *time)
>  {
>  	unsigned char ctrl;
>  	unsigned long flags;
>  	unsigned char century = 0;
> +	bool retry;
>  
>  #ifdef CONFIG_MACH_DECSTATION
>  	unsigned int real_year;
>  #endif
>  
> +again:
> +	spin_lock_irqsave(&rtc_lock, flags);
>  	/*
> -	 * read RTC once any update in progress is done. The update
> -	 * can take just over 2ms. We wait 20ms. There is no need to
> -	 * to poll-wait (up to 1s - eeccch) for the falling edge of RTC_UIP.
> -	 * If you need to know *exactly* when a second has started, enable
> -	 * periodic update complete interrupts, (via ioctl) and then
> -	 * immediately read /dev/rtc which will block until you get the IRQ.
> -	 * Once the read clears, read the RTC time (again via ioctl). Easy.
> +	 * Check whether there is an update in progress during which the
> +	 * readout is unspecified. The maximum update time is ~2ms. Poll
> +	 * every msec for completion.
> +	 *
> +	 * Store the second value before checking UIP so a long lasting NMI
> +	 * which happens to hit after the UIP check cannot make an update
> +	 * cycle invisible.
>  	 */
> -	if (mc146818_is_updating())
> -		mdelay(20);
> +	time->tm_sec = CMOS_READ(RTC_SECONDS);
> +
> +	if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
> +		spin_unlock_irqrestore(&rtc_lock, flags);
> +		mdelay(1);

My VM loops here.
time->tm_sec is always 255.

> +		goto again;
> +	}
> +
> +	/* Revalidate the above readout */
> +	if (time->tm_sec != CMOS_READ(RTC_SECONDS)) {
> +		spin_unlock_irqrestore(&rtc_lock, flags);
> +		goto again;
> +	}
>  
>  	/*
>  	 * Only the values that we read from the RTC are set. We leave
> @@ -50,8 +50,6 @@ unsigned int mc146818_get_time(struct rt
>  	 * RTC has RTC_DAY_OF_WEEK, we ignore it, as it is only updated
>  	 * by the RTC when initially set to a non-zero value.
>  	 */
> -	spin_lock_irqsave(&rtc_lock, flags);
> -	time->tm_sec = CMOS_READ(RTC_SECONDS);
>  	time->tm_min = CMOS_READ(RTC_MINUTES);
>  	time->tm_hour = CMOS_READ(RTC_HOURS);
>  	time->tm_mday = CMOS_READ(RTC_DAY_OF_MONTH);
> @@ -66,8 +64,24 @@ unsigned int mc146818_get_time(struct rt
>  		century = CMOS_READ(acpi_gbl_FADT.century);
>  #endif
>  	ctrl = CMOS_READ(RTC_CONTROL);
> +	/*
> +	 * Check for the UIP bit again. If it is set now then
> +	 * the above values may contain garbage.
> +	 */
> +	retry = CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP;
> +	/*
> +	 * A NMI might have interrupted the above sequence so check whether
> +	 * the seconds value has changed which indicates that the NMI took
> +	 * longer than the UIP bit was set. Unlikely, but possible and
> +	 * there is also virt...
> +	 */
> +	retry |= time->tm_sec != CMOS_READ(RTC_SECONDS);
> +
>  	spin_unlock_irqrestore(&rtc_lock, flags);
>  
> +	if (retry)
> +		goto again;
> +
>  	if (!(ctrl & RTC_DM_BINARY) || RTC_ALWAYS_BCD)
>  	{
>  		time->tm_sec = bcd2bin(time->tm_sec);
> 
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 1/8] rtc: mc146818: Prevent reading garbage - bug
  2021-01-25 18:40   ` [patch 1/8] rtc: mc146818: Prevent reading garbage - bug Mickaël Salaün
@ 2021-01-26 13:26     ` Thomas Gleixner
  2021-01-26 14:17       ` Mickaël Salaün
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2021-01-26 13:26 UTC (permalink / raw)
  To: Mickaël Salaün, LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

On Mon, Jan 25 2021 at 19:40, Mickaël Salaün wrote:
> After some bisecting, I found that commit 05a0302c3548 ("rtc: mc146818:
> Prevent reading garbage", this patch, introduced since v5.11-rc1) makes
> my VM hang at boot. Before this commit, I got this (and didn't notice)
> at every boot:
> rtc_cmos rtc_cmos: registered as rtc0
> rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
> rtc_cmos rtc_cmos: alarms up to one day, 114 bytes nvram
>
> I notice that this patch creates infinite loops, which my VM falls into
> (cf. below).
>> +	time->tm_sec = CMOS_READ(RTC_SECONDS);
>> +
>> +	if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
>> +		spin_unlock_irqrestore(&rtc_lock, flags);
>> +		mdelay(1);
>
> My VM loops here.
> time->tm_sec is always 255.

That means there is no RTC and therefore the CMOS_READ($REG) returns
0xFF which makes the loop stuck because RTC_UIP is always set.

Yet another proof that VIRT creates more problems than it solves.

Fix below.

Thanks,

        tglx
---
Subject: rtc: mc146818: Detect and handle broken RTCs
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 26 Jan 2021 11:38:40 +0100

The recent fix for handling the UIP bit unearthed another issue in the RTC
code. If the RTC is advertised but the readout is straight 0xFF because
it's not available, the old code just proceeded with crappy values, but the
new code hangs because it waits for the UIP bit to become low.

Add a sanity check in the RTC CMOS probe function which reads the RTC_VALID
register (Register D) which should have bit 0-6 cleared. If that's not the
case then fail to register the CMOS.

Add the same check to mc146818_get_time(), warn once when the condition
is true and invalidate the rtc_time data.

Reported-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/rtc-cmos.c         |    8 ++++++++
 drivers/rtc/rtc-mc146818-lib.c |    7 +++++++
 2 files changed, 15 insertions(+)

--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct
 
 	spin_lock_irq(&rtc_lock);
 
+	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
+	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
+		spin_unlock_irq(&rtc_lock);
+		dev_warn(dev, "not accessible\n");
+		retval = -ENXIO;
+		goto cleanup1;
+	}
+
 	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
 		/* force periodic irq to CMOS reset default of 1024Hz;
 		 *
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
 
 again:
 	spin_lock_irqsave(&rtc_lock, flags);
+	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
+	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		memset(time, 0xff, sizeof(time));
+		return 0;
+	}
+
 	/*
 	 * Check whether there is an update in progress during which the
 	 * readout is unspecified. The maximum update time is ~2ms. Poll

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 1/8] rtc: mc146818: Prevent reading garbage - bug
  2021-01-26 13:26     ` Thomas Gleixner
@ 2021-01-26 14:17       ` Mickaël Salaün
  2021-01-26 15:35         ` Thomas Gleixner
  0 siblings, 1 reply; 54+ messages in thread
From: Mickaël Salaün @ 2021-01-26 14:17 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

Thanks for the fix! It boots now with a new message:
rtc_cmos rtc_cmos: not accessible

I tested with rtc=on and rtc=off (which didn't make a difference before
this fix) on a microvm:
https://github.com/qemu/qemu/blob/master/docs/system/i386/microvm.rst

There is one issue with the memset though:

On 26/01/2021 14:26, Thomas Gleixner wrote:
> On Mon, Jan 25 2021 at 19:40, Mickaël Salaün wrote:
>> After some bisecting, I found that commit 05a0302c3548 ("rtc: mc146818:
>> Prevent reading garbage", this patch, introduced since v5.11-rc1) makes
>> my VM hang at boot. Before this commit, I got this (and didn't notice)
>> at every boot:
>> rtc_cmos rtc_cmos: registered as rtc0
>> rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
>> rtc_cmos rtc_cmos: alarms up to one day, 114 bytes nvram
>>
>> I notice that this patch creates infinite loops, which my VM falls into
>> (cf. below).
>>> +	time->tm_sec = CMOS_READ(RTC_SECONDS);
>>> +
>>> +	if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
>>> +		spin_unlock_irqrestore(&rtc_lock, flags);
>>> +		mdelay(1);
>>
>> My VM loops here.
>> time->tm_sec is always 255.
> 
> That means there is no RTC and therefore the CMOS_READ($REG) returns
> 0xFF which makes the loop stuck because RTC_UIP is always set.
> 
> Yet another proof that VIRT creates more problems than it solves.
> 
> Fix below.
> 
> Thanks,
> 
>         tglx
> ---
> Subject: rtc: mc146818: Detect and handle broken RTCs
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Tue, 26 Jan 2021 11:38:40 +0100
> 
> The recent fix for handling the UIP bit unearthed another issue in the RTC
> code. If the RTC is advertised but the readout is straight 0xFF because
> it's not available, the old code just proceeded with crappy values, but the
> new code hangs because it waits for the UIP bit to become low.
> 
> Add a sanity check in the RTC CMOS probe function which reads the RTC_VALID
> register (Register D) which should have bit 0-6 cleared. If that's not the
> case then fail to register the CMOS.
> 
> Add the same check to mc146818_get_time(), warn once when the condition
> is true and invalidate the rtc_time data.
> 
> Reported-by: Mickaël Salaün <mic@digikod.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/rtc/rtc-cmos.c         |    8 ++++++++
>  drivers/rtc/rtc-mc146818-lib.c |    7 +++++++
>  2 files changed, 15 insertions(+)
> 
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct
>  
>  	spin_lock_irq(&rtc_lock);
>  
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
> +		spin_unlock_irq(&rtc_lock);
> +		dev_warn(dev, "not accessible\n");
> +		retval = -ENXIO;
> +		goto cleanup1;
> +	}
> +
>  	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
>  		/* force periodic irq to CMOS reset default of 1024Hz;
>  		 *
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
>  
>  again:
>  	spin_lock_irqsave(&rtc_lock, flags);
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +		spin_unlock_irqrestore(&rtc_lock, flags);
> +		memset(time, 0xff, sizeof(time));

This should be: sizeof(*time)

> +		return 0;
> +	}
> +
>  	/*
>  	 * Check whether there is an update in progress during which the
>  	 * readout is unspecified. The maximum update time is ~2ms. Poll
> 

Tested-by: Mickaël Salaün <mic@linux.microsoft.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [patch 1/8] rtc: mc146818: Prevent reading garbage - bug
  2021-01-26 14:17       ` Mickaël Salaün
@ 2021-01-26 15:35         ` Thomas Gleixner
  2021-01-26 17:02           ` [PATCH V2] rtc: mc146818: Detect and handle broken RTCs Thomas Gleixner
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2021-01-26 15:35 UTC (permalink / raw)
  To: Mickaël Salaün, LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

On Tue, Jan 26 2021 at 15:17, Mickaël Salaün wrote:
> Thanks for the fix! It boots now with a new message:
> rtc_cmos rtc_cmos: not accessible
>>  	spin_lock_irqsave(&rtc_lock, flags);
>> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
>> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
>> +		spin_unlock_irqrestore(&rtc_lock, flags);
>> +		memset(time, 0xff, sizeof(time));
>
> This should be: sizeof(*time)

Of course ....

>> +		return 0;
>> +	}
>> +
>>  	/*
>>  	 * Check whether there is an update in progress during which the
>>  	 * readout is unspecified. The maximum update time is ~2ms. Poll
>> 
>
> Tested-by: Mickaël Salaün <mic@linux.microsoft.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH V2] rtc: mc146818: Detect and handle broken RTCs
  2021-01-26 15:35         ` Thomas Gleixner
@ 2021-01-26 17:02           ` Thomas Gleixner
  2021-01-26 18:00             ` Alexandre Belloni
                               ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Thomas Gleixner @ 2021-01-26 17:02 UTC (permalink / raw)
  To: Mickaël Salaün, LKML
  Cc: Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

The recent fix for handling the UIP bit unearthed another issue in the RTC
code. If the RTC is advertised but the readout is straight 0xFF because
it's not available, the old code just proceeded with crappy values, but the
new code hangs because it waits for the UIP bit to become low.

Add a sanity check in the RTC CMOS probe function which reads the RTC_VALID
register (Register D) which should have bit 0-6 cleared. If that's not the
case then fail to register the CMOS.

Add the same check to mc146818_get_time(), warn once when the condition
is true and invalidate the rtc_time data.

Reported-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mickaël Salaün <mic@linux.microsoft.com>
---
V2: Fixed the sizeof() as spotted by Mickaël
---
 drivers/rtc/rtc-cmos.c         |    8 ++++++++
 drivers/rtc/rtc-mc146818-lib.c |    7 +++++++
 2 files changed, 15 insertions(+)

--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct
 
 	spin_lock_irq(&rtc_lock);
 
+	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
+	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
+		spin_unlock_irq(&rtc_lock);
+		dev_warn(dev, "not accessible\n");
+		retval = -ENXIO;
+		goto cleanup1;
+	}
+
 	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
 		/* force periodic irq to CMOS reset default of 1024Hz;
 		 *
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
 
 again:
 	spin_lock_irqsave(&rtc_lock, flags);
+	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
+	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		memset(time, 0xff, sizeof(*time));
+		return 0;
+	}
+
 	/*
 	 * Check whether there is an update in progress during which the
 	 * readout is unspecified. The maximum update time is ~2ms. Poll

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Detect and handle broken RTCs
  2021-01-26 17:02           ` [PATCH V2] rtc: mc146818: Detect and handle broken RTCs Thomas Gleixner
@ 2021-01-26 18:00             ` Alexandre Belloni
  2021-01-27  8:41             ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
  2021-01-31 10:59             ` [PATCH V2] " Dirk Gouders
  2 siblings, 0 replies; 54+ messages in thread
From: Alexandre Belloni @ 2021-01-26 18:00 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mickaël Salaün, LKML, Jason Gunthorpe,
	Miroslav Lichvar, John Stultz, Prarit Bhargava, Alessandro Zummo,
	linux-rtc, Peter Zijlstra

On 26/01/2021 18:02:11+0100, Thomas Gleixner wrote:
> The recent fix for handling the UIP bit unearthed another issue in the RTC
> code. If the RTC is advertised but the readout is straight 0xFF because
> it's not available, the old code just proceeded with crappy values, but the
> new code hangs because it waits for the UIP bit to become low.
> 
> Add a sanity check in the RTC CMOS probe function which reads the RTC_VALID
> register (Register D) which should have bit 0-6 cleared. If that's not the
> case then fail to register the CMOS.
> 
> Add the same check to mc146818_get_time(), warn once when the condition
> is true and invalidate the rtc_time data.
> 
> Reported-by: Mickaël Salaün <mic@digikod.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Mickaël Salaün <mic@linux.microsoft.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

> ---
> V2: Fixed the sizeof() as spotted by Mickaël
> ---
>  drivers/rtc/rtc-cmos.c         |    8 ++++++++
>  drivers/rtc/rtc-mc146818-lib.c |    7 +++++++
>  2 files changed, 15 insertions(+)
> 
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct
>  
>  	spin_lock_irq(&rtc_lock);
>  
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
> +		spin_unlock_irq(&rtc_lock);
> +		dev_warn(dev, "not accessible\n");
> +		retval = -ENXIO;
> +		goto cleanup1;
> +	}
> +
>  	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
>  		/* force periodic irq to CMOS reset default of 1024Hz;
>  		 *
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
>  
>  again:
>  	spin_lock_irqsave(&rtc_lock, flags);
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +		spin_unlock_irqrestore(&rtc_lock, flags);
> +		memset(time, 0xff, sizeof(*time));
> +		return 0;
> +	}
> +
>  	/*
>  	 * Check whether there is an update in progress during which the
>  	 * readout is unspecified. The maximum update time is ~2ms. Poll

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [tip: timers/urgent] rtc: mc146818: Detect and handle broken RTCs
  2021-01-26 17:02           ` [PATCH V2] rtc: mc146818: Detect and handle broken RTCs Thomas Gleixner
  2021-01-26 18:00             ` Alexandre Belloni
@ 2021-01-27  8:41             ` tip-bot2 for Thomas Gleixner
  2021-01-31 10:59             ` [PATCH V2] " Dirk Gouders
  2 siblings, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-01-27  8:41 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mic, Thomas Gleixner, mic, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     211e5db19d15a721b2953ea54b8f26c2963720eb
Gitweb:        https://git.kernel.org/tip/211e5db19d15a721b2953ea54b8f26c2963720eb
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 26 Jan 2021 18:02:11 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 27 Jan 2021 09:36:22 +01:00

rtc: mc146818: Detect and handle broken RTCs

The recent fix for handling the UIP bit unearthed another issue in the RTC
code. If the RTC is advertised but the readout is straight 0xFF because
it's not available, the old code just proceeded with crappy values, but the
new code hangs because it waits for the UIP bit to become low.

Add a sanity check in the RTC CMOS probe function which reads the RTC_VALID
register (Register D) which should have bit 0-6 cleared. If that's not the
case then fail to register the CMOS.

Add the same check to mc146818_get_time(), warn once when the condition
is true and invalidate the rtc_time data.

Reported-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mickaël Salaün <mic@linux.microsoft.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/87tur3fx7w.fsf@nanos.tec.linutronix.de

---
 drivers/rtc/rtc-cmos.c         | 8 ++++++++
 drivers/rtc/rtc-mc146818-lib.c | 7 +++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 51e80bc..68a9ac6 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq)
 
 	spin_lock_irq(&rtc_lock);
 
+	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
+	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
+		spin_unlock_irq(&rtc_lock);
+		dev_warn(dev, "not accessible\n");
+		retval = -ENXIO;
+		goto cleanup1;
+	}
+
 	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
 		/* force periodic irq to CMOS reset default of 1024Hz;
 		 *
diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 972a5b9..f83c138 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rtc_time *time)
 
 again:
 	spin_lock_irqsave(&rtc_lock, flags);
+	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
+	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		memset(time, 0xff, sizeof(*time));
+		return 0;
+	}
+
 	/*
 	 * Check whether there is an update in progress during which the
 	 * readout is unspecified. The maximum update time is ~2ms. Poll

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Detect and handle broken RTCs
  2021-01-26 17:02           ` [PATCH V2] rtc: mc146818: Detect and handle broken RTCs Thomas Gleixner
  2021-01-26 18:00             ` Alexandre Belloni
  2021-01-27  8:41             ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
@ 2021-01-31 10:59             ` Dirk Gouders
  2021-02-01 13:53               ` Serge Belyshev
  2 siblings, 1 reply; 54+ messages in thread
From: Dirk Gouders @ 2021-01-31 10:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mickaël Salaün, LKML, Alexandre Belloni,
	Jason Gunthorpe, Miroslav Lichvar, John Stultz, Prarit Bhargava,
	Alessandro Zummo, linux-rtc, Peter Zijlstra

Thomas Gleixner <tglx@linutronix.de> writes:

> The recent fix for handling the UIP bit unearthed another issue in the RTC
> code. If the RTC is advertised but the readout is straight 0xFF because
> it's not available, the old code just proceeded with crappy values, but the
> new code hangs because it waits for the UIP bit to become low.
>
> Add a sanity check in the RTC CMOS probe function which reads the RTC_VALID
> register (Register D) which should have bit 0-6 cleared. If that's not the
> case then fail to register the CMOS.
>
> Add the same check to mc146818_get_time(), warn once when the condition
> is true and invalidate the rtc_time data.

In case it is helpful: on my hardware this patch triggers a warning
(attached below).

Without it the rtc messages look like this:

[    2.783386] rtc_cmos 00:01: RTC can wake from S4
[    2.784302] rtc_cmos 00:01: registered as rtc0
[    2.785036] rtc_cmos 00:01: setting system clock to 2021-01-31T10:13:40 UTC (1612088020)
[    2.785713] rtc_cmos 00:01: alarms up to one month, y3k, 114 bytes nvram, hpet irqs

Dirk

[    7.258410] ------------[ cut here ]------------
[    7.258414] WARNING: CPU: 2 PID: 0 at drivers/rtc/rtc-mc146818-lib.c:25 mc146818_get_time+0x2b/0x1e5
[    7.258420] Modules linked in: iwlmvm(+) mac80211 iwlwifi sdhci_pci amdgpu(+) drm_ttm_helper cfg80211 ttm cqhci gpu_sched sdhci ccp thinkpad_acpi(+) rng_core nvram tpm_tis(+) tpm_tis_core wmi tpm pinctrl_amd
[    7.258432] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G        W         5.11.0-rc5-next-20210129-x86_64 #180
[    7.258434] Hardware name: LENOVO 20U50008GE/20U50008GE, BIOS R19ET26W (1.10 ) 06/22/2020
[    7.258435] RIP: 0010:mc146818_get_time+0x2b/0x1e5
[    7.258437] Code: 56 41 55 45 31 ed 41 54 55 53 48 89 fb 48 c7 c7 bc d9 eb 82 e8 26 d8 36 00 bf 0d 00 00 00 48 89 c5 e8 6d d1 8f ff a8 7f 74 24 <0f> 0b 48 c7 c7 bc d9 eb 82 48 89 ee e8 bc d6 36 00 b0 ff b9 24 00
[    7.258438] RSP: 0018:ffffc9000022cef0 EFLAGS: 00010002
[    7.258440] RAX: 0000000000000031 RBX: ffffc9000022cf24 RCX: 0000000000000000
[    7.258441] RDX: 0000000000000001 RSI: ffff888105607000 RDI: 000000000000000d
[    7.258441] RBP: 0000000000000046 R08: ffffc9000022cf24 R09: 0000000000000000
[    7.258442] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888105607000
[    7.258443] R13: 0000000000000000 R14: ffffc9000022cfa4 R15: 0000000000000000
[    7.258444] FS:  0000000000000000(0000) GS:ffff88840ec80000(0000) knlGS:0000000000000000
[    7.258445] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.258446] CR2: 00007f2ed26c4160 CR3: 000000000480a000 CR4: 0000000000350ee0
[    7.258447] Call Trace:
[    7.258449]  <IRQ>
[    7.258450]  hpet_rtc_interrupt+0xd3/0x1a3
[    7.258454]  __handle_irq_event_percpu+0x6b/0x12e
[    7.258457]  handle_irq_event_percpu+0x2c/0x6f
[    7.258459]  handle_irq_event+0x23/0x43
[    7.258461]  handle_edge_irq+0x9e/0xbb
[    7.258463]  asm_call_irq_on_stack+0x12/0x20
[    7.258467]  </IRQ>
[    7.258467]  common_interrupt+0x9a/0x123
[    7.258470]  asm_common_interrupt+0x1e/0x40
[    7.258472] RIP: 0010:cpuidle_enter_state+0x13e/0x1fe
[    7.258475] Code: 49 89 c4 e8 bd fd ff ff 31 ff e8 3e 80 92 ff 45 84 ff 74 12 9c 58 0f ba e0 09 73 03 0f 0b fa 31 ff e8 13 16 96 ff fb 45 85 f6 <0f> 88 97 00 00 00 49 63 d6 4c 2b 24 24 48 6b ca 68 48 6b c2 30 4c
[    7.258476] RSP: 0018:ffffc90000167eb0 EFLAGS: 00000206
[    7.258477] RAX: ffff88840eca8240 RBX: ffff888101e0d400 RCX: 00000001b0a24b16
[    7.258478] RDX: 0000000000000002 RSI: 0000000000000002 RDI: 0000000000000000
[    7.258478] RBP: 0000000000000003 R08: 00000000ffffffff R09: 0000000000000000
[    7.258479] R10: ffff88810083c4a8 R11: 0000000000000000 R12: 00000001b0a24b48
[    7.258480] R13: ffffffff8299cc60 R14: 0000000000000003 R15: 0000000000000000
[    7.258482]  cpuidle_enter+0x2b/0x37
[    7.258483]  do_idle+0x126/0x184
[    7.258485]  cpu_startup_entry+0x18/0x1a
[    7.258486]  secondary_startup_64_no_verify+0xb0/0xbb
[    7.258489] ---[ end trace 9da59c3696ed99d8 ]---


> Reported-by: Mickaël Salaün <mic@digikod.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Mickaël Salaün <mic@linux.microsoft.com>
> ---
> V2: Fixed the sizeof() as spotted by Mickaël
> ---
>  drivers/rtc/rtc-cmos.c         |    8 ++++++++
>  drivers/rtc/rtc-mc146818-lib.c |    7 +++++++
>  2 files changed, 15 insertions(+)
>
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct
>  
>  	spin_lock_irq(&rtc_lock);
>  
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
> +		spin_unlock_irq(&rtc_lock);
> +		dev_warn(dev, "not accessible\n");
> +		retval = -ENXIO;
> +		goto cleanup1;
> +	}
> +
>  	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
>  		/* force periodic irq to CMOS reset default of 1024Hz;
>  		 *
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
>  
>  again:
>  	spin_lock_irqsave(&rtc_lock, flags);
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +		spin_unlock_irqrestore(&rtc_lock, flags);
> +		memset(time, 0xff, sizeof(*time));
> +		return 0;
> +	}
> +
>  	/*
>  	 * Check whether there is an update in progress during which the
>  	 * readout is unspecified. The maximum update time is ~2ms. Poll

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Detect and handle broken RTCs
  2021-01-31 10:59             ` [PATCH V2] " Dirk Gouders
@ 2021-02-01 13:53               ` Serge Belyshev
  2021-02-01 19:09                 ` [PATCH] rtc: mc146818: Dont test for bit 0-5 in Register D Thomas Gleixner
  0 siblings, 1 reply; 54+ messages in thread
From: Serge Belyshev @ 2021-02-01 13:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dirk Gouders, Mickaël Salaün, LKML, Alexandre Belloni,
	Jason Gunthorpe, Miroslav Lichvar, John Stultz, Prarit Bhargava,
	Alessandro Zummo, linux-rtc, Peter Zijlstra

Hi!  "Me too":

> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
>  
>  again:
>  	spin_lock_irqsave(&rtc_lock, flags);
> +	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +		spin_unlock_irqrestore(&rtc_lock, flags);
> +		memset(time, 0xff, sizeof(*time));
> +		return 0;
> +	}
> +

... triggers here on a different box (Xiaomi mi notebook air 12.5):

[    3.524002] ------------[ cut here ]------------
[    3.528317] WARNING: CPU: 3 PID: 273 at drivers/rtc/rtc-mc146818-lib.c:25 mc146818_get_time+0x1b6/0x210
[    3.532558] CPU: 3 PID: 273 Comm: udevadm Not tainted 5.11.0-rc6 #760
[    3.536748] Hardware name: Timi TM1612/TM1612, BIOS A04 08/06/2016
[    3.540947] RIP: 0010:mc146818_get_time+0x1b6/0x210
[    3.545103] Code: 76 0b 0f b6 d0 83 ea 13 6b d2 64 01 d5 83 fd 45 89 6b 14 7f 06 83 c5 64 89 6b 14 41 83 ed 01 b8 02 00 00 00 44 89 6b 10 eb 39 <0f> 0b 48 c7 c7 b4 e0 9e 82 48 89 ee e8 29 6b 34 00 48 c7 03 ff ff
[    3.549883] RSP: 0000:ffffc900012efe30 EFLAGS: 00010002
[    3.554387] RAX: 0000000000000081 RBX: ffffc900012efe64 RCX: 000000000005b8d7
[    3.558867] RDX: 0000000000000001 RSI: ffff8881000aa000 RDI: 000000000000000d
[    3.563333] RBP: 0000000000000046 R08: 0000000000000004 R09: fffffffe5e075ac6
[    3.567748] iwlwifi 0000:02:00.0: Applying debug destination EXTERNAL_DRAM
[    3.567822] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    3.567827] R13: ffffc900012efedc R14: 0000000000000008 R15: ffff888100051200
[    3.577223] FS:  0000000000000000(0000) GS:ffff88816ad80000(0000) knlGS:0000000000000000
[    3.579870] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.581947] CR2: 00007fface455e28 CR3: 0000000103244005 CR4: 00000000003706a0
[    3.583836] Call Trace:
[    3.585699]  hpet_rtc_interrupt+0x1af/0x220
[    3.587585]  __handle_irq_event_percpu+0x5a/0xc0
[    3.589230]  handle_irq_event_percpu+0x1b/0x50
[    3.590673]  handle_irq_event+0x22/0x40
[    3.592107]  handle_edge_irq+0x6b/0x190
[    3.593545]  common_interrupt+0x67/0x130
[    3.594983]  ? asm_common_interrupt+0x8/0x40
[    3.596432]  asm_common_interrupt+0x1e/0x40
[    3.597618] RIP: 0033:0x7ffaceac9b31
[    3.598794] Code: 48 83 fe 0a 0f 87 f5 fe ff ff be 41 ff ff 6f 48 29 d6 48 89 04 f1 e9 e4 fe ff ff 48 85 ff 74 79 49 8b 44 24 60 48 85 c0 74 04 <48> 01 78 08 49 8b 44 24 58 48 85 c0 74 04 48 01 78 08 49 8b 44 24
[    3.600048] RSP: 002b:00007ffc12303b00 EFLAGS: 00010202
[    3.601343] RAX: 00007fface455e20 RBX: 000000006ffffdff RCX: 00007fface80c040
[    3.602587] RDX: 0000000000000000 RSI: 0000000000000029 RDI: 00007fface451000
[    3.603809] RBP: 00007ffc12303c50 R08: 000000006fffffff R09: 00000000effffef5
[    3.605015] R10: 0000000070000022 R11: 0000000000000032 R12: 00007fface80c000
[    3.606223] R13: 000000006ffffeff R14: 000000006ffffe35 R15: 00007ffc12303ce0
[    3.607421] ---[ end trace 5922ddf43b0f7b83 ]---
[    3.608692] hpet: Lost 3 RTC interrupts

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 13:53               ` Serge Belyshev
@ 2021-02-01 19:09                 ` Thomas Gleixner
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2021-02-01 19:09 UTC (permalink / raw)
  To: Serge Belyshev
  Cc: Dirk Gouders, Mickaël Salaün, LKML, Alexandre Belloni,
	Jason Gunthorpe, Miroslav Lichvar, John Stultz, Prarit Bhargava,
	Alessandro Zummo, linux-rtc, Peter Zijlstra, Linus Torvalds

The recent change to validate the RTC turned out to be overly tight.

While it cures the problem on the reporters machine it breaks machines
with Intel chipsets which use bit 0-5 of the D register. So check only
for bit 6 being 0 which is the case on these Intel machines as well.

Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
Reported-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/rtc/rtc-cmos.c         |    8 ++++++++
 drivers/rtc/rtc-mc146818-lib.c |    7 +++++++
 2 files changed, 15 insertions(+)

--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -805,6 +805,14 @@ cmos_do_probe(struct device *dev, struct
 
 	spin_lock_irq(&rtc_lock);
 
+	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
+	if ((CMOS_READ(RTC_VALID) & 0x40) != 0) {
+		spin_unlock_irq(&rtc_lock);
+		dev_warn(dev, "not accessible\n");
+		retval = -ENXIO;
+		goto cleanup1;
+	}
+
 	if (!(flags & CMOS_RTC_FLAGS_NOFREQ)) {
 		/* force periodic irq to CMOS reset default of 1024Hz;
 		 *
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -21,6 +21,13 @@ unsigned int mc146818_get_time(struct rt
 
 again:
 	spin_lock_irqsave(&rtc_lock, flags);
+	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
+	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x40) != 0)) {
+		spin_unlock_irqrestore(&rtc_lock, flags);
+		memset(time, 0xff, sizeof(*time));
+		return 0;
+	}
+
 	/*
 	 * Check whether there is an update in progress during which the
 	 * readout is unspecified. The maximum update time is ~2ms. Poll

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:09                 ` [PATCH] rtc: mc146818: Dont test for bit 0-5 in Register D Thomas Gleixner
@ 2021-02-01 19:24                   ` Thomas Gleixner
  2021-02-01 19:32                     ` Linus Torvalds
                                       ` (6 more replies)
  0 siblings, 7 replies; 54+ messages in thread
From: Thomas Gleixner @ 2021-02-01 19:24 UTC (permalink / raw)
  To: Serge Belyshev
  Cc: Dirk Gouders, Mickaël Salaün, LKML, Alexandre Belloni,
	Jason Gunthorpe, Miroslav Lichvar, John Stultz, Prarit Bhargava,
	Alessandro Zummo, linux-rtc, Peter Zijlstra, Linus Torvalds

The recent change to validate the RTC turned out to be overly tight.

While it cures the problem on the reporters machine it breaks machines
with Intel chipsets which use bit 0-5 of the D register. So check only
for bit 6 being 0 which is the case on these Intel machines as well.

Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
Reported-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Provide the actual delta patch. Should have stayed away from
    computers today....
---
 drivers/rtc/rtc-cmos.c         |    4 ++--
 drivers/rtc/rtc-mc146818-lib.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -805,8 +805,8 @@ cmos_do_probe(struct device *dev, struct
 
 	spin_lock_irq(&rtc_lock);
 
-	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
-	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
+	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
+	if ((CMOS_READ(RTC_VALID) & 0x40) != 0) {
 		spin_unlock_irq(&rtc_lock);
 		dev_warn(dev, "not accessible\n");
 		retval = -ENXIO;
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -21,8 +21,8 @@ unsigned int mc146818_get_time(struct rt
 
 again:
 	spin_lock_irqsave(&rtc_lock, flags);
-	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
-	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
+	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
+	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x40) != 0)) {
 		spin_unlock_irqrestore(&rtc_lock, flags);
 		memset(time, 0xff, sizeof(*time));
 		return 0;

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
@ 2021-02-01 19:32                     ` Linus Torvalds
  2021-02-01 19:40                       ` Thomas Gleixner
  2021-02-01 19:38                     ` Alexandre Belloni
                                       ` (5 subsequent siblings)
  6 siblings, 1 reply; 54+ messages in thread
From: Linus Torvalds @ 2021-02-01 19:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Serge Belyshev, Dirk Gouders, Mickaël Salaün, LKML,
	Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

On Mon, Feb 1, 2021 at 11:24 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> While it cures the problem on the reporters machine it breaks machines
> with Intel chipsets which use bit 0-5 of the D register. So check only
> for bit 6 being 0 which is the case on these Intel machines as well.

This looks fine, but it might also be worth it simply just checking
for the only really special value: 0xff, and going "ok, that looks
like missing hardware".

That's what a few other drivers historically do in their probing
routines, so it's not unheard of (ie you can find drivers doing that
kind of

        /* If we read 0xff from the LSR, there is no UART here. */
        if (inb(.. port ..) == 0xff)

in their init routines.

Not a big deal either way, I just think it would be more in like with
what other places do in similar situations

      Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
  2021-02-01 19:32                     ` Linus Torvalds
@ 2021-02-01 19:38                     ` Alexandre Belloni
  2021-02-01 20:09                     ` Borislav Petkov
                                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 54+ messages in thread
From: Alexandre Belloni @ 2021-02-01 19:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Serge Belyshev, Dirk Gouders, Mickaël Salaün, LKML,
	Jason Gunthorpe, Miroslav Lichvar, John Stultz, Prarit Bhargava,
	Alessandro Zummo, linux-rtc, Peter Zijlstra, Linus Torvalds

On 01/02/2021 20:24:17+0100, Thomas Gleixner wrote:
> The recent change to validate the RTC turned out to be overly tight.
> 
> While it cures the problem on the reporters machine it breaks machines
> with Intel chipsets which use bit 0-5 of the D register. So check only
> for bit 6 being 0 which is the case on these Intel machines as well.
> 
> Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
> Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
> Reported-by: Dirk Gouders <dirk@gouders.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

I'm still fine with that going through your tree.

Thanks for this work I do hope this will be the last issue...

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:32                     ` Linus Torvalds
@ 2021-02-01 19:40                       ` Thomas Gleixner
  2021-02-11 23:09                         ` Maciej W. Rozycki
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2021-02-01 19:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Serge Belyshev, Dirk Gouders, Mickaël Salaün, LKML,
	Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra

On Mon, Feb 01 2021 at 11:32, Linus Torvalds wrote:
> On Mon, Feb 1, 2021 at 11:24 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> While it cures the problem on the reporters machine it breaks machines
>> with Intel chipsets which use bit 0-5 of the D register. So check only
>> for bit 6 being 0 which is the case on these Intel machines as well.
>
> This looks fine, but it might also be worth it simply just checking
> for the only really special value: 0xff, and going "ok, that looks
> like missing hardware".
>
> That's what a few other drivers historically do in their probing
> routines, so it's not unheard of (ie you can find drivers doing that
> kind of
>
>         /* If we read 0xff from the LSR, there is no UART here. */
>         if (inb(.. port ..) == 0xff)
>
> in their init routines.
>
> Not a big deal either way, I just think it would be more in like with
> what other places do in similar situations

Yeah, we can do that as well. Either way is fine.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
  2021-02-01 19:32                     ` Linus Torvalds
  2021-02-01 19:38                     ` Alexandre Belloni
@ 2021-02-01 20:09                     ` Borislav Petkov
  2021-02-01 20:15                     ` Dirk Gouders
                                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 54+ messages in thread
From: Borislav Petkov @ 2021-02-01 20:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Serge Belyshev, Dirk Gouders, Mickaël Salaün, LKML,
	Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra, Linus Torvalds

On Mon, Feb 01, 2021 at 08:24:17PM +0100, Thomas Gleixner wrote:
> The recent change to validate the RTC turned out to be overly tight.
> 
> While it cures the problem on the reporters machine it breaks machines
> with Intel chipsets which use bit 0-5 of the D register. So check only
> for bit 6 being 0 which is the case on these Intel machines as well.
> 
> Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
> Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
> Reported-by: Dirk Gouders <dirk@gouders.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: Provide the actual delta patch. Should have stayed away from
>     computers today....
> ---
>  drivers/rtc/rtc-cmos.c         |    4 ++--
>  drivers/rtc/rtc-mc146818-lib.c |    4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)

FWIW:

Reported-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
                                       ` (2 preceding siblings ...)
  2021-02-01 20:09                     ` Borislav Petkov
@ 2021-02-01 20:15                     ` Dirk Gouders
  2021-02-02  4:22                     ` Len Brown
                                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 54+ messages in thread
From: Dirk Gouders @ 2021-02-01 20:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Serge Belyshev, Mickaël Salaün, LKML,
	Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra, Linus Torvalds

Thomas Gleixner <tglx@linutronix.de> writes:

> The recent change to validate the RTC turned out to be overly tight.
>
> While it cures the problem on the reporters machine it breaks machines
> with Intel chipsets which use bit 0-5 of the D register. So check only
> for bit 6 being 0 which is the case on these Intel machines as well.
>
> Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
> Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
> Reported-by: Dirk Gouders <dirk@gouders.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: Provide the actual delta patch. Should have stayed away from
>     computers today....

I tested V2 and it eliminates the warning, here.

Thank you,

Dirk

> ---
>  drivers/rtc/rtc-cmos.c         |    4 ++--
>  drivers/rtc/rtc-mc146818-lib.c |    4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -805,8 +805,8 @@ cmos_do_probe(struct device *dev, struct
>  
>  	spin_lock_irq(&rtc_lock);
>  
> -	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> -	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
> +	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
> +	if ((CMOS_READ(RTC_VALID) & 0x40) != 0) {
>  		spin_unlock_irq(&rtc_lock);
>  		dev_warn(dev, "not accessible\n");
>  		retval = -ENXIO;
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,8 +21,8 @@ unsigned int mc146818_get_time(struct rt
>  
>  again:
>  	spin_lock_irqsave(&rtc_lock, flags);
> -	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> -	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x40) != 0)) {
>  		spin_unlock_irqrestore(&rtc_lock, flags);
>  		memset(time, 0xff, sizeof(*time));
>  		return 0;

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
                                       ` (3 preceding siblings ...)
  2021-02-01 20:15                     ` Dirk Gouders
@ 2021-02-02  4:22                     ` Len Brown
  2021-02-02 19:40                     ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
  2021-02-03 13:00                     ` [PATCH V2] " Mickaël Salaün
  6 siblings, 0 replies; 54+ messages in thread
From: Len Brown @ 2021-02-02  4:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Serge Belyshev, Dirk Gouders, Mickaël Salaün, LKML,
	Alexandre Belloni, Jason Gunthorpe, Miroslav Lichvar,
	John Stultz, Prarit Bhargava, Alessandro Zummo, linux-rtc,
	Peter Zijlstra, Linus Torvalds

Thanks for the update, Thomas.

V1 prevented rc6 automated suspend/resume testing on all 13 of my
local machines.
V2 applied, and they are back in business.

tested-by: Len Brown <len.brown@intel.com>

On Mon, Feb 1, 2021 at 2:25 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The recent change to validate the RTC turned out to be overly tight.
>
> While it cures the problem on the reporters machine it breaks machines
> with Intel chipsets which use bit 0-5 of the D register. So check only
> for bit 6 being 0 which is the case on these Intel machines as well.
>
> Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
> Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
> Reported-by: Dirk Gouders <dirk@gouders.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: Provide the actual delta patch. Should have stayed away from
>     computers today....
> ---
>  drivers/rtc/rtc-cmos.c         |    4 ++--
>  drivers/rtc/rtc-mc146818-lib.c |    4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -805,8 +805,8 @@ cmos_do_probe(struct device *dev, struct
>
>         spin_lock_irq(&rtc_lock);
>
> -       /* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> -       if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
> +       /* Ensure that the RTC is accessible. Bit 6 must be 0! */
> +       if ((CMOS_READ(RTC_VALID) & 0x40) != 0) {
>                 spin_unlock_irq(&rtc_lock);
>                 dev_warn(dev, "not accessible\n");
>                 retval = -ENXIO;
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,8 +21,8 @@ unsigned int mc146818_get_time(struct rt
>
>  again:
>         spin_lock_irqsave(&rtc_lock, flags);
> -       /* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> -       if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +       /* Ensure that the RTC is accessible. Bit 6 must be 0! */
> +       if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x40) != 0)) {
>                 spin_unlock_irqrestore(&rtc_lock, flags);
>                 memset(time, 0xff, sizeof(*time));
>                 return 0;



-- 
Len Brown, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [tip: timers/urgent] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
                                       ` (4 preceding siblings ...)
  2021-02-02  4:22                     ` Len Brown
@ 2021-02-02 19:40                     ` tip-bot2 for Thomas Gleixner
  2021-02-03 13:00                     ` [PATCH V2] " Mickaël Salaün
  6 siblings, 0 replies; 54+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-02-02 19:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Serge Belyshev, Dirk Gouders, Borislav Petkov, Thomas Gleixner,
	Len Brown, Alexandre Belloni, x86, linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     ebb22a05943666155e6da04407cc6e913974c78c
Gitweb:        https://git.kernel.org/tip/ebb22a05943666155e6da04407cc6e913974c78c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 01 Feb 2021 20:24:17 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 02 Feb 2021 20:35:02 +01:00

rtc: mc146818: Dont test for bit 0-5 in Register D

The recent change to validate the RTC turned out to be overly tight.

While it cures the problem on the reporters machine it breaks machines
with Intel chipsets which use bit 0-5 of the D register. So check only
for bit 6 being 0 which is the case on these Intel machines as well.

Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
Reported-by: Dirk Gouders <dirk@gouders.net>
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Dirk Gouders <dirk@gouders.net>
Tested-by: Len Brown <len.brown@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/87zh0nbnha.fsf@nanos.tec.linutronix.de

---
 drivers/rtc/rtc-cmos.c         | 4 ++--
 drivers/rtc/rtc-mc146818-lib.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 68a9ac6..a701dae 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -805,8 +805,8 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq)
 
 	spin_lock_irq(&rtc_lock);
 
-	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
-	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
+	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
+	if ((CMOS_READ(RTC_VALID) & 0x40) != 0) {
 		spin_unlock_irq(&rtc_lock);
 		dev_warn(dev, "not accessible\n");
 		retval = -ENXIO;
diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index f83c138..dcfaf09 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -21,8 +21,8 @@ unsigned int mc146818_get_time(struct rtc_time *time)
 
 again:
 	spin_lock_irqsave(&rtc_lock, flags);
-	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
-	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
+	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
+	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x40) != 0)) {
 		spin_unlock_irqrestore(&rtc_lock, flags);
 		memset(time, 0xff, sizeof(*time));
 		return 0;

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
                                       ` (5 preceding siblings ...)
  2021-02-02 19:40                     ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
@ 2021-02-03 13:00                     ` Mickaël Salaün
  6 siblings, 0 replies; 54+ messages in thread
From: Mickaël Salaün @ 2021-02-03 13:00 UTC (permalink / raw)
  To: Thomas Gleixner, Serge Belyshev
  Cc: Dirk Gouders, LKML, Alexandre Belloni, Jason Gunthorpe,
	Miroslav Lichvar, John Stultz, Prarit Bhargava, Alessandro Zummo,
	linux-rtc, Peter Zijlstra, Linus Torvalds

FWIW, it's still OK for me.

Tested-by: Mickaël Salaün <mic@linux.microsoft.com>

On 01/02/2021 20:24, Thomas Gleixner wrote:
> The recent change to validate the RTC turned out to be overly tight.
> 
> While it cures the problem on the reporters machine it breaks machines
> with Intel chipsets which use bit 0-5 of the D register. So check only
> for bit 6 being 0 which is the case on these Intel machines as well.
> 
> Fixes: 211e5db19d15 ("rtc: mc146818: Detect and handle broken RTCs")
> Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
> Reported-by: Dirk Gouders <dirk@gouders.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: Provide the actual delta patch. Should have stayed away from
>     computers today....
> ---
>  drivers/rtc/rtc-cmos.c         |    4 ++--
>  drivers/rtc/rtc-mc146818-lib.c |    4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -805,8 +805,8 @@ cmos_do_probe(struct device *dev, struct
>  
>  	spin_lock_irq(&rtc_lock);
>  
> -	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> -	if ((CMOS_READ(RTC_VALID) & 0x7f) != 0) {
> +	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
> +	if ((CMOS_READ(RTC_VALID) & 0x40) != 0) {
>  		spin_unlock_irq(&rtc_lock);
>  		dev_warn(dev, "not accessible\n");
>  		retval = -ENXIO;
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -21,8 +21,8 @@ unsigned int mc146818_get_time(struct rt
>  
>  again:
>  	spin_lock_irqsave(&rtc_lock, flags);
> -	/* Ensure that the RTC is accessible. Bit 0-6 must be 0! */
> -	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x7f) != 0)) {
> +	/* Ensure that the RTC is accessible. Bit 6 must be 0! */
> +	if (WARN_ON_ONCE((CMOS_READ(RTC_VALID) & 0x40) != 0)) {
>  		spin_unlock_irqrestore(&rtc_lock, flags);
>  		memset(time, 0xff, sizeof(*time));
>  		return 0;
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2] rtc: mc146818: Dont test for bit 0-5 in Register D
  2021-02-01 19:40                       ` Thomas Gleixner
@ 2021-02-11 23:09                         ` Maciej W. Rozycki
  0 siblings, 0 replies; 54+ messages in thread
From: Maciej W. Rozycki @ 2021-02-11 23:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Serge Belyshev, Dirk Gouders,
	Mickaël Salaün, LKML, Alexandre Belloni,
	Jason Gunthorpe, Miroslav Lichvar, John Stultz, Prarit Bhargava,
	Alessandro Zummo, linux-rtc, Peter Zijlstra

On Mon, 1 Feb 2021, Thomas Gleixner wrote:

> >> While it cures the problem on the reporters machine it breaks machines
> >> with Intel chipsets which use bit 0-5 of the D register. So check only
> >> for bit 6 being 0 which is the case on these Intel machines as well.
> >
> > This looks fine, but it might also be worth it simply just checking
> > for the only really special value: 0xff, and going "ok, that looks
> > like missing hardware".
> >
> > That's what a few other drivers historically do in their probing
> > routines, so it's not unheard of (ie you can find drivers doing that
> > kind of
> >
> >         /* If we read 0xff from the LSR, there is no UART here. */
> >         if (inb(.. port ..) == 0xff)
> >
> > in their init routines.
> >
> > Not a big deal either way, I just think it would be more in like with
> > what other places do in similar situations
> 
> Yeah, we can do that as well. Either way is fine.

 Given that evidently vendors appear to start playing with 146818 clones 
it may be worth it to peek at the D and the C register and checking they 
are not 0xff both at a time for robustness before concluding no RTC is 
present.  The C register is supposed to hold zeros in bits 3:0.  A read of 
the C register will drop interrupt bits, but I guess it does not matter at 
the probe time.

 FWIW,

  Maciej

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2021-02-11 23:10 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-06 21:46 [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
2020-12-06 21:46 ` [patch 1/8] rtc: mc146818: Prevent reading garbage Thomas Gleixner
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2021-01-25 18:40   ` [patch 1/8] rtc: mc146818: Prevent reading garbage - bug Mickaël Salaün
2021-01-26 13:26     ` Thomas Gleixner
2021-01-26 14:17       ` Mickaël Salaün
2021-01-26 15:35         ` Thomas Gleixner
2021-01-26 17:02           ` [PATCH V2] rtc: mc146818: Detect and handle broken RTCs Thomas Gleixner
2021-01-26 18:00             ` Alexandre Belloni
2021-01-27  8:41             ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
2021-01-31 10:59             ` [PATCH V2] " Dirk Gouders
2021-02-01 13:53               ` Serge Belyshev
2021-02-01 19:09                 ` [PATCH] rtc: mc146818: Dont test for bit 0-5 in Register D Thomas Gleixner
2021-02-01 19:24                   ` [PATCH V2] " Thomas Gleixner
2021-02-01 19:32                     ` Linus Torvalds
2021-02-01 19:40                       ` Thomas Gleixner
2021-02-11 23:09                         ` Maciej W. Rozycki
2021-02-01 19:38                     ` Alexandre Belloni
2021-02-01 20:09                     ` Borislav Petkov
2021-02-01 20:15                     ` Dirk Gouders
2021-02-02  4:22                     ` Len Brown
2021-02-02 19:40                     ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
2021-02-03 13:00                     ` [PATCH V2] " Mickaël Salaün
2020-12-06 21:46 ` [patch 2/8] rtc: mc146818: Reduce spinlock section in mc146818_set_time() Thomas Gleixner
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-06 21:46 ` [patch 3/8] rtc: cmos: Make rtc_cmos sync offset correct Thomas Gleixner
2020-12-07 20:50   ` Jason Gunthorpe
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-06 21:46 ` [patch 4/8] rtc: core: Make the sync offset default more realistic Thomas Gleixner
2020-12-07 20:55   ` Jason Gunthorpe
2020-12-10 23:59   ` Alexandre Belloni
2020-12-11  0:23     ` Thomas Gleixner
2020-12-11  0:28       ` Alexandre Belloni
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-06 21:46 ` [patch 5/8] ntp: Make the RTC synchronization more reliable Thomas Gleixner
2020-12-07 12:47   ` Miroslav Lichvar
2020-12-07 20:59   ` Jason Gunthorpe
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-29 19:41   ` [patch 5/8] " Geert Uytterhoeven
2021-01-11 10:12     ` Thomas Gleixner
2021-01-11 10:40       ` Geert Uytterhoeven
2020-12-06 21:46 ` [patch 6/8] ntp, rtc: Move rtc_set_ntp_time() to ntp code Thomas Gleixner
2020-12-07 20:59   ` Jason Gunthorpe
2020-12-09  3:51     ` Alexandre Belloni
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-06 21:46 ` [patch 7/8] ntp: Make the RTC sync offset less obscure Thomas Gleixner
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-06 21:46 ` [patch 8/8] ntp: Consolidate the RTC update implementation Thomas Gleixner
2020-12-07 21:05   ` Jason Gunthorpe
2020-12-11  9:23     ` Thomas Gleixner
2020-12-11 10:07   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
2020-12-09  0:33 ` [patch 0/8] ntp/rtc: Fixes and cleanups Thomas Gleixner
2020-12-09  4:01   ` Alexandre Belloni
2020-12-09 12:39     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.