All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/3] Improve stability of system clock
@ 2017-05-17 16:13 Miroslav Lichvar
  2017-05-17 16:13 ` [PATCH RFC 1/3] timekeeping: Remove support for old vsyscalls Miroslav Lichvar
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Miroslav Lichvar @ 2017-05-17 16:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz, Prarit Bhargava, Richard Cochran

This is an attempt to improve stability and accuracy of the system clock
with very accurate time sources like the new PTP KVM clock or NTP/PTP
using hardware timestamping. It affects mainly kernels running with
NOHZ. It requires updating of the old ia64 and powerpc vsyscalls.

The main problem is that the error accumulated in the ntp_error register
takes too long to correct and this cannot be easily fixed. There are
four sources of the error:
- rounding of time for old vsyscalls
- alignment of frequency adjustments to ticks
- iterative correction of the multiplier
- limited resolution of the multipler

Instead of trying to correct the error faster, the patches remove the
first three sources. With the only remaining source the correction logic
can be simplified and the frequency of the clock is much more stable and
accurate.

Simulations of a frequency step in linux-tktest (values are in ppm and
nanoseconds):

Before:

nohz on             [1, samples/2]           [samples/2 + 1, samples]
samples         freq       dev       max      freq       dev       max
10           1.47222    1341.3    2217.8   0.06322       0.2       0.5
30           0.20799     849.5    2448.7   0.06311       0.2       0.6
100          0.04101     492.1    2895.2   0.06311       0.2       0.5
300          0.05660     295.5    3026.1   0.02064      28.3     108.9
1000         0.01994     409.8    2732.1   0.00355      13.7      52.2
3000         0.00477     469.1    3238.9   0.00070      11.0      40.9
10000        0.00081     377.3    3791.6   0.00013       9.4      36.2
30000        0.00016     259.9    4055.7   0.00004       8.9      34.1
100000       0.00003     159.0    4177.2   0.00000      13.7      58.4

nohz off            [1, samples/2]           [samples/2 + 1, samples]
samples         freq       dev       max      freq       dev       max
10           3.55062       6.2      10.8   0.05730       0.0       0.0
30           0.44672       4.5      14.1   0.05724       0.2       0.5
100          0.03649       2.7      17.4   0.05711       0.2       0.5
300          0.05815       1.7      18.7   0.06313       0.2       0.5
1000         0.06270       1.0      19.1   0.06315       0.2       0.5
3000         0.05720       1.9      19.9   0.02065       1.1       4.1
10000        0.01947      13.5      41.0   0.00339       0.5       1.7
30000        0.00448      17.5      75.9   0.00065       0.3       1.0
100000       0.00078      14.2     101.7   0.00012       0.2       0.7

After:

nohz on             [1, samples/2]           [samples/2 + 1, samples]
samples         freq       dev       max      freq       dev       max
10           0.01584       9.0      14.2   0.02937       2.7       7.2
30           0.00426      10.9      22.4   0.00481       6.5      19.2
100          0.00077      11.6      26.3   0.00074       9.0      26.9
300          0.00013      12.4      29.9   0.00018       8.7      29.3
1000         0.00003      12.6      31.8   0.00003       8.7      32.1
3000         0.00001      12.6      33.3   0.00001       9.1      33.4
10000        0.00000      12.9      34.0   0.00000       9.0      34.1
30000        0.00000      12.8      34.5   0.00000       9.0      34.5
100000       0.00000      16.5      51.2   0.00000      13.7      58.5

nohz off            [1, samples/2]           [samples/2 + 1, samples]
samples         freq       dev       max      freq       dev       max
10           0.10309       0.1       0.1   0.12717       0.0       0.1
30           0.04269       0.1       0.3   0.02592       0.1       0.4
100          0.00629       0.3       0.5   0.00521       0.2       0.5
300          0.00109       0.3       0.6   0.00099       0.2       0.5
1000         0.00019       0.3       0.6   0.00022       0.2       0.6
3000         0.00002       0.3       0.6   0.00002       0.2       0.6
10000        0.00000       0.3       0.6   0.00000       0.2       0.6
30000        0.00000       0.3       0.6   0.00000       0.2       0.6
100000       0.00000       0.3       0.6   0.00000       0.2       0.6

Miroslav Lichvar (3):
  timekeeping: Remove support for old vsyscalls
  timekeeping: Don't align frequency adjustments to ticks
  timekeeping: Determine multiplier directly from NTP tick length

 include/linux/timekeeper_internal.h |   9 +-
 kernel/time/Kconfig                 |   4 -
 kernel/time/timekeeping.c           | 184 +++++++++---------------------------
 3 files changed, 48 insertions(+), 149 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH RFC 1/3] timekeeping: Remove support for old vsyscalls
  2017-05-17 16:13 [PATCH RFC 0/3] Improve stability of system clock Miroslav Lichvar
@ 2017-05-17 16:13 ` Miroslav Lichvar
  2017-05-17 16:13 ` [PATCH RFC 2/3] timekeeping: Don't align frequency adjustments to ticks Miroslav Lichvar
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Miroslav Lichvar @ 2017-05-17 16:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz, Prarit Bhargava, Richard Cochran

As the last users of CONFIG_GENERIC_TIME_VSYSCALL_OLD have been updated,
the support can be removed from the timekeeping code.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 include/linux/timekeeper_internal.h |  7 ------
 kernel/time/Kconfig                 |  4 ----
 kernel/time/timekeeping.c           | 44 -------------------------------------
 3 files changed, 55 deletions(-)

diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 110f453..b7ae5b0 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -132,13 +132,6 @@ struct timekeeper {
 extern void update_vsyscall(struct timekeeper *tk);
 extern void update_vsyscall_tz(void);
 
-#elif defined(CONFIG_GENERIC_TIME_VSYSCALL_OLD)
-
-extern void update_vsyscall_old(struct timespec *ts, struct timespec *wtm,
-				struct clocksource *c, u32 mult,
-				u64 cycle_last);
-extern void update_vsyscall_tz(void);
-
 #else
 
 static inline void update_vsyscall(struct timekeeper *tk)
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 4008d9f..55d61a3 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -21,10 +21,6 @@ config CLOCKSOURCE_VALIDATE_LAST_CYCLE
 config GENERIC_TIME_VSYSCALL
 	bool
 
-# Timekeeping vsyscall support
-config GENERIC_TIME_VSYSCALL_OLD
-	bool
-
 # Old style timekeeping
 config ARCH_USES_GETTIMEOFFSET
 	bool
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 8fd77c6..ff542dd 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -487,44 +487,6 @@ static void halt_fast_timekeeper(struct timekeeper *tk)
 	update_fast_timekeeper(&tkr_dummy, &tk_fast_raw);
 }
 
-#ifdef CONFIG_GENERIC_TIME_VSYSCALL_OLD
-
-static inline void update_vsyscall(struct timekeeper *tk)
-{
-	struct timespec xt, wm;
-
-	xt = timespec64_to_timespec(tk_xtime(tk));
-	wm = timespec64_to_timespec(tk->wall_to_monotonic);
-	update_vsyscall_old(&xt, &wm, tk->tkr_mono.clock, tk->tkr_mono.mult,
-			    tk->tkr_mono.cycle_last);
-}
-
-static inline void old_vsyscall_fixup(struct timekeeper *tk)
-{
-	s64 remainder;
-
-	/*
-	* Store only full nanoseconds into xtime_nsec after rounding
-	* it up and add the remainder to the error difference.
-	* XXX - This is necessary to avoid small 1ns inconsistnecies caused
-	* by truncating the remainder in vsyscalls. However, it causes
-	* additional work to be done in timekeeping_adjust(). Once
-	* the vsyscall implementations are converted to use xtime_nsec
-	* (shifted nanoseconds), and CONFIG_GENERIC_TIME_VSYSCALL_OLD
-	* users are removed, this can be killed.
-	*/
-	remainder = tk->tkr_mono.xtime_nsec & ((1ULL << tk->tkr_mono.shift) - 1);
-	if (remainder != 0) {
-		tk->tkr_mono.xtime_nsec -= remainder;
-		tk->tkr_mono.xtime_nsec += 1ULL << tk->tkr_mono.shift;
-		tk->ntp_error += remainder << tk->ntp_error_shift;
-		tk->ntp_error -= (1ULL << tk->tkr_mono.shift) << tk->ntp_error_shift;
-	}
-}
-#else
-#define old_vsyscall_fixup(tk)
-#endif
-
 static RAW_NOTIFIER_HEAD(pvclock_gtod_chain);
 
 static void update_pvclock_gtod(struct timekeeper *tk, bool was_set)
@@ -2065,12 +2027,6 @@ void update_wall_time(void)
 	timekeeping_adjust(tk, offset);
 
 	/*
-	 * XXX This can be killed once everyone converts
-	 * to the new update_vsyscall.
-	 */
-	old_vsyscall_fixup(tk);
-
-	/*
 	 * Finally, make sure that after the rounding
 	 * xtime_nsec isn't larger than NSEC_PER_SEC
 	 */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH RFC 2/3] timekeeping: Don't align frequency adjustments to ticks
  2017-05-17 16:13 [PATCH RFC 0/3] Improve stability of system clock Miroslav Lichvar
  2017-05-17 16:13 ` [PATCH RFC 1/3] timekeeping: Remove support for old vsyscalls Miroslav Lichvar
@ 2017-05-17 16:13 ` Miroslav Lichvar
  2017-05-17 16:13 ` [PATCH RFC 3/3] timekeeping: Determine multiplier directly from NTP tick length Miroslav Lichvar
  2017-05-17 16:30 ` [PATCH RFC 0/3] Improve stability of system clock John Stultz
  3 siblings, 0 replies; 14+ messages in thread
From: Miroslav Lichvar @ 2017-05-17 16:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz, Prarit Bhargava, Richard Cochran

When the timekeeping multiplier is adjusted, the NTP error is adjusted
to correct the clock for the misalignment of the update to the start of
the tick. This error is corrected in later updates and the clock appears
as if the frequency was changed exactly on the tick.

Remove this correction to keep the point where the frequency is
effectively changed at the time of the update. This removes a major
source of the NTP error.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 kernel/time/timekeeping.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ff542dd..5ae6f27 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1760,8 +1760,6 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
 	 *	xtime_nsec_2 = xtime_nsec_1 - offset
 	 * Which simplfies to:
 	 *	xtime_nsec -= offset
-	 *
-	 * XXX - TODO: Doc ntp_error calculation.
 	 */
 	if ((mult_adj > 0) && (tk->tkr_mono.mult + mult_adj < mult_adj)) {
 		/* NTP adjustment caused clocksource mult overflow */
@@ -1772,7 +1770,6 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
 	tk->tkr_mono.mult += mult_adj;
 	tk->xtime_interval += interval;
 	tk->tkr_mono.xtime_nsec -= offset;
-	tk->ntp_error -= (interval - offset) << tk->ntp_error_shift;
 }
 
 /*
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH RFC 3/3] timekeeping: Determine multiplier directly from NTP tick length
  2017-05-17 16:13 [PATCH RFC 0/3] Improve stability of system clock Miroslav Lichvar
  2017-05-17 16:13 ` [PATCH RFC 1/3] timekeeping: Remove support for old vsyscalls Miroslav Lichvar
  2017-05-17 16:13 ` [PATCH RFC 2/3] timekeeping: Don't align frequency adjustments to ticks Miroslav Lichvar
@ 2017-05-17 16:13 ` Miroslav Lichvar
  2017-05-17 16:30 ` [PATCH RFC 0/3] Improve stability of system clock John Stultz
  3 siblings, 0 replies; 14+ messages in thread
From: Miroslav Lichvar @ 2017-05-17 16:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz, Prarit Bhargava, Richard Cochran

When the length of the NTP tick changes significantly, e.g. when an
NTP/PTP implementation corrects the initial offset of the clock, a large
value may accumulate in the NTP error before the multiplier converges to
the correct value. It may then take a very long time (hours or even
days) before the error is corrected. This creates a small unstable
frequency offset and prevents stable synchronization of the clock with
very stable time sources (e.g. NTP/PTP using hardware timestamping or
PTP KVM clock).

Use division to determine the correct multiplier directly from the NTP
tick length in order to replace the iterative approach and remove the
last major source of the NTP error. The only remaining source of the
error is now limited resolution of the multiplier, which can be
effectively corrected by adding 0 or 1 to the multiplier according to
the sign of the error.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 include/linux/timekeeper_internal.h |   2 +
 kernel/time/timekeeping.c           | 137 ++++++++++++------------------------
 2 files changed, 48 insertions(+), 91 deletions(-)

diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index b7ae5b0..e9a46e5 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -113,6 +113,8 @@ struct timekeeper {
 	s64			ntp_error;
 	u32			ntp_error_shift;
 	u32			ntp_err_mult;
+	/* Flag used to avoid updating NTP twice with same second */
+	u32			skip_second_overflow;
 #ifdef CONFIG_DEBUG_TIMEKEEPING
 	long			last_warning;
 	/*
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5ae6f27..b4cc606 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -289,6 +289,7 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 	tk->tkr_mono.mult = clock->mult;
 	tk->tkr_raw.mult = clock->mult;
 	tk->ntp_err_mult = 0;
+	tk->skip_second_overflow = 0;
 }
 
 /* Timekeeper helper functions. */
@@ -1699,20 +1700,19 @@ device_initcall(timekeeping_init_ops);
  */
 static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
 							 s64 offset,
-							 bool negative,
-							 int adj_scale)
+							 s32 mult_adj)
 {
 	s64 interval = tk->cycle_interval;
-	s32 mult_adj = 1;
 
-	if (negative) {
-		mult_adj = -mult_adj;
+	if (mult_adj == 0) {
+		return;
+	} else if (mult_adj == -1) {
 		interval = -interval;
-		offset  = -offset;
+		offset = -offset;
+	} else if (mult_adj != 1) {
+		interval *= mult_adj;
+		offset *= mult_adj;
 	}
-	mult_adj <<= adj_scale;
-	interval <<= adj_scale;
-	offset <<= adj_scale;
 
 	/*
 	 * So the following can be confusing.
@@ -1773,85 +1773,35 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
 }
 
 /*
- * Calculate the multiplier adjustment needed to match the frequency
- * specified by NTP
+ * Adjust the timekeeper's multiplier to the correct frequency
+ * and also to reduce the accumulated error value.
  */
-static __always_inline void timekeeping_freqadjust(struct timekeeper *tk,
-							s64 offset)
+static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
 {
-	s64 interval = tk->cycle_interval;
-	s64 xinterval = tk->xtime_interval;
-	u32 base = tk->tkr_mono.clock->mult;
-	u32 max = tk->tkr_mono.clock->maxadj;
-	u32 cur_adj = tk->tkr_mono.mult;
-	s64 tick_error;
-	bool negative;
-	u32 adj_scale;
-
-	/* Remove any current error adj from freq calculation */
-	if (tk->ntp_err_mult)
-		xinterval -= tk->cycle_interval;
-
-	tk->ntp_tick = ntp_tick_length();
-
-	/* Calculate current error per tick */
-	tick_error = ntp_tick_length() >> tk->ntp_error_shift;
-	tick_error -= (xinterval + tk->xtime_remainder);
-
-	/* Don't worry about correcting it if its small */
-	if (likely((tick_error >= 0) && (tick_error <= interval)))
-		return;
-
-	/* preserve the direction of correction */
-	negative = (tick_error < 0);
+	u32 mult;
 
-	/* If any adjustment would pass the max, just return */
-	if (negative && (cur_adj - 1) <= (base - max))
-		return;
-	if (!negative && (cur_adj + 1) >= (base + max))
-		return;
 	/*
-	 * Sort out the magnitude of the correction, but
-	 * avoid making so large a correction that we go
-	 * over the max adjustment.
+	 * Determine the multiplier from the current NTP tick length.
+	 * Avoid expensive division when the tick length doesn't change.
 	 */
-	adj_scale = 0;
-	tick_error = abs(tick_error);
-	while (tick_error > interval) {
-		u32 adj = 1 << (adj_scale + 1);
-
-		/* Check if adjustment gets us within 1 unit from the max */
-		if (negative && (cur_adj - adj) <= (base - max))
-			break;
-		if (!negative && (cur_adj + adj) >= (base + max))
-			break;
-
-		adj_scale++;
-		tick_error >>= 1;
+	if (likely(tk->ntp_tick == ntp_tick_length())) {
+		mult = tk->tkr_mono.mult - tk->ntp_err_mult;
+	} else {
+		tk->ntp_tick = ntp_tick_length();
+		mult = div64_u64((tk->ntp_tick >> tk->ntp_error_shift) -
+				 tk->xtime_remainder, tk->cycle_interval);
 	}
 
-	/* scale the corrections */
-	timekeeping_apply_adjustment(tk, offset, negative, adj_scale);
-}
+	/*
+	 * If the clock is behind the NTP time, increase the multiplier by 1
+	 * to catch up with it. If it's ahead and there was a remainder in the
+	 * tick division, the clock will slow down. Otherwise it will stay
+	 * ahead until the tick length changes to a non-divisible value.
+	 */
+	tk->ntp_err_mult = tk->ntp_error > 0 ? 1 : 0;
+	mult += tk->ntp_err_mult;
 
-/*
- * Adjust the timekeeper's multiplier to the correct frequency
- * and also to reduce the accumulated error value.
- */
-static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
-{
-	/* Correct for the current frequency error */
-	timekeeping_freqadjust(tk, offset);
-
-	/* Next make a small adjustment to fix any cumulative error */
-	if (!tk->ntp_err_mult && (tk->ntp_error > 0)) {
-		tk->ntp_err_mult = 1;
-		timekeeping_apply_adjustment(tk, offset, 0, 0);
-	} else if (tk->ntp_err_mult && (tk->ntp_error <= 0)) {
-		/* Undo any existing error adjustment */
-		timekeeping_apply_adjustment(tk, offset, 1, 0);
-		tk->ntp_err_mult = 0;
-	}
+	timekeeping_apply_adjustment(tk, offset, mult - tk->tkr_mono.mult);
 
 	if (unlikely(tk->tkr_mono.clock->maxadj &&
 		(abs(tk->tkr_mono.mult - tk->tkr_mono.clock->mult)
@@ -1868,18 +1818,14 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
 	 * in the code above, its possible the required corrective factor to
 	 * xtime_nsec could cause it to underflow.
 	 *
-	 * Now, since we already accumulated the second, cannot simply roll
-	 * the accumulated second back, since the NTP subsystem has been
-	 * notified via second_overflow. So instead we push xtime_nsec forward
-	 * by the amount we underflowed, and add that amount into the error.
-	 *
-	 * We'll correct this error next time through this function, when
-	 * xtime_nsec is not as small.
+	 * Now, since we already accumulated the second and the NTP subsystem has
+	 * been notified via second_overflow(), we need to skip the next update.
 	 */
 	if (unlikely((s64)tk->tkr_mono.xtime_nsec < 0)) {
-		s64 neg = -(s64)tk->tkr_mono.xtime_nsec;
-		tk->tkr_mono.xtime_nsec = 0;
-		tk->ntp_error += neg << tk->ntp_error_shift;
+		tk->tkr_mono.xtime_nsec += (u64)NSEC_PER_SEC <<
+							tk->tkr_mono.shift;
+		tk->xtime_sec--;
+		tk->skip_second_overflow = 1;
 	}
 }
 
@@ -1902,6 +1848,15 @@ static inline unsigned int accumulate_nsecs_to_secs(struct timekeeper *tk)
 		tk->tkr_mono.xtime_nsec -= nsecps;
 		tk->xtime_sec++;
 
+		/*
+		 * Skip NTP update if this second was accumulated before,
+		 * i.e. xtime_nsec underflowed in timekeeping_adjust()
+		 */
+		if (unlikely(tk->skip_second_overflow)) {
+			tk->skip_second_overflow = 0;
+			continue;
+		}
+
 		/* Figure out if its a leap sec and apply if needed */
 		leap = second_overflow(tk->xtime_sec);
 		if (unlikely(leap)) {
@@ -2020,7 +1975,7 @@ void update_wall_time(void)
 			shift--;
 	}
 
-	/* correct the clock when NTP error is too big */
+	/* Adjust the multiplier to correct NTP error */
 	timekeeping_adjust(tk, offset);
 
 	/*
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-17 16:13 [PATCH RFC 0/3] Improve stability of system clock Miroslav Lichvar
                   ` (2 preceding siblings ...)
  2017-05-17 16:13 ` [PATCH RFC 3/3] timekeeping: Determine multiplier directly from NTP tick length Miroslav Lichvar
@ 2017-05-17 16:30 ` John Stultz
  2017-05-17 16:57   ` Miroslav Lichvar
  3 siblings, 1 reply; 14+ messages in thread
From: John Stultz @ 2017-05-17 16:30 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: lkml, Prarit Bhargava, Richard Cochran

On Wed, May 17, 2017 at 9:13 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> This is an attempt to improve stability and accuracy of the system clock
> with very accurate time sources like the new PTP KVM clock or NTP/PTP
> using hardware timestamping. It affects mainly kernels running with
> NOHZ. It requires updating of the old ia64 and powerpc vsyscalls.
>
> The main problem is that the error accumulated in the ntp_error register
> takes too long to correct and this cannot be easily fixed. There are
> four sources of the error:
> - rounding of time for old vsyscalls
> - alignment of frequency adjustments to ticks
> - iterative correction of the multiplier
> - limited resolution of the multipler
>
> Instead of trying to correct the error faster, the patches remove the
> first three sources. With the only remaining source the correction logic
> can be simplified and the frequency of the clock is much more stable and
> accurate.
>
> Simulations of a frequency step in linux-tktest (values are in ppm and
> nanoseconds):

So thanks for sending these out. I still need to look them over in
depth, but can I make another ask here?  :)

Could you submit your linux-tktest infrastructure to the kselftests dir?

It would be really nice for folks to be able to reproduce your results
be able to do similar testing for regressions.

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-17 16:30 ` [PATCH RFC 0/3] Improve stability of system clock John Stultz
@ 2017-05-17 16:57   ` Miroslav Lichvar
  2017-05-17 17:02     ` John Stultz
  0 siblings, 1 reply; 14+ messages in thread
From: Miroslav Lichvar @ 2017-05-17 16:57 UTC (permalink / raw)
  To: John Stultz; +Cc: lkml, Prarit Bhargava, Richard Cochran

On Wed, May 17, 2017 at 09:30:31AM -0700, John Stultz wrote:
> So thanks for sending these out. I still need to look them over in
> depth, but can I make another ask here?  :)
> 
> Could you submit your linux-tktest infrastructure to the kselftests dir?

I can, but it's a mess that breaks frequently as the timekeeping and
other kernel code changes. Are you sure you want that in the kernel
tree? :)

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-17 16:57   ` Miroslav Lichvar
@ 2017-05-17 17:02     ` John Stultz
  2017-05-17 17:22       ` Miroslav Lichvar
  0 siblings, 1 reply; 14+ messages in thread
From: John Stultz @ 2017-05-17 17:02 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: lkml, Prarit Bhargava, Richard Cochran

On Wed, May 17, 2017 at 9:57 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Wed, May 17, 2017 at 09:30:31AM -0700, John Stultz wrote:
>> So thanks for sending these out. I still need to look them over in
>> depth, but can I make another ask here?  :)
>>
>> Could you submit your linux-tktest infrastructure to the kselftests dir?
>
> I can, but it's a mess that breaks frequently as the timekeeping and
> other kernel code changes. Are you sure you want that in the kernel
> tree? :)

Being a mess is a slight concern, but as for breaking, if its
in-kernel, then folks can't make changes that break it, right?

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-17 17:02     ` John Stultz
@ 2017-05-17 17:22       ` Miroslav Lichvar
  2017-05-17 23:06         ` John Stultz
  0 siblings, 1 reply; 14+ messages in thread
From: Miroslav Lichvar @ 2017-05-17 17:22 UTC (permalink / raw)
  To: John Stultz; +Cc: lkml, Prarit Bhargava, Richard Cochran

On Wed, May 17, 2017 at 10:02:00AM -0700, John Stultz wrote:
> On Wed, May 17, 2017 at 9:57 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > On Wed, May 17, 2017 at 09:30:31AM -0700, John Stultz wrote:
> >> Could you submit your linux-tktest infrastructure to the kselftests dir?
> >
> > I can, but it's a mess that breaks frequently as the timekeeping and
> > other kernel code changes. Are you sure you want that in the kernel
> > tree? :)
> 
> Being a mess is a slight concern, but as for breaking, if its
> in-kernel, then folks can't make changes that break it, right?

It duplicates/stubs quite a few kernel functions that are needed to
compile and link the timekeeping.c file into an executable. See
linux-tktest/missing.c. If their signature changes, or new functions
are needed, it will break.

Is there a better way to run the timekeeping code in an userspace
application? I suspect it would need something like the Linux Kernel
Library project.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-17 17:22       ` Miroslav Lichvar
@ 2017-05-17 23:06         ` John Stultz
  2017-05-18  4:54           ` Richard Cochran
  0 siblings, 1 reply; 14+ messages in thread
From: John Stultz @ 2017-05-17 23:06 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: lkml, Prarit Bhargava, Richard Cochran

On Wed, May 17, 2017 at 10:22 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Wed, May 17, 2017 at 10:02:00AM -0700, John Stultz wrote:
>> On Wed, May 17, 2017 at 9:57 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> > On Wed, May 17, 2017 at 09:30:31AM -0700, John Stultz wrote:
>> >> Could you submit your linux-tktest infrastructure to the kselftests dir?
>> >
>> > I can, but it's a mess that breaks frequently as the timekeeping and
>> > other kernel code changes. Are you sure you want that in the kernel
>> > tree? :)
>>
>> Being a mess is a slight concern, but as for breaking, if its
>> in-kernel, then folks can't make changes that break it, right?
>
> It duplicates/stubs quite a few kernel functions that are needed to
> compile and link the timekeeping.c file into an executable. See
> linux-tktest/missing.c. If their signature changes, or new functions
> are needed, it will break.

Hopefully we can fix it in the kernel as well then?

Or maybe it will help inform how we could refactor the time code to
better support the simulation code?

> Is there a better way to run the timekeeping code in an userspace
> application? I suspect it would need something like the Linux Kernel
> Library project.

I dunno. There's probably a cleaner way to go about it, but I also
feel like the benefit of just having the test in the kernel tree is
that it can be managed as a unified whole, rather then the test being
a separate thing and always playing catchup to kernel changes.

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-17 23:06         ` John Stultz
@ 2017-05-18  4:54           ` Richard Cochran
  2017-05-20  0:35             ` John Stultz
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Cochran @ 2017-05-18  4:54 UTC (permalink / raw)
  To: John Stultz; +Cc: Miroslav Lichvar, lkml, Prarit Bhargava

On Wed, May 17, 2017 at 04:06:07PM -0700, John Stultz wrote:
> On Wed, May 17, 2017 at 10:22 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > Is there a better way to run the timekeeping code in an userspace
> > application? I suspect it would need something like the Linux Kernel
> > Library project.
> 
> I dunno. There's probably a cleaner way to go about it, but I also
> feel like the benefit of just having the test in the kernel tree is
> that it can be managed as a unified whole, rather then the test being
> a separate thing and always playing catchup to kernel changes.

I vaguely recall a rant on the list years ago from a Linux bigwhig
saying how we don't support that kind of thing.  But maybe it is my
imagination.  In any case, IMHO running user space tests for chunks of
kernel code can be quite useful.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-18  4:54           ` Richard Cochran
@ 2017-05-20  0:35             ` John Stultz
  2017-05-21  3:19               ` Rusty Russell
  2017-06-08 16:17               ` Miroslav Lichvar
  0 siblings, 2 replies; 14+ messages in thread
From: John Stultz @ 2017-05-20  0:35 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Miroslav Lichvar, lkml, Prarit Bhargava, Rusty Russell

On Wed, May 17, 2017 at 9:54 PM, Richard Cochran
<richardcochran@gmail.com> wrote:
> On Wed, May 17, 2017 at 04:06:07PM -0700, John Stultz wrote:
>> On Wed, May 17, 2017 at 10:22 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> > Is there a better way to run the timekeeping code in an userspace
>> > application? I suspect it would need something like the Linux Kernel
>> > Library project.
>>
>> I dunno. There's probably a cleaner way to go about it, but I also
>> feel like the benefit of just having the test in the kernel tree is
>> that it can be managed as a unified whole, rather then the test being
>> a separate thing and always playing catchup to kernel changes.
>
> I vaguely recall a rant on the list years ago from a Linux bigwhig
> saying how we don't support that kind of thing.  But maybe it is my
> imagination.  In any case, IMHO running user space tests for chunks of
> kernel code can be quite useful.

So a few years ago I mentioned this at a testing session at I think
Linux Plubmers' and Rusty (CC'ed) commented that he had some netfilter
(or iptables?) simulator code that never made it upstream. However,
now that kselftests are integrated with the kernel this could change.
At least that's my memory of the discussion.

Anyway, I still think its worth trying to submit. Worse case its a
huge pain and we pull it back out?

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-20  0:35             ` John Stultz
@ 2017-05-21  3:19               ` Rusty Russell
  2017-06-08 16:17               ` Miroslav Lichvar
  1 sibling, 0 replies; 14+ messages in thread
From: Rusty Russell @ 2017-05-21  3:19 UTC (permalink / raw)
  To: John Stultz, Richard Cochran; +Cc: Miroslav Lichvar, lkml, Prarit Bhargava

John Stultz <john.stultz@linaro.org> writes:
> On Wed, May 17, 2017 at 9:54 PM, Richard Cochran
> <richardcochran@gmail.com> wrote:
>> On Wed, May 17, 2017 at 04:06:07PM -0700, John Stultz wrote:
>>> On Wed, May 17, 2017 at 10:22 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>>> > Is there a better way to run the timekeeping code in an userspace
>>> > application? I suspect it would need something like the Linux Kernel
>>> > Library project.
>>>
>>> I dunno. There's probably a cleaner way to go about it, but I also
>>> feel like the benefit of just having the test in the kernel tree is
>>> that it can be managed as a unified whole, rather then the test being
>>> a separate thing and always playing catchup to kernel changes.
>>
>> I vaguely recall a rant on the list years ago from a Linux bigwhig
>> saying how we don't support that kind of thing.  But maybe it is my
>> imagination.  In any case, IMHO running user space tests for chunks of
>> kernel code can be quite useful.
>
> So a few years ago I mentioned this at a testing session at I think
> Linux Plubmers' and Rusty (CC'ed) commented that he had some netfilter
> (or iptables?) simulator code that never made it upstream. However,
> now that kselftests are integrated with the kernel this could change.
> At least that's my memory of the discussion.

Yep, we did it with nfsim, but forward porting was a PITA.  Good luck!

Rusty.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-05-20  0:35             ` John Stultz
  2017-05-21  3:19               ` Rusty Russell
@ 2017-06-08 16:17               ` Miroslav Lichvar
  2017-06-08 18:36                 ` John Stultz
  1 sibling, 1 reply; 14+ messages in thread
From: Miroslav Lichvar @ 2017-06-08 16:17 UTC (permalink / raw)
  To: John Stultz; +Cc: Richard Cochran, lkml, Prarit Bhargava, Rusty Russell

On Fri, May 19, 2017 at 05:35:38PM -0700, John Stultz wrote:
> >> On Wed, May 17, 2017 at 10:22 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> >> > Is there a better way to run the timekeeping code in an userspace
> >> > application? I suspect it would need something like the Linux Kernel
> >> > Library project.

> So a few years ago I mentioned this at a testing session at I think
> Linux Plubmers' and Rusty (CC'ed) commented that he had some netfilter
> (or iptables?) simulator code that never made it upstream. However,
> now that kselftests are integrated with the kernel this could change.
> At least that's my memory of the discussion.
> 
> Anyway, I still think its worth trying to submit. Worse case its a
> huge pain and we pull it back out?

I've tried something different. I've reimplemented the simulated test
as an ordinary user-space application using the CLOCK_MONOTONIC and
CLOCK_MONOTONIC_RAW clocks. It's not deterministic, it doesn't give
results instantly, and it's not as precise as the original test, but
it can clearly show the difference that this patchset makes.

Before:
Clock precision: 46 ns
CLOCK_MONOTONIC_RAW frequency offset: 0.00 ppm
   base   step       freq    dev     max       freq    dev     max
  15.24  40960      -0.04    273    1852      +0.01      2       3
 228.26  40960      +6.07  33089  225914      -0.09      2       4
  55.42  40960     -11.90  46413  232834      +0.01      1       3
 237.65  40960      -4.75  25574  173479      +0.05      1       3
 240.63  40960      -0.08    846    5758      +0.05      2       3
 205.52    640      -0.11    781    5231      -0.01      2       4
 246.81    640      +0.21    809    5486      +0.08      1       3
 164.04    640      +0.16    282    1920      +0.11      2       3
 171.32    640      -0.08    408    2756      -0.01      2       3
 243.04    640      -0.03    349    2377      +0.03      2       3
 179.91     10      +0.07     28      62      -0.00      2       6
  45.44     10      +0.10     18     119      +0.00      6      29
 204.30     10      -0.00     21     122      -0.00      4       9
  76.18     10      +0.03     39      85      -0.00      3       5
 158.18     10      -0.02     26      94      -0.00      4       9

After:
Clock precision: 46 ns
CLOCK_MONOTONIC_RAW frequency offset: -0.00 ppm
   base   step       freq    dev     max       freq    dev     max
  93.98  40960      +0.00      3       7      +0.00      4       8
 117.95  40960      +0.00      3       9      -0.00      3       8
 230.44  40960      -0.00      4       9      -0.00      2       5
 240.56  40960      +0.00      3       6      -0.00      3       7
 228.39  40960      +0.00      3       7      -0.00      3       9
 237.85    640      -0.00      4      10      +0.00      3       8
 250.74    640      +0.00      4      11      -0.00      3       7
 249.06    640      +0.00      4       9      -0.00      3       9
 114.98    640      -0.00      3       8      +0.00      3       8
 120.59    640      +0.00      3       7      +0.00      3       7
 190.66     10      -0.00      3       7      +0.00      3       7
 228.83     10      +0.00      3       7      -0.00      3       6
  18.91     10      +0.00      3       8      -0.00      3       8
  12.39     10      +0.00      3       8      +0.00      4       8
  12.01     10      +0.00      4       9      -0.00      4       9

Each line has statistics from 100 samples collected in 0.1 second
interval.

The frequency error in the second "freq" column with values up to 0.11
ppm shows the problem with the clock very slowly correcting a large
NTP error.

I can add some limits for the measured errors and submit it as new a
kselftest. If the measured precision is too large (e.g. >100ns), the
test can return "skip" in order to avoid false negatives.

If adjtimex() had an option to return the NTP error directly, or it
was possible to read the two clocks at the same time, the test could
be much more sensitive and observe shorter intervals (spanning fewer
clock updates).

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC 0/3] Improve stability of system clock
  2017-06-08 16:17               ` Miroslav Lichvar
@ 2017-06-08 18:36                 ` John Stultz
  0 siblings, 0 replies; 14+ messages in thread
From: John Stultz @ 2017-06-08 18:36 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: Richard Cochran, lkml, Prarit Bhargava, Rusty Russell

On Thu, Jun 8, 2017 at 9:17 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Fri, May 19, 2017 at 05:35:38PM -0700, John Stultz wrote:
>> >> On Wed, May 17, 2017 at 10:22 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> >> > Is there a better way to run the timekeeping code in an userspace
>> >> > application? I suspect it would need something like the Linux Kernel
>> >> > Library project.
>
>> So a few years ago I mentioned this at a testing session at I think
>> Linux Plubmers' and Rusty (CC'ed) commented that he had some netfilter
>> (or iptables?) simulator code that never made it upstream. However,
>> now that kselftests are integrated with the kernel this could change.
>> At least that's my memory of the discussion.
>>
>> Anyway, I still think its worth trying to submit. Worse case its a
>> huge pain and we pull it back out?
>
> I've tried something different. I've reimplemented the simulated test
> as an ordinary user-space application using the CLOCK_MONOTONIC and
> CLOCK_MONOTONIC_RAW clocks. It's not deterministic, it doesn't give
> results instantly, and it's not as precise as the original test, but
> it can clearly show the difference that this patchset makes.
>
> Before:
> Clock precision: 46 ns
> CLOCK_MONOTONIC_RAW frequency offset: 0.00 ppm
>    base   step       freq    dev     max       freq    dev     max
>   15.24  40960      -0.04    273    1852      +0.01      2       3
>  228.26  40960      +6.07  33089  225914      -0.09      2       4
>   55.42  40960     -11.90  46413  232834      +0.01      1       3
>  237.65  40960      -4.75  25574  173479      +0.05      1       3
>  240.63  40960      -0.08    846    5758      +0.05      2       3
>  205.52    640      -0.11    781    5231      -0.01      2       4
>  246.81    640      +0.21    809    5486      +0.08      1       3
>  164.04    640      +0.16    282    1920      +0.11      2       3
>  171.32    640      -0.08    408    2756      -0.01      2       3
>  243.04    640      -0.03    349    2377      +0.03      2       3
>  179.91     10      +0.07     28      62      -0.00      2       6
>   45.44     10      +0.10     18     119      +0.00      6      29
>  204.30     10      -0.00     21     122      -0.00      4       9
>   76.18     10      +0.03     39      85      -0.00      3       5
>  158.18     10      -0.02     26      94      -0.00      4       9
>
> After:
> Clock precision: 46 ns
> CLOCK_MONOTONIC_RAW frequency offset: -0.00 ppm
>    base   step       freq    dev     max       freq    dev     max
>   93.98  40960      +0.00      3       7      +0.00      4       8
>  117.95  40960      +0.00      3       9      -0.00      3       8
>  230.44  40960      -0.00      4       9      -0.00      2       5
>  240.56  40960      +0.00      3       6      -0.00      3       7
>  228.39  40960      +0.00      3       7      -0.00      3       9
>  237.85    640      -0.00      4      10      +0.00      3       8
>  250.74    640      +0.00      4      11      -0.00      3       7
>  249.06    640      +0.00      4       9      -0.00      3       9
>  114.98    640      -0.00      3       8      +0.00      3       8
>  120.59    640      +0.00      3       7      +0.00      3       7
>  190.66     10      -0.00      3       7      +0.00      3       7
>  228.83     10      +0.00      3       7      -0.00      3       6
>   18.91     10      +0.00      3       8      -0.00      3       8
>   12.39     10      +0.00      3       8      +0.00      4       8
>   12.01     10      +0.00      4       9      -0.00      4       9
>
> Each line has statistics from 100 samples collected in 0.1 second
> interval.
>
> The frequency error in the second "freq" column with values up to 0.11
> ppm shows the problem with the clock very slowly correcting a large
> NTP error.

Might rename the headers for the second column set for clarity?

>
> I can add some limits for the measured errors and submit it as new a
> kselftest. If the measured precision is too large (e.g. >100ns), the
> test can return "skip" in order to avoid false negatives.

That all sounds great!

(Though I still do really like your simulator!  Being able to have
deterministic test results from known inputs is a big plus.)

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-06-08 18:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-17 16:13 [PATCH RFC 0/3] Improve stability of system clock Miroslav Lichvar
2017-05-17 16:13 ` [PATCH RFC 1/3] timekeeping: Remove support for old vsyscalls Miroslav Lichvar
2017-05-17 16:13 ` [PATCH RFC 2/3] timekeeping: Don't align frequency adjustments to ticks Miroslav Lichvar
2017-05-17 16:13 ` [PATCH RFC 3/3] timekeeping: Determine multiplier directly from NTP tick length Miroslav Lichvar
2017-05-17 16:30 ` [PATCH RFC 0/3] Improve stability of system clock John Stultz
2017-05-17 16:57   ` Miroslav Lichvar
2017-05-17 17:02     ` John Stultz
2017-05-17 17:22       ` Miroslav Lichvar
2017-05-17 23:06         ` John Stultz
2017-05-18  4:54           ` Richard Cochran
2017-05-20  0:35             ` John Stultz
2017-05-21  3:19               ` Rusty Russell
2017-06-08 16:17               ` Miroslav Lichvar
2017-06-08 18:36                 ` John Stultz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.