* [PATCH v2 0/2] Allwinner A64 timer workaround @ 2018-07-12 2:25 Samuel Holland 2018-07-12 2:25 ` [PATCH v2 1/2] arm64: arch_timer: Workaround for Allwinner A64 timer instability Samuel Holland 2018-07-12 2:25 ` [PATCH v2 2/2] arm64: dts: allwinner: a64: Enable A64 timer workaround Samuel Holland 0 siblings, 2 replies; 4+ messages in thread From: Samuel Holland @ 2018-07-12 2:25 UTC (permalink / raw) To: Maxime Ripard, Chen-Yu Tsai, Catalin Marinas, Will Deacon, Daniel Lezcano, Thomas Gleixner, Marc Zyngier, Mark Rutland Cc: linux-arm-kernel, linux-kernel, linux-sunxi, Samuel Holland Hello, Several people (including me) have experienced extremely large system clock jumps on their A64-based devices, apparently due to the architectural timer going backward, which is interpreted by Linux as the timer wrapping around after 2^56 cycles. Investigation led to discovery of some obvious problems with this SoC's architectural timer, and this patch series introduces at least a partial workaround. It covers up small-scale instability in the timer, which reduces the incidence of the large system clock jumps. More details are in the commit message for patch 1. Patch 2 simply enables the workaround in the device tree. I chose not to send this to stable, since it is most likely not a complete solution. Thanks, Samuel changes since v1: - Add an iteration limit like most other arch timer workarounds - Added Andre's Tested-by Samuel Holland (2): arm64: arch_timer: Workaround for Allwinner A64 timer instability arm64: dts: allwinner: a64: Enable A64 timer workaround arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 1 + drivers/clocksource/Kconfig | 11 +++++++ drivers/clocksource/arm_arch_timer.c | 43 +++++++++++++++++++++++++++ 3 files changed, 55 insertions(+) -- 2.16.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/2] arm64: arch_timer: Workaround for Allwinner A64 timer instability 2018-07-12 2:25 [PATCH v2 0/2] Allwinner A64 timer workaround Samuel Holland @ 2018-07-12 2:25 ` Samuel Holland 2018-07-12 10:59 ` Marc Zyngier 2018-07-12 2:25 ` [PATCH v2 2/2] arm64: dts: allwinner: a64: Enable A64 timer workaround Samuel Holland 1 sibling, 1 reply; 4+ messages in thread From: Samuel Holland @ 2018-07-12 2:25 UTC (permalink / raw) To: Maxime Ripard, Chen-Yu Tsai, Catalin Marinas, Will Deacon, Daniel Lezcano, Thomas Gleixner, Marc Zyngier, Mark Rutland Cc: linux-arm-kernel, linux-kernel, linux-sunxi, Samuel Holland The Allwinner A64 SoC is known [1] to have an unstable architectural timer, which manifests itself most obviously in the time jumping forward a multiple of 95 years [2][3]. This coincides with 2^56 cycles at a timer frequency of 24 MHz, implying that the time went slightly backward (and this was interpreted by the kernel as it jumping forward and wrapping around past the epoch). Further investigation revealed instability in the low bits of CNTVCT at the point a high bit rolls over. This leads to power-of-two cycle forward and backward jumps. (Testing shows that forward jumps are about twice as likely as backward jumps.) Without trapping reads to CNTVCT, a userspace program is able to read it in a loop faster than it changes. A test program running on all 4 CPU cores that reported jumps larger than 100 ms was run for 13.6 hours and reported the following: Count | Event -------+--------------------------- 9940 | jumped backward 699ms 268 | jumped backward 1398ms 1 | jumped backward 2097ms 16020 | jumped forward 175ms 6443 | jumped forward 699ms 2976 | jumped forward 1398ms 9 | jumped forward 356516ms 9 | jumped forward 357215ms 4 | jumped forward 714430ms 1 | jumped forward 3578440ms This works out to a jump larger than 100 ms about every 5.5 seconds on each CPU core. The largest jump (almost an hour!) was the following sequence of reads: 0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000 Note that the middle bits don't necessarily all read as all zeroes or all ones during the anomalous behavior; however the low 11 bits checked by the function in this patch have never been observed with any other value. Also note that smaller jumps are much more common, with the smallest backward jumps of 2048 (2^11) cycles observed over 400 times per second on each core. (Of course, this is partially due to lower bits rolling over more frequently.) Any one of these could have caused the 95 year time skip. Similar anomalies were observed while reading CNTPCT (after patching the kernel to allow reads from userspace). However, the CNTPCT jumps are much less frequent, and only small jumps were observed. The same program as before (except now reading CNTPCT) observed after 72 hours: Count | Event -------+--------------------------- 17 | jumped backward 699ms 52 | jumped forward 175ms 2831 | jumped forward 699ms 5 | jumped forward 1398ms ======================================================================== Because the CPU can read the CNTPCT/CNTVCT registers faster than they change, performing two reads of the register and comparing the high bits (like other workarounds) is not a workable solution. And because the timer can jump both forward and backward, no pair of reads can distinguish a good value from a bad one. The only way to guarantee a good value from consecutive reads would be to read _three_ times, and take the middle value iff the three values are 1) individually unique and 2) increasing. This takes at minimum 3 cycles (125 ns), or more if an anomaly is detected. However, since there is a distinct pattern to the bad values, we can optimize the common case (2046/2048 of the time) to a single read by simply ignoring values that match the pattern. This still takes no more than 3 cycles in the worst case, and requires much less code. As an additional safety check, limit the loop iteration based on the number of maximum-frequency CPU cycles in three 24 MHz counter periods. [1]: https://github.com/armbian/build/commit/a08cd6fe7ae9 [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/ [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26 Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Samuel Holland <samuel@sholland.org> --- drivers/clocksource/Kconfig | 11 +++++++++ drivers/clocksource/arm_arch_timer.c | 43 ++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig index 8e8a09755d10..7a5d434dd30b 100644 --- a/drivers/clocksource/Kconfig +++ b/drivers/clocksource/Kconfig @@ -364,6 +364,17 @@ config ARM64_ERRATUM_858921 The workaround will be dynamically enabled when an affected core is detected. +config SUN50I_A64_UNSTABLE_TIMER + bool "Workaround for Allwinner A64 timer instability" + default y + depends on ARM_ARCH_TIMER && ARM64 && ARCH_SUNXI + select ARM_ARCH_TIMER_OOL_WORKAROUND + help + This option enables a workaround for instability in the timer on + the Allwinner A64 SoC. The workaround will only be active if the + allwinner,sun50i-a64-unstable-timer property is found in the + timer node. + config ARM_GLOBAL_TIMER bool "Support for the ARM global timer" if COMPILE_TEST select TIMER_OF if OF diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 57cb2f00fc07..4fba50716bda 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -319,6 +319,40 @@ static u64 notrace arm64_858921_read_cntvct_el0(void) } #endif +#ifdef CONFIG_SUN50I_A64_UNSTABLE_TIMER +/* + * The low bits of each register can transiently read as all ones or all zeroes + * when bit 11 or greater rolls over. Since the value can jump both backward + * (7ff -> 000 -> 800) and forward (7ff -> fff -> 800), it is simplest to just + * ignore register values with all ones or zeros in the low bits. + * + * Bound the loop by the worst-case number of CPU cycles that can occur during + * three distinct counter periods. + */ +#define __sun50i_a64_read_reg(reg) ({ \ + u64 _val; \ + int _retries = 150; \ + \ + do { \ + _val = read_sysreg(reg); \ + _retries--; \ + } while (((_val + 1) & GENMASK(10, 0)) <= 1 && _retries); \ + \ + WARN_ON_ONCE(!_retries); \ + _val; \ +}) + +static u64 notrace sun50i_a64_read_cntpct_el0(void) +{ + return __sun50i_a64_read_reg(cntpct_el0); +} + +static u64 notrace sun50i_a64_read_cntvct_el0(void) +{ + return __sun50i_a64_read_reg(cntvct_el0); +} +#endif + #ifdef CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND DEFINE_PER_CPU(const struct arch_timer_erratum_workaround *, timer_unstable_counter_workaround); EXPORT_SYMBOL_GPL(timer_unstable_counter_workaround); @@ -408,6 +442,15 @@ static const struct arch_timer_erratum_workaround ool_workarounds[] = { .read_cntvct_el0 = arm64_858921_read_cntvct_el0, }, #endif +#ifdef CONFIG_SUN50I_A64_UNSTABLE_TIMER + { + .match_type = ate_match_dt, + .id = "allwinner,sun50i-a64-unstable-timer", + .desc = "Allwinner A64 timer instability", + .read_cntpct_el0 = sun50i_a64_read_cntpct_el0, + .read_cntvct_el0 = sun50i_a64_read_cntvct_el0, + }, +#endif }; typedef bool (*ate_match_fn_t)(const struct arch_timer_erratum_workaround *, -- 2.16.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] arm64: arch_timer: Workaround for Allwinner A64 timer instability 2018-07-12 2:25 ` [PATCH v2 1/2] arm64: arch_timer: Workaround for Allwinner A64 timer instability Samuel Holland @ 2018-07-12 10:59 ` Marc Zyngier 0 siblings, 0 replies; 4+ messages in thread From: Marc Zyngier @ 2018-07-12 10:59 UTC (permalink / raw) To: Samuel Holland, Maxime Ripard, Chen-Yu Tsai, Catalin Marinas, Will Deacon, Daniel Lezcano, Thomas Gleixner, Mark Rutland Cc: linux-arm-kernel, linux-kernel, linux-sunxi On 12/07/18 03:25, Samuel Holland wrote: > The Allwinner A64 SoC is known [1] to have an unstable architectural > timer, which manifests itself most obviously in the time jumping forward > a multiple of 95 years [2][3]. This coincides with 2^56 cycles at a > timer frequency of 24 MHz, implying that the time went slightly backward > (and this was interpreted by the kernel as it jumping forward and > wrapping around past the epoch). > > Further investigation revealed instability in the low bits of CNTVCT at > the point a high bit rolls over. This leads to power-of-two cycle > forward and backward jumps. (Testing shows that forward jumps are about > twice as likely as backward jumps.) > > Without trapping reads to CNTVCT, a userspace program is able to read it > in a loop faster than it changes. A test program running on all 4 CPU > cores that reported jumps larger than 100 ms was run for 13.6 hours and > reported the following: > > Count | Event > -------+--------------------------- > 9940 | jumped backward 699ms > 268 | jumped backward 1398ms > 1 | jumped backward 2097ms > 16020 | jumped forward 175ms > 6443 | jumped forward 699ms > 2976 | jumped forward 1398ms > 9 | jumped forward 356516ms > 9 | jumped forward 357215ms > 4 | jumped forward 714430ms > 1 | jumped forward 3578440ms > > This works out to a jump larger than 100 ms about every 5.5 seconds on > each CPU core. > > The largest jump (almost an hour!) was the following sequence of reads: > 0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000 > > Note that the middle bits don't necessarily all read as all zeroes or > all ones during the anomalous behavior; however the low 11 bits checked > by the function in this patch have never been observed with any other > value. > > Also note that smaller jumps are much more common, with the smallest > backward jumps of 2048 (2^11) cycles observed over 400 times per second > on each core. (Of course, this is partially due to lower bits rolling > over more frequently.) Any one of these could have caused the 95 year > time skip. > > Similar anomalies were observed while reading CNTPCT (after patching the > kernel to allow reads from userspace). However, the CNTPCT jumps are > much less frequent, and only small jumps were observed. The same program > as before (except now reading CNTPCT) observed after 72 hours: > > Count | Event > -------+--------------------------- > 17 | jumped backward 699ms > 52 | jumped forward 175ms > 2831 | jumped forward 699ms > 5 | jumped forward 1398ms > > ======================================================================== > > Because the CPU can read the CNTPCT/CNTVCT registers faster than they > change, performing two reads of the register and comparing the high bits > (like other workarounds) is not a workable solution. And because the > timer can jump both forward and backward, no pair of reads can > distinguish a good value from a bad one. The only way to guarantee a > good value from consecutive reads would be to read _three_ times, and > take the middle value iff the three values are 1) individually unique > and 2) increasing. This takes at minimum 3 cycles (125 ns), or more if > an anomaly is detected. > > However, since there is a distinct pattern to the bad values, we can > optimize the common case (2046/2048 of the time) to a single read by > simply ignoring values that match the pattern. This still takes no more > than 3 cycles in the worst case, and requires much less code. > > As an additional safety check, limit the loop iteration based on the > number of maximum-frequency CPU cycles in three 24 MHz counter periods. > > [1]: https://github.com/armbian/build/commit/a08cd6fe7ae9 > [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/ > [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26 > > Tested-by: Andre Przywara <andre.przywara@arm.com> > Signed-off-by: Samuel Holland <samuel@sholland.org> > --- > drivers/clocksource/Kconfig | 11 +++++++++ > drivers/clocksource/arm_arch_timer.c | 43 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 54 insertions(+) > > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > index 8e8a09755d10..7a5d434dd30b 100644 > --- a/drivers/clocksource/Kconfig > +++ b/drivers/clocksource/Kconfig > @@ -364,6 +364,17 @@ config ARM64_ERRATUM_858921 > The workaround will be dynamically enabled when an affected > core is detected. > > +config SUN50I_A64_UNSTABLE_TIMER > + bool "Workaround for Allwinner A64 timer instability" > + default y > + depends on ARM_ARCH_TIMER && ARM64 && ARCH_SUNXI > + select ARM_ARCH_TIMER_OOL_WORKAROUND > + help > + This option enables a workaround for instability in the timer on > + the Allwinner A64 SoC. The workaround will only be active if the > + allwinner,sun50i-a64-unstable-timer property is found in the > + timer node. > + This still missing some documentation in Documentation/arm64/silicon-errata.txt. Since the silicon vendor has gone AWOL, I suggest the platform maintainers put together an SoC-specific namespace, and maintain it, so that we track what is handled where. Something like "AW_SUN50I_UNKOWN_1". > config ARM_GLOBAL_TIMER > bool "Support for the ARM global timer" if COMPILE_TEST > select TIMER_OF if OF > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c > index 57cb2f00fc07..4fba50716bda 100644 > --- a/drivers/clocksource/arm_arch_timer.c > +++ b/drivers/clocksource/arm_arch_timer.c > @@ -319,6 +319,40 @@ static u64 notrace arm64_858921_read_cntvct_el0(void) > } > #endif > > +#ifdef CONFIG_SUN50I_A64_UNSTABLE_TIMER > +/* > + * The low bits of each register can transiently read as all ones or all zeroes > + * when bit 11 or greater rolls over. Since the value can jump both backward > + * (7ff -> 000 -> 800) and forward (7ff -> fff -> 800), it is simplest to just > + * ignore register values with all ones or zeros in the low bits. > + * > + * Bound the loop by the worst-case number of CPU cycles that can occur during > + * three distinct counter periods. > + */ > +#define __sun50i_a64_read_reg(reg) ({ \ > + u64 _val; \ > + int _retries = 150; \ > + \ > + do { \ > + _val = read_sysreg(reg); \ > + _retries--; \ > + } while (((_val + 1) & GENMASK(10, 0)) <= 1 && _retries); \ > + \ > + WARN_ON_ONCE(!_retries); \ > + _val; \ > +}) > + > +static u64 notrace sun50i_a64_read_cntpct_el0(void) > +{ > + return __sun50i_a64_read_reg(cntpct_el0); > +} > + > +static u64 notrace sun50i_a64_read_cntvct_el0(void) > +{ > + return __sun50i_a64_read_reg(cntvct_el0); > +} > +#endif > + > #ifdef CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND > DEFINE_PER_CPU(const struct arch_timer_erratum_workaround *, timer_unstable_counter_workaround); > EXPORT_SYMBOL_GPL(timer_unstable_counter_workaround); > @@ -408,6 +442,15 @@ static const struct arch_timer_erratum_workaround ool_workarounds[] = { > .read_cntvct_el0 = arm64_858921_read_cntvct_el0, > }, > #endif > +#ifdef CONFIG_SUN50I_A64_UNSTABLE_TIMER > + { > + .match_type = ate_match_dt, > + .id = "allwinner,sun50i-a64-unstable-timer", > + .desc = "Allwinner A64 timer instability", > + .read_cntpct_el0 = sun50i_a64_read_cntpct_el0, > + .read_cntvct_el0 = sun50i_a64_read_cntvct_el0, > + }, > +#endif > }; > > typedef bool (*ate_match_fn_t)(const struct arch_timer_erratum_workaround *, > As we discussed last week, it is likely that this is not enough to address all the problems with this wonderful piece of kit. I strongly suspect that TVAL read/write is affected as well, due to the implicit counter read, so it needs to be emulated in SW as well. Something like: diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 4fba50716bda..11d9b53d19da 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -351,6 +351,16 @@ static u64 notrace sun50i_a64_read_cntvct_el0(void) { return __sun50i_a64_read_reg(cntvct_el0); } + +static u32 sun50i_a64_read_cntp_tval_el0(void) +{ + return read_sysreg(cntp_cval_el0) - sun50i_a64_read_cntpct_el0(); +} + +static u32 sun50i_a64_read_cntv_tval_el0(void) +{ + return read_sysreg(cntv_cval_el0) - sun50i_a64_read_cntvct_el0(); +} #endif #ifdef CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND @@ -447,8 +457,12 @@ static const struct arch_timer_erratum_workaround ool_workarounds[] = { .match_type = ate_match_dt, .id = "allwinner,sun50i-a64-unstable-timer", .desc = "Allwinner A64 timer instability", + .read_cntp_tval_el0 = sun50i_a64_read_cntp_tval_el0, + .read_cntv_tval_el0 = sun50i_a64_read_cntv_tval_el0, .read_cntpct_el0 = sun50i_a64_read_cntpct_el0, .read_cntvct_el0 = sun50i_a64_read_cntvct_el0, + .set_next_event_phys = erratum_set_next_event_tval_phys, + .set_next_event_virt = erratum_set_next_event_tval_virt, }, #endif }; which is of course completely untested. Let me know if that helps making the system behave. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] arm64: dts: allwinner: a64: Enable A64 timer workaround 2018-07-12 2:25 [PATCH v2 0/2] Allwinner A64 timer workaround Samuel Holland 2018-07-12 2:25 ` [PATCH v2 1/2] arm64: arch_timer: Workaround for Allwinner A64 timer instability Samuel Holland @ 2018-07-12 2:25 ` Samuel Holland 1 sibling, 0 replies; 4+ messages in thread From: Samuel Holland @ 2018-07-12 2:25 UTC (permalink / raw) To: Maxime Ripard, Chen-Yu Tsai, Catalin Marinas, Will Deacon, Daniel Lezcano, Thomas Gleixner, Marc Zyngier, Mark Rutland Cc: linux-arm-kernel, linux-kernel, linux-sunxi, Samuel Holland As instability in the architectural timer has been observed on multiple devices using this SoC, inluding the Pine64 and the Orange Pi Win, enable the workaround in the SoC's device tree. Signed-off-by: Samuel Holland <samuel@sholland.org> --- arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi index 1b2ef28c42bd..5202b76e9684 100644 --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi @@ -152,6 +152,7 @@ timer { compatible = "arm,armv8-timer"; + allwinner,sun50i-a64-unstable-timer; interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>, <GIC_PPI 14 -- 2.16.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-07-12 10:59 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-12 2:25 [PATCH v2 0/2] Allwinner A64 timer workaround Samuel Holland 2018-07-12 2:25 ` [PATCH v2 1/2] arm64: arch_timer: Workaround for Allwinner A64 timer instability Samuel Holland 2018-07-12 10:59 ` Marc Zyngier 2018-07-12 2:25 ` [PATCH v2 2/2] arm64: dts: allwinner: a64: Enable A64 timer workaround Samuel Holland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).