linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v14 clocksource 0/5] Do not mark clocks unstable due to delays for v5.14
@ 2021-05-11 23:34 Paul E. McKenney
  2021-05-11 23:34 ` [PATCH v14 clocksource 1/6] clocksource: Retry clock read if long delays detected Paul E. McKenney
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Paul E. McKenney @ 2021-05-11 23:34 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, john.stultz, sboyd, corbet, Mark.Rutland, maz,
	kernel-team, neeraju, ak, feng.tang, zhengjun.xing

Hello!

If there is a sufficient delay between reading the watchdog clock and the
clock under test, the clock under test will be marked unstable through no
fault of its own.  This series checks for this, doing limited retries
to get a good set of clock reads.  If the clock is marked unstable
and is marked as being per-CPU, cross-CPU synchronization is checked.
This series also provides a clocksource-watchdog-test kernel module that
tests this new ability of distinguishing delay-induced clock skew from
true clock skew.

Note that "sufficient delay" can be provided by SMIs, NMIs, and of course
vCPU preemption.

The patches in this series are as follows:

1.	Retry clock read if long delays detected.

2.	Check per-CPU clock synchronization when marked unstable.

3.	Limit number of CPUs checked for clock synchronization.

4.	Reduce clocksource-skew threshold for TSC.

5.	Provide kernel module to test clocksource watchdog.

6.	Print deviation in nanoseconds for unstable case, courtesy of
	Feng Tang.

Changes since v13:

o	Forward-port to v5.13-rc1.

o	Add patch 6 from Feng Tang.

Changes since v12, based on feedback from kernel test robot and Stephen
Rothwell:

o	Export clocksource_verify_percpu().

Link: https://lore.kernel.org/lkml/20210501003204.GA2447938@paulmck-ThinkPad-P17-Gen-1/

Changes since v11, based on feedback from Thomas Gleixner:

o	Remove the fault-injection code from clocksource.c.

o	Create a kernel/time/clocksource-wdtest.c kernel module that
	creates its own clocksource structures and injects delays
	as part of their ->read() functions.

o	Make this kernel module splat upon error, for example, when
	a clocksource is not marked unstable but should have been.

o	Apply a couple more "Link:" fields to all patches.

Changes since v10 based on feedback from Thomas Gleixner, Feng Tang,
and Andi Kleen:

o	Add an uncertainty_margin field to the clocksource structure to
	allow skew cutoffs to be tailored to the pair of clocksources
	that the watchdog is comparing.

o	Manually initialize this uncertainty_margin field for
	clocksource_tsc_early and clocksource_jiffies, thus avoiding
	the need for special-case code to allow for the unusually
	large skews inherent to these clocksources.

Changes since v9:

o	Forgive tsc_early drift, based on feedback from Feng Tang; Xing,
	Zhengjun; and Thomas Gleixner.

o	Improve CPU selection for clock-synchronization checking.

Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/

Changes since v8, based on Thomas Gleixner feedback:

o	Reduced clock-skew threshold to 200us and delay limit to 50us.

o	Split out a cs_watchdog_read() function.

o	Removed the pointless CLOCK_SOURCE_VERIFY_PERCPU from kvm_clock.

o	Initialized cs_nsec_max and cs_nsec_min to avoid firsttime checks.

Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/

Changes since v7, based on Thomas Gleixner feedback:

o	Fix embarrassing git-format-patch operator error.

o	Merge pairwise clock-desynchronization checking into the checking
	of per-CPU clock synchronization when marked unstable.

o	Do selective per-CPU checking rather than blindly checking all
	CPUs.  Provide a clocksource.verify_n_cpus kernel boot parameter
	to control this behavior, with the value -1 choosing the old
	check-all-CPUs behavior.  The default is to randomly check 8 CPUs.

o	Fix the clock-desynchronization checking to avoid a potential
	use-after-free error for dynamically allocated clocksource
	structures.

o	Remove redundance "wdagain_nsec < 0" from clocksource_watchdog()
	clocksource skew checking.

o	Update commit logs and do code-style updates.

Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/

Changes since v5:

o	Rebased to v5.12-rc5.

Changes since v4:

o	Rebased to v5.12-rc1.

Changes since v3:

o	Rebased to v5.11.

o	Apply Randy Dunlap feedback.

Changes since v2:

o	Rebased to v5.11-rc6.

o	Updated Cc: list.

Changes since v1:

o	Applied feedback from Rik van Riel.

o	Rebased to v5.11-rc3.

o	Stripped "RFC" from the subject lines.

						Thanx, Paul

------------------------------------------------------------------------

 Documentation/admin-guide/kernel-parameters.txt   |   16 +
 arch/x86/kernel/tsc.c                             |    1 
 b/Documentation/admin-guide/kernel-parameters.txt |    6 
 b/arch/x86/kernel/tsc.c                           |    3 
 b/include/linux/clocksource.h                     |    2 
 b/kernel/time/Makefile                            |    1 
 b/kernel/time/clocksource-wdtest.c                |  202 ++++++++++++++++++++++
 b/kernel/time/clocksource.c                       |   52 +++++
 b/kernel/time/jiffies.c                           |   15 -
 b/lib/Kconfig.debug                               |   12 +
 include/linux/clocksource.h                       |    6 
 kernel/time/clocksource.c                         |  202 +++++++++++++++++++---
 12 files changed, 482 insertions(+), 36 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-05-13  4:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 23:34 [PATCH v14 clocksource 0/5] Do not mark clocks unstable due to delays for v5.14 Paul E. McKenney
2021-05-11 23:34 ` [PATCH v14 clocksource 1/6] clocksource: Retry clock read if long delays detected Paul E. McKenney
2021-05-11 23:34 ` [PATCH v14 clocksource 2/6] clocksource: Check per-CPU clock synchronization when marked unstable Paul E. McKenney
2021-05-11 23:34 ` [PATCH v14 clocksource 3/6] clocksource: Limit number of CPUs checked for clock synchronization Paul E. McKenney
2021-05-11 23:34 ` [PATCH v14 clocksource 4/6] clocksource: Reduce clocksource-skew threshold for TSC Paul E. McKenney
2021-05-12  2:18   ` Feng Tang
2021-05-12  3:51     ` Paul E. McKenney
2021-05-12 13:18       ` Feng Tang
2021-05-12 17:14         ` Paul E. McKenney
2021-05-11 23:34 ` [PATCH v14 clocksource 5/6] clocksource: Provide kernel module to test clocksource watchdog Paul E. McKenney
2021-05-13  3:29   ` Feng Tang
2021-05-13  4:01     ` Paul E. McKenney
2021-05-11 23:34 ` [PATCH v14 clocksource 6/6] clocksource: Print deviation in nanoseconds for unstable case Paul E. McKenney
2021-05-12  2:21   ` Feng Tang
2021-05-12  3:38     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).