All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: tglx@linutronix.de
Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org,
	sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com,
	maz@kernel.org, kernel-team@fb.com, neeraju@codeaurora.org,
	ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com
Subject: [PATCH V11 clocksource 0/6] Do not mark clocks unstable due to delays for v5.13
Date: Wed, 28 Apr 2021 18:29:09 -0700	[thread overview]
Message-ID: <20210429012909.GA3958584@paulmck-ThinkPad-P17-Gen-1> (raw)

Hello!

If there is a sufficient delay between reading the watchdog clock and the
clock under test, the clock under test will be marked unstable through no
fault of its own.  This series checks for this, doing limited retries
to get a good set of clock reads.  If the clock is marked unstable
and is marked as being per-CPU, cross-CPU synchronization is checked.
This series also provides delay injection, which may be enabled via
kernel boot parameters to test the checking for delays.

Note that "sufficient delay" can be provided by SMIs, NMIs, and of course
vCPU preemption.

1.	Provide module parameters to inject delays in watchdog.

2.	Retry clock read if long delays detected.

3.	Check per-CPU clock synchronization when marked unstable.

4.	Provide a module parameter to fuzz per-CPU clock checking.

5.	Limit number of CPUs checked for clock synchronization.

6.	Reduce clocksource-skew threshold for TSC.

Changes since v10, based on feedback from Thomas Gleixner, Peter Zijlstra,
Feng Tang, Andi Kleen, Luming Yu, Xing Zhengju, and the indefatigible
kernel test robot:

o	Automatically compute the uncertainty margin for clocksource, and
	also allow them to be specified manually before that clocksource
	is registered.

o	For the automatically computed uncertainty margins, bound them
	below by 100 microseconds (2 * WATCHDOG_MAX_SKEW).

o	For the manually specified uncertainty margins, splat (but
	continue) if they are less than 100 microseconds (again 2 *
	WATCHDOG_MAX_SKEW).  The purpose of splatting is to discourage
	production use of this clock-skew-inducing debugging technique.

o	Manually set the uncertainty margin for clocksource_jiffies
	(and thus refined_jiffies) to TICK_NSEC to compensate for the
	very low frequency of these clocks.

o	Manually set the uncertainty margin for clocksource_tsc_early
	to 32 milliseconds.

o	Apply numerous "Link:" fields to all patches.

o	Add some acks and CCs.

Changes since v9:

o	Forgive tsc_early drift, based on feedback from Feng Tang; Xing,
	Zhengjun; and Thomas Gleixner.

o	Improve CPU selection for clock-synchronization checking.

Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/

Changes since v8, based on Thomas Gleixner feedback:

o	Reduced clock-skew threshold to 200us and delay limit to 50us.

o	Split out a cs_watchdog_read() function.

o	Removed the pointless CLOCK_SOURCE_VERIFY_PERCPU from kvm_clock.

o	Initialized cs_nsec_max and cs_nsec_min to avoid firsttime checks.

Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/

Changes since v7, based on Thomas Gleixner feedback:

o	Fix embarrassing git-format-patch operator error.

o	Merge pairwise clock-desynchronization checking into the checking
	of per-CPU clock synchronization when marked unstable.

o	Do selective per-CPU checking rather than blindly checking all
	CPUs.  Provide a clocksource.verify_n_cpus kernel boot parameter
	to control this behavior, with the value -1 choosing the old
	check-all-CPUs behavior.  The default is to randomly check 8 CPUs.

o	Fix the clock-desynchronization checking to avoid a potential
	use-after-free error for dynamically allocated clocksource
	structures.

o	Remove redundance "wdagain_nsec < 0" from clocksource_watchdog()
	clocksource skew checking.

o	Update commit logs and do code-style updates.

Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/

Changes since v5:

o	Rebased to v5.12-rc5.

Changes since v4:

o	Rebased to v5.12-rc1.

Changes since v3:

o	Rebased to v5.11.

o	Apply Randy Dunlap feedback.

Changes since v2:

o	Rebased to v5.11-rc6.

o	Updated Cc: list.

Changes since v1:

o	Applied feedback from Rik van Riel.

o	Rebased to v5.11-rc3.

o	Stripped "RFC" from the subject lines.

						Thanx, Paul

------------------------------------------------------------------------

 Documentation/admin-guide/kernel-parameters.txt   |   32 +++
 arch/x86/kernel/tsc.c                             |    1 
 b/Documentation/admin-guide/kernel-parameters.txt |   21 ++
 b/arch/x86/kernel/tsc.c                           |    3 
 b/include/linux/clocksource.h                     |    2 
 b/kernel/time/clocksource.c                       |   23 ++
 b/kernel/time/jiffies.c                           |   15 -
 include/linux/clocksource.h                       |    3 
 kernel/time/clocksource.c                         |  227 ++++++++++++++++++++--
 9 files changed, 304 insertions(+), 23 deletions(-)

             reply	other threads:[~2021-04-29  1:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29  1:29 Paul E. McKenney [this message]
2021-04-29  1:30 ` [PATCH v11 clocksource 1/6] clocksource: Provide module parameters to inject delays in watchdog Paul E. McKenney
2021-04-29  1:30 ` [PATCH v11 clocksource 2/6] clocksource: Retry clock read if long delays detected Paul E. McKenney
2021-04-29  1:30 ` [PATCH v11 clocksource 3/6] clocksource: Check per-CPU clock synchronization when marked unstable Paul E. McKenney
2021-04-29  1:30 ` [PATCH v11 clocksource 4/6] clocksource: Provide a module parameter to fuzz per-CPU clock checking Paul E. McKenney
2021-04-29  1:30 ` [PATCH v11 clocksource 5/6] clocksource: Limit number of CPUs checked for clock synchronization Paul E. McKenney
2021-04-29  1:30 ` [PATCH v11 clocksource 6/6] clocksource: Reduce clocksource-skew threshold for TSC Paul E. McKenney
2021-04-29  7:02   ` kernel test robot
2021-04-29  7:02     ` kernel test robot
2021-04-29 14:04     ` Paul E. McKenney
2021-04-29 14:04       ` Paul E. McKenney
2021-04-29 11:13 ` [PATCH V11 clocksource 0/6] Do not mark clocks unstable due to delays for v5.13 Luming Yu
2021-04-30  5:10   ` Paul E. McKenney
2021-04-30  6:52     ` Luming Yu
2021-05-01  4:28       ` Paul E. McKenney
2021-06-02  5:10         ` Luming Yu
2021-06-02 17:46           ` Paul E. McKenney
2021-06-02 18:24             ` Paul E. McKenney
2021-06-03  4:25               ` Luming Yu
2021-06-03 14:40                 ` Paul E. McKenney
2021-06-04  4:23                   ` Luming Yu
2021-06-04 20:26                     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210429012909.GA3958584@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=Mark.Rutland@arm.com \
    --cc=ak@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=feng.tang@intel.com \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=neeraju@codeaurora.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=zhengjun.xing@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.