linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: tglx@linutronix.de
Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org,
	sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com,
	maz@kernel.org, kernel-team@fb.com, neeraju@codeaurora.org,
	ak@linux.intel.com, "Paul E. McKenney" <paulmck@kernel.org>,
	Chris Mason <clm@fb.com>
Subject: [PATCH v9 clocksource 1/6] clocksource: Provide module parameters to inject delays in watchdog
Date: Sun, 18 Apr 2021 21:52:55 -0700	[thread overview]
Message-ID: <20210419045300.596332-1-paulmck@kernel.org> (raw)
In-Reply-To: <20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1>

When the clocksource watchdog marks a clock as unstable, this might be due
to that clock being unstable or it might be due to delays that happen to
occur between the reads of the two clocks.  Yes, interrupts are disabled
across those two reads, but there are no shortage of things that can
delay interrupts-disabled regions of code ranging from SMI handlers to
vCPU preemption.  It would be good to have some indication as to why
the clock was marked unstable.

The first step is a way of injecting such delays.  Therefore, provide
clocksource.inject_delay_freq and clocksource.inject_delay_run kernel boot
parameters that specify that sufficient delay be injected to cause the
clocksource_watchdog() function to mark a clock unstable.  This delay is
injected every Nth set of M calls to clocksource_watchdog(), where N is
the value specified for the inject_delay_freq boot parameter and M is the
value specified for the inject_delay_run boot parameter.  Values of zero
or less for either parameter disable delay injection, and the default for
clocksource.inject_delay_freq is zero, that is, disabled.  The default for
clocksource.inject_delay_run is the value one, that is single-call runs.

This facility is intended for diagnostic use only, and should be avoided
on production systems.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <Mark.Rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
[ paulmck: Apply Rik van Riel feedback. ]
[ paulmck: Apply Thomas Gleixner feedback. ]
Reported-by: Chris Mason <clm@fb.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 .../admin-guide/kernel-parameters.txt         | 16 +++++++++++++
 kernel/time/clocksource.c                     | 23 +++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..4a372037b49f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -583,6 +583,22 @@
 			loops can be debugged more effectively on production
 			systems.
 
+	clocksource.inject_delay_period= [KNL]
+			Number of calls to clocksource_watchdog() before
+			delays are injected between reads from the
+			two clocksources.  Values of zero disable this
+			delay injection.  These delays can cause clocks
+			to be marked unstable, so use of this parameter
+			should therefore be avoided on production systems.
+			Defaults to zero (disabled).
+
+	clocksource.inject_delay_repeat= [KNL]
+			Number of repeated clocksource_watchdog() delay
+			injections per period.	If inject_delay_period
+			is five and inject_delay_repeat is three, there
+			will be five delay-free reads followed by three
+			delayed reads.
+
 	clearcpuid=BITNUM[,BITNUM...] [X86]
 			Disable CPUID feature X for the kernel. See
 			arch/x86/include/asm/cpufeatures.h for the valid bit
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cce484a2cc7c..d77700c45f4e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -14,6 +14,7 @@
 #include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */
 #include <linux/tick.h>
 #include <linux/kthread.h>
+#include <linux/delay.h>
 
 #include "tick-internal.h"
 #include "timekeeping_internal.h"
@@ -184,6 +185,27 @@ void clocksource_mark_unstable(struct clocksource *cs)
 	spin_unlock_irqrestore(&watchdog_lock, flags);
 }
 
+static ulong inject_delay_period;
+module_param(inject_delay_period, ulong, 0644);
+static ulong inject_delay_repeat = 1;
+module_param(inject_delay_repeat, ulong, 0644);
+
+static void clocksource_watchdog_inject_delay(void)
+{
+	static unsigned int invocations = 1, injections;
+
+	if (!inject_delay_period || !inject_delay_repeat)
+		return;
+	if (!(invocations % inject_delay_period)) {
+		pr_warn("%s(): Injecting delay.\n", __func__);
+		mdelay(2 * WATCHDOG_THRESHOLD / NSEC_PER_MSEC);
+		if (++injections < inject_delay_repeat)
+			return;
+		injections = 0;
+	}
+	invocations++;
+}
+
 static void clocksource_watchdog(struct timer_list *unused)
 {
 	struct clocksource *cs;
@@ -208,6 +230,7 @@ static void clocksource_watchdog(struct timer_list *unused)
 
 		local_irq_disable();
 		csnow = cs->read(cs);
+		clocksource_watchdog_inject_delay();
 		wdnow = watchdog->read(watchdog);
 		local_irq_enable();
 
-- 
2.31.1.189.g2e36527f23


  reply	other threads:[~2021-04-19  4:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19  4:51 [PATCH v9 clocksource 0/6] Do not mark clocks unstable due to delays for v5.13 Paul E. McKenney
2021-04-19  4:52 ` Paul E. McKenney [this message]
2021-04-19  4:52 ` [PATCH v9 clocksource 2/6] clocksource: Retry clock read if long delays detected Paul E. McKenney
2021-04-19  4:52 ` [PATCH v9 clocksource 3/6] clocksource: Check per-CPU clock synchronization when marked unstable Paul E. McKenney
2021-04-19  4:52 ` [PATCH v9 clocksource 4/6] clocksource: Provide a module parameter to fuzz per-CPU clock checking Paul E. McKenney
2021-04-19  4:52 ` [PATCH v9 clocksource 5/6] clocksource: Limit number of CPUs checked for clock synchronization Paul E. McKenney
2021-04-19  4:53 ` [PATCH v9 clocksource 6/6] clocksource: Reduce WATCHDOG_THRESHOLD Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210419045300.596332-1-paulmck@kernel.org \
    --to=paulmck@kernel.org \
    --cc=Mark.Rutland@arm.com \
    --cc=ak@linux.intel.com \
    --cc=clm@fb.com \
    --cc=corbet@lwn.net \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=neeraju@codeaurora.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    --subject='Re: [PATCH v9 clocksource 1/6] clocksource: Provide module parameters to inject delays in watchdog' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).