linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@suse.de>
Cc: Ashok Raj <ashok.raj@intel.com>,
	Andi Kleen <andi.kleen@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Ricardo Neri <ricardo.neri@intel.com>,
	Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Tony Luck <tony.luck@intel.com>,
	Clemens Ladisch <clemens@ladisch.de>,
	Arnd Bergmann <arnd@arndb.de>,
	Philippe Ombredanne <pombredanne@nexb.com>,
	Kate Stewart <kstewart@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Mimi Zohar <zohar@linux.ibm.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Nayna Jain <nayna@linux.ibm.com>
Subject: [RFC PATCH v2 12/14] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI
Date: Wed, 27 Feb 2019 08:05:16 -0800	[thread overview]
Message-ID: <1551283518-18922-13-git-send-email-ricardo.neri-calderon@linux.intel.com> (raw)
In-Reply-To: <1551283518-18922-1-git-send-email-ricardo.neri-calderon@linux.intel.com>

The only direct method to determine whether an HPET timer caused an
interrupt is to read the Interrupt Status register. Unfortunately,
reading HPET registers is slow and, therefore, it is not recommended to
read them while in NMI context. Furthermore, status is not available if
the interrupt is generated vi the Front Side Bus.

An indirect manner is to compute the expected value of the the time-stamp
counter and, at the time of the interrupt and verify that its actual
value is within a range of the expected value. Since the hardlockup
detector operates in seconds, high precision is not needed. This
implementation considers that the HPET caused the HMI if the time-stamp
counter reads the expected value -/+ 1.5%. This value is selected is it
is equivalent to 1/64 and the division can be performed using bit
shifts. Experimentally, the error in the estimation is consistently less
than 1%.

Also, only read the time-stamp counter of the handling CPU (the one
targeted by the HPET timer). This helps to avoid variability of the time
stamp across CPUs.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nayna Jain <nayna@linux.ibm.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Suggested-by: Andi Kleen <andi.kleen@intel.com> 
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/hpet.h         |  2 ++
 arch/x86/kernel/watchdog_hld_hpet.c | 28 +++++++++++++++++++++++++---
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 15dc3b576496..09763340c911 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -123,6 +123,8 @@ struct hpet_hld_data {
 	u32		num;
 	u32		flags;
 	u64		ticks_per_second;
+	u64		tsc_next;
+	u64		tsc_next_error;
 	u32		handling_cpu;
 	struct cpumask	cpu_monitored_mask;
 	struct msi_msg	msi_msg;
diff --git a/arch/x86/kernel/watchdog_hld_hpet.c b/arch/x86/kernel/watchdog_hld_hpet.c
index cfa284da4bf6..65b4699f249a 100644
--- a/arch/x86/kernel/watchdog_hld_hpet.c
+++ b/arch/x86/kernel/watchdog_hld_hpet.c
@@ -55,6 +55,11 @@ static inline void set_comparator(struct hpet_hld_data *hdata,
  *
  * Reprogram the timer to expire within watchdog_thresh seconds in the future.
  *
+ * Also compute the expected value of the time-stamp counter at the time of
+ * expiration as well as a deviation from the expected value. The maximum
+ * deviation is of ~1.5%. This deviation can be easily computed by shifting
+ * by 6 positions the delta between the current and expected time-stamp values.
+ *
  * Returns:
  *
  * None
@@ -62,7 +67,18 @@ static inline void set_comparator(struct hpet_hld_data *hdata,
 static void kick_timer(struct hpet_hld_data *hdata, bool force)
 {
 	bool kick_needed = force || !(hdata->flags & HPET_DEV_PERI_CAP);
-	unsigned long new_compare, count;
+	unsigned long tsc_curr, tsc_delta, new_compare, count;
+
+	/* Start obtaining the current TSC and HPET counts. */
+	tsc_curr = rdtsc();
+
+	if (kick_needed)
+		count = get_count();
+
+	tsc_delta = (unsigned long)watchdog_thresh * (unsigned long)tsc_khz
+		    * 1000L;
+	hdata->tsc_next = tsc_curr + tsc_delta;
+	hdata->tsc_next_error = tsc_delta >> 6;
 
 	/*
 	 * Update the comparator in increments of watch_thresh seconds relative
@@ -74,8 +90,6 @@ static void kick_timer(struct hpet_hld_data *hdata, bool force)
 	 */
 
 	if (kick_needed) {
-		count = get_count();
-
 		new_compare = count + watchdog_thresh * hdata->ticks_per_second;
 
 		set_comparator(hdata, new_compare);
@@ -147,6 +161,14 @@ static void set_periodic(struct hpet_hld_data *hdata)
  */
 static bool is_hpet_wdt_interrupt(struct hpet_hld_data *hdata)
 {
+	if (smp_processor_id() == hdata->handling_cpu) {
+		unsigned long tsc_curr;
+
+		tsc_curr = rdtsc();
+		if (abs(tsc_curr - hdata->tsc_next) < hdata->tsc_next_error)
+			return true;
+	}
+
 	return false;
 }
 
-- 
2.17.1


  parent reply	other threads:[~2019-02-27 16:06 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-27 16:05 [RFC PATCH v2 00/14] Implement an HPET-based hardlockup detector Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 01/14] x86/msi: Add definition for NMI delivery mode Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 02/14] x86/hpet: Expose more functions to read and write registers Ricardo Neri
2019-03-26 21:00   ` Thomas Gleixner
2019-04-09  2:03     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 03/14] x86/hpet: Calculate ticks-per-second in a separate function Ricardo Neri
2019-03-26 21:03   ` Thomas Gleixner
2019-04-09  2:04     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 04/14] x86/hpet: Reserve timer for the HPET hardlockup detector Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 05/14] x86/hpet: Relocate flag definitions to a header file Ricardo Neri
2019-03-26 21:11   ` Thomas Gleixner
2019-04-09  2:04     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 06/14] x86/hpet: Configure the timer used by the hardlockup detector Ricardo Neri
2019-03-26 21:13   ` Thomas Gleixner
2019-04-09  2:04     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 07/14] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 08/14] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
2019-03-26 21:18   ` Thomas Gleixner
2019-04-09  2:05     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 09/14] watchdog/hardlockup: Make arch_touch_nmi_watchdog() to hpet-based implementation Ricardo Neri
2019-02-27 16:17   ` Paul E. McKenney
2019-03-01  1:17     ` Ricardo Neri
2019-03-26 21:20       ` Thomas Gleixner
2019-04-09  2:05         ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 10/14] kernel/watchdog: Add a function to obtain the watchdog_allowed_mask Ricardo Neri
2019-03-26 21:22   ` Thomas Gleixner
2019-04-09  2:05     ` Ricardo Neri
2019-04-09 11:34   ` Peter Zijlstra
2019-04-11  1:15     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 11/14] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
2019-03-26 20:49   ` Thomas Gleixner
2019-04-09  2:02     ` Ricardo Neri
2019-04-09 10:59     ` Peter Zijlstra
2019-04-10  1:13       ` Ricardo Neri
2019-04-05 16:12   ` Suthikulpanit, Suravee
2019-04-09  2:14     ` Ricardo Neri
2019-04-09 11:03   ` Peter Zijlstra
2019-04-10  1:05     ` Ricardo Neri
2019-02-27 16:05 ` Ricardo Neri [this message]
2019-03-26 20:55   ` [RFC PATCH v2 12/14] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI Thomas Gleixner
2019-04-09  2:02     ` Ricardo Neri
2019-04-09 11:28   ` Peter Zijlstra
2019-04-10  1:19     ` Ricardo Neri
2019-04-10  7:01       ` Peter Zijlstra
2019-04-11  1:12         ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 13/14] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
2019-03-26 21:29   ` Thomas Gleixner
2019-04-09  2:07     ` Ricardo Neri
2019-02-27 16:05 ` [RFC PATCH v2 14/14] x86/watchdog: Add a shim hardlockup detector Ricardo Neri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1551283518-18922-13-git-send-email-ricardo.neri-calderon@linux.intel.com \
    --to=ricardo.neri-calderon@linux.intel.com \
    --cc=andi.kleen@intel.com \
    --cc=arnd@arndb.de \
    --cc=ashok.raj@intel.com \
    --cc=bp@suse.de \
    --cc=clemens@ladisch.de \
    --cc=hpa@zytor.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=nayna@linux.ibm.com \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=ricardo.neri@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yamada.masahiro@socionext.com \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).