All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: linux-kernel@vger.kernel.org
Cc: mark.rutland@arm.com, maz@kernel.org, paulmck@kernel.org,
	peterz@infradead.org, tglx@linutronix.de
Subject: [PATCH v2 2/2] irq: detect long-running IRQ handlers
Date: Tue, 15 Jun 2021 11:25:07 +0100	[thread overview]
Message-ID: <20210615102507.9677-3-mark.rutland@arm.com> (raw)
In-Reply-To: <20210615102507.9677-1-mark.rutland@arm.com>

If a hard IRQ handler takes a long time to handle an IRQ, it may cause a
soft lockup or RCU stall, but as this will be detected once the handler
has returned it can be difficult to attribute the delay to the specific
IRQ handler.

It's possible to trace IRQ handlers to diagnose this, but that's not a
great fit for automated testing environments (e.g. fuzzers), where
something like the existing lockup/stall detectors works well.

This patch adds a new stall detector for IRQ handlers, which reports
when handlers took longer than a given timeout value (defaulting to 1
second). This won't detect hung IRQ handlers (which requires an NMI, and
should already be caught by hung task detection on systems with NMIs),
but helps on platforms without NMI or where a periodic watchdog is
undesireable.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/irq/internals.h | 35 ++++++++++++++++++++++++++++++++---
 lib/Kconfig.debug      | 15 +++++++++++++++
 2 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 70a4694cc891..191b6a9d30e2 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -6,6 +6,7 @@
  * kernel/irq/. Do not even think about using any information outside
  * of this file for your non core code.
  */
+#include <linux/bug.h>
 #include <linux/irqdesc.h>
 #include <linux/kernel_stat.h>
 #include <linux/pm_runtime.h>
@@ -122,17 +123,45 @@ static inline irqreturn_t __handle_irqaction(unsigned int irq,
 	return res;
 }
 
+#ifdef CONFIG_DETECT_SLOW_IRQ_HANDLER
+static inline irqreturn_t __handle_check_irqaction(unsigned int irq,
+						   struct irqaction *action,
+						   void *dev_id)
+{
+	u64 timeout = CONFIG_IRQ_HANDLER_TIMEOUT_NS;
+	u64 start, end, duration;
+	int res;
+
+	start = local_clock();
+	res = __handle_irqaction(irq, action, dev_id);
+	end = local_clock();
+
+	duration = end - start;
+	WARN(duration > timeout, "IRQ %d handler %ps took %llu ns\n",
+	     irq, action->handler, duration);
+
+	return res;
+}
+#else
+static inline irqreturn_t __handle_check_irqaction(unsigned int irq,
+						   struct irqaction *action,
+						   void *dev_id)
+{
+	return __handle_irqaction(irq, action, dev_id);
+}
+#endif
+
 static inline irqreturn_t handle_irqaction(unsigned int irq,
 					   struct irqaction *action)
 {
-	return __handle_irqaction(irq, action, action->dev_id);
+	return __handle_check_irqaction(irq, action, action->dev_id);
 }
 
 static inline irqreturn_t handle_irqaction_percpu_devid(unsigned int irq,
 							struct irqaction *action)
 {
-	return __handle_irqaction(irq, action,
-				  raw_cpu_ptr(action->percpu_dev_id));
+	return __handle_check_irqaction(irq, action,
+					raw_cpu_ptr(action->percpu_dev_id));
 }
 
 /* Resending of interrupts :*/
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 678c13967580..2c6a501fa9b9 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1144,6 +1144,21 @@ config WQ_WATCHDOG
 	  state.  This can be configured through kernel parameter
 	  "workqueue.watchdog_thresh" and its sysfs counterpart.
 
+config DETECT_SLOW_IRQ_HANDLER
+	bool "Detect long-running IRQ handlers"
+	help
+	  Say Y here to enable detection of long-running IRQ handlers. When a
+	  (hard) IRQ handler returns after a given timeout value (1s by
+	  default) a warning will be printed with the name of the handler.
+
+	  This can help to identify specific IRQ handlers which are
+	  contributing to stalls.
+
+config IRQ_HANDLER_TIMEOUT_NS
+	int "Timeout for long-running IRQ handlers (in nanoseconds)"
+	depends on DETECT_SLOW_IRQ_HANDLER
+	default 1000000000
+
 config TEST_LOCKUP
 	tristate "Test module to generate lockups"
 	depends on m
-- 
2.11.0


      parent reply	other threads:[~2021-06-15 10:25 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 10:25 [PATCH v2 0/2] irq: detect slow IRQ handlers Mark Rutland
2021-06-15 10:25 ` [PATCH v2 1/2] irq: abstract irqaction handler invocation Mark Rutland
2021-06-15 10:25 ` Mark Rutland [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210615102507.9677-3-mark.rutland@arm.com \
    --to=mark.rutland@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.