[PATCH 0/2] irq: detect slow IRQ handlers

* [PATCH 0/2] irq: detect slow IRQ handlers
@ 2021-01-12 13:59 Mark Rutland
  2021-01-12 13:59 ` [PATCH 1/2] irq: abstract irqaction handler invocation Mark Rutland
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Mark Rutland @ 2021-01-12 13:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: mark.rutland, maz, paulmck, peterz, tglx

Hi,

While fuzzing arm64 with Syzkaller (under QEMU+KVM) over a number of releases,
I've occasionally seen some ridiculously long stalls (20+ seconds), where it
appears that a CPU is stuck in a hard IRQ context. As this gets detected after
the CPU returns to the interrupted context, it's difficult to identify where
exactly the stall is coming from.

These patches are intended to help tracking this down, with a WARN() if an IRQ
handler takes longer than a given timout (1 second by default), logging the
specific IRQ and handler function. While it's possible to achieve similar with
tracing, it's harder to integrate that into an automated fuzzing setup.

I've been running this for a short while, and haven't yet seen any of the
stalls with this applied, but I've tested with smaller timeout periods in the 1
millisecond range by overloading the host, so I'm confident that the check
works.

Thanks,
Mark.

Mark Rutland (2):
  irq: abstract irqaction handler invocation
  irq: detect long-running IRQ handlers

 kernel/irq/chip.c      | 15 +++----------
 kernel/irq/handle.c    |  4 +---
 kernel/irq/internals.h | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug      | 15 +++++++++++++
 4 files changed, 76 insertions(+), 15 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 6+ messages in thread