* [PATCH 0/3] warn and suppress irqflood @ 2020-10-22 5:56 Pingfan Liu 2020-10-22 5:56 ` [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods Pingfan Liu ` (3 more replies) 0 siblings, 4 replies; 24+ messages in thread From: Pingfan Liu @ 2020-10-22 5:56 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, kexec I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. When the bug happens, the kernel is totally occupies by irq. Currently, there may be nothing or just soft lockup warning showed in console. It is better to warn users with irq flood info. In the kdump case, the kernel can move on by suppressing the irq flood. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: afzal mohammed <afzal.mohd.ma@gmail.com> Cc: Lina Iyer <ilina@codeaurora.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: Maulik Shah <mkshah@codeaurora.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oliver Neukum <oneukum@suse.com> To: linux-kernel@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: kexec@lists.infradead.org Pingfan Liu (3): kernel/watchdog: show irq percentage if irq floods kernel/watchdog: suppress max irq when irq floods Documentation: introduce a param "irqflood_suppress" Documentation/admin-guide/kernel-parameters.txt | 3 ++ include/linux/irq.h | 2 ++ kernel/irq/spurious.c | 32 +++++++++++++++++ kernel/watchdog.c | 48 +++++++++++++++++++++++++ 4 files changed, 85 insertions(+) -- 2.7.5 ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods 2020-10-22 5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu @ 2020-10-22 5:56 ` Pingfan Liu 2020-10-22 5:56 ` [PATCH 2/3] kernel/watchdog: suppress max irq when " Pingfan Liu ` (2 subsequent siblings) 3 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-10-22 5:56 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, kexec In kdump case, the capture kernel has no chance to shutdown devices, and leave the devices in unstable state. Irq flood is one of the consequence. When irq floods, the kernel is totally occupies by irq. Currently, there may be nothing or just soft lockup warning showed in console. It is better to warn users with irq flood info. Soft lockup watchdog is a good foundation to implement the detector. A irq flood is reported if the following two conditions are met: -1. the irq occupies too much (98%) of the past interval. -2. no tasks has been scheduled in the past interval. This is implemented by check_hint. A note: is_softlockup() implies the 2nd condition, but unfortunately, irq flood can come from anywhere. If irq_enter_rcu()->tick_irq_enter(), then touch_softlockup_watchdog_sched() will reset watchdog, and softlockup is never detected. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: afzal mohammed <afzal.mohd.ma@gmail.com> Cc: Lina Iyer <ilina@codeaurora.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: Maulik Shah <mkshah@codeaurora.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oliver Neukum <oneukum@suse.com> To: linux-kernel@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: kexec@lists.infradead.org --- kernel/watchdog.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 5abb5b2..230ac38 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -23,6 +23,7 @@ #include <linux/sched/debug.h> #include <linux/sched/isolation.h> #include <linux/stop_machine.h> +#include <linux/kernel_stat.h> #include <asm/irq_regs.h> #include <linux/kvm_para.h> @@ -175,6 +176,13 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync); static DEFINE_PER_CPU(bool, soft_watchdog_warn); static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts); static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); + +#ifdef CONFIG_IRQ_TIME_ACCOUNTING +static DEFINE_PER_CPU(long, check_hint) = {-1}; +static DEFINE_PER_CPU(unsigned long, last_irq_ts); +static DEFINE_PER_CPU(unsigned long, last_total_ts); +#endif + static unsigned long soft_lockup_nmi_warn; static int __init nowatchdog_setup(char *str) @@ -331,12 +339,43 @@ static DEFINE_PER_CPU(struct cpu_stop_work, softlockup_stop_work); */ static int softlockup_fn(void *data) { +#ifdef CONFIG_IRQ_TIME_ACCOUNTING + __this_cpu_write(check_hint, -1); +#endif __touch_watchdog(); complete(this_cpu_ptr(&softlockup_completion)); return 0; } +#ifdef CONFIG_IRQ_TIME_ACCOUNTING +static void check_irq_flood(void) +{ + unsigned long irqts, totalts, percent, cnt; + u64 *cpustat = kcpustat_this_cpu->cpustat; + + totalts = running_clock(); + irqts = cpustat[CPUTIME_IRQ] + cpustat[CPUTIME_SOFTIRQ]; + cnt = __this_cpu_inc_return(check_hint); + + if (cnt >= 5) { + totalts = totalts - __this_cpu_read(last_total_ts); + irqts = irqts - __this_cpu_read(last_irq_ts); + percent = irqts * 100 / totalts; + percent = percent < 100 ? percent : 100; + __this_cpu_write(check_hint, -1); + if (percent >= 98) + pr_info("Irq flood occupies more than %lu%% of the past %lu seconds\n", + percent, totalts >> 30); + } else if (cnt == 0) { + __this_cpu_write(last_total_ts, totalts); + __this_cpu_write(last_irq_ts, irqts); + } +} +#else +static void check_irq_flood(void){} +#endif + /* watchdog kicker functions */ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) { @@ -348,6 +387,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) if (!watchdog_enabled) return HRTIMER_NORESTART; + /* When irq floods, watchdog may be still touched. Hence it can not be done inside lockup */ + check_irq_flood(); /* kick the hardlockup detector */ watchdog_interrupt_count(); -- 2.7.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/3] kernel/watchdog: suppress max irq when irq floods 2020-10-22 5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu 2020-10-22 5:56 ` [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods Pingfan Liu @ 2020-10-22 5:56 ` Pingfan Liu 2020-10-22 5:56 ` [PATCH 3/3] Documentation: introduce a param "irqflood_suppress" Pingfan Liu 2020-10-22 8:37 ` [PATCH 0/3] warn and suppress irqflood Thomas Gleixner 3 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-10-22 5:56 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, kexec The capture kernel should try its best to save the crash info. Normally, irq flood is caused by some trivial devices, which has no impact on saving vmcore. Introducing a parameter "irqflood_suppress" to enable suppress irq flood. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: afzal mohammed <afzal.mohd.ma@gmail.com> Cc: Lina Iyer <ilina@codeaurora.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: Maulik Shah <mkshah@codeaurora.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oliver Neukum <oneukum@suse.com> To: linux-kernel@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: kexec@lists.infradead.org --- include/linux/irq.h | 2 ++ kernel/irq/spurious.c | 32 ++++++++++++++++++++++++++++++++ kernel/watchdog.c | 9 ++++++++- 3 files changed, 42 insertions(+), 1 deletion(-) diff --git a/include/linux/irq.h b/include/linux/irq.h index 1b7f4df..140cb61 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -684,6 +684,8 @@ extern void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret); /* Enable/disable irq debugging output: */ extern int noirqdebug_setup(char *str); +void suppress_max_irq(void); + /* Checks whether the interrupt can be requested by request_irq(): */ extern int can_request_irq(unsigned int irq, unsigned long irqflags); diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c index f865e5f..d3d94d6 100644 --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -464,3 +464,35 @@ static int __init irqpoll_setup(char *str) } __setup("irqpoll", irqpoll_setup); + +#ifdef CONFIG_IRQ_TIME_ACCOUNTING + +static bool irqflood_suppress; + +static int __init irqflood_suppress_setup(char *str) +{ + irqflood_suppress = true; + pr_info("enable auto suppress irqflood\n"); + return 1; +} +__setup("irqflood_suppress", irqflood_suppress_setup); + +void suppress_max_irq(void) +{ + unsigned int tmp, maxirq = 0, max = 0; + int irq; + + if (!irqflood_suppress) + return; + for_each_active_irq(irq) { + tmp = kstat_irqs_cpu(irq, smp_processor_id()); + if (max < tmp) { + maxirq = irq; + max = tmp; + } + } + pr_warn("Suppress irq:%u, which is triggered %u times\n", + maxirq, max); + disable_irq_nosync(maxirq); +} +#endif diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 230ac38..28a74e5 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -24,6 +24,7 @@ #include <linux/sched/isolation.h> #include <linux/stop_machine.h> #include <linux/kernel_stat.h> +#include <linux/irq.h> #include <asm/irq_regs.h> #include <linux/kvm_para.h> @@ -364,9 +365,15 @@ static void check_irq_flood(void) percent = irqts * 100 / totalts; percent = percent < 100 ? percent : 100; __this_cpu_write(check_hint, -1); - if (percent >= 98) + if (percent >= 98) { pr_info("Irq flood occupies more than %lu%% of the past %lu seconds\n", percent, totalts >> 30); + /* + * Suppress top irq when scheduler does not work for long time and irq + * occupies too much time. + */ + suppress_max_irq(); + } } else if (cnt == 0) { __this_cpu_write(last_total_ts, totalts); __this_cpu_write(last_irq_ts, irqts); -- 2.7.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/3] Documentation: introduce a param "irqflood_suppress" 2020-10-22 5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu 2020-10-22 5:56 ` [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods Pingfan Liu 2020-10-22 5:56 ` [PATCH 2/3] kernel/watchdog: suppress max irq when " Pingfan Liu @ 2020-10-22 5:56 ` Pingfan Liu 2020-10-22 8:37 ` [PATCH 0/3] warn and suppress irqflood Thomas Gleixner 3 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-10-22 5:56 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, kexec The param "irqflood_suppress" is helpful for capture kernel to survive irq flood. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: afzal mohammed <afzal.mohd.ma@gmail.com> Cc: Lina Iyer <ilina@codeaurora.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: Maulik Shah <mkshah@codeaurora.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oliver Neukum <oneukum@suse.com> To: linux-kernel@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: kexec@lists.infradead.org --- Documentation/admin-guide/kernel-parameters.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a106874..0a25a05 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2009,6 +2009,9 @@ for it. Also check all handlers each timer interrupt. Intended to get systems with badly broken firmware running. + irqflood_suppress [HW] + When a irq fully occupies a cpu in a long time, suppressing + it to make kernel move on. It is useful in the capture kernel. isapnp= [ISAPNP] Format: <RDP>,<reset>,<pci_scan>,<verbosity> -- 2.7.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-22 5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu ` (2 preceding siblings ...) 2020-10-22 5:56 ` [PATCH 3/3] Documentation: introduce a param "irqflood_suppress" Pingfan Liu @ 2020-10-22 8:37 ` Thomas Gleixner 2020-10-25 11:12 ` Pingfan Liu 3 siblings, 1 reply; 24+ messages in thread From: Thomas Gleixner @ 2020-10-22 8:37 UTC (permalink / raw) To: Pingfan Liu, linux-kernel Cc: Pingfan Liu, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, kexec On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote: > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. > When the bug happens, the kernel is totally occupies by irq. Currently, there > may be nothing or just soft lockup warning showed in console. It is better > to warn users with irq flood info. > > In the kdump case, the kernel can move on by suppressing the irq flood. You're curing the symptom not the cause and the cure is just magic and can't work reliably. Where is that irq flood originated from and why is none of the mechanisms we have in place to shut it up working? Thanks, tglx ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-22 8:37 ` [PATCH 0/3] warn and suppress irqflood Thomas Gleixner @ 2020-10-25 11:12 ` Pingfan Liu 2020-10-25 12:21 ` [Skiboot] " Oliver O'Halloran 2020-10-26 15:06 ` Guilherme Piccoli 0 siblings, 2 replies; 24+ messages in thread From: Pingfan Liu @ 2020-10-25 11:12 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote: > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. > > When the bug happens, the kernel is totally occupies by irq. Currently, there > > may be nothing or just soft lockup warning showed in console. It is better > > to warn users with irq flood info. > > > > In the kdump case, the kernel can move on by suppressing the irq flood. > > You're curing the symptom not the cause and the cure is just magic and > can't work reliably. Yeah, it is magic. But at least, it is better to printk something and alarm users about what happens. With current code, it may show nothing when system hangs. > > Where is that irq flood originated from and why is none of the > mechanisms we have in place to shut it up working? The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus driver (i2c-opal.c). After i2c_opal_send_request(), the bug is triggered. But things are complicated by introducing a firmware layer: Skiboot. This software layer hides the detail of manipulating the hardware from Linux. I guess the software logic can not enter a sane state when kernel crashes. Cc Skiboot and ppc64 community to see whether anyone has idea about it. Thanks, Pingfan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Skiboot] [PATCH 0/3] warn and suppress irqflood 2020-10-25 11:12 ` Pingfan Liu @ 2020-10-25 12:21 ` Oliver O'Halloran 2020-10-25 13:11 ` Pingfan Liu 2020-10-26 15:06 ` Guilherme Piccoli 1 sibling, 1 reply; 24+ messages in thread From: Oliver O'Halloran @ 2020-10-25 12:21 UTC (permalink / raw) To: Pingfan Liu Cc: Thomas Gleixner, Maulik Shah, Petr Mladek, Oliver Neukum, Jonathan Corbet, Gustavo A. R. Silva, Peter Zijlstra, Marc Zyngier, Linus Walleij, Guilherme G. Piccoli, linux-doc, LKML, Lina Iyer, Jisheng Zhang, Pawan Gupta, Al Viro, Andrew Morton, afzal mohammed, Kexec Mailing List, Mike Kravetz On Sun, Oct 25, 2020 at 10:22 PM Pingfan Liu <kernelfans@gmail.com> wrote: > > On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote: > > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. > > > When the bug happens, the kernel is totally occupies by irq. Currently, there > > > may be nothing or just soft lockup warning showed in console. It is better > > > to warn users with irq flood info. > > > > > > In the kdump case, the kernel can move on by suppressing the irq flood. > > > > You're curing the symptom not the cause and the cure is just magic and > > can't work reliably. > Yeah, it is magic. But at least, it is better to printk something and > alarm users about what happens. With current code, it may show nothing > when system hangs. > > > > Where is that irq flood originated from and why is none of the > > mechanisms we have in place to shut it up working? > The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus > driver (i2c-opal.c). After i2c_opal_send_request(), the bug is > triggered. > > But things are complicated by introducing a firmware layer: Skiboot. > This software layer hides the detail of manipulating the hardware from > Linux. > > I guess the software logic can not enter a sane state when kernel crashes. > > Cc Skiboot and ppc64 community to see whether anyone has idea about it. What system are you using? There's an external interrupt pin which is supposed to be wired to the TPM. I think we bounce that interrupt to FW by default since the external interrupt is sometimes used for other system-specific purposes. Odds are FW doesn't know what to do with it so you effectively have an always-on LSI. I fixed a similar bug a while ago by having skiboot mask any interrupts it doesn't have a handler for, but I have no idea when that fix will land in a released FW build. Oh well. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Skiboot] [PATCH 0/3] warn and suppress irqflood 2020-10-25 12:21 ` [Skiboot] " Oliver O'Halloran @ 2020-10-25 13:11 ` Pingfan Liu 2020-10-25 13:51 ` Oliver O'Halloran 0 siblings, 1 reply; 24+ messages in thread From: Pingfan Liu @ 2020-10-25 13:11 UTC (permalink / raw) To: Oliver O'Halloran Cc: Thomas Gleixner, Maulik Shah, Petr Mladek, Oliver Neukum, Jonathan Corbet, Gustavo A. R. Silva, Peter Zijlstra, Marc Zyngier, Linus Walleij, Guilherme G. Piccoli, linux-doc, LKML, Lina Iyer, Jisheng Zhang, Pawan Gupta, Al Viro, Andrew Morton, afzal mohammed, Kexec Mailing List, Mike Kravetz On Sun, Oct 25, 2020 at 8:21 PM Oliver O'Halloran <oohall@gmail.com> wrote: > > On Sun, Oct 25, 2020 at 10:22 PM Pingfan Liu <kernelfans@gmail.com> wrote: > > > > On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote: > > > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. > > > > When the bug happens, the kernel is totally occupies by irq. Currently, there > > > > may be nothing or just soft lockup warning showed in console. It is better > > > > to warn users with irq flood info. > > > > > > > > In the kdump case, the kernel can move on by suppressing the irq flood. > > > > > > You're curing the symptom not the cause and the cure is just magic and > > > can't work reliably. > > Yeah, it is magic. But at least, it is better to printk something and > > alarm users about what happens. With current code, it may show nothing > > when system hangs. > > > > > > Where is that irq flood originated from and why is none of the > > > mechanisms we have in place to shut it up working? > > The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus > > driver (i2c-opal.c). After i2c_opal_send_request(), the bug is > > triggered. > > > > But things are complicated by introducing a firmware layer: Skiboot. > > This software layer hides the detail of manipulating the hardware from > > Linux. > > > > I guess the software logic can not enter a sane state when kernel crashes. > > > > Cc Skiboot and ppc64 community to see whether anyone has idea about it. > > What system are you using? Here is the info, if not enough, I will get more. Product Name : OpenPOWER Firmware Product Version : open-power-SUPERMICRO-P9DSU-V1.16-20180531-imp Product Extra : op-build-e4b3eb5 Product Extra : skiboot-v6.0-p1da203b Product Extra : hostboot-f911e5c-pda8239f Product Extra : occ-77bb5e6-p623d1cd Product Extra : linux-4.16.7-openpower2-pbc45895 Product Extra : petitboot-v1.7.1-pf773c0d Product Extra : machine-xml-218a77a > > There's an external interrupt pin which is supposed to be wired to the > TPM. I think we bounce that interrupt to FW by default since the > external interrupt is sometimes used for other system-specific > purposes. Odds are FW doesn't know what to do with it so you > effectively have an always-on LSI. I fixed a similar bug a while ago > by having skiboot mask any interrupts it doesn't have a handler for, This sounds like the root cause. But here Skiboot should have handler, otherwise the first kernel can not run smoothly. Do you have any idea about an unexpected re-initialization introducing an unsane stage? Thanks, Pingfan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Skiboot] [PATCH 0/3] warn and suppress irqflood 2020-10-25 13:11 ` Pingfan Liu @ 2020-10-25 13:51 ` Oliver O'Halloran 0 siblings, 0 replies; 24+ messages in thread From: Oliver O'Halloran @ 2020-10-25 13:51 UTC (permalink / raw) To: Pingfan Liu Cc: Thomas Gleixner, Maulik Shah, Petr Mladek, Oliver Neukum, Jonathan Corbet, Gustavo A. R. Silva, Peter Zijlstra, Marc Zyngier, Linus Walleij, Guilherme G. Piccoli, linux-doc, LKML, Lina Iyer, Jisheng Zhang, Pawan Gupta, Al Viro, Andrew Morton, afzal mohammed, Kexec Mailing List, Mike Kravetz On Mon, Oct 26, 2020 at 12:11 AM Pingfan Liu <kernelfans@gmail.com> wrote: > > On Sun, Oct 25, 2020 at 8:21 PM Oliver O'Halloran <oohall@gmail.com> wrote: > > > > On Sun, Oct 25, 2020 at 10:22 PM Pingfan Liu <kernelfans@gmail.com> wrote: > > > > > > On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > > > > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote: > > > > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. > > > > > When the bug happens, the kernel is totally occupies by irq. Currently, there > > > > > may be nothing or just soft lockup warning showed in console. It is better > > > > > to warn users with irq flood info. > > > > > > > > > > In the kdump case, the kernel can move on by suppressing the irq flood. > > > > > > > > You're curing the symptom not the cause and the cure is just magic and > > > > can't work reliably. > > > Yeah, it is magic. But at least, it is better to printk something and > > > alarm users about what happens. With current code, it may show nothing > > > when system hangs. > > > > > > > > Where is that irq flood originated from and why is none of the > > > > mechanisms we have in place to shut it up working? > > > The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus > > > driver (i2c-opal.c). After i2c_opal_send_request(), the bug is > > > triggered. > > > > > > But things are complicated by introducing a firmware layer: Skiboot. > > > This software layer hides the detail of manipulating the hardware from > > > Linux. > > > > > > I guess the software logic can not enter a sane state when kernel crashes. > > > > > > Cc Skiboot and ppc64 community to see whether anyone has idea about it. > > > > What system are you using? > > Here is the info, if not enough, I will get more. > Product Name : OpenPOWER Firmware > Product Version : open-power-SUPERMICRO-P9DSU-V1.16-20180531-imp > Product Extra : op-build-e4b3eb5 > Product Extra : skiboot-v6.0-p1da203b > Product Extra : hostboot-f911e5c-pda8239f > Product Extra : occ-77bb5e6-p623d1cd > Product Extra : linux-4.16.7-openpower2-pbc45895 > Product Extra : petitboot-v1.7.1-pf773c0d > Product Extra : machine-xml-218a77a Unfortunately I don't have a schematic for that one. > > There's an external interrupt pin which is supposed to be wired to the > > TPM. I think we bounce that interrupt to FW by default since the > > external interrupt is sometimes used for other system-specific > > purposes. Odds are FW doesn't know what to do with it so you > > effectively have an always-on LSI. I fixed a similar bug a while ago > > by having skiboot mask any interrupts it doesn't have a handler for, > > This sounds like the root cause. But here Skiboot should have handler, > otherwise the first kernel can not run smoothly. I don't know why the TPM interrupt is asserted. If the TPM driver is polling for a response it might clear the underlying condition as a side effect of it's normal operation. > Do you have any idea about an unexpected re-initialization introducing > an unsane stage? No idea, but those TPMs have a history of bricking themselves if you do anything slightly odd to them. It wouldn't surprise me if the re-probe can cause issues. > Thanks, > Pingfan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-25 11:12 ` Pingfan Liu 2020-10-25 12:21 ` [Skiboot] " Oliver O'Halloran @ 2020-10-26 15:06 ` Guilherme Piccoli 2020-10-26 19:59 ` Thomas Gleixner 1 sibling, 1 reply; 24+ messages in thread From: Guilherme Piccoli @ 2020-10-26 15:06 UTC (permalink / raw) To: Pingfan Liu Cc: Thomas Gleixner, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List On Sun, Oct 25, 2020 at 8:12 AM Pingfan Liu <kernelfans@gmail.com> wrote: > > On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote: > > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform. > > > When the bug happens, the kernel is totally occupies by irq. Currently, there > > > may be nothing or just soft lockup warning showed in console. It is better > > > to warn users with irq flood info. > > > > > > In the kdump case, the kernel can move on by suppressing the irq flood. > > > > You're curing the symptom not the cause and the cure is just magic and > > can't work reliably. > Yeah, it is magic. But at least, it is better to printk something and > alarm users about what happens. With current code, it may show nothing > when system hangs. Thanks Pingfan and Thomas for the points - I'd like to have a mechanism in the kernel to warn users when an IRQ flood is potentially happening. Some time ago (2 years) we faced a similar issue in x86-64, a hard to debug problem in kdump, that eventually was narrowed to a buggy NIC FW flooding IRQs in kdump kernel, and no messages showed (although kernel changed a lot since that time, today we might have better IRQ handling/warning). We tried an early-boot fix, by disabling MSIs (as per PCI spec) early in x86 boot, but it wasn't accepted - Bjorn asked pertinent questions that I couldn't respond (I lost the reproducer) [0]. Cheers, Guilherme [0] lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-26 15:06 ` Guilherme Piccoli @ 2020-10-26 19:59 ` Thomas Gleixner 2020-10-26 20:28 ` Guilherme Piccoli 2020-10-28 6:02 ` Pingfan Liu 0 siblings, 2 replies; 24+ messages in thread From: Thomas Gleixner @ 2020-10-26 19:59 UTC (permalink / raw) To: Guilherme Piccoli, Pingfan Liu Cc: LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Mon, Oct 26 2020 at 12:06, Guilherme Piccoli wrote: > On Sun, Oct 25, 2020 at 8:12 AM Pingfan Liu <kernelfans@gmail.com> wrote: > > Some time ago (2 years) we faced a similar issue in x86-64, a hard to > debug problem in kdump, that eventually was narrowed to a buggy NIC FW > flooding IRQs in kdump kernel, and no messages showed (although kernel > changed a lot since that time, today we might have better IRQ > handling/warning). We tried an early-boot fix, by disabling MSIs (as > per PCI spec) early in x86 boot, but it wasn't accepted - Bjorn asked > pertinent questions that I couldn't respond (I lost the reproducer) > [0]. ... > [0] lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com With that broken firmware the NIC continued to send MSI messages to the vector/CPU which was assigned to it before the crash. But the crash kernel has no interrupt descriptor for this vector installed. So Liu's patches wont print anything simply because the interrupt core cannot detect it. To answer Bjorns still open question about when the point X is: https://lore.kernel.org/linux-pci/20181023170343.GA4587@bhelgaas-glaptop.roam.corp.google.com/ It gets flooded right at the point where the crash kernel enables interrupts in start_kernel(). At that point there is no device driver and no interupt requested. All you can see on the console for this is "common_interrupt: $VECTOR.$CPU No irq handler for vector" And contrary to Liu's patches which try to disable a requested interrupt if too many of them arrive, the kernel cannot do anything because there is nothing to disable in your case. That's why you needed to do the MSI disable magic in the early PCI quirks which run before interrupts get enabled. Also Liu's patch only works if: 1) CONFIG_IRQ_TIME_ACCOUNTING is enabled 2) the runaway interrupt has been requested by the relevant driver in the dump kernel. Especially #1 is not a sensible restriction. Thanks, tglx ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-26 19:59 ` Thomas Gleixner @ 2020-10-26 20:28 ` Guilherme Piccoli 2020-10-26 21:21 ` Thomas Gleixner 2020-10-28 6:02 ` Pingfan Liu 1 sibling, 1 reply; 24+ messages in thread From: Guilherme Piccoli @ 2020-10-26 20:28 UTC (permalink / raw) To: Thomas Gleixner Cc: Pingfan Liu, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Mon, Oct 26, 2020 at 4:59 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Mon, Oct 26 2020 at 12:06, Guilherme Piccoli wrote: > > On Sun, Oct 25, 2020 at 8:12 AM Pingfan Liu <kernelfans@gmail.com> wrote: > > > > Some time ago (2 years) we faced a similar issue in x86-64, a hard to > > debug problem in kdump, that eventually was narrowed to a buggy NIC FW > > flooding IRQs in kdump kernel, and no messages showed (although kernel > > changed a lot since that time, today we might have better IRQ > > handling/warning). We tried an early-boot fix, by disabling MSIs (as > > per PCI spec) early in x86 boot, but it wasn't accepted - Bjorn asked > > pertinent questions that I couldn't respond (I lost the reproducer) > > [0]. > ... > > [0] lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com > > With that broken firmware the NIC continued to send MSI messages to the > vector/CPU which was assigned to it before the crash. But the crash > kernel has no interrupt descriptor for this vector installed. So Liu's > patches wont print anything simply because the interrupt core cannot > detect it. > > To answer Bjorns still open question about when the point X is: > > https://lore.kernel.org/linux-pci/20181023170343.GA4587@bhelgaas-glaptop.roam.corp.google.com/ > > It gets flooded right at the point where the crash kernel enables > interrupts in start_kernel(). At that point there is no device driver > and no interupt requested. All you can see on the console for this is > > "common_interrupt: $VECTOR.$CPU No irq handler for vector" > > And contrary to Liu's patches which try to disable a requested interrupt > if too many of them arrive, the kernel cannot do anything because there > is nothing to disable in your case. That's why you needed to do the MSI > disable magic in the early PCI quirks which run before interrupts get > enabled. > > Also Liu's patch only works if: > > 1) CONFIG_IRQ_TIME_ACCOUNTING is enabled > > 2) the runaway interrupt has been requested by the relevant driver in > the dump kernel. > > Especially #1 is not a sensible restriction. > > Thanks, > > tglx Wow, thank you very much for this great explanation (without a reproducer) - it's nice to hear somebody that deeply understands the code! And double thanks for CCing Bjorn. So, I don't want to hijack Liu's thread, but do you think it makes sense to have my approach as a (debug) parameter to prevent such a degenerate case? Or could we have something in core IRQ code to prevent irq flooding in such scenarios, something "stronger" than disabling MSIs (APIC-level, likely)? Cheers, Guilherme ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-26 20:28 ` Guilherme Piccoli @ 2020-10-26 21:21 ` Thomas Gleixner 2020-10-27 12:28 ` Guilherme Piccoli 0 siblings, 1 reply; 24+ messages in thread From: Thomas Gleixner @ 2020-10-26 21:21 UTC (permalink / raw) To: Guilherme Piccoli Cc: Pingfan Liu, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Mon, Oct 26 2020 at 17:28, Guilherme Piccoli wrote: > On Mon, Oct 26, 2020 at 4:59 PM Thomas Gleixner <tglx@linutronix.de> wrote: >> It gets flooded right at the point where the crash kernel enables >> interrupts in start_kernel(). At that point there is no device driver >> and no interupt requested. All you can see on the console for this is >> >> "common_interrupt: $VECTOR.$CPU No irq handler for vector" >> >> And contrary to Liu's patches which try to disable a requested interrupt >> if too many of them arrive, the kernel cannot do anything because there >> is nothing to disable in your case. That's why you needed to do the MSI >> disable magic in the early PCI quirks which run before interrupts get >> enabled. > > Wow, thank you very much for this great explanation (without a > reproducer) - it's nice to hear somebody that deeply understands the > code! And double thanks for CCing Bjorn. Understanding the code is only half of the picture. You need to understand how the hardware works or not :) > So, I don't want to hijack Liu's thread, but do you think it makes > sense to have my approach as a (debug) parameter to prevent such a > degenerate case? At least it makes sense to some extent even if it's incomplete. What bothers me is that it'd be x86 specific while the issue is pretty much architecture independent. I don't think that the APIC is special in that regard. Rogue MSIs should be able to bring down pretty much all architectures. > Or could we have something in core IRQ code to prevent irq flooding in > such scenarios, something "stronger" than disabling MSIs (APIC-level, > likely)? For your case? No. The APIC cannot be protected against rogue MSIs. The only cure is to disable interrupts or disable MSIs on all PCI[E] devices early on. Disabling interrupts is not so much of an option obviously :) Thanks, tglx ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-26 21:21 ` Thomas Gleixner @ 2020-10-27 12:28 ` Guilherme Piccoli 0 siblings, 0 replies; 24+ messages in thread From: Guilherme Piccoli @ 2020-10-27 12:28 UTC (permalink / raw) To: Thomas Gleixner Cc: Pingfan Liu, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Mon, Oct 26, 2020 at 6:21 PM Thomas Gleixner <tglx@linutronix.de> wrote: > [...] > > So, I don't want to hijack Liu's thread, but do you think it makes > > sense to have my approach as a (debug) parameter to prevent such a > > degenerate case? > > At least it makes sense to some extent even if it's incomplete. What > bothers me is that it'd be x86 specific while the issue is pretty much > architecture independent. I don't think that the APIC is special in that > regard. Rogue MSIs should be able to bring down pretty much all > architectures. > Thanks Thomas! I partially agree with you, I can speak only for x86 and powerpc. In x86 we know that happens, OK. But in powerpc, we had a special PCI reset, we called it IIRC "fundamental"/PHB reset - that procedure would put the PCI devices in good shape, it was something that the kernel could request from FW - see [0] for an example. It was present in all incarnations of powerpc (bare-metal, powerVM/PHyp - a virtual thing) except maybe in qemu (although it'd be possible to do that, since the PCI devices are attached on host and passthrough'ed via vfio). Anyway, in powerpc the PCI devices are really reset across "soft-reboots" be it kexec or what was called a fast reboot (that skipped some FW initializations), effectively disabling MSIs - x86 has no such default/vendor-agnostic reset infrastructure, BIOSes usually do some kind of PCI reset but with no interface for the kernel to request that in kexec, for example. That said, the option was to use the arch code to early-clear the MSI state in all devices, that being a kind of reset. And it's "supported" by the spec, that claims MSIs should be clear before devices' initialization =) Anyway, I'm glad to discuss more, and I'm even more glad that you consider the approach useful. We could revive that if Bjorn agrees, I could respin an updated version. ARM64/RISC-V or whatever other architectures I can't say about, but I think if they have early-PCI handlers (and !FW reset, like powerpc) it would be possible to implement that in a more complete way. > > Or could we have something in core IRQ code to prevent irq flooding in > > such scenarios, something "stronger" than disabling MSIs (APIC-level, > > likely)? > > For your case? No. The APIC cannot be protected against rogue MSIs. The > only cure is to disable interrupts or disable MSIs on all PCI[E] devices > early on. Disabling interrupts is not so much of an option obviously :) Great to know that, we imagined if it would be possible to have a more "soft" option, but it seems clearing MSIs is the way to go. Cheers, Guilherme [0] kernel portion: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/platforms/powernv/pci-ioda.c#n3161 FW portion: github.com/open-power/skiboot/blob/master/core/pci-opal.c#L545 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-26 19:59 ` Thomas Gleixner 2020-10-26 20:28 ` Guilherme Piccoli @ 2020-10-28 6:02 ` Pingfan Liu 2020-10-28 11:58 ` Thomas Gleixner 1 sibling, 1 reply; 24+ messages in thread From: Pingfan Liu @ 2020-10-28 6:02 UTC (permalink / raw) To: Thomas Gleixner Cc: Guilherme Piccoli, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Tue, Oct 27, 2020 at 3:59 AM Thomas Gleixner <tglx@linutronix.de> wrote: > [...] > > And contrary to Liu's patches which try to disable a requested interrupt > if too many of them arrive, the kernel cannot do anything because there > is nothing to disable in your case. That's why you needed to do the MSI > disable magic in the early PCI quirks which run before interrupts get > enabled. > > Also Liu's patch only works if: > > 1) CONFIG_IRQ_TIME_ACCOUNTING is enabled I wonder whether it can not be a default option or not by the following method: DEFINE_STATIC_KEY_FALSE(irqtime_account), and enable it according to a boot param. This will have no impact on performance with the disabled branch. Meanwhile users can easily turn on the option to detect an irq flood without recompiling the kernel. If it is doable, I will rework only on [1/2]. > > 2) the runaway interrupt has been requested by the relevant driver in > the dump kernel. Yes, it raises a big challenge to my method. Kdump kernel miss the whole picture of the first kernel's irq routing. Thanks, Pingfan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-28 6:02 ` Pingfan Liu @ 2020-10-28 11:58 ` Thomas Gleixner 2020-10-29 6:26 ` Pingfan Liu ` (3 more replies) 0 siblings, 4 replies; 24+ messages in thread From: Thomas Gleixner @ 2020-10-28 11:58 UTC (permalink / raw) To: Pingfan Liu Cc: Guilherme Piccoli, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Wed, Oct 28 2020 at 14:02, Pingfan Liu wrote: > On Tue, Oct 27, 2020 at 3:59 AM Thomas Gleixner <tglx@linutronix.de> wrote: >> Also Liu's patch only works if: >> >> 1) CONFIG_IRQ_TIME_ACCOUNTING is enabled > > I wonder whether it can not be a default option or not by the following method: > DEFINE_STATIC_KEY_FALSE(irqtime_account), and enable it according to > a boot param. How so? config IRQ_TIME_ACCOUNTING depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE > This will have no impact on performance with the disabled branch. > Meanwhile users can easily turn on the option to detect an irq flood > without recompiling the kernel. > > If it is doable, I will rework only on [1/2]. See above :) >> 2) the runaway interrupt has been requested by the relevant driver in >> the dump kernel. > > Yes, it raises a big challenge to my method. Kdump kernel miss the > whole picture of the first kernel's irq routing. Correct. If there is anything stale then you get what Guilherme observed. But the irq core can do nothing about that. Something like the completly untested below should work independent of config options. Thanks, tglx --- include/linux/irqdesc.h | 4 ++ kernel/irq/manage.c | 3 + kernel/irq/spurious.c | 74 +++++++++++++++++++++++++++++++++++------------- 3 files changed, 61 insertions(+), 20 deletions(-) --- a/include/linux/irqdesc.h +++ b/include/linux/irqdesc.h @@ -30,6 +30,8 @@ struct pt_regs; * @tot_count: stats field for non-percpu irqs * @irq_count: stats field to detect stalled irqs * @last_unhandled: aging timer for unhandled count + * @storm_count: Counter for irq storm detection + * @storm_checked: Timestamp for irq storm detection * @irqs_unhandled: stats field for spurious unhandled interrupts * @threads_handled: stats field for deferred spurious detection of threaded handlers * @threads_handled_last: comparator field for deferred spurious detection of theraded handlers @@ -65,6 +67,8 @@ struct irq_desc { unsigned int tot_count; unsigned int irq_count; /* For detecting broken IRQs */ unsigned long last_unhandled; /* Aging timer for unhandled count */ + unsigned long storm_count; + unsigned long storm_checked; unsigned int irqs_unhandled; atomic_t threads_handled; int threads_handled_last; --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1581,6 +1581,9 @@ static int if (!shared) { init_waitqueue_head(&desc->wait_for_threads); + /* Take a timestamp for interrupt storm detection */ + desc->storm_checked = jiffies; + /* Setup the type (level, edge polarity) if configured: */ if (new->flags & IRQF_TRIGGER_MASK) { ret = __irq_set_trigger(desc, --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -21,6 +21,7 @@ static void poll_spurious_irqs(struct ti static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs); static int irq_poll_cpu; static atomic_t irq_poll_active; +static unsigned long irqstorm_limit __ro_after_init; /* * We wait here for a poller to finish. @@ -189,18 +190,21 @@ static inline int bad_action_ret(irqretu * (The other 100-of-100,000 interrupts may have been a correctly * functioning device sharing an IRQ with the failing one) */ -static void __report_bad_irq(struct irq_desc *desc, irqreturn_t action_ret) +static void __report_bad_irq(struct irq_desc *desc, irqreturn_t action_ret, + bool storm) { unsigned int irq = irq_desc_get_irq(desc); struct irqaction *action; unsigned long flags; - if (bad_action_ret(action_ret)) { - printk(KERN_ERR "irq event %d: bogus return value %x\n", - irq, action_ret); - } else { - printk(KERN_ERR "irq %d: nobody cared (try booting with " + if (!storm) { + if (bad_action_ret(action_ret)) { + pr_err("irq event %d: bogus return value %x\n", + irq, action_ret); + } else { + pr_err("irq %d: nobody cared (try booting with " "the \"irqpoll\" option)\n", irq); + } } dump_stack(); printk(KERN_ERR "handlers:\n"); @@ -228,7 +232,7 @@ static void report_bad_irq(struct irq_de if (count > 0) { count--; - __report_bad_irq(desc, action_ret); + __report_bad_irq(desc, action_ret, false); } } @@ -267,6 +271,33 @@ try_misrouted_irq(unsigned int irq, stru return action && (action->flags & IRQF_IRQPOLL); } +static void disable_stuck_irq(struct irq_desc *desc, irqreturn_t action_ret, + const char *reason, bool storm) +{ + __report_bad_irq(desc, action_ret, storm); + pr_emerg("Disabling %s IRQ #%d\n", reason, irq_desc_get_irq(desc)); + desc->istate |= IRQS_SPURIOUS_DISABLED; + desc->depth++; + irq_disable(desc); +} + +/* Interrupt storm detector for runaway interrupts (handled or not). */ +static bool irqstorm_detected(struct irq_desc *desc) +{ + unsigned long now = jiffies; + + if (++desc->storm_count < irqstorm_limit) { + if (time_after(now, desc->storm_checked + HZ)) { + desc->storm_count = 0; + desc->storm_checked = now; + } + return false; + } + + disable_stuck_irq(desc, IRQ_NONE, "runaway", true); + return true; +} + #define SPURIOUS_DEFERRED 0x80000000 void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret) @@ -403,24 +434,16 @@ void note_interrupt(struct irq_desc *des desc->irqs_unhandled -= ok; } + if (unlikely(irqstorm_limit && irqstorm_detected(desc))) + return; + desc->irq_count++; if (likely(desc->irq_count < 100000)) return; desc->irq_count = 0; if (unlikely(desc->irqs_unhandled > 99900)) { - /* - * The interrupt is stuck - */ - __report_bad_irq(desc, action_ret); - /* - * Now kill the IRQ - */ - printk(KERN_EMERG "Disabling IRQ #%d\n", irq); - desc->istate |= IRQS_SPURIOUS_DISABLED; - desc->depth++; - irq_disable(desc); - + disable_stuck_irq(desc, action_ret, "unhandled", false); mod_timer(&poll_spurious_irq_timer, jiffies + POLL_SPURIOUS_IRQ_INTERVAL); } @@ -462,5 +485,16 @@ static int __init irqpoll_setup(char *st "performance\n"); return 1; } - __setup("irqpoll", irqpoll_setup); + +static int __init irqstorm_setup(char *arg) +{ + int res = kstrtoul(arg, 0, &irqstorm_limit); + + if (!res) { + pr_info("Interrupt storm detector enabled. Limit=%lu / s\n", + irqstorm_limit); + } + return !!res; +} +__setup("irqstorm_limit", irqstorm_setup); ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-28 11:58 ` Thomas Gleixner @ 2020-10-29 6:26 ` Pingfan Liu 2020-11-06 5:53 ` Pingfan Liu ` (2 subsequent siblings) 3 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-10-29 6:26 UTC (permalink / raw) To: Thomas Gleixner Cc: Guilherme Piccoli, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Wed, Oct 28, 2020 at 12:58:41PM +0100, Thomas Gleixner wrote: > On Wed, Oct 28 2020 at 14:02, Pingfan Liu wrote: > > On Tue, Oct 27, 2020 at 3:59 AM Thomas Gleixner <tglx@linutronix.de> wrote: > >> Also Liu's patch only works if: > >> > >> 1) CONFIG_IRQ_TIME_ACCOUNTING is enabled > > > > I wonder whether it can not be a default option or not by the following method: > > DEFINE_STATIC_KEY_FALSE(irqtime_account), and enable it according to > > a boot param. > > How so? > > config IRQ_TIME_ACCOUNTING > depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE > Look closely at the two config value: -1. HAVE_IRQ_TIME_ACCOUNTING, it is selected by most of the popular arches, and can be further relaxed. It implies sched_clock() is fast enough for sampling. With current code, the variable sched_clock_irqtime=0 can be used to turn off irqtime accounting on some arches with slow sched_clock(). And it can be even better by using DEFINE_STATIC_KEY_FALSE(sched_clock_irqtime) So the pre-requirement can be relaxed as "depends on !VIRT_CPU_ACCOUNTING_NATIVE" In case that I can not express clearly, could you have a look at the demo patch? That patch _assumes_ that irqtime accounting costs much and is not turned on by default. If turned on, it will cost an extra jmp than current implement. And I think it is critical to my [1/3] whether this assumption is reasonable. -2. For VIRT_CPU_ACCOUNTING_NATIVE, it can only be selected by powerpc and ia64 In fact, I have a seperate patch for powerpc with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE to utilize my [1/3]. --- diff --git a/init/Kconfig b/init/Kconfig index c944691..16d168b 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -490,7 +490,7 @@ endchoice config IRQ_TIME_ACCOUNTING bool "Fine granularity task level IRQ time accounting" - depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE + depends on !VIRT_CPU_ACCOUNTING_NATIVE help Select this option to enable fine granularity task irq time accounting. This is done by reading a timestamp on each diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 5a55d23..3ab7e1d 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -19,7 +19,7 @@ */ DEFINE_PER_CPU(struct irqtime, cpu_irqtime); -static int sched_clock_irqtime; +DEFINE_STATIC_KEY_FALSE(sched_clock_irqtime); void enable_sched_clock_irqtime(void) { @@ -49,13 +49,14 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, */ void irqtime_account_irq(struct task_struct *curr) { - struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime); + struct irqtime *irqtime; s64 delta; int cpu; - if (!sched_clock_irqtime) + if (static_branch_unlikely(&sched_clock_irqtime)) return; + irqtime = this_cpu_ptr(&cpu_irqtime); cpu = smp_processor_id(); delta = sched_clock_cpu(cpu) - irqtime->irq_start_time; irqtime->irq_start_time += delta; @@ -84,16 +85,7 @@ static u64 irqtime_tick_accounted(u64 maxtime) return delta; } -#else /* CONFIG_IRQ_TIME_ACCOUNTING */ - -#define sched_clock_irqtime (0) - -static u64 irqtime_tick_accounted(u64 dummy) -{ - return 0; -} - -#endif /* !CONFIG_IRQ_TIME_ACCOUNTING */ +#endif static inline void task_group_account_field(struct task_struct *p, int index, u64 tmp) @@ -475,7 +467,7 @@ void account_process_tick(struct task_struct *p, int user_tick) if (vtime_accounting_enabled_this_cpu()) return; - if (sched_clock_irqtime) { + if (static_branch_unlikely(&sched_clock_irqtime)) irqtime_account_process_tick(p, user_tick, 1); return; } @@ -504,7 +496,7 @@ void account_idle_ticks(unsigned long ticks) { u64 cputime, steal; - if (sched_clock_irqtime) { + if (static_branch_unlikely(&sched_clock_irqtime)) irqtime_account_idle_ticks(ticks); return; } -- 2.7.5 [...] > + > +static int __init irqstorm_setup(char *arg) > +{ > + int res = kstrtoul(arg, 0, &irqstorm_limit); > + > + if (!res) { > + pr_info("Interrupt storm detector enabled. Limit=%lu / s\n", > + irqstorm_limit); > + } > + return !!res; > +} > +__setup("irqstorm_limit", irqstorm_setup); This configuration independent method looks appealing. And I am glad to have a try. But irqstorm_limit may be a hard choice. Maybe by formula: instruction-percpu-per-second / insn num of irq failed path ? It is hard to estimate "instruction-percpu-per-second". Thanks, Pingfan ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-28 11:58 ` Thomas Gleixner 2020-10-29 6:26 ` Pingfan Liu @ 2020-11-06 5:53 ` Pingfan Liu 2020-11-18 3:36 ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu 2021-03-02 7:45 ` [PATCH 0/3] warn and suppress irqflood Sai Prakash Ranjan 3 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-11-06 5:53 UTC (permalink / raw) To: Thomas Gleixner Cc: Guilherme Piccoli, LKML, Peter Zijlstra, Jisheng Zhang, Andrew Morton, Petr Mladek, Marc Zyngier, Linus Walleij, afzal mohammed, Lina Iyer, Gustavo A. R. Silva, Maulik Shah, Al Viro, Jonathan Corbet, Pawan Gupta, Mike Kravetz, Oliver Neukum, linux-doc, Kexec Mailing List, Bjorn Helgaas On Wed, Oct 28, 2020 at 7:58 PM Thomas Gleixner <tglx@linutronix.de> wrote: > [...] > --- > include/linux/irqdesc.h | 4 ++ > kernel/irq/manage.c | 3 + > kernel/irq/spurious.c | 74 +++++++++++++++++++++++++++++++++++------------- > 3 files changed, 61 insertions(+), 20 deletions(-) > > --- a/include/linux/irqdesc.h > +++ b/include/linux/irqdesc.h > @@ -30,6 +30,8 @@ struct pt_regs; > * @tot_count: stats field for non-percpu irqs > * @irq_count: stats field to detect stalled irqs > * @last_unhandled: aging timer for unhandled count > + * @storm_count: Counter for irq storm detection > + * @storm_checked: Timestamp for irq storm detection > * @irqs_unhandled: stats field for spurious unhandled interrupts > * @threads_handled: stats field for deferred spurious detection of threaded handlers > * @threads_handled_last: comparator field for deferred spurious detection of theraded handlers > @@ -65,6 +67,8 @@ struct irq_desc { > unsigned int tot_count; > unsigned int irq_count; /* For detecting broken IRQs */ > unsigned long last_unhandled; /* Aging timer for unhandled count */ > + unsigned long storm_count; > + unsigned long storm_checked; > unsigned int irqs_unhandled; > atomic_t threads_handled; > int threads_handled_last; > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -1581,6 +1581,9 @@ static int > if (!shared) { > init_waitqueue_head(&desc->wait_for_threads); > > + /* Take a timestamp for interrupt storm detection */ > + desc->storm_checked = jiffies; > + > /* Setup the type (level, edge polarity) if configured: */ > if (new->flags & IRQF_TRIGGER_MASK) { > ret = __irq_set_trigger(desc, > --- a/kernel/irq/spurious.c > +++ b/kernel/irq/spurious.c > @@ -21,6 +21,7 @@ static void poll_spurious_irqs(struct ti > static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs); > static int irq_poll_cpu; > static atomic_t irq_poll_active; > +static unsigned long irqstorm_limit __ro_after_init; > > /* > * We wait here for a poller to finish. > @@ -189,18 +190,21 @@ static inline int bad_action_ret(irqretu > * (The other 100-of-100,000 interrupts may have been a correctly > * functioning device sharing an IRQ with the failing one) > */ > -static void __report_bad_irq(struct irq_desc *desc, irqreturn_t action_ret) > +static void __report_bad_irq(struct irq_desc *desc, irqreturn_t action_ret, > + bool storm) > { > unsigned int irq = irq_desc_get_irq(desc); > struct irqaction *action; > unsigned long flags; > > - if (bad_action_ret(action_ret)) { > - printk(KERN_ERR "irq event %d: bogus return value %x\n", > - irq, action_ret); > - } else { > - printk(KERN_ERR "irq %d: nobody cared (try booting with " > + if (!storm) { > + if (bad_action_ret(action_ret)) { > + pr_err("irq event %d: bogus return value %x\n", > + irq, action_ret); > + } else { > + pr_err("irq %d: nobody cared (try booting with " > "the \"irqpoll\" option)\n", irq); > + } > } > dump_stack(); > printk(KERN_ERR "handlers:\n"); > @@ -228,7 +232,7 @@ static void report_bad_irq(struct irq_de > > if (count > 0) { > count--; > - __report_bad_irq(desc, action_ret); > + __report_bad_irq(desc, action_ret, false); > } > } > > @@ -267,6 +271,33 @@ try_misrouted_irq(unsigned int irq, stru > return action && (action->flags & IRQF_IRQPOLL); > } > > +static void disable_stuck_irq(struct irq_desc *desc, irqreturn_t action_ret, > + const char *reason, bool storm) > +{ > + __report_bad_irq(desc, action_ret, storm); > + pr_emerg("Disabling %s IRQ #%d\n", reason, irq_desc_get_irq(desc)); > + desc->istate |= IRQS_SPURIOUS_DISABLED; > + desc->depth++; > + irq_disable(desc); > +} > + > +/* Interrupt storm detector for runaway interrupts (handled or not). */ > +static bool irqstorm_detected(struct irq_desc *desc) > +{ > + unsigned long now = jiffies; > + > + if (++desc->storm_count < irqstorm_limit) { > + if (time_after(now, desc->storm_checked + HZ)) { > + desc->storm_count = 0; > + desc->storm_checked = now; > + } > + return false; > + } > + > + disable_stuck_irq(desc, IRQ_NONE, "runaway", true); > + return true; > +} > + > #define SPURIOUS_DEFERRED 0x80000000 > > void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret) > @@ -403,24 +434,16 @@ void note_interrupt(struct irq_desc *des > desc->irqs_unhandled -= ok; > } > > + if (unlikely(irqstorm_limit && irqstorm_detected(desc))) > + return; > + > desc->irq_count++; > if (likely(desc->irq_count < 100000)) > return; > > desc->irq_count = 0; > if (unlikely(desc->irqs_unhandled > 99900)) { > - /* > - * The interrupt is stuck > - */ > - __report_bad_irq(desc, action_ret); > - /* > - * Now kill the IRQ > - */ > - printk(KERN_EMERG "Disabling IRQ #%d\n", irq); > - desc->istate |= IRQS_SPURIOUS_DISABLED; > - desc->depth++; > - irq_disable(desc); > - > + disable_stuck_irq(desc, action_ret, "unhandled", false); > mod_timer(&poll_spurious_irq_timer, > jiffies + POLL_SPURIOUS_IRQ_INTERVAL); > } > @@ -462,5 +485,16 @@ static int __init irqpoll_setup(char *st > "performance\n"); > return 1; > } > - > __setup("irqpoll", irqpoll_setup); > + > +static int __init irqstorm_setup(char *arg) > +{ > + int res = kstrtoul(arg, 0, &irqstorm_limit); > + > + if (!res) { > + pr_info("Interrupt storm detector enabled. Limit=%lu / s\n", > + irqstorm_limit); > + } > + return !!res; > +} > +__setup("irqstorm_limit", irqstorm_setup); It should be __setup("irqstorm_limit=", irqstorm_setup); And I have tested this patch on the P9 machine, where I set the limit to 70000. It works for kdump kernel. Thanks, Pingfan ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 0/3] use soft lockup to detect irq flood 2020-10-28 11:58 ` Thomas Gleixner 2020-10-29 6:26 ` Pingfan Liu 2020-11-06 5:53 ` Pingfan Liu @ 2020-11-18 3:36 ` Pingfan Liu 2020-11-18 3:36 ` [PATCH 1/3] x86/irq: account the unused irq Pingfan Liu ` (2 more replies) 2021-03-02 7:45 ` [PATCH 0/3] warn and suppress irqflood Sai Prakash Ranjan 3 siblings, 3 replies; 24+ messages in thread From: Pingfan Liu @ 2020-11-18 3:36 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Jisheng Zhang, Peter Zijlstra (Intel), Vlastimil Babka, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, kexec The chipset's pins are often multi-used because greedy usage compares to a finite pin number. This requests a carefully configuration in firmware or OS. If not, it may contribute to most of irq flood cases, which appears as a soft lockup issue on Linux. Strictly speaking, soft lockup and irq flood are different things to overcome. And it is helpful to make users aware of that situation for prime time. This series shows the irq statistics when soft lockup. The statistics can be used to evaluate the possibility of irq flood and as a rough evaluated input to the kernel parameter "irqstorm_limit" in [1]. It is not easy to find a common way to work around irq flood, which may be raised by different root cause. To now, it is still a open question. Thomas and Guilherme suggested patches to suppress the odd irq in different situation. [1].[2] [1]: https://lore.kernel.org/lkml/87tuueftou.fsf@nanos.tec.linutronix.de/ [2]: https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com/ Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: kexec@lists.infradead.org To: linux-kernel@vger.kernel.org Pingfan Liu (3): x86/irq: account the unused irq kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond kernel/watchdog: use soft lockup to detect irq flood arch/x86/kernel/irq.c | 1 + include/linux/kernel_stat.h | 1 + kernel/watchdog.c | 37 ++++++++++++++++++++++++++----------- 3 files changed, 28 insertions(+), 11 deletions(-) -- 2.7.5 ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/3] x86/irq: account the unused irq 2020-11-18 3:36 ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu @ 2020-11-18 3:36 ` Pingfan Liu 2020-11-18 3:36 ` [PATCH 2/3] kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond Pingfan Liu 2020-11-18 3:36 ` [PATCH 3/3] kernel/watchdog: use soft lockup to detect irq flood Pingfan Liu 2 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-11-18 3:36 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Jisheng Zhang, Peter Zijlstra (Intel), Vlastimil Babka, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, kexec Accounting the unused irq in order to count it if irq flood. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: kexec@lists.infradead.org To: linux-kernel@vger.kernel.org --- arch/x86/kernel/irq.c | 1 + include/linux/kernel_stat.h | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index c5dd503..6f583a7 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -254,6 +254,7 @@ DEFINE_IDTENTRY_IRQ(common_interrupt) pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n", __func__, smp_processor_id(), vector); + __this_cpu_inc(kstat.unused_irqs_sum); } else { __this_cpu_write(vector_irq[vector], VECTOR_UNUSED); } diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h index 89f0745..c8d5cb8 100644 --- a/include/linux/kernel_stat.h +++ b/include/linux/kernel_stat.h @@ -37,6 +37,7 @@ struct kernel_cpustat { struct kernel_stat { unsigned long irqs_sum; + unsigned long unused_irqs_sum; unsigned int softirqs[NR_SOFTIRQS]; }; -- 2.7.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/3] kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond 2020-11-18 3:36 ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu 2020-11-18 3:36 ` [PATCH 1/3] x86/irq: account the unused irq Pingfan Liu @ 2020-11-18 3:36 ` Pingfan Liu 2020-11-18 3:36 ` [PATCH 3/3] kernel/watchdog: use soft lockup to detect irq flood Pingfan Liu 2 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-11-18 3:36 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Jisheng Zhang, Peter Zijlstra (Intel), Vlastimil Babka, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, kexec The incoming statistics of irq average number will base on the delta of watchdog_touch_ts. Improving the accuracy of watchdog_touch_ts from second to nanosecond can help improve the accuracy of the statistics. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: kexec@lists.infradead.org To: linux-kernel@vger.kernel.org --- kernel/watchdog.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 5abb5b2..1cc619a 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -207,7 +207,7 @@ static void __lockup_detector_cleanup(void); * the thresholds with a factor: we make the soft threshold twice the amount of * time the hard threshold is. */ -static int get_softlockup_thresh(void) +static unsigned int get_softlockup_thresh(void) { return watchdog_thresh * 2; } @@ -217,9 +217,9 @@ static int get_softlockup_thresh(void) * resolution, and we don't need to waste time with a big divide when * 2^30ns == 1.074s. */ -static unsigned long get_timestamp(void) +static unsigned long convert_seconds(unsigned long ns) { - return running_clock() >> 30LL; /* 2^30 ~= 10^9 */ + return ns >> 30LL; /* 2^30 ~= 10^9 */ } static void set_sample_period(void) @@ -238,7 +238,7 @@ static void set_sample_period(void) /* Commands for resetting the watchdog */ static void __touch_watchdog(void) { - __this_cpu_write(watchdog_touch_ts, get_timestamp()); + __this_cpu_write(watchdog_touch_ts, running_clock()); } /** @@ -289,14 +289,15 @@ void touch_softlockup_watchdog_sync(void) __this_cpu_write(watchdog_touch_ts, SOFTLOCKUP_RESET); } -static int is_softlockup(unsigned long touch_ts) +static unsigned long is_softlockup(unsigned long touch_ts) { - unsigned long now = get_timestamp(); + unsigned long span, now = running_clock(); + span = now - touch_ts; if ((watchdog_enabled & SOFT_WATCHDOG_ENABLED) && watchdog_thresh){ /* Warn about unreasonable delays. */ - if (time_after(now, touch_ts + get_softlockup_thresh())) - return now - touch_ts; + if (time_after(convert_seconds(span), (unsigned long)get_softlockup_thresh())) + return span; } return 0; } @@ -340,9 +341,8 @@ static int softlockup_fn(void *data) /* watchdog kicker functions */ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) { - unsigned long touch_ts = __this_cpu_read(watchdog_touch_ts); + unsigned long duration, touch_ts = __this_cpu_read(watchdog_touch_ts); struct pt_regs *regs = get_irq_regs(); - int duration; int softlockup_all_cpu_backtrace = sysctl_softlockup_all_cpu_backtrace; if (!watchdog_enabled) @@ -410,7 +410,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) } pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n", - smp_processor_id(), duration, + smp_processor_id(), (unsigned int)convert_seconds(duration), current->comm, task_pid_nr(current)); print_modules(); print_irqtrace_events(current); -- 2.7.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/3] kernel/watchdog: use soft lockup to detect irq flood 2020-11-18 3:36 ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu 2020-11-18 3:36 ` [PATCH 1/3] x86/irq: account the unused irq Pingfan Liu 2020-11-18 3:36 ` [PATCH 2/3] kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond Pingfan Liu @ 2020-11-18 3:36 ` Pingfan Liu 2 siblings, 0 replies; 24+ messages in thread From: Pingfan Liu @ 2020-11-18 3:36 UTC (permalink / raw) To: linux-kernel Cc: Pingfan Liu, Thomas Gleixner, Jisheng Zhang, Peter Zijlstra (Intel), Vlastimil Babka, Andrew Morton, Guilherme G. Piccoli, Petr Mladek, kexec When irq flood happens, interrupt handler occupies all of the cpu time. This results in a situation where soft lockup can be observed, although it is different from the design purpose of soft lockup. In order to distinguish this situation, it is helpful to print out the statistics of irq frequency when warning soft lockup to evaluate the potential irq flood. Thomas and Guilherme suggested patches to suppress the odd irq in different situation. [1].[2]. But it seems to be an open question in a near future. For now, it had better print some hints for users than nothing. [1]: https://lore.kernel.org/lkml/87tuueftou.fsf@nanos.tec.linutronix.de/ [2]: https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com/ Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Petr Mladek <pmladek@suse.com> Cc: kexec@lists.infradead.org To: linux-kernel@vger.kernel.org --- kernel/watchdog.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 1cc619a..a0ab2a8 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -23,6 +23,7 @@ #include <linux/sched/debug.h> #include <linux/sched/isolation.h> #include <linux/stop_machine.h> +#include <linux/kernel_stat.h> #include <asm/irq_regs.h> #include <linux/kvm_para.h> @@ -175,6 +176,9 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync); static DEFINE_PER_CPU(bool, soft_watchdog_warn); static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts); static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); +static DEFINE_PER_CPU(unsigned long, last_irq_sum); +static DEFINE_PER_CPU(unsigned long, last_unused_irq_sum); + static unsigned long soft_lockup_nmi_warn; static int __init nowatchdog_setup(char *str) @@ -353,6 +357,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) /* kick the softlockup detector */ if (completion_done(this_cpu_ptr(&softlockup_completion))) { + __this_cpu_write(last_irq_sum, kstat_this_cpu->irqs_sum); + __this_cpu_write(last_unused_irq_sum, kstat_this_cpu->unused_irqs_sum); reinit_completion(this_cpu_ptr(&softlockup_completion)); stop_one_cpu_nowait(smp_processor_id(), softlockup_fn, NULL, @@ -386,6 +392,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) */ duration = is_softlockup(touch_ts); if (unlikely(duration)) { + unsigned long irq_sum, unused_irq_sum; + unsigned int seconds; + /* * If a virtual machine is stopped by the host it can look to * the watchdog like a soft lockup, check to see if the host @@ -409,9 +418,15 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) } } + irq_sum = kstat_this_cpu->irqs_sum - __this_cpu_read(last_irq_sum); + unused_irq_sum = kstat_this_cpu->unused_irqs_sum - + __this_cpu_read(last_unused_irq_sum); + seconds = (unsigned int)convert_seconds(duration); pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n", - smp_processor_id(), (unsigned int)convert_seconds(duration), + smp_processor_id(), seconds, current->comm, task_pid_nr(current)); + pr_emerg("%lu irqs at rate: %lu / s, %lu unused irq at rate: %lu / s\n", + irq_sum, irq_sum/seconds, unused_irq_sum, unused_irq_sum/seconds); print_modules(); print_irqtrace_events(current); if (regs) -- 2.7.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2020-10-28 11:58 ` Thomas Gleixner ` (2 preceding siblings ...) 2020-11-18 3:36 ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu @ 2021-03-02 7:45 ` Sai Prakash Ranjan 2021-06-05 2:32 ` Sai Prakash Ranjan 3 siblings, 1 reply; 24+ messages in thread From: Sai Prakash Ranjan @ 2021-03-02 7:45 UTC (permalink / raw) To: tglx Cc: Jisheng.Zhang, afzal.mohd.ma, akpm, corbet, gpiccoli, gustavo, helgaas, ilina, kernelfans, kexec, linus.walleij, linux-doc, linux-kernel, maz, mike.kravetz, mkshah, oneukum, pawan.kumar.gupta, peterz, pmladek, viro, Sai Prakash Ranjan Hi Thomas, > On Wed, Oct 28, 2020 at 12:58:41PM +0100, Thomas Gleixner wrote: > <snip>... > Something like the completly untested below should work independent of > config options. > > Thanks, > > tglx > --- > include/linux/irqdesc.h | 4 ++ > kernel/irq/manage.c | 3 + > kernel/irq/spurious.c | 74 +++++++++++++++++++++++++++++++++++------------- > 3 files changed, 61 insertions(+), 20 deletions(-) > > --- a/include/linux/irqdesc.h > +++ b/include/linux/irqdesc.h > @@ -30,6 +30,8 @@ struct pt_regs; > * @tot_count: stats field for non-percpu irqs > * @irq_count: stats field to detect stalled irqs > * @last_unhandled: aging timer for unhandled count > + * @storm_count: Counter for irq storm detection > + * @storm_checked: Timestamp for irq storm detection > * @irqs_unhandled: stats field for spurious unhandled interrupts > * @threads_handled: stats field for deferred spurious detection of threaded handlers > * @threads_handled_last: comparator field for deferred spurious detection of theraded handlers > @@ -65,6 +67,8 @@ struct irq_desc { > unsigned int tot_count; > unsigned int irq_count; /* For detecting broken IRQs */ > unsigned long last_unhandled; /* Aging timer for unhandled count */ > + unsigned long storm_count; > + unsigned long storm_checked; > unsigned int irqs_unhandled; > atomic_t threads_handled; > int threads_handled_last; > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -1581,6 +1581,9 @@ static int > if (!shared) { > init_waitqueue_head(&desc->wait_for_threads); > > + /* Take a timestamp for interrupt storm detection */ > + desc->storm_checked = jiffies; > + > /* Setup the type (level, edge polarity) if configured: */ > if (new->flags & IRQF_TRIGGER_MASK) { > ret = __irq_set_trigger(desc, > --- a/kernel/irq/spurious.c > +++ b/kernel/irq/spurious.c > @@ -21,6 +21,7 @@ static void poll_spurious_irqs(struct ti > static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs); > static int irq_poll_cpu; > static atomic_t irq_poll_active; > +static unsigned long irqstorm_limit __ro_after_init; > > /* > * We wait here for a poller to finish. > @@ -189,18 +190,21 @@ static inline int bad_action_ret(irqretu > * (The other 100-of-100,000 interrupts may have been a correctly > * functioning device sharing an IRQ with the failing one) > */ > -static void __report_bad_irq(struct irq_desc *desc, irqreturn_t action_ret) > +static void __report_bad_irq(struct irq_desc *desc, irqreturn_t action_ret, > + bool storm) > { > unsigned int irq = irq_desc_get_irq(desc); > struct irqaction *action; > unsigned long flags; > > - if (bad_action_ret(action_ret)) { > - printk(KERN_ERR "irq event %d: bogus return value %x\n", > - irq, action_ret); > - } else { > - printk(KERN_ERR "irq %d: nobody cared (try booting with " > + if (!storm) { > + if (bad_action_ret(action_ret)) { > + pr_err("irq event %d: bogus return value %x\n", > + irq, action_ret); > + } else { > + pr_err("irq %d: nobody cared (try booting with " > "the \"irqpoll\" option)\n", irq); > + } > } > dump_stack(); > printk(KERN_ERR "handlers:\n"); > @@ -228,7 +232,7 @@ static void report_bad_irq(struct irq_de > > if (count > 0) { > count--; > - __report_bad_irq(desc, action_ret); > + __report_bad_irq(desc, action_ret, false); > } > } > > @@ -267,6 +271,33 @@ try_misrouted_irq(unsigned int irq, stru > return action && (action->flags & IRQF_IRQPOLL); > } > > +static void disable_stuck_irq(struct irq_desc *desc, irqreturn_t action_ret, > + const char *reason, bool storm) > +{ > + __report_bad_irq(desc, action_ret, storm); > + pr_emerg("Disabling %s IRQ #%d\n", reason, irq_desc_get_irq(desc)); > + desc->istate |= IRQS_SPURIOUS_DISABLED; > + desc->depth++; > + irq_disable(desc); > +} > + > +/* Interrupt storm detector for runaway interrupts (handled or not). */ > +static bool irqstorm_detected(struct irq_desc *desc) > +{ > + unsigned long now = jiffies; > + > + if (++desc->storm_count < irqstorm_limit) { > + if (time_after(now, desc->storm_checked + HZ)) { > + desc->storm_count = 0; > + desc->storm_checked = now; > + } > + return false; > + } > + > + disable_stuck_irq(desc, IRQ_NONE, "runaway", true); > + return true; > +} > + > #define SPURIOUS_DEFERRED 0x80000000 > > void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret) > @@ -403,24 +434,16 @@ void note_interrupt(struct irq_desc *des > desc->irqs_unhandled -= ok; > } > > + if (unlikely(irqstorm_limit && irqstorm_detected(desc))) > + return; > + > desc->irq_count++; > if (likely(desc->irq_count < 100000)) > return; > > desc->irq_count = 0; > if (unlikely(desc->irqs_unhandled > 99900)) { > - /* > - * The interrupt is stuck > - */ > - __report_bad_irq(desc, action_ret); > - /* > - * Now kill the IRQ > - */ > - printk(KERN_EMERG "Disabling IRQ #%d\n", irq); > - desc->istate |= IRQS_SPURIOUS_DISABLED; > - desc->depth++; > - irq_disable(desc); > - > + disable_stuck_irq(desc, action_ret, "unhandled", false); > mod_timer(&poll_spurious_irq_timer, > jiffies + POLL_SPURIOUS_IRQ_INTERVAL); > } > @@ -462,5 +485,16 @@ static int __init irqpoll_setup(char *st > "performance\n"); > return 1; > } > - > __setup("irqpoll", irqpoll_setup); > + > +static int __init irqstorm_setup(char *arg) > +{ > + int res = kstrtoul(arg, 0, &irqstorm_limit); > + > + if (!res) { > + pr_info("Interrupt storm detector enabled. Limit=%lu / s\n", > + irqstorm_limit); > + } > + return !!res; > +} > +__setup("irqstorm_limit", irqstorm_setup); This irq storm detection feature is very useful, any chance to get this merged? We will be happy to test. People seem to be having their own copy of such feature out-of-tree [1]. [1] https://elinux.org/images/d/de/Oct28_InterruptStormDetectionFeature_KentoKobayashi.pdf Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] warn and suppress irqflood 2021-03-02 7:45 ` [PATCH 0/3] warn and suppress irqflood Sai Prakash Ranjan @ 2021-06-05 2:32 ` Sai Prakash Ranjan 0 siblings, 0 replies; 24+ messages in thread From: Sai Prakash Ranjan @ 2021-06-05 2:32 UTC (permalink / raw) To: tglx Cc: Jisheng.Zhang, afzal.mohd.ma, akpm, corbet, gpiccoli, gustavo, helgaas, ilina, kernelfans, kexec, linus.walleij, linux-doc, linux-kernel, maz, mike.kravetz, mkshah, oneukum, pawan.kumar.gupta, peterz, pmladek, viro Hi Thomas, On 2021-03-02 13:15, Sai Prakash Ranjan wrote: > Hi Thomas, > >> On Wed, Oct 28, 2020 at 12:58:41PM +0100, Thomas Gleixner wrote: >> > > <snip>... > >> Something like the completly untested below should work independent of >> config options. >> >> Thanks, >> >> tglx >> --- >> include/linux/irqdesc.h | 4 ++ >> kernel/irq/manage.c | 3 + >> kernel/irq/spurious.c | 74 >> +++++++++++++++++++++++++++++++++++------------- >> 3 files changed, 61 insertions(+), 20 deletions(-) >> >> --- a/include/linux/irqdesc.h >> +++ b/include/linux/irqdesc.h >> @@ -30,6 +30,8 @@ struct pt_regs; >> * @tot_count: stats field for non-percpu irqs >> * @irq_count: stats field to detect stalled irqs >> * @last_unhandled: aging timer for unhandled count >> + * @storm_count: Counter for irq storm detection >> + * @storm_checked: Timestamp for irq storm detection >> * @irqs_unhandled: stats field for spurious unhandled interrupts >> * @threads_handled: stats field for deferred spurious detection of >> threaded handlers >> * @threads_handled_last: comparator field for deferred spurious >> detection of theraded handlers >> @@ -65,6 +67,8 @@ struct irq_desc { >> unsigned int tot_count; >> unsigned int irq_count; /* For detecting broken IRQs */ >> unsigned long last_unhandled; /* Aging timer for unhandled count */ >> + unsigned long storm_count; >> + unsigned long storm_checked; >> unsigned int irqs_unhandled; >> atomic_t threads_handled; >> int threads_handled_last; >> --- a/kernel/irq/manage.c >> +++ b/kernel/irq/manage.c >> @@ -1581,6 +1581,9 @@ static int >> if (!shared) { >> init_waitqueue_head(&desc->wait_for_threads); >> >> + /* Take a timestamp for interrupt storm detection */ >> + desc->storm_checked = jiffies; >> + >> /* Setup the type (level, edge polarity) if configured: */ >> if (new->flags & IRQF_TRIGGER_MASK) { >> ret = __irq_set_trigger(desc, >> --- a/kernel/irq/spurious.c >> +++ b/kernel/irq/spurious.c >> @@ -21,6 +21,7 @@ static void poll_spurious_irqs(struct ti >> static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs); >> static int irq_poll_cpu; >> static atomic_t irq_poll_active; >> +static unsigned long irqstorm_limit __ro_after_init; >> >> /* >> * We wait here for a poller to finish. >> @@ -189,18 +190,21 @@ static inline int bad_action_ret(irqretu >> * (The other 100-of-100,000 interrupts may have been a correctly >> * functioning device sharing an IRQ with the failing one) >> */ >> -static void __report_bad_irq(struct irq_desc *desc, irqreturn_t >> action_ret) >> +static void __report_bad_irq(struct irq_desc *desc, irqreturn_t >> action_ret, >> + bool storm) >> { >> unsigned int irq = irq_desc_get_irq(desc); >> struct irqaction *action; >> unsigned long flags; >> >> - if (bad_action_ret(action_ret)) { >> - printk(KERN_ERR "irq event %d: bogus return value %x\n", >> - irq, action_ret); >> - } else { >> - printk(KERN_ERR "irq %d: nobody cared (try booting with " >> + if (!storm) { >> + if (bad_action_ret(action_ret)) { >> + pr_err("irq event %d: bogus return value %x\n", >> + irq, action_ret); >> + } else { >> + pr_err("irq %d: nobody cared (try booting with " >> "the \"irqpoll\" option)\n", irq); >> + } >> } >> dump_stack(); >> printk(KERN_ERR "handlers:\n"); >> @@ -228,7 +232,7 @@ static void report_bad_irq(struct irq_de >> >> if (count > 0) { >> count--; >> - __report_bad_irq(desc, action_ret); >> + __report_bad_irq(desc, action_ret, false); >> } >> } >> >> @@ -267,6 +271,33 @@ try_misrouted_irq(unsigned int irq, stru >> return action && (action->flags & IRQF_IRQPOLL); >> } >> >> +static void disable_stuck_irq(struct irq_desc *desc, irqreturn_t >> action_ret, >> + const char *reason, bool storm) >> +{ >> + __report_bad_irq(desc, action_ret, storm); >> + pr_emerg("Disabling %s IRQ #%d\n", reason, irq_desc_get_irq(desc)); >> + desc->istate |= IRQS_SPURIOUS_DISABLED; >> + desc->depth++; >> + irq_disable(desc); >> +} >> + >> +/* Interrupt storm detector for runaway interrupts (handled or not). >> */ >> +static bool irqstorm_detected(struct irq_desc *desc) >> +{ >> + unsigned long now = jiffies; >> + >> + if (++desc->storm_count < irqstorm_limit) { >> + if (time_after(now, desc->storm_checked + HZ)) { >> + desc->storm_count = 0; >> + desc->storm_checked = now; >> + } >> + return false; >> + } >> + >> + disable_stuck_irq(desc, IRQ_NONE, "runaway", true); >> + return true; >> +} >> + >> #define SPURIOUS_DEFERRED 0x80000000 >> >> void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret) >> @@ -403,24 +434,16 @@ void note_interrupt(struct irq_desc *des >> desc->irqs_unhandled -= ok; >> } >> >> + if (unlikely(irqstorm_limit && irqstorm_detected(desc))) >> + return; >> + >> desc->irq_count++; >> if (likely(desc->irq_count < 100000)) >> return; >> >> desc->irq_count = 0; >> if (unlikely(desc->irqs_unhandled > 99900)) { >> - /* >> - * The interrupt is stuck >> - */ >> - __report_bad_irq(desc, action_ret); >> - /* >> - * Now kill the IRQ >> - */ >> - printk(KERN_EMERG "Disabling IRQ #%d\n", irq); >> - desc->istate |= IRQS_SPURIOUS_DISABLED; >> - desc->depth++; >> - irq_disable(desc); >> - >> + disable_stuck_irq(desc, action_ret, "unhandled", false); >> mod_timer(&poll_spurious_irq_timer, >> jiffies + POLL_SPURIOUS_IRQ_INTERVAL); >> } >> @@ -462,5 +485,16 @@ static int __init irqpoll_setup(char *st >> "performance\n"); >> return 1; >> } >> - >> __setup("irqpoll", irqpoll_setup); >> + >> +static int __init irqstorm_setup(char *arg) >> +{ >> + int res = kstrtoul(arg, 0, &irqstorm_limit); >> + >> + if (!res) { >> + pr_info("Interrupt storm detector enabled. Limit=%lu / s\n", >> + irqstorm_limit); >> + } >> + return !!res; >> +} >> +__setup("irqstorm_limit", irqstorm_setup); > > This irq storm detection feature is very useful, any chance to get this > merged? > We will be happy to test. People seem to be having their own copy of > such feature > out-of-tree [1]. > > [1] > https://elinux.org/images/d/de/Oct28_InterruptStormDetectionFeature_KentoKobayashi.pdf > Any chance of having this useful debug feature in upstream kernel? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2021-06-05 2:32 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-22 5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu 2020-10-22 5:56 ` [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods Pingfan Liu 2020-10-22 5:56 ` [PATCH 2/3] kernel/watchdog: suppress max irq when " Pingfan Liu 2020-10-22 5:56 ` [PATCH 3/3] Documentation: introduce a param "irqflood_suppress" Pingfan Liu 2020-10-22 8:37 ` [PATCH 0/3] warn and suppress irqflood Thomas Gleixner 2020-10-25 11:12 ` Pingfan Liu 2020-10-25 12:21 ` [Skiboot] " Oliver O'Halloran 2020-10-25 13:11 ` Pingfan Liu 2020-10-25 13:51 ` Oliver O'Halloran 2020-10-26 15:06 ` Guilherme Piccoli 2020-10-26 19:59 ` Thomas Gleixner 2020-10-26 20:28 ` Guilherme Piccoli 2020-10-26 21:21 ` Thomas Gleixner 2020-10-27 12:28 ` Guilherme Piccoli 2020-10-28 6:02 ` Pingfan Liu 2020-10-28 11:58 ` Thomas Gleixner 2020-10-29 6:26 ` Pingfan Liu 2020-11-06 5:53 ` Pingfan Liu 2020-11-18 3:36 ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu 2020-11-18 3:36 ` [PATCH 1/3] x86/irq: account the unused irq Pingfan Liu 2020-11-18 3:36 ` [PATCH 2/3] kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond Pingfan Liu 2020-11-18 3:36 ` [PATCH 3/3] kernel/watchdog: use soft lockup to detect irq flood Pingfan Liu 2021-03-02 7:45 ` [PATCH 0/3] warn and suppress irqflood Sai Prakash Ranjan 2021-06-05 2:32 ` Sai Prakash Ranjan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).