linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Guilherme Piccoli <gpiccoli@canonical.com>,
	Pingfan Liu <kernelfans@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Jisheng Zhang <Jisheng.Zhang@synaptics.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Petr Mladek <pmladek@suse.com>, Marc Zyngier <maz@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	afzal mohammed <afzal.mohd.ma@gmail.com>,
	Lina Iyer <ilina@codeaurora.org>,
	"Gustavo A. R. Silva" <gustavo@embeddedor.com>,
	Maulik Shah <mkshah@codeaurora.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jonathan Corbet <corbet@lwn.net>,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Oliver Neukum <oneukum@suse.com>,
	linux-doc@vger.kernel.org,
	Kexec Mailing List <kexec@lists.infradead.org>,
	Bjorn Helgaas <helgaas@kernel.org>
Subject: Re: [PATCH 0/3] warn and suppress irqflood
Date: Mon, 26 Oct 2020 20:59:56 +0100	[thread overview]
Message-ID: <87y2js3ghv.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <CAHD1Q_x99XW1zDr5HpVR27F_ksHLkaxc2W83e-N6F_xLYKyGbQ@mail.gmail.com>

On Mon, Oct 26 2020 at 12:06, Guilherme Piccoli wrote:
> On Sun, Oct 25, 2020 at 8:12 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> Some time ago (2 years) we faced a similar issue in x86-64, a hard to
> debug problem in kdump, that eventually was narrowed to a buggy NIC FW
> flooding IRQs in kdump kernel, and no messages showed (although kernel
> changed a lot since that time, today we might have better IRQ
> handling/warning). We tried an early-boot fix, by disabling MSIs (as
> per PCI spec) early in x86 boot, but it wasn't accepted - Bjorn asked
> pertinent questions that I couldn't respond (I lost the reproducer)
> [0].
...
> [0] lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com

With that broken firmware the NIC continued to send MSI messages to the
vector/CPU which was assigned to it before the crash. But the crash
kernel has no interrupt descriptor for this vector installed. So Liu's
patches wont print anything simply because the interrupt core cannot
detect it.

To answer Bjorns still open question about when the point X is:

  https://lore.kernel.org/linux-pci/20181023170343.GA4587@bhelgaas-glaptop.roam.corp.google.com/

It gets flooded right at the point where the crash kernel enables
interrupts in start_kernel(). At that point there is no device driver
and no interupt requested. All you can see on the console for this is

 "common_interrupt: $VECTOR.$CPU No irq handler for vector"

And contrary to Liu's patches which try to disable a requested interrupt
if too many of them arrive, the kernel cannot do anything because there
is nothing to disable in your case. That's why you needed to do the MSI
disable magic in the early PCI quirks which run before interrupts get
enabled.

Also Liu's patch only works if:

  1) CONFIG_IRQ_TIME_ACCOUNTING is enabled

  2) the runaway interrupt has been requested by the relevant driver in
     the dump kernel.

Especially #1 is not a sensible restriction.

Thanks,

        tglx



  reply	other threads:[~2020-10-26 20:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-22  5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu
2020-10-22  5:56 ` [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods Pingfan Liu
2020-10-22  5:56 ` [PATCH 2/3] kernel/watchdog: suppress max irq when " Pingfan Liu
2020-10-22  5:56 ` [PATCH 3/3] Documentation: introduce a param "irqflood_suppress" Pingfan Liu
2020-10-22  8:37 ` [PATCH 0/3] warn and suppress irqflood Thomas Gleixner
2020-10-25 11:12   ` Pingfan Liu
2020-10-25 12:21     ` [Skiboot] " Oliver O'Halloran
2020-10-25 13:11       ` Pingfan Liu
2020-10-25 13:51         ` Oliver O'Halloran
2020-10-26 15:06     ` Guilherme Piccoli
2020-10-26 19:59       ` Thomas Gleixner [this message]
2020-10-26 20:28         ` Guilherme Piccoli
2020-10-26 21:21           ` Thomas Gleixner
2020-10-27 12:28             ` Guilherme Piccoli
2020-10-28  6:02         ` Pingfan Liu
2020-10-28 11:58           ` Thomas Gleixner
2020-10-29  6:26             ` Pingfan Liu
2020-11-06  5:53             ` Pingfan Liu
2020-11-18  3:36             ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu
2020-11-18  3:36               ` [PATCH 1/3] x86/irq: account the unused irq Pingfan Liu
2020-11-18  3:36               ` [PATCH 2/3] kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond Pingfan Liu
2020-11-18  3:36               ` [PATCH 3/3] kernel/watchdog: use soft lockup to detect irq flood Pingfan Liu
2021-03-02  7:45             ` [PATCH 0/3] warn and suppress irqflood Sai Prakash Ranjan
2021-06-05  2:32               ` Sai Prakash Ranjan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2js3ghv.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=Jisheng.Zhang@synaptics.com \
    --cc=afzal.mohd.ma@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=gpiccoli@canonical.com \
    --cc=gustavo@embeddedor.com \
    --cc=helgaas@kernel.org \
    --cc=ilina@codeaurora.org \
    --cc=kernelfans@gmail.com \
    --cc=kexec@lists.infradead.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mkshah@codeaurora.org \
    --cc=oneukum@suse.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).