linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Oliver O'Halloran" <oohall@gmail.com>
To: Pingfan Liu <kernelfans@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Maulik Shah <mkshah@codeaurora.org>,
	Petr Mladek <pmladek@suse.com>, Oliver Neukum <oneukum@suse.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"Gustavo A. R. Silva" <gustavo@embeddedor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Marc Zyngier <maz@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	"Guilherme G. Piccoli" <gpiccoli@canonical.com>,
	linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Lina Iyer <ilina@codeaurora.org>,
	Jisheng Zhang <Jisheng.Zhang@synaptics.com>,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	afzal mohammed <afzal.mohd.ma@gmail.com>,
	Kexec Mailing List <kexec@lists.infradead.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [Skiboot] [PATCH 0/3] warn and suppress irqflood
Date: Mon, 26 Oct 2020 00:51:58 +1100	[thread overview]
Message-ID: <CAOSf1CGHPUZUBQV0Zm3onMxCZ-zBpOxE9tmMeBODeKUyuO3Rpg@mail.gmail.com> (raw)
In-Reply-To: <CAFgQCTveoz0fOELrwUY5ZSG_iNKkjGJ32QW1POo-OfjvXM=YLQ@mail.gmail.com>

On Mon, Oct 26, 2020 at 12:11 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> On Sun, Oct 25, 2020 at 8:21 PM Oliver O'Halloran <oohall@gmail.com> wrote:
> >
> > On Sun, Oct 25, 2020 at 10:22 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> > >
> > > On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > >
> > > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote:
> > > > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform.
> > > > > When the bug happens, the kernel is totally occupies by irq.  Currently, there
> > > > > may be nothing or just soft lockup warning showed in console. It is better
> > > > > to warn users with irq flood info.
> > > > >
> > > > > In the kdump case, the kernel can move on by suppressing the irq flood.
> > > >
> > > > You're curing the symptom not the cause and the cure is just magic and
> > > > can't work reliably.
> > > Yeah, it is magic. But at least, it is better to printk something and
> > > alarm users about what happens. With current code, it may show nothing
> > > when system hangs.
> > > >
> > > > Where is that irq flood originated from and why is none of the
> > > > mechanisms we have in place to shut it up working?
> > > The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus
> > > driver (i2c-opal.c). After i2c_opal_send_request(), the bug is
> > > triggered.
> > >
> > > But things are complicated by introducing a firmware layer: Skiboot.
> > > This software layer hides the detail of manipulating the hardware from
> > > Linux.
> > >
> > > I guess the software logic can not enter a sane state when kernel crashes.
> > >
> > > Cc Skiboot and ppc64 community to see whether anyone has idea about it.
> >
> > What system are you using?
>
> Here is the info, if not enough, I will get more.
>  Product Name          : OpenPOWER Firmware
>  Product Version       : open-power-SUPERMICRO-P9DSU-V1.16-20180531-imp
>  Product Extra         : op-build-e4b3eb5
>  Product Extra         : skiboot-v6.0-p1da203b
>  Product Extra         : hostboot-f911e5c-pda8239f
>  Product Extra         : occ-77bb5e6-p623d1cd
>  Product Extra         : linux-4.16.7-openpower2-pbc45895
>  Product Extra         : petitboot-v1.7.1-pf773c0d
>  Product Extra         : machine-xml-218a77a

Unfortunately I don't have a schematic for that one.

> > There's an external interrupt pin which is supposed to be wired to the
> > TPM. I think we bounce that interrupt to FW by default since the
> > external interrupt is sometimes used for other system-specific
> > purposes. Odds are FW doesn't know what to do with it so you
> > effectively have an always-on LSI. I fixed a similar bug a while ago
> > by having skiboot mask any interrupts it doesn't have a handler for,
>
> This sounds like the root cause. But here Skiboot should have handler,
> otherwise the first kernel can not run smoothly.

I don't know why the TPM interrupt is asserted. If the TPM driver is
polling for a response it might clear the underlying condition as a
side effect of it's normal operation.

> Do you have any idea about an unexpected re-initialization introducing
> an unsane stage?

No idea, but those TPMs have a history of bricking themselves if you
do anything slightly odd to them. It wouldn't surprise me if the
re-probe can cause issues.

> Thanks,
> Pingfan

  reply	other threads:[~2020-10-25 13:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-22  5:56 [PATCH 0/3] warn and suppress irqflood Pingfan Liu
2020-10-22  5:56 ` [PATCH 1/3] kernel/watchdog: show irq percentage if irq floods Pingfan Liu
2020-10-22  5:56 ` [PATCH 2/3] kernel/watchdog: suppress max irq when " Pingfan Liu
2020-10-22  5:56 ` [PATCH 3/3] Documentation: introduce a param "irqflood_suppress" Pingfan Liu
2020-10-22  8:37 ` [PATCH 0/3] warn and suppress irqflood Thomas Gleixner
2020-10-25 11:12   ` Pingfan Liu
2020-10-25 12:21     ` [Skiboot] " Oliver O'Halloran
2020-10-25 13:11       ` Pingfan Liu
2020-10-25 13:51         ` Oliver O'Halloran [this message]
2020-10-26 15:06     ` Guilherme Piccoli
2020-10-26 19:59       ` Thomas Gleixner
2020-10-26 20:28         ` Guilherme Piccoli
2020-10-26 21:21           ` Thomas Gleixner
2020-10-27 12:28             ` Guilherme Piccoli
2020-10-28  6:02         ` Pingfan Liu
2020-10-28 11:58           ` Thomas Gleixner
2020-10-29  6:26             ` Pingfan Liu
2020-11-06  5:53             ` Pingfan Liu
2020-11-18  3:36             ` [PATCH 0/3] use soft lockup to detect irq flood Pingfan Liu
2020-11-18  3:36               ` [PATCH 1/3] x86/irq: account the unused irq Pingfan Liu
2020-11-18  3:36               ` [PATCH 2/3] kernel/watchdog: make watchdog_touch_ts more accurate by using nanosecond Pingfan Liu
2020-11-18  3:36               ` [PATCH 3/3] kernel/watchdog: use soft lockup to detect irq flood Pingfan Liu
2021-03-02  7:45             ` [PATCH 0/3] warn and suppress irqflood Sai Prakash Ranjan
2021-06-05  2:32               ` Sai Prakash Ranjan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOSf1CGHPUZUBQV0Zm3onMxCZ-zBpOxE9tmMeBODeKUyuO3Rpg@mail.gmail.com \
    --to=oohall@gmail.com \
    --cc=Jisheng.Zhang@synaptics.com \
    --cc=afzal.mohd.ma@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=gpiccoli@canonical.com \
    --cc=gustavo@embeddedor.com \
    --cc=ilina@codeaurora.org \
    --cc=kernelfans@gmail.com \
    --cc=kexec@lists.infradead.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mkshah@codeaurora.org \
    --cc=oneukum@suse.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).