linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kairui Song <kasong@redhat.com>
To: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Khalid Aziz <khalid@gonehiking.org>, Baoquan He <bhe@redhat.com>,
	Jerry Hoemann <Jerry.Hoemann@hpe.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, kexec@lists.infradead.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Randy Wright <rwright@hpe.com>
Subject: Re: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel
Date: Tue, 14 Jan 2020 01:07:09 +0800	[thread overview]
Message-ID: <CACPcB9cQY9Vu3wG-QYZS6W6T_PZxnJ1ABNUUAF_qvk-VSxbpTA@mail.gmail.com> (raw)
In-Reply-To: <CABeXuvqquCU+1G=5onk9owASorhpcYWeWBge9U35BrorABcsuw@mail.gmail.com>

On Sun, Jan 12, 2020 at 2:33 AM Deepa Dinamani <deepa.kernel@gmail.com> wrote:
>
> > Hi, there are some previous works about this issue, reset PCI devices
> > in kdump kernel to stop ongoing DMA:
> >
> > [v7,0/5] Reset PCIe devices to address DMA problem on kdump with iommu
> > https://lore.kernel.org/patchwork/cover/343767/
> >
> > [v2] PCI: Reset PCIe devices to stop ongoing DMA
> > https://lore.kernel.org/patchwork/patch/379191/
> >
> > And didn't get merged, that patch are trying to fix some DMAR error
> > problem, but resetting devices is a bit too destructive, and the
> > problem is later fixed in IOMMU side. And in most case the DMA seems
> > harmless, as they targets first kernel's memory and kdump kernel only
> > live in crash memory.
>
> I was going to ask the same. If the kdump kernel had IOMMU on, would
> that still be a problem?

It will still fail, doing DMA is not a problem, it only go wrong when
a device's upstream bridge is mistakenly shutdown before the device
shutdown.

>
> > Also, by the time kdump kernel is able to scan and reset devices,
> > there are already a very large time window where things could go
> > wrong.
> >
> > The currently problem observed only happens upon kdump kernel
> > shutdown, as the upper bridge is disabled before the device is
> > disabledm so DMA will raise error. It's more like a problem of wrong
> > device shutting down order.
>
> The way it was described earlier "During this time, the SUT sometimes
> gets a PCI error that raises an NMI." suggests that it isn't really
> restricted to kexec/kdump.
> Any attached device without an active driver might attempt spurious or
> malicious DMA and trigger the same during normal operation.
> Do you have available some more reporting of what happens during the
> PCIe error handling?

Let me add more info about this:

On the machine where I can reproduce this issue, the first kernel
always runs fine, and kdump kernel works fine during dumping the
vmcore, even if I keep the kdump kernel running for hours, nothing
goes wrong. If there are DMA during normal operation that will cause
problem, this should have exposed it.

The problem only occur when kdump kernel try to reboot, no matter how
long the kdump kernel have been running (few minutes or hours). The
machine is dead after printing:
[  101.438300] reboot: Restarting system^M
[  101.455360] reboot: machine restart^M

And I can find following logs happend just at that time, in the
"Integrated Management Log" from the iLO web interface:
1254 OS 12/25/2019 09:08 12/25/2019 09:08 1 User Remotely Initiated NMI Switch
1253 System Error 12/25/2019 09:08 12/25/2019 09:08 1 An Unrecoverable
System Error (NMI) has occurred (Service Information: 0x00000000,
0x00000000)
1252 PCI Bus 12/25/2019 09:07 12/25/2019 09:07 1 Uncorrectable PCI
Express Error (Embedded device, Bus 0, Device 2, Function 2, Error
status 0x00100000)
1251 System Error 12/25/2019 09:07 12/25/2019 09:07 1 Unrecoverable
System Error (NMI) has occurred.  System Firmware will log additional
details in a separate IML entry if possible
1250 PCI Bus 12/25/2019 09:07 12/25/2019 09:07 1 PCI Bus Error (Slot
0, Bus 0, Device 2, Function 2)

And the topology is:
[0000:00]-+-00.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2
          +-01.0-[02]--
          +-01.1-[05]--
          +-02.0-[06]--+-00.0  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.1  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.2  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.3  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.4  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.5  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.6  Emulex Corporation OneConnect NIC (Skyhawk)
          |            \-00.7  Emulex Corporation OneConnect NIC (Skyhawk)
          +-02.1-[0f]--
          +-02.2-[07]----00.0  Hewlett-Packard Company Smart Array
Gen9 Controllers

It's a bridge reporting the error. It should be an unsupported request
error, bacause downstream device is still alive and sending request,
but the port have bus mastering off. If I manually shutdown the "Smart
Array" (HPSA) device before kdump reboot, it will always reboot just
fine.

And as the patch descriptions said, the HPSA is used in first kernel,
but didn't get reset in kdump kernel because driver is not loaded.
When shutting down a bridge, kernel should shutdown downstream device
first, and then shutdown and clear bus master bit of the bridge. But
in kdump case, kernel skipped some device shutdown due to driver not
loaded issue, and kernel don't know they are enabled.

This problem is not limited to HPSA, the NIC listed in above topology
maybe also make the bridge error out, if HPSA get loaded in kdump
kernel and NIC get ignored.

>
> "The reaction to the NMI that the kdump kernel takes is problematic."
> Or the NMI should not have been triggered to begin with? Where does that happen?

The NMI is triggered by firmware upon the bridge error mentioned
above, it should have triggered a kernel panic, but on the test
system, it just hanged and no longer give any output, so I can't post
any log about it.


  reply	other threads:[~2020-01-13 17:07 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 19:21 [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel Kairui Song
2020-01-03  7:58 ` Kairui Song
2020-01-10 21:42 ` Bjorn Helgaas
2020-01-10 22:25   ` Khalid Aziz and Shuah Khan
2020-01-10 23:00     ` Jerry Hoemann
2020-01-11  0:18       ` Khalid Aziz
2020-01-11  0:50         ` Baoquan He
2020-01-11  3:45           ` Khalid Aziz
2020-01-11  9:35             ` Kairui Song
2020-01-11 18:32               ` Deepa Dinamani
2020-01-13 17:07                 ` Kairui Song [this message]
2020-01-15  1:16                   ` Deepa Dinamani
2020-01-15  7:56                     ` Kairui Song
2020-01-15 17:30                   ` Khalid Aziz
2020-01-15 18:05                     ` Kairui Song
2020-01-15 21:17                       ` Khalid Aziz
2020-01-17  3:24                         ` Dave Young
2020-01-17  3:46                           ` Baoquan He
2020-01-17 15:44                           ` Khalid Aziz
2020-01-11 10:04             ` Baoquan He
2020-01-11  0:45       ` Baoquan He
2020-01-11  0:51         ` Baoquan He
2020-01-11  1:46         ` Baoquan He
2020-01-11  9:24         ` Kairui Song
2020-01-10 23:36   ` Jerry Hoemann
2020-01-11  8:46   ` Kairui Song
2020-02-22 16:56 ` Bjorn Helgaas
2020-02-24  4:56   ` Dave Young
2020-02-24 17:30   ` Kairui Song
2020-02-28 19:53     ` Deepa Dinamani
2020-03-03 21:01       ` Deepa Dinamani
2020-03-05  3:53         ` Baoquan He
2020-03-05  4:53           ` Deepa Dinamani
2020-03-05  6:06             ` Deepa Dinamani
2020-03-06  9:38             ` Baoquan He
2020-07-22 14:52               ` Kairui Song
2020-07-22 15:21                 ` Bjorn Helgaas
2020-07-22 21:50                   ` Jerry Hoemann
2020-07-23  0:00                     ` Bjorn Helgaas
2020-07-23 18:34                       ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACPcB9cQY9Vu3wG-QYZS6W6T_PZxnJ1ABNUUAF_qvk-VSxbpTA@mail.gmail.com \
    --to=kasong@redhat.com \
    --cc=Jerry.Hoemann@hpe.com \
    --cc=bhe@redhat.com \
    --cc=deepa.kernel@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=kexec@lists.infradead.org \
    --cc=khalid@gonehiking.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rwright@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).