linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kairui Song <kasong@redhat.com>
To: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	kexec@lists.infradead.org, Jerry Hoemann <jerry.hoemann@hpe.com>,
	Baoquan He <bhe@redhat.com>, Kairui Song <kasong@redhat.com>
Subject: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel
Date: Thu, 26 Dec 2019 03:21:18 +0800	[thread overview]
Message-ID: <20191225192118.283637-1-kasong@redhat.com> (raw)

There are reports about kdump hang upon reboot on some HPE machines,
kernel hanged when trying to shutdown a PCIe port, an uncorrectable
error occurred and crashed the system.

On the machine I can reproduce this issue, part of the topology
looks like this:

[0000:00]-+-00.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2
          +-01.0-[02]--
          +-01.1-[05]--
          +-02.0-[06]--+-00.0  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.1  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.2  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.3  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.4  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.5  Emulex Corporation OneConnect NIC (Skyhawk)
          |            +-00.6  Emulex Corporation OneConnect NIC (Skyhawk)
          |            \-00.7  Emulex Corporation OneConnect NIC (Skyhawk)
          +-02.1-[0f]--
          +-02.2-[07]----00.0  Hewlett-Packard Company Smart Array Gen9 Controllers

When shuting down PCIe port 0000:00:02.2 or 0000:00:02.0, the machine
will hang, depend on which device is reinitialized in kdump kernel.

If force remove unused device then trigger kdump, the problem will never
happen:

    echo 1 > /sys/bus/pci/devices/0000\:00\:02.2/0000\:07\:00.0/remove
    echo c > /proc/sysrq-trigger

    ... Kdump save vmcore through network, the NIC get reinitialized and
    hpsa is untouched. Then reboot with no problem. (If hpsa is used
    instead, shutdown the NIC in first kernel will help)

The cause is that some devices are enabled by the first kernel, but it
don't have the chance to shutdown the device, and kdump kernel is not
aware of it, unless it reinitialize the device.

Upon reboot, kdump kernel will skip downstream device shutdown and
clears its bridge's master bit directly. The downstream device could
error out as it can still send requests but upstream refuses it.

So for kdump, let kernel read the correct hardware power state on boot,
and always clear the bus master bit of PCI device upon shutdown if the
device is on. PCIe port driver will always shutdown all downstream
devices first, so this should ensure all downstream devices have bus
master bit off before clearing the bridge's bus master bit.

Signed-off-by: Kairui Song <kasong@redhat.com>
---
 drivers/pci/pci-driver.c | 11 ++++++++---
 drivers/pci/quirks.c     | 20 ++++++++++++++++++++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 0454ca0e4e3f..84a7fd643b4d 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -18,6 +18,7 @@
 #include <linux/kexec.h>
 #include <linux/of_device.h>
 #include <linux/acpi.h>
+#include <linux/crash_dump.h>
 #include "pci.h"
 #include "pcie/portdrv.h"
 
@@ -488,10 +489,14 @@ static void pci_device_shutdown(struct device *dev)
 	 * If this is a kexec reboot, turn off Bus Master bit on the
 	 * device to tell it to not continue to do DMA. Don't touch
 	 * devices in D3cold or unknown states.
-	 * If it is not a kexec reboot, firmware will hit the PCI
-	 * devices with big hammer and stop their DMA any way.
+	 * If this is kdump kernel, also turn off Bus Master, the device
+	 * could be activated by previous crashed kernel and may block
+	 * it's upstream from shutting down.
+	 * Else, firmware will hit the PCI devices with big hammer
+	 * and stop their DMA any way.
 	 */
-	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
+	if ((kexec_in_progress || is_kdump_kernel()) &&
+			pci_dev->current_state <= PCI_D3hot)
 		pci_clear_master(pci_dev);
 }
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4937a088d7d8..c65d11ab3939 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -28,6 +28,7 @@
 #include <linux/platform_data/x86/apple.h>
 #include <linux/pm_runtime.h>
 #include <linux/switchtec.h>
+#include <linux/crash_dump.h>
 #include <asm/dma.h>	/* isa_dma_bridge_buggy */
 #include "pci.h"
 
@@ -192,6 +193,25 @@ static int __init pci_apply_final_quirks(void)
 }
 fs_initcall_sync(pci_apply_final_quirks);
 
+/*
+ * Read the device state even if it's not enabled. The device could be
+ * activated by previous crashed kernel, this will read and correct the
+ * cached state.
+ */
+static void quirk_read_pm_state_in_kdump(struct pci_dev *dev)
+{
+	u16 pmcsr;
+
+	if (!is_kdump_kernel())
+		return;
+
+	if (dev->pm_cap) {
+		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
+		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
+	}
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, quirk_read_pm_state_in_kdump);
+
 /*
  * Decoding should be disabled for a PCI device during BAR sizing to avoid
  * conflict. But doing so may cause problems on host bridge and perhaps other
-- 
2.24.1


             reply	other threads:[~2019-12-25 19:21 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 19:21 Kairui Song [this message]
2020-01-03  7:58 ` [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel Kairui Song
2020-01-10 21:42 ` Bjorn Helgaas
2020-01-10 22:25   ` Khalid Aziz and Shuah Khan
2020-01-10 23:00     ` Jerry Hoemann
2020-01-11  0:18       ` Khalid Aziz
2020-01-11  0:50         ` Baoquan He
2020-01-11  3:45           ` Khalid Aziz
2020-01-11  9:35             ` Kairui Song
2020-01-11 18:32               ` Deepa Dinamani
2020-01-13 17:07                 ` Kairui Song
2020-01-15  1:16                   ` Deepa Dinamani
2020-01-15  7:56                     ` Kairui Song
2020-01-15 17:30                   ` Khalid Aziz
2020-01-15 18:05                     ` Kairui Song
2020-01-15 21:17                       ` Khalid Aziz
2020-01-17  3:24                         ` Dave Young
2020-01-17  3:46                           ` Baoquan He
2020-01-17 15:44                           ` Khalid Aziz
2020-01-11 10:04             ` Baoquan He
2020-01-11  0:45       ` Baoquan He
2020-01-11  0:51         ` Baoquan He
2020-01-11  1:46         ` Baoquan He
2020-01-11  9:24         ` Kairui Song
2020-01-10 23:36   ` Jerry Hoemann
2020-01-11  8:46   ` Kairui Song
2020-02-22 16:56 ` Bjorn Helgaas
2020-02-24  4:56   ` Dave Young
2020-02-24 17:30   ` Kairui Song
2020-02-28 19:53     ` Deepa Dinamani
2020-03-03 21:01       ` Deepa Dinamani
2020-03-05  3:53         ` Baoquan He
2020-03-05  4:53           ` Deepa Dinamani
2020-03-05  6:06             ` Deepa Dinamani
2020-03-06  9:38             ` Baoquan He
2020-07-22 14:52               ` Kairui Song
2020-07-22 15:21                 ` Bjorn Helgaas
2020-07-22 21:50                   ` Jerry Hoemann
2020-07-23  0:00                     ` Bjorn Helgaas
2020-07-23 18:34                       ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191225192118.283637-1-kasong@redhat.com \
    --to=kasong@redhat.com \
    --cc=bhe@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=jerry.hoemann@hpe.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).