linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Kairui Song <kasong@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	kexec@lists.infradead.org, Jerry Hoemann <jerry.hoemann@hpe.com>,
	Baoquan He <bhe@redhat.com>, Khalid Aziz <khalid@gonehiking.org>,
	Deepa Dinamani <deepa.kernel@gmail.com>,
	Randy Wright <rwright@hpe.com>, Dave Young <dyoung@redhat.com>,
	Myron Stowe <myron.stowe@redhat.com>
Subject: Re: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel
Date: Sat, 22 Feb 2020 10:56:31 -0600	[thread overview]
Message-ID: <20200222165631.GA213225@google.com> (raw)
In-Reply-To: <20191225192118.283637-1-kasong@redhat.com>

[+cc Khalid, Deepa, Randy, Dave, Myron]

On Thu, Dec 26, 2019 at 03:21:18AM +0800, Kairui Song wrote:
> There are reports about kdump hang upon reboot on some HPE machines,
> kernel hanged when trying to shutdown a PCIe port, an uncorrectable
> error occurred and crashed the system.

Did we ever make progress on this?  This definitely sounds like a
problem that needs to be fixed, but I don't see a resolution here.

> On the machine I can reproduce this issue, part of the topology
> looks like this:
> 
> [0000:00]-+-00.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2
>           +-01.0-[02]--
>           +-01.1-[05]--
>           +-02.0-[06]--+-00.0  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            +-00.1  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            +-00.2  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            +-00.3  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            +-00.4  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            +-00.5  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            +-00.6  Emulex Corporation OneConnect NIC (Skyhawk)
>           |            \-00.7  Emulex Corporation OneConnect NIC (Skyhawk)
>           +-02.1-[0f]--
>           +-02.2-[07]----00.0  Hewlett-Packard Company Smart Array Gen9 Controllers
> 
> When shuting down PCIe port 0000:00:02.2 or 0000:00:02.0, the machine
> will hang, depend on which device is reinitialized in kdump kernel.
> 
> If force remove unused device then trigger kdump, the problem will never
> happen:
> 
>     echo 1 > /sys/bus/pci/devices/0000\:00\:02.2/0000\:07\:00.0/remove
>     echo c > /proc/sysrq-trigger
> 
>     ... Kdump save vmcore through network, the NIC get reinitialized and
>     hpsa is untouched. Then reboot with no problem. (If hpsa is used
>     instead, shutdown the NIC in first kernel will help)
> 
> The cause is that some devices are enabled by the first kernel, but it
> don't have the chance to shutdown the device, and kdump kernel is not
> aware of it, unless it reinitialize the device.
> 
> Upon reboot, kdump kernel will skip downstream device shutdown and
> clears its bridge's master bit directly. The downstream device could
> error out as it can still send requests but upstream refuses it.
> 
> So for kdump, let kernel read the correct hardware power state on boot,
> and always clear the bus master bit of PCI device upon shutdown if the
> device is on. PCIe port driver will always shutdown all downstream
> devices first, so this should ensure all downstream devices have bus
> master bit off before clearing the bridge's bus master bit.
> 
> Signed-off-by: Kairui Song <kasong@redhat.com>
> ---
>  drivers/pci/pci-driver.c | 11 ++++++++---
>  drivers/pci/quirks.c     | 20 ++++++++++++++++++++
>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 0454ca0e4e3f..84a7fd643b4d 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -18,6 +18,7 @@
>  #include <linux/kexec.h>
>  #include <linux/of_device.h>
>  #include <linux/acpi.h>
> +#include <linux/crash_dump.h>
>  #include "pci.h"
>  #include "pcie/portdrv.h"
>  
> @@ -488,10 +489,14 @@ static void pci_device_shutdown(struct device *dev)
>  	 * If this is a kexec reboot, turn off Bus Master bit on the
>  	 * device to tell it to not continue to do DMA. Don't touch
>  	 * devices in D3cold or unknown states.
> -	 * If it is not a kexec reboot, firmware will hit the PCI
> -	 * devices with big hammer and stop their DMA any way.
> +	 * If this is kdump kernel, also turn off Bus Master, the device
> +	 * could be activated by previous crashed kernel and may block
> +	 * it's upstream from shutting down.
> +	 * Else, firmware will hit the PCI devices with big hammer
> +	 * and stop their DMA any way.
>  	 */
> -	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> +	if ((kexec_in_progress || is_kdump_kernel()) &&
> +			pci_dev->current_state <= PCI_D3hot)
>  		pci_clear_master(pci_dev);
>  }
>  
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4937a088d7d8..c65d11ab3939 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -28,6 +28,7 @@
>  #include <linux/platform_data/x86/apple.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/switchtec.h>
> +#include <linux/crash_dump.h>
>  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
>  #include "pci.h"
>  
> @@ -192,6 +193,25 @@ static int __init pci_apply_final_quirks(void)
>  }
>  fs_initcall_sync(pci_apply_final_quirks);
>  
> +/*
> + * Read the device state even if it's not enabled. The device could be
> + * activated by previous crashed kernel, this will read and correct the
> + * cached state.
> + */
> +static void quirk_read_pm_state_in_kdump(struct pci_dev *dev)
> +{
> +	u16 pmcsr;
> +
> +	if (!is_kdump_kernel())
> +		return;
> +
> +	if (dev->pm_cap) {
> +		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> +		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> +	}
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, quirk_read_pm_state_in_kdump);
> +
>  /*
>   * Decoding should be disabled for a PCI device during BAR sizing to avoid
>   * conflict. But doing so may cause problems on host bridge and perhaps other
> -- 
> 2.24.1
> 

  parent reply	other threads:[~2020-02-22 16:56 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 19:21 [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel Kairui Song
2020-01-03  7:58 ` Kairui Song
2020-01-10 21:42 ` Bjorn Helgaas
2020-01-10 22:25   ` Khalid Aziz and Shuah Khan
2020-01-10 23:00     ` Jerry Hoemann
2020-01-11  0:18       ` Khalid Aziz
2020-01-11  0:50         ` Baoquan He
2020-01-11  3:45           ` Khalid Aziz
2020-01-11  9:35             ` Kairui Song
2020-01-11 18:32               ` Deepa Dinamani
2020-01-13 17:07                 ` Kairui Song
2020-01-15  1:16                   ` Deepa Dinamani
2020-01-15  7:56                     ` Kairui Song
2020-01-15 17:30                   ` Khalid Aziz
2020-01-15 18:05                     ` Kairui Song
2020-01-15 21:17                       ` Khalid Aziz
2020-01-17  3:24                         ` Dave Young
2020-01-17  3:46                           ` Baoquan He
2020-01-17 15:44                           ` Khalid Aziz
2020-01-11 10:04             ` Baoquan He
2020-01-11  0:45       ` Baoquan He
2020-01-11  0:51         ` Baoquan He
2020-01-11  1:46         ` Baoquan He
2020-01-11  9:24         ` Kairui Song
2020-01-10 23:36   ` Jerry Hoemann
2020-01-11  8:46   ` Kairui Song
2020-02-22 16:56 ` Bjorn Helgaas [this message]
2020-02-24  4:56   ` Dave Young
2020-02-24 17:30   ` Kairui Song
2020-02-28 19:53     ` Deepa Dinamani
2020-03-03 21:01       ` Deepa Dinamani
2020-03-05  3:53         ` Baoquan He
2020-03-05  4:53           ` Deepa Dinamani
2020-03-05  6:06             ` Deepa Dinamani
2020-03-06  9:38             ` Baoquan He
2020-07-22 14:52               ` Kairui Song
2020-07-22 15:21                 ` Bjorn Helgaas
2020-07-22 21:50                   ` Jerry Hoemann
2020-07-23  0:00                     ` Bjorn Helgaas
2020-07-23 18:34                       ` Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200222165631.GA213225@google.com \
    --to=helgaas@kernel.org \
    --cc=bhe@redhat.com \
    --cc=deepa.kernel@gmail.com \
    --cc=dyoung@redhat.com \
    --cc=jerry.hoemann@hpe.com \
    --cc=kasong@redhat.com \
    --cc=kexec@lists.infradead.org \
    --cc=khalid@gonehiking.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=myron.stowe@redhat.com \
    --cc=rwright@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).