linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: stuart hayes <stuart.w.hayes@gmail.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Ethan Zhao <haifeng.zhao@intel.com>,
	Sinan Kaya <okaya@kernel.org>, Ashok Raj <ashok.raj@intel.com>,
	Keith Busch <kbusch@kernel.org>,
	Yicong Yang <yangyicong@hisilicon.com>,
	linux-pci@vger.kernel.org, Russell Currey <ruscur@russell.cc>,
	Oliver OHalloran <oohall@gmail.com>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH v2] PCI: pciehp: Ignore Link Down/Up caused by DPC
Date: Sat, 26 Jun 2021 08:50:49 +0200	[thread overview]
Message-ID: <20210626065049.GA19767@wunner.de> (raw)
In-Reply-To: <08c046b0-c9f2-3489-eeef-7e7aca435bb9@gmail.com>

On Fri, Jun 25, 2021 at 03:38:41PM -0500, stuart hayes wrote:
> I have a system that is failing to recover after an EDR event with (or
> without...) this patch.  It looks like the problem is similar to what this
> patch is trying to fix, except that on my system, the hotplug port is
> downstream of the root port that has DPC, so the "link down" event on it is
> not being ignored.  So the hotplug code disables the slot (which contains an
> NVMe device on this system) while the nvme driver is trying to use it, which
> results in a failed recovery and another EDR event, and the kernel ends up
> with the DPC trigger status bit set in the root port, so everything
> downstream is gone.
> 
> I added the hack below so the hotplug code will ignore the "link down"
> events on the ports downstream of the root port during DPC recovery, and it
> recovers no problem.  (I'm not proposing this as a correct fix.)

Please help me understand what's causing the Link Down event in the
first place:

With DPC, the hardware (only) disables the link on the port containing the
error.  Since that's the Root Port above the hotplug port in your case,
the link between the hotplug port and the NVMe drive should remain up.

Since your patch sets the PCI_DPC_RECOVERING flag during invocation
of the dev->driver->err_handler->slot_reset() hook, I assume that's
what's causing the Link Down.  However pcie_portdrv_slot_reset()
only restores and saves PCI config space, I don't think that's
causing a Link Down?

Is maybe nvme_slot_reset() causing the Link Down on the parent hotplug port?

Thanks,

Lukas

> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index b576aa890c76..dfd983c3c5bf 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -119,8 +132,10 @@ static int report_slot_reset(struct pci_dev *dev, void
> *data)
>  		!dev->driver->err_handler->slot_reset)
>  		goto out;
> 
> +	set_bit(PCI_DPC_RECOVERING, &dev->priv_flags);
>  	err_handler = dev->driver->err_handler;
>  	vote = err_handler->slot_reset(dev);
> +	clear_bit(PCI_DPC_RECOVERING, &dev->priv_flags);
>  	*result = merge_result(*result, vote);
>  out:
>  	device_unlock(&dev->dev);

  reply	other threads:[~2021-06-26  6:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-01  8:29 [PATCH v2] PCI: pciehp: Ignore Link Down/Up caused by DPC Lukas Wunner
2021-05-01  8:38 ` Lukas Wunner
2021-06-16 22:19 ` Bjorn Helgaas
2021-06-20  7:38   ` Lukas Wunner
2021-06-25 20:38     ` stuart hayes
2021-06-26  6:50       ` Lukas Wunner [this message]
2021-07-06 22:15         ` stuart hayes
2021-07-18 21:26           ` Lukas Wunner
2021-07-19 15:10       ` Lukas Wunner
2021-07-19 19:00         ` stuart hayes
2021-07-20  6:57           ` Lukas Wunner
2021-07-20 22:11             ` stuart hayes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210626065049.GA19767@wunner.de \
    --to=lukas@wunner.de \
    --cc=ashok.raj@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=haifeng.zhao@intel.com \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=okaya@kernel.org \
    --cc=oohall@gmail.com \
    --cc=ruscur@russell.cc \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=stuart.w.hayes@gmail.com \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).