All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kuppuswamy, Sathyanarayanan"  <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Ethan Zhao <haifeng.zhao@intel.com>,
	Sinan Kaya <okaya@kernel.org>, Ashok Raj <ashok.raj@intel.com>,
	Keith Busch <kbusch@kernel.org>,
	linux-pci@vger.kernel.org, Russell Currey <ruscur@russell.cc>,
	Oliver O'Halloran <oohall@gmail.com>,
	Stuart Hayes <stuart.w.hayes@gmail.com>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH] PCI: pciehp: Ignore Link Down/Up caused by DPC
Date: Tue, 27 Apr 2021 17:39:43 -0700	[thread overview]
Message-ID: <c7a09e8d-d2a9-02c9-8ac1-224147bd7827@linux.intel.com> (raw)
In-Reply-To: <13bbd4f9-dff4-be79-d80a-342399961939@linux.intel.com>

Hi Bjorn,

On 3/30/21 1:53 PM, Kuppuswamy, Sathyanarayanan wrote:
>> Downstream Port Containment (PCIe Base Spec, sec. 6.2.10) disables the
>> link upon an error and attempts to re-enable it when instructed by the
>> DPC driver.
>>
>> A slot which is both DPC- and hotplug-capable is currently brought down
>> by pciehp once DPC is triggered (due to the link change) and brought up
>> on successful recovery.  That's undesirable, the slot should remain up
>> so that the hotplugged device remains bound to its driver.  DPC notifies
>> the driver of the error and of successful recovery in pcie_do_recovery()
>> and the driver may then restore the device to working state.
>>
>> Moreover, Sinan points out that turning off slot power by pciehp may
>> foil recovery by DPC:  Power off/on is a cold reset concurrently to
>> DPC's warm reset.  Sathyanarayanan reports extended delays or failure
>> in link retraining by DPC if pciehp brings down the slot.
>>
>> Fix by detecting whether a Link Down event is caused by DPC and awaiting
>> recovery if so.  On successful recovery, ignore both the Link Down and
>> the subsequent Link Up event.
>>
>> Afterwards, check whether the link is down to detect surprise-removal or
>> another DPC event immediately after DPC recovery.  Ensure that the
>> corresponding DLLSC event is not ignored by synthesizing it and
>> invoking irq_wake_thread() to trigger a re-run of pciehp_ist().
>>
>> The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and
>> dpc_handler(), race against each other.  If pciehp is faster than DPC,
>> it will wait until DPC recovery completes.
>>
>> Recovery consists of two steps:  The first step (waiting for link
>> disablement) is recognizable by pciehp through a set DPC Trigger Status
>> bit.  The second step (waiting for link retraining) is recognizable
>> through a newly introduced PCI_DPC_RECOVERING flag.
>>
>> If DPC is faster than pciehp, neither of the two flags will be set and
>> pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag.
>> The flag is zero if DPC didn't occur at all, hence DLLSC events are not
>> ignored by default.
>>
>> This commit draws inspiration from previous attempts to synchronize DPC
>> with pciehp:
>>
>> By Sinan Kaya, August 2018:
>> https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/
>>
>> By Ethan Zhao, October 2020:
>> https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/
>>
>> By Sathyanarayanan Kuppuswamy, March 2021:
>> https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ 
>>
> Looks good to me. This patch fixes the reported issue in our environment.
> 
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Any update on this patch? is this queued for merge? One of our customers is looking
for this fix. So wondering about the status.

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

  reply	other threads:[~2021-04-28  0:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-28  8:52 [PATCH] PCI: pciehp: Ignore Link Down/Up caused by DPC Lukas Wunner
2021-03-30 20:53 ` Kuppuswamy, Sathyanarayanan
2021-04-28  0:39   ` Kuppuswamy, Sathyanarayanan [this message]
2021-04-28  1:42     ` Zhao, Haifeng
2021-04-28 10:08 ` Yicong Yang
2021-04-28 14:40   ` Lukas Wunner
2021-04-29 11:29     ` Yicong Yang
2021-04-29 12:40       ` Zhao, Haifeng
2021-04-29 19:42       ` Lukas Wunner
2021-04-30  8:47         ` Yicong Yang
2021-04-30 12:15           ` Lukas Wunner
2021-04-29 19:36 ` Keith Busch
2021-04-29 20:16   ` Lukas Wunner
2021-04-29 21:16     ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7a09e8d-d2a9-02c9-8ac1-224147bd7827@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=ashok.raj@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=haifeng.zhao@intel.com \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mika.westerberg@linux.intel.com \
    --cc=okaya@kernel.org \
    --cc=oohall@gmail.com \
    --cc=ruscur@russell.cc \
    --cc=stuart.w.hayes@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.