All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Sean V Kelley" <sean.v.kelley@intel.com>
To: "Jonathan Cameron" <Jonathan.Cameron@Huawei.com>
Cc: bhelgaas@google.com, rjw@rjwysocki.net, ashok.raj@intel.com,
	tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Qiuxu Zhuo" <qiuxu.zhuo@intel.com>
Subject: Re: [PATCH V2 6/9] PCI: Add 'rcec' field to pci_dev for associated RCiEPs
Date: Wed, 05 Aug 2020 10:48:46 -0700	[thread overview]
Message-ID: <BE94DF65-ED67-4113-A932-58BFE86B1152@intel.com> (raw)
In-Reply-To: <20200805184020.00000cf3@Huawei.com>

On 5 Aug 2020, at 10:40, Jonathan Cameron wrote:

> On Tue, 4 Aug 2020 12:40:49 -0700
> Sean V Kelley <sean.v.kelley@intel.com> wrote:
>
>> From: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>>
>> When attempting error recovery for an RCiEP associated with an RCEC 
>> device,
>> there needs to be a way to update the Root Error Status, the 
>> Uncorrectable
>> Error Status and the Uncorrectable Error Severity of the parent RCEC.
>> So add the 'rcec' field to the pci_dev structure and provide a hook 
>> for the
>> Root Port Driver to associate RCiEPs with their respective parent 
>> RCEC.
>>
>> Co-developed-by: Sean V Kelley <sean.v.kelley@intel.com>
>> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
> Hi,
>
> One question in line.
>
>> ---
>>  drivers/pci/pcie/aer.c         |  9 +++++----
>>  drivers/pci/pcie/err.c         | 12 ++++++++++++
>>  drivers/pci/pcie/portdrv_pci.c | 15 +++++++++++++++
>>  include/linux/pci.h            |  3 +++
>>  4 files changed, 35 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index 87283cda3990..f658607e8e00 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device *dev)
>>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>>  {
>>  	int aer = dev->aer_cap;
>> +	int rc = 0;
>>  	u32 reg32;
>> -	int rc;
>> -
>>
>>  	/* Disable Root's interrupt in response to error messages */
>>  	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
>>  	reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>>  	pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>>
>> -	rc = pci_bus_error_reset(dev);
>> -	pci_info(dev, "Root Port link has been reset\n");
>> +	if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
>> +		rc = pci_bus_error_reset(dev);
>> +		pci_info(dev, "Root Port link has been reset\n");
>> +	}
>>
>>  	/* Clear Root Error Status */
>>  	pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>> index 4812aa678eff..43f1c55c76db 100644
>> --- a/drivers/pci/pcie/err.c
>> +++ b/drivers/pci/pcie/err.c
>> @@ -203,6 +203,12 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev 
>> *dev,
>>  		pci_walk_dev_affected(dev, report_frozen_detected, &status);
>>  		if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
>>  			status = flr_on_rciep(dev);
>> +			/*
>> +			 * The callback only clears the Root Error Status
>> +			 * of the RCEC (see aer.c).
>> +			 */
>> +			if (pcie_aer_is_native(dev) && dev->rcec)
>> +				reset_link(dev->rcec);
>
> I'm not sure about this pcie_aer_is_native check.
> We don't check in the normal EP path.  Perhaps we should be checking 
> there
> as well?

Looks like we should.  Will make adjustments and test.

>
> I can contrive a CPER record that hits the reset_link for the normal 
> EP on my qemu
> test setup.  Just for fun it causes a synchronous external abort that 
> I need
> to track down but not related this patch or indeed reset_link and may 
> just reflect
> an impossible to hit path in the e1000e driver.
>
> It needs a very contrived combination of blocks that say the error is 
> fatal
> and others that say it isn't so I'm not that worried about that.

Okay, will also test on my end.

Thanks,

Sean

>
>
>>  			if (status != PCI_ERS_RESULT_RECOVERED) {
>>  				pci_warn(dev, "function level reset failed\n");
>>  				goto failed;
>> @@ -247,7 +253,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev 
>> *dev,
>>  		if (pcie_aer_is_native(dev))
>>  			pcie_clear_device_status(dev);
>>  		pci_aer_clear_nonfatal_status(dev);
>> +	} else if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
>> +		if (pcie_aer_is_native(dev) && dev->rcec)
>> +			pcie_clear_device_status(dev->rcec);
>> +		if (dev->rcec)
>> +			pci_aer_clear_nonfatal_status(dev->rcec);
>>  	}
>> +
>>  	pci_info(dev, "device recovery successful\n");
>>  	return status;
>>
>> diff --git a/drivers/pci/pcie/portdrv_pci.c 
>> b/drivers/pci/pcie/portdrv_pci.c
>> index 4d880679b9b1..dff5c9e13412 100644
>> --- a/drivers/pci/pcie/portdrv_pci.c
>> +++ b/drivers/pci/pcie/portdrv_pci.c
>> @@ -90,6 +90,18 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops 
>> = {
>>  #define PCIE_PORTDRV_PM_OPS	NULL
>>  #endif /* !PM */
>>
>> +static int pcie_hook_rcec(struct pci_dev *pdev, void *data)
>> +{
>> +	struct pci_dev *rcec = (struct pci_dev *)data;
>> +
>> +	pdev->rcec = rcec;
>> +	pci_dbg(rcec, "RCiEP(under an RCEC) %04x:%02x:%02x.%d\n",
>> +		pci_domain_nr(pdev->bus), pdev->bus->number,
>> +		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * pcie_portdrv_probe - Probe PCI-Express port devices
>>   * @dev: PCI-Express port device being probed
>> @@ -110,6 +122,9 @@ static int pcie_portdrv_probe(struct pci_dev 
>> *dev,
>>  	     (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC)))
>>  		return -ENODEV;
>>
>> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC)
>> +		pcie_walk_rcec(dev, pcie_hook_rcec, dev);
>> +
>>  	status = pcie_port_device_register(dev);
>>  	if (status)
>>  		return status;
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index ee49469bd2b5..d5f7dbbf5e2f 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -326,6 +326,9 @@ struct pci_dev {
>>  #ifdef CONFIG_PCIEAER
>>  	u16		aer_cap;	/* AER capability offset */
>>  	struct aer_stats *aer_stats;	/* AER stats for this device */
>> +#endif
>> +#ifdef CONFIG_PCIEPORTBUS
>> +	struct pci_dev	*rcec;		/* Associated RCEC device */
>>  #endif
>>  	u8		pcie_cap;	/* PCIe capability offset */
>>  	u8		msi_cap;	/* MSI capability offset */

  reply	other threads:[~2020-08-05 19:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-04 19:40 [PATCH V2 0/9] Add RCEC handling to PCI/AER Sean V Kelley
2020-08-04 19:40 ` [PATCH V2 1/9] pci_ids: Add class code and extended capability for RCEC Sean V Kelley
2020-08-05  3:01   ` Bjorn Helgaas
2020-08-04 19:40 ` [PATCH V2 2/9] PCI: Extend Root Port Driver to support RCEC Sean V Kelley
2020-08-05 17:43   ` Bjorn Helgaas
2020-08-05 18:14     ` Sean V Kelley
2020-08-05 17:45   ` Jonathan Cameron
2020-08-05 17:50     ` Sean V Kelley
2020-08-26 16:16   ` Kuppuswamy, Sathyanarayanan
2020-08-26 18:29     ` sean.v.kelley
2020-08-04 19:40 ` [PATCH V2 3/9] PCI/portdrv: Add pcie_walk_rcec() to walk RCiEPs associated with RCEC Sean V Kelley
2020-08-05 17:43   ` Bjorn Helgaas
2020-08-05 18:07     ` Sean V Kelley
2020-08-26 16:20   ` Kuppuswamy, Sathyanarayanan
2020-08-26 18:37     ` sean.v.kelley
2020-08-04 19:40 ` [PATCH V2 4/9] PCI/AER: Extend AER error handling to RCECs Sean V Kelley
2020-08-07 22:53   ` Bjorn Helgaas
2020-08-08  0:55     ` Sean V Kelley
2020-08-10  9:32       ` Jonathan Cameron
2020-08-17 22:24         ` Bjorn Helgaas
2020-08-18  9:01           ` Jonathan Cameron
2020-08-04 19:40 ` [PATCH V2 5/9] PCI/AER: Apply function level reset to RCiEP on fatal error Sean V Kelley
2020-08-04 19:40 ` [PATCH V2 6/9] PCI: Add 'rcec' field to pci_dev for associated RCiEPs Sean V Kelley
2020-08-05 17:40   ` Jonathan Cameron
2020-08-05 17:48     ` Sean V Kelley [this message]
2020-08-04 19:40 ` [PATCH V2 7/9] PCI/AER: Add RCEC AER handling Sean V Kelley
2020-08-05 17:49   ` Jonathan Cameron
2020-08-04 19:40 ` [PATCH V2 8/9] PCI/PME: Add RCEC PME handling Sean V Kelley
2020-08-05 17:51   ` Jonathan Cameron
2020-08-04 19:40 ` [PATCH V2 9/9] PCI/AER: Add RCEC AER error injection support Sean V Kelley
2020-08-05 17:54   ` Jonathan Cameron
2020-08-05 18:09     ` Sean V Kelley
2020-08-05 18:00 ` [PATCH V2 0/9] Add RCEC handling to PCI/AER Bjorn Helgaas
2020-08-05 18:12   ` Sean V Kelley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BE94DF65-ED67-4113-A932-58BFE86B1152@intel.com \
    --to=sean.v.kelley@intel.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=qiuxu.zhuo@intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.