linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Sean V Kelley <sean.v.kelley@intel.com>
Cc: bhelgaas@google.com, Jonathan.Cameron@huawei.com,
	xerces.zhao@gmail.com, rafael.j.wysocki@intel.com,
	ashok.raj@intel.com, tony.luck@intel.com,
	sathyanarayanan.kuppuswamy@intel.com, qiuxu.zhuo@intel.com,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v12 10/15] PCI/ERR: Limit AER resets in pcie_do_recovery()
Date: Mon, 23 Nov 2020 17:28:38 -0600	[thread overview]
Message-ID: <20201123232838.GA498923@bjorn-Precision-5520> (raw)
In-Reply-To: <20201121001036.8560-11-sean.v.kelley@intel.com>

On Fri, Nov 20, 2020 at 04:10:31PM -0800, Sean V Kelley wrote:
> In some cases a bridge may not exist as the hardware controlling may be
> handled only by firmware and so is not visible to the OS. This scenario is
> also possible in future use cases involving non-native use of RCECs by
> firmware.
> 
> Explicitly apply conditional logic around these resets by limiting them to
> Root Ports and Downstream Ports.

Can you help me understand this?  The subject says "Limit AER resets"
and here you say "limit them to RPs and DPs", but it's not completely
obvious how the resets are being limited, i.e., the patch doesn't add
anything like:

 +  if (type == PCI_EXP_TYPE_ROOT_PORT ||
 +      type == PCI_EXP_TYPE_DOWNSTREAM)
      reset_subordinates(bridge);

It *does* add checks around pcie_clear_device_status(), but that also
includes RC_EC.  And that's not a reset, so I don't think that's
explicitly mentioned in the commit log.

Also see the question below.

> Link: https://lore.kernel.org/r/20201002184735.1229220-8-seanvk.dev@oregontracks.org
> Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/pci/pcie/err.c | 31 +++++++++++++++++++++++++------
>  1 file changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 8b53aecdb43d..7883c9791562 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -148,13 +148,17 @@ static int report_resume(struct pci_dev *dev, void *data)
>  
>  /**
>   * pci_walk_bridge - walk bridges potentially AER affected
> - * @bridge:	bridge which may be a Port
> + * @bridge:	bridge which may be a Port, an RCEC with associated RCiEPs,
> + *		or an RCiEP associated with an RCEC
>   * @cb:		callback to be called for each device found
>   * @userdata:	arbitrary pointer to be passed to callback
>   *
>   * If the device provided is a bridge, walk the subordinate bus, including
>   * any bridged devices on buses under this bus.  Call the provided callback
>   * on each device found.
> + *
> + * If the device provided has no subordinate bus, call the callback on the
> + * device itself.
>   */
>  static void pci_walk_bridge(struct pci_dev *bridge,
>  			    int (*cb)(struct pci_dev *, void *),
> @@ -162,6 +166,8 @@ static void pci_walk_bridge(struct pci_dev *bridge,
>  {
>  	if (bridge->subordinate)
>  		pci_walk_bus(bridge->subordinate, cb, userdata);
> +	else
> +		cb(bridge, userdata);
>  }
>  
>  pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> @@ -174,10 +180,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>  
>  	/*
>  	 * Error recovery runs on all subordinates of the bridge.  If the
> -	 * bridge detected the error, it is cleared at the end.
> +	 * bridge detected the error, it is cleared at the end.  For RCiEPs
> +	 * we should reset just the RCiEP itself.
>  	 */
>  	if (type == PCI_EXP_TYPE_ROOT_PORT ||
> -	    type == PCI_EXP_TYPE_DOWNSTREAM)
> +	    type == PCI_EXP_TYPE_DOWNSTREAM ||
> +	    type == PCI_EXP_TYPE_RC_EC ||
> +	    type == PCI_EXP_TYPE_RC_END)
>  		bridge = dev;
>  	else
>  		bridge = pci_upstream_bridge(dev);
> @@ -185,6 +194,12 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>  	pci_dbg(bridge, "broadcast error_detected message\n");
>  	if (state == pci_channel_io_frozen) {
>  		pci_walk_bridge(bridge, report_frozen_detected, &status);
> +		if (type == PCI_EXP_TYPE_RC_END) {
> +			pci_warn(dev, "subordinate device reset not possible for RCiEP\n");
> +			status = PCI_ERS_RESULT_NONE;
> +			goto failed;
> +		}
> +
>  		status = reset_subordinates(bridge);
>  		if (status != PCI_ERS_RESULT_RECOVERED) {
>  			pci_warn(bridge, "subordinate device reset failed\n");
> @@ -217,9 +232,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>  	pci_dbg(bridge, "broadcast resume message\n");
>  	pci_walk_bridge(bridge, report_resume, &status);
>  
> -	if (pcie_aer_is_native(bridge))
> -		pcie_clear_device_status(bridge);
> -	pci_aer_clear_nonfatal_status(bridge);
> +	if (type == PCI_EXP_TYPE_ROOT_PORT ||
> +	    type == PCI_EXP_TYPE_DOWNSTREAM ||
> +	    type == PCI_EXP_TYPE_RC_EC) {
> +		if (pcie_aer_is_native(bridge))
> +			pcie_clear_device_status(bridge);
> +		pci_aer_clear_nonfatal_status(bridge);

This is hard to understand because "type" is from "dev", but "bridge"
is not necessarily the same device.  Should it be this?

  type = pci_pcie_type(bridge);
  if (type == PCI_EXP_TYPE_ROOT_PORT ||
      ...)

> +	}
>  	pci_info(bridge, "device recovery successful\n");
>  	return status;
>  
> -- 
> 2.29.2
> 

  reply	other threads:[~2020-11-23 23:28 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-21  0:10 [PATCH v12 00/15] Add RCEC handling to PCI/AER Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 01/15] AER: aer_root_reset() non-native handling Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 02/15] PCI/RCEC: Bind RCEC devices to the Root Port driver Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 03/15] PCI/RCEC: Cache RCEC capabilities in pci_init_capabilities() Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 04/15] PCI/ERR: Rename reset_link() to reset_subordinates() Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 05/15] PCI/ERR: Simplify by using pci_upstream_bridge() Sean V Kelley
2020-12-03 18:45   ` Kelley, Sean V
2020-12-03 22:25     ` Bjorn Helgaas
2020-11-21  0:10 ` [PATCH v12 06/15] PCI/ERR: Simplify by computing pci_pcie_type() once Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 07/15] PCI/ERR: Use "bridge" for clarity in pcie_do_recovery() Sean V Kelley
2020-12-02 23:18   ` Bjorn Helgaas
2020-11-21  0:10 ` [PATCH v12 08/15] PCI/ERR: Avoid negated conditional for clarity Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 09/15] PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery() Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 10/15] PCI/ERR: Limit AER resets in pcie_do_recovery() Sean V Kelley
2020-11-23 23:28   ` Bjorn Helgaas [this message]
2020-11-23 23:57     ` Kelley, Sean V
2020-11-24 17:17       ` Bjorn Helgaas
2020-11-30 19:54         ` Kelley, Sean V
2020-12-01  0:25           ` Bjorn Helgaas
2020-12-01  1:09             ` Kuppuswamy, Sathyanarayanan
2020-12-01  1:13             ` Kelley, Sean V
2020-12-02 20:53             ` Kelley, Sean V
2020-12-02 21:27               ` Bjorn Helgaas
2020-12-02 22:54                 ` Kelley, Sean V
2020-11-21  0:10 ` [PATCH v12 11/15] PCI/RCEC: Add pcie_link_rcec() to associate RCiEPs Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 12/15] PCI/RCEC: Add RCiEP's linked RCEC to AER/ERR Sean V Kelley
2020-12-02 23:44   ` Bjorn Helgaas
2020-12-03  0:51     ` Kelley, Sean V
2020-12-04  0:01       ` Bjorn Helgaas
2020-12-04 17:17         ` Kelley, Sean V
2020-12-04 17:24           ` Bjorn Helgaas
2020-12-05 21:30           ` Bjorn Helgaas
2020-12-07 17:23             ` Kelley, Sean V
2020-11-21  0:10 ` [PATCH v12 13/15] PCI/AER: Add pcie_walk_rcec() to RCEC AER handling Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 14/15] PCI/PME: Add pcie_walk_rcec() to RCEC PME handling Sean V Kelley
2020-11-21  0:10 ` [PATCH v12 15/15] PCI/AER: Add RCEC AER error injection support Sean V Kelley
2020-11-21  4:26 ` [PATCH v12 00/15] Add RCEC handling to PCI/AER Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201123232838.GA498923@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=qiuxu.zhuo@intel.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=sathyanarayanan.kuppuswamy@intel.com \
    --cc=sean.v.kelley@intel.com \
    --cc=tony.luck@intel.com \
    --cc=xerces.zhao@gmail.com \
    --subject='Re: [PATCH v12 10/15] PCI/ERR: Limit AER resets in pcie_do_recovery()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).