linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Rothwell <sfr@canb.auug.org.au>
To: Alex Deucher <alexdeucher@gmail.com>,
	Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Deucher <alexander.deucher@amd.com>,
	Jay Vosburgh <jay.vosburgh@canonical.com>,
	Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Next Mailing List <linux-next@vger.kernel.org>,
	Sean V Kelley <sean.v.kelley@intel.com>
Subject: linux-next: manual merge of the amdgpu tree with the pci tree
Date: Tue, 8 Dec 2020 13:56:20 +1100	[thread overview]
Message-ID: <20201208135620.237dbbd1@canb.auug.org.au> (raw)

[-- Attachment #1: Type: text/plain, Size: 5989 bytes --]

Hi all,

Today's linux-next merge of the amdgpu tree got a conflict in:

  drivers/pci/pcie/err.c

between commits:

  8f1bbfbc3596 ("PCI/ERR: Rename reset_link() to reset_subordinates()")
  0791721d8007 ("PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()")
  05e9ae19ab83 ("PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()")

from the pci tree and commit:

  36a8901e900a ("PCI/ERR: Fix reset logic in pcie_do_recovery() call")

from the amdgpu tree.

I fixed it up (I think - see below) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/pci/pcie/err.c
index 510f31f0ef6d,4a2735b70fa6..000000000000
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@@ -146,61 -146,49 +146,82 @@@ out
  	return 0;
  }
  
 +/**
 + * pci_walk_bridge - walk bridges potentially AER affected
 + * @bridge:	bridge which may be a Port, an RCEC, or an RCiEP
 + * @cb:		callback to be called for each device found
 + * @userdata:	arbitrary pointer to be passed to callback
 + *
 + * If the device provided is a bridge, walk the subordinate bus, including
 + * any bridged devices on buses under this bus.  Call the provided callback
 + * on each device found.
 + *
 + * If the device provided has no subordinate bus, e.g., an RCEC or RCiEP,
 + * call the callback on the device itself.
 + */
 +static void pci_walk_bridge(struct pci_dev *bridge,
 +			    int (*cb)(struct pci_dev *, void *),
 +			    void *userdata)
 +{
 +	if (bridge->subordinate)
 +		pci_walk_bus(bridge->subordinate, cb, userdata);
 +	else
 +		cb(bridge, userdata);
 +}
 +
  pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 -			pci_channel_state_t state,
 -			pci_ers_result_t (*reset_link)(struct pci_dev *pdev))
 +		pci_channel_state_t state,
 +		pci_ers_result_t (*reset_subordinates)(struct pci_dev *pdev))
  {
 +	int type = pci_pcie_type(dev);
 +	struct pci_dev *bridge;
  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
 -	struct pci_bus *bus;
 +	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
  
  	/*
 -	 * Error recovery runs on all subordinates of the first downstream port.
 -	 * If the downstream port detected the error, it is cleared at the end.
 +	 * If the error was detected by a Root Port, Downstream Port, RCEC,
 +	 * or RCiEP, recovery runs on the device itself.  For Ports, that
 +	 * also includes any subordinate devices.
 +	 *
 +	 * If it was detected by another device (Endpoint, etc), recovery
 +	 * runs on the device and anything else under the same Port, i.e.,
 +	 * everything under "bridge".
  	 */
 -	if (!(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
 -	      pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
 -		dev = dev->bus->self;
 -	bus = dev->subordinate;
 +	if (type == PCI_EXP_TYPE_ROOT_PORT ||
 +	    type == PCI_EXP_TYPE_DOWNSTREAM ||
 +	    type == PCI_EXP_TYPE_RC_EC ||
 +	    type == PCI_EXP_TYPE_RC_END)
 +		bridge = dev;
 +	else
 +		bridge = pci_upstream_bridge(dev);
  
 -	pci_dbg(dev, "broadcast error_detected message\n");
 +	pci_dbg(bridge, "broadcast error_detected message\n");
  	if (state == pci_channel_io_frozen) {
 -		pci_walk_bus(bus, report_frozen_detected, &status);
 +		pci_walk_bridge(bridge, report_frozen_detected, &status);
+ 		/*
+ 		 * After resetting the link using reset_link() call, the
+ 		 * possible value of error status is either
+ 		 * PCI_ERS_RESULT_DISCONNECT (failure case) or
+ 		 * PCI_ERS_RESULT_NEED_RESET (success case).
+ 		 * So ignore the return value of report_error_detected()
+ 		 * call for fatal errors.
+ 		 *
+ 		 * In EDR mode, since AER and DPC Capabilities are owned by
+ 		 * firmware, reported_error_detected() will return error
+ 		 * status PCI_ERS_RESULT_NO_AER_DRIVER. Continuing
+ 		 * pcie_do_recovery() with error status as
+ 		 * PCI_ERS_RESULT_NO_AER_DRIVER will report recovery failure
+ 		 * irrespective of recovery status. But successful reset_link()
+ 		 * call usually recovers all fatal errors. So ignoring the
+ 		 * status result of report_error_detected() also helps EDR based
+ 		 * error recovery.
+ 		 */
 -		status = reset_link(dev);
 +		status = reset_subordinates(bridge);
- 		if (status != PCI_ERS_RESULT_RECOVERED) {
+ 		if (status == PCI_ERS_RESULT_RECOVERED) {
+ 			status = PCI_ERS_RESULT_NEED_RESET;
+ 		} else {
+ 			status = PCI_ERS_RESULT_DISCONNECT;
 -			pci_warn(dev, "link reset failed\n");
 +			pci_warn(bridge, "subordinate device reset failed\n");
  			goto failed;
  		}
  	} else {
@@@ -215,13 -203,25 +236,25 @@@
  
  	if (status == PCI_ERS_RESULT_NEED_RESET) {
  		/*
- 		 * TODO: Should call platform-specific
- 		 * functions to reset slot before calling
- 		 * drivers' slot_reset callbacks?
+ 		 * TODO: Optimize the call to pci_reset_bus()
+ 		 *
+ 		 * There are two components to pci_reset_bus().
+ 		 *
+ 		 * 1. Do platform specific slot/bus reset.
+ 		 * 2. Save/Restore all devices in the bus.
+ 		 *
+ 		 * For hotplug capable devices and fatal errors,
+ 		 * device is already in reset state due to link
+ 		 * reset. So repeating platform specific slot/bus
+ 		 * reset via pci_reset_bus() call is redundant. So
+ 		 * can optimize this logic and conditionally call
+ 		 * pci_reset_bus().
  		 */
+ 		pci_reset_bus(dev);
+ 
  		status = PCI_ERS_RESULT_RECOVERED;
 -		pci_dbg(dev, "broadcast slot_reset message\n");
 -		pci_walk_bus(bus, report_slot_reset, &status);
 +		pci_dbg(bridge, "broadcast slot_reset message\n");
 +		pci_walk_bridge(bridge, report_slot_reset, &status);
  	}
  
  	if (status != PCI_ERS_RESULT_RECOVERED)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

             reply	other threads:[~2020-12-08  2:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08  2:56 Stephen Rothwell [this message]
2020-12-14 20:34 ` linux-next: manual merge of the amdgpu tree with the pci tree Stephen Rothwell
2020-12-14 23:16   ` Bjorn Helgaas
2020-12-14 23:18     ` Alex Deucher
2020-12-14 23:37       ` Bjorn Helgaas
2020-12-15  6:52         ` Kuppuswamy, Sathyanarayanan
2020-12-15 17:25           ` Bjorn Helgaas
2020-12-15 18:04             ` Alex Deucher
2020-12-15 18:22               ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201208135620.237dbbd1@canb.auug.org.au \
    --to=sfr@canb.auug.org.au \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=sean.v.kelley@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).