All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] PCI, error handling and hot plug
@ 2018-08-31 21:26 Keith Busch
  2018-08-31 21:26 ` [PATCH 01/16] PCI: Simplify disconnected marking Keith Busch
                   ` (15 more replies)
  0 siblings, 16 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

This series is all about error handling and hot plug. It should get the
AER fatal handling much like how it was before where we don't always
tear down and re-enumerate the topology, but even better in taht it
won't trigger hotplug events to the extent possibe to suppress link
events that occur from the secondary bus reset.

Ports that do not implement an out-of-band presence detection mechanism
may still have their device removed on the DPC and AER error handling,
but at least error handling for non-hot pluggable ports won't experience
removals anymore.

Some of these are simple bug fixes, some are more complicated. Patch
11 fixes a long-standing issue where a RP detected fatal error never
restored the downstream bridges, making recovery of end devices below
those switches bound to fail after a secondary bus reset resets the
bridge's config memory and bus windows.

Comments welcomed!

Keith Busch (15):
  PCI: Fix pci_reset_bus
  PCI/AER: Remove dead code
  PCI/ERR: Use slot reset if available
  PCI/ERR: Handle fatal error recovery
  PCI/ERR: Remove devices on recovery failure
  PCI/ERR: Always use the first downstream port
  PCI/ERR: Simplify broadcast callouts
  PCI/ERR: Report current recovery status for udev
  PCI/portdrv: Provide pci error callbacks
  PCI/portdrv: Restore pci state on slot reset
  PCI/pciehp: Fix powerfault detection order
  PCI/pciehp: Implement error handling callbacks
  PCI/pciehp: Ignore link events during DPC event
  PCI/DPC: Wait for reset complete
  PCI: Unify device inaccessible

Lukas Wunner (1):
  PCI: Simplify disconnected marking

 drivers/pci/hotplug/pciehp.h      |   1 +
 drivers/pci/hotplug/pciehp_core.c |  51 +++++++
 drivers/pci/hotplug/pciehp_hpc.c  |  34 +++--
 drivers/pci/hotplug/pciehp_pci.c  |   9 +-
 drivers/pci/pci.c                 |  32 ++++-
 drivers/pci/pci.h                 |  12 +-
 drivers/pci/pcie/aer.c            |  15 +-
 drivers/pci/pcie/dpc.c            |  40 ++++--
 drivers/pci/pcie/err.c            | 293 ++++++++++++++------------------------
 drivers/pci/pcie/portdrv.h        |   6 +
 drivers/pci/pcie/portdrv_pci.c    |  37 ++++-
 include/linux/pci.h               |   6 +
 12 files changed, 299 insertions(+), 237 deletions(-)

-- 
2.14.4

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/16] PCI: Simplify disconnected marking
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 21:26 ` [PATCH 02/16] PCI: Fix pci_reset_bus Keith Busch
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza, Lukas Wunner

From: Lukas Wunner <lukas@wunner.de>

Commit 89ee9f768003 ("PCI: Add device disconnected state") iterates over
the devices on a parent bus, marks each as disconnected, then marks
each device's children as disconnected using pci_walk_bus().

The same can be achieved more succinctly by calling pci_walk_bus() on
the parent bus.  Moreover, this does not need to wait until acquiring
pci_lock_rescan_remove(), so move it out of that critical section.

The critical section in err.c contains a pci_dev_get() / pci_dev_put()
pair which was apparently copy-pasted from pciehp_pci.c.  In the latter
it serves the purpose of holding the struct pci_dev in place until the
Command register is updated.  err.c doesn't do anything like that, hence
the pair is unnecessary.  Remove it.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Cc: Oza Pawandeep <poza@codeaurora.org>
Cc: Sinan Kaya <okaya@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/pci/hotplug/pciehp_pci.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 5c58c22e0c08..3ef5c0744249 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -74,6 +74,9 @@ void pciehp_unconfigure_device(struct slot *p_slot)
 		 __func__, pci_domain_nr(parent), parent->number);
 	pciehp_get_adapter_status(p_slot, &presence);
 
+	if (!presence)
+		pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
+
 	pci_lock_rescan_remove();
 
 	/*
@@ -85,12 +88,6 @@ void pciehp_unconfigure_device(struct slot *p_slot)
 	list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
 					 bus_list) {
 		pci_dev_get(dev);
-		if (!presence) {
-			pci_dev_set_disconnected(dev, NULL);
-			if (pci_has_subordinate(dev))
-				pci_walk_bus(dev->subordinate,
-					     pci_dev_set_disconnected, NULL);
-		}
 		pci_stop_and_remove_bus_device(dev);
 		/*
 		 * Ensure that no new Requests will be generated from
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/16] PCI: Fix pci_reset_bus
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
  2018-08-31 21:26 ` [PATCH 01/16] PCI: Simplify disconnected marking Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 21:52   ` Sinan Kaya
  2018-08-31 21:26 ` [PATCH 03/16] PCI/AER: Remove dead code Keith Busch
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

The check for slot reset capable, pci_probe_reset_slot, returns 0 when
slot reset is possible, but pci_reset_bus was proceeding to reset the
slot when the probe returned an error. This patch reverses the check to
fix that logic.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 29ff9619b5fa..bc8419e0065a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5200,7 +5200,7 @@ static int __pci_reset_bus(struct pci_bus *bus)
  */
 int pci_reset_bus(struct pci_dev *pdev)
 {
-	return pci_probe_reset_slot(pdev->slot) ?
+	return !pci_probe_reset_slot(pdev->slot) ?
 	    __pci_reset_slot(pdev->slot) : __pci_reset_bus(pdev->bus);
 }
 EXPORT_SYMBOL_GPL(pci_reset_bus);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/16] PCI/AER: Remove dead code
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
  2018-08-31 21:26 ` [PATCH 01/16] PCI: Simplify disconnected marking Keith Busch
  2018-08-31 21:26 ` [PATCH 02/16] PCI: Fix pci_reset_bus Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 21:26 ` [PATCH 04/16] PCI/ERR: Use slot reset if available Keith Busch
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

The error recovery callbacks are only run on child devices. A root port
is never a child device, so this error resume callback was never
invoked.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/aer.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 83180edd6ed4..a95ceba977bb 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1541,18 +1541,6 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 	return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
 }
 
-/**
- * aer_error_resume - clean up corresponding error status bits
- * @dev: pointer to Root Port's pci_dev data structure
- *
- * Invoked by Port Bus driver during nonfatal recovery.
- */
-static void aer_error_resume(struct pci_dev *dev)
-{
-	pci_aer_clear_device_status(dev);
-	pci_cleanup_aer_uncorrect_error_status(dev);
-}
-
 static struct pcie_port_service_driver aerdriver = {
 	.name		= "aer",
 	.port_type	= PCI_EXP_TYPE_ROOT_PORT,
@@ -1560,7 +1548,6 @@ static struct pcie_port_service_driver aerdriver = {
 
 	.probe		= aer_probe,
 	.remove		= aer_remove,
-	.error_resume	= aer_error_resume,
 	.reset_link	= aer_root_reset,
 };
 
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/16] PCI/ERR: Use slot reset if available
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (2 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 03/16] PCI/AER: Remove dead code Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-01 17:20   ` Lukas Wunner
  2018-08-31 21:26 ` [PATCH 05/16] PCI/ERR: Handle fatal error recovery Keith Busch
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

The secondary bus reset may have link side effects that a hot plug
capable port may incorrectly react to. This patch will use the slot
specific reset for hotplug ports, fixing the undesirable link down-up
handling during error recovering.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pci.c      | 30 ++++++++++++++++++++++++++++++
 drivers/pci/pci.h      |  1 +
 drivers/pci/pcie/aer.c |  2 +-
 drivers/pci/pcie/err.c |  2 +-
 4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index bc8419e0065a..b128deaf7ffb 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5152,6 +5152,36 @@ static int pci_bus_reset(struct pci_bus *bus, int probe)
 	return ret;
 }
 
+/**
+ * pci_bus_error_reset - reset the bridge's subordinate bus
+ * @bridge: The parent device that connects to the bus to reset
+ *
+ * This function will first try to reset the slots on this bus if the method is
+ * available. If slot reset fails or is not available, this will fall back to a
+ * secondary bus reset.
+ */
+int pci_bus_error_reset(struct pci_dev *bridge)
+{
+	struct pci_bus *bus = bridge->subordinate;
+
+	if (!bus)
+		return -ENOTTY;
+
+	if (!list_empty(&bus->slots)) {
+		struct pci_slot *slot;
+
+		list_for_each_entry(slot, &bus->slots, list)
+			if (pci_probe_reset_slot(slot))
+				goto bus_reset;
+		list_for_each_entry(slot, &bus->slots, list)
+			if (pci_slot_reset(slot, 0))
+				goto bus_reset;
+		return 0;
+	}
+bus_reset:
+	return pci_bus_reset(bridge->subordinate, 0);
+}
+
 /**
  * pci_probe_reset_bus - probe whether a PCI bus can be reset
  * @bus: PCI bus to probe
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 6e0d1528d471..ed2965ee6779 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -35,6 +35,7 @@ int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
 
 int pci_probe_reset_function(struct pci_dev *dev);
 int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
+int pci_bus_error_reset(struct pci_dev *dev);
 
 /**
  * struct pci_platform_pm_ops - Firmware PM callbacks
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a95ceba977bb..a96cea39ee80 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1526,7 +1526,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 	reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
 	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, reg32);
 
-	rc = pci_bridge_secondary_bus_reset(dev);
+	rc = pci_bus_error_reset(dev);
 	pci_printk(KERN_DEBUG, dev, "Root Port link has been reset\n");
 
 	/* Clear Root Error Status */
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 708fd3a0d646..12c1205e1d80 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -177,7 +177,7 @@ static pci_ers_result_t default_reset_link(struct pci_dev *dev)
 {
 	int rc;
 
-	rc = pci_bridge_secondary_bus_reset(dev);
+	rc = pci_bus_error_reset(dev);
 	pci_printk(KERN_DEBUG, dev, "downstream link has been reset\n");
 	return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
 }
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/16] PCI/ERR: Handle fatal error recovery
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (3 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 04/16] PCI/ERR: Use slot reset if available Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-01  8:31   ` Christoph Hellwig
  2018-09-05  5:56   ` poza
  2018-08-31 21:26 ` [PATCH 06/16] PCI/ERR: Remove devices on recovery failure Keith Busch
                   ` (10 subsequent siblings)
  15 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

We don't need to be paranoid about the topology changing while handling an
error. If the device has changed in a hotplug capable slot, we can rely
on the presence detection handling to react to a changing topology. This
patch restores the fatal error handling behavior that existed before
merging DPC with AER with 7e9084b3674 ("PCI/AER: Handle ERR_FATAL with
removal and re-enumeration of devices").

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/dpc.c |  2 +-
 drivers/pci/pcie/err.c | 89 +++++++++-----------------------------------------
 2 files changed, 17 insertions(+), 74 deletions(-)

diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index f03279fc87cd..eadfd835af13 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -169,7 +169,7 @@ static irqreturn_t dpc_handler(int irq, void *context)
 
 	reason = (status & PCI_EXP_DPC_STATUS_TRIGGER_RSN) >> 1;
 	ext_reason = (status & PCI_EXP_DPC_STATUS_TRIGGER_RSN_EXT) >> 5;
-	dev_warn(dev, "DPC %s detected, remove downstream devices\n",
+	dev_warn(dev, "DPC %s detected\n",
 		 (reason == 0) ? "unmasked uncorrectable error" :
 		 (reason == 1) ? "ERR_NONFATAL" :
 		 (reason == 2) ? "ERR_FATAL" :
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 12c1205e1d80..44c55f7ceb39 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -271,87 +271,20 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 	return result_data.result;
 }
 
-/**
- * pcie_do_fatal_recovery - handle fatal error recovery process
- * @dev: pointer to a pci_dev data structure of agent detecting an error
- *
- * Invoked when an error is fatal. Once being invoked, removes the devices
- * beneath this AER agent, followed by reset link e.g. secondary bus reset
- * followed by re-enumeration of devices.
- */
-void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
-{
-	struct pci_dev *udev;
-	struct pci_bus *parent;
-	struct pci_dev *pdev, *temp;
-	pci_ers_result_t result;
-
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
-		udev = dev;
-	else
-		udev = dev->bus->self;
-
-	parent = udev->subordinate;
-	pci_lock_rescan_remove();
-	pci_dev_get(dev);
-	list_for_each_entry_safe_reverse(pdev, temp, &parent->devices,
-					 bus_list) {
-		pci_dev_get(pdev);
-		pci_dev_set_disconnected(pdev, NULL);
-		if (pci_has_subordinate(pdev))
-			pci_walk_bus(pdev->subordinate,
-				     pci_dev_set_disconnected, NULL);
-		pci_stop_and_remove_bus_device(pdev);
-		pci_dev_put(pdev);
-	}
-
-	result = reset_link(udev, service);
-
-	if ((service == PCIE_PORT_SERVICE_AER) &&
-	    (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)) {
-		/*
-		 * If the error is reported by a bridge, we think this error
-		 * is related to the downstream link of the bridge, so we
-		 * do error recovery on all subordinates of the bridge instead
-		 * of the bridge and clear the error status of the bridge.
-		 */
-		pci_aer_clear_fatal_status(dev);
-		pci_aer_clear_device_status(dev);
-	}
-
-	if (result == PCI_ERS_RESULT_RECOVERED) {
-		if (pcie_wait_for_link(udev, true))
-			pci_rescan_bus(udev->bus);
-		pci_info(dev, "Device recovery from fatal error successful\n");
-	} else {
-		pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
-		pci_info(dev, "Device recovery from fatal error failed\n");
-	}
-
-	pci_dev_put(dev);
-	pci_unlock_rescan_remove();
-}
-
-/**
- * pcie_do_nonfatal_recovery - handle nonfatal error recovery process
- * @dev: pointer to a pci_dev data structure of agent detecting an error
- *
- * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
- * error detected message to all downstream drivers within a hierarchy in
- * question and return the returned code.
- */
-void pcie_do_nonfatal_recovery(struct pci_dev *dev)
+static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
+			     u32 service)
 {
 	pci_ers_result_t status;
-	enum pci_channel_state state;
-
-	state = pci_channel_io_normal;
 
 	status = broadcast_error_message(dev,
 			state,
 			"error_detected",
 			report_error_detected);
 
+	if (state == pci_channel_io_frozen &&
+	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
+		goto failed;
+
 	if (status == PCI_ERS_RESULT_CAN_RECOVER)
 		status = broadcast_error_message(dev,
 				state,
@@ -387,3 +320,13 @@ void pcie_do_nonfatal_recovery(struct pci_dev *dev)
 	/* TODO: Should kernel panic here? */
 	pci_info(dev, "AER: Device recovery failed\n");
 }
+
+void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
+{
+	pcie_do_recovery(dev, pci_channel_io_frozen, service);
+}
+
+void pcie_do_nonfatal_recovery(struct pci_dev *dev)
+{
+	pcie_do_recovery(dev, pci_channel_io_normal, PCIE_PORT_SERVICE_AER);
+}
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/16] PCI/ERR: Remove devices on recovery failure
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (4 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 05/16] PCI/ERR: Handle fatal error recovery Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 22:26   ` Sinan Kaya
  2018-08-31 21:26 ` [PATCH 07/16] PCI/ERR: Always use the first downstream port Keith Busch
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

This patch removes devices connected through a bus that can't recover
from an error.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/err.c | 42 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 44c55f7ceb39..45f574954fd6 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -166,6 +166,15 @@ static int report_resume(struct pci_dev *dev, void *data)
 	return 0;
 }
 
+static int report_disconnect(struct pci_dev *dev, void *data)
+{
+	device_lock(&dev->dev);
+	pci_dev_set_disconnected(dev, NULL);
+	pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
+	device_unlock(&dev->dev);
+	return 0;
+}
+
 /**
  * default_reset_link - default reset function
  * @dev: pointer to pci_dev data structure
@@ -271,6 +280,34 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 	return result_data.result;
 }
 
+/**
+ * pcie_disconnect_device - Called when error handling ends with
+ * 			    PCI_ERS_RESULT_DISCONNECT status.
+ *
+ * Reaching here means error handling has irrevocably failed. This function
+ * will ungracefully disconnect all the devices below the bus that has
+ * experienced the unrecoverable error.
+ *
+ * If the link is active after the removing all devices on the bus, this will
+ * attempt to re-enumerate the bus from scratch.
+ */
+static void pcie_disconnect_device(struct pci_dev *dev)
+{
+	struct pci_bus *bus = dev->subordinate;
+	struct pci_dev *child, *tmp;
+
+	broadcast_error_message(dev, PCI_ERS_RESULT_DISCONNECT,
+				"disconnect", report_disconnect);
+	pci_lock_rescan_remove();
+	list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
+		pci_stop_and_remove_bus_device(child);
+
+	pci_bridge_secondary_bus_reset(dev);
+	if (pcie_wait_for_link(dev, true))
+		pci_rescan_bus(bus);
+	pci_unlock_rescan_remove();
+}
+
 static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 			     u32 service)
 {
@@ -313,12 +350,9 @@ static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 
 	pci_info(dev, "AER: Device recovery successful\n");
 	return;
-
 failed:
-	pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
-
-	/* TODO: Should kernel panic here? */
 	pci_info(dev, "AER: Device recovery failed\n");
+	pcie_disconnect_device(dev);
 }
 
 void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/16] PCI/ERR: Always use the first downstream port
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (5 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 06/16] PCI/ERR: Remove devices on recovery failure Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 21:26 ` [PATCH 08/16] PCI/ERR: Simplify broadcast callouts Keith Busch
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

The link reset always used the first bridge device, but AER broadcast
error handling may have reported an end device. This means the reset may
hit devices that were never notified of the impending error recovery.

This patch uses the first downstream port in the hierarchy considered
reliable. An error detected by a  switch upstream port detected should
mean it occurred on its upstream link, so the patch selects the parent
device if the error is not a root or downstream port.

This allows two other clean-ups. First, error handling can only run on
bridges so this patch removes checks and branches for end devices. Second,
the first accessible port does not inherit the channel error state since
we can access it, so the special cases for error detect and resume are
no longer necessary.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/err.c | 85 +++++++++++++-------------------------------------
 1 file changed, 21 insertions(+), 64 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 45f574954fd6..efef4c4ce7b2 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -63,30 +63,12 @@ static int report_error_detected(struct pci_dev *dev, void *data)
 	if (!dev->driver ||
 		!dev->driver->err_handler ||
 		!dev->driver->err_handler->error_detected) {
-		if (result_data->state == pci_channel_io_frozen &&
-			dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
-			/*
-			 * In case of fatal recovery, if one of down-
-			 * stream device has no driver. We might be
-			 * unable to recover because a later insmod
-			 * of a driver for this device is unaware of
-			 * its hw state.
-			 */
-			pci_printk(KERN_DEBUG, dev, "device has %s\n",
-				   dev->driver ?
-				   "no AER-aware driver" : "no driver");
-		}
-
 		/*
-		 * If there's any device in the subtree that does not
-		 * have an error_detected callback, returning
-		 * PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
-		 * the subsequent mmio_enabled/slot_reset/resume
-		 * callbacks of "any" device in the subtree. All the
-		 * devices in the subtree are left in the error state
-		 * without recovery.
+		 * If any device in the subtree does not have an error_detected
+		 * callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
+		 * erronr callbacks of "any" device in the subtree, and will
+		 * exit in the disconnected error state.
 		 */
-
 		if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE)
 			vote = PCI_ERS_RESULT_NO_AER_DRIVER;
 		else
@@ -193,34 +175,23 @@ static pci_ers_result_t default_reset_link(struct pci_dev *dev)
 
 static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service)
 {
-	struct pci_dev *udev;
 	pci_ers_result_t status;
 	struct pcie_port_service_driver *driver = NULL;
 
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
-		/* Reset this port for all subordinates */
-		udev = dev;
-	} else {
-		/* Reset the upstream component (likely downstream port) */
-		udev = dev->bus->self;
-	}
-
-	/* Use the aer driver of the component firstly */
-	driver = pcie_port_find_service(udev, service);
-
+	driver = pcie_port_find_service(dev, service);
 	if (driver && driver->reset_link) {
-		status = driver->reset_link(udev);
-	} else if (udev->has_secondary_link) {
-		status = default_reset_link(udev);
+		status = driver->reset_link(dev);
+	} else if (dev->has_secondary_link) {
+		status = default_reset_link(dev);
 	} else {
 		pci_printk(KERN_DEBUG, dev, "no link-reset support at upstream device %s\n",
-			pci_name(udev));
+			pci_name(dev));
 		return PCI_ERS_RESULT_DISCONNECT;
 	}
 
 	if (status != PCI_ERS_RESULT_RECOVERED) {
 		pci_printk(KERN_DEBUG, dev, "link reset at upstream device %s failed\n",
-			pci_name(udev));
+			pci_name(dev));
 		return PCI_ERS_RESULT_DISCONNECT;
 	}
 
@@ -252,31 +223,7 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 	else
 		result_data.result = PCI_ERS_RESULT_RECOVERED;
 
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
-		/*
-		 * If the error is reported by a bridge, we think this error
-		 * is related to the downstream link of the bridge, so we
-		 * do error recovery on all subordinates of the bridge instead
-		 * of the bridge and clear the error status of the bridge.
-		 */
-		if (cb == report_error_detected)
-			dev->error_state = state;
-		pci_walk_bus(dev->subordinate, cb, &result_data);
-		if (cb == report_resume) {
-			pci_aer_clear_device_status(dev);
-			pci_cleanup_aer_uncorrect_error_status(dev);
-			dev->error_state = pci_channel_io_normal;
-		}
-	} else {
-		/*
-		 * If the error is reported by an end point, we think this
-		 * error is related to the upstream link of the end point.
-		 * The error is non fatal so the bus is ok; just invoke
-		 * the callback for the function that logged the error.
-		 */
-		cb(dev, &result_data);
-	}
-
+	pci_walk_bus(dev->subordinate, cb, &result_data);
 	return result_data.result;
 }
 
@@ -313,6 +260,14 @@ static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 {
 	pci_ers_result_t status;
 
+	/*
+	 * Error recovery runs on all subordinates of the first downstream port.
+	 * If the downstream port detected the error, it is cleared at the end.
+	 */
+	if (!(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
+	      pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
+		dev = dev->bus->self;
+
 	status = broadcast_error_message(dev,
 			state,
 			"error_detected",
@@ -348,6 +303,8 @@ static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 				"resume",
 				report_resume);
 
+	pci_aer_clear_device_status(dev);
+	pci_cleanup_aer_uncorrect_error_status(dev);
 	pci_info(dev, "AER: Device recovery successful\n");
 	return;
 failed:
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/16] PCI/ERR: Simplify broadcast callouts
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (6 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 07/16] PCI/ERR: Always use the first downstream port Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-01  8:33   ` Christoph Hellwig
  2018-08-31 21:26 ` [PATCH 09/16] PCI/ERR: Report current recovery status for udev Keith Busch
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

There is no point in having a generic broadcast function if it needs to
have special cases for each callback it broadcasts. This patch abstracts
the error broadcast to only the necessary information to make it easier
to use.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/err.c | 89 +++++++++++++++++++++-----------------------------
 1 file changed, 38 insertions(+), 51 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index efef4c4ce7b2..ee9014615add 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -19,11 +19,6 @@
 #include "portdrv.h"
 #include "../pci.h"
 
-struct aer_broadcast_data {
-	enum pci_channel_state state;
-	enum pci_ers_result result;
-};
-
 static pci_ers_result_t merge_result(enum pci_ers_result orig,
 				  enum pci_ers_result new)
 {
@@ -49,16 +44,15 @@ static pci_ers_result_t merge_result(enum pci_ers_result orig,
 	return orig;
 }
 
-static int report_error_detected(struct pci_dev *dev, void *data)
+static int report_error_detected(struct pci_dev *dev,
+				 enum pci_channel_state state,
+				 enum pci_ers_result *result)
 {
 	pci_ers_result_t vote;
 	const struct pci_error_handlers *err_handler;
-	struct aer_broadcast_data *result_data;
-
-	result_data = (struct aer_broadcast_data *) data;
 
 	device_lock(&dev->dev);
-	dev->error_state = result_data->state;
+	dev->error_state = state;
 
 	if (!dev->driver ||
 		!dev->driver->err_handler ||
@@ -75,22 +69,29 @@ static int report_error_detected(struct pci_dev *dev, void *data)
 			vote = PCI_ERS_RESULT_NONE;
 	} else {
 		err_handler = dev->driver->err_handler;
-		vote = err_handler->error_detected(dev, result_data->state);
+		vote = err_handler->error_detected(dev, state);
 		pci_uevent_ers(dev, PCI_ERS_RESULT_NONE);
 	}
 
-	result_data->result = merge_result(result_data->result, vote);
+	*result = merge_result(*result, vote);
 	device_unlock(&dev->dev);
 	return 0;
 }
 
+static int report_forzen_detected(struct pci_dev *dev, void *data)
+{
+	return report_error_detected(dev, pci_channel_io_frozen, data);
+}
+
+static int report_normal_detected(struct pci_dev *dev, void *data)
+{
+	return report_error_detected(dev, pci_channel_io_normal, data);
+}
+
 static int report_mmio_enabled(struct pci_dev *dev, void *data)
 {
-	pci_ers_result_t vote;
+	pci_ers_result_t vote, *result = data;
 	const struct pci_error_handlers *err_handler;
-	struct aer_broadcast_data *result_data;
-
-	result_data = (struct aer_broadcast_data *) data;
 
 	device_lock(&dev->dev);
 	if (!dev->driver ||
@@ -100,7 +101,7 @@ static int report_mmio_enabled(struct pci_dev *dev, void *data)
 
 	err_handler = dev->driver->err_handler;
 	vote = err_handler->mmio_enabled(dev);
-	result_data->result = merge_result(result_data->result, vote);
+	*result = merge_result(*result, vote);
 out:
 	device_unlock(&dev->dev);
 	return 0;
@@ -108,11 +109,8 @@ static int report_mmio_enabled(struct pci_dev *dev, void *data)
 
 static int report_slot_reset(struct pci_dev *dev, void *data)
 {
-	pci_ers_result_t vote;
+	pci_ers_result_t vote, *result = data;
 	const struct pci_error_handlers *err_handler;
-	struct aer_broadcast_data *result_data;
-
-	result_data = (struct aer_broadcast_data *) data;
 
 	device_lock(&dev->dev);
 	if (!dev->driver ||
@@ -122,7 +120,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
 
 	err_handler = dev->driver->err_handler;
 	vote = err_handler->slot_reset(dev);
-	result_data->result = merge_result(result_data->result, vote);
+	*result = merge_result(*result, vote);
 out:
 	device_unlock(&dev->dev);
 	return 0;
@@ -201,7 +199,7 @@ static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service)
 /**
  * broadcast_error_message - handle message broadcast to downstream drivers
  * @dev: pointer to from where in a hierarchy message is broadcasted down
- * @state: error state
+ * @result: starting result
  * @error_mesg: message to print
  * @cb: callback to be broadcasted
  *
@@ -210,21 +208,12 @@ static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service)
  * hierarchy in question.
  */
 static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
-	enum pci_channel_state state,
-	char *error_mesg,
-	int (*cb)(struct pci_dev *, void *))
+				pci_ers_result_t result, char *error_mesg,
+				int (*cb)(struct pci_dev *, void *))
 {
-	struct aer_broadcast_data result_data;
-
 	pci_printk(KERN_DEBUG, dev, "broadcast %s message\n", error_mesg);
-	result_data.state = state;
-	if (cb == report_error_detected)
-		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
-	else
-		result_data.result = PCI_ERS_RESULT_RECOVERED;
-
-	pci_walk_bus(dev->subordinate, cb, &result_data);
-	return result_data.result;
+	pci_walk_bus(dev->subordinate, cb, &result);
+	return result;
 }
 
 /**
@@ -268,20 +257,22 @@ static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 	      pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
 		dev = dev->bus->self;
 
-	status = broadcast_error_message(dev,
-			state,
-			"error_detected",
-			report_error_detected);
+	if (state == pci_channel_io_frozen)
+		status = broadcast_error_message(dev,
+				PCI_ERS_RESULT_CAN_RECOVER, "error_detected",
+				report_forzen_detected);
+	else
+		status = broadcast_error_message(dev,
+				PCI_ERS_RESULT_CAN_RECOVER, "error_detected",
+				report_normal_detected);
 
 	if (state == pci_channel_io_frozen &&
 	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
 		goto failed;
 
 	if (status == PCI_ERS_RESULT_CAN_RECOVER)
-		status = broadcast_error_message(dev,
-				state,
-				"mmio_enabled",
-				report_mmio_enabled);
+		status = broadcast_error_message(dev, PCI_ERS_RESULT_RECOVERED,
+					"mmio_enabled", report_mmio_enabled);
 
 	if (status == PCI_ERS_RESULT_NEED_RESET) {
 		/*
@@ -289,18 +280,14 @@ static void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 		 * functions to reset slot before calling
 		 * drivers' slot_reset callbacks?
 		 */
-		status = broadcast_error_message(dev,
-				state,
-				"slot_reset",
-				report_slot_reset);
+		status = broadcast_error_message(dev, PCI_ERS_RESULT_RECOVERED,
+					"slot_reset", report_slot_reset);
 	}
 
 	if (status != PCI_ERS_RESULT_RECOVERED)
 		goto failed;
 
-	broadcast_error_message(dev,
-				state,
-				"resume",
+	broadcast_error_message(dev, PCI_ERS_RESULT_RECOVERED, "resume",
 				report_resume);
 
 	pci_aer_clear_device_status(dev);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/16] PCI/ERR: Report current recovery status for udev
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (7 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 08/16] PCI/ERR: Simplify broadcast callouts Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-01  8:36   ` Christoph Hellwig
  2018-08-31 21:26 ` [PATCH 10/16] PCI/portdrv: Provide pci error callbacks Keith Busch
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

A device still participates in error recovery even if it doesn't have
the error callbacks. This patch provides the status for user event
watchers.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/err.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index ee9014615add..a42a17a851fb 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -67,12 +67,12 @@ static int report_error_detected(struct pci_dev *dev,
 			vote = PCI_ERS_RESULT_NO_AER_DRIVER;
 		else
 			vote = PCI_ERS_RESULT_NONE;
-	} else {
-		err_handler = dev->driver->err_handler;
-		vote = err_handler->error_detected(dev, state);
-		pci_uevent_ers(dev, PCI_ERS_RESULT_NONE);
+		goto out;
 	}
-
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->error_detected(dev, state);
+out:
+	pci_uevent_ers(dev, vote);
 	*result = merge_result(*result, vote);
 	device_unlock(&dev->dev);
 	return 0;
@@ -140,8 +140,8 @@ static int report_resume(struct pci_dev *dev, void *data)
 
 	err_handler = dev->driver->err_handler;
 	err_handler->resume(dev);
-	pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
 out:
+	pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
 	device_unlock(&dev->dev);
 	return 0;
 }
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/16] PCI/portdrv: Provide pci error callbacks
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (8 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 09/16] PCI/ERR: Report current recovery status for udev Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-02 10:16   ` Lukas Wunner
  2018-08-31 21:26 ` [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset Keith Busch
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

The error notification walks the whole bus, and some ports may need to
do something to prepare for error handling and recover after slot resets
too. This patch chains the error notification to the port services that
register this callback.

And while we're here, remove the misleading comment referring to AER
root ports in the generic port services driver, as the port may be
involved with other services.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/portdrv.h     |  6 ++++++
 drivers/pci/pcie/portdrv_pci.c | 35 ++++++++++++++++++++++++++++++++++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index d59afa42fc14..70b770dd83f1 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -53,6 +53,12 @@ struct pcie_port_service_driver {
 	int (*resume_noirq) (struct pcie_device *dev);
 	int (*resume) (struct pcie_device *dev);
 
+	/* Device driver notification of error handling */
+	void (*error_detected)(struct pci_dev *dev);
+
+	/* Device driver notification of slot reset */
+	void (*slot_reset)(struct pci_dev *dev);
+
 	/* Device driver may resume normal operations */
 	void (*error_resume)(struct pci_dev *dev);
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index eef22dc29140..b1deab941f68 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -139,13 +139,45 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
 	pcie_port_device_remove(dev);
 }
 
+static int detected_iter(struct device *device, void *data)
+{
+	struct pci_dev *pdev = data;
+	struct pcie_port_service_driver *driver;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver && driver->error_detected)
+			driver->error_detected(pdev);
+	}
+	return 0;
+}
+
 static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
 					enum pci_channel_state error)
 {
-	/* Root Port has no impact. Always recovers. */
+	device_for_each_child(&dev->dev, dev, detected_iter);
 	return PCI_ERS_RESULT_CAN_RECOVER;
 }
 
+static int slot_reset_iter(struct device *device, void *data)
+{
+	struct pci_dev *pdev = data;
+	struct pcie_port_service_driver *driver;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver && driver->slot_reset)
+			driver->slot_reset(pdev);
+	}
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
+{
+	device_for_each_child(&dev->dev, dev, slot_reset_iter);
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
 static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
 {
 	return PCI_ERS_RESULT_RECOVERED;
@@ -185,6 +217,7 @@ static const struct pci_device_id port_pci_ids[] = { {
 
 static const struct pci_error_handlers pcie_portdrv_err_handler = {
 	.error_detected = pcie_portdrv_error_detected,
+	.slot_reset = pcie_portdrv_slot_reset,
 	.mmio_enabled = pcie_portdrv_mmio_enabled,
 	.resume = pcie_portdrv_err_resume,
 };
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (9 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 10/16] PCI/portdrv: Provide pci error callbacks Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-02  9:34   ` Lukas Wunner
  2018-08-31 21:26 ` [PATCH 12/16] PCI/pciehp: Fix powerfault detection order Keith Busch
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

The port's config space may be reset after a link reset, which is often
implementated as a secondary bus reset. We need to restore the config
space in order for downstream devices to successfully use it.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/portdrv_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index b1deab941f68..7b5b3d7ded5b 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -174,7 +174,9 @@ static int slot_reset_iter(struct device *device, void *data)
 
 static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
 {
+	pci_restore_state(dev);
 	device_for_each_child(&dev->dev, dev, slot_reset_iter);
+	pci_save_state(dev);
 	return PCI_ERS_RESULT_RECOVERED;
 }
 
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/16] PCI/pciehp: Fix powerfault detection order
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (10 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-01 15:18   ` Lukas Wunner
  2018-08-31 21:26 ` [PATCH 13/16] PCI/pciehp: Implement error handling callbacks Keith Busch
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

A device add in a power controller controlled slot will power on and
clear power fault slot events, but this was happening before the interrupt
handler attempted to set the sticky status and attention indicators. The
wrong status will be set if a hot-add and power fault are handled in
one interrupt. This patch fixes that by checking for power faults before
checking for new devices.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/hotplug/pciehp_hpc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 7136e3430925..c6116e516e1e 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -646,6 +646,14 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
 		pciehp_handle_button_press(slot);
 	}
 
+	/* Check Power Fault Detected */
+	if ((events & PCI_EXP_SLTSTA_PFD) && !ctrl->power_fault_detected) {
+		ctrl->power_fault_detected = 1;
+		ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(slot));
+		pciehp_set_attention_status(slot, 1);
+		pciehp_green_led_off(slot);
+	}
+
 	/*
 	 * Disable requests have higher priority than Presence Detect Changed
 	 * or Data Link Layer State Changed events.
@@ -657,14 +665,6 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
 		pciehp_handle_presence_or_link_change(slot, events);
 	up_read(&ctrl->reset_lock);
 
-	/* Check Power Fault Detected */
-	if ((events & PCI_EXP_SLTSTA_PFD) && !ctrl->power_fault_detected) {
-		ctrl->power_fault_detected = 1;
-		ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(slot));
-		pciehp_set_attention_status(slot, 1);
-		pciehp_green_led_off(slot);
-	}
-
 	pci_config_pm_runtime_put(pdev);
 	wake_up(&ctrl->requester);
 	return IRQ_HANDLED;
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/16] PCI/pciehp: Implement error handling callbacks
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (11 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 12/16] PCI/pciehp: Fix powerfault detection order Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-02 10:39   ` Lukas Wunner
  2018-08-31 21:26 ` [PATCH 14/16] pciehp: Ignore link events during DPC event Keith Busch
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

Error handling may trigger spurious link events, which may trigger
hotplug handling to re-enumerate the topology. This patch temporarily
disables notification during such error handling and checks the topology
for changes after reset.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/hotplug/pciehp.h      |  1 +
 drivers/pci/hotplug/pciehp_core.c | 51 +++++++++++++++++++++++++++++++++++++++
 drivers/pci/hotplug/pciehp_hpc.c  |  9 +++++++
 3 files changed, 61 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 811cf83f956d..27abfb504b35 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -195,6 +195,7 @@ void pciehp_get_attention_status(struct slot *slot, u8 *status);
 void pciehp_set_attention_status(struct slot *slot, u8 status);
 void pciehp_get_latch_status(struct slot *slot, u8 *status);
 void pciehp_get_adapter_status(struct slot *slot, u8 *status);
+void pciehp_get_adapter_changed(struct slot *slot, u8 *changed);
 int pciehp_query_power_fault(struct slot *slot);
 void pciehp_green_led_on(struct slot *slot);
 void pciehp_green_led_off(struct slot *slot);
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index ec48c9433ae5..acb4d864b4e7 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -301,6 +301,55 @@ static void pciehp_remove(struct pcie_device *dev)
 	pciehp_release_ctrl(ctrl);
 }
 
+static void pciehp_error_detected(struct pci_dev *dev)
+{
+	struct controller *ctrl;
+	struct pcie_device *pcie_dev;
+	struct device *device = pcie_port_find_device(dev, PCIE_PORT_SERVICE_HP);
+
+	if (!device)
+		return;
+
+	pcie_dev = to_pcie_device(device);
+	ctrl = get_service_data(pcie_dev);
+
+	/*
+	 * Shutdown notification to ignore hotplug events during error
+	 * handling. We'll recheck the status when error handling completes.
+	 *
+	 * It is possible link event related to this error handling may have
+	 * triggered a the hotplug interrupt ahead of this notification, but we
+	 * can't do anything about that race.
+	 */
+	pcie_shutdown_notification(ctrl);
+}
+
+static void pciehp_slot_reset(struct pci_dev *dev)
+{
+	u8 changed;
+	struct controller *ctrl;
+	struct pcie_device *pcie_dev;
+	struct device *device = pcie_port_find_device(dev, PCIE_PORT_SERVICE_HP);
+
+	if (!device)
+		return;
+
+	pcie_dev = to_pcie_device(device);
+	ctrl = get_service_data(pcie_dev);
+
+	/*
+	 * Cache presence detect change, but ignore other hotplug events that
+	 * may occur during error handling. Ports that implement only in-band
+	 * presence detection may inadvertently believe the device has changed,
+	 * so those devices will have to be re-enumerated.
+	 */
+	pciehp_get_adapter_changed(ctrl->slot, &changed);
+	pcie_clear_hotplug_events(ctrl);
+	if (changed)
+		pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC);
+	pcie_init_notification(ctrl);
+}
+
 #ifdef CONFIG_PM
 static int pciehp_suspend(struct pcie_device *dev)
 {
@@ -340,6 +389,8 @@ static struct pcie_port_service_driver hpdriver_portdrv = {
 
 	.probe		= pciehp_probe,
 	.remove		= pciehp_remove,
+	.error_detected	= pciehp_error_detected,
+	.slot_reset	= pciehp_slot_reset,
 
 #ifdef	CONFIG_PM
 	.suspend	= pciehp_suspend,
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index c6116e516e1e..7c43336e08ba 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -398,6 +398,15 @@ void pciehp_get_adapter_status(struct slot *slot, u8 *status)
 	*status = !!(slot_status & PCI_EXP_SLTSTA_PDS);
 }
 
+void pciehp_get_adapter_changed(struct slot *slot, u8 *changed)
+{
+	struct pci_dev *pdev = ctrl_dev(slot->ctrl);
+	u16 slot_status;
+
+	pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
+	*changed = !!(slot_status & PCI_EXP_SLTSTA_PDC);
+}
+
 int pciehp_query_power_fault(struct slot *slot)
 {
 	struct pci_dev *pdev = ctrl_dev(slot->ctrl);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (12 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 13/16] PCI/pciehp: Implement error handling callbacks Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 22:18   ` Sinan Kaya
  2018-09-02 14:27   ` Lukas Wunner
  2018-08-31 21:26 ` [PATCH 15/16] PCI/DPC: Wait for reset complete Keith Busch
  2018-08-31 21:26 ` [PATCH 16/16] PCI: Unify device inaccessible Keith Busch
  15 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

This patch adds a channel state to a subordinate bus. When a DPC event is
triggered, the DPC driver will set the channel state to frozen, and the
pciehp driver will ignore link events if the subordinate bus is being
managed by DPC error handling.

This is safe because the pciehp and DPC drivers share the same
interrupt. The DPC driver sets the bus state in the top-half interrupt
context, and the pciehp driver checks and masks off link events in its
bottom-half error handler.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/hotplug/pciehp_hpc.c | 9 +++++++++
 drivers/pci/pcie/dpc.c           | 8 ++++++--
 include/linux/pci.h              | 6 ++++++
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 7c43336e08ba..a928dcccf78b 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -643,6 +643,15 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
 
 	synchronize_hardirq(irq);
 	events = atomic_xchg(&ctrl->pending_events, 0);
+
+	/*
+	 * Ignore link events on the suborinate bus if error handling is
+	 * active, as link down may be expected. We'll continue to handle
+	 * presence detect changes.
+	 */
+	if (pdev->subordinate && pci_bus_offline(pdev->subordinate))
+		events &= ~PCI_EXP_SLTSTA_DLLSC;
+
 	if (!events) {
 		pci_config_pm_runtime_put(pdev);
 		return IRQ_NONE;
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index eadfd835af13..93ce26191a2b 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -92,7 +92,8 @@ static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev)
 
 	pci_write_config_word(pdev, cap + PCI_EXP_DPC_STATUS,
 			      PCI_EXP_DPC_STATUS_TRIGGER);
-
+	if (pdev->subordinate)
+		pdev->subordinate->error_state = pci_channel_io_normal;
 	return PCI_ERS_RESULT_RECOVERED;
 }
 
@@ -204,8 +205,11 @@ static irqreturn_t dpc_irq(int irq, void *context)
 
 	pci_write_config_word(pdev, cap + PCI_EXP_DPC_STATUS,
 			      PCI_EXP_DPC_STATUS_INTERRUPT);
-	if (status & PCI_EXP_DPC_STATUS_TRIGGER)
+	if (status & PCI_EXP_DPC_STATUS_TRIGGER) {
+		if (pdev->subordinate)
+			pdev->subordinate->error_state = pci_channel_io_frozen;
 		return IRQ_WAKE_THREAD;
+	}
 	return IRQ_HANDLED;
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e72ca8dd6241..a76e10049b08 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -573,9 +573,15 @@ struct pci_bus {
 	struct device		dev;
 	struct bin_attribute	*legacy_io;	/* Legacy I/O for this bus */
 	struct bin_attribute	*legacy_mem;	/* Legacy mem */
+	pci_channel_state_t error_state;	/* Current connectivity state */
 	unsigned int		is_added:1;
 };
 
+static inline int pci_bus_offline(struct pci_bus *bus)
+{
+	return (bus->error_state != pci_channel_io_normal);
+}
+
 #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
 
 /*
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/16] PCI/DPC: Wait for reset complete
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (13 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 14/16] pciehp: Ignore link events during DPC event Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-08-31 22:15   ` Sinan Kaya
  2018-08-31 21:26 ` [PATCH 16/16] PCI: Unify device inaccessible Keith Busch
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

DPC is only for error recovery. Since we're not removing the topology
and relying on link state interrupts to re-enumerate anymore, the dpc
reset handler has to implement the required reset timings and checks.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/dpc.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 93ce26191a2b..34bac3e2beac 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -66,32 +66,38 @@ static int dpc_wait_rp_inactive(struct dpc_dev *dpc)
 
 static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev)
 {
-	struct dpc_dev *dpc;
-	struct pcie_device *pciedev;
-	struct device *devdpc;
-
-	u16 cap;
+	struct device *dev = pcie_port_find_device(pdev, PCIE_PORT_SERVICE_DPC);
+	struct pcie_device *pciedev = to_pcie_device(dev);
+	struct dpc_dev *dpc = get_service_data(pciedev);
+	u16 cap = dpc->cap_pos;
 
 	/*
 	 * DPC disables the Link automatically in hardware, so it has
 	 * already been reset by the time we get here.
-	 */
-	devdpc = pcie_port_find_device(pdev, PCIE_PORT_SERVICE_DPC);
-	pciedev = to_pcie_device(devdpc);
-	dpc = get_service_data(pciedev);
-	cap = dpc->cap_pos;
-
-	/*
+	 *
 	 * Wait until the Link is inactive, then clear DPC Trigger Status
 	 * to allow the Port to leave DPC.
 	 */
 	pcie_wait_for_link(pdev, false);
-
 	if (dpc->rp_extensions && dpc_wait_rp_inactive(dpc))
 		return PCI_ERS_RESULT_DISCONNECT;
 
 	pci_write_config_word(pdev, cap + PCI_EXP_DPC_STATUS,
 			      PCI_EXP_DPC_STATUS_TRIGGER);
+	/*
+	 * PCIe 4.0r1 6.6.1, a component must enter LTSSM Detect within 20ms,
+	 * after which we should expect an link active if the reset was
+	 * successful. If so, software must wait a minimum 100ms before sending
+	 * configuration requests to devices downstream this port.
+	 *
+	 * If the link fails to activate, either the device was physically
+	 * removed or the link is permanently failed.
+	 */
+	msleep(20);
+	if (!pcie_wait_for_link(pdev, true))
+		return PCI_ERS_RESULT_DISCONNECT;
+	msleep(100);
+
 	if (pdev->subordinate)
 		pdev->subordinate->error_state = pci_channel_io_normal;
 	return PCI_ERS_RESULT_RECOVERED;
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/16] PCI: Unify device inaccessible
  2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
                   ` (14 preceding siblings ...)
  2018-08-31 21:26 ` [PATCH 15/16] PCI/DPC: Wait for reset complete Keith Busch
@ 2018-08-31 21:26 ` Keith Busch
  2018-09-02 14:39   ` Lukas Wunner
  15 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 21:26 UTC (permalink / raw)
  To: Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Sinan Kaya, Thomas Tai, poza,
	Lukas Wunner, Keith Busch

This patch brings surprise removals and permanent failures together so
no longer need separate flags.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pci.h | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index ed2965ee6779..20cadf91cd9c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -294,21 +294,20 @@ struct pci_sriov {
 	bool		drivers_autoprobe; /* Auto probing of VFs by driver */
 };
 
-/* pci_dev priv_flags */
-#define PCI_DEV_DISCONNECTED 0
-#define PCI_DEV_ADDED 1
-
 static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
 {
-	set_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
+	dev->error_state = pci_channel_io_perm_failure;
 	return 0;
 }
 
 static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
 {
-	return test_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
+	return dev->error_state == pci_channel_io_perm_failure;
 }
 
+/* pci_dev priv_flags */
+#define PCI_DEV_ADDED 0
+
 static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
 {
 	assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 02/16] PCI: Fix pci_reset_bus
  2018-08-31 21:26 ` [PATCH 02/16] PCI: Fix pci_reset_bus Keith Busch
@ 2018-08-31 21:52   ` Sinan Kaya
  2018-08-31 22:08     ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Sinan Kaya @ 2018-08-31 21:52 UTC (permalink / raw)
  To: Keith Busch, Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Thomas Tai, poza, Lukas Wunner

On 8/31/2018 2:26 PM, Keith Busch wrote:
> The check for slot reset capable, pci_probe_reset_slot, returns 0 when
> slot reset is possible, but pci_reset_bus was proceeding to reset the
> slot when the probe returned an error. This patch reverses the check to
> fix that logic.
> 
> Signed-off-by: Keith Busch<keith.busch@intel.com>

Thanks, dennis.dalessandro@intel.com just posted the same patch.

Reviewed-by: Sinan Kaya <okaya@kernel.org>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 02/16] PCI: Fix pci_reset_bus
  2018-08-31 21:52   ` Sinan Kaya
@ 2018-08-31 22:08     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-08-31 22:08 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Tai,
	poza, Lukas Wunner

On Fri, Aug 31, 2018 at 02:52:25PM -0700, Sinan Kaya wrote:
> On 8/31/2018 2:26 PM, Keith Busch wrote:
> > The check for slot reset capable, pci_probe_reset_slot, returns 0 when
> > slot reset is possible, but pci_reset_bus was proceeding to reset the
> > slot when the probe returned an error. This patch reverses the check to
> > fix that logic.
> > 
> > Signed-off-by: Keith Busch<keith.busch@intel.com>
> 
> Thanks, dennis.dalessandro@intel.com just posted the same patch.

Oh, he sure did! Nothing in my series depends on this duplicate patch,
so no problem to use either one.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] PCI/DPC: Wait for reset complete
  2018-08-31 21:26 ` [PATCH 15/16] PCI/DPC: Wait for reset complete Keith Busch
@ 2018-08-31 22:15   ` Sinan Kaya
  0 siblings, 0 replies; 46+ messages in thread
From: Sinan Kaya @ 2018-08-31 22:15 UTC (permalink / raw)
  To: Keith Busch, Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Thomas Tai, poza, Lukas Wunner

On 8/31/2018 2:26 PM, Keith Busch wrote:
> 	/*
> +	 * PCIe 4.0r1 6.6.1, a component must enter LTSSM Detect within 20ms,
> +	 * after which we should expect an link active if the reset was
> +	 * successful. If so, software must wait a minimum 100ms before sending
> +	 * configuration requests to devices downstream this port.
> +	 *
> +	 * If the link fails to activate, either the device was physically
> +	 * removed or the link is permanently failed.
> +	 */
> +	msleep(20);
> +	if (!pcie_wait_for_link(pdev, true))
> +		return PCI_ERS_RESULT_DISCONNECT;
> +	msleep(100);

Can you move these wait statements and spec reference inside the 
pcie_wait_for_link() function?

Nit. There are some whitespace changes in this file.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 21:26 ` [PATCH 14/16] pciehp: Ignore link events during DPC event Keith Busch
@ 2018-08-31 22:18   ` Sinan Kaya
  2018-08-31 22:33     ` Keith Busch
  2018-09-02 14:27   ` Lukas Wunner
  1 sibling, 1 reply; 46+ messages in thread
From: Sinan Kaya @ 2018-08-31 22:18 UTC (permalink / raw)
  To: Keith Busch, Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Thomas Tai, poza, Lukas Wunner

On 8/31/2018 2:26 PM, Keith Busch wrote:
> This is safe because the pciehp and DPC drivers share the same
> interrupt. The DPC driver sets the bus state in the top-half interrupt
> context, and the pciehp driver checks and masks off link events in its
> bottom-half error handler.

Where is this coming from? Is there a spec reference?

DPC and HP interrupts can be implemented as MSI-x interrupts and could
be unrelated interrupt IDs?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 06/16] PCI/ERR: Remove devices on recovery failure
  2018-08-31 21:26 ` [PATCH 06/16] PCI/ERR: Remove devices on recovery failure Keith Busch
@ 2018-08-31 22:26   ` Sinan Kaya
  0 siblings, 0 replies; 46+ messages in thread
From: Sinan Kaya @ 2018-08-31 22:26 UTC (permalink / raw)
  To: Keith Busch, Linux PCI, Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Thomas Tai, poza, Lukas Wunner

On 8/31/2018 2:26 PM, Keith Busch wrote:
> +static void pcie_disconnect_device(struct pci_dev *dev)
> +{
> +	struct pci_bus *bus = dev->subordinate;
> +	struct pci_dev *child, *tmp;
> +
> +	broadcast_error_message(dev, PCI_ERS_RESULT_DISCONNECT,
> +				"disconnect", report_disconnect);
> +	pci_lock_rescan_remove();
> +	list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
> +		pci_stop_and_remove_bus_device(child);
> +
> +	pci_bridge_secondary_bus_reset(dev);

Series look great so far. Please check the return value of
pci_bridge_secondary_bus_reset().

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 22:18   ` Sinan Kaya
@ 2018-08-31 22:33     ` Keith Busch
  2018-08-31 22:55       ` Sinan Kaya
  0 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 22:33 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Tai,
	poza, Lukas Wunner

On Fri, Aug 31, 2018 at 03:18:15PM -0700, Sinan Kaya wrote:
> On 8/31/2018 2:26 PM, Keith Busch wrote:
> > This is safe because the pciehp and DPC drivers share the same
> > interrupt. The DPC driver sets the bus state in the top-half interrupt
> > context, and the pciehp driver checks and masks off link events in its
> > bottom-half error handler.
> 
> Where is this coming from? Is there a spec reference?
> 
> DPC and HP interrupts can be implemented as MSI-x interrupts and could
> be unrelated interrupt IDs?

Darn, you're right. The kernel allocates up to 32 vectors and the port is
free to divvy them up however it wants amont its supported services. It
just so happened most of the ports I tested used the same one. There's
no way to really close this race if they are separate vectors though,
so maybe just leave this as  a 'best effort' approach and update the
change log accodingly.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 22:33     ` Keith Busch
@ 2018-08-31 22:55       ` Sinan Kaya
  2018-08-31 22:59         ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Sinan Kaya @ 2018-08-31 22:55 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Tai,
	poza, Lukas Wunner

On 8/31/2018 3:33 PM, Keith Busch wrote:
> Darn, you're right. The kernel allocates up to 32 vectors and the port is
> free to divvy them up however it wants amont its supported services. It
> just so happened most of the ports I tested used the same one. There's
> no way to really close this race if they are separate vectors though,
> so maybe just leave this as  a 'best effort' approach and update the
> change log accodingly.

or take the big hammer and move them into a single workqueue?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 22:55       ` Sinan Kaya
@ 2018-08-31 22:59         ` Keith Busch
  2018-08-31 23:07           ` Sinan Kaya
  0 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-08-31 22:59 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Tai,
	poza, Lukas Wunner

On Fri, Aug 31, 2018 at 03:55:58PM -0700, Sinan Kaya wrote:
> On 8/31/2018 3:33 PM, Keith Busch wrote:
> > Darn, you're right. The kernel allocates up to 32 vectors and the port is
> > free to divvy them up however it wants amont its supported services. It
> > just so happened most of the ports I tested used the same one. There's
> > no way to really close this race if they are separate vectors though,
> > so maybe just leave this as  a 'best effort' approach and update the
> > change log accodingly.
> 
> or take the big hammer and move them into a single workqueue?

That wouldn't really help if they're different interrupt vectors since
they could happen in either order with a delay between the CPU seeing
each of them.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 22:59         ` Keith Busch
@ 2018-08-31 23:07           ` Sinan Kaya
  0 siblings, 0 replies; 46+ messages in thread
From: Sinan Kaya @ 2018-08-31 23:07 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Tai,
	poza, Lukas Wunner

On 8/31/2018 3:59 PM, Keith Busch wrote:
> On Fri, Aug 31, 2018 at 03:55:58PM -0700, Sinan Kaya wrote:
>> On 8/31/2018 3:33 PM, Keith Busch wrote:
>>> Darn, you're right. The kernel allocates up to 32 vectors and the port is
>>> free to divvy them up however it wants amont its supported services. It
>>> just so happened most of the ports I tested used the same one. There's
>>> no way to really close this race if they are separate vectors though,
>>> so maybe just leave this as  a 'best effort' approach and update the
>>> change log accodingly.
>>
>> or take the big hammer and move them into a single workqueue?
> 
> That wouldn't really help if they're different interrupt vectors since
> they could happen in either order with a delay between the CPU seeing
> each of them.
> 

I was trying to make it look like single interrupt in software even though
they are sourced from different interrupt handlers.

But, you are right. Interrupts can arrive in arbitrary order.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/16] PCI/ERR: Handle fatal error recovery
  2018-08-31 21:26 ` [PATCH 05/16] PCI/ERR: Handle fatal error recovery Keith Busch
@ 2018-09-01  8:31   ` Christoph Hellwig
  2018-09-05  5:56   ` poza
  1 sibling, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2018-09-01  8:31 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza, Lukas Wunner

> +void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
> +{
> +	pcie_do_recovery(dev, pci_channel_io_frozen, service);
> +}
> +
> +void pcie_do_nonfatal_recovery(struct pci_dev *dev)
> +{
> +	pcie_do_recovery(dev, pci_channel_io_normal, PCIE_PORT_SERVICE_AER);
> +}

Is there any good reason to not just expose pcie_do_recovery
directly and skip these wrappers?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 08/16] PCI/ERR: Simplify broadcast callouts
  2018-08-31 21:26 ` [PATCH 08/16] PCI/ERR: Simplify broadcast callouts Keith Busch
@ 2018-09-01  8:33   ` Christoph Hellwig
  0 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2018-09-01  8:33 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza, Lukas Wunner

> +static int report_forzen_detected(struct pci_dev *dev, void *data)

s/forzen/frozen/ ?

>  static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
> +				pci_ers_result_t result, char *error_mesg,
> +				int (*cb)(struct pci_dev *, void *))
>  {
>  	pci_printk(KERN_DEBUG, dev, "broadcast %s message\n", error_mesg);
> +	pci_walk_bus(dev->subordinate, cb, &result);
> +	return result;
>  }

Is there any point in keeping this helper?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 09/16] PCI/ERR: Report current recovery status for udev
  2018-08-31 21:26 ` [PATCH 09/16] PCI/ERR: Report current recovery status for udev Keith Busch
@ 2018-09-01  8:36   ` Christoph Hellwig
  0 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2018-09-01  8:36 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza, Lukas Wunner

On Fri, Aug 31, 2018 at 03:26:32PM -0600, Keith Busch wrote:
> A device still participates in error recovery even if it doesn't have
> the error callbacks. This patch provides the status for user event
> watchers.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  drivers/pci/pcie/err.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index ee9014615add..a42a17a851fb 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -67,12 +67,12 @@ static int report_error_detected(struct pci_dev *dev,
>  			vote = PCI_ERS_RESULT_NO_AER_DRIVER;
>  		else
>  			vote = PCI_ERS_RESULT_NONE;
> +		goto out;
>  	}
> -
> +	err_handler = dev->driver->err_handler;
> +	vote = err_handler->error_detected(dev, state);
> +out:
> +	pci_uevent_ers(dev, vote);
>  	*result = merge_result(*result, vote);
>  	device_unlock(&dev->dev);
>  	return 0;

The goto out looks a little odd here.  Why not keep the else
and just move the uevent notification out of it?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 12/16] PCI/pciehp: Fix powerfault detection order
  2018-08-31 21:26 ` [PATCH 12/16] PCI/pciehp: Fix powerfault detection order Keith Busch
@ 2018-09-01 15:18   ` Lukas Wunner
  2018-09-04 14:27     ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-01 15:18 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:35PM -0600, Keith Busch wrote:
> A device add in a power controller controlled slot will power on and
> clear power fault slot events, but this was happening before the interrupt
> handler attempted to set the sticky status and attention indicators. The
> wrong status will be set if a hot-add and power fault are handled in
> one interrupt. This patch fixes that by checking for power faults before
> checking for new devices.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>

Reviewed-by: Lukas Wunner <lukas@wunner.de>

I think this should go in as a fix for v4.19, right?

Thanks for catching this, I missed it during my pciehp refactoring
because Thunderbolt doesn't have a power controller.

Actually I have further power controller related questions, maybe you
can help me out there:

* In pcie_enable_notification() there's a code comment:
  "TBD: Power fault detected software notification support"
  Is there really anything left to be done or can this code comment be
  deleted?  AFAICS, a power fault *is* detected and acted upon, then
  has to be cleared by the user, either through enablement via sysfs
  or by physically replacing the card.

* In pciehp_ist(), the if-condition to check for a power fault also
  checks for "&& !ctrl->power_fault_detected".  The check seems
  superfluous to me because pciehp_isr() clears the PCI_EXP_SLTSTA_PFD
  bit in the pending events if ctrl->power_fault_detected is true.
  Can the additional check be removed from the if-condition in
  pciehp_ist()?

* In pciehp_ist(), we call pciehp_green_led_off() if a power fault is
  detected.  However my (limited) understanding is that a power fault
  always results in a Link Down, and when bringing down the slot in
  response to that, we already call pciehp_green_led_off() in
  remove_board().  Can the call to pciehp_green_led_off() be removed
  from pciehp_ist()?  Or does the link not necessarily go down on a
  power fault and we should rather bring the slot down when a power
  fault is detected?  Is that the "TBD" mentioned in the code comment?
  Or is this just an intentional cosmetic behavior that we turn off
  the green LED as early as possible once we detect the slot is off?

Thanks,

Lukas

> ---
>  drivers/pci/hotplug/pciehp_hpc.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 7136e3430925..c6116e516e1e 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -646,6 +646,14 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
>  		pciehp_handle_button_press(slot);
>  	}
>  
> +	/* Check Power Fault Detected */
> +	if ((events & PCI_EXP_SLTSTA_PFD) && !ctrl->power_fault_detected) {
> +		ctrl->power_fault_detected = 1;
> +		ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(slot));
> +		pciehp_set_attention_status(slot, 1);
> +		pciehp_green_led_off(slot);
> +	}
> +
>  	/*
>  	 * Disable requests have higher priority than Presence Detect Changed
>  	 * or Data Link Layer State Changed events.
> @@ -657,14 +665,6 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
>  		pciehp_handle_presence_or_link_change(slot, events);
>  	up_read(&ctrl->reset_lock);
>  
> -	/* Check Power Fault Detected */
> -	if ((events & PCI_EXP_SLTSTA_PFD) && !ctrl->power_fault_detected) {
> -		ctrl->power_fault_detected = 1;
> -		ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(slot));
> -		pciehp_set_attention_status(slot, 1);
> -		pciehp_green_led_off(slot);
> -	}
> -
>  	pci_config_pm_runtime_put(pdev);
>  	wake_up(&ctrl->requester);
>  	return IRQ_HANDLED;
> -- 
> 2.14.4
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/16] PCI/ERR: Use slot reset if available
  2018-08-31 21:26 ` [PATCH 04/16] PCI/ERR: Use slot reset if available Keith Busch
@ 2018-09-01 17:20   ` Lukas Wunner
  2018-09-04 14:53     ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-01 17:20 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:27PM -0600, Keith Busch wrote:
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> +int pci_bus_error_reset(struct pci_dev *bridge)
> +{
> +	struct pci_bus *bus = bridge->subordinate;
> +
> +	if (!bus)
> +		return -ENOTTY;
> +
> +	if (!list_empty(&bus->slots)) {
> +		struct pci_slot *slot;
> +
> +		list_for_each_entry(slot, &bus->slots, list)
> +			if (pci_probe_reset_slot(slot))
> +				goto bus_reset;
> +		list_for_each_entry(slot, &bus->slots, list)
> +			if (pci_slot_reset(slot, 0))
> +				goto bus_reset;
> +		return 0;
> +	}

I don't see any locking to protect the slots list access.  It seems
pci_slot_mutex is meant to serve this purpose, but it's scoped to slot.c.
Which begs the question if this function should live there as well?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset
  2018-08-31 21:26 ` [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset Keith Busch
@ 2018-09-02  9:34   ` Lukas Wunner
  2018-09-04 14:36     ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-02  9:34 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:34PM -0600, Keith Busch wrote:
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -174,7 +174,9 @@ static int slot_reset_iter(struct device *device, void *data)
>  
>  static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>  {
> +	pci_restore_state(dev);
>  	device_for_each_child(&dev->dev, dev, slot_reset_iter);
> +	pci_save_state(dev);
>  	return PCI_ERS_RESULT_RECOVERED;
>  }

Shouldn't this be the other way round, i.e. save, reset, restore?

Also, the function pcie_portdrv_slot_reset() was introduced in the prior
patch, so it seems this is a fix for that other patch and the two should
be squashed together.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 10/16] PCI/portdrv: Provide pci error callbacks
  2018-08-31 21:26 ` [PATCH 10/16] PCI/portdrv: Provide pci error callbacks Keith Busch
@ 2018-09-02 10:16   ` Lukas Wunner
  2018-09-04 21:38     ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-02 10:16 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:33PM -0600, Keith Busch wrote:
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -139,13 +139,45 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
>  	pcie_port_device_remove(dev);
>  }
>  
> +static int detected_iter(struct device *device, void *data)
> +{
> +	struct pci_dev *pdev = data;
> +	struct pcie_port_service_driver *driver;
> +
> +	if (device->bus == &pcie_port_bus_type && device->driver) {
> +		driver = to_service_driver(device->driver);
> +		if (driver && driver->error_detected)
> +			driver->error_detected(pdev);
> +	}
> +	return 0;
> +}

You're passing a pci_dev to the ->error_detected callback and this
forces the callback to laboriously find its own pcie port service device,
as visible in patch [13/16], where pciehp_error_detected() contains:

	struct device *device = pcie_port_find_device(dev, PCIE_PORT_SERVICE_HP);
	if (!device)
		return;
	pcie_dev = to_pcie_device(device);

This seems backwards, the callbacks should be passed a pcie_device
instead of a pci_dev.

Same for the ->slot_reset callback iterator added in this patch.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 13/16] PCI/pciehp: Implement error handling callbacks
  2018-08-31 21:26 ` [PATCH 13/16] PCI/pciehp: Implement error handling callbacks Keith Busch
@ 2018-09-02 10:39   ` Lukas Wunner
  2018-09-04 14:19     ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-02 10:39 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:36PM -0600, Keith Busch wrote:
> +	/*
> +	 * Shutdown notification to ignore hotplug events during error
> +	 * handling. We'll recheck the status when error handling completes.
> +	 *
> +	 * It is possible link event related to this error handling may have
> +	 * triggered a the hotplug interrupt ahead of this notification, but we
> +	 * can't do anything about that race.
> +	 */
> +	pcie_shutdown_notification(ctrl);

No, if you look at pciehp_remove() you'll notice that I call pci_hp_del()
before pcie_shutdown_notification().  The IRQ thread is needed to respond
to user requests to enable/disable the slot.  If you just release the IRQ,
the sysfs interface is still there but will no longer function while the
reset is handled.  Not good.

I think what you want to do instead is "down_write(&ctrl->reset_lock)",
see commit 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset").
And possibly mask PDCE/DLLSCE like pciehp_reset_slot() does.


> +static void pciehp_slot_reset(struct pci_dev *dev)
[...]
> +	/*
> +	 * Cache presence detect change, but ignore other hotplug events that
> +	 * may occur during error handling. Ports that implement only in-band
> +	 * presence detection may inadvertently believe the device has changed,
> +	 * so those devices will have to be re-enumerated.
> +	 */
> +	pciehp_get_adapter_changed(ctrl->slot, &changed);
> +	pcie_clear_hotplug_events(ctrl);
> +	if (changed)
> +		pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC);
> +	pcie_init_notification(ctrl);
> +}

Hm, can we be certain that error handling does not affect PDC?
Because pciehp_reset_slot() does mask it and Sinan once said that in-band
presence detect may cause presence to flap as well during reset:
https://lkml.org/lkml/2018/7/3/693

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-08-31 21:26 ` [PATCH 14/16] pciehp: Ignore link events during DPC event Keith Busch
  2018-08-31 22:18   ` Sinan Kaya
@ 2018-09-02 14:27   ` Lukas Wunner
  2018-09-04 14:16     ` Keith Busch
  1 sibling, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-02 14:27 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:37PM -0600, Keith Busch wrote:
> This patch adds a channel state to a subordinate bus. When a DPC event is
> triggered, the DPC driver will set the channel state to frozen, and the
> pciehp driver will ignore link events if the subordinate bus is being
> managed by DPC error handling.
> 
> This is safe because the pciehp and DPC drivers share the same
> interrupt. The DPC driver sets the bus state in the top-half interrupt
> context, and the pciehp driver checks and masks off link events in its
> bottom-half error handler.

I really liked Sinan's approach of checking in pciehp whether a fatal
error is pending and waiting for it to be handled:
https://patchwork.ozlabs.org/patch/959464/

This seemed to avoid any races with DPC and is small and simple.
Can we pursue a solution along those lines?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 16/16] PCI: Unify device inaccessible
  2018-08-31 21:26 ` [PATCH 16/16] PCI: Unify device inaccessible Keith Busch
@ 2018-09-02 14:39   ` Lukas Wunner
  2018-09-03  0:38     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-02 14:39 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Fri, Aug 31, 2018 at 03:26:39PM -0600, Keith Busch wrote:
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -294,21 +294,20 @@ struct pci_sriov {
>  static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
>  {
> -	set_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
> +	dev->error_state = pci_channel_io_perm_failure;
>  	return 0;
>  }

Back in 2016 when I floated the idea of using error_state to store
that the device has been removed, you responded:

   "I'd be happy if we can reuse that, but concerned about overloading
    error_state's intended purpose for AER. The conditions under which an
    'is_removed' may be set can also create AER events, and the aer driver
    overrides the error_state."
    https://spinics.net/lists/linux-pci/msg55417.html

Is it guaranteed that AER refrains from writing a different value to
error_state once it has been set to pci_channel_io_perm_failure due
to removal?  If so I'm happy with this patch.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 16/16] PCI: Unify device inaccessible
  2018-09-02 14:39   ` Lukas Wunner
@ 2018-09-03  0:38     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 46+ messages in thread
From: Benjamin Herrenschmidt @ 2018-09-03  0:38 UTC (permalink / raw)
  To: Lukas Wunner, Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Sinan Kaya, Thomas Tai, poza

On Sun, 2018-09-02 at 16:39 +0200, Lukas Wunner wrote:
> On Fri, Aug 31, 2018 at 03:26:39PM -0600, Keith Busch wrote:
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -294,21 +294,20 @@ struct pci_sriov {
> >  static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
> >  {
> > -	set_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
> > +	dev->error_state = pci_channel_io_perm_failure;
> >  	return 0;
> >  }
> 
> Back in 2016 when I floated the idea of using error_state to store
> that the device has been removed, you responded:

Wow, lots of activity while I wasn't looking :-) Unfortunately I'll be
away for a few weeks...

A quick note:

>    "I'd be happy if we can reuse that, but concerned about overloading
>     error_state's intended purpose for AER. The conditions under which an
>     'is_removed' may be set can also create AER events, and the aer driver
>     overrides the error_state."
>     https://spinics.net/lists/linux-pci/msg55417.html
> 
> Is it guaranteed that AER refrains from writing a different value to
> error_state once it has been set to pci_channel_io_perm_failure due
> to removal?  If so I'm happy with this patch.

My suggestion to avoid that problem (we have a similar one in theory
with EEH which can set error_state from interrupts) is to make
error_state an atomic by having the "set" function use cmpxchg to
enforce that there is no valid transition from perm_failure.

I was hoping to cookup some patches along the line of the RFC I already
sent factoring the above, but a number of things here got in the way
and I'm about to head out of the country for 3 weeks.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-09-02 14:27   ` Lukas Wunner
@ 2018-09-04 14:16     ` Keith Busch
  2018-09-04 14:40       ` Lukas Wunner
  0 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2018-09-04 14:16 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Sun, Sep 02, 2018 at 04:27:14PM +0200, Lukas Wunner wrote:
> On Fri, Aug 31, 2018 at 03:26:37PM -0600, Keith Busch wrote:
> > This patch adds a channel state to a subordinate bus. When a DPC event is
> > triggered, the DPC driver will set the channel state to frozen, and the
> > pciehp driver will ignore link events if the subordinate bus is being
> > managed by DPC error handling.
> > 
> > This is safe because the pciehp and DPC drivers share the same
> > interrupt. The DPC driver sets the bus state in the top-half interrupt
> > context, and the pciehp driver checks and masks off link events in its
> > bottom-half error handler.
> 
> I really liked Sinan's approach of checking in pciehp whether a fatal
> error is pending and waiting for it to be handled:
> https://patchwork.ozlabs.org/patch/959464/
> 
> This seemed to avoid any races with DPC and is small and simple.
> Can we pursue a solution along those lines?

That introduces a completely different race between the error handling
and hotplug threads. We don't control  which interrupt fires first or
any way ensure they're even the same event.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 13/16] PCI/pciehp: Implement error handling callbacks
  2018-09-02 10:39   ` Lukas Wunner
@ 2018-09-04 14:19     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-09-04 14:19 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Sun, Sep 02, 2018 at 12:39:55PM +0200, Lukas Wunner wrote:
> On Fri, Aug 31, 2018 at 03:26:36PM -0600, Keith Busch wrote:
> > +	/*
> > +	 * Shutdown notification to ignore hotplug events during error
> > +	 * handling. We'll recheck the status when error handling completes.
> > +	 *
> > +	 * It is possible link event related to this error handling may have
> > +	 * triggered a the hotplug interrupt ahead of this notification, but we
> > +	 * can't do anything about that race.
> > +	 */
> > +	pcie_shutdown_notification(ctrl);
> 
> No, if you look at pciehp_remove() you'll notice that I call pci_hp_del()
> before pcie_shutdown_notification().  The IRQ thread is needed to respond
> to user requests to enable/disable the slot.  If you just release the IRQ,
> the sysfs interface is still there but will no longer function while the
> reset is handled.  Not good.
> 
> I think what you want to do instead is "down_write(&ctrl->reset_lock)",
> see commit 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset").
> And possibly mask PDCE/DLLSCE like pciehp_reset_slot() does.
> 
> 
> > +static void pciehp_slot_reset(struct pci_dev *dev)
> [...]
> > +	/*
> > +	 * Cache presence detect change, but ignore other hotplug events that
> > +	 * may occur during error handling. Ports that implement only in-band
> > +	 * presence detection may inadvertently believe the device has changed,
> > +	 * so those devices will have to be re-enumerated.
> > +	 */
> > +	pciehp_get_adapter_changed(ctrl->slot, &changed);
> > +	pcie_clear_hotplug_events(ctrl);
> > +	if (changed)
> > +		pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC);
> > +	pcie_init_notification(ctrl);
> > +}
> 
> Hm, can we be certain that error handling does not affect PDC?

No, I'm actually saying exactly the opposite in the code comments. The
error handling may affect PDC if the port doesn't have an out-of-band
presence detection capability.

> Because pciehp_reset_slot() does mask it and Sinan once said that in-band
> presence detect may cause presence to flap as well during reset:
> https://lkml.org/lkml/2018/7/3/693

Right, but how do you know which PDC is okay to ignore and which one
isn't? Hotplug events freqently trigger other error handling so we can't
just ignore hot plug during error handling. Link events should be safe
to ignore, though.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 12/16] PCI/pciehp: Fix powerfault detection order
  2018-09-01 15:18   ` Lukas Wunner
@ 2018-09-04 14:27     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-09-04 14:27 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Sat, Sep 01, 2018 at 05:18:36PM +0200, Lukas Wunner wrote:
> * In pcie_enable_notification() there's a code comment:
>   "TBD: Power fault detected software notification support"
>   Is there really anything left to be done or can this code comment be
>   deleted?  AFAICS, a power fault *is* detected and acted upon, then
>   has to be cleared by the user, either through enablement via sysfs
>   or by physically replacing the card.
> 
> * In pciehp_ist(), the if-condition to check for a power fault also
>   checks for "&& !ctrl->power_fault_detected".  The check seems
>   superfluous to me because pciehp_isr() clears the PCI_EXP_SLTSTA_PFD
>   bit in the pending events if ctrl->power_fault_detected is true.
>   Can the additional check be removed from the if-condition in
>   pciehp_ist()?

The slot is supposed to have PFD set while the controller detects a
power fault. The host acknowledging the event doesn't actually make the
power fault stop exisiting, so we have to make the observation sticky
until it goes away.
 
> * In pciehp_ist(), we call pciehp_green_led_off() if a power fault is
>   detected.  However my (limited) understanding is that a power fault
>   always results in a Link Down, and when bringing down the slot in
>   response to that, we already call pciehp_green_led_off() in
>   remove_board().  Can the call to pciehp_green_led_off() be removed
>   from pciehp_ist()?  Or does the link not necessarily go down on a
>   power fault and we should rather bring the slot down when a power
>   fault is detected?  Is that the "TBD" mentioned in the code comment?
>   Or is this just an intentional cosmetic behavior that we turn off
>   the green LED as early as possible once we detect the slot is off?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset
  2018-09-02  9:34   ` Lukas Wunner
@ 2018-09-04 14:36     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-09-04 14:36 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Sun, Sep 02, 2018 at 02:34:02AM -0700, Lukas Wunner wrote:
> On Fri, Aug 31, 2018 at 03:26:34PM -0600, Keith Busch wrote:
> > --- a/drivers/pci/pcie/portdrv_pci.c
> > +++ b/drivers/pci/pcie/portdrv_pci.c
> > @@ -174,7 +174,9 @@ static int slot_reset_iter(struct device *device, void *data)
> >  
> >  static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
> >  {
> > +	pci_restore_state(dev);
> >  	device_for_each_child(&dev->dev, dev, slot_reset_iter);
> > +	pci_save_state(dev);
> >  	return PCI_ERS_RESULT_RECOVERED;
> >  }
> 
> Shouldn't this be the other way round, i.e. save, reset, restore?

No, this is the correct order. The state is originally saved in
pcie_portdrv_probe, and we need to restore to that state in the slot
reset callback. Once the slot has been restored through all the child
drivers, then we need to save the state again for any future error
handling.
 
> Also, the function pcie_portdrv_slot_reset() was introduced in the prior
> patch, so it seems this is a fix for that other patch and the two should
> be squashed together.

I guess they could be squashed, but the problems they're addressing seemed
different enough to warrant separate change logs. The previous patch
just set up the infrastructure for port error callbacks and chaining to
child drivers. This patch is fixing a specific problem to restore the
slot to a usable state.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-09-04 14:16     ` Keith Busch
@ 2018-09-04 14:40       ` Lukas Wunner
  2018-09-04 15:31         ` Keith Busch
  0 siblings, 1 reply; 46+ messages in thread
From: Lukas Wunner @ 2018-09-04 14:40 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Tue, Sep 04, 2018 at 08:16:02AM -0600, Keith Busch wrote:
> On Sun, Sep 02, 2018 at 04:27:14PM +0200, Lukas Wunner wrote:
> > On Fri, Aug 31, 2018 at 03:26:37PM -0600, Keith Busch wrote:
> > > This patch adds a channel state to a subordinate bus. When a DPC event is
> > > triggered, the DPC driver will set the channel state to frozen, and the
> > > pciehp driver will ignore link events if the subordinate bus is being
> > > managed by DPC error handling.
> > > 
> > > This is safe because the pciehp and DPC drivers share the same
> > > interrupt. The DPC driver sets the bus state in the top-half interrupt
> > > context, and the pciehp driver checks and masks off link events in its
> > > bottom-half error handler.
> > 
> > I really liked Sinan's approach of checking in pciehp whether a fatal
> > error is pending and waiting for it to be handled:
> > https://patchwork.ozlabs.org/patch/959464/
> > 
> > This seemed to avoid any races with DPC and is small and simple.
> > Can we pursue a solution along those lines?
> 
> That introduces a completely different race between the error handling
> and hotplug threads. We don't control  which interrupt fires first or
> any way ensure they're even the same event.

pciehp may react quicker than dpc, hence needs to determine a fatal
error is pending without relying on dpc.  My understanding is that
this is achieved by Sinan checking PCI_EXP_DEVSTA_FED directly from
pciehp.

For the case when dpc reacts quicker and clears the error before
pciehp checks for PCI_EXP_DEVSTA_FED, you need an additional
synchronization mechanism between dpc and pciehp, such as a flag
that is set by dpc before clearing the error, and that is checked
by pciehp.  Though you need to take care that pciehp does not see
a stale flag when the next error occurs.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/16] PCI/ERR: Use slot reset if available
  2018-09-01 17:20   ` Lukas Wunner
@ 2018-09-04 14:53     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-09-04 14:53 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Sat, Sep 01, 2018 at 10:20:11AM -0700, Lukas Wunner wrote:
> On Fri, Aug 31, 2018 at 03:26:27PM -0600, Keith Busch wrote:
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > +int pci_bus_error_reset(struct pci_dev *bridge)
> > +{
> > +	struct pci_bus *bus = bridge->subordinate;
> > +
> > +	if (!bus)
> > +		return -ENOTTY;
> > +
> > +	if (!list_empty(&bus->slots)) {
> > +		struct pci_slot *slot;
> > +
> > +		list_for_each_entry(slot, &bus->slots, list)
> > +			if (pci_probe_reset_slot(slot))
> > +				goto bus_reset;
> > +		list_for_each_entry(slot, &bus->slots, list)
> > +			if (pci_slot_reset(slot, 0))
> > +				goto bus_reset;
> > +		return 0;
> > +	}
> 
> I don't see any locking to protect the slots list access.  It seems
> pci_slot_mutex is meant to serve this purpose, but it's scoped to slot.c.
> Which begs the question if this function should live there as well?

Thanks for pointing that out. I think we ought to have all the needed
locking between pci_bus_sem and pci_rescan_remove_lock. I'll see if we
can make pci_slot_mutex unnecessary.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/16] pciehp: Ignore link events during DPC event
  2018-09-04 14:40       ` Lukas Wunner
@ 2018-09-04 15:31         ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-09-04 15:31 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Tue, Sep 04, 2018 at 04:40:14PM +0200, Lukas Wunner wrote:
> On Tue, Sep 04, 2018 at 08:16:02AM -0600, Keith Busch wrote:
> > On Sun, Sep 02, 2018 at 04:27:14PM +0200, Lukas Wunner wrote:
> > > On Fri, Aug 31, 2018 at 03:26:37PM -0600, Keith Busch wrote:
> > > > This patch adds a channel state to a subordinate bus. When a DPC event is
> > > > triggered, the DPC driver will set the channel state to frozen, and the
> > > > pciehp driver will ignore link events if the subordinate bus is being
> > > > managed by DPC error handling.
> > > > 
> > > > This is safe because the pciehp and DPC drivers share the same
> > > > interrupt. The DPC driver sets the bus state in the top-half interrupt
> > > > context, and the pciehp driver checks and masks off link events in its
> > > > bottom-half error handler.
> > > 
> > > I really liked Sinan's approach of checking in pciehp whether a fatal
> > > error is pending and waiting for it to be handled:
> > > https://patchwork.ozlabs.org/patch/959464/
> > > 
> > > This seemed to avoid any races with DPC and is small and simple.
> > > Can we pursue a solution along those lines?
> > 
> > That introduces a completely different race between the error handling
> > and hotplug threads. We don't control  which interrupt fires first or
> > any way ensure they're even the same event.
> 
> pciehp may react quicker than dpc, hence needs to determine a fatal
> error is pending without relying on dpc.  My understanding is that
> this is achieved by Sinan checking PCI_EXP_DEVSTA_FED directly from
> pciehp.

That's only true if the bridge detects ERR_FATAL, which is one of several
ways to trigger DPC or AER. If the message comes from the end device,
then PCI_EXP_DEVSTA_FED won't be set in the bridge that pciehp can
read.

> For the case when dpc reacts quicker and clears the error before
> pciehp checks for PCI_EXP_DEVSTA_FED, you need an additional
> synchronization mechanism between dpc and pciehp, such as a flag
> that is set by dpc before clearing the error, and that is checked
> by pciehp.  Though you need to take care that pciehp does not see
> a stale flag when the next error occurs.

Yes, the pci_bus error_state this patch creates was intended to be
that flag.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 10/16] PCI/portdrv: Provide pci error callbacks
  2018-09-02 10:16   ` Lukas Wunner
@ 2018-09-04 21:38     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2018-09-04 21:38 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, poza

On Sun, Sep 02, 2018 at 12:16:41PM +0200, Lukas Wunner wrote:
> On Fri, Aug 31, 2018 at 03:26:33PM -0600, Keith Busch wrote:
> > --- a/drivers/pci/pcie/portdrv_pci.c
> > +++ b/drivers/pci/pcie/portdrv_pci.c
> > @@ -139,13 +139,45 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
> >  	pcie_port_device_remove(dev);
> >  }
> >  
> > +static int detected_iter(struct device *device, void *data)
> > +{
> > +	struct pci_dev *pdev = data;
> > +	struct pcie_port_service_driver *driver;
> > +
> > +	if (device->bus == &pcie_port_bus_type && device->driver) {
> > +		driver = to_service_driver(device->driver);
> > +		if (driver && driver->error_detected)
> > +			driver->error_detected(pdev);
> > +	}
> > +	return 0;
> > +}
> 
> You're passing a pci_dev to the ->error_detected callback and this
> forces the callback to laboriously find its own pcie port service device,
> as visible in patch [13/16], where pciehp_error_detected() contains:
> 
> 	struct device *device = pcie_port_find_device(dev, PCIE_PORT_SERVICE_HP);
> 	if (!device)
> 		return;
> 	pcie_dev = to_pcie_device(device);
> 
> This seems backwards, the callbacks should be passed a pcie_device
> instead of a pci_dev.
> 
> Same for the ->slot_reset callback iterator added in this patch.

I agree that is better. I was only trying to stick with the existing
pattern, but I can change that as a prep patch preceeding adding these
error callbacks, no problem.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/16] PCI/ERR: Handle fatal error recovery
  2018-08-31 21:26 ` [PATCH 05/16] PCI/ERR: Handle fatal error recovery Keith Busch
  2018-09-01  8:31   ` Christoph Hellwig
@ 2018-09-05  5:56   ` poza
  1 sibling, 0 replies; 46+ messages in thread
From: poza @ 2018-09-05  5:56 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux PCI, Bjorn Helgaas, Benjamin Herrenschmidt, Sinan Kaya,
	Thomas Tai, Lukas Wunner

On 2018-09-01 02:56, Keith Busch wrote:
> We don't need to be paranoid about the topology changing while handling 
> an
> error. If the device has changed in a hotplug capable slot, we can rely
> on the presence detection handling to react to a changing topology. 
> This
> patch restores the fatal error handling behavior that existed before
> merging DPC with AER with 7e9084b3674 ("PCI/AER: Handle ERR_FATAL with
> removal and re-enumeration of devices").
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  drivers/pci/pcie/dpc.c |  2 +-
>  drivers/pci/pcie/err.c | 89 
> +++++++++-----------------------------------------
>  2 files changed, 17 insertions(+), 74 deletions(-)
> 
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index f03279fc87cd..eadfd835af13 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -169,7 +169,7 @@ static irqreturn_t dpc_handler(int irq, void 
> *context)
> 
>  	reason = (status & PCI_EXP_DPC_STATUS_TRIGGER_RSN) >> 1;
>  	ext_reason = (status & PCI_EXP_DPC_STATUS_TRIGGER_RSN_EXT) >> 5;
> -	dev_warn(dev, "DPC %s detected, remove downstream devices\n",
> +	dev_warn(dev, "DPC %s detected\n",
>  		 (reason == 0) ? "unmasked uncorrectable error" :
>  		 (reason == 1) ? "ERR_NONFATAL" :
>  		 (reason == 2) ? "ERR_FATAL" :
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 12c1205e1d80..44c55f7ceb39 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -271,87 +271,20 @@ static pci_ers_result_t
> broadcast_error_message(struct pci_dev *dev,
>  	return result_data.result;
>  }
> 
> -/**
> - * pcie_do_fatal_recovery - handle fatal error recovery process
> - * @dev: pointer to a pci_dev data structure of agent detecting an 
> error
> - *
> - * Invoked when an error is fatal. Once being invoked, removes the 
> devices
> - * beneath this AER agent, followed by reset link e.g. secondary bus 
> reset
> - * followed by re-enumeration of devices.
> - */
> -void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
> -{
> -	struct pci_dev *udev;
> -	struct pci_bus *parent;
> -	struct pci_dev *pdev, *temp;
> -	pci_ers_result_t result;
> -
> -	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
> -		udev = dev;
> -	else
> -		udev = dev->bus->self;
> -
> -	parent = udev->subordinate;
> -	pci_lock_rescan_remove();
> -	pci_dev_get(dev);
> -	list_for_each_entry_safe_reverse(pdev, temp, &parent->devices,
> -					 bus_list) {
> -		pci_dev_get(pdev);
> -		pci_dev_set_disconnected(pdev, NULL);
> -		if (pci_has_subordinate(pdev))
> -			pci_walk_bus(pdev->subordinate,
> -				     pci_dev_set_disconnected, NULL);
> -		pci_stop_and_remove_bus_device(pdev);
> -		pci_dev_put(pdev);
> -	}
> -
> -	result = reset_link(udev, service);
> -
> -	if ((service == PCIE_PORT_SERVICE_AER) &&
> -	    (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)) {
> -		/*
> -		 * If the error is reported by a bridge, we think this error
> -		 * is related to the downstream link of the bridge, so we
> -		 * do error recovery on all subordinates of the bridge instead
> -		 * of the bridge and clear the error status of the bridge.
> -		 */
> -		pci_aer_clear_fatal_status(dev);
> -		pci_aer_clear_device_status(dev);
> -	}
> -
> -	if (result == PCI_ERS_RESULT_RECOVERED) {
> -		if (pcie_wait_for_link(udev, true))
> -			pci_rescan_bus(udev->bus);
> -		pci_info(dev, "Device recovery from fatal error successful\n");
> -	} else {
> -		pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
> -		pci_info(dev, "Device recovery from fatal error failed\n");
> -	}
> -
> -	pci_dev_put(dev);
> -	pci_unlock_rescan_remove();
> -}
> -
> -/**
> - * pcie_do_nonfatal_recovery - handle nonfatal error recovery process
> - * @dev: pointer to a pci_dev data structure of agent detecting an 
> error
> - *
> - * Invoked when an error is nonfatal/fatal. Once being invoked, 
> broadcast
> - * error detected message to all downstream drivers within a hierarchy 
> in
> - * question and return the returned code.
> - */
> -void pcie_do_nonfatal_recovery(struct pci_dev *dev)
> +static void pcie_do_recovery(struct pci_dev *dev, enum 
> pci_channel_state state,
> +			     u32 service)
>  {
>  	pci_ers_result_t status;
> -	enum pci_channel_state state;
> -
> -	state = pci_channel_io_normal;
> 
>  	status = broadcast_error_message(dev,
>  			state,
>  			"error_detected",
>  			report_error_detected);
> 
> +	if (state == pci_channel_io_frozen &&
> +	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
> +		goto failed;
> +
>  	if (status == PCI_ERS_RESULT_CAN_RECOVER)
>  		status = broadcast_error_message(dev,
>  				state,
> @@ -387,3 +320,13 @@ void pcie_do_nonfatal_recovery(struct pci_dev 
> *dev)
>  	/* TODO: Should kernel panic here? */
>  	pci_info(dev, "AER: Device recovery failed\n");
>  }
> +
> +void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
> +{
> +	pcie_do_recovery(dev, pci_channel_io_frozen, service);
> +}
> +
> +void pcie_do_nonfatal_recovery(struct pci_dev *dev)
> +{
> +	pcie_do_recovery(dev, pci_channel_io_normal, PCIE_PORT_SERVICE_AER);
> +}


Overall I like this idea of not being paranoid about the topology 
changing while handling an
error. this is what was always on my mind, now since we are there, its 
good looking series.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2018-09-05 10:24 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-31 21:26 [PATCH 00/16] PCI, error handling and hot plug Keith Busch
2018-08-31 21:26 ` [PATCH 01/16] PCI: Simplify disconnected marking Keith Busch
2018-08-31 21:26 ` [PATCH 02/16] PCI: Fix pci_reset_bus Keith Busch
2018-08-31 21:52   ` Sinan Kaya
2018-08-31 22:08     ` Keith Busch
2018-08-31 21:26 ` [PATCH 03/16] PCI/AER: Remove dead code Keith Busch
2018-08-31 21:26 ` [PATCH 04/16] PCI/ERR: Use slot reset if available Keith Busch
2018-09-01 17:20   ` Lukas Wunner
2018-09-04 14:53     ` Keith Busch
2018-08-31 21:26 ` [PATCH 05/16] PCI/ERR: Handle fatal error recovery Keith Busch
2018-09-01  8:31   ` Christoph Hellwig
2018-09-05  5:56   ` poza
2018-08-31 21:26 ` [PATCH 06/16] PCI/ERR: Remove devices on recovery failure Keith Busch
2018-08-31 22:26   ` Sinan Kaya
2018-08-31 21:26 ` [PATCH 07/16] PCI/ERR: Always use the first downstream port Keith Busch
2018-08-31 21:26 ` [PATCH 08/16] PCI/ERR: Simplify broadcast callouts Keith Busch
2018-09-01  8:33   ` Christoph Hellwig
2018-08-31 21:26 ` [PATCH 09/16] PCI/ERR: Report current recovery status for udev Keith Busch
2018-09-01  8:36   ` Christoph Hellwig
2018-08-31 21:26 ` [PATCH 10/16] PCI/portdrv: Provide pci error callbacks Keith Busch
2018-09-02 10:16   ` Lukas Wunner
2018-09-04 21:38     ` Keith Busch
2018-08-31 21:26 ` [PATCH 11/16] PCI/portdrv: Restore pci state on slot reset Keith Busch
2018-09-02  9:34   ` Lukas Wunner
2018-09-04 14:36     ` Keith Busch
2018-08-31 21:26 ` [PATCH 12/16] PCI/pciehp: Fix powerfault detection order Keith Busch
2018-09-01 15:18   ` Lukas Wunner
2018-09-04 14:27     ` Keith Busch
2018-08-31 21:26 ` [PATCH 13/16] PCI/pciehp: Implement error handling callbacks Keith Busch
2018-09-02 10:39   ` Lukas Wunner
2018-09-04 14:19     ` Keith Busch
2018-08-31 21:26 ` [PATCH 14/16] pciehp: Ignore link events during DPC event Keith Busch
2018-08-31 22:18   ` Sinan Kaya
2018-08-31 22:33     ` Keith Busch
2018-08-31 22:55       ` Sinan Kaya
2018-08-31 22:59         ` Keith Busch
2018-08-31 23:07           ` Sinan Kaya
2018-09-02 14:27   ` Lukas Wunner
2018-09-04 14:16     ` Keith Busch
2018-09-04 14:40       ` Lukas Wunner
2018-09-04 15:31         ` Keith Busch
2018-08-31 21:26 ` [PATCH 15/16] PCI/DPC: Wait for reset complete Keith Busch
2018-08-31 22:15   ` Sinan Kaya
2018-08-31 21:26 ` [PATCH 16/16] PCI: Unify device inaccessible Keith Busch
2018-09-02 14:39   ` Lukas Wunner
2018-09-03  0:38     ` Benjamin Herrenschmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.