linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Address error and recovery for AER and DPC
@ 2018-01-17 10:37 Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 1/4] PCI/AER: Rename error recovery to generic pci naming Oza Pawandeep
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Oza Pawandeep @ 2018-01-17 10:37 UTC (permalink / raw)
  To: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya, Timur Tabi
  Cc: Oza Pawandeep

This patch set brings in error handling support for DPC

The current implementation of AER and error message broadcasting to the
EP driver is tightly coupled and limited to AER service driver.
It is important to factor out broadcasting and other link handling
callbacks. So that not only when AER gets triggered, but also when DPC get
triggered (for e.g. ERR_FATAL), callbacks are handled appropriately.

DPC should enumerate the devices after recovering the link, which is
achieved by implementing error_resume callback.

Changes since v4:
	Bjorn's comments incorporated.
		> Renamed only do_recovery.
		> moved the things more locally to drivers/pci/pci.h
Changes since v3:
	Bjorn's comments incorporated.
		> Made separate patch renaming generic pci_err.c
		> Introduce pci_err.h to contain all the error types and recovery
		> removed all the dependencies on pci.h
Changes since v2:
	Based on feedback from Keith:
	"
	When DPC is triggered due to receipt of an uncorrectable error Message,
	the Requester ID from the Message is recorded in the DPC Error
	Source ID register and that Message is discarded and not forwarded Upstream.
	"
	Removed the patch where AER checks if DPC service is active
Changes since v1:
	Kbuild errors fixed:
		> pci_find_dpc_dev made static
		> ras_event.h updated
		> pci_find_aer_service call with CONFIG check
		> pci_find_dpc_service call with CONFIG check

Oza Pawandeep (4):
  PCI/AER: factor out error reporting from AER
  PCI/DPC/AER: Address Concurrency between AER and DPC
  PCI/ERR: Do not do recovery if DPC service is active
  PCI/DPC: Enumerate the devices after DPC trigger event

 drivers/acpi/apei/ghes.c               |   2 +-
 drivers/pci/pcie/Makefile              |   2 +-
 drivers/pci/pcie/aer/aerdrv.h          |  30 ---
 drivers/pci/pcie/aer/aerdrv_core.c     | 306 +------------------------
 drivers/pci/pcie/aer/aerdrv_errprint.c |  27 ++-
 drivers/pci/pcie/pcie-dpc.c            | 127 ++++++++++-
 drivers/pci/pcie/pcie-err.c            | 399 +++++++++++++++++++++++++++++++++
 drivers/pci/pcie/portdrv.h             |   2 +
 include/linux/aer.h                    |   4 -
 include/linux/pci.h                    |  23 ++
 include/ras/ras_event.h                |   6 +-
 11 files changed, 579 insertions(+), 349 deletions(-)
 create mode 100644 drivers/pci/pcie/pcie-err.c

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.,
a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5 1/4] PCI/AER: Rename error recovery to generic pci naming
  2018-01-17 10:37 [PATCH v5 0/4] Address error and recovery for AER and DPC Oza Pawandeep
@ 2018-01-17 10:37 ` Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 2/4] PCI/AER: factor out error reporting from AER Oza Pawandeep
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Oza Pawandeep @ 2018-01-17 10:37 UTC (permalink / raw)
  To: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya, Timur Tabi
  Cc: Oza Pawandeep

This patch renames error recovery to generic name with pci prefix

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 7448052..6cb1b36 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -482,7 +482,7 @@ static pci_ers_result_t reset_link(struct pci_dev *dev)
 }
 
 /**
- * do_recovery - handle nonfatal/fatal error recovery process
+ * pci_do_recovery - handle nonfatal/fatal error recovery process
  * @dev: pointer to a pci_dev data structure of agent detecting an error
  * @severity: error severity type
  *
@@ -490,7 +490,7 @@ static pci_ers_result_t reset_link(struct pci_dev *dev)
  * error detected message to all downstream drivers within a hierarchy in
  * question and return the returned code.
  */
-static void do_recovery(struct pci_dev *dev, int severity)
+static void pci_do_recovery(struct pci_dev *dev, int severity)
 {
 	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
 	enum pci_channel_state state;
@@ -569,7 +569,7 @@ static void handle_error_source(struct pcie_device *aerdev,
 			pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
 					info->status);
 	} else
-		do_recovery(dev, info->severity);
+		pci_do_recovery(dev, info->severity);
 }
 
 #ifdef CONFIG_ACPI_APEI_PCIEAER
@@ -633,7 +633,7 @@ static void aer_recover_work_func(struct work_struct *work)
 			continue;
 		}
 		cper_print_aer(pdev, entry.severity, entry.regs);
-		do_recovery(pdev, entry.severity);
+		pci_do_recovery(pdev, entry.severity);
 		pci_dev_put(pdev);
 	}
 }
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.,
a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 2/4] PCI/AER: factor out error reporting from AER
  2018-01-17 10:37 [PATCH v5 0/4] Address error and recovery for AER and DPC Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 1/4] PCI/AER: Rename error recovery to generic pci naming Oza Pawandeep
@ 2018-01-17 10:37 ` Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event Oza Pawandeep
  3 siblings, 0 replies; 24+ messages in thread
From: Oza Pawandeep @ 2018-01-17 10:37 UTC (permalink / raw)
  To: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya, Timur Tabi
  Cc: Oza Pawandeep

This patch factors out error reporting callbacks, which are currently
tightly coupled with AER.
DPC should be able to register callbacks and attmept recovery when DPC
trigger event occurs.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f6b58b3..665ff6c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -342,6 +342,9 @@ static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
 
 void pci_enable_acs(struct pci_dev *dev);
 
+/* PCI error reporting and recovery */
+void pci_do_recovery(struct pci_dev *dev, int severity);
+
 #ifdef CONFIG_PCIE_PTM
 void pci_ptm_init(struct pci_dev *dev);
 #else
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 223e4c3..d669497 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -6,7 +6,7 @@
 # Build PCI Express ASPM if needed
 obj-$(CONFIG_PCIEASPM)		+= aspm.o
 
-pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
+pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o pcie-err.o
 pcieportdrv-$(CONFIG_ACPI)	+= portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 5449e5c..bc9db53 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -76,36 +76,6 @@ struct aer_rpc {
 					 */
 };
 
-struct aer_broadcast_data {
-	enum pci_channel_state state;
-	enum pci_ers_result result;
-};
-
-static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
-		enum pci_ers_result new)
-{
-	if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
-		return PCI_ERS_RESULT_NO_AER_DRIVER;
-
-	if (new == PCI_ERS_RESULT_NONE)
-		return orig;
-
-	switch (orig) {
-	case PCI_ERS_RESULT_CAN_RECOVER:
-	case PCI_ERS_RESULT_RECOVERED:
-		orig = new;
-		break;
-	case PCI_ERS_RESULT_DISCONNECT:
-		if (new == PCI_ERS_RESULT_NEED_RESET)
-			orig = PCI_ERS_RESULT_NEED_RESET;
-		break;
-	default:
-		break;
-	}
-
-	return orig;
-}
-
 extern struct bus_type pcie_port_bus_type;
 void aer_isr(struct work_struct *work);
 void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 6cb1b36..7934de0 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -26,6 +26,7 @@
 #include <linux/slab.h>
 #include <linux/kfifo.h>
 #include "aerdrv.h"
+#include "../../pci.h"
 
 #define	PCI_EXP_AER_FLAGS	(PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \
 				 PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE)
@@ -234,189 +235,6 @@ static bool find_source_device(struct pci_dev *parent,
 	return true;
 }
 
-static int report_error_detected(struct pci_dev *dev, void *data)
-{
-	pci_ers_result_t vote;
-	const struct pci_error_handlers *err_handler;
-	struct aer_broadcast_data *result_data;
-	result_data = (struct aer_broadcast_data *) data;
-
-	device_lock(&dev->dev);
-	dev->error_state = result_data->state;
-
-	if (!dev->driver ||
-		!dev->driver->err_handler ||
-		!dev->driver->err_handler->error_detected) {
-		if (result_data->state == pci_channel_io_frozen &&
-			dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
-			/*
-			 * In case of fatal recovery, if one of down-
-			 * stream device has no driver. We might be
-			 * unable to recover because a later insmod
-			 * of a driver for this device is unaware of
-			 * its hw state.
-			 */
-			dev_printk(KERN_DEBUG, &dev->dev, "device has %s\n",
-				   dev->driver ?
-				   "no AER-aware driver" : "no driver");
-		}
-
-		/*
-		 * If there's any device in the subtree that does not
-		 * have an error_detected callback, returning
-		 * PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
-		 * the subsequent mmio_enabled/slot_reset/resume
-		 * callbacks of "any" device in the subtree. All the
-		 * devices in the subtree are left in the error state
-		 * without recovery.
-		 */
-
-		if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE)
-			vote = PCI_ERS_RESULT_NO_AER_DRIVER;
-		else
-			vote = PCI_ERS_RESULT_NONE;
-	} else {
-		err_handler = dev->driver->err_handler;
-		vote = err_handler->error_detected(dev, result_data->state);
-	}
-
-	result_data->result = merge_result(result_data->result, vote);
-	device_unlock(&dev->dev);
-	return 0;
-}
-
-static int report_mmio_enabled(struct pci_dev *dev, void *data)
-{
-	pci_ers_result_t vote;
-	const struct pci_error_handlers *err_handler;
-	struct aer_broadcast_data *result_data;
-	result_data = (struct aer_broadcast_data *) data;
-
-	device_lock(&dev->dev);
-	if (!dev->driver ||
-		!dev->driver->err_handler ||
-		!dev->driver->err_handler->mmio_enabled)
-		goto out;
-
-	err_handler = dev->driver->err_handler;
-	vote = err_handler->mmio_enabled(dev);
-	result_data->result = merge_result(result_data->result, vote);
-out:
-	device_unlock(&dev->dev);
-	return 0;
-}
-
-static int report_slot_reset(struct pci_dev *dev, void *data)
-{
-	pci_ers_result_t vote;
-	const struct pci_error_handlers *err_handler;
-	struct aer_broadcast_data *result_data;
-	result_data = (struct aer_broadcast_data *) data;
-
-	device_lock(&dev->dev);
-	if (!dev->driver ||
-		!dev->driver->err_handler ||
-		!dev->driver->err_handler->slot_reset)
-		goto out;
-
-	err_handler = dev->driver->err_handler;
-	vote = err_handler->slot_reset(dev);
-	result_data->result = merge_result(result_data->result, vote);
-out:
-	device_unlock(&dev->dev);
-	return 0;
-}
-
-static int report_resume(struct pci_dev *dev, void *data)
-{
-	const struct pci_error_handlers *err_handler;
-
-	device_lock(&dev->dev);
-	dev->error_state = pci_channel_io_normal;
-
-	if (!dev->driver ||
-		!dev->driver->err_handler ||
-		!dev->driver->err_handler->resume)
-		goto out;
-
-	err_handler = dev->driver->err_handler;
-	err_handler->resume(dev);
-out:
-	device_unlock(&dev->dev);
-	return 0;
-}
-
-/**
- * broadcast_error_message - handle message broadcast to downstream drivers
- * @dev: pointer to from where in a hierarchy message is broadcasted down
- * @state: error state
- * @error_mesg: message to print
- * @cb: callback to be broadcasted
- *
- * Invoked during error recovery process. Once being invoked, the content
- * of error severity will be broadcasted to all downstream drivers in a
- * hierarchy in question.
- */
-static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
-	enum pci_channel_state state,
-	char *error_mesg,
-	int (*cb)(struct pci_dev *, void *))
-{
-	struct aer_broadcast_data result_data;
-
-	dev_printk(KERN_DEBUG, &dev->dev, "broadcast %s message\n", error_mesg);
-	result_data.state = state;
-	if (cb == report_error_detected)
-		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
-	else
-		result_data.result = PCI_ERS_RESULT_RECOVERED;
-
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
-		/*
-		 * If the error is reported by a bridge, we think this error
-		 * is related to the downstream link of the bridge, so we
-		 * do error recovery on all subordinates of the bridge instead
-		 * of the bridge and clear the error status of the bridge.
-		 */
-		if (cb == report_error_detected)
-			dev->error_state = state;
-		pci_walk_bus(dev->subordinate, cb, &result_data);
-		if (cb == report_resume) {
-			pci_cleanup_aer_uncorrect_error_status(dev);
-			dev->error_state = pci_channel_io_normal;
-		}
-	} else {
-		/*
-		 * If the error is reported by an end point, we think this
-		 * error is related to the upstream link of the end point.
-		 */
-		if (state == pci_channel_io_normal)
-			/*
-			 * the error is non fatal so the bus is ok, just invoke
-			 * the callback for the function that logged the error.
-			 */
-			cb(dev, &result_data);
-		else
-			pci_walk_bus(dev->bus, cb, &result_data);
-	}
-
-	return result_data.result;
-}
-
-/**
- * default_reset_link - default reset function
- * @dev: pointer to pci_dev data structure
- *
- * Invoked when performing link reset on a Downstream Port or a
- * Root Port with no aer driver.
- */
-static pci_ers_result_t default_reset_link(struct pci_dev *dev)
-{
-	pci_reset_bridge_secondary_bus(dev);
-	dev_printk(KERN_DEBUG, &dev->dev, "downstream link has been reset\n");
-	return PCI_ERS_RESULT_RECOVERED;
-}
-
 static int find_aer_service_iter(struct device *device, void *data)
 {
 	struct pcie_port_service_driver *service_driver, **drv;
@@ -434,7 +252,7 @@ static int find_aer_service_iter(struct device *device, void *data)
 	return 0;
 }
 
-static struct pcie_port_service_driver *find_aer_service(struct pci_dev *dev)
+struct pcie_port_service_driver *pci_find_aer_service(struct pci_dev *dev)
 {
 	struct pcie_port_service_driver *drv = NULL;
 
@@ -442,108 +260,7 @@ static struct pcie_port_service_driver *find_aer_service(struct pci_dev *dev)
 
 	return drv;
 }
-
-static pci_ers_result_t reset_link(struct pci_dev *dev)
-{
-	struct pci_dev *udev;
-	pci_ers_result_t status;
-	struct pcie_port_service_driver *driver;
-
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
-		/* Reset this port for all subordinates */
-		udev = dev;
-	} else {
-		/* Reset the upstream component (likely downstream port) */
-		udev = dev->bus->self;
-	}
-
-	/* Use the aer driver of the component firstly */
-	driver = find_aer_service(udev);
-
-	if (driver && driver->reset_link) {
-		status = driver->reset_link(udev);
-	} else if (udev->has_secondary_link) {
-		status = default_reset_link(udev);
-	} else {
-		dev_printk(KERN_DEBUG, &dev->dev,
-			"no link-reset support at upstream device %s\n",
-			pci_name(udev));
-		return PCI_ERS_RESULT_DISCONNECT;
-	}
-
-	if (status != PCI_ERS_RESULT_RECOVERED) {
-		dev_printk(KERN_DEBUG, &dev->dev,
-			"link reset at upstream device %s failed\n",
-			pci_name(udev));
-		return PCI_ERS_RESULT_DISCONNECT;
-	}
-
-	return status;
-}
-
-/**
- * pci_do_recovery - handle nonfatal/fatal error recovery process
- * @dev: pointer to a pci_dev data structure of agent detecting an error
- * @severity: error severity type
- *
- * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
- * error detected message to all downstream drivers within a hierarchy in
- * question and return the returned code.
- */
-static void pci_do_recovery(struct pci_dev *dev, int severity)
-{
-	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
-	enum pci_channel_state state;
-
-	if (severity == AER_FATAL)
-		state = pci_channel_io_frozen;
-	else
-		state = pci_channel_io_normal;
-
-	status = broadcast_error_message(dev,
-			state,
-			"error_detected",
-			report_error_detected);
-
-	if (severity == AER_FATAL) {
-		result = reset_link(dev);
-		if (result != PCI_ERS_RESULT_RECOVERED)
-			goto failed;
-	}
-
-	if (status == PCI_ERS_RESULT_CAN_RECOVER)
-		status = broadcast_error_message(dev,
-				state,
-				"mmio_enabled",
-				report_mmio_enabled);
-
-	if (status == PCI_ERS_RESULT_NEED_RESET) {
-		/*
-		 * TODO: Should call platform-specific
-		 * functions to reset slot before calling
-		 * drivers' slot_reset callbacks?
-		 */
-		status = broadcast_error_message(dev,
-				state,
-				"slot_reset",
-				report_slot_reset);
-	}
-
-	if (status != PCI_ERS_RESULT_RECOVERED)
-		goto failed;
-
-	broadcast_error_message(dev,
-				state,
-				"resume",
-				report_resume);
-
-	dev_info(&dev->dev, "AER: Device recovery successful\n");
-	return;
-
-failed:
-	/* TODO: Should kernel panic here? */
-	dev_info(&dev->dev, "AER: Device recovery failed\n");
-}
+EXPORT_SYMBOL(pci_find_aer_service);
 
 /**
  * handle_error_source - handle logging error into an event log
diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c
new file mode 100644
index 0000000..a532fe0
--- /dev/null
+++ b/drivers/pci/pcie/pcie-err.c
@@ -0,0 +1,334 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file implements the error recovery as a core part of PCIe error reporting.
+ * When a PCIe error is delivered, an error message will be collected and printed
+ * to console, then, an error recovery procedure will be executed by following
+ * the PCI error recovery rules.
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/aer.h>
+#include <linux/pcieport_if.h>
+#include "portdrv.h"
+
+struct aer_broadcast_data {
+	enum pci_channel_state state;
+	enum pci_ers_result result;
+};
+
+static pci_ers_result_t merge_result(enum pci_ers_result orig,
+				  enum pci_ers_result new)
+{
+	if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
+		return PCI_ERS_RESULT_NO_AER_DRIVER;
+
+	if (new == PCI_ERS_RESULT_NONE)
+		return orig;
+
+	switch (orig) {
+	case PCI_ERS_RESULT_CAN_RECOVER:
+	case PCI_ERS_RESULT_RECOVERED:
+		orig = new;
+		break;
+	case PCI_ERS_RESULT_DISCONNECT:
+		if (new == PCI_ERS_RESULT_NEED_RESET)
+			orig = PCI_ERS_RESULT_NEED_RESET;
+		break;
+	default:
+		break;
+	}
+
+	return orig;
+}
+
+static int report_mmio_enabled(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	const struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+
+	result_data = (struct aer_broadcast_data *) data;
+
+	device_lock(&dev->dev);
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->mmio_enabled)
+		goto out;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->mmio_enabled(dev);
+	result_data->result = merge_result(result_data->result, vote);
+out:
+	device_unlock(&dev->dev);
+	return 0;
+}
+
+static int report_slot_reset(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	const struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+
+	result_data = (struct aer_broadcast_data *) data;
+
+	device_lock(&dev->dev);
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		goto out;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->slot_reset(dev);
+	result_data->result = merge_result(result_data->result, vote);
+out:
+	device_unlock(&dev->dev);
+	return 0;
+}
+
+static int report_resume(struct pci_dev *dev, void *data)
+{
+	const struct pci_error_handlers *err_handler;
+
+	device_lock(&dev->dev);
+	dev->error_state = pci_channel_io_normal;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->resume)
+		goto out;
+
+	err_handler = dev->driver->err_handler;
+	err_handler->resume(dev);
+out:
+	device_unlock(&dev->dev);
+	return 0;
+}
+
+static int report_error_detected(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	const struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+
+	result_data = (struct aer_broadcast_data *) data;
+
+	device_lock(&dev->dev);
+	dev->error_state = result_data->state;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->error_detected) {
+		if (result_data->state == pci_channel_io_frozen &&
+			dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
+			/*
+			 * In case of fatal recovery, if one of down-
+			 * stream device has no driver. We might be
+			 * unable to recover because a later insmod
+			 * of a driver for this device is unaware of
+			 * its hw state.
+			 */
+			dev_printk(KERN_DEBUG, &dev->dev, "device has %s\n",
+				   dev->driver ?
+				   "no error-aware driver" : "no driver");
+		}
+
+		/*
+		 * If there's any device in the subtree that does not
+		 * have an error_detected callback, returning
+		 * PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
+		 * the subsequent mmio_enabled/slot_reset/resume
+		 * callbacks of "any" device in the subtree. All the
+		 * devices in the subtree are left in the error state
+		 * without recovery.
+		 */
+
+		if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE)
+			vote = PCI_ERS_RESULT_NO_AER_DRIVER;
+		else
+			vote = PCI_ERS_RESULT_NONE;
+	} else {
+		err_handler = dev->driver->err_handler;
+		vote = err_handler->error_detected(dev, result_data->state);
+	}
+
+	result_data->result = merge_result(result_data->result, vote);
+	device_unlock(&dev->dev);
+	return 0;
+}
+
+/**
+ * default_reset_link - default reset function
+ * @dev: pointer to pci_dev data structure
+ *
+ * Invoked when performing link reset on a Downstream Port or a
+ * Root Port with no aer driver.
+ */
+static pci_ers_result_t default_reset_link(struct pci_dev *dev)
+{
+	pci_reset_bridge_secondary_bus(dev);
+	dev_printk(KERN_DEBUG, &dev->dev, "downstream link has been reset\n");
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+static pci_ers_result_t reset_link(struct pci_dev *dev)
+{
+	struct pci_dev *udev;
+	pci_ers_result_t status;
+	struct pcie_port_service_driver *driver = NULL;
+
+	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+		/* Reset this port for all subordinates */
+		udev = dev;
+	} else {
+		/* Reset the upstream component (likely downstream port) */
+		udev = dev->bus->self;
+	}
+
+#if IS_ENABLED(CONFIG_PCIEAER)
+	/* Use the aer driver of the component firstly */
+	driver = pci_find_aer_service(udev);
+#endif
+
+	if (driver && driver->reset_link) {
+		status = driver->reset_link(udev);
+	} else if (udev->has_secondary_link) {
+		status = default_reset_link(udev);
+	} else {
+		dev_printk(KERN_DEBUG, &dev->dev,
+			"no link-reset support at upstream device %s\n",
+			pci_name(udev));
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	if (status != PCI_ERS_RESULT_RECOVERED) {
+		dev_printk(KERN_DEBUG, &dev->dev,
+			"link reset at upstream device %s failed\n",
+			pci_name(udev));
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	return status;
+}
+
+/**
+ * broadcast_error_message - handle message broadcast to downstream drivers
+ * @dev: pointer to from where in a hierarchy message is broadcasted down
+ * @state: error state
+ * @error_mesg: message to print
+ * @cb: callback to be broadcasted
+ *
+ * Invoked during error recovery process. Once being invoked, the content
+ * of error severity will be broadcasted to all downstream drivers in a
+ * hierarchy in question.
+ */
+static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+	enum pci_channel_state state,
+	char *error_mesg,
+	int (*cb)(struct pci_dev *, void *))
+{
+	struct aer_broadcast_data result_data;
+
+	dev_printk(KERN_DEBUG, &dev->dev, "broadcast %s message\n", error_mesg);
+	result_data.state = state;
+	if (cb == report_error_detected)
+		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
+	else
+		result_data.result = PCI_ERS_RESULT_RECOVERED;
+
+	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+		/*
+		 * If the error is reported by a bridge, we think this error
+		 * is related to the downstream link of the bridge, so we
+		 * do error recovery on all subordinates of the bridge instead
+		 * of the bridge and clear the error status of the bridge.
+		 */
+		if (cb == report_error_detected)
+			dev->error_state = state;
+		pci_walk_bus(dev->subordinate, cb, &result_data);
+		if (cb == report_resume) {
+			pci_cleanup_aer_uncorrect_error_status(dev);
+			dev->error_state = pci_channel_io_normal;
+		}
+	} else {
+		/*
+		 * If the error is reported by an end point, we think this
+		 * error is related to the upstream link of the end point.
+		 */
+		pci_walk_bus(dev->bus, cb, &result_data);
+	}
+
+	return result_data.result;
+}
+
+/**
+ * pci_do_recovery - handle nonfatal/fatal error recovery process
+ * @dev: pointer to a pci_dev data structure of agent detecting an error
+ * @severity: error severity type
+ *
+ * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
+ * error detected message to all downstream drivers within a hierarchy in
+ * question and return the returned code.
+ */
+void pci_do_recovery(struct pci_dev *dev, int severity)
+{
+	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
+	enum pci_channel_state state;
+
+	if (severity == AER_FATAL)
+		state = pci_channel_io_frozen;
+	else
+		state = pci_channel_io_normal;
+
+	status = broadcast_error_message(dev,
+			state,
+			"error_detected",
+			report_error_detected);
+
+	if (severity == AER_FATAL) {
+		result = reset_link(dev);
+		if (result != PCI_ERS_RESULT_RECOVERED)
+			goto failed;
+	}
+
+	if (status == PCI_ERS_RESULT_CAN_RECOVER)
+		status = broadcast_error_message(dev,
+				state,
+				"mmio_enabled",
+				report_mmio_enabled);
+
+	if (status == PCI_ERS_RESULT_NEED_RESET) {
+		/*
+		 * TODO: Should call platform-specific
+		 * functions to reset slot before calling
+		 * drivers' slot_reset callbacks?
+		 */
+		status = broadcast_error_message(dev,
+				state,
+				"slot_reset",
+				report_slot_reset);
+	}
+
+	if (status != PCI_ERS_RESULT_RECOVERED)
+		goto failed;
+
+	broadcast_error_message(dev,
+				state,
+				"resume",
+				report_resume);
+
+	dev_info(&dev->dev, "Device recovery successful\n");
+	return;
+
+failed:
+	/* TODO: Should kernel panic here? */
+	dev_info(&dev->dev, "Device recovery failed\n");
+}
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index a854bc5..4f1992d 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -79,4 +79,5 @@ static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask)
 static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){}
 #endif /* !CONFIG_ACPI */
 
+struct pcie_port_service_driver *pci_find_aer_service(struct pci_dev *dev);
 #endif /* _PORTDRV_H_ */
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 8f87bbe..cd4f086 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -11,9 +11,9 @@
 #include <linux/errno.h>
 #include <linux/types.h>
 
-#define AER_NONFATAL			0
-#define AER_FATAL			1
-#define AER_CORRECTABLE			2
+#define AER_NONFATAL		0
+#define AER_FATAL		1
+#define AER_CORRECTABLE		2
 
 struct pci_dev;
 
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.,
a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-17 10:37 [PATCH v5 0/4] Address error and recovery for AER and DPC Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 1/4] PCI/AER: Rename error recovery to generic pci naming Oza Pawandeep
  2018-01-17 10:37 ` [PATCH v5 2/4] PCI/AER: factor out error reporting from AER Oza Pawandeep
@ 2018-01-17 10:37 ` Oza Pawandeep
  2018-01-17 16:45   ` Sinan Kaya
  2018-01-17 16:46   ` Sinan Kaya
  2018-01-17 10:37 ` [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event Oza Pawandeep
  3 siblings, 2 replies; 24+ messages in thread
From: Oza Pawandeep @ 2018-01-17 10:37 UTC (permalink / raw)
  To: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya, Timur Tabi
  Cc: Oza Pawandeep

Current DPC driver does not do recovery, e.g. calling end-point's driver's
callbacks, which sanitize the sw.

DPC driver implements link_reset callback, and calls pci_do_recovery.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>

diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
index 2d976a6..ed916765 100644
--- a/drivers/pci/pcie/pcie-dpc.c
+++ b/drivers/pci/pcie/pcie-dpc.c
@@ -13,8 +13,12 @@
 #include <linux/interrupt.h>
 #include <linux/init.h>
 #include <linux/pci.h>
+#include <linux/dpc.h>
 #include <linux/pcieport_if.h>
 #include "../pci.h"
+#include "portdrv.h"
+
+static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
 
 struct rp_pio_header_log_regs {
 	u32 dw0;
@@ -67,6 +71,60 @@ struct dpc_dev {
 	"Memory Request Completion Timeout",		 /* Bit Position 18 */
 };
 
+static int find_dpc_dev_iter(struct device *device, void *data)
+{
+	struct pcie_port_service_driver *service_driver;
+	struct device **dev;
+
+	dev = (struct device **) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		service_driver = to_service_driver(device->driver);
+		if (service_driver->service == PCIE_PORT_SERVICE_DPC) {
+			*dev = device;
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+static struct device *pci_find_dpc_dev(struct pci_dev *pdev)
+{
+	struct device *dev = NULL;
+
+	device_for_each_child(&pdev->dev, &dev, find_dpc_dev_iter);
+
+	return dev;
+}
+
+static int find_dpc_service_iter(struct device *device, void *data)
+{
+	struct pcie_port_service_driver *service_driver, **drv;
+
+	drv = (struct pcie_port_service_driver **) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		service_driver = to_service_driver(device->driver);
+		if (service_driver->service == PCIE_PORT_SERVICE_DPC) {
+			*drv = service_driver;
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+struct pcie_port_service_driver *pci_find_dpc_service(struct pci_dev *dev)
+{
+	struct pcie_port_service_driver *drv = NULL;
+
+	device_for_each_child(&dev->dev, &drv, find_dpc_service_iter);
+
+	return drv;
+}
+EXPORT_SYMBOL(pci_find_dpc_service);
+
 static int dpc_wait_rp_inactive(struct dpc_dev *dpc)
 {
 	unsigned long timeout = jiffies + HZ;
@@ -104,11 +162,23 @@ static void dpc_wait_link_inactive(struct dpc_dev *dpc)
 		dev_warn(dev, "Link state not disabled for DPC event\n");
 }
 
-static void interrupt_event_handler(struct work_struct *work)
+/**
+ * dpc_reset_link - reset link DPC  routine
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver when performing link reset at Root Port.
+ */
+static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev)
 {
-	struct dpc_dev *dpc = container_of(work, struct dpc_dev, work);
-	struct pci_dev *dev, *temp, *pdev = dpc->dev->port;
 	struct pci_bus *parent = pdev->subordinate;
+	struct pci_dev *dev, *temp;
+	struct dpc_dev *dpc;
+	struct pcie_device *pciedev;
+	struct device *devdpc;
+
+	devdpc = pci_find_dpc_dev(pdev);
+	pciedev = to_pcie_device(devdpc);
+	dpc = get_service_data(pciedev);
 
 	pci_lock_rescan_remove();
 	list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
@@ -125,7 +195,7 @@ static void interrupt_event_handler(struct work_struct *work)
 
 	dpc_wait_link_inactive(dpc);
 	if (dpc->rp && dpc_wait_rp_inactive(dpc))
-		return;
+		return PCI_ERS_RESULT_DISCONNECT;
 	if (dpc->rp && dpc->rp_pio_status) {
 		pci_write_config_dword(pdev,
 				      dpc->cap_pos + PCI_EXP_DPC_RP_PIO_STATUS,
@@ -135,6 +205,17 @@ static void interrupt_event_handler(struct work_struct *work)
 
 	pci_write_config_word(pdev, dpc->cap_pos + PCI_EXP_DPC_STATUS,
 		PCI_EXP_DPC_STATUS_TRIGGER | PCI_EXP_DPC_STATUS_INTERRUPT);
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void interrupt_event_handler(struct work_struct *work)
+{
+	struct dpc_dev *dpc = container_of(work, struct dpc_dev, work);
+	struct pci_dev *pdev = dpc->dev->port;
+
+	/* From DPC point of view error is always FATAL. */
+	pci_do_recovery(pdev, DPC_FATAL);
 }
 
 static void dpc_rp_pio_print_tlp_header(struct device *dev,
@@ -339,6 +420,7 @@ static void dpc_remove(struct pcie_device *dev)
 	.service	= PCIE_PORT_SERVICE_DPC,
 	.probe		= dpc_probe,
 	.remove		= dpc_remove,
+	.reset_link     = dpc_reset_link,
 };
 
 static int __init dpc_service_init(void)
diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c
index a532fe0..8ce1de1 100644
--- a/drivers/pci/pcie/pcie-err.c
+++ b/drivers/pci/pcie/pcie-err.c
@@ -17,9 +17,12 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/aer.h>
+#include <linux/dpc.h>
 #include <linux/pcieport_if.h>
 #include "portdrv.h"
 
+static DEFINE_MUTEX(pci_err_recovery_lock);
+
 struct aer_broadcast_data {
 	enum pci_channel_state state;
 	enum pci_ers_result result;
@@ -179,7 +182,7 @@ static pci_ers_result_t default_reset_link(struct pci_dev *dev)
 	return PCI_ERS_RESULT_RECOVERED;
 }
 
-static pci_ers_result_t reset_link(struct pci_dev *dev)
+static pci_ers_result_t reset_link(struct pci_dev *dev, int severity)
 {
 	struct pci_dev *udev;
 	pci_ers_result_t status;
@@ -193,9 +196,17 @@ static pci_ers_result_t reset_link(struct pci_dev *dev)
 		udev = dev->bus->self;
 	}
 
+
+	/* Use the service driver of the component firstly */
+#if IS_ENABLED(CONFIG_PCIE_DPC)
+	if (severity == DPC_FATAL)
+		driver = pci_find_dpc_service(udev);
+#endif
 #if IS_ENABLED(CONFIG_PCIEAER)
-	/* Use the aer driver of the component firstly */
-	driver = pci_find_aer_service(udev);
+	if ((severity == AER_FATAL) ||
+	    (severity == AER_NONFATAL) ||
+	    (severity == AER_CORRECTABLE))
+		driver = pci_find_aer_service(udev);
 #endif
 
 	if (driver && driver->reset_link) {
@@ -283,7 +294,10 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
 	enum pci_channel_state state;
 
-	if (severity == AER_FATAL)
+	mutex_lock(&pci_err_recovery_lock);
+
+	if ((severity == AER_FATAL) ||
+	    (severity == DPC_FATAL))
 		state = pci_channel_io_frozen;
 	else
 		state = pci_channel_io_normal;
@@ -293,8 +307,9 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 			"error_detected",
 			report_error_detected);
 
-	if (severity == AER_FATAL) {
-		result = reset_link(dev);
+	if ((severity == AER_FATAL) ||
+	    (severity == DPC_FATAL)) {
+		result = reset_link(dev, severity);
 		if (result != PCI_ERS_RESULT_RECOVERED)
 			goto failed;
 	}
@@ -326,9 +341,11 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 				report_resume);
 
 	dev_info(&dev->dev, "Device recovery successful\n");
+	mutex_unlock(&pci_err_recovery_lock);
 	return;
 
 failed:
 	/* TODO: Should kernel panic here? */
+	mutex_unlock(&pci_err_recovery_lock);
 	dev_info(&dev->dev, "Device recovery failed\n");
 }
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 4f1992d..b013e24 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -80,4 +80,5 @@ static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){}
 #endif /* !CONFIG_ACPI */
 
 struct pcie_port_service_driver *pci_find_aer_service(struct pci_dev *dev);
+struct pcie_port_service_driver *pci_find_dpc_service(struct pci_dev *dev);
 #endif /* _PORTDRV_H_ */
diff --git a/include/linux/dpc.h b/include/linux/dpc.h
new file mode 100644
index 0000000..2019ce4
--- /dev/null
+++ b/include/linux/dpc.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _DPC_H_
+#define _DPC_H_
+
+#define DPC_FATAL		4
+
+#endif //_DPC_H_
+
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.,
a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-17 10:37 [PATCH v5 0/4] Address error and recovery for AER and DPC Oza Pawandeep
                   ` (2 preceding siblings ...)
  2018-01-17 10:37 ` [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC Oza Pawandeep
@ 2018-01-17 10:37 ` Oza Pawandeep
  2018-01-17 16:27   ` Sinan Kaya
  3 siblings, 1 reply; 24+ messages in thread
From: Oza Pawandeep @ 2018-01-17 10:37 UTC (permalink / raw)
  To: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya, Timur Tabi
  Cc: Oza Pawandeep

Implement error_resume callback in DPC so, after DPC trigger event
enumerates the devices beneath.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>

diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
index ed916765..f99adcae 100644
--- a/drivers/pci/pcie/pcie-dpc.c
+++ b/drivers/pci/pcie/pcie-dpc.c
@@ -162,6 +162,43 @@ static void dpc_wait_link_inactive(struct dpc_dev *dpc)
 		dev_warn(dev, "Link state not disabled for DPC event\n");
 }
 
+static bool dpc_wait_link_active(struct pci_dev *pdev)
+{
+	unsigned long timeout = jiffies + HZ;
+	u16 lnk_status;
+	bool ret = true;
+
+	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+
+	while (!(lnk_status & PCI_EXP_LNKSTA_DLLLA) &&
+					!time_after(jiffies, timeout)) {
+		msleep(10);
+		pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+	}
+
+	if (!(lnk_status & PCI_EXP_LNKSTA_DLLLA)) {
+		dev_warn(&pdev->dev, "Link state not enabled after DPC event\n");
+		ret = false;
+	}
+
+	return ret;
+}
+
+/**
+ * dpc_error_resume - enumerate the devices beneath
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver during nonfatal recovery.
+ */
+static void dpc_error_resume(struct pci_dev *pdev)
+{
+	if (dpc_wait_link_active(pdev)) {
+		pci_lock_rescan_remove();
+		pci_rescan_bus(pdev->bus);
+		pci_unlock_rescan_remove();
+	}
+}
+
 /**
  * dpc_reset_link - reset link DPC  routine
  * @dev: pointer to Root Port's pci_dev data structure
@@ -420,6 +457,7 @@ static void dpc_remove(struct pcie_device *dev)
 	.service	= PCIE_PORT_SERVICE_DPC,
 	.probe		= dpc_probe,
 	.remove		= dpc_remove,
+	.error_resume	= dpc_error_resume,
 	.reset_link     = dpc_reset_link,
 };
 
diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c
index 8ce1de1..de72b0a 100644
--- a/drivers/pci/pcie/pcie-err.c
+++ b/drivers/pci/pcie/pcie-err.c
@@ -236,6 +236,7 @@ static pci_ers_result_t reset_link(struct pci_dev *dev, int severity)
  * @state: error state
  * @error_mesg: message to print
  * @cb: callback to be broadcasted
+ * @severity: error severity
  *
  * Invoked during error recovery process. Once being invoked, the content
  * of error severity will be broadcasted to all downstream drivers in a
@@ -244,7 +245,8 @@ static pci_ers_result_t reset_link(struct pci_dev *dev, int severity)
 static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 	enum pci_channel_state state,
 	char *error_mesg,
-	int (*cb)(struct pci_dev *, void *))
+	int (*cb)(struct pci_dev *, void *),
+	int severity)
 {
 	struct aer_broadcast_data result_data;
 
@@ -256,6 +258,15 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 		result_data.result = PCI_ERS_RESULT_RECOVERED;
 
 	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+		/* If DPC is triggered, call resume error hanlder
+		 * because, at this point we can safely assume that
+		 * link recovery has happened.
+		 */
+		if ((severity == DPC_FATAL) &&
+			(cb == report_resume)) {
+			cb(dev, NULL);
+			return PCI_ERS_RESULT_RECOVERED;
+		}
 		/*
 		 * If the error is reported by a bridge, we think this error
 		 * is related to the downstream link of the bridge, so we
@@ -305,7 +316,8 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 	status = broadcast_error_message(dev,
 			state,
 			"error_detected",
-			report_error_detected);
+			report_error_detected,
+			severity);
 
 	if ((severity == AER_FATAL) ||
 	    (severity == DPC_FATAL)) {
@@ -318,7 +330,8 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 		status = broadcast_error_message(dev,
 				state,
 				"mmio_enabled",
-				report_mmio_enabled);
+				report_mmio_enabled,
+				severity);
 
 	if (status == PCI_ERS_RESULT_NEED_RESET) {
 		/*
@@ -329,7 +342,8 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 		status = broadcast_error_message(dev,
 				state,
 				"slot_reset",
-				report_slot_reset);
+				report_slot_reset,
+				severity);
 	}
 
 	if (status != PCI_ERS_RESULT_RECOVERED)
@@ -338,7 +352,8 @@ void pci_do_recovery(struct pci_dev *dev, int severity)
 	broadcast_error_message(dev,
 				state,
 				"resume",
-				report_resume);
+				report_resume,
+				severity);
 
 	dev_info(&dev->dev, "Device recovery successful\n");
 	mutex_unlock(&pci_err_recovery_lock);
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.,
a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-17 10:37 ` [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event Oza Pawandeep
@ 2018-01-17 16:27   ` Sinan Kaya
  2018-01-18  2:56     ` Keith Busch
  2018-01-18  5:26     ` poza
  0 siblings, 2 replies; 24+ messages in thread
From: Sinan Kaya @ 2018-01-17 16:27 UTC (permalink / raw)
  To: Oza Pawandeep, Bjorn Helgaas, Philippe Ombredanne,
	Thomas Gleixner, Greg Kroah-Hartman, Kate Stewart, linux-pci,
	linux-kernel, Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
> +static bool dpc_wait_link_active(struct pci_dev *pdev)
> +{

I think you can also make this function common instead of making another copy here.
Of course, this would be another patch.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-17 10:37 ` [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC Oza Pawandeep
@ 2018-01-17 16:45   ` Sinan Kaya
  2018-01-18  5:22     ` poza
  2018-01-17 16:46   ` Sinan Kaya
  1 sibling, 1 reply; 24+ messages in thread
From: Sinan Kaya @ 2018-01-17 16:45 UTC (permalink / raw)
  To: Oza Pawandeep, Bjorn Helgaas, Philippe Ombredanne,
	Thomas Gleixner, Greg Kroah-Hartman, Kate Stewart, linux-pci,
	linux-kernel, Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
> +		driver = pci_find_dpc_service(udev);
> +#endif
>  #if IS_ENABLED(CONFIG_PCIEAER)
> -	/* Use the aer driver of the component firstly */
> -	driver = pci_find_aer_service(udev);

I think we need a pci_find_service function that unifies these two.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-17 10:37 ` [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC Oza Pawandeep
  2018-01-17 16:45   ` Sinan Kaya
@ 2018-01-17 16:46   ` Sinan Kaya
  2018-01-18  5:17     ` poza
  1 sibling, 1 reply; 24+ messages in thread
From: Sinan Kaya @ 2018-01-17 16:46 UTC (permalink / raw)
  To: Oza Pawandeep, Bjorn Helgaas, Philippe Ombredanne,
	Thomas Gleixner, Greg Kroah-Hartman, Kate Stewart, linux-pci,
	linux-kernel, Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
> +++ b/include/linux/dpc.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _DPC_H_
> +#define _DPC_H_
> +
> +#define DPC_FATAL		4
> +
> +#endif //_DPC_H_
> +

can you keep this in drivers/pci.h and get rid of this file?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-17 16:27   ` Sinan Kaya
@ 2018-01-18  2:56     ` Keith Busch
  2018-01-18  5:32       ` poza
  2018-01-18  5:26     ` poza
  1 sibling, 1 reply; 24+ messages in thread
From: Keith Busch @ 2018-01-18  2:56 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Oza Pawandeep, Bjorn Helgaas, Philippe Ombredanne,
	Thomas Gleixner, Greg Kroah-Hartman, Kate Stewart, linux-pci,
	linux-kernel, Dongdong Liu, Wei Zhang, Timur Tabi

On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote:
> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
> > +static bool dpc_wait_link_active(struct pci_dev *pdev)
> > +{
> 
> I think you can also make this function common instead of making another copy here.
> Of course, this would be another patch.

It is actually very similar to __pcie_wait_link_active in pciehp_hpc.c,
so there's some opprotunity to make even more common code.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-17 16:46   ` Sinan Kaya
@ 2018-01-18  5:17     ` poza
  2018-01-18  5:57       ` poza
  0 siblings, 1 reply; 24+ messages in thread
From: poza @ 2018-01-18  5:17 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-17 22:16, Sinan Kaya wrote:
> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>> +++ b/include/linux/dpc.h
>> @@ -0,0 +1,9 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _DPC_H_
>> +#define _DPC_H_
>> +
>> +#define DPC_FATAL		4
>> +
>> +#endif //_DPC_H_
>> +
> 
> can you keep this in drivers/pci.h and get rid of this file?

I thought about this, but if I keep it in drivers/pci.h,
then AER's defines have to be in that as well. (for unification)

and then all the dependent files who are using AER_FATAL such as 
drivers/acpi/apei/ghees.c
have to go on including this drivers file which is odd way of doing it.

So I am not very sure about this....since AER_FATAL are in aer.h, I have 
made dpc.h


Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-17 16:45   ` Sinan Kaya
@ 2018-01-18  5:22     ` poza
  2018-01-18  6:04       ` poza
  0 siblings, 1 reply; 24+ messages in thread
From: poza @ 2018-01-18  5:22 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-17 22:15, Sinan Kaya wrote:
> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>> +		driver = pci_find_dpc_service(udev);
>> +#endif
>>  #if IS_ENABLED(CONFIG_PCIEAER)
>> -	/* Use the aer driver of the component firstly */
>> -	driver = pci_find_aer_service(udev);
> 
> I think we need a pci_find_service function that unifies these two.

Right now, find_xxx_service are in their respective file and exporting 
it.
which makes sense no less than having generic function.

If I have to change pci_find_service(...., int service_name) then it has 
to be somewhere in generic file.
probably portdrv_core.c

either way I am fine but just thinking out if its really required.

Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-17 16:27   ` Sinan Kaya
  2018-01-18  2:56     ` Keith Busch
@ 2018-01-18  5:26     ` poza
  1 sibling, 0 replies; 24+ messages in thread
From: poza @ 2018-01-18  5:26 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-17 21:57, Sinan Kaya wrote:
> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>> +static bool dpc_wait_link_active(struct pci_dev *pdev)
>> +{
> 
> I think you can also make this function common instead of making
> another copy here.
> Of course, this would be another patch.

ok I will make a separate patch taking one more parameter
dpc_wait_link_active(struct pci_dev *, bool)

if not in this series, then immediate one.

Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-18  2:56     ` Keith Busch
@ 2018-01-18  5:32       ` poza
  2018-01-18 16:35         ` Sinan Kaya
  0 siblings, 1 reply; 24+ messages in thread
From: poza @ 2018-01-18  5:32 UTC (permalink / raw)
  To: Keith Busch
  Cc: Sinan Kaya, Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Wei Zhang, Timur Tabi

On 2018-01-18 08:26, Keith Busch wrote:
> On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote:
>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>> > +static bool dpc_wait_link_active(struct pci_dev *pdev)
>> > +{
>> 
>> I think you can also make this function common instead of making 
>> another copy here.
>> Of course, this would be another patch.
> 
> It is actually very similar to __pcie_wait_link_active in pciehp_hpc.c,
> so there's some opprotunity to make even more common code.

in that case there has to be a generic function in
drives/pci.c

which addresses folowing functions from

pcie-dpc.c:
dpc_wait_link_inactive
dpc_wait_link_active

drivers/pci/hotplug/pciehp_hpc.c
pcie_wait_link_active


all aboe making one generic function to be moved to drives/pci.c

please let me know if this is okay.

Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-18  5:17     ` poza
@ 2018-01-18  5:57       ` poza
  2018-01-18 16:31         ` Sinan Kaya
  0 siblings, 1 reply; 24+ messages in thread
From: poza @ 2018-01-18  5:57 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-18 10:47, poza@codeaurora.org wrote:
> On 2018-01-17 22:16, Sinan Kaya wrote:
>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>>> +++ b/include/linux/dpc.h
>>> @@ -0,0 +1,9 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +
>>> +#ifndef _DPC_H_
>>> +#define _DPC_H_
>>> +
>>> +#define DPC_FATAL		4
>>> +
>>> +#endif //_DPC_H_
>>> +
>> 
>> can you keep this in drivers/pci.h and get rid of this file?
> 
> I thought about this, but if I keep it in drivers/pci.h,
> then AER's defines have to be in that as well. (for unification)
> 
> and then all the dependent files who are using AER_FATAL such as
> drivers/acpi/apei/ghees.c
> have to go on including this drivers file which is odd way of doing it.
> 
> So I am not very sure about this....since AER_FATAL are in aer.h, I
> have made dpc.h
> 
> 
> Regards,
> Oza.

Should I be doing in next patch-set series ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-18  5:22     ` poza
@ 2018-01-18  6:04       ` poza
  0 siblings, 0 replies; 24+ messages in thread
From: poza @ 2018-01-18  6:04 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-18 10:52, poza@codeaurora.org wrote:
> On 2018-01-17 22:15, Sinan Kaya wrote:
>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>>> +		driver = pci_find_dpc_service(udev);
>>> +#endif
>>>  #if IS_ENABLED(CONFIG_PCIEAER)
>>> -	/* Use the aer driver of the component firstly */
>>> -	driver = pci_find_aer_service(udev);
>> 
>> I think we need a pci_find_service function that unifies these two.
> 
> Right now, find_xxx_service are in their respective file and exporting 
> it.
> which makes sense no less than having generic function.
> 
> If I have to change pci_find_service(...., int service_name) then it
> has to be somewhere in generic file.
> probably portdrv_core.c
> 
> either way I am fine but just thinking out if its really required.
> 
> Regards,
> Oza.

Should I be doing in next patch-set series ?

Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-18  5:57       ` poza
@ 2018-01-18 16:31         ` Sinan Kaya
  2018-01-18 18:00           ` poza
  0 siblings, 1 reply; 24+ messages in thread
From: Sinan Kaya @ 2018-01-18 16:31 UTC (permalink / raw)
  To: poza
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 1/18/2018 12:57 AM, poza@codeaurora.org wrote:
> On 2018-01-18 10:47, poza@codeaurora.org wrote:
>> On 2018-01-17 22:16, Sinan Kaya wrote:
>>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>>>> +++ b/include/linux/dpc.h
>>>> @@ -0,0 +1,9 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>>> +
>>>> +#ifndef _DPC_H_
>>>> +#define _DPC_H_
>>>> +
>>>> +#define DPC_FATAL        4
>>>> +
>>>> +#endif //_DPC_H_
>>>> +
>>>
>>> can you keep this in drivers/pci.h and get rid of this file?
>>
>> I thought about this, but if I keep it in drivers/pci.h,
>> then AER's defines have to be in that as well. (for unification)
>>
>> and then all the dependent files who are using AER_FATAL such as
>> drivers/acpi/apei/ghees.c
>> have to go on including this drivers file which is odd way of doing it.
>>
>> So I am not very sure about this....since AER_FATAL are in aer.h, I
>> have made dpc.h
>>
>>
>> Regards,
>> Oza.
> 
> Should I be doing in next patch-set series ?
> 

I think you would put into include/linux/pci.h only if there is an external
use of constant outside of drivers/pci directory. Otherwise, you should keep
the setting inside one of the header files in drivers/pci directory.

I don't see any other subsystem caring about DPC_FATAL definition.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-18  5:32       ` poza
@ 2018-01-18 16:35         ` Sinan Kaya
  2018-01-19  1:43           ` Keith Busch
  0 siblings, 1 reply; 24+ messages in thread
From: Sinan Kaya @ 2018-01-18 16:35 UTC (permalink / raw)
  To: poza, Keith Busch
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Wei Zhang, Timur Tabi

On 1/18/2018 12:32 AM, poza@codeaurora.org wrote:
> On 2018-01-18 08:26, Keith Busch wrote:
>> On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote:
>>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>>> > +static bool dpc_wait_link_active(struct pci_dev *pdev)
>>> > +{
>>>
>>> I think you can also make this function common instead of making another copy here.
>>> Of course, this would be another patch.
>>
>> It is actually very similar to __pcie_wait_link_active in pciehp_hpc.c,
>> so there's some opprotunity to make even more common code.
> 
> in that case there has to be a generic function in
> drives/pci.c
> 
> which addresses folowing functions from
> 
> pcie-dpc.c:
> dpc_wait_link_inactive
> dpc_wait_link_active
> 
> drivers/pci/hotplug/pciehp_hpc.c
> pcie_wait_link_active
> 
> 
> all aboe making one generic function to be moved to drives/pci.c
> 
> please let me know if this is okay.

Works for me. Keith/Bjorn?

> 
> Regards,
> Oza.
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-18 16:31         ` Sinan Kaya
@ 2018-01-18 18:00           ` poza
  2018-01-18 18:03             ` Sinan Kaya
  0 siblings, 1 reply; 24+ messages in thread
From: poza @ 2018-01-18 18:00 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-18 22:01, Sinan Kaya wrote:
> On 1/18/2018 12:57 AM, poza@codeaurora.org wrote:
>> On 2018-01-18 10:47, poza@codeaurora.org wrote:
>>> On 2018-01-17 22:16, Sinan Kaya wrote:
>>>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>>>>> +++ b/include/linux/dpc.h
>>>>> @@ -0,0 +1,9 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>>>> +
>>>>> +#ifndef _DPC_H_
>>>>> +#define _DPC_H_
>>>>> +
>>>>> +#define DPC_FATAL        4
>>>>> +
>>>>> +#endif //_DPC_H_
>>>>> +
>>>> 
>>>> can you keep this in drivers/pci.h and get rid of this file?
>>> 
>>> I thought about this, but if I keep it in drivers/pci.h,
>>> then AER's defines have to be in that as well. (for unification)
>>> 
>>> and then all the dependent files who are using AER_FATAL such as
>>> drivers/acpi/apei/ghees.c
>>> have to go on including this drivers file which is odd way of doing 
>>> it.
>>> 
>>> So I am not very sure about this....since AER_FATAL are in aer.h, I
>>> have made dpc.h
>>> 
>>> 
>>> Regards,
>>> Oza.
>> 
>> Should I be doing in next patch-set series ?
>> 
> 
> I think you would put into include/linux/pci.h only if there is an 
> external
> use of constant outside of drivers/pci directory. Otherwise, you should 
> keep
> the setting inside one of the header files in drivers/pci directory.
> 
> I don't see any other subsystem caring about DPC_FATAL definition.

ok so you are suggesting to move only DPC_FATAL ? so then AER can stay 
where it is.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-18 18:00           ` poza
@ 2018-01-18 18:03             ` Sinan Kaya
  2018-01-19  4:23               ` poza
  0 siblings, 1 reply; 24+ messages in thread
From: Sinan Kaya @ 2018-01-18 18:03 UTC (permalink / raw)
  To: poza
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 1/18/2018 1:00 PM, poza@codeaurora.org wrote:
>> I think you would put into include/linux/pci.h only if there is an external
>> use of constant outside of drivers/pci directory. Otherwise, you should keep
>> the setting inside one of the header files in drivers/pci directory.
>>
>> I don't see any other subsystem caring about DPC_FATAL definition.
> 
> ok so you are suggesting to move only DPC_FATAL ? so then AER can stay where it is.

Now that both AER and DPC handling is getting unified, I think it makes sense to
keep all error codes (AER+DPC) together in drivers/pci/pci.h rather than having
them split in aer.h and dpc.h.

Otherwise, how would we avoid having a new error type defined with the existing values.



-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-18 16:35         ` Sinan Kaya
@ 2018-01-19  1:43           ` Keith Busch
  2018-01-19  4:21             ` poza
  0 siblings, 1 reply; 24+ messages in thread
From: Keith Busch @ 2018-01-19  1:43 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: poza, Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Wei Zhang, Timur Tabi

On Thu, Jan 18, 2018 at 11:35:59AM -0500, Sinan Kaya wrote:
> On 1/18/2018 12:32 AM, poza@codeaurora.org wrote:
> > On 2018-01-18 08:26, Keith Busch wrote:
> >> On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote:
> >>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
> >>> > +static bool dpc_wait_link_active(struct pci_dev *pdev)
> >>> > +{
> >>>
> >>> I think you can also make this function common instead of making another copy here.
> >>> Of course, this would be another patch.
> >>
> >> It is actually very similar to __pcie_wait_link_active in pciehp_hpc.c,
> >> so there's some opprotunity to make even more common code.
> > 
> > in that case there has to be a generic function in
> > drives/pci.c
> > 
> > which addresses folowing functions from
> > 
> > pcie-dpc.c:
> > dpc_wait_link_inactive
> > dpc_wait_link_active
> > 
> > drivers/pci/hotplug/pciehp_hpc.c
> > pcie_wait_link_active
> > 
> > 
> > all aboe making one generic function to be moved to drives/pci.c
> > 
> > please let me know if this is okay.
> 
> Works for me. Keith/Bjorn?

Yep, I believe common solutions that reduce code is always encouraged
in the Linux kernel.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
  2018-01-19  1:43           ` Keith Busch
@ 2018-01-19  4:21             ` poza
  0 siblings, 0 replies; 24+ messages in thread
From: poza @ 2018-01-19  4:21 UTC (permalink / raw)
  To: Keith Busch
  Cc: Sinan Kaya, Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Wei Zhang, Timur Tabi

On 2018-01-19 07:13, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 11:35:59AM -0500, Sinan Kaya wrote:
>> On 1/18/2018 12:32 AM, poza@codeaurora.org wrote:
>> > On 2018-01-18 08:26, Keith Busch wrote:
>> >> On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote:
>> >>> On 1/17/2018 5:37 AM, Oza Pawandeep wrote:
>> >>> > +static bool dpc_wait_link_active(struct pci_dev *pdev)
>> >>> > +{
>> >>>
>> >>> I think you can also make this function common instead of making another copy here.
>> >>> Of course, this would be another patch.
>> >>
>> >> It is actually very similar to __pcie_wait_link_active in pciehp_hpc.c,
>> >> so there's some opprotunity to make even more common code.
>> >
>> > in that case there has to be a generic function in
>> > drives/pci.c
>> >
>> > which addresses folowing functions from
>> >
>> > pcie-dpc.c:
>> > dpc_wait_link_inactive
>> > dpc_wait_link_active
>> >
>> > drivers/pci/hotplug/pciehp_hpc.c
>> > pcie_wait_link_active
>> >
>> >
>> > all aboe making one generic function to be moved to drives/pci.c
>> >
>> > please let me know if this is okay.
>> 
>> Works for me. Keith/Bjorn?
> 
> Yep, I believe common solutions that reduce code is always encouraged
> in the Linux kernel.


okay, I will work on this.

Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-18 18:03             ` Sinan Kaya
@ 2018-01-19  4:23               ` poza
  2018-01-19  4:44                 ` Sinan Kaya
  0 siblings, 1 reply; 24+ messages in thread
From: poza @ 2018-01-19  4:23 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-18 23:33, Sinan Kaya wrote:
> On 1/18/2018 1:00 PM, poza@codeaurora.org wrote:
>>> I think you would put into include/linux/pci.h only if there is an 
>>> external
>>> use of constant outside of drivers/pci directory. Otherwise, you 
>>> should keep
>>> the setting inside one of the header files in drivers/pci directory.
>>> 
>>> I don't see any other subsystem caring about DPC_FATAL definition.
>> 
>> ok so you are suggesting to move only DPC_FATAL ? so then AER can stay 
>> where it is.
> 
> Now that both AER and DPC handling is getting unified, I think it makes 
> sense to
> keep all error codes (AER+DPC) together in drivers/pci/pci.h rather 
> than having
> them split in aer.h and dpc.h.
> 
> Otherwise, how would we avoid having a new error type defined with the
> existing values.

I agree, its is just that drivers/acpi/apet/ghes.c has to do
#include ../../pci/pci.h

but thats okay I think.  let me move error codes to drivers/pci/pci.h.

Regards,
Oza.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-19  4:23               ` poza
@ 2018-01-19  4:44                 ` Sinan Kaya
  2018-01-19  9:03                   ` poza
  0 siblings, 1 reply; 24+ messages in thread
From: Sinan Kaya @ 2018-01-19  4:44 UTC (permalink / raw)
  To: poza
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 1/18/2018 11:23 PM, poza@codeaurora.org wrote:
> On 2018-01-18 23:33, Sinan Kaya wrote:
>> On 1/18/2018 1:00 PM, poza@codeaurora.org wrote:
>>>> I think you would put into include/linux/pci.h only if there is an external
>>>> use of constant outside of drivers/pci directory. Otherwise, you should keep
>>>> the setting inside one of the header files in drivers/pci directory.
>>>>
>>>> I don't see any other subsystem caring about DPC_FATAL definition.
>>>
>>> ok so you are suggesting to move only DPC_FATAL ? so then AER can stay where it is.
>>
>> Now that both AER and DPC handling is getting unified, I think it makes sense to
>> keep all error codes (AER+DPC) together in drivers/pci/pci.h rather than having
>> them split in aer.h and dpc.h.
>>
>> Otherwise, how would we avoid having a new error type defined with the
>> existing values.
> 
> I agree, its is just that drivers/acpi/apet/ghes.c has to do
> #include ../../pci/pci.h

That's bad. I was just thinking about the DPC error code only. I didn't realize
AER error codes are being referenced from ghes.c.

> 
> but thats okay I think.  let me move error codes to drivers/pci/pci.h.

It is better if error codes move to include/linux/pci.h and keep them together.

> 
> Regards,
> Oza.
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC
  2018-01-19  4:44                 ` Sinan Kaya
@ 2018-01-19  9:03                   ` poza
  0 siblings, 0 replies; 24+ messages in thread
From: poza @ 2018-01-19  9:03 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, Philippe Ombredanne, Thomas Gleixner,
	Greg Kroah-Hartman, Kate Stewart, linux-pci, linux-kernel,
	Dongdong Liu, Keith Busch, Wei Zhang, Timur Tabi

On 2018-01-19 10:14, Sinan Kaya wrote:
> On 1/18/2018 11:23 PM, poza@codeaurora.org wrote:
>> On 2018-01-18 23:33, Sinan Kaya wrote:
>>> On 1/18/2018 1:00 PM, poza@codeaurora.org wrote:
>>>>> I think you would put into include/linux/pci.h only if there is an 
>>>>> external
>>>>> use of constant outside of drivers/pci directory. Otherwise, you 
>>>>> should keep
>>>>> the setting inside one of the header files in drivers/pci 
>>>>> directory.
>>>>> 
>>>>> I don't see any other subsystem caring about DPC_FATAL definition.
>>>> 
>>>> ok so you are suggesting to move only DPC_FATAL ? so then AER can 
>>>> stay where it is.
>>> 
>>> Now that both AER and DPC handling is getting unified, I think it 
>>> makes sense to
>>> keep all error codes (AER+DPC) together in drivers/pci/pci.h rather 
>>> than having
>>> them split in aer.h and dpc.h.
>>> 
>>> Otherwise, how would we avoid having a new error type defined with 
>>> the
>>> existing values.
>> 
>> I agree, its is just that drivers/acpi/apet/ghes.c has to do
>> #include ../../pci/pci.h
> 
> That's bad. I was just thinking about the DPC error code only. I didn't 
> realize
> AER error codes are being referenced from ghes.c.
> 
>> 
>> but thats okay I think.  let me move error codes to drivers/pci/pci.h.
> 
> It is better if error codes move to include/linux/pci.h and keep them 
> together.
> 

The problem with moving them to include/linux/pci.h, it falls into 
global scope, besides
they have to be renamed to/prefixed with PCI_ERR_xxx

the use of AER_FATAL, DPC_FATAL etc.. is very limited in entire linux. 
and likely to be so.
I think moving them to drivers/pci/pci.h would be more restricted/local

let me make patch-set based on that, and see how it looks like. we can 
arrive at some consensus then.

>> 
>> Regards,
>> Oza.
>> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-01-19  9:03 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-17 10:37 [PATCH v5 0/4] Address error and recovery for AER and DPC Oza Pawandeep
2018-01-17 10:37 ` [PATCH v5 1/4] PCI/AER: Rename error recovery to generic pci naming Oza Pawandeep
2018-01-17 10:37 ` [PATCH v5 2/4] PCI/AER: factor out error reporting from AER Oza Pawandeep
2018-01-17 10:37 ` [PATCH v5 3/4] PCI/DPC: Unify and plumb error handling into DPC Oza Pawandeep
2018-01-17 16:45   ` Sinan Kaya
2018-01-18  5:22     ` poza
2018-01-18  6:04       ` poza
2018-01-17 16:46   ` Sinan Kaya
2018-01-18  5:17     ` poza
2018-01-18  5:57       ` poza
2018-01-18 16:31         ` Sinan Kaya
2018-01-18 18:00           ` poza
2018-01-18 18:03             ` Sinan Kaya
2018-01-19  4:23               ` poza
2018-01-19  4:44                 ` Sinan Kaya
2018-01-19  9:03                   ` poza
2018-01-17 10:37 ` [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event Oza Pawandeep
2018-01-17 16:27   ` Sinan Kaya
2018-01-18  2:56     ` Keith Busch
2018-01-18  5:32       ` poza
2018-01-18 16:35         ` Sinan Kaya
2018-01-19  1:43           ` Keith Busch
2018-01-19  4:21             ` poza
2018-01-18  5:26     ` poza

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).