linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] PCI-Express AER implemetation: aer howto document
@ 2006-07-12  7:10 Zhang, Yanmin
  2006-07-12  7:16 ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
  2006-07-14  5:25 ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
  0 siblings, 2 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  7:10 UTC (permalink / raw)
  To: LKML, linux-pci maillist; +Cc: Greg KH, Tom Long Nguyen

I changed some codes and separated into 5 patches. Thank Greg for his comments.

From: Zhang, Yanmin <yanmin.zhang@intel.com>

PCI-Express AER (Advanced Error Reporting) provides more robust error reporting.
The series of patches enable kernel support to AER.

The initial patches were written by Tom Long Nguyen. I ported them to the kernel
2.6.17. Many thanks to Rajesh Shah and Narayanan Chandramouli for their great
review comments and testing help.

Patch 1 consists of the pciaer-howto.txt document.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> 

---

--- linux-2.6.17/Documentation/pcieaer-howto.txt	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/Documentation/pcieaer-howto.txt	2006-06-30 15:45:26.000000000 +0800
@@ -0,0 +1,224 @@
+   The PCI Express Advanced Error Reporting Driver Guide HOWTO
+		T. Long Nguyen	<tom.l.nguyen@intel.com>
+		Yanmin Zhang	<yanmin.zhang@intel.com>
+				03/30/2006
+
+1. About this guide
+
+This guide describes the basics of the PCI Express Advanced Error
+Reporting (AER) driver and provides information on how to enable
+the drivers of endpoint devices to conform with PCI Express AER
+driver.
+
+2. Copyright © Intel Corporation 2006.
+
+3. What is the PCI Express AER Driver?
+
+PCI Express error signaling can occur on the PCI Express link itself
+or on behalf of transactions initiated on the link. PCI Express
+defines two error reporting paradigms: the baseline capability and
+the Advanced Error Reporting capability. The baseline capability is
+required of all PCI Express components providing a minimum defined
+set of error reporting requirements. Advanced Error Reporting
+capability is implemented with a PCI Express advanced error reporting
+extended capability structure providing more robust error reporting.
+
+The PCI Express AER driver provides the infrastructure to support PCI
+Express Advanced Error Reporting capability. The PCI Express AER
+driver provides three basic functions:
+
+-	Gathers the comprehensive error information if errors
+	occurred. 
+-	Performs error recovery actions.
+-	Reports error to the users.
+
+AER driver only attaches root port which support PCI-Express AER
+capability.
+
+4. Why Use the PCI Express AER Driver?
+
+In a PCI Express-aware system, when AER is enabled, a PCI Express
+device will automatically send an error message to the PCIE root
+port above it when the device captures an error. The Root Port,
+upon receiving an error reporting message, internally processes
+and logs the error message in its PCI Express capability structure.
+Error information being logged includes storing the error reporting
+agent's requestor ID into the Error Source Identification Registers
+and setting the error bits of the Root Error Status Register
+accordingly. If AER error reporting is enabled in Root Error Command
+Register, the Root Port generates an interrupt if an error is
+detected.
+
+All kernels released before 2.6.18 have no root service driver
+available to manage the PCI Express advanced error reporting
+extended capability structure. BIOS could provide the baseline
+capability, but it is unable to coordinate with the downstream device
+drivers to determine more precisely which error and what severity,
+and unable to reset the downstream links while handling fatal error
+recovery.
+
+To provide a solution to these BIOS issues requires the PCI Express AER
+Root driver that provides:
+
+- 	An infrastructure for the OS and application to determine if a
+	fatal error is fatal to the system, OS, or application increasing
+	uptime.
+
+-	An infrastructure to notify the downstream device drivers if errors
+	occurred.
+
+-	An infrastructure to dynamically perform error recovery actions
+	based on configuration options.
+
+- 	Platform-specific independence.
+
+5. Including the PCI Express AER Root Driver into the Linux Kernel
+
+The PCI Express AER Root driver is a Root Port service driver attached
+to the PCI Express Port Bus driver. Its service must be registered
+with the PCI Express Port Bus driver and users are required to include
+the PCI Express Port Bus driver in the kernel (refer to
+PCIEBUS-HOWTO.txt). Once the kernel config CONFIG_PCIEPORTBUS is
+included, the PCI Express AER Root driver is automatically included
+as a kernel driver by default (CONFIG_PCIEAER = y). Users may disable
+the PCI Express AER driver by clearing CONFIG_PCIEAER.
+
+Note that there is a case where a system has AER support in BIOS. 
+Enabling the AER Root driver and having AER support in BIOS may
+result unpredictable behavior. To avoid this conflict, a successful
+load of the AER Root driver requires ACPI _OSC support in the BIOS to
+allow the AER Root driver to request for native control of AER. See
+the PCI FW 3.0 Specification for details regarding OSC usage. Currently,
+lots of firmwares don't provide _OSC support while they use
+PCI-Express. To support such firmwares, forceload, a module parameter
+of type bool, could enable AER to continue to be initiated although
+firmwares have no _OSC support. forceload=n by default.
+
+6. Enabling AER Aware Support in PCI Express Device Driver
+
+To enable AER aware support requires a software driver to configure
+the AER capability structure within its device and to provide its
+error-recovery callbacks as described below.
+
+6.1. Configuring the AER capability structure
+
+PCI Express errors are classified into two types: correctable errors
+and uncorrectable errors. This classification is based on the impacts
+of those errors, which may result in function failure or in degraded
+performance.
+
+Correctable errors pose no impacts on the functionality of the
+interface. The PCI Express protocol can recover without any software
+intervention or any loss of data. These errors are detected and
+corrected by hardware. Unlike correctable errors, uncorrectable
+errors impact functionality of the interface. Uncorrectable errors
+can cause a particular transaction or a particular PCI Express link
+to be unreliable. Depending on those error conditions, uncorrectable
+errors are further classified into fatal errors and non-fatal errors.
+Non-fatal errors cause the particular transaction to be unreliable,
+but the PCI Express link itself is fully functional. Fatal errors, on
+the other hand, cause the link to be unreliable.
+
+AER aware drivers of PCI Express component need change the device
+control registers to enable AER. They also could change AER registers,
+including mask and severity registers.
+
+Note that the errors as described above are related to the PCI Express
+hierarchy and links. These errors do not include any device specific
+errors because device specific errors will still get sent directly to
+the device driver.
+
+6.2. Provide PCI error-recovery callbacks
+
+The PCI Express AER Root driver uses callbacks to coordinate with
+downstream device drivers associated with a hierarchy in question
+when performing error recovery actions. AER driver follows the rules
+defined in pci-error-recovery.txt.
+
+Note that correctable errors pose no impacts on the functionality of
+the interface. The PCI Express protocol can recover without any
+software intervention or any loss of data. These errors do not
+require any recovery actions. The AER driver clears the device's
+correctable error status register accordingly and logs these errors.
+
+If an error message indicates a non-fatal error, performing link reset
+at upstream is not required. The AER driver calls error_detected(dev,
+pci_channel_io_normal) to all drivers associated within a hierarchy in
+question. A driver may return PCI_ERS_RESULT_CAN_RECOVER,
+PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
+whether it can recover or the AER driver calls mmio_enabled as next.
+ 
+If an error message indicates a fatal error, kernel will broadcast
+error_detected(dev, pci_channel_io_frozen) to all drivers within
+a hierarchy in question. Then, performing link reset at upstream is
+necessary. As different kinds of devices might use different approaches
+to reset link, AER port service driver is required to provide the
+function to reset link. Firstly, kernel looks for if the upstream
+component has an aer driver. If it has, kernel uses the reset_link
+callback of the aer driver. If the upstream component has no aer driver
+and the port is downstream port, we will use the aer driver of the
+root port who reports the AER error. As for upstream ports,
+they should provide their own aer service drivers with reset_link
+function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
+reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
+to mmio_enabled.
+
+6.2.5 helper functions
+
+6.2.5.1 int pci_find_aer_capability(struct pci_dev *dev);
+pci_find_aer_capability locates the PCI-Express AER capability
+in the device configuration space. If the device doesn't support
+PCI-Express AER, the function returns 0.
+
+6.2.5.2 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+pci_enable_pcie_error_reporting enables the device to send error
+messages to root port when an error is detected. Note that devices
+don't enable the error reporting by default, so device driver need
+call this function to enable it.
+
+6.2.5.3 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+pci_disable_pcie_error_reporting disables the device to send error
+messages to root port when an error is detected.
+
+6.2.5.4 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
+error status register.
+
+7. AER error output
+
+When any AER error is reported, kernel will call printk to output
+error messages.
+
+Below shows an sample.
++------ PCI-Express Device Error -----+
+Error Severity          : Uncorrected (Fatal)
+PCIE Bus Error type     : Transaction Layer
+Unsupported Request     : First
+Requester ID            : 0500
+VendorID=8086h, DeviceID=0329h, Bus=05h, Device=00h, Function=00h
+TLB Header:
+04000001 00200a03 05010000 00050100
+
+8. Frequent Asked Questions
+
+Q: What happens if a PCI Express device driver does not provide an
+error recovery handle?
+
+A: The devices attached with the driver won't be recovered. If the
+error is fatal, kernel will print out warning messages. Please refer
+to section 6 for more information.
+
+Q: How does this infrastructure deal with driver that is not PCI
+Express aware?
+
+A: This infrastructure calls the error callback functions of the
+driver when an error happens. But if the driver is not aware of
+PCI Express, the device might not report its own errors to root
+port.
+
+Q: What modifications will that driver need to make it compatible
+with the PCI Express AER Root driver?
+
+A: It could call the helper functions to enable AER in devices and
+cleanup uncorrectable status register. Pls. refer to section 6.2.5.
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h
  2006-07-12  7:10 [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
@ 2006-07-12  7:16 ` Zhang, Yanmin
  2006-07-12  7:22   ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
  2006-07-14  5:25 ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
  1 sibling, 1 reply; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  7:16 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

Although Greg already accepted the second patch into his testing tree,
I still resend it to keep the patch integrity.

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 2 adds new defines of PCI-Express AER registers
and their bits into file include/linux/pci_regs.h.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/include/linux/pci_regs.h	2006-06-22 16:26:31.000000000 +0800
+++ linux-2.6.17_aer/include/linux/pci_regs.h	2006-06-22 16:46:29.000000000 +0800
@@ -421,7 +421,23 @@
 #define  PCI_ERR_CAP_ECRC_CHKE	0x00000100	/* ECRC Check Enable */
 #define PCI_ERR_HEADER_LOG	28	/* Header Log Register (16 bytes) */
 #define PCI_ERR_ROOT_COMMAND	44	/* Root Error Command */
+/* Correctable Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_COR_EN		0x00000001
+/* Non-fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_NONFATAL_EN	0x00000002
+/* Fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_FATAL_EN	0x00000004
 #define PCI_ERR_ROOT_STATUS	48
+#define PCI_ERR_ROOT_COR_RCV		0x00000001	/* ERR_COR Received */
+/* Multi ERR_COR Received */
+#define PCI_ERR_ROOT_MULTI_COR_RCV	0x00000002
+/* ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_UNCOR_RCV		0x00000004
+/* Multi ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_MULTI_UNCOR_RCV	0x00000008
+#define PCI_ERR_ROOT_FIRST_FATAL	0x00000010	/* First Fatal */
+#define PCI_ERR_ROOT_NONFATAL_RCV	0x00000020	/* Non-Fatal Received */
+#define PCI_ERR_ROOT_FATAL_RCV		0x00000040	/* Fatal Received */
 #define PCI_ERR_ROOT_COR_SRC	52
 #define PCI_ERR_ROOT_SRC	54
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type
  2006-07-12  7:16 ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
@ 2006-07-12  7:22   ` Zhang, Yanmin
  2006-07-12  7:32     ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
  2006-07-12  8:00     ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
  0 siblings, 2 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  7:22 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 3 exports pcie_port_bus_type. AER driver could be compiled
as a module and it needs to access pcie_port_bus_type.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/portdrv_bus.c	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_bus.c	2006-06-22 16:46:29.000000000 +0800
@@ -76,3 +76,6 @@ static int pcie_port_bus_resume(struct d
 		driver->resume(pciedev);
 	return 0;
 }
+
+EXPORT_SYMBOL(pcie_port_bus_type);
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12  7:22   ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
@ 2006-07-12  7:32     ` Zhang, Yanmin
  2006-07-12  7:38       ` [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler Zhang, Yanmin
  2006-07-12  8:06       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
  2006-07-12  8:00     ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
  1 sibling, 2 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  7:32 UTC (permalink / raw)
  To: LKML, linux-pci maillist; +Cc: Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 4 implements the core part of PCI-Express AER and aerdrv
port service driver.

When a root port service device is probed, the aerdrv will call
request_irq to register irq handler for AER error interrupt.

When a device sends an PCI-Express error message to the root port,
the root port will trigger an interrupt, by either MSI or IO-APIC,
then kernel would run the irq handler. The handler collects root
error status register and schedules a work. The work will call
the core part to process the error based on its type
(Correctable/non-fatal/fatal).

As for Correctable errors, the patch chooses to just clear the correctable
error status register of the device.

As for the non-fatal error, the patch follows generic PCI error handler
rules to call the error callback functions of the endpoint's driver. If
the device is a bridge, the patch chooses to broadcast the error to
downstream devices.

As for the fatal error, the patch resets the pci-express link and
follows generic PCI error handler rules to call the error callback
functions of the endpoint's driver. If the device is a bridge, the patch
chooses to broadcast the error to downstream devices.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_acpi.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_acpi.c	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+/**
+ * aer_osc_setup - run ACPI _OSC method
+ *
+ * Return: 
+ *	Zero if success. Nonzero for otherwise.
+ *
+ * Invoked when PCIE bus loads AER service driver. To avoid conflict with
+ * BIOS AER support requires BIOS to yield AER control to OS native driver.
+ **/
+int aer_osc_setup(struct pci_dev *dev)
+{
+	int retval = OSC_METHOD_RUN_SUCCESS;
+	acpi_status status;
+	acpi_handle handle = DEVICE_ACPI_HANDLE(&dev->dev);
+	struct pci_dev *pdev = dev;
+	struct pci_bus *parent;
+
+	while (!handle) {
+		if (!pdev || !pdev->bus->parent)
+			break;
+		parent = pdev->bus->parent;
+		if (!parent->self)
+			/* Parent must be a host bridge */
+			handle = acpi_get_pci_rootbridge_handle(
+					pci_domain_nr(parent),
+					parent->number);
+		else
+			handle = DEVICE_ACPI_HANDLE(
+					&(parent->self->dev));
+		pdev = parent->self;
+	}
+
+	if (!handle)
+		return OSC_METHOD_NOT_SUPPORTED;
+
+	pci_osc_support_set(OSC_EXT_PCI_CONFIG_SUPPORT);
+	status = pci_osc_control_set(handle, OSC_PCI_EXPRESS_AER_CONTROL |
+		OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL);
+	if (ACPI_FAILURE(status)) {
+		if (status == AE_SUPPORT) 
+			retval = OSC_METHOD_NOT_SUPPORTED;
+	 	else
+			retval = OSC_METHOD_RUN_FAILURE;
+	}
+
+	return retval;
+}
+
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_core.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_core.c	2006-07-12 14:35:48.000000000 +0800
@@ -0,0 +1,737 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+static LIST_HEAD(rc_list);		/* Define Root Complex List */
+
+static int forceload;
+module_param(forceload, bool, 0);
+
+#define PCI_CFG_SPACE_SIZE	(0x100)
+int pci_find_aer_capability(struct pci_dev *dev)
+{
+	int pos;
+	u32 reg32 = 0;
+
+	/* Check if it's a pci-express device */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return 0;
+
+	/* Check if it supports pci-express AER */
+	pos = PCI_CFG_SPACE_SIZE;
+	while (pos) {
+		if (pci_read_config_dword(dev, pos, &reg32))
+			return 0;
+
+		/* some broken boards return ~0 */
+		if (reg32 == 0xffffffff)
+			return 0;
+
+		if (PCI_EXT_CAP_ID(reg32) == PCI_EXT_CAP_ID_ERR)
+			break;
+
+		pos = reg32 >> 20;
+	}
+
+	return pos;
+}
+
+int pci_disable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 & ~(PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE);
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+
+	pos = pci_find_aer_capability(dev);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+
+	return 0;
+}
+
+static int find_device_iter(struct device *device, void *data)
+{
+	struct pci_dev *dev;
+	u16 id = *(unsigned long *)data;
+	u8 secondary, subordinate, d_bus = id >> 8;
+
+	if (device->bus == &pci_bus_type) {
+		dev = to_pci_dev(device);
+		if (id == ((dev->bus->number << 8) | dev->devfn)) {
+			/*
+			 * Device ID match
+			 */
+			*(unsigned long*)data = (unsigned long)device;
+			return 1;
+		}
+
+		/* 
+		 * If device is P2P, check if it is an upstream?
+		 */
+		if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+			pci_read_config_byte(dev, PCI_SECONDARY_BUS,
+				&secondary);
+			pci_read_config_byte(dev, PCI_SUBORDINATE_BUS,
+				&subordinate);
+			if (d_bus >= secondary && d_bus <= subordinate) {
+				*(unsigned long*)data = (unsigned long)device;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * find_source_device - search through device hierarchy for source device
+ * @p_dev: pointer to Root Port pci_dev data structure
+ * @id: device ID of agent who sends an error message to this Root Port
+ *
+ * Invoked when error is detected at the Root Port.
+ **/
+static struct device* find_source_device(struct pci_dev *parent, u16 id)
+{
+	struct pci_dev *dev = parent;
+	struct device *device;
+	unsigned long device_addr;
+	int status;
+
+	/* Is Root Port an agent that sends error message? */
+	if (id == ((dev->bus->number << 8) | dev->devfn)) 
+		return &dev->dev;
+
+	do {
+		device_addr = id;
+ 		if ((status = device_for_each_child(&dev->dev,
+			&device_addr, find_device_iter))) {
+			device = (struct device*)device_addr;
+			dev = to_pci_dev(device);
+			if (id == ((dev->bus->number << 8) | dev->devfn))
+				return device;
+		}
+ 	}while (status);
+
+	return NULL;
+}
+
+static void report_error_detected(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	dev->error_state = result_data->state;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->error_detected) {
+		if (result_data->state == pci_channel_io_frozen &&
+			!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+			/* 
+			 * In case of fatal recovery, if one of down-
+			 * stream device has no driver. We might be
+			 * unable to recover because a later insmod
+			 * of a driver for this device is unaware of
+			 * its hw state.
+			 */
+			printk(KERN_DEBUG "Device ID[%s] has %s\n",
+					dev->dev.bus_id, (dev->driver) ?
+					"no AER-aware driver" : "no driver");
+		}
+		return;
+	}
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->error_detected(dev, result_data->state);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_mmio_enabled(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->mmio_enabled)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->mmio_enabled(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_slot_reset(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->slot_reset(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_resume(struct pci_dev *dev, void *data)
+{
+	struct pci_error_handlers *err_handler;
+
+	dev->error_state = pci_channel_io_normal;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	err_handler->resume(dev);
+	return;
+}
+
+/**
+ * broadcast_error_message - handle message broadcast to downstream drivers
+ * @device: pointer to from where in a hierarchy message is broadcasted down
+ * @api: callback to be broadcasted
+ * @state: error state
+ *
+ * Invoked during error recovery process. Once being invoked, the content
+ * of error severity will be broadcasted to all downstream drivers in a 
+ * hierarchy in question.
+ **/
+static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+	enum pci_channel_state state,
+	char *error_mesg,
+	void (*cb)(struct pci_dev *, void *))
+{
+	struct aer_broadcast_data result_data;
+
+	printk(KERN_DEBUG "Broadcast %s message\n", error_mesg);
+	result_data.state = state;
+	if (cb == report_error_detected)
+		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
+	else
+		result_data.result = PCI_ERS_RESULT_RECOVERED;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+		/*
+		 * If the error is reported by a bridge, we think this error
+		 * is related to the downstream link of the bridge, so we
+		 * do error recovery on all subordinates of the bridge instead
+		 * of the bridge and clear the error status of the bridge.
+		 */
+		if (cb == report_error_detected)
+			dev->error_state = state;
+		pci_walk_bus(dev->subordinate, cb, &result_data);
+		if (cb == report_resume) {
+			pci_cleanup_aer_uncorrect_error_status(dev);
+			dev->error_state = pci_channel_io_normal;
+		}
+	}
+	else {
+		/*
+		 * If the error is reported by an end point, we think this
+		 * error is related to the upstream link of the end point.
+		 */
+		pci_walk_bus(dev->bus, cb, &result_data);
+	}
+
+	return result_data.result;
+}
+
+struct find_aer_service_data {
+        struct pcie_port_service_driver *aer_driver;
+        int is_downstream;
+};
+
+static int find_aer_service_iter(struct device *device, void *data)
+{
+	struct device_driver *driver;
+	struct pcie_port_service_driver *service_driver;
+	struct pcie_device *pcie_dev;
+	struct find_aer_service_data *result;
+
+	result = (struct find_aer_service_data *) data;
+
+	if (device->bus == &pcie_port_bus_type) {
+		pcie_dev = to_pcie_device(device);
+		if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+			result->is_downstream = 1;
+
+		driver = device->driver;
+		if (driver) {
+			service_driver = to_service_driver(driver);
+			if (service_driver->id_table->service_type ==
+					PCIE_PORT_SERVICE_AER) {
+				result->aer_driver = service_driver;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void find_aer_service(struct pci_dev *dev,
+		struct find_aer_service_data *data)
+{
+	device_for_each_child(&dev->dev, data, find_aer_service_iter);
+}
+
+static pci_ers_result_t reset_link(struct pcie_device *aerdev,
+		struct pci_dev *dev)
+{
+	struct pci_dev *udev;
+	pci_ers_result_t status;
+	struct find_aer_service_data data;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)
+		udev = dev;
+	else
+		udev= dev->bus->self;
+
+	data.is_downstream = 0;
+	data.aer_driver = NULL;
+	find_aer_service(udev, &data);
+
+	/*
+	 * Use the aer driver of the error agent firstly.
+	 * If it hasn't the aer driver, use the root port's
+	 */
+	if (!data.aer_driver || !data.aer_driver->reset_link) {
+		if (data.is_downstream &&
+			aerdev->device.driver &&
+			to_service_driver(aerdev->device.driver)->reset_link) {
+			data.aer_driver =
+				to_service_driver(aerdev->device.driver);
+		} else {
+			printk(KERN_DEBUG "No link-reset support to Device ID"
+				"[%s]\n",
+				dev->dev.bus_id);
+			return PCI_ERS_RESULT_DISCONNECT;
+		}
+	}
+
+	status = data.aer_driver->reset_link(udev);
+	if (status != PCI_ERS_RESULT_RECOVERED) {
+		printk(KERN_DEBUG "Link reset at upstream Device ID"
+			"[%s] failed\n",
+			udev->dev.bus_id);
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	return status;
+}
+
+/**
+ * do_recovery - handle nonfatal/fatal error recovery process
+ * @aerdev: pointer to a pcie_device data structure of root port
+ * @dev: pointer to a pci_dev data structure of agent detecting an error
+ * @severity: error severity type
+ *
+ * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
+ * error detected message to all downstream drivers within a hierarchy in 
+ * question and return the returned code.
+ **/
+static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
+		struct pci_dev *dev,
+		int severity)
+{
+	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
+	enum pci_channel_state state;
+
+	if (severity == AER_FATAL)
+		state = pci_channel_io_frozen;
+	else
+		state = pci_channel_io_normal;
+
+	status = broadcast_error_message(dev,
+			state,
+			"error_detected",
+			report_error_detected);
+
+	if (severity == AER_FATAL) {
+		result = reset_link(aerdev, dev);
+		if (result != PCI_ERS_RESULT_RECOVERED) {
+			/* TODO: Should panic here? */
+			return result;
+		}
+	}
+
+	if (status == PCI_ERS_RESULT_CAN_RECOVER)
+		status = broadcast_error_message(dev,
+				state,
+				"mmio_enabled",
+				report_mmio_enabled);
+
+	if (status == PCI_ERS_RESULT_NEED_RESET) {
+		/*
+		 * TODO: Should call platform-specific
+		 * functions to reset slot before calling
+		 * drivers' slot_reset callbacks?
+		 */
+		status = broadcast_error_message(dev,
+				state,
+				"slot_reset",
+				report_slot_reset);
+	}
+
+	if (status == PCI_ERS_RESULT_RECOVERED)
+		broadcast_error_message(dev,
+				state,
+				"resume",
+				report_resume);
+
+	return status;
+}
+
+/**
+ * handle_error_source - handle logging error into an event log
+ * @aerdev: pointer to pcie_device data structure of the root port
+ * @dev: pointer to pci_dev data structure of error source device
+ * @info: comprehensive error information
+ *
+ * Invoked when an error being detected by Root Port.
+ **/
+static void handle_error_source(struct pcie_device * aerdev,
+	struct pci_dev *dev,
+	struct aer_err_info info)
+{
+	pci_ers_result_t status = 0;
+	int pos;
+
+	if (info.severity == AER_CORRECTABLE) {
+		/* 
+		 * Correctable error does not need software intevention.
+		 * No need to go through error recovery process.
+		 */
+		pos = pci_find_aer_capability(dev);
+		if (pos)
+			pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+					info.status);
+	} else {
+		status = do_recovery(aerdev, dev, info.severity);
+		if (status == PCI_ERS_RESULT_RECOVERED) {
+			printk(KERN_DEBUG "AER driver successfully recovered\n");
+		} else {
+			/* TODO: Should kernel panic here? */ 
+			printk(KERN_DEBUG "AER driver didn't recover\n");
+		}
+	}
+}
+
+/**
+ * enable_root_aer - enable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus loads AER service driver.
+ **/
+static void enable_root_aer(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	int pos, aer_pos;
+	u16 reg16;
+	u32 reg32;
+
+	pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+	/* Clear PCIE Capability's Device Status */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVSTA, reg16);
+
+	/* Disable system error generation in response to error messages */
+	pci_read_config_word(pdev, pos + PCI_EXP_RTCTL, &reg16);
+	reg16 &= ~(SYSTEM_ERROR_INTR_ON_MESG_MASK);
+	pci_write_config_word(pdev, pos + PCI_EXP_RTCTL, reg16);
+
+	aer_pos = pci_find_aer_capability(pdev);
+	/* Clear error status */
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
+
+	/* Enable Root Port device reporting error itself */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVCTL,
+		reg16);
+
+	/* Enable Root Port's interrupt in response to error messages */
+	pci_write_config_dword(pdev,
+		aer_pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+}
+
+/**
+ * disable_root_aer - disable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus unloads AER service driver.
+ **/
+static void disable_root_aer(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	u32 reg32;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev);
+	/* Disable Root's interrupt in response to error messages */
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Clear Root's error status reg */
+	pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, reg32);
+}
+
+/**
+ * get_e_source - retrieve an error source
+ * @rpc: pointer to the root port which holds an error
+ *
+ * Invoked by DPC handler to consume an error.
+ **/
+static struct aer_err_source* get_e_source(struct aer_rpc *rpc)
+{
+	struct aer_err_source *e_source;
+	unsigned long flags;
+
+	/* Lock access to Root error producer/consumer index */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+	if (rpc->prod_idx == rpc->cons_idx) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return NULL;
+	}
+	e_source = &rpc->e_sources[rpc->cons_idx];
+	rpc->cons_idx++;
+	if (rpc->cons_idx == AER_ERROR_SOURCES_MAX)
+		rpc->cons_idx = 0;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+	
+	return e_source;
+}
+
+static int get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
+{
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* The device might not support AER */
+	if (!pos)
+		return AER_SUCCESS;
+
+	if (info->severity == AER_CORRECTABLE) {
+		pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_CORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS; 
+	} else if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE ||
+		info->severity == AER_NONFATAL) {
+
+		/* Link is still healthy for IO reads */
+		pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_UNCORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS;
+
+		if (info->status & AER_LOG_TLP_MASKS) {
+			info->flags |= AER_TLP_HEADER_VALID_FLAG;
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 4, &info->tlp.dw1);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 8, &info->tlp.dw2);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 12, &info->tlp.dw3);
+		}
+	}
+
+	return AER_SUCCESS;
+}
+
+/**
+ * aer_isr - consume an error detected by root port
+ * @context: pointer to a private data of pcie device
+ *
+ * Invoked, as DPC, when root port records new detected error
+ **/
+void aer_isr(void *context)
+{
+	struct pcie_device *p_device = (struct pcie_device *) context;
+	struct device *s_device;
+	struct aer_rpc *rpc = get_service_data(p_device);
+	struct aer_err_source *e_src;
+	struct aer_err_info e_info = {0, 0, 0,};
+	int i;
+	u16 id;
+
+	/* 
+	 * Lock access into an error buffer associated with this Root Port.
+	 * Process one error at a time.
+	 */
+	down(&rpc->rpc_sema);
+	if (!(e_src = get_e_source(rpc))) {
+		printk(KERN_DEBUG "%s->DPC fails to get an error source\n",
+			__FUNCTION__);
+		up(&rpc->rpc_sema);
+		return;
+	}
+
+	/*
+	 * There is a possibility that both correctable error and 
+	 * uncorrectable error being logged. Report correctable error first.
+	 */
+	for (i = 1; i & ROOT_ERR_STATUS_MASKS ; i <<= 2) {
+		if (i > 4)
+			break;
+		if (!(e_src->status & i))
+			continue;
+
+		/* Init comprehensive error information */
+		if (i & PCI_ERR_ROOT_COR_RCV) {
+			id = ERR_COR_ID(e_src->id);
+			e_info.severity = AER_CORRECTABLE;
+		} else {
+			id = ERR_UNCOR_ID(e_src->id);
+			e_info.severity = ((e_src->status >> 6) & 1);
+		}
+		if (e_src->status &
+			(PCI_ERR_ROOT_MULTI_COR_RCV |
+			 PCI_ERR_ROOT_MULTI_UNCOR_RCV))
+			e_info.flags |= AER_MULTI_ERROR_VALID_FLAG;
+		if (!(s_device = find_source_device(p_device->port, id))) {
+			printk(KERN_DEBUG "%s->can't find device of ID%04x\n",
+				__FUNCTION__, id);
+			continue;
+		}
+		if (get_device_error_info(to_pci_dev(s_device), &e_info) ==
+				AER_SUCCESS) {
+			aer_print_error(to_pci_dev(s_device), &e_info);
+			handle_error_source(p_device,
+				to_pci_dev(s_device),
+				e_info);
+		}
+	}
+	up(&rpc->rpc_sema);
+}
+
+/**
+ * aer_add_rootport - add a new root port into Root Complex's port hierarchy
+ * @rpc: pointer to a new root port device being added
+ *
+ * Invoked when AER service loaded on a new Root Port
+ **/
+void aer_add_rootport(struct aer_rpc *rpc)
+{
+	/* Add new Root Port into RC List */
+	list_add_tail(&rpc->node, &rc_list);
+
+	/* Enable root port AER itself */
+	enable_root_aer(rpc);
+}
+
+/**
+ * aer_delete_rootport - delete a root port from Root Complex's port hierarchy
+ * @rpc: pointer to a root port device being deleted
+ *
+ * Invoked when AER service unloaded on a specific Root Port
+ **/
+void aer_delete_rootport(struct aer_rpc *rpc)
+{
+	/* Disable root port AER itself */
+	disable_root_aer(rpc);
+	
+	/* Free all source nodes under this root port */
+	list_del(&rpc->node);
+	kfree(rpc);
+}
+
+/**
+ * aer_init - provide AER initialization
+ * @dev: pointer to AER pcie device
+ *
+ * Invoked when AER service driver is loaded.
+ **/
+int aer_init(struct pcie_device *dev)
+{
+	int status;
+
+	/* Run _OSC Method */
+	status = aer_osc_setup(dev->port);
+
+	if(status != OSC_METHOD_RUN_SUCCESS) {
+		printk(KERN_DEBUG "%s: AER service init fails - %s\n",
+		__FUNCTION__,
+		(status == OSC_METHOD_NOT_SUPPORTED) ?
+			"No ACPI _OSC support" : "Run ACPI _OSC fails");
+
+		if (!forceload)
+			return status;
+	}
+
+	return AER_SUCCESS;
+}
+
+EXPORT_SYMBOL(pci_find_aer_capability);
+EXPORT_SYMBOL(pci_disable_pcie_error_reporting);
+EXPORT_SYMBOL(pci_cleanup_aer_uncorrect_error_status);
+
--- linux-2.6.17/include/linux/aer.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/include/linux/aer.h	2006-07-12 14:37:17.000000000 +0800
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2006 Intel
+ *     Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *     Zhang Yanmin (yanmin.zhang@intel.com)
+ */
+
+#ifndef _AER_H_
+#define _AER_H_
+
+#if defined(CONFIG_PCIEAER) || defined(CONFIG_PCIEAER_MODULE)
+/* pci-e port driver needs this function to enable aer */
+static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+extern int pci_find_aer_capability(struct pci_dev *dev);
+extern int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+#else
+#define pci_enable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_find_aer_capability(dev)			do { } while (0)
+#define pci_disable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_cleanup_aer_uncorrect_error_status(dev)	do { } while (0)
+#endif
+
+#endif //_AER_H_
+
--- linux-2.6.17/include/linux/pcieport_if.h	2006-06-22 16:26:32.000000000 +0800
+++ linux-2.6.17_aer/include/linux/pcieport_if.h	2006-06-22 16:46:29.000000000 +0800
@@ -61,6 +61,12 @@ struct pcie_port_service_driver {
 	void (*remove) (struct pcie_device *dev);
 	int (*suspend) (struct pcie_device *dev, pm_message_t state);
 	int (*resume) (struct pcie_device *dev);
+	
+	/* Service Error Recovery Handler */
+	struct pci_error_handlers *err_handler;
+
+	/* Link Reset Capability - AER service driver specific */
+	pci_ers_result_t (*reset_link) (struct pci_dev *dev);
 
 	const struct pcie_port_service_id *id_table;
 	struct device_driver driver;
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.h	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,137 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#ifndef _AERDRV_H_
+#define _AERDRV_H_
+
+#include <linux/pcieport_if.h>
+#include <linux/aer.h>
+
+#define AER_NONFATAL			0
+#define AER_FATAL			1
+#define AER_CORRECTABLE			2
+#define AER_UNCORRECTABLE		4
+#define AER_ERROR_MASK			0x001fffff
+#define AER_ERROR(d)			(d & AER_ERROR_MASK)
+
+#define VERBOSE_LIMIT_DISPLAY		1
+#define VERBOSE_FULL_DISPLAY		2
+#define VERBOSE_RAW_DISPLAY		3
+#define VERBOSE_MASK			0x3
+
+#define OSC_METHOD_RUN_SUCCESS		0
+#define OSC_METHOD_NOT_SUPPORTED	1
+#define OSC_METHOD_RUN_FAILURE		2
+
+/* Root Error Status Register Bits */
+#define ROOT_ERR_STATUS_MASKS			0x0f
+
+#define SYSTEM_ERROR_INTR_ON_MESG_MASK	(PCI_EXP_RTCTL_SECEE|	\
+					PCI_EXP_RTCTL_SENFEE|	\
+					PCI_EXP_RTCTL_SEFEE)
+#define ROOT_PORT_INTR_ON_MESG_MASK	(PCI_ERR_ROOT_CMD_COR_EN|	\
+					PCI_ERR_ROOT_CMD_NONFATAL_EN|	\
+					PCI_ERR_ROOT_CMD_FATAL_EN)
+#define ERR_COR_ID(d)			(d & 0xffff)
+#define ERR_UNCOR_ID(d)			(d >> 16)
+
+#define AER_SUCCESS			0
+#define AER_UNSUCCESS			1
+#define AER_ERROR_SOURCES_MAX		100
+
+#define AER_LOG_TLP_MASKS		(PCI_ERR_UNC_POISON_TLP|	\
+					PCI_ERR_UNC_ECRC|		\
+					PCI_ERR_UNC_UNSUP|		\
+					PCI_ERR_UNC_COMP_ABORT|		\
+					PCI_ERR_UNC_UNX_COMP|		\
+					PCI_ERR_UNC_MALF_TLP)
+
+/* AER Error Info Flags */
+#define AER_TLP_HEADER_VALID_FLAG	0x00000001
+#define AER_MULTI_ERROR_VALID_FLAG	0x00000002
+
+#define ERR_CORRECTABLE_ERROR_MASK	0x000031c1
+#define ERR_UNCORRECTABLE_ERROR_MASK	0x001ff010
+
+struct header_log_regs {
+	unsigned int dw0;
+	unsigned int dw1;
+	unsigned int dw2;
+	unsigned int dw3;
+};
+
+struct aer_err_info {
+	int severity;			/* 0:NONFATAL | 1:FATAL | 2:COR */
+	int flags;			
+	unsigned int status;		/* COR/UNCOR Error Status */
+	struct header_log_regs tlp; 	/* TLP Header */
+};
+
+struct aer_err_source {
+	unsigned int status;
+	unsigned int id;
+};
+
+struct aer_rpc {
+ 	struct list_head node;
+ 	struct list_head children;	/* AER children of this root port */
+	struct pcie_device *rpd;	/* Root Port device */
+	struct work_struct dpc_handler;
+	struct aer_err_source e_sources[AER_ERROR_SOURCES_MAX];
+	unsigned short prod_idx;	/* Error Producer Index */
+	unsigned short cons_idx;	/* Error Consumer Index */
+	int isr;
+	spinlock_t e_lock;		/* 
+					 * Lock access to Error Status/ID Regs
+					 * and error producer/consumer index
+					 */
+ 
+	struct semaphore rpc_sema;	/* 
+					 * Semaphore access required to
+					 * access, add, remove, or print AER
+				 	 * aware devices in this RPC hierarchy
+					 */
+};
+
+struct aer_broadcast_data {
+	enum pci_channel_state state;
+	enum pci_ers_result result;
+};
+
+static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
+		enum pci_ers_result new)
+{
+	switch (orig) {
+	case PCI_ERS_RESULT_CAN_RECOVER:
+	case PCI_ERS_RESULT_RECOVERED:
+		orig = new;
+		break;
+	case PCI_ERS_RESULT_DISCONNECT:
+		if (new == PCI_ERS_RESULT_NEED_RESET)
+			orig = new;
+		break;
+	default:
+		break;
+	}
+
+	return orig;
+}
+
+extern struct bus_type pcie_port_bus_type;
+extern void aer_add_rootport(struct aer_rpc *rpc);
+extern void aer_delete_rootport(struct aer_rpc *rpc);
+extern int aer_init(struct pcie_device *dev);
+extern void aer_isr(void *context);
+extern void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
+
+#ifdef CONFIG_ACPI
+extern int aer_osc_setup(struct pci_dev *dev);
+#else
+#define  aer_osc_setup(dev)		(OSC_METHOD_NOT_SUPPORTED)
+#endif
+
+#endif //_AERDRV_H_
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.c	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,342 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/pcieport_if.h>
+
+#include "aerdrv.h"
+
+/*
+ * Version Information
+ */
+#define DRIVER_VERSION "v1.0"
+#define DRIVER_AUTHOR "tom.l.nguyen@intel.com"
+#define DRIVER_DESC "Root Port Advanced Error Reporting Driver"
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+static int __devinit aer_probe (struct pcie_device *dev,
+	const struct pcie_port_service_id *id );
+static void aer_remove(struct pcie_device *dev);
+static int aer_suspend(struct pcie_device *dev, pm_message_t state)
+{return 0;}
+static int aer_resume(struct pcie_device *dev) {return 0;}
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+	enum pci_channel_state error);
+static void aer_error_resume(struct pci_dev *dev);
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
+
+/*
+ * PCI Express bus's AER Root service driver data structure
+ */
+static struct pcie_port_service_id aer_id[] = {
+	{
+	.vendor 	= PCI_ANY_ID, 
+	.device 	= PCI_ANY_ID,
+	.port_type 	= PCIE_RC_PORT, 
+	.service_type 	= PCIE_PORT_SERVICE_AER,
+	},
+	{ /* end: all zeroes */ }
+};
+
+static struct pci_error_handlers aer_error_handlers = {
+	.error_detected = aer_error_detected,
+	.resume = aer_error_resume,
+};
+
+static struct pcie_port_service_driver aerdrv = {
+	.name		= "aer",
+	.id_table	= &aer_id[0],
+
+	.probe		= aer_probe,
+	.remove		= aer_remove,
+
+	.suspend	= aer_suspend,
+	.resume		= aer_resume,
+
+	.err_handler	= &aer_error_handlers,
+
+	.reset_link	= aer_root_reset,
+};
+
+/**
+ * aer_irq - Root Port's ISR
+ * @irq: IRQ assigned to Root Port
+ * @context: pointer to Root Port data structure
+ * @r: pointer struct pt_regs
+ *
+ * Invoked when Root Port detects AER messages.
+ **/
+static irqreturn_t aer_irq(int irq, void *context, struct pt_regs * r)
+{
+	unsigned int status, id;
+	struct pcie_device *pdev = (struct pcie_device *)context;
+	struct aer_rpc *rpc = get_service_data(pdev);
+	int next_prod_idx;
+	unsigned long flags;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev->port);
+	/* 
+	 * Must lock access to Root Error Status Reg, Root Error ID Reg, 
+	 * and Root error producer/consumer index 
+	 */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+
+	/* Read error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, &status);
+	if (!(status & ROOT_ERR_STATUS_MASKS)) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_NONE;
+	}
+
+	/* Read error source and clear error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_COR_SRC, &id);
+	pci_write_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, status);
+
+	/* Store error source for later DPC handler */
+	next_prod_idx = rpc->prod_idx + 1;
+	if (next_prod_idx == AER_ERROR_SOURCES_MAX)
+		next_prod_idx = 0;
+	if (next_prod_idx == rpc->cons_idx) {
+		/* 
+		 * Error Storm Condition - possibly the same error occurred.
+		 * Drop the error.
+		 */
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_HANDLED;
+	}
+	rpc->e_sources[rpc->prod_idx].status =  status;
+	rpc->e_sources[rpc->prod_idx].id = id;
+	rpc->prod_idx = next_prod_idx;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+
+	/*  Invoke DPC handler */
+	schedule_work(&rpc->dpc_handler);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * aer_alloc_rpc - allocate Root Port data structure
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when Root Port's AER service is loaded.
+ **/
+static struct aer_rpc* aer_alloc_rpc(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc;
+
+	if (!(rpc = (struct aer_rpc *)kmalloc(sizeof(struct aer_rpc), 
+		GFP_KERNEL)))
+		return NULL;
+
+	memset(rpc, 0, sizeof(struct aer_rpc));
+	/* 
+	 * Initialize Root lock access, e_lock, to Root Error Status Reg, 
+	 * Root Error ID Reg, and Root error producer/consumer index. 
+	 */
+	rpc->e_lock = SPIN_LOCK_UNLOCKED;
+
+	/* 
+	 * Initialize semaphore access required to access, add, remove,
+	 * or print AER aware devices in this RPC hierarchy 
+	 */
+	sema_init(&rpc->rpc_sema, 1);
+
+	INIT_LIST_HEAD(&rpc->node);
+	INIT_LIST_HEAD(&rpc->children);
+	rpc->rpd = dev;
+	INIT_WORK(&rpc->dpc_handler, aer_isr, (void *)dev);
+	rpc->prod_idx = rpc->cons_idx = 0;
+
+	/* Use PCIE bus function to store rpc into PCIE device */
+	set_service_data(dev, rpc);
+
+	return rpc;
+}
+
+/**
+ * aer_remove - clean up resources
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when PCI Express bus unloads or AER probe fails.
+ **/
+static void aer_remove(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc = get_service_data(dev);
+
+	if (rpc) {
+		/* If register interrupt service, it must be free. */
+		if (rpc->isr)
+			free_irq(dev->irq, dev);
+
+		/* Delete this node from a RC hierarchy */
+		aer_delete_rootport(rpc);
+		set_service_data(dev, NULL);
+	}
+}
+
+/**
+ * aer_probe - initialize resources
+ * @dev: pointer to the pcie_dev data structure
+ * @id: pointer to the service id data structure
+ *
+ * Invoked when PCI Express bus loads AER service driver.
+ **/
+static int __devinit aer_probe (struct pcie_device *dev, 
+				const struct pcie_port_service_id *id )
+{
+	int status;
+	struct aer_rpc *rpc;
+	struct device *device = &dev->device;
+
+	/* Init */
+	if ((status = aer_init(dev)))
+		return status;
+
+	/* Alloc rpc data structure */
+	if (!(rpc = aer_alloc_rpc(dev))) {
+		printk(KERN_DEBUG "%s: Alloc rpc fails on PCIE device[%s]\n",
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return -ENOMEM;
+	}
+
+	/* Request IRQ ISR */
+	if ((status = request_irq(dev->irq, aer_irq, SA_SHIRQ, "aerdrv", 
+				dev))) {
+		printk(KERN_DEBUG "%s: Request ISR fails on PCIE device[%s]\n", 
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return status;
+	}
+
+	rpc->isr = 1;
+
+	/* Add rpc into a RC hierarchy */
+	aer_add_rootport(rpc);
+
+	return status;
+}
+
+/**
+ * aer_root_reset - reset link on Root Port
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver when performing link reset at Root Port.
+ **/
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
+{
+	u16 p2p_ctrl;
+	u32 status;
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* Disable Root's interrupt in response to error messages */ 
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Assert Secondary Bus Reset */
+	pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl);
+	p2p_ctrl |= PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* De-assert Secondary Bus Reset */
+	p2p_ctrl &= ~PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* 
+	 * System software must wait for at least 100ms from the end 
+	 * of a reset of one or more device before it is permitted
+	 * to issue Configuration Requests to those devices.
+	 */
+	msleep(200);
+	printk(KERN_DEBUG "Complete link reset at Root[%s]\n", dev->dev.bus_id);
+
+	/* Enable Root Port's interrupt in response to error messages */ 
+	pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, status);
+	pci_write_config_dword(dev,
+		pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+/**
+ * aer_error_detected - update severity status
+ * @dev: pointer to Root Port's pci_dev data structure
+ * @error: error severity being notified by port bus
+ *
+ * Invoked by Port Bus driver during error recovery.
+ **/
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+			enum pci_channel_state error)
+{
+	/* Root Port has no impact. Always recovers. */
+	return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+/**
+ * aer_error_resume - clean up corresponding error status bits
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver during nonfatal recovery.
+ **/
+static void aer_error_resume(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+	u16 reg16;
+
+	/* Clean up Root device status */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	pci_read_config_word(dev, pos + PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(dev, pos + PCI_EXP_DEVSTA, reg16);
+
+	/* Clean AER Root Error Status */
+	pos = pci_find_aer_capability(dev);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
+/**
+ * aer_service_init - register AER root service driver
+ *
+ * Invoked when AER root service driver is loaded.
+ **/
+static int __init aer_service_init(void)
+{
+	return pcie_port_service_register(&aerdrv);
+}
+
+/**
+ * aer_service_exit - unregister AER root service driver
+ *
+ * Invoked when AER root service driver is unloaded.
+ **/
+static void __exit aer_service_exit(void) 
+{
+	pcie_port_service_unregister(&aerdrv);
+}
+
+module_init(aer_service_init);
+module_exit(aer_service_exit);
--- linux-2.6.17/drivers/pci/pcie/aer/Kconfig	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Kconfig	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,12 @@
+#
+# Root Port Device AER Configuration
+#
+
+config PCIEAER
+	tristate "Root Port Advanced Error Reporting support"
+	depends on PCIEPORTBUS 
+	default y
+	help
+	  This enables Root Port Advanced Error Reporting (AER) driver
+	  support. Error reporting messages sent to Root Port will be
+	  handled by PCI Express AER driver.
--- linux-2.6.17/drivers/pci/pcie/aer/Makefile	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,10 @@
+#
+# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
+#
+
+obj-$(CONFIG_PCIEAER)		+= aerdriver.o
+aerdrv_acpi-$(CONFIG_ACPI)	+= aerdrv_acpi.o
+
+aerdriver-objs		:= aerdrv_errprint.o aerdrv_core.o aerdrv.o
+aerdriver-objs		+= $(aerdrv_acpi-y)
+
--- linux-2.6.17/drivers/pci/pcie/Kconfig	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Kconfig	2006-06-22 16:46:29.000000000 +0800
@@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
 	   
 	  When in doubt, say N.
 
+source "drivers/pci/pcie/aer/Kconfig"
--- linux-2.6.17/drivers/pci/pcie/Makefile	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -5,3 +5,6 @@
 pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
 
 obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
+
+# Build PCI Express AER if needed
+obj-$(CONFIG_PCIEAER)		+= aer/
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_errprint.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_errprint.c	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,216 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+
+#include "aerdrv.h"
+
+#define AER_AGENT_RECEIVER		0
+#define AER_AGENT_REQUESTER		1
+#define AER_AGENT_COMPLETER		2
+#define AER_AGENT_TRANSMITTER		3		
+
+#define AER_AGENT_REQUESTER_MASK	(PCI_ERR_UNC_COMP_TIME|	\
+					PCI_ERR_UNC_UNSUP)
+
+#define AER_AGENT_COMPLETER_MASK	PCI_ERR_UNC_COMP_ABORT
+
+#define AER_AGENT_TRANSMITTER_MASK(t, e) (e & (PCI_ERR_COR_REP_ROLL| \
+	((t == AER_CORRECTABLE) ? PCI_ERR_COR_REP_TIMER: 0))) 
+
+#define AER_GET_AGENT(t, e)						\
+	((e & AER_AGENT_COMPLETER_MASK) ? AER_AGENT_COMPLETER :		\
+	(e & AER_AGENT_REQUESTER_MASK) ? AER_AGENT_REQUESTER :		\
+	(AER_AGENT_TRANSMITTER_MASK(t, e)) ? AER_AGENT_TRANSMITTER :	\
+	AER_AGENT_RECEIVER)
+
+#define AER_PHYSICAL_LAYER_ERROR_MASK	PCI_ERR_COR_RCVR
+#define AER_DATA_LINK_LAYER_ERROR_MASK(t, e)	\
+		(PCI_ERR_UNC_DLP|		\
+		PCI_ERR_COR_BAD_TLP| 		\
+		PCI_ERR_COR_BAD_DLLP|		\
+		PCI_ERR_COR_REP_ROLL| 		\
+		((t == AER_CORRECTABLE) ?	\
+		PCI_ERR_COR_REP_TIMER: 0))
+
+#define AER_PHYSICAL_LAYER_ERROR	0
+#define AER_DATA_LINK_LAYER_ERROR	1
+#define AER_TRANSACTION_LAYER_ERROR	2
+
+#define AER_GET_LAYER_ERROR(t, e)				\
+	((e & AER_PHYSICAL_LAYER_ERROR_MASK) ?			\
+	AER_PHYSICAL_LAYER_ERROR :				\
+	(e & AER_DATA_LINK_LAYER_ERROR_MASK(t, e)) ?		\
+		AER_DATA_LINK_LAYER_ERROR : 			\
+		AER_TRANSACTION_LAYER_ERROR)
+
+/* 
+ * AER error strings 
+ */
+static char* aer_error_severity_string[] = {
+	"Uncorrected (Non-Fatal)", 
+	"Uncorrected (Fatal)",
+	"Corrected"
+};
+
+static char* aer_error_layer[] = {
+	"Physical Layer",
+	"Data Link Layer",
+	"Transaction Layer" 
+};
+static char* aer_correctable_error_string[] = {
+	"Receiver Error        ",	/* Bit Position 0 	*/
+	"Unknown Error Bit 1   ", 	/* Bit Position 1	*/
+	"Unknown Error Bit 2   ",	/* Bit Position 2	*/
+	"Unknown Error Bit 3   ", 	/* Bit Position 3	*/
+	"Unknown Error Bit 4   ", 	/* Bit Position 4 	*/
+	"Unknown Error Bit 5   ",	/* Bit Position 5	*/
+	"Bad TLP               ",	/* Bit Position 6 	*/
+	"Bad DLLP              ",	/* Bit Position 7 	*/
+	"RELAY_NUM Rollover    ",	/* Bit Position 8 	*/
+	"Unknown Error Bit 9   ", 	/* Bit Position 9	*/
+	"Unknown Error Bit 10  ",	/* Bit Position 10	*/
+	"Unknown Error Bit 11  ", 	/* Bit Position 11	*/
+	"Replay Timer Timeout  ",	/* Bit Position 12 	*/
+	"Advisory Non-Fatal    ", 	/* Bit Position 13	*/
+	"Unknown Error Bit 14  ",	/* Bit Position 14	*/
+	"Unknown Error Bit 15  ", 	/* Bit Position 15	*/
+	"Unknown Error Bit 16  ", 	/* Bit Position 16 	*/
+	"Unknown Error Bit 17  ",	/* Bit Position 17	*/
+	"Unknown Error Bit 18  ", 	/* Bit Position 18	*/
+	"Unknown Error Bit 19  ",	/* Bit Position 19	*/
+	"Unknown Error Bit 20  ", 	/* Bit Position 20	*/
+	"Unknown Error Bit 21  ", 	/* Bit Position 21 	*/
+	"Unknown Error Bit 22  ",	/* Bit Position 22	*/
+	"Unknown Error Bit 23  ", 	/* Bit Position 23	*/
+	"Unknown Error Bit 24  ",	/* Bit Position 24	*/
+	"Unknown Error Bit 25  ", 	/* Bit Position 25	*/
+	"Unknown Error Bit 26  ", 	/* Bit Position 26 	*/
+	"Unknown Error Bit 27  ",	/* Bit Position 27	*/
+	"Unknown Error Bit 28  ",	/* Bit Position 28	*/
+	"Unknown Error Bit 29  ", 	/* Bit Position 29	*/
+	"Unknown Error Bit 30  ", 	/* Bit Position 30 	*/
+	"Unknown Error Bit 31  "	/* Bit Position 31	*/
+};
+
+static char* aer_uncorrectable_error_string[] = {
+	"Unknown Error Bit 0   ", 	/* Bit Position 0	*/
+	"Unknown Error Bit 1   ", 	/* Bit Position 1	*/
+	"Unknown Error Bit 2   ",	/* Bit Position 2	*/
+	"Unknown Error Bit 3   ", 	/* Bit Position 3	*/
+	"Data Link Protocol    ",	/* Bit Position 4	*/
+	"Unknown Error Bit 5   ", 	/* Bit Position 5	*/
+	"Unknown Error Bit 6   ", 	/* Bit Position 6	*/
+	"Unknown Error Bit 7   ",	/* Bit Position 7	*/
+	"Unknown Error Bit 8   ", 	/* Bit Position 8	*/
+	"Unknown Error Bit 9   ", 	/* Bit Position 9	*/
+	"Unknown Error Bit 10  ",	/* Bit Position 10	*/
+	"Unknown Error Bit 11  ", 	/* Bit Position 11	*/
+	"Poisoned TLP          ",	/* Bit Position 12 	*/
+	"Flow Control Protocol ",	/* Bit Position 13	*/
+	"Completion Timeout    ",	/* Bit Position 14 	*/
+	"Completer Abort       ",	/* Bit Position 15 	*/
+	"Unexpected Completion ",	/* Bit Position 16	*/
+	"Receiver Overflow     ",	/* Bit Position 17	*/
+	"Malformed TLP         ",	/* Bit Position 18	*/
+	"ECRC                  ",	/* Bit Position 19	*/
+	"Unsupported Request   ",	/* Bit Position 20	*/
+	"Unknown Error Bit 21  ", 	/* Bit Position 21 	*/
+	"Unknown Error Bit 22  ",	/* Bit Position 22	*/
+	"Unknown Error Bit 23  ", 	/* Bit Position 23	*/
+	"Unknown Error Bit 24  ",	/* Bit Position 24	*/
+	"Unknown Error Bit 25  ", 	/* Bit Position 25	*/
+	"Unknown Error Bit 26  ", 	/* Bit Position 26 	*/
+	"Unknown Error Bit 27  ",	/* Bit Position 27	*/
+	"Unknown Error Bit 28  ",	/* Bit Position 28	*/
+	"Unknown Error Bit 29  ", 	/* Bit Position 29	*/
+	"Unknown Error Bit 30  ", 	/* Bit Position 30 	*/
+	"Unknown Error Bit 31  "	/* Bit Position 31	*/
+};
+
+static char* aer_agent_string[] = {
+	"Receiver ID", 
+	"Requester ID", 
+	"Completer ID", 
+	"Transmitter ID" 
+};
+
+static char* aer_get_error_source_name(int severity, unsigned int status)
+{
+        int i;
+
+        for (i = 0; i < 32; i++) {
+                if (!(status & (1 << i)))
+                        continue;
+
+                if (severity == AER_CORRECTABLE)
+                        return aer_correctable_error_string[i];
+                else
+                        return aer_uncorrectable_error_string[i];
+        }
+
+        return NULL;
+}
+
+void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
+{
+	char * errmsg;
+	int err_layer, agent;
+
+	printk(KERN_ERR "+------ PCI-Express Device Error ------+\n");
+	printk(KERN_ERR "Error Severity\t\t: %s\n",
+		aer_error_severity_string[info->severity]);
+
+	if ( info->status == 0) {
+                printk(KERN_ERR "PCIE Bus Error type\t: (Unaccessible)\n");
+                printk(KERN_ERR "Unaccessible Received\t: %s\n",
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+                printk(KERN_ERR "Unregistered Agent ID\t: %04x\n",
+			(dev->bus->number << 8) | dev->devfn);
+	} else {
+		err_layer = AER_GET_LAYER_ERROR(info->severity, info->status);
+		printk(KERN_ERR "PCIE Bus Error type\t: %s\n",
+			aer_error_layer[err_layer]);
+
+		errmsg = aer_get_error_source_name(info->severity, info->status);
+		printk(KERN_ERR "%s\t: %s\n", errmsg,
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+
+		agent = AER_GET_AGENT(info->severity, info->status);
+		printk(KERN_ERR "%s\t\t: %04x\n",
+			aer_agent_string[agent],
+			(dev->bus->number << 8) | dev->devfn);
+
+		printk(KERN_ERR "VendorID=%04xh, DeviceID=%04xh,"
+			" Bus=%02xh, Device=%02xh, Function=%02xh\n",
+			dev->vendor,
+			dev->device,
+			dev->bus->number,
+			PCI_SLOT(dev->devfn),
+			PCI_FUNC(dev->devfn));
+
+		if (info->flags & AER_TLP_HEADER_VALID_FLAG) {
+			unsigned char *tlp = (unsigned char *) &info->tlp;
+			printk(KERN_ERR "TLB Header:\n");
+			printk(KERN_ERR "%02x%02x%02x%02x %02x%02x%02x%02x"
+				" %02x%02x%02x%02x %02x%02x%02x%02x\n",
+				*(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+				*(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
+				*(tlp + 11), *(tlp + 10), *(tlp + 9),
+				*(tlp + 8), *(tlp + 15), *(tlp + 14),
+				*(tlp + 13), *(tlp + 12));
+		}
+	}
+}
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler
  2006-07-12  7:32     ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
@ 2006-07-12  7:38       ` Zhang, Yanmin
  2006-07-12  8:06       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
  1 sibling, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  7:38 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 5 implements error handlers for pcie_portdrv.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:27:35.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:46:29.000000000 +0800
@@ -14,8 +14,10 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/pcieport_if.h>
+#include <linux/aer.h>
 
 #include "portdrv.h"
+#include "aer/aerdrv.h"
 
 /*
  * Version Information
@@ -76,6 +78,8 @@ static int __devinit pcie_portdrv_probe 
 	if (pcie_port_device_register(dev)) 
 		return -ENOMEM;
 
+	pci_enable_pcie_error_reporting(dev);
+
 	return 0;
 }
 
@@ -102,6 +106,146 @@ static int pcie_portdrv_resume (struct p
 }
 #endif
 
+static int error_detected_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	struct aer_broadcast_data *result_data;
+	pci_ers_result_t status;
+
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (!driver ||
+			!driver->err_handler ||
+			!driver->err_handler->error_detected)
+			return 0;
+
+		pcie_device = to_pcie_device(device);
+
+		/* Forward error detected message to service drivers */
+		status = driver->err_handler->error_detected(
+			pcie_device->port,
+			result_data->state);
+		result_data->result =
+			merge_result(result_data->result, status);
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
+					enum pci_channel_state error)
+{
+	struct aer_broadcast_data result_data =
+			{error, PCI_ERS_RESULT_CAN_RECOVER};
+	
+	device_for_each_child(&dev->dev, &result_data, error_detected_iter);
+
+	/* If fatal, save cfg space for possible link reset at upstream */
+	if (error == pci_channel_io_frozen)
+		pcie_portdrv_save_config(dev);
+
+	return result_data.result;
+}
+
+static int mmio_enabled_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	pci_ers_result_t status, *result;
+
+	result = (pci_ers_result_t *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->mmio_enabled) {
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			status = driver->err_handler->mmio_enabled(
+					pcie_device->port);
+			*result = merge_result(*result, status);
+		}
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
+{
+	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
+
+	device_for_each_child(&dev->dev, &status, mmio_enabled_iter);
+	return status;
+}
+
+static int slot_reset_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	pci_ers_result_t status, *result;
+
+	result = (pci_ers_result_t *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->slot_reset) {
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			status = driver->err_handler->slot_reset(
+					pcie_device->port);
+			*result = merge_result(*result, status);
+		}
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
+{
+	pci_ers_result_t status;
+
+	/* If fatal, restore cfg space for possible link reset at upstream */
+	if (dev->error_state == pci_channel_io_frozen)
+		pcie_portdrv_restore_config(dev);
+
+	device_for_each_child(&dev->dev, &status, slot_reset_iter);
+
+	return status;
+}
+
+static int resume_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->resume) { 
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			driver->err_handler->resume(pcie_device->port);
+		}
+	}
+
+	return 0;
+}
+
+static void pcie_portdrv_err_resume(struct pci_dev *dev)
+{
+	device_for_each_child(&dev->dev, NULL, resume_iter);
+}
+
 /*
  * LINUX Device Driver Model
  */
@@ -112,6 +256,13 @@ static const struct pci_device_id port_p
 };
 MODULE_DEVICE_TABLE(pci, port_pci_ids);
 
+static struct pci_error_handlers pcie_portdrv_err_handler = {
+		.error_detected = pcie_portdrv_error_detected,
+		.mmio_enabled = pcie_portdrv_mmio_enabled,
+		.slot_reset = pcie_portdrv_slot_reset,
+		.resume = pcie_portdrv_err_resume,
+};
+
 static struct pci_driver pcie_portdrv = {
 	.name		= (char *)device_name,
 	.id_table	= &port_pci_ids[0],
@@ -123,6 +274,8 @@ static struct pci_driver pcie_portdrv = 
 	.suspend	= pcie_portdrv_suspend,
 	.resume		= pcie_portdrv_resume,
 #endif	/* PM */
+
+	.err_handler 	= &pcie_portdrv_err_handler,
 };
 
 static int __init pcie_portdrv_init(void)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type
  2006-07-12  7:22   ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
  2006-07-12  7:32     ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
@ 2006-07-12  8:00     ` Zhang, Yanmin
  1 sibling, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  8:00 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

With Arjan's comments, I changed EXPORT_SYMBOL to EXPORT_SYMBOL_GPL.

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 3 exports pcie_port_bus_type. AER driver could be compiled
as a module and it needs to access pcie_port_bus_type.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/portdrv_bus.c	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_bus.c	2006-07-12 15:39:14.000000000 +0800
@@ -76,3 +76,6 @@ static int pcie_port_bus_resume(struct d
 		driver->resume(pciedev);
 	return 0;
 }
+
+EXPORT_SYMBOL_GPL(pcie_port_bus_type);
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12  7:32     ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
  2006-07-12  7:38       ` [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler Zhang, Yanmin
@ 2006-07-12  8:06       ` Zhang, Yanmin
  2006-07-12 13:16         ` Arjan van de Ven
  2006-07-12 16:26         ` Andi Kleen
  1 sibling, 2 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-12  8:06 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

With Arjan's comments, I changed EXPORT_SYMBOL to EXPORT_SYMBOL_GPL.
Sorry for flooding your emailbox again. :)

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 4 implements the core part of PCI-Express AER and aerdrv
port service driver.

When a root port service device is probed, the aerdrv will call
request_irq to register irq handler for AER error interrupt.

When a device sends an PCI-Express error message to the root port,
the root port will trigger an interrupt, by either MSI or IO-APIC,
then kernel would run the irq handler. The handler collects root
error status register and schedules a work. The work will call
the core part to process the error based on its type
(Correctable/non-fatal/fatal).

As for Correctable errors, the patch chooses to just clear the correctable
error status register of the device.

As for the non-fatal error, the patch follows generic PCI error handler
rules to call the error callback functions of the endpoint's driver. If
the device is a bridge, the patch chooses to broadcast the error to
downstream devices.

As for the fatal error, the patch resets the pci-express link and
follows generic PCI error handler rules to call the error callback
functions of the endpoint's driver. If the device is a bridge, the patch
chooses to broadcast the error to downstream devices.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_acpi.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_acpi.c	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+/**
+ * aer_osc_setup - run ACPI _OSC method
+ *
+ * Return: 
+ *	Zero if success. Nonzero for otherwise.
+ *
+ * Invoked when PCIE bus loads AER service driver. To avoid conflict with
+ * BIOS AER support requires BIOS to yield AER control to OS native driver.
+ **/
+int aer_osc_setup(struct pci_dev *dev)
+{
+	int retval = OSC_METHOD_RUN_SUCCESS;
+	acpi_status status;
+	acpi_handle handle = DEVICE_ACPI_HANDLE(&dev->dev);
+	struct pci_dev *pdev = dev;
+	struct pci_bus *parent;
+
+	while (!handle) {
+		if (!pdev || !pdev->bus->parent)
+			break;
+		parent = pdev->bus->parent;
+		if (!parent->self)
+			/* Parent must be a host bridge */
+			handle = acpi_get_pci_rootbridge_handle(
+					pci_domain_nr(parent),
+					parent->number);
+		else
+			handle = DEVICE_ACPI_HANDLE(
+					&(parent->self->dev));
+		pdev = parent->self;
+	}
+
+	if (!handle)
+		return OSC_METHOD_NOT_SUPPORTED;
+
+	pci_osc_support_set(OSC_EXT_PCI_CONFIG_SUPPORT);
+	status = pci_osc_control_set(handle, OSC_PCI_EXPRESS_AER_CONTROL |
+		OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL);
+	if (ACPI_FAILURE(status)) {
+		if (status == AE_SUPPORT) 
+			retval = OSC_METHOD_NOT_SUPPORTED;
+	 	else
+			retval = OSC_METHOD_RUN_FAILURE;
+	}
+
+	return retval;
+}
+
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_core.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_core.c	2006-07-12 15:47:38.000000000 +0800
@@ -0,0 +1,737 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+static LIST_HEAD(rc_list);		/* Define Root Complex List */
+
+static int forceload;
+module_param(forceload, bool, 0);
+
+#define PCI_CFG_SPACE_SIZE	(0x100)
+int pci_find_aer_capability(struct pci_dev *dev)
+{
+	int pos;
+	u32 reg32 = 0;
+
+	/* Check if it's a pci-express device */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return 0;
+
+	/* Check if it supports pci-express AER */
+	pos = PCI_CFG_SPACE_SIZE;
+	while (pos) {
+		if (pci_read_config_dword(dev, pos, &reg32))
+			return 0;
+
+		/* some broken boards return ~0 */
+		if (reg32 == 0xffffffff)
+			return 0;
+
+		if (PCI_EXT_CAP_ID(reg32) == PCI_EXT_CAP_ID_ERR)
+			break;
+
+		pos = reg32 >> 20;
+	}
+
+	return pos;
+}
+
+int pci_disable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 & ~(PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE);
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+
+	pos = pci_find_aer_capability(dev);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+
+	return 0;
+}
+
+static int find_device_iter(struct device *device, void *data)
+{
+	struct pci_dev *dev;
+	u16 id = *(unsigned long *)data;
+	u8 secondary, subordinate, d_bus = id >> 8;
+
+	if (device->bus == &pci_bus_type) {
+		dev = to_pci_dev(device);
+		if (id == ((dev->bus->number << 8) | dev->devfn)) {
+			/*
+			 * Device ID match
+			 */
+			*(unsigned long*)data = (unsigned long)device;
+			return 1;
+		}
+
+		/* 
+		 * If device is P2P, check if it is an upstream?
+		 */
+		if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+			pci_read_config_byte(dev, PCI_SECONDARY_BUS,
+				&secondary);
+			pci_read_config_byte(dev, PCI_SUBORDINATE_BUS,
+				&subordinate);
+			if (d_bus >= secondary && d_bus <= subordinate) {
+				*(unsigned long*)data = (unsigned long)device;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * find_source_device - search through device hierarchy for source device
+ * @p_dev: pointer to Root Port pci_dev data structure
+ * @id: device ID of agent who sends an error message to this Root Port
+ *
+ * Invoked when error is detected at the Root Port.
+ **/
+static struct device* find_source_device(struct pci_dev *parent, u16 id)
+{
+	struct pci_dev *dev = parent;
+	struct device *device;
+	unsigned long device_addr;
+	int status;
+
+	/* Is Root Port an agent that sends error message? */
+	if (id == ((dev->bus->number << 8) | dev->devfn)) 
+		return &dev->dev;
+
+	do {
+		device_addr = id;
+ 		if ((status = device_for_each_child(&dev->dev,
+			&device_addr, find_device_iter))) {
+			device = (struct device*)device_addr;
+			dev = to_pci_dev(device);
+			if (id == ((dev->bus->number << 8) | dev->devfn))
+				return device;
+		}
+ 	}while (status);
+
+	return NULL;
+}
+
+static void report_error_detected(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	dev->error_state = result_data->state;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->error_detected) {
+		if (result_data->state == pci_channel_io_frozen &&
+			!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+			/* 
+			 * In case of fatal recovery, if one of down-
+			 * stream device has no driver. We might be
+			 * unable to recover because a later insmod
+			 * of a driver for this device is unaware of
+			 * its hw state.
+			 */
+			printk(KERN_DEBUG "Device ID[%s] has %s\n",
+					dev->dev.bus_id, (dev->driver) ?
+					"no AER-aware driver" : "no driver");
+		}
+		return;
+	}
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->error_detected(dev, result_data->state);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_mmio_enabled(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->mmio_enabled)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->mmio_enabled(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_slot_reset(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->slot_reset(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_resume(struct pci_dev *dev, void *data)
+{
+	struct pci_error_handlers *err_handler;
+
+	dev->error_state = pci_channel_io_normal;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	err_handler->resume(dev);
+	return;
+}
+
+/**
+ * broadcast_error_message - handle message broadcast to downstream drivers
+ * @device: pointer to from where in a hierarchy message is broadcasted down
+ * @api: callback to be broadcasted
+ * @state: error state
+ *
+ * Invoked during error recovery process. Once being invoked, the content
+ * of error severity will be broadcasted to all downstream drivers in a 
+ * hierarchy in question.
+ **/
+static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+	enum pci_channel_state state,
+	char *error_mesg,
+	void (*cb)(struct pci_dev *, void *))
+{
+	struct aer_broadcast_data result_data;
+
+	printk(KERN_DEBUG "Broadcast %s message\n", error_mesg);
+	result_data.state = state;
+	if (cb == report_error_detected)
+		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
+	else
+		result_data.result = PCI_ERS_RESULT_RECOVERED;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+		/*
+		 * If the error is reported by a bridge, we think this error
+		 * is related to the downstream link of the bridge, so we
+		 * do error recovery on all subordinates of the bridge instead
+		 * of the bridge and clear the error status of the bridge.
+		 */
+		if (cb == report_error_detected)
+			dev->error_state = state;
+		pci_walk_bus(dev->subordinate, cb, &result_data);
+		if (cb == report_resume) {
+			pci_cleanup_aer_uncorrect_error_status(dev);
+			dev->error_state = pci_channel_io_normal;
+		}
+	}
+	else {
+		/*
+		 * If the error is reported by an end point, we think this
+		 * error is related to the upstream link of the end point.
+		 */
+		pci_walk_bus(dev->bus, cb, &result_data);
+	}
+
+	return result_data.result;
+}
+
+struct find_aer_service_data {
+        struct pcie_port_service_driver *aer_driver;
+        int is_downstream;
+};
+
+static int find_aer_service_iter(struct device *device, void *data)
+{
+	struct device_driver *driver;
+	struct pcie_port_service_driver *service_driver;
+	struct pcie_device *pcie_dev;
+	struct find_aer_service_data *result;
+
+	result = (struct find_aer_service_data *) data;
+
+	if (device->bus == &pcie_port_bus_type) {
+		pcie_dev = to_pcie_device(device);
+		if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+			result->is_downstream = 1;
+
+		driver = device->driver;
+		if (driver) {
+			service_driver = to_service_driver(driver);
+			if (service_driver->id_table->service_type ==
+					PCIE_PORT_SERVICE_AER) {
+				result->aer_driver = service_driver;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void find_aer_service(struct pci_dev *dev,
+		struct find_aer_service_data *data)
+{
+	device_for_each_child(&dev->dev, data, find_aer_service_iter);
+}
+
+static pci_ers_result_t reset_link(struct pcie_device *aerdev,
+		struct pci_dev *dev)
+{
+	struct pci_dev *udev;
+	pci_ers_result_t status;
+	struct find_aer_service_data data;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)
+		udev = dev;
+	else
+		udev= dev->bus->self;
+
+	data.is_downstream = 0;
+	data.aer_driver = NULL;
+	find_aer_service(udev, &data);
+
+	/*
+	 * Use the aer driver of the error agent firstly.
+	 * If it hasn't the aer driver, use the root port's
+	 */
+	if (!data.aer_driver || !data.aer_driver->reset_link) {
+		if (data.is_downstream &&
+			aerdev->device.driver &&
+			to_service_driver(aerdev->device.driver)->reset_link) {
+			data.aer_driver =
+				to_service_driver(aerdev->device.driver);
+		} else {
+			printk(KERN_DEBUG "No link-reset support to Device ID"
+				"[%s]\n",
+				dev->dev.bus_id);
+			return PCI_ERS_RESULT_DISCONNECT;
+		}
+	}
+
+	status = data.aer_driver->reset_link(udev);
+	if (status != PCI_ERS_RESULT_RECOVERED) {
+		printk(KERN_DEBUG "Link reset at upstream Device ID"
+			"[%s] failed\n",
+			udev->dev.bus_id);
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	return status;
+}
+
+/**
+ * do_recovery - handle nonfatal/fatal error recovery process
+ * @aerdev: pointer to a pcie_device data structure of root port
+ * @dev: pointer to a pci_dev data structure of agent detecting an error
+ * @severity: error severity type
+ *
+ * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
+ * error detected message to all downstream drivers within a hierarchy in 
+ * question and return the returned code.
+ **/
+static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
+		struct pci_dev *dev,
+		int severity)
+{
+	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
+	enum pci_channel_state state;
+
+	if (severity == AER_FATAL)
+		state = pci_channel_io_frozen;
+	else
+		state = pci_channel_io_normal;
+
+	status = broadcast_error_message(dev,
+			state,
+			"error_detected",
+			report_error_detected);
+
+	if (severity == AER_FATAL) {
+		result = reset_link(aerdev, dev);
+		if (result != PCI_ERS_RESULT_RECOVERED) {
+			/* TODO: Should panic here? */
+			return result;
+		}
+	}
+
+	if (status == PCI_ERS_RESULT_CAN_RECOVER)
+		status = broadcast_error_message(dev,
+				state,
+				"mmio_enabled",
+				report_mmio_enabled);
+
+	if (status == PCI_ERS_RESULT_NEED_RESET) {
+		/*
+		 * TODO: Should call platform-specific
+		 * functions to reset slot before calling
+		 * drivers' slot_reset callbacks?
+		 */
+		status = broadcast_error_message(dev,
+				state,
+				"slot_reset",
+				report_slot_reset);
+	}
+
+	if (status == PCI_ERS_RESULT_RECOVERED)
+		broadcast_error_message(dev,
+				state,
+				"resume",
+				report_resume);
+
+	return status;
+}
+
+/**
+ * handle_error_source - handle logging error into an event log
+ * @aerdev: pointer to pcie_device data structure of the root port
+ * @dev: pointer to pci_dev data structure of error source device
+ * @info: comprehensive error information
+ *
+ * Invoked when an error being detected by Root Port.
+ **/
+static void handle_error_source(struct pcie_device * aerdev,
+	struct pci_dev *dev,
+	struct aer_err_info info)
+{
+	pci_ers_result_t status = 0;
+	int pos;
+
+	if (info.severity == AER_CORRECTABLE) {
+		/* 
+		 * Correctable error does not need software intevention.
+		 * No need to go through error recovery process.
+		 */
+		pos = pci_find_aer_capability(dev);
+		if (pos)
+			pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+					info.status);
+	} else {
+		status = do_recovery(aerdev, dev, info.severity);
+		if (status == PCI_ERS_RESULT_RECOVERED) {
+			printk(KERN_DEBUG "AER driver successfully recovered\n");
+		} else {
+			/* TODO: Should kernel panic here? */ 
+			printk(KERN_DEBUG "AER driver didn't recover\n");
+		}
+	}
+}
+
+/**
+ * enable_root_aer - enable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus loads AER service driver.
+ **/
+static void enable_root_aer(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	int pos, aer_pos;
+	u16 reg16;
+	u32 reg32;
+
+	pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+	/* Clear PCIE Capability's Device Status */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVSTA, reg16);
+
+	/* Disable system error generation in response to error messages */
+	pci_read_config_word(pdev, pos + PCI_EXP_RTCTL, &reg16);
+	reg16 &= ~(SYSTEM_ERROR_INTR_ON_MESG_MASK);
+	pci_write_config_word(pdev, pos + PCI_EXP_RTCTL, reg16);
+
+	aer_pos = pci_find_aer_capability(pdev);
+	/* Clear error status */
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
+
+	/* Enable Root Port device reporting error itself */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVCTL,
+		reg16);
+
+	/* Enable Root Port's interrupt in response to error messages */
+	pci_write_config_dword(pdev,
+		aer_pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+}
+
+/**
+ * disable_root_aer - disable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus unloads AER service driver.
+ **/
+static void disable_root_aer(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	u32 reg32;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev);
+	/* Disable Root's interrupt in response to error messages */
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Clear Root's error status reg */
+	pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, reg32);
+}
+
+/**
+ * get_e_source - retrieve an error source
+ * @rpc: pointer to the root port which holds an error
+ *
+ * Invoked by DPC handler to consume an error.
+ **/
+static struct aer_err_source* get_e_source(struct aer_rpc *rpc)
+{
+	struct aer_err_source *e_source;
+	unsigned long flags;
+
+	/* Lock access to Root error producer/consumer index */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+	if (rpc->prod_idx == rpc->cons_idx) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return NULL;
+	}
+	e_source = &rpc->e_sources[rpc->cons_idx];
+	rpc->cons_idx++;
+	if (rpc->cons_idx == AER_ERROR_SOURCES_MAX)
+		rpc->cons_idx = 0;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+	
+	return e_source;
+}
+
+static int get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
+{
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* The device might not support AER */
+	if (!pos)
+		return AER_SUCCESS;
+
+	if (info->severity == AER_CORRECTABLE) {
+		pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_CORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS; 
+	} else if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE ||
+		info->severity == AER_NONFATAL) {
+
+		/* Link is still healthy for IO reads */
+		pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_UNCORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS;
+
+		if (info->status & AER_LOG_TLP_MASKS) {
+			info->flags |= AER_TLP_HEADER_VALID_FLAG;
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 4, &info->tlp.dw1);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 8, &info->tlp.dw2);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 12, &info->tlp.dw3);
+		}
+	}
+
+	return AER_SUCCESS;
+}
+
+/**
+ * aer_isr - consume an error detected by root port
+ * @context: pointer to a private data of pcie device
+ *
+ * Invoked, as DPC, when root port records new detected error
+ **/
+void aer_isr(void *context)
+{
+	struct pcie_device *p_device = (struct pcie_device *) context;
+	struct device *s_device;
+	struct aer_rpc *rpc = get_service_data(p_device);
+	struct aer_err_source *e_src;
+	struct aer_err_info e_info = {0, 0, 0,};
+	int i;
+	u16 id;
+
+	/* 
+	 * Lock access into an error buffer associated with this Root Port.
+	 * Process one error at a time.
+	 */
+	down(&rpc->rpc_sema);
+	if (!(e_src = get_e_source(rpc))) {
+		printk(KERN_DEBUG "%s->DPC fails to get an error source\n",
+			__FUNCTION__);
+		up(&rpc->rpc_sema);
+		return;
+	}
+
+	/*
+	 * There is a possibility that both correctable error and 
+	 * uncorrectable error being logged. Report correctable error first.
+	 */
+	for (i = 1; i & ROOT_ERR_STATUS_MASKS ; i <<= 2) {
+		if (i > 4)
+			break;
+		if (!(e_src->status & i))
+			continue;
+
+		/* Init comprehensive error information */
+		if (i & PCI_ERR_ROOT_COR_RCV) {
+			id = ERR_COR_ID(e_src->id);
+			e_info.severity = AER_CORRECTABLE;
+		} else {
+			id = ERR_UNCOR_ID(e_src->id);
+			e_info.severity = ((e_src->status >> 6) & 1);
+		}
+		if (e_src->status &
+			(PCI_ERR_ROOT_MULTI_COR_RCV |
+			 PCI_ERR_ROOT_MULTI_UNCOR_RCV))
+			e_info.flags |= AER_MULTI_ERROR_VALID_FLAG;
+		if (!(s_device = find_source_device(p_device->port, id))) {
+			printk(KERN_DEBUG "%s->can't find device of ID%04x\n",
+				__FUNCTION__, id);
+			continue;
+		}
+		if (get_device_error_info(to_pci_dev(s_device), &e_info) ==
+				AER_SUCCESS) {
+			aer_print_error(to_pci_dev(s_device), &e_info);
+			handle_error_source(p_device,
+				to_pci_dev(s_device),
+				e_info);
+		}
+	}
+	up(&rpc->rpc_sema);
+}
+
+/**
+ * aer_add_rootport - add a new root port into Root Complex's port hierarchy
+ * @rpc: pointer to a new root port device being added
+ *
+ * Invoked when AER service loaded on a new Root Port
+ **/
+void aer_add_rootport(struct aer_rpc *rpc)
+{
+	/* Add new Root Port into RC List */
+	list_add_tail(&rpc->node, &rc_list);
+
+	/* Enable root port AER itself */
+	enable_root_aer(rpc);
+}
+
+/**
+ * aer_delete_rootport - delete a root port from Root Complex's port hierarchy
+ * @rpc: pointer to a root port device being deleted
+ *
+ * Invoked when AER service unloaded on a specific Root Port
+ **/
+void aer_delete_rootport(struct aer_rpc *rpc)
+{
+	/* Disable root port AER itself */
+	disable_root_aer(rpc);
+	
+	/* Free all source nodes under this root port */
+	list_del(&rpc->node);
+	kfree(rpc);
+}
+
+/**
+ * aer_init - provide AER initialization
+ * @dev: pointer to AER pcie device
+ *
+ * Invoked when AER service driver is loaded.
+ **/
+int aer_init(struct pcie_device *dev)
+{
+	int status;
+
+	/* Run _OSC Method */
+	status = aer_osc_setup(dev->port);
+
+	if(status != OSC_METHOD_RUN_SUCCESS) {
+		printk(KERN_DEBUG "%s: AER service init fails - %s\n",
+		__FUNCTION__,
+		(status == OSC_METHOD_NOT_SUPPORTED) ?
+			"No ACPI _OSC support" : "Run ACPI _OSC fails");
+
+		if (!forceload)
+			return status;
+	}
+
+	return AER_SUCCESS;
+}
+
+EXPORT_SYMBOL_GPL(pci_find_aer_capability);
+EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
+EXPORT_SYMBOL_GPL(pci_cleanup_aer_uncorrect_error_status);
+
--- linux-2.6.17/include/linux/aer.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/include/linux/aer.h	2006-07-12 14:37:17.000000000 +0800
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2006 Intel
+ *     Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *     Zhang Yanmin (yanmin.zhang@intel.com)
+ */
+
+#ifndef _AER_H_
+#define _AER_H_
+
+#if defined(CONFIG_PCIEAER) || defined(CONFIG_PCIEAER_MODULE)
+/* pci-e port driver needs this function to enable aer */
+static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+extern int pci_find_aer_capability(struct pci_dev *dev);
+extern int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+#else
+#define pci_enable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_find_aer_capability(dev)			do { } while (0)
+#define pci_disable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_cleanup_aer_uncorrect_error_status(dev)	do { } while (0)
+#endif
+
+#endif //_AER_H_
+
--- linux-2.6.17/include/linux/pcieport_if.h	2006-06-22 16:26:32.000000000 +0800
+++ linux-2.6.17_aer/include/linux/pcieport_if.h	2006-06-22 16:46:29.000000000 +0800
@@ -61,6 +61,12 @@ struct pcie_port_service_driver {
 	void (*remove) (struct pcie_device *dev);
 	int (*suspend) (struct pcie_device *dev, pm_message_t state);
 	int (*resume) (struct pcie_device *dev);
+	
+	/* Service Error Recovery Handler */
+	struct pci_error_handlers *err_handler;
+
+	/* Link Reset Capability - AER service driver specific */
+	pci_ers_result_t (*reset_link) (struct pci_dev *dev);
 
 	const struct pcie_port_service_id *id_table;
 	struct device_driver driver;
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.h	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,137 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#ifndef _AERDRV_H_
+#define _AERDRV_H_
+
+#include <linux/pcieport_if.h>
+#include <linux/aer.h>
+
+#define AER_NONFATAL			0
+#define AER_FATAL			1
+#define AER_CORRECTABLE			2
+#define AER_UNCORRECTABLE		4
+#define AER_ERROR_MASK			0x001fffff
+#define AER_ERROR(d)			(d & AER_ERROR_MASK)
+
+#define VERBOSE_LIMIT_DISPLAY		1
+#define VERBOSE_FULL_DISPLAY		2
+#define VERBOSE_RAW_DISPLAY		3
+#define VERBOSE_MASK			0x3
+
+#define OSC_METHOD_RUN_SUCCESS		0
+#define OSC_METHOD_NOT_SUPPORTED	1
+#define OSC_METHOD_RUN_FAILURE		2
+
+/* Root Error Status Register Bits */
+#define ROOT_ERR_STATUS_MASKS			0x0f
+
+#define SYSTEM_ERROR_INTR_ON_MESG_MASK	(PCI_EXP_RTCTL_SECEE|	\
+					PCI_EXP_RTCTL_SENFEE|	\
+					PCI_EXP_RTCTL_SEFEE)
+#define ROOT_PORT_INTR_ON_MESG_MASK	(PCI_ERR_ROOT_CMD_COR_EN|	\
+					PCI_ERR_ROOT_CMD_NONFATAL_EN|	\
+					PCI_ERR_ROOT_CMD_FATAL_EN)
+#define ERR_COR_ID(d)			(d & 0xffff)
+#define ERR_UNCOR_ID(d)			(d >> 16)
+
+#define AER_SUCCESS			0
+#define AER_UNSUCCESS			1
+#define AER_ERROR_SOURCES_MAX		100
+
+#define AER_LOG_TLP_MASKS		(PCI_ERR_UNC_POISON_TLP|	\
+					PCI_ERR_UNC_ECRC|		\
+					PCI_ERR_UNC_UNSUP|		\
+					PCI_ERR_UNC_COMP_ABORT|		\
+					PCI_ERR_UNC_UNX_COMP|		\
+					PCI_ERR_UNC_MALF_TLP)
+
+/* AER Error Info Flags */
+#define AER_TLP_HEADER_VALID_FLAG	0x00000001
+#define AER_MULTI_ERROR_VALID_FLAG	0x00000002
+
+#define ERR_CORRECTABLE_ERROR_MASK	0x000031c1
+#define ERR_UNCORRECTABLE_ERROR_MASK	0x001ff010
+
+struct header_log_regs {
+	unsigned int dw0;
+	unsigned int dw1;
+	unsigned int dw2;
+	unsigned int dw3;
+};
+
+struct aer_err_info {
+	int severity;			/* 0:NONFATAL | 1:FATAL | 2:COR */
+	int flags;			
+	unsigned int status;		/* COR/UNCOR Error Status */
+	struct header_log_regs tlp; 	/* TLP Header */
+};
+
+struct aer_err_source {
+	unsigned int status;
+	unsigned int id;
+};
+
+struct aer_rpc {
+ 	struct list_head node;
+ 	struct list_head children;	/* AER children of this root port */
+	struct pcie_device *rpd;	/* Root Port device */
+	struct work_struct dpc_handler;
+	struct aer_err_source e_sources[AER_ERROR_SOURCES_MAX];
+	unsigned short prod_idx;	/* Error Producer Index */
+	unsigned short cons_idx;	/* Error Consumer Index */
+	int isr;
+	spinlock_t e_lock;		/* 
+					 * Lock access to Error Status/ID Regs
+					 * and error producer/consumer index
+					 */
+ 
+	struct semaphore rpc_sema;	/* 
+					 * Semaphore access required to
+					 * access, add, remove, or print AER
+				 	 * aware devices in this RPC hierarchy
+					 */
+};
+
+struct aer_broadcast_data {
+	enum pci_channel_state state;
+	enum pci_ers_result result;
+};
+
+static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
+		enum pci_ers_result new)
+{
+	switch (orig) {
+	case PCI_ERS_RESULT_CAN_RECOVER:
+	case PCI_ERS_RESULT_RECOVERED:
+		orig = new;
+		break;
+	case PCI_ERS_RESULT_DISCONNECT:
+		if (new == PCI_ERS_RESULT_NEED_RESET)
+			orig = new;
+		break;
+	default:
+		break;
+	}
+
+	return orig;
+}
+
+extern struct bus_type pcie_port_bus_type;
+extern void aer_add_rootport(struct aer_rpc *rpc);
+extern void aer_delete_rootport(struct aer_rpc *rpc);
+extern int aer_init(struct pcie_device *dev);
+extern void aer_isr(void *context);
+extern void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
+
+#ifdef CONFIG_ACPI
+extern int aer_osc_setup(struct pci_dev *dev);
+#else
+#define  aer_osc_setup(dev)		(OSC_METHOD_NOT_SUPPORTED)
+#endif
+
+#endif //_AERDRV_H_
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.c	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,342 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/pcieport_if.h>
+
+#include "aerdrv.h"
+
+/*
+ * Version Information
+ */
+#define DRIVER_VERSION "v1.0"
+#define DRIVER_AUTHOR "tom.l.nguyen@intel.com"
+#define DRIVER_DESC "Root Port Advanced Error Reporting Driver"
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+static int __devinit aer_probe (struct pcie_device *dev,
+	const struct pcie_port_service_id *id );
+static void aer_remove(struct pcie_device *dev);
+static int aer_suspend(struct pcie_device *dev, pm_message_t state)
+{return 0;}
+static int aer_resume(struct pcie_device *dev) {return 0;}
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+	enum pci_channel_state error);
+static void aer_error_resume(struct pci_dev *dev);
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
+
+/*
+ * PCI Express bus's AER Root service driver data structure
+ */
+static struct pcie_port_service_id aer_id[] = {
+	{
+	.vendor 	= PCI_ANY_ID, 
+	.device 	= PCI_ANY_ID,
+	.port_type 	= PCIE_RC_PORT, 
+	.service_type 	= PCIE_PORT_SERVICE_AER,
+	},
+	{ /* end: all zeroes */ }
+};
+
+static struct pci_error_handlers aer_error_handlers = {
+	.error_detected = aer_error_detected,
+	.resume = aer_error_resume,
+};
+
+static struct pcie_port_service_driver aerdrv = {
+	.name		= "aer",
+	.id_table	= &aer_id[0],
+
+	.probe		= aer_probe,
+	.remove		= aer_remove,
+
+	.suspend	= aer_suspend,
+	.resume		= aer_resume,
+
+	.err_handler	= &aer_error_handlers,
+
+	.reset_link	= aer_root_reset,
+};
+
+/**
+ * aer_irq - Root Port's ISR
+ * @irq: IRQ assigned to Root Port
+ * @context: pointer to Root Port data structure
+ * @r: pointer struct pt_regs
+ *
+ * Invoked when Root Port detects AER messages.
+ **/
+static irqreturn_t aer_irq(int irq, void *context, struct pt_regs * r)
+{
+	unsigned int status, id;
+	struct pcie_device *pdev = (struct pcie_device *)context;
+	struct aer_rpc *rpc = get_service_data(pdev);
+	int next_prod_idx;
+	unsigned long flags;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev->port);
+	/* 
+	 * Must lock access to Root Error Status Reg, Root Error ID Reg, 
+	 * and Root error producer/consumer index 
+	 */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+
+	/* Read error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, &status);
+	if (!(status & ROOT_ERR_STATUS_MASKS)) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_NONE;
+	}
+
+	/* Read error source and clear error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_COR_SRC, &id);
+	pci_write_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, status);
+
+	/* Store error source for later DPC handler */
+	next_prod_idx = rpc->prod_idx + 1;
+	if (next_prod_idx == AER_ERROR_SOURCES_MAX)
+		next_prod_idx = 0;
+	if (next_prod_idx == rpc->cons_idx) {
+		/* 
+		 * Error Storm Condition - possibly the same error occurred.
+		 * Drop the error.
+		 */
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_HANDLED;
+	}
+	rpc->e_sources[rpc->prod_idx].status =  status;
+	rpc->e_sources[rpc->prod_idx].id = id;
+	rpc->prod_idx = next_prod_idx;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+
+	/*  Invoke DPC handler */
+	schedule_work(&rpc->dpc_handler);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * aer_alloc_rpc - allocate Root Port data structure
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when Root Port's AER service is loaded.
+ **/
+static struct aer_rpc* aer_alloc_rpc(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc;
+
+	if (!(rpc = (struct aer_rpc *)kmalloc(sizeof(struct aer_rpc), 
+		GFP_KERNEL)))
+		return NULL;
+
+	memset(rpc, 0, sizeof(struct aer_rpc));
+	/* 
+	 * Initialize Root lock access, e_lock, to Root Error Status Reg, 
+	 * Root Error ID Reg, and Root error producer/consumer index. 
+	 */
+	rpc->e_lock = SPIN_LOCK_UNLOCKED;
+
+	/* 
+	 * Initialize semaphore access required to access, add, remove,
+	 * or print AER aware devices in this RPC hierarchy 
+	 */
+	sema_init(&rpc->rpc_sema, 1);
+
+	INIT_LIST_HEAD(&rpc->node);
+	INIT_LIST_HEAD(&rpc->children);
+	rpc->rpd = dev;
+	INIT_WORK(&rpc->dpc_handler, aer_isr, (void *)dev);
+	rpc->prod_idx = rpc->cons_idx = 0;
+
+	/* Use PCIE bus function to store rpc into PCIE device */
+	set_service_data(dev, rpc);
+
+	return rpc;
+}
+
+/**
+ * aer_remove - clean up resources
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when PCI Express bus unloads or AER probe fails.
+ **/
+static void aer_remove(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc = get_service_data(dev);
+
+	if (rpc) {
+		/* If register interrupt service, it must be free. */
+		if (rpc->isr)
+			free_irq(dev->irq, dev);
+
+		/* Delete this node from a RC hierarchy */
+		aer_delete_rootport(rpc);
+		set_service_data(dev, NULL);
+	}
+}
+
+/**
+ * aer_probe - initialize resources
+ * @dev: pointer to the pcie_dev data structure
+ * @id: pointer to the service id data structure
+ *
+ * Invoked when PCI Express bus loads AER service driver.
+ **/
+static int __devinit aer_probe (struct pcie_device *dev, 
+				const struct pcie_port_service_id *id )
+{
+	int status;
+	struct aer_rpc *rpc;
+	struct device *device = &dev->device;
+
+	/* Init */
+	if ((status = aer_init(dev)))
+		return status;
+
+	/* Alloc rpc data structure */
+	if (!(rpc = aer_alloc_rpc(dev))) {
+		printk(KERN_DEBUG "%s: Alloc rpc fails on PCIE device[%s]\n",
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return -ENOMEM;
+	}
+
+	/* Request IRQ ISR */
+	if ((status = request_irq(dev->irq, aer_irq, SA_SHIRQ, "aerdrv", 
+				dev))) {
+		printk(KERN_DEBUG "%s: Request ISR fails on PCIE device[%s]\n", 
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return status;
+	}
+
+	rpc->isr = 1;
+
+	/* Add rpc into a RC hierarchy */
+	aer_add_rootport(rpc);
+
+	return status;
+}
+
+/**
+ * aer_root_reset - reset link on Root Port
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver when performing link reset at Root Port.
+ **/
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
+{
+	u16 p2p_ctrl;
+	u32 status;
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* Disable Root's interrupt in response to error messages */ 
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Assert Secondary Bus Reset */
+	pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl);
+	p2p_ctrl |= PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* De-assert Secondary Bus Reset */
+	p2p_ctrl &= ~PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* 
+	 * System software must wait for at least 100ms from the end 
+	 * of a reset of one or more device before it is permitted
+	 * to issue Configuration Requests to those devices.
+	 */
+	msleep(200);
+	printk(KERN_DEBUG "Complete link reset at Root[%s]\n", dev->dev.bus_id);
+
+	/* Enable Root Port's interrupt in response to error messages */ 
+	pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, status);
+	pci_write_config_dword(dev,
+		pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+/**
+ * aer_error_detected - update severity status
+ * @dev: pointer to Root Port's pci_dev data structure
+ * @error: error severity being notified by port bus
+ *
+ * Invoked by Port Bus driver during error recovery.
+ **/
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+			enum pci_channel_state error)
+{
+	/* Root Port has no impact. Always recovers. */
+	return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+/**
+ * aer_error_resume - clean up corresponding error status bits
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver during nonfatal recovery.
+ **/
+static void aer_error_resume(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+	u16 reg16;
+
+	/* Clean up Root device status */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	pci_read_config_word(dev, pos + PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(dev, pos + PCI_EXP_DEVSTA, reg16);
+
+	/* Clean AER Root Error Status */
+	pos = pci_find_aer_capability(dev);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
+/**
+ * aer_service_init - register AER root service driver
+ *
+ * Invoked when AER root service driver is loaded.
+ **/
+static int __init aer_service_init(void)
+{
+	return pcie_port_service_register(&aerdrv);
+}
+
+/**
+ * aer_service_exit - unregister AER root service driver
+ *
+ * Invoked when AER root service driver is unloaded.
+ **/
+static void __exit aer_service_exit(void) 
+{
+	pcie_port_service_unregister(&aerdrv);
+}
+
+module_init(aer_service_init);
+module_exit(aer_service_exit);
--- linux-2.6.17/drivers/pci/pcie/aer/Kconfig	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Kconfig	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,12 @@
+#
+# Root Port Device AER Configuration
+#
+
+config PCIEAER
+	tristate "Root Port Advanced Error Reporting support"
+	depends on PCIEPORTBUS 
+	default y
+	help
+	  This enables Root Port Advanced Error Reporting (AER) driver
+	  support. Error reporting messages sent to Root Port will be
+	  handled by PCI Express AER driver.
--- linux-2.6.17/drivers/pci/pcie/aer/Makefile	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,10 @@
+#
+# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
+#
+
+obj-$(CONFIG_PCIEAER)		+= aerdriver.o
+aerdrv_acpi-$(CONFIG_ACPI)	+= aerdrv_acpi.o
+
+aerdriver-objs		:= aerdrv_errprint.o aerdrv_core.o aerdrv.o
+aerdriver-objs		+= $(aerdrv_acpi-y)
+
--- linux-2.6.17/drivers/pci/pcie/Kconfig	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Kconfig	2006-06-22 16:46:29.000000000 +0800
@@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
 	   
 	  When in doubt, say N.
 
+source "drivers/pci/pcie/aer/Kconfig"
--- linux-2.6.17/drivers/pci/pcie/Makefile	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -5,3 +5,6 @@
 pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
 
 obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
+
+# Build PCI Express AER if needed
+obj-$(CONFIG_PCIEAER)		+= aer/
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_errprint.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_errprint.c	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,216 @@
+/*
+ * Copyright (C) 2006 Intel
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+
+#include "aerdrv.h"
+
+#define AER_AGENT_RECEIVER		0
+#define AER_AGENT_REQUESTER		1
+#define AER_AGENT_COMPLETER		2
+#define AER_AGENT_TRANSMITTER		3		
+
+#define AER_AGENT_REQUESTER_MASK	(PCI_ERR_UNC_COMP_TIME|	\
+					PCI_ERR_UNC_UNSUP)
+
+#define AER_AGENT_COMPLETER_MASK	PCI_ERR_UNC_COMP_ABORT
+
+#define AER_AGENT_TRANSMITTER_MASK(t, e) (e & (PCI_ERR_COR_REP_ROLL| \
+	((t == AER_CORRECTABLE) ? PCI_ERR_COR_REP_TIMER: 0))) 
+
+#define AER_GET_AGENT(t, e)						\
+	((e & AER_AGENT_COMPLETER_MASK) ? AER_AGENT_COMPLETER :		\
+	(e & AER_AGENT_REQUESTER_MASK) ? AER_AGENT_REQUESTER :		\
+	(AER_AGENT_TRANSMITTER_MASK(t, e)) ? AER_AGENT_TRANSMITTER :	\
+	AER_AGENT_RECEIVER)
+
+#define AER_PHYSICAL_LAYER_ERROR_MASK	PCI_ERR_COR_RCVR
+#define AER_DATA_LINK_LAYER_ERROR_MASK(t, e)	\
+		(PCI_ERR_UNC_DLP|		\
+		PCI_ERR_COR_BAD_TLP| 		\
+		PCI_ERR_COR_BAD_DLLP|		\
+		PCI_ERR_COR_REP_ROLL| 		\
+		((t == AER_CORRECTABLE) ?	\
+		PCI_ERR_COR_REP_TIMER: 0))
+
+#define AER_PHYSICAL_LAYER_ERROR	0
+#define AER_DATA_LINK_LAYER_ERROR	1
+#define AER_TRANSACTION_LAYER_ERROR	2
+
+#define AER_GET_LAYER_ERROR(t, e)				\
+	((e & AER_PHYSICAL_LAYER_ERROR_MASK) ?			\
+	AER_PHYSICAL_LAYER_ERROR :				\
+	(e & AER_DATA_LINK_LAYER_ERROR_MASK(t, e)) ?		\
+		AER_DATA_LINK_LAYER_ERROR : 			\
+		AER_TRANSACTION_LAYER_ERROR)
+
+/* 
+ * AER error strings 
+ */
+static char* aer_error_severity_string[] = {
+	"Uncorrected (Non-Fatal)", 
+	"Uncorrected (Fatal)",
+	"Corrected"
+};
+
+static char* aer_error_layer[] = {
+	"Physical Layer",
+	"Data Link Layer",
+	"Transaction Layer" 
+};
+static char* aer_correctable_error_string[] = {
+	"Receiver Error        ",	/* Bit Position 0 	*/
+	"Unknown Error Bit 1   ", 	/* Bit Position 1	*/
+	"Unknown Error Bit 2   ",	/* Bit Position 2	*/
+	"Unknown Error Bit 3   ", 	/* Bit Position 3	*/
+	"Unknown Error Bit 4   ", 	/* Bit Position 4 	*/
+	"Unknown Error Bit 5   ",	/* Bit Position 5	*/
+	"Bad TLP               ",	/* Bit Position 6 	*/
+	"Bad DLLP              ",	/* Bit Position 7 	*/
+	"RELAY_NUM Rollover    ",	/* Bit Position 8 	*/
+	"Unknown Error Bit 9   ", 	/* Bit Position 9	*/
+	"Unknown Error Bit 10  ",	/* Bit Position 10	*/
+	"Unknown Error Bit 11  ", 	/* Bit Position 11	*/
+	"Replay Timer Timeout  ",	/* Bit Position 12 	*/
+	"Advisory Non-Fatal    ", 	/* Bit Position 13	*/
+	"Unknown Error Bit 14  ",	/* Bit Position 14	*/
+	"Unknown Error Bit 15  ", 	/* Bit Position 15	*/
+	"Unknown Error Bit 16  ", 	/* Bit Position 16 	*/
+	"Unknown Error Bit 17  ",	/* Bit Position 17	*/
+	"Unknown Error Bit 18  ", 	/* Bit Position 18	*/
+	"Unknown Error Bit 19  ",	/* Bit Position 19	*/
+	"Unknown Error Bit 20  ", 	/* Bit Position 20	*/
+	"Unknown Error Bit 21  ", 	/* Bit Position 21 	*/
+	"Unknown Error Bit 22  ",	/* Bit Position 22	*/
+	"Unknown Error Bit 23  ", 	/* Bit Position 23	*/
+	"Unknown Error Bit 24  ",	/* Bit Position 24	*/
+	"Unknown Error Bit 25  ", 	/* Bit Position 25	*/
+	"Unknown Error Bit 26  ", 	/* Bit Position 26 	*/
+	"Unknown Error Bit 27  ",	/* Bit Position 27	*/
+	"Unknown Error Bit 28  ",	/* Bit Position 28	*/
+	"Unknown Error Bit 29  ", 	/* Bit Position 29	*/
+	"Unknown Error Bit 30  ", 	/* Bit Position 30 	*/
+	"Unknown Error Bit 31  "	/* Bit Position 31	*/
+};
+
+static char* aer_uncorrectable_error_string[] = {
+	"Unknown Error Bit 0   ", 	/* Bit Position 0	*/
+	"Unknown Error Bit 1   ", 	/* Bit Position 1	*/
+	"Unknown Error Bit 2   ",	/* Bit Position 2	*/
+	"Unknown Error Bit 3   ", 	/* Bit Position 3	*/
+	"Data Link Protocol    ",	/* Bit Position 4	*/
+	"Unknown Error Bit 5   ", 	/* Bit Position 5	*/
+	"Unknown Error Bit 6   ", 	/* Bit Position 6	*/
+	"Unknown Error Bit 7   ",	/* Bit Position 7	*/
+	"Unknown Error Bit 8   ", 	/* Bit Position 8	*/
+	"Unknown Error Bit 9   ", 	/* Bit Position 9	*/
+	"Unknown Error Bit 10  ",	/* Bit Position 10	*/
+	"Unknown Error Bit 11  ", 	/* Bit Position 11	*/
+	"Poisoned TLP          ",	/* Bit Position 12 	*/
+	"Flow Control Protocol ",	/* Bit Position 13	*/
+	"Completion Timeout    ",	/* Bit Position 14 	*/
+	"Completer Abort       ",	/* Bit Position 15 	*/
+	"Unexpected Completion ",	/* Bit Position 16	*/
+	"Receiver Overflow     ",	/* Bit Position 17	*/
+	"Malformed TLP         ",	/* Bit Position 18	*/
+	"ECRC                  ",	/* Bit Position 19	*/
+	"Unsupported Request   ",	/* Bit Position 20	*/
+	"Unknown Error Bit 21  ", 	/* Bit Position 21 	*/
+	"Unknown Error Bit 22  ",	/* Bit Position 22	*/
+	"Unknown Error Bit 23  ", 	/* Bit Position 23	*/
+	"Unknown Error Bit 24  ",	/* Bit Position 24	*/
+	"Unknown Error Bit 25  ", 	/* Bit Position 25	*/
+	"Unknown Error Bit 26  ", 	/* Bit Position 26 	*/
+	"Unknown Error Bit 27  ",	/* Bit Position 27	*/
+	"Unknown Error Bit 28  ",	/* Bit Position 28	*/
+	"Unknown Error Bit 29  ", 	/* Bit Position 29	*/
+	"Unknown Error Bit 30  ", 	/* Bit Position 30 	*/
+	"Unknown Error Bit 31  "	/* Bit Position 31	*/
+};
+
+static char* aer_agent_string[] = {
+	"Receiver ID", 
+	"Requester ID", 
+	"Completer ID", 
+	"Transmitter ID" 
+};
+
+static char* aer_get_error_source_name(int severity, unsigned int status)
+{
+        int i;
+
+        for (i = 0; i < 32; i++) {
+                if (!(status & (1 << i)))
+                        continue;
+
+                if (severity == AER_CORRECTABLE)
+                        return aer_correctable_error_string[i];
+                else
+                        return aer_uncorrectable_error_string[i];
+        }
+
+        return NULL;
+}
+
+void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
+{
+	char * errmsg;
+	int err_layer, agent;
+
+	printk(KERN_ERR "+------ PCI-Express Device Error ------+\n");
+	printk(KERN_ERR "Error Severity\t\t: %s\n",
+		aer_error_severity_string[info->severity]);
+
+	if ( info->status == 0) {
+                printk(KERN_ERR "PCIE Bus Error type\t: (Unaccessible)\n");
+                printk(KERN_ERR "Unaccessible Received\t: %s\n",
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+                printk(KERN_ERR "Unregistered Agent ID\t: %04x\n",
+			(dev->bus->number << 8) | dev->devfn);
+	} else {
+		err_layer = AER_GET_LAYER_ERROR(info->severity, info->status);
+		printk(KERN_ERR "PCIE Bus Error type\t: %s\n",
+			aer_error_layer[err_layer]);
+
+		errmsg = aer_get_error_source_name(info->severity, info->status);
+		printk(KERN_ERR "%s\t: %s\n", errmsg,
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+
+		agent = AER_GET_AGENT(info->severity, info->status);
+		printk(KERN_ERR "%s\t\t: %04x\n",
+			aer_agent_string[agent],
+			(dev->bus->number << 8) | dev->devfn);
+
+		printk(KERN_ERR "VendorID=%04xh, DeviceID=%04xh,"
+			" Bus=%02xh, Device=%02xh, Function=%02xh\n",
+			dev->vendor,
+			dev->device,
+			dev->bus->number,
+			PCI_SLOT(dev->devfn),
+			PCI_FUNC(dev->devfn));
+
+		if (info->flags & AER_TLP_HEADER_VALID_FLAG) {
+			unsigned char *tlp = (unsigned char *) &info->tlp;
+			printk(KERN_ERR "TLB Header:\n");
+			printk(KERN_ERR "%02x%02x%02x%02x %02x%02x%02x%02x"
+				" %02x%02x%02x%02x %02x%02x%02x%02x\n",
+				*(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+				*(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
+				*(tlp + 11), *(tlp + 10), *(tlp + 9),
+				*(tlp + 8), *(tlp + 15), *(tlp + 14),
+				*(tlp + 13), *(tlp + 12));
+		}
+	}
+}
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12  8:06       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
@ 2006-07-12 13:16         ` Arjan van de Ven
  2006-07-13  2:08           ` Zhang, Yanmin
  2006-07-12 16:26         ` Andi Kleen
  1 sibling, 1 reply; 29+ messages in thread
From: Arjan van de Ven @ 2006-07-12 13:16 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: LKML, linux-pci maillist, Greg KH, Tom Long Nguyen

> + 
> +	struct semaphore rpc_sema;	/* 
> +					 * Semaphore access required to
> +					 * access, add, remove, or print AER
> +				 	 * aware devices in this RPC hierarchy
> +					 */


Hi, 

sorry to bug you again.. but is there a reason you're introducing a new
semaphore and not a mutex? From looking at the code it could/should be a
mutex...

Greetings,
   Arjan van de Ven


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12  8:06       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
  2006-07-12 13:16         ` Arjan van de Ven
@ 2006-07-12 16:26         ` Andi Kleen
  2006-07-13  2:16           ` Zhang, Yanmin
  1 sibling, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2006-07-12 16:26 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen, linux-kernel

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:

> With Arjan's comments, I changed EXPORT_SYMBOL to EXPORT_SYMBOL_GPL.
> Sorry for flooding your emailbox again. :)

This means that non GPL drivers will reimplement these functions 
on their own (which is possible, just ugly) The fallout of them getting that wrong
might be significant.

I would change it back. _GPL should be only for core services, not
for generic driver interfaces.

> --- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_core.c	1970-01-01 08:00:00.000000000 +0800
> +++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_core.c	2006-07-12 15:47:38.000000000 +0800
> @@ -0,0 +1,737 @@
> +/*
> + * Copyright (C) 2006 Intel
> + *	Tom Long Nguyen (tom.l.nguyen@intel.com)
> + *	Zhang Yanmin (yanmin.zhang@intel.com)

Comment describing what the file does missing. At least one paragraph
of design rationale would be good 

> +
> +config PCIEAER
> +	tristate "Root Port Advanced Error Reporting support"
> +	depends on PCIEPORTBUS 
> +	default y
> +	help
> +	  This enables Root Port Advanced Error Reporting (AER) driver
> +	  support. Error reporting messages sent to Root Port will be
> +	  handled by PCI Express AER driver.

I hope it's clear from the context this is PCI-E specific?

> --- linux-2.6.17/drivers/pci/pcie/aer/Makefile	1970-01-01 08:00:00.000000000 +0800
> +++ linux-2.6.17_aer/drivers/pci/pcie/aer/Makefile	2006-06-22 16:46:29.000000000 +0800
> @@ -0,0 +1,10 @@
> +#
> +# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
> +#
> +
> +obj-$(CONFIG_PCIEAER)		+= aerdriver.o
> +aerdrv_acpi-$(CONFIG_ACPI)	+= aerdrv_acpi.o
> +
> +aerdriver-objs		:= aerdrv_errprint.o aerdrv_core.o aerdrv.o
> +aerdriver-objs		+= $(aerdrv_acpi-y)
> +
> --- linux-2.6.17/drivers/pci/pcie/Kconfig	2006-06-22 16:26:43.000000000 +0800
> +++ linux-2.6.17_aer/drivers/pci/pcie/Kconfig	2006-06-22 16:46:29.000000000 +0800
> @@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
>  	   
>  	  When in doubt, say N.
>  
> +source "drivers/pci/pcie/aer/Kconfig"
> --- linux-2.6.17/drivers/pci/pcie/Makefile	2006-06-22 16:26:43.000000000 +0800
> +++ linux-2.6.17_aer/drivers/pci/pcie/Makefile	2006-06-22 16:46:29.000000000 +0800
> @@ -5,3 +5,6 @@
>  pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
>  
>  obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
> +
> +# Build PCI Express AER if needed
> +obj-$(CONFIG_PCIEAER)		+= aer/
> --- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_errprint.c	1970-01-01 08:00:00.000000000 +0800
> +++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_errprint.c	2006-06-22 16:46:29.000000000 +0800
> @@ -0,0 +1,216 @@
> +/*
> + * Copyright (C) 2006 Intel
> + *	Tom Long Nguyen (tom.l.nguyen@intel.com)
> + *	Zhang Yanmin (yanmin.zhang@intel.com)
> + *

Comment what the code does missing.

At least one paragraph of design rationale would be good.

> +	"Unknown Error Bit 22  ",	/* Bit Position 22	*/
> +	"Unknown Error Bit 23  ", 	/* Bit Position 23	*/
> +	"Unknown Error Bit 24  ",	/* Bit Position 24	*/
> +	"Unknown Error Bit 25  ", 	/* Bit Position 25	*/
> +	"Unknown Error Bit 26  ", 	/* Bit Position 26 	*/
> +	"Unknown Error Bit 27  ",	/* Bit Position 27	*/
> +	"Unknown Error Bit 28  ",	/* Bit Position 28	*/
> +	"Unknown Error Bit 29  ", 	/* Bit Position 29	*/
> +	"Unknown Error Bit 30  ", 	/* Bit Position 30 	*/
> +	"Unknown Error Bit 31  "	/* Bit Position 31	*/

Make all the unknown error bits a NULL and use a sprintf in the 
decoder instead.

Similar for the following arrays.
> +void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
> +{
> +	char * errmsg;
> +	int err_layer, agent;
> +
> +	printk(KERN_ERR "+------ PCI-Express Device Error ------+\n");
> +	printk(KERN_ERR "Error Severity\t\t: %s\n",
> +		aer_error_severity_string[info->severity]);
> +
> +	if ( info->status == 0) {
> +                printk(KERN_ERR "PCIE Bus Error type\t: (Unaccessible)\n");

KERN_ERR? THis means it will appear on consoles, won't it?
And surely not all these errors are fatal enough to need user attention
immediately and I bet there will be some devices who report these
errors unnecessarily. I would use a lower log level.

Also I would suggest you add something in the documentation
on what the messages mean exactly and how to decode them. I'm sure that will be a FAQ.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12 13:16         ` Arjan van de Ven
@ 2006-07-13  2:08           ` Zhang, Yanmin
  0 siblings, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-13  2:08 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: LKML, linux-pci maillist, Greg KH, Tom Long Nguyen

On Wed, 2006-07-12 at 21:16, Arjan van de Ven wrote:
> > + 
> > +	struct semaphore rpc_sema;	/* 
> > +					 * Semaphore access required to
> > +					 * access, add, remove, or print AER
> > +				 	 * aware devices in this RPC hierarchy
> > +					 */
> 
> 
> Hi, 
> 
> sorry to bug you again..
Any comment is welcome.

>  but is there a reason you're introducing a new
> semaphore and not a mutex? From looking at the code it could/should be a
> mutex...
It could be a mutex and be deleted because every root port has its own rpc. workqueue
could guarantee only one keventd will service the work at the same time.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12 16:26         ` Andi Kleen
@ 2006-07-13  2:16           ` Zhang, Yanmin
  0 siblings, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-13  2:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-pci maillist, Greg KH, Tom Long Nguyen, LKML, Arjan van de Ven

On Thu, 2006-07-13 at 00:26, Andi Kleen wrote:
> "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:
> 
> > With Arjan's comments, I changed EXPORT_SYMBOL to EXPORT_SYMBOL_GPL.
> > Sorry for flooding your emailbox again. :)
> 
> This means that non GPL drivers will reimplement these functions 
> on their own (which is possible, just ugly) The fallout of them getting that wrong
> might be significant.
Yeah, it looks like a hard decision. 

> 
> I would change it back. _GPL should be only for core services, not
> for generic driver interfaces.
> 
> > --- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_core.c	1970-01-01 08:00:00.000000000 +0800
> > +++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_core.c	2006-07-12 15:47:38.000000000 +0800
> > @@ -0,0 +1,737 @@
> > +/*
> > + * Copyright (C) 2006 Intel
> > + *	Tom Long Nguyen (tom.l.nguyen@intel.com)
> > + *	Zhang Yanmin (yanmin.zhang@intel.com)
> 
> Comment describing what the file does missing. At least one paragraph
> of design rationale would be good 
Ok. I will add it.

> 
> > +
> > +config PCIEAER
> > +	tristate "Root Port Advanced Error Reporting support"
> > +	depends on PCIEPORTBUS 
> > +	default y
> > +	help
> > +	  This enables Root Port Advanced Error Reporting (AER) driver
> > +	  support. Error reporting messages sent to Root Port will be
> > +	  handled by PCI Express AER driver.
> 
> I hope it's clear from the context this is PCI-E specific?
The dependence on PCIEPORTBUS means it's of PCI-E specific. I will add
more description.


> 
> > --- linux-2.6.17/drivers/pci/pcie/aer/Makefile	1970-01-01 08:00:00.000000000 +0800
> > +++ linux-2.6.17_aer/drivers/pci/pcie/aer/Makefile	2006-06-22 16:46:29.000000000 +0800
> > @@ -0,0 +1,10 @@
> > +#
> > +# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
> > +#
> > +
> > +obj-$(CONFIG_PCIEAER)		+= aerdriver.o
> > +aerdrv_acpi-$(CONFIG_ACPI)	+= aerdrv_acpi.o
> > +
> > +aerdriver-objs		:= aerdrv_errprint.o aerdrv_core.o aerdrv.o
> > +aerdriver-objs		+= $(aerdrv_acpi-y)
> > +
> > --- linux-2.6.17/drivers/pci/pcie/Kconfig	2006-06-22 16:26:43.000000000 +0800
> > +++ linux-2.6.17_aer/drivers/pci/pcie/Kconfig	2006-06-22 16:46:29.000000000 +0800
> > @@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
> >  	   
> >  	  When in doubt, say N.
> >  
> > +source "drivers/pci/pcie/aer/Kconfig"
> > --- linux-2.6.17/drivers/pci/pcie/Makefile	2006-06-22 16:26:43.000000000 +0800
> > +++ linux-2.6.17_aer/drivers/pci/pcie/Makefile	2006-06-22 16:46:29.000000000 +0800
> > @@ -5,3 +5,6 @@
> >  pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
> >  
> >  obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
> > +
> > +# Build PCI Express AER if needed
> > +obj-$(CONFIG_PCIEAER)		+= aer/
> > --- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_errprint.c	1970-01-01 08:00:00.000000000 +0800
> > +++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_errprint.c	2006-06-22 16:46:29.000000000 +0800
> > @@ -0,0 +1,216 @@
> > +/*
> > + * Copyright (C) 2006 Intel
> > + *	Tom Long Nguyen (tom.l.nguyen@intel.com)
> > + *	Zhang Yanmin (yanmin.zhang@intel.com)
> > + *
> 
> Comment what the code does missing.
Ok.

> 
> At least one paragraph of design rationale would be good.
> 
> > +	"Unknown Error Bit 22  ",	/* Bit Position 22	*/
> > +	"Unknown Error Bit 23  ", 	/* Bit Position 23	*/
> > +	"Unknown Error Bit 24  ",	/* Bit Position 24	*/
> > +	"Unknown Error Bit 25  ", 	/* Bit Position 25	*/
> > +	"Unknown Error Bit 26  ", 	/* Bit Position 26 	*/
> > +	"Unknown Error Bit 27  ",	/* Bit Position 27	*/
> > +	"Unknown Error Bit 28  ",	/* Bit Position 28	*/
> > +	"Unknown Error Bit 29  ", 	/* Bit Position 29	*/
> > +	"Unknown Error Bit 30  ", 	/* Bit Position 30 	*/
> > +	"Unknown Error Bit 31  "	/* Bit Position 31	*/
> 
> Make all the unknown error bits a NULL and use a sprintf in the 
> decoder instead.
I will try.

> 
> Similar for the following arrays.
> > +void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
> > +{
> > +	char * errmsg;
> > +	int err_layer, agent;
> > +
> > +	printk(KERN_ERR "+------ PCI-Express Device Error ------+\n");
> > +	printk(KERN_ERR "Error Severity\t\t: %s\n",
> > +		aer_error_severity_string[info->severity]);
> > +
> > +	if ( info->status == 0) {
> > +                printk(KERN_ERR "PCIE Bus Error type\t: (Unaccessible)\n");
> 
> KERN_ERR? THis means it will appear on consoles, won't it?
> And surely not all these errors are fatal enough to need user attention
> immediately and I bet there will be some devices who report these
> errors unnecessarily. I would use a lower log level.
It should be more elaborated. I will change it.

> 
> Also I would suggest you add something in the documentation
> on what the messages mean exactly and how to decode them. I'm sure that will be a FAQ.
I will add more description, but we couldn't hope it has detailed info like
the pci-e specs has. 

I really appreciate your comments.

Yanmin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/5] PCI-Express AER implemetation: aer howto document
  2006-07-12  7:10 [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
  2006-07-12  7:16 ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
@ 2006-07-14  5:25 ` Zhang, Yanmin
  2006-07-14  5:27   ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
                     ` (2 more replies)
  1 sibling, 3 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-14  5:25 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

Here are the updated patches. Thank Greg, Andi Kleen and Arjan for their comments.

From: Zhang, Yanmin <yanmin.zhang@intel.com>

PCI-Express AER (Advanced Error Reporting) provides more robust error reporting.
The series of patches enable kernel support to AER.

The initial patches were written by Tom Long Nguyen. I ported them to the kernel
2.6.17. Many thanks to Rajesh Shah and Narayanan Chandramouli for their great
review comments and testing help.

Patch 1 consists of the pciaer-howto.txt document.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> 

---

--- linux-2.6.17/Documentation/pcieaer-howto.txt	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/Documentation/pcieaer-howto.txt	2006-07-14 11:09:37.000000000 +0800
@@ -0,0 +1,228 @@
+   The PCI Express Advanced Error Reporting Driver Guide HOWTO
+		T. Long Nguyen	<tom.l.nguyen@intel.com>
+		Yanmin Zhang	<yanmin.zhang@intel.com>
+				03/30/2006
+
+1. About this guide
+
+This guide describes the basics of the PCI Express Advanced Error
+Reporting (AER) driver and provides information on how to enable
+the drivers of endpoint devices to conform with PCI Express AER
+driver.
+
+2. Copyright © Intel Corporation 2006.
+
+3. What is the PCI Express AER Driver?
+
+PCI Express error signaling can occur on the PCI Express link itself
+or on behalf of transactions initiated on the link. PCI Express
+defines two error reporting paradigms: the baseline capability and
+the Advanced Error Reporting capability. The baseline capability is
+required of all PCI Express components providing a minimum defined
+set of error reporting requirements. Advanced Error Reporting
+capability is implemented with a PCI Express advanced error reporting
+extended capability structure providing more robust error reporting.
+
+The PCI Express AER driver provides the infrastructure to support PCI
+Express Advanced Error Reporting capability. The PCI Express AER
+driver provides three basic functions:
+
+-	Gathers the comprehensive error information if errors
+	occurred. 
+-	Performs error recovery actions.
+-	Reports error to the users.
+
+AER driver only attaches root port which support PCI-Express AER
+capability.
+
+4. Why Use the PCI Express AER Driver?
+
+In a PCI Express-aware system, when AER is enabled, a PCI Express
+device will automatically send an error message to the PCIE root
+port above it when the device captures an error. The Root Port,
+upon receiving an error reporting message, internally processes
+and logs the error message in its PCI Express capability structure.
+Error information being logged includes storing the error reporting
+agent's requestor ID into the Error Source Identification Registers
+and setting the error bits of the Root Error Status Register
+accordingly. If AER error reporting is enabled in Root Error Command
+Register, the Root Port generates an interrupt if an error is
+detected.
+
+All kernels released before 2.6.18 have no root service driver
+available to manage the PCI Express advanced error reporting
+extended capability structure. BIOS could provide the baseline
+capability, but it is unable to coordinate with the downstream device
+drivers to determine more precisely which error and what severity,
+and unable to reset the downstream links while handling fatal error
+recovery.
+
+To provide a solution to these BIOS issues requires the PCI Express AER
+Root driver that provides:
+
+- 	An infrastructure for the OS and application to determine if a
+	fatal error is fatal to the system, OS, or application increasing
+	uptime.
+
+-	An infrastructure to notify the downstream device drivers if errors
+	occurred.
+
+-	An infrastructure to dynamically perform error recovery actions
+	based on configuration options.
+
+- 	Platform-specific independence.
+
+5. Including the PCI Express AER Root Driver into the Linux Kernel
+
+The PCI Express AER Root driver is a Root Port service driver attached
+to the PCI Express Port Bus driver. Its service must be registered
+with the PCI Express Port Bus driver and users are required to include
+the PCI Express Port Bus driver in the kernel (refer to
+PCIEBUS-HOWTO.txt). Once the kernel config CONFIG_PCIEPORTBUS is
+included, the PCI Express AER Root driver is automatically included
+as a kernel driver by default (CONFIG_PCIEAER = y). Users may disable
+the PCI Express AER driver by clearing CONFIG_PCIEAER.
+
+Note that there is a case where a system has AER support in BIOS. 
+Enabling the AER Root driver and having AER support in BIOS may
+result unpredictable behavior. To avoid this conflict, a successful
+load of the AER Root driver requires ACPI _OSC support in the BIOS to
+allow the AER Root driver to request for native control of AER. See
+the PCI FW 3.0 Specification for details regarding OSC usage. Currently,
+lots of firmwares don't provide _OSC support while they use
+PCI-Express. To support such firmwares, forceload, a module parameter
+of type bool, could enable AER to continue to be initiated although
+firmwares have no _OSC support. forceload=n by default.
+
+6. Enabling AER Aware Support in PCI Express Device Driver
+
+To enable AER aware support requires a software driver to configure
+the AER capability structure within its device and to provide its
+error-recovery callbacks as described below.
+
+6.1. Configuring the AER capability structure
+
+PCI Express errors are classified into two types: correctable errors
+and uncorrectable errors. This classification is based on the impacts
+of those errors, which may result in function failure or in degraded
+performance.
+
+Correctable errors pose no impacts on the functionality of the
+interface. The PCI Express protocol can recover without any software
+intervention or any loss of data. These errors are detected and
+corrected by hardware. Unlike correctable errors, uncorrectable
+errors impact functionality of the interface. Uncorrectable errors
+can cause a particular transaction or a particular PCI Express link
+to be unreliable. Depending on those error conditions, uncorrectable
+errors are further classified into fatal errors and non-fatal errors.
+Non-fatal errors cause the particular transaction to be unreliable,
+but the PCI Express link itself is fully functional. Fatal errors, on
+the other hand, cause the link to be unreliable.
+
+AER aware drivers of PCI Express component need change the device
+control registers to enable AER. They also could change AER registers,
+including mask and severity registers.
+
+Note that the errors as described above are related to the PCI Express
+hierarchy and links. These errors do not include any device specific
+errors because device specific errors will still get sent directly to
+the device driver.
+
+6.2. Provide PCI error-recovery callbacks
+
+The PCI Express AER Root driver uses callbacks to coordinate with
+downstream device drivers associated with a hierarchy in question
+when performing error recovery actions. AER driver follows the rules
+defined in pci-error-recovery.txt.
+
+Note that correctable errors pose no impacts on the functionality of
+the interface. The PCI Express protocol can recover without any
+software intervention or any loss of data. These errors do not
+require any recovery actions. The AER driver clears the device's
+correctable error status register accordingly and logs these errors.
+
+If an error message indicates a non-fatal error, performing link reset
+at upstream is not required. The AER driver calls error_detected(dev,
+pci_channel_io_normal) to all drivers associated within a hierarchy in
+question. A driver may return PCI_ERS_RESULT_CAN_RECOVER,
+PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
+whether it can recover or the AER driver calls mmio_enabled as next.
+ 
+If an error message indicates a fatal error, kernel will broadcast
+error_detected(dev, pci_channel_io_frozen) to all drivers within
+a hierarchy in question. Then, performing link reset at upstream is
+necessary. As different kinds of devices might use different approaches
+to reset link, AER port service driver is required to provide the
+function to reset link. Firstly, kernel looks for if the upstream
+component has an aer driver. If it has, kernel uses the reset_link
+callback of the aer driver. If the upstream component has no aer driver
+and the port is downstream port, we will use the aer driver of the
+root port who reports the AER error. As for upstream ports,
+they should provide their own aer service drivers with reset_link
+function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
+reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
+to mmio_enabled.
+
+6.2.5 helper functions
+
+6.2.5.1 int pci_find_aer_capability(struct pci_dev *dev);
+pci_find_aer_capability locates the PCI-Express AER capability
+in the device configuration space. If the device doesn't support
+PCI-Express AER, the function returns 0.
+
+6.2.5.2 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+pci_enable_pcie_error_reporting enables the device to send error
+messages to root port when an error is detected. Note that devices
+don't enable the error reporting by default, so device driver need
+call this function to enable it.
+
+6.2.5.3 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+pci_disable_pcie_error_reporting disables the device to send error
+messages to root port when an error is detected.
+
+6.2.5.4 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
+error status register.
+
+7. AER error output
+
+When any AER error is reported, kernel will call printk to output
+error messages.
+
+Below shows an example.
++------ PCI-Express Device Error -----+
+Error Severity          : Uncorrected (Fatal)
+PCIE Bus Error type     : Transaction Layer
+Unsupported Request     : First
+Requester ID            : 0500
+VendorID=8086h, DeviceID=0329h, Bus=05h, Device=00h, Function=00h
+TLB Header:
+04000001 00200a03 05010000 00050100
+
+In the example, 'Requester ID' means the ID of the device who sends
+the error message to root port. Pls. refer to pci express specs for
+other fields.
+
+8. Frequent Asked Questions
+
+Q: What happens if a PCI Express device driver does not provide an
+error recovery handle?
+
+A: The devices attached with the driver won't be recovered. If the
+error is fatal, kernel will print out warning messages. Please refer
+to section 6 for more information.
+
+Q: How does this infrastructure deal with driver that is not PCI
+Express aware?
+
+A: This infrastructure calls the error callback functions of the
+driver when an error happens. But if the driver is not aware of
+PCI Express, the device might not report its own errors to root
+port.
+
+Q: What modifications will that driver need to make it compatible
+with the PCI Express AER Root driver?
+
+A: It could call the helper functions to enable AER in devices and
+cleanup uncorrectable status register. Pls. refer to section 6.2.5.
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h
  2006-07-14  5:25 ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
@ 2006-07-14  5:27   ` Zhang, Yanmin
  2006-07-14  5:28     ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
  2006-07-14 12:40   ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Andi Kleen
  2006-07-24 20:48   ` Linas Vepstas
  2 siblings, 1 reply; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-14  5:27 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

Although Greg already accepted the second patch into his testing tree,
I still resend it to keep the patch integrity.

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 2 adds new defines of PCI-Express AER registers
and their bits into file include/linux/pci_regs.h.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/include/linux/pci_regs.h	2006-06-22 16:26:31.000000000 +0800
+++ linux-2.6.17_aer/include/linux/pci_regs.h	2006-06-22 16:46:29.000000000 +0800
@@ -421,7 +421,23 @@
 #define  PCI_ERR_CAP_ECRC_CHKE	0x00000100	/* ECRC Check Enable */
 #define PCI_ERR_HEADER_LOG	28	/* Header Log Register (16 bytes) */
 #define PCI_ERR_ROOT_COMMAND	44	/* Root Error Command */
+/* Correctable Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_COR_EN		0x00000001
+/* Non-fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_NONFATAL_EN	0x00000002
+/* Fatal Err Reporting Enable */
+#define PCI_ERR_ROOT_CMD_FATAL_EN	0x00000004
 #define PCI_ERR_ROOT_STATUS	48
+#define PCI_ERR_ROOT_COR_RCV		0x00000001	/* ERR_COR Received */
+/* Multi ERR_COR Received */
+#define PCI_ERR_ROOT_MULTI_COR_RCV	0x00000002
+/* ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_UNCOR_RCV		0x00000004
+/* Multi ERR_FATAL/NONFATAL Recevied */
+#define PCI_ERR_ROOT_MULTI_UNCOR_RCV	0x00000008
+#define PCI_ERR_ROOT_FIRST_FATAL	0x00000010	/* First Fatal */
+#define PCI_ERR_ROOT_NONFATAL_RCV	0x00000020	/* Non-Fatal Received */
+#define PCI_ERR_ROOT_FATAL_RCV		0x00000040	/* Fatal Received */
 #define PCI_ERR_ROOT_COR_SRC	52
 #define PCI_ERR_ROOT_SRC	54
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type
  2006-07-14  5:27   ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
@ 2006-07-14  5:28     ` Zhang, Yanmin
  2006-07-14  5:30       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
  0 siblings, 1 reply; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-14  5:28 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 3 exports pcie_port_bus_type. AER driver could be compiled
as a module and it needs to access pcie_port_bus_type.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/portdrv_bus.c	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_bus.c	2006-07-13 09:21:38.000000000 +0800
@@ -76,3 +76,6 @@ static int pcie_port_bus_resume(struct d
 		driver->resume(pciedev);
 	return 0;
 }
+
+EXPORT_SYMBOL(pcie_port_bus_type);
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-14  5:28     ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
@ 2006-07-14  5:30       ` Zhang, Yanmin
  2006-07-14  5:32         ` [PATCH 4/5] PCI-Express AER implemetation: pcie_portdrv error handler Zhang, Yanmin
  0 siblings, 1 reply; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-14  5:30 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 4 implements the core part of PCI-Express AER and aerdrv
port service driver.

When a root port service device is probed, the aerdrv will call
request_irq to register irq handler for AER error interrupt.

When a device sends an PCI-Express error message to the root port,
the root port will trigger an interrupt, by either MSI or IO-APIC,
then kernel would run the irq handler. The handler collects root
error status register and schedules a work. The work will call
the core part to process the error based on its type
(Correctable/non-fatal/fatal).

As for Correctable errors, the patch chooses to just clear the correctable
error status register of the device.

As for the non-fatal error, the patch follows generic PCI error handler
rules to call the error callback functions of the endpoint's driver. If
the device is a bridge, the patch chooses to broadcast the error to
downstream devices.

As for the fatal error, the patch resets the pci-express link and
follows generic PCI error handler rules to call the error callback
functions of the endpoint's driver. If the device is a bridge, the patch
chooses to broadcast the error to downstream devices.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_acpi.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_acpi.c	2006-07-14 11:02:09.000000000 +0800
@@ -0,0 +1,68 @@
+/*
+ * Access ACPI _OSC method
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+/**
+ * aer_osc_setup - run ACPI _OSC method
+ *
+ * Return: 
+ *	Zero if success. Nonzero for otherwise.
+ *
+ * Invoked when PCIE bus loads AER service driver. To avoid conflict with
+ * BIOS AER support requires BIOS to yield AER control to OS native driver.
+ **/
+int aer_osc_setup(struct pci_dev *dev)
+{
+	int retval = OSC_METHOD_RUN_SUCCESS;
+	acpi_status status;
+	acpi_handle handle = DEVICE_ACPI_HANDLE(&dev->dev);
+	struct pci_dev *pdev = dev;
+	struct pci_bus *parent;
+
+	while (!handle) {
+		if (!pdev || !pdev->bus->parent)
+			break;
+		parent = pdev->bus->parent;
+		if (!parent->self)
+			/* Parent must be a host bridge */
+			handle = acpi_get_pci_rootbridge_handle(
+					pci_domain_nr(parent),
+					parent->number);
+		else
+			handle = DEVICE_ACPI_HANDLE(
+					&(parent->self->dev));
+		pdev = parent->self;
+	}
+
+	if (!handle)
+		return OSC_METHOD_NOT_SUPPORTED;
+
+	pci_osc_support_set(OSC_EXT_PCI_CONFIG_SUPPORT);
+	status = pci_osc_control_set(handle, OSC_PCI_EXPRESS_AER_CONTROL |
+		OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL);
+	if (ACPI_FAILURE(status)) {
+		if (status == AE_SUPPORT) 
+			retval = OSC_METHOD_NOT_SUPPORTED;
+	 	else
+			retval = OSC_METHOD_RUN_FAILURE;
+	}
+
+	return retval;
+}
+
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_core.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_core.c	2006-07-14 13:05:28.000000000 +0800
@@ -0,0 +1,734 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv_core.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file implements the core part of PCI-Express AER. When an pci-express
+ * error is delivered, an error message will be collected and printed to
+ * console, then, an error recovery procedure will be executed by following
+ * the pci error recovery rules.
+ * 
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+static int forceload;
+module_param(forceload, bool, 0);
+
+#define PCI_CFG_SPACE_SIZE	(0x100)
+int pci_find_aer_capability(struct pci_dev *dev)
+{
+	int pos;
+	u32 reg32 = 0;
+
+	/* Check if it's a pci-express device */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return 0;
+
+	/* Check if it supports pci-express AER */
+	pos = PCI_CFG_SPACE_SIZE;
+	while (pos) {
+		if (pci_read_config_dword(dev, pos, &reg32))
+			return 0;
+
+		/* some broken boards return ~0 */
+		if (reg32 == 0xffffffff)
+			return 0;
+
+		if (PCI_EXT_CAP_ID(reg32) == PCI_EXT_CAP_ID_ERR)
+			break;
+
+		pos = reg32 >> 20;
+	}
+
+	return pos;
+}
+
+int pci_disable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 & ~(PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE);
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+
+	pos = pci_find_aer_capability(dev);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+
+	return 0;
+}
+
+static int find_device_iter(struct device *device, void *data)
+{
+	struct pci_dev *dev;
+	u16 id = *(unsigned long *)data;
+	u8 secondary, subordinate, d_bus = id >> 8;
+
+	if (device->bus == &pci_bus_type) {
+		dev = to_pci_dev(device);
+		if (id == ((dev->bus->number << 8) | dev->devfn)) {
+			/*
+			 * Device ID match
+			 */
+			*(unsigned long*)data = (unsigned long)device;
+			return 1;
+		}
+
+		/* 
+		 * If device is P2P, check if it is an upstream?
+		 */
+		if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+			pci_read_config_byte(dev, PCI_SECONDARY_BUS,
+				&secondary);
+			pci_read_config_byte(dev, PCI_SUBORDINATE_BUS,
+				&subordinate);
+			if (d_bus >= secondary && d_bus <= subordinate) {
+				*(unsigned long*)data = (unsigned long)device;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * find_source_device - search through device hierarchy for source device
+ * @p_dev: pointer to Root Port pci_dev data structure
+ * @id: device ID of agent who sends an error message to this Root Port
+ *
+ * Invoked when error is detected at the Root Port.
+ **/
+static struct device* find_source_device(struct pci_dev *parent, u16 id)
+{
+	struct pci_dev *dev = parent;
+	struct device *device;
+	unsigned long device_addr;
+	int status;
+
+	/* Is Root Port an agent that sends error message? */
+	if (id == ((dev->bus->number << 8) | dev->devfn)) 
+		return &dev->dev;
+
+	do {
+		device_addr = id;
+ 		if ((status = device_for_each_child(&dev->dev,
+			&device_addr, find_device_iter))) {
+			device = (struct device*)device_addr;
+			dev = to_pci_dev(device);
+			if (id == ((dev->bus->number << 8) | dev->devfn))
+				return device;
+		}
+ 	}while (status);
+
+	return NULL;
+}
+
+static void report_error_detected(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	dev->error_state = result_data->state;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->error_detected) {
+		if (result_data->state == pci_channel_io_frozen &&
+			!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+			/* 
+			 * In case of fatal recovery, if one of down-
+			 * stream device has no driver. We might be
+			 * unable to recover because a later insmod
+			 * of a driver for this device is unaware of
+			 * its hw state.
+			 */
+			printk(KERN_DEBUG "Device ID[%s] has %s\n",
+					dev->dev.bus_id, (dev->driver) ?
+					"no AER-aware driver" : "no driver");
+		}
+		return;
+	}
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->error_detected(dev, result_data->state);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_mmio_enabled(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->mmio_enabled)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->mmio_enabled(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_slot_reset(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->slot_reset(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_resume(struct pci_dev *dev, void *data)
+{
+	struct pci_error_handlers *err_handler;
+
+	dev->error_state = pci_channel_io_normal;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	err_handler->resume(dev);
+	return;
+}
+
+/**
+ * broadcast_error_message - handle message broadcast to downstream drivers
+ * @device: pointer to from where in a hierarchy message is broadcasted down
+ * @api: callback to be broadcasted
+ * @state: error state
+ *
+ * Invoked during error recovery process. Once being invoked, the content
+ * of error severity will be broadcasted to all downstream drivers in a 
+ * hierarchy in question.
+ **/
+static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+	enum pci_channel_state state,
+	char *error_mesg,
+	void (*cb)(struct pci_dev *, void *))
+{
+	struct aer_broadcast_data result_data;
+
+	printk(KERN_DEBUG "Broadcast %s message\n", error_mesg);
+	result_data.state = state;
+	if (cb == report_error_detected)
+		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
+	else
+		result_data.result = PCI_ERS_RESULT_RECOVERED;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+		/*
+		 * If the error is reported by a bridge, we think this error
+		 * is related to the downstream link of the bridge, so we
+		 * do error recovery on all subordinates of the bridge instead
+		 * of the bridge and clear the error status of the bridge.
+		 */
+		if (cb == report_error_detected)
+			dev->error_state = state;
+		pci_walk_bus(dev->subordinate, cb, &result_data);
+		if (cb == report_resume) {
+			pci_cleanup_aer_uncorrect_error_status(dev);
+			dev->error_state = pci_channel_io_normal;
+		}
+	}
+	else {
+		/*
+		 * If the error is reported by an end point, we think this
+		 * error is related to the upstream link of the end point.
+		 */
+		pci_walk_bus(dev->bus, cb, &result_data);
+	}
+
+	return result_data.result;
+}
+
+struct find_aer_service_data {
+	struct pcie_port_service_driver *aer_driver;
+	int is_downstream;
+};
+
+static int find_aer_service_iter(struct device *device, void *data)
+{
+	struct device_driver *driver;
+	struct pcie_port_service_driver *service_driver;
+	struct pcie_device *pcie_dev;
+	struct find_aer_service_data *result;
+
+	result = (struct find_aer_service_data *) data;
+
+	if (device->bus == &pcie_port_bus_type) {
+		pcie_dev = to_pcie_device(device);
+		if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+			result->is_downstream = 1;
+
+		driver = device->driver;
+		if (driver) {
+			service_driver = to_service_driver(driver);
+			if (service_driver->id_table->service_type ==
+					PCIE_PORT_SERVICE_AER) {
+				result->aer_driver = service_driver;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void find_aer_service(struct pci_dev *dev,
+		struct find_aer_service_data *data)
+{
+	device_for_each_child(&dev->dev, data, find_aer_service_iter);
+}
+
+static pci_ers_result_t reset_link(struct pcie_device *aerdev,
+		struct pci_dev *dev)
+{
+	struct pci_dev *udev;
+	pci_ers_result_t status;
+	struct find_aer_service_data data;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)
+		udev = dev;
+	else
+		udev= dev->bus->self;
+
+	data.is_downstream = 0;
+	data.aer_driver = NULL;
+	find_aer_service(udev, &data);
+
+	/*
+	 * Use the aer driver of the error agent firstly.
+	 * If it hasn't the aer driver, use the root port's
+	 */
+	if (!data.aer_driver || !data.aer_driver->reset_link) {
+		if (data.is_downstream &&
+			aerdev->device.driver &&
+			to_service_driver(aerdev->device.driver)->reset_link) {
+			data.aer_driver =
+				to_service_driver(aerdev->device.driver);
+		} else {
+			printk(KERN_DEBUG "No link-reset support to Device ID"
+				"[%s]\n",
+				dev->dev.bus_id);
+			return PCI_ERS_RESULT_DISCONNECT;
+		}
+	}
+
+	status = data.aer_driver->reset_link(udev);
+	if (status != PCI_ERS_RESULT_RECOVERED) {
+		printk(KERN_DEBUG "Link reset at upstream Device ID"
+			"[%s] failed\n",
+			udev->dev.bus_id);
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	return status;
+}
+
+/**
+ * do_recovery - handle nonfatal/fatal error recovery process
+ * @aerdev: pointer to a pcie_device data structure of root port
+ * @dev: pointer to a pci_dev data structure of agent detecting an error
+ * @severity: error severity type
+ *
+ * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
+ * error detected message to all downstream drivers within a hierarchy in 
+ * question and return the returned code.
+ **/
+static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
+		struct pci_dev *dev,
+		int severity)
+{
+	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
+	enum pci_channel_state state;
+
+	if (severity == AER_FATAL)
+		state = pci_channel_io_frozen;
+	else
+		state = pci_channel_io_normal;
+
+	status = broadcast_error_message(dev,
+			state,
+			"error_detected",
+			report_error_detected);
+
+	if (severity == AER_FATAL) {
+		result = reset_link(aerdev, dev);
+		if (result != PCI_ERS_RESULT_RECOVERED) {
+			/* TODO: Should panic here? */
+			return result;
+		}
+	}
+
+	if (status == PCI_ERS_RESULT_CAN_RECOVER)
+		status = broadcast_error_message(dev,
+				state,
+				"mmio_enabled",
+				report_mmio_enabled);
+
+	if (status == PCI_ERS_RESULT_NEED_RESET) {
+		/*
+		 * TODO: Should call platform-specific
+		 * functions to reset slot before calling
+		 * drivers' slot_reset callbacks?
+		 */
+		status = broadcast_error_message(dev,
+				state,
+				"slot_reset",
+				report_slot_reset);
+	}
+
+	if (status == PCI_ERS_RESULT_RECOVERED)
+		broadcast_error_message(dev,
+				state,
+				"resume",
+				report_resume);
+
+	return status;
+}
+
+/**
+ * handle_error_source - handle logging error into an event log
+ * @aerdev: pointer to pcie_device data structure of the root port
+ * @dev: pointer to pci_dev data structure of error source device
+ * @info: comprehensive error information
+ *
+ * Invoked when an error being detected by Root Port.
+ **/
+static void handle_error_source(struct pcie_device * aerdev,
+	struct pci_dev *dev,
+	struct aer_err_info info)
+{
+	pci_ers_result_t status = 0;
+	int pos;
+
+	if (info.severity == AER_CORRECTABLE) {
+		/* 
+		 * Correctable error does not need software intevention.
+		 * No need to go through error recovery process.
+		 */
+		pos = pci_find_aer_capability(dev);
+		if (pos)
+			pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+					info.status);
+	} else {
+		status = do_recovery(aerdev, dev, info.severity);
+		if (status == PCI_ERS_RESULT_RECOVERED) {
+			printk(KERN_DEBUG "AER driver successfully recovered\n");
+		} else {
+			/* TODO: Should kernel panic here? */ 
+			printk(KERN_DEBUG "AER driver didn't recover\n");
+		}
+	}
+}
+
+/**
+ * aer_enable_rootport - enable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus loads AER service driver.
+ **/
+void aer_enable_rootport(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	int pos, aer_pos;
+	u16 reg16;
+	u32 reg32;
+
+	pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+	/* Clear PCIE Capability's Device Status */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVSTA, reg16);
+
+	/* Disable system error generation in response to error messages */
+	pci_read_config_word(pdev, pos + PCI_EXP_RTCTL, &reg16);
+	reg16 &= ~(SYSTEM_ERROR_INTR_ON_MESG_MASK);
+	pci_write_config_word(pdev, pos + PCI_EXP_RTCTL, reg16);
+
+	aer_pos = pci_find_aer_capability(pdev);
+	/* Clear error status */
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
+
+	/* Enable Root Port device reporting error itself */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVCTL,
+		reg16);
+
+	/* Enable Root Port's interrupt in response to error messages */
+	pci_write_config_dword(pdev,
+		aer_pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+}
+
+/**
+ * disable_root_aer - disable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus unloads AER service driver.
+ **/
+static void disable_root_aer(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	u32 reg32;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev);
+	/* Disable Root's interrupt in response to error messages */
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Clear Root's error status reg */
+	pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, reg32);
+}
+
+/**
+ * get_e_source - retrieve an error source
+ * @rpc: pointer to the root port which holds an error
+ *
+ * Invoked by DPC handler to consume an error.
+ **/
+static struct aer_err_source* get_e_source(struct aer_rpc *rpc)
+{
+	struct aer_err_source *e_source;
+	unsigned long flags;
+
+	/* Lock access to Root error producer/consumer index */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+	if (rpc->prod_idx == rpc->cons_idx) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return NULL;
+	}
+	e_source = &rpc->e_sources[rpc->cons_idx];
+	rpc->cons_idx++;
+	if (rpc->cons_idx == AER_ERROR_SOURCES_MAX)
+		rpc->cons_idx = 0;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+	
+	return e_source;
+}
+
+static int get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
+{
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* The device might not support AER */
+	if (!pos)
+		return AER_SUCCESS;
+
+	if (info->severity == AER_CORRECTABLE) {
+		pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_CORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS; 
+	} else if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE ||
+		info->severity == AER_NONFATAL) {
+
+		/* Link is still healthy for IO reads */
+		pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_UNCORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS;
+
+		if (info->status & AER_LOG_TLP_MASKS) {
+			info->flags |= AER_TLP_HEADER_VALID_FLAG;
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 4, &info->tlp.dw1);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 8, &info->tlp.dw2);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 12, &info->tlp.dw3);
+		}
+	}
+
+	return AER_SUCCESS;
+}
+
+/**
+ * aer_isr_one_error - consume an error detected by root port
+ * @p_device: pointer to error root port service device
+ * @e_src: pointer to an error source
+ **/
+static void aer_isr_one_error(struct pcie_device *p_device,
+		struct aer_err_source *e_src)
+{
+	struct device *s_device;
+	struct aer_err_info e_info = {0, 0, 0,};
+	int i;
+	u16 id;
+
+	/*
+	 * There is a possibility that both correctable error and 
+	 * uncorrectable error being logged. Report correctable error first.
+	 */
+	for (i = 1; i & ROOT_ERR_STATUS_MASKS ; i <<= 2) {
+		if (i > 4)
+			break;
+		if (!(e_src->status & i))
+			continue;
+
+		/* Init comprehensive error information */
+		if (i & PCI_ERR_ROOT_COR_RCV) {
+			id = ERR_COR_ID(e_src->id);
+			e_info.severity = AER_CORRECTABLE;
+		} else {
+			id = ERR_UNCOR_ID(e_src->id);
+			e_info.severity = ((e_src->status >> 6) & 1);
+		}
+		if (e_src->status &
+			(PCI_ERR_ROOT_MULTI_COR_RCV |
+			 PCI_ERR_ROOT_MULTI_UNCOR_RCV))
+			e_info.flags |= AER_MULTI_ERROR_VALID_FLAG;
+		if (!(s_device = find_source_device(p_device->port, id))) {
+			printk(KERN_DEBUG "%s->can't find device of ID%04x\n",
+				__FUNCTION__, id);
+			continue;
+		}
+		if (get_device_error_info(to_pci_dev(s_device), &e_info) ==
+				AER_SUCCESS) {
+			aer_print_error(to_pci_dev(s_device), &e_info);
+			handle_error_source(p_device,
+				to_pci_dev(s_device),
+				e_info);
+		}
+	}
+}
+
+/**
+ * aer_isr - consume errors detected by root port
+ * @context: pointer to a private data of pcie device
+ *
+ * Invoked, as DPC, when root port records new detected error
+ **/
+void aer_isr(void *context)
+{
+	struct pcie_device *p_device = (struct pcie_device *) context;
+	struct aer_rpc *rpc = get_service_data(p_device);
+	struct aer_err_source *e_src;
+
+	e_src = get_e_source(rpc);
+	while (e_src) {
+		aer_isr_one_error(p_device, e_src);
+		e_src = get_e_source(rpc);
+	}
+
+	wake_up(&rpc->wait_release);
+}
+
+/**
+ * aer_delete_rootport - disable root port aer and delete service data 
+ * @rpc: pointer to a root port device being deleted
+ *
+ * Invoked when AER service unloaded on a specific Root Port
+ **/
+void aer_delete_rootport(struct aer_rpc *rpc)
+{
+	/* Disable root port AER itself */
+	disable_root_aer(rpc);
+	
+	kfree(rpc);
+}
+
+/**
+ * aer_init - provide AER initialization
+ * @dev: pointer to AER pcie device
+ *
+ * Invoked when AER service driver is loaded.
+ **/
+int aer_init(struct pcie_device *dev)
+{
+	int status;
+
+	/* Run _OSC Method */
+	status = aer_osc_setup(dev->port);
+
+	if(status != OSC_METHOD_RUN_SUCCESS) {
+		printk(KERN_DEBUG "%s: AER service init fails - %s\n",
+		__FUNCTION__,
+		(status == OSC_METHOD_NOT_SUPPORTED) ?
+			"No ACPI _OSC support" : "Run ACPI _OSC fails");
+
+		if (!forceload)
+			return status;
+	}
+
+	return AER_SUCCESS;
+}
+
+EXPORT_SYMBOL(pci_find_aer_capability);
+EXPORT_SYMBOL(pci_disable_pcie_error_reporting);
+EXPORT_SYMBOL(pci_cleanup_aer_uncorrect_error_status);
+
--- linux-2.6.17/include/linux/aer.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/include/linux/aer.h	2006-07-14 11:17:31.000000000 +0800
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *     Zhang Yanmin (yanmin.zhang@intel.com)
+ */
+
+#ifndef _AER_H_
+#define _AER_H_
+
+#if defined(CONFIG_PCIEAER) || defined(CONFIG_PCIEAER_MODULE)
+/* pci-e port driver needs this function to enable aer */
+static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+extern int pci_find_aer_capability(struct pci_dev *dev);
+extern int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+#else
+#define pci_enable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_find_aer_capability(dev)			do { } while (0)
+#define pci_disable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_cleanup_aer_uncorrect_error_status(dev)	do { } while (0)
+#endif
+
+#endif //_AER_H_
+
--- linux-2.6.17/include/linux/pcieport_if.h	2006-06-22 16:26:32.000000000 +0800
+++ linux-2.6.17_aer/include/linux/pcieport_if.h	2006-06-22 16:46:29.000000000 +0800
@@ -61,6 +61,12 @@ struct pcie_port_service_driver {
 	void (*remove) (struct pcie_device *dev);
 	int (*suspend) (struct pcie_device *dev, pm_message_t state);
 	int (*resume) (struct pcie_device *dev);
+	
+	/* Service Error Recovery Handler */
+	struct pci_error_handlers *err_handler;
+
+	/* Link Reset Capability - AER service driver specific */
+	pci_ers_result_t (*reset_link) (struct pci_dev *dev);
 
 	const struct pcie_port_service_id *id_table;
 	struct device_driver driver;
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.h	2006-07-14 11:17:46.000000000 +0800
@@ -0,0 +1,130 @@
+/*
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#ifndef _AERDRV_H_
+#define _AERDRV_H_
+
+#include <linux/pcieport_if.h>
+#include <linux/aer.h>
+
+#define AER_NONFATAL			0
+#define AER_FATAL			1
+#define AER_CORRECTABLE			2
+#define AER_UNCORRECTABLE		4
+#define AER_ERROR_MASK			0x001fffff
+#define AER_ERROR(d)			(d & AER_ERROR_MASK)
+
+#define VERBOSE_LIMIT_DISPLAY		1
+#define VERBOSE_FULL_DISPLAY		2
+#define VERBOSE_RAW_DISPLAY		3
+#define VERBOSE_MASK			0x3
+
+#define OSC_METHOD_RUN_SUCCESS		0
+#define OSC_METHOD_NOT_SUPPORTED	1
+#define OSC_METHOD_RUN_FAILURE		2
+
+/* Root Error Status Register Bits */
+#define ROOT_ERR_STATUS_MASKS			0x0f
+
+#define SYSTEM_ERROR_INTR_ON_MESG_MASK	(PCI_EXP_RTCTL_SECEE|	\
+					PCI_EXP_RTCTL_SENFEE|	\
+					PCI_EXP_RTCTL_SEFEE)
+#define ROOT_PORT_INTR_ON_MESG_MASK	(PCI_ERR_ROOT_CMD_COR_EN|	\
+					PCI_ERR_ROOT_CMD_NONFATAL_EN|	\
+					PCI_ERR_ROOT_CMD_FATAL_EN)
+#define ERR_COR_ID(d)			(d & 0xffff)
+#define ERR_UNCOR_ID(d)			(d >> 16)
+
+#define AER_SUCCESS			0
+#define AER_UNSUCCESS			1
+#define AER_ERROR_SOURCES_MAX		100
+
+#define AER_LOG_TLP_MASKS		(PCI_ERR_UNC_POISON_TLP|	\
+					PCI_ERR_UNC_ECRC|		\
+					PCI_ERR_UNC_UNSUP|		\
+					PCI_ERR_UNC_COMP_ABORT|		\
+					PCI_ERR_UNC_UNX_COMP|		\
+					PCI_ERR_UNC_MALF_TLP)
+
+/* AER Error Info Flags */
+#define AER_TLP_HEADER_VALID_FLAG	0x00000001
+#define AER_MULTI_ERROR_VALID_FLAG	0x00000002
+
+#define ERR_CORRECTABLE_ERROR_MASK	0x000031c1
+#define ERR_UNCORRECTABLE_ERROR_MASK	0x001ff010
+
+struct header_log_regs {
+	unsigned int dw0;
+	unsigned int dw1;
+	unsigned int dw2;
+	unsigned int dw3;
+};
+
+struct aer_err_info {
+	int severity;			/* 0:NONFATAL | 1:FATAL | 2:COR */
+	int flags;			
+	unsigned int status;		/* COR/UNCOR Error Status */
+	struct header_log_regs tlp; 	/* TLP Header */
+};
+
+struct aer_err_source {
+	unsigned int status;
+	unsigned int id;
+};
+
+struct aer_rpc {
+	struct pcie_device *rpd;	/* Root Port device */
+	struct work_struct dpc_handler;
+	struct aer_err_source e_sources[AER_ERROR_SOURCES_MAX];
+	unsigned short prod_idx;	/* Error Producer Index */
+	unsigned short cons_idx;	/* Error Consumer Index */
+	int isr;
+	spinlock_t e_lock;		/* 
+					 * Lock access to Error Status/ID Regs
+					 * and error producer/consumer index
+					 */
+	wait_queue_head_t wait_release;
+};
+
+struct aer_broadcast_data {
+	enum pci_channel_state state;
+	enum pci_ers_result result;
+};
+
+static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
+		enum pci_ers_result new)
+{
+	switch (orig) {
+	case PCI_ERS_RESULT_CAN_RECOVER:
+	case PCI_ERS_RESULT_RECOVERED:
+		orig = new;
+		break;
+	case PCI_ERS_RESULT_DISCONNECT:
+		if (new == PCI_ERS_RESULT_NEED_RESET)
+			orig = new;
+		break;
+	default:
+		break;
+	}
+
+	return orig;
+}
+
+extern struct bus_type pcie_port_bus_type;
+extern void aer_enable_rootport(struct aer_rpc *rpc);
+extern void aer_delete_rootport(struct aer_rpc *rpc);
+extern int aer_init(struct pcie_device *dev);
+extern void aer_isr(void *context);
+extern void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
+
+#ifdef CONFIG_ACPI
+extern int aer_osc_setup(struct pci_dev *dev);
+#else
+#define  aer_osc_setup(dev)		(OSC_METHOD_NOT_SUPPORTED)
+#endif
+
+#endif //_AERDRV_H_
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.c	2006-07-14 11:16:23.000000000 +0800
@@ -0,0 +1,345 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file implements the AER root port service driver. The driver will
+ * register an irq handler. When root port triggers an AER interrupt, the irq
+ * handler will collect root port status and schedule a work.
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/pcieport_if.h>
+
+#include "aerdrv.h"
+
+/*
+ * Version Information
+ */
+#define DRIVER_VERSION "v1.0"
+#define DRIVER_AUTHOR "tom.l.nguyen@intel.com"
+#define DRIVER_DESC "Root Port Advanced Error Reporting Driver"
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+static int __devinit aer_probe (struct pcie_device *dev,
+	const struct pcie_port_service_id *id );
+static void aer_remove(struct pcie_device *dev);
+static int aer_suspend(struct pcie_device *dev, pm_message_t state)
+{return 0;}
+static int aer_resume(struct pcie_device *dev) {return 0;}
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+	enum pci_channel_state error);
+static void aer_error_resume(struct pci_dev *dev);
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
+
+/*
+ * PCI Express bus's AER Root service driver data structure
+ */
+static struct pcie_port_service_id aer_id[] = {
+	{
+	.vendor 	= PCI_ANY_ID, 
+	.device 	= PCI_ANY_ID,
+	.port_type 	= PCIE_RC_PORT, 
+	.service_type 	= PCIE_PORT_SERVICE_AER,
+	},
+	{ /* end: all zeroes */ }
+};
+
+static struct pci_error_handlers aer_error_handlers = {
+	.error_detected = aer_error_detected,
+	.resume = aer_error_resume,
+};
+
+static struct pcie_port_service_driver aerdrv = {
+	.name		= "aer",
+	.id_table	= &aer_id[0],
+
+	.probe		= aer_probe,
+	.remove		= aer_remove,
+
+	.suspend	= aer_suspend,
+	.resume		= aer_resume,
+
+	.err_handler	= &aer_error_handlers,
+
+	.reset_link	= aer_root_reset,
+};
+
+/**
+ * aer_irq - Root Port's ISR
+ * @irq: IRQ assigned to Root Port
+ * @context: pointer to Root Port data structure
+ * @r: pointer struct pt_regs
+ *
+ * Invoked when Root Port detects AER messages.
+ **/
+static irqreturn_t aer_irq(int irq, void *context, struct pt_regs * r)
+{
+	unsigned int status, id;
+	struct pcie_device *pdev = (struct pcie_device *)context;
+	struct aer_rpc *rpc = get_service_data(pdev);
+	int next_prod_idx;
+	unsigned long flags;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev->port);
+	/* 
+	 * Must lock access to Root Error Status Reg, Root Error ID Reg, 
+	 * and Root error producer/consumer index 
+	 */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+
+	/* Read error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, &status);
+	if (!(status & ROOT_ERR_STATUS_MASKS)) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_NONE;
+	}
+
+	/* Read error source and clear error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_COR_SRC, &id);
+	pci_write_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, status);
+
+	/* Store error source for later DPC handler */
+	next_prod_idx = rpc->prod_idx + 1;
+	if (next_prod_idx == AER_ERROR_SOURCES_MAX)
+		next_prod_idx = 0;
+	if (next_prod_idx == rpc->cons_idx) {
+		/* 
+		 * Error Storm Condition - possibly the same error occurred.
+		 * Drop the error.
+		 */
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_HANDLED;
+	}
+	rpc->e_sources[rpc->prod_idx].status =  status;
+	rpc->e_sources[rpc->prod_idx].id = id;
+	rpc->prod_idx = next_prod_idx;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+
+	/*  Invoke DPC handler */
+	schedule_work(&rpc->dpc_handler);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * aer_alloc_rpc - allocate Root Port data structure
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when Root Port's AER service is loaded.
+ **/
+static struct aer_rpc* aer_alloc_rpc(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc;
+
+	if (!(rpc = (struct aer_rpc *)kmalloc(sizeof(struct aer_rpc), 
+		GFP_KERNEL)))
+		return NULL;
+
+	memset(rpc, 0, sizeof(struct aer_rpc));
+	/* 
+	 * Initialize Root lock access, e_lock, to Root Error Status Reg, 
+	 * Root Error ID Reg, and Root error producer/consumer index. 
+	 */
+	rpc->e_lock = SPIN_LOCK_UNLOCKED;
+
+	rpc->rpd = dev;
+	INIT_WORK(&rpc->dpc_handler, aer_isr, (void *)dev);
+	rpc->prod_idx = rpc->cons_idx = 0;
+	init_waitqueue_head(&rpc->wait_release);
+
+	/* Use PCIE bus function to store rpc into PCIE device */
+	set_service_data(dev, rpc);
+
+	return rpc;
+}
+
+/**
+ * aer_remove - clean up resources
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when PCI Express bus unloads or AER probe fails.
+ **/
+static void aer_remove(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc = get_service_data(dev);
+
+	if (rpc) {
+		/* If register interrupt service, it must be free. */
+		if (rpc->isr)
+			free_irq(dev->irq, dev);
+
+		wait_event(rpc->wait_release, rpc->prod_idx == rpc->cons_idx);
+
+		aer_delete_rootport(rpc);
+		set_service_data(dev, NULL);
+	}
+}
+
+/**
+ * aer_probe - initialize resources
+ * @dev: pointer to the pcie_dev data structure
+ * @id: pointer to the service id data structure
+ *
+ * Invoked when PCI Express bus loads AER service driver.
+ **/
+static int __devinit aer_probe (struct pcie_device *dev, 
+				const struct pcie_port_service_id *id )
+{
+	int status;
+	struct aer_rpc *rpc;
+	struct device *device = &dev->device;
+
+	/* Init */
+	if ((status = aer_init(dev)))
+		return status;
+
+	/* Alloc rpc data structure */
+	if (!(rpc = aer_alloc_rpc(dev))) {
+		printk(KERN_DEBUG "%s: Alloc rpc fails on PCIE device[%s]\n",
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return -ENOMEM;
+	}
+
+	/* Request IRQ ISR */
+	if ((status = request_irq(dev->irq, aer_irq, SA_SHIRQ, "aerdrv", 
+				dev))) {
+		printk(KERN_DEBUG "%s: Request ISR fails on PCIE device[%s]\n", 
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return status;
+	}
+
+	rpc->isr = 1;
+
+	aer_enable_rootport(rpc);
+
+	return status;
+}
+
+/**
+ * aer_root_reset - reset link on Root Port
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver when performing link reset at Root Port.
+ **/
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
+{
+	u16 p2p_ctrl;
+	u32 status;
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* Disable Root's interrupt in response to error messages */ 
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Assert Secondary Bus Reset */
+	pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl);
+	p2p_ctrl |= PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* De-assert Secondary Bus Reset */
+	p2p_ctrl &= ~PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* 
+	 * System software must wait for at least 100ms from the end 
+	 * of a reset of one or more device before it is permitted
+	 * to issue Configuration Requests to those devices.
+	 */
+	msleep(200);
+	printk(KERN_DEBUG "Complete link reset at Root[%s]\n", dev->dev.bus_id);
+
+	/* Enable Root Port's interrupt in response to error messages */ 
+	pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, status);
+	pci_write_config_dword(dev,
+		pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+/**
+ * aer_error_detected - update severity status
+ * @dev: pointer to Root Port's pci_dev data structure
+ * @error: error severity being notified by port bus
+ *
+ * Invoked by Port Bus driver during error recovery.
+ **/
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+			enum pci_channel_state error)
+{
+	/* Root Port has no impact. Always recovers. */
+	return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+/**
+ * aer_error_resume - clean up corresponding error status bits
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver during nonfatal recovery.
+ **/
+static void aer_error_resume(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+	u16 reg16;
+
+	/* Clean up Root device status */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	pci_read_config_word(dev, pos + PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(dev, pos + PCI_EXP_DEVSTA, reg16);
+
+	/* Clean AER Root Error Status */
+	pos = pci_find_aer_capability(dev);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
+/**
+ * aer_service_init - register AER root service driver
+ *
+ * Invoked when AER root service driver is loaded.
+ **/
+static int __init aer_service_init(void)
+{
+	return pcie_port_service_register(&aerdrv);
+}
+
+/**
+ * aer_service_exit - unregister AER root service driver
+ *
+ * Invoked when AER root service driver is unloaded.
+ **/
+static void __exit aer_service_exit(void) 
+{
+	pcie_port_service_unregister(&aerdrv);
+}
+
+module_init(aer_service_init);
+module_exit(aer_service_exit);
--- linux-2.6.17/drivers/pci/pcie/aer/Kconfig	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Kconfig	2006-07-14 09:59:32.000000000 +0800
@@ -0,0 +1,12 @@
+#
+# PCI Express Root Port Device AER Configuration
+#
+
+config PCIEAER
+	tristate "Root Port Advanced Error Reporting support"
+	depends on PCIEPORTBUS 
+	default y
+	help
+	  This enables PCI Express Root Port Advanced Error Reporting
+	  (AER) driver support. Error reporting messages sent to Root
+	  Port will be handled by PCI Express AER driver.
--- linux-2.6.17/drivers/pci/pcie/aer/Makefile	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,10 @@
+#
+# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
+#
+
+obj-$(CONFIG_PCIEAER)		+= aerdriver.o
+aerdrv_acpi-$(CONFIG_ACPI)	+= aerdrv_acpi.o
+
+aerdriver-objs		:= aerdrv_errprint.o aerdrv_core.o aerdrv.o
+aerdriver-objs		+= $(aerdrv_acpi-y)
+
--- linux-2.6.17/drivers/pci/pcie/Kconfig	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Kconfig	2006-06-22 16:46:29.000000000 +0800
@@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
 	   
 	  When in doubt, say N.
 
+source "drivers/pci/pcie/aer/Kconfig"
--- linux-2.6.17/drivers/pci/pcie/Makefile	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -5,3 +5,6 @@
 pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
 
 obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
+
+# Build PCI Express AER if needed
+obj-$(CONFIG_PCIEAER)		+= aer/
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_errprint.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_errprint.c	2006-07-14 10:49:41.000000000 +0800
@@ -0,0 +1,248 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv_errprint.c
+ * 
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Format error messages and print them to console.
+ * 
+ * Copyright (C) 2006 Intel Corp. 
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+
+#include "aerdrv.h"
+
+#define AER_AGENT_RECEIVER		0
+#define AER_AGENT_REQUESTER		1
+#define AER_AGENT_COMPLETER		2
+#define AER_AGENT_TRANSMITTER		3		
+
+#define AER_AGENT_REQUESTER_MASK	(PCI_ERR_UNC_COMP_TIME|	\
+					PCI_ERR_UNC_UNSUP)
+
+#define AER_AGENT_COMPLETER_MASK	PCI_ERR_UNC_COMP_ABORT
+
+#define AER_AGENT_TRANSMITTER_MASK(t, e) (e & (PCI_ERR_COR_REP_ROLL| \
+	((t == AER_CORRECTABLE) ? PCI_ERR_COR_REP_TIMER: 0))) 
+
+#define AER_GET_AGENT(t, e)						\
+	((e & AER_AGENT_COMPLETER_MASK) ? AER_AGENT_COMPLETER :		\
+	(e & AER_AGENT_REQUESTER_MASK) ? AER_AGENT_REQUESTER :		\
+	(AER_AGENT_TRANSMITTER_MASK(t, e)) ? AER_AGENT_TRANSMITTER :	\
+	AER_AGENT_RECEIVER)
+
+#define AER_PHYSICAL_LAYER_ERROR_MASK	PCI_ERR_COR_RCVR
+#define AER_DATA_LINK_LAYER_ERROR_MASK(t, e)	\
+		(PCI_ERR_UNC_DLP|		\
+		PCI_ERR_COR_BAD_TLP| 		\
+		PCI_ERR_COR_BAD_DLLP|		\
+		PCI_ERR_COR_REP_ROLL| 		\
+		((t == AER_CORRECTABLE) ?	\
+		PCI_ERR_COR_REP_TIMER: 0))
+
+#define AER_PHYSICAL_LAYER_ERROR	0
+#define AER_DATA_LINK_LAYER_ERROR	1
+#define AER_TRANSACTION_LAYER_ERROR	2
+
+#define AER_GET_LAYER_ERROR(t, e)				\
+	((e & AER_PHYSICAL_LAYER_ERROR_MASK) ?			\
+	AER_PHYSICAL_LAYER_ERROR :				\
+	(e & AER_DATA_LINK_LAYER_ERROR_MASK(t, e)) ?		\
+		AER_DATA_LINK_LAYER_ERROR : 			\
+		AER_TRANSACTION_LAYER_ERROR)
+
+/* 
+ * AER error strings 
+ */
+static char* aer_error_severity_string[] = {
+	"Uncorrected (Non-Fatal)", 
+	"Uncorrected (Fatal)",
+	"Corrected"
+};
+
+static char* aer_error_layer[] = {
+	"Physical Layer",
+	"Data Link Layer",
+	"Transaction Layer" 
+};
+static char* aer_correctable_error_string[] = {
+	"Receiver Error        ",	/* Bit Position 0 	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	"Bad TLP               ",	/* Bit Position 6 	*/
+	"Bad DLLP              ",	/* Bit Position 7 	*/
+	"RELAY_NUM Rollover    ",	/* Bit Position 8 	*/
+	NULL,
+	NULL,
+	NULL,
+	"Replay Timer Timeout  ",	/* Bit Position 12 	*/
+	"Advisory Non-Fatal    ", 	/* Bit Position 13	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+};
+
+static char* aer_uncorrectable_error_string[] = {
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	"Data Link Protocol    ",	/* Bit Position 4	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	"Poisoned TLP          ",	/* Bit Position 12 	*/
+	"Flow Control Protocol ",	/* Bit Position 13	*/
+	"Completion Timeout    ",	/* Bit Position 14 	*/
+	"Completer Abort       ",	/* Bit Position 15 	*/
+	"Unexpected Completion ",	/* Bit Position 16	*/
+	"Receiver Overflow     ",	/* Bit Position 17	*/
+	"Malformed TLP         ",	/* Bit Position 18	*/
+	"ECRC                  ",	/* Bit Position 19	*/
+	"Unsupported Request   ",	/* Bit Position 20	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+};
+
+static char* aer_agent_string[] = {
+	"Receiver ID",
+	"Requester ID",
+	"Completer ID",
+	"Transmitter ID"
+};
+
+static char * aer_get_error_source_name(int severity,
+			unsigned int status,
+			char errmsg_buff[])
+{
+	int i;
+	char * errmsg = NULL;
+
+	for (i = 0; i < 32; i++) {
+		if (!(status & (1 << i)))
+			continue;
+
+		if (severity == AER_CORRECTABLE)
+			errmsg = aer_correctable_error_string[i];
+		else
+			errmsg = aer_uncorrectable_error_string[i];
+
+		if (!errmsg) {
+			sprintf(errmsg_buff, "Unknown Error Bit %2d  ", i);
+			errmsg = errmsg_buff;
+		}
+
+		break;
+	}
+
+	return errmsg;
+}
+
+static DEFINE_SPINLOCK(logbuf_lock);
+static char errmsg_buff[100];
+void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
+{
+	char * errmsg;
+	int err_layer, agent;
+	char * loglevel;
+
+	if (info->severity == AER_CORRECTABLE)
+		loglevel = KERN_WARNING;
+	else
+		loglevel = KERN_ERR;
+
+	printk("%s+------ PCI-Express Device Error ------+\n", loglevel);
+	printk("%sError Severity\t\t: %s\n", loglevel,
+		aer_error_severity_string[info->severity]);
+
+	if ( info->status == 0) {
+		printk("%sPCIE Bus Error type\t: (Unaccessible)\n", loglevel);
+		printk("%sUnaccessible Received\t: %s\n", loglevel,
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+		printk("%sUnregistered Agent ID\t: %04x\n", loglevel,
+			(dev->bus->number << 8) | dev->devfn);
+	} else {
+		err_layer = AER_GET_LAYER_ERROR(info->severity, info->status);
+		printk("%sPCIE Bus Error type\t: %s\n", loglevel,
+			aer_error_layer[err_layer]);
+
+		spin_lock(&logbuf_lock);
+		errmsg = aer_get_error_source_name(info->severity,
+				info->status,
+				errmsg_buff);
+		printk("%s%s\t: %s\n", loglevel, errmsg,
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+		spin_unlock(&logbuf_lock);
+
+		agent = AER_GET_AGENT(info->severity, info->status);
+		printk("%s%s\t\t: %04x\n", loglevel,
+			aer_agent_string[agent],
+			(dev->bus->number << 8) | dev->devfn);
+
+		printk("%sVendorID=%04xh, DeviceID=%04xh,"
+			" Bus=%02xh, Device=%02xh, Function=%02xh\n",
+			loglevel,
+			dev->vendor,
+			dev->device,
+			dev->bus->number,
+			PCI_SLOT(dev->devfn),
+			PCI_FUNC(dev->devfn));
+
+		if (info->flags & AER_TLP_HEADER_VALID_FLAG) {
+			unsigned char *tlp = (unsigned char *) &info->tlp;
+			printk("%sTLB Header:\n", loglevel);
+			printk("%s%02x%02x%02x%02x %02x%02x%02x%02x"
+				" %02x%02x%02x%02x %02x%02x%02x%02x\n",
+				loglevel,
+				*(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+				*(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
+				*(tlp + 11), *(tlp + 10), *(tlp + 9),
+				*(tlp + 8), *(tlp + 15), *(tlp + 14),
+				*(tlp + 13), *(tlp + 12));
+		}
+	}
+}
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: pcie_portdrv error handler
  2006-07-14  5:30       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
@ 2006-07-14  5:32         ` Zhang, Yanmin
  2006-07-14  5:35           ` [PATCH 5/5] " Zhang, Yanmin
  0 siblings, 1 reply; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-14  5:32 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 5 implements error handlers for pcie_portdrv.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:27:35.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:46:29.000000000 +0800
@@ -14,8 +14,10 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/pcieport_if.h>
+#include <linux/aer.h>
 
 #include "portdrv.h"
+#include "aer/aerdrv.h"
 
 /*
  * Version Information
@@ -76,6 +78,8 @@ static int __devinit pcie_portdrv_probe 
 	if (pcie_port_device_register(dev)) 
 		return -ENOMEM;
 
+	pci_enable_pcie_error_reporting(dev);
+
 	return 0;
 }
 
@@ -102,6 +106,146 @@ static int pcie_portdrv_resume (struct p
 }
 #endif
 
+static int error_detected_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	struct aer_broadcast_data *result_data;
+	pci_ers_result_t status;
+
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (!driver ||
+			!driver->err_handler ||
+			!driver->err_handler->error_detected)
+			return 0;
+
+		pcie_device = to_pcie_device(device);
+
+		/* Forward error detected message to service drivers */
+		status = driver->err_handler->error_detected(
+			pcie_device->port,
+			result_data->state);
+		result_data->result =
+			merge_result(result_data->result, status);
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
+					enum pci_channel_state error)
+{
+	struct aer_broadcast_data result_data =
+			{error, PCI_ERS_RESULT_CAN_RECOVER};
+	
+	device_for_each_child(&dev->dev, &result_data, error_detected_iter);
+
+	/* If fatal, save cfg space for possible link reset at upstream */
+	if (error == pci_channel_io_frozen)
+		pcie_portdrv_save_config(dev);
+
+	return result_data.result;
+}
+
+static int mmio_enabled_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	pci_ers_result_t status, *result;
+
+	result = (pci_ers_result_t *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->mmio_enabled) {
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			status = driver->err_handler->mmio_enabled(
+					pcie_device->port);
+			*result = merge_result(*result, status);
+		}
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
+{
+	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
+
+	device_for_each_child(&dev->dev, &status, mmio_enabled_iter);
+	return status;
+}
+
+static int slot_reset_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	pci_ers_result_t status, *result;
+
+	result = (pci_ers_result_t *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->slot_reset) {
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			status = driver->err_handler->slot_reset(
+					pcie_device->port);
+			*result = merge_result(*result, status);
+		}
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
+{
+	pci_ers_result_t status;
+
+	/* If fatal, restore cfg space for possible link reset at upstream */
+	if (dev->error_state == pci_channel_io_frozen)
+		pcie_portdrv_restore_config(dev);
+
+	device_for_each_child(&dev->dev, &status, slot_reset_iter);
+
+	return status;
+}
+
+static int resume_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->resume) { 
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			driver->err_handler->resume(pcie_device->port);
+		}
+	}
+
+	return 0;
+}
+
+static void pcie_portdrv_err_resume(struct pci_dev *dev)
+{
+	device_for_each_child(&dev->dev, NULL, resume_iter);
+}
+
 /*
  * LINUX Device Driver Model
  */
@@ -112,6 +256,13 @@ static const struct pci_device_id port_p
 };
 MODULE_DEVICE_TABLE(pci, port_pci_ids);
 
+static struct pci_error_handlers pcie_portdrv_err_handler = {
+		.error_detected = pcie_portdrv_error_detected,
+		.mmio_enabled = pcie_portdrv_mmio_enabled,
+		.slot_reset = pcie_portdrv_slot_reset,
+		.resume = pcie_portdrv_err_resume,
+};
+
 static struct pci_driver pcie_portdrv = {
 	.name		= (char *)device_name,
 	.id_table	= &port_pci_ids[0],
@@ -123,6 +274,8 @@ static struct pci_driver pcie_portdrv = 
 	.suspend	= pcie_portdrv_suspend,
 	.resume		= pcie_portdrv_resume,
 #endif	/* PM */
+
+	.err_handler 	= &pcie_portdrv_err_handler,
 };
 
 static int __init pcie_portdrv_init(void)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler
  2006-07-14  5:32         ` [PATCH 4/5] PCI-Express AER implemetation: pcie_portdrv error handler Zhang, Yanmin
@ 2006-07-14  5:35           ` Zhang, Yanmin
  2006-07-24 19:37             ` Linas Vepstas
  0 siblings, 1 reply; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-14  5:35 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

Sorry. The patch number in subject is incorrect. Resend the last patch.

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 5 implements error handlers for pcie_portdrv.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:27:35.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:46:29.000000000 +0800
@@ -14,8 +14,10 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/pcieport_if.h>
+#include <linux/aer.h>
 
 #include "portdrv.h"
+#include "aer/aerdrv.h"
 
 /*
  * Version Information
@@ -76,6 +78,8 @@ static int __devinit pcie_portdrv_probe 
 	if (pcie_port_device_register(dev)) 
 		return -ENOMEM;
 
+	pci_enable_pcie_error_reporting(dev);
+
 	return 0;
 }
 
@@ -102,6 +106,146 @@ static int pcie_portdrv_resume (struct p
 }
 #endif
 
+static int error_detected_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	struct aer_broadcast_data *result_data;
+	pci_ers_result_t status;
+
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (!driver ||
+			!driver->err_handler ||
+			!driver->err_handler->error_detected)
+			return 0;
+
+		pcie_device = to_pcie_device(device);
+
+		/* Forward error detected message to service drivers */
+		status = driver->err_handler->error_detected(
+			pcie_device->port,
+			result_data->state);
+		result_data->result =
+			merge_result(result_data->result, status);
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
+					enum pci_channel_state error)
+{
+	struct aer_broadcast_data result_data =
+			{error, PCI_ERS_RESULT_CAN_RECOVER};
+	
+	device_for_each_child(&dev->dev, &result_data, error_detected_iter);
+
+	/* If fatal, save cfg space for possible link reset at upstream */
+	if (error == pci_channel_io_frozen)
+		pcie_portdrv_save_config(dev);
+
+	return result_data.result;
+}
+
+static int mmio_enabled_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	pci_ers_result_t status, *result;
+
+	result = (pci_ers_result_t *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->mmio_enabled) {
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			status = driver->err_handler->mmio_enabled(
+					pcie_device->port);
+			*result = merge_result(*result, status);
+		}
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
+{
+	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
+
+	device_for_each_child(&dev->dev, &status, mmio_enabled_iter);
+	return status;
+}
+
+static int slot_reset_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+	pci_ers_result_t status, *result;
+
+	result = (pci_ers_result_t *) data;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->slot_reset) {
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			status = driver->err_handler->slot_reset(
+					pcie_device->port);
+			*result = merge_result(*result, status);
+		}
+	}
+
+	return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
+{
+	pci_ers_result_t status;
+
+	/* If fatal, restore cfg space for possible link reset at upstream */
+	if (dev->error_state == pci_channel_io_frozen)
+		pcie_portdrv_restore_config(dev);
+
+	device_for_each_child(&dev->dev, &status, slot_reset_iter);
+
+	return status;
+}
+
+static int resume_iter(struct device *device, void *data)
+{
+	struct pcie_device *pcie_device;
+	struct pcie_port_service_driver *driver;
+
+	if (device->bus == &pcie_port_bus_type && device->driver) {
+		driver = to_service_driver(device->driver);
+		if (driver &&
+			driver->err_handler &&
+			driver->err_handler->resume) { 
+			pcie_device = to_pcie_device(device);
+
+			/* Forward error message to service drivers */
+			driver->err_handler->resume(pcie_device->port);
+		}
+	}
+
+	return 0;
+}
+
+static void pcie_portdrv_err_resume(struct pci_dev *dev)
+{
+	device_for_each_child(&dev->dev, NULL, resume_iter);
+}
+
 /*
  * LINUX Device Driver Model
  */
@@ -112,6 +256,13 @@ static const struct pci_device_id port_p
 };
 MODULE_DEVICE_TABLE(pci, port_pci_ids);
 
+static struct pci_error_handlers pcie_portdrv_err_handler = {
+		.error_detected = pcie_portdrv_error_detected,
+		.mmio_enabled = pcie_portdrv_mmio_enabled,
+		.slot_reset = pcie_portdrv_slot_reset,
+		.resume = pcie_portdrv_err_resume,
+};
+
 static struct pci_driver pcie_portdrv = {
 	.name		= (char *)device_name,
 	.id_table	= &port_pci_ids[0],
@@ -123,6 +274,8 @@ static struct pci_driver pcie_portdrv = 
 	.suspend	= pcie_portdrv_suspend,
 	.resume		= pcie_portdrv_resume,
 #endif	/* PM */
+
+	.err_handler 	= &pcie_portdrv_err_handler,
 };
 
 static int __init pcie_portdrv_init(void)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/5] PCI-Express AER implemetation: aer howto document
  2006-07-14  5:25 ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
  2006-07-14  5:27   ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
@ 2006-07-14 12:40   ` Andi Kleen
  2006-07-17  1:24     ` Zhang, Yanmin
  2006-07-24 20:48   ` Linas Vepstas
  2 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2006-07-14 12:40 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen, linux-kernel

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:
> 
> Patch 1 consists of the pciaer-howto.txt document.

The user documentation is still not good. Too hidden, too short.

Best you split it into a user and developer part and user needs to be
far more extensive.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/5] PCI-Express AER implemetation: aer howto document
  2006-07-14 12:40   ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Andi Kleen
@ 2006-07-17  1:24     ` Zhang, Yanmin
  0 siblings, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-17  1:24 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen, LKML

On Fri, 2006-07-14 at 20:40, Andi Kleen wrote:
> "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:
> > 
> > Patch 1 consists of the pciaer-howto.txt document.
> 
> The user documentation is still not good. Too hidden, too short.
> 
> Best you split it into a user and developer part and user needs to be
> far more extensive.
That's a good idea. I will change the doc.

Thanks.
Yanmin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler
  2006-07-14  5:35           ` [PATCH 5/5] " Zhang, Yanmin
@ 2006-07-24 19:37             ` Linas Vepstas
  2006-07-26  4:56               ` Zhang, Yanmin
  0 siblings, 1 reply; 29+ messages in thread
From: Linas Vepstas @ 2006-07-24 19:37 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: LKML, linux-pci maillist, Greg KH, Tom Long Nguyen


Hi,

Sorry for a late reply...

On Fri, Jul 14, 2006 at 01:35:38PM +0800, Zhang, Yanmin wrote:
> 
> --- linux-2.6.17/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:27:35.000000000 +0800
> +++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:46:29.000000000 +0800
> +
> +static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
> +					enum pci_channel_state error)
> +{
> +	/* If fatal, save cfg space for possible link reset at upstream */
> +	if (error == pci_channel_io_frozen)
> +		pcie_portdrv_save_config(dev);

If the channel is frozen, is the config space still readable? 
In my case, I had to save config space data early on before
the bus error. 

What's more, I discovered that I had to save the pci config 
space data before device drivers do thier probe. During the probe, 
device drivers will change the config. For example, they'll enable
interrupts and dma. If you turn these on, and then do the probe,
you'll get spectacuar failures.

To be safe, I found the best thing to do was to save the pci
config space state as it was during boot, before the PCI probe 
routines ran.

--linas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/5] PCI-Express AER implemetation: aer howto document
  2006-07-14  5:25 ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
  2006-07-14  5:27   ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
  2006-07-14 12:40   ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Andi Kleen
@ 2006-07-24 20:48   ` Linas Vepstas
  2006-07-26  5:48     ` Zhang, Yanmin
  2 siblings, 1 reply; 29+ messages in thread
From: Linas Vepstas @ 2006-07-24 20:48 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: LKML, linux-pci maillist, Greg KH, Tom Long Nguyen

Hi,

More late commentary ...

On Fri, Jul 14, 2006 at 01:25:49PM +0800, Zhang, Yanmin wrote:
> --- linux-2.6.17/Documentation/pcieaer-howto.txt	1970-01-01 08:00:00.000000000 +0800
> +++ linux-2.6.17_aer/Documentation/pcieaer-howto.txt	2006-07-14 11:09:37.000000000 +0800
> +6.1. Configuring the AER capability structure
> +
> +AER aware drivers of PCI Express component need change the device
> +control registers to enable AER. They also could change AER registers,
> +including mask and severity registers.

Hmm. Why not just enable error reporting for everything? Why make 
the device driver jump through this extra hoop?

If there is some really good reason not to enable reporting by default,
(which I cannot think of at the moment), then there's another
possiblity: enable error reporting if and only if the device
driver has struct pci_driver -> err_handler != NULL.

> +6.2. Provide PCI error-recovery callbacks
> +
> +If an error message indicates a non-fatal error, performing link reset
> +at upstream is not required. The AER driver calls error_detected(dev,
> +pci_channel_io_normal) to all drivers associated within a hierarchy in

Hmm. I would rather extend enum pci_channel_state to include a non-fatal 
error notification. That is, add pci_channel_io_nonfatal_error=4; to the enum.  

> +If an error message indicates a fatal error, kernel will broadcast
> +error_detected(dev, pci_channel_io_frozen) to all drivers within
> +a hierarchy in question. 

"The hierarchy in question" -- does that meen all drivers attached to
the root port?  Or only drivers that aree using some particular link?
You don't want to notify/reset every PCI slot in the system (that would
hurt!); ideally one rests only the one PCI slot that was affected.

> +As different kinds of devices might use different approaches
> +to reset link, AER port service driver is required to provide the
> +function to reset link. Firstly, kernel looks for if the upstream
> +component has an aer driver. If it has, kernel uses the reset_link
> +callback of the aer driver. 

I don't yet entirely understand link reset. However, the original
pci error recovery spec was written by assuming that it would be 
the aer root port driver that performs the link reset. The callback
link_reset() was to notify the device driver that the link was reset.

> +8. Frequent Asked Questions
> +
> +Q: What happens if a PCI Express device driver does not provide an
> +error recovery handle?

What's an "error recovery handle"? Does this refer to the 
struct pci_driver {
   struct pci_error_handlers *err_handler;
}

pointer?  I think it does, but at first this is unclear. 

> +Q: How does this infrastructure deal with driver that is not PCI
> +Express aware?
> +
> +A: This infrastructure calls the error callback functions of the
> +driver when an error happens. But if the driver is not aware of
> +PCI Express, the device might not report its own errors to root
> +port.

Which is a good reason to enable eror reporting by default, or at 
least, to enable error reporting when 
struct pci_driver->err_handler != NULL


--linas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler
  2006-07-24 19:37             ` Linas Vepstas
@ 2006-07-26  4:56               ` Zhang, Yanmin
  0 siblings, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-26  4:56 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: LKML, linux-pci maillist, Greg KH, Tom Long Nguyen

On Tue, 2006-07-25 at 03:37, Linas Vepstas wrote:
> Hi,
> 
> Sorry for a late reply...
> 
> On Fri, Jul 14, 2006 at 01:35:38PM +0800, Zhang, Yanmin wrote:
> > 
> > --- linux-2.6.17/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:27:35.000000000 +0800
> > +++ linux-2.6.17_aer/drivers/pci/pcie/portdrv_pci.c	2006-06-22 16:46:29.000000000 +0800
> > +
> > +static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
> > +					enum pci_channel_state error)
> > +{
> > +	/* If fatal, save cfg space for possible link reset at upstream */
> > +	if (error == pci_channel_io_frozen)
> > +		pcie_portdrv_save_config(dev);
> 
> If the channel is frozen, is the config space still readable? 
> In my case, I had to save config space data early on before
> the bus error. 
You are right.

> 
> What's more, I discovered that I had to save the pci config 
> space data before device drivers do thier probe. During the probe, 
> device drivers will change the config. For example, they'll enable
> interrupts and dma. If you turn these on, and then do the probe,
> you'll get spectacuar failures.
> 
> To be safe, I found the best thing to do was to save the pci
> config space state as it was during boot, before the PCI probe 
> routines ran.
Thanks. I will try.

> 
> --linas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/5] PCI-Express AER implemetation: aer howto document
  2006-07-24 20:48   ` Linas Vepstas
@ 2006-07-26  5:48     ` Zhang, Yanmin
  0 siblings, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-26  5:48 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: LKML, linux-pci maillist, Greg KH, Tom Long Nguyen

On Tue, 2006-07-25 at 04:48, Linas Vepstas wrote:
> Hi,
> 
> More late commentary ...
> 
> On Fri, Jul 14, 2006 at 01:25:49PM +0800, Zhang, Yanmin wrote:
> > --- linux-2.6.17/Documentation/pcieaer-howto.txt	1970-01-01 08:00:00.000000000 +0800
> > +++ linux-2.6.17_aer/Documentation/pcieaer-howto.txt	2006-07-14 11:09:37.000000000 +0800
> > +6.1. Configuring the AER capability structure
> > +
> > +AER aware drivers of PCI Express component need change the device
> > +control registers to enable AER. They also could change AER registers,
> > +including mask and severity registers.
> 
> Hmm. Why not just enable error reporting for everything? Why make 
> the device driver jump through this extra hoop?
> If there is some really good reason not to enable reporting by default,
> (which I cannot think of at the moment), then there's another
> possiblity: enable error reporting if and only if the device
> driver has struct pci_driver -> err_handler != NULL.
It could be a reason. If we choose to enable AER for all devices by default, we
could do so in function pci_enable_device. Then, aerdriver could only be compiled
into kernel instead of a module.

> 
> > +6.2. Provide PCI error-recovery callbacks
> > +
> > +If an error message indicates a non-fatal error, performing link reset
> > +at upstream is not required. The AER driver calls error_detected(dev,
> > +pci_channel_io_normal) to all drivers associated within a hierarchy in
> 
> Hmm. I would rather extend enum pci_channel_state to include a non-fatal 
> error notification. That is, add pci_channel_io_nonfatal_error=4; to the enum.
It looks like good, but makes things complicated. I once wrote the error handlers
for tg3 pci-e driver and felt it's not easy to follow pci-error-recovery.txt.

Just like the driver error handlers of some NIC you wrote before, you chose
to do slot reset for all errors. Is it really useful to add a new state? 

>   
> 
> > +If an error message indicates a fatal error, kernel will broadcast
> > +error_detected(dev, pci_channel_io_frozen) to all drivers within
> > +a hierarchy in question. 
> 
> "The hierarchy in question" -- does that meen all drivers attached to
> the root port?  Or only drivers that aree using some particular link?
> You don't want to notify/reset every PCI slot in the system (that would
> hurt!); ideally one rests only the one PCI slot that was affected.
The hierarchy consists of just all the devices below the device who reports
the error to root port. For example, assume device tg3 NIC connects to bridge
A. The connection is like:
NIC<==>Downstream port B<==>Upstream port A<==>Root port.

If Upstream port A captures an AER error, the hierarchy consists of
Downstream port B and NIC.

> 
> > +As different kinds of devices might use different approaches
> > +to reset link, AER port service driver is required to provide the
> > +function to reset link. Firstly, kernel looks for if the upstream
> > +component has an aer driver. If it has, kernel uses the reset_link
> > +callback of the aer driver. 
> 
> I don't yet entirely understand link reset. However, the original
> pci error recovery spec was written by assuming that it would be 
> the aer root port driver that performs the link reset.
aer root port driver is just AER port service driver. Specific switch's
upstream port might have different link reset method. 


>  The callback
> link_reset() was to notify the device driver that the link was reset.
Callback link_reset definition is confusing and causes driver error handlers
too complicated. The link reset only happens when a fatal error happens,
driver should do a full recovery.

> 
> > +8. Frequent Asked Questions
> > +
> > +Q: What happens if a PCI Express device driver does not provide an
> > +error recovery handle?
> 
> What's an "error recovery handle"? Does this refer to the 
> struct pci_driver {
>    struct pci_error_handlers *err_handler;
> }
> 
> pointer?  I think it does, but at first this is unclear. 
Yes. I will add more pointers.

> 
> > +Q: How does this infrastructure deal with driver that is not PCI
> > +Express aware?
> > +
> > +A: This infrastructure calls the error callback functions of the
> > +driver when an error happens. But if the driver is not aware of
> > +PCI Express, the device might not report its own errors to root
> > +port.
> 
> Which is a good reason to enable eror reporting by default, or at 
> least, to enable error reporting when 
> struct pci_driver->err_handler != NULL
Is it true for bridges (Upstream port and downstream port)? Most bridges
have no drivers, not mentioning error handlers.
Currently, I choose to enable AER for all ports (bridges) by default.

Thanks,
Yanmin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-31  3:14   ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
@ 2006-07-31  3:22     ` Zhang, Yanmin
  0 siblings, 0 replies; 29+ messages in thread
From: Zhang, Yanmin @ 2006-07-31  3:22 UTC (permalink / raw)
  To: LKML; +Cc: linux-pci maillist, Greg KH, Tom Long Nguyen

From: Zhang, Yanmin <yanmin.zhang@intel.com>

Patch 4 implements the core part of PCI-Express AER and aerdrv
port service driver.

When a root port service device is probed, the aerdrv will call
request_irq to register irq handler for AER error interrupt.

When a device sends an PCI-Express error message to the root port,
the root port will trigger an interrupt, by either MSI or IO-APIC,
then kernel would run the irq handler. The handler collects root
error status register and schedules a work. The work will call
the core part to process the error based on its type
(Correctable/non-fatal/fatal).

As for Correctable errors, the patch chooses to just clear the correctable
error status register of the device.

As for the non-fatal error, the patch follows generic PCI error handler
rules to call the error callback functions of the endpoint's driver. If
the device is a bridge, the patch chooses to broadcast the error to
downstream devices.

As for the fatal error, the patch resets the pci-express link and
follows generic PCI error handler rules to call the error callback
functions of the endpoint's driver. If the device is a bridge, the patch
chooses to broadcast the error to downstream devices.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_acpi.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_acpi.c	2006-07-14 11:02:09.000000000 +0800
@@ -0,0 +1,68 @@
+/*
+ * Access ACPI _OSC method
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+/**
+ * aer_osc_setup - run ACPI _OSC method
+ *
+ * Return: 
+ *	Zero if success. Nonzero for otherwise.
+ *
+ * Invoked when PCIE bus loads AER service driver. To avoid conflict with
+ * BIOS AER support requires BIOS to yield AER control to OS native driver.
+ **/
+int aer_osc_setup(struct pci_dev *dev)
+{
+	int retval = OSC_METHOD_RUN_SUCCESS;
+	acpi_status status;
+	acpi_handle handle = DEVICE_ACPI_HANDLE(&dev->dev);
+	struct pci_dev *pdev = dev;
+	struct pci_bus *parent;
+
+	while (!handle) {
+		if (!pdev || !pdev->bus->parent)
+			break;
+		parent = pdev->bus->parent;
+		if (!parent->self)
+			/* Parent must be a host bridge */
+			handle = acpi_get_pci_rootbridge_handle(
+					pci_domain_nr(parent),
+					parent->number);
+		else
+			handle = DEVICE_ACPI_HANDLE(
+					&(parent->self->dev));
+		pdev = parent->self;
+	}
+
+	if (!handle)
+		return OSC_METHOD_NOT_SUPPORTED;
+
+	pci_osc_support_set(OSC_EXT_PCI_CONFIG_SUPPORT);
+	status = pci_osc_control_set(handle, OSC_PCI_EXPRESS_AER_CONTROL |
+		OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL);
+	if (ACPI_FAILURE(status)) {
+		if (status == AE_SUPPORT) 
+			retval = OSC_METHOD_NOT_SUPPORTED;
+	 	else
+			retval = OSC_METHOD_RUN_FAILURE;
+	}
+
+	return retval;
+}
+
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_core.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_core.c	2006-07-31 10:11:43.000000000 +0800
@@ -0,0 +1,757 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv_core.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file implements the core part of PCI-Express AER. When an pci-express
+ * error is delivered, an error message will be collected and printed to
+ * console, then, an error recovery procedure will be executed by following
+ * the pci error recovery rules.
+ * 
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+static int forceload;
+module_param(forceload, bool, 0);
+
+#define PCI_CFG_SPACE_SIZE	(0x100)
+int pci_find_aer_capability(struct pci_dev *dev)
+{
+	int pos;
+	u32 reg32 = 0;
+
+	/* Check if it's a pci-express device */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return 0;
+
+	/* Check if it supports pci-express AER */
+	pos = PCI_CFG_SPACE_SIZE;
+	while (pos) {
+		if (pci_read_config_dword(dev, pos, &reg32))
+			return 0;
+
+		/* some broken boards return ~0 */
+		if (reg32 == 0xffffffff)
+			return 0;
+
+		if (PCI_EXT_CAP_ID(reg32) == PCI_EXT_CAP_ID_ERR)
+			break;
+
+		pos = reg32 >> 20;
+	}
+
+	return pos;
+}
+
+int pci_enable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+int pci_disable_pcie_error_reporting(struct pci_dev *dev)
+{
+	u16 reg16 = 0;
+	int pos;
+
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 & ~(PCI_EXP_DEVCTL_CERE |
+			PCI_EXP_DEVCTL_NFERE |
+			PCI_EXP_DEVCTL_FERE |
+			PCI_EXP_DEVCTL_URRE);
+	pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+			reg16);
+	return 0;
+}
+
+int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+
+	pos = pci_find_aer_capability(dev);
+	if (!pos)
+		return -EIO;
+
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+
+	return 0;
+}
+
+static int find_device_iter(struct device *device, void *data)
+{
+	struct pci_dev *dev;
+	u16 id = *(unsigned long *)data;
+	u8 secondary, subordinate, d_bus = id >> 8;
+
+	if (device->bus == &pci_bus_type) {
+		dev = to_pci_dev(device);
+		if (id == ((dev->bus->number << 8) | dev->devfn)) {
+			/*
+			 * Device ID match
+			 */
+			*(unsigned long*)data = (unsigned long)device;
+			return 1;
+		}
+
+		/* 
+		 * If device is P2P, check if it is an upstream?
+		 */
+		if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+			pci_read_config_byte(dev, PCI_SECONDARY_BUS,
+				&secondary);
+			pci_read_config_byte(dev, PCI_SUBORDINATE_BUS,
+				&subordinate);
+			if (d_bus >= secondary && d_bus <= subordinate) {
+				*(unsigned long*)data = (unsigned long)device;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * find_source_device - search through device hierarchy for source device
+ * @p_dev: pointer to Root Port pci_dev data structure
+ * @id: device ID of agent who sends an error message to this Root Port
+ *
+ * Invoked when error is detected at the Root Port.
+ **/
+static struct device* find_source_device(struct pci_dev *parent, u16 id)
+{
+	struct pci_dev *dev = parent;
+	struct device *device;
+	unsigned long device_addr;
+	int status;
+
+	/* Is Root Port an agent that sends error message? */
+	if (id == ((dev->bus->number << 8) | dev->devfn)) 
+		return &dev->dev;
+
+	do {
+		device_addr = id;
+ 		if ((status = device_for_each_child(&dev->dev,
+			&device_addr, find_device_iter))) {
+			device = (struct device*)device_addr;
+			dev = to_pci_dev(device);
+			if (id == ((dev->bus->number << 8) | dev->devfn))
+				return device;
+		}
+ 	}while (status);
+
+	return NULL;
+}
+
+static void report_error_detected(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	dev->error_state = result_data->state;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->error_detected) {
+		if (result_data->state == pci_channel_io_frozen &&
+			!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+			/* 
+			 * In case of fatal recovery, if one of down-
+			 * stream device has no driver. We might be
+			 * unable to recover because a later insmod
+			 * of a driver for this device is unaware of
+			 * its hw state.
+			 */
+			printk(KERN_DEBUG "Device ID[%s] has %s\n",
+					dev->dev.bus_id, (dev->driver) ?
+					"no AER-aware driver" : "no driver");
+		}
+		return;
+	}
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->error_detected(dev, result_data->state);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_mmio_enabled(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->mmio_enabled)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->mmio_enabled(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_slot_reset(struct pci_dev *dev, void *data)
+{
+	pci_ers_result_t vote;
+	struct pci_error_handlers *err_handler;
+	struct aer_broadcast_data *result_data;
+	result_data = (struct aer_broadcast_data *) data;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	vote = err_handler->slot_reset(dev);
+	result_data->result = merge_result(result_data->result, vote);
+	return;
+}
+
+static void report_resume(struct pci_dev *dev, void *data)
+{
+	struct pci_error_handlers *err_handler;
+
+	dev->error_state = pci_channel_io_normal;
+
+	if (!dev->driver ||
+		!dev->driver->err_handler ||
+		!dev->driver->err_handler->slot_reset)
+		return;
+
+	err_handler = dev->driver->err_handler;
+	err_handler->resume(dev);
+	return;
+}
+
+/**
+ * broadcast_error_message - handle message broadcast to downstream drivers
+ * @device: pointer to from where in a hierarchy message is broadcasted down
+ * @api: callback to be broadcasted
+ * @state: error state
+ *
+ * Invoked during error recovery process. Once being invoked, the content
+ * of error severity will be broadcasted to all downstream drivers in a 
+ * hierarchy in question.
+ **/
+static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+	enum pci_channel_state state,
+	char *error_mesg,
+	void (*cb)(struct pci_dev *, void *))
+{
+	struct aer_broadcast_data result_data;
+
+	printk(KERN_DEBUG "Broadcast %s message\n", error_mesg);
+	result_data.state = state;
+	if (cb == report_error_detected)
+		result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
+	else
+		result_data.result = PCI_ERS_RESULT_RECOVERED;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+		/*
+		 * If the error is reported by a bridge, we think this error
+		 * is related to the downstream link of the bridge, so we
+		 * do error recovery on all subordinates of the bridge instead
+		 * of the bridge and clear the error status of the bridge.
+		 */
+		if (cb == report_error_detected)
+			dev->error_state = state;
+		pci_walk_bus(dev->subordinate, cb, &result_data);
+		if (cb == report_resume) {
+			pci_cleanup_aer_uncorrect_error_status(dev);
+			dev->error_state = pci_channel_io_normal;
+		}
+	}
+	else {
+		/*
+		 * If the error is reported by an end point, we think this
+		 * error is related to the upstream link of the end point.
+		 */
+		pci_walk_bus(dev->bus, cb, &result_data);
+	}
+
+	return result_data.result;
+}
+
+struct find_aer_service_data {
+	struct pcie_port_service_driver *aer_driver;
+	int is_downstream;
+};
+
+static int find_aer_service_iter(struct device *device, void *data)
+{
+	struct device_driver *driver;
+	struct pcie_port_service_driver *service_driver;
+	struct pcie_device *pcie_dev;
+	struct find_aer_service_data *result;
+
+	result = (struct find_aer_service_data *) data;
+
+	if (device->bus == &pcie_port_bus_type) {
+		pcie_dev = to_pcie_device(device);
+		if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+			result->is_downstream = 1;
+
+		driver = device->driver;
+		if (driver) {
+			service_driver = to_service_driver(driver);
+			if (service_driver->id_table->service_type ==
+					PCIE_PORT_SERVICE_AER) {
+				result->aer_driver = service_driver;
+				return 1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void find_aer_service(struct pci_dev *dev,
+		struct find_aer_service_data *data)
+{
+	device_for_each_child(&dev->dev, data, find_aer_service_iter);
+}
+
+static pci_ers_result_t reset_link(struct pcie_device *aerdev,
+		struct pci_dev *dev)
+{
+	struct pci_dev *udev;
+	pci_ers_result_t status;
+	struct find_aer_service_data data;
+
+	if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)
+		udev = dev;
+	else
+		udev= dev->bus->self;
+
+	data.is_downstream = 0;
+	data.aer_driver = NULL;
+	find_aer_service(udev, &data);
+
+	/*
+	 * Use the aer driver of the error agent firstly.
+	 * If it hasn't the aer driver, use the root port's
+	 */
+	if (!data.aer_driver || !data.aer_driver->reset_link) {
+		if (data.is_downstream &&
+			aerdev->device.driver &&
+			to_service_driver(aerdev->device.driver)->reset_link) {
+			data.aer_driver =
+				to_service_driver(aerdev->device.driver);
+		} else {
+			printk(KERN_DEBUG "No link-reset support to Device ID"
+				"[%s]\n",
+				dev->dev.bus_id);
+			return PCI_ERS_RESULT_DISCONNECT;
+		}
+	}
+
+	status = data.aer_driver->reset_link(udev);
+	if (status != PCI_ERS_RESULT_RECOVERED) {
+		printk(KERN_DEBUG "Link reset at upstream Device ID"
+			"[%s] failed\n",
+			udev->dev.bus_id);
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	return status;
+}
+
+/**
+ * do_recovery - handle nonfatal/fatal error recovery process
+ * @aerdev: pointer to a pcie_device data structure of root port
+ * @dev: pointer to a pci_dev data structure of agent detecting an error
+ * @severity: error severity type
+ *
+ * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
+ * error detected message to all downstream drivers within a hierarchy in 
+ * question and return the returned code.
+ **/
+static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
+		struct pci_dev *dev,
+		int severity)
+{
+	pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
+	enum pci_channel_state state;
+
+	if (severity == AER_FATAL)
+		state = pci_channel_io_frozen;
+	else
+		state = pci_channel_io_normal;
+
+	status = broadcast_error_message(dev,
+			state,
+			"error_detected",
+			report_error_detected);
+
+	if (severity == AER_FATAL) {
+		result = reset_link(aerdev, dev);
+		if (result != PCI_ERS_RESULT_RECOVERED) {
+			/* TODO: Should panic here? */
+			return result;
+		}
+	}
+
+	if (status == PCI_ERS_RESULT_CAN_RECOVER)
+		status = broadcast_error_message(dev,
+				state,
+				"mmio_enabled",
+				report_mmio_enabled);
+
+	if (status == PCI_ERS_RESULT_NEED_RESET) {
+		/*
+		 * TODO: Should call platform-specific
+		 * functions to reset slot before calling
+		 * drivers' slot_reset callbacks?
+		 */
+		status = broadcast_error_message(dev,
+				state,
+				"slot_reset",
+				report_slot_reset);
+	}
+
+	if (status == PCI_ERS_RESULT_RECOVERED)
+		broadcast_error_message(dev,
+				state,
+				"resume",
+				report_resume);
+
+	return status;
+}
+
+/**
+ * handle_error_source - handle logging error into an event log
+ * @aerdev: pointer to pcie_device data structure of the root port
+ * @dev: pointer to pci_dev data structure of error source device
+ * @info: comprehensive error information
+ *
+ * Invoked when an error being detected by Root Port.
+ **/
+static void handle_error_source(struct pcie_device * aerdev,
+	struct pci_dev *dev,
+	struct aer_err_info info)
+{
+	pci_ers_result_t status = 0;
+	int pos;
+
+	if (info.severity == AER_CORRECTABLE) {
+		/* 
+		 * Correctable error does not need software intevention.
+		 * No need to go through error recovery process.
+		 */
+		pos = pci_find_aer_capability(dev);
+		if (pos)
+			pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+					info.status);
+	} else {
+		status = do_recovery(aerdev, dev, info.severity);
+		if (status == PCI_ERS_RESULT_RECOVERED) {
+			printk(KERN_DEBUG "AER driver successfully recovered\n");
+		} else {
+			/* TODO: Should kernel panic here? */ 
+			printk(KERN_DEBUG "AER driver didn't recover\n");
+		}
+	}
+}
+
+/**
+ * aer_enable_rootport - enable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus loads AER service driver.
+ **/
+void aer_enable_rootport(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	int pos, aer_pos;
+	u16 reg16;
+	u32 reg32;
+
+	pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+	/* Clear PCIE Capability's Device Status */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVSTA, reg16);
+
+	/* Disable system error generation in response to error messages */
+	pci_read_config_word(pdev, pos + PCI_EXP_RTCTL, &reg16);
+	reg16 &= ~(SYSTEM_ERROR_INTR_ON_MESG_MASK);
+	pci_write_config_word(pdev, pos + PCI_EXP_RTCTL, reg16);
+
+	aer_pos = pci_find_aer_capability(pdev);
+	/* Clear error status */
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, reg32);
+	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
+	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
+
+	/* Enable Root Port device reporting error itself */
+	pci_read_config_word(pdev, pos+PCI_EXP_DEVCTL, &reg16);
+	reg16 = reg16 |
+		PCI_EXP_DEVCTL_CERE |
+		PCI_EXP_DEVCTL_NFERE |
+		PCI_EXP_DEVCTL_FERE |
+		PCI_EXP_DEVCTL_URRE;
+	pci_write_config_word(pdev, pos+PCI_EXP_DEVCTL,
+		reg16);
+
+	/* Enable Root Port's interrupt in response to error messages */
+	pci_write_config_dword(pdev,
+		aer_pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+}
+
+/**
+ * disable_root_aer - disable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus unloads AER service driver.
+ **/
+static void disable_root_aer(struct aer_rpc *rpc)
+{
+	struct pci_dev *pdev = rpc->rpd->port;
+	u32 reg32;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev);
+	/* Disable Root's interrupt in response to error messages */
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Clear Root's error status reg */
+	pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, reg32);
+}
+
+/**
+ * get_e_source - retrieve an error source
+ * @rpc: pointer to the root port which holds an error
+ *
+ * Invoked by DPC handler to consume an error.
+ **/
+static struct aer_err_source* get_e_source(struct aer_rpc *rpc)
+{
+	struct aer_err_source *e_source;
+	unsigned long flags;
+
+	/* Lock access to Root error producer/consumer index */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+	if (rpc->prod_idx == rpc->cons_idx) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return NULL;
+	}
+	e_source = &rpc->e_sources[rpc->cons_idx];
+	rpc->cons_idx++;
+	if (rpc->cons_idx == AER_ERROR_SOURCES_MAX)
+		rpc->cons_idx = 0;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+	
+	return e_source;
+}
+
+static int get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
+{
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* The device might not support AER */
+	if (!pos)
+		return AER_SUCCESS;
+
+	if (info->severity == AER_CORRECTABLE) {
+		pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_CORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS; 
+	} else if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE ||
+		info->severity == AER_NONFATAL) {
+
+		/* Link is still healthy for IO reads */
+		pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS,
+			&info->status);
+		if (!(info->status & ERR_UNCORRECTABLE_ERROR_MASK))
+			return AER_UNSUCCESS;
+
+		if (info->status & AER_LOG_TLP_MASKS) {
+			info->flags |= AER_TLP_HEADER_VALID_FLAG;
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 4, &info->tlp.dw1);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 8, &info->tlp.dw2);
+			pci_read_config_dword(dev, 
+				pos + PCI_ERR_HEADER_LOG + 12, &info->tlp.dw3);
+		}
+	}
+
+	return AER_SUCCESS;
+}
+
+/**
+ * aer_isr_one_error - consume an error detected by root port
+ * @p_device: pointer to error root port service device
+ * @e_src: pointer to an error source
+ **/
+static void aer_isr_one_error(struct pcie_device *p_device,
+		struct aer_err_source *e_src)
+{
+	struct device *s_device;
+	struct aer_err_info e_info = {0, 0, 0,};
+	int i;
+	u16 id;
+
+	/*
+	 * There is a possibility that both correctable error and 
+	 * uncorrectable error being logged. Report correctable error first.
+	 */
+	for (i = 1; i & ROOT_ERR_STATUS_MASKS ; i <<= 2) {
+		if (i > 4)
+			break;
+		if (!(e_src->status & i))
+			continue;
+
+		/* Init comprehensive error information */
+		if (i & PCI_ERR_ROOT_COR_RCV) {
+			id = ERR_COR_ID(e_src->id);
+			e_info.severity = AER_CORRECTABLE;
+		} else {
+			id = ERR_UNCOR_ID(e_src->id);
+			e_info.severity = ((e_src->status >> 6) & 1);
+		}
+		if (e_src->status &
+			(PCI_ERR_ROOT_MULTI_COR_RCV |
+			 PCI_ERR_ROOT_MULTI_UNCOR_RCV))
+			e_info.flags |= AER_MULTI_ERROR_VALID_FLAG;
+		if (!(s_device = find_source_device(p_device->port, id))) {
+			printk(KERN_DEBUG "%s->can't find device of ID%04x\n",
+				__FUNCTION__, id);
+			continue;
+		}
+		if (get_device_error_info(to_pci_dev(s_device), &e_info) ==
+				AER_SUCCESS) {
+			aer_print_error(to_pci_dev(s_device), &e_info);
+			handle_error_source(p_device,
+				to_pci_dev(s_device),
+				e_info);
+		}
+	}
+}
+
+/**
+ * aer_isr - consume errors detected by root port
+ * @context: pointer to a private data of pcie device
+ *
+ * Invoked, as DPC, when root port records new detected error
+ **/
+void aer_isr(void *context)
+{
+	struct pcie_device *p_device = (struct pcie_device *) context;
+	struct aer_rpc *rpc = get_service_data(p_device);
+	struct aer_err_source *e_src;
+
+	mutex_lock(&rpc->rpc_mutex);
+	e_src = get_e_source(rpc);
+	while (e_src) {
+		aer_isr_one_error(p_device, e_src);
+		e_src = get_e_source(rpc);
+	}
+	mutex_unlock(&rpc->rpc_mutex);
+
+	wake_up(&rpc->wait_release);
+}
+
+/**
+ * aer_delete_rootport - disable root port aer and delete service data 
+ * @rpc: pointer to a root port device being deleted
+ *
+ * Invoked when AER service unloaded on a specific Root Port
+ **/
+void aer_delete_rootport(struct aer_rpc *rpc)
+{
+	/* Disable root port AER itself */
+	disable_root_aer(rpc);
+	
+	kfree(rpc);
+}
+
+/**
+ * aer_init - provide AER initialization
+ * @dev: pointer to AER pcie device
+ *
+ * Invoked when AER service driver is loaded.
+ **/
+int aer_init(struct pcie_device *dev)
+{
+	int status;
+
+	/* Run _OSC Method */
+	status = aer_osc_setup(dev->port);
+
+	if(status != OSC_METHOD_RUN_SUCCESS) {
+		printk(KERN_DEBUG "%s: AER service init fails - %s\n",
+		__FUNCTION__,
+		(status == OSC_METHOD_NOT_SUPPORTED) ?
+			"No ACPI _OSC support" : "Run ACPI _OSC fails");
+
+		if (!forceload)
+			return status;
+	}
+
+	return AER_SUCCESS;
+}
+
+EXPORT_SYMBOL(pci_find_aer_capability);
+EXPORT_SYMBOL(pci_enable_pcie_error_reporting);
+EXPORT_SYMBOL(pci_disable_pcie_error_reporting);
+EXPORT_SYMBOL(pci_cleanup_aer_uncorrect_error_status);
+
--- linux-2.6.17/include/linux/aer.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/include/linux/aer.h	2006-07-31 10:08:51.000000000 +0800
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *     Zhang Yanmin (yanmin.zhang@intel.com)
+ */
+
+#ifndef _AER_H_
+#define _AER_H_
+
+#if defined(CONFIG_PCIEAER)
+/* pci-e port driver needs this function to enable aer */
+extern int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_find_aer_capability(struct pci_dev *dev);
+extern int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+#else
+#define pci_enable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_find_aer_capability(dev)			do { } while (0)
+#define pci_disable_pcie_error_reporting(dev)		do { } while (0)
+#define pci_cleanup_aer_uncorrect_error_status(dev)	do { } while (0)
+#endif
+
+#endif //_AER_H_
+
--- linux-2.6.17/include/linux/pcieport_if.h	2006-06-22 16:26:32.000000000 +0800
+++ linux-2.6.17_aer/include/linux/pcieport_if.h	2006-06-22 16:46:29.000000000 +0800
@@ -61,6 +61,12 @@ struct pcie_port_service_driver {
 	void (*remove) (struct pcie_device *dev);
 	int (*suspend) (struct pcie_device *dev, pm_message_t state);
 	int (*resume) (struct pcie_device *dev);
+	
+	/* Service Error Recovery Handler */
+	struct pci_error_handlers *err_handler;
+
+	/* Link Reset Capability - AER service driver specific */
+	pci_ers_result_t (*reset_link) (struct pci_dev *dev);
 
 	const struct pcie_port_service_id *id_table;
 	struct device_driver driver;
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.h	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.h	2006-07-29 14:37:55.000000000 +0800
@@ -0,0 +1,135 @@
+/*
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#ifndef _AERDRV_H_
+#define _AERDRV_H_
+
+#include <linux/pcieport_if.h>
+#include <linux/aer.h>
+
+#define AER_NONFATAL			0
+#define AER_FATAL			1
+#define AER_CORRECTABLE			2
+#define AER_UNCORRECTABLE		4
+#define AER_ERROR_MASK			0x001fffff
+#define AER_ERROR(d)			(d & AER_ERROR_MASK)
+
+#define VERBOSE_LIMIT_DISPLAY		1
+#define VERBOSE_FULL_DISPLAY		2
+#define VERBOSE_RAW_DISPLAY		3
+#define VERBOSE_MASK			0x3
+
+#define OSC_METHOD_RUN_SUCCESS		0
+#define OSC_METHOD_NOT_SUPPORTED	1
+#define OSC_METHOD_RUN_FAILURE		2
+
+/* Root Error Status Register Bits */
+#define ROOT_ERR_STATUS_MASKS			0x0f
+
+#define SYSTEM_ERROR_INTR_ON_MESG_MASK	(PCI_EXP_RTCTL_SECEE|	\
+					PCI_EXP_RTCTL_SENFEE|	\
+					PCI_EXP_RTCTL_SEFEE)
+#define ROOT_PORT_INTR_ON_MESG_MASK	(PCI_ERR_ROOT_CMD_COR_EN|	\
+					PCI_ERR_ROOT_CMD_NONFATAL_EN|	\
+					PCI_ERR_ROOT_CMD_FATAL_EN)
+#define ERR_COR_ID(d)			(d & 0xffff)
+#define ERR_UNCOR_ID(d)			(d >> 16)
+
+#define AER_SUCCESS			0
+#define AER_UNSUCCESS			1
+#define AER_ERROR_SOURCES_MAX		100
+
+#define AER_LOG_TLP_MASKS		(PCI_ERR_UNC_POISON_TLP|	\
+					PCI_ERR_UNC_ECRC|		\
+					PCI_ERR_UNC_UNSUP|		\
+					PCI_ERR_UNC_COMP_ABORT|		\
+					PCI_ERR_UNC_UNX_COMP|		\
+					PCI_ERR_UNC_MALF_TLP)
+
+/* AER Error Info Flags */
+#define AER_TLP_HEADER_VALID_FLAG	0x00000001
+#define AER_MULTI_ERROR_VALID_FLAG	0x00000002
+
+#define ERR_CORRECTABLE_ERROR_MASK	0x000031c1
+#define ERR_UNCORRECTABLE_ERROR_MASK	0x001ff010
+
+struct header_log_regs {
+	unsigned int dw0;
+	unsigned int dw1;
+	unsigned int dw2;
+	unsigned int dw3;
+};
+
+struct aer_err_info {
+	int severity;			/* 0:NONFATAL | 1:FATAL | 2:COR */
+	int flags;			
+	unsigned int status;		/* COR/UNCOR Error Status */
+	struct header_log_regs tlp; 	/* TLP Header */
+};
+
+struct aer_err_source {
+	unsigned int status;
+	unsigned int id;
+};
+
+struct aer_rpc {
+	struct pcie_device *rpd;	/* Root Port device */
+	struct work_struct dpc_handler;
+	struct aer_err_source e_sources[AER_ERROR_SOURCES_MAX];
+	unsigned short prod_idx;	/* Error Producer Index */
+	unsigned short cons_idx;	/* Error Consumer Index */
+	int isr;
+	spinlock_t e_lock;		/* 
+					 * Lock access to Error Status/ID Regs
+					 * and error producer/consumer index
+					 */
+	struct mutex rpc_mutex;		/* 
+					 * only one thread could do
+					 * recovery on the same
+					 * root port hierachy
+					 */
+	wait_queue_head_t wait_release;
+};
+
+struct aer_broadcast_data {
+	enum pci_channel_state state;
+	enum pci_ers_result result;
+};
+
+static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
+		enum pci_ers_result new)
+{
+	switch (orig) {
+	case PCI_ERS_RESULT_CAN_RECOVER:
+	case PCI_ERS_RESULT_RECOVERED:
+		orig = new;
+		break;
+	case PCI_ERS_RESULT_DISCONNECT:
+		if (new == PCI_ERS_RESULT_NEED_RESET)
+			orig = new;
+		break;
+	default:
+		break;
+	}
+
+	return orig;
+}
+
+extern struct bus_type pcie_port_bus_type;
+extern void aer_enable_rootport(struct aer_rpc *rpc);
+extern void aer_delete_rootport(struct aer_rpc *rpc);
+extern int aer_init(struct pcie_device *dev);
+extern void aer_isr(void *context);
+extern void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
+
+#ifdef CONFIG_ACPI
+extern int aer_osc_setup(struct pci_dev *dev);
+#else
+#define  aer_osc_setup(dev)		(OSC_METHOD_NOT_SUPPORTED)
+#endif
+
+#endif //_AERDRV_H_
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv.c	2006-07-31 09:25:12.000000000 +0800
@@ -0,0 +1,346 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file implements the AER root port service driver. The driver will
+ * register an irq handler. When root port triggers an AER interrupt, the irq
+ * handler will collect root port status and schedule a work.
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/pcieport_if.h>
+
+#include "aerdrv.h"
+
+/*
+ * Version Information
+ */
+#define DRIVER_VERSION "v1.0"
+#define DRIVER_AUTHOR "tom.l.nguyen@intel.com"
+#define DRIVER_DESC "Root Port Advanced Error Reporting Driver"
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+static int __devinit aer_probe (struct pcie_device *dev,
+	const struct pcie_port_service_id *id );
+static void aer_remove(struct pcie_device *dev);
+static int aer_suspend(struct pcie_device *dev, pm_message_t state)
+{return 0;}
+static int aer_resume(struct pcie_device *dev) {return 0;}
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+	enum pci_channel_state error);
+static void aer_error_resume(struct pci_dev *dev);
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
+
+/*
+ * PCI Express bus's AER Root service driver data structure
+ */
+static struct pcie_port_service_id aer_id[] = {
+	{
+	.vendor 	= PCI_ANY_ID, 
+	.device 	= PCI_ANY_ID,
+	.port_type 	= PCIE_RC_PORT, 
+	.service_type 	= PCIE_PORT_SERVICE_AER,
+	},
+	{ /* end: all zeroes */ }
+};
+
+static struct pci_error_handlers aer_error_handlers = {
+	.error_detected = aer_error_detected,
+	.resume = aer_error_resume,
+};
+
+static struct pcie_port_service_driver aerdrv = {
+	.name		= "aer",
+	.id_table	= &aer_id[0],
+
+	.probe		= aer_probe,
+	.remove		= aer_remove,
+
+	.suspend	= aer_suspend,
+	.resume		= aer_resume,
+
+	.err_handler	= &aer_error_handlers,
+
+	.reset_link	= aer_root_reset,
+};
+
+/**
+ * aer_irq - Root Port's ISR
+ * @irq: IRQ assigned to Root Port
+ * @context: pointer to Root Port data structure
+ * @r: pointer struct pt_regs
+ *
+ * Invoked when Root Port detects AER messages.
+ **/
+static irqreturn_t aer_irq(int irq, void *context, struct pt_regs * r)
+{
+	unsigned int status, id;
+	struct pcie_device *pdev = (struct pcie_device *)context;
+	struct aer_rpc *rpc = get_service_data(pdev);
+	int next_prod_idx;
+	unsigned long flags;
+	int pos;
+
+	pos = pci_find_aer_capability(pdev->port);
+	/* 
+	 * Must lock access to Root Error Status Reg, Root Error ID Reg, 
+	 * and Root error producer/consumer index 
+	 */
+	spin_lock_irqsave(&rpc->e_lock, flags);
+
+	/* Read error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, &status);
+	if (!(status & ROOT_ERR_STATUS_MASKS)) {
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_NONE;
+	}
+
+	/* Read error source and clear error status */
+	pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_COR_SRC, &id);
+	pci_write_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, status);
+
+	/* Store error source for later DPC handler */
+	next_prod_idx = rpc->prod_idx + 1;
+	if (next_prod_idx == AER_ERROR_SOURCES_MAX)
+		next_prod_idx = 0;
+	if (next_prod_idx == rpc->cons_idx) {
+		/* 
+		 * Error Storm Condition - possibly the same error occurred.
+		 * Drop the error.
+		 */
+		spin_unlock_irqrestore(&rpc->e_lock, flags);
+		return IRQ_HANDLED;
+	}
+	rpc->e_sources[rpc->prod_idx].status =  status;
+	rpc->e_sources[rpc->prod_idx].id = id;
+	rpc->prod_idx = next_prod_idx;
+	spin_unlock_irqrestore(&rpc->e_lock, flags);
+
+	/*  Invoke DPC handler */
+	schedule_work(&rpc->dpc_handler);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * aer_alloc_rpc - allocate Root Port data structure
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when Root Port's AER service is loaded.
+ **/
+static struct aer_rpc* aer_alloc_rpc(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc;
+
+	if (!(rpc = (struct aer_rpc *)kmalloc(sizeof(struct aer_rpc), 
+		GFP_KERNEL)))
+		return NULL;
+
+	memset(rpc, 0, sizeof(struct aer_rpc));
+	/* 
+	 * Initialize Root lock access, e_lock, to Root Error Status Reg, 
+	 * Root Error ID Reg, and Root error producer/consumer index. 
+	 */
+	rpc->e_lock = SPIN_LOCK_UNLOCKED;
+
+	rpc->rpd = dev;
+	INIT_WORK(&rpc->dpc_handler, aer_isr, (void *)dev);
+	rpc->prod_idx = rpc->cons_idx = 0;
+	mutex_init(&rpc->rpc_mutex);
+	init_waitqueue_head(&rpc->wait_release);
+
+	/* Use PCIE bus function to store rpc into PCIE device */
+	set_service_data(dev, rpc);
+
+	return rpc;
+}
+
+/**
+ * aer_remove - clean up resources
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when PCI Express bus unloads or AER probe fails.
+ **/
+static void aer_remove(struct pcie_device *dev)
+{
+	struct aer_rpc *rpc = get_service_data(dev);
+
+	if (rpc) {
+		/* If register interrupt service, it must be free. */
+		if (rpc->isr)
+			free_irq(dev->irq, dev);
+
+		wait_event(rpc->wait_release, rpc->prod_idx == rpc->cons_idx);
+
+		aer_delete_rootport(rpc);
+		set_service_data(dev, NULL);
+	}
+}
+
+/**
+ * aer_probe - initialize resources
+ * @dev: pointer to the pcie_dev data structure
+ * @id: pointer to the service id data structure
+ *
+ * Invoked when PCI Express bus loads AER service driver.
+ **/
+static int __devinit aer_probe (struct pcie_device *dev, 
+				const struct pcie_port_service_id *id )
+{
+	int status;
+	struct aer_rpc *rpc;
+	struct device *device = &dev->device;
+
+	/* Init */
+	if ((status = aer_init(dev)))
+		return status;
+
+	/* Alloc rpc data structure */
+	if (!(rpc = aer_alloc_rpc(dev))) {
+		printk(KERN_DEBUG "%s: Alloc rpc fails on PCIE device[%s]\n",
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return -ENOMEM;
+	}
+
+	/* Request IRQ ISR */
+	if ((status = request_irq(dev->irq, aer_irq, SA_SHIRQ, "aerdrv", 
+				dev))) {
+		printk(KERN_DEBUG "%s: Request ISR fails on PCIE device[%s]\n", 
+			__FUNCTION__, device->bus_id);
+		aer_remove(dev);
+		return status;
+	}
+
+	rpc->isr = 1;
+
+	aer_enable_rootport(rpc);
+
+	return status;
+}
+
+/**
+ * aer_root_reset - reset link on Root Port
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver when performing link reset at Root Port.
+ **/
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
+{
+	u16 p2p_ctrl;
+	u32 status;
+	int pos;
+
+	pos = pci_find_aer_capability(dev);
+
+	/* Disable Root's interrupt in response to error messages */ 
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+	/* Assert Secondary Bus Reset */
+	pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl);
+	p2p_ctrl |= PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* De-assert Secondary Bus Reset */
+	p2p_ctrl &= ~PCI_CB_BRIDGE_CTL_CB_RESET;
+	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+	/* 
+	 * System software must wait for at least 100ms from the end 
+	 * of a reset of one or more device before it is permitted
+	 * to issue Configuration Requests to those devices.
+	 */
+	msleep(200);
+	printk(KERN_DEBUG "Complete link reset at Root[%s]\n", dev->dev.bus_id);
+
+	/* Enable Root Port's interrupt in response to error messages */ 
+	pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
+	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, status);
+	pci_write_config_dword(dev,
+		pos + PCI_ERR_ROOT_COMMAND,
+		ROOT_PORT_INTR_ON_MESG_MASK);
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+/**
+ * aer_error_detected - update severity status
+ * @dev: pointer to Root Port's pci_dev data structure
+ * @error: error severity being notified by port bus
+ *
+ * Invoked by Port Bus driver during error recovery.
+ **/
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+			enum pci_channel_state error)
+{
+	/* Root Port has no impact. Always recovers. */
+	return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+/**
+ * aer_error_resume - clean up corresponding error status bits
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver during nonfatal recovery.
+ **/
+static void aer_error_resume(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, mask;
+	u16 reg16;
+
+	/* Clean up Root device status */
+	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+	pci_read_config_word(dev, pos + PCI_EXP_DEVSTA, &reg16);
+	pci_write_config_word(dev, pos + PCI_EXP_DEVSTA, reg16);
+
+	/* Clean AER Root Error Status */
+	pos = pci_find_aer_capability(dev);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+	if (dev->error_state == pci_channel_io_normal)
+		status &= ~mask; /* Clear corresponding nonfatal bits */
+	else
+		status &= mask; /* Clear corresponding fatal bits */
+	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
+/**
+ * aer_service_init - register AER root service driver
+ *
+ * Invoked when AER root service driver is loaded.
+ **/
+static int __init aer_service_init(void)
+{
+	return pcie_port_service_register(&aerdrv);
+}
+
+/**
+ * aer_service_exit - unregister AER root service driver
+ *
+ * Invoked when AER root service driver is unloaded.
+ **/
+static void __exit aer_service_exit(void) 
+{
+	pcie_port_service_unregister(&aerdrv);
+}
+
+module_init(aer_service_init);
+module_exit(aer_service_exit);
--- linux-2.6.17/drivers/pci/pcie/aer/Kconfig	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Kconfig	2006-07-31 10:02:27.000000000 +0800
@@ -0,0 +1,12 @@
+#
+# PCI Express Root Port Device AER Configuration
+#
+
+config PCIEAER
+	boolean "Root Port Advanced Error Reporting support"
+	depends on PCIEPORTBUS 
+	default y
+	help
+	  This enables PCI Express Root Port Advanced Error Reporting
+	  (AER) driver support. Error reporting messages sent to Root
+	  Port will be handled by PCI Express AER driver.
--- linux-2.6.17/drivers/pci/pcie/aer/Makefile	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -0,0 +1,10 @@
+#
+# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
+#
+
+obj-$(CONFIG_PCIEAER)		+= aerdriver.o
+aerdrv_acpi-$(CONFIG_ACPI)	+= aerdrv_acpi.o
+
+aerdriver-objs		:= aerdrv_errprint.o aerdrv_core.o aerdrv.o
+aerdriver-objs		+= $(aerdrv_acpi-y)
+
--- linux-2.6.17/drivers/pci/pcie/Kconfig	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Kconfig	2006-06-22 16:46:29.000000000 +0800
@@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
 	   
 	  When in doubt, say N.
 
+source "drivers/pci/pcie/aer/Kconfig"
--- linux-2.6.17/drivers/pci/pcie/Makefile	2006-06-22 16:26:43.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/Makefile	2006-06-22 16:46:29.000000000 +0800
@@ -5,3 +5,6 @@
 pcieportdrv-y			:= portdrv_core.o portdrv_pci.o portdrv_bus.o
 
 obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
+
+# Build PCI Express AER if needed
+obj-$(CONFIG_PCIEAER)		+= aer/
--- linux-2.6.17/drivers/pci/pcie/aer/aerdrv_errprint.c	1970-01-01 08:00:00.000000000 +0800
+++ linux-2.6.17_aer/drivers/pci/pcie/aer/aerdrv_errprint.c	2006-07-14 10:49:41.000000000 +0800
@@ -0,0 +1,248 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv_errprint.c
+ * 
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Format error messages and print them to console.
+ * 
+ * Copyright (C) 2006 Intel Corp. 
+ *	Tom Long Nguyen (tom.l.nguyen@intel.com)
+ *	Zhang Yanmin (yanmin.zhang@intel.com)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+
+#include "aerdrv.h"
+
+#define AER_AGENT_RECEIVER		0
+#define AER_AGENT_REQUESTER		1
+#define AER_AGENT_COMPLETER		2
+#define AER_AGENT_TRANSMITTER		3		
+
+#define AER_AGENT_REQUESTER_MASK	(PCI_ERR_UNC_COMP_TIME|	\
+					PCI_ERR_UNC_UNSUP)
+
+#define AER_AGENT_COMPLETER_MASK	PCI_ERR_UNC_COMP_ABORT
+
+#define AER_AGENT_TRANSMITTER_MASK(t, e) (e & (PCI_ERR_COR_REP_ROLL| \
+	((t == AER_CORRECTABLE) ? PCI_ERR_COR_REP_TIMER: 0))) 
+
+#define AER_GET_AGENT(t, e)						\
+	((e & AER_AGENT_COMPLETER_MASK) ? AER_AGENT_COMPLETER :		\
+	(e & AER_AGENT_REQUESTER_MASK) ? AER_AGENT_REQUESTER :		\
+	(AER_AGENT_TRANSMITTER_MASK(t, e)) ? AER_AGENT_TRANSMITTER :	\
+	AER_AGENT_RECEIVER)
+
+#define AER_PHYSICAL_LAYER_ERROR_MASK	PCI_ERR_COR_RCVR
+#define AER_DATA_LINK_LAYER_ERROR_MASK(t, e)	\
+		(PCI_ERR_UNC_DLP|		\
+		PCI_ERR_COR_BAD_TLP| 		\
+		PCI_ERR_COR_BAD_DLLP|		\
+		PCI_ERR_COR_REP_ROLL| 		\
+		((t == AER_CORRECTABLE) ?	\
+		PCI_ERR_COR_REP_TIMER: 0))
+
+#define AER_PHYSICAL_LAYER_ERROR	0
+#define AER_DATA_LINK_LAYER_ERROR	1
+#define AER_TRANSACTION_LAYER_ERROR	2
+
+#define AER_GET_LAYER_ERROR(t, e)				\
+	((e & AER_PHYSICAL_LAYER_ERROR_MASK) ?			\
+	AER_PHYSICAL_LAYER_ERROR :				\
+	(e & AER_DATA_LINK_LAYER_ERROR_MASK(t, e)) ?		\
+		AER_DATA_LINK_LAYER_ERROR : 			\
+		AER_TRANSACTION_LAYER_ERROR)
+
+/* 
+ * AER error strings 
+ */
+static char* aer_error_severity_string[] = {
+	"Uncorrected (Non-Fatal)", 
+	"Uncorrected (Fatal)",
+	"Corrected"
+};
+
+static char* aer_error_layer[] = {
+	"Physical Layer",
+	"Data Link Layer",
+	"Transaction Layer" 
+};
+static char* aer_correctable_error_string[] = {
+	"Receiver Error        ",	/* Bit Position 0 	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	"Bad TLP               ",	/* Bit Position 6 	*/
+	"Bad DLLP              ",	/* Bit Position 7 	*/
+	"RELAY_NUM Rollover    ",	/* Bit Position 8 	*/
+	NULL,
+	NULL,
+	NULL,
+	"Replay Timer Timeout  ",	/* Bit Position 12 	*/
+	"Advisory Non-Fatal    ", 	/* Bit Position 13	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+};
+
+static char* aer_uncorrectable_error_string[] = {
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	"Data Link Protocol    ",	/* Bit Position 4	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	"Poisoned TLP          ",	/* Bit Position 12 	*/
+	"Flow Control Protocol ",	/* Bit Position 13	*/
+	"Completion Timeout    ",	/* Bit Position 14 	*/
+	"Completer Abort       ",	/* Bit Position 15 	*/
+	"Unexpected Completion ",	/* Bit Position 16	*/
+	"Receiver Overflow     ",	/* Bit Position 17	*/
+	"Malformed TLP         ",	/* Bit Position 18	*/
+	"ECRC                  ",	/* Bit Position 19	*/
+	"Unsupported Request   ",	/* Bit Position 20	*/
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+};
+
+static char* aer_agent_string[] = {
+	"Receiver ID",
+	"Requester ID",
+	"Completer ID",
+	"Transmitter ID"
+};
+
+static char * aer_get_error_source_name(int severity,
+			unsigned int status,
+			char errmsg_buff[])
+{
+	int i;
+	char * errmsg = NULL;
+
+	for (i = 0; i < 32; i++) {
+		if (!(status & (1 << i)))
+			continue;
+
+		if (severity == AER_CORRECTABLE)
+			errmsg = aer_correctable_error_string[i];
+		else
+			errmsg = aer_uncorrectable_error_string[i];
+
+		if (!errmsg) {
+			sprintf(errmsg_buff, "Unknown Error Bit %2d  ", i);
+			errmsg = errmsg_buff;
+		}
+
+		break;
+	}
+
+	return errmsg;
+}
+
+static DEFINE_SPINLOCK(logbuf_lock);
+static char errmsg_buff[100];
+void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
+{
+	char * errmsg;
+	int err_layer, agent;
+	char * loglevel;
+
+	if (info->severity == AER_CORRECTABLE)
+		loglevel = KERN_WARNING;
+	else
+		loglevel = KERN_ERR;
+
+	printk("%s+------ PCI-Express Device Error ------+\n", loglevel);
+	printk("%sError Severity\t\t: %s\n", loglevel,
+		aer_error_severity_string[info->severity]);
+
+	if ( info->status == 0) {
+		printk("%sPCIE Bus Error type\t: (Unaccessible)\n", loglevel);
+		printk("%sUnaccessible Received\t: %s\n", loglevel,
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+		printk("%sUnregistered Agent ID\t: %04x\n", loglevel,
+			(dev->bus->number << 8) | dev->devfn);
+	} else {
+		err_layer = AER_GET_LAYER_ERROR(info->severity, info->status);
+		printk("%sPCIE Bus Error type\t: %s\n", loglevel,
+			aer_error_layer[err_layer]);
+
+		spin_lock(&logbuf_lock);
+		errmsg = aer_get_error_source_name(info->severity,
+				info->status,
+				errmsg_buff);
+		printk("%s%s\t: %s\n", loglevel, errmsg,
+			info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+				"Multiple" : "First");
+		spin_unlock(&logbuf_lock);
+
+		agent = AER_GET_AGENT(info->severity, info->status);
+		printk("%s%s\t\t: %04x\n", loglevel,
+			aer_agent_string[agent],
+			(dev->bus->number << 8) | dev->devfn);
+
+		printk("%sVendorID=%04xh, DeviceID=%04xh,"
+			" Bus=%02xh, Device=%02xh, Function=%02xh\n",
+			loglevel,
+			dev->vendor,
+			dev->device,
+			dev->bus->number,
+			PCI_SLOT(dev->devfn),
+			PCI_FUNC(dev->devfn));
+
+		if (info->flags & AER_TLP_HEADER_VALID_FLAG) {
+			unsigned char *tlp = (unsigned char *) &info->tlp;
+			printk("%sTLB Header:\n", loglevel);
+			printk("%s%02x%02x%02x%02x %02x%02x%02x%02x"
+				" %02x%02x%02x%02x %02x%02x%02x%02x\n",
+				loglevel,
+				*(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+				*(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
+				*(tlp + 11), *(tlp + 10), *(tlp + 9),
+				*(tlp + 8), *(tlp + 15), *(tlp + 14),
+				*(tlp + 13), *(tlp + 12));
+		}
+	}
+}
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-13  8:45     ` Christoph Hellwig
@ 2006-07-13 11:26       ` Johnny Lever
  0 siblings, 0 replies; 29+ messages in thread
From: Johnny Lever @ 2006-07-13 11:26 UTC (permalink / raw)
  To: Christoph Hellwig, Johnny Lever, Arjan van de Ven, linux-kernel

On 7/13/06, Christoph Hellwig <hch@infradead.org> wrote:
> It is.  Unless you couldn't endless of useless bugs reported from users
> we can't debug and stupid whining trolls likle you as giving back.  While
> we definitly can't redefine you definition of "giving back" your version
> is at least no appreciated.  So please stop trolling here now and go somewhere
> else.  Don't expect me and others to support you violating the clearly defined
> copyright license (GPLv2) we gave everyone to our code.
>
Oh yeah dear GOD - rrright, those bugs - some greater GOD than you
mandated that you look at them and fix them.

Go sue all those producing binary drivers and demonstrate legally they
are doing something illegal before claiming junk. Or at least offer
some RE graphics drivers so people have a choice to begin with -
unless you do at least one of these - all you spoke is more troll than
anything.

Johnny

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12 18:07   ` Johnny Lever
@ 2006-07-13  8:45     ` Christoph Hellwig
  2006-07-13 11:26       ` Johnny Lever
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Hellwig @ 2006-07-13  8:45 UTC (permalink / raw)
  To: Johnny Lever; +Cc: Arjan van de Ven, linux-kernel

On Wed, Jul 12, 2006 at 02:07:07PM -0400, Johnny Lever wrote:
> >Demanding that these
> >companies then also allow that same code to be used by other companies
> >who do not give anything back to Linux... is just really bad for Linux
> >both in the short and the long term.
> >
> This is a quite questionable theory and a narrow minded perspective.
> Saying companies that write binary drivers for Linux don't give back
> anything is not factual

It is.  Unless you couldn't endless of useless bugs reported from users
we can't debug and stupid whining trolls likle you as giving back.  While
we definitly can't redefine you definition of "giving back" your version
is at least no appreciated.  So please stop trolling here now and go somewhere
else.  Don't expect me and others to support you violating the clearly defined
copyright license (GPLv2) we gave everyone to our code.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12 17:32 ` Arjan van de Ven
@ 2006-07-12 18:07   ` Johnny Lever
  2006-07-13  8:45     ` Christoph Hellwig
  0 siblings, 1 reply; 29+ messages in thread
From: Johnny Lever @ 2006-07-12 18:07 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel

>Demanding that these
> companies then also allow that same code to be used by other companies
> who do not give anything back to Linux... is just really bad for Linux
> both in the short and the long term.
>
This is a quite questionable theory and a narrow minded perspective.
Saying companies that write binary drivers for Linux don't give back
anything is not factual - they offer many people a possibility to use
Linux where in the absence of those drivers they would have been
forced to use something else, 100% proprietary. In short they offer
Linux a percentage of its user base. Reducing possibilities is never a
good idea  - especially when it comes to users, it does hurt them as
much as it hurts you to see a EXPORT_SYMBOL without a GPL in it - and
there is no reason why it isn't ok to hurt you but the users can be
hurt freely.  If it was so impossible and questionable to write binary
drivers, they all would have been disappeared by now. They are still
out there, people use them.

>
> good thing you're using a fake gmail account instead of your own name,
> it means it's just a very obvious troll attempt....

Yeah whatever, my real name is not as important as the topic at hand.
That's just your personal opinion/logic to translate/label it as a
troll.

And I do apologize for saying Dimwits.

Johnny

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
  2006-07-12 17:21 [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Johnny Lever
@ 2006-07-12 17:32 ` Arjan van de Ven
  2006-07-12 18:07   ` Johnny Lever
  0 siblings, 1 reply; 29+ messages in thread
From: Arjan van de Ven @ 2006-07-12 17:32 UTC (permalink / raw)
  To: Johnny Lever; +Cc: linux-kernel

On Wed, 2006-07-12 at 13:21 -0400, Johnny Lever wrote:
> >With Arjan's comments, I changed EXPORT_SYMBOL to EXPORT_SYMBOL_GPL.
> Sorry for flooding your emailbox again. :)
> 
> I think we should have "moron-proof" review system on LKML. Simplest
> way to get started is to ignore the EXPORT_SYMBOL_GPL "missionaries"
> trying to 'convert' code without giving proper thought to its
> implications or without ascertaining the correctness of the
> conversion.

I don't know where you get the "convert" idea from; this is new code and
new, linux specific API/functionality.

Some companies and people contribute a lot to the Linux kernel, all
under the GPL. In my opinion it is entirely fair that those companies
can expect that their efforts, which are new and Linux specific
functionality/APIs, will be used in compliance with the license and in a
level playing field. With "Level playing field" I mean that their
competitors ought to play by the same rules, eg comply with the letter
and spirit of the GPL. These companies make a sacrifice by giving their
code up under the GPL rather than charging for it proprietary, and they
do so gladly (and for their internal reasons). Demanding that these
companies then also allow that same code to be used by other companies
who do not give anything back to Linux... is just really bad for Linux
both in the short and the long term.

> Dimwits.

good thing you're using a fake gmail account instead of your own name,
it means it's just a very obvious troll attempt....


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver
@ 2006-07-12 17:21 Johnny Lever
  2006-07-12 17:32 ` Arjan van de Ven
  0 siblings, 1 reply; 29+ messages in thread
From: Johnny Lever @ 2006-07-12 17:21 UTC (permalink / raw)
  To: linux-kernel

>With Arjan's comments, I changed EXPORT_SYMBOL to EXPORT_SYMBOL_GPL.
Sorry for flooding your emailbox again. :)

I think we should have "moron-proof" review system on LKML. Simplest
way to get started is to ignore the EXPORT_SYMBOL_GPL "missionaries"
trying to 'convert' code without giving proper thought to its
implications or without ascertaining the correctness of the
conversion.

Dimwits.

Johnny

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2006-07-31  3:24 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-12  7:10 [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
2006-07-12  7:16 ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
2006-07-12  7:22   ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
2006-07-12  7:32     ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
2006-07-12  7:38       ` [PATCH 5/5] PCI-Express AER implemetation: pcie_portdrv error handler Zhang, Yanmin
2006-07-12  8:06       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
2006-07-12 13:16         ` Arjan van de Ven
2006-07-13  2:08           ` Zhang, Yanmin
2006-07-12 16:26         ` Andi Kleen
2006-07-13  2:16           ` Zhang, Yanmin
2006-07-12  8:00     ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
2006-07-14  5:25 ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
2006-07-14  5:27   ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
2006-07-14  5:28     ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
2006-07-14  5:30       ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin
2006-07-14  5:32         ` [PATCH 4/5] PCI-Express AER implemetation: pcie_portdrv error handler Zhang, Yanmin
2006-07-14  5:35           ` [PATCH 5/5] " Zhang, Yanmin
2006-07-24 19:37             ` Linas Vepstas
2006-07-26  4:56               ` Zhang, Yanmin
2006-07-14 12:40   ` [PATCH 1/5] PCI-Express AER implemetation: aer howto document Andi Kleen
2006-07-17  1:24     ` Zhang, Yanmin
2006-07-24 20:48   ` Linas Vepstas
2006-07-26  5:48     ` Zhang, Yanmin
2006-07-12 17:21 [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Johnny Lever
2006-07-12 17:32 ` Arjan van de Ven
2006-07-12 18:07   ` Johnny Lever
2006-07-13  8:45     ` Christoph Hellwig
2006-07-13 11:26       ` Johnny Lever
2006-07-31  3:00 [PATCH 1/5] PCI-Express AER implemetation: aer howto document Zhang, Yanmin
2006-07-31  3:10 ` [PATCH 2/5] PCI-Express AER implemetation: Add new defines to pci_regs.h Zhang, Yanmin
2006-07-31  3:14   ` [PATCH 3/5] PCI-Express AER implemetation: export pcie_port_bus_type Zhang, Yanmin
2006-07-31  3:22     ` [PATCH 4/5] PCI-Express AER implemetation: AER core and aerdriver Zhang, Yanmin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).