All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset
@ 2013-06-06 18:10 Betty Dall
  2013-06-06 18:10 ` [PATCH v3 1/6] PCI/AER: Don't parse HEST table for non-PCIe devices Betty Dall
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

This patch set fixes a bug on platforms that use firmware first AER.
Firmware can leave the root port in Secondary Bus Reset (SBR) and
communicate this to the OS through the "reset" bit in the flags field
of the HEST table and associated CPER records. Firmware wants to do this
so that the error is contained and the hardware is in a known state.

Without these patches, the root port stays in SBR and the device drivers
cannot recover. These patches recognize when the firmware first root port
is in SBR and bring the root port out of SBR so the devices under the root
port can recover.

The changes have been tested on systems with firmware first that set the
"reset" bit by injecting various hardware errors. The errors successfully
recover.

Changes since v1:
Fixed a typo in the comment of patch 2.
Removed incorrect setting of reset bit in patch 3.

Changes since v2: 
The v2 patch 1/3 was re-written by Bjorn Helgaas and is now patches 1/6
through 3/6.
The v2 patch 2/3 is now 5/6 and changed to directly use the AER_FATAL define
and introduced patch 4/6 to move the defines to a public header file.
The v2 patch 3/3 is now 6/6 and uses the same default reset link function for
both Downstream Ports and Root Ports.

Signed-off-by: Betty Dall <betty.dall@hp.com>
---
Betty Dall (6):
  PCI/AER: Don't parse HEST table for non-PCIe devices
  PCI/AER: Factor out HEST device type matching
  PCI/AER: Set dev->__aer_firmware_first only for matching devices
  PCI/ACPI: Move AER severity defines to aer.h
  ACPI/APEI: Force fatal AER severity when bus has been reset
  PCI/AER: Provide reset_link for firmware first root port
---
 drivers/acpi/apei/ghes.c           |   10 +++++++
 drivers/pci/pcie/aer/aerdrv.h      |    4 ---
 drivers/pci/pcie/aer/aerdrv_acpi.c |   47 ++++++++++++++++++-----------------
 drivers/pci/pcie/aer/aerdrv_core.c |   17 +++++++------
 include/linux/aer.h                |   16 +++++++----
 5 files changed, 53 insertions(+), 41 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/6] PCI/AER: Don't parse HEST table for non-PCIe devices
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
@ 2013-06-06 18:10 ` Betty Dall
  2013-06-06 18:10 ` [PATCH v3 2/6] PCI/AER: Factor out HEST device type matching Betty Dall
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

AER is a PCIe-only capability, so there's no point in trying to match
a HEST PCIe structure with a non-PCIe device.

Previously, a HEST global AER bridge entry (type 8) could incorrectly
match *any* bridge, even a legacy PCI-PCI bridge, and a non-global
HEST entry could match a legacy PCI device.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Betty Dall <betty.dall@hp.com>
---

 drivers/pci/pcie/aer/aerdrv_acpi.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)


diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c
index 5194a7d..4f798ab 100644
--- a/drivers/pci/pcie/aer/aerdrv_acpi.c
+++ b/drivers/pci/pcie/aer/aerdrv_acpi.c
@@ -59,8 +59,7 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
 
 	p = (struct acpi_hest_aer_common *)(hest_hdr + 1);
 	if (p->flags & ACPI_HEST_GLOBAL) {
-		if ((pci_is_pcie(info->pci_dev) &&
-		     pci_pcie_type(info->pci_dev) == pcie_type) || bridge)
+		if ((pci_pcie_type(info->pci_dev) == pcie_type) || bridge)
 			ff = !!(p->flags & ACPI_HEST_FIRMWARE_FIRST);
 	} else
 		if (hest_match_pci(p, info->pci_dev))
@@ -89,6 +88,9 @@ static void aer_set_firmware_first(struct pci_dev *pci_dev)
 
 int pcie_aer_get_firmware_first(struct pci_dev *dev)
 {
+	if (!pci_is_pcie(dev))
+		return 0;
+
 	if (!dev->__aer_firmware_first_valid)
 		aer_set_firmware_first(dev);
 	return dev->__aer_firmware_first;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/6] PCI/AER: Factor out HEST device type matching
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
  2013-06-06 18:10 ` [PATCH v3 1/6] PCI/AER: Don't parse HEST table for non-PCIe devices Betty Dall
@ 2013-06-06 18:10 ` Betty Dall
  2013-06-06 18:10 ` [PATCH v3 3/6] PCI/AER: Set dev->__aer_firmware_first only for matching devices Betty Dall
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

This factors out the matching of HEST structure type and PCIe device type
to improve readability.  No functional change.
    
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Betty Dall <betty.dall@hp.com>
---

 drivers/pci/pcie/aer/aerdrv_acpi.c |   35 +++++++++++++++++------------------
 1 files changed, 17 insertions(+), 18 deletions(-)


diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c
index 4f798ab..e7ea573 100644
--- a/drivers/pci/pcie/aer/aerdrv_acpi.c
+++ b/drivers/pci/pcie/aer/aerdrv_acpi.c
@@ -29,6 +29,22 @@ static inline int hest_match_pci(struct acpi_hest_aer_common *p,
 		 p->function == PCI_FUNC(pci->devfn));
 }
 
+static inline bool hest_match_type(struct acpi_hest_header *hest_hdr,
+				struct pci_dev *dev)
+{
+	u16 hest_type = hest_hdr->type;
+	u8 pcie_type = pci_pcie_type(dev);
+
+	if ((hest_type == ACPI_HEST_TYPE_AER_ROOT_PORT &&
+		pcie_type == PCI_EXP_TYPE_ROOT_PORT) ||
+	    (hest_type == ACPI_HEST_TYPE_AER_ENDPOINT &&
+		pcie_type == PCI_EXP_TYPE_ENDPOINT) ||
+	    (hest_type == ACPI_HEST_TYPE_AER_BRIDGE &&
+		(dev->class >> 16) == PCI_BASE_CLASS_BRIDGE))
+		return true;
+	return false;
+}
+
 struct aer_hest_parse_info {
 	struct pci_dev *pci_dev;
 	int firmware_first;
@@ -38,28 +54,11 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
 {
 	struct aer_hest_parse_info *info = data;
 	struct acpi_hest_aer_common *p;
-	u8 pcie_type = 0;
-	u8 bridge = 0;
 	int ff = 0;
 
-	switch (hest_hdr->type) {
-	case ACPI_HEST_TYPE_AER_ROOT_PORT:
-		pcie_type = PCI_EXP_TYPE_ROOT_PORT;
-		break;
-	case ACPI_HEST_TYPE_AER_ENDPOINT:
-		pcie_type = PCI_EXP_TYPE_ENDPOINT;
-		break;
-	case ACPI_HEST_TYPE_AER_BRIDGE:
-		if ((info->pci_dev->class >> 16) == PCI_BASE_CLASS_BRIDGE)
-			bridge = 1;
-		break;
-	default:
-		return 0;
-	}
-
 	p = (struct acpi_hest_aer_common *)(hest_hdr + 1);
 	if (p->flags & ACPI_HEST_GLOBAL) {
-		if ((pci_pcie_type(info->pci_dev) == pcie_type) || bridge)
+		if (hest_match_type(hest_hdr, info->pci_dev))
 			ff = !!(p->flags & ACPI_HEST_FIRMWARE_FIRST);
 	} else
 		if (hest_match_pci(p, info->pci_dev))

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/6] PCI/AER: Set dev->__aer_firmware_first only for matching devices
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
  2013-06-06 18:10 ` [PATCH v3 1/6] PCI/AER: Don't parse HEST table for non-PCIe devices Betty Dall
  2013-06-06 18:10 ` [PATCH v3 2/6] PCI/AER: Factor out HEST device type matching Betty Dall
@ 2013-06-06 18:10 ` Betty Dall
  2013-06-06 18:10 ` [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h Betty Dall
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

Previously, we always updated info->firmware_first, even for HEST entries
that didn't match the device.  Therefore, if the last HEST descriptor was
a PCIe structure that didn't match the device, we always cleared
dev->__aer_firmware_first.
    
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Betty Dall <betty.dall@hp.com>
---

 drivers/pci/pcie/aer/aerdrv_acpi.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)


diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c
index e7ea573..cf611ab 100644
--- a/drivers/pci/pcie/aer/aerdrv_acpi.c
+++ b/drivers/pci/pcie/aer/aerdrv_acpi.c
@@ -54,16 +54,16 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
 {
 	struct aer_hest_parse_info *info = data;
 	struct acpi_hest_aer_common *p;
-	int ff = 0;
+	int ff;
 
 	p = (struct acpi_hest_aer_common *)(hest_hdr + 1);
+	ff = !!(p->flags & ACPI_HEST_FIRMWARE_FIRST);
 	if (p->flags & ACPI_HEST_GLOBAL) {
 		if (hest_match_type(hest_hdr, info->pci_dev))
-			ff = !!(p->flags & ACPI_HEST_FIRMWARE_FIRST);
+			info->firmware_first = ff;
 	} else
 		if (hest_match_pci(p, info->pci_dev))
-			ff = !!(p->flags & ACPI_HEST_FIRMWARE_FIRST);
-	info->firmware_first = ff;
+			info->firmware_first = ff;
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
                   ` (2 preceding siblings ...)
  2013-06-06 18:10 ` [PATCH v3 3/6] PCI/AER: Set dev->__aer_firmware_first only for matching devices Betty Dall
@ 2013-06-06 18:10 ` Betty Dall
  2013-06-06 18:23   ` Joe Perches
  2013-06-06 18:10 ` [PATCH v3 5/6] ACPI/APEI: Force fatal AER severity when bus has been reset Betty Dall
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

The function aer_recover_queue() is a public interface and the
severity argument uses #defines that are in the private header
pci/pcie/aer/aerdrv.h.

This patch moves the #defines from pci/pcie/aer/aerdrv.h to 
include/linux/aer.h.

Signed-off-by: Betty Dall <betty.dall@hp.com>
---

 drivers/pci/pcie/aer/aerdrv.h |    4 ----
 include/linux/aer.h           |   16 ++++++++++------
 2 files changed, 10 insertions(+), 10 deletions(-)


diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index d12c77c..90ea3e8 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -13,10 +13,6 @@
 #include <linux/aer.h>
 #include <linux/interrupt.h>
 
-#define AER_NONFATAL			0
-#define AER_FATAL			1
-#define AER_CORRECTABLE			2
-
 #define SYSTEM_ERROR_INTR_ON_MESG_MASK	(PCI_EXP_RTCTL_SECEE|	\
 					PCI_EXP_RTCTL_SENFEE|	\
 					PCI_EXP_RTCTL_SEFEE)
diff --git a/include/linux/aer.h b/include/linux/aer.h
index ec10e1b..0f6a9e2 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -7,6 +7,10 @@
 #ifndef _AER_H_
 #define _AER_H_
 
+#define AER_NONFATAL			0
+#define AER_FATAL			1
+#define AER_CORRECTABLE			2
+
 struct aer_header_log_regs {
 	unsigned int dw0;
 	unsigned int dw1;
@@ -31,9 +35,9 @@ struct aer_capability_regs {
 
 #if defined(CONFIG_PCIEAER)
 /* pci-e port driver needs this function to enable aer */
-extern int pci_enable_pcie_error_reporting(struct pci_dev *dev);
-extern int pci_disable_pcie_error_reporting(struct pci_dev *dev);
-extern int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
 #else
 static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
 {
@@ -49,10 +53,10 @@ static inline int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 }
 #endif
 
-extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
+void cper_print_aer(const char *prefix, struct pci_dev *dev,
 			   int cper_severity, struct aer_capability_regs *aer);
-extern int cper_severity_to_aer(int cper_severity);
-extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
+int cper_severity_to_aer(int cper_severity);
+void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
 			      int severity);
 #endif //_AER_H_
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 5/6] ACPI/APEI: Force fatal AER severity when bus has been reset
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
                   ` (3 preceding siblings ...)
  2013-06-06 18:10 ` [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h Betty Dall
@ 2013-06-06 18:10 ` Betty Dall
  2013-06-06 18:10 ` [PATCH v3 6/6] PCI/AER: Provide reset_link for firmware first root port Betty Dall
  2013-06-06 21:04 ` [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Bjorn Helgaas
  6 siblings, 0 replies; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

The CPER error record has a reset bit that indicates that the platform
has reset the bus. The reset bit can be set for any severity error
including recoverable.  From the AER code path's perspective,
any error is fatal if the bus has been reset. This patch upgrades the
severity of the AER recovery to AER_FATAL whenever the CPER error record
indicates that the bus has been reset.

Changes since v1:
Fixed a typo in comment.

Changes since v2:
Set the aer_severity to AER_FATAL rather than using cper_severity_to_aer().
Simplified the comment.

Signed-off-by: Betty Dall <betty.dall@hp.com>
---

 drivers/acpi/apei/ghes.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)


diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d668a8a..ab31551 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -449,9 +449,19 @@ static void ghes_do_proc(struct ghes *ghes,
 			    pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
 				unsigned int devfn;
 				int aer_severity;
+
 				devfn = PCI_DEVFN(pcie_err->device_id.device,
 						  pcie_err->device_id.function);
 				aer_severity = cper_severity_to_aer(sev);
+
+				/*
+				 * If firmware reset the component to contain
+				 * the error, we must reinitialize it before
+				 * use, so treat it as a fatal AER error.
+				 */
+				if (gdata->flags & CPER_SEC_RESET)
+					aer_severity = AER_FATAL;
+
 				aer_recover_queue(pcie_err->device_id.segment,
 						  pcie_err->device_id.bus,
 						  devfn, aer_severity);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 6/6] PCI/AER: Provide reset_link for firmware first root port
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
                   ` (4 preceding siblings ...)
  2013-06-06 18:10 ` [PATCH v3 5/6] ACPI/APEI: Force fatal AER severity when bus has been reset Betty Dall
@ 2013-06-06 18:10 ` Betty Dall
  2013-06-06 21:04 ` [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Bjorn Helgaas
  6 siblings, 0 replies; 11+ messages in thread
From: Betty Dall @ 2013-06-06 18:10 UTC (permalink / raw)
  To: rjw, bhelgaas, gong.chen, greg.pearson
  Cc: ying.huang, linux-acpi, linux-kernel, linux-pci, Betty Dall

The firmware first path does not install the aerdrv root port
service driver, so the firmware first root port does not have
a reset_link callback. When a firmware first root port does not have
a reset_link callback, use a new default reset_link similar to what
we already do for the default_downstream_reset_link(). The firmware
first default reset_link brings the root port out of SBR if firmware
left it in SBR.

Changes since v1:
Removed incorrect setting of p2p_ctrl after the read.

Changes since v2:
Treat the default case for a Root Port the same as the existing
default case for a Downstream Link.

Signed-off-by: Betty Dall <betty.dall@hp.com>
---

 drivers/pci/pcie/aer/aerdrv_core.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)


diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 8ec8b4f..cb29c04 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -400,16 +400,16 @@ void aer_do_secondary_bus_reset(struct pci_dev *dev)
 }
 
 /**
- * default_downstream_reset_link - default reset function for Downstream Port
- * @dev: pointer to downstream port's pci_dev data structure
+ * default_reset_link - default reset function
+ * @dev: pointer to pci_dev data structure
  *
- * Invoked when performing link reset at Downstream Port w/ no aer driver.
+ * Invoked when performing link reset on a Downstream Port or a
+ * Root Port with no aer driver.
  */
-static pci_ers_result_t default_downstream_reset_link(struct pci_dev *dev)
+static pci_ers_result_t default_reset_link(struct pci_dev *dev)
 {
 	aer_do_secondary_bus_reset(dev);
-	dev_printk(KERN_DEBUG, &dev->dev,
-		"Downstream Port link has been reset\n");
+	dev_dbg(&dev->dev, "Link has been reset\n");
 	return PCI_ERS_RESULT_RECOVERED;
 }
 
@@ -458,8 +458,9 @@ static pci_ers_result_t reset_link(struct pci_dev *dev)
 
 	if (driver && driver->reset_link) {
 		status = driver->reset_link(udev);
-	} else if (pci_pcie_type(udev) == PCI_EXP_TYPE_DOWNSTREAM) {
-		status = default_downstream_reset_link(udev);
+	} else if (pci_pcie_type(udev) == PCI_EXP_TYPE_DOWNSTREAM ||
+		pci_pcie_type(udev) == PCI_EXP_TYPE_ROOT_PORT) {
+		status = default_reset_link(udev);
 	} else {
 		dev_printk(KERN_DEBUG, &dev->dev,
 			"no link-reset support at upstream device %s\n",

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h
  2013-06-06 18:10 ` [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h Betty Dall
@ 2013-06-06 18:23   ` Joe Perches
  2013-06-06 19:28     ` Bjorn Helgaas
  0 siblings, 1 reply; 11+ messages in thread
From: Joe Perches @ 2013-06-06 18:23 UTC (permalink / raw)
  To: Betty Dall
  Cc: rjw, bhelgaas, gong.chen, greg.pearson, ying.huang, linux-acpi,
	linux-kernel, linux-pci

On Thu, 2013-06-06 at 12:10 -0600, Betty Dall wrote:
> The function aer_recover_queue() is a public interface and the
> severity argument uses #defines that are in the private header
> pci/pcie/aer/aerdrv.h.
> 
> This patch moves the #defines from pci/pcie/aer/aerdrv.h to 
> include/linux/aer.h.
[]
> diff --git a/include/linux/aer.h b/include/linux/aer.h
[]
> -extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
> +void cper_print_aer(const char *prefix, struct pci_dev *dev,
>  			   int cper_severity, struct aer_capability_regs *aer);

Can you please also realign the arguments on subsequent
lines to the open parenthesis of the first line and reflow
then to 80 cols when appropriate?

> -extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> +void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
>  			      int severity);



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h
  2013-06-06 18:23   ` Joe Perches
@ 2013-06-06 19:28     ` Bjorn Helgaas
  2013-06-06 20:19       ` Rafael J. Wysocki
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2013-06-06 19:28 UTC (permalink / raw)
  To: Joe Perches
  Cc: Betty Dall, Rafael J. Wysocki, Chen Gong, Pearson, Greg,
	Huang Ying, linux-acpi, linux-kernel, linux-pci

On Thu, Jun 6, 2013 at 12:23 PM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2013-06-06 at 12:10 -0600, Betty Dall wrote:
>> The function aer_recover_queue() is a public interface and the
>> severity argument uses #defines that are in the private header
>> pci/pcie/aer/aerdrv.h.
>>
>> This patch moves the #defines from pci/pcie/aer/aerdrv.h to
>> include/linux/aer.h.
> []
>> diff --git a/include/linux/aer.h b/include/linux/aer.h
> []
>> -extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
>> +void cper_print_aer(const char *prefix, struct pci_dev *dev,
>>                          int cper_severity, struct aer_capability_regs *aer);
>
> Can you please also realign the arguments on subsequent
> lines to the open parenthesis of the first line and reflow
> then to 80 cols when appropriate?

I can do this when I apply them, so don't bother reposting unless you
have more substantive changes to make.

>> -extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
>> +void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
>>                             int severity);
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h
  2013-06-06 19:28     ` Bjorn Helgaas
@ 2013-06-06 20:19       ` Rafael J. Wysocki
  0 siblings, 0 replies; 11+ messages in thread
From: Rafael J. Wysocki @ 2013-06-06 20:19 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Joe Perches, Betty Dall, Chen Gong, Pearson, Greg, Huang Ying,
	linux-acpi, linux-kernel, linux-pci

On Thursday, June 06, 2013 01:28:28 PM Bjorn Helgaas wrote:
> On Thu, Jun 6, 2013 at 12:23 PM, Joe Perches <joe@perches.com> wrote:
> > On Thu, 2013-06-06 at 12:10 -0600, Betty Dall wrote:
> >> The function aer_recover_queue() is a public interface and the
> >> severity argument uses #defines that are in the private header
> >> pci/pcie/aer/aerdrv.h.
> >>
> >> This patch moves the #defines from pci/pcie/aer/aerdrv.h to
> >> include/linux/aer.h.
> > []
> >> diff --git a/include/linux/aer.h b/include/linux/aer.h
> > []
> >> -extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
> >> +void cper_print_aer(const char *prefix, struct pci_dev *dev,
> >>                          int cper_severity, struct aer_capability_regs *aer);
> >
> > Can you please also realign the arguments on subsequent
> > lines to the open parenthesis of the first line and reflow
> > then to 80 cols when appropriate?
> 
> I can do this when I apply them, so don't bother reposting unless you
> have more substantive changes to make.

For the record, I have no objections against this patchset.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset
  2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
                   ` (5 preceding siblings ...)
  2013-06-06 18:10 ` [PATCH v3 6/6] PCI/AER: Provide reset_link for firmware first root port Betty Dall
@ 2013-06-06 21:04 ` Bjorn Helgaas
  6 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2013-06-06 21:04 UTC (permalink / raw)
  To: Betty Dall
  Cc: Rafael J. Wysocki, Chen Gong, Pearson, Greg, Huang Ying,
	linux-acpi, linux-kernel, linux-pci

On Thu, Jun 6, 2013 at 12:10 PM, Betty Dall <betty.dall@hp.com> wrote:
> This patch set fixes a bug on platforms that use firmware first AER.
> Firmware can leave the root port in Secondary Bus Reset (SBR) and
> communicate this to the OS through the "reset" bit in the flags field
> of the HEST table and associated CPER records. Firmware wants to do this
> so that the error is contained and the hardware is in a known state.
>
> Without these patches, the root port stays in SBR and the device drivers
> cannot recover. These patches recognize when the firmware first root port
> is in SBR and bring the root port out of SBR so the devices under the root
> port can recover.
>
> The changes have been tested on systems with firmware first that set the
> "reset" bit by injecting various hardware errors. The errors successfully
> recover.
>
> Changes since v1:
> Fixed a typo in the comment of patch 2.
> Removed incorrect setting of reset bit in patch 3.
>
> Changes since v2:
> The v2 patch 1/3 was re-written by Bjorn Helgaas and is now patches 1/6
> through 3/6.
> The v2 patch 2/3 is now 5/6 and changed to directly use the AER_FATAL define
> and introduced patch 4/6 to move the defines to a public header file.
> The v2 patch 3/3 is now 6/6 and uses the same default reset link function for
> both Downstream Ports and Root Ports.
>
> Signed-off-by: Betty Dall <betty.dall@hp.com>
> ---
> Betty Dall (6):
>   PCI/AER: Don't parse HEST table for non-PCIe devices
>   PCI/AER: Factor out HEST device type matching
>   PCI/AER: Set dev->__aer_firmware_first only for matching devices
>   PCI/ACPI: Move AER severity defines to aer.h
>   ACPI/APEI: Force fatal AER severity when bus has been reset
>   PCI/AER: Provide reset_link for firmware first root port
> ---
>  drivers/acpi/apei/ghes.c           |   10 +++++++
>  drivers/pci/pcie/aer/aerdrv.h      |    4 ---
>  drivers/pci/pcie/aer/aerdrv_acpi.c |   47 ++++++++++++++++++-----------------
>  drivers/pci/pcie/aer/aerdrv_core.c |   17 +++++++------
>  include/linux/aer.h                |   16 +++++++----
>  5 files changed, 53 insertions(+), 41 deletions(-)

I put these on http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/betty-aer-v3

I'll merge them into -next for v3.11 soon.  Thanks, Betty!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-06-06 21:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-06 18:10 [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Betty Dall
2013-06-06 18:10 ` [PATCH v3 1/6] PCI/AER: Don't parse HEST table for non-PCIe devices Betty Dall
2013-06-06 18:10 ` [PATCH v3 2/6] PCI/AER: Factor out HEST device type matching Betty Dall
2013-06-06 18:10 ` [PATCH v3 3/6] PCI/AER: Set dev->__aer_firmware_first only for matching devices Betty Dall
2013-06-06 18:10 ` [PATCH v2 4/6] PCI/ACPI: Move AER severity defines to aer.h Betty Dall
2013-06-06 18:23   ` Joe Perches
2013-06-06 19:28     ` Bjorn Helgaas
2013-06-06 20:19       ` Rafael J. Wysocki
2013-06-06 18:10 ` [PATCH v3 5/6] ACPI/APEI: Force fatal AER severity when bus has been reset Betty Dall
2013-06-06 18:10 ` [PATCH v3 6/6] PCI/AER: Provide reset_link for firmware first root port Betty Dall
2013-06-06 21:04 ` [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery with root port in reset Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.