All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] powerpc/eeh: Refactor config accessors
@ 2014-10-01  7:07 Gavin Shan
  2014-10-01  7:07 ` [PATCH 1/6] powerpc/eeh: Fix condition for isolated state Gavin Shan
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

When EEH errors are detected on some particular PCI adapters, one of
which is shown as follows, the PCI config space of thoese PCI adapters
(PE) should be blocked. Otherwise, we will run into fenced PHB when
collecting EEH logs (part of recovery). The patchset fixes this issue.
Also, EEH_PE_RESET is replaced with EEH_PE_CFG_BLOCKED to indicate its
usage. It's bad idea to allow PCI config access even EEH_PE_CFG_BLOCKED
flag is set for the corresponding PE because it potentially triggers
recursive EEH error. The patchset also blocks config request from EEH
backend if necessary.

Gavin Shan (6):
  powerpc/eeh: Fix condition for isolated state
  powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
  powerpc/powernv: Drop config requests in EEH accessors
  powerpc/pseries: Drop config requests in EEH accessors
  powerpc/eeh: Block PCI config access upon frozen PE
  powerpc/eeh: Don't collect logs on PE with blocked config space

 arch/powerpc/include/asm/eeh.h               |  3 +-
 arch/powerpc/kernel/eeh.c                    | 19 +++++++---
 arch/powerpc/kernel/eeh_driver.c             | 12 +++---
 arch/powerpc/kernel/eeh_pe.c                 | 10 ++++-
 arch/powerpc/kernel/rtas_pci.c               | 30 ++++++---------
 arch/powerpc/platforms/powernv/eeh-ioda.c    |  2 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c | 56 +++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.c         |  2 +-
 8 files changed, 97 insertions(+), 37 deletions(-)

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/6] powerpc/eeh: Fix condition for isolated state
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
@ 2014-10-01  7:07 ` Gavin Shan
  2014-10-01  7:07 ` [PATCH 2/6] powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED Gavin Shan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

Function eeh_pe_state_mark() could possibly have combination of
multiple EEH PE state as its argument. The patch fixes the condition
used to check if EEH_PE_ISOLATED is included.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_pe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index eef08f0..8c4429b 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -525,7 +525,7 @@ static void *__eeh_pe_state_mark(void *data, void *flag)
 	pe->state |= state;
 
 	/* Offline PCI devices if applicable */
-	if (state != EEH_PE_ISOLATED)
+	if (!(state & EEH_PE_ISOLATED))
 		return NULL;
 
 	eeh_pe_for_each_dev(pe, edev, tmp) {
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/6] powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
  2014-10-01  7:07 ` [PATCH 1/6] powerpc/eeh: Fix condition for isolated state Gavin Shan
@ 2014-10-01  7:07 ` Gavin Shan
  2014-10-01  7:07 ` [PATCH 3/6] powerpc/powernv: Drop config requests in EEH accessors Gavin Shan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The flag EEH_PE_RESET indicates blocking config space of the PE
during reset time. We potentially need block PE's config space
other than reset time. So it's reasonable to replace it with
EEH_PE_CFG_BLOCKED to indicate its usage.

There are no substantial code or logic changes in this patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h            |  2 +-
 arch/powerpc/kernel/eeh.c                 | 12 ++++++------
 arch/powerpc/kernel/eeh_driver.c          | 12 ++++++------
 arch/powerpc/kernel/rtas_pci.c            |  4 ++--
 arch/powerpc/platforms/powernv/eeh-ioda.c |  2 +-
 arch/powerpc/platforms/powernv/pci.c      |  2 +-
 6 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index ee54f01..e925a8e 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -71,7 +71,7 @@ struct device_node;
 
 #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
 #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
-#define EEH_PE_RESET		(1 << 2)	/* PE reset in progress	*/
+#define EEH_PE_CFG_BLOCKED	(1 << 2)	/* Block config access	*/
 
 #define EEH_PE_KEEP		(1 << 8)	/* Keep PE on hotplug	*/
 
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 519622f..c9d274e 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -673,18 +673,18 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat
 	switch (state) {
 	case pcie_deassert_reset:
 		eeh_ops->reset(pe, EEH_RESET_DEACTIVATE);
-		eeh_pe_state_clear(pe, EEH_PE_RESET);
+		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		break;
 	case pcie_hot_reset:
-		eeh_pe_state_mark(pe, EEH_PE_RESET);
+		eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
 		eeh_ops->reset(pe, EEH_RESET_HOT);
 		break;
 	case pcie_warm_reset:
-		eeh_pe_state_mark(pe, EEH_PE_RESET);
+		eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
 		eeh_ops->reset(pe, EEH_RESET_FUNDAMENTAL);
 		break;
 	default:
-		eeh_pe_state_clear(pe, EEH_PE_RESET);
+		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		return -EINVAL;
 	};
 
@@ -1523,7 +1523,7 @@ int eeh_pe_reset(struct eeh_pe *pe, int option)
 	switch (option) {
 	case EEH_RESET_DEACTIVATE:
 		ret = eeh_ops->reset(pe, option);
-		eeh_pe_state_clear(pe, EEH_PE_RESET);
+		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		if (ret)
 			break;
 
@@ -1538,7 +1538,7 @@ int eeh_pe_reset(struct eeh_pe *pe, int option)
 		 */
 		eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
 
-		eeh_pe_state_mark(pe, EEH_PE_RESET);
+		eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
 		ret = eeh_ops->reset(pe, option);
 		break;
 	default:
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 3fd514f..6535936 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -528,13 +528,13 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
 	eeh_pe_dev_traverse(pe, eeh_report_error, &result);
 
 	/* Issue reset */
-	eeh_pe_state_mark(pe, EEH_PE_RESET);
+	eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
 	ret = eeh_reset_pe(pe);
 	if (ret) {
-		eeh_pe_state_clear(pe, EEH_PE_RECOVERING | EEH_PE_RESET);
+		eeh_pe_state_clear(pe, EEH_PE_RECOVERING | EEH_PE_CFG_BLOCKED);
 		return ret;
 	}
-	eeh_pe_state_clear(pe, EEH_PE_RESET);
+	eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 
 	/* Unfreeze the PE */
 	ret = eeh_clear_pe_frozen_state(pe, true);
@@ -601,10 +601,10 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 * config accesses. So we prefer to block them. However, controlled
 	 * PCI config accesses initiated from EEH itself are allowed.
 	 */
-	eeh_pe_state_mark(pe, EEH_PE_RESET);
+	eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
 	rc = eeh_reset_pe(pe);
 	if (rc) {
-		eeh_pe_state_clear(pe, EEH_PE_RESET);
+		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		return rc;
 	}
 
@@ -613,7 +613,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	/* Restore PE */
 	eeh_ops->configure_bridge(pe);
 	eeh_pe_restore_bars(pe);
-	eeh_pe_state_clear(pe, EEH_PE_RESET);
+	eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 
 	/* Clear frozen state */
 	rc = eeh_clear_pe_frozen_state(pe, false);
diff --git a/arch/powerpc/kernel/rtas_pci.c b/arch/powerpc/kernel/rtas_pci.c
index c168337..ce7c8b6 100644
--- a/arch/powerpc/kernel/rtas_pci.c
+++ b/arch/powerpc/kernel/rtas_pci.c
@@ -111,7 +111,7 @@ static int rtas_pci_read_config(struct pci_bus *bus,
 		return PCIBIOS_DEVICE_NOT_FOUND;
 #ifdef CONFIG_EEH
 	edev = of_node_to_eeh_dev(dn);
-	if (edev && edev->pe && edev->pe->state & EEH_PE_RESET)
+	if (edev && edev->pe && edev->pe->state & EEH_PE_CFG_BLOCKED)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 #endif
 
@@ -175,7 +175,7 @@ static int rtas_pci_write_config(struct pci_bus *bus,
 		return PCIBIOS_DEVICE_NOT_FOUND;
 #ifdef CONFIG_EEH
 	edev = of_node_to_eeh_dev(dn);
-	if (edev && edev->pe && (edev->pe->state & EEH_PE_RESET))
+	if (edev && edev->pe && (edev->pe->state & EEH_PE_CFG_BLOCKED))
 		return PCIBIOS_DEVICE_NOT_FOUND;
 #endif
 	ret = rtas_write_config(pdn, where, size, val);
diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 61ccb1e..5cd4226 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -373,7 +373,7 @@ static int ioda_eeh_get_pe_state(struct eeh_pe *pe)
 	 * moving forward, we have to return operational
 	 * state during PE reset.
 	 */
-	if (pe->state & EEH_PE_RESET) {
+	if (pe->state & EEH_PE_CFG_BLOCKED) {
 		result = (EEH_STATE_MMIO_ACTIVE  |
 			  EEH_STATE_DMA_ACTIVE   |
 			  EEH_STATE_MMIO_ENABLED |
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index e9f509b..b1b7ac2 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -513,7 +513,7 @@ static bool pnv_pci_cfg_check(struct pci_controller *hose,
 	edev = of_node_to_eeh_dev(dn);
 	if (edev) {
 		if (edev->pe &&
-		    (edev->pe->state & EEH_PE_RESET))
+		    (edev->pe->state & EEH_PE_CFG_BLOCKED))
 			return false;
 
 		if (edev->mode & EEH_DEV_REMOVED)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/6] powerpc/powernv: Drop config requests in EEH accessors
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
  2014-10-01  7:07 ` [PATCH 1/6] powerpc/eeh: Fix condition for isolated state Gavin Shan
  2014-10-01  7:07 ` [PATCH 2/6] powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED Gavin Shan
@ 2014-10-01  7:07 ` Gavin Shan
  2014-10-01  7:07 ` [PATCH 4/6] powerpc/pseries: " Gavin Shan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

It's bad idea to access the PCI config registers of the adapters,
which is experiencing reset. It leads to recursive EEH error without
exception. The patch drops PCI config requests in EEH accessors if
the PE has been marked to accept PCI config requests, for example
during PE reseet time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 37 ++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 3e89cbf..04e42f7 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -383,6 +383,39 @@ static int powernv_eeh_err_inject(struct eeh_pe *pe, int type, int func,
 	return ret;
 }
 
+static inline bool powernv_eeh_cfg_blocked(struct device_node *dn)
+{
+	struct eeh_dev *edev = of_node_to_eeh_dev(dn);
+
+	if (!edev || !edev->pe)
+		return false;
+
+	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
+		return true;
+
+	return false;
+}
+
+static int powernv_eeh_read_config(struct device_node *dn,
+				   int where, int size, u32 *val)
+{
+	if (powernv_eeh_cfg_blocked(dn)) {
+		*val = 0xFFFFFFFF;
+		return PCIBIOS_SET_FAILED;
+	}
+
+	return pnv_pci_cfg_read(dn, where, size, val);
+}
+
+static int powernv_eeh_write_config(struct device_node *dn,
+				    int where, int size, u32 val)
+{
+	if (powernv_eeh_cfg_blocked(dn))
+		return PCIBIOS_SET_FAILED;
+
+	return pnv_pci_cfg_write(dn, where, size, val);
+}
+
 /**
  * powernv_eeh_next_error - Retrieve next EEH error to handle
  * @pe: Affected PE
@@ -440,8 +473,8 @@ static struct eeh_ops powernv_eeh_ops = {
 	.get_log                = powernv_eeh_get_log,
 	.configure_bridge       = powernv_eeh_configure_bridge,
 	.err_inject		= powernv_eeh_err_inject,
-	.read_config            = pnv_pci_cfg_read,
-	.write_config           = pnv_pci_cfg_write,
+	.read_config            = powernv_eeh_read_config,
+	.write_config           = powernv_eeh_write_config,
 	.next_error		= powernv_eeh_next_error,
 	.restore_config		= powernv_eeh_restore_config
 };
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/6] powerpc/pseries: Drop config requests in EEH accessors
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
                   ` (2 preceding siblings ...)
  2014-10-01  7:07 ` [PATCH 3/6] powerpc/powernv: Drop config requests in EEH accessors Gavin Shan
@ 2014-10-01  7:07 ` Gavin Shan
  2014-10-01  7:07 ` [PATCH 5/6] powerpc/eeh: Block PCI config access upon frozen PE Gavin Shan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The pSeires EEH config accessors rely on rtas_{read, write}_config()
and the condition to check if the PE's config space is blocked
should be moved to those 2 functions so that config requests from
kernel, userland, EEH core can be dropped to avoid recursive EEH error
if necessary.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/rtas_pci.c | 30 +++++++++++-------------------
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/kernel/rtas_pci.c b/arch/powerpc/kernel/rtas_pci.c
index ce7c8b6..7c55b86 100644
--- a/arch/powerpc/kernel/rtas_pci.c
+++ b/arch/powerpc/kernel/rtas_pci.c
@@ -66,6 +66,11 @@ int rtas_read_config(struct pci_dn *pdn, int where, int size, u32 *val)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 	if (!config_access_valid(pdn, where))
 		return PCIBIOS_BAD_REGISTER_NUMBER;
+#ifdef CONFIG_EEH
+	if (pdn->edev && pdn->edev->pe &&
+	    (pdn->edev->pe->state & EEH_PE_CFG_BLOCKED))
+		return PCIBIOS_SET_FAILED;
+#endif
 
 	addr = rtas_config_addr(pdn->busno, pdn->devfn, where);
 	buid = pdn->phb->buid;
@@ -90,9 +95,6 @@ static int rtas_pci_read_config(struct pci_bus *bus,
 	struct device_node *busdn, *dn;
 	struct pci_dn *pdn;
 	bool found = false;
-#ifdef CONFIG_EEH
-	struct eeh_dev *edev;
-#endif
 	int ret;
 
 	/* Search only direct children of the bus */
@@ -109,11 +111,6 @@ static int rtas_pci_read_config(struct pci_bus *bus,
 
 	if (!found)
 		return PCIBIOS_DEVICE_NOT_FOUND;
-#ifdef CONFIG_EEH
-	edev = of_node_to_eeh_dev(dn);
-	if (edev && edev->pe && edev->pe->state & EEH_PE_CFG_BLOCKED)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-#endif
 
 	ret = rtas_read_config(pdn, where, size, val);
 	if (*val == EEH_IO_ERROR_VALUE(size) &&
@@ -132,6 +129,11 @@ int rtas_write_config(struct pci_dn *pdn, int where, int size, u32 val)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 	if (!config_access_valid(pdn, where))
 		return PCIBIOS_BAD_REGISTER_NUMBER;
+#ifdef CONFIG_EEH
+	if (pdn->edev && pdn->edev->pe &&
+	    (pdn->edev->pe->state & EEH_PE_CFG_BLOCKED))
+		return PCIBIOS_SET_FAILED;
+#endif
 
 	addr = rtas_config_addr(pdn->busno, pdn->devfn, where);
 	buid = pdn->phb->buid;
@@ -155,10 +157,6 @@ static int rtas_pci_write_config(struct pci_bus *bus,
 	struct device_node *busdn, *dn;
 	struct pci_dn *pdn;
 	bool found = false;
-#ifdef CONFIG_EEH
-	struct eeh_dev *edev;
-#endif
-	int ret;
 
 	/* Search only direct children of the bus */
 	busdn = pci_bus_to_OF_node(bus);
@@ -173,14 +171,8 @@ static int rtas_pci_write_config(struct pci_bus *bus,
 
 	if (!found)
 		return PCIBIOS_DEVICE_NOT_FOUND;
-#ifdef CONFIG_EEH
-	edev = of_node_to_eeh_dev(dn);
-	if (edev && edev->pe && (edev->pe->state & EEH_PE_CFG_BLOCKED))
-		return PCIBIOS_DEVICE_NOT_FOUND;
-#endif
-	ret = rtas_write_config(pdn, where, size, val);
 
-	return ret;
+	return rtas_write_config(pdn, where, size, val);
 }
 
 static struct pci_ops rtas_pci_ops = {
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/6] powerpc/eeh: Block PCI config access upon frozen PE
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
                   ` (3 preceding siblings ...)
  2014-10-01  7:07 ` [PATCH 4/6] powerpc/pseries: " Gavin Shan
@ 2014-10-01  7:07 ` Gavin Shan
  2014-10-01  7:07 ` [PATCH 6/6] powerpc/eeh: Don't collect logs on PE with blocked config space Gavin Shan
  2014-10-01  7:12 ` [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The problem was found when I tried to inject PCI config error by
PHB3 PAPR error injection registers into Broadcom Austin 4-ports
NIC adapter. The frozen PE was reported successfully and EEH core
started to recover it. However, I run into fenced PHB when dumping
PCI config space as EEH logs. I was told that PCI config requests
should not be progagated to the adapter until PE reset is done
successfully. Otherise, we would run out of PHB internal credits
and trigger PCT (PCIE Completion Timeout), which leads to the
fenced PHB.

The patch introduces another PE flag EEH_PE_CFG_RESTRICTED, which
is set during PE initialization time if the PE includes the specific
PCI devices that need block PCI config access until PE reset is done.
When the PE becomes frozen for the first time, EEH_PE_CFG_BLOCKED is
set if the PE has flag EEH_PE_CFG_RESTRICTED. Then the PCI config
access to the PE will be dropped by platform PCI accessors until
PE reset is done successfully. The mechanism is shared by PowerNV
platform owned PE or userland owned ones. It's not used on pSeries
platform yet.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |  1 +
 arch/powerpc/kernel/eeh_pe.c                 |  8 ++++++++
 arch/powerpc/platforms/powernv/eeh-powernv.c | 19 +++++++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index e925a8e..9d7654c 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -74,6 +74,7 @@ struct device_node;
 #define EEH_PE_CFG_BLOCKED	(1 << 2)	/* Block config access	*/
 
 #define EEH_PE_KEEP		(1 << 8)	/* Keep PE on hotplug	*/
+#define EEH_PE_CFG_RESTRICTED	(1 << 9)	/* Block config on error */
 
 struct eeh_pe {
 	int type;			/* PE type: PHB/Bus/Device	*/
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 8c4429b..230ed5b 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -534,6 +534,10 @@ static void *__eeh_pe_state_mark(void *data, void *flag)
 			pdev->error_state = pci_channel_io_frozen;
 	}
 
+	/* Block PCI config access if required */
+	if (pe->state & EEH_PE_CFG_RESTRICTED)
+		pe->state |= EEH_PE_CFG_BLOCKED;
+
 	return NULL;
 }
 
@@ -611,6 +615,10 @@ static void *__eeh_pe_state_clear(void *data, void *flag)
 		pdev->error_state = pci_channel_io_normal;
 	}
 
+	/* Unblock PCI config access if required */
+	if (pe->state & EEH_PE_CFG_RESTRICTED)
+		pe->state &= ~EEH_PE_CFG_BLOCKED;
+
 	return NULL;
 }
 
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 04e42f7..443ce96 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -169,6 +169,25 @@ static int powernv_eeh_dev_probe(struct pci_dev *dev, void *flag)
 	}
 
 	/*
+	 * If the PE contains any one of following adapters, the
+	 * PCI config space can't be accessed when dumping EEH log.
+	 * Otherwise, we will run into fenced PHB caused by shortage
+	 * of outbound credits in the adapter. The PCI config access
+	 * should be blocked until PE reset. MMIO access is dropped
+	 * by hardware certainly. In order to drop PCI config requests,
+	 * one more flag (EEH_PE_CFG_RESTRICTED) is introduced, which
+	 * will be checked in the backend for PE state retrival. If
+	 * the PE becomes frozen for the first time and the flag has
+	 * been set for the PE, we will set EEH_PE_CFG_BLOCKED for
+	 * that PE to block its config space.
+	 *
+	 * Broadcom Austin 4-ports NICs (14e4:1657)
+	 */
+	if (dev->vendor == PCI_VENDOR_ID_BROADCOM &&
+	    dev->device == 0x1657)
+		edev->pe->state |= EEH_PE_CFG_RESTRICTED;
+
+	/*
 	 * Cache the PE primary bus, which can't be fetched when
 	 * full hotplug is in progress. In that case, all child
 	 * PCI devices of the PE are expected to be removed prior
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/6] powerpc/eeh: Don't collect logs on PE with blocked config space
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
                   ` (4 preceding siblings ...)
  2014-10-01  7:07 ` [PATCH 5/6] powerpc/eeh: Block PCI config access upon frozen PE Gavin Shan
@ 2014-10-01  7:07 ` Gavin Shan
  2014-10-01  7:12 ` [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

When the PE's config space is marked as blocked, PCI config read
requests always return 0xFF's. It's pointless to collect logs in
this case.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index c9d274e..e6a718f 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -257,6 +257,13 @@ static void *eeh_dump_pe_log(void *data, void *flag)
 	struct eeh_dev *edev, *tmp;
 	size_t *plen = flag;
 
+	/* If the PE's config space is blocked, 0xFF's will be
+	 * returned. It's pointless to collect the log in this
+	 * case.
+	 */
+	if (pe->state & EEH_PE_CFG_BLOCKED)
+		return NULL;
+
 	eeh_pe_for_each_dev(pe, edev, tmp)
 		*plen += eeh_dump_dev_log(edev, pci_regs_buf + *plen,
 					  EEH_PCI_REGS_LOG_LEN - *plen);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/6] powerpc/eeh: Refactor config accessors
  2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
                   ` (5 preceding siblings ...)
  2014-10-01  7:07 ` [PATCH 6/6] powerpc/eeh: Don't collect logs on PE with blocked config space Gavin Shan
@ 2014-10-01  7:12 ` Gavin Shan
  6 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2014-10-01  7:12 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev

On Wed, Oct 01, 2014 at 05:07:48PM +1000, Gavin Shan wrote:
>When EEH errors are detected on some particular PCI adapters, one of
>which is shown as follows, the PCI config space of thoese PCI adapters
>(PE) should be blocked. Otherwise, we will run into fenced PHB when
>collecting EEH logs (part of recovery). The patchset fixes this issue.
>Also, EEH_PE_RESET is replaced with EEH_PE_CFG_BLOCKED to indicate its
>usage. It's bad idea to allow PCI config access even EEH_PE_CFG_BLOCKED
>flag is set for the corresponding PE because it potentially triggers
>recursive EEH error. The patchset also blocks config request from EEH
>backend if necessary.
>

Missed to attach the logs from "lspci" to point the adapters we have
problems with:

# lspci -s 0003:09:00.0
0003:09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 \
	     Gigabit Ethernet PCIe (rev 01)
# lspci -n -s 0003:09:00.0
0003:09:00.0 0200: 14e4:1657 (rev 01)

Thanks,
Gavin

>Gavin Shan (6):
>  powerpc/eeh: Fix condition for isolated state
>  powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
>  powerpc/powernv: Drop config requests in EEH accessors
>  powerpc/pseries: Drop config requests in EEH accessors
>  powerpc/eeh: Block PCI config access upon frozen PE
>  powerpc/eeh: Don't collect logs on PE with blocked config space
>
> arch/powerpc/include/asm/eeh.h               |  3 +-
> arch/powerpc/kernel/eeh.c                    | 19 +++++++---
> arch/powerpc/kernel/eeh_driver.c             | 12 +++---
> arch/powerpc/kernel/eeh_pe.c                 | 10 ++++-
> arch/powerpc/kernel/rtas_pci.c               | 30 ++++++---------
> arch/powerpc/platforms/powernv/eeh-ioda.c    |  2 +-
> arch/powerpc/platforms/powernv/eeh-powernv.c | 56 +++++++++++++++++++++++++++-
> arch/powerpc/platforms/powernv/pci.c         |  2 +-
> 8 files changed, 97 insertions(+), 37 deletions(-)
>
>-- 
>1.8.3.2
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-01  7:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-01  7:07 [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan
2014-10-01  7:07 ` [PATCH 1/6] powerpc/eeh: Fix condition for isolated state Gavin Shan
2014-10-01  7:07 ` [PATCH 2/6] powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED Gavin Shan
2014-10-01  7:07 ` [PATCH 3/6] powerpc/powernv: Drop config requests in EEH accessors Gavin Shan
2014-10-01  7:07 ` [PATCH 4/6] powerpc/pseries: " Gavin Shan
2014-10-01  7:07 ` [PATCH 5/6] powerpc/eeh: Block PCI config access upon frozen PE Gavin Shan
2014-10-01  7:07 ` [PATCH 6/6] powerpc/eeh: Don't collect logs on PE with blocked config space Gavin Shan
2014-10-01  7:12 ` [PATCH 0/6] powerpc/eeh: Refactor config accessors Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.