linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* From: Gavin Shan <shangw@linux.vnet.ibm.com>
@ 2013-06-27  5:46 Gavin Shan
  2013-06-27  5:46 ` [PATCH 1/8] powerpc/eeh: Don't collect PCI-CFG data on PHB Gavin Shan
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev

The series of patches are follow-up in order to make EEH workable for PowerNV
platform on Juno-IOC-L machine. Couple of issues have been fixed with help of
Ben:

	- Check PCIe link after PHB complete reset
	- Restore config space for bridges
	- The EEH address cache wasn't built successfully
	- Misc cleanup on output messages
	- Misc cleanup on EEH flags maintained by "struct pnv_phb"
	- Misc cleanup on properties of functions to avoid build warnings
	- Let PCI config accessors rely on device node
	- Do hotplug during reset for those devices whose drivers can't
	  support EEH
 
---

Trigger frozen PE:

        echo 0x0000000002000000 > /sys/kernel/debug/powerpc/PCI0000/err_injct
        sleep 1
        echo 0x0 > /sys/kernel/debug/powerpc/PCI0000/err_injct

Trigger fenced PHB:

	echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0000/err_injct


---

Changelog:
==========
v3 -> v4:
	* Add more output messages in EEH core to let users know what the EEH
	  core is doing.
	* Add one patch to use device node in the PCI config accessors since
	  the accessors used by EEH and it's not safe enough to refer PCI device
	  and bus. We instead fully utilize the information from PCI_DN.
	* Add one patch to remove those deivces whose drivers can't support EEH
	  before reset, and add them to the system after reset. 
v2 -> v3:
	* Fix overwritten buffer while collecting data from PCI config space.
v1 -> v2:
	* Remove the mechanism to block PCI-CFG and MMIO.
	* Add one patch to do cleanup on output messages.
	* Add one patch to avoid build warnings.
	* Split functions to restore BARs for PCI devices and bridges separately.

---

arch/powerpc/include/asm/eeh.h               |    8 +-
arch/powerpc/include/asm/pci.h               |    1 +
arch/powerpc/kernel/eeh.c                    |   43 +++++--
arch/powerpc/kernel/eeh_cache.c              |    4 +-
arch/powerpc/kernel/eeh_driver.c             |  157 +++++++++++++++++++++++-
arch/powerpc/kernel/eeh_pe.c                 |  166 ++++++++++++++++++++++++--
arch/powerpc/kernel/pci_hotplug.c            |    8 +-
arch/powerpc/platforms/powernv/eeh-ioda.c    |   33 +++--
arch/powerpc/platforms/powernv/eeh-powernv.c |   44 +-------
arch/powerpc/platforms/powernv/pci-ioda.c    |    1 +
arch/powerpc/platforms/powernv/pci.c         |  124 ++++++++++++--------
arch/powerpc/platforms/powernv/pci.h         |   11 ++-
drivers/pci/probe.c                          |    6 +-
13 files changed, 462 insertions(+), 144 deletions(-)

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/8] powerpc/eeh: Don't collect PCI-CFG data on PHB
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 2/8] powerpc/eeh: Check PCIe link after reset Gavin Shan
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

When the PHB is fenced or dead, it's pointless to collect the data
from PCI config space of subordinate PCI devices since it should
return 0xFF's. The patch also fixes overwritten buffer while getting
PCI config data.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh.c |   34 ++++++++++++++++++++++++----------
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 951a632..2dd0bd1 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -232,16 +232,30 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
 {
 	size_t loglen = 0;
 	struct eeh_dev *edev;
+	bool valid_cfg_log = true;
 
-	eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
-	eeh_ops->configure_bridge(pe);
-	eeh_pe_restore_bars(pe);
-
-	pci_regs_buf[0] = 0;
-	eeh_pe_for_each_dev(pe, edev) {
-		loglen += eeh_gather_pci_data(edev, pci_regs_buf,
-				EEH_PCI_REGS_LOG_LEN);
-        }
+	/*
+	 * When the PHB is fenced or dead, it's pointless to collect
+	 * the data from PCI config space because it should return
+	 * 0xFF's. For ER, we still retrieve the data from the PCI
+	 * config space.
+	 */
+	if (eeh_probe_mode_dev() &&
+	    (pe->type & EEH_PE_PHB) &&
+	    (pe->state & (EEH_PE_ISOLATED | EEH_PE_PHB_DEAD)))
+		valid_cfg_log = false;
+
+	if (valid_cfg_log) {
+		eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
+		eeh_ops->configure_bridge(pe);
+		eeh_pe_restore_bars(pe);
+
+		pci_regs_buf[0] = 0;
+		eeh_pe_for_each_dev(pe, edev) {
+			loglen += eeh_gather_pci_data(edev, pci_regs_buf + loglen,
+						      EEH_PCI_REGS_LOG_LEN - loglen);
+		}
+	}
 
 	eeh_ops->get_log(pe, severity, pci_regs_buf, loglen);
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/8] powerpc/eeh: Check PCIe link after reset
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
  2013-06-27  5:46 ` [PATCH 1/8] powerpc/eeh: Don't collect PCI-CFG data on PHB Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 3/8] powerpc/powernv: Replace variables with flags Gavin Shan
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

After reset (e.g. complete reset) in order to bring the fenced PHB
back, the PCIe link might not be ready yet. The patch intends to
make sure the PCIe link is ready before accessing its subordinate
PCI devices. The patch also fixes that wrong values restored to
PCI_COMMAND register for PCI bridges.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_pe.c |  157 ++++++++++++++++++++++++++++++++++++++----
 1 files changed, 144 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 55943fc..016588a 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -22,6 +22,7 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */
 
+#include <linux/delay.h>
 #include <linux/export.h>
 #include <linux/gfp.h>
 #include <linux/init.h>
@@ -567,30 +568,132 @@ void eeh_pe_state_clear(struct eeh_pe *pe, int state)
 	eeh_pe_traverse(pe, __eeh_pe_state_clear, &state);
 }
 
-/**
- * eeh_restore_one_device_bars - Restore the Base Address Registers for one device
- * @data: EEH device
- * @flag: Unused
+/*
+ * Some PCI bridges (e.g. PLX bridges) have primary/secondary
+ * buses assigned explicitly by firmware, and we probably have
+ * lost that after reset. So we have to delay the check until
+ * the PCI-CFG registers have been restored for the parent
+ * bridge.
  *
- * Loads the PCI configuration space base address registers,
- * the expansion ROM base address, the latency timer, and etc.
- * from the saved values in the device node.
+ * Don't use normal PCI-CFG accessors, which probably has been
+ * blocked on normal path during the stage. So we need utilize
+ * eeh operations, which is always permitted.
  */
-static void *eeh_restore_one_device_bars(void *data, void *flag)
+static void eeh_bridge_check_link(struct pci_dev *pdev,
+				  struct device_node *dn)
+{
+	int cap;
+	uint32_t val;
+	int timeout = 0;
+
+	/*
+	 * We only check root port and downstream ports of
+	 * PCIe switches
+	 */
+	if (!pci_is_pcie(pdev) ||
+	    (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT &&
+	     pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM))
+		return;
+
+	pr_debug("%s: Check PCIe link for %s ...\n",
+		 __func__, pci_name(pdev));
+
+	/* Check slot status */
+	cap = pdev->pcie_cap;
+	eeh_ops->read_config(dn, cap + PCI_EXP_SLTSTA, 2, &val);
+	if (!(val & PCI_EXP_SLTSTA_PDS)) {
+		pr_debug("  No card in the slot (0x%04x) !\n", val);
+		return;
+	}
+
+	/* Check power status if we have the capability */
+	eeh_ops->read_config(dn, cap + PCI_EXP_SLTCAP, 2, &val);
+	if (val & PCI_EXP_SLTCAP_PCP) {
+		eeh_ops->read_config(dn, cap + PCI_EXP_SLTCTL, 2, &val);
+		if (val & PCI_EXP_SLTCTL_PCC) {
+			pr_debug("  In power-off state, power it on ...\n");
+			val &= ~(PCI_EXP_SLTCTL_PCC | PCI_EXP_SLTCTL_PIC);
+			val |= (0x0100 & PCI_EXP_SLTCTL_PIC);
+			eeh_ops->write_config(dn, cap + PCI_EXP_SLTCTL, 2, val);
+			msleep(2 * 1000);
+		}
+	}
+
+	/* Enable link */
+	eeh_ops->read_config(dn, cap + PCI_EXP_LNKCTL, 2, &val);
+	val &= ~PCI_EXP_LNKCTL_LD;
+	eeh_ops->write_config(dn, cap + PCI_EXP_LNKCTL, 2, val);
+
+	/* Check link */
+	eeh_ops->read_config(dn, cap + PCI_EXP_LNKCAP, 4, &val);
+	if (!(val & PCI_EXP_LNKCAP_DLLLARC)) {
+		pr_debug("  No link reporting capability (0x%08x) \n", val);
+		msleep(1000);
+		return;
+	}
+
+	/* Wait the link is up until timeout (5s) */
+	timeout = 0;
+	while (timeout < 5000) {
+		msleep(20);
+		timeout += 20;
+
+		eeh_ops->read_config(dn, cap + PCI_EXP_LNKSTA, 2, &val);
+		if (val & PCI_EXP_LNKSTA_DLLLA)
+			break;
+	}
+
+	if (val & PCI_EXP_LNKSTA_DLLLA)
+		pr_debug("  Link up (%s)\n",
+			 (val & PCI_EXP_LNKSTA_CLS_2_5GB) ? "2.5GB" : "5GB");
+	else
+		pr_debug("  Link not ready (0x%04x)\n", val);
+}
+
+#define BYTE_SWAP(OFF)	(8*((OFF)/4)+3-(OFF))
+#define SAVED_BYTE(OFF)	(((u8 *)(edev->config_space))[BYTE_SWAP(OFF)])
+
+static void eeh_restore_bridge_bars(struct pci_dev *pdev,
+				    struct eeh_dev *edev,
+				    struct device_node *dn)
+{
+	int i;
+
+	/*
+	 * Device BARs: 0x10 - 0x18
+	 * Bus numbers and windows: 0x18 - 0x30
+	 */
+	for (i = 4; i < 13; i++)
+		eeh_ops->write_config(dn, i*4, 4, edev->config_space[i]);
+	/* Rom: 0x38 */
+	eeh_ops->write_config(dn, 14*4, 4, edev->config_space[14]);
+
+	/* Cache line & Latency timer: 0xC 0xD */
+	eeh_ops->write_config(dn, PCI_CACHE_LINE_SIZE, 1,
+                SAVED_BYTE(PCI_CACHE_LINE_SIZE));
+        eeh_ops->write_config(dn, PCI_LATENCY_TIMER, 1,
+                SAVED_BYTE(PCI_LATENCY_TIMER));
+	/* Max latency, min grant, interrupt ping and line: 0x3C */
+	eeh_ops->write_config(dn, 15*4, 4, edev->config_space[15]);
+
+	/* PCI Command: 0x4 */
+	eeh_ops->write_config(dn, PCI_COMMAND, 4, edev->config_space[1]);
+
+	/* Check the PCIe link is ready */
+	eeh_bridge_check_link(pdev, dn);
+}
+
+static void eeh_restore_device_bars(struct eeh_dev *edev,
+				    struct device_node *dn)
 {
 	int i;
 	u32 cmd;
-	struct eeh_dev *edev = (struct eeh_dev *)data;
-	struct device_node *dn = eeh_dev_to_of_node(edev);
 
 	for (i = 4; i < 10; i++)
 		eeh_ops->write_config(dn, i*4, 4, edev->config_space[i]);
 	/* 12 == Expansion ROM Address */
 	eeh_ops->write_config(dn, 12*4, 4, edev->config_space[12]);
 
-#define BYTE_SWAP(OFF) (8*((OFF)/4)+3-(OFF))
-#define SAVED_BYTE(OFF) (((u8 *)(edev->config_space))[BYTE_SWAP(OFF)])
-
 	eeh_ops->write_config(dn, PCI_CACHE_LINE_SIZE, 1,
 		SAVED_BYTE(PCI_CACHE_LINE_SIZE));
 	eeh_ops->write_config(dn, PCI_LATENCY_TIMER, 1,
@@ -613,6 +716,34 @@ static void *eeh_restore_one_device_bars(void *data, void *flag)
 	else
 		cmd &= ~PCI_COMMAND_SERR;
 	eeh_ops->write_config(dn, PCI_COMMAND, 4, cmd);
+}
+
+/**
+ * eeh_restore_one_device_bars - Restore the Base Address Registers for one device
+ * @data: EEH device
+ * @flag: Unused
+ *
+ * Loads the PCI configuration space base address registers,
+ * the expansion ROM base address, the latency timer, and etc.
+ * from the saved values in the device node.
+ */
+static void *eeh_restore_one_device_bars(void *data, void *flag)
+{
+	struct pci_dev *pdev = NULL;
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct device_node *dn = eeh_dev_to_of_node(edev);
+
+	/* Trace the PCI bridge */
+	if (eeh_probe_mode_dev()) {
+		pdev = eeh_dev_to_pci_dev(edev);
+		if (pdev->hdr_type != PCI_HEADER_TYPE_BRIDGE)
+                        pdev = NULL;
+        }
+
+	if (pdev)
+		eeh_restore_bridge_bars(pdev, edev, dn);
+	else
+		eeh_restore_device_bars(edev, dn);
 
 	return NULL;
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/8] powerpc/powernv: Replace variables with flags
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
  2013-06-27  5:46 ` [PATCH 1/8] powerpc/eeh: Don't collect PCI-CFG data on PHB Gavin Shan
  2013-06-27  5:46 ` [PATCH 2/8] powerpc/eeh: Check PCIe link after reset Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 4/8] powerpc/eeh: Fix address catch for PowerNV Gavin Shan
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

We have 2 fields in "struct pnv_phb" to trace the states. The patch
replace the fields with one and introduces flags for that. The patch
doesn't impact the logic.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-ioda.c |    8 ++++----
 arch/powerpc/platforms/powernv/pci.c      |    4 ++--
 arch/powerpc/platforms/powernv/pci.h      |    7 +++++--
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 84f3036..85025d7 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -132,7 +132,7 @@ static int ioda_eeh_post_init(struct pci_controller *hose)
 					    &ioda_eeh_dbgfs_ops);
 #endif
 
-		phb->eeh_enabled = 1;
+		phb->eeh_state |= PNV_EEH_STATE_ENABLED;
 	}
 
 	return 0;
@@ -815,7 +815,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 		 * removed, we needn't take care of it any more.
 		 */
 		phb = hose->private_data;
-		if (phb->removed)
+		if (phb->eeh_state & PNV_EEH_STATE_REMOVED)
 			continue;
 
 		rc = opal_pci_next_error(phb->opal_id,
@@ -850,7 +850,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 				list_for_each_entry_safe(hose, tmp,
 						&hose_list, list_node) {
 					phb = hose->private_data;
-					phb->removed = 1;
+					phb->eeh_state |= PNV_EEH_STATE_REMOVED;
 				}
 
 				WARN(1, "EEH: dead IOC detected\n");
@@ -867,7 +867,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 
 				WARN(1, "EEH: dead PHB#%x detected\n",
 				     hose->global_number);
-				phb->removed = 1;
+				phb->eeh_state |= PNV_EEH_STATE_REMOVED;
 				ret = 3;
 				goto out;
 			} else if (severity == OPAL_EEH_SEV_PHB_FENCED) {
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 6d9a506..1f31826 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -308,7 +308,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 	if (phb_pe && (phb_pe->state & EEH_PE_ISOLATED))
 		return PCIBIOS_SUCCESSFUL;
 
-	if (phb->eeh_enabled) {
+	if (phb->eeh_state & PNV_EEH_STATE_ENABLED) {
 		if (*val == EEH_IO_ERROR_VALUE(size)) {
 			busdn = pci_bus_to_OF_node(bus);
 			for (dn = busdn->child; dn; dn = dn->sibling) {
@@ -358,7 +358,7 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 
 	/* Check if the PHB got frozen due to an error (no response) */
 #ifdef CONFIG_EEH
-	if (!phb->eeh_enabled)
+	if (!(phb->eeh_state & PNV_EEH_STATE_ENABLED))
 		pnv_pci_config_check_eeh(phb, bus, bdfn);
 #else
 	pnv_pci_config_check_eeh(phb, bus, bdfn);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 43906e3..40bdf02 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -78,6 +78,10 @@ struct pnv_eeh_ops {
 	int (*configure_bridge)(struct eeh_pe *pe);
 	int (*next_error)(struct eeh_pe **pe);
 };
+
+#define PNV_EEH_STATE_ENABLED	(1 << 0)	/* EEH enabled	*/
+#define PNV_EEH_STATE_REMOVED	(1 << 1)	/* PHB removed	*/
+
 #endif /* CONFIG_EEH */
 
 struct pnv_phb {
@@ -92,8 +96,7 @@ struct pnv_phb {
 
 #ifdef CONFIG_EEH
 	struct pnv_eeh_ops	*eeh_ops;
-	int			eeh_enabled;
-	int			removed;
+	int			eeh_state;
 #endif
 
 #ifdef CONFIG_DEBUG_FS
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/8] powerpc/eeh: Fix address catch for PowerNV
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
                   ` (2 preceding siblings ...)
  2013-06-27  5:46 ` [PATCH 3/8] powerpc/powernv: Replace variables with flags Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 5/8] powerpc/eeh: Refactor the output message Gavin Shan
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

On the PowerNV platform, the EEH address cache isn't built correctly
because we skipped the EEH devices without binding PE. The patch
fixes that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_cache.c           |    2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |    1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index 1d5d9a6..858ebea 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -194,7 +194,7 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev)
 	}
 
 	/* Skip any devices for which EEH is not enabled. */
-	if (!edev->pe) {
+	if (!eeh_probe_mode_dev() && !edev->pe) {
 #ifdef DEBUG
 		pr_info("PCI: skip building address cache for=%s - %s\n",
 			pci_name(dev), dn->full_name);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3e5c3d5..0ff9a3a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -998,6 +998,7 @@ static void pnv_pci_ioda_fixup(void)
 	pnv_pci_ioda_create_dbgfs();
 
 #ifdef CONFIG_EEH
+	eeh_probe_mode_set(EEH_PROBE_MODE_DEV);
 	eeh_addr_cache_build();
 	eeh_init();
 #endif
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/8] powerpc/eeh: Refactor the output message
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
                   ` (3 preceding siblings ...)
  2013-06-27  5:46 ` [PATCH 4/8] powerpc/eeh: Fix address catch for PowerNV Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 6/8] powerpc/eeh: Avoid build warnings Gavin Shan
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

We needn't the the whole backtrace other than one-line message in
the error reporting interrupt handler. For errors triggered by
access PCI config space or MMIO, we replace "WARN(1, ...)" with
pr_err() and dump_stack(). The patch also adds more output messages
to indicate what EEH core is doing. Besides, some printk() are
replaced with pr_warning().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh.c                 |    9 +++++++--
 arch/powerpc/kernel/eeh_driver.c          |   23 ++++++++++++++++++-----
 arch/powerpc/platforms/powernv/eeh-ioda.c |   25 ++++++++++++++++---------
 3 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 2dd0bd1..f7f2775 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -324,7 +324,9 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
 	eeh_serialize_unlock(flags);
 	eeh_send_failure_event(phb_pe);
 
-	WARN(1, "EEH: PHB failure detected\n");
+	pr_err("EEH: PHB#%x failure detected\n",
+		phb_pe->phb->global_number);
+	dump_stack();
 
 	return 1;
 out:
@@ -453,7 +455,10 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
 	 * a stack trace will help the device-driver authors figure
 	 * out what happened.  So print that out.
 	 */
-	WARN(1, "EEH: failure detected\n");
+	pr_err("EEH: Frozen PE#%x detected on PHB#%x\n",
+		pe->addr, pe->phb->global_number);
+	dump_stack();
+
 	return 1;
 
 dn_unlock:
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 0974e13..2b1ce17 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -425,6 +425,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * status ... if any child can't handle the reset, then the entire
 	 * slot is dlpar removed and added.
 	 */
+	pr_info("EEH: Notify device drivers to shutdown\n");
 	eeh_pe_dev_traverse(pe, eeh_report_error, &result);
 
 	/* Get the current PCI slot state. This can take a long time,
@@ -432,7 +433,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 	 */
 	rc = eeh_ops->wait_state(pe, MAX_WAIT_FOR_RECOVERY*1000);
 	if (rc < 0 || rc == EEH_STATE_NOT_SUPPORT) {
-		printk(KERN_WARNING "EEH: Permanent failure\n");
+		pr_warning("EEH: Permanent failure\n");
 		goto hard_fail;
 	}
 
@@ -440,6 +441,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * don't post the error log until after all dev drivers
 	 * have been informed.
 	 */
+	pr_info("EEH: Collect temporary log\n");
 	eeh_slot_error_detail(pe, EEH_LOG_TEMP);
 
 	/* If all device drivers were EEH-unaware, then shut
@@ -447,15 +449,18 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * go down willingly, without panicing the system.
 	 */
 	if (result == PCI_ERS_RESULT_NONE) {
+		pr_info("EEH: Reset with hotplug activity\n");
 		rc = eeh_reset_device(pe, frozen_bus);
 		if (rc) {
-			printk(KERN_WARNING "EEH: Unable to reset, rc=%d\n", rc);
+			pr_warning("%s: Unable to reset, err=%d\n",
+				   __func__, rc);
 			goto hard_fail;
 		}
 	}
 
 	/* If all devices reported they can proceed, then re-enable MMIO */
 	if (result == PCI_ERS_RESULT_CAN_RECOVER) {
+		pr_info("EEH: Enable I/O for affected devices\n");
 		rc = eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
 
 		if (rc < 0)
@@ -463,6 +468,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 		if (rc) {
 			result = PCI_ERS_RESULT_NEED_RESET;
 		} else {
+			pr_info("EEH: Notify device drivers to resume I/O\n");
 			result = PCI_ERS_RESULT_NONE;
 			eeh_pe_dev_traverse(pe, eeh_report_mmio_enabled, &result);
 		}
@@ -470,6 +476,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 
 	/* If all devices reported they can proceed, then re-enable DMA */
 	if (result == PCI_ERS_RESULT_CAN_RECOVER) {
+		pr_info("EEH: Enabled DMA for affected devices\n");
 		rc = eeh_pci_enable(pe, EEH_OPT_THAW_DMA);
 
 		if (rc < 0)
@@ -482,17 +489,22 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 
 	/* If any device has a hard failure, then shut off everything. */
 	if (result == PCI_ERS_RESULT_DISCONNECT) {
-		printk(KERN_WARNING "EEH: Device driver gave up\n");
+		pr_warning("EEH: Device driver gave up\n");
 		goto hard_fail;
 	}
 
 	/* If any device called out for a reset, then reset the slot */
 	if (result == PCI_ERS_RESULT_NEED_RESET) {
+		pr_info("EEH: Reset without hotplug activity\n");
 		rc = eeh_reset_device(pe, NULL);
 		if (rc) {
-			printk(KERN_WARNING "EEH: Cannot reset, rc=%d\n", rc);
+			pr_warning("%s: Cannot reset, err=%d\n",
+				   __func__, rc);
 			goto hard_fail;
 		}
+
+		pr_info("EEH: Notify device drivers "
+			"the completion of reset\n");
 		result = PCI_ERS_RESULT_NONE;
 		eeh_pe_dev_traverse(pe, eeh_report_reset, &result);
 	}
@@ -500,11 +512,12 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
 	/* All devices should claim they have recovered by now. */
 	if ((result != PCI_ERS_RESULT_RECOVERED) &&
 	    (result != PCI_ERS_RESULT_NONE)) {
-		printk(KERN_WARNING "EEH: Not recovered\n");
+		pr_warning("EEH: Not recovered\n");
 		goto hard_fail;
 	}
 
 	/* Tell all device drivers that they can resume operations */
+	pr_info("EEH: Notify device driver to resume\n");
 	eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
 
 	return;
diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 85025d7..0cd1c4a 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -853,11 +853,14 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 					phb->eeh_state |= PNV_EEH_STATE_REMOVED;
 				}
 
-				WARN(1, "EEH: dead IOC detected\n");
+				pr_err("EEH: dead IOC detected\n");
 				ret = 4;
 				goto out;
-			} else if (severity == OPAL_EEH_SEV_INF)
+			} else if (severity == OPAL_EEH_SEV_INF) {
+				pr_info("EEH: IOC informative error "
+					"detected\n");
 				ioda_eeh_hub_diag(hose);
+			}
 
 			break;
 		case OPAL_EEH_PHB_ERROR:
@@ -865,8 +868,8 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 				if (ioda_eeh_get_phb_pe(hose, pe))
 					break;
 
-				WARN(1, "EEH: dead PHB#%x detected\n",
-				     hose->global_number);
+				pr_err("EEH: dead PHB#%x detected\n",
+					hose->global_number);
 				phb->eeh_state |= PNV_EEH_STATE_REMOVED;
 				ret = 3;
 				goto out;
@@ -874,20 +877,24 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 				if (ioda_eeh_get_phb_pe(hose, pe))
 					break;
 
-				WARN(1, "EEH: fenced PHB#%x detected\n",
-				     hose->global_number);
+				pr_err("EEH: fenced PHB#%x detected\n",
+					hose->global_number);
 				ret = 2;
 				goto out;
-			} else if (severity == OPAL_EEH_SEV_INF)
+			} else if (severity == OPAL_EEH_SEV_INF) {
+				pr_info("EEH: PHB#%x informative error "
+					"detected\n",
+					hose->global_number);
 				ioda_eeh_phb_diag(hose);
+			}
 
 			break;
 		case OPAL_EEH_PE_ERROR:
 			if (ioda_eeh_get_pe(hose, frozen_pe_no, pe))
 				break;
 
-			WARN(1, "EEH: Frozen PE#%x on PHB#%x detected\n",
-			     (*pe)->addr, (*pe)->phb->global_number);
+			pr_err("EEH: Frozen PE#%x on PHB#%x detected\n",
+				(*pe)->addr, (*pe)->phb->global_number);
 			ret = 1;
 			goto out;
 		}
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/8] powerpc/eeh: Avoid build warnings
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
                   ` (4 preceding siblings ...)
  2013-06-27  5:46 ` [PATCH 5/8] powerpc/eeh: Refactor the output message Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 7/8] powerpc/powernv: Use dev-node in PCI config accessors Gavin Shan
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The patch is for avoiding following build warnings:

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_init().
   This is often because .pnv_pci_ioda_fixup lacks a __init

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_addr_cache_build().
   This is often because .pnv_pci_ioda_fixup lacks a __init

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h  |    4 ++--
 arch/powerpc/kernel/eeh.c       |    2 +-
 arch/powerpc/kernel/eeh_cache.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index dd65e31..09a8743 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -202,13 +202,13 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
 
 void *eeh_dev_init(struct device_node *dn, void *data);
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
-int __init eeh_init(void);
+int eeh_init(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
 int __exit eeh_ops_unregister(const char *name);
 unsigned long eeh_check_failure(const volatile void __iomem *token,
 				unsigned long val);
 int eeh_dev_check_failure(struct eeh_dev *edev);
-void __init eeh_addr_cache_build(void);
+void eeh_addr_cache_build(void);
 void eeh_add_device_tree_early(struct device_node *);
 void eeh_add_device_tree_late(struct pci_bus *);
 void eeh_add_sysfs_files(struct pci_bus *);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index f7f2775..14475f6 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -751,7 +751,7 @@ int __exit eeh_ops_unregister(const char *name)
  * Even if force-off is set, the EEH hardware is still enabled, so that
  * newer systems can boot.
  */
-int __init eeh_init(void)
+int eeh_init(void)
 {
 	struct pci_controller *hose, *tmp;
 	struct device_node *phb;
diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index 858ebea..ea9a94c 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -285,7 +285,7 @@ void eeh_addr_cache_rmv_dev(struct pci_dev *dev)
  * Must be run late in boot process, after the pci controllers
  * have been scanned for devices (after all device resources are known).
  */
-void __init eeh_addr_cache_build(void)
+void eeh_addr_cache_build(void)
 {
 	struct device_node *dn;
 	struct eeh_dev *edev;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/8] powerpc/powernv: Use dev-node in PCI config accessors
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
                   ` (5 preceding siblings ...)
  2013-06-27  5:46 ` [PATCH 6/8] powerpc/eeh: Avoid build warnings Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-27  5:46 ` [PATCH 8/8] powernv/eeh: Do hotplug on devices without EEH aware driver Gavin Shan
  2013-06-27  5:51 ` From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

Currently, we're using the combo (PCI bus + devfn) in the PCI
config accessors and PCI config accessors in EEH depends on them.
However, it's not safe to refer the PCI bus which might have been
removed during hotplug. So we're using device node in the PCI
config accessors and the corresponding backends just reuse them.

The patch also fix one potential risk: We possiblly have frozen
PE during the early PCI probe time, but we haven't setup the PE
mapping yet. So the errors should be counted to PE#0.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c |   44 +---------
 arch/powerpc/platforms/powernv/pci.c         |  120 ++++++++++++++++----------
 arch/powerpc/platforms/powernv/pci.h         |    4 +
 3 files changed, 79 insertions(+), 89 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 9559115..969cce7 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -315,46 +315,6 @@ static int powernv_eeh_configure_bridge(struct eeh_pe *pe)
 }
 
 /**
- * powernv_eeh_read_config - Read PCI config space
- * @dn: device node
- * @where: PCI address
- * @size: size to read
- * @val: return value
- *
- * Read config space from the speicifed device
- */
-static int powernv_eeh_read_config(struct device_node *dn, int where,
-				   int size, u32 *val)
-{
-	struct eeh_dev *edev = of_node_to_eeh_dev(dn);
-	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
-	struct pci_controller *hose = edev->phb;
-
-	return hose->ops->read(dev->bus, dev->devfn, where, size, val);
-}
-
-/**
- * powernv_eeh_write_config - Write PCI config space
- * @dn: device node
- * @where: PCI address
- * @size: size to write
- * @val: value to be written
- *
- * Write config space to the specified device
- */
-static int powernv_eeh_write_config(struct device_node *dn, int where,
-				    int size, u32 val)
-{
-	struct eeh_dev *edev = of_node_to_eeh_dev(dn);
-	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
-	struct pci_controller *hose = edev->phb;
-
-	hose = pci_bus_to_host(dev->bus);
-
-	return hose->ops->write(dev->bus, dev->devfn, where, size, val);
-}
-
-/**
  * powernv_eeh_next_error - Retrieve next EEH error to handle
  * @pe: Affected PE
  *
@@ -389,8 +349,8 @@ static struct eeh_ops powernv_eeh_ops = {
 	.wait_state             = powernv_eeh_wait_state,
 	.get_log                = powernv_eeh_get_log,
 	.configure_bridge       = powernv_eeh_configure_bridge,
-	.read_config            = powernv_eeh_read_config,
-	.write_config           = powernv_eeh_write_config,
+	.read_config            = pnv_pci_cfg_read,
+	.write_config           = pnv_pci_cfg_write,
 	.next_error		= powernv_eeh_next_error
 };
 
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 1f31826..229b034 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -230,47 +230,50 @@ static void pnv_pci_handle_eeh_config(struct pnv_phb *phb, u32 pe_no)
 	spin_unlock_irqrestore(&phb->lock, flags);
 }
 
-static void pnv_pci_config_check_eeh(struct pnv_phb *phb, struct pci_bus *bus,
-				     u32 bdfn)
+static void pnv_pci_config_check_eeh(struct pnv_phb *phb,
+				     struct device_node *dn)
 {
 	s64	rc;
 	u8	fstate;
 	u16	pcierr;
 	u32	pe_no;
 
-	/* Get PE# if we support IODA */
-	pe_no = phb->bdfn_to_pe ? phb->bdfn_to_pe(phb, bus, bdfn & 0xff) : 0;
+	/*
+	 * Get the PE#. During the PCI probe stage, we might not
+	 * setup that yet. So all ER errors should be mapped to
+	 * PE#0
+	 */
+	pe_no = PCI_DN(dn)->pe_number;
+	if (pe_no == IODA_INVALID_PE)
+		pe_no = 0;
 
 	/* Read freeze status */
 	rc = opal_pci_eeh_freeze_status(phb->opal_id, pe_no, &fstate, &pcierr,
 					NULL);
 	if (rc) {
-		pr_warning("PCI %d: Failed to read EEH status for PE#%d,"
-			   " err %lld\n", phb->hose->global_number, pe_no, rc);
+		pr_warning("%s: Can't read EEH status (PE#%d) for "
+			   "%s, err %lld\n",
+			   __func__, pe_no, dn->full_name, rc);
 		return;
 	}
-	cfg_dbg(" -> EEH check, bdfn=%04x PE%d fstate=%x\n",
-		bdfn, pe_no, fstate);
+	cfg_dbg(" -> EEH check, bdfn=%04x PE#%d fstate=%x\n",
+		(PCI_DN(dn)->busno << 8) | (PCI_DN(dn)->devfn),
+		pe_no, fstate);
 	if (fstate != 0)
 		pnv_pci_handle_eeh_config(phb, pe_no);
 }
 
-static int pnv_pci_read_config(struct pci_bus *bus,
-			       unsigned int devfn,
-			       int where, int size, u32 *val)
+int pnv_pci_cfg_read(struct device_node *dn,
+		     int where, int size, u32 *val)
 {
-	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = PCI_DN(dn);
+	struct pnv_phb *phb = pdn->phb->private_data;
+	u32 bdfn = (pdn->busno << 8) | pdn->devfn;
 #ifdef CONFIG_EEH
-	struct device_node *busdn, *dn;
 	struct eeh_pe *phb_pe = NULL;
 #endif
-	u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
 	s64 rc;
 
-	if (hose == NULL)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-
 	switch (size) {
 	case 1: {
 		u8 v8;
@@ -294,8 +297,8 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 	default:
 		return PCIBIOS_FUNC_NOT_SUPPORTED;
 	}
-	cfg_dbg("pnv_pci_read_config bus: %x devfn: %x +%x/%x -> %08x\n",
-		bus->number, devfn, where, size, *val);
+	cfg_dbg("%s: bus: %x devfn: %x +%x/%x -> %08x\n",
+		__func__, pdn->busno, pdn->devfn, where, size, *val);
 
 	/*
 	 * Check if the specified PE has been put into frozen
@@ -304,44 +307,33 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 	 * PHB-fatal errors.
 	 */
 #ifdef CONFIG_EEH
-	phb_pe = eeh_phb_pe_get(hose);
+	phb_pe = eeh_phb_pe_get(pdn->phb);
 	if (phb_pe && (phb_pe->state & EEH_PE_ISOLATED))
 		return PCIBIOS_SUCCESSFUL;
 
 	if (phb->eeh_state & PNV_EEH_STATE_ENABLED) {
-		if (*val == EEH_IO_ERROR_VALUE(size)) {
-			busdn = pci_bus_to_OF_node(bus);
-			for (dn = busdn->child; dn; dn = dn->sibling) {
-				struct pci_dn *pdn = PCI_DN(dn);
-
-				if (pdn && pdn->devfn == devfn &&
-				    eeh_dev_check_failure(of_node_to_eeh_dev(dn)))
-					return PCIBIOS_DEVICE_NOT_FOUND;
-			}
-		}
+		if (*val == EEH_IO_ERROR_VALUE(size) &&
+		    eeh_dev_check_failure(of_node_to_eeh_dev(dn)))
+			return PCIBIOS_DEVICE_NOT_FOUND;
 	} else {
-		pnv_pci_config_check_eeh(phb, bus, bdfn);
+		pnv_pci_config_check_eeh(phb, dn);
 	}
 #else
-	pnv_pci_config_check_eeh(phb, bus, bdfn);
+	pnv_pci_config_check_eeh(phb, dn);
 #endif
 
 	return PCIBIOS_SUCCESSFUL;
 }
 
-static int pnv_pci_write_config(struct pci_bus *bus,
-				unsigned int devfn,
-				int where, int size, u32 val)
+int pnv_pci_cfg_write(struct device_node *dn,
+		      int where, int size, u32 val)
 {
-	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
-	u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
+	struct pci_dn *pdn = PCI_DN(dn);
+	struct pnv_phb *phb = pdn->phb->private_data;
+	u32 bdfn = (pdn->busno << 8) | pdn->devfn;
 
-	if (hose == NULL)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-
-	cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
-		bus->number, devfn, where, size, val);
+	cfg_dbg("%s: bus: %x devfn: %x +%x/%x -> %08x\n",
+		pdn->busno, pdn->devfn, where, size, val);
 	switch (size) {
 	case 1:
 		opal_pci_config_write_byte(phb->opal_id, bdfn, where, val);
@@ -359,16 +351,50 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 	/* Check if the PHB got frozen due to an error (no response) */
 #ifdef CONFIG_EEH
 	if (!(phb->eeh_state & PNV_EEH_STATE_ENABLED))
-		pnv_pci_config_check_eeh(phb, bus, bdfn);
+		pnv_pci_config_check_eeh(phb, dn);
 #else
-	pnv_pci_config_check_eeh(phb, bus, bdfn);
+	pnv_pci_config_check_eeh(phb, dn);
 #endif
 
 	return PCIBIOS_SUCCESSFUL;
 }
 
+static int pnv_pci_read_config(struct pci_bus *bus,
+			       unsigned int devfn,
+			       int where, int size, u32 *val)
+{
+	struct device_node *dn, *busdn = pci_bus_to_OF_node(bus);
+	struct pci_dn *pdn;
+
+	for (dn = busdn->child; dn; dn = dn->sibling) {
+		pdn = PCI_DN(dn);
+		if (pdn && pdn->devfn == devfn)
+			return pnv_pci_cfg_read(dn, where, size, val);
+	}
+
+	*val = 0xFFFFFFFF;
+	return PCIBIOS_DEVICE_NOT_FOUND;
+
+}
+
+static int pnv_pci_write_config(struct pci_bus *bus,
+				unsigned int devfn,
+				int where, int size, u32 val)
+{
+	struct device_node *dn, *busdn = pci_bus_to_OF_node(bus);
+	struct pci_dn *pdn;
+
+	for (dn = busdn->child; dn; dn = dn->sibling) {
+		pdn = PCI_DN(dn);
+		if (pdn && pdn->devfn == devfn)
+			return pnv_pci_cfg_write(dn, where, size, val);
+	}
+
+	return PCIBIOS_DEVICE_NOT_FOUND;
+}
+
 struct pci_ops pnv_pci_ops = {
-	.read = pnv_pci_read_config,
+	.read  = pnv_pci_read_config,
 	.write = pnv_pci_write_config,
 };
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 40bdf02..d633c64 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -182,6 +182,10 @@ extern struct pci_ops pnv_pci_ops;
 extern struct pnv_eeh_ops ioda_eeh_ops;
 #endif
 
+int pnv_pci_cfg_read(struct device_node *dn,
+		     int where, int size, u32 *val);
+int pnv_pci_cfg_write(struct device_node *dn,
+		      int where, int size, u32 val);
 extern void pnv_pci_setup_iommu_table(struct iommu_table *tbl,
 				      void *tce_mem, u64 tce_size,
 				      u64 dma_offset);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 8/8] powernv/eeh: Do hotplug on devices without EEH aware driver
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
                   ` (6 preceding siblings ...)
  2013-06-27  5:46 ` [PATCH 7/8] powerpc/powernv: Use dev-node in PCI config accessors Gavin Shan
@ 2013-06-27  5:46 ` Gavin Shan
  2013-06-30  6:25   ` Benjamin Herrenschmidt
  2013-06-27  5:51 ` From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
  8 siblings, 1 reply; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

During recovery for EEH errors, the device driver requires reset
explicitly (most of cases). The EEH core doesn't do hotplug during
reset. However, there might have some device drivers that can't
support EEH. So the deivce can't be put into quite state during
the reset and possibly requesting PCI config or MMIO access. That
would lead to the failure of the reset, and we don't expect that.

The patch intends to fix that by removing those devices whose drivers
can't support EEH before reset and added into the system after reset.
In the result, it would avoid the race condition mentioned as above.
The idea was proposed by Ben.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   |    4 +-
 arch/powerpc/include/asm/pci.h   |    1 +
 arch/powerpc/kernel/eeh_driver.c |  134 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 137 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 09a8743..dd62006 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -82,7 +82,8 @@ struct eeh_pe {
  * another tree except the currently existing tree of PCI
  * buses and PCI devices
  */
-#define EEH_DEV_IRQ_DISABLED	(1<<0)	/* Interrupt disabled		*/
+#define EEH_DEV_IRQ_DISABLED	(1 << 0)	/* Interrupt disabled	*/
+#define EEH_DEV_REMOVED		(1 << 1)	/* PCI device removed	*/
 
 struct eeh_dev {
 	int mode;			/* EEH mode			*/
@@ -95,6 +96,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct device_node *dn;		/* Associated device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	struct pci_bus *bus;		/* PCI bus used in hotplug	*/
 };
 
 static inline struct device_node *eeh_dev_to_of_node(struct eeh_dev *edev)
diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index 6653f27..af10ec5 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -183,6 +183,7 @@ extern void pci_resource_to_user(const struct pci_dev *dev, int bar,
 				 resource_size_t *start, resource_size_t *end);
 
 extern resource_size_t pcibios_io_space_offset(struct pci_controller *hose);
+extern void pcibios_setup_device(struct pci_dev *dev);
 extern void pcibios_setup_bus_devices(struct pci_bus *bus);
 extern void pcibios_setup_bus_self(struct pci_bus *bus);
 extern void pcibios_setup_phb_io_space(struct pci_controller *hose);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 2b1ce17..cb3baab 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -244,6 +244,7 @@ static void *eeh_report_reset(void *data, void *userdata)
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	enum pci_ers_result rc, *res = userdata;
 	struct pci_driver *driver;
+	bool enable_irq = true;
 
 	if (!dev) return NULL;
 	dev->error_state = pci_channel_io_normal;
@@ -251,7 +252,21 @@ static void *eeh_report_reset(void *data, void *userdata)
 	driver = eeh_pcid_get(dev);
 	if (!driver) return NULL;
 
-	eeh_enable_irq(dev);
+	/*
+	 * For those PCI devices just added, we reloaded its driver
+	 * and needn't to enable the interrupt. The driver should
+	 * take care of that. Otherwise, complaint raised from IRQ
+	 * subsystem.
+	 */
+	if (eeh_probe_mode_dev() && (edev->mode & EEH_DEV_REMOVED)) {
+		edev->mode &= ~(EEH_DEV_REMOVED | EEH_DEV_IRQ_DISABLED);
+		edev->bus = NULL;
+		enable_irq = false;
+
+	}
+
+	if (enable_irq)
+		eeh_enable_irq(dev);
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->slot_reset) {
@@ -338,6 +353,115 @@ static void *eeh_report_failure(void *data, void *userdata)
 	return NULL;
 }
 
+static void *eeh_rmv_device(void *data, void *userdata)
+{
+	struct pci_driver *driver;
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
+	int *removed = (int *)userdata;
+
+	/*
+	 * Actually, we should remove the PCI bridges as well.
+	 * However, that's lots of complexity to do that,
+	 * particularly some of devices under the bridge might
+	 * support EEH. So we just care about PCI devices for
+	 * simplicity here.
+	 */
+	if (!dev || (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE))
+		return NULL;
+	driver = eeh_pcid_get(dev);
+	if (driver && driver->err_handler)
+		return NULL;
+
+	/* If the driver doesn't support EEH, remove it */
+	pr_info("EEH: Removing device %s without EEH support\n",
+		pci_name(dev));
+
+	/* Detach EEH device from PCI device */
+	edev->pdev = NULL;
+	dev->dev.archdata.edev = NULL;
+	pci_dev_put(dev);
+
+	/* Remove and address cache */
+	eeh_addr_cache_rmv_dev(dev);
+	eeh_sysfs_remove_device(dev);
+
+	/* Remove it from PCI subsystem */
+	edev->mode |= EEH_DEV_REMOVED;
+	edev->bus = dev->bus;
+	pci_stop_and_remove_bus_device(dev);
+	(*removed)++;
+
+	return NULL;
+}
+
+static void *eeh_add_device(void *data, void *userdata)
+{
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct pci_dev *dev;
+	struct pci_bus *bus;
+	struct resource *r;
+	int *removed = (int *)userdata;
+	int devfn, i;
+
+	if (!edev || !(edev->mode & EEH_DEV_REMOVED))
+		return NULL;
+	if (*removed <= 0)
+		return edev;
+
+	/*
+	 * We don't clear EEH_DEV_REMOVED flag here.
+	 * Instead, do that before enabling IRQ to
+	 * avoid complain from IRQ subsystem.
+	 */
+	*removed -= 1;
+	bus = edev->bus;
+	devfn = edev->config_addr & 0xFF;
+	pr_info("EEH: Adding PCI device %04x:%02x:%02x.%01x\n",
+		edev->phb->global_number, bus->number,
+		PCI_SLOT(devfn), PCI_FUNC(devfn));
+
+	/* Scan PCI function */
+	dev = pci_scan_single_device(bus, devfn);
+	if (!dev) {
+		pr_err("%s: Can't scan PCI function %02x:%02x.%01x\n",
+		       __func__, bus->number, PCI_SLOT(devfn),
+		       PCI_FUNC(devfn));
+		return NULL;
+	}
+
+	/*
+	 * Setup the PCI device. It's not enough to
+	 * claim the resource and we need assign or
+	 * reassign that.
+	 */
+	pcibios_setup_device(dev);
+	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+		r = &dev->resource[i];
+		if (r->parent || !r->flags)
+			continue;
+		if (pci_assign_resource(dev, i)) {
+			pr_err("%s: Can't allocate %pR for %s\n",
+			       __func__, r, pci_name(dev));
+			/* Clear it out */
+			r->start = 0;
+			r->end = 0;
+			r->flags = 0;
+		}
+	}
+
+	/* Associate EEH device with PCI device */
+	pci_dev_get(dev);
+	edev->pdev = dev;
+	dev->dev.archdata.edev = edev;
+	eeh_addr_cache_insert_dev(dev);
+
+	pci_bus_add_devices(bus);
+	eeh_sysfs_add_device(dev);
+
+	return NULL;
+}
+
 /**
  * eeh_reset_device - Perform actual reset of a pci slot
  * @pe: EEH PE
@@ -351,6 +475,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 {
 	struct timeval tstamp;
 	int cnt, rc;
+	int removed = 0;
 
 	/* pcibios will clear the counter; save the value */
 	cnt = pe->freeze_count;
@@ -364,6 +489,8 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 */
 	if (bus)
 		__pcibios_remove_pci_devices(bus, 0);
+	else if (eeh_probe_mode_dev())
+		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
 
 	/* Reset the pci controller. (Asserts RST#; resets config space).
 	 * Reconfigure bridges and devices. Don't try to bring the system
@@ -384,8 +511,13 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 * potentially weird things happen.
 	 */
 	if (bus) {
+		pr_info("EEH: Hold for 5 seconds after reset\n");
 		ssleep(5);
 		pcibios_add_pci_devices(bus);
+	} else if (eeh_probe_mode_dev() && removed) {
+		pr_info("EEH: Hold for 5 seconds after reset\n");
+		ssleep(5);
+		eeh_pe_dev_traverse(pe, eeh_add_device, &removed);
 	}
 
 	pe->tstamp = tstamp;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: From: Gavin Shan <shangw@linux.vnet.ibm.com>
  2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
                   ` (7 preceding siblings ...)
  2013-06-27  5:46 ` [PATCH 8/8] powernv/eeh: Do hotplug on devices without EEH aware driver Gavin Shan
@ 2013-06-27  5:51 ` Gavin Shan
  8 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2013-06-27  5:51 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev

On Thu, Jun 27, 2013 at 01:46:41PM +0800, Gavin Shan wrote:

The subject went wrong. I didn't make the format correct and here's
the makeup:

[PATCH v4 00/8] Follow-up fixes for EEH on PowerNV

>The series of patches are follow-up in order to make EEH workable for PowerNV
>platform on Juno-IOC-L machine. Couple of issues have been fixed with help of
>Ben:
>
>	- Check PCIe link after PHB complete reset
>	- Restore config space for bridges
>	- The EEH address cache wasn't built successfully
>	- Misc cleanup on output messages
>	- Misc cleanup on EEH flags maintained by "struct pnv_phb"
>	- Misc cleanup on properties of functions to avoid build warnings
>	- Let PCI config accessors rely on device node
>	- Do hotplug during reset for those devices whose drivers can't
>	  support EEH
>
>---
>
>Trigger frozen PE:
>
>        echo 0x0000000002000000 > /sys/kernel/debug/powerpc/PCI0000/err_injct
>        sleep 1
>        echo 0x0 > /sys/kernel/debug/powerpc/PCI0000/err_injct
>
>Trigger fenced PHB:
>
>	echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0000/err_injct
>
>
>---
>
>Changelog:
>==========
>v3 -> v4:
>	* Add more output messages in EEH core to let users know what the EEH
>	  core is doing.
>	* Add one patch to use device node in the PCI config accessors since
>	  the accessors used by EEH and it's not safe enough to refer PCI device
>	  and bus. We instead fully utilize the information from PCI_DN.
>	* Add one patch to remove those deivces whose drivers can't support EEH
>	  before reset, and add them to the system after reset. 
>v2 -> v3:
>	* Fix overwritten buffer while collecting data from PCI config space.
>v1 -> v2:
>	* Remove the mechanism to block PCI-CFG and MMIO.
>	* Add one patch to do cleanup on output messages.
>	* Add one patch to avoid build warnings.
>	* Split functions to restore BARs for PCI devices and bridges separately.
>
>---
>
>arch/powerpc/include/asm/eeh.h               |    8 +-
>arch/powerpc/include/asm/pci.h               |    1 +
>arch/powerpc/kernel/eeh.c                    |   43 +++++--
>arch/powerpc/kernel/eeh_cache.c              |    4 +-
>arch/powerpc/kernel/eeh_driver.c             |  157 +++++++++++++++++++++++-
>arch/powerpc/kernel/eeh_pe.c                 |  166 ++++++++++++++++++++++++--
>arch/powerpc/kernel/pci_hotplug.c            |    8 +-
>arch/powerpc/platforms/powernv/eeh-ioda.c    |   33 +++--
>arch/powerpc/platforms/powernv/eeh-powernv.c |   44 +-------
>arch/powerpc/platforms/powernv/pci-ioda.c    |    1 +
>arch/powerpc/platforms/powernv/pci.c         |  124 ++++++++++++--------
>arch/powerpc/platforms/powernv/pci.h         |   11 ++-
>drivers/pci/probe.c                          |    6 +-
>13 files changed, 462 insertions(+), 144 deletions(-)
>
>Thanks,
>Gavin
>
>_______________________________________________
>Linuxppc-dev mailing list
>Linuxppc-dev@lists.ozlabs.org
>https://lists.ozlabs.org/listinfo/linuxppc-dev
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 8/8] powernv/eeh: Do hotplug on devices without EEH aware driver
  2013-06-27  5:46 ` [PATCH 8/8] powernv/eeh: Do hotplug on devices without EEH aware driver Gavin Shan
@ 2013-06-30  6:25   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2013-06-30  6:25 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev

On Thu, 2013-06-27 at 13:46 +0800, Gavin Shan wrote:
> During recovery for EEH errors, the device driver requires reset
> explicitly (most of cases). The EEH core doesn't do hotplug during
> reset. However, there might have some device drivers that can't
> support EEH. So the deivce can't be put into quite state during
> the reset and possibly requesting PCI config or MMIO access. That
> would lead to the failure of the reset, and we don't expect that.
> 
> The patch intends to fix that by removing those devices whose drivers
> can't support EEH before reset and added into the system after reset.
> In the result, it would avoid the race condition mentioned as above.
> The idea was proposed by Ben.

Ok so, while the basic idea is sane, I have some problems with the
implementation.

First, you add a lot of code conditional to eeh_probe_mode_dev() and
I don't see why. There is no reason not to do the exact same remove/add
operations under pHyp.

This is actually a problem I have with the "new" EEH code overall, there
are too much gratuitous differences between the two "modes" and it's
very tricky to figure out exactly what and why, meaning this is going to
be a source of bugs. I think it's worthwhile trying to reconcile some of
that.

But then, there is *already* code to do a similar unplug/replug in EEH,
except that it's done inside eeh_reset_device() in the "other" case
(and indeed not necessarily in the best way).

Let's look a bit more:

> +extern void pcibios_setup_device(struct pci_dev *dev);

Unrelated: The above is gone as an exported symbol from upstream
as of today. I'll merge upstream in before applying your patches
so don't add that. You shouldn't need anyway, see below.

>  extern void pcibios_setup_bus_devices(struct pci_bus *bus);
>  extern void pcibios_setup_bus_self(struct pci_bus *bus);
>  extern void pcibios_setup_phb_io_space(struct pci_controller *hose);
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 2b1ce17..cb3baab 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -244,6 +244,7 @@ static void *eeh_report_reset(void *data, void *userdata)
>  	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>  	enum pci_ers_result rc, *res = userdata;
>  	struct pci_driver *driver;
> +	bool enable_irq = true;
>  
>  	if (!dev) return NULL;
>  	dev->error_state = pci_channel_io_normal;
> @@ -251,7 +252,21 @@ static void *eeh_report_reset(void *data, void *userdata)
>  	driver = eeh_pcid_get(dev);
>  	if (!driver) return NULL;
>  
> -	eeh_enable_irq(dev);

So I would very much like to understand a bit more that irq business,
for example, what does it do when MSIs are enabled, etc... but that
is also probably something to look at separately (you have a TODO
list ?)

> +	/*
> +	 * For those PCI devices just added, we reloaded its driver
> +	 * and needn't to enable the interrupt. The driver should
> +	 * take care of that. Otherwise, complaint raised from IRQ
> +	 * subsystem.
> +	 */
> +	if (eeh_probe_mode_dev() && (edev->mode & EEH_DEV_REMOVED)) {
> +		edev->mode &= ~(EEH_DEV_REMOVED | EEH_DEV_IRQ_DISABLED);
> +		edev->bus = NULL;
> +		enable_irq = false;
> +
> +	}

Ok, yet another eeh_probe_mode_dev() ... they irk me :-) There should be
no functional differences between EEH operations on powernv and pseries
at a high level and this is high level...

> +	if (enable_irq)
> +		eeh_enable_irq(dev);
>  
>  	if (!driver->err_handler ||
>  	    !driver->err_handler->slot_reset) {
> @@ -338,6 +353,115 @@ static void *eeh_report_failure(void *data, void *userdata)
>  	return NULL;
>  }
>  
> +static void *eeh_rmv_device(void *data, void *userdata)
> +{
> +	struct pci_driver *driver;
> +	struct eeh_dev *edev = (struct eeh_dev *)data;
> +	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
> +	int *removed = (int *)userdata;
> +
> +	/*
> +	 * Actually, we should remove the PCI bridges as well.
> +	 * However, that's lots of complexity to do that,
> +	 * particularly some of devices under the bridge might
> +	 * support EEH. So we just care about PCI devices for
> +	 * simplicity here.
> +	 */
> +	if (!dev || (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE))
> +		return NULL;
> +	driver = eeh_pcid_get(dev);
> +	if (driver && driver->err_handler)
> +		return NULL;

So here we end up basically re-inventing PCI hotplug:

> +	/* If the driver doesn't support EEH, remove it */
> +	pr_info("EEH: Removing device %s without EEH support\n",
> +		pci_name(dev));
> +
> +	/* Detach EEH device from PCI device */
> +	edev->pdev = NULL;
> +	dev->dev.archdata.edev = NULL;
> +	pci_dev_put(dev);
> +
> +	/* Remove and address cache */
> +	eeh_addr_cache_rmv_dev(dev);
> +	eeh_sysfs_remove_device(dev);
> +
> +	/* Remove it from PCI subsystem */
> +	edev->mode |= EEH_DEV_REMOVED;
> +	edev->bus = dev->bus;
> +	pci_stop_and_remove_bus_device(dev);
> +	(*removed)++;
> +
> +	return NULL;
> +}

And ...

> +static void *eeh_add_device(void *data, void *userdata)
> +{
> +	struct eeh_dev *edev = (struct eeh_dev *)data;
> +	struct pci_dev *dev;
> +	struct pci_bus *bus;
> +	struct resource *r;
> +	int *removed = (int *)userdata;
> +	int devfn, i;
> +
> +	if (!edev || !(edev->mode & EEH_DEV_REMOVED))
> +		return NULL;
> +	if (*removed <= 0)
> +		return edev;
> +
> +	/*
> +	 * We don't clear EEH_DEV_REMOVED flag here.
> +	 * Instead, do that before enabling IRQ to
> +	 * avoid complain from IRQ subsystem.
> +	 */
> +	*removed -= 1;
> +	bus = edev->bus;
> +	devfn = edev->config_addr & 0xFF;
> +	pr_info("EEH: Adding PCI device %04x:%02x:%02x.%01x\n",
> +		edev->phb->global_number, bus->number,
> +		PCI_SLOT(devfn), PCI_FUNC(devfn));
> +
> +	/* Scan PCI function */
> +	dev = pci_scan_single_device(bus, devfn);
> +	if (!dev) {
> +		pr_err("%s: Can't scan PCI function %02x:%02x.%01x\n",
> +		       __func__, bus->number, PCI_SLOT(devfn),
> +		       PCI_FUNC(devfn));
> +		return NULL;
> +	}

Wow, that's a hard scan ... we shouldn't do that on pseries (which you
don't since you don't use that code but still ...). See below for more
details.

> +	/*
> +	 * Setup the PCI device. It's not enough to
> +	 * claim the resource and we need assign or
> +	 * reassign that.
> +	 */
> +	pcibios_setup_device(dev);
> +	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> +		r = &dev->resource[i];
> +		if (r->parent || !r->flags)
> +			continue;
> +		if (pci_assign_resource(dev, i)) {

Are we really supposed to play with resource assignment here ? We should
just restore the BARs to their former values which are normally cached
in the eeh dev...

> +			pr_err("%s: Can't allocate %pR for %s\n",
> +			       __func__, r, pci_name(dev));
> +			/* Clear it out */
> +			r->start = 0;
> +			r->end = 0;
> +			r->flags = 0;
> +		}
> +	}
> +
> +	/* Associate EEH device with PCI device */
> +	pci_dev_get(dev);
> +	edev->pdev = dev;
> +	dev->dev.archdata.edev = edev;
> +	eeh_addr_cache_insert_dev(dev);
> +
> +	pci_bus_add_devices(bus);
> +	eeh_sysfs_add_device(dev);

So all that is a duplicate of:

> +	return NULL;
> +}
> +
>  /**
>   * eeh_reset_device - Perform actual reset of a pci slot
>   * @pe: EEH PE
> @@ -351,6 +475,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
>  {
>  	struct timeval tstamp;
>  	int cnt, rc;
> +	int removed = 0;
>  
>  	/* pcibios will clear the counter; save the value */
>  	cnt = pe->freeze_count;
> @@ -364,6 +489,8 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
>  	 */
>  	if (bus)
>       		__pcibios_remove_pci_devices(bus, 0);
                  ^^^^^^^^^^ this

and ...

> +	else if (eeh_probe_mode_dev())
> +		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
>  
>  	/* Reset the pci controller. (Asserts RST#; resets config space).
>  	 * Reconfigure bridges and devices. Don't try to bring the system
> @@ -384,8 +511,13 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
>  	 * potentially weird things happen.
>  	 */
>  	if (bus) {
> +		pr_info("EEH: Hold for 5 seconds after reset\n");
>  		ssleep(5);
>  		pcibios_add_pci_devices(bus);
                  ^^^^^^^^^^^^^ this

Except of course that the above are working on a "full bus" basis while
you try to deal with individual devices. More discussion about that
further down. Note also that the 2 above functions are in
arch/powerpc/platform/pseries. They should probably be moved (along with
their dependents) somewhere else such as in
arch/powerpc/kernel/pci-common.c as they are generally useful for
anything doing hotplug and as-is, it may not build with pseries enabled
in the .config

> +	} else if (eeh_probe_mode_dev() && removed) {
> +		pr_info("EEH: Hold for 5 seconds after reset\n");
> +		ssleep(5);
> +		eeh_pe_dev_traverse(pe, eeh_add_device, &removed);
>  	}
>  
>  	pe->tstamp = tstamp;

So, basically, the existing code tries to unplug the whole PE (which is
always a bus) under pseries, if *any* device in the PE doesn't do EEH.

Your new code tries to handle individual devices in the PE (does it ?)
and remove only those who don't handle EEH (am I correct ?) in case of
an EEH error, and re-plug them.

I don't see a reason why the latter shouldn't work for the former case.

IE Why can't we also do individual devices instead of removing the whole
lot on pseries ?

Also, the code to do the actual removal and adding which is currently
in pci_dlpar.c properly handles "device-tree" style probing *provided*
that you restore the BARs before you call it.

So at this point, I suggest you consider one of two approaches:

 - You can either break down the code in pci_dlpar.c to handle
individual devices instead of entire busses (and still provide a wrapper
for entire busses since that's used by the PCI hotplug code). Then you
use that for both pseries and powernv.

 - Or you can make powernv do like pseries and unplug the entire PE
which is always a bus for now (at least until we do SR-IOV) and use the
existing pci_dlpar.c functions more/less unmodified... but you probably
will have to switch to the first option when we do SR-IOV.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-06-30  6:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-27  5:46 From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan
2013-06-27  5:46 ` [PATCH 1/8] powerpc/eeh: Don't collect PCI-CFG data on PHB Gavin Shan
2013-06-27  5:46 ` [PATCH 2/8] powerpc/eeh: Check PCIe link after reset Gavin Shan
2013-06-27  5:46 ` [PATCH 3/8] powerpc/powernv: Replace variables with flags Gavin Shan
2013-06-27  5:46 ` [PATCH 4/8] powerpc/eeh: Fix address catch for PowerNV Gavin Shan
2013-06-27  5:46 ` [PATCH 5/8] powerpc/eeh: Refactor the output message Gavin Shan
2013-06-27  5:46 ` [PATCH 6/8] powerpc/eeh: Avoid build warnings Gavin Shan
2013-06-27  5:46 ` [PATCH 7/8] powerpc/powernv: Use dev-node in PCI config accessors Gavin Shan
2013-06-27  5:46 ` [PATCH 8/8] powernv/eeh: Do hotplug on devices without EEH aware driver Gavin Shan
2013-06-30  6:25   ` Benjamin Herrenschmidt
2013-06-27  5:51 ` From: Gavin Shan <shangw@linux.vnet.ibm.com> Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).