linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8]
@ 2019-03-20  2:58 Sam Bobroff
  2019-03-20  2:58 ` [PATCH 1/8] powerpc/64: Adjust order in pcibios_init() Sam Bobroff
                   ` (8 more replies)
  0 siblings, 9 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

Hi all,

This patch set adds support for EEH recovery of hot plugged devices on pSeries
machines. Specifically, devices discovered by PCI rescanning using
/sys/bus/pci/rescan, which includes devices hotplugged by QEMU's device_add
command. (pSeries doesn't currently use slot power control for hotplugging.)

As a side effect this also provides EEH support for devices removed by
/sys/bus/pci/devices/*/remove and re-discovered by writing to /sys/bus/pci/rescan,
on all platforms.

The approach I've taken is to use the fact that the existing
pcibios_bus_add_device() platform hooks (which are used to set up EEH on
Virtual Function devices (VFs)) are actually called for all devices, so I've
widened their scope and made other adjustments necessary to allow them to work
for hotplugged and boot-time devices as well.

Because some of the changes are in generic PowerPC code, it's
possible that I've disturbed something for another PowerPC platform. I've tried
to minimize this by leaving that code alone as much as possible and so there
are a few cases where eeh_add_device_{early,late}() or eeh_add_sysfs_files() is
called more than once. I think these can be looked at later, as duplicate calls
are not harmful.

The patch "Convert PNV_PHB_FLAG_EEH" isn't strictly necessary and I'm not sure
if it's better to keep it, because it simplifies the code or drop it, because
we may need a separate flag per PHB later on. Thoughts anyone?

The first patch is a rework of the pcibios_init reordering patch I posted
earlier, which I've included here because it's necessary for this set.

I have done some testing for PowerNV on Power9 using a modified pnv_php module
and some testing on pSeries with slot power control using a modified rpaphp
module, and the EEH-related parts seem to work.

Cheers,
Sam.

Sam Bobroff (8):
  powerpc/64: Adjust order in pcibios_init()
  powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
  powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
  powerpc/eeh: Improve debug messages around device addition
  powerpc/eeh: Add eeh_show_enabled()
  powerpc/eeh: Initialize EEH address cache earlier
  powerpc/eeh: EEH for pSeries hot plug
  powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()

 arch/powerpc/include/asm/eeh.h               | 19 +++--
 arch/powerpc/kernel/eeh.c                    | 33 ++++-----
 arch/powerpc/kernel/eeh_cache.c              | 29 +-------
 arch/powerpc/kernel/eeh_driver.c             |  4 ++
 arch/powerpc/kernel/of_platform.c            |  3 +-
 arch/powerpc/kernel/pci-common.c             |  4 --
 arch/powerpc/kernel/pci_32.c                 |  4 ++
 arch/powerpc/kernel/pci_64.c                 | 12 +++-
 arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++++------
 arch/powerpc/platforms/powernv/pci.c         |  7 +-
 arch/powerpc/platforms/powernv/pci.h         |  2 -
 arch/powerpc/platforms/pseries/eeh_pseries.c | 75 +++++++++++---------
 arch/powerpc/platforms/pseries/pci.c         |  7 +-
 13 files changed, 122 insertions(+), 118 deletions(-)

-- 
2.19.0.2.gcad72f5712


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/8] powerpc/64: Adjust order in pcibios_init()
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:03   ` Alexey Kardashevskiy
  2019-03-20  2:58 ` [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Sam Bobroff
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

The pcibios_init() function for 64 bit PowerPC currently calls
pci_bus_add_devices() before pcibios_resource_survey(), which seems
incorrect because it adds devices and attempts to bind their drivers
before allocating their resources (although no problems seem to be
apparent).

So move the call to pci_bus_add_devices() to after
pcibios_resource_survey(), while extracting call to the
pcibios_fixup() hook so that it remains in the same location.

This will also allow the ppc_md.pcibios_bus_add_device() hooks to
perform actions that depend on PCI resources, both during rescanning
(where this is already the case) and at boot time, to support future
work.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/pci-common.c |  4 ----
 arch/powerpc/kernel/pci_32.c     |  4 ++++
 arch/powerpc/kernel/pci_64.c     | 12 +++++++++---
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index ff4b7539cbdf..3146eb73e3b3 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1383,10 +1383,6 @@ void __init pcibios_resource_survey(void)
 		pr_debug("PCI: Assigning unassigned resources...\n");
 		pci_assign_unassigned_resources();
 	}
-
-	/* Call machine dependent fixup */
-	if (ppc_md.pcibios_fixup)
-		ppc_md.pcibios_fixup();
 }
 
 /* This is used by the PCI hotplug driver to allocate resource
diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c
index d3f04f2d8249..40aaa1a6e193 100644
--- a/arch/powerpc/kernel/pci_32.c
+++ b/arch/powerpc/kernel/pci_32.c
@@ -259,6 +259,10 @@ static int __init pcibios_init(void)
 	/* Call common code to handle resource allocation */
 	pcibios_resource_survey();
 
+	/* Call machine dependent fixup */
+	if (ppc_md.pcibios_fixup)
+		ppc_md.pcibios_fixup();
+
 	/* Call machine dependent post-init code */
 	if (ppc_md.pcibios_after_init)
 		ppc_md.pcibios_after_init();
diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
index 9d8c10d55407..6f16f30031d7 100644
--- a/arch/powerpc/kernel/pci_64.c
+++ b/arch/powerpc/kernel/pci_64.c
@@ -58,14 +58,20 @@ static int __init pcibios_init(void)
 	pci_add_flags(PCI_ENABLE_PROC_DOMAINS | PCI_COMPAT_DOMAIN_0);
 
 	/* Scan all of the recorded PCI controllers.  */
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
+	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
 		pcibios_scan_phb(hose);
-		pci_bus_add_devices(hose->bus);
-	}
 
 	/* Call common code to handle resource allocation */
 	pcibios_resource_survey();
 
+	/* Add devices. */
+	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
+		pci_bus_add_devices(hose->bus);
+
+	/* Call machine dependent fixup */
+	if (ppc_md.pcibios_fixup)
+		ppc_md.pcibios_fixup();
+
 	printk(KERN_DEBUG "PCI: Probing PCI hardware done\n");
 
 	return 0;
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
  2019-03-20  2:58 ` [PATCH 1/8] powerpc/64: Adjust order in pcibios_init() Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-03-20  2:58 ` [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag Sam Bobroff
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

The EEH_DEV_NO_HANDLER flag is used by the EEH system to prevent the
use of driver callbacks in drivers that have been bound part way
through the recovery process. This is necessary to prevent later stage
handlers from being called when the earlier stage handlers haven't,
which can be confusing for drivers.

However, the flag is set for all devices that are added after boot
time and only cleared at the end of the EEH recovery process. This
results in hot plugged devices erroneously having the flag set during
the first recovery after they are added (causing their driver's
handlers to be incorrectly ignored).

To remedy this, clear the flag at the beginning of recovery
processing. The flag is still cleared at the end of recovery
processing, although it is no longer really necessary.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 6f3ee30565dd..4c34b9901f15 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -819,6 +819,10 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		result = PCI_ERS_RESULT_DISCONNECT;
 	}
 
+	eeh_for_each_pe(pe, tmp_pe)
+		eeh_pe_for_each_dev(tmp_pe, edev, tmp)
+			edev->mode &= ~EEH_DEV_NO_HANDLER;
+
 	/* Walk the various device drivers attached to this slot through
 	 * a reset sequence, giving each an opportunity to do what it needs
 	 * to accomplish the reset.  Each child gets a report of the
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
  2019-03-20  2:58 ` [PATCH 1/8] powerpc/64: Adjust order in pcibios_init() Sam Bobroff
  2019-03-20  2:58 ` [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-04-18  9:51   ` Oliver O'Halloran
  2019-03-20  2:58 ` [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition Sam Bobroff
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

The PHB flag, PNV_PHB_FLAG_EEH, is set (on PowerNV) individually on
each PHB once the EEH subsystem is ready. It is the only use of the
flags member of the phb struct.

However there is no need to store this separately on each PHB, so
convert it to a global flag. For symmetry, the flag is now also set
for pSeries; although it is currently unused it may be useful in the
future.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               | 11 +++++++++++
 arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++-----------
 arch/powerpc/platforms/powernv/pci.c         |  7 +++----
 arch/powerpc/platforms/powernv/pci.h         |  2 --
 arch/powerpc/platforms/pseries/pci.c         |  4 ++++
 5 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 3613a56281f2..fe4cf7208890 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -43,6 +43,7 @@ struct pci_dn;
 #define EEH_VALID_PE_ZERO	0x10	/* PE#0 is valid		     */
 #define EEH_ENABLE_IO_FOR_LOG	0x20	/* Enable IO for log		     */
 #define EEH_EARLY_DUMP_LOG	0x40	/* Dump log immediately		     */
+#define EEH_PHB_ENABLED		0x80	/* PHB recovery uses EEH	     */
 
 /*
  * Delay for PE reset, all in ms
@@ -245,6 +246,11 @@ static inline bool eeh_enabled(void)
 	return eeh_has_flag(EEH_ENABLED) && !eeh_has_flag(EEH_FORCE_DISABLED);
 }
 
+static inline bool eeh_phb_enabled(void)
+{
+	return eeh_has_flag(EEH_PHB_ENABLED);
+}
+
 static inline void eeh_serialize_lock(unsigned long *flags)
 {
 	raw_spin_lock_irqsave(&confirm_error_lock, *flags);
@@ -332,6 +338,11 @@ static inline bool eeh_enabled(void)
         return false;
 }
 
+static inline bool eeh_phb_enabled(void)
+{
+	return false;
+}
+
 static inline void eeh_probe_devices(void) { }
 
 static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 6fc1a463b796..f0a95f663810 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -264,22 +264,14 @@ int pnv_eeh_post_init(void)
 		return ret;
 	}
 
-	if (!eeh_enabled())
+	if (eeh_enabled())
+		eeh_add_flag(EEH_PHB_ENABLED);
+	else
 		disable_irq(eeh_event_irq);
 
 	list_for_each_entry(hose, &hose_list, list_node) {
 		phb = hose->private_data;
 
-		/*
-		 * If EEH is enabled, we're going to rely on that.
-		 * Otherwise, we restore to conventional mechanism
-		 * to clear frozen PE during PCI config access.
-		 */
-		if (eeh_enabled())
-			phb->flags |= PNV_PHB_FLAG_EEH;
-		else
-			phb->flags &= ~PNV_PHB_FLAG_EEH;
-
 		/* Create debugfs entries */
 #ifdef CONFIG_DEBUG_FS
 		if (phb->has_dbgfs || !phb->dbgfs)
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 307181fd8a17..d2b50f3bf6b1 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -717,10 +717,9 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
 static bool pnv_pci_cfg_check(struct pci_dn *pdn)
 {
 	struct eeh_dev *edev = NULL;
-	struct pnv_phb *phb = pdn->phb->private_data;
 
 	/* EEH not enabled ? */
-	if (!(phb->flags & PNV_PHB_FLAG_EEH))
+	if (!eeh_phb_enabled())
 		return true;
 
 	/* PE reset or device removed ? */
@@ -761,7 +760,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 
 	ret = pnv_pci_cfg_read(pdn, where, size, val);
 	phb = pdn->phb->private_data;
-	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
+	if (eeh_phb_enabled() && pdn->edev) {
 		if (*val == EEH_IO_ERROR_VALUE(size) &&
 		    eeh_dev_check_failure(pdn->edev))
                         return PCIBIOS_DEVICE_NOT_FOUND;
@@ -789,7 +788,7 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 
 	ret = pnv_pci_cfg_write(pdn, where, size, val);
 	phb = pdn->phb->private_data;
-	if (!(phb->flags & PNV_PHB_FLAG_EEH))
+	if (!eeh_phb_enabled())
 		pnv_pci_config_check_eeh(pdn);
 
 	return ret;
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 8e36da379252..eb0add61397b 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -85,8 +85,6 @@ struct pnv_ioda_pe {
 	struct list_head	list;
 };
 
-#define PNV_PHB_FLAG_EEH	(1 << 0)
-
 struct pnv_phb {
 	struct pci_controller	*hose;
 	enum pnv_phb_type	type;
diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
index 37a77e57893e..7be80882c08d 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -244,6 +244,10 @@ void __init pSeries_final_fixup(void)
 
 	eeh_probe_devices();
 	eeh_addr_cache_build();
+#ifdef CONFIG_EEH
+	if (eeh_enabled())
+		eeh_add_flag(EEH_PHB_ENABLED);
+#endif
 
 #ifdef CONFIG_PCI_IOV
 	ppc_md.pcibios_sriov_enable = pseries_pcibios_sriov_enable;
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
                   ` (2 preceding siblings ...)
  2019-03-20  2:58 ` [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-03-20  2:58 ` [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled() Sam Bobroff
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

Also remove useless comment.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh.c                    |  2 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c | 14 ++++++++----
 arch/powerpc/platforms/pseries/eeh_pseries.c | 23 +++++++++++++++-----
 3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 8d3c36a1f194..b14d89547895 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1291,7 +1291,7 @@ void eeh_add_device_late(struct pci_dev *dev)
 	pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
 	edev = pdn_to_eeh_dev(pdn);
 	if (edev->pdev == dev) {
-		pr_debug("EEH: Already referenced !\n");
+		pr_debug("EEH: Device %s already referenced!\n", pci_name(dev));
 		return;
 	}
 
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index f0a95f663810..51c5b6bb9b0e 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -50,10 +50,7 @@ void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
 	if (!pdev->is_virtfn)
 		return;
 
-	/*
-	 * The following operations will fail if VF's sysfs files
-	 * aren't created or its resources aren't finalized.
-	 */
+	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
 	eeh_add_device_early(pdn);
 	eeh_add_device_late(pdev);
 	eeh_sysfs_add_device(pdev);
@@ -389,6 +386,10 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	int ret;
 	int config_addr = (pdn->busno << 8) | (pdn->devfn);
 
+	pr_debug("%s: probing %04x:%02x:%02x.%01x\n",
+		__func__, hose->global_number, pdn->busno,
+		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
 	/*
 	 * When probing the root bridge, which doesn't have any
 	 * subordinate PCI devices. We don't have OF node for
@@ -483,6 +484,11 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	/* Save memory bars */
 	eeh_save_bars(edev);
 
+	pr_debug("%s: EEH enabled on %02x:%02x.%01x PHB#%x-PE#%x\n",
+		__func__, pdn->busno, PCI_SLOT(pdn->devfn),
+		PCI_FUNC(pdn->devfn), edev->pe->phb->global_number,
+		edev->pe->addr);
+
 	return NULL;
 }
 
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 7aa50258dd42..ae06878fbdea 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -65,6 +65,8 @@ void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
 	if (!pdev->is_virtfn)
 		return;
 
+	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
+
 	pdn->device_id  =  pdev->device;
 	pdn->vendor_id  =  pdev->vendor;
 	pdn->class_code =  pdev->class;
@@ -251,6 +253,10 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
 	int enable = 0;
 	int ret;
 
+	pr_debug("%s: probing %04x:%02x:%02x.%01x\n",
+		__func__, pdn->phb->global_number, pdn->busno,
+		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
 	/* Retrieve OF node and eeh device */
 	edev = pdn_to_eeh_dev(pdn);
 	if (!edev || edev->pe)
@@ -294,7 +300,12 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
 
 	/* Enable EEH on the device */
 	ret = eeh_ops->set_option(&pe, EEH_OPT_ENABLE);
-	if (!ret) {
+	if (ret) {
+		pr_debug("%s: EEH failed to enable on %02x:%02x.%01x PHB#%x-PE#%x (code %d)\n",
+			__func__, pdn->busno, PCI_SLOT(pdn->devfn),
+			PCI_FUNC(pdn->devfn), pe.phb->global_number,
+			pe.addr, ret);
+	} else {
 		/* Retrieve PE address */
 		edev->pe_config_addr = eeh_ops->get_pe_addr(&pe);
 		pe.addr = edev->pe_config_addr;
@@ -310,11 +321,6 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
 		if (enable) {
 			eeh_add_flag(EEH_ENABLED);
 			eeh_add_to_parent_pe(edev);
-
-			pr_debug("%s: EEH enabled on %02x:%02x.%01x PHB#%x-PE#%x\n",
-				__func__, pdn->busno, PCI_SLOT(pdn->devfn),
-				PCI_FUNC(pdn->devfn), pe.phb->global_number,
-				pe.addr);
 		} else if (pdn->parent && pdn_to_eeh_dev(pdn->parent) &&
 			   (pdn_to_eeh_dev(pdn->parent))->pe) {
 			/* This device doesn't support EEH, but it may have an
@@ -323,6 +329,11 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
 			edev->pe_config_addr = pdn_to_eeh_dev(pdn->parent)->pe_config_addr;
 			eeh_add_to_parent_pe(edev);
 		}
+		pr_debug("%s: EEH %s on %02x:%02x.%01x PHB#%x-PE#%x (code %d)\n",
+			__func__, (enable ? "enabled" : "unsupported"),
+			pdn->busno, PCI_SLOT(pdn->devfn),
+			PCI_FUNC(pdn->devfn), pe.phb->global_number,
+			pe.addr, ret);
 	}
 
 	/* Save memory bars */
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
                   ` (3 preceding siblings ...)
  2019-03-20  2:58 ` [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-04-18 10:01   ` Oliver O'Halloran
  2019-03-20  2:58 ` [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier Sam Bobroff
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

Move the EEH enabled message into it's own function so that future
work can call it from multiple places.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h |  3 +++
 arch/powerpc/kernel/eeh.c      | 16 +++++++++++-----
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index fe4cf7208890..e217ccda55d0 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -289,6 +289,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
 
 struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
+void eeh_show_enabled(void);
 void eeh_probe_devices(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
 int __exit eeh_ops_unregister(const char *name);
@@ -338,6 +339,8 @@ static inline bool eeh_enabled(void)
         return false;
 }
 
+static inline void eeh_show_enabled(void) { }
+
 static inline bool eeh_phb_enabled(void)
 {
 	return false;
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index b14d89547895..3dcff29cb9b3 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -163,6 +163,16 @@ static int __init eeh_setup(char *str)
 }
 __setup("eeh=", eeh_setup);
 
+void eeh_show_enabled(void)
+{
+	if (eeh_has_flag(EEH_FORCE_DISABLED))
+		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (by eeh=off)\n");
+	else if (eeh_enabled())
+		pr_info("EEH: PCI Enhanced I/O Error Handling ENABLED (capable adapter found)\n");
+	else
+		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (no capable adapter found)\n");
+}
+
 /*
  * This routine captures assorted PCI configuration space data
  * for the indicated PCI device, and puts them into a buffer
@@ -1166,11 +1176,7 @@ void eeh_probe_devices(void)
 		pdn = hose->pci_data;
 		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
 	}
-	if (eeh_enabled())
-		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
-	else
-		pr_info("EEH: No capable adapters found\n");
-
+	eeh_show_enabled();
 }
 
 /**
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
                   ` (4 preceding siblings ...)
  2019-03-20  2:58 ` [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled() Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-04-18 10:13   ` Oliver O'Halloran
  2019-03-20  2:58 ` [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug Sam Bobroff
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

The EEH address cache is currently initialized and populated by a
single function: eeh_addr_cache_build().  While the initial population
of the cache can only be done once resources are allocated,
initialization (just setting up a spinlock) could be done much
earlier.

So move the initialization step into a separate function and call it
from a core_initcall (rather than a subsys initcall).

This will allow future work to make use of the cache during boot time
PCI scanning.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h  |  3 +++
 arch/powerpc/kernel/eeh.c       |  2 ++
 arch/powerpc/kernel/eeh_cache.c | 13 +++++++++++--
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index e217ccda55d0..791b9e6fcc45 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -295,6 +295,7 @@ int __init eeh_ops_register(struct eeh_ops *ops);
 int __exit eeh_ops_unregister(const char *name);
 int eeh_check_failure(const volatile void __iomem *token);
 int eeh_dev_check_failure(struct eeh_dev *edev);
+void eeh_addr_cache_init(void);
 void eeh_addr_cache_build(void);
 void eeh_add_device_early(struct pci_dn *);
 void eeh_add_device_tree_early(struct pci_dn *);
@@ -362,6 +363,8 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
 
 #define eeh_dev_check_failure(x) (0)
 
+static inline void eeh_addr_cache_init(void) { }
+
 static inline void eeh_addr_cache_build(void) { }
 
 static inline void eeh_add_device_early(struct pci_dn *pdn) { }
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 3dcff29cb9b3..7a406d58d2c0 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1219,6 +1219,8 @@ static int eeh_init(void)
 	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
 		eeh_dev_phb_init_dynamic(hose);
 
+	eeh_addr_cache_init();
+
 	/* Initialize EEH event */
 	return eeh_event_init();
 }
diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index 9c68f0837385..f93dd5cf6a39 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -267,6 +267,17 @@ void eeh_addr_cache_rmv_dev(struct pci_dev *dev)
 	spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
 }
 
+/**
+ * eeh_addr_cache_init - Initialize a cache of I/O addresses
+ *
+ * Initialize a cache of pci i/o addresses.  This cache will be used to
+ * find the pci device that corresponds to a given address.
+ */
+void eeh_addr_cache_init(void)
+{
+	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
+}
+
 /**
  * eeh_addr_cache_build - Build a cache of I/O addresses
  *
@@ -282,8 +293,6 @@ void eeh_addr_cache_build(void)
 	struct eeh_dev *edev;
 	struct pci_dev *dev = NULL;
 
-	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
-
 	for_each_pci_dev(dev) {
 		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
 		if (!pdn)
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
                   ` (5 preceding siblings ...)
  2019-03-20  2:58 ` [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-03-20  2:58 ` [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build() Sam Bobroff
  2019-04-12  0:55 ` [PATCH 0/8] Tyrel Datwyler
  8 siblings, 1 reply; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

On PowerNV and pSeries, devices currently acquire EEH support from
several different places: Boot-time devices from eeh_probe_devices()
and eeh_addr_cache_build(), Virtual Function devices from the pcibios
bus add device hooks and hot plugged devices from pci_hp_add_devices()
(with other platforms using other methods as well).  Unfortunately,
pSeries machines currently discover hot plugged devices using
pci_rescan_bus(), not pci_hp_add_devices(), and so those devices do
not receive EEH support.

Rather than adding another case for pci_rescan_bus(), this change
widens the scope of the pcibios bus add device hooks so that they can
handle all devices. As a side effect this also supports devices
discovered after manually rescanning via /sys/bus/pci/rescan.

Note that on PowerNV, this change allows the EEH subsystem to become
enabled after boot as long as it has not been forced off, which was
not previously possible (it was already possible on pSeries).

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh.c                    |  2 +-
 arch/powerpc/kernel/of_platform.c            |  3 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c |  8 ++-
 arch/powerpc/platforms/pseries/eeh_pseries.c | 54 ++++++++++----------
 4 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 7a406d58d2c0..217e14bb1fb6 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1291,7 +1291,7 @@ void eeh_add_device_late(struct pci_dev *dev)
 	struct pci_dn *pdn;
 	struct eeh_dev *edev;
 
-	if (!dev || !eeh_enabled())
+	if (!dev)
 		return;
 
 	pr_debug("EEH: Adding device %s\n", pci_name(dev));
diff --git a/arch/powerpc/kernel/of_platform.c b/arch/powerpc/kernel/of_platform.c
index becaec990140..d5818e9c4069 100644
--- a/arch/powerpc/kernel/of_platform.c
+++ b/arch/powerpc/kernel/of_platform.c
@@ -86,7 +86,8 @@ static int of_pci_phb_probe(struct platform_device *dev)
 	pcibios_claim_one_bus(phb->bus);
 
 	/* Finish EEH setup */
-	eeh_add_device_tree_late(phb->bus);
+	if (!eeh_has_flag(EEH_FORCE_DISABLED))
+		eeh_add_device_tree_late(phb->bus);
 
 	/* Add probed PCI devices to the device model */
 	pci_bus_add_devices(phb->bus);
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 51c5b6bb9b0e..81b0923cc55f 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -47,7 +47,7 @@ void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 
-	if (!pdev->is_virtfn)
+	if (eeh_has_flag(EEH_FORCE_DISABLED))
 		return;
 
 	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
@@ -479,7 +479,11 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	 * Enable EEH explicitly so that we will do EEH check
 	 * while accessing I/O stuff
 	 */
-	eeh_add_flag(EEH_ENABLED);
+	if (!eeh_has_flag(EEH_ENABLED)) {
+		enable_irq(eeh_event_irq);
+		eeh_add_flag(EEH_PHB_ENABLED);
+		eeh_add_flag(EEH_ENABLED);
+	}
 
 	/* Save memory bars */
 	eeh_save_bars(edev);
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index ae06878fbdea..e68c79164974 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -55,44 +55,44 @@ static int ibm_get_config_addr_info;
 static int ibm_get_config_addr_info2;
 static int ibm_configure_pe;
 
-#ifdef CONFIG_PCI_IOV
 void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn = pci_get_pdn(pdev);
-	struct pci_dn *physfn_pdn;
-	struct eeh_dev *edev;
 
-	if (!pdev->is_virtfn)
+	if (eeh_has_flag(EEH_FORCE_DISABLED))
 		return;
 
 	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
+#ifdef CONFIG_PCI_IOV
+	if (pdev->is_virtfn) {
+		struct pci_dn *physfn_pdn;
 
-	pdn->device_id  =  pdev->device;
-	pdn->vendor_id  =  pdev->vendor;
-	pdn->class_code =  pdev->class;
-	/*
-	 * Last allow unfreeze return code used for retrieval
-	 * by user space in eeh-sysfs to show the last command
-	 * completion from platform.
-	 */
-	pdn->last_allow_rc =  0;
-	physfn_pdn      =  pci_get_pdn(pdev->physfn);
-	pdn->pe_number  =  physfn_pdn->pe_num_map[pdn->vf_index];
-	edev = pdn_to_eeh_dev(pdn);
-
-	/*
-	 * The following operations will fail if VF's sysfs files
-	 * aren't created or its resources aren't finalized.
-	 */
+		pdn->device_id  =  pdev->device;
+		pdn->vendor_id  =  pdev->vendor;
+		pdn->class_code =  pdev->class;
+		/*
+		 * Last allow unfreeze return code used for retrieval
+		 * by user space in eeh-sysfs to show the last command
+		 * completion from platform.
+		 */
+		pdn->last_allow_rc =  0;
+		physfn_pdn      =  pci_get_pdn(pdev->physfn);
+		pdn->pe_number  =  physfn_pdn->pe_num_map[pdn->vf_index];
+	}
+#endif
 	eeh_add_device_early(pdn);
 	eeh_add_device_late(pdev);
-	edev->pe_config_addr =  (pdn->busno << 16) | (pdn->devfn << 8);
-	eeh_rmv_from_parent_pe(edev); /* Remove as it is adding to bus pe */
-	eeh_add_to_parent_pe(edev);   /* Add as VF PE type */
-	eeh_sysfs_add_device(pdev);
+#ifdef CONFIG_PCI_IOV
+	if (pdev->is_virtfn) {
+		struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 
-}
+		edev->pe_config_addr =  (pdn->busno << 16) | (pdn->devfn << 8);
+		eeh_rmv_from_parent_pe(edev); /* Remove as it is adding to bus pe */
+		eeh_add_to_parent_pe(edev);   /* Add as VF PE type */
+	}
 #endif
+	eeh_sysfs_add_device(pdev);
+}
 
 /*
  * Buffer for reporting slot-error-detail rtas calls. Its here
@@ -159,10 +159,8 @@ static int pseries_eeh_init(void)
 	/* Set EEH probe mode */
 	eeh_add_flag(EEH_PROBE_MODE_DEVTREE | EEH_ENABLE_IO_FOR_LOG);
 
-#ifdef CONFIG_PCI_IOV
 	/* Set EEH machine dependent code */
 	ppc_md.pcibios_bus_add_device = pseries_pcibios_bus_add_device;
-#endif
 
 	return 0;
 }
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
                   ` (6 preceding siblings ...)
  2019-03-20  2:58 ` [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug Sam Bobroff
@ 2019-03-20  2:58 ` Sam Bobroff
  2019-03-20  6:05   ` Alexey Kardashevskiy
  2019-04-12  0:55 ` [PATCH 0/8] Tyrel Datwyler
  8 siblings, 1 reply; 31+ messages in thread
From: Sam Bobroff @ 2019-03-20  2:58 UTC (permalink / raw)
  To: linuxppc-dev

Now that EEH support for all devices (on PowerNV and pSeries) is
provided by the pcibios bus add device hooks, eeh_probe_devices() and
eeh_addr_cache_build() are redundant and can be removed.

Note that previously on pSeries, useless EEH sysfs files were created
for some devices that did not have EEH support and this change
prevents them from being created.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |  6 ----
 arch/powerpc/kernel/eeh.c                    | 13 --------
 arch/powerpc/kernel/eeh_cache.c              | 32 --------------------
 arch/powerpc/platforms/powernv/eeh-powernv.c |  5 ++-
 arch/powerpc/platforms/pseries/pci.c         |  3 +-
 5 files changed, 3 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 791b9e6fcc45..f1eca1757cbc 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -290,13 +290,11 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
 struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
 void eeh_show_enabled(void);
-void eeh_probe_devices(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
 int __exit eeh_ops_unregister(const char *name);
 int eeh_check_failure(const volatile void __iomem *token);
 int eeh_dev_check_failure(struct eeh_dev *edev);
 void eeh_addr_cache_init(void);
-void eeh_addr_cache_build(void);
 void eeh_add_device_early(struct pci_dn *);
 void eeh_add_device_tree_early(struct pci_dn *);
 void eeh_add_device_late(struct pci_dev *);
@@ -347,8 +345,6 @@ static inline bool eeh_phb_enabled(void)
 	return false;
 }
 
-static inline void eeh_probe_devices(void) { }
-
 static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
 {
 	return NULL;
@@ -365,8 +361,6 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
 
 static inline void eeh_addr_cache_init(void) { }
 
-static inline void eeh_addr_cache_build(void) { }
-
 static inline void eeh_add_device_early(struct pci_dn *pdn) { }
 
 static inline void eeh_add_device_tree_early(struct pci_dn *pdn) { }
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 217e14bb1fb6..cd2abbe41497 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1166,19 +1166,6 @@ static struct notifier_block eeh_reboot_nb = {
 	.notifier_call = eeh_reboot_notifier,
 };
 
-void eeh_probe_devices(void)
-{
-	struct pci_controller *hose, *tmp;
-	struct pci_dn *pdn;
-
-	/* Enable EEH for all adapters */
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		pdn = hose->pci_data;
-		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
-	}
-	eeh_show_enabled();
-}
-
 /**
  * eeh_init - EEH initialization
  *
diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index f93dd5cf6a39..c40078d036af 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -278,38 +278,6 @@ void eeh_addr_cache_init(void)
 	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
 }
 
-/**
- * eeh_addr_cache_build - Build a cache of I/O addresses
- *
- * Build a cache of pci i/o addresses.  This cache will be used to
- * find the pci device that corresponds to a given address.
- * This routine scans all pci busses to build the cache.
- * Must be run late in boot process, after the pci controllers
- * have been scanned for devices (after all device resources are known).
- */
-void eeh_addr_cache_build(void)
-{
-	struct pci_dn *pdn;
-	struct eeh_dev *edev;
-	struct pci_dev *dev = NULL;
-
-	for_each_pci_dev(dev) {
-		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
-		if (!pdn)
-			continue;
-
-		edev = pdn_to_eeh_dev(pdn);
-		if (!edev)
-			continue;
-
-		dev->dev.archdata.edev = edev;
-		edev->pdev = dev;
-
-		eeh_addr_cache_insert_dev(dev);
-		eeh_sysfs_add_device(dev);
-	}
-}
-
 static int eeh_addr_cache_show(struct seq_file *s, void *v)
 {
 	struct pci_io_addr_range *piar;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 81b0923cc55f..6a08f4fab255 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -240,9 +240,7 @@ int pnv_eeh_post_init(void)
 	struct pnv_phb *phb;
 	int ret = 0;
 
-	/* Probe devices & build address cache */
-	eeh_probe_devices();
-	eeh_addr_cache_build();
+	eeh_show_enabled();
 
 	/* Register OPAL event notifier */
 	eeh_event_irq = opal_event_request(ilog2(OPAL_EVENT_PCI_ERROR));
@@ -360,6 +358,7 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
 	return 0;
 }
 
+
 /**
  * pnv_eeh_probe - Do probe on PCI device
  * @pdn: PCI device node
diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
index 7be80882c08d..7c781b04fb15 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -242,8 +242,7 @@ void __init pSeries_final_fixup(void)
 
 	pSeries_request_regions();
 
-	eeh_probe_devices();
-	eeh_addr_cache_build();
+	eeh_show_enabled();
 #ifdef CONFIG_EEH
 	if (eeh_enabled())
 		eeh_add_flag(EEH_PHB_ENABLED);
-- 
2.19.0.2.gcad72f5712


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug
  2019-03-20  2:58 ` [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug Sam Bobroff
@ 2019-03-20  6:02   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:02 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> On PowerNV and pSeries, devices currently acquire EEH support from
> several different places: Boot-time devices from eeh_probe_devices()
> and eeh_addr_cache_build(), Virtual Function devices from the pcibios
> bus add device hooks and hot plugged devices from pci_hp_add_devices()
> (with other platforms using other methods as well).  Unfortunately,
> pSeries machines currently discover hot plugged devices using
> pci_rescan_bus(), not pci_hp_add_devices(), and so those devices do
> not receive EEH support.
> 
> Rather than adding another case for pci_rescan_bus(), this change
> widens the scope of the pcibios bus add device hooks so that they can
> handle all devices. As a side effect this also supports devices
> discovered after manually rescanning via /sys/bus/pci/rescan.
> 
> Note that on PowerNV, this change allows the EEH subsystem to become
> enabled after boot as long as it has not been forced off, which was
> not previously possible (it was already possible on pSeries).
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/kernel/eeh.c                    |  2 +-
>  arch/powerpc/kernel/of_platform.c            |  3 +-
>  arch/powerpc/platforms/powernv/eeh-powernv.c |  8 ++-
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 54 ++++++++++----------
>  4 files changed, 35 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 7a406d58d2c0..217e14bb1fb6 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1291,7 +1291,7 @@ void eeh_add_device_late(struct pci_dev *dev)
>  	struct pci_dn *pdn;
>  	struct eeh_dev *edev;
>  
> -	if (!dev || !eeh_enabled())
> +	if (!dev)
>  		return;
>  
>  	pr_debug("EEH: Adding device %s\n", pci_name(dev));
> diff --git a/arch/powerpc/kernel/of_platform.c b/arch/powerpc/kernel/of_platform.c
> index becaec990140..d5818e9c4069 100644
> --- a/arch/powerpc/kernel/of_platform.c
> +++ b/arch/powerpc/kernel/of_platform.c
> @@ -86,7 +86,8 @@ static int of_pci_phb_probe(struct platform_device *dev)
>  	pcibios_claim_one_bus(phb->bus);
>  
>  	/* Finish EEH setup */
> -	eeh_add_device_tree_late(phb->bus);
> +	if (!eeh_has_flag(EEH_FORCE_DISABLED))
> +		eeh_add_device_tree_late(phb->bus);
>  
>  	/* Add probed PCI devices to the device model */
>  	pci_bus_add_devices(phb->bus);
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 51c5b6bb9b0e..81b0923cc55f 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -47,7 +47,7 @@ void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
>  {
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  
> -	if (!pdev->is_virtfn)
> +	if (eeh_has_flag(EEH_FORCE_DISABLED))
>  		return;
>  
>  	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
> @@ -479,7 +479,11 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
>  	 * Enable EEH explicitly so that we will do EEH check
>  	 * while accessing I/O stuff
>  	 */
> -	eeh_add_flag(EEH_ENABLED);
> +	if (!eeh_has_flag(EEH_ENABLED)) {
> +		enable_irq(eeh_event_irq);
> +		eeh_add_flag(EEH_PHB_ENABLED);


Except that I do not think we need EEH_PHB_ENABLED (commented elsewhere),

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> +		eeh_add_flag(EEH_ENABLED);
> +	}
>  
>  	/* Save memory bars */
>  	eeh_save_bars(edev);
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index ae06878fbdea..e68c79164974 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -55,44 +55,44 @@ static int ibm_get_config_addr_info;
>  static int ibm_get_config_addr_info2;
>  static int ibm_configure_pe;
>  
> -#ifdef CONFIG_PCI_IOV
>  void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
>  {
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
> -	struct pci_dn *physfn_pdn;
> -	struct eeh_dev *edev;
>  
> -	if (!pdev->is_virtfn)
> +	if (eeh_has_flag(EEH_FORCE_DISABLED))
>  		return;
>  
>  	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
> +#ifdef CONFIG_PCI_IOV
> +	if (pdev->is_virtfn) {
> +		struct pci_dn *physfn_pdn;
>  
> -	pdn->device_id  =  pdev->device;
> -	pdn->vendor_id  =  pdev->vendor;
> -	pdn->class_code =  pdev->class;
> -	/*
> -	 * Last allow unfreeze return code used for retrieval
> -	 * by user space in eeh-sysfs to show the last command
> -	 * completion from platform.
> -	 */
> -	pdn->last_allow_rc =  0;
> -	physfn_pdn      =  pci_get_pdn(pdev->physfn);
> -	pdn->pe_number  =  physfn_pdn->pe_num_map[pdn->vf_index];
> -	edev = pdn_to_eeh_dev(pdn);
> -
> -	/*
> -	 * The following operations will fail if VF's sysfs files
> -	 * aren't created or its resources aren't finalized.
> -	 */
> +		pdn->device_id  =  pdev->device;
> +		pdn->vendor_id  =  pdev->vendor;
> +		pdn->class_code =  pdev->class;
> +		/*
> +		 * Last allow unfreeze return code used for retrieval
> +		 * by user space in eeh-sysfs to show the last command
> +		 * completion from platform.
> +		 */
> +		pdn->last_allow_rc =  0;
> +		physfn_pdn      =  pci_get_pdn(pdev->physfn);
> +		pdn->pe_number  =  physfn_pdn->pe_num_map[pdn->vf_index];
> +	}
> +#endif
>  	eeh_add_device_early(pdn);
>  	eeh_add_device_late(pdev);
> -	edev->pe_config_addr =  (pdn->busno << 16) | (pdn->devfn << 8);
> -	eeh_rmv_from_parent_pe(edev); /* Remove as it is adding to bus pe */
> -	eeh_add_to_parent_pe(edev);   /* Add as VF PE type */
> -	eeh_sysfs_add_device(pdev);
> +#ifdef CONFIG_PCI_IOV
> +	if (pdev->is_virtfn) {
> +		struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>  
> -}
> +		edev->pe_config_addr =  (pdn->busno << 16) | (pdn->devfn << 8);
> +		eeh_rmv_from_parent_pe(edev); /* Remove as it is adding to bus pe */
> +		eeh_add_to_parent_pe(edev);   /* Add as VF PE type */
> +	}
>  #endif
> +	eeh_sysfs_add_device(pdev);
> +}
>  
>  /*
>   * Buffer for reporting slot-error-detail rtas calls. Its here
> @@ -159,10 +159,8 @@ static int pseries_eeh_init(void)
>  	/* Set EEH probe mode */
>  	eeh_add_flag(EEH_PROBE_MODE_DEVTREE | EEH_ENABLE_IO_FOR_LOG);
>  
> -#ifdef CONFIG_PCI_IOV
>  	/* Set EEH machine dependent code */
>  	ppc_md.pcibios_bus_add_device = pseries_pcibios_bus_add_device;
> -#endif
>  
>  	return 0;
>  }
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier
  2019-03-20  2:58 ` [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier Sam Bobroff
@ 2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-04-18 10:13   ` Oliver O'Halloran
  1 sibling, 0 replies; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:02 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> The EEH address cache is currently initialized and populated by a
> single function: eeh_addr_cache_build().  While the initial population
> of the cache can only be done once resources are allocated,
> initialization (just setting up a spinlock) could be done much
> earlier.
> 
> So move the initialization step into a separate function and call it
> from a core_initcall (rather than a subsys initcall).
> 
> This will allow future work to make use of the cache during boot time
> PCI scanning.
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

> ---
>  arch/powerpc/include/asm/eeh.h  |  3 +++
>  arch/powerpc/kernel/eeh.c       |  2 ++
>  arch/powerpc/kernel/eeh_cache.c | 13 +++++++++++--
>  3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index e217ccda55d0..791b9e6fcc45 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -295,6 +295,7 @@ int __init eeh_ops_register(struct eeh_ops *ops);
>  int __exit eeh_ops_unregister(const char *name);
>  int eeh_check_failure(const volatile void __iomem *token);
>  int eeh_dev_check_failure(struct eeh_dev *edev);
> +void eeh_addr_cache_init(void);
>  void eeh_addr_cache_build(void);
>  void eeh_add_device_early(struct pci_dn *);
>  void eeh_add_device_tree_early(struct pci_dn *);
> @@ -362,6 +363,8 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
>  
>  #define eeh_dev_check_failure(x) (0)
>  
> +static inline void eeh_addr_cache_init(void) { }
> +
>  static inline void eeh_addr_cache_build(void) { }
>  
>  static inline void eeh_add_device_early(struct pci_dn *pdn) { }
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 3dcff29cb9b3..7a406d58d2c0 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1219,6 +1219,8 @@ static int eeh_init(void)
>  	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
>  		eeh_dev_phb_init_dynamic(hose);
>  
> +	eeh_addr_cache_init();
> +
>  	/* Initialize EEH event */
>  	return eeh_event_init();
>  }
> diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
> index 9c68f0837385..f93dd5cf6a39 100644
> --- a/arch/powerpc/kernel/eeh_cache.c
> +++ b/arch/powerpc/kernel/eeh_cache.c
> @@ -267,6 +267,17 @@ void eeh_addr_cache_rmv_dev(struct pci_dev *dev)
>  	spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
>  }
>  
> +/**
> + * eeh_addr_cache_init - Initialize a cache of I/O addresses
> + *
> + * Initialize a cache of pci i/o addresses.  This cache will be used to
> + * find the pci device that corresponds to a given address.
> + */
> +void eeh_addr_cache_init(void)
> +{
> +	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> +}
> +
>  /**
>   * eeh_addr_cache_build - Build a cache of I/O addresses
>   *
> @@ -282,8 +293,6 @@ void eeh_addr_cache_build(void)
>  	struct eeh_dev *edev;
>  	struct pci_dev *dev = NULL;
>  
> -	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> -
>  	for_each_pci_dev(dev) {
>  		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
>  		if (!pdn)
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()
  2019-03-20  2:58 ` [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled() Sam Bobroff
@ 2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-03-20  6:24     ` Oliver
  2019-04-09  3:30     ` Sam Bobroff
  2019-04-18 10:01   ` Oliver O'Halloran
  1 sibling, 2 replies; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:02 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> Move the EEH enabled message into it's own function so that future
> work can call it from multiple places.
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/eeh.h |  3 +++
>  arch/powerpc/kernel/eeh.c      | 16 +++++++++++-----
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index fe4cf7208890..e217ccda55d0 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -289,6 +289,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
>  
>  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
>  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> +void eeh_show_enabled(void);
>  void eeh_probe_devices(void);
>  int __init eeh_ops_register(struct eeh_ops *ops);
>  int __exit eeh_ops_unregister(const char *name);
> @@ -338,6 +339,8 @@ static inline bool eeh_enabled(void)
>          return false;
>  }
>  
> +static inline void eeh_show_enabled(void) { }
> +
>  static inline bool eeh_phb_enabled(void)
>  {
>  	return false;
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index b14d89547895..3dcff29cb9b3 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -163,6 +163,16 @@ static int __init eeh_setup(char *str)
>  }
>  __setup("eeh=", eeh_setup);
>  
> +void eeh_show_enabled(void)
> +{
> +	if (eeh_has_flag(EEH_FORCE_DISABLED))
> +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (by eeh=off)\n");
> +	else if (eeh_enabled())


I'd make it eeh_has_flag(EEH_ENABLED) for clarity.


> +		pr_info("EEH: PCI Enhanced I/O Error Handling ENABLED (capable adapter found)\n");
> +	else
> +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (no capable adapter found)\n");
> +}
> +
>  /*
>   * This routine captures assorted PCI configuration space data
>   * for the indicated PCI device, and puts them into a buffer
> @@ -1166,11 +1176,7 @@ void eeh_probe_devices(void)
>  		pdn = hose->pci_data;
>  		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
>  	}
> -	if (eeh_enabled())
> -		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
> -	else
> -		pr_info("EEH: No capable adapters found\n");
> -
> +	eeh_show_enabled();


This line moves later in the series so I'd just merge this patch into
8/8 to reduce number of lines moving withing the patchset.

In general the whole point of the EEH_ENABLED flag is fading away. Its
meaning now is that "at least somewhere in the box for at least one
device with enabled EEH" which does not seem extremely useful as we have
a pci_dev or pe pretty much everywhere we look at eeh_enabled() and
pdev->dev.archdata.edev can tell if eeh is enabled for a device.
Although I am pretty sure this is in your list already :)


>  }
>  
>  /**
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition
  2019-03-20  2:58 ` [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition Sam Bobroff
@ 2019-03-20  6:02   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:02 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> Also remove useless comment.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/kernel/eeh.c                    |  2 +-
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 14 ++++++++----
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 23 +++++++++++++++-----
>  3 files changed, 28 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 8d3c36a1f194..b14d89547895 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1291,7 +1291,7 @@ void eeh_add_device_late(struct pci_dev *dev)
>  	pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
>  	edev = pdn_to_eeh_dev(pdn);
>  	if (edev->pdev == dev) {
> -		pr_debug("EEH: Already referenced !\n");
> +		pr_debug("EEH: Device %s already referenced!\n", pci_name(dev));
>  		return;
>  	}
>  
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index f0a95f663810..51c5b6bb9b0e 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -50,10 +50,7 @@ void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
>  	if (!pdev->is_virtfn)
>  		return;
>  
> -	/*
> -	 * The following operations will fail if VF's sysfs files
> -	 * aren't created or its resources aren't finalized.
> -	 */
> +	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
>  	eeh_add_device_early(pdn);
>  	eeh_add_device_late(pdev);
>  	eeh_sysfs_add_device(pdev);
> @@ -389,6 +386,10 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
>  	int ret;
>  	int config_addr = (pdn->busno << 8) | (pdn->devfn);
>  
> +	pr_debug("%s: probing %04x:%02x:%02x.%01x\n",
> +		__func__, hose->global_number, pdn->busno,
> +		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
> +
>  	/*
>  	 * When probing the root bridge, which doesn't have any
>  	 * subordinate PCI devices. We don't have OF node for
> @@ -483,6 +484,11 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
>  	/* Save memory bars */
>  	eeh_save_bars(edev);
>  
> +	pr_debug("%s: EEH enabled on %02x:%02x.%01x PHB#%x-PE#%x\n",
> +		__func__, pdn->busno, PCI_SLOT(pdn->devfn),
> +		PCI_FUNC(pdn->devfn), edev->pe->phb->global_number,
> +		edev->pe->addr);
> +
>  	return NULL;
>  }
>  
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index 7aa50258dd42..ae06878fbdea 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -65,6 +65,8 @@ void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
>  	if (!pdev->is_virtfn)
>  		return;
>  
> +	pr_debug("%s: EEH: Setting up device %s.\n", __func__, pci_name(pdev));
> +
>  	pdn->device_id  =  pdev->device;
>  	pdn->vendor_id  =  pdev->vendor;
>  	pdn->class_code =  pdev->class;
> @@ -251,6 +253,10 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
>  	int enable = 0;
>  	int ret;
>  
> +	pr_debug("%s: probing %04x:%02x:%02x.%01x\n",
> +		__func__, pdn->phb->global_number, pdn->busno,
> +		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
> +
>  	/* Retrieve OF node and eeh device */
>  	edev = pdn_to_eeh_dev(pdn);
>  	if (!edev || edev->pe)
> @@ -294,7 +300,12 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
>  
>  	/* Enable EEH on the device */
>  	ret = eeh_ops->set_option(&pe, EEH_OPT_ENABLE);
> -	if (!ret) {
> +	if (ret) {
> +		pr_debug("%s: EEH failed to enable on %02x:%02x.%01x PHB#%x-PE#%x (code %d)\n",
> +			__func__, pdn->busno, PCI_SLOT(pdn->devfn),
> +			PCI_FUNC(pdn->devfn), pe.phb->global_number,
> +			pe.addr, ret);
> +	} else {
>  		/* Retrieve PE address */
>  		edev->pe_config_addr = eeh_ops->get_pe_addr(&pe);
>  		pe.addr = edev->pe_config_addr;
> @@ -310,11 +321,6 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
>  		if (enable) {
>  			eeh_add_flag(EEH_ENABLED);
>  			eeh_add_to_parent_pe(edev);
> -
> -			pr_debug("%s: EEH enabled on %02x:%02x.%01x PHB#%x-PE#%x\n",
> -				__func__, pdn->busno, PCI_SLOT(pdn->devfn),
> -				PCI_FUNC(pdn->devfn), pe.phb->global_number,
> -				pe.addr);
>  		} else if (pdn->parent && pdn_to_eeh_dev(pdn->parent) &&
>  			   (pdn_to_eeh_dev(pdn->parent))->pe) {
>  			/* This device doesn't support EEH, but it may have an
> @@ -323,6 +329,11 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
>  			edev->pe_config_addr = pdn_to_eeh_dev(pdn->parent)->pe_config_addr;
>  			eeh_add_to_parent_pe(edev);
>  		}
> +		pr_debug("%s: EEH %s on %02x:%02x.%01x PHB#%x-PE#%x (code %d)\n",
> +			__func__, (enable ? "enabled" : "unsupported"),
> +			pdn->busno, PCI_SLOT(pdn->devfn),
> +			PCI_FUNC(pdn->devfn), pe.phb->global_number,
> +			pe.addr, ret);
>  	}
>  
>  	/* Save memory bars */
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
  2019-03-20  2:58 ` [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag Sam Bobroff
@ 2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-04-09  1:41     ` Sam Bobroff
  2019-04-18  9:51   ` Oliver O'Halloran
  1 sibling, 1 reply; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:02 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> The PHB flag, PNV_PHB_FLAG_EEH, is set (on PowerNV) individually on
> each PHB once the EEH subsystem is ready. It is the only use of the
> flags member of the phb struct.


Then why to keep pnv_phb::flags?

> However there is no need to store this separately on each PHB, so
> convert it to a global flag. For symmetry, the flag is now also set
> for pSeries; although it is currently unused it may be useful in the
> future.

Just using eeh_enabled() instead of (phb->flags & PNV_PHB_FLAG_EEH)
seems easier and cleaner; also pseries does not use it so there is no
point defining it there either.


> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/eeh.h               | 11 +++++++++++
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++-----------
>  arch/powerpc/platforms/powernv/pci.c         |  7 +++----
>  arch/powerpc/platforms/powernv/pci.h         |  2 --
>  arch/powerpc/platforms/pseries/pci.c         |  4 ++++
>  5 files changed, 21 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index 3613a56281f2..fe4cf7208890 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -43,6 +43,7 @@ struct pci_dn;
>  #define EEH_VALID_PE_ZERO	0x10	/* PE#0 is valid		     */
>  #define EEH_ENABLE_IO_FOR_LOG	0x20	/* Enable IO for log		     */
>  #define EEH_EARLY_DUMP_LOG	0x40	/* Dump log immediately		     */
> +#define EEH_PHB_ENABLED		0x80	/* PHB recovery uses EEH	     */
>  
>  /*
>   * Delay for PE reset, all in ms
> @@ -245,6 +246,11 @@ static inline bool eeh_enabled(void)
>  	return eeh_has_flag(EEH_ENABLED) && !eeh_has_flag(EEH_FORCE_DISABLED);
>  }
>  
> +static inline bool eeh_phb_enabled(void)
> +{
> +	return eeh_has_flag(EEH_PHB_ENABLED);
> +}
> +
>  static inline void eeh_serialize_lock(unsigned long *flags)
>  {
>  	raw_spin_lock_irqsave(&confirm_error_lock, *flags);
> @@ -332,6 +338,11 @@ static inline bool eeh_enabled(void)
>          return false;
>  }
>  
> +static inline bool eeh_phb_enabled(void)
> +{
> +	return false;
> +}
> +
>  static inline void eeh_probe_devices(void) { }
>  
>  static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 6fc1a463b796..f0a95f663810 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -264,22 +264,14 @@ int pnv_eeh_post_init(void)
>  		return ret;
>  	}
>  
> -	if (!eeh_enabled())
> +	if (eeh_enabled())
> +		eeh_add_flag(EEH_PHB_ENABLED);
> +	else
>  		disable_irq(eeh_event_irq);
>  
>  	list_for_each_entry(hose, &hose_list, list_node) {
>  		phb = hose->private_data;
>  
> -		/*
> -		 * If EEH is enabled, we're going to rely on that.
> -		 * Otherwise, we restore to conventional mechanism
> -		 * to clear frozen PE during PCI config access.
> -		 */
> -		if (eeh_enabled())
> -			phb->flags |= PNV_PHB_FLAG_EEH;
> -		else
> -			phb->flags &= ~PNV_PHB_FLAG_EEH;
> -
>  		/* Create debugfs entries */
>  #ifdef CONFIG_DEBUG_FS
>  		if (phb->has_dbgfs || !phb->dbgfs)
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 307181fd8a17..d2b50f3bf6b1 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -717,10 +717,9 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
>  static bool pnv_pci_cfg_check(struct pci_dn *pdn)
>  {
>  	struct eeh_dev *edev = NULL;
> -	struct pnv_phb *phb = pdn->phb->private_data;
>  
>  	/* EEH not enabled ? */
> -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> +	if (!eeh_phb_enabled())
>  		return true;
>  
>  	/* PE reset or device removed ? */
> @@ -761,7 +760,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  
>  	ret = pnv_pci_cfg_read(pdn, where, size, val);
>  	phb = pdn->phb->private_data;
> -	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
> +	if (eeh_phb_enabled() && pdn->edev) {
>  		if (*val == EEH_IO_ERROR_VALUE(size) &&
>  		    eeh_dev_check_failure(pdn->edev))
>                          return PCIBIOS_DEVICE_NOT_FOUND;
> @@ -789,7 +788,7 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  
>  	ret = pnv_pci_cfg_write(pdn, where, size, val);
>  	phb = pdn->phb->private_data;
> -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> +	if (!eeh_phb_enabled())
>  		pnv_pci_config_check_eeh(pdn);
>  
>  	return ret;
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 8e36da379252..eb0add61397b 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -85,8 +85,6 @@ struct pnv_ioda_pe {
>  	struct list_head	list;
>  };
>  
> -#define PNV_PHB_FLAG_EEH	(1 << 0)
> -
>  struct pnv_phb {
>  	struct pci_controller	*hose;
>  	enum pnv_phb_type	type;
> diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> index 37a77e57893e..7be80882c08d 100644
> --- a/arch/powerpc/platforms/pseries/pci.c
> +++ b/arch/powerpc/platforms/pseries/pci.c
> @@ -244,6 +244,10 @@ void __init pSeries_final_fixup(void)
>  
>  	eeh_probe_devices();
>  	eeh_addr_cache_build();
> +#ifdef CONFIG_EEH
> +	if (eeh_enabled())
> +		eeh_add_flag(EEH_PHB_ENABLED);
> +#endif
>  
>  #ifdef CONFIG_PCI_IOV
>  	ppc_md.pcibios_sriov_enable = pseries_pcibios_sriov_enable;
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
  2019-03-20  2:58 ` [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Sam Bobroff
@ 2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-04-08  6:50     ` Sam Bobroff
  0 siblings, 1 reply; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:02 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> The EEH_DEV_NO_HANDLER flag is used by the EEH system to prevent the
> use of driver callbacks in drivers that have been bound part way
> through the recovery process. This is necessary to prevent later stage
> handlers from being called when the earlier stage handlers haven't,
> which can be confusing for drivers.

The flag is used from eeh_pe_report()->eeh_pe_report_edev which is
called many times from eeh_handle_normal_event() (and you clear the flag
here unconditionally) and once from eeh_handle_special_event() - so this
is actually the only case now when the flag matters. Is my understanding
correct? Also is not clearing the flag correct in that case? I do not
quite understand eeh_handle_normal_event vs. eeh_handle_special_event
business though.


> 
> However, the flag is set for all devices that are added after boot
> time and only cleared at the end of the EEH recovery process. This
> results in hot plugged devices erroneously having the flag set during
> the first recovery after they are added (causing their driver's
> handlers to be incorrectly ignored).
> 
> To remedy this, clear the flag at the beginning of recovery
> processing. The flag is still cleared at the end of recovery
> processing, although it is no longer really necessary.

Then may be remove that redundant clearing?

> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/kernel/eeh_driver.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 6f3ee30565dd..4c34b9901f15 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -819,6 +819,10 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
>  		result = PCI_ERS_RESULT_DISCONNECT;
>  	}
>  
> +	eeh_for_each_pe(pe, tmp_pe)
> +		eeh_pe_for_each_dev(tmp_pe, edev, tmp)
> +			edev->mode &= ~EEH_DEV_NO_HANDLER;
> +
>  	/* Walk the various device drivers attached to this slot through
>  	 * a reset sequence, giving each an opportunity to do what it needs
>  	 * to accomplish the reset.  Each child gets a report of the
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/8] powerpc/64: Adjust order in pcibios_init()
  2019-03-20  2:58 ` [PATCH 1/8] powerpc/64: Adjust order in pcibios_init() Sam Bobroff
@ 2019-03-20  6:03   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:03 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> The pcibios_init() function for 64 bit PowerPC currently calls
> pci_bus_add_devices() before pcibios_resource_survey(), which seems
> incorrect because it adds devices and attempts to bind their drivers
> before allocating their resources (although no problems seem to be
> apparent).
> 
> So move the call to pci_bus_add_devices() to after
> pcibios_resource_survey(), while extracting call to the
> pcibios_fixup() hook so that it remains in the same location.
> 
> This will also allow the ppc_md.pcibios_bus_add_device() hooks to
> perform actions that depend on PCI resources, both during rescanning
> (where this is already the case) and at boot time, to support future
> work.
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> ---
>  arch/powerpc/kernel/pci-common.c |  4 ----
>  arch/powerpc/kernel/pci_32.c     |  4 ++++
>  arch/powerpc/kernel/pci_64.c     | 12 +++++++++---
>  3 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> index ff4b7539cbdf..3146eb73e3b3 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -1383,10 +1383,6 @@ void __init pcibios_resource_survey(void)
>  		pr_debug("PCI: Assigning unassigned resources...\n");
>  		pci_assign_unassigned_resources();
>  	}
> -
> -	/* Call machine dependent fixup */
> -	if (ppc_md.pcibios_fixup)
> -		ppc_md.pcibios_fixup();
>  }
>  
>  /* This is used by the PCI hotplug driver to allocate resource
> diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c
> index d3f04f2d8249..40aaa1a6e193 100644
> --- a/arch/powerpc/kernel/pci_32.c
> +++ b/arch/powerpc/kernel/pci_32.c
> @@ -259,6 +259,10 @@ static int __init pcibios_init(void)
>  	/* Call common code to handle resource allocation */
>  	pcibios_resource_survey();
>  
> +	/* Call machine dependent fixup */
> +	if (ppc_md.pcibios_fixup)
> +		ppc_md.pcibios_fixup();
> +
>  	/* Call machine dependent post-init code */
>  	if (ppc_md.pcibios_after_init)
>  		ppc_md.pcibios_after_init();
> diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
> index 9d8c10d55407..6f16f30031d7 100644
> --- a/arch/powerpc/kernel/pci_64.c
> +++ b/arch/powerpc/kernel/pci_64.c
> @@ -58,14 +58,20 @@ static int __init pcibios_init(void)
>  	pci_add_flags(PCI_ENABLE_PROC_DOMAINS | PCI_COMPAT_DOMAIN_0);
>  
>  	/* Scan all of the recorded PCI controllers.  */
> -	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
> +	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
>  		pcibios_scan_phb(hose);
> -		pci_bus_add_devices(hose->bus);
> -	}
>  
>  	/* Call common code to handle resource allocation */
>  	pcibios_resource_survey();
>  
> +	/* Add devices. */
> +	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
> +		pci_bus_add_devices(hose->bus);
> +
> +	/* Call machine dependent fixup */
> +	if (ppc_md.pcibios_fixup)
> +		ppc_md.pcibios_fixup();
> +
>  	printk(KERN_DEBUG "PCI: Probing PCI hardware done\n");
>  
>  	return 0;
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
  2019-03-20  2:58 ` [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build() Sam Bobroff
@ 2019-03-20  6:05   ` Alexey Kardashevskiy
  2019-04-09  3:31     ` Sam Bobroff
  0 siblings, 1 reply; 31+ messages in thread
From: Alexey Kardashevskiy @ 2019-03-20  6:05 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev



On 20/03/2019 13:58, Sam Bobroff wrote:
> Now that EEH support for all devices (on PowerNV and pSeries) is
> provided by the pcibios bus add device hooks, eeh_probe_devices() and
> eeh_addr_cache_build() are redundant and can be removed.
> 
> Note that previously on pSeries, useless EEH sysfs files were created
> for some devices that did not have EEH support and this change
> prevents them from being created.
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/eeh.h               |  6 ----
>  arch/powerpc/kernel/eeh.c                    | 13 --------
>  arch/powerpc/kernel/eeh_cache.c              | 32 --------------------
>  arch/powerpc/platforms/powernv/eeh-powernv.c |  5 ++-
>  arch/powerpc/platforms/pseries/pci.c         |  3 +-
>  5 files changed, 3 insertions(+), 56 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index 791b9e6fcc45..f1eca1757cbc 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -290,13 +290,11 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
>  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
>  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
>  void eeh_show_enabled(void);
> -void eeh_probe_devices(void);
>  int __init eeh_ops_register(struct eeh_ops *ops);
>  int __exit eeh_ops_unregister(const char *name);
>  int eeh_check_failure(const volatile void __iomem *token);
>  int eeh_dev_check_failure(struct eeh_dev *edev);
>  void eeh_addr_cache_init(void);
> -void eeh_addr_cache_build(void);
>  void eeh_add_device_early(struct pci_dn *);
>  void eeh_add_device_tree_early(struct pci_dn *);
>  void eeh_add_device_late(struct pci_dev *);
> @@ -347,8 +345,6 @@ static inline bool eeh_phb_enabled(void)
>  	return false;
>  }
>  
> -static inline void eeh_probe_devices(void) { }
> -
>  static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
>  {
>  	return NULL;
> @@ -365,8 +361,6 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
>  
>  static inline void eeh_addr_cache_init(void) { }
>  
> -static inline void eeh_addr_cache_build(void) { }
> -
>  static inline void eeh_add_device_early(struct pci_dn *pdn) { }
>  
>  static inline void eeh_add_device_tree_early(struct pci_dn *pdn) { }
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 217e14bb1fb6..cd2abbe41497 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1166,19 +1166,6 @@ static struct notifier_block eeh_reboot_nb = {
>  	.notifier_call = eeh_reboot_notifier,
>  };
>  
> -void eeh_probe_devices(void)
> -{
> -	struct pci_controller *hose, *tmp;
> -	struct pci_dn *pdn;
> -
> -	/* Enable EEH for all adapters */
> -	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
> -		pdn = hose->pci_data;
> -		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
> -	}
> -	eeh_show_enabled();
> -}
> -
>  /**
>   * eeh_init - EEH initialization
>   *
> diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
> index f93dd5cf6a39..c40078d036af 100644
> --- a/arch/powerpc/kernel/eeh_cache.c
> +++ b/arch/powerpc/kernel/eeh_cache.c
> @@ -278,38 +278,6 @@ void eeh_addr_cache_init(void)
>  	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
>  }
>  
> -/**
> - * eeh_addr_cache_build - Build a cache of I/O addresses
> - *
> - * Build a cache of pci i/o addresses.  This cache will be used to
> - * find the pci device that corresponds to a given address.
> - * This routine scans all pci busses to build the cache.
> - * Must be run late in boot process, after the pci controllers
> - * have been scanned for devices (after all device resources are known).
> - */
> -void eeh_addr_cache_build(void)
> -{
> -	struct pci_dn *pdn;
> -	struct eeh_dev *edev;
> -	struct pci_dev *dev = NULL;
> -
> -	for_each_pci_dev(dev) {
> -		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
> -		if (!pdn)
> -			continue;
> -
> -		edev = pdn_to_eeh_dev(pdn);
> -		if (!edev)
> -			continue;
> -
> -		dev->dev.archdata.edev = edev;
> -		edev->pdev = dev;
> -
> -		eeh_addr_cache_insert_dev(dev);
> -		eeh_sysfs_add_device(dev);
> -	}
> -}
> -
>  static int eeh_addr_cache_show(struct seq_file *s, void *v)
>  {
>  	struct pci_io_addr_range *piar;
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 81b0923cc55f..6a08f4fab255 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -240,9 +240,7 @@ int pnv_eeh_post_init(void)
>  	struct pnv_phb *phb;
>  	int ret = 0;
>  
> -	/* Probe devices & build address cache */
> -	eeh_probe_devices();
> -	eeh_addr_cache_build();
> +	eeh_show_enabled();
>  
>  	/* Register OPAL event notifier */
>  	eeh_event_irq = opal_event_request(ilog2(OPAL_EVENT_PCI_ERROR));
> @@ -360,6 +358,7 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
>  	return 0;
>  }
>  
> +

Unrelated new line. Otherwise


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




>  /**
>   * pnv_eeh_probe - Do probe on PCI device
>   * @pdn: PCI device node
> diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> index 7be80882c08d..7c781b04fb15 100644
> --- a/arch/powerpc/platforms/pseries/pci.c
> +++ b/arch/powerpc/platforms/pseries/pci.c
> @@ -242,8 +242,7 @@ void __init pSeries_final_fixup(void)
>  
>  	pSeries_request_regions();
>  
> -	eeh_probe_devices();
> -	eeh_addr_cache_build();
> +	eeh_show_enabled();
>  #ifdef CONFIG_EEH
>  	if (eeh_enabled())
>  		eeh_add_flag(EEH_PHB_ENABLED);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()
  2019-03-20  6:02   ` Alexey Kardashevskiy
@ 2019-03-20  6:24     ` Oliver
  2019-04-09  3:30     ` Sam Bobroff
  1 sibling, 0 replies; 31+ messages in thread
From: Oliver @ 2019-03-20  6:24 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Sam Bobroff, linuxppc-dev

On Wed, Mar 20, 2019 at 5:06 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/03/2019 13:58, Sam Bobroff wrote:
> > Move the EEH enabled message into it's own function so that future
> > work can call it from multiple places.
> >
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h |  3 +++
> >  arch/powerpc/kernel/eeh.c      | 16 +++++++++++-----
> >  2 files changed, 14 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index fe4cf7208890..e217ccda55d0 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -289,6 +289,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
> >
> >  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
> >  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> > +void eeh_show_enabled(void);
> >  void eeh_probe_devices(void);
> >  int __init eeh_ops_register(struct eeh_ops *ops);
> >  int __exit eeh_ops_unregister(const char *name);
> > @@ -338,6 +339,8 @@ static inline bool eeh_enabled(void)
> >          return false;
> >  }
> >
> > +static inline void eeh_show_enabled(void) { }
> > +
> >  static inline bool eeh_phb_enabled(void)
> >  {
> >       return false;
> > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> > index b14d89547895..3dcff29cb9b3 100644
> > --- a/arch/powerpc/kernel/eeh.c
> > +++ b/arch/powerpc/kernel/eeh.c
> > @@ -163,6 +163,16 @@ static int __init eeh_setup(char *str)
> >  }
> >  __setup("eeh=", eeh_setup);
> >
> > +void eeh_show_enabled(void)
> > +{
> > +     if (eeh_has_flag(EEH_FORCE_DISABLED))
> > +             pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (by eeh=off)\n");
> > +     else if (eeh_enabled())
>
>
> I'd make it eeh_has_flag(EEH_ENABLED) for clarity.
>
>
> > +             pr_info("EEH: PCI Enhanced I/O Error Handling ENABLED (capable adapter found)\n");
> > +     else
> > +             pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (no capable adapter found)\n");
> > +}
> > +
> >  /*
> >   * This routine captures assorted PCI configuration space data
> >   * for the indicated PCI device, and puts them into a buffer
> > @@ -1166,11 +1176,7 @@ void eeh_probe_devices(void)
> >               pdn = hose->pci_data;
> >               traverse_pci_dn(pdn, eeh_ops->probe, NULL);
> >       }
> > -     if (eeh_enabled())
> > -             pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
> > -     else
> > -             pr_info("EEH: No capable adapters found\n");
> > -
> > +     eeh_show_enabled();
>
>
> This line moves later in the series so I'd just merge this patch into
> 8/8 to reduce number of lines moving withing the patchset.
>
> In general the whole point of the EEH_ENABLED flag is fading away. Its
> meaning now is that "at least somewhere in the box for at least one
> device with enabled EEH" which does not seem extremely useful as we have
> a pci_dev or pe pretty much everywhere we look at eeh_enabled() and
> pdev->dev.archdata.edev can tell if eeh is enabled for a device.
> Although I am pretty sure this is in your list already :)

The other function is to disable attempting to detect EEH errors when
we get 0xFFs from an MMIO load, but I don't think anyone ever disables
it.

> >  }
> >
> >  /**
> >
>
> --
> Alexey

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
  2019-03-20  6:02   ` Alexey Kardashevskiy
@ 2019-04-08  6:50     ` Sam Bobroff
  0 siblings, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-08  6:50 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3918 bytes --]

On Wed, Mar 20, 2019 at 05:02:57PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 20/03/2019 13:58, Sam Bobroff wrote:
> > The EEH_DEV_NO_HANDLER flag is used by the EEH system to prevent the
> > use of driver callbacks in drivers that have been bound part way
> > through the recovery process. This is necessary to prevent later stage
> > handlers from being called when the earlier stage handlers haven't,
> > which can be confusing for drivers.
> 
> The flag is used from eeh_pe_report()->eeh_pe_report_edev which is
> called many times from eeh_handle_normal_event() (and you clear the flag
> here unconditionally) and once from eeh_handle_special_event() - so this
> is actually the only case now when the flag matters. Is my understanding
> correct? Also is not clearing the flag correct in that case? I do not
> quite understand eeh_handle_normal_event vs. eeh_handle_special_event
> business though.

I'm not sure I fully understand your question, but here's the situation:

* EEH is detected on a PCI device that has no driver bound but there is
  a driver that COULD bind.
* eeh_handle_normal_event() follows the "EEH: Reset with hotplug
  activity" path because the device doesn't (currently) have a driver
  that supports EEH.
* eeh_reset_device() removes the device (pci_hp_remove_devices()).
* eeh_reset_device() re-discovers the device with pci_hp_add_devices().
* As part of re-discovery the PCI subsystem will bind the available driver.
* eeh_handle_normal_event() calls eeh_report_resume() (via eeh_pe_report()).

If the (newly bound) driver has a resume() handler, then
eeh_report_resume() will call it and AFAIK this will cause a problem for
some drivers because their error_detected() handler wasn't called first.

The fix for this is to have eeh_add_device_late() set EEH_DEV_NO_HANDLER
so that we can detect that the device has been added DURING recovery,
and avoid calling it's handlers later.

I see what you mean about the eeh_handle_special_event() case, where
EEH_DEV_NO_HANDLER isn't cleared before calling eeh_pe_report(), and I
think it's a bug! I'll fix it in the next version.

(Cleaning up that flag is on my list. I don't think it's a very good
solution.)

> > 
> > However, the flag is set for all devices that are added after boot
> > time and only cleared at the end of the EEH recovery process. This
> > results in hot plugged devices erroneously having the flag set during
> > the first recovery after they are added (causing their driver's
> > handlers to be incorrectly ignored).
> > 
> > To remedy this, clear the flag at the beginning of recovery
> > processing. The flag is still cleared at the end of recovery
> > processing, although it is no longer really necessary.
> 
> Then may be remove that redundant clearing?

I don't really mind either way; clearing it when we are finished with
recovery seems "cleaner" to me but it doesn't have any function. (In
any case, I think I will eventually want to remove it.)

> > 
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/kernel/eeh_driver.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> > index 6f3ee30565dd..4c34b9901f15 100644
> > --- a/arch/powerpc/kernel/eeh_driver.c
> > +++ b/arch/powerpc/kernel/eeh_driver.c
> > @@ -819,6 +819,10 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> >  		result = PCI_ERS_RESULT_DISCONNECT;
> >  	}
> >  
> > +	eeh_for_each_pe(pe, tmp_pe)
> > +		eeh_pe_for_each_dev(tmp_pe, edev, tmp)
> > +			edev->mode &= ~EEH_DEV_NO_HANDLER;
> > +
> >  	/* Walk the various device drivers attached to this slot through
> >  	 * a reset sequence, giving each an opportunity to do what it needs
> >  	 * to accomplish the reset.  Each child gets a report of the
> > 
> 
> -- 
> Alexey
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
  2019-03-20  6:02   ` Alexey Kardashevskiy
@ 2019-04-09  1:41     ` Sam Bobroff
  0 siblings, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-09  1:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 6774 bytes --]

On Wed, Mar 20, 2019 at 05:02:44PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 20/03/2019 13:58, Sam Bobroff wrote:
> > The PHB flag, PNV_PHB_FLAG_EEH, is set (on PowerNV) individually on
> > each PHB once the EEH subsystem is ready. It is the only use of the
> > flags member of the phb struct.
> 
> 
> Then why to keep pnv_phb::flags?

No reason. I'll remove it in the next version.

> > However there is no need to store this separately on each PHB, so
> > convert it to a global flag. For symmetry, the flag is now also set
> > for pSeries; although it is currently unused it may be useful in the
> > future.
> 
> Just using eeh_enabled() instead of (phb->flags & PNV_PHB_FLAG_EEH)
> seems easier and cleaner; also pseries does not use it so there is no
> point defining it there either.

I do want to do that. However, eeh_enabled() seems to be slightly
different from PNV_PHB_FLAG_EEH:

- eeh_enabled() is true as soon as the first device with EEH support is
  detected.
- eeh_phb_enabled() is true after EEH support has been enabled on every
  device that supports it.

So I was concerned that using eeh_enabled() would cause problems in
pnv_pci_config_check_eeh() if EEH was detected *during* the initial PCI
scanning phase when eeh_enabled() was true but EEH had not yet been set
up on the device or PHB where it was detected.

Does that make sense?

Would it be reasonable to keep this patch as it is for now and
investigate cleaning it up in a future patch?

> > 
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h               | 11 +++++++++++
> >  arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++-----------
> >  arch/powerpc/platforms/powernv/pci.c         |  7 +++----
> >  arch/powerpc/platforms/powernv/pci.h         |  2 --
> >  arch/powerpc/platforms/pseries/pci.c         |  4 ++++
> >  5 files changed, 21 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index 3613a56281f2..fe4cf7208890 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -43,6 +43,7 @@ struct pci_dn;
> >  #define EEH_VALID_PE_ZERO	0x10	/* PE#0 is valid		     */
> >  #define EEH_ENABLE_IO_FOR_LOG	0x20	/* Enable IO for log		     */
> >  #define EEH_EARLY_DUMP_LOG	0x40	/* Dump log immediately		     */
> > +#define EEH_PHB_ENABLED		0x80	/* PHB recovery uses EEH	     */
> >  
> >  /*
> >   * Delay for PE reset, all in ms
> > @@ -245,6 +246,11 @@ static inline bool eeh_enabled(void)
> >  	return eeh_has_flag(EEH_ENABLED) && !eeh_has_flag(EEH_FORCE_DISABLED);
> >  }
> >  
> > +static inline bool eeh_phb_enabled(void)
> > +{
> > +	return eeh_has_flag(EEH_PHB_ENABLED);
> > +}
> > +
> >  static inline void eeh_serialize_lock(unsigned long *flags)
> >  {
> >  	raw_spin_lock_irqsave(&confirm_error_lock, *flags);
> > @@ -332,6 +338,11 @@ static inline bool eeh_enabled(void)
> >          return false;
> >  }
> >  
> > +static inline bool eeh_phb_enabled(void)
> > +{
> > +	return false;
> > +}
> > +
> >  static inline void eeh_probe_devices(void) { }
> >  
> >  static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
> > diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > index 6fc1a463b796..f0a95f663810 100644
> > --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> > +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > @@ -264,22 +264,14 @@ int pnv_eeh_post_init(void)
> >  		return ret;
> >  	}
> >  
> > -	if (!eeh_enabled())
> > +	if (eeh_enabled())
> > +		eeh_add_flag(EEH_PHB_ENABLED);
> > +	else
> >  		disable_irq(eeh_event_irq);
> >  
> >  	list_for_each_entry(hose, &hose_list, list_node) {
> >  		phb = hose->private_data;
> >  
> > -		/*
> > -		 * If EEH is enabled, we're going to rely on that.
> > -		 * Otherwise, we restore to conventional mechanism
> > -		 * to clear frozen PE during PCI config access.
> > -		 */
> > -		if (eeh_enabled())
> > -			phb->flags |= PNV_PHB_FLAG_EEH;
> > -		else
> > -			phb->flags &= ~PNV_PHB_FLAG_EEH;
> > -
> >  		/* Create debugfs entries */
> >  #ifdef CONFIG_DEBUG_FS
> >  		if (phb->has_dbgfs || !phb->dbgfs)
> > diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> > index 307181fd8a17..d2b50f3bf6b1 100644
> > --- a/arch/powerpc/platforms/powernv/pci.c
> > +++ b/arch/powerpc/platforms/powernv/pci.c
> > @@ -717,10 +717,9 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
> >  static bool pnv_pci_cfg_check(struct pci_dn *pdn)
> >  {
> >  	struct eeh_dev *edev = NULL;
> > -	struct pnv_phb *phb = pdn->phb->private_data;
> >  
> >  	/* EEH not enabled ? */
> > -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> > +	if (!eeh_phb_enabled())
> >  		return true;
> >  
> >  	/* PE reset or device removed ? */
> > @@ -761,7 +760,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
> >  
> >  	ret = pnv_pci_cfg_read(pdn, where, size, val);
> >  	phb = pdn->phb->private_data;
> > -	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
> > +	if (eeh_phb_enabled() && pdn->edev) {
> >  		if (*val == EEH_IO_ERROR_VALUE(size) &&
> >  		    eeh_dev_check_failure(pdn->edev))
> >                          return PCIBIOS_DEVICE_NOT_FOUND;
> > @@ -789,7 +788,7 @@ static int pnv_pci_write_config(struct pci_bus *bus,
> >  
> >  	ret = pnv_pci_cfg_write(pdn, where, size, val);
> >  	phb = pdn->phb->private_data;
> > -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> > +	if (!eeh_phb_enabled())
> >  		pnv_pci_config_check_eeh(pdn);
> >  
> >  	return ret;
> > diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> > index 8e36da379252..eb0add61397b 100644
> > --- a/arch/powerpc/platforms/powernv/pci.h
> > +++ b/arch/powerpc/platforms/powernv/pci.h
> > @@ -85,8 +85,6 @@ struct pnv_ioda_pe {
> >  	struct list_head	list;
> >  };
> >  
> > -#define PNV_PHB_FLAG_EEH	(1 << 0)
> > -
> >  struct pnv_phb {
> >  	struct pci_controller	*hose;
> >  	enum pnv_phb_type	type;
> > diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> > index 37a77e57893e..7be80882c08d 100644
> > --- a/arch/powerpc/platforms/pseries/pci.c
> > +++ b/arch/powerpc/platforms/pseries/pci.c
> > @@ -244,6 +244,10 @@ void __init pSeries_final_fixup(void)
> >  
> >  	eeh_probe_devices();
> >  	eeh_addr_cache_build();
> > +#ifdef CONFIG_EEH
> > +	if (eeh_enabled())
> > +		eeh_add_flag(EEH_PHB_ENABLED);
> > +#endif
> >  
> >  #ifdef CONFIG_PCI_IOV
> >  	ppc_md.pcibios_sriov_enable = pseries_pcibios_sriov_enable;
> > 
> 
> -- 
> Alexey
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()
  2019-03-20  6:02   ` Alexey Kardashevskiy
  2019-03-20  6:24     ` Oliver
@ 2019-04-09  3:30     ` Sam Bobroff
  1 sibling, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-09  3:30 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3274 bytes --]

On Wed, Mar 20, 2019 at 05:02:23PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 20/03/2019 13:58, Sam Bobroff wrote:
> > Move the EEH enabled message into it's own function so that future
> > work can call it from multiple places.
> > 
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h |  3 +++
> >  arch/powerpc/kernel/eeh.c      | 16 +++++++++++-----
> >  2 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index fe4cf7208890..e217ccda55d0 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -289,6 +289,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
> >  
> >  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
> >  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> > +void eeh_show_enabled(void);
> >  void eeh_probe_devices(void);
> >  int __init eeh_ops_register(struct eeh_ops *ops);
> >  int __exit eeh_ops_unregister(const char *name);
> > @@ -338,6 +339,8 @@ static inline bool eeh_enabled(void)
> >          return false;
> >  }
> >  
> > +static inline void eeh_show_enabled(void) { }
> > +
> >  static inline bool eeh_phb_enabled(void)
> >  {
> >  	return false;
> > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> > index b14d89547895..3dcff29cb9b3 100644
> > --- a/arch/powerpc/kernel/eeh.c
> > +++ b/arch/powerpc/kernel/eeh.c
> > @@ -163,6 +163,16 @@ static int __init eeh_setup(char *str)
> >  }
> >  __setup("eeh=", eeh_setup);
> >  
> > +void eeh_show_enabled(void)
> > +{
> > +	if (eeh_has_flag(EEH_FORCE_DISABLED))
> > +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (by eeh=off)\n");
> > +	else if (eeh_enabled())
> 
> 
> I'd make it eeh_has_flag(EEH_ENABLED) for clarity.

OK, sounds good.

> 
> > +		pr_info("EEH: PCI Enhanced I/O Error Handling ENABLED (capable adapter found)\n");
> > +	else
> > +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (no capable adapter found)\n");
> > +}
> > +
> >  /*
> >   * This routine captures assorted PCI configuration space data
> >   * for the indicated PCI device, and puts them into a buffer
> > @@ -1166,11 +1176,7 @@ void eeh_probe_devices(void)
> >  		pdn = hose->pci_data;
> >  		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
> >  	}
> > -	if (eeh_enabled())
> > -		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
> > -	else
> > -		pr_info("EEH: No capable adapters found\n");
> > -
> > +	eeh_show_enabled();
> 
> 
> This line moves later in the series so I'd just merge this patch into
> 8/8 to reduce number of lines moving withing the patchset.

Oh, good idea. I'll do it.

> In general the whole point of the EEH_ENABLED flag is fading away. Its
> meaning now is that "at least somewhere in the box for at least one
> device with enabled EEH" which does not seem extremely useful as we have
> a pci_dev or pe pretty much everywhere we look at eeh_enabled() and
> pdev->dev.archdata.edev can tell if eeh is enabled for a device.
> Although I am pretty sure this is in your list already :)

Yes. :-)

> 
> >  }
> >  
> >  /**
> > 
> 
> -- 
> Alexey
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
  2019-03-20  6:05   ` Alexey Kardashevskiy
@ 2019-04-09  3:31     ` Sam Bobroff
  0 siblings, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-09  3:31 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 6295 bytes --]

On Wed, Mar 20, 2019 at 05:05:49PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 20/03/2019 13:58, Sam Bobroff wrote:
> > Now that EEH support for all devices (on PowerNV and pSeries) is
> > provided by the pcibios bus add device hooks, eeh_probe_devices() and
> > eeh_addr_cache_build() are redundant and can be removed.
> > 
> > Note that previously on pSeries, useless EEH sysfs files were created
> > for some devices that did not have EEH support and this change
> > prevents them from being created.
> > 
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h               |  6 ----
> >  arch/powerpc/kernel/eeh.c                    | 13 --------
> >  arch/powerpc/kernel/eeh_cache.c              | 32 --------------------
> >  arch/powerpc/platforms/powernv/eeh-powernv.c |  5 ++-
> >  arch/powerpc/platforms/pseries/pci.c         |  3 +-
> >  5 files changed, 3 insertions(+), 56 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index 791b9e6fcc45..f1eca1757cbc 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -290,13 +290,11 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
> >  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
> >  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> >  void eeh_show_enabled(void);
> > -void eeh_probe_devices(void);
> >  int __init eeh_ops_register(struct eeh_ops *ops);
> >  int __exit eeh_ops_unregister(const char *name);
> >  int eeh_check_failure(const volatile void __iomem *token);
> >  int eeh_dev_check_failure(struct eeh_dev *edev);
> >  void eeh_addr_cache_init(void);
> > -void eeh_addr_cache_build(void);
> >  void eeh_add_device_early(struct pci_dn *);
> >  void eeh_add_device_tree_early(struct pci_dn *);
> >  void eeh_add_device_late(struct pci_dev *);
> > @@ -347,8 +345,6 @@ static inline bool eeh_phb_enabled(void)
> >  	return false;
> >  }
> >  
> > -static inline void eeh_probe_devices(void) { }
> > -
> >  static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
> >  {
> >  	return NULL;
> > @@ -365,8 +361,6 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
> >  
> >  static inline void eeh_addr_cache_init(void) { }
> >  
> > -static inline void eeh_addr_cache_build(void) { }
> > -
> >  static inline void eeh_add_device_early(struct pci_dn *pdn) { }
> >  
> >  static inline void eeh_add_device_tree_early(struct pci_dn *pdn) { }
> > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> > index 217e14bb1fb6..cd2abbe41497 100644
> > --- a/arch/powerpc/kernel/eeh.c
> > +++ b/arch/powerpc/kernel/eeh.c
> > @@ -1166,19 +1166,6 @@ static struct notifier_block eeh_reboot_nb = {
> >  	.notifier_call = eeh_reboot_notifier,
> >  };
> >  
> > -void eeh_probe_devices(void)
> > -{
> > -	struct pci_controller *hose, *tmp;
> > -	struct pci_dn *pdn;
> > -
> > -	/* Enable EEH for all adapters */
> > -	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
> > -		pdn = hose->pci_data;
> > -		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
> > -	}
> > -	eeh_show_enabled();
> > -}
> > -
> >  /**
> >   * eeh_init - EEH initialization
> >   *
> > diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
> > index f93dd5cf6a39..c40078d036af 100644
> > --- a/arch/powerpc/kernel/eeh_cache.c
> > +++ b/arch/powerpc/kernel/eeh_cache.c
> > @@ -278,38 +278,6 @@ void eeh_addr_cache_init(void)
> >  	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> >  }
> >  
> > -/**
> > - * eeh_addr_cache_build - Build a cache of I/O addresses
> > - *
> > - * Build a cache of pci i/o addresses.  This cache will be used to
> > - * find the pci device that corresponds to a given address.
> > - * This routine scans all pci busses to build the cache.
> > - * Must be run late in boot process, after the pci controllers
> > - * have been scanned for devices (after all device resources are known).
> > - */
> > -void eeh_addr_cache_build(void)
> > -{
> > -	struct pci_dn *pdn;
> > -	struct eeh_dev *edev;
> > -	struct pci_dev *dev = NULL;
> > -
> > -	for_each_pci_dev(dev) {
> > -		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
> > -		if (!pdn)
> > -			continue;
> > -
> > -		edev = pdn_to_eeh_dev(pdn);
> > -		if (!edev)
> > -			continue;
> > -
> > -		dev->dev.archdata.edev = edev;
> > -		edev->pdev = dev;
> > -
> > -		eeh_addr_cache_insert_dev(dev);
> > -		eeh_sysfs_add_device(dev);
> > -	}
> > -}
> > -
> >  static int eeh_addr_cache_show(struct seq_file *s, void *v)
> >  {
> >  	struct pci_io_addr_range *piar;
> > diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > index 81b0923cc55f..6a08f4fab255 100644
> > --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> > +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > @@ -240,9 +240,7 @@ int pnv_eeh_post_init(void)
> >  	struct pnv_phb *phb;
> >  	int ret = 0;
> >  
> > -	/* Probe devices & build address cache */
> > -	eeh_probe_devices();
> > -	eeh_addr_cache_build();
> > +	eeh_show_enabled();
> >  
> >  	/* Register OPAL event notifier */
> >  	eeh_event_irq = opal_event_request(ilog2(OPAL_EVENT_PCI_ERROR));
> > @@ -360,6 +358,7 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
> >  	return 0;
> >  }
> >  
> > +
> 
> Unrelated new line. Otherwise

Thanks, I'll fix it as part of merging in patch #5.

> 
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> 
> 
> 
> >  /**
> >   * pnv_eeh_probe - Do probe on PCI device
> >   * @pdn: PCI device node
> > diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> > index 7be80882c08d..7c781b04fb15 100644
> > --- a/arch/powerpc/platforms/pseries/pci.c
> > +++ b/arch/powerpc/platforms/pseries/pci.c
> > @@ -242,8 +242,7 @@ void __init pSeries_final_fixup(void)
> >  
> >  	pSeries_request_regions();
> >  
> > -	eeh_probe_devices();
> > -	eeh_addr_cache_build();
> > +	eeh_show_enabled();
> >  #ifdef CONFIG_EEH
> >  	if (eeh_enabled())
> >  		eeh_add_flag(EEH_PHB_ENABLED);
> > 
> 
> -- 
> Alexey
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/8]
  2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
                   ` (7 preceding siblings ...)
  2019-03-20  2:58 ` [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build() Sam Bobroff
@ 2019-04-12  0:55 ` Tyrel Datwyler
  2019-04-15  3:41   ` Sam Bobroff
  8 siblings, 1 reply; 31+ messages in thread
From: Tyrel Datwyler @ 2019-04-12  0:55 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev

On 03/19/2019 07:58 PM, Sam Bobroff wrote:
> Hi all,
> 
> This patch set adds support for EEH recovery of hot plugged devices on pSeries
> machines. Specifically, devices discovered by PCI rescanning using
> /sys/bus/pci/rescan, which includes devices hotplugged by QEMU's device_add
> command. (pSeries doesn't currently use slot power control for hotplugging.)

Slight nit that its not that pSeries doesn't support slot power control
hotplugging, its that QEMU pSeries guests don't support it. We most definitely
use the slot power control for hotplugging in PowerVM pSeries Linux guests. More
specifically we had to work around short comings in the rpaphp driver when
dealing with QEMU. This being that while at initial glance the design implies
that it had multiple devices per PHB in mind, it didn't, and only actually
supported a single slot per PHB. Further, when we developed the QEMU pci hotplug
feature we had to deal with only having a single PHB per QEMU guest and as a
result needed a way to plug multiple pci devices into a single PHB. Hence, came
the pci rescan work around in drmgr.

Mike Roth and I have had discussions over the years to get the slot power
control hotplugging working properly with QEMU, and while I did get the RPA
hotplug driver fixed to register all available slots associated with a PHB, EEH
remained an issue. So, I'm very happy to see this patchset get that working with
the rescan work around.

> 
> As a side effect this also provides EEH support for devices removed by
> /sys/bus/pci/devices/*/remove and re-discovered by writing to /sys/bus/pci/rescan,
> on all platforms.

+1, this seems like icing on the cake. ;)

> 
> The approach I've taken is to use the fact that the existing
> pcibios_bus_add_device() platform hooks (which are used to set up EEH on
> Virtual Function devices (VFs)) are actually called for all devices, so I've
> widened their scope and made other adjustments necessary to allow them to work
> for hotplugged and boot-time devices as well.
> 
> Because some of the changes are in generic PowerPC code, it's
> possible that I've disturbed something for another PowerPC platform. I've tried
> to minimize this by leaving that code alone as much as possible and so there
> are a few cases where eeh_add_device_{early,late}() or eeh_add_sysfs_files() is
> called more than once. I think these can be looked at later, as duplicate calls
> are not harmful.
> 
> The patch "Convert PNV_PHB_FLAG_EEH" isn't strictly necessary and I'm not sure
> if it's better to keep it, because it simplifies the code or drop it, because
> we may need a separate flag per PHB later on. Thoughts anyone?
> 
> The first patch is a rework of the pcibios_init reordering patch I posted
> earlier, which I've included here because it's necessary for this set.
> 
> I have done some testing for PowerNV on Power9 using a modified pnv_php module
> and some testing on pSeries with slot power control using a modified rpaphp
> module, and the EEH-related parts seem to work.

I'm interested in what modifications with rpaphp. Its unclear if you are saying
rpaphp modified so that slot power hotplug works with a QEMU pSeries guest? If
thats the case it would be optimal to get that upstream and remove the work
rescan workaround for guests that don't need it.

-Tyrel

> 
> Cheers,
> Sam.
> 
> Sam Bobroff (8):
>   powerpc/64: Adjust order in pcibios_init()
>   powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
>   powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
>   powerpc/eeh: Improve debug messages around device addition
>   powerpc/eeh: Add eeh_show_enabled()
>   powerpc/eeh: Initialize EEH address cache earlier
>   powerpc/eeh: EEH for pSeries hot plug
>   powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
> 
>  arch/powerpc/include/asm/eeh.h               | 19 +++--
>  arch/powerpc/kernel/eeh.c                    | 33 ++++-----
>  arch/powerpc/kernel/eeh_cache.c              | 29 +-------
>  arch/powerpc/kernel/eeh_driver.c             |  4 ++
>  arch/powerpc/kernel/of_platform.c            |  3 +-
>  arch/powerpc/kernel/pci-common.c             |  4 --
>  arch/powerpc/kernel/pci_32.c                 |  4 ++
>  arch/powerpc/kernel/pci_64.c                 | 12 +++-
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++++------
>  arch/powerpc/platforms/powernv/pci.c         |  7 +-
>  arch/powerpc/platforms/powernv/pci.h         |  2 -
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 75 +++++++++++---------
>  arch/powerpc/platforms/pseries/pci.c         |  7 +-
>  13 files changed, 122 insertions(+), 118 deletions(-)
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/8]
  2019-04-12  0:55 ` [PATCH 0/8] Tyrel Datwyler
@ 2019-04-15  3:41   ` Sam Bobroff
  2019-04-19 22:36     ` Tyrel Datwyler
  0 siblings, 1 reply; 31+ messages in thread
From: Sam Bobroff @ 2019-04-15  3:41 UTC (permalink / raw)
  To: Tyrel Datwyler; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 6202 bytes --]

On Thu, Apr 11, 2019 at 05:55:33PM -0700, Tyrel Datwyler wrote:
> On 03/19/2019 07:58 PM, Sam Bobroff wrote:
> > Hi all,
> > 
> > This patch set adds support for EEH recovery of hot plugged devices on pSeries
> > machines. Specifically, devices discovered by PCI rescanning using
> > /sys/bus/pci/rescan, which includes devices hotplugged by QEMU's device_add
> > command. (pSeries doesn't currently use slot power control for hotplugging.)
> 
> Slight nit that its not that pSeries doesn't support slot power control
> hotplugging, its that QEMU pSeries guests don't support it. We most definitely
> use the slot power control for hotplugging in PowerVM pSeries Linux guests. More

Ah, I think I see what you mean: pSeries can (and does!) use slot power
control for hotplugging, it's just that Linux doesn't. Right :-) I'll
change it to "Linux on pSeries doesn't...." for the next version.

> specifically we had to work around short comings in the rpaphp driver when
> dealing with QEMU. This being that while at initial glance the design implies
> that it had multiple devices per PHB in mind, it didn't, and only actually
> supported a single slot per PHB. Further, when we developed the QEMU pci hotplug
> feature we had to deal with only having a single PHB per QEMU guest and as a
> result needed a way to plug multiple pci devices into a single PHB. Hence, came
> the pci rescan work around in drmgr.
> 
> Mike Roth and I have had discussions over the years to get the slot power
> control hotplugging working properly with QEMU, and while I did get the RPA
> hotplug driver fixed to register all available slots associated with a PHB, EEH
> remained an issue. So, I'm very happy to see this patchset get that working with
> the rescan work around.
> 
> > 
> > As a side effect this also provides EEH support for devices removed by
> > /sys/bus/pci/devices/*/remove and re-discovered by writing to /sys/bus/pci/rescan,
> > on all platforms.
> 
> +1, this seems like icing on the cake. ;)

Yes :-)

Although maybe I should mention that we can't really benefit from this
on PowerNV *yet* because there seem to be some other problems with
removing and re-scanning devices: in my tests devices are often unusable
after being rediscovered.

(I'm hoping to take a look at that soon.)

> > 
> > The approach I've taken is to use the fact that the existing
> > pcibios_bus_add_device() platform hooks (which are used to set up EEH on
> > Virtual Function devices (VFs)) are actually called for all devices, so I've
> > widened their scope and made other adjustments necessary to allow them to work
> > for hotplugged and boot-time devices as well.
> > 
> > Because some of the changes are in generic PowerPC code, it's
> > possible that I've disturbed something for another PowerPC platform. I've tried
> > to minimize this by leaving that code alone as much as possible and so there
> > are a few cases where eeh_add_device_{early,late}() or eeh_add_sysfs_files() is
> > called more than once. I think these can be looked at later, as duplicate calls
> > are not harmful.
> > 
> > The patch "Convert PNV_PHB_FLAG_EEH" isn't strictly necessary and I'm not sure
> > if it's better to keep it, because it simplifies the code or drop it, because
> > we may need a separate flag per PHB later on. Thoughts anyone?
> > 
> > The first patch is a rework of the pcibios_init reordering patch I posted
> > earlier, which I've included here because it's necessary for this set.
> > 
> > I have done some testing for PowerNV on Power9 using a modified pnv_php module
> > and some testing on pSeries with slot power control using a modified rpaphp
> > module, and the EEH-related parts seem to work.
> 
> I'm interested in what modifications with rpaphp. Its unclear if you are saying
> rpaphp modified so that slot power hotplug works with a QEMU pSeries guest? If
> thats the case it would be optimal to get that upstream and remove the work
> rescan workaround for guests that don't need it.

Unfortunately no, I didn't do enough work to really get it working.  I
just wanted to get an idea of how that code path interacted with the EEH
code I was changing, so that hopefully when we get to fixing it, the EEH
part will be easier to do.

The hack I tested with was:

- rtas_errd changed so that it doesn't pass -V to drmgr (-V seems to
  trigger drmgr to use the PCI rescan system rather that slot power
  control).
- of_pci_parse_addrs() changed so that if the assigned-addresses node
  is missing (which it is when the guest is running under QEMU/KVM) we
  call pci_setup_device() to configure the BARs.

It did look pretty good -- the EEH part may actually work fine once we get
the rest sorted out.

> 
> -Tyrel
> 
> > 
> > Cheers,
> > Sam.
> > 
> > Sam Bobroff (8):
> >   powerpc/64: Adjust order in pcibios_init()
> >   powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
> >   powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
> >   powerpc/eeh: Improve debug messages around device addition
> >   powerpc/eeh: Add eeh_show_enabled()
> >   powerpc/eeh: Initialize EEH address cache earlier
> >   powerpc/eeh: EEH for pSeries hot plug
> >   powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
> > 
> >  arch/powerpc/include/asm/eeh.h               | 19 +++--
> >  arch/powerpc/kernel/eeh.c                    | 33 ++++-----
> >  arch/powerpc/kernel/eeh_cache.c              | 29 +-------
> >  arch/powerpc/kernel/eeh_driver.c             |  4 ++
> >  arch/powerpc/kernel/of_platform.c            |  3 +-
> >  arch/powerpc/kernel/pci-common.c             |  4 --
> >  arch/powerpc/kernel/pci_32.c                 |  4 ++
> >  arch/powerpc/kernel/pci_64.c                 | 12 +++-
> >  arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++++------
> >  arch/powerpc/platforms/powernv/pci.c         |  7 +-
> >  arch/powerpc/platforms/powernv/pci.h         |  2 -
> >  arch/powerpc/platforms/pseries/eeh_pseries.c | 75 +++++++++++---------
> >  arch/powerpc/platforms/pseries/pci.c         |  7 +-
> >  13 files changed, 122 insertions(+), 118 deletions(-)
> > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
  2019-03-20  2:58 ` [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
@ 2019-04-18  9:51   ` Oliver O'Halloran
  2019-04-30  5:30     ` Sam Bobroff
  1 sibling, 1 reply; 31+ messages in thread
From: Oliver O'Halloran @ 2019-04-18  9:51 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev

On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote:
> The PHB flag, PNV_PHB_FLAG_EEH, is set (on PowerNV) individually on
> each PHB once the EEH subsystem is ready. It is the only use of the
> flags member of the phb struct.
> 
> However there is no need to store this separately on each PHB, so
> convert it to a global flag. For symmetry, the flag is now also set
> for pSeries; although it is currently unused it may be useful in the
> future.

I'd drop this patch and keep it as a seperate flag. Making this a
global flag doesn't really buy us anything either. It's also worth
remembering that we do have virtual PHBs, like the NVLink ones, that
don't have real EEH support and we should be doing something more
intelligent about that.

If you are going to remove the PNV_PHB flag then i would look at moving
that flag into the generic pci_controller structure that we use for
tracking PHBs since that would give us a way to toggle EEH support on a
per PHB basis on pseries and powernv.

> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/eeh.h               | 11 +++++++++++
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++-----------
>  arch/powerpc/platforms/powernv/pci.c         |  7 +++----
>  arch/powerpc/platforms/powernv/pci.h         |  2 --
>  arch/powerpc/platforms/pseries/pci.c         |  4 ++++
>  5 files changed, 21 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index 3613a56281f2..fe4cf7208890 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -43,6 +43,7 @@ struct pci_dn;
>  #define EEH_VALID_PE_ZERO	0x10	/* PE#0 is valid		     */
>  #define EEH_ENABLE_IO_FOR_LOG	0x20	/* Enable IO for log		     */
>  #define EEH_EARLY_DUMP_LOG	0x40	/* Dump log immediately		     */
> +#define EEH_PHB_ENABLED		0x80	/* PHB recovery uses EEH	     */
>  
>  /*
>   * Delay for PE reset, all in ms
> @@ -245,6 +246,11 @@ static inline bool eeh_enabled(void)
>  	return eeh_has_flag(EEH_ENABLED) && !eeh_has_flag(EEH_FORCE_DISABLED);
>  }
>  
> +static inline bool eeh_phb_enabled(void)
> +{
> +	return eeh_has_flag(EEH_PHB_ENABLED);
> +}
> +
>  static inline void eeh_serialize_lock(unsigned long *flags)
>  {
>  	raw_spin_lock_irqsave(&confirm_error_lock, *flags);
> @@ -332,6 +338,11 @@ static inline bool eeh_enabled(void)
>          return false;
>  }
>  
> +static inline bool eeh_phb_enabled(void)
> +{
> +	return false;
> +}
> +
>  static inline void eeh_probe_devices(void) { }
>  
>  static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 6fc1a463b796..f0a95f663810 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -264,22 +264,14 @@ int pnv_eeh_post_init(void)
>  		return ret;
>  	}
>  
> -	if (!eeh_enabled())
> +	if (eeh_enabled())
> +		eeh_add_flag(EEH_PHB_ENABLED);
> +	else
>  		disable_irq(eeh_event_irq);
>  
>  	list_for_each_entry(hose, &hose_list, list_node) {
>  		phb = hose->private_data;
>  
> -		/*
> -		 * If EEH is enabled, we're going to rely on that.
> -		 * Otherwise, we restore to conventional mechanism
> -		 * to clear frozen PE during PCI config access.
> -		 */
> -		if (eeh_enabled())
> -			phb->flags |= PNV_PHB_FLAG_EEH;
> -		else
> -			phb->flags &= ~PNV_PHB_FLAG_EEH;
> -
>  		/* Create debugfs entries */
>  #ifdef CONFIG_DEBUG_FS
>  		if (phb->has_dbgfs || !phb->dbgfs)
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 307181fd8a17..d2b50f3bf6b1 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -717,10 +717,9 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
>  static bool pnv_pci_cfg_check(struct pci_dn *pdn)
>  {
>  	struct eeh_dev *edev = NULL;
> -	struct pnv_phb *phb = pdn->phb->private_data;
>  
>  	/* EEH not enabled ? */
> -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> +	if (!eeh_phb_enabled())
>  		return true;
>  
>  	/* PE reset or device removed ? */
> @@ -761,7 +760,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  
>  	ret = pnv_pci_cfg_read(pdn, where, size, val);
>  	phb = pdn->phb->private_data;
> -	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
> +	if (eeh_phb_enabled() && pdn->edev) {
>  		if (*val == EEH_IO_ERROR_VALUE(size) &&
>  		    eeh_dev_check_failure(pdn->edev))
>                          return PCIBIOS_DEVICE_NOT_FOUND;
> @@ -789,7 +788,7 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  
>  	ret = pnv_pci_cfg_write(pdn, where, size, val);
>  	phb = pdn->phb->private_data;
> -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> +	if (!eeh_phb_enabled())
>  		pnv_pci_config_check_eeh(pdn);
>  
>  	return ret;
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 8e36da379252..eb0add61397b 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -85,8 +85,6 @@ struct pnv_ioda_pe {
>  	struct list_head	list;
>  };
>  
> -#define PNV_PHB_FLAG_EEH	(1 << 0)
> -
>  struct pnv_phb {
>  	struct pci_controller	*hose;
>  	enum pnv_phb_type	type;
> diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> index 37a77e57893e..7be80882c08d 100644
> --- a/arch/powerpc/platforms/pseries/pci.c
> +++ b/arch/powerpc/platforms/pseries/pci.c
> @@ -244,6 +244,10 @@ void __init pSeries_final_fixup(void)
>  
>  	eeh_probe_devices();
>  	eeh_addr_cache_build();
> +#ifdef CONFIG_EEH
> +	if (eeh_enabled())
> +		eeh_add_flag(EEH_PHB_ENABLED);
> +#endif
>  
>  #ifdef CONFIG_PCI_IOV
>  	ppc_md.pcibios_sriov_enable = pseries_pcibios_sriov_enable;


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()
  2019-03-20  2:58 ` [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled() Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
@ 2019-04-18 10:01   ` Oliver O'Halloran
  2019-04-30  5:44     ` Sam Bobroff
  1 sibling, 1 reply; 31+ messages in thread
From: Oliver O'Halloran @ 2019-04-18 10:01 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev

On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote:
> Move the EEH enabled message into it's own function so that future
> work can call it from multiple places.
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/eeh.h |  3 +++
>  arch/powerpc/kernel/eeh.c      | 16 +++++++++++-----
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index fe4cf7208890..e217ccda55d0 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -289,6 +289,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
>  
>  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
>  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> +void eeh_show_enabled(void);
>  void eeh_probe_devices(void);
>  int __init eeh_ops_register(struct eeh_ops *ops);
>  int __exit eeh_ops_unregister(const char *name);
> @@ -338,6 +339,8 @@ static inline bool eeh_enabled(void)
>          return false;
>  }
>  
> +static inline void eeh_show_enabled(void) { }
> +
>  static inline bool eeh_phb_enabled(void)
>  {
>  	return false;
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index b14d89547895..3dcff29cb9b3 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -163,6 +163,16 @@ static int __init eeh_setup(char *str)
>  }
>  __setup("eeh=", eeh_setup);
>  
> +void eeh_show_enabled(void)
> +{
> +	if (eeh_has_flag(EEH_FORCE_DISABLED))
> +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (by eeh=off)\n");

The allcaps looks kind of stupid.

> +	else if (eeh_enabled())
> +		pr_info("EEH: PCI Enhanced I/O Error Handling ENABLED (capable adapter found)\n");
> +	else
> +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (no capable adapter found)\n");

If we really have to keep these messages then make it say "no EEH
capable buses found" or something. EEH support has absolutely nothing
to do with the adapters and I'm not even sure we can get the "DISABLED"
message on PowerNV since the root port will always be there.

> +}
> +
>  /*
>   * This routine captures assorted PCI configuration space data
>   * for the indicated PCI device, and puts them into a buffer
> @@ -1166,11 +1176,7 @@ void eeh_probe_devices(void)
>  		pdn = hose->pci_data;
>  		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
>  	}
> -	if (eeh_enabled())
> -		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
> -	else
> -		pr_info("EEH: No capable adapters found\n");
> -
> +	eeh_show_enabled();
>  }
>  
>  /**


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier
  2019-03-20  2:58 ` [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier Sam Bobroff
  2019-03-20  6:02   ` Alexey Kardashevskiy
@ 2019-04-18 10:13   ` Oliver O'Halloran
  2019-04-30  5:54     ` Sam Bobroff
  1 sibling, 1 reply; 31+ messages in thread
From: Oliver O'Halloran @ 2019-04-18 10:13 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev

On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote:
> The EEH address cache is currently initialized and populated by a
> single function: eeh_addr_cache_build().  While the initial population
> of the cache can only be done once resources are allocated,
> initialization (just setting up a spinlock) could be done much
> earlier.
> 
> So move the initialization step into a separate function and call it
> from a core_initcall (rather than a subsys initcall).
> 

> This will allow future work to make use of the cache during boot time
> PCI scanning.

What's the idea there? Checking for overlaps in the BAR assignments?

> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/eeh.h  |  3 +++
>  arch/powerpc/kernel/eeh.c       |  2 ++
>  arch/powerpc/kernel/eeh_cache.c | 13 +++++++++++--
>  3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index e217ccda55d0..791b9e6fcc45 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -295,6 +295,7 @@ int __init eeh_ops_register(struct eeh_ops *ops);
>  int __exit eeh_ops_unregister(const char *name);
>  int eeh_check_failure(const volatile void __iomem *token);
>  int eeh_dev_check_failure(struct eeh_dev *edev);
> +void eeh_addr_cache_init(void);
>  void eeh_addr_cache_build(void);
>  void eeh_add_device_early(struct pci_dn *);
>  void eeh_add_device_tree_early(struct pci_dn *);
> @@ -362,6 +363,8 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
>  
>  #define eeh_dev_check_failure(x) (0)
>  
> +static inline void eeh_addr_cache_init(void) { }
> +
>  static inline void eeh_addr_cache_build(void) { }
>  
>  static inline void eeh_add_device_early(struct pci_dn *pdn) { }
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 3dcff29cb9b3..7a406d58d2c0 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1219,6 +1219,8 @@ static int eeh_init(void)
>  	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
>  		eeh_dev_phb_init_dynamic(hose);
>  
> +	eeh_addr_cache_init();
> +
>  	/* Initialize EEH event */
>  	return eeh_event_init();
>  }
> diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
> index 9c68f0837385..f93dd5cf6a39 100644
> --- a/arch/powerpc/kernel/eeh_cache.c
> +++ b/arch/powerpc/kernel/eeh_cache.c
> @@ -267,6 +267,17 @@ void eeh_addr_cache_rmv_dev(struct pci_dev *dev)
>  	spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
>  }
>  
> +/**
> + * eeh_addr_cache_init - Initialize a cache of I/O addresses
> + *
> + * Initialize a cache of pci i/o addresses.  This cache will be used to
> + * find the pci device that corresponds to a given address.
> + */
> +void eeh_addr_cache_init(void)
> +{
> +	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> +}

You could move this out of the pci_io_addr_cache structure and use
DEFINE_SPINLOCK() too. We might even be able to get rid of the hand-
rolled interval tree in eeh_cache.c in favour of the generic
implementation (see mm/interval_tree.c). I'm pretty sure the generic
one is a drop-in replacement, but I haven't had a chance to have a
detailed look to see if there's any differences in behaviour.

> +
>  /**
>   * eeh_addr_cache_build - Build a cache of I/O addresses
>   *
> @@ -282,8 +293,6 @@ void eeh_addr_cache_build(void)
>  	struct eeh_dev *edev;
>  	struct pci_dev *dev = NULL;
>  
> -	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> -
>  	for_each_pci_dev(dev) {
>  		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
>  		if (!pdn)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/8]
  2019-04-15  3:41   ` Sam Bobroff
@ 2019-04-19 22:36     ` Tyrel Datwyler
  0 siblings, 0 replies; 31+ messages in thread
From: Tyrel Datwyler @ 2019-04-19 22:36 UTC (permalink / raw)
  To: Sam Bobroff, Tyrel Datwyler; +Cc: linuxppc-dev

On 04/14/2019 08:41 PM, Sam Bobroff wrote:
> On Thu, Apr 11, 2019 at 05:55:33PM -0700, Tyrel Datwyler wrote:
>> On 03/19/2019 07:58 PM, Sam Bobroff wrote:
>>> Hi all,
>>>
>>> This patch set adds support for EEH recovery of hot plugged devices on pSeries
>>> machines. Specifically, devices discovered by PCI rescanning using
>>> /sys/bus/pci/rescan, which includes devices hotplugged by QEMU's device_add
>>> command. (pSeries doesn't currently use slot power control for hotplugging.)
>>
>> Slight nit that its not that pSeries doesn't support slot power control
>> hotplugging, its that QEMU pSeries guests don't support it. We most definitely
>> use the slot power control for hotplugging in PowerVM pSeries Linux guests. More
> 
> Ah, I think I see what you mean: pSeries can (and does!) use slot power
> control for hotplugging, it's just that Linux doesn't. Right :-) I'll
> change it to "Linux on pSeries doesn't...." for the next version.

Not quite. A pSeries Linux PowerVM LPAR does use the slot power hotplugging. It
is only the pSeries Linux qemu guest that doesn't. The way that hotplug is
initiated is different for PowerVM and qemu. On PowerVM the command is sent from
the HMC down the RSCT stack which calls drmgr whereas with qemu hotplug
generates an interrupt which logs an event that is picked up by rtas_errd which
then calls drmgr with the "-V" flag. That flag was a special case that we added
to drmgr to work around the slot power hotplugging with rpaphp not working
correctly with qemu guests.

> 
>> specifically we had to work around short comings in the rpaphp driver when
>> dealing with QEMU. This being that while at initial glance the design implies
>> that it had multiple devices per PHB in mind, it didn't, and only actually
>> supported a single slot per PHB. Further, when we developed the QEMU pci hotplug
>> feature we had to deal with only having a single PHB per QEMU guest and as a
>> result needed a way to plug multiple pci devices into a single PHB. Hence, came
>> the pci rescan work around in drmgr.
>>
>> Mike Roth and I have had discussions over the years to get the slot power
>> control hotplugging working properly with QEMU, and while I did get the RPA
>> hotplug driver fixed to register all available slots associated with a PHB, EEH
>> remained an issue. So, I'm very happy to see this patchset get that working with
>> the rescan work around.
>>
>>>
>>> As a side effect this also provides EEH support for devices removed by
>>> /sys/bus/pci/devices/*/remove and re-discovered by writing to /sys/bus/pci/rescan,
>>> on all platforms.
>>
>> +1, this seems like icing on the cake. ;)
> 
> Yes :-)
> 
> Although maybe I should mention that we can't really benefit from this
> on PowerNV *yet* because there seem to be some other problems with
> removing and re-scanning devices: in my tests devices are often unusable
> after being rediscovered.
> 
> (I'm hoping to take a look at that soon.)

Interesting.

> 
>>>
>>> The approach I've taken is to use the fact that the existing
>>> pcibios_bus_add_device() platform hooks (which are used to set up EEH on
>>> Virtual Function devices (VFs)) are actually called for all devices, so I've
>>> widened their scope and made other adjustments necessary to allow them to work
>>> for hotplugged and boot-time devices as well.
>>>
>>> Because some of the changes are in generic PowerPC code, it's
>>> possible that I've disturbed something for another PowerPC platform. I've tried
>>> to minimize this by leaving that code alone as much as possible and so there
>>> are a few cases where eeh_add_device_{early,late}() or eeh_add_sysfs_files() is
>>> called more than once. I think these can be looked at later, as duplicate calls
>>> are not harmful.
>>>
>>> The patch "Convert PNV_PHB_FLAG_EEH" isn't strictly necessary and I'm not sure
>>> if it's better to keep it, because it simplifies the code or drop it, because
>>> we may need a separate flag per PHB later on. Thoughts anyone?
>>>
>>> The first patch is a rework of the pcibios_init reordering patch I posted
>>> earlier, which I've included here because it's necessary for this set.
>>>
>>> I have done some testing for PowerNV on Power9 using a modified pnv_php module
>>> and some testing on pSeries with slot power control using a modified rpaphp
>>> module, and the EEH-related parts seem to work.
>>
>> I'm interested in what modifications with rpaphp. Its unclear if you are saying
>> rpaphp modified so that slot power hotplug works with a QEMU pSeries guest? If
>> thats the case it would be optimal to get that upstream and remove the work
>> rescan workaround for guests that don't need it.
> 
> Unfortunately no, I didn't do enough work to really get it working.  I
> just wanted to get an idea of how that code path interacted with the EEH
> code I was changing, so that hopefully when we get to fixing it, the EEH
> part will be easier to do.
> 
> The hack I tested with was:
> 
> - rtas_errd changed so that it doesn't pass -V to drmgr (-V seems to
>   trigger drmgr to use the PCI rescan system rather that slot power
>   control).

Correct, as I mentioned above we did that on purpose to basically hack around
rpaphp not working with qemu guests.

-Tyrel

> - of_pci_parse_addrs() changed so that if the assigned-addresses node
>   is missing (which it is when the guest is running under QEMU/KVM) we
>   call pci_setup_device() to configure the BARs.
> 
> It did look pretty good -- the EEH part may actually work fine once we get
> the rest sorted out.
> 
>>
>> -Tyrel
>>
>>>
>>> Cheers,
>>> Sam.
>>>
>>> Sam Bobroff (8):
>>>   powerpc/64: Adjust order in pcibios_init()
>>>   powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
>>>   powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
>>>   powerpc/eeh: Improve debug messages around device addition
>>>   powerpc/eeh: Add eeh_show_enabled()
>>>   powerpc/eeh: Initialize EEH address cache earlier
>>>   powerpc/eeh: EEH for pSeries hot plug
>>>   powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
>>>
>>>  arch/powerpc/include/asm/eeh.h               | 19 +++--
>>>  arch/powerpc/kernel/eeh.c                    | 33 ++++-----
>>>  arch/powerpc/kernel/eeh_cache.c              | 29 +-------
>>>  arch/powerpc/kernel/eeh_driver.c             |  4 ++
>>>  arch/powerpc/kernel/of_platform.c            |  3 +-
>>>  arch/powerpc/kernel/pci-common.c             |  4 --
>>>  arch/powerpc/kernel/pci_32.c                 |  4 ++
>>>  arch/powerpc/kernel/pci_64.c                 | 12 +++-
>>>  arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++++------
>>>  arch/powerpc/platforms/powernv/pci.c         |  7 +-
>>>  arch/powerpc/platforms/powernv/pci.h         |  2 -
>>>  arch/powerpc/platforms/pseries/eeh_pseries.c | 75 +++++++++++---------
>>>  arch/powerpc/platforms/pseries/pci.c         |  7 +-
>>>  13 files changed, 122 insertions(+), 118 deletions(-)
>>>
>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
  2019-04-18  9:51   ` Oliver O'Halloran
@ 2019-04-30  5:30     ` Sam Bobroff
  0 siblings, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-30  5:30 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 6586 bytes --]

On Thu, Apr 18, 2019 at 07:51:40PM +1000, Oliver O'Halloran wrote:
> On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote:
> > The PHB flag, PNV_PHB_FLAG_EEH, is set (on PowerNV) individually on
> > each PHB once the EEH subsystem is ready. It is the only use of the
> > flags member of the phb struct.
> > 
> > However there is no need to store this separately on each PHB, so
> > convert it to a global flag. For symmetry, the flag is now also set
> > for pSeries; although it is currently unused it may be useful in the
> > future.
> 
> I'd drop this patch and keep it as a seperate flag. Making this a
> global flag doesn't really buy us anything either. It's also worth
> remembering that we do have virtual PHBs, like the NVLink ones, that
> don't have real EEH support and we should be doing something more
> intelligent about that.
> 
> If you are going to remove the PNV_PHB flag then i would look at moving
> that flag into the generic pci_controller structure that we use for
> tracking PHBs since that would give us a way to toggle EEH support on a
> per PHB basis on pseries and powernv.

OK. I want to keep these changes as small as possible, so I'll try a
simpler approach and use the exsiting flag. (I only touched this area
because it's now necessary to set the EEH_ENABLED flag both at boot time
and after it.)

> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h               | 11 +++++++++++
> >  arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++-----------
> >  arch/powerpc/platforms/powernv/pci.c         |  7 +++----
> >  arch/powerpc/platforms/powernv/pci.h         |  2 --
> >  arch/powerpc/platforms/pseries/pci.c         |  4 ++++
> >  5 files changed, 21 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index 3613a56281f2..fe4cf7208890 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -43,6 +43,7 @@ struct pci_dn;
> >  #define EEH_VALID_PE_ZERO	0x10	/* PE#0 is valid		     */
> >  #define EEH_ENABLE_IO_FOR_LOG	0x20	/* Enable IO for log		     */
> >  #define EEH_EARLY_DUMP_LOG	0x40	/* Dump log immediately		     */
> > +#define EEH_PHB_ENABLED		0x80	/* PHB recovery uses EEH	     */
> >  
> >  /*
> >   * Delay for PE reset, all in ms
> > @@ -245,6 +246,11 @@ static inline bool eeh_enabled(void)
> >  	return eeh_has_flag(EEH_ENABLED) && !eeh_has_flag(EEH_FORCE_DISABLED);
> >  }
> >  
> > +static inline bool eeh_phb_enabled(void)
> > +{
> > +	return eeh_has_flag(EEH_PHB_ENABLED);
> > +}
> > +
> >  static inline void eeh_serialize_lock(unsigned long *flags)
> >  {
> >  	raw_spin_lock_irqsave(&confirm_error_lock, *flags);
> > @@ -332,6 +338,11 @@ static inline bool eeh_enabled(void)
> >          return false;
> >  }
> >  
> > +static inline bool eeh_phb_enabled(void)
> > +{
> > +	return false;
> > +}
> > +
> >  static inline void eeh_probe_devices(void) { }
> >  
> >  static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
> > diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > index 6fc1a463b796..f0a95f663810 100644
> > --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> > +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > @@ -264,22 +264,14 @@ int pnv_eeh_post_init(void)
> >  		return ret;
> >  	}
> >  
> > -	if (!eeh_enabled())
> > +	if (eeh_enabled())
> > +		eeh_add_flag(EEH_PHB_ENABLED);
> > +	else
> >  		disable_irq(eeh_event_irq);
> >  
> >  	list_for_each_entry(hose, &hose_list, list_node) {
> >  		phb = hose->private_data;
> >  
> > -		/*
> > -		 * If EEH is enabled, we're going to rely on that.
> > -		 * Otherwise, we restore to conventional mechanism
> > -		 * to clear frozen PE during PCI config access.
> > -		 */
> > -		if (eeh_enabled())
> > -			phb->flags |= PNV_PHB_FLAG_EEH;
> > -		else
> > -			phb->flags &= ~PNV_PHB_FLAG_EEH;
> > -
> >  		/* Create debugfs entries */
> >  #ifdef CONFIG_DEBUG_FS
> >  		if (phb->has_dbgfs || !phb->dbgfs)
> > diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> > index 307181fd8a17..d2b50f3bf6b1 100644
> > --- a/arch/powerpc/platforms/powernv/pci.c
> > +++ b/arch/powerpc/platforms/powernv/pci.c
> > @@ -717,10 +717,9 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
> >  static bool pnv_pci_cfg_check(struct pci_dn *pdn)
> >  {
> >  	struct eeh_dev *edev = NULL;
> > -	struct pnv_phb *phb = pdn->phb->private_data;
> >  
> >  	/* EEH not enabled ? */
> > -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> > +	if (!eeh_phb_enabled())
> >  		return true;
> >  
> >  	/* PE reset or device removed ? */
> > @@ -761,7 +760,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
> >  
> >  	ret = pnv_pci_cfg_read(pdn, where, size, val);
> >  	phb = pdn->phb->private_data;
> > -	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
> > +	if (eeh_phb_enabled() && pdn->edev) {
> >  		if (*val == EEH_IO_ERROR_VALUE(size) &&
> >  		    eeh_dev_check_failure(pdn->edev))
> >                          return PCIBIOS_DEVICE_NOT_FOUND;
> > @@ -789,7 +788,7 @@ static int pnv_pci_write_config(struct pci_bus *bus,
> >  
> >  	ret = pnv_pci_cfg_write(pdn, where, size, val);
> >  	phb = pdn->phb->private_data;
> > -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> > +	if (!eeh_phb_enabled())
> >  		pnv_pci_config_check_eeh(pdn);
> >  
> >  	return ret;
> > diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> > index 8e36da379252..eb0add61397b 100644
> > --- a/arch/powerpc/platforms/powernv/pci.h
> > +++ b/arch/powerpc/platforms/powernv/pci.h
> > @@ -85,8 +85,6 @@ struct pnv_ioda_pe {
> >  	struct list_head	list;
> >  };
> >  
> > -#define PNV_PHB_FLAG_EEH	(1 << 0)
> > -
> >  struct pnv_phb {
> >  	struct pci_controller	*hose;
> >  	enum pnv_phb_type	type;
> > diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> > index 37a77e57893e..7be80882c08d 100644
> > --- a/arch/powerpc/platforms/pseries/pci.c
> > +++ b/arch/powerpc/platforms/pseries/pci.c
> > @@ -244,6 +244,10 @@ void __init pSeries_final_fixup(void)
> >  
> >  	eeh_probe_devices();
> >  	eeh_addr_cache_build();
> > +#ifdef CONFIG_EEH
> > +	if (eeh_enabled())
> > +		eeh_add_flag(EEH_PHB_ENABLED);
> > +#endif
> >  
> >  #ifdef CONFIG_PCI_IOV
> >  	ppc_md.pcibios_sriov_enable = pseries_pcibios_sriov_enable;
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()
  2019-04-18 10:01   ` Oliver O'Halloran
@ 2019-04-30  5:44     ` Sam Bobroff
  0 siblings, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-30  5:44 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3490 bytes --]

On Thu, Apr 18, 2019 at 08:01:36PM +1000, Oliver O'Halloran wrote:
> On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote:
> > Move the EEH enabled message into it's own function so that future
> > work can call it from multiple places.
> > 
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h |  3 +++
> >  arch/powerpc/kernel/eeh.c      | 16 +++++++++++-----
> >  2 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index fe4cf7208890..e217ccda55d0 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -289,6 +289,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
> >  
> >  struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
> >  void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> > +void eeh_show_enabled(void);
> >  void eeh_probe_devices(void);
> >  int __init eeh_ops_register(struct eeh_ops *ops);
> >  int __exit eeh_ops_unregister(const char *name);
> > @@ -338,6 +339,8 @@ static inline bool eeh_enabled(void)
> >          return false;
> >  }
> >  
> > +static inline void eeh_show_enabled(void) { }
> > +
> >  static inline bool eeh_phb_enabled(void)
> >  {
> >  	return false;
> > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> > index b14d89547895..3dcff29cb9b3 100644
> > --- a/arch/powerpc/kernel/eeh.c
> > +++ b/arch/powerpc/kernel/eeh.c
> > @@ -163,6 +163,16 @@ static int __init eeh_setup(char *str)
> >  }
> >  __setup("eeh=", eeh_setup);
> >  
> > +void eeh_show_enabled(void)
> > +{
> > +	if (eeh_has_flag(EEH_FORCE_DISABLED))
> > +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (by eeh=off)\n");
> 
> The allcaps looks kind of stupid.

Argh, true!

> > +	else if (eeh_enabled())
> > +		pr_info("EEH: PCI Enhanced I/O Error Handling ENABLED (capable adapter found)\n");
> > +	else
> > +		pr_info("EEH: PCI Enhanced I/O Error Handling DISABLED (no capable adapter found)\n");
> 
> If we really have to keep these messages then make it say "no EEH
> capable buses found" or something. EEH support has absolutely nothing

I don't think it's critical, but I'd like something that lets you
determine if EEH_ENABLED is set or not and why, since it's helpful for
debugging.

And I know what you mean about EEH support and adapters, but (before
these patches anyway) if you don't have an EEH capable adapter in the
machine at boot time, you won't ever get EEH recovery because
EEH_ENABLED won't be set. (I think we want to clean that up, so this
probably only matters until we do that.)

> to do with the adapters and I'm not even sure we can get the "DISABLED"
> message on PowerNV since the root port will always be there.

Yes, you'd have to force it off via the command line.

Anyway, I'll try to improve the messages (and no more caps) :-)

> > +}
> > +
> >  /*
> >   * This routine captures assorted PCI configuration space data
> >   * for the indicated PCI device, and puts them into a buffer
> > @@ -1166,11 +1176,7 @@ void eeh_probe_devices(void)
> >  		pdn = hose->pci_data;
> >  		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
> >  	}
> > -	if (eeh_enabled())
> > -		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
> > -	else
> > -		pr_info("EEH: No capable adapters found\n");
> > -
> > +	eeh_show_enabled();
> >  }
> >  
> >  /**
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier
  2019-04-18 10:13   ` Oliver O'Halloran
@ 2019-04-30  5:54     ` Sam Bobroff
  0 siblings, 0 replies; 31+ messages in thread
From: Sam Bobroff @ 2019-04-30  5:54 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 4606 bytes --]

On Thu, Apr 18, 2019 at 08:13:53PM +1000, Oliver O'Halloran wrote:
> On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote:
> > The EEH address cache is currently initialized and populated by a
> > single function: eeh_addr_cache_build().  While the initial population
> > of the cache can only be done once resources are allocated,
> > initialization (just setting up a spinlock) could be done much
> > earlier.
> > 
> > So move the initialization step into a separate function and call it
> > from a core_initcall (rather than a subsys initcall).
> > 
> 
> > This will allow future work to make use of the cache during boot time
> > PCI scanning.
> 
> What's the idea there? Checking for overlaps in the BAR assignments?

The idea is just to be able to populate the cache earlier during boot
time, because that allows more code to be consolidated:

Before this set, the pcibios_add_device handlers were called during boot
but they couldn't populate the cache because it wasn't set up. Instead,
the handlers did nothing and a separate sweep was done after setting up
the cache. (This is annoying because when devices are added after boot,
the pcibios_add_device handlers are the only place where the cache can
be populated.)

> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/include/asm/eeh.h  |  3 +++
> >  arch/powerpc/kernel/eeh.c       |  2 ++
> >  arch/powerpc/kernel/eeh_cache.c | 13 +++++++++++--
> >  3 files changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index e217ccda55d0..791b9e6fcc45 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -295,6 +295,7 @@ int __init eeh_ops_register(struct eeh_ops *ops);
> >  int __exit eeh_ops_unregister(const char *name);
> >  int eeh_check_failure(const volatile void __iomem *token);
> >  int eeh_dev_check_failure(struct eeh_dev *edev);
> > +void eeh_addr_cache_init(void);
> >  void eeh_addr_cache_build(void);
> >  void eeh_add_device_early(struct pci_dn *);
> >  void eeh_add_device_tree_early(struct pci_dn *);
> > @@ -362,6 +363,8 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
> >  
> >  #define eeh_dev_check_failure(x) (0)
> >  
> > +static inline void eeh_addr_cache_init(void) { }
> > +
> >  static inline void eeh_addr_cache_build(void) { }
> >  
> >  static inline void eeh_add_device_early(struct pci_dn *pdn) { }
> > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> > index 3dcff29cb9b3..7a406d58d2c0 100644
> > --- a/arch/powerpc/kernel/eeh.c
> > +++ b/arch/powerpc/kernel/eeh.c
> > @@ -1219,6 +1219,8 @@ static int eeh_init(void)
> >  	list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
> >  		eeh_dev_phb_init_dynamic(hose);
> >  
> > +	eeh_addr_cache_init();
> > +
> >  	/* Initialize EEH event */
> >  	return eeh_event_init();
> >  }
> > diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
> > index 9c68f0837385..f93dd5cf6a39 100644
> > --- a/arch/powerpc/kernel/eeh_cache.c
> > +++ b/arch/powerpc/kernel/eeh_cache.c
> > @@ -267,6 +267,17 @@ void eeh_addr_cache_rmv_dev(struct pci_dev *dev)
> >  	spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
> >  }
> >  
> > +/**
> > + * eeh_addr_cache_init - Initialize a cache of I/O addresses
> > + *
> > + * Initialize a cache of pci i/o addresses.  This cache will be used to
> > + * find the pci device that corresponds to a given address.
> > + */
> > +void eeh_addr_cache_init(void)
> > +{
> > +	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> > +}
> 
> You could move this out of the pci_io_addr_cache structure and use
> DEFINE_SPINLOCK() too. We might even be able to get rid of the hand-
> rolled interval tree in eeh_cache.c in favour of the generic
> implementation (see mm/interval_tree.c). I'm pretty sure the generic
> one is a drop-in replacement, but I haven't had a chance to have a
> detailed look to see if there's any differences in behaviour.

Sounds good, I'll try to take a look at doing it in a future patch :-)

> > +
> >  /**
> >   * eeh_addr_cache_build - Build a cache of I/O addresses
> >   *
> > @@ -282,8 +293,6 @@ void eeh_addr_cache_build(void)
> >  	struct eeh_dev *edev;
> >  	struct pci_dev *dev = NULL;
> >  
> > -	spin_lock_init(&pci_io_addr_cache_root.piar_lock);
> > -
> >  	for_each_pci_dev(dev) {
> >  		pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
> >  		if (!pdn)
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-04-30  5:55 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
2019-03-20  2:58 ` [PATCH 1/8] powerpc/64: Adjust order in pcibios_init() Sam Bobroff
2019-03-20  6:03   ` Alexey Kardashevskiy
2019-03-20  2:58 ` [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-04-08  6:50     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-04-09  1:41     ` Sam Bobroff
2019-04-18  9:51   ` Oliver O'Halloran
2019-04-30  5:30     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-03-20  2:58 ` [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled() Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-03-20  6:24     ` Oliver
2019-04-09  3:30     ` Sam Bobroff
2019-04-18 10:01   ` Oliver O'Halloran
2019-04-30  5:44     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-04-18 10:13   ` Oliver O'Halloran
2019-04-30  5:54     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-03-20  2:58 ` [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build() Sam Bobroff
2019-03-20  6:05   ` Alexey Kardashevskiy
2019-04-09  3:31     ` Sam Bobroff
2019-04-12  0:55 ` [PATCH 0/8] Tyrel Datwyler
2019-04-15  3:41   ` Sam Bobroff
2019-04-19 22:36     ` Tyrel Datwyler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).