All of lore.kernel.org
 help / color / mirror / Atom feed
* PCIPOCALYPSE
@ 2019-11-20  1:28 Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset Oliver O'Halloran
                   ` (45 more replies)
  0 siblings, 46 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, s.miroshnichenko

This series does a few things and probably needs to be split into two or
three smaller ones. I figured I'd post it as-is since I'm sick of sitting
on it and some people wanted people to take a look at it. There's three
parts:

1) Reworking EEH to move the "pseudo-generic" into the platform backend.
2) Moving the point where do do PE assignments for PCIe devices out of
   pcibios_setup_bridge() and into pcibios_bus_add_device().
3) Killing the use of pci_dn in powernv entirely.

It used to be a (much) longer series, but bits and pieces of been
upstreamed or at least posted to the list so I've omitted most of the
pre-reqs. Here is a tree you can build based on today's -next with
everything in it:

	https://github.com/oohal/linux/tree/eeh-no-pdn-working

Keep in mind this is all pretty raw and I've tested it on precisely one
P8 PowerNV system. Things not tested:

 -> pseries (not recently anyway)
 -> CAPI
 -> OpenCAPI
 -> Any kind of NVLink

The main TODO is to finish what was started in 2) so that we handle PE
assignments, IOMMU configuration, etc in the same place for each PHB
type. Right now there's three distinct paths:

 1) For normal IODA PHBs (PHB3 and 4) the PE we can assign a device to is
 pinned by the location of it's MMIO BARs. How this is handled depends on
 whether the device is a VF or not, so the two sub cases are:

  a) For normal devices all the devices under a bridge are assigned to a
     PE in a walk done after configuring the bridge window. This causes a
     pile of wierd edge cases when a PCI device is removed without also
     removing it's parent bridge.

  b) For VFs PEs (and MMIO BARs) are assigned when we call sriov_enable() on
     the PF and we "fix up" the software state later on. As a result there's
     some IOMMU group stuff that happens in a bus notifier which runs after
     adding the device to a bus.

 2) For bullshit IODA PHBs (OpenCAPI / NVLink) there is no MMIO pinning
    so we can assign a BDFN to an arbitrary PE. For devices under those
    PEs are assigned in a per-PHB fixup that runs only once at boot,
    just after the the PHB is probed. There doesn't seem to be good
    reason for this and the lack of pinning means we should be able to
    do it whenever.

This series fixes 1a) by moving the PE assignment into
pcibios_bus_add_device(), which is run per-device. With that change
fixing the other two cases should be relatively straight forward. VFs
will probably still require some special casing since their setup
works differently to normal PCI devices, but we should be able to do
better than the current trainwreck of random hacks occuring in random
places.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  3:38   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 02/46] powernv/pci: Add helper to find ioda_pe from BDFN Oliver O'Halloran
                   ` (44 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

After resetting a VF we call eeh_restore_vf_config() to restore several
registers in the VFs config space. For physical functions this is normally
handled by the pci_reinit_device() OPAL call which requests firmware to
re-program the device with whatever defaults were set at boot time. We
can't use that for VFs since OPAL (being firmware) doesn't know (or care)
about VFs.

However, the fields that are restored here are all marked as reserved for
VFs in the SR-IOV spec. In other words, eeh_restore_vf_config() doesn't
actually do anything.

There is an argument to be made that we should be saving and restoring
some of these fields since they are marked as "Reserved, but Preserve"
(ResvP) to allow these fields to be used in new versions of the SR-IOV.
However, the current code doesn't even do that properly since it assumes
they can be set to whatever the EEH core has assumed to be correct. If
the fields *are* used in future versions of the SR-IOV spec this code
is still broken since it doesn't take into account any changes made
by the driver, or the Linux IOV core.

Given the above, just delete the code. It's broken, it's mis-leading,
and it's getting in the way of doing useful cleanups.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/kernel/eeh.c                    | 59 --------------------
 arch/powerpc/platforms/powernv/eeh-powernv.c | 39 +++----------
 arch/powerpc/platforms/pseries/eeh_pseries.c | 26 +--------
 3 files changed, 8 insertions(+), 116 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index ae0a9c421d7b..a3b93db972fc 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -742,65 +742,6 @@ static void eeh_restore_dev_state(struct eeh_dev *edev, void *userdata)
 		pci_restore_state(pdev);
 }
 
-int eeh_restore_vf_config(struct pci_dn *pdn)
-{
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
-	u32 devctl, cmd, cap2, aer_capctl;
-	int old_mps;
-
-	if (edev->pcie_cap) {
-		/* Restore MPS */
-		old_mps = (ffs(pdn->mps) - 8) << 5;
-		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
-				     2, &devctl);
-		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
-		devctl |= old_mps;
-		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
-				      2, devctl);
-
-		/* Disable Completion Timeout if possible */
-		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP2,
-				     4, &cap2);
-		if (cap2 & PCI_EXP_DEVCAP2_COMP_TMOUT_DIS) {
-			eeh_ops->read_config(pdn,
-					     edev->pcie_cap + PCI_EXP_DEVCTL2,
-					     4, &cap2);
-			cap2 |= PCI_EXP_DEVCTL2_COMP_TMOUT_DIS;
-			eeh_ops->write_config(pdn,
-					      edev->pcie_cap + PCI_EXP_DEVCTL2,
-					      4, cap2);
-		}
-	}
-
-	/* Enable SERR and parity checking */
-	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
-	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
-	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
-
-	/* Enable report various errors */
-	if (edev->pcie_cap) {
-		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
-				     2, &devctl);
-		devctl &= ~PCI_EXP_DEVCTL_CERE;
-		devctl |= (PCI_EXP_DEVCTL_NFERE |
-			   PCI_EXP_DEVCTL_FERE |
-			   PCI_EXP_DEVCTL_URRE);
-		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
-				      2, devctl);
-	}
-
-	/* Enable ECRC generation and check */
-	if (edev->pcie_cap && edev->aer_cap) {
-		eeh_ops->read_config(pdn, edev->aer_cap + PCI_ERR_CAP,
-				     4, &aer_capctl);
-		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
-		eeh_ops->write_config(pdn, edev->aer_cap + PCI_ERR_CAP,
-				      4, aer_capctl);
-	}
-
-	return 0;
-}
-
 /**
  * pcibios_set_pcie_reset_state - Set PCI-E reset state
  * @dev: pci device struct
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index ef727ecd99cd..b2ac4130fda7 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1649,20 +1649,13 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
 	if (!edev)
 		return -EEXIST;
 
-	/*
-	 * We have to restore the PCI config space after reset since the
-	 * firmware can't see SRIOV VFs.
-	 *
-	 * FIXME: The MPS, error routing rules, timeout setting are worthy
-	 * to be exported by firmware in extendible way.
-	 */
-	if (edev->physfn) {
-		ret = eeh_restore_vf_config(pdn);
-	} else {
-		phb = pdn->phb->private_data;
-		ret = opal_pci_reinit(phb->opal_id,
-				      OPAL_REINIT_PCI_DEV, config_addr);
-	}
+	/* Nothing to do for VFs */
+	if (edev->physfn)
+		return 0;
+
+	phb = pdn->phb->private_data;
+	ret = opal_pci_reinit(phb->opal_id,
+			      OPAL_REINIT_PCI_DEV, config_addr);
 
 	if (ret) {
 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
@@ -1691,24 +1684,6 @@ static struct eeh_ops pnv_eeh_ops = {
 	.notify_resume		= NULL
 };
 
-#ifdef CONFIG_PCI_IOV
-static void pnv_pci_fixup_vf_mps(struct pci_dev *pdev)
-{
-	struct pci_dn *pdn = pci_get_pdn(pdev);
-	int parent_mps;
-
-	if (!pdev->is_virtfn)
-		return;
-
-	/* Synchronize MPS for VF and PF */
-	parent_mps = pcie_get_mps(pdev->physfn);
-	if ((128 << pdev->pcie_mpss) >= parent_mps)
-		pcie_set_mps(pdev, parent_mps);
-	pdn->mps = pcie_get_mps(pdev);
-}
-DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_mps);
-#endif /* CONFIG_PCI_IOV */
-
 /**
  * eeh_powernv_init - Register platform dependent EEH operations
  *
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 95bbf9102584..fa704d7052ec 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -657,30 +657,6 @@ static int pseries_eeh_write_config(struct pci_dn *pdn, int where, int size, u32
 	return rtas_write_config(pdn, where, size, val);
 }
 
-static int pseries_eeh_restore_config(struct pci_dn *pdn)
-{
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
-	s64 ret = 0;
-
-	if (!edev)
-		return -EEXIST;
-
-	/*
-	 * FIXME: The MPS, error routing rules, timeout setting are worthy
-	 * to be exported by firmware in extendible way.
-	 */
-	if (edev->physfn)
-		ret = eeh_restore_vf_config(pdn);
-
-	if (ret) {
-		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
-			__func__, edev->pe_config_addr, ret);
-		return -EIO;
-	}
-
-	return ret;
-}
-
 #ifdef CONFIG_PCI_IOV
 int pseries_send_allow_unfreeze(struct pci_dn *pdn,
 				u16 *vf_pe_array, int cur_vfs)
@@ -786,7 +762,7 @@ static struct eeh_ops pseries_eeh_ops = {
 	.read_config		= pseries_eeh_read_config,
 	.write_config		= pseries_eeh_write_config,
 	.next_error		= NULL,
-	.restore_config		= pseries_eeh_restore_config,
+	.restore_config		= NULL,
 #ifdef CONFIG_PCI_IOV
 	.notify_resume		= pseries_notify_resume
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 02/46] powernv/pci: Add helper to find ioda_pe from BDFN
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs Oliver O'Halloran
                   ` (43 subsequent siblings)
  45 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Linux has a look-up table for mapping BDFNs to PEs which is updated when we
call into OPAL to update the PHB's internally BDFN<->PE mapping. We can use
this table to the PE for a device without needing to use the cached value
inside the pci_dn.

We'd like to get rid of pci_dn eventually so this patch adds adds a helper
to find the ioda_pe of a BDFN based on the table. This is different to the
existing helper which takes a pci_dev directly because there are some
contexts (e.g. EEH recovery) where we need to check for an existing PE
assignment when no corresponding pci_dev exists.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++++++++++
 arch/powerpc/platforms/powernv/pci.h      |  1 +
 2 files changed, 11 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index fdacf98555e9..65b5b121ebad 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -660,6 +660,16 @@ static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
 	return state;
 }
 
+struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn)
+{
+	int pe_number = phb->ioda.pe_rmap[bdfn];
+
+	if (pe_number == IODA_INVALID_PE)
+		return NULL;
+
+	return &phb->ioda.pe_array[pe_number];
+}
+
 struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev)
 {
 	struct pci_controller *hose = pci_bus_to_host(dev->bus);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index f914f0b14e4e..01a01739c03e 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -193,6 +193,7 @@ extern void pnv_pci_dma_dev_setup(struct pci_dev *pdev);
 extern void pnv_pci_dma_bus_setup(struct pci_bus *bus);
 extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type);
 extern void pnv_teardown_msi_irqs(struct pci_dev *pdev);
+extern struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn);
 extern struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev);
 extern void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq);
 extern unsigned long pnv_pci_ioda2_get_table_size(__u32 page_shift,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 02/46] powernv/pci: Add helper to find ioda_pe from BDFN Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  3:57   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c Oliver O'Halloran
                   ` (42 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

The pnv_pci_dma_dev_setup() only does something when:

1) There PHB contains VFs, or
2) The PHB defines a dma_dev_setup() callback in the pnv_phb structure.

Neither is true for NPU PHBs, so don't set the callback in the pci_controller_ops.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 65b5b121ebad..099c0bb1a9b9 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3652,7 +3652,6 @@ static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
 };
 
 static const struct pci_controller_ops pnv_npu_ioda_controller_ops = {
-	.dma_dev_setup		= pnv_pci_dma_dev_setup,
 	.setup_msi_irqs		= pnv_setup_msi_irqs,
 	.teardown_msi_irqs	= pnv_teardown_msi_irqs,
 	.enable_device_hook	= pnv_pci_enable_device_hook,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (2 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  4:02   ` Alexey Kardashevskiy
  2019-11-21  7:46   ` Christoph Hellwig
  2019-11-20  1:28 ` [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback Oliver O'Halloran
                   ` (41 subsequent siblings)
  45 siblings, 2 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

These functions are only used from pci-ioda.c. Move them in there and remove
the prototypes from the header files.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 43 +++++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.c      | 43 -----------------------
 arch/powerpc/platforms/powernv/pci.h      |  2 --
 3 files changed, 43 insertions(+), 45 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 099c0bb1a9b9..c2b3a5a13004 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3637,6 +3637,49 @@ static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
 		       OPAL_ASSERT_RESET);
 }
 
+void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+#ifdef CONFIG_PCI_IOV
+	struct pnv_ioda_pe *pe;
+
+	/* Fix the VF pdn PE number */
+	if (pdev->is_virtfn) {
+		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
+			if (pe->rid == ((pdev->bus->number << 8) |
+			    (pdev->devfn & 0xff))) {
+				pe->pdev = pdev;
+				break;
+			}
+		}
+	}
+#endif /* CONFIG_PCI_IOV */
+
+	if (phb && phb->dma_dev_setup)
+		phb->dma_dev_setup(phb, pdev);
+}
+
+void pnv_pci_dma_bus_setup(struct pci_bus *bus)
+{
+	struct pci_controller *hose = bus->sysdata;
+	struct pnv_phb *phb = hose->private_data;
+	struct pnv_ioda_pe *pe;
+
+	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
+		if (!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
+			continue;
+
+		if (!pe->pbus)
+			continue;
+
+		if (bus->number == ((pe->rid >> 8) & 0xFF)) {
+			pe->pbus = bus;
+			break;
+		}
+	}
+}
+
 static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
 	.dma_dev_setup		= pnv_pci_dma_dev_setup,
 	.dma_bus_setup		= pnv_pci_dma_bus_setup,
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index b7761e2e06f8..8b9058b52575 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -810,49 +810,6 @@ struct iommu_table *pnv_pci_table_alloc(int nid)
 	return tbl;
 }
 
-void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
-{
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
-#ifdef CONFIG_PCI_IOV
-	struct pnv_ioda_pe *pe;
-
-	/* Fix the VF pdn PE number */
-	if (pdev->is_virtfn) {
-		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-			if (pe->rid == ((pdev->bus->number << 8) |
-			    (pdev->devfn & 0xff))) {
-				pe->pdev = pdev;
-				break;
-			}
-		}
-	}
-#endif /* CONFIG_PCI_IOV */
-
-	if (phb && phb->dma_dev_setup)
-		phb->dma_dev_setup(phb, pdev);
-}
-
-void pnv_pci_dma_bus_setup(struct pci_bus *bus)
-{
-	struct pci_controller *hose = bus->sysdata;
-	struct pnv_phb *phb = hose->private_data;
-	struct pnv_ioda_pe *pe;
-
-	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-		if (!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
-			continue;
-
-		if (!pe->pbus)
-			continue;
-
-		if (bus->number == ((pe->rid >> 8) & 0xFF)) {
-			pe->pbus = bus;
-			break;
-		}
-	}
-}
-
 struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev)
 {
 	struct pci_controller *hose = pci_bus_to_host(dev->bus);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 01a01739c03e..f23145575048 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -189,8 +189,6 @@ extern void pnv_npu2_map_lpar(struct pnv_ioda_pe *gpe, unsigned long msr);
 extern void pnv_pci_reset_secondary_bus(struct pci_dev *dev);
 extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option);
 
-extern void pnv_pci_dma_dev_setup(struct pci_dev *pdev);
-extern void pnv_pci_dma_bus_setup(struct pci_bus *bus);
 extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type);
 extern void pnv_teardown_msi_irqs(struct pci_dev *pdev);
 extern struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (3 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  4:03   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() Oliver O'Halloran
                   ` (40 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

This is only ever set for IODA PHBs. The only call site is in
pnv_pci_dma_dev_setup(), which is also only used by normal IODA PHBs, so remove
the callback in favour of a direct call.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 4 +---
 arch/powerpc/platforms/powernv/pci.h      | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index c2b3a5a13004..45f974258766 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3656,8 +3656,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 	}
 #endif /* CONFIG_PCI_IOV */
 
-	if (phb && phb->dma_dev_setup)
-		phb->dma_dev_setup(phb, pdev);
+	pnv_pci_ioda_dma_dev_setup(phb, pdev);
 }
 
 void pnv_pci_dma_bus_setup(struct pci_bus *bus)
@@ -3940,7 +3939,6 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 		hose->controller_ops = pnv_npu_ocapi_ioda_controller_ops;
 		break;
 	default:
-		phb->dma_dev_setup = pnv_pci_ioda_dma_dev_setup;
 		hose->controller_ops = pnv_pci_ioda_controller_ops;
 	}
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index f23145575048..3c33a0c91a69 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -108,7 +108,6 @@ struct pnv_phb {
 	int (*msi_setup)(struct pnv_phb *phb, struct pci_dev *dev,
 			 unsigned int hwirq, unsigned int virq,
 			 unsigned int is_64, struct msi_msg *msg);
-	void (*dma_dev_setup)(struct pnv_phb *phb, struct pci_dev *pdev);
 	int (*init_m64)(struct pnv_phb *phb);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (4 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  4:34   ` Alexey Kardashevskiy
  2019-11-21  7:48   ` Christoph Hellwig
  2019-11-20  1:28 ` [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting Oliver O'Halloran
                   ` (39 subsequent siblings)
  45 siblings, 2 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Move this out of the PHB's dma_dev_setup() callback and into the
ppc_md.pcibios_fixup_iov callback. This ensures that the VF PE's
pdev pointer is always valid for the whole time the device is
added the bus.

This isn't strictly required, but it's slightly a slightly more logical
place to do the fixup and it makes dma_dev_setup a bit simpler.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 35 +++++++++++------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 45f974258766..c6ea7a504e04 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2910,9 +2910,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 	struct pci_dn *pdn;
 	int mul, total_vfs;
 
-	if (!pdev->is_physfn || pci_dev_is_added(pdev))
-		return;
-
 	pdn = pci_get_pdn(pdev);
 	pdn->vfs_expanded = 0;
 	pdn->m64_single_mode = false;
@@ -2987,6 +2984,22 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 		res->end = res->start - 1;
 	}
 }
+
+static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
+{
+	if (WARN_ON(pci_dev_is_added(pdev)))
+		return;
+
+	if (pdev->is_virtfn) {
+		/* Fix the VF PE's pdev pointer */
+		struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
+		pe->pdev = pdev;
+
+		WARN_ON(!(pe->flags & PNV_IODA_PE_VF));
+	} else if (pdev->is_physfn) {
+		pnv_pci_ioda_fixup_iov_resources(pdev);
+	}
+}
 #endif /* CONFIG_PCI_IOV */
 
 static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe,
@@ -3641,20 +3654,6 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 {
 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pnv_phb *phb = hose->private_data;
-#ifdef CONFIG_PCI_IOV
-	struct pnv_ioda_pe *pe;
-
-	/* Fix the VF pdn PE number */
-	if (pdev->is_virtfn) {
-		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-			if (pe->rid == ((pdev->bus->number << 8) |
-			    (pdev->devfn & 0xff))) {
-				pe->pdev = pdev;
-				break;
-			}
-		}
-	}
-#endif /* CONFIG_PCI_IOV */
 
 	pnv_pci_ioda_dma_dev_setup(phb, pdev);
 }
@@ -3945,7 +3944,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	ppc_md.pcibios_default_alignment = pnv_pci_default_alignment;
 
 #ifdef CONFIG_PCI_IOV
-	ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov_resources;
+	ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov;
 	ppc_md.pcibios_iov_resource_alignment = pnv_pci_iov_resource_alignment;
 	ppc_md.pcibios_sriov_enable = pnv_pcibios_sriov_enable;
 	ppc_md.pcibios_sriov_disable = pnv_pcibios_sriov_disable;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (5 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  5:48   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn Oliver O'Halloran
                   ` (38 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

The current process for configuring the IODA PEs for normal PCI devices is
abit stupid. After assigning MMIO resources for the devices on a bus the
actual PE asignment occurs when pcibios_setup_bridge() is called for the
parent bridge. In pcibios_setup_bridge() we:

1. Assign all 256 devfn's for the subordinate bus to the PE corresponding
   to the bridge window.
2. Initialise the IOMMU tables for that PE.
3. Traverse each device on the bus below the bridge setting the IOMMU table
   pointer to point at the PE's table.
4. Finally, set pci_dn->pe_number to indicate that we've done the
   per-device setup and allow EEH and the platform code to look up
   the PE number.

Although mostly functional, there's a couple of issues with this approach.
The most glaring is that it mixes the per-bus (per-PE really) setup with
the per-device setup in a way that's completely asymmetric to what happens
when tearing down a device.

In step 4. the number of devices in the PE is counted and stored in the
ioda_pe structure. When releasing a pci_dev the device count is dropped
until it hits zero where the ioda_pe itself is torn down. However, the bus
itself remains active and can be re-scanned to bring back the devices that
were removed. On a rescan we do not re-run the bridge setup so the
per-device setup is never re-done which results in the re-scanned being
unusable.

There are a few other minor issues too. Associating all 256 devfns with
the PE means that config accesses to non-existant PCI devices results
in a spurious PE freezes. We currently prevent this by only allowing config
accesses to a BDFN when there is a corresponding pci_dn structure. We
would like to eliminate that restriction in the future though.

That all said the biggest issue is that the current behaviour is hard to
follow at the best of times. On top of that the behaviour is slightly (or
majorly) different across each PHB type (PCIe, OpenCAPI, NVLink) and the
behaviour for physical devices (described above) and virtual functions is
again different. To address this we want to merge the paths as much as
possible so that the PowerNV specific PCI initialisation steps all occur
at roughly the same point in the PCI device setup path.

We can address most of these problems by moving the PE setup out of
pcibios_setup_bridge() and into pcibios_bus_add_device(). The latter is
called on a per-device basis so we have some symmetry between the setup and
teardown paths. Moving the PE assignments to here should also allow us to
converge how PE assignment works on all PHB types so it's always done in
one place.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 112 +++++++++++-----------
 1 file changed, 58 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index c6ea7a504e04..c74521e5f3ab 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -51,6 +51,7 @@ static const char * const pnv_phb_names[] = { "IODA1", "IODA2", "NPU_NVLINK",
 					      "NPU_OCAPI" };
 
 static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable);
+static void pnv_pci_configure_bus(struct pci_bus *bus);
 
 void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
 			    const char *fmt, ...)
@@ -1104,34 +1105,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
 	return pe;
 }
 
-static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
-{
-	struct pci_dev *dev;
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		struct pci_dn *pdn = pci_get_pdn(dev);
-
-		if (pdn == NULL) {
-			pr_warn("%s: No device node associated with device !\n",
-				pci_name(dev));
-			continue;
-		}
-
-		/*
-		 * In partial hotplug case, the PCI device might be still
-		 * associated with the PE and needn't attach it to the PE
-		 * again.
-		 */
-		if (pdn->pe_number != IODA_INVALID_PE)
-			continue;
-
-		pe->device_count++;
-		pdn->pe_number = pe->pe_number;
-		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
-			pnv_ioda_setup_same_PE(dev->subordinate, pe);
-	}
-}
-
 /*
  * There're 2 types of PCI bus sensitive PEs: One that is compromised of
  * single PCI bus. Another one that contains the primary PCI bus and its
@@ -1152,7 +1125,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 	pe_num = phb->ioda.pe_rmap[bus->number << 8];
 	if (pe_num != IODA_INVALID_PE) {
 		pe = &phb->ioda.pe_array[pe_num];
-		pnv_ioda_setup_same_PE(bus, pe);
 		return NULL;
 	}
 
@@ -1196,9 +1168,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 		return NULL;
 	}
 
-	/* Associate it with all child devices */
-	pnv_ioda_setup_same_PE(bus, pe);
-
 	/* Put PE to the list */
 	list_add_tail(&pe->list, &phb->ioda.pe_list);
 
@@ -1758,23 +1727,20 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pnv_ioda_pe *pe;
 
-	/*
-	 * The function can be called while the PE#
-	 * hasn't been assigned. Do nothing for the
-	 * case.
-	 */
-	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
-		return;
-
 	pe = &phb->ioda.pe_array[pdn->pe_number];
 	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
 	pdev->dev.archdata.dma_offset = pe->tce_bypass_base;
 	set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
+
+	pe->device_count++;
+
 	/*
 	 * Note: iommu_add_device() will fail here as
 	 * for physical PE: the device is already added by now;
 	 * for virtual PE: sysfs entries are not ready yet and
 	 * tce_iommu_bus_notifier will add the device to a group later.
+	 *
+	 * XXX: this is wrong since the re-ordering patch.
 	 */
 }
 
@@ -2288,9 +2254,6 @@ static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb,
 	pe->table_group.tce32_size = tbl->it_size << tbl->it_page_shift;
 	iommu_init_table(tbl, phb->hose->node, 0, 0);
 
-	if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
-		pnv_ioda_setup_bus_dma(pe, pe->pbus);
-
 	return;
  fail:
 	/* XXX Failure: Try to fallback to 64-bit only ? */
@@ -2626,9 +2589,9 @@ static void pnv_pci_ioda_setup_iommu_api(void)
 	/*
 	 * There are 4 types of PEs:
 	 * - PNV_IODA_PE_BUS: a downstream port with an adapter,
-	 *   created from pnv_pci_setup_bridge();
+	 *   created from pnv_pci_configure_bus();
 	 * - PNV_IODA_PE_BUS_ALL: a PCI-PCIX bridge with devices behind it,
-	 *   created from pnv_pci_setup_bridge();
+	 *   created from pnv_pci_configure_bus();
 	 * - PNV_IODA_PE_VF: a SRIOV virtual function,
 	 *   created from pnv_pcibios_sriov_enable();
 	 * - PNV_IODA_PE_DEV: an NPU or OCAPI device,
@@ -2748,8 +2711,10 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 	if (rc)
 		return;
 
-	if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
-		pnv_ioda_setup_bus_dma(pe, pe->pbus);
+	/*
+	 * The IOMMU table for the PE is associated with the device in
+	 * pnv_pcibios_bus_add_device()
+	 */
 }
 
 int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq)
@@ -3324,16 +3289,13 @@ static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
 	}
 }
 
-static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
+static void pnv_pci_configure_bus(struct pci_bus *bus)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
 	struct pnv_phb *phb = hose->private_data;
 	struct pci_dev *bridge = bus->self;
 	struct pnv_ioda_pe *pe;
-	bool all = (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
-
-	/* Extend bridge's windows if necessary */
-	pnv_pci_fixup_bridge_resources(bus, type);
+	bool all = (bridge && pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
 
 	/* The PE for root bus should be realized before any one else */
 	if (!phb->ioda.root_pe_populated) {
@@ -3342,12 +3304,21 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 			phb->ioda.root_pe_idx = pe->pe_number;
 			phb->ioda.root_pe_populated = true;
 		}
+
+		/* no need to re-configure the root bus */
+		if (bus == phb->hose->bus)
+			return;
 	}
 
 	/* Don't assign PE to PCI bus, which doesn't have subordinate devices */
 	if (list_empty(&bus->devices))
 		return;
 
+	/* PE should never be re-configured */
+	pe = __pnv_ioda_get_pe(phb, bus->number << 8);
+	if (WARN_ON(pe))
+		return;
+
 	/* Reserve PEs according to used M64 resources */
 	pnv_ioda_reserve_m64_pe(bus, NULL, all);
 
@@ -3654,6 +3625,39 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 {
 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	struct pnv_ioda_pe *pe;
+
+	/* Check if the BDFN for this device is associated with a PE yet */
+	pe = __pnv_ioda_get_pe(phb, pdev->devfn | (pdev->bus->number << 8));
+	if (!pe) {
+		/*
+		 * We should only hit this path for "normal" PCI PHBs. The
+		 * special PHBs used for OpenCAPI and NVLink don't have to
+		 * deal with eeh-on-mmio so they assign PEs at probe time
+		 * rather than after resources are allocated.
+		 */
+		WARN_ON(phb->type != PNV_PHB_IODA2 && phb->type != PNV_PHB_IODA1);
+		/* PEs for VFs should have been assigned in sriov_enable() */
+		WARN_ON(pdev->is_virtfn);
+
+		pnv_pci_configure_bus(pdev->bus);
+		pe = __pnv_ioda_get_pe(phb, pdev->devfn | (pdev->bus->number << 8));
+		pci_err(pdev, "Configured new pe PE#%x\n", pe ? pe->pe_number : 0xfffff);
+
+
+		/*
+		 * If we can't setup the IODA PE something has gone horribly
+		 * wrong and we can't enable DMA for the device.
+		 */
+		if (WARN_ON(!pe))
+			return;
+	} else {
+		pci_err(pdev, "Added to existing PE#%x\n", pe->pe_number);
+	}
+
+	if (pdn)
+		pdn->pe_number = pe->pe_number;
 
 	pnv_pci_ioda_dma_dev_setup(phb, pdev);
 }
@@ -3680,14 +3684,14 @@ void pnv_pci_dma_bus_setup(struct pci_bus *bus)
 
 static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
 	.dma_dev_setup		= pnv_pci_dma_dev_setup,
-	.dma_bus_setup		= pnv_pci_dma_bus_setup,
+	.dma_bus_setup		= pnv_pci_dma_bus_setup, /* NB: DMA setup actually happens in dma_dev_setup */
 	.iommu_bypass_supported	= pnv_pci_ioda_iommu_bypass_supported,
 	.setup_msi_irqs		= pnv_setup_msi_irqs,
 	.teardown_msi_irqs	= pnv_teardown_msi_irqs,
 	.enable_device_hook	= pnv_pci_enable_device_hook,
 	.release_device		= pnv_pci_release_device,
 	.window_alignment	= pnv_pci_window_alignment,
-	.setup_bridge		= pnv_pci_setup_bridge,
+	.setup_bridge		= pnv_pci_fixup_bridge_resources,
 	.reset_secondary_bus	= pnv_pci_reset_secondary_bus,
 	.shutdown		= pnv_pci_ioda_shutdown,
 };
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (6 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  4:43   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config() Oliver O'Halloran
                   ` (37 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Find the VF index based on the BDFN of the device rather than using a cached
value in the pci_dn. This is probably slower than looking up the cached value
in the pci_dn, but it's done infrequently (only in the EEH recovery path) and
it's just arithmatic.

We need this here because the functions to remove a VF are slightly
different to those which remove a physical PCI device.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/kernel/eeh_driver.c | 44 +++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index a1eaffe868de..1cdeed464aed 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -457,12 +457,35 @@ static enum pci_ers_result eeh_report_failure(struct eeh_dev *edev,
 	return rc;
 }
 
+#ifdef CONFIG_PCI_IOV
+/* FIXME: this should probably go in drivers/pci/iov.c */
+static int eeh_find_vf_index(struct pci_dev *physfn, u16 vf_bdfn)
+{
+	u16 vf_bus, vf_devfn;
+	int i;
+
+	vf_bus = vf_bdfn >> 8;
+	vf_devfn = vf_bdfn & 0xff;
+
+	for (i = 0; i < pci_num_vf(physfn); i++) {
+		if (pci_iov_virtfn_bus(physfn, i) != vf_bus)
+			continue;
+		if (pci_iov_virtfn_devfn(physfn, i) != vf_devfn)
+			continue;
+		return i;
+	}
+
+	WARN_ON(1);
+	return -1;
+}
+
 static void *eeh_add_virt_device(struct eeh_dev *edev)
 {
-	struct pci_driver *driver;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
+	struct pci_driver *driver;
+	int vf_index;
 
-	if (!(edev->physfn)) {
+	if (!edev->physfn) {
 		eeh_edev_warn(edev, "Not for VF\n");
 		return NULL;
 	}
@@ -476,11 +499,18 @@ static void *eeh_add_virt_device(struct eeh_dev *edev)
 		eeh_pcid_put(dev);
 	}
 
-#ifdef CONFIG_PCI_IOV
-	pci_iov_add_virtfn(edev->physfn, eeh_dev_to_pdn(edev)->vf_index);
-#endif
+	vf_index = eeh_find_vf_index(edev->physfn, edev->bdfn);
+	pci_iov_add_virtfn(edev->physfn, vf_index);
+
 	return NULL;
 }
+#else
+static void *eeh_add_virt_device(struct eeh_dev *edev)
+{
+	WARN_ON(1);
+	return NULL;
+}
+#endif
 
 static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 {
@@ -521,9 +551,9 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 
 	if (edev->physfn) {
 #ifdef CONFIG_PCI_IOV
-		struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+		int vf_index = eeh_find_vf_index(edev->physfn, edev->bdfn);
 
-		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
+		pci_iov_remove_virtfn(edev->physfn, vf_index);
 		edev->pdev = NULL;
 #endif
 		if (rmv_data)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (7 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  4:52   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 10/46] powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config() Oliver O'Halloran
                   ` (36 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Switch the eeh_ops->{read|write}_config methods to take an eeh_dev structure
rather than a pci_dn structure to specify the target device. This removes a
lot of the uses of pci_dn in both the EEH core and in the platform EEH
support.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/include/asm/eeh.h               |  4 +-
 arch/powerpc/kernel/eeh.c                    | 22 +++++-----
 arch/powerpc/kernel/eeh_pe.c                 | 44 ++++++++++----------
 arch/powerpc/platforms/powernv/eeh-powernv.c | 43 ++++++++++---------
 arch/powerpc/platforms/pseries/eeh_pseries.c | 16 ++++---
 5 files changed, 67 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index e11deb284631..62c4ee44ad2c 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -224,8 +224,8 @@ struct eeh_ops {
 	int (*configure_bridge)(struct eeh_pe *pe);
 	int (*err_inject)(struct eeh_pe *pe, int type, int func,
 			  unsigned long addr, unsigned long mask);
-	int (*read_config)(struct pci_dn *pdn, int where, int size, u32 *val);
-	int (*write_config)(struct pci_dn *pdn, int where, int size, u32 val);
+	int (*read_config)(struct eeh_dev *edev, int where, int size, u32 *val);
+	int (*write_config)(struct eeh_dev *edev, int where, int size, u32 val);
 	int (*next_error)(struct eeh_pe **pe);
 	int (*restore_config)(struct pci_dn *pdn);
 	int (*notify_resume)(struct pci_dn *pdn);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index a3b93db972fc..7258fa04176d 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -185,21 +185,21 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
 		pdn->phb->global_number, pdn->busno,
 		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
 
-	eeh_ops->read_config(pdn, PCI_VENDOR_ID, 4, &cfg);
+	eeh_ops->read_config(edev, PCI_VENDOR_ID, 4, &cfg);
 	n += scnprintf(buf+n, len-n, "dev/vend:%08x\n", cfg);
 	pr_warn("EEH: PCI device/vendor: %08x\n", cfg);
 
-	eeh_ops->read_config(pdn, PCI_COMMAND, 4, &cfg);
+	eeh_ops->read_config(edev, PCI_COMMAND, 4, &cfg);
 	n += scnprintf(buf+n, len-n, "cmd/stat:%x\n", cfg);
 	pr_warn("EEH: PCI cmd/status register: %08x\n", cfg);
 
 	/* Gather bridge-specific registers */
 	if (edev->mode & EEH_DEV_BRIDGE) {
-		eeh_ops->read_config(pdn, PCI_SEC_STATUS, 2, &cfg);
+		eeh_ops->read_config(edev, PCI_SEC_STATUS, 2, &cfg);
 		n += scnprintf(buf+n, len-n, "sec stat:%x\n", cfg);
 		pr_warn("EEH: Bridge secondary status: %04x\n", cfg);
 
-		eeh_ops->read_config(pdn, PCI_BRIDGE_CONTROL, 2, &cfg);
+		eeh_ops->read_config(edev, PCI_BRIDGE_CONTROL, 2, &cfg);
 		n += scnprintf(buf+n, len-n, "brdg ctl:%x\n", cfg);
 		pr_warn("EEH: Bridge control: %04x\n", cfg);
 	}
@@ -207,11 +207,11 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
 	/* Dump out the PCI-X command and status regs */
 	cap = edev->pcix_cap;
 	if (cap) {
-		eeh_ops->read_config(pdn, cap, 4, &cfg);
+		eeh_ops->read_config(edev, cap, 4, &cfg);
 		n += scnprintf(buf+n, len-n, "pcix-cmd:%x\n", cfg);
 		pr_warn("EEH: PCI-X cmd: %08x\n", cfg);
 
-		eeh_ops->read_config(pdn, cap+4, 4, &cfg);
+		eeh_ops->read_config(edev, cap+4, 4, &cfg);
 		n += scnprintf(buf+n, len-n, "pcix-stat:%x\n", cfg);
 		pr_warn("EEH: PCI-X status: %08x\n", cfg);
 	}
@@ -223,7 +223,7 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
 		pr_warn("EEH: PCI-E capabilities and status follow:\n");
 
 		for (i=0; i<=8; i++) {
-			eeh_ops->read_config(pdn, cap+4*i, 4, &cfg);
+			eeh_ops->read_config(edev, cap+4*i, 4, &cfg);
 			n += scnprintf(buf+n, len-n, "%02x:%x\n", 4*i, cfg);
 
 			if ((i % 4) == 0) {
@@ -250,7 +250,7 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
 		pr_warn("EEH: PCI-E AER capability register set follows:\n");
 
 		for (i=0; i<=13; i++) {
-			eeh_ops->read_config(pdn, cap+4*i, 4, &cfg);
+			eeh_ops->read_config(edev, cap+4*i, 4, &cfg);
 			n += scnprintf(buf+n, len-n, "%02x:%x\n", 4*i, cfg);
 
 			if ((i % 4) == 0) {
@@ -918,15 +918,13 @@ int eeh_pe_reset_full(struct eeh_pe *pe, bool include_passed)
  */
 void eeh_save_bars(struct eeh_dev *edev)
 {
-	struct pci_dn *pdn;
 	int i;
 
-	pdn = eeh_dev_to_pdn(edev);
-	if (!pdn)
+	if (!edev)
 		return;
 
 	for (i = 0; i < 16; i++)
-		eeh_ops->read_config(pdn, i * 4, 4, &edev->config_space[i]);
+		eeh_ops->read_config(edev, i * 4, 4, &edev->config_space[i]);
 
 	/*
 	 * For PCI bridges including root port, we need enable bus
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 177852e39a25..e11e0830f125 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -714,32 +714,32 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
 
 	/* Check slot status */
 	cap = edev->pcie_cap;
-	eeh_ops->read_config(pdn, cap + PCI_EXP_SLTSTA, 2, &val);
+	eeh_ops->read_config(edev, cap + PCI_EXP_SLTSTA, 2, &val);
 	if (!(val & PCI_EXP_SLTSTA_PDS)) {
 		eeh_edev_dbg(edev, "No card in the slot (0x%04x) !\n", val);
 		return;
 	}
 
 	/* Check power status if we have the capability */
-	eeh_ops->read_config(pdn, cap + PCI_EXP_SLTCAP, 2, &val);
+	eeh_ops->read_config(edev, cap + PCI_EXP_SLTCAP, 2, &val);
 	if (val & PCI_EXP_SLTCAP_PCP) {
-		eeh_ops->read_config(pdn, cap + PCI_EXP_SLTCTL, 2, &val);
+		eeh_ops->read_config(edev, cap + PCI_EXP_SLTCTL, 2, &val);
 		if (val & PCI_EXP_SLTCTL_PCC) {
 			eeh_edev_dbg(edev, "In power-off state, power it on ...\n");
 			val &= ~(PCI_EXP_SLTCTL_PCC | PCI_EXP_SLTCTL_PIC);
 			val |= (0x0100 & PCI_EXP_SLTCTL_PIC);
-			eeh_ops->write_config(pdn, cap + PCI_EXP_SLTCTL, 2, val);
+			eeh_ops->write_config(edev, cap + PCI_EXP_SLTCTL, 2, val);
 			msleep(2 * 1000);
 		}
 	}
 
 	/* Enable link */
-	eeh_ops->read_config(pdn, cap + PCI_EXP_LNKCTL, 2, &val);
+	eeh_ops->read_config(edev, cap + PCI_EXP_LNKCTL, 2, &val);
 	val &= ~PCI_EXP_LNKCTL_LD;
-	eeh_ops->write_config(pdn, cap + PCI_EXP_LNKCTL, 2, val);
+	eeh_ops->write_config(edev, cap + PCI_EXP_LNKCTL, 2, val);
 
 	/* Check link */
-	eeh_ops->read_config(pdn, cap + PCI_EXP_LNKCAP, 4, &val);
+	eeh_ops->read_config(edev, cap + PCI_EXP_LNKCAP, 4, &val);
 	if (!(val & PCI_EXP_LNKCAP_DLLLARC)) {
 		eeh_edev_dbg(edev, "No link reporting capability (0x%08x) \n", val);
 		msleep(1000);
@@ -752,7 +752,7 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
 		msleep(20);
 		timeout += 20;
 
-		eeh_ops->read_config(pdn, cap + PCI_EXP_LNKSTA, 2, &val);
+		eeh_ops->read_config(edev, cap + PCI_EXP_LNKSTA, 2, &val);
 		if (val & PCI_EXP_LNKSTA_DLLLA)
 			break;
 	}
@@ -769,7 +769,6 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
 
 static void eeh_restore_bridge_bars(struct eeh_dev *edev)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 	int i;
 
 	/*
@@ -777,20 +776,20 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
 	 * Bus numbers and windows: 0x18 - 0x30
 	 */
 	for (i = 4; i < 13; i++)
-		eeh_ops->write_config(pdn, i*4, 4, edev->config_space[i]);
+		eeh_ops->write_config(edev, i*4, 4, edev->config_space[i]);
 	/* Rom: 0x38 */
-	eeh_ops->write_config(pdn, 14*4, 4, edev->config_space[14]);
+	eeh_ops->write_config(edev, 14*4, 4, edev->config_space[14]);
 
 	/* Cache line & Latency timer: 0xC 0xD */
-	eeh_ops->write_config(pdn, PCI_CACHE_LINE_SIZE, 1,
+	eeh_ops->write_config(edev, PCI_CACHE_LINE_SIZE, 1,
                 SAVED_BYTE(PCI_CACHE_LINE_SIZE));
-        eeh_ops->write_config(pdn, PCI_LATENCY_TIMER, 1,
+        eeh_ops->write_config(edev, PCI_LATENCY_TIMER, 1,
                 SAVED_BYTE(PCI_LATENCY_TIMER));
 	/* Max latency, min grant, interrupt ping and line: 0x3C */
-	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
+	eeh_ops->write_config(edev, 15*4, 4, edev->config_space[15]);
 
 	/* PCI Command: 0x4 */
-	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1] |
+	eeh_ops->write_config(edev, PCI_COMMAND, 4, edev->config_space[1] |
 			      PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
 
 	/* Check the PCIe link is ready */
@@ -799,28 +798,27 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
 
 static void eeh_restore_device_bars(struct eeh_dev *edev)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 	int i;
 	u32 cmd;
 
 	for (i = 4; i < 10; i++)
-		eeh_ops->write_config(pdn, i*4, 4, edev->config_space[i]);
+		eeh_ops->write_config(edev, i*4, 4, edev->config_space[i]);
 	/* 12 == Expansion ROM Address */
-	eeh_ops->write_config(pdn, 12*4, 4, edev->config_space[12]);
+	eeh_ops->write_config(edev, 12*4, 4, edev->config_space[12]);
 
-	eeh_ops->write_config(pdn, PCI_CACHE_LINE_SIZE, 1,
+	eeh_ops->write_config(edev, PCI_CACHE_LINE_SIZE, 1,
 		SAVED_BYTE(PCI_CACHE_LINE_SIZE));
-	eeh_ops->write_config(pdn, PCI_LATENCY_TIMER, 1,
+	eeh_ops->write_config(edev, PCI_LATENCY_TIMER, 1,
 		SAVED_BYTE(PCI_LATENCY_TIMER));
 
 	/* max latency, min grant, interrupt pin and line */
-	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
+	eeh_ops->write_config(edev, 15*4, 4, edev->config_space[15]);
 
 	/*
 	 * Restore PERR & SERR bits, some devices require it,
 	 * don't touch the other command bits
 	 */
-	eeh_ops->read_config(pdn, PCI_COMMAND, 4, &cmd);
+	eeh_ops->read_config(edev, PCI_COMMAND, 4, &cmd);
 	if (edev->config_space[1] & PCI_COMMAND_PARITY)
 		cmd |= PCI_COMMAND_PARITY;
 	else
@@ -829,7 +827,7 @@ static void eeh_restore_device_bars(struct eeh_dev *edev)
 		cmd |= PCI_COMMAND_SERR;
 	else
 		cmd &= ~PCI_COMMAND_SERR;
-	eeh_ops->write_config(pdn, PCI_COMMAND, 4, cmd);
+	eeh_ops->write_config(edev, PCI_COMMAND, 4, cmd);
 }
 
 /**
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index b2ac4130fda7..54d8ec77aef2 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -858,32 +858,32 @@ static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	case EEH_RESET_HOT:
 		/* Don't report linkDown event */
 		if (aer) {
-			eeh_ops->read_config(pdn, aer + PCI_ERR_UNCOR_MASK,
+			eeh_ops->read_config(edev, aer + PCI_ERR_UNCOR_MASK,
 					     4, &ctrl);
 			ctrl |= PCI_ERR_UNC_SURPDN;
-			eeh_ops->write_config(pdn, aer + PCI_ERR_UNCOR_MASK,
+			eeh_ops->write_config(edev, aer + PCI_ERR_UNCOR_MASK,
 					      4, ctrl);
 		}
 
-		eeh_ops->read_config(pdn, PCI_BRIDGE_CONTROL, 2, &ctrl);
+		eeh_ops->read_config(edev, PCI_BRIDGE_CONTROL, 2, &ctrl);
 		ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
-		eeh_ops->write_config(pdn, PCI_BRIDGE_CONTROL, 2, ctrl);
+		eeh_ops->write_config(edev, PCI_BRIDGE_CONTROL, 2, ctrl);
 
 		msleep(EEH_PE_RST_HOLD_TIME);
 		break;
 	case EEH_RESET_DEACTIVATE:
-		eeh_ops->read_config(pdn, PCI_BRIDGE_CONTROL, 2, &ctrl);
+		eeh_ops->read_config(edev, PCI_BRIDGE_CONTROL, 2, &ctrl);
 		ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
-		eeh_ops->write_config(pdn, PCI_BRIDGE_CONTROL, 2, ctrl);
+		eeh_ops->write_config(edev, PCI_BRIDGE_CONTROL, 2, ctrl);
 
 		msleep(EEH_PE_RST_SETTLE_TIME);
 
 		/* Continue reporting linkDown event */
 		if (aer) {
-			eeh_ops->read_config(pdn, aer + PCI_ERR_UNCOR_MASK,
+			eeh_ops->read_config(edev, aer + PCI_ERR_UNCOR_MASK,
 					     4, &ctrl);
 			ctrl &= ~PCI_ERR_UNC_SURPDN;
-			eeh_ops->write_config(pdn, aer + PCI_ERR_UNCOR_MASK,
+			eeh_ops->write_config(edev, aer + PCI_ERR_UNCOR_MASK,
 					      4, ctrl);
 		}
 
@@ -952,11 +952,12 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
 				     int pos, u16 mask)
 {
+	struct eeh_dev *edev = pdn->edev;
 	int i, status = 0;
 
 	/* Wait for Transaction Pending bit to be cleared */
 	for (i = 0; i < 4; i++) {
-		eeh_ops->read_config(pdn, pos, 2, &status);
+		eeh_ops->read_config(edev, pos, 2, &status);
 		if (!(status & mask))
 			return;
 
@@ -977,7 +978,7 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
 	if (WARN_ON(!edev->pcie_cap))
 		return -ENOTTY;
 
-	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &reg);
+	eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &reg);
 	if (!(reg & PCI_EXP_DEVCAP_FLR))
 		return -ENOTTY;
 
@@ -987,18 +988,18 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
 		pnv_eeh_wait_for_pending(pdn, "",
 					 edev->pcie_cap + PCI_EXP_DEVSTA,
 					 PCI_EXP_DEVSTA_TRPND);
-		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+		eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
 				     4, &reg);
 		reg |= PCI_EXP_DEVCTL_BCR_FLR;
-		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+		eeh_ops->write_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
 				      4, reg);
 		msleep(EEH_PE_RST_HOLD_TIME);
 		break;
 	case EEH_RESET_DEACTIVATE:
-		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+		eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
 				     4, &reg);
 		reg &= ~PCI_EXP_DEVCTL_BCR_FLR;
-		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+		eeh_ops->write_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
 				      4, reg);
 		msleep(EEH_PE_RST_SETTLE_TIME);
 		break;
@@ -1015,7 +1016,7 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
 	if (WARN_ON(!edev->af_cap))
 		return -ENOTTY;
 
-	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
+	eeh_ops->read_config(edev, edev->af_cap + PCI_AF_CAP, 1, &cap);
 	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
 		return -ENOTTY;
 
@@ -1030,12 +1031,12 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
 		pnv_eeh_wait_for_pending(pdn, "AF",
 					 edev->af_cap + PCI_AF_CTRL,
 					 PCI_AF_STATUS_TP << 8);
-		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL,
+		eeh_ops->write_config(edev, edev->af_cap + PCI_AF_CTRL,
 				      1, PCI_AF_CTRL_FLR);
 		msleep(EEH_PE_RST_HOLD_TIME);
 		break;
 	case EEH_RESET_DEACTIVATE:
-		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
+		eeh_ops->write_config(edev, edev->af_cap + PCI_AF_CTRL, 1, 0);
 		msleep(EEH_PE_RST_SETTLE_TIME);
 		break;
 	}
@@ -1269,9 +1270,11 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
 	return false;
 }
 
-static int pnv_eeh_read_config(struct pci_dn *pdn,
+static int pnv_eeh_read_config(struct eeh_dev *edev,
 			       int where, int size, u32 *val)
 {
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
 	if (!pdn)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
@@ -1283,9 +1286,11 @@ static int pnv_eeh_read_config(struct pci_dn *pdn,
 	return pnv_pci_cfg_read(pdn, where, size, val);
 }
 
-static int pnv_eeh_write_config(struct pci_dn *pdn,
+static int pnv_eeh_write_config(struct eeh_dev *edev,
 				int where, int size, u32 val)
 {
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
 	if (!pdn)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index fa704d7052ec..6f911a048339 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -631,29 +631,33 @@ static int pseries_eeh_configure_bridge(struct eeh_pe *pe)
 
 /**
  * pseries_eeh_read_config - Read PCI config space
- * @pdn: PCI device node
- * @where: PCI address
+ * @edev: EEH device handle
+ * @where: PCI config space offset
  * @size: size to read
  * @val: return value
  *
  * Read config space from the speicifed device
  */
-static int pseries_eeh_read_config(struct pci_dn *pdn, int where, int size, u32 *val)
+static int pseries_eeh_read_config(struct eeh_dev *edev, int where, int size, u32 *val)
 {
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
 	return rtas_read_config(pdn, where, size, val);
 }
 
 /**
  * pseries_eeh_write_config - Write PCI config space
- * @pdn: PCI device node
- * @where: PCI address
+ * @edev: EEH device handle
+ * @where: PCI config space offset
  * @size: size to write
  * @val: value to be written
  *
  * Write config space to the specified device
  */
-static int pseries_eeh_write_config(struct pci_dn *pdn, int where, int size, u32 val)
+static int pseries_eeh_write_config(struct eeh_dev *edev, int where, int size, u32 val)
 {
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
 	return rtas_write_config(pdn, where, size, val);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 10/46] powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (8 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn Oliver O'Halloran
                   ` (35 subsequent siblings)
  45 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Remove another pdn usage.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/include/asm/eeh.h               |  2 +-
 arch/powerpc/kernel/eeh.c                    |  5 ++---
 arch/powerpc/kernel/eeh_pe.c                 |  6 ++----
 arch/powerpc/platforms/powernv/eeh-powernv.c | 11 +++++------
 4 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 62c4ee44ad2c..67847f8dfe71 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -227,7 +227,7 @@ struct eeh_ops {
 	int (*read_config)(struct eeh_dev *edev, int where, int size, u32 *val);
 	int (*write_config)(struct eeh_dev *edev, int where, int size, u32 val);
 	int (*next_error)(struct eeh_pe **pe);
-	int (*restore_config)(struct pci_dn *pdn);
+	int (*restore_config)(struct eeh_dev *edev);
 	int (*notify_resume)(struct pci_dn *pdn);
 };
 
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 7258fa04176d..63500e34e329 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -726,7 +726,6 @@ static void eeh_disable_and_save_dev_state(struct eeh_dev *edev,
 
 static void eeh_restore_dev_state(struct eeh_dev *edev, void *userdata)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 	struct pci_dev *pdev = eeh_dev_to_pci_dev(edev);
 	struct pci_dev *dev = userdata;
 
@@ -734,8 +733,8 @@ static void eeh_restore_dev_state(struct eeh_dev *edev, void *userdata)
 		return;
 
 	/* Apply customization from firmware */
-	if (pdn && eeh_ops->restore_config)
-		eeh_ops->restore_config(pdn);
+	if (eeh_ops->restore_config)
+		eeh_ops->restore_config(edev);
 
 	/* The caller should restore state for the specified device */
 	if (pdev != dev)
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index e11e0830f125..634963aa4a77 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -841,16 +841,14 @@ static void eeh_restore_device_bars(struct eeh_dev *edev)
  */
 static void eeh_restore_one_device_bars(struct eeh_dev *edev, void *flag)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-
 	/* Do special restore for bridges */
 	if (edev->mode & EEH_DEV_BRIDGE)
 		eeh_restore_bridge_bars(edev);
 	else
 		eeh_restore_device_bars(edev);
 
-	if (eeh_ops->restore_config && pdn)
-		eeh_ops->restore_config(pdn);
+	if (eeh_ops->restore_config)
+		eeh_ops->restore_config(edev);
 }
 
 /**
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 54d8ec77aef2..6c5d9f1bc378 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1644,12 +1644,10 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 	return ret;
 }
 
-static int pnv_eeh_restore_config(struct pci_dn *pdn)
+static int pnv_eeh_restore_config(struct eeh_dev *edev)
 {
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 	struct pnv_phb *phb;
 	s64 ret = 0;
-	int config_addr = (pdn->busno << 8) | (pdn->devfn);
 
 	if (!edev)
 		return -EEXIST;
@@ -1658,13 +1656,14 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
 	if (edev->physfn)
 		return 0;
 
-	phb = pdn->phb->private_data;
+	phb = edev->pe->phb->private_data;
 	ret = opal_pci_reinit(phb->opal_id,
-			      OPAL_REINIT_PCI_DEV, config_addr);
+			      OPAL_REINIT_PCI_DEV, edev->bdfn);
 
+	ret = opal_pci_reinit(phb->opal_id, OPAL_REINIT_PCI_DEV, edev->bdfn);
 	if (ret) {
 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
-			__func__, config_addr, ret);
+			__func__, edev->bdfn, ret);
 		return -EIO;
 	}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (9 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 10/46] powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  4:55   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev Oliver O'Halloran
                   ` (34 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

We use the pci_dn to retrieve the domain, bus, device, and function numbers for
an EEH device. We now have that in the eeh_dev so covert the various printk()s
we have around the place to source that information from the eeh_dev.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/kernel/eeh.c    | 14 ++++----------
 arch/powerpc/kernel/eeh_pe.c | 14 ++++++--------
 2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 63500e34e329..c8039fdb23ba 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -167,23 +167,17 @@ void eeh_show_enabled(void)
  */
 static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 	u32 cfg;
 	int cap, i;
 	int n = 0, l = 0;
 	char buffer[128];
 
-	if (!pdn) {
-		pr_warn("EEH: Note: No error log for absent device.\n");
-		return 0;
-	}
-
 	n += scnprintf(buf+n, len-n, "%04x:%02x:%02x.%01x\n",
-		       pdn->phb->global_number, pdn->busno,
-		       PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+			edev->pe->phb->global_number, edev->bdfn >> 8,
+			PCI_SLOT(edev->bdfn), PCI_FUNC(edev->bdfn));
 	pr_warn("EEH: of node=%04x:%02x:%02x.%01x\n",
-		pdn->phb->global_number, pdn->busno,
-		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+		edev->pe->phb->global_number, edev->bdfn >> 8,
+		PCI_SLOT(edev->bdfn), PCI_FUNC(edev->bdfn));
 
 	eeh_ops->read_config(edev, PCI_VENDOR_ID, 4, &cfg);
 	n += scnprintf(buf+n, len-n, "dev/vend:%08x\n", cfg);
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 634963aa4a77..831f363f1732 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -366,9 +366,8 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
  */
 int eeh_add_to_parent_pe(struct eeh_dev *edev)
 {
+	int config_addr = edev->bdfn;
 	struct eeh_pe *pe, *parent;
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-	int config_addr = (pdn->busno << 8) | (pdn->devfn);
 
 	/* Check if the PE number is valid */
 	if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) {
@@ -382,7 +381,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	 * PE should be composed of PCI bus and its subordinate
 	 * components.
 	 */
-	pe = eeh_pe_get(pdn->phb, edev->pe_config_addr, config_addr);
+	pe = eeh_pe_get(edev->controller, edev->pe_config_addr, config_addr);
 	if (pe) {
 		if (pe->type & EEH_PE_INVALID) {
 			list_add_tail(&edev->entry, &pe->edevs);
@@ -416,9 +415,9 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 
 	/* Create a new EEH PE */
 	if (edev->physfn)
-		pe = eeh_pe_alloc(pdn->phb, EEH_PE_VF);
+		pe = eeh_pe_alloc(edev->controller, EEH_PE_VF);
 	else
-		pe = eeh_pe_alloc(pdn->phb, EEH_PE_DEVICE);
+		pe = eeh_pe_alloc(edev->controller, EEH_PE_DEVICE);
 	if (!pe) {
 		pr_err("%s: out of memory!\n", __func__);
 		return -ENOMEM;
@@ -434,10 +433,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	 */
 	parent = eeh_pe_get_parent(edev);
 	if (!parent) {
-		parent = eeh_phb_pe_get(pdn->phb);
+		parent = eeh_phb_pe_get(edev->controller);
 		if (!parent) {
 			pr_err("%s: No PHB PE is found (PHB Domain=%d)\n",
-				__func__, pdn->phb->global_number);
+				__func__, edev->controller->global_number);
 			edev->pe = NULL;
 			kfree(pe);
 			return -EEXIST;
@@ -698,7 +697,6 @@ void eeh_pe_state_clear(struct eeh_pe *root, int state, bool include_passed)
  */
 static void eeh_bridge_check_link(struct eeh_dev *edev)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 	int cap;
 	uint32_t val;
 	int timeout = 0;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (10 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  5:45   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 13/46] powerpc/eeh: Rework how pdev_probe() is used Oliver O'Halloran
                   ` (33 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

The EEH core has a concept of "early probe" and "late probe." When the
EEH_PROBE_MODE_DEVTREE flag is set (i.e pseries) we call the eeh_ops->probe()
function in eeh_add_device_early() so the eeh_dev state is initialised based on
the pci_dn. It's important to realise that this happens *long* before the PCI
device has been probed and a pci_dev structure created. This is necessary due
to a PAPR requirement that EEH be enabled before to OS starts interacting
with the device.

The late probe is done in eeh_add_device_late() when the EEH_PROBE_MODE_DEV
flag is set (i.e. PowerNV). The main difference is the late probe happens
after the pci_dev has been created. As a result there is no actual dependency
on a pci_dn in the late probe case. Splitting the single eeh_ops->probe()
function into seperate functions allows us to simplify the late probe case
since we have access to a pci_dev at that point. Having access to a pci_dev
means that we can use the functions provided by the PCI core for finding
capabilities, etc rather than doing it manually.

It also changes the prototype for the probe functions to be void. Currently
they return a void *, but both implementations always return NULL so there's
not much point to it.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/include/asm/eeh.h               |  3 +-
 arch/powerpc/kernel/eeh.c                    |  6 ++--
 arch/powerpc/platforms/powernv/eeh-powernv.c | 29 ++++++--------------
 arch/powerpc/platforms/pseries/eeh_pseries.c | 13 ++++-----
 4 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 67847f8dfe71..466b0165fbcf 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -215,7 +215,8 @@ enum {
 struct eeh_ops {
 	char *name;
 	int (*init)(void);
-	void* (*probe)(struct pci_dn *pdn, void *data);
+	void (*probe_pdn)(struct pci_dn *pdn);    /* used on pseries */
+	void (*probe_pdev)(struct pci_dev *pdev); /* used on powernv */
 	int (*set_option)(struct eeh_pe *pe, int option);
 	int (*get_pe_addr)(struct eeh_pe *pe);
 	int (*get_state)(struct eeh_pe *pe, int *delay);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index c8039fdb23ba..087a98b42a8c 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1066,7 +1066,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
 	    (eeh_has_flag(EEH_PROBE_MODE_DEVTREE) && 0 == phb->buid))
 		return;
 
-	eeh_ops->probe(pdn, NULL);
+	eeh_ops->probe_pdn(pdn);
 }
 
 /**
@@ -1135,8 +1135,8 @@ void eeh_add_device_late(struct pci_dev *dev)
 		dev->dev.archdata.edev = NULL;
 	}
 
-	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
-		eeh_ops->probe(pdn, NULL);
+	if (eeh_ops->probe_pdev && eeh_has_flag(EEH_PROBE_MODE_DEV))
+		eeh_ops->probe_pdev(dev);
 
 	edev->pdev = dev;
 	dev->dev.archdata.edev = edev;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 6c5d9f1bc378..8bd5317aa878 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -346,23 +346,13 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
 
 /**
  * pnv_eeh_probe - Do probe on PCI device
- * @pdn: PCI device node
- * @data: unused
+ * @pdev: pci_dev to probe
  *
- * When EEH module is installed during system boot, all PCI devices
- * are checked one by one to see if it supports EEH. The function
- * is introduced for the purpose. By default, EEH has been enabled
- * on all PCI devices. That's to say, we only need do necessary
- * initialization on the corresponding eeh device and create PE
- * accordingly.
- *
- * It's notable that's unsafe to retrieve the EEH device through
- * the corresponding PCI device. During the PCI device hotplug, which
- * was possiblly triggered by EEH core, the binding between EEH device
- * and the PCI device isn't built yet.
+ * Creates (or finds an existing) edev for this pci_dev.
  */
-static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
+static void pnv_eeh_probe_pdev(struct pci_dev *pdev)
 {
+	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pci_controller *hose = pdn->phb;
 	struct pnv_phb *phb = hose->private_data;
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -377,11 +367,11 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	 * the probing.
 	 */
 	if (!edev || edev->pe)
-		return NULL;
+		return;
 
 	/* Skip for PCI-ISA bridge */
 	if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
-		return NULL;
+		return;
 
 	eeh_edev_dbg(edev, "Probing device\n");
 
@@ -411,7 +401,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	ret = eeh_add_to_parent_pe(edev);
 	if (ret) {
 		eeh_edev_warn(edev, "Failed to add device to PE (code %d)\n", ret);
-		return NULL;
+		return;
 	}
 
 	/*
@@ -469,8 +459,6 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	eeh_save_bars(edev);
 
 	eeh_edev_dbg(edev, "EEH enabled on device\n");
-
-	return NULL;
 }
 
 /**
@@ -1673,7 +1661,8 @@ static int pnv_eeh_restore_config(struct eeh_dev *edev)
 static struct eeh_ops pnv_eeh_ops = {
 	.name                   = "powernv",
 	.init                   = pnv_eeh_init,
-	.probe			= pnv_eeh_probe,
+	.probe_pdn		= NULL,
+	.probe_pdev		= pnv_eeh_probe_pdev,
 	.set_option             = pnv_eeh_set_option,
 	.get_pe_addr            = pnv_eeh_get_pe_addr,
 	.get_state              = pnv_eeh_get_state,
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 6f911a048339..3ac23c884f4e 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -229,7 +229,7 @@ static int pseries_eeh_find_ecap(struct pci_dn *pdn, int cap)
  * are checked one by one to see if it supports EEH. The function
  * is introduced for the purpose.
  */
-static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
+static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
 {
 	struct eeh_dev *edev;
 	struct eeh_pe pe;
@@ -240,15 +240,15 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
 	/* Retrieve OF node and eeh device */
 	edev = pdn_to_eeh_dev(pdn);
 	if (!edev || edev->pe)
-		return NULL;
+		return;
 
 	/* Check class/vendor/device IDs */
 	if (!pdn->vendor_id || !pdn->device_id || !pdn->class_code)
-		return NULL;
+		return;
 
 	/* Skip for PCI-ISA bridge */
         if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
-		return NULL;
+		return;
 
 	eeh_edev_dbg(edev, "Probing device\n");
 
@@ -315,8 +315,6 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
 
 	/* Save memory bars */
 	eeh_save_bars(edev);
-
-	return NULL;
 }
 
 /**
@@ -755,7 +753,8 @@ static int pseries_notify_resume(struct pci_dn *pdn)
 static struct eeh_ops pseries_eeh_ops = {
 	.name			= "pseries",
 	.init			= pseries_eeh_init,
-	.probe			= pseries_eeh_probe,
+	.probe_pdn		= pseries_eeh_probe_pdn,
+	.probe_pdev 		= NULL,
 	.set_option		= pseries_eeh_set_option,
 	.get_pe_addr		= pseries_eeh_get_pe_addr,
 	.get_state		= pseries_eeh_get_state,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 13/46] powerpc/eeh: Rework how pdev_probe() is used
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (11 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early() Oliver O'Halloran
                   ` (32 subsequent siblings)
  45 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Adjust how the EEH core uses the eeh_ops->probe_pdev() so that it returns
the eeh_dev for the passed-in pci_dev. Currently mapping an pci_dev to an
eeh_dev is done by finding the pci_dn for the pci_dev, then using the
back-pointer to the eeh_dev stashed in the pci_dn.

We want to move away from using pci_dn on PowerNV and moving the eeh_dev
lookup into probe_pdev() allows the EEH core to be oblivious of how the
mapping is actually done.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/include/asm/eeh.h               | 16 +++++++--
 arch/powerpc/kernel/eeh.c                    | 34 ++++++++++++--------
 arch/powerpc/platforms/powernv/eeh-powernv.c | 20 +++++++++---
 arch/powerpc/platforms/pseries/eeh_pseries.c | 19 ++++++++++-
 4 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 466b0165fbcf..e109bfd3dd57 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -215,8 +215,20 @@ enum {
 struct eeh_ops {
 	char *name;
 	int (*init)(void);
-	void (*probe_pdn)(struct pci_dn *pdn);    /* used on pseries */
-	void (*probe_pdev)(struct pci_dev *pdev); /* used on powernv */
+
+	/*
+	 * on pseries the eeh_dev is initialised before the pci_dev exists
+	 * using the contents of the pci_dn.
+	 */
+	void (*probe_pdn)(struct pci_dn *pdn);
+
+	/*
+	 * probe_pdev() is used to find, and possibly create, an eeh_dev
+	 * for a pci_dev. The EEH core binds the returned device to the
+	 * pci_dev.
+	 */
+	struct eeh_dev *(*probe_pdev)(struct pci_dev *pdev);
+
 	int (*set_option)(struct eeh_pe *pe, int option);
 	int (*get_pe_addr)(struct eeh_pe *pe);
 	int (*get_state)(struct eeh_pe *pe, int *delay);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 087a98b42a8c..58a8299ac417 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1099,17 +1099,24 @@ EXPORT_SYMBOL_GPL(eeh_add_device_tree_early);
  */
 void eeh_add_device_late(struct pci_dev *dev)
 {
-	struct pci_dn *pdn;
 	struct eeh_dev *edev;
 
 	if (!dev)
 		return;
 
-	pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
-	edev = pdn_to_eeh_dev(pdn);
-	eeh_edev_dbg(edev, "Adding device\n");
-	if (edev->pdev == dev) {
-		eeh_edev_dbg(edev, "Device already referenced!\n");
+	pr_debug("EEH: Adding device %s\n", pci_name(dev));
+
+	/* pci_dev_to_eeh_dev() can only work if archdata.edev is already set */
+	edev = pci_dev_to_eeh_dev(dev);
+	if (edev) {
+		/* FIXME: I don't remember why this isn't an error, but it's not */
+		eeh_edev_dbg(edev, "Already bound to an eeh_dev!\n");
+		return;
+	}
+
+	edev = eeh_ops->probe_pdev(dev);
+	if (!edev) {
+		pr_debug("EEH: Adding device failed\n");
 		return;
 	}
 
@@ -1118,8 +1125,13 @@ void eeh_add_device_late(struct pci_dev *dev)
 	 * unbalanced kref to the device during unplug time, which
 	 * relies on pcibios_release_device(). So we have to remove
 	 * that here explicitly.
+	 *
+	 * FIXME: This really shouldn't be necessary. We should probably
+	 * tear down the EEH state when we detatch the pci_dev from the
+	 * bus. We might need to move the bus notifiers out of the platforms
+	 * first.
 	 */
-	if (edev->pdev) {
+	if (edev->pdev && edev->pdev != dev) {
 		eeh_rmv_from_parent_pe(edev);
 		eeh_addr_cache_rmv_dev(edev->pdev);
 		eeh_sysfs_remove_device(edev->pdev);
@@ -1130,17 +1142,11 @@ void eeh_add_device_late(struct pci_dev *dev)
 		 * into error handler afterwards.
 		 */
 		edev->mode |= EEH_DEV_NO_HANDLER;
-
-		edev->pdev = NULL;
-		dev->dev.archdata.edev = NULL;
 	}
 
-	if (eeh_ops->probe_pdev && eeh_has_flag(EEH_PROBE_MODE_DEV))
-		eeh_ops->probe_pdev(dev);
-
+	/* bind the pdev and the edev together */
 	edev->pdev = dev;
 	dev->dev.archdata.edev = edev;
-
 	eeh_addr_cache_insert_dev(dev);
 	eeh_sysfs_add_device(dev);
 }
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 8bd5317aa878..5250c4525544 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -348,9 +348,9 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
  * pnv_eeh_probe - Do probe on PCI device
  * @pdev: pci_dev to probe
  *
- * Creates (or finds an existing) edev for this pci_dev.
+ * Create, or find the existing, eeh_dev for this pci_dev.
  */
-static void pnv_eeh_probe_pdev(struct pci_dev *pdev)
+static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pci_controller *hose = pdn->phb;
@@ -367,11 +367,19 @@ static void pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	 * the probing.
 	 */
 	if (!edev || edev->pe)
-		return;
+		return NULL;
+
+	/* already configured? */
+	if (edev->pdev) {
+		pr_debug("%s: found existing edev for %04x:%02x:%02x.%01x\n",
+			__func__, hose->global_number, config_addr >> 8,
+			PCI_SLOT(config_addr), PCI_FUNC(config_addr));
+		return edev;
+	}
 
 	/* Skip for PCI-ISA bridge */
 	if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
-		return;
+		return NULL;
 
 	eeh_edev_dbg(edev, "Probing device\n");
 
@@ -401,7 +409,7 @@ static void pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	ret = eeh_add_to_parent_pe(edev);
 	if (ret) {
 		eeh_edev_warn(edev, "Failed to add device to PE (code %d)\n", ret);
-		return;
+		return NULL;
 	}
 
 	/*
@@ -459,6 +467,8 @@ static void pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	eeh_save_bars(edev);
 
 	eeh_edev_dbg(edev, "EEH enabled on device\n");
+
+	return edev;
 }
 
 /**
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 3ac23c884f4e..13a8c274554a 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -317,6 +317,23 @@ static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
 	eeh_save_bars(edev);
 }
 
+/* Platform specific method to retrieve the eeh_dev for this pci_dev */
+static struct eeh_dev *pseries_eeh_probe_pdev(struct pci_dev *pdev)
+{
+	struct eeh_dev *edev;
+	struct pci_dn *pdn;
+
+	pdn = pci_get_pdn_by_devfn(pdev->bus, pdev->devfn);
+	if (!pdn)
+		return NULL;
+
+	edev = pdn_to_eeh_dev(pdn);
+	if (!edev || !edev->pe)
+		return NULL;
+
+	return edev;
+}
+
 /**
  * pseries_eeh_set_option - Initialize EEH or MMIO/DMA reenable
  * @pe: EEH PE
@@ -754,7 +771,7 @@ static struct eeh_ops pseries_eeh_ops = {
 	.name			= "pseries",
 	.init			= pseries_eeh_init,
 	.probe_pdn		= pseries_eeh_probe_pdn,
-	.probe_pdev 		= NULL,
+	.probe_pdev 		= pseries_eeh_probe_pdev,
 	.set_option		= pseries_eeh_set_option,
 	.get_pe_addr		= pseries_eeh_get_pe_addr,
 	.get_state		= pseries_eeh_get_state,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (12 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 13/46] powerpc/eeh: Rework how pdev_probe() is used Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  6:01   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops Oliver O'Halloran
                   ` (31 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

eeh_add_device_early() is used to initialise the EEH state for a PCI device
based on the contents of it's devicetree node. It doesn't do anything
unless EEH_FLAG_PROBE_MODE_DEVTREE is set and that only happens on pseries.

Remove the call to eeh_add_device_early() in the powernv code to squash
another pci_dn usage.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 5250c4525544..aa2935a08464 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -40,13 +40,10 @@ static int eeh_event_irq = -EINVAL;
 
 void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
 {
-	struct pci_dn *pdn = pci_get_pdn(pdev);
-
-	if (!pdn || eeh_has_flag(EEH_FORCE_DISABLED))
+	if (eeh_has_flag(EEH_FORCE_DISABLED))
 		return;
 
 	dev_dbg(&pdev->dev, "EEH: Setting up device\n");
-	eeh_add_device_early(pdn);
 	eeh_add_device_late(pdev);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (13 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  6:15   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print Oliver O'Halloran
                   ` (30 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Use the pnv_eeh_{read|write}_config() functions that take an edev rather
than a pci_dn. This allows us to remove most of the explict uses of pci_dn
in the PowerNV EEH backend and localises them into a few functions which we
can fix later.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 153 +++++++++----------
 1 file changed, 70 insertions(+), 83 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index aa2935a08464..aaccb3768393 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -278,27 +278,73 @@ int pnv_eeh_post_init(void)
 	return ret;
 }
 
-static int pnv_eeh_find_cap(struct pci_dn *pdn, int cap)
+static inline bool pnv_eeh_cfg_blocked(struct eeh_dev *edev)
+{
+	if (!edev || !edev->pe)
+		return false;
+
+	/*
+	 * We will issue FLR or AF FLR to all VFs, which are contained
+	 * in VF PE. It relies on the EEH PCI config accessors. So we
+	 * can't block them during the window.
+	 */
+	if (edev->physfn && (edev->pe->state & EEH_PE_RESET))
+		return false;
+
+	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
+		return true;
+
+	return false;
+}
+
+static int pnv_eeh_read_config(struct eeh_dev *edev,
+			       int where, int size, u32 *val)
+{
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	if (!pdn)
+		return PCIBIOS_DEVICE_NOT_FOUND;
+
+	if (pnv_eeh_cfg_blocked(edev)) {
+		*val = 0xFFFFFFFF;
+		return PCIBIOS_SET_FAILED;
+	}
+
+	return pnv_pci_cfg_read(pdn, where, size, val);
+}
+
+static int pnv_eeh_write_config(struct eeh_dev *edev,
+				int where, int size, u32 val)
+{
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	if (!pdn)
+		return PCIBIOS_DEVICE_NOT_FOUND;
+
+	if (pnv_eeh_cfg_blocked(edev))
+		return PCIBIOS_SET_FAILED;
+
+	return pnv_pci_cfg_write(pdn, where, size, val);
+}
+
+static int pnv_eeh_find_cap(struct eeh_dev *edev, int cap)
 {
 	int pos = PCI_CAPABILITY_LIST;
 	int cnt = 48;   /* Maximal number of capabilities */
 	u32 status, id;
 
-	if (!pdn)
-		return 0;
-
 	/* Check if the device supports capabilities */
-	pnv_pci_cfg_read(pdn, PCI_STATUS, 2, &status);
+	pnv_eeh_read_config(edev, PCI_STATUS, 2, &status);
 	if (!(status & PCI_STATUS_CAP_LIST))
 		return 0;
 
 	while (cnt--) {
-		pnv_pci_cfg_read(pdn, pos, 1, &pos);
+		pnv_eeh_read_config(edev, pos, 1, &pos);
 		if (pos < 0x40)
 			break;
 
 		pos &= ~3;
-		pnv_pci_cfg_read(pdn, pos + PCI_CAP_LIST_ID, 1, &id);
+		pnv_eeh_read_config(edev, pos + PCI_CAP_LIST_ID, 1, &id);
 		if (id == 0xff)
 			break;
 
@@ -313,15 +359,14 @@ static int pnv_eeh_find_cap(struct pci_dn *pdn, int cap)
 	return 0;
 }
 
-static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
+static int pnv_eeh_find_ecap(struct eeh_dev *edev, int cap)
 {
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 	u32 header;
 	int pos = 256, ttl = (4096 - 256) / 8;
 
 	if (!edev || !edev->pcie_cap)
 		return 0;
-	if (pnv_pci_cfg_read(pdn, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
+	if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
 		return 0;
 	else if (!header)
 		return 0;
@@ -334,7 +379,7 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
 		if (pos < 256)
 			break;
 
-		if (pnv_pci_cfg_read(pdn, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
+		if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
 			break;
 	}
 
@@ -382,15 +427,14 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 
 	/* Initialize eeh device */
 	edev->class_code = pdn->class_code;
-	edev->mode	&= 0xFFFFFF00;
-	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
-	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
-	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
-	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+	edev->pcix_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_PCIX);
+	edev->pcie_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_EXP);
+	edev->af_cap   = pnv_eeh_find_cap(edev, PCI_CAP_ID_AF);
+	edev->aer_cap  = pnv_eeh_find_ecap(edev, PCI_EXT_CAP_ID_ERR);
 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
-			pnv_pci_cfg_read(pdn, edev->pcie_cap + PCI_EXP_FLAGS,
+			pnv_eeh_read_config(edev, edev->pcie_cap + PCI_EXP_FLAGS,
 					 2, &pcie_flags);
 			pcie_flags = (pcie_flags & PCI_EXP_FLAGS_TYPE) >> 4;
 			if (pcie_flags == PCI_EXP_TYPE_ROOT_PORT)
@@ -839,8 +883,7 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
 
 static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 {
-	struct pci_dn *pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+	struct eeh_dev *edev = pci_dev_to_eeh_dev(dev);
 	int aer = edev ? edev->aer_cap : 0;
 	u32 ctrl;
 
@@ -944,10 +987,9 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 	}
 }
 
-static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
+static void pnv_eeh_wait_for_pending(struct eeh_dev *edev, const char *type,
 				     int pos, u16 mask)
 {
-	struct eeh_dev *edev = pdn->edev;
 	int i, status = 0;
 
 	/* Wait for Transaction Pending bit to be cleared */
@@ -965,9 +1007,8 @@ static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
 		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
 }
 
-static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
+static int pnv_eeh_do_flr(struct eeh_dev *edev, int option)
 {
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 	u32 reg = 0;
 
 	if (WARN_ON(!edev->pcie_cap))
@@ -980,7 +1021,7 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
 	switch (option) {
 	case EEH_RESET_HOT:
 	case EEH_RESET_FUNDAMENTAL:
-		pnv_eeh_wait_for_pending(pdn, "",
+		pnv_eeh_wait_for_pending(edev, "",
 					 edev->pcie_cap + PCI_EXP_DEVSTA,
 					 PCI_EXP_DEVSTA_TRPND);
 		eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
@@ -1003,9 +1044,8 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
 	return 0;
 }
 
-static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
+static int pnv_eeh_do_af_flr(struct eeh_dev *edev, int option)
 {
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 	u32 cap = 0;
 
 	if (WARN_ON(!edev->af_cap))
@@ -1023,7 +1063,7 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
 		 * test is used, so we use the conrol offset rather than status
 		 * and shift the test bit to match.
 		 */
-		pnv_eeh_wait_for_pending(pdn, "AF",
+		pnv_eeh_wait_for_pending(edev, "AF",
 					 edev->af_cap + PCI_AF_CTRL,
 					 PCI_AF_STATUS_TP << 8);
 		eeh_ops->write_config(edev, edev->af_cap + PCI_AF_CTRL,
@@ -1042,20 +1082,18 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
 static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option)
 {
 	struct eeh_dev *edev;
-	struct pci_dn *pdn;
 	int ret;
 
 	/* The VF PE should have only one child device */
 	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, entry);
-	pdn = eeh_dev_to_pdn(edev);
-	if (!pdn)
+	if (!edev)
 		return -ENXIO;
 
-	ret = pnv_eeh_do_flr(pdn, option);
+	ret = pnv_eeh_do_flr(edev, option);
 	if (!ret)
 		return ret;
 
-	return pnv_eeh_do_af_flr(pdn, option);
+	return pnv_eeh_do_af_flr(edev, option);
 }
 
 /**
@@ -1244,57 +1282,6 @@ static int pnv_eeh_err_inject(struct eeh_pe *pe, int type, int func,
 	return 0;
 }
 
-static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
-{
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
-
-	if (!edev || !edev->pe)
-		return false;
-
-	/*
-	 * We will issue FLR or AF FLR to all VFs, which are contained
-	 * in VF PE. It relies on the EEH PCI config accessors. So we
-	 * can't block them during the window.
-	 */
-	if (edev->physfn && (edev->pe->state & EEH_PE_RESET))
-		return false;
-
-	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
-		return true;
-
-	return false;
-}
-
-static int pnv_eeh_read_config(struct eeh_dev *edev,
-			       int where, int size, u32 *val)
-{
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-
-	if (!pdn)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-
-	if (pnv_eeh_cfg_blocked(pdn)) {
-		*val = 0xFFFFFFFF;
-		return PCIBIOS_SET_FAILED;
-	}
-
-	return pnv_pci_cfg_read(pdn, where, size, val);
-}
-
-static int pnv_eeh_write_config(struct eeh_dev *edev,
-				int where, int size, u32 val)
-{
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-
-	if (!pdn)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-
-	if (pnv_eeh_cfg_blocked(pdn))
-		return PCIBIOS_SET_FAILED;
-
-	return pnv_pci_cfg_write(pdn, where, size, val);
-}
-
 static void pnv_eeh_dump_hub_diag_common(struct OpalIoP7IOCErrorData *data)
 {
 	/* GEM */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (14 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-22  6:17   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev() Oliver O'Halloran
                   ` (29 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Neaten things up a bit and remove a pci_dn use.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index aaccb3768393..f58fe6bda46e 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1001,10 +1001,8 @@ static void pnv_eeh_wait_for_pending(struct eeh_dev *edev, const char *type,
 		msleep((1 << i) * 100);
 	}
 
-	pr_warn("%s: Pending transaction while issuing %sFLR to %04x:%02x:%02x.%01x\n",
-		__func__, type,
-		pdn->phb->global_number, pdn->busno,
-		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+	eeh_edev_warn(edev, "%s: Pending transaction while issuing %sFLR\n",
+		__func__, type);
 }
 
 static int pnv_eeh_do_flr(struct eeh_dev *edev, int option)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (15 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  0:30   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper Oliver O'Halloran
                   ` (28 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

To get away from using pci_dn we need a way to find the edev for a given
bdfn. The easiest way to do this is to find the ioda_pe for that BDFN in
the PHB's reverse mapping table and scan the device list of the
corresponding eeh_pe.

Is this slow? Yeah probably. Is it slower than the existing "traverse the
pdn tree" method? Probably not.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 31 ++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h         |  2 ++
 2 files changed, 33 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index f58fe6bda46e..a974822c5097 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -278,6 +278,37 @@ int pnv_eeh_post_init(void)
 	return ret;
 }
 
+struct eeh_dev *pnv_eeh_find_edev(struct pnv_phb *phb, u16 bdfn)
+{
+	struct pnv_ioda_pe *ioda_pe;
+	struct eeh_dev *tmp, *edev;
+	struct eeh_pe *pe;
+
+	/* EEH not enabled ? */
+	if (!(phb->flags & PNV_PHB_FLAG_EEH))
+		return NULL;
+
+	/* Fish the EEH PE from the IODA PE */
+	ioda_pe = __pnv_ioda_get_pe(phb, bdfn);
+	if (!ioda_pe)
+		return NULL;
+
+	/*
+	 * FIXME: Doing a tree-traversal followed by a list traversal
+	 * on every config access is dumb. Not much dumber than the pci_dn
+	 * tree traversal we did before, but still quite dumb.
+	 */
+	pe = eeh_pe_get(phb->hose, ioda_pe->pe_number, 0);
+	if (!pe)
+		return NULL;
+
+	eeh_pe_for_each_dev(pe, edev, tmp)
+		if (edev->bdfn == bdfn)
+			return edev;
+
+	return NULL;
+}
+
 static inline bool pnv_eeh_cfg_blocked(struct eeh_dev *edev)
 {
 	if (!edev || !edev->pe)
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 3c33a0c91a69..a343f3c8e65c 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -196,6 +196,8 @@ extern void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq);
 extern unsigned long pnv_pci_ioda2_get_table_size(__u32 page_shift,
 		__u64 window_size, __u32 levels);
 extern int pnv_eeh_post_init(void);
+struct eeh_dev;
+struct eeh_dev *pnv_eeh_find_edev(struct pnv_phb *phb, u16 bdfn);
 
 __printf(3, 4)
 extern void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (16 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  0:42   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions Oliver O'Halloran
                   ` (27 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Add a helper to go from a pci_bus structure to the pnv_phb that hosts that
bus. There's a lot of instances of the following pattern:

	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
	struct pnv_phb *phb = hose->private_data;

Without any other uses of the pci_controller inside the function. This is
hard to read since it requires you to memorise the contents of the
private data fields and kind of error prone since it involves blindly
assigning a void pointer. Add a helper to make it more concise and
explict.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 88 +++++++----------------
 arch/powerpc/platforms/powernv/pci.c      | 18 ++---
 arch/powerpc/platforms/powernv/pci.h      | 10 +++
 3 files changed, 39 insertions(+), 77 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index c74521e5f3ab..a1c9315f3208 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -252,8 +252,7 @@ static int pnv_ioda2_init_m64(struct pnv_phb *phb)
 static void pnv_ioda_reserve_dev_m64_pe(struct pci_dev *pdev,
 					 unsigned long *pe_bitmap)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct resource *r;
 	resource_size_t base, sgsz, start, end;
 	int segno, i;
@@ -351,8 +350,7 @@ static void pnv_ioda_reserve_m64_pe(struct pci_bus *bus,
 
 static struct pnv_ioda_pe *pnv_ioda_pick_m64_pe(struct pci_bus *bus, bool all)
 {
-	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	struct pnv_ioda_pe *master_pe, *pe;
 	unsigned long size, *pe_alloc;
 	int i;
@@ -673,8 +671,7 @@ struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn)
 
 struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev)
 {
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
 	struct pci_dn *pdn = pci_get_pdn(dev);
 
 	if (!pdn)
@@ -1053,8 +1050,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
 
 static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
 {
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
 	struct pci_dn *pdn = pci_get_pdn(dev);
 	struct pnv_ioda_pe *pe;
 
@@ -1113,8 +1109,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
  */
 static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 {
-	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	struct pnv_ioda_pe *pe = NULL;
 	unsigned int pe_num;
 
@@ -1181,8 +1176,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
 	struct pnv_ioda_pe *pe;
 	struct pci_dev *gpu_pdev;
 	struct pci_dn *npu_pdn;
-	struct pci_controller *hose = pci_bus_to_host(npu_pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(npu_pdev->bus);
 
 	/*
 	 * Due to a hardware errata PE#0 on the NPU is reserved for
@@ -1279,16 +1273,12 @@ static void pnv_pci_ioda_setup_PEs(void)
 #ifdef CONFIG_PCI_IOV
 static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
 {
-	struct pci_bus        *bus;
-	struct pci_controller *hose;
 	struct pnv_phb        *phb;
 	struct pci_dn         *pdn;
 	int                    i, j;
 	int                    m64_bars;
 
-	bus = pdev->bus;
-	hose = pci_bus_to_host(bus);
-	phb = hose->private_data;
+	phb = pci_bus_to_pnvhb(pdev->bus);
 	pdn = pci_get_pdn(pdev);
 
 	if (pdn->m64_single_mode)
@@ -1312,8 +1302,6 @@ static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
 
 static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 {
-	struct pci_bus        *bus;
-	struct pci_controller *hose;
 	struct pnv_phb        *phb;
 	struct pci_dn         *pdn;
 	unsigned int           win;
@@ -1325,9 +1313,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 	int                    pe_num;
 	int                    m64_bars;
 
-	bus = pdev->bus;
-	hose = pci_bus_to_host(bus);
-	phb = hose->private_data;
+	phb = pci_bus_to_pnvhb(pdev->bus);
 	pdn = pci_get_pdn(pdev);
 	total_vfs = pci_sriov_get_totalvfs(pdev);
 
@@ -1438,15 +1424,11 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe
 
 static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
 {
-	struct pci_bus        *bus;
-	struct pci_controller *hose;
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe, *pe_n;
 	struct pci_dn         *pdn;
 
-	bus = pdev->bus;
-	hose = pci_bus_to_host(bus);
-	phb = hose->private_data;
+	phb = pci_bus_to_pnvhb(pdev->bus);
 	pdn = pci_get_pdn(pdev);
 
 	if (!pdev->is_physfn)
@@ -1471,16 +1453,12 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
 
 void pnv_pci_sriov_disable(struct pci_dev *pdev)
 {
-	struct pci_bus        *bus;
-	struct pci_controller *hose;
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe;
 	struct pci_dn         *pdn;
 	u16                    num_vfs, i;
 
-	bus = pdev->bus;
-	hose = pci_bus_to_host(bus);
-	phb = hose->private_data;
+	phb = pci_bus_to_pnvhb(pdev->bus);
 	pdn = pci_get_pdn(pdev);
 	num_vfs = pdn->num_vfs;
 
@@ -1519,17 +1497,13 @@ static void pnv_ioda_setup_bus_iommu_group(struct pnv_ioda_pe *pe,
 #endif
 static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 {
-	struct pci_bus        *bus;
-	struct pci_controller *hose;
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe;
 	int                    pe_num;
 	u16                    vf_index;
 	struct pci_dn         *pdn;
 
-	bus = pdev->bus;
-	hose = pci_bus_to_host(bus);
-	phb = hose->private_data;
+	phb = pci_bus_to_pnvhb(pdev->bus);
 	pdn = pci_get_pdn(pdev);
 
 	if (!pdev->is_physfn)
@@ -1556,7 +1530,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		pe->rid = (vf_bus << 8) | vf_devfn;
 
 		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
-			hose->global_number, pdev->bus->number,
+			pci_domain_nr(pdev->bus), pdev->bus->number,
 			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
 
 		if (pnv_ioda_configure_pe(phb, pe)) {
@@ -1591,17 +1565,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 
 int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 {
-	struct pci_bus        *bus;
-	struct pci_controller *hose;
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe;
 	struct pci_dn         *pdn;
 	int                    ret;
 	u16                    i;
 
-	bus = pdev->bus;
-	hose = pci_bus_to_host(bus);
-	phb = hose->private_data;
+	phb = pci_bus_to_pnvhb(pdev->bus);
 	pdn = pci_get_pdn(pdev);
 
 	if (phb->type == PNV_PHB_IODA2) {
@@ -1816,8 +1786,7 @@ static int pnv_pci_ioda_dma_64bit_bypass(struct pnv_ioda_pe *pe)
 static bool pnv_pci_ioda_iommu_bypass_supported(struct pci_dev *pdev,
 		u64 dma_mask)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pnv_ioda_pe *pe;
 
@@ -2866,8 +2835,7 @@ static void pnv_pci_init_ioda_msis(struct pnv_phb *phb)
 #ifdef CONFIG_PCI_IOV
 static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	const resource_size_t gate = phb->ioda.m64_segsize >> 2;
 	struct resource *res;
 	int i;
@@ -3202,10 +3170,9 @@ static void pnv_pci_ioda_fixup(void)
 static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 						unsigned long type)
 {
-	struct pci_dev *bridge;
-	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	int num_pci_bridges = 0;
+	struct pci_dev *bridge;
 
 	bridge = bus->self;
 	while (bridge) {
@@ -3291,8 +3258,7 @@ static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
 
 static void pnv_pci_configure_bus(struct pci_bus *bus)
 {
-	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	struct pci_dev *bridge = bus->self;
 	struct pnv_ioda_pe *pe;
 	bool all = (bridge && pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
@@ -3354,8 +3320,7 @@ static resource_size_t pnv_pci_default_alignment(void)
 static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	resource_size_t align;
 
@@ -3391,8 +3356,7 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
  */
 static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
 {
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
 	struct pci_dn *pdn;
 
 	/* The function is probably called while the PEs have
@@ -3577,8 +3541,7 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
 
 static void pnv_pci_release_device(struct pci_dev *pdev)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pnv_ioda_pe *pe;
 
@@ -3623,8 +3586,7 @@ static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
 
 void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pnv_ioda_pe *pe;
 
@@ -3664,8 +3626,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 
 void pnv_pci_dma_bus_setup(struct pci_bus *bus)
 {
-	struct pci_controller *hose = bus->sysdata;
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	struct pnv_ioda_pe *pe;
 
 	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
@@ -3999,8 +3960,7 @@ void __init pnv_pci_init_npu2_opencapi_phb(struct device_node *np)
 
 static void pnv_npu2_opencapi_cfg_size_fixup(struct pci_dev *dev)
 {
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
 
 	if (!machine_is(powernv))
 		return;
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 8b9058b52575..d36dde9777aa 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -158,8 +158,7 @@ EXPORT_SYMBOL_GPL(pnv_pci_set_power_state);
 
 int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct msi_desc *entry;
 	struct msi_msg msg;
 	int hwirq;
@@ -207,8 +206,7 @@ int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 
 void pnv_teardown_msi_irqs(struct pci_dev *pdev)
 {
-	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct msi_desc *entry;
 	irq_hw_number_t hwirq;
 
@@ -820,10 +818,9 @@ EXPORT_SYMBOL(pnv_pci_get_phb_node);
 
 int pnv_pci_set_tunnel_bar(struct pci_dev *dev, u64 addr, int enable)
 {
-	__be64 val;
-	struct pci_controller *hose;
-	struct pnv_phb *phb;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
 	u64 tunnel_bar;
+	__be64 val;
 	int rc;
 
 	if (!opal_check_token(OPAL_PCI_GET_PBCQ_TUNNEL_BAR))
@@ -831,9 +828,6 @@ int pnv_pci_set_tunnel_bar(struct pci_dev *dev, u64 addr, int enable)
 	if (!opal_check_token(OPAL_PCI_SET_PBCQ_TUNNEL_BAR))
 		return -ENXIO;
 
-	hose = pci_bus_to_host(dev->bus);
-	phb = hose->private_data;
-
 	mutex_lock(&tunnel_mutex);
 	rc = opal_pci_get_pbcq_tunnel_bar(phb->opal_id, &val);
 	if (rc != OPAL_SUCCESS) {
@@ -937,15 +931,13 @@ static int pnv_tce_iommu_bus_notifier(struct notifier_block *nb,
 	struct pci_dev *pdev;
 	struct pci_dn *pdn;
 	struct pnv_ioda_pe *pe;
-	struct pci_controller *hose;
 	struct pnv_phb *phb;
 
 	switch (action) {
 	case BUS_NOTIFY_ADD_DEVICE:
 		pdev = to_pci_dev(dev);
 		pdn = pci_get_pdn(pdev);
-		hose = pci_bus_to_host(pdev->bus);
-		phb = hose->private_data;
+		phb = pci_bus_to_pnvhb(pdev->bus);
 
 		WARN_ON_ONCE(!phb);
 		if (!pdn || pdn->pe_number == IODA_INVALID_PE || !phb)
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index a343f3c8e65c..be435a810d19 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -247,4 +247,14 @@ extern void pnv_pci_setup_iommu_table(struct iommu_table *tbl,
 		void *tce_mem, u64 tce_size,
 		u64 dma_offset, unsigned int page_shift);
 
+static inline struct pnv_phb *pci_bus_to_pnvhb(struct pci_bus *bus)
+{
+	struct pci_controller *hose = bus->sysdata;
+
+	if (hose)
+		return hose->private_data;
+
+	return NULL;
+}
+
 #endif /* __POWERNV_PCI_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (17 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  1:02   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev Oliver O'Halloran
                   ` (26 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

We have a pci_dev so we can use the functions provided by the PCI core for
looking up capabilities. This should be safe since these are only called
when initialising the eeh_dev when the device is first probed and not in
the EEH recovery path where config accesses are blocked.

This might cause a problem if an EEH event occured while probing the device,
but I'm pretty sure that's going to be broken anyway.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 67 ++------------------
 1 file changed, 4 insertions(+), 63 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index a974822c5097..b79aca8368c6 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -358,65 +358,6 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
 	return pnv_pci_cfg_write(pdn, where, size, val);
 }
 
-static int pnv_eeh_find_cap(struct eeh_dev *edev, int cap)
-{
-	int pos = PCI_CAPABILITY_LIST;
-	int cnt = 48;   /* Maximal number of capabilities */
-	u32 status, id;
-
-	/* Check if the device supports capabilities */
-	pnv_eeh_read_config(edev, PCI_STATUS, 2, &status);
-	if (!(status & PCI_STATUS_CAP_LIST))
-		return 0;
-
-	while (cnt--) {
-		pnv_eeh_read_config(edev, pos, 1, &pos);
-		if (pos < 0x40)
-			break;
-
-		pos &= ~3;
-		pnv_eeh_read_config(edev, pos + PCI_CAP_LIST_ID, 1, &id);
-		if (id == 0xff)
-			break;
-
-		/* Found */
-		if (id == cap)
-			return pos;
-
-		/* Next one */
-		pos += PCI_CAP_LIST_NEXT;
-	}
-
-	return 0;
-}
-
-static int pnv_eeh_find_ecap(struct eeh_dev *edev, int cap)
-{
-	u32 header;
-	int pos = 256, ttl = (4096 - 256) / 8;
-
-	if (!edev || !edev->pcie_cap)
-		return 0;
-	if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
-		return 0;
-	else if (!header)
-		return 0;
-
-	while (ttl-- > 0) {
-		if (PCI_EXT_CAP_ID(header) == cap && pos)
-			return pos;
-
-		pos = PCI_EXT_CAP_NEXT(header);
-		if (pos < 256)
-			break;
-
-		if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
-			break;
-	}
-
-	return 0;
-}
-
 /**
  * pnv_eeh_probe - Do probe on PCI device
  * @pdev: pci_dev to probe
@@ -458,10 +399,10 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 
 	/* Initialize eeh device */
 	edev->class_code = pdn->class_code;
-	edev->pcix_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_PCIX);
-	edev->pcie_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_EXP);
-	edev->af_cap   = pnv_eeh_find_cap(edev, PCI_CAP_ID_AF);
-	edev->aer_cap  = pnv_eeh_find_ecap(edev, PCI_EXT_CAP_ID_ERR);
+	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
+	edev->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+	edev->af_cap   = pci_find_capability(pdev, PCI_CAP_ID_AF);
+	edev->aer_cap  = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (18 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  1:26   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev() Oliver O'Halloran
                   ` (25 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Most of what we fetch from the pci_dn is also in the pci_dev structure. Convert
the pnv_eeh_probe_pdev() to use the pdev fields rather than the pci_dn so we can
get rid of pci_dn eventually.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 26 ++++++++++----------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index b79aca8368c6..6ba74836a9f8 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -372,7 +372,7 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 	uint32_t pcie_flags;
 	int ret;
-	int config_addr = (pdn->busno << 8) | (pdn->devfn);
+	int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
 
 	/*
 	 * When probing the root bridge, which doesn't have any
@@ -392,18 +392,18 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	}
 
 	/* Skip for PCI-ISA bridge */
-	if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
+	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
 		return NULL;
 
 	eeh_edev_dbg(edev, "Probing device\n");
 
 	/* Initialize eeh device */
-	edev->class_code = pdn->class_code;
+	edev->class_code = pdev->class;
 	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
 	edev->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
 	edev->af_cap   = pci_find_capability(pdev, PCI_CAP_ID_AF);
 	edev->aer_cap  = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
-	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
+	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
 			pnv_eeh_read_config(edev, edev->pcie_cap + PCI_EXP_FLAGS,
@@ -443,14 +443,14 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	 * Broadcom Shiner 4-ports 1G NICs (14e4:168a)
 	 * Broadcom Shiner 2-ports 10G NICs (14e4:168e)
 	 */
-	if ((pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
-	     pdn->device_id == 0x1656) ||
-	    (pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
-	     pdn->device_id == 0x1657) ||
-	    (pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
-	     pdn->device_id == 0x168a) ||
-	    (pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
-	     pdn->device_id == 0x168e))
+	if ((pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
+	     pdev->device == 0x1656) ||
+	    (pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
+	     pdev->device == 0x1657) ||
+	    (pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
+	     pdev->device == 0x168a) ||
+	    (pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
+	     pdev->device == 0x168e))
 		edev->pe->state |= EEH_PE_CFG_RESTRICTED;
 
 	/*
@@ -461,7 +461,7 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	 */
 	if (!(edev->pe->state & EEH_PE_PRI_BUS)) {
 		edev->pe->bus = pci_find_bus(hose->global_number,
-					     pdn->busno);
+					     pdev->bus->number);
 		if (edev->pe->bus)
 			edev->pe->state |= EEH_PE_PRI_BUS;
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (19 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  3:20   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed Oliver O'Halloran
                   ` (24 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Use the pnv_eeh_find_edev() helper to look up the eeh_dev for a device
rather than doing it via the pci_dn.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 44 ++++++++++++++------
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 6ba74836a9f8..1cd80b399995 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -374,20 +374,40 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	int ret;
 	int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
 
+	pci_dbg(pdev, "%s: probing\n", __func__);
+
 	/*
-	 * When probing the root bridge, which doesn't have any
-	 * subordinate PCI devices. We don't have OF node for
-	 * the root bridge. So it's not reasonable to continue
-	 * the probing.
+	 * EEH keeps the eeh_dev alive over a recovery pass even when the
+	 * corresponding pci_dev has been torn down. In that case we need
+	 * to find the existing eeh_dev and re-bind the two.
 	 */
-	if (!edev || edev->pe)
-		return NULL;
+	edev = pnv_eeh_find_edev(phb, config_addr);
+	if (edev) {
+		eeh_edev_dbg(edev, "Found existing edev!\n");
+
+		/*
+		 * XXX: eeh_remove_device() clears pdev so we shouldn't hit this
+		 * normally. I've found that screwing around with the pci probe
+		 * path can result in eeh_probe_pdev() being called twice. This
+		 * is harmless at the moment, but it's pretty strange so emit a
+		 * warning to be on the safe side.
+		 */
+		if (WARN_ON(edev->pdev))
+			eeh_edev_dbg(edev, "%s: already bound to a pdev!\n", __func__);
+
+		edev->pdev = pdev;
+
+		/* should we be doing something with REMOVED too? */
+		edev->mode &= EEH_DEV_DISCONNECTED;
+
+		/* update the primary bus if we need to */
+		// XXX: why do we need to do this? is the pci_bus going away? what cleared the flag?
+		if (!(edev->pe->state & EEH_PE_PRI_BUS)) {
+			edev->pe->bus = pdev->bus;
+			if (edev->pe->bus)
+				edev->pe->state |= EEH_PE_PRI_BUS;
+		}
 
-	/* already configured? */
-	if (edev->pdev) {
-		pr_debug("%s: found existing edev for %04x:%02x:%02x.%01x\n",
-			__func__, hose->global_number, config_addr >> 8,
-			PCI_SLOT(config_addr), PCI_FUNC(config_addr));
 		return edev;
 	}
 
@@ -395,8 +415,6 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
 		return NULL;
 
-	eeh_edev_dbg(edev, "Probing device\n");
-
 	/* Initialize eeh device */
 	edev->class_code = pdev->class;
 	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (20 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  3:27   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform Oliver O'Halloran
                   ` (23 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Have the PowerNV EEH backend allocate the eeh_dev if needed rather than using
the one attached to the pci_dn. This gets us most of the way towards decoupling
pci_dn from the PowerNV EEH code.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
We should probably be free()ing the eeh_dev somewhere. The pci_dev release
function is the right place for it.
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 22 ++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 1cd80b399995..7aba18e08996 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -366,10 +366,9 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
  */
 static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 {
-	struct pci_dn *pdn = pci_get_pdn(pdev);
-	struct pci_controller *hose = pdn->phb;
-	struct pnv_phb *phb = hose->private_data;
-	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
+	struct pci_controller *hose = phb->hose;
+	struct eeh_dev *edev;
 	uint32_t pcie_flags;
 	int ret;
 	int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
@@ -415,12 +414,27 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
 		return NULL;
 
+	/* otherwise allocate and initialise a new eeh_dev */
+	edev = kzalloc(sizeof(*edev), GFP_KERNEL);
+	if (!edev) {
+		pr_err("%s: out of memory lol\n", __func__);
+		return NULL;
+	}
+
 	/* Initialize eeh device */
+	edev->bdfn       = config_addr;
+	edev->controller = phb->hose;
+
 	edev->class_code = pdev->class;
 	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
 	edev->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
 	edev->af_cap   = pci_find_capability(pdev, PCI_CAP_ID_AF);
 	edev->aer_cap  = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
+
+	/* TODO: stash the vf_index in here? */
+	if (pdev->is_virtfn)
+		edev->physfn = pdev->physfn;
+
 	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (21 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-25  5:00   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn Oliver O'Halloran
                   ` (22 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Currently the generic EEH code uses the pci_dn of a device to look up the
PE of the device's parent bridge, or physical function. The generic
function to insert the edev (and possibly create the eeh_pe) is called
from the probe functions already so this is a relatively minor change.

The existing lookup method moves into the pseries platform and PowerNV
can choose the PE based on the bus heirachy instead.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
"parent" meaning "parent of the PE that actually contains this edev"
is stupid, but it's stupid consistent with what's there already. Also
I couldn't think of a way to fix it without adding a bunch of boring
boilerplate at the call sites.

FIXME: I think I introduced a bug here. Currently we coalase a switch's
upstream port bus and the downstream port bus into a single PE since
they're a single failure domain. That seems to have been broken by
this patch, but whatever.
---
 arch/powerpc/include/asm/eeh.h               |  2 +-
 arch/powerpc/kernel/eeh_pe.c                 | 54 ++++-------------
 arch/powerpc/platforms/powernv/eeh-powernv.c | 25 +++++++-
 arch/powerpc/platforms/pseries/eeh_pseries.c | 61 ++++++++++++++++----
 4 files changed, 86 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index e109bfd3dd57..70d3e01dbe9d 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -295,7 +295,7 @@ struct eeh_pe *eeh_phb_pe_get(struct pci_controller *phb);
 struct eeh_pe *eeh_pe_next(struct eeh_pe *pe, struct eeh_pe *root);
 struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
 			  int pe_no, int config_addr);
-int eeh_add_to_parent_pe(struct eeh_dev *edev);
+int eeh_add_to_parent_pe(struct eeh_pe *parent, struct eeh_dev *edev);
 int eeh_rmv_from_parent_pe(struct eeh_dev *edev);
 void eeh_pe_update_time_stamp(struct eeh_pe *pe);
 void *eeh_pe_traverse(struct eeh_pe *root,
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 831f363f1732..520c249f19d3 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -318,56 +318,23 @@ struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
 	return pe;
 }
 
-/**
- * eeh_pe_get_parent - Retrieve the parent PE
- * @edev: EEH device
- *
- * The whole PEs existing in the system are organized as hierarchy
- * tree. The function is used to retrieve the parent PE according
- * to the parent EEH device.
- */
-static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
-{
-	struct eeh_dev *parent;
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-
-	/*
-	 * It might have the case for the indirect parent
-	 * EEH device already having associated PE, but
-	 * the direct parent EEH device doesn't have yet.
-	 */
-	if (edev->physfn)
-		pdn = pci_get_pdn(edev->physfn);
-	else
-		pdn = pdn ? pdn->parent : NULL;
-	while (pdn) {
-		/* We're poking out of PCI territory */
-		parent = pdn_to_eeh_dev(pdn);
-		if (!parent)
-			return NULL;
-
-		if (parent->pe)
-			return parent->pe;
-
-		pdn = pdn->parent;
-	}
-
-	return NULL;
-}
-
 /**
  * eeh_add_to_parent_pe - Add EEH device to parent PE
+ * @parent: PE to create additional PEs under
  * @edev: EEH device
  *
- * Add EEH device to the parent PE. If the parent PE already
- * exists, the PE type will be changed to EEH_PE_BUS. Otherwise,
- * we have to create new PE to hold the EEH device and the new
- * PE will be linked to its parent PE as well.
+ * Add EEH device to the PE in edev->pe_config_addr. If the PE
+ * already exists then we'll add it to that. Otherwise a new
+ * PE is created, and inserted into the PE tree below @parent.
+ * If @parent is NULL, then it will be inserted under the PHB
+ * PE for edev->controller.
+ *
+ * In either case @edev is added to the PE's device list.
  */
-int eeh_add_to_parent_pe(struct eeh_dev *edev)
+int eeh_add_to_parent_pe(struct eeh_pe *parent, struct eeh_dev *edev)
 {
 	int config_addr = edev->bdfn;
-	struct eeh_pe *pe, *parent;
+	struct eeh_pe *pe;
 
 	/* Check if the PE number is valid */
 	if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) {
@@ -431,7 +398,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	 * to PHB directly. Otherwise, we have to associate the
 	 * PE with its parent.
 	 */
-	parent = eeh_pe_get_parent(edev);
 	if (!parent) {
 		parent = eeh_phb_pe_get(edev->controller);
 		if (!parent) {
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 7aba18e08996..49a932ff092a 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -358,6 +358,25 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
 	return pnv_pci_cfg_write(pdn, where, size, val);
 }
 
+static struct eeh_pe *pnv_eeh_pe_get_parent(struct pci_dev *pdev)
+{
+	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
+	struct pci_dev *parent = pdev->bus->self;
+
+#ifdef CONFIG_PCI_IOV
+	if (pdev->is_virtfn)
+		parent = pdev->physfn;
+#endif
+
+	if (parent) {
+		struct pnv_ioda_pe *ioda_pe = pnv_ioda_get_pe(parent);
+
+		return eeh_pe_get(phb->hose, ioda_pe->pe_number, 0);
+	}
+
+	return NULL;
+}
+
 /**
  * pnv_eeh_probe - Do probe on PCI device
  * @pdev: pci_dev to probe
@@ -368,6 +387,7 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct pci_controller *hose = phb->hose;
+	struct eeh_pe *parent_pe;
 	struct eeh_dev *edev;
 	uint32_t pcie_flags;
 	int ret;
@@ -450,8 +470,11 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 
 	edev->pe_config_addr = phb->ioda.pe_rmap[config_addr];
 
+	/* find the PE that contains this PE, might be NULL */
+	parent_pe = pnv_eeh_pe_get_parent(pdev);
+
 	/* Create PE */
-	ret = eeh_add_to_parent_pe(edev);
+	ret = eeh_add_to_parent_pe(parent_pe, edev);
 	if (ret) {
 		eeh_edev_warn(edev, "Failed to add device to PE (code %d)\n", ret);
 		return NULL;
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 13a8c274554a..b4a92c24fd45 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -70,11 +70,12 @@ void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
 	eeh_add_device_early(pdn);
 #ifdef CONFIG_PCI_IOV
 	if (pdev->is_virtfn) {
+		struct eeh_pe *physfn_pe = pci_dev_to_eeh_dev(pdev->physfn)->pe;
 		struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 
 		edev->pe_config_addr =  (pdn->busno << 16) | (pdn->devfn << 8);
 		eeh_rmv_from_parent_pe(edev); /* Remove as it is adding to bus pe */
-		eeh_add_to_parent_pe(edev);   /* Add as VF PE type */
+		eeh_add_to_parent_pe(physfn_pe, edev); /* Add as VF PE type */
 	}
 #endif
 	eeh_add_device_late(pdev);
@@ -220,6 +221,43 @@ static int pseries_eeh_find_ecap(struct pci_dn *pdn, int cap)
 	return 0;
 }
 
+/**
+ * pseries_eeh_pe_get_parent - Retrieve the parent PE
+ * @edev: EEH device
+ *
+ * The whole PEs existing in the system are organized as hierarchy
+ * tree. The function is used to retrieve the parent PE according
+ * to the parent EEH device.
+ */
+static struct eeh_pe *pseries_eeh_pe_get_parent(struct eeh_dev *edev)
+{
+	struct eeh_dev *parent;
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	/*
+	 * It might have the case for the indirect parent
+	 * EEH device already having associated PE, but
+	 * the direct parent EEH device doesn't have yet.
+	 */
+	if (edev->physfn)
+		pdn = pci_get_pdn(edev->physfn);
+	else
+		pdn = pdn ? pdn->parent : NULL;
+	while (pdn) {
+		/* We're poking out of PCI territory */
+		parent = pdn_to_eeh_dev(pdn);
+		if (!parent)
+			return NULL;
+
+		if (parent->pe)
+			return parent->pe;
+
+		pdn = pdn->parent;
+	}
+
+	return NULL;
+}
+
 /**
  * pseries_eeh_probe - EEH probe on the given device
  * @pdn: PCI device node
@@ -286,10 +324,14 @@ static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
 	if (ret) {
 		eeh_edev_dbg(edev, "EEH failed to enable on device (code %d)\n", ret);
 	} else {
+		struct eeh_pe *parent;
+
 		/* Retrieve PE address */
 		edev->pe_config_addr = eeh_ops->get_pe_addr(&pe);
 		pe.addr = edev->pe_config_addr;
 
+		parent = pseries_eeh_pe_get_parent(edev);
+
 		/* Some older systems (Power4) allow the ibm,set-eeh-option
 		 * call to succeed even on nodes where EEH is not supported.
 		 * Verify support explicitly.
@@ -298,16 +340,15 @@ static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
 		if (ret > 0 && ret != EEH_STATE_NOT_SUPPORT)
 			enable = 1;
 
-		if (enable) {
+		/* This device doesn't support EEH, but it may have an
+		 * EEH parent, in which case we mark it as supported.
+		 */
+		if (parent && !enable)
+			edev->pe_config_addr = parent->addr;
+
+		if (enable || parent) {
 			eeh_add_flag(EEH_ENABLED);
-			eeh_add_to_parent_pe(edev);
-		} else if (pdn->parent && pdn_to_eeh_dev(pdn->parent) &&
-			   (pdn_to_eeh_dev(pdn->parent))->pe) {
-			/* This device doesn't support EEH, but it may have an
-			 * EEH parent, in which case we mark it as supported.
-			 */
-			edev->pe_config_addr = pdn_to_eeh_dev(pdn->parent)->pe_config_addr;
-			eeh_add_to_parent_pe(edev);
+			eeh_add_to_parent_pe(parent, edev);
 		}
 		eeh_edev_dbg(edev, "EEH is %s on device (code %d)\n",
 			     (enable ? "enabled" : "unsupported"), ret);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (22 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  0:21   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh() Oliver O'Halloran
                   ` (21 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Squash another usage in preperation for making the config accessors pci_dn.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
We might want to move this into eeh-powernv.c
---
 arch/powerpc/platforms/powernv/pci.c | 37 +++++++++++++---------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index d36dde9777aa..6170677bfdc7 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -708,30 +708,23 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
 }
 
 #if CONFIG_EEH
-static bool pnv_pci_cfg_check(struct pci_dn *pdn)
+bool pnv_eeh_pre_cfg_check(struct eeh_dev *edev)
 {
-	struct eeh_dev *edev = NULL;
-	struct pnv_phb *phb = pdn->phb->private_data;
-
-	/* EEH not enabled ? */
-	if (!(phb->flags & PNV_PHB_FLAG_EEH))
+	if (!edev || !edev->pe)
 		return true;
 
-	/* PE reset or device removed ? */
-	edev = pdn->edev;
-	if (edev) {
-		if (edev->pe &&
-		    (edev->pe->state & EEH_PE_CFG_BLOCKED))
-			return false;
+	/* PE in reset? */
+	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
+		return false;
 
-		if (edev->mode & EEH_DEV_REMOVED)
-			return false;
-	}
+	/* Device removed? */
+	if (edev->mode & EEH_DEV_REMOVED)
+		return false;
 
 	return true;
 }
 #else
-static inline pnv_pci_cfg_check(struct pci_dn *pdn)
+static inline pnv_pci_cfg_check(struct eeh_dev *edev)
 {
 	return true;
 }
@@ -743,6 +736,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 {
 	struct pci_dn *pdn;
 	struct pnv_phb *phb;
+	struct eeh_dev *edev;
 	int ret;
 
 	*val = 0xFFFFFFFF;
@@ -750,14 +744,15 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 	if (!pdn)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
-	if (!pnv_pci_cfg_check(pdn))
+	edev = pdn_to_eeh_dev(pdn);
+	if (!pnv_eeh_pre_cfg_check(edev))
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
 	ret = pnv_pci_cfg_read(pdn, where, size, val);
 	phb = pdn->phb->private_data;
-	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
+	if (phb->flags & PNV_PHB_FLAG_EEH && edev) {
 		if (*val == EEH_IO_ERROR_VALUE(size) &&
-		    eeh_dev_check_failure(pdn->edev))
+		    eeh_dev_check_failure(edev))
                         return PCIBIOS_DEVICE_NOT_FOUND;
 	} else {
 		pnv_pci_config_check_eeh(pdn);
@@ -772,13 +767,15 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 {
 	struct pci_dn *pdn;
 	struct pnv_phb *phb;
+	struct eeh_dev *edev;
 	int ret;
 
 	pdn = pci_get_pdn_by_devfn(bus, devfn);
 	if (!pdn)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
-	if (!pnv_pci_cfg_check(pdn))
+	edev = pdn_to_eeh_dev(pdn);
+	if (!pnv_eeh_pre_cfg_check(edev))
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
 	ret = pnv_pci_cfg_write(pdn, where, size, val);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (23 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  1:05   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write} Oliver O'Halloran
                   ` (20 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Despite the name this function is generic PowerNV PCI code rather than anything
EEH specific. Convert to take a phb and bdfn rather than a pci_dn.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci.c | 32 ++++++++++++++++++----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 6170677bfdc7..50142ff045ac 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -591,9 +591,15 @@ static void pnv_pci_handle_eeh_config(struct pnv_phb *phb, u32 pe_no)
 	spin_unlock_irqrestore(&phb->lock, flags);
 }
 
-static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
+/*
+ * This, very strangely named, function checks if a config access
+ * caused an EEH and un-freezes the PE if it did. This is mainly
+ * for the !CONFIG_EEH case where nothing is going to un-freeze
+ * it for us.
+ */
+static void pnv_pci_config_check_eeh(struct pnv_phb *phb, u16 bdfn)
 {
-	struct pnv_phb *phb = pdn->phb->private_data;
+	struct pnv_ioda_pe *ioda_pe;
 	u8	fstate = 0;
 	__be16	pcierr = 0;
 	unsigned int pe_no;
@@ -604,10 +610,11 @@ static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
 	 * setup that yet. So all ER errors should be mapped to
 	 * reserved PE.
 	 */
-	pe_no = pdn->pe_number;
-	if (pe_no == IODA_INVALID_PE) {
+	ioda_pe = __pnv_ioda_get_pe(phb, bdfn);
+	if (ioda_pe)
+		pe_no = ioda_pe->pe_number;
+	else
 		pe_no = phb->ioda.reserved_pe_idx;
-	}
 
 	/*
 	 * Fetch frozen state. If the PHB support compound PE,
@@ -629,7 +636,7 @@ static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
 	}
 
 	pr_devel(" -> EEH check, bdfn=%04x PE#%x fstate=%x\n",
-		 (pdn->busno << 8) | (pdn->devfn), pe_no, fstate);
+		 bdfn, pe_no, fstate);
 
 	/* Clear the frozen state if applicable */
 	if (fstate == OPAL_EEH_STOPPED_MMIO_FREEZE ||
@@ -642,6 +649,7 @@ static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
 		if (phb->freeze_pe)
 			phb->freeze_pe(phb, pe_no);
 
+		/* fish out the EEH log and send an EEH event. */
 		pnv_pci_handle_eeh_config(phb, pe_no);
 	}
 }
@@ -735,7 +743,8 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 			       int where, int size, u32 *val)
 {
 	struct pci_dn *pdn;
-	struct pnv_phb *phb;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
+	u16 bdfn = bus->number << 8 | devfn;
 	struct eeh_dev *edev;
 	int ret;
 
@@ -755,7 +764,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 		    eeh_dev_check_failure(edev))
                         return PCIBIOS_DEVICE_NOT_FOUND;
 	} else {
-		pnv_pci_config_check_eeh(pdn);
+		pnv_pci_config_check_eeh(phb, bdfn);
 	}
 
 	return ret;
@@ -766,7 +775,8 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 				int where, int size, u32 val)
 {
 	struct pci_dn *pdn;
-	struct pnv_phb *phb;
+	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
+	u16 bdfn = bus->number << 8 | devfn;
 	struct eeh_dev *edev;
 	int ret;
 
@@ -779,9 +789,9 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
 	ret = pnv_pci_cfg_write(pdn, where, size, val);
-	phb = pdn->phb->private_data;
+
 	if (!(phb->flags & PNV_PHB_FLAG_EEH))
-		pnv_pci_config_check_eeh(pdn);
+		pnv_pci_config_check_eeh(phb, bdfn);
 
 	return ret;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write}
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (24 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  2:16   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 27/46] powernv/pci: Clear reserved PE freezes Oliver O'Halloran
                   ` (19 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Remove the use of pci_dn from the low-level config space access functions.
These are used by the eeh's config ops and the bus config ops that we
provide to the PCI core.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++--------
 arch/powerpc/platforms/powernv/pci.c         | 26 ++++++++------------
 arch/powerpc/platforms/powernv/pci.h         |  6 ++---
 3 files changed, 16 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 49a932ff092a..8a73bc7517c5 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -331,31 +331,25 @@ static inline bool pnv_eeh_cfg_blocked(struct eeh_dev *edev)
 static int pnv_eeh_read_config(struct eeh_dev *edev,
 			       int where, int size, u32 *val)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-
-	if (!pdn)
-		return PCIBIOS_DEVICE_NOT_FOUND;
+	struct pnv_phb *phb = edev->controller->private_data;
 
 	if (pnv_eeh_cfg_blocked(edev)) {
 		*val = 0xFFFFFFFF;
 		return PCIBIOS_SET_FAILED;
 	}
 
-	return pnv_pci_cfg_read(pdn, where, size, val);
+	return pnv_pci_cfg_read(phb, edev->bdfn, where, size, val);
 }
 
 static int pnv_eeh_write_config(struct eeh_dev *edev,
 				int where, int size, u32 val)
 {
-	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
-
-	if (!pdn)
-		return PCIBIOS_DEVICE_NOT_FOUND;
+	struct pnv_phb *phb = edev->controller->private_data;
 
 	if (pnv_eeh_cfg_blocked(edev))
 		return PCIBIOS_SET_FAILED;
 
-	return pnv_pci_cfg_write(pdn, where, size, val);
+	return pnv_pci_cfg_write(phb, edev->bdfn, where, size, val);
 }
 
 static struct eeh_pe *pnv_eeh_pe_get_parent(struct pci_dev *pdev)
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 50142ff045ac..36eea4bb514c 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -654,11 +654,9 @@ static void pnv_pci_config_check_eeh(struct pnv_phb *phb, u16 bdfn)
 	}
 }
 
-int pnv_pci_cfg_read(struct pci_dn *pdn,
+int pnv_pci_cfg_read(struct pnv_phb *phb, u16 bdfn,
 		     int where, int size, u32 *val)
 {
-	struct pnv_phb *phb = pdn->phb->private_data;
-	u32 bdfn = (pdn->busno << 8) | pdn->devfn;
 	s64 rc;
 
 	switch (size) {
@@ -685,19 +683,16 @@ int pnv_pci_cfg_read(struct pci_dn *pdn,
 		return PCIBIOS_FUNC_NOT_SUPPORTED;
 	}
 
-	pr_devel("%s: bus: %x devfn: %x +%x/%x -> %08x\n",
-		 __func__, pdn->busno, pdn->devfn, where, size, *val);
+	pr_devel("%s: bdfn: %x  +%x/%x -> %08x\n",
+		 __func__, bdfn, where, size, *val);
 	return PCIBIOS_SUCCESSFUL;
 }
 
-int pnv_pci_cfg_write(struct pci_dn *pdn,
+int pnv_pci_cfg_write(struct pnv_phb *phb, u16 bdfn,
 		      int where, int size, u32 val)
 {
-	struct pnv_phb *phb = pdn->phb->private_data;
-	u32 bdfn = (pdn->busno << 8) | pdn->devfn;
-
-	pr_devel("%s: bus: %x devfn: %x +%x/%x -> %08x\n",
-		 __func__, pdn->busno, pdn->devfn, where, size, val);
+	pr_devel("%s: bdfn: %x +%x/%x -> %08x\n",
+		 __func__, bdfn, where, size, val);
 	switch (size) {
 	case 1:
 		opal_pci_config_write_byte(phb->opal_id, bdfn, where, val);
@@ -753,12 +748,11 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 	if (!pdn)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
-	edev = pdn_to_eeh_dev(pdn);
+	edev = pnv_eeh_find_edev(phb, bdfn);
 	if (!pnv_eeh_pre_cfg_check(edev))
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
-	ret = pnv_pci_cfg_read(pdn, where, size, val);
-	phb = pdn->phb->private_data;
+	ret = pnv_pci_cfg_read(phb, bdfn, where, size, val);
 	if (phb->flags & PNV_PHB_FLAG_EEH && edev) {
 		if (*val == EEH_IO_ERROR_VALUE(size) &&
 		    eeh_dev_check_failure(edev))
@@ -784,11 +778,11 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 	if (!pdn)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
-	edev = pdn_to_eeh_dev(pdn);
+	edev = pnv_eeh_find_edev(phb, bdfn);
 	if (!pnv_eeh_pre_cfg_check(edev))
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
-	ret = pnv_pci_cfg_write(pdn, where, size, val);
+	ret = pnv_pci_cfg_write(phb, bdfn, where, size, val);
 
 	if (!(phb->flags & PNV_PHB_FLAG_EEH))
 		pnv_pci_config_check_eeh(phb, bdfn);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index be435a810d19..52dc4d05eaca 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -7,8 +7,6 @@
 #include <asm/iommu.h>
 #include <asm/msi_bitmap.h>
 
-struct pci_dn;
-
 enum pnv_phb_type {
 	PNV_PHB_IODA1		= 0,
 	PNV_PHB_IODA2		= 1,
@@ -174,9 +172,9 @@ extern struct pci_ops pnv_pci_ops;
 
 void pnv_pci_dump_phb_diag_data(struct pci_controller *hose,
 				unsigned char *log_buff);
-int pnv_pci_cfg_read(struct pci_dn *pdn,
+int pnv_pci_cfg_read(struct pnv_phb *phb, u16 bdfn,
 		     int where, int size, u32 *val);
-int pnv_pci_cfg_write(struct pci_dn *pdn,
+int pnv_pci_cfg_write(struct pnv_phb *phb, u16 bdfn,
 		      int where, int size, u32 val);
 extern struct iommu_table *pnv_pci_table_alloc(int nid);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 27/46] powernv/pci: Clear reserved PE freezes
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (25 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write} Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  3:00   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn Oliver O'Halloran
                   ` (18 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

When we scan an empty slot the PHB gets an Unsupported Request from the
downstream bridge when there's no device present at that BDFN.  Some older
PHBs (p7-IOC) don't allow further config space accesses while the PE is
frozen, so clear it here without bothering with the diagnostic log.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 36eea4bb514c..5b1f4677cdce 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -642,6 +642,19 @@ static void pnv_pci_config_check_eeh(struct pnv_phb *phb, u16 bdfn)
 	if (fstate == OPAL_EEH_STOPPED_MMIO_FREEZE ||
 	    fstate == OPAL_EEH_STOPPED_DMA_FREEZE  ||
 	    fstate == OPAL_EEH_STOPPED_MMIO_DMA_FREEZE) {
+
+		/*
+		 * Scanning an empty slot will result in a freeze on the reserved PE.
+		 *
+		 * Some old and bad PHBs block config space access to frozen PEs in
+		 * addition to MMIOs, so unfreeze it here.
+		 */
+		if (pe_no == phb->ioda.reserved_pe_idx) {
+			phb->unfreeze_pe(phb, phb->ioda.reserved_pe_idx,
+					 OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+			return;
+		}
+
 		/*
 		 * If PHB supports compound PE, freeze it for
 		 * consistency.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (26 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 27/46] powernv/pci: Clear reserved PE freezes Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  4:09   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup Oliver O'Halloran
                   ` (17 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Move the SR-IOV into a platform specific structure. I'm sure stashing all the
SR-IOV state in pci_dn seemed like a good idea at the time, but it results in a
lot of powernv specifics being leaked out of the platform directory.

Moving all the PHB3/4 specific M64 BAR wrangling into a PowerNV specific
structure helps to clarify the role of pci_dn and ensures that the platform
specifics stay that way.

This will make the code easier to understand and modify since we don't need
to so much aboute PowerNV changes breaking pseries and EEH, and vis-a-vis.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
TODO: Remove all the sriov stuff from pci_dn. We can't do that yet because
the pseries SRIOV support was a giant hack that re-used some of the
previously powernv specific fields.
---
 arch/powerpc/include/asm/device.h         |   3 +
 arch/powerpc/platforms/powernv/pci-ioda.c | 199 ++++++++++++----------
 arch/powerpc/platforms/powernv/pci.h      |  36 ++++
 3 files changed, 148 insertions(+), 90 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
index 266542769e4b..4d8934db7ef5 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -49,6 +49,9 @@ struct dev_archdata {
 #ifdef CONFIG_CXL_BASE
 	struct cxl_context	*cxl_ctx;
 #endif
+#ifdef CONFIG_PCI_IOV
+	void *iov_data;
+#endif
 };
 
 struct pdev_archdata {
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index a1c9315f3208..1c90feed233d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -966,14 +966,15 @@ static int pnv_ioda_configure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 #ifdef CONFIG_PCI_IOV
 static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
 {
-	struct pci_dn *pdn = pci_get_pdn(dev);
-	int i;
 	struct resource *res, res2;
+	struct pnv_iov_data *iov;
 	resource_size_t size;
 	u16 num_vfs;
+	int i;
 
 	if (!dev->is_physfn)
 		return -EINVAL;
+	iov = pnv_iov_get(dev);
 
 	/*
 	 * "offset" is in VFs.  The M64 windows are sized so that when they
@@ -983,7 +984,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
 	 * separate PE, and changing the IOV BAR start address changes the
 	 * range of PEs the VFs are in.
 	 */
-	num_vfs = pdn->num_vfs;
+	num_vfs = iov->num_vfs;
 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
 		res = &dev->resource[i + PCI_IOV_RESOURCES];
 		if (!res->flags || !res->parent)
@@ -1029,19 +1030,19 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
 			 num_vfs, offset);
 
 		if (offset < 0) {
-			devm_release_resource(&dev->dev, &pdn->holes[i]);
-			memset(&pdn->holes[i], 0, sizeof(pdn->holes[i]));
+			devm_release_resource(&dev->dev, &iov->holes[i]);
+			memset(&iov->holes[i], 0, sizeof(iov->holes[i]));
 		}
 
 		pci_update_resource(dev, i + PCI_IOV_RESOURCES);
 
 		if (offset > 0) {
-			pdn->holes[i].start = res2.start;
-			pdn->holes[i].end = res2.start + size * offset - 1;
-			pdn->holes[i].flags = IORESOURCE_BUS;
-			pdn->holes[i].name = "pnv_iov_reserved";
+			iov->holes[i].start = res2.start;
+			iov->holes[i].end = res2.start + size * offset - 1;
+			iov->holes[i].flags = IORESOURCE_BUS;
+			iov->holes[i].name = "pnv_iov_reserved";
 			devm_request_resource(&dev->dev, res->parent,
-					&pdn->holes[i]);
+					&iov->holes[i]);
 		}
 	}
 	return 0;
@@ -1273,37 +1274,37 @@ static void pnv_pci_ioda_setup_PEs(void)
 #ifdef CONFIG_PCI_IOV
 static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
 {
+	struct pnv_iov_data   *iov;
 	struct pnv_phb        *phb;
-	struct pci_dn         *pdn;
 	int                    i, j;
 	int                    m64_bars;
 
 	phb = pci_bus_to_pnvhb(pdev->bus);
-	pdn = pci_get_pdn(pdev);
+	iov = pnv_iov_get(pdev);
 
-	if (pdn->m64_single_mode)
+	if (iov->m64_single_mode)
 		m64_bars = num_vfs;
 	else
 		m64_bars = 1;
 
 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
 		for (j = 0; j < m64_bars; j++) {
-			if (pdn->m64_map[j][i] == IODA_INVALID_M64)
+			if (iov->m64_map[j][i] == IODA_INVALID_M64)
 				continue;
 			opal_pci_phb_mmio_enable(phb->opal_id,
-				OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 0);
-			clear_bit(pdn->m64_map[j][i], &phb->ioda.m64_bar_alloc);
-			pdn->m64_map[j][i] = IODA_INVALID_M64;
+				OPAL_M64_WINDOW_TYPE, iov->m64_map[j][i], 0);
+			clear_bit(iov->m64_map[j][i], &phb->ioda.m64_bar_alloc);
+			iov->m64_map[j][i] = IODA_INVALID_M64;
 		}
 
-	kfree(pdn->m64_map);
+	kfree(iov->m64_map);
 	return 0;
 }
 
 static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 {
+	struct pnv_iov_data   *iov;
 	struct pnv_phb        *phb;
-	struct pci_dn         *pdn;
 	unsigned int           win;
 	struct resource       *res;
 	int                    i, j;
@@ -1314,23 +1315,23 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 	int                    m64_bars;
 
 	phb = pci_bus_to_pnvhb(pdev->bus);
-	pdn = pci_get_pdn(pdev);
+	iov = pnv_iov_get(pdev);
 	total_vfs = pci_sriov_get_totalvfs(pdev);
 
-	if (pdn->m64_single_mode)
+	if (iov->m64_single_mode)
 		m64_bars = num_vfs;
 	else
 		m64_bars = 1;
 
-	pdn->m64_map = kmalloc_array(m64_bars,
-				     sizeof(*pdn->m64_map),
+	iov->m64_map = kmalloc_array(m64_bars,
+				     sizeof(*iov->m64_map),
 				     GFP_KERNEL);
-	if (!pdn->m64_map)
+	if (!iov->m64_map)
 		return -ENOMEM;
 	/* Initialize the m64_map to IODA_INVALID_M64 */
 	for (i = 0; i < m64_bars ; i++)
 		for (j = 0; j < PCI_SRIOV_NUM_BARS; j++)
-			pdn->m64_map[i][j] = IODA_INVALID_M64;
+			iov->m64_map[i][j] = IODA_INVALID_M64;
 
 
 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
@@ -1347,9 +1348,9 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 					goto m64_failed;
 			} while (test_and_set_bit(win, &phb->ioda.m64_bar_alloc));
 
-			pdn->m64_map[j][i] = win;
+			iov->m64_map[j][i] = win;
 
-			if (pdn->m64_single_mode) {
+			if (iov->m64_single_mode) {
 				size = pci_iov_resource_size(pdev,
 							PCI_IOV_RESOURCES + i);
 				start = res->start + size * j;
@@ -1359,16 +1360,16 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 			}
 
 			/* Map the M64 here */
-			if (pdn->m64_single_mode) {
-				pe_num = pdn->pe_num_map[j];
+			if (iov->m64_single_mode) {
+				pe_num = iov->pe_num_map[j];
 				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
 						pe_num, OPAL_M64_WINDOW_TYPE,
-						pdn->m64_map[j][i], 0);
+						iov->m64_map[j][i], 0);
 			}
 
 			rc = opal_pci_set_phb_mem_window(phb->opal_id,
 						 OPAL_M64_WINDOW_TYPE,
-						 pdn->m64_map[j][i],
+						 iov->m64_map[j][i],
 						 start,
 						 0, /* unused */
 						 size);
@@ -1380,12 +1381,12 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
 				goto m64_failed;
 			}
 
-			if (pdn->m64_single_mode)
+			if (iov->m64_single_mode)
 				rc = opal_pci_phb_mmio_enable(phb->opal_id,
-				     OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 2);
+				     OPAL_M64_WINDOW_TYPE, iov->m64_map[j][i], 2);
 			else
 				rc = opal_pci_phb_mmio_enable(phb->opal_id,
-				     OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 1);
+				     OPAL_M64_WINDOW_TYPE, iov->m64_map[j][i], 1);
 
 			if (rc != OPAL_SUCCESS) {
 				dev_err(&pdev->dev, "Failed to enable M64 window #%d: %llx\n",
@@ -1426,10 +1427,8 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
 {
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe, *pe_n;
-	struct pci_dn         *pdn;
 
 	phb = pci_bus_to_pnvhb(pdev->bus);
-	pdn = pci_get_pdn(pdev);
 
 	if (!pdev->is_physfn)
 		return;
@@ -1455,36 +1454,36 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
 {
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe;
-	struct pci_dn         *pdn;
+	struct pnv_iov_data   *iov;
 	u16                    num_vfs, i;
 
 	phb = pci_bus_to_pnvhb(pdev->bus);
-	pdn = pci_get_pdn(pdev);
-	num_vfs = pdn->num_vfs;
+	iov = pnv_iov_get(pdev);
+	num_vfs = iov->num_vfs;
 
 	/* Release VF PEs */
 	pnv_ioda_release_vf_PE(pdev);
 
 	if (phb->type == PNV_PHB_IODA2) {
-		if (!pdn->m64_single_mode)
-			pnv_pci_vf_resource_shift(pdev, -*pdn->pe_num_map);
+		if (!iov->m64_single_mode)
+			pnv_pci_vf_resource_shift(pdev, -*iov->pe_num_map);
 
 		/* Release M64 windows */
 		pnv_pci_vf_release_m64(pdev, num_vfs);
 
 		/* Release PE numbers */
-		if (pdn->m64_single_mode) {
+		if (iov->m64_single_mode) {
 			for (i = 0; i < num_vfs; i++) {
-				if (pdn->pe_num_map[i] == IODA_INVALID_PE)
+				if (iov->pe_num_map[i] == IODA_INVALID_PE)
 					continue;
 
-				pe = &phb->ioda.pe_array[pdn->pe_num_map[i]];
+				pe = &phb->ioda.pe_array[iov->pe_num_map[i]];
 				pnv_ioda_free_pe(pe);
 			}
 		} else
-			bitmap_clear(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs);
+			bitmap_clear(phb->ioda.pe_alloc, *iov->pe_num_map, num_vfs);
 		/* Releasing pe_num_map */
-		kfree(pdn->pe_num_map);
+		kfree(iov->pe_num_map);
 	}
 }
 
@@ -1501,24 +1500,24 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 	struct pnv_ioda_pe    *pe;
 	int                    pe_num;
 	u16                    vf_index;
-	struct pci_dn         *pdn;
-
-	phb = pci_bus_to_pnvhb(pdev->bus);
-	pdn = pci_get_pdn(pdev);
+	struct pnv_iov_data   *iov;
 
 	if (!pdev->is_physfn)
 		return;
 
+	phb = pci_bus_to_pnvhb(pdev->bus);
+	iov = pnv_iov_get(pdev);
+
 	/* Reserve PE for each VF */
 	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
 		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
 		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
 		struct pci_dn *vf_pdn;
 
-		if (pdn->m64_single_mode)
-			pe_num = pdn->pe_num_map[vf_index];
+		if (iov->m64_single_mode)
+			pe_num = iov->pe_num_map[vf_index];
 		else
-			pe_num = *pdn->pe_num_map + vf_index;
+			pe_num = *iov->pe_num_map + vf_index;
 
 		pe = &phb->ioda.pe_array[pe_num];
 		pe->pe_number = pe_num;
@@ -1565,17 +1564,17 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 
 int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 {
+	struct pnv_iov_data   *iov;
 	struct pnv_phb        *phb;
 	struct pnv_ioda_pe    *pe;
-	struct pci_dn         *pdn;
 	int                    ret;
 	u16                    i;
 
 	phb = pci_bus_to_pnvhb(pdev->bus);
-	pdn = pci_get_pdn(pdev);
+	iov = pnv_iov_get(pdev);
 
 	if (phb->type == PNV_PHB_IODA2) {
-		if (!pdn->vfs_expanded) {
+		if (!iov->vfs_expanded) {
 			dev_info(&pdev->dev, "don't support this SRIOV device"
 				" with non 64bit-prefetchable IOV BAR\n");
 			return -ENOSPC;
@@ -1585,28 +1584,26 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 		 * When M64 BARs functions in Single PE mode, the number of VFs
 		 * could be enabled must be less than the number of M64 BARs.
 		 */
-		if (pdn->m64_single_mode && num_vfs > phb->ioda.m64_bar_idx) {
+		if (iov->m64_single_mode && num_vfs > phb->ioda.m64_bar_idx) {
 			dev_info(&pdev->dev, "Not enough M64 BAR for VFs\n");
 			return -EBUSY;
 		}
 
 		/* Allocating pe_num_map */
-		if (pdn->m64_single_mode)
-			pdn->pe_num_map = kmalloc_array(num_vfs,
-							sizeof(*pdn->pe_num_map),
-							GFP_KERNEL);
+		if (iov->m64_single_mode)
+			iov->pe_num_map = kmalloc_array(num_vfs, sizeof(*iov->pe_num_map), GFP_KERNEL);
 		else
-			pdn->pe_num_map = kmalloc(sizeof(*pdn->pe_num_map), GFP_KERNEL);
+			iov->pe_num_map = kmalloc(sizeof(*iov->pe_num_map), GFP_KERNEL);
 
-		if (!pdn->pe_num_map)
+		if (!iov->pe_num_map)
 			return -ENOMEM;
 
-		if (pdn->m64_single_mode)
+		if (iov->m64_single_mode)
 			for (i = 0; i < num_vfs; i++)
-				pdn->pe_num_map[i] = IODA_INVALID_PE;
+				iov->pe_num_map[i] = IODA_INVALID_PE;
 
 		/* Calculate available PE for required VFs */
-		if (pdn->m64_single_mode) {
+		if (iov->m64_single_mode) {
 			for (i = 0; i < num_vfs; i++) {
 				pe = pnv_ioda_alloc_pe(phb);
 				if (!pe) {
@@ -1614,23 +1611,23 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 					goto m64_failed;
 				}
 
-				pdn->pe_num_map[i] = pe->pe_number;
+				iov->pe_num_map[i] = pe->pe_number;
 			}
 		} else {
 			mutex_lock(&phb->ioda.pe_alloc_mutex);
-			*pdn->pe_num_map = bitmap_find_next_zero_area(
+			*iov->pe_num_map = bitmap_find_next_zero_area(
 				phb->ioda.pe_alloc, phb->ioda.total_pe_num,
 				0, num_vfs, 0);
-			if (*pdn->pe_num_map >= phb->ioda.total_pe_num) {
+			if (*iov->pe_num_map >= phb->ioda.total_pe_num) {
 				mutex_unlock(&phb->ioda.pe_alloc_mutex);
 				dev_info(&pdev->dev, "Failed to enable VF%d\n", num_vfs);
-				kfree(pdn->pe_num_map);
+				kfree(iov->pe_num_map);
 				return -EBUSY;
 			}
-			bitmap_set(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs);
+			bitmap_set(phb->ioda.pe_alloc, *iov->pe_num_map, num_vfs);
 			mutex_unlock(&phb->ioda.pe_alloc_mutex);
 		}
-		pdn->num_vfs = num_vfs;
+		iov->num_vfs = num_vfs;
 
 		/* Assign M64 window accordingly */
 		ret = pnv_pci_vf_assign_m64(pdev, num_vfs);
@@ -1644,8 +1641,8 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 		 * the IOV BAR according to the PE# allocated to the VFs.
 		 * Otherwise, the PE# for the VF will conflict with others.
 		 */
-		if (!pdn->m64_single_mode) {
-			ret = pnv_pci_vf_resource_shift(pdev, *pdn->pe_num_map);
+		if (!iov->m64_single_mode) {
+			ret = pnv_pci_vf_resource_shift(pdev, *iov->pe_num_map);
 			if (ret)
 				goto m64_failed;
 		}
@@ -1657,19 +1654,19 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 	return 0;
 
 m64_failed:
-	if (pdn->m64_single_mode) {
+	if (iov->m64_single_mode) {
 		for (i = 0; i < num_vfs; i++) {
-			if (pdn->pe_num_map[i] == IODA_INVALID_PE)
+			if (iov->pe_num_map[i] == IODA_INVALID_PE)
 				continue;
 
-			pe = &phb->ioda.pe_array[pdn->pe_num_map[i]];
+			pe = &phb->ioda.pe_array[iov->pe_num_map[i]];
 			pnv_ioda_free_pe(pe);
 		}
 	} else
-		bitmap_clear(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs);
+		bitmap_clear(phb->ioda.pe_alloc, *iov->pe_num_map, num_vfs);
 
 	/* Releasing pe_num_map */
-	kfree(pdn->pe_num_map);
+	kfree(iov->pe_num_map);
 
 	return ret;
 }
@@ -2840,12 +2837,13 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 	struct resource *res;
 	int i;
 	resource_size_t size, total_vf_bar_sz;
-	struct pci_dn *pdn;
+	struct pnv_iov_data *iov;
 	int mul, total_vfs;
 
-	pdn = pci_get_pdn(pdev);
-	pdn->vfs_expanded = 0;
-	pdn->m64_single_mode = false;
+	iov = kzalloc(sizeof(*iov), GFP_KERNEL);
+	if (!iov)
+		goto truncate_iov;
+	pdev->dev.archdata.iov_data = iov;
 
 	total_vfs = pci_sriov_get_totalvfs(pdev);
 	mul = phb->ioda.total_pe_num;
@@ -2882,7 +2880,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 			dev_info(&pdev->dev,
 				"VF BAR Total IOV size %llx > %llx, roundup to %d VFs\n",
 				total_vf_bar_sz, gate, mul);
-			pdn->m64_single_mode = true;
+			iov->m64_single_mode = true;
 			break;
 		}
 	}
@@ -2897,7 +2895,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 		 * On PHB3, the minimum size alignment of M64 BAR in single
 		 * mode is 32MB.
 		 */
-		if (pdn->m64_single_mode && (size < SZ_32M))
+		if (iov->m64_single_mode && (size < SZ_32M))
 			goto truncate_iov;
 		dev_dbg(&pdev->dev, " Fixing VF BAR%d: %pR to\n", i, res);
 		res->end = res->start + size * mul - 1;
@@ -2905,7 +2903,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 		dev_info(&pdev->dev, "VF BAR%d: %pR (expanded to %d VFs for PE alignment)",
 			 i, res, mul);
 	}
-	pdn->vfs_expanded = mul;
+	iov->vfs_expanded = mul;
 
 	return;
 
@@ -2916,6 +2914,9 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 		res->flags = 0;
 		res->end = res->start - 1;
 	}
+
+	pdev->dev.archdata.iov_data = NULL;
+	kfree(iov);
 }
 
 static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
@@ -3321,7 +3322,7 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
-	struct pci_dn *pdn = pci_get_pdn(pdev);
+	struct pnv_iov_data *iov = pnv_iov_get(pdev);
 	resource_size_t align;
 
 	/*
@@ -3342,12 +3343,21 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 	 * M64 segment size if IOV BAR size is less.
 	 */
 	align = pci_iov_resource_size(pdev, resno);
-	if (!pdn->vfs_expanded)
+
+	/*
+	 * iov can be null if we have an SR-IOV device with IOV BAR that can't
+	 * be placed in the m64 space (i.e. The BAR is 32bit or non-prefetch).
+	 * In that case we don't allow VFs to be enabled so just return the
+	 * default alignment.
+	 */
+	if (!iov)
 		return align;
-	if (pdn->m64_single_mode)
+	if (!iov->vfs_expanded)
+		return align;
+	if (iov->m64_single_mode)
 		return max(align, (resource_size_t)phb->ioda.m64_segsize);
 
-	return pdn->vfs_expanded * align;
+	return iov->vfs_expanded * align;
 }
 #endif /* CONFIG_PCI_IOV */
 
@@ -3545,12 +3555,21 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pnv_ioda_pe *pe;
 
+	/* The VF PE state is torn down when sriov_disable() is called */
 	if (pdev->is_virtfn)
 		return;
 
 	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
 		return;
 
+	/*
+	 * FIXME: Try move this to sriov_disable(). It's here since we allocate
+	 * the iov state at probe time since we need to fiddle with the IOV
+	 * resources.
+	 */
+	if (pdev->is_physfn)
+		kfree(pdev->dev.archdata.iov_data);
+
 	/*
 	 * PCI hotplug can happen as part of EEH error recovery. The @pdn
 	 * isn't removed and added afterwards in this scenario. We should
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 52dc4d05eaca..0e875f714911 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -168,6 +168,42 @@ struct pnv_phb {
 	u8			*diag_data;
 };
 
+#ifdef CONFIG_PCI_IOV
+/*
+ * For SR-IOV we want to put each VF's MMIO resource in to a seperate PE.
+ * This requires a bit of acrobatics with the MMIO -> PE configuration
+ * and this structure is used to keep track of it all.
+ */
+struct pnv_iov_data {
+	/* number of VFs IOV BAR expanded. FIXME: rename this to something less bad */
+	u16     vfs_expanded;
+
+	/* number of VFs enabled */
+	u16     num_vfs;
+	unsigned int *pe_num_map;	/* PE# for the first VF PE or array */
+
+	/* Did we map the VF BARs with single-PE IODA BARs? */
+	bool    m64_single_mode;
+
+	int     (*m64_map)[PCI_SRIOV_NUM_BARS];
+#define IODA_INVALID_M64        (-1)
+
+	/*
+	 * If we map the SR-IOV BARs with a segmented window then
+	 * parts of that window will be "claimed" by other PEs.
+	 *
+	 * "holes" here is used to reserve the leading portion
+	 * of the window that is used by other (non VF) PEs.
+	 */
+	struct resource holes[PCI_SRIOV_NUM_BARS];
+};
+
+static inline struct pnv_iov_data *pnv_iov_get(struct pci_dev *pdev)
+{
+	return pdev->dev.archdata.iov_data;
+}
+#endif
+
 extern struct pci_ops pnv_pci_ops;
 
 void pnv_pci_dump_phb_diag_data(struct pci_controller *hose,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (27 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  4:26   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown Oliver O'Halloran
                   ` (16 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 32 +++++++++++++++++------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 1c90feed233d..5bd7c1b058da 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -760,6 +760,11 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
 		}
 	}
 
+	/*
+	 * Walk the bridges up to the root. Along the way mark this PE as
+	 * downstream of the bridge PE(s) so that errors upstream errors
+	 * also cause this PE to be frozen.
+	 */
 	if (pe->flags & (PNV_IODA_PE_BUS_ALL | PNV_IODA_PE_BUS))
 		pdev = pe->pbus->self;
 	else if (pe->flags & PNV_IODA_PE_DEV)
@@ -768,16 +773,27 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
 	else if (pe->flags & PNV_IODA_PE_VF)
 		pdev = pe->parent_dev;
 #endif /* CONFIG_PCI_IOV */
+
 	while (pdev) {
-		struct pci_dn *pdn = pci_get_pdn(pdev);
-		struct pnv_ioda_pe *parent;
+		struct pnv_ioda_pe *parent = pnv_ioda_get_pe(pdev);
 
-		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
-			parent = &phb->ioda.pe_array[pdn->pe_number];
-			ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
-			if (ret)
-				return ret;
-		}
+		/*
+		 * FIXME: This is called from pcibios_setup_bridge(), which is called
+		 * from the bottom (leaf) bridge to the root. This means that this
+		 * doesn't actually setup the PELT-V entries since the PEs for
+		 * the bridges above assigned after this is run for the leaf.
+		 *
+		 * FIXMEFIXME: might not be true since moving PE configuration
+		 * into pcibios_bus_add_device().
+		 */
+		if (!parent)
+			break;
+
+		WARN_ON(!parent || parent->pe_number == IODA_INVALID_PE);
+
+		ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
+		if (ret)
+			return ret;
 
 		pdev = pdev->bus->self;
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (28 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  4:50   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup() Oliver O'Halloran
                   ` (15 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5bd7c1b058da..d4b5ee926222 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -853,11 +853,13 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 
 	/* Release from all parents PELT-V */
 	while (parent) {
-		struct pci_dn *pdn = pci_get_pdn(parent);
-		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
-			rc = opal_pci_set_peltv(phb->opal_id, pdn->pe_number,
-						pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
-			/* XXX What to do in case of error ? */
+		struct pnv_ioda_pe *parent_pe = pnv_ioda_get_pe(parent);
+
+		if (parent_pe) {
+			rc = opal_pci_set_peltv(phb->opal_id,
+						parent_pe->pe_number,
+						pe->pe_number,
+						OPAL_REMOVE_PE_FROM_DOMAIN);
 		}
 		parent = parent->bus->self;
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (29 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  7:52   ` Christoph Hellwig
  2019-11-27  4:53   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported() Oliver O'Halloran
                   ` (14 subsequent siblings)
  45 siblings, 2 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Use the helper to look up the pnv_ioda_pe for the device we're configuring DMA
for. In the VF case there's no need set pdn->pe_number since nothing looks at
it any more.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d4b5ee926222..98d858999a2d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1709,10 +1709,9 @@ int pnv_pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 
 static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev)
 {
-	struct pci_dn *pdn = pci_get_pdn(pdev);
 	struct pnv_ioda_pe *pe;
 
-	pe = &phb->ioda.pe_array[pdn->pe_number];
+	pe = pnv_ioda_get_pe(pdev);
 	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
 	pdev->dev.archdata.dma_offset = pe->tce_bypass_base;
 	set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (30 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  5:09   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier Oliver O'Halloran
                   ` (13 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 98d858999a2d..7e88de18ead6 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1801,13 +1801,11 @@ static bool pnv_pci_ioda_iommu_bypass_supported(struct pci_dev *pdev,
 		u64 dma_mask)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
-	struct pci_dn *pdn = pci_get_pdn(pdev);
-	struct pnv_ioda_pe *pe;
+	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
 
-	if (WARN_ON(!pdn || pdn->pe_number == IODA_INVALID_PE))
+	if (WARN_ON(!pe))
 		return false;
 
-	pe = &phb->ioda.pe_array[pdn->pe_number];
 	if (pe->tce_bypass_enabled) {
 		u64 top = pe->tce_bypass_base + memblock_end_of_DRAM() - 1;
 		if (dma_mask >= top)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (31 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  5:09   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook() Oliver O'Halloran
                   ` (12 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 5b1f4677cdce..0eeea8652426 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -943,23 +943,22 @@ static int pnv_tce_iommu_bus_notifier(struct notifier_block *nb,
 {
 	struct device *dev = data;
 	struct pci_dev *pdev;
-	struct pci_dn *pdn;
 	struct pnv_ioda_pe *pe;
 	struct pnv_phb *phb;
 
 	switch (action) {
 	case BUS_NOTIFY_ADD_DEVICE:
 		pdev = to_pci_dev(dev);
-		pdn = pci_get_pdn(pdev);
 		phb = pci_bus_to_pnvhb(pdev->bus);
 
 		WARN_ON_ONCE(!phb);
-		if (!pdn || pdn->pe_number == IODA_INVALID_PE || !phb)
+		if (!phb)
 			return 0;
 
-		pe = &phb->ioda.pe_array[pdn->pe_number];
-		if (!pe->table_group.group)
+		pe = pnv_ioda_get_pe(pdev);
+		if (!pe || !pe->table_group.group)
 			return 0;
+
 		iommu_add_device(&pe->table_group, dev);
 		return 0;
 	case BUS_NOTIFY_DEL_DEVICE:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (32 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  5:14   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device Oliver O'Halloran
                   ` (11 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 7e88de18ead6..4f38652c7cd7 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3382,7 +3382,6 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
-	struct pci_dn *pdn;
 
 	/* The function is probably called while the PEs have
 	 * not be created yet. For example, resource reassignment
@@ -3392,11 +3391,7 @@ static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
 	if (!phb->initialized)
 		return true;
 
-	pdn = pci_get_pdn(dev);
-	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
-		return false;
-
-	return true;
+	return !!pnv_ioda_get_pe(dev);
 }
 
 static long pnv_pci_ioda1_unset_window(struct iommu_table_group *table_group,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (33 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  5:24   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device Oliver O'Halloran
                   ` (10 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 4f38652c7cd7..8525642b1256 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3562,14 +3562,14 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
 static void pnv_pci_release_device(struct pci_dev *pdev)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
+	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
 	struct pci_dn *pdn = pci_get_pdn(pdev);
-	struct pnv_ioda_pe *pe;
 
 	/* The VF PE state is torn down when sriov_disable() is called */
 	if (pdev->is_virtfn)
 		return;
 
-	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
+	if (WARN_ON(!pe))
 		return;
 
 	/*
@@ -3588,7 +3588,6 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
 	 * be increased on adding devices. It leads to unbalanced PE's device
 	 * count and eventually make normal PCI hotplug path broken.
 	 */
-	pe = &phb->ioda.pe_array[pdn->pe_number];
 	pdn->pe_number = IODA_INVALID_PE;
 
 	WARN_ON(--pe->device_count < 0);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (34 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  5:45   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe() Oliver O'Halloran
                   ` (9 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/npu-dma.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index b95b9e3c4c98..68bfaef44862 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -97,25 +97,16 @@ EXPORT_SYMBOL(pnv_pci_get_npu_dev);
 static struct pnv_ioda_pe *get_gpu_pci_dev_and_pe(struct pnv_ioda_pe *npe,
 						  struct pci_dev **gpdev)
 {
-	struct pnv_phb *phb;
-	struct pci_controller *hose;
 	struct pci_dev *pdev;
 	struct pnv_ioda_pe *pe;
-	struct pci_dn *pdn;
 
 	pdev = pnv_pci_get_gpu_dev(npe->pdev);
 	if (!pdev)
 		return NULL;
 
-	pdn = pci_get_pdn(pdev);
-	if (WARN_ON(!pdn || pdn->pe_number == IODA_INVALID_PE))
-		return NULL;
-
-	hose = pci_bus_to_host(pdev->bus);
-	phb = hose->private_data;
-	pe = &phb->ioda.pe_array[pdn->pe_number];
+	pe = pnv_ioda_get_pe(pdev);
 
-	if (gpdev)
+	if (pe && pdev)
 		*gpdev = pdev;
 
 	return pe;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe()
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (35 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-21  3:50   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL Oliver O'Halloran
                   ` (8 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Rather than using the pdn->pe_number for a device as an index into the
IODA PE array we can use the reverse map. This maps the RID (i.e. bdfn)
to the PE number associated with it. Firmware maintains a copy of the
rmap which is used by the hardware for determining which PE to use
when handling a DMA so this gets us a bit closer to the model used
by the HW, which is comprensible by mortals, rather than... whatever
the hell is going on currently.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8525642b1256..d111a50fbe68 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -672,13 +672,9 @@ struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn)
 struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
-	struct pci_dn *pdn = pci_get_pdn(dev);
+	u16 bdfn = (dev->bus->number << 8) | dev->devfn;
 
-	if (!pdn)
-		return NULL;
-	if (pdn->pe_number == IODA_INVALID_PE)
-		return NULL;
-	return &phb->ioda.pe_array[pdn->pe_number];
+	return __pnv_ioda_get_pe(phb, bdfn);
 }
 
 static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (36 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe() Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  6:27   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev Oliver O'Halloran
                   ` (7 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Currently when using the normal (i.e not building pci_dev's from the DT
node) probe method we only scan the devfn corresponding to the first child
of the bridge's DT node. This doesn't make much sense to me, but it seems
to have worked so far. At a guess it seems to work because in a PCIe
environment the first downstream child will be at devfn 00.0.

In any case it's completely broken when no pci_dn is available. Remove
the PCI_DN checking and scan each of the device number that might be on
the downstream bus.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
I'm not sure we should be using pci_scan_slot() directly here. Maybe
there's some insane legacy reason for it.
---
 arch/powerpc/kernel/pci-hotplug.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index d6a67f814983..85299c769768 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -123,17 +123,10 @@ void pci_hp_add_devices(struct pci_bus *bus)
 	if (mode == PCI_PROBE_DEVTREE) {
 		/* use ofdt-based probe */
 		of_rescan_bus(dn, bus);
-	} else if (mode == PCI_PROBE_NORMAL &&
-		   dn->child && PCI_DN(dn->child)) {
-		/*
-		 * Use legacy probe. In the partial hotplug case, we
-		 * probably have grandchildren devices unplugged. So
-		 * we don't check the return value from pci_scan_slot() in
-		 * order for fully rescan all the way down to pick them up.
-		 * They can have been removed during partial hotplug.
-		 */
-		slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
-		pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
+	} else if (mode == PCI_PROBE_NORMAL) {
+		for (slotno = 0; slotno < 255; slotno += 8)
+			pci_scan_slot(bus, slotno);
+
 		max = bus->busn_res.start;
 		/*
 		 * Scan bridges that are already configured. We don't touch
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (37 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  6:58   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs Oliver O'Halloran
                   ` (6 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

There's no need to use the pci_dn to find a device_node from a pci_dev.
Just search for the node pointed to by the pci_dev's of_node pointer.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/npu-dma.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 68bfaef44862..72d3749da02c 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -21,11 +21,11 @@
 
 static struct pci_dev *get_pci_dev(struct device_node *dn)
 {
-	struct pci_dn *pdn = PCI_DN(dn);
-	struct pci_dev *pdev;
+	struct pci_dev *pdev = NULL;
 
-	pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
-					   pdn->busno, pdn->devfn);
+	for_each_pci_dev(pdev)
+		if (pdev->dev.of_node == dn)
+			break;
 
 	/*
 	 * pci_get_domain_bus_and_slot() increased the reference count of
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (38 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  7:09   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs Oliver O'Halloran
                   ` (5 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

The comment here implies that we don't need to take a ref to the pci_dev
because the ioda_pe will always have one. This implies that the current
expection is that the pci_dev for an NPU device will *never* be torn
down since the ioda_pe having a ref to the device will prevent the
release function from being called.

In other words, the desired behaviour here appears to be leaking a ref.

Nice!

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 72d3749da02c..2eb6e6d45a98 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
 			break;
 
 	/*
-	 * pci_get_domain_bus_and_slot() increased the reference count of
-	 * the PCI device, but callers don't need that actually as the PE
-	 * already holds a reference to the device. Since callers aren't
-	 * aware of the reference count change, call pci_dev_put() now to
-	 * avoid leaks.
+	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
+	 * Caller is responsible for dropping the ref when it's
+	 * finished with it.
 	 */
-	if (pdev)
-		pci_dev_put(pdev);
-
 	return pdev;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (39 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  7:14   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device Oliver O'Halloran
                   ` (4 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

We don't need a pci_dn for the VF any more, so we can skip adding them.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d111a50fbe68..d3e375d71cdc 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1526,7 +1526,6 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
 		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
 		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
-		struct pci_dn *vf_pdn;
 
 		if (iov->m64_single_mode)
 			pe_num = iov->pe_num_map[vf_index];
@@ -1558,15 +1557,6 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		list_add_tail(&pe->list, &phb->ioda.pe_list);
 		mutex_unlock(&phb->ioda.pe_list_mutex);
 
-		/* associate this pe to it's pdn */
-		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
-			if (vf_pdn->busno == vf_bus &&
-			    vf_pdn->devfn == vf_devfn) {
-				vf_pdn->pe_number = pe_num;
-				break;
-			}
-		}
-
 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
 #ifdef CONFIG_IOMMU_API
 		iommu_register_group(&pe->table_group,
@@ -1688,17 +1678,11 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 int pnv_pcibios_sriov_disable(struct pci_dev *pdev)
 {
 	pnv_pci_sriov_disable(pdev);
-
-	/* Release PCI data */
-	remove_sriov_vf_pdns(pdev);
 	return 0;
 }
 
 int pnv_pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 {
-	/* Allocate PCI data */
-	add_sriov_vf_pdns(pdev);
-
 	return pnv_pci_sriov_enable(pdev, num_vfs);
 }
 #endif /* CONFIG_PCI_IOV */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (40 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27  7:30   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices Oliver O'Halloran
                   ` (3 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Nothing looks at it anymore.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d3e375d71cdc..45d940730c30 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3541,9 +3541,7 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
 
 static void pnv_pci_release_device(struct pci_dev *pdev)
 {
-	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
 	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
-	struct pci_dn *pdn = pci_get_pdn(pdev);
 
 	/* The VF PE state is torn down when sriov_disable() is called */
 	if (pdev->is_virtfn)
@@ -3560,16 +3558,6 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
 	if (pdev->is_physfn)
 		kfree(pdev->dev.archdata.iov_data);
 
-	/*
-	 * PCI hotplug can happen as part of EEH error recovery. The @pdn
-	 * isn't removed and added afterwards in this scenario. We should
-	 * set the PE number in @pdn to an invalid one. Otherwise, the PE's
-	 * device count is decreased on removing devices while failing to
-	 * be increased on adding devices. It leads to unbalanced PE's device
-	 * count and eventually make normal PCI hotplug path broken.
-	 */
-	pdn->pe_number = IODA_INVALID_PE;
-
 	WARN_ON(--pe->device_count < 0);
 	if (pe->device_count == 0)
 		pnv_ioda_release_pe(pe);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (41 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27 22:49   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack Oliver O'Halloran
                   ` (2 subsequent siblings)
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

The only thing we need the pdn for in this function is setting the pe_number
field, which we don't use anymore. Fix the weird refcounting behaviour while
we're here.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
Either Fred, or Reza also fixed this in some patch lately and that'll probably get
merged before this one does.
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 27 +++++++++--------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 45d940730c30..2a9201306543 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1066,16 +1066,13 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
 static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
 {
 	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
-	struct pci_dn *pdn = pci_get_pdn(dev);
-	struct pnv_ioda_pe *pe;
+	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev);
 
-	if (!pdn) {
-		pr_err("%s: Device tree node not associated properly\n",
-			   pci_name(dev));
+	/* Already has a PE assigned? huh? */
+	if (pe) {
+		WARN_ON(1);
 		return NULL;
 	}
-	if (pdn->pe_number != IODA_INVALID_PE)
-		return NULL;
 
 	pe = pnv_ioda_alloc_pe(phb);
 	if (!pe) {
@@ -1084,29 +1081,25 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
 		return NULL;
 	}
 
-	/* NOTE: We get only one ref to the pci_dev for the pdn, not for the
-	 * pointer in the PE data structure, both should be destroyed at the
-	 * same time. However, this needs to be looked at more closely again
-	 * once we actually start removing things (Hotplug, SR-IOV, ...)
+	/*
+	 * NB: We **do not** hold a pci_dev ref for pe->pdev.
 	 *
-	 * At some point we want to remove the PDN completely anyways
+	 * The pci_dev's release function cleans up the ioda_pe state, so:
+	 *  a) We can't take a ref otherwise the release function is never called
+	 *  b) The pe->pdev pointer will always point to valid pci_dev (or NULL)
 	 */
-	pci_dev_get(dev);
-	pdn->pe_number = pe->pe_number;
 	pe->flags = PNV_IODA_PE_DEV;
 	pe->pdev = dev;
 	pe->pbus = NULL;
 	pe->mve_number = -1;
-	pe->rid = dev->bus->number << 8 | pdn->devfn;
+	pe->rid = dev->bus->number << 8 | dev->devfn;
 
 	pe_info(pe, "Associated device to PE\n");
 
 	if (pnv_ioda_configure_pe(phb, pe)) {
 		/* XXX What do we do here ? */
 		pnv_ioda_free_pe(pe);
-		pdn->pe_number = IODA_INVALID_PE;
 		pe->pdev = NULL;
-		pci_dev_put(dev);
 		return NULL;
 	}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (42 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27 22:54   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors Oliver O'Halloran
  2019-11-20  1:28 ` [Very RFC 46/46] HACK: prevent pdn's from being created Oliver O'Halloran
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

P8 needs to shove four GPUs into three PEs for $reasons. Remove the
pdn->pe_assignment done there since we just use the pe_rmap[] now.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2a9201306543..eceff27357e5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1183,7 +1183,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
 	long rid;
 	struct pnv_ioda_pe *pe;
 	struct pci_dev *gpu_pdev;
-	struct pci_dn *npu_pdn;
 	struct pnv_phb *phb = pci_bus_to_pnvhb(npu_pdev->bus);
 
 	/*
@@ -1210,9 +1209,8 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
 			dev_info(&npu_pdev->dev,
 				"Associating to existing PE %x\n", pe_num);
 			pci_dev_get(npu_pdev);
-			npu_pdn = pci_get_pdn(npu_pdev);
-			rid = npu_pdev->bus->number << 8 | npu_pdn->devfn;
-			npu_pdn->pe_number = pe_num;
+
+			rid = npu_pdev->bus->number << 8 | npu_pdev->devfn;
 			phb->ioda.pe_rmap[rid] = pe->pe_number;
 
 			/* Map the PE to this link */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (43 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  2019-11-27 23:00   ` Alexey Kardashevskiy
  2019-11-20  1:28 ` [Very RFC 46/46] HACK: prevent pdn's from being created Oliver O'Halloran
  45 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

:toot:

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/powernv/pci.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 0eeea8652426..6383dcfec606 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -750,17 +750,12 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 			       unsigned int devfn,
 			       int where, int size, u32 *val)
 {
-	struct pci_dn *pdn;
 	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	u16 bdfn = bus->number << 8 | devfn;
 	struct eeh_dev *edev;
 	int ret;
 
 	*val = 0xFFFFFFFF;
-	pdn = pci_get_pdn_by_devfn(bus, devfn);
-	if (!pdn)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-
 	edev = pnv_eeh_find_edev(phb, bdfn);
 	if (!pnv_eeh_pre_cfg_check(edev))
 		return PCIBIOS_DEVICE_NOT_FOUND;
@@ -781,16 +776,11 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 				unsigned int devfn,
 				int where, int size, u32 val)
 {
-	struct pci_dn *pdn;
 	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
 	u16 bdfn = bus->number << 8 | devfn;
 	struct eeh_dev *edev;
 	int ret;
 
-	pdn = pci_get_pdn_by_devfn(bus, devfn);
-	if (!pdn)
-		return PCIBIOS_DEVICE_NOT_FOUND;
-
 	edev = pnv_eeh_find_edev(phb, bdfn);
 	if (!pnv_eeh_pre_cfg_check(edev))
 		return PCIBIOS_DEVICE_NOT_FOUND;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [Very RFC 46/46] HACK: prevent pdn's from being created
  2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
                   ` (44 preceding siblings ...)
  2019-11-20  1:28 ` [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors Oliver O'Halloran
@ 2019-11-20  1:28 ` Oliver O'Halloran
  45 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-20  1:28 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Oliver O'Halloran, s.miroshnichenko

Not-Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/kernel/pci_dn.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index f790a8d06f50..0e05c1d7633a 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -289,6 +289,9 @@ struct pci_dn *pci_add_device_node_info(struct pci_controller *hose,
 	struct eeh_dev *edev;
 #endif
 
+	pr_err("skipping adding pdn for %pOF\n", dn);
+	return NULL;
+
 	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
 	if (pdn == NULL)
 		return NULL;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 107+ messages in thread

* Re: [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset
  2019-11-20  1:28 ` [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset Oliver O'Halloran
@ 2019-11-21  3:38   ` Alexey Kardashevskiy
  2019-11-21  4:34     ` Oliver O'Halloran
  0 siblings, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  3:38 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> After resetting a VF we call eeh_restore_vf_config() to restore several
> registers in the VFs config space. For physical functions this is normally
> handled by the pci_reinit_device() OPAL call which requests firmware to
> re-program the device with whatever defaults were set at boot time. We
> can't use that for VFs since OPAL (being firmware) doesn't know (or care)
> about VFs.
> 
> However, the fields that are restored here are all marked as reserved for
> VFs in the SR-IOV spec. In other words, eeh_restore_vf_config() doesn't
> actually do anything.
> 
> There is an argument to be made that we should be saving and restoring
> some of these fields since they are marked as "Reserved, but Preserve"
> (ResvP) to allow these fields to be used in new versions of the SR-IOV.
> However, the current code doesn't even do that properly since it assumes
> they can be set to whatever the EEH core has assumed to be correct. If
> the fields *are* used in future versions of the SR-IOV spec this code
> is still broken since it doesn't take into account any changes made
> by the driver, or the Linux IOV core.
> Given the above, just delete the code. It's broken, it's mis-leading,
> and it's getting in the way of doing useful cleanups.


There is still a prototype for this in arch/powerpc/include/asm/eeh.h,
and pci_dn::mps as well.


With the history of this explained offline,

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>





> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/kernel/eeh.c                    | 59 --------------------
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 39 +++----------
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 26 +--------
>  3 files changed, 8 insertions(+), 116 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index ae0a9c421d7b..a3b93db972fc 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -742,65 +742,6 @@ static void eeh_restore_dev_state(struct eeh_dev *edev, void *userdata)
>  		pci_restore_state(pdev);
>  }
>  
> -int eeh_restore_vf_config(struct pci_dn *pdn)
> -{
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> -	u32 devctl, cmd, cap2, aer_capctl;
> -	int old_mps;
> -
> -	if (edev->pcie_cap) {
> -		/* Restore MPS */
> -		old_mps = (ffs(pdn->mps) - 8) << 5;
> -		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> -				     2, &devctl);
> -		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
> -		devctl |= old_mps;
> -		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> -				      2, devctl);
> -
> -		/* Disable Completion Timeout if possible */
> -		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP2,
> -				     4, &cap2);
> -		if (cap2 & PCI_EXP_DEVCAP2_COMP_TMOUT_DIS) {
> -			eeh_ops->read_config(pdn,
> -					     edev->pcie_cap + PCI_EXP_DEVCTL2,
> -					     4, &cap2);
> -			cap2 |= PCI_EXP_DEVCTL2_COMP_TMOUT_DIS;
> -			eeh_ops->write_config(pdn,
> -					      edev->pcie_cap + PCI_EXP_DEVCTL2,
> -					      4, cap2);
> -		}
> -	}
> -
> -	/* Enable SERR and parity checking */
> -	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
> -	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
> -	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
> -
> -	/* Enable report various errors */
> -	if (edev->pcie_cap) {
> -		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> -				     2, &devctl);
> -		devctl &= ~PCI_EXP_DEVCTL_CERE;
> -		devctl |= (PCI_EXP_DEVCTL_NFERE |
> -			   PCI_EXP_DEVCTL_FERE |
> -			   PCI_EXP_DEVCTL_URRE);
> -		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> -				      2, devctl);
> -	}
> -
> -	/* Enable ECRC generation and check */
> -	if (edev->pcie_cap && edev->aer_cap) {
> -		eeh_ops->read_config(pdn, edev->aer_cap + PCI_ERR_CAP,
> -				     4, &aer_capctl);
> -		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
> -		eeh_ops->write_config(pdn, edev->aer_cap + PCI_ERR_CAP,
> -				      4, aer_capctl);
> -	}
> -
> -	return 0;
> -}
> -
>  /**
>   * pcibios_set_pcie_reset_state - Set PCI-E reset state
>   * @dev: pci device struct
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index ef727ecd99cd..b2ac4130fda7 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -1649,20 +1649,13 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
>  	if (!edev)
>  		return -EEXIST;
>  
> -	/*
> -	 * We have to restore the PCI config space after reset since the
> -	 * firmware can't see SRIOV VFs.
> -	 *
> -	 * FIXME: The MPS, error routing rules, timeout setting are worthy
> -	 * to be exported by firmware in extendible way.
> -	 */
> -	if (edev->physfn) {
> -		ret = eeh_restore_vf_config(pdn);
> -	} else {
> -		phb = pdn->phb->private_data;
> -		ret = opal_pci_reinit(phb->opal_id,
> -				      OPAL_REINIT_PCI_DEV, config_addr);
> -	}
> +	/* Nothing to do for VFs */
> +	if (edev->physfn)
> +		return 0;
> +
> +	phb = pdn->phb->private_data;
> +	ret = opal_pci_reinit(phb->opal_id,
> +			      OPAL_REINIT_PCI_DEV, config_addr);
>  
>  	if (ret) {
>  		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
> @@ -1691,24 +1684,6 @@ static struct eeh_ops pnv_eeh_ops = {
>  	.notify_resume		= NULL
>  };
>  
> -#ifdef CONFIG_PCI_IOV
> -static void pnv_pci_fixup_vf_mps(struct pci_dev *pdev)
> -{
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
> -	int parent_mps;
> -
> -	if (!pdev->is_virtfn)
> -		return;
> -
> -	/* Synchronize MPS for VF and PF */
> -	parent_mps = pcie_get_mps(pdev->physfn);
> -	if ((128 << pdev->pcie_mpss) >= parent_mps)
> -		pcie_set_mps(pdev, parent_mps);
> -	pdn->mps = pcie_get_mps(pdev);
> -}
> -DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_mps);
> -#endif /* CONFIG_PCI_IOV */
> -
>  /**
>   * eeh_powernv_init - Register platform dependent EEH operations
>   *
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index 95bbf9102584..fa704d7052ec 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -657,30 +657,6 @@ static int pseries_eeh_write_config(struct pci_dn *pdn, int where, int size, u32
>  	return rtas_write_config(pdn, where, size, val);
>  }
>  
> -static int pseries_eeh_restore_config(struct pci_dn *pdn)
> -{
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> -	s64 ret = 0;
> -
> -	if (!edev)
> -		return -EEXIST;
> -
> -	/*
> -	 * FIXME: The MPS, error routing rules, timeout setting are worthy
> -	 * to be exported by firmware in extendible way.
> -	 */
> -	if (edev->physfn)
> -		ret = eeh_restore_vf_config(pdn);
> -
> -	if (ret) {
> -		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
> -			__func__, edev->pe_config_addr, ret);
> -		return -EIO;
> -	}
> -
> -	return ret;
> -}
> -
>  #ifdef CONFIG_PCI_IOV
>  int pseries_send_allow_unfreeze(struct pci_dn *pdn,
>  				u16 *vf_pe_array, int cur_vfs)
> @@ -786,7 +762,7 @@ static struct eeh_ops pseries_eeh_ops = {
>  	.read_config		= pseries_eeh_read_config,
>  	.write_config		= pseries_eeh_write_config,
>  	.next_error		= NULL,
> -	.restore_config		= pseries_eeh_restore_config,
> +	.restore_config		= NULL,
>  #ifdef CONFIG_PCI_IOV
>  	.notify_resume		= pseries_notify_resume
>  #endif
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe()
  2019-11-20  1:28 ` [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe() Oliver O'Halloran
@ 2019-11-21  3:50   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  3:50 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Rather than using the pdn->pe_number for a device as an index into the
> IODA PE array we can use the reverse map. This maps the RID (i.e. bdfn)
> to the PE number associated with it. Firmware maintains a copy of the
> rmap which is used by the hardware for determining which PE to use
> when handling a DMA so this gets us a bit closer to the model used
> by the HW, which is comprehensible by mortals, rather than... whatever

s/comprensible/comprehensible/ ?

> the hell is going on currently.


Merge this into 02/46?


> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 8525642b1256..d111a50fbe68 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -672,13 +672,9 @@ struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn)
>  struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
> -	struct pci_dn *pdn = pci_get_pdn(dev);
> +	u16 bdfn = (dev->bus->number << 8) | dev->devfn;
>  
> -	if (!pdn)
> -		return NULL;
> -	if (pdn->pe_number == IODA_INVALID_PE)
> -		return NULL;
> -	return &phb->ioda.pe_array[pdn->pe_number];
> +	return __pnv_ioda_get_pe(phb, bdfn);
>  }
>  
>  static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs
  2019-11-20  1:28 ` [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs Oliver O'Halloran
@ 2019-11-21  3:57   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  3:57 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> The pnv_pci_dma_dev_setup() only does something when:
> 
> 1) There PHB contains VFs, or
> 2) The PHB defines a dma_dev_setup() callback in the pnv_phb structure.
> 
> Neither is true for NPU PHBs, so don't set the callback in the pci_controller_ops.

True.


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 65b5b121ebad..099c0bb1a9b9 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3652,7 +3652,6 @@ static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
>  };
>  
>  static const struct pci_controller_ops pnv_npu_ioda_controller_ops = {
> -	.dma_dev_setup		= pnv_pci_dma_dev_setup,
>  	.setup_msi_irqs		= pnv_setup_msi_irqs,
>  	.teardown_msi_irqs	= pnv_teardown_msi_irqs,
>  	.enable_device_hook	= pnv_pci_enable_device_hook,
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c
  2019-11-20  1:28 ` [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c Oliver O'Halloran
@ 2019-11-21  4:02   ` Alexey Kardashevskiy
  2019-11-21  4:33     ` Oliver O'Halloran
  2019-11-21  7:46   ` Christoph Hellwig
  1 sibling, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  4:02 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> These functions are only used from pci-ioda.c. Move them in there and remove
> the prototypes from the header files.


Make them static then, or am I missing the point?

With that fixed,


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



Also, pci-ioda.c is around 4000 lines long, if anything I'd rather be
moving things to a separate new file (pci-ioda-dma.c or whatever). OTOH
after everything you've done, pci-ioda.c is 77 lines shorter so I'll
shut up now :)


> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 43 +++++++++++++++++++++++
>  arch/powerpc/platforms/powernv/pci.c      | 43 -----------------------
>  arch/powerpc/platforms/powernv/pci.h      |  2 --
>  3 files changed, 43 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 099c0bb1a9b9..c2b3a5a13004 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3637,6 +3637,49 @@ static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
>  		       OPAL_ASSERT_RESET);
>  }
>  
> +void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
> +{
> +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> +	struct pnv_phb *phb = hose->private_data;
> +#ifdef CONFIG_PCI_IOV
> +	struct pnv_ioda_pe *pe;
> +
> +	/* Fix the VF pdn PE number */
> +	if (pdev->is_virtfn) {
> +		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> +			if (pe->rid == ((pdev->bus->number << 8) |
> +			    (pdev->devfn & 0xff))) {
> +				pe->pdev = pdev;
> +				break;
> +			}
> +		}
> +	}
> +#endif /* CONFIG_PCI_IOV */
> +
> +	if (phb && phb->dma_dev_setup)
> +		phb->dma_dev_setup(phb, pdev);
> +}
> +
> +void pnv_pci_dma_bus_setup(struct pci_bus *bus)
> +{
> +	struct pci_controller *hose = bus->sysdata;
> +	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_ioda_pe *pe;
> +
> +	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> +		if (!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
> +			continue;
> +
> +		if (!pe->pbus)
> +			continue;
> +
> +		if (bus->number == ((pe->rid >> 8) & 0xFF)) {
> +			pe->pbus = bus;
> +			break;
> +		}
> +	}
> +}
> +
>  static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
>  	.dma_dev_setup		= pnv_pci_dma_dev_setup,
>  	.dma_bus_setup		= pnv_pci_dma_bus_setup,
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index b7761e2e06f8..8b9058b52575 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -810,49 +810,6 @@ struct iommu_table *pnv_pci_table_alloc(int nid)
>  	return tbl;
>  }
>  
> -void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
> -{
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> -#ifdef CONFIG_PCI_IOV
> -	struct pnv_ioda_pe *pe;
> -
> -	/* Fix the VF pdn PE number */
> -	if (pdev->is_virtfn) {
> -		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> -			if (pe->rid == ((pdev->bus->number << 8) |
> -			    (pdev->devfn & 0xff))) {
> -				pe->pdev = pdev;
> -				break;
> -			}
> -		}
> -	}
> -#endif /* CONFIG_PCI_IOV */
> -
> -	if (phb && phb->dma_dev_setup)
> -		phb->dma_dev_setup(phb, pdev);
> -}
> -
> -void pnv_pci_dma_bus_setup(struct pci_bus *bus)
> -{
> -	struct pci_controller *hose = bus->sysdata;
> -	struct pnv_phb *phb = hose->private_data;
> -	struct pnv_ioda_pe *pe;
> -
> -	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> -		if (!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
> -			continue;
> -
> -		if (!pe->pbus)
> -			continue;
> -
> -		if (bus->number == ((pe->rid >> 8) & 0xFF)) {
> -			pe->pbus = bus;
> -			break;
> -		}
> -	}
> -}
> -
>  struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(dev->bus);
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 01a01739c03e..f23145575048 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -189,8 +189,6 @@ extern void pnv_npu2_map_lpar(struct pnv_ioda_pe *gpe, unsigned long msr);
>  extern void pnv_pci_reset_secondary_bus(struct pci_dev *dev);
>  extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option);
>  
> -extern void pnv_pci_dma_dev_setup(struct pci_dev *pdev);
> -extern void pnv_pci_dma_bus_setup(struct pci_bus *bus);
>  extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type);
>  extern void pnv_teardown_msi_irqs(struct pci_dev *pdev);
>  extern struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback
  2019-11-20  1:28 ` [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback Oliver O'Halloran
@ 2019-11-21  4:03   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  4:03 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> This is only ever set for IODA PHBs. The only call site is in
> pnv_pci_dma_dev_setup(), which is also only used by normal IODA PHBs, so remove
> the callback in favour of a direct call.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 4 +---
>  arch/powerpc/platforms/powernv/pci.h      | 1 -
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index c2b3a5a13004..45f974258766 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3656,8 +3656,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  	}
>  #endif /* CONFIG_PCI_IOV */
>  
> -	if (phb && phb->dma_dev_setup)
> -		phb->dma_dev_setup(phb, pdev);
> +	pnv_pci_ioda_dma_dev_setup(phb, pdev);
>  }
>  
>  void pnv_pci_dma_bus_setup(struct pci_bus *bus)
> @@ -3940,7 +3939,6 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>  		hose->controller_ops = pnv_npu_ocapi_ioda_controller_ops;
>  		break;
>  	default:
> -		phb->dma_dev_setup = pnv_pci_ioda_dma_dev_setup;
>  		hose->controller_ops = pnv_pci_ioda_controller_ops;
>  	}
>  
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index f23145575048..3c33a0c91a69 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -108,7 +108,6 @@ struct pnv_phb {
>  	int (*msi_setup)(struct pnv_phb *phb, struct pci_dev *dev,
>  			 unsigned int hwirq, unsigned int virq,
>  			 unsigned int is_64, struct msi_msg *msg);
> -	void (*dma_dev_setup)(struct pnv_phb *phb, struct pci_dev *pdev);
>  	int (*init_m64)(struct pnv_phb *phb);
>  	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
>  	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c
  2019-11-21  4:02   ` Alexey Kardashevskiy
@ 2019-11-21  4:33     ` Oliver O'Halloran
  0 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-21  4:33 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Thu, Nov 21, 2019 at 3:02 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > These functions are only used from pci-ioda.c. Move them in there and remove
> > the prototypes from the header files.
>
>
> Make them static then, or am I missing the point?

I forgot.

>
> With that fixed,
>
>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
>
>
> Also, pci-ioda.c is around 4000 lines long, if anything I'd rather be
> moving things to a separate new file (pci-ioda-dma.c or whatever). OTOH
> after everything you've done, pci-ioda.c is 77 lines shorter so I'll
> shut up now :)

IMO pci-ioda and pci.c should probably be folded together since
there's no clear distinction between the two. We might be able to
split off the PCI/NPU/OpenCAPI PHB specific functions into separate
files to slim things down a bit, but I still need to re-work some of
PHB init code and to make that possible.

Oliver

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
  2019-11-20  1:28 ` [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() Oliver O'Halloran
@ 2019-11-21  4:34   ` Alexey Kardashevskiy
  2019-11-25  4:41     ` Oliver O'Halloran
  2019-11-21  7:48   ` Christoph Hellwig
  1 sibling, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  4:34 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Move this out of the PHB's dma_dev_setup() callback and into the
> ppc_md.pcibios_fixup_iov callback. This ensures that the VF PE's
> pdev pointer is always valid for the whole time the device is
> added the bus.

Yeah it would be nice if dma setup did just dma stuff.

> This isn't strictly required, but it's slightly a slightly more logical

s/slightly a slightly/slightly (slightly)/ ? :)


> place to do the fixup and it makes dma_dev_setup a bit simpler.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 35 +++++++++++------------
>  1 file changed, 17 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 45f974258766..c6ea7a504e04 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2910,9 +2910,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  	struct pci_dn *pdn;
>  	int mul, total_vfs;
>  
> -	if (!pdev->is_physfn || pci_dev_is_added(pdev))
> -		return;
> -
>  	pdn = pci_get_pdn(pdev);
>  	pdn->vfs_expanded = 0;
>  	pdn->m64_single_mode = false;
> @@ -2987,6 +2984,22 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  		res->end = res->start - 1;
>  	}
>  }
> +
> +static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
> +{
> +	if (WARN_ON(pci_dev_is_added(pdev)))
> +		return;
> +
> +	if (pdev->is_virtfn) {
> +		/* Fix the VF PE's pdev pointer */
> +		struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
> +		pe->pdev = pdev;
> +
> +		WARN_ON(!(pe->flags & PNV_IODA_PE_VF));


return;


> +	} else if (pdev->is_physfn) {



> +		pnv_pci_ioda_fixup_iov_resources(pdev);


and open code pnv_pci_ioda_fixup_iov_resources() right here?

Either way


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>





> +	}
> +}
>  #endif /* CONFIG_PCI_IOV */
>  
>  static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe,
> @@ -3641,20 +3654,6 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>  	struct pnv_phb *phb = hose->private_data;
> -#ifdef CONFIG_PCI_IOV
> -	struct pnv_ioda_pe *pe;
> -
> -	/* Fix the VF pdn PE number */
> -	if (pdev->is_virtfn) {
> -		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> -			if (pe->rid == ((pdev->bus->number << 8) |
> -			    (pdev->devfn & 0xff))) {
> -				pe->pdev = pdev;
> -				break;
> -			}
> -		}
> -	}
> -#endif /* CONFIG_PCI_IOV */
>  
>  	pnv_pci_ioda_dma_dev_setup(phb, pdev);
>  }
> @@ -3945,7 +3944,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>  	ppc_md.pcibios_default_alignment = pnv_pci_default_alignment;
>  
>  #ifdef CONFIG_PCI_IOV
> -	ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov_resources;
> +	ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov;
>  	ppc_md.pcibios_iov_resource_alignment = pnv_pci_iov_resource_alignment;
>  	ppc_md.pcibios_sriov_enable = pnv_pcibios_sriov_enable;
>  	ppc_md.pcibios_sriov_disable = pnv_pcibios_sriov_disable;
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset
  2019-11-21  3:38   ` Alexey Kardashevskiy
@ 2019-11-21  4:34     ` Oliver O'Halloran
  0 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-21  4:34 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Thu, Nov 21, 2019 at 2:38 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > After resetting a VF we call eeh_restore_vf_config() to restore several
> > registers in the VFs config space. For physical functions this is normally
> > handled by the pci_reinit_device() OPAL call which requests firmware to
> > re-program the device with whatever defaults were set at boot time. We
> > can't use that for VFs since OPAL (being firmware) doesn't know (or care)
> > about VFs.
> >
> > However, the fields that are restored here are all marked as reserved for
> > VFs in the SR-IOV spec. In other words, eeh_restore_vf_config() doesn't
> > actually do anything.
> >
> > There is an argument to be made that we should be saving and restoring
> > some of these fields since they are marked as "Reserved, but Preserve"
> > (ResvP) to allow these fields to be used in new versions of the SR-IOV.
> > However, the current code doesn't even do that properly since it assumes
> > they can be set to whatever the EEH core has assumed to be correct. If
> > the fields *are* used in future versions of the SR-IOV spec this code
> > is still broken since it doesn't take into account any changes made
> > by the driver, or the Linux IOV core.
> > Given the above, just delete the code. It's broken, it's mis-leading,
> > and it's getting in the way of doing useful cleanups.
>
>
> There is still a prototype for this in arch/powerpc/include/asm/eeh.h,
> and pci_dn::mps as well.
>
>
> With the history of this explained offline,
>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>


The commit message used to have some of the background, but I thought
it was too long already and took it out. I'll add it back in I guess.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting
  2019-11-20  1:28 ` [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting Oliver O'Halloran
@ 2019-11-21  5:48   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-21  5:48 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> The current process for configuring the IODA PEs for normal PCI devices is
> abit stupid.

This phrase should go.


> After assigning MMIO resources for the devices on a bus the
> actual PE asignment occurs when pcibios_setup_bridge() is called for the
> parent bridge. In pcibios_setup_bridge() we:
> 
> 1. Assign all 256 devfn's for the subordinate bus to the PE corresponding
>    to the bridge window.
> 2. Initialise the IOMMU tables for that PE.
> 3. Traverse each device on the bus below the bridge setting the IOMMU table
>    pointer to point at the PE's table.
> 4. Finally, set pci_dn->pe_number to indicate that we've done the
>    per-device setup and allow EEH and the platform code to look up
>    the PE number.
> 
> Although mostly functional, there's a couple of issues with this approach.
> The most glaring is that it mixes the per-bus (per-PE really) setup with
> the per-device setup in a way that's completely asymmetric to what happens
> when tearing down a device.
> 
> In step 4. the number of devices in the PE is counted and stored in the
> ioda_pe structure. When releasing a pci_dev the device count is dropped
> until it hits zero where the ioda_pe itself is torn down. However, the bus
> itself remains active and can be re-scanned to bring back the devices that
> were removed. On a rescan we do not re-run the bridge setup so the
> per-device setup is never re-done which results in the re-scanned being
> unusable.
> 
> There are a few other minor issues too. Associating all 256 devfns with
> the PE means that config accesses to non-existant PCI devices results
> in a spurious PE freezes. We currently prevent this by only allowing config
> accesses to a BDFN when there is a corresponding pci_dn structure. We
> would like to eliminate that restriction in the future though.
> 
> That all said the biggest issue is that the current behaviour is hard to
> follow at the best of times. On top of that the behaviour is slightly (or
> majorly) different across each PHB type (PCIe, OpenCAPI, NVLink) and the
> behaviour for physical devices (described above) and virtual functions is
> again different. To address this we want to merge the paths as much as
> possible so that the PowerNV specific PCI initialisation steps all occur
> at roughly the same point in the PCI device setup path.
> 
> We can address most of these problems by moving the PE setup out of
> pcibios_setup_bridge() and into pcibios_bus_add_device(). The latter is
> called on a per-device basis so we have some symmetry between the setup and
> teardown paths. Moving the PE assignments to here should also allow us to
> converge how PE assignment works on all PHB types so it's always done in
> one place.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 112 +++++++++++-----------
>  1 file changed, 58 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index c6ea7a504e04..c74521e5f3ab 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -51,6 +51,7 @@ static const char * const pnv_phb_names[] = { "IODA1", "IODA2", "NPU_NVLINK",
>  					      "NPU_OCAPI" };
>  
>  static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable);
> +static void pnv_pci_configure_bus(struct pci_bus *bus);


Do not need this, at least, after 46/46.

>  
>  void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
>  			    const char *fmt, ...)
> @@ -1104,34 +1105,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>  	return pe;
>  }
>  
> -static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
> -{
> -	struct pci_dev *dev;
> -
> -	list_for_each_entry(dev, &bus->devices, bus_list) {
> -		struct pci_dn *pdn = pci_get_pdn(dev);
> -
> -		if (pdn == NULL) {
> -			pr_warn("%s: No device node associated with device !\n",
> -				pci_name(dev));
> -			continue;
> -		}
> -
> -		/*
> -		 * In partial hotplug case, the PCI device might be still
> -		 * associated with the PE and needn't attach it to the PE
> -		 * again.
> -		 */
> -		if (pdn->pe_number != IODA_INVALID_PE)
> -			continue;
> -
> -		pe->device_count++;
> -		pdn->pe_number = pe->pe_number;
> -		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
> -			pnv_ioda_setup_same_PE(dev->subordinate, pe);
> -	}
> -}
> -
>  /*
>   * There're 2 types of PCI bus sensitive PEs: One that is compromised of
>   * single PCI bus. Another one that contains the primary PCI bus and its
> @@ -1152,7 +1125,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
>  	pe_num = phb->ioda.pe_rmap[bus->number << 8];
>  	if (pe_num != IODA_INVALID_PE) {
>  		pe = &phb->ioda.pe_array[pe_num];
> -		pnv_ioda_setup_same_PE(bus, pe);
>  		return NULL;
>  	}
>  
> @@ -1196,9 +1168,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
>  		return NULL;
>  	}
>  
> -	/* Associate it with all child devices */
> -	pnv_ioda_setup_same_PE(bus, pe);
> -
>  	/* Put PE to the list */
>  	list_add_tail(&pe->list, &phb->ioda.pe_list);
>  
> @@ -1758,23 +1727,20 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> -	/*
> -	 * The function can be called while the PE#
> -	 * hasn't been assigned. Do nothing for the
> -	 * case.
> -	 */
> -	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
> -		return;
> -
>  	pe = &phb->ioda.pe_array[pdn->pe_number];
>  	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
>  	pdev->dev.archdata.dma_offset = pe->tce_bypass_base;
>  	set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
> +
> +	pe->device_count++;
> +
>  	/*
>  	 * Note: iommu_add_device() will fail here as
>  	 * for physical PE: the device is already added by now;
>  	 * for virtual PE: sysfs entries are not ready yet and
>  	 * tce_iommu_bus_notifier will add the device to a group later.
> +	 *
> +	 * XXX: this is wrong since the re-ordering patch.


What is that re-ordering patch and why it did not remove this comment then?



>  	 */
>  }
>  
> @@ -2288,9 +2254,6 @@ static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb,
>  	pe->table_group.tce32_size = tbl->it_size << tbl->it_page_shift;
>  	iommu_init_table(tbl, phb->hose->node, 0, 0);
>  
> -	if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
> -		pnv_ioda_setup_bus_dma(pe, pe->pbus);
> -
>  	return;
>   fail:
>  	/* XXX Failure: Try to fallback to 64-bit only ? */
> @@ -2626,9 +2589,9 @@ static void pnv_pci_ioda_setup_iommu_api(void)
>  	/*
>  	 * There are 4 types of PEs:
>  	 * - PNV_IODA_PE_BUS: a downstream port with an adapter,
> -	 *   created from pnv_pci_setup_bridge();
> +	 *   created from pnv_pci_configure_bus();
>  	 * - PNV_IODA_PE_BUS_ALL: a PCI-PCIX bridge with devices behind it,
> -	 *   created from pnv_pci_setup_bridge();
> +	 *   created from pnv_pci_configure_bus();
>  	 * - PNV_IODA_PE_VF: a SRIOV virtual function,
>  	 *   created from pnv_pcibios_sriov_enable();
>  	 * - PNV_IODA_PE_DEV: an NPU or OCAPI device,
> @@ -2748,8 +2711,10 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>  	if (rc)
>  		return;
>  
> -	if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
> -		pnv_ioda_setup_bus_dma(pe, pe->pbus);
> +	/*
> +	 * The IOMMU table for the PE is associated with the device in
> +	 * pnv_pcibios_bus_add_device()
> +	 */


Is it really? eeh_add_device_late or pnv_eeh_probe_pdev do not seem to
be doing this.


>  }
>  
>  int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq)
> @@ -3324,16 +3289,13 @@ static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
>  	}
>  }
>  
> -static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
> +static void pnv_pci_configure_bus(struct pci_bus *bus)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(bus);
>  	struct pnv_phb *phb = hose->private_data;
>  	struct pci_dev *bridge = bus->self;
>  	struct pnv_ioda_pe *pe;
> -	bool all = (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
> -
> -	/* Extend bridge's windows if necessary */
> -	pnv_pci_fixup_bridge_resources(bus, type);
> +	bool all = (bridge && pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
>  
>  	/* The PE for root bus should be realized before any one else */
>  	if (!phb->ioda.root_pe_populated) {
> @@ -3342,12 +3304,21 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>  			phb->ioda.root_pe_idx = pe->pe_number;
>  			phb->ioda.root_pe_populated = true;
>  		}
> +
> +		/* no need to re-configure the root bus */
> +		if (bus == phb->hose->bus)
> +			return;
>  	}
>  
>  	/* Don't assign PE to PCI bus, which doesn't have subordinate devices */
>  	if (list_empty(&bus->devices))
>  		return;
>  
> +	/* PE should never be re-configured */
> +	pe = __pnv_ioda_get_pe(phb, bus->number << 8);
> +	if (WARN_ON(pe))
> +		return;
> +
>  	/* Reserve PEs according to used M64 resources */
>  	pnv_ioda_reserve_m64_pe(bus, NULL, all);
>  
> @@ -3654,6 +3625,39 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>  	struct pnv_phb *phb = hose->private_data;
> +	struct pci_dn *pdn = pci_get_pdn(pdev);
> +	struct pnv_ioda_pe *pe;
> +
> +	/* Check if the BDFN for this device is associated with a PE yet */
> +	pe = __pnv_ioda_get_pe(phb, pdev->devfn | (pdev->bus->number << 8));
> +	if (!pe) {
> +		/*
> +		 * We should only hit this path for "normal" PCI PHBs. The


We should only hit this path only for PCIe-PCIx bridges, right? If yes,
should not this be in pnv_pci_dma_bus_setup()? If no, I am lost :(


> +		 * special PHBs used for OpenCAPI and NVLink don't have to
> +		 * deal with eeh-on-mmio so they assign PEs at probe time
> +		 * rather than after resources are allocated.
> +		 */
> +		WARN_ON(phb->type != PNV_PHB_IODA2 && phb->type != PNV_PHB_IODA1);
> +		/* PEs for VFs should have been assigned in sriov_enable() */
> +		WARN_ON(pdev->is_virtfn);
> +
> +		pnv_pci_configure_bus(pdev->bus);
> +		pe = __pnv_ioda_get_pe(phb, pdev->devfn | (pdev->bus->number << 8));
> +		pci_err(pdev, "Configured new pe PE#%x\n", pe ? pe->pe_number : 0xfffff);
> +
> +


An extra empty line.

> +		/*
> +		 * If we can't setup the IODA PE something has gone horribly
> +		 * wrong and we can't enable DMA for the device.
> +		 */
> +		if (WARN_ON(!pe))
> +			return;
> +	} else {
> +		pci_err(pdev, "Added to existing PE#%x\n", pe->pe_number);
> +	}
> +
> +	if (pdn)
> +		pdn->pe_number = pe->pe_number;
>  
>  	pnv_pci_ioda_dma_dev_setup(phb, pdev);

Open code pnv_pci_ioda_dma_dev_setup()?

Or at least do pe->device_count++ here?


>  }
> @@ -3680,14 +3684,14 @@ void pnv_pci_dma_bus_setup(struct pci_bus *bus)
>  
>  static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
>  	.dma_dev_setup		= pnv_pci_dma_dev_setup,
> -	.dma_bus_setup		= pnv_pci_dma_bus_setup,
> +	.dma_bus_setup		= pnv_pci_dma_bus_setup, /* NB: DMA setup actually happens in dma_dev_setup */

The comment does not help much imho.


>  	.iommu_bypass_supported	= pnv_pci_ioda_iommu_bypass_supported,
>  	.setup_msi_irqs		= pnv_setup_msi_irqs,
>  	.teardown_msi_irqs	= pnv_teardown_msi_irqs,
>  	.enable_device_hook	= pnv_pci_enable_device_hook,
>  	.release_device		= pnv_pci_release_device,
>  	.window_alignment	= pnv_pci_window_alignment,
> -	.setup_bridge		= pnv_pci_setup_bridge,
> +	.setup_bridge		= pnv_pci_fixup_bridge_resources,


Using "fixup" in the name of a "setup" hook is ... mmmm.... contemporary :)



>  	.reset_secondary_bus	= pnv_pci_reset_secondary_bus,
>  	.shutdown		= pnv_pci_ioda_shutdown,
>  };
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c
  2019-11-20  1:28 ` [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c Oliver O'Halloran
  2019-11-21  4:02   ` Alexey Kardashevskiy
@ 2019-11-21  7:46   ` Christoph Hellwig
  1 sibling, 0 replies; 107+ messages in thread
From: Christoph Hellwig @ 2019-11-21  7:46 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: alistair, linuxppc-dev, s.miroshnichenko

> +#ifdef CONFIG_PCI_IOV
> +	struct pnv_ioda_pe *pe;
> +
> +	/* Fix the VF pdn PE number */
> +	if (pdev->is_virtfn) {
> +		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> +			if (pe->rid == ((pdev->bus->number << 8) |
> +			    (pdev->devfn & 0xff))) {
> +				pe->pdev = pdev;
> +				break;
> +			}
> +		}
> +	}
> +#endif /* CONFIG_PCI_IOV */

It would be nice to split this into a helper.  And I think using
IS_ENABLED we might not even need ifdefs:

static void pnv_pci_dma_fixup_vfs(struct pci_dev *pdev)
{
	struct pnv_ioda_pe *pe;

	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
		if (pe->rid ==
		    ((pdev->bus->number << 8) | (pdev->devfn & 0xff))) {
			pe->pdev = pdev;
			break;
	}
}


...

	if (IS_ENABLED(CONFIG_PCI_IOV) && pdev->is_virtfn)
		pnv_pci_dma_fixup_vfs(pdev);
		

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
  2019-11-20  1:28 ` [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() Oliver O'Halloran
  2019-11-21  4:34   ` Alexey Kardashevskiy
@ 2019-11-21  7:48   ` Christoph Hellwig
  2019-11-25  4:39     ` Oliver O'Halloran
  1 sibling, 1 reply; 107+ messages in thread
From: Christoph Hellwig @ 2019-11-21  7:48 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: alistair, linuxppc-dev, s.miroshnichenko

On Wed, Nov 20, 2019 at 12:28:19PM +1100, Oliver O'Halloran wrote:
> Move this out of the PHB's dma_dev_setup() callback and into the
> ppc_md.pcibios_fixup_iov callback. This ensures that the VF PE's
> pdev pointer is always valid for the whole time the device is
> added the bus.
> 
> This isn't strictly required, but it's slightly a slightly more logical
> place to do the fixup and it makes dma_dev_setup a bit simpler.

Ok, this removes the code I commented on earlier, so I take my
comment there back.

> +	if (pdev->is_virtfn) {
> +		/* Fix the VF PE's pdev pointer */
> +		struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
> +		pe->pdev = pdev;

Maybe add an empty line after the variable declaration?

> @@ -3641,20 +3654,6 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>  	struct pnv_phb *phb = hose->private_data;
>  
>  	pnv_pci_ioda_dma_dev_setup(phb, pdev);
>  }

Can you just merge pnv_pci_dma_dev_setup and pnv_pci_ioda_dma_dev_setup
now?


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup()
  2019-11-20  1:28 ` [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup() Oliver O'Halloran
@ 2019-11-21  7:52   ` Christoph Hellwig
  2019-11-27  4:53   ` Alexey Kardashevskiy
  1 sibling, 0 replies; 107+ messages in thread
From: Christoph Hellwig @ 2019-11-21  7:52 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: alistair, linuxppc-dev, s.miroshnichenko

On Wed, Nov 20, 2019 at 12:28:44PM +1100, Oliver O'Halloran wrote:
> Use the helper to look up the pnv_ioda_pe for the device we're configuring DMA
> for. In the VF case there's no need set pdn->pe_number since nothing looks at
> it any more.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d4b5ee926222..98d858999a2d 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1709,10 +1709,9 @@ int pnv_pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  
>  static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev)
>  {
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> -	pe = &phb->ioda.pe_array[pdn->pe_number];
> +	pe = pnv_ioda_get_pe(pdev);

Nit: why not move the pe initialization to the line declaring the
variable?

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn
  2019-11-20  1:28 ` [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn Oliver O'Halloran
@ 2019-11-22  4:43   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  4:43 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Find the VF index based on the BDFN of the device rather than using a cached
> value in the pci_dn. This is probably slower than looking up the cached value
> in the pci_dn, but it's done infrequently (only in the EEH recovery path) and
> it's just arithmatic.
> 
> We need this here because the functions to remove a VF are slightly
> different to those which remove a physical PCI device.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> ---
>  arch/powerpc/kernel/eeh_driver.c | 44 +++++++++++++++++++++++++++-----
>  1 file changed, 37 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index a1eaffe868de..1cdeed464aed 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -457,12 +457,35 @@ static enum pci_ers_result eeh_report_failure(struct eeh_dev *edev,
>  	return rc;
>  }
>  
> +#ifdef CONFIG_PCI_IOV
> +/* FIXME: this should probably go in drivers/pci/iov.c */
> +static int eeh_find_vf_index(struct pci_dev *physfn, u16 vf_bdfn)
> +{
> +	u16 vf_bus, vf_devfn;
> +	int i;
> +
> +	vf_bus = vf_bdfn >> 8;
> +	vf_devfn = vf_bdfn & 0xff;
> +
> +	for (i = 0; i < pci_num_vf(physfn); i++) {
> +		if (pci_iov_virtfn_bus(physfn, i) != vf_bus)
> +			continue;
> +		if (pci_iov_virtfn_devfn(physfn, i) != vf_devfn)
> +			continue;
> +		return i;
> +	}
> +
> +	WARN_ON(1);
> +	return -1;
> +}
> +
>  static void *eeh_add_virt_device(struct eeh_dev *edev)
>  {
> -	struct pci_driver *driver;
>  	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
> +	struct pci_driver *driver;
> +	int vf_index;
>  
> -	if (!(edev->physfn)) {
> +	if (!edev->physfn) {
>  		eeh_edev_warn(edev, "Not for VF\n");
>  		return NULL;
>  	}
> @@ -476,11 +499,18 @@ static void *eeh_add_virt_device(struct eeh_dev *edev)
>  		eeh_pcid_put(dev);
>  	}
>  
> -#ifdef CONFIG_PCI_IOV
> -	pci_iov_add_virtfn(edev->physfn, eeh_dev_to_pdn(edev)->vf_index);
> -#endif
> +	vf_index = eeh_find_vf_index(edev->physfn, edev->bdfn);
> +	pci_iov_add_virtfn(edev->physfn, vf_index);
> +
>  	return NULL;
>  }
> +#else
> +static void *eeh_add_virt_device(struct eeh_dev *edev)
> +{
> +	WARN_ON(1);
> +	return NULL;
> +}
> +#endif
>  
>  static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>  {
> @@ -521,9 +551,9 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>  
>  	if (edev->physfn) {
>  #ifdef CONFIG_PCI_IOV
> -		struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +		int vf_index = eeh_find_vf_index(edev->physfn, edev->bdfn);
>  
> -		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
> +		pci_iov_remove_virtfn(edev->physfn, vf_index);
>  		edev->pdev = NULL;
>  #endif
>  		if (rmv_data)
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config()
  2019-11-20  1:28 ` [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config() Oliver O'Halloran
@ 2019-11-22  4:52   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  4:52 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Switch the eeh_ops->{read|write}_config methods to take an eeh_dev structure
> rather than a pci_dn structure to specify the target device. This removes a
> lot of the uses of pci_dn in both the EEH core and in the platform EEH
> support.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>


09/46, 10/46 and 11/46 can be actually merged into one patch. Oh, well,
may half of 11/46 without s/pdn->phb/edev->controller/.



> ---
>  arch/powerpc/include/asm/eeh.h               |  4 +-
>  arch/powerpc/kernel/eeh.c                    | 22 +++++-----
>  arch/powerpc/kernel/eeh_pe.c                 | 44 ++++++++++----------
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 43 ++++++++++---------
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 16 ++++---
>  5 files changed, 67 insertions(+), 62 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index e11deb284631..62c4ee44ad2c 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -224,8 +224,8 @@ struct eeh_ops {
>  	int (*configure_bridge)(struct eeh_pe *pe);
>  	int (*err_inject)(struct eeh_pe *pe, int type, int func,
>  			  unsigned long addr, unsigned long mask);
> -	int (*read_config)(struct pci_dn *pdn, int where, int size, u32 *val);
> -	int (*write_config)(struct pci_dn *pdn, int where, int size, u32 val);
> +	int (*read_config)(struct eeh_dev *edev, int where, int size, u32 *val);
> +	int (*write_config)(struct eeh_dev *edev, int where, int size, u32 val);
>  	int (*next_error)(struct eeh_pe **pe);
>  	int (*restore_config)(struct pci_dn *pdn);
>  	int (*notify_resume)(struct pci_dn *pdn);
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index a3b93db972fc..7258fa04176d 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -185,21 +185,21 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
>  		pdn->phb->global_number, pdn->busno,
>  		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>  
> -	eeh_ops->read_config(pdn, PCI_VENDOR_ID, 4, &cfg);
> +	eeh_ops->read_config(edev, PCI_VENDOR_ID, 4, &cfg);
>  	n += scnprintf(buf+n, len-n, "dev/vend:%08x\n", cfg);
>  	pr_warn("EEH: PCI device/vendor: %08x\n", cfg);
>  
> -	eeh_ops->read_config(pdn, PCI_COMMAND, 4, &cfg);
> +	eeh_ops->read_config(edev, PCI_COMMAND, 4, &cfg);
>  	n += scnprintf(buf+n, len-n, "cmd/stat:%x\n", cfg);
>  	pr_warn("EEH: PCI cmd/status register: %08x\n", cfg);
>  
>  	/* Gather bridge-specific registers */
>  	if (edev->mode & EEH_DEV_BRIDGE) {
> -		eeh_ops->read_config(pdn, PCI_SEC_STATUS, 2, &cfg);
> +		eeh_ops->read_config(edev, PCI_SEC_STATUS, 2, &cfg);
>  		n += scnprintf(buf+n, len-n, "sec stat:%x\n", cfg);
>  		pr_warn("EEH: Bridge secondary status: %04x\n", cfg);
>  
> -		eeh_ops->read_config(pdn, PCI_BRIDGE_CONTROL, 2, &cfg);
> +		eeh_ops->read_config(edev, PCI_BRIDGE_CONTROL, 2, &cfg);
>  		n += scnprintf(buf+n, len-n, "brdg ctl:%x\n", cfg);
>  		pr_warn("EEH: Bridge control: %04x\n", cfg);
>  	}
> @@ -207,11 +207,11 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
>  	/* Dump out the PCI-X command and status regs */
>  	cap = edev->pcix_cap;
>  	if (cap) {
> -		eeh_ops->read_config(pdn, cap, 4, &cfg);
> +		eeh_ops->read_config(edev, cap, 4, &cfg);
>  		n += scnprintf(buf+n, len-n, "pcix-cmd:%x\n", cfg);
>  		pr_warn("EEH: PCI-X cmd: %08x\n", cfg);
>  
> -		eeh_ops->read_config(pdn, cap+4, 4, &cfg);
> +		eeh_ops->read_config(edev, cap+4, 4, &cfg);
>  		n += scnprintf(buf+n, len-n, "pcix-stat:%x\n", cfg);
>  		pr_warn("EEH: PCI-X status: %08x\n", cfg);
>  	}
> @@ -223,7 +223,7 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
>  		pr_warn("EEH: PCI-E capabilities and status follow:\n");
>  
>  		for (i=0; i<=8; i++) {
> -			eeh_ops->read_config(pdn, cap+4*i, 4, &cfg);
> +			eeh_ops->read_config(edev, cap+4*i, 4, &cfg);
>  			n += scnprintf(buf+n, len-n, "%02x:%x\n", 4*i, cfg);
>  
>  			if ((i % 4) == 0) {
> @@ -250,7 +250,7 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
>  		pr_warn("EEH: PCI-E AER capability register set follows:\n");
>  
>  		for (i=0; i<=13; i++) {
> -			eeh_ops->read_config(pdn, cap+4*i, 4, &cfg);
> +			eeh_ops->read_config(edev, cap+4*i, 4, &cfg);
>  			n += scnprintf(buf+n, len-n, "%02x:%x\n", 4*i, cfg);
>  
>  			if ((i % 4) == 0) {
> @@ -918,15 +918,13 @@ int eeh_pe_reset_full(struct eeh_pe *pe, bool include_passed)
>   */
>  void eeh_save_bars(struct eeh_dev *edev)
>  {
> -	struct pci_dn *pdn;
>  	int i;
>  
> -	pdn = eeh_dev_to_pdn(edev);
> -	if (!pdn)
> +	if (!edev)
>  		return;
>  
>  	for (i = 0; i < 16; i++)
> -		eeh_ops->read_config(pdn, i * 4, 4, &edev->config_space[i]);
> +		eeh_ops->read_config(edev, i * 4, 4, &edev->config_space[i]);
>  
>  	/*
>  	 * For PCI bridges including root port, we need enable bus
> diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
> index 177852e39a25..e11e0830f125 100644
> --- a/arch/powerpc/kernel/eeh_pe.c
> +++ b/arch/powerpc/kernel/eeh_pe.c
> @@ -714,32 +714,32 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
>  
>  	/* Check slot status */
>  	cap = edev->pcie_cap;
> -	eeh_ops->read_config(pdn, cap + PCI_EXP_SLTSTA, 2, &val);
> +	eeh_ops->read_config(edev, cap + PCI_EXP_SLTSTA, 2, &val);
>  	if (!(val & PCI_EXP_SLTSTA_PDS)) {
>  		eeh_edev_dbg(edev, "No card in the slot (0x%04x) !\n", val);
>  		return;
>  	}
>  
>  	/* Check power status if we have the capability */
> -	eeh_ops->read_config(pdn, cap + PCI_EXP_SLTCAP, 2, &val);
> +	eeh_ops->read_config(edev, cap + PCI_EXP_SLTCAP, 2, &val);
>  	if (val & PCI_EXP_SLTCAP_PCP) {
> -		eeh_ops->read_config(pdn, cap + PCI_EXP_SLTCTL, 2, &val);
> +		eeh_ops->read_config(edev, cap + PCI_EXP_SLTCTL, 2, &val);
>  		if (val & PCI_EXP_SLTCTL_PCC) {
>  			eeh_edev_dbg(edev, "In power-off state, power it on ...\n");
>  			val &= ~(PCI_EXP_SLTCTL_PCC | PCI_EXP_SLTCTL_PIC);
>  			val |= (0x0100 & PCI_EXP_SLTCTL_PIC);
> -			eeh_ops->write_config(pdn, cap + PCI_EXP_SLTCTL, 2, val);
> +			eeh_ops->write_config(edev, cap + PCI_EXP_SLTCTL, 2, val);
>  			msleep(2 * 1000);
>  		}
>  	}
>  
>  	/* Enable link */
> -	eeh_ops->read_config(pdn, cap + PCI_EXP_LNKCTL, 2, &val);
> +	eeh_ops->read_config(edev, cap + PCI_EXP_LNKCTL, 2, &val);
>  	val &= ~PCI_EXP_LNKCTL_LD;
> -	eeh_ops->write_config(pdn, cap + PCI_EXP_LNKCTL, 2, val);
> +	eeh_ops->write_config(edev, cap + PCI_EXP_LNKCTL, 2, val);
>  
>  	/* Check link */
> -	eeh_ops->read_config(pdn, cap + PCI_EXP_LNKCAP, 4, &val);
> +	eeh_ops->read_config(edev, cap + PCI_EXP_LNKCAP, 4, &val);
>  	if (!(val & PCI_EXP_LNKCAP_DLLLARC)) {
>  		eeh_edev_dbg(edev, "No link reporting capability (0x%08x) \n", val);
>  		msleep(1000);
> @@ -752,7 +752,7 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
>  		msleep(20);
>  		timeout += 20;
>  
> -		eeh_ops->read_config(pdn, cap + PCI_EXP_LNKSTA, 2, &val);
> +		eeh_ops->read_config(edev, cap + PCI_EXP_LNKSTA, 2, &val);
>  		if (val & PCI_EXP_LNKSTA_DLLLA)
>  			break;
>  	}
> @@ -769,7 +769,6 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
>  
>  static void eeh_restore_bridge_bars(struct eeh_dev *edev)
>  {
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>  	int i;
>  
>  	/*
> @@ -777,20 +776,20 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
>  	 * Bus numbers and windows: 0x18 - 0x30
>  	 */
>  	for (i = 4; i < 13; i++)
> -		eeh_ops->write_config(pdn, i*4, 4, edev->config_space[i]);
> +		eeh_ops->write_config(edev, i*4, 4, edev->config_space[i]);
>  	/* Rom: 0x38 */
> -	eeh_ops->write_config(pdn, 14*4, 4, edev->config_space[14]);
> +	eeh_ops->write_config(edev, 14*4, 4, edev->config_space[14]);
>  
>  	/* Cache line & Latency timer: 0xC 0xD */
> -	eeh_ops->write_config(pdn, PCI_CACHE_LINE_SIZE, 1,
> +	eeh_ops->write_config(edev, PCI_CACHE_LINE_SIZE, 1,
>                  SAVED_BYTE(PCI_CACHE_LINE_SIZE));
> -        eeh_ops->write_config(pdn, PCI_LATENCY_TIMER, 1,
> +        eeh_ops->write_config(edev, PCI_LATENCY_TIMER, 1,
>                  SAVED_BYTE(PCI_LATENCY_TIMER));
>  	/* Max latency, min grant, interrupt ping and line: 0x3C */
> -	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
> +	eeh_ops->write_config(edev, 15*4, 4, edev->config_space[15]);
>  
>  	/* PCI Command: 0x4 */
> -	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1] |
> +	eeh_ops->write_config(edev, PCI_COMMAND, 4, edev->config_space[1] |
>  			      PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
>  
>  	/* Check the PCIe link is ready */
> @@ -799,28 +798,27 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
>  
>  static void eeh_restore_device_bars(struct eeh_dev *edev)
>  {
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>  	int i;
>  	u32 cmd;
>  
>  	for (i = 4; i < 10; i++)
> -		eeh_ops->write_config(pdn, i*4, 4, edev->config_space[i]);
> +		eeh_ops->write_config(edev, i*4, 4, edev->config_space[i]);
>  	/* 12 == Expansion ROM Address */
> -	eeh_ops->write_config(pdn, 12*4, 4, edev->config_space[12]);
> +	eeh_ops->write_config(edev, 12*4, 4, edev->config_space[12]);
>  
> -	eeh_ops->write_config(pdn, PCI_CACHE_LINE_SIZE, 1,
> +	eeh_ops->write_config(edev, PCI_CACHE_LINE_SIZE, 1,
>  		SAVED_BYTE(PCI_CACHE_LINE_SIZE));
> -	eeh_ops->write_config(pdn, PCI_LATENCY_TIMER, 1,
> +	eeh_ops->write_config(edev, PCI_LATENCY_TIMER, 1,
>  		SAVED_BYTE(PCI_LATENCY_TIMER));
>  
>  	/* max latency, min grant, interrupt pin and line */
> -	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
> +	eeh_ops->write_config(edev, 15*4, 4, edev->config_space[15]);
>  
>  	/*
>  	 * Restore PERR & SERR bits, some devices require it,
>  	 * don't touch the other command bits
>  	 */
> -	eeh_ops->read_config(pdn, PCI_COMMAND, 4, &cmd);
> +	eeh_ops->read_config(edev, PCI_COMMAND, 4, &cmd);
>  	if (edev->config_space[1] & PCI_COMMAND_PARITY)
>  		cmd |= PCI_COMMAND_PARITY;
>  	else
> @@ -829,7 +827,7 @@ static void eeh_restore_device_bars(struct eeh_dev *edev)
>  		cmd |= PCI_COMMAND_SERR;
>  	else
>  		cmd &= ~PCI_COMMAND_SERR;
> -	eeh_ops->write_config(pdn, PCI_COMMAND, 4, cmd);
> +	eeh_ops->write_config(edev, PCI_COMMAND, 4, cmd);
>  }
>  
>  /**
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index b2ac4130fda7..54d8ec77aef2 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -858,32 +858,32 @@ static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
>  	case EEH_RESET_HOT:
>  		/* Don't report linkDown event */
>  		if (aer) {
> -			eeh_ops->read_config(pdn, aer + PCI_ERR_UNCOR_MASK,
> +			eeh_ops->read_config(edev, aer + PCI_ERR_UNCOR_MASK,
>  					     4, &ctrl);
>  			ctrl |= PCI_ERR_UNC_SURPDN;
> -			eeh_ops->write_config(pdn, aer + PCI_ERR_UNCOR_MASK,
> +			eeh_ops->write_config(edev, aer + PCI_ERR_UNCOR_MASK,
>  					      4, ctrl);
>  		}
>  
> -		eeh_ops->read_config(pdn, PCI_BRIDGE_CONTROL, 2, &ctrl);
> +		eeh_ops->read_config(edev, PCI_BRIDGE_CONTROL, 2, &ctrl);
>  		ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
> -		eeh_ops->write_config(pdn, PCI_BRIDGE_CONTROL, 2, ctrl);
> +		eeh_ops->write_config(edev, PCI_BRIDGE_CONTROL, 2, ctrl);
>  
>  		msleep(EEH_PE_RST_HOLD_TIME);
>  		break;
>  	case EEH_RESET_DEACTIVATE:
> -		eeh_ops->read_config(pdn, PCI_BRIDGE_CONTROL, 2, &ctrl);
> +		eeh_ops->read_config(edev, PCI_BRIDGE_CONTROL, 2, &ctrl);
>  		ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> -		eeh_ops->write_config(pdn, PCI_BRIDGE_CONTROL, 2, ctrl);
> +		eeh_ops->write_config(edev, PCI_BRIDGE_CONTROL, 2, ctrl);
>  
>  		msleep(EEH_PE_RST_SETTLE_TIME);
>  
>  		/* Continue reporting linkDown event */
>  		if (aer) {
> -			eeh_ops->read_config(pdn, aer + PCI_ERR_UNCOR_MASK,
> +			eeh_ops->read_config(edev, aer + PCI_ERR_UNCOR_MASK,
>  					     4, &ctrl);
>  			ctrl &= ~PCI_ERR_UNC_SURPDN;
> -			eeh_ops->write_config(pdn, aer + PCI_ERR_UNCOR_MASK,
> +			eeh_ops->write_config(edev, aer + PCI_ERR_UNCOR_MASK,
>  					      4, ctrl);
>  		}
>  
> @@ -952,11 +952,12 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
>  static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
>  				     int pos, u16 mask)
>  {
> +	struct eeh_dev *edev = pdn->edev;
>  	int i, status = 0;
>  
>  	/* Wait for Transaction Pending bit to be cleared */
>  	for (i = 0; i < 4; i++) {
> -		eeh_ops->read_config(pdn, pos, 2, &status);
> +		eeh_ops->read_config(edev, pos, 2, &status);
>  		if (!(status & mask))
>  			return;
>  
> @@ -977,7 +978,7 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
>  	if (WARN_ON(!edev->pcie_cap))
>  		return -ENOTTY;
>  
> -	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &reg);
> +	eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &reg);
>  	if (!(reg & PCI_EXP_DEVCAP_FLR))
>  		return -ENOTTY;
>  
> @@ -987,18 +988,18 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
>  		pnv_eeh_wait_for_pending(pdn, "",
>  					 edev->pcie_cap + PCI_EXP_DEVSTA,
>  					 PCI_EXP_DEVSTA_TRPND);
> -		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> +		eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
>  				     4, &reg);
>  		reg |= PCI_EXP_DEVCTL_BCR_FLR;
> -		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> +		eeh_ops->write_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
>  				      4, reg);
>  		msleep(EEH_PE_RST_HOLD_TIME);
>  		break;
>  	case EEH_RESET_DEACTIVATE:
> -		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> +		eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
>  				     4, &reg);
>  		reg &= ~PCI_EXP_DEVCTL_BCR_FLR;
> -		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
> +		eeh_ops->write_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
>  				      4, reg);
>  		msleep(EEH_PE_RST_SETTLE_TIME);
>  		break;
> @@ -1015,7 +1016,7 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
>  	if (WARN_ON(!edev->af_cap))
>  		return -ENOTTY;
>  
> -	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
> +	eeh_ops->read_config(edev, edev->af_cap + PCI_AF_CAP, 1, &cap);
>  	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
>  		return -ENOTTY;
>  
> @@ -1030,12 +1031,12 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
>  		pnv_eeh_wait_for_pending(pdn, "AF",
>  					 edev->af_cap + PCI_AF_CTRL,
>  					 PCI_AF_STATUS_TP << 8);
> -		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL,
> +		eeh_ops->write_config(edev, edev->af_cap + PCI_AF_CTRL,
>  				      1, PCI_AF_CTRL_FLR);
>  		msleep(EEH_PE_RST_HOLD_TIME);
>  		break;
>  	case EEH_RESET_DEACTIVATE:
> -		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
> +		eeh_ops->write_config(edev, edev->af_cap + PCI_AF_CTRL, 1, 0);
>  		msleep(EEH_PE_RST_SETTLE_TIME);
>  		break;
>  	}
> @@ -1269,9 +1270,11 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
>  	return false;
>  }
>  
> -static int pnv_eeh_read_config(struct pci_dn *pdn,
> +static int pnv_eeh_read_config(struct eeh_dev *edev,
>  			       int where, int size, u32 *val)
>  {
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
>  	if (!pdn)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> @@ -1283,9 +1286,11 @@ static int pnv_eeh_read_config(struct pci_dn *pdn,
>  	return pnv_pci_cfg_read(pdn, where, size, val);
>  }
>  
> -static int pnv_eeh_write_config(struct pci_dn *pdn,
> +static int pnv_eeh_write_config(struct eeh_dev *edev,
>  				int where, int size, u32 val)
>  {
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
>  	if (!pdn)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index fa704d7052ec..6f911a048339 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -631,29 +631,33 @@ static int pseries_eeh_configure_bridge(struct eeh_pe *pe)
>  
>  /**
>   * pseries_eeh_read_config - Read PCI config space
> - * @pdn: PCI device node
> - * @where: PCI address
> + * @edev: EEH device handle
> + * @where: PCI config space offset
>   * @size: size to read
>   * @val: return value
>   *
>   * Read config space from the speicifed device
>   */
> -static int pseries_eeh_read_config(struct pci_dn *pdn, int where, int size, u32 *val)
> +static int pseries_eeh_read_config(struct eeh_dev *edev, int where, int size, u32 *val)
>  {
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
>  	return rtas_read_config(pdn, where, size, val);
>  }
>  
>  /**
>   * pseries_eeh_write_config - Write PCI config space
> - * @pdn: PCI device node
> - * @where: PCI address
> + * @edev: EEH device handle
> + * @where: PCI config space offset
>   * @size: size to write
>   * @val: value to be written
>   *
>   * Write config space to the specified device
>   */
> -static int pseries_eeh_write_config(struct pci_dn *pdn, int where, int size, u32 val)
> +static int pseries_eeh_write_config(struct eeh_dev *edev, int where, int size, u32 val)
>  {
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
>  	return rtas_write_config(pdn, where, size, val);
>  }
>  
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn
  2019-11-20  1:28 ` [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn Oliver O'Halloran
@ 2019-11-22  4:55   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  4:55 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> We use the pci_dn to retrieve the domain, bus, device, and function numbers for
> an EEH device. We now have that in the eeh_dev so covert the various printk()s
> we have around the place to source that information from the eeh_dev.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/kernel/eeh.c    | 14 ++++----------
>  arch/powerpc/kernel/eeh_pe.c | 14 ++++++--------
>  2 files changed, 10 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 63500e34e329..c8039fdb23ba 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -167,23 +167,17 @@ void eeh_show_enabled(void)
>   */
>  static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
>  {
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>  	u32 cfg;
>  	int cap, i;
>  	int n = 0, l = 0;
>  	char buffer[128];
>  
> -	if (!pdn) {
> -		pr_warn("EEH: Note: No error log for absent device.\n");
> -		return 0;
> -	}
> -
>  	n += scnprintf(buf+n, len-n, "%04x:%02x:%02x.%01x\n",
> -		       pdn->phb->global_number, pdn->busno,
> -		       PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
> +			edev->pe->phb->global_number, edev->bdfn >> 8,
> +			PCI_SLOT(edev->bdfn), PCI_FUNC(edev->bdfn));
>  	pr_warn("EEH: of node=%04x:%02x:%02x.%01x\n",
> -		pdn->phb->global_number, pdn->busno,
> -		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
> +		edev->pe->phb->global_number, edev->bdfn >> 8,
> +		PCI_SLOT(edev->bdfn), PCI_FUNC(edev->bdfn));
>  
>  	eeh_ops->read_config(edev, PCI_VENDOR_ID, 4, &cfg);
>  	n += scnprintf(buf+n, len-n, "dev/vend:%08x\n", cfg);
> diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
> index 634963aa4a77..831f363f1732 100644
> --- a/arch/powerpc/kernel/eeh_pe.c
> +++ b/arch/powerpc/kernel/eeh_pe.c
> @@ -366,9 +366,8 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
>   */
>  int eeh_add_to_parent_pe(struct eeh_dev *edev)
>  {
> +	int config_addr = edev->bdfn;
>  	struct eeh_pe *pe, *parent;
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> -	int config_addr = (pdn->busno << 8) | (pdn->devfn);
>  
>  	/* Check if the PE number is valid */
>  	if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) {
> @@ -382,7 +381,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>  	 * PE should be composed of PCI bus and its subordinate
>  	 * components.
>  	 */
> -	pe = eeh_pe_get(pdn->phb, edev->pe_config_addr, config_addr);
> +	pe = eeh_pe_get(edev->controller, edev->pe_config_addr, config_addr);
>  	if (pe) {
>  		if (pe->type & EEH_PE_INVALID) {
>  			list_add_tail(&edev->entry, &pe->edevs);
> @@ -416,9 +415,9 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>  
>  	/* Create a new EEH PE */
>  	if (edev->physfn)
> -		pe = eeh_pe_alloc(pdn->phb, EEH_PE_VF);
> +		pe = eeh_pe_alloc(edev->controller, EEH_PE_VF);
>  	else
> -		pe = eeh_pe_alloc(pdn->phb, EEH_PE_DEVICE);
> +		pe = eeh_pe_alloc(edev->controller, EEH_PE_DEVICE);
>  	if (!pe) {
>  		pr_err("%s: out of memory!\n", __func__);
>  		return -ENOMEM;
> @@ -434,10 +433,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>  	 */
>  	parent = eeh_pe_get_parent(edev);
>  	if (!parent) {
> -		parent = eeh_phb_pe_get(pdn->phb);
> +		parent = eeh_phb_pe_get(edev->controller);
>  		if (!parent) {
>  			pr_err("%s: No PHB PE is found (PHB Domain=%d)\n",
> -				__func__, pdn->phb->global_number);
> +				__func__, edev->controller->global_number);
>  			edev->pe = NULL;
>  			kfree(pe);
>  			return -EEXIST;
> @@ -698,7 +697,6 @@ void eeh_pe_state_clear(struct eeh_pe *root, int state, bool include_passed)
>   */
>  static void eeh_bridge_check_link(struct eeh_dev *edev)
>  {
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);


This belongs to 09/46. Or just merge them.


>  	int cap;
>  	uint32_t val;
>  	int timeout = 0;
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev
  2019-11-20  1:28 ` [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev Oliver O'Halloran
@ 2019-11-22  5:45   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  5:45 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> The EEH core has a concept of "early probe" and "late probe." When the
> EEH_PROBE_MODE_DEVTREE flag is set (i.e pseries) we call the eeh_ops->probe()
> function in eeh_add_device_early() so the eeh_dev state is initialised based on
> the pci_dn. It's important to realise that this happens *long* before the PCI
> device has been probed and a pci_dev structure created. This is necessary due
> to a PAPR requirement that EEH be enabled before to OS starts interacting
> with the device.
> 
> The late probe is done in eeh_add_device_late() when the EEH_PROBE_MODE_DEV
> flag is set (i.e. PowerNV). The main difference is the late probe happens
> after the pci_dev has been created. As a result there is no actual dependency
> on a pci_dn in the late probe case. Splitting the single eeh_ops->probe()
> function into seperate functions allows us to simplify the late probe case
> since we have access to a pci_dev at that point. Having access to a pci_dev
> means that we can use the functions provided by the PCI core for finding
> capabilities, etc rather than doing it manually.
> 
> It also changes the prototype for the probe functions to be void. Currently
> they return a void *, but both implementations always return NULL so there's
> not much point to it.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/include/asm/eeh.h               |  3 +-
>  arch/powerpc/kernel/eeh.c                    |  6 ++--
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 29 ++++++--------------
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 13 ++++-----
>  4 files changed, 20 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index 67847f8dfe71..466b0165fbcf 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -215,7 +215,8 @@ enum {
>  struct eeh_ops {
>  	char *name;
>  	int (*init)(void);
> -	void* (*probe)(struct pci_dn *pdn, void *data);
> +	void (*probe_pdn)(struct pci_dn *pdn);    /* used on pseries */
> +	void (*probe_pdev)(struct pci_dev *pdev); /* used on powernv */
>  	int (*set_option)(struct eeh_pe *pe, int option);
>  	int (*get_pe_addr)(struct eeh_pe *pe);
>  	int (*get_state)(struct eeh_pe *pe, int *delay);
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index c8039fdb23ba..087a98b42a8c 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1066,7 +1066,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
>  	    (eeh_has_flag(EEH_PROBE_MODE_DEVTREE) && 0 == phb->buid))
>  		return;
>  
> -	eeh_ops->probe(pdn, NULL);
> +	eeh_ops->probe_pdn(pdn);
>  }
>  
>  /**
> @@ -1135,8 +1135,8 @@ void eeh_add_device_late(struct pci_dev *dev)


This guy is called directly from pseries and powernv so it feels like
you do not really need these probe/probe_pdev() as eeh_ops hooks and can
just call them directly.

eeh_add_device_early() is even simpler and only used for pseries (not
now but after 14/46), unless I missed something. Thanks,



>  		dev->dev.archdata.edev = NULL;
>  	}
>  
> -	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
> -		eeh_ops->probe(pdn, NULL);
> +	if (eeh_ops->probe_pdev && eeh_has_flag(EEH_PROBE_MODE_DEV))
> +		eeh_ops->probe_pdev(dev);
>  
>  	edev->pdev = dev;
>  	dev->dev.archdata.edev = edev;
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 6c5d9f1bc378..8bd5317aa878 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -346,23 +346,13 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
>  
>  /**
>   * pnv_eeh_probe - Do probe on PCI device
> - * @pdn: PCI device node
> - * @data: unused
> + * @pdev: pci_dev to probe
>   *
> - * When EEH module is installed during system boot, all PCI devices
> - * are checked one by one to see if it supports EEH. The function
> - * is introduced for the purpose. By default, EEH has been enabled
> - * on all PCI devices. That's to say, we only need do necessary
> - * initialization on the corresponding eeh device and create PE
> - * accordingly.
> - *
> - * It's notable that's unsafe to retrieve the EEH device through
> - * the corresponding PCI device. During the PCI device hotplug, which
> - * was possiblly triggered by EEH core, the binding between EEH device
> - * and the PCI device isn't built yet.
> + * Creates (or finds an existing) edev for this pci_dev.
>   */
> -static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
> +static void pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  {
> +	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pci_controller *hose = pdn->phb;
>  	struct pnv_phb *phb = hose->private_data;
>  	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> @@ -377,11 +367,11 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
>  	 * the probing.
>  	 */
>  	if (!edev || edev->pe)
> -		return NULL;
> +		return;
>  
>  	/* Skip for PCI-ISA bridge */
>  	if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
> -		return NULL;
> +		return;
>  
>  	eeh_edev_dbg(edev, "Probing device\n");
>  
> @@ -411,7 +401,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
>  	ret = eeh_add_to_parent_pe(edev);
>  	if (ret) {
>  		eeh_edev_warn(edev, "Failed to add device to PE (code %d)\n", ret);
> -		return NULL;
> +		return;
>  	}
>  
>  	/*
> @@ -469,8 +459,6 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
>  	eeh_save_bars(edev);
>  
>  	eeh_edev_dbg(edev, "EEH enabled on device\n");
> -
> -	return NULL;
>  }
>  
>  /**
> @@ -1673,7 +1661,8 @@ static int pnv_eeh_restore_config(struct eeh_dev *edev)
>  static struct eeh_ops pnv_eeh_ops = {
>  	.name                   = "powernv",
>  	.init                   = pnv_eeh_init,
> -	.probe			= pnv_eeh_probe,
> +	.probe_pdn		= NULL,
> +	.probe_pdev		= pnv_eeh_probe_pdev,
>  	.set_option             = pnv_eeh_set_option,
>  	.get_pe_addr            = pnv_eeh_get_pe_addr,
>  	.get_state              = pnv_eeh_get_state,
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index 6f911a048339..3ac23c884f4e 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -229,7 +229,7 @@ static int pseries_eeh_find_ecap(struct pci_dn *pdn, int cap)
>   * are checked one by one to see if it supports EEH. The function
>   * is introduced for the purpose.
>   */
> -static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
> +static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
>  {
>  	struct eeh_dev *edev;
>  	struct eeh_pe pe;
> @@ -240,15 +240,15 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
>  	/* Retrieve OF node and eeh device */
>  	edev = pdn_to_eeh_dev(pdn);
>  	if (!edev || edev->pe)
> -		return NULL;
> +		return;
>  
>  	/* Check class/vendor/device IDs */
>  	if (!pdn->vendor_id || !pdn->device_id || !pdn->class_code)
> -		return NULL;
> +		return;
>  
>  	/* Skip for PCI-ISA bridge */
>          if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
> -		return NULL;
> +		return;
>  
>  	eeh_edev_dbg(edev, "Probing device\n");
>  
> @@ -315,8 +315,6 @@ static void *pseries_eeh_probe(struct pci_dn *pdn, void *data)
>  
>  	/* Save memory bars */
>  	eeh_save_bars(edev);
> -
> -	return NULL;
>  }
>  
>  /**
> @@ -755,7 +753,8 @@ static int pseries_notify_resume(struct pci_dn *pdn)
>  static struct eeh_ops pseries_eeh_ops = {
>  	.name			= "pseries",
>  	.init			= pseries_eeh_init,
> -	.probe			= pseries_eeh_probe,
> +	.probe_pdn		= pseries_eeh_probe_pdn,
> +	.probe_pdev 		= NULL,
>  	.set_option		= pseries_eeh_set_option,
>  	.get_pe_addr		= pseries_eeh_get_pe_addr,
>  	.get_state		= pseries_eeh_get_state,
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early()
  2019-11-20  1:28 ` [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early() Oliver O'Halloran
@ 2019-11-22  6:01   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  6:01 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> eeh_add_device_early() is used to initialise the EEH state for a PCI device
> based on the contents of it's devicetree node. It doesn't do anything
> unless EEH_FLAG_PROBE_MODE_DEVTREE is set and that only happens on pseries.
> 
> Remove the call to eeh_add_device_early() in the powernv code to squash
> another pci_dn usage.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>


> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 5250c4525544..aa2935a08464 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -40,13 +40,10 @@ static int eeh_event_irq = -EINVAL;
>  
>  void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
> -
> -	if (!pdn || eeh_has_flag(EEH_FORCE_DISABLED))
> +	if (eeh_has_flag(EEH_FORCE_DISABLED))
>  		return;
>  
>  	dev_dbg(&pdev->dev, "EEH: Setting up device\n");
> -	eeh_add_device_early(pdn);
>  	eeh_add_device_late(pdev);
>  }
>  
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops
  2019-11-20  1:28 ` [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops Oliver O'Halloran
@ 2019-11-22  6:15   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  6:15 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Use the pnv_eeh_{read|write}_config() functions that take an edev rather
> than a pci_dn. This allows us to remove most of the explict uses of pci_dn
> in the PowerNV EEH backend and localises them into a few functions which we
> can fix later.


-ESPELL :)

> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 153 +++++++++----------
>  1 file changed, 70 insertions(+), 83 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index aa2935a08464..aaccb3768393 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -278,27 +278,73 @@ int pnv_eeh_post_init(void)
>  	return ret;
>  }
>  
> -static int pnv_eeh_find_cap(struct pci_dn *pdn, int cap)
> +static inline bool pnv_eeh_cfg_blocked(struct eeh_dev *edev)
> +{
> +	if (!edev || !edev->pe)
> +		return false;
> +
> +	/*
> +	 * We will issue FLR or AF FLR to all VFs, which are contained
> +	 * in VF PE. It relies on the EEH PCI config accessors. So we
> +	 * can't block them during the window.
> +	 */
> +	if (edev->physfn && (edev->pe->state & EEH_PE_RESET))
> +		return false;
> +
> +	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
> +		return true;
> +
> +	return false;
> +}
> +
> +static int pnv_eeh_read_config(struct eeh_dev *edev,
> +			       int where, int size, u32 *val)
> +{
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
> +	if (!pdn)
> +		return PCIBIOS_DEVICE_NOT_FOUND;
> +
> +	if (pnv_eeh_cfg_blocked(edev)) {
> +		*val = 0xFFFFFFFF;
> +		return PCIBIOS_SET_FAILED;
> +	}
> +
> +	return pnv_pci_cfg_read(pdn, where, size, val);
> +}
> +
> +static int pnv_eeh_write_config(struct eeh_dev *edev,
> +				int where, int size, u32 val)
> +{
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
> +	if (!pdn)
> +		return PCIBIOS_DEVICE_NOT_FOUND;
> +
> +	if (pnv_eeh_cfg_blocked(edev))
> +		return PCIBIOS_SET_FAILED;
> +
> +	return pnv_pci_cfg_write(pdn, where, size, val);
> +}
> +
> +static int pnv_eeh_find_cap(struct eeh_dev *edev, int cap)
>  {
>  	int pos = PCI_CAPABILITY_LIST;
>  	int cnt = 48;   /* Maximal number of capabilities */
>  	u32 status, id;
>  
> -	if (!pdn)
> -		return 0;
> -
>  	/* Check if the device supports capabilities */
> -	pnv_pci_cfg_read(pdn, PCI_STATUS, 2, &status);
> +	pnv_eeh_read_config(edev, PCI_STATUS, 2, &status);
>  	if (!(status & PCI_STATUS_CAP_LIST))
>  		return 0;
>  
>  	while (cnt--) {
> -		pnv_pci_cfg_read(pdn, pos, 1, &pos);
> +		pnv_eeh_read_config(edev, pos, 1, &pos);
>  		if (pos < 0x40)
>  			break;
>  
>  		pos &= ~3;
> -		pnv_pci_cfg_read(pdn, pos + PCI_CAP_LIST_ID, 1, &id);
> +		pnv_eeh_read_config(edev, pos + PCI_CAP_LIST_ID, 1, &id);
>  		if (id == 0xff)
>  			break;
>  
> @@ -313,15 +359,14 @@ static int pnv_eeh_find_cap(struct pci_dn *pdn, int cap)
>  	return 0;
>  }
>  
> -static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
> +static int pnv_eeh_find_ecap(struct eeh_dev *edev, int cap)
>  {
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>  	u32 header;
>  	int pos = 256, ttl = (4096 - 256) / 8;
>  
>  	if (!edev || !edev->pcie_cap)
>  		return 0;
> -	if (pnv_pci_cfg_read(pdn, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
> +	if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
>  		return 0;
>  	else if (!header)
>  		return 0;
> @@ -334,7 +379,7 @@ static int pnv_eeh_find_ecap(struct pci_dn *pdn, int cap)
>  		if (pos < 256)
>  			break;
>  
> -		if (pnv_pci_cfg_read(pdn, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
> +		if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
>  			break;
>  	}
>  
> @@ -382,15 +427,14 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  
>  	/* Initialize eeh device */
>  	edev->class_code = pdn->class_code;
> -	edev->mode	&= 0xFFFFFF00;


It seems that this should go to 22/46. Thanks,


> -	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
> -	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
> -	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
> -	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
> +	edev->pcix_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_PCIX);
> +	edev->pcie_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_EXP);
> +	edev->af_cap   = pnv_eeh_find_cap(edev, PCI_CAP_ID_AF);
> +	edev->aer_cap  = pnv_eeh_find_ecap(edev, PCI_EXT_CAP_ID_ERR);
>  	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
>  		edev->mode |= EEH_DEV_BRIDGE;
>  		if (edev->pcie_cap) {
> -			pnv_pci_cfg_read(pdn, edev->pcie_cap + PCI_EXP_FLAGS,
> +			pnv_eeh_read_config(edev, edev->pcie_cap + PCI_EXP_FLAGS,
>  					 2, &pcie_flags);
>  			pcie_flags = (pcie_flags & PCI_EXP_FLAGS_TYPE) >> 4;
>  			if (pcie_flags == PCI_EXP_TYPE_ROOT_PORT)
> @@ -839,8 +883,7 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
>  
>  static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
>  {
> -	struct pci_dn *pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> +	struct eeh_dev *edev = pci_dev_to_eeh_dev(dev);
>  	int aer = edev ? edev->aer_cap : 0;
>  	u32 ctrl;
>  
> @@ -944,10 +987,9 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
>  	}
>  }
>  
> -static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
> +static void pnv_eeh_wait_for_pending(struct eeh_dev *edev, const char *type,
>  				     int pos, u16 mask)
>  {
> -	struct eeh_dev *edev = pdn->edev;
>  	int i, status = 0;
>  
>  	/* Wait for Transaction Pending bit to be cleared */
> @@ -965,9 +1007,8 @@ static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
>  		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>  }
>  
> -static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
> +static int pnv_eeh_do_flr(struct eeh_dev *edev, int option)
>  {
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>  	u32 reg = 0;
>  
>  	if (WARN_ON(!edev->pcie_cap))
> @@ -980,7 +1021,7 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
>  	switch (option) {
>  	case EEH_RESET_HOT:
>  	case EEH_RESET_FUNDAMENTAL:
> -		pnv_eeh_wait_for_pending(pdn, "",
> +		pnv_eeh_wait_for_pending(edev, "",
>  					 edev->pcie_cap + PCI_EXP_DEVSTA,
>  					 PCI_EXP_DEVSTA_TRPND);
>  		eeh_ops->read_config(edev, edev->pcie_cap + PCI_EXP_DEVCTL,
> @@ -1003,9 +1044,8 @@ static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
>  	return 0;
>  }
>  
> -static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
> +static int pnv_eeh_do_af_flr(struct eeh_dev *edev, int option)
>  {
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>  	u32 cap = 0;
>  
>  	if (WARN_ON(!edev->af_cap))
> @@ -1023,7 +1063,7 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
>  		 * test is used, so we use the conrol offset rather than status
>  		 * and shift the test bit to match.
>  		 */
> -		pnv_eeh_wait_for_pending(pdn, "AF",
> +		pnv_eeh_wait_for_pending(edev, "AF",
>  					 edev->af_cap + PCI_AF_CTRL,
>  					 PCI_AF_STATUS_TP << 8);
>  		eeh_ops->write_config(edev, edev->af_cap + PCI_AF_CTRL,
> @@ -1042,20 +1082,18 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
>  static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option)
>  {
>  	struct eeh_dev *edev;
> -	struct pci_dn *pdn;
>  	int ret;
>  
>  	/* The VF PE should have only one child device */
>  	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, entry);
> -	pdn = eeh_dev_to_pdn(edev);
> -	if (!pdn)
> +	if (!edev)
>  		return -ENXIO;
>  
> -	ret = pnv_eeh_do_flr(pdn, option);
> +	ret = pnv_eeh_do_flr(edev, option);
>  	if (!ret)
>  		return ret;
>  
> -	return pnv_eeh_do_af_flr(pdn, option);
> +	return pnv_eeh_do_af_flr(edev, option);
>  }
>  
>  /**
> @@ -1244,57 +1282,6 @@ static int pnv_eeh_err_inject(struct eeh_pe *pe, int type, int func,
>  	return 0;
>  }
>  
> -static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
> -{
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> -
> -	if (!edev || !edev->pe)
> -		return false;
> -
> -	/*
> -	 * We will issue FLR or AF FLR to all VFs, which are contained
> -	 * in VF PE. It relies on the EEH PCI config accessors. So we
> -	 * can't block them during the window.
> -	 */
> -	if (edev->physfn && (edev->pe->state & EEH_PE_RESET))
> -		return false;
> -
> -	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
> -		return true;
> -
> -	return false;
> -}
> -
> -static int pnv_eeh_read_config(struct eeh_dev *edev,
> -			       int where, int size, u32 *val)
> -{
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> -
> -	if (!pdn)
> -		return PCIBIOS_DEVICE_NOT_FOUND;
> -
> -	if (pnv_eeh_cfg_blocked(pdn)) {
> -		*val = 0xFFFFFFFF;
> -		return PCIBIOS_SET_FAILED;
> -	}
> -
> -	return pnv_pci_cfg_read(pdn, where, size, val);
> -}
> -
> -static int pnv_eeh_write_config(struct eeh_dev *edev,
> -				int where, int size, u32 val)
> -{
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> -
> -	if (!pdn)
> -		return PCIBIOS_DEVICE_NOT_FOUND;
> -
> -	if (pnv_eeh_cfg_blocked(pdn))
> -		return PCIBIOS_SET_FAILED;
> -
> -	return pnv_pci_cfg_write(pdn, where, size, val);
> -}
> -
>  static void pnv_eeh_dump_hub_diag_common(struct OpalIoP7IOCErrorData *data)
>  {
>  	/* GEM */
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print
  2019-11-20  1:28 ` [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print Oliver O'Halloran
@ 2019-11-22  6:17   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-22  6:17 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Neaten things up a bit and remove a pci_dn use.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

but it also could be merged into some bigger patch, it is hardly useful
on its own.


> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index aaccb3768393..f58fe6bda46e 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -1001,10 +1001,8 @@ static void pnv_eeh_wait_for_pending(struct eeh_dev *edev, const char *type,
>  		msleep((1 << i) * 100);
>  	}
>  
> -	pr_warn("%s: Pending transaction while issuing %sFLR to %04x:%02x:%02x.%01x\n",
> -		__func__, type,
> -		pdn->phb->global_number, pdn->busno,
> -		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
> +	eeh_edev_warn(edev, "%s: Pending transaction while issuing %sFLR\n",
> +		__func__, type);
>  }
>  
>  static int pnv_eeh_do_flr(struct eeh_dev *edev, int option)
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev()
  2019-11-20  1:28 ` [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev() Oliver O'Halloran
@ 2019-11-25  0:30   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  0:30 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> To get away from using pci_dn we need a way to find the edev for a given
> bdfn. The easiest way to do this is to find the ioda_pe for that BDFN in
> the PHB's reverse mapping table and scan the device list of the
> corresponding eeh_pe.
> 
> Is this slow? Yeah probably. Is it slower than the existing "traverse the
> pdn tree" method? Probably not.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 31 ++++++++++++++++++++
>  arch/powerpc/platforms/powernv/pci.h         |  2 ++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index f58fe6bda46e..a974822c5097 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -278,6 +278,37 @@ int pnv_eeh_post_init(void)
>  	return ret;
>  }
>  
> +struct eeh_dev *pnv_eeh_find_edev(struct pnv_phb *phb, u16 bdfn)
> +{
> +	struct pnv_ioda_pe *ioda_pe;
> +	struct eeh_dev *tmp, *edev;
> +	struct eeh_pe *pe;
> +
> +	/* EEH not enabled ? */
> +	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> +		return NULL;
> +
> +	/* Fish the EEH PE from the IODA PE */
> +	ioda_pe = __pnv_ioda_get_pe(phb, bdfn);
> +	if (!ioda_pe)
> +		return NULL;
> +
> +	/*
> +	 * FIXME: Doing a tree-traversal followed by a list traversal
> +	 * on every config access is dumb. Not much dumber than the pci_dn
> +	 * tree traversal we did before, but still quite dumb.


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>


Although I would reduce the comment above to "FIXME: replace 3
traversals with something better".



> +	 */
> +	pe = eeh_pe_get(phb->hose, ioda_pe->pe_number, 0);
> +	if (!pe)
> +		return NULL;
> +
> +	eeh_pe_for_each_dev(pe, edev, tmp)
> +		if (edev->bdfn == bdfn)
> +			return edev;
> +
> +	return NULL;
> +}
> +
>  static inline bool pnv_eeh_cfg_blocked(struct eeh_dev *edev)
>  {
>  	if (!edev || !edev->pe)
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 3c33a0c91a69..a343f3c8e65c 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -196,6 +196,8 @@ extern void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq);
>  extern unsigned long pnv_pci_ioda2_get_table_size(__u32 page_shift,
>  		__u64 window_size, __u32 levels);
>  extern int pnv_eeh_post_init(void);
> +struct eeh_dev;
> +struct eeh_dev *pnv_eeh_find_edev(struct pnv_phb *phb, u16 bdfn);
>  
>  __printf(3, 4)
>  extern void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper
  2019-11-20  1:28 ` [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper Oliver O'Halloran
@ 2019-11-25  0:42   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  0:42 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Add a helper to go from a pci_bus structure to the pnv_phb that hosts that
> bus. There's a lot of instances of the following pattern:
> 
> 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> 	struct pnv_phb *phb = hose->private_data;
> 
> Without any other uses of the pci_controller inside the function. This is
> hard to read since it requires you to memorise the contents of the
> private data fields and kind of error prone since it involves blindly
> assigning a void pointer. Add a helper to make it more concise and
> explict.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 88 +++++++----------------
>  arch/powerpc/platforms/powernv/pci.c      | 18 ++---
>  arch/powerpc/platforms/powernv/pci.h      | 10 +++
>  3 files changed, 39 insertions(+), 77 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index c74521e5f3ab..a1c9315f3208 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -252,8 +252,7 @@ static int pnv_ioda2_init_m64(struct pnv_phb *phb)
>  static void pnv_ioda_reserve_dev_m64_pe(struct pci_dev *pdev,
>  					 unsigned long *pe_bitmap)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct resource *r;
>  	resource_size_t base, sgsz, start, end;
>  	int segno, i;
> @@ -351,8 +350,7 @@ static void pnv_ioda_reserve_m64_pe(struct pci_bus *bus,
>  
>  static struct pnv_ioda_pe *pnv_ioda_pick_m64_pe(struct pci_bus *bus, bool all)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	struct pnv_ioda_pe *master_pe, *pe;
>  	unsigned long size, *pe_alloc;
>  	int i;
> @@ -673,8 +671,7 @@ struct pnv_ioda_pe *__pnv_ioda_get_pe(struct pnv_phb *phb, u16 bdfn)
>  
>  struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(dev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
>  	struct pci_dn *pdn = pci_get_pdn(dev);
>  
>  	if (!pdn)
> @@ -1053,8 +1050,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
>  
>  static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(dev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
>  	struct pci_dn *pdn = pci_get_pdn(dev);
>  	struct pnv_ioda_pe *pe;
>  
> @@ -1113,8 +1109,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>   */
>  static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	struct pnv_ioda_pe *pe = NULL;
>  	unsigned int pe_num;
>  
> @@ -1181,8 +1176,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
>  	struct pnv_ioda_pe *pe;
>  	struct pci_dev *gpu_pdev;
>  	struct pci_dn *npu_pdn;
> -	struct pci_controller *hose = pci_bus_to_host(npu_pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(npu_pdev->bus);
>  
>  	/*
>  	 * Due to a hardware errata PE#0 on the NPU is reserved for
> @@ -1279,16 +1273,12 @@ static void pnv_pci_ioda_setup_PEs(void)
>  #ifdef CONFIG_PCI_IOV
>  static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
>  {
> -	struct pci_bus        *bus;
> -	struct pci_controller *hose;
>  	struct pnv_phb        *phb;
>  	struct pci_dn         *pdn;
>  	int                    i, j;
>  	int                    m64_bars;
>  
> -	bus = pdev->bus;
> -	hose = pci_bus_to_host(bus);
> -	phb = hose->private_data;
> +	phb = pci_bus_to_pnvhb(pdev->bus);
>  	pdn = pci_get_pdn(pdev);
>  
>  	if (pdn->m64_single_mode)
> @@ -1312,8 +1302,6 @@ static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
>  
>  static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  {
> -	struct pci_bus        *bus;
> -	struct pci_controller *hose;
>  	struct pnv_phb        *phb;
>  	struct pci_dn         *pdn;
>  	unsigned int           win;
> @@ -1325,9 +1313,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  	int                    pe_num;
>  	int                    m64_bars;
>  
> -	bus = pdev->bus;
> -	hose = pci_bus_to_host(bus);
> -	phb = hose->private_data;
> +	phb = pci_bus_to_pnvhb(pdev->bus);
>  	pdn = pci_get_pdn(pdev);
>  	total_vfs = pci_sriov_get_totalvfs(pdev);
>  
> @@ -1438,15 +1424,11 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe
>  
>  static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
>  {
> -	struct pci_bus        *bus;
> -	struct pci_controller *hose;
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe, *pe_n;
>  	struct pci_dn         *pdn;
>  
> -	bus = pdev->bus;
> -	hose = pci_bus_to_host(bus);
> -	phb = hose->private_data;
> +	phb = pci_bus_to_pnvhb(pdev->bus);
>  	pdn = pci_get_pdn(pdev);
>  
>  	if (!pdev->is_physfn)
> @@ -1471,16 +1453,12 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
>  
>  void pnv_pci_sriov_disable(struct pci_dev *pdev)
>  {
> -	struct pci_bus        *bus;
> -	struct pci_controller *hose;
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe;
>  	struct pci_dn         *pdn;
>  	u16                    num_vfs, i;
>  
> -	bus = pdev->bus;
> -	hose = pci_bus_to_host(bus);
> -	phb = hose->private_data;
> +	phb = pci_bus_to_pnvhb(pdev->bus);
>  	pdn = pci_get_pdn(pdev);
>  	num_vfs = pdn->num_vfs;
>  
> @@ -1519,17 +1497,13 @@ static void pnv_ioda_setup_bus_iommu_group(struct pnv_ioda_pe *pe,
>  #endif
>  static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  {
> -	struct pci_bus        *bus;
> -	struct pci_controller *hose;
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe;
>  	int                    pe_num;
>  	u16                    vf_index;
>  	struct pci_dn         *pdn;
>  
> -	bus = pdev->bus;
> -	hose = pci_bus_to_host(bus);
> -	phb = hose->private_data;
> +	phb = pci_bus_to_pnvhb(pdev->bus);
>  	pdn = pci_get_pdn(pdev);
>  
>  	if (!pdev->is_physfn)
> @@ -1556,7 +1530,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		pe->rid = (vf_bus << 8) | vf_devfn;
>  
>  		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
> -			hose->global_number, pdev->bus->number,
> +			pci_domain_nr(pdev->bus), pdev->bus->number,
>  			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
>  
>  		if (pnv_ioda_configure_pe(phb, pe)) {
> @@ -1591,17 +1565,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  
>  int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  {
> -	struct pci_bus        *bus;
> -	struct pci_controller *hose;
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe;
>  	struct pci_dn         *pdn;
>  	int                    ret;
>  	u16                    i;
>  
> -	bus = pdev->bus;
> -	hose = pci_bus_to_host(bus);
> -	phb = hose->private_data;
> +	phb = pci_bus_to_pnvhb(pdev->bus);
>  	pdn = pci_get_pdn(pdev);
>  
>  	if (phb->type == PNV_PHB_IODA2) {
> @@ -1816,8 +1786,7 @@ static int pnv_pci_ioda_dma_64bit_bypass(struct pnv_ioda_pe *pe)
>  static bool pnv_pci_ioda_iommu_bypass_supported(struct pci_dev *pdev,
>  		u64 dma_mask)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> @@ -2866,8 +2835,7 @@ static void pnv_pci_init_ioda_msis(struct pnv_phb *phb)
>  #ifdef CONFIG_PCI_IOV
>  static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	const resource_size_t gate = phb->ioda.m64_segsize >> 2;
>  	struct resource *res;
>  	int i;
> @@ -3202,10 +3170,9 @@ static void pnv_pci_ioda_fixup(void)
>  static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
>  						unsigned long type)
>  {
> -	struct pci_dev *bridge;
> -	struct pci_controller *hose = pci_bus_to_host(bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	int num_pci_bridges = 0;
> +	struct pci_dev *bridge;


Is this definition movement an oversight or christmastreefication? This
is not skiboot though ;)

This looks unrelated change.


>  
>  	bridge = bus->self;
>  	while (bridge) {
> @@ -3291,8 +3258,7 @@ static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
>  
>  static void pnv_pci_configure_bus(struct pci_bus *bus)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	struct pci_dev *bridge = bus->self;
>  	struct pnv_ioda_pe *pe;
>  	bool all = (bridge && pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
> @@ -3354,8 +3320,7 @@ static resource_size_t pnv_pci_default_alignment(void)
>  static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
>  						      int resno)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	resource_size_t align;
>  
> @@ -3391,8 +3356,7 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
>   */
>  static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(dev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
>  	struct pci_dn *pdn;
>  
>  	/* The function is probably called while the PEs have
> @@ -3577,8 +3541,7 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
>  
>  static void pnv_pci_release_device(struct pci_dev *pdev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> @@ -3623,8 +3586,7 @@ static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
>  
>  void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> @@ -3664,8 +3626,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  
>  void pnv_pci_dma_bus_setup(struct pci_bus *bus)
>  {
> -	struct pci_controller *hose = bus->sysdata;
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	struct pnv_ioda_pe *pe;
>  
>  	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> @@ -3999,8 +3960,7 @@ void __init pnv_pci_init_npu2_opencapi_phb(struct device_node *np)
>  
>  static void pnv_npu2_opencapi_cfg_size_fixup(struct pci_dev *dev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(dev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
>  
>  	if (!machine_is(powernv))
>  		return;
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 8b9058b52575..d36dde9777aa 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -158,8 +158,7 @@ EXPORT_SYMBOL_GPL(pnv_pci_set_power_state);
>  
>  int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct msi_desc *entry;
>  	struct msi_msg msg;
>  	int hwirq;
> @@ -207,8 +206,7 @@ int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
>  
>  void pnv_teardown_msi_irqs(struct pci_dev *pdev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct msi_desc *entry;
>  	irq_hw_number_t hwirq;
>  
> @@ -820,10 +818,9 @@ EXPORT_SYMBOL(pnv_pci_get_phb_node);
>  
>  int pnv_pci_set_tunnel_bar(struct pci_dev *dev, u64 addr, int enable)
>  {
> -	__be64 val;
> -	struct pci_controller *hose;
> -	struct pnv_phb *phb;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
>  	u64 tunnel_bar;
> +	__be64 val;



or here...

>  	int rc;
>  
>  	if (!opal_check_token(OPAL_PCI_GET_PBCQ_TUNNEL_BAR))
> @@ -831,9 +828,6 @@ int pnv_pci_set_tunnel_bar(struct pci_dev *dev, u64 addr, int enable)
>  	if (!opal_check_token(OPAL_PCI_SET_PBCQ_TUNNEL_BAR))
>  		return -ENXIO;
>  
> -	hose = pci_bus_to_host(dev->bus);
> -	phb = hose->private_data;
> -
>  	mutex_lock(&tunnel_mutex);
>  	rc = opal_pci_get_pbcq_tunnel_bar(phb->opal_id, &val);
>  	if (rc != OPAL_SUCCESS) {
> @@ -937,15 +931,13 @@ static int pnv_tce_iommu_bus_notifier(struct notifier_block *nb,
>  	struct pci_dev *pdev;
>  	struct pci_dn *pdn;
>  	struct pnv_ioda_pe *pe;
> -	struct pci_controller *hose;
>  	struct pnv_phb *phb;
>  
>  	switch (action) {
>  	case BUS_NOTIFY_ADD_DEVICE:
>  		pdev = to_pci_dev(dev);
>  		pdn = pci_get_pdn(pdev);
> -		hose = pci_bus_to_host(pdev->bus);
> -		phb = hose->private_data;
> +		phb = pci_bus_to_pnvhb(pdev->bus);
>  
>  		WARN_ON_ONCE(!phb);
>  		if (!pdn || pdn->pe_number == IODA_INVALID_PE || !phb)
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index a343f3c8e65c..be435a810d19 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -247,4 +247,14 @@ extern void pnv_pci_setup_iommu_table(struct iommu_table *tbl,
>  		void *tce_mem, u64 tce_size,
>  		u64 dma_offset, unsigned int page_shift);
>  
> +static inline struct pnv_phb *pci_bus_to_pnvhb(struct pci_bus *bus)
> +{
> +	struct pci_controller *hose = bus->sysdata;
> +
> +	if (hose)
> +		return hose->private_data;


And since I am commenting on this, usually it is the other way around,
like "if (!hose) return NULL" but I do not insist.


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> +
> +	return NULL;
> +}
> +
>  #endif /* __POWERNV_PCI_H */
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions
  2019-11-20  1:28 ` [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions Oliver O'Halloran
@ 2019-11-25  1:02   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  1:02 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> We have a pci_dev so we can use the functions provided by the PCI core for
> looking up capabilities. This should be safe since these are only called
> when initialising the eeh_dev when the device is first probed and not in
> the EEH recovery path where config accesses are blocked.
> 
> This might cause a problem if an EEH event occured while probing the device,
> but I'm pretty sure that's going to be broken anyway.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 67 ++------------------

I like this diffstat :)

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



>  1 file changed, 4 insertions(+), 63 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index a974822c5097..b79aca8368c6 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -358,65 +358,6 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
>  	return pnv_pci_cfg_write(pdn, where, size, val);
>  }
>  
> -static int pnv_eeh_find_cap(struct eeh_dev *edev, int cap)
> -{
> -	int pos = PCI_CAPABILITY_LIST;
> -	int cnt = 48;   /* Maximal number of capabilities */
> -	u32 status, id;
> -
> -	/* Check if the device supports capabilities */
> -	pnv_eeh_read_config(edev, PCI_STATUS, 2, &status);
> -	if (!(status & PCI_STATUS_CAP_LIST))
> -		return 0;
> -
> -	while (cnt--) {
> -		pnv_eeh_read_config(edev, pos, 1, &pos);
> -		if (pos < 0x40)
> -			break;
> -
> -		pos &= ~3;
> -		pnv_eeh_read_config(edev, pos + PCI_CAP_LIST_ID, 1, &id);
> -		if (id == 0xff)
> -			break;
> -
> -		/* Found */
> -		if (id == cap)
> -			return pos;
> -
> -		/* Next one */
> -		pos += PCI_CAP_LIST_NEXT;
> -	}
> -
> -	return 0;
> -}
> -
> -static int pnv_eeh_find_ecap(struct eeh_dev *edev, int cap)
> -{
> -	u32 header;
> -	int pos = 256, ttl = (4096 - 256) / 8;
> -
> -	if (!edev || !edev->pcie_cap)
> -		return 0;
> -	if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
> -		return 0;
> -	else if (!header)
> -		return 0;
> -
> -	while (ttl-- > 0) {
> -		if (PCI_EXT_CAP_ID(header) == cap && pos)
> -			return pos;
> -
> -		pos = PCI_EXT_CAP_NEXT(header);
> -		if (pos < 256)
> -			break;
> -
> -		if (pnv_eeh_read_config(edev, pos, 4, &header) != PCIBIOS_SUCCESSFUL)
> -			break;
> -	}
> -
> -	return 0;
> -}
> -
>  /**
>   * pnv_eeh_probe - Do probe on PCI device
>   * @pdev: pci_dev to probe
> @@ -458,10 +399,10 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  
>  	/* Initialize eeh device */
>  	edev->class_code = pdn->class_code;
> -	edev->pcix_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_PCIX);
> -	edev->pcie_cap = pnv_eeh_find_cap(edev, PCI_CAP_ID_EXP);
> -	edev->af_cap   = pnv_eeh_find_cap(edev, PCI_CAP_ID_AF);
> -	edev->aer_cap  = pnv_eeh_find_ecap(edev, PCI_EXT_CAP_ID_ERR);
> +	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
> +	edev->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
> +	edev->af_cap   = pci_find_capability(pdev, PCI_CAP_ID_AF);
> +	edev->aer_cap  = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
>  	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
>  		edev->mode |= EEH_DEV_BRIDGE;
>  		if (edev->pcie_cap) {
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev
  2019-11-20  1:28 ` [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev Oliver O'Halloran
@ 2019-11-25  1:26   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  1:26 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Most of what we fetch from the pci_dn is also in the pci_dev structure. Convert
> the pnv_eeh_probe_pdev() to use the pdev fields rather than the pci_dn so we can
> get rid of pci_dn eventually.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 26 ++++++++++----------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index b79aca8368c6..6ba74836a9f8 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -372,7 +372,7 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>  	uint32_t pcie_flags;
>  	int ret;
> -	int config_addr = (pdn->busno << 8) | (pdn->devfn);
> +	int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
>  
>  	/*
>  	 * When probing the root bridge, which doesn't have any
> @@ -392,18 +392,18 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	}
>  
>  	/* Skip for PCI-ISA bridge */
> -	if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
> +	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
>  		return NULL;
>  
>  	eeh_edev_dbg(edev, "Probing device\n");
>  
>  	/* Initialize eeh device */
> -	edev->class_code = pdn->class_code;
> +	edev->class_code = pdev->class;
>  	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
>  	edev->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
>  	edev->af_cap   = pci_find_capability(pdev, PCI_CAP_ID_AF);
>  	edev->aer_cap  = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
> -	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
> +	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) {
>  		edev->mode |= EEH_DEV_BRIDGE;
>  		if (edev->pcie_cap) {
>  			pnv_eeh_read_config(edev, edev->pcie_cap + PCI_EXP_FLAGS,
> @@ -443,14 +443,14 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	 * Broadcom Shiner 4-ports 1G NICs (14e4:168a)
>  	 * Broadcom Shiner 2-ports 10G NICs (14e4:168e)
>  	 */
> -	if ((pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&


This very much looks like you can get rid of
pci_dn::vendor_id/device_id/class now.


Anyway

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> -	     pdn->device_id == 0x1656) ||
> -	    (pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
> -	     pdn->device_id == 0x1657) ||
> -	    (pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
> -	     pdn->device_id == 0x168a) ||
> -	    (pdn->vendor_id == PCI_VENDOR_ID_BROADCOM &&
> -	     pdn->device_id == 0x168e))
> +	if ((pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
> +	     pdev->device == 0x1656) ||
> +	    (pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
> +	     pdev->device == 0x1657) ||
> +	    (pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
> +	     pdev->device == 0x168a) ||
> +	    (pdev->vendor == PCI_VENDOR_ID_BROADCOM &&
> +	     pdev->device == 0x168e))
>  		edev->pe->state |= EEH_PE_CFG_RESTRICTED;
>  
>  	/*
> @@ -461,7 +461,7 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	 */
>  	if (!(edev->pe->state & EEH_PE_PRI_BUS)) {
>  		edev->pe->bus = pci_find_bus(hose->global_number,
> -					     pdn->busno);
> +					     pdev->bus->number);
>  		if (edev->pe->bus)
>  			edev->pe->state |= EEH_PE_PRI_BUS;
>  	}
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev()
  2019-11-20  1:28 ` [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev() Oliver O'Halloran
@ 2019-11-25  3:20   ` Alexey Kardashevskiy
  2019-11-25  4:17     ` Oliver O'Halloran
  0 siblings, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  3:20 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Use the pnv_eeh_find_edev() helper to look up the eeh_dev for a device
> rather than doing it via the pci_dn.

This is not what the patch does. I struggle to see what is that thing
really.


> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 44 ++++++++++++++------
>  1 file changed, 31 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 6ba74836a9f8..1cd80b399995 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -374,20 +374,40 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	int ret;
>  	int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
>  
> +	pci_dbg(pdev, "%s: probing\n", __func__);
> +
>  	/*
> -	 * When probing the root bridge, which doesn't have any
> -	 * subordinate PCI devices. We don't have OF node for
> -	 * the root bridge. So it's not reasonable to continue
> -	 * the probing.
> +	 * EEH keeps the eeh_dev alive over a recovery pass even when the
> +	 * corresponding pci_dev has been torn down. In that case we need
> +	 * to find the existing eeh_dev and re-bind the two.
>  	 */
> -	if (!edev || edev->pe)
> -		return NULL;
> +	edev = pnv_eeh_find_edev(phb, config_addr);


What was @edev before this line?


> +	if (edev) {
> +		eeh_edev_dbg(edev, "Found existing edev!\n");
> +
> +		/*
> +		 * XXX: eeh_remove_device() clears pdev so we shouldn't hit this
> +		 * normally. I've found that screwing around with the pci probe
> +		 * path can result in eeh_probe_pdev() being called twice. This
> +		 * is harmless at the moment, but it's pretty strange so emit a
> +		 * warning to be on the safe side.
> +		 */
> +		if (WARN_ON(edev->pdev))
> +			eeh_edev_dbg(edev, "%s: already bound to a pdev!\n", __func__);
> +
> +		edev->pdev = pdev;
> +
> +		/* should we be doing something with REMOVED too? */
> +		edev->mode &= EEH_DEV_DISCONNECTED;
> +
> +		/* update the primary bus if we need to */
> +		// XXX: why do we need to do this? is the pci_bus going away? what cleared the flag?

From just reading this patch alone: if you do not know why we need it,
then why did you add it here (it is not cut-n-paste)? Thanks,



> +		if (!(edev->pe->state & EEH_PE_PRI_BUS)) {
> +			edev->pe->bus = pdev->bus;
> +			if (edev->pe->bus)
> +				edev->pe->state |= EEH_PE_PRI_BUS;
> +		}
>  
> -	/* already configured? */
> -	if (edev->pdev) {
> -		pr_debug("%s: found existing edev for %04x:%02x:%02x.%01x\n",
> -			__func__, hose->global_number, config_addr >> 8,
> -			PCI_SLOT(config_addr), PCI_FUNC(config_addr));
>  		return edev;
>  	}
>  
> @@ -395,8 +415,6 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
>  		return NULL;
>  
> -	eeh_edev_dbg(edev, "Probing device\n");
> -
>  	/* Initialize eeh device */
>  	edev->class_code = pdev->class;
>  	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed
  2019-11-20  1:28 ` [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed Oliver O'Halloran
@ 2019-11-25  3:27   ` Alexey Kardashevskiy
  2019-11-25  4:26     ` Oliver O'Halloran
  0 siblings, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  3:27 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Have the PowerNV EEH backend allocate the eeh_dev if needed rather than using
> the one attached to the pci_dn. 

So that pci_dn attached one is leaked then?


> This gets us most of the way towards decoupling
> pci_dn from the PowerNV EEH code.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> We should probably be free()ing the eeh_dev somewhere. The pci_dev release
> function is the right place for it.
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 22 ++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 1cd80b399995..7aba18e08996 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -366,10 +366,9 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
>   */
>  static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  {
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
> -	struct pci_controller *hose = pdn->phb;
> -	struct pnv_phb *phb = hose->private_data;
> -	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> +	struct pci_controller *hose = phb->hose;
> +	struct eeh_dev *edev;
>  	uint32_t pcie_flags;
>  	int ret;
>  	int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
> @@ -415,12 +414,27 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
>  		return NULL;
>  
> +	/* otherwise allocate and initialise a new eeh_dev */
> +	edev = kzalloc(sizeof(*edev), GFP_KERNEL);
> +	if (!edev) {
> +		pr_err("%s: out of memory lol\n", __func__);

"lol"? I am pretty sure we do not have to print anything if alloc failed
as alloc prints an error anyway. Thanks,


> +		return NULL;
> +	}
> +
>  	/* Initialize eeh device */
> +	edev->bdfn       = config_addr;
> +	edev->controller = phb->hose;
> +
>  	edev->class_code = pdev->class;
>  	edev->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX);
>  	edev->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
>  	edev->af_cap   = pci_find_capability(pdev, PCI_CAP_ID_AF);
>  	edev->aer_cap  = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
> +
> +	/* TODO: stash the vf_index in here? */
> +	if (pdev->is_virtfn)
> +		edev->physfn = pdev->physfn;
> +
>  	if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) {
>  		edev->mode |= EEH_DEV_BRIDGE;
>  		if (edev->pcie_cap) {
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev()
  2019-11-25  3:20   ` Alexey Kardashevskiy
@ 2019-11-25  4:17     ` Oliver O'Halloran
  0 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-25  4:17 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Mon, Nov 25, 2019 at 2:20 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > Use the pnv_eeh_find_edev() helper to look up the eeh_dev for a device
> > rather than doing it via the pci_dn.
>
> This is not what the patch does. I struggle to see what is that thing
> really.

Hmm, looks like a rebase screw up. This patch and the following one
(22/46) used to be one patch, but I thought it was getting a bit large
and split them.

> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> >  arch/powerpc/platforms/powernv/eeh-powernv.c | 44 ++++++++++++++------
> >  1 file changed, 31 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > index 6ba74836a9f8..1cd80b399995 100644
> > --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> > +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > @@ -374,20 +374,40 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
> >       int ret;
> >       int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
> >
> > +     pci_dbg(pdev, "%s: probing\n", __func__);
> > +
> >       /*
> > -      * When probing the root bridge, which doesn't have any
> > -      * subordinate PCI devices. We don't have OF node for
> > -      * the root bridge. So it's not reasonable to continue
> > -      * the probing.
> > +      * EEH keeps the eeh_dev alive over a recovery pass even when the
> > +      * corresponding pci_dev has been torn down. In that case we need
> > +      * to find the existing eeh_dev and re-bind the two.
> >        */
> > -     if (!edev || edev->pe)
> > -             return NULL;
> > +     edev = pnv_eeh_find_edev(phb, config_addr);
>
>
> What was @edev before this line?

22/46 has the following hunk which should probably be in this patch:

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 1cd80b3..7aba18e 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -366,10 +366,9 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
  */
 static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
 {
-       struct pci_dn *pdn = pci_get_pdn(pdev);
-       struct pci_controller *hose = pdn->phb;
-       struct pnv_phb *phb = hose->private_data;
-       struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+       struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
+       struct pci_controller *hose = phb->hose;
+       struct eeh_dev *edev;

> > +     if (edev) {
> > +             eeh_edev_dbg(edev, "Found existing edev!\n");
> > +
> > +             /*
> > +              * XXX: eeh_remove_device() clears pdev so we shouldn't hit this
> > +              * normally. I've found that screwing around with the pci probe
> > +              * path can result in eeh_probe_pdev() being called twice. This
> > +              * is harmless at the moment, but it's pretty strange so emit a
> > +              * warning to be on the safe side.
> > +              */
> > +             if (WARN_ON(edev->pdev))
> > +                     eeh_edev_dbg(edev, "%s: already bound to a pdev!\n", __func__);
> > +
> > +             edev->pdev = pdev;
> > +
> > +             /* should we be doing something with REMOVED too? */
> > +             edev->mode &= EEH_DEV_DISCONNECTED;
> > +
> > +             /* update the primary bus if we need to */
> > +             // XXX: why do we need to do this? is the pci_bus going away? what cleared the flag?
>
> From just reading this patch alone: if you do not know why we need it,

There's a few comments in here that are essentially notes to myself
that I thought other people might be able to shed light on. The series
is tagged "Very RFC" for a reason ;)

> then why did you add it here (it is not cut-n-paste)? Thanks,

dunno lol

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* Re: [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed
  2019-11-25  3:27   ` Alexey Kardashevskiy
@ 2019-11-25  4:26     ` Oliver O'Halloran
  2019-11-27  1:50       ` Alexey Kardashevskiy
  0 siblings, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-25  4:26 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Mon, Nov 25, 2019 at 2:27 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > Have the PowerNV EEH backend allocate the eeh_dev if needed rather than using
> > the one attached to the pci_dn.
>
> So that pci_dn attached one is leaked then?

Sorta, the eeh_dev attached to the pci_dn is supposed to have the same
lifetime as the pci_dn it's attached to. Whatever frees the pci_dn
should also be freeing the eeh_dev, but I'm pretty sure the only
situation where that actually happens is when removing the pci_dn for
VFs. It's bad.

> > This gets us most of the way towards decoupling
> > pci_dn from the PowerNV EEH code.
> >
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> > We should probably be free()ing the eeh_dev somewhere. The pci_dev release
> > function is the right place for it.
> > ---
> >  arch/powerpc/platforms/powernv/eeh-powernv.c | 22 ++++++++++++++++----
> >  1 file changed, 18 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > index 1cd80b399995..7aba18e08996 100644
> > --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> > +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> > @@ -366,10 +366,9 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
> >   */
> >  static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
> >  {
> > -     struct pci_dn *pdn = pci_get_pdn(pdev);
> > -     struct pci_controller *hose = pdn->phb;
> > -     struct pnv_phb *phb = hose->private_data;
> > -     struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
> > +     struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> > +     struct pci_controller *hose = phb->hose;
> > +     struct eeh_dev *edev;
> >       uint32_t pcie_flags;
> >       int ret;
> >       int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
> > @@ -415,12 +414,27 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
> >       if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
> >               return NULL;
> >
> > +     /* otherwise allocate and initialise a new eeh_dev */
> > +     edev = kzalloc(sizeof(*edev), GFP_KERNEL);
> > +     if (!edev) {
> > +             pr_err("%s: out of memory lol\n", __func__);
>
> "lol"?

yeah lol

I am pretty sure we do not have to print anything if alloc failed
> as alloc prints an error anyway. Thanks,

It does? Neat.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
  2019-11-21  7:48   ` Christoph Hellwig
@ 2019-11-25  4:39     ` Oliver O'Halloran
  0 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-25  4:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Thu, Nov 21, 2019 at 6:48 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Nov 20, 2019 at 12:28:19PM +1100, Oliver O'Halloran wrote:
> > Move this out of the PHB's dma_dev_setup() callback and into the
> > ppc_md.pcibios_fixup_iov callback. This ensures that the VF PE's
> > pdev pointer is always valid for the whole time the device is
> > added the bus.
> >
> > This isn't strictly required, but it's slightly a slightly more logical
> > place to do the fixup and it makes dma_dev_setup a bit simpler.
>
> Ok, this removes the code I commented on earlier, so I take my
> comment there back.

It is a bit weird. I'll re-order the two patches so we're not
shovelling around the fixup junk.

> > +     if (pdev->is_virtfn) {
> > +             /* Fix the VF PE's pdev pointer */
> > +             struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
> > +             pe->pdev = pdev;
>
> Maybe add an empty line after the variable declaration?

ok

> > @@ -3641,20 +3654,6 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
> >  {
> >       struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> >       struct pnv_phb *phb = hose->private_data;
> >
> >       pnv_pci_ioda_dma_dev_setup(phb, pdev);
> >  }
>
> Can you just merge pnv_pci_dma_dev_setup and pnv_pci_ioda_dma_dev_setup
> now?

Oh cool, looks like we can. I'll do that.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
  2019-11-21  4:34   ` Alexey Kardashevskiy
@ 2019-11-25  4:41     ` Oliver O'Halloran
  0 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-25  4:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Thu, Nov 21, 2019 at 3:34 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > Move this out of the PHB's dma_dev_setup() callback and into the
> > ppc_md.pcibios_fixup_iov callback. This ensures that the VF PE's
> > pdev pointer is always valid for the whole time the device is
> > added the bus.
>
> Yeah it would be nice if dma setup did just dma stuff.
>
> > This isn't strictly required, but it's slightly a slightly more logical
>
> s/slightly a slightly/slightly (slightly)/ ? :)
>
>
> > place to do the fixup and it makes dma_dev_setup a bit simpler.
> >
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> >  arch/powerpc/platforms/powernv/pci-ioda.c | 35 +++++++++++------------
> >  1 file changed, 17 insertions(+), 18 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 45f974258766..c6ea7a504e04 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -2910,9 +2910,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
> >       struct pci_dn *pdn;
> >       int mul, total_vfs;
> >
> > -     if (!pdev->is_physfn || pci_dev_is_added(pdev))
> > -             return;
> > -
> >       pdn = pci_get_pdn(pdev);
> >       pdn->vfs_expanded = 0;
> >       pdn->m64_single_mode = false;
> > @@ -2987,6 +2984,22 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
> >               res->end = res->start - 1;
> >       }
> >  }
> > +
> > +static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
> > +{
> > +     if (WARN_ON(pci_dev_is_added(pdev)))
> > +             return;
> > +
> > +     if (pdev->is_virtfn) {
> > +             /* Fix the VF PE's pdev pointer */
> > +             struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
> > +             pe->pdev = pdev;
> > +
> > +             WARN_ON(!(pe->flags & PNV_IODA_PE_VF));
>
>
> return;
>
> > +     } else if (pdev->is_physfn) {
>
>
>
> > +             pnv_pci_ioda_fixup_iov_resources(pdev);
>
>
> and open code pnv_pci_ioda_fixup_iov_resources() right here?

pnv_pci_ioda_fixup_iov_resources() is pretty hairy so I'd rather keep
it as a separate function. I'd like to get rid of it entirely at some
point, but that's a problem for another day.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform
  2019-11-20  1:28 ` [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform Oliver O'Halloran
@ 2019-11-25  5:00   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-25  5:00 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Currently the generic EEH code uses the pci_dn of a device to look up the
> PE of the device's parent bridge, or physical function. The generic
> function to insert the edev (and possibly create the eeh_pe) is called
> from the probe functions already so this is a relatively minor change.
> 
> The existing lookup method moves into the pseries platform and PowerNV
> can choose the PE based on the bus heirachy instead.


The pseries search is also based on sort of bus hierarchy (the pci_dn
tree is that). A short essay about the difference between PCI trees and
pci_dn trees would help (reminder: draw that picture of a set of trees
and lists we got there).


> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> "parent" meaning "parent of the PE that actually contains this edev"
> is stupid, but it's stupid consistent with what's there already. Also
> I couldn't think of a way to fix it without adding a bunch of boring
> boilerplate at the call sites.
> 
> FIXME: I think I introduced a bug here. Currently we coalase a switch's
> upstream port bus and the downstream port bus into a single PE since
> they're a single failure domain. That seems to have been broken by
> this patch, but whatever.


Not clear at all where/how this got broken, pointers would help,
preferably as a comment in the code.

Apart from that,

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



Thanks,


> ---
>  arch/powerpc/include/asm/eeh.h               |  2 +-
>  arch/powerpc/kernel/eeh_pe.c                 | 54 ++++-------------
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 25 +++++++-
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 61 ++++++++++++++++----
>  4 files changed, 86 insertions(+), 56 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index e109bfd3dd57..70d3e01dbe9d 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -295,7 +295,7 @@ struct eeh_pe *eeh_phb_pe_get(struct pci_controller *phb);
>  struct eeh_pe *eeh_pe_next(struct eeh_pe *pe, struct eeh_pe *root);
>  struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
>  			  int pe_no, int config_addr);
> -int eeh_add_to_parent_pe(struct eeh_dev *edev);
> +int eeh_add_to_parent_pe(struct eeh_pe *parent, struct eeh_dev *edev);
>  int eeh_rmv_from_parent_pe(struct eeh_dev *edev);
>  void eeh_pe_update_time_stamp(struct eeh_pe *pe);
>  void *eeh_pe_traverse(struct eeh_pe *root,
> diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
> index 831f363f1732..520c249f19d3 100644
> --- a/arch/powerpc/kernel/eeh_pe.c
> +++ b/arch/powerpc/kernel/eeh_pe.c
> @@ -318,56 +318,23 @@ struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
>  	return pe;
>  }
>  
> -/**
> - * eeh_pe_get_parent - Retrieve the parent PE
> - * @edev: EEH device
> - *
> - * The whole PEs existing in the system are organized as hierarchy
> - * tree. The function is used to retrieve the parent PE according
> - * to the parent EEH device.
> - */
> -static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
> -{
> -	struct eeh_dev *parent;
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> -
> -	/*
> -	 * It might have the case for the indirect parent
> -	 * EEH device already having associated PE, but
> -	 * the direct parent EEH device doesn't have yet.
> -	 */
> -	if (edev->physfn)
> -		pdn = pci_get_pdn(edev->physfn);
> -	else
> -		pdn = pdn ? pdn->parent : NULL;
> -	while (pdn) {
> -		/* We're poking out of PCI territory */
> -		parent = pdn_to_eeh_dev(pdn);
> -		if (!parent)
> -			return NULL;
> -
> -		if (parent->pe)
> -			return parent->pe;
> -
> -		pdn = pdn->parent;
> -	}
> -
> -	return NULL;
> -}
> -
>  /**
>   * eeh_add_to_parent_pe - Add EEH device to parent PE
> + * @parent: PE to create additional PEs under
>   * @edev: EEH device
>   *
> - * Add EEH device to the parent PE. If the parent PE already
> - * exists, the PE type will be changed to EEH_PE_BUS. Otherwise,
> - * we have to create new PE to hold the EEH device and the new
> - * PE will be linked to its parent PE as well.
> + * Add EEH device to the PE in edev->pe_config_addr. If the PE
> + * already exists then we'll add it to that. Otherwise a new
> + * PE is created, and inserted into the PE tree below @parent.
> + * If @parent is NULL, then it will be inserted under the PHB
> + * PE for edev->controller.
> + *
> + * In either case @edev is added to the PE's device list.
>   */
> -int eeh_add_to_parent_pe(struct eeh_dev *edev)
> +int eeh_add_to_parent_pe(struct eeh_pe *parent, struct eeh_dev *edev)
>  {
>  	int config_addr = edev->bdfn;
> -	struct eeh_pe *pe, *parent;
> +	struct eeh_pe *pe;
>  
>  	/* Check if the PE number is valid */
>  	if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) {
> @@ -431,7 +398,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>  	 * to PHB directly. Otherwise, we have to associate the
>  	 * PE with its parent.
>  	 */
> -	parent = eeh_pe_get_parent(edev);
>  	if (!parent) {
>  		parent = eeh_phb_pe_get(edev->controller);
>  		if (!parent) {
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 7aba18e08996..49a932ff092a 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -358,6 +358,25 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
>  	return pnv_pci_cfg_write(pdn, where, size, val);
>  }
>  
> +static struct eeh_pe *pnv_eeh_pe_get_parent(struct pci_dev *pdev)
> +{
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> +	struct pci_dev *parent = pdev->bus->self;
> +
> +#ifdef CONFIG_PCI_IOV
> +	if (pdev->is_virtfn)
> +		parent = pdev->physfn;
> +#endif
> +
> +	if (parent) {
> +		struct pnv_ioda_pe *ioda_pe = pnv_ioda_get_pe(parent);
> +
> +		return eeh_pe_get(phb->hose, ioda_pe->pe_number, 0);
> +	}
> +
> +	return NULL;
> +}
> +
>  /**
>   * pnv_eeh_probe - Do probe on PCI device
>   * @pdev: pci_dev to probe
> @@ -368,6 +387,7 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct pci_controller *hose = phb->hose;
> +	struct eeh_pe *parent_pe;
>  	struct eeh_dev *edev;
>  	uint32_t pcie_flags;
>  	int ret;
> @@ -450,8 +470,11 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>  
>  	edev->pe_config_addr = phb->ioda.pe_rmap[config_addr];
>  
> +	/* find the PE that contains this PE, might be NULL */
> +	parent_pe = pnv_eeh_pe_get_parent(pdev);
> +
>  	/* Create PE */
> -	ret = eeh_add_to_parent_pe(edev);
> +	ret = eeh_add_to_parent_pe(parent_pe, edev);
>  	if (ret) {
>  		eeh_edev_warn(edev, "Failed to add device to PE (code %d)\n", ret);
>  		return NULL;
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index 13a8c274554a..b4a92c24fd45 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -70,11 +70,12 @@ void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
>  	eeh_add_device_early(pdn);
>  #ifdef CONFIG_PCI_IOV
>  	if (pdev->is_virtfn) {
> +		struct eeh_pe *physfn_pe = pci_dev_to_eeh_dev(pdev->physfn)->pe;
>  		struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>  
>  		edev->pe_config_addr =  (pdn->busno << 16) | (pdn->devfn << 8);
>  		eeh_rmv_from_parent_pe(edev); /* Remove as it is adding to bus pe */
> -		eeh_add_to_parent_pe(edev);   /* Add as VF PE type */
> +		eeh_add_to_parent_pe(physfn_pe, edev); /* Add as VF PE type */
>  	}
>  #endif
>  	eeh_add_device_late(pdev);
> @@ -220,6 +221,43 @@ static int pseries_eeh_find_ecap(struct pci_dn *pdn, int cap)
>  	return 0;
>  }
>  
> +/**
> + * pseries_eeh_pe_get_parent - Retrieve the parent PE
> + * @edev: EEH device
> + *
> + * The whole PEs existing in the system are organized as hierarchy
> + * tree. The function is used to retrieve the parent PE according
> + * to the parent EEH device.
> + */
> +static struct eeh_pe *pseries_eeh_pe_get_parent(struct eeh_dev *edev)
> +{
> +	struct eeh_dev *parent;
> +	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> +
> +	/*
> +	 * It might have the case for the indirect parent
> +	 * EEH device already having associated PE, but
> +	 * the direct parent EEH device doesn't have yet.
> +	 */
> +	if (edev->physfn)
> +		pdn = pci_get_pdn(edev->physfn);
> +	else
> +		pdn = pdn ? pdn->parent : NULL;
> +	while (pdn) {
> +		/* We're poking out of PCI territory */
> +		parent = pdn_to_eeh_dev(pdn);
> +		if (!parent)
> +			return NULL;
> +
> +		if (parent->pe)
> +			return parent->pe;
> +
> +		pdn = pdn->parent;
> +	}
> +
> +	return NULL;
> +}
> +
>  /**
>   * pseries_eeh_probe - EEH probe on the given device
>   * @pdn: PCI device node
> @@ -286,10 +324,14 @@ static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
>  	if (ret) {
>  		eeh_edev_dbg(edev, "EEH failed to enable on device (code %d)\n", ret);
>  	} else {
> +		struct eeh_pe *parent;
> +
>  		/* Retrieve PE address */
>  		edev->pe_config_addr = eeh_ops->get_pe_addr(&pe);
>  		pe.addr = edev->pe_config_addr;
>  
> +		parent = pseries_eeh_pe_get_parent(edev);
> +
>  		/* Some older systems (Power4) allow the ibm,set-eeh-option
>  		 * call to succeed even on nodes where EEH is not supported.
>  		 * Verify support explicitly.
> @@ -298,16 +340,15 @@ static void pseries_eeh_probe_pdn(struct pci_dn *pdn)
>  		if (ret > 0 && ret != EEH_STATE_NOT_SUPPORT)
>  			enable = 1;
>  
> -		if (enable) {
> +		/* This device doesn't support EEH, but it may have an
> +		 * EEH parent, in which case we mark it as supported.
> +		 */
> +		if (parent && !enable)
> +			edev->pe_config_addr = parent->addr;
> +
> +		if (enable || parent) {
>  			eeh_add_flag(EEH_ENABLED);
> -			eeh_add_to_parent_pe(edev);
> -		} else if (pdn->parent && pdn_to_eeh_dev(pdn->parent) &&
> -			   (pdn_to_eeh_dev(pdn->parent))->pe) {
> -			/* This device doesn't support EEH, but it may have an
> -			 * EEH parent, in which case we mark it as supported.
> -			 */
> -			edev->pe_config_addr = pdn_to_eeh_dev(pdn->parent)->pe_config_addr;
> -			eeh_add_to_parent_pe(edev);
> +			eeh_add_to_parent_pe(parent, edev);
>  		}
>  		eeh_edev_dbg(edev, "EEH is %s on device (code %d)\n",
>  			     (enable ? "enabled" : "unsupported"), ret);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn
  2019-11-20  1:28 ` [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn Oliver O'Halloran
@ 2019-11-27  0:21   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  0:21 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Squash another usage in preperation for making the config accessors pci_dn.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>


> ---
> We might want to move this into eeh-powernv.c
> ---
>  arch/powerpc/platforms/powernv/pci.c | 37 +++++++++++++---------------
>  1 file changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index d36dde9777aa..6170677bfdc7 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -708,30 +708,23 @@ int pnv_pci_cfg_write(struct pci_dn *pdn,
>  }
>  
>  #if CONFIG_EEH
> -static bool pnv_pci_cfg_check(struct pci_dn *pdn)
> +bool pnv_eeh_pre_cfg_check(struct eeh_dev *edev)
>  {
> -	struct eeh_dev *edev = NULL;
> -	struct pnv_phb *phb = pdn->phb->private_data;
> -
> -	/* EEH not enabled ? */
> -	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> +	if (!edev || !edev->pe)
>  		return true;
>  
> -	/* PE reset or device removed ? */
> -	edev = pdn->edev;
> -	if (edev) {
> -		if (edev->pe &&
> -		    (edev->pe->state & EEH_PE_CFG_BLOCKED))
> -			return false;
> +	/* PE in reset? */
> +	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
> +		return false;
>  
> -		if (edev->mode & EEH_DEV_REMOVED)
> -			return false;
> -	}
> +	/* Device removed? */
> +	if (edev->mode & EEH_DEV_REMOVED)
> +		return false;
>  
>  	return true;
>  }
>  #else
> -static inline pnv_pci_cfg_check(struct pci_dn *pdn)
> +static inline pnv_pci_cfg_check(struct eeh_dev *edev)
>  {
>  	return true;
>  }
> @@ -743,6 +736,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  {
>  	struct pci_dn *pdn;
>  	struct pnv_phb *phb;
> +	struct eeh_dev *edev;
>  	int ret;
>  
>  	*val = 0xFFFFFFFF;
> @@ -750,14 +744,15 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  	if (!pdn)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> -	if (!pnv_pci_cfg_check(pdn))
> +	edev = pdn_to_eeh_dev(pdn);
> +	if (!pnv_eeh_pre_cfg_check(edev))
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
>  	ret = pnv_pci_cfg_read(pdn, where, size, val);
>  	phb = pdn->phb->private_data;
> -	if (phb->flags & PNV_PHB_FLAG_EEH && pdn->edev) {
> +	if (phb->flags & PNV_PHB_FLAG_EEH && edev) {
>  		if (*val == EEH_IO_ERROR_VALUE(size) &&
> -		    eeh_dev_check_failure(pdn->edev))
> +		    eeh_dev_check_failure(edev))
>                          return PCIBIOS_DEVICE_NOT_FOUND;
>  	} else {
>  		pnv_pci_config_check_eeh(pdn);
> @@ -772,13 +767,15 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  {
>  	struct pci_dn *pdn;
>  	struct pnv_phb *phb;
> +	struct eeh_dev *edev;
>  	int ret;
>  
>  	pdn = pci_get_pdn_by_devfn(bus, devfn);
>  	if (!pdn)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> -	if (!pnv_pci_cfg_check(pdn))
> +	edev = pdn_to_eeh_dev(pdn);
> +	if (!pnv_eeh_pre_cfg_check(edev))
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
>  	ret = pnv_pci_cfg_write(pdn, where, size, val);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh()
  2019-11-20  1:28 ` [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh() Oliver O'Halloran
@ 2019-11-27  1:05   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  1:05 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Despite the name this function is generic PowerNV PCI code rather than anything
> EEH specific. Convert to take a phb and bdfn rather than a pci_dn.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci.c | 32 ++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 6170677bfdc7..50142ff045ac 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -591,9 +591,15 @@ static void pnv_pci_handle_eeh_config(struct pnv_phb *phb, u32 pe_no)
>  	spin_unlock_irqrestore(&phb->lock, flags);
>  }
>  
> -static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
> +/*
> + * This, very strangely named, function checks if a config access
> + * caused an EEH and un-freezes the PE if it did. This is mainly
> + * for the !CONFIG_EEH case where nothing is going to un-freeze
> + * it for us.
> + */

Rather than writing a comment like this, simply rename it to
pnv_pci_cfg_check_and_unfreeze() or similar as you are changing
callsites anyway. Thanks,



> +static void pnv_pci_config_check_eeh(struct pnv_phb *phb, u16 bdfn)


>  {
> -	struct pnv_phb *phb = pdn->phb->private_data;
> +	struct pnv_ioda_pe *ioda_pe;
>  	u8	fstate = 0;
>  	__be16	pcierr = 0;
>  	unsigned int pe_no;
> @@ -604,10 +610,11 @@ static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
>  	 * setup that yet. So all ER errors should be mapped to
>  	 * reserved PE.
>  	 */
> -	pe_no = pdn->pe_number;
> -	if (pe_no == IODA_INVALID_PE) {
> +	ioda_pe = __pnv_ioda_get_pe(phb, bdfn);
> +	if (ioda_pe)
> +		pe_no = ioda_pe->pe_number;
> +	else
>  		pe_no = phb->ioda.reserved_pe_idx;
> -	}
>  
>  	/*
>  	 * Fetch frozen state. If the PHB support compound PE,
> @@ -629,7 +636,7 @@ static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
>  	}
>  
>  	pr_devel(" -> EEH check, bdfn=%04x PE#%x fstate=%x\n",
> -		 (pdn->busno << 8) | (pdn->devfn), pe_no, fstate);
> +		 bdfn, pe_no, fstate);
>  
>  	/* Clear the frozen state if applicable */
>  	if (fstate == OPAL_EEH_STOPPED_MMIO_FREEZE ||
> @@ -642,6 +649,7 @@ static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
>  		if (phb->freeze_pe)
>  			phb->freeze_pe(phb, pe_no);
>  
> +		/* fish out the EEH log and send an EEH event. */
>  		pnv_pci_handle_eeh_config(phb, pe_no);
>  	}
>  }
> @@ -735,7 +743,8 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  			       int where, int size, u32 *val)
>  {
>  	struct pci_dn *pdn;
> -	struct pnv_phb *phb;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
> +	u16 bdfn = bus->number << 8 | devfn;
>  	struct eeh_dev *edev;
>  	int ret;
>  
> @@ -755,7 +764,7 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  		    eeh_dev_check_failure(edev))
>                          return PCIBIOS_DEVICE_NOT_FOUND;
>  	} else {
> -		pnv_pci_config_check_eeh(pdn);
> +		pnv_pci_config_check_eeh(phb, bdfn);
>  	}
>  
>  	return ret;
> @@ -766,7 +775,8 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  				int where, int size, u32 val)
>  {
>  	struct pci_dn *pdn;
> -	struct pnv_phb *phb;
> +	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
> +	u16 bdfn = bus->number << 8 | devfn;
>  	struct eeh_dev *edev;
>  	int ret;
>  
> @@ -779,9 +789,9 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
>  	ret = pnv_pci_cfg_write(pdn, where, size, val);
> -	phb = pdn->phb->private_data;
> +
>  	if (!(phb->flags & PNV_PHB_FLAG_EEH))
> -		pnv_pci_config_check_eeh(pdn);
> +		pnv_pci_config_check_eeh(phb, bdfn);
>  
>  	return ret;
>  }
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed
  2019-11-25  4:26     ` Oliver O'Halloran
@ 2019-11-27  1:50       ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  1:50 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko



On 25/11/2019 15:26, Oliver O'Halloran wrote:
> On Mon, Nov 25, 2019 at 2:27 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>
>>
>>
>> On 20/11/2019 12:28, Oliver O'Halloran wrote:
>>> Have the PowerNV EEH backend allocate the eeh_dev if needed rather than using
>>> the one attached to the pci_dn.
>>
>> So that pci_dn attached one is leaked then?
> 
> Sorta, the eeh_dev attached to the pci_dn is supposed to have the same
> lifetime as the pci_dn it's attached to. Whatever frees the pci_dn
> should also be freeing the eeh_dev, but I'm pretty sure the only
> situation where that actually happens is when removing the pci_dn for
> VFs.


Oh, that's lovely. add_sriov_vf_pdns() calls eeh_dev_init() to allocate
@edev but remove_sriov_vf_pdns() does kfree(edev) by itself.


> It's bad.

No sh*t :)

> 
>>> This gets us most of the way towards decoupling
>>> pci_dn from the PowerNV EEH code.
>>>
>>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>>> ---
>>> We should probably be free()ing the eeh_dev somewhere. The pci_dev release
>>> function is the right place for it.
>>> ---
>>>  arch/powerpc/platforms/powernv/eeh-powernv.c | 22 ++++++++++++++++----
>>>  1 file changed, 18 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>> index 1cd80b399995..7aba18e08996 100644
>>> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>> @@ -366,10 +366,9 @@ static int pnv_eeh_write_config(struct eeh_dev *edev,
>>>   */
>>>  static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>>>  {
>>> -     struct pci_dn *pdn = pci_get_pdn(pdev);
>>> -     struct pci_controller *hose = pdn->phb;
>>> -     struct pnv_phb *phb = hose->private_data;
>>> -     struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>> +     struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>>> +     struct pci_controller *hose = phb->hose;
>>> +     struct eeh_dev *edev;
>>>       uint32_t pcie_flags;
>>>       int ret;
>>>       int config_addr = (pdev->bus->number << 8) | (pdev->devfn);
>>> @@ -415,12 +414,27 @@ static struct eeh_dev *pnv_eeh_probe_pdev(struct pci_dev *pdev)
>>>       if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
>>>               return NULL;
>>>
>>> +     /* otherwise allocate and initialise a new eeh_dev */
>>> +     edev = kzalloc(sizeof(*edev), GFP_KERNEL);
>>> +     if (!edev) {
>>> +             pr_err("%s: out of memory lol\n", __func__);
>>
>> "lol"?
> 
> yeah lol

"unprofessional" is the word for this ;)


> 
> I am pretty sure we do not have to print anything if alloc failed
>> as alloc prints an error anyway. Thanks,
> 
> It does? Neat.

Well, it is this:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst#n878

===
These generic allocation functions all emit a stack dump on failure when
used
without __GFP_NOWARN so there is no use in emitting an additional failure
message when NULL is returned.
===

More than a printk. A small detail though.


-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write}
  2019-11-20  1:28 ` [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write} Oliver O'Halloran
@ 2019-11-27  2:16   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  2:16 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Remove the use of pci_dn from the low-level config space access functions.
> These are used by the eeh's config ops and the bus config ops that we
> provide to the PCI core.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +++--------
>  arch/powerpc/platforms/powernv/pci.c         | 26 ++++++++------------
>  arch/powerpc/platforms/powernv/pci.h         |  6 ++---
>  3 files changed, 16 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 49a932ff092a..8a73bc7517c5 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -331,31 +331,25 @@ static inline bool pnv_eeh_cfg_blocked(struct eeh_dev *edev)
>  static int pnv_eeh_read_config(struct eeh_dev *edev,
>  			       int where, int size, u32 *val)
>  {
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> -
> -	if (!pdn)
> -		return PCIBIOS_DEVICE_NOT_FOUND;
> +	struct pnv_phb *phb = edev->controller->private_data;
>  
>  	if (pnv_eeh_cfg_blocked(edev)) {
>  		*val = 0xFFFFFFFF;
>  		return PCIBIOS_SET_FAILED;
>  	}
>  
> -	return pnv_pci_cfg_read(pdn, where, size, val);
> +	return pnv_pci_cfg_read(phb, edev->bdfn, where, size, val);
>  }
>  
>  static int pnv_eeh_write_config(struct eeh_dev *edev,
>  				int where, int size, u32 val)
>  {
> -	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
> -
> -	if (!pdn)
> -		return PCIBIOS_DEVICE_NOT_FOUND;
> +	struct pnv_phb *phb = edev->controller->private_data;
>  
>  	if (pnv_eeh_cfg_blocked(edev))
>  		return PCIBIOS_SET_FAILED;
>  
> -	return pnv_pci_cfg_write(pdn, where, size, val);
> +	return pnv_pci_cfg_write(phb, edev->bdfn, where, size, val);
>  }
>  
>  static struct eeh_pe *pnv_eeh_pe_get_parent(struct pci_dev *pdev)
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 50142ff045ac..36eea4bb514c 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -654,11 +654,9 @@ static void pnv_pci_config_check_eeh(struct pnv_phb *phb, u16 bdfn)
>  	}
>  }
>  
> -int pnv_pci_cfg_read(struct pci_dn *pdn,
> +int pnv_pci_cfg_read(struct pnv_phb *phb, u16 bdfn,
>  		     int where, int size, u32 *val)
>  {
> -	struct pnv_phb *phb = pdn->phb->private_data;
> -	u32 bdfn = (pdn->busno << 8) | pdn->devfn;
>  	s64 rc;
>  
>  	switch (size) {
> @@ -685,19 +683,16 @@ int pnv_pci_cfg_read(struct pci_dn *pdn,
>  		return PCIBIOS_FUNC_NOT_SUPPORTED;
>  	}
>  
> -	pr_devel("%s: bus: %x devfn: %x +%x/%x -> %08x\n",
> -		 __func__, pdn->busno, pdn->devfn, where, size, *val);
> +	pr_devel("%s: bdfn: %x  +%x/%x -> %08x\n",
> +		 __func__, bdfn, where, size, *val);
>  	return PCIBIOS_SUCCESSFUL;
>  }
>  
> -int pnv_pci_cfg_write(struct pci_dn *pdn,
> +int pnv_pci_cfg_write(struct pnv_phb *phb, u16 bdfn,
>  		      int where, int size, u32 val)
>  {
> -	struct pnv_phb *phb = pdn->phb->private_data;
> -	u32 bdfn = (pdn->busno << 8) | pdn->devfn;
> -
> -	pr_devel("%s: bus: %x devfn: %x +%x/%x -> %08x\n",
> -		 __func__, pdn->busno, pdn->devfn, where, size, val);
> +	pr_devel("%s: bdfn: %x +%x/%x -> %08x\n",
> +		 __func__, bdfn, where, size, val);
>  	switch (size) {
>  	case 1:
>  		opal_pci_config_write_byte(phb->opal_id, bdfn, where, val);
> @@ -753,12 +748,11 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  	if (!pdn)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> -	edev = pdn_to_eeh_dev(pdn);
> +	edev = pnv_eeh_find_edev(phb, bdfn);
>  	if (!pnv_eeh_pre_cfg_check(edev))
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> -	ret = pnv_pci_cfg_read(pdn, where, size, val);
> -	phb = pdn->phb->private_data;
> +	ret = pnv_pci_cfg_read(phb, bdfn, where, size, val);
>  	if (phb->flags & PNV_PHB_FLAG_EEH && edev) {
>  		if (*val == EEH_IO_ERROR_VALUE(size) &&
>  		    eeh_dev_check_failure(edev))
> @@ -784,11 +778,11 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  	if (!pdn)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> -	edev = pdn_to_eeh_dev(pdn);
> +	edev = pnv_eeh_find_edev(phb, bdfn);
>  	if (!pnv_eeh_pre_cfg_check(edev))
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> -	ret = pnv_pci_cfg_write(pdn, where, size, val);
> +	ret = pnv_pci_cfg_write(phb, bdfn, where, size, val);
>  
>  	if (!(phb->flags & PNV_PHB_FLAG_EEH))
>  		pnv_pci_config_check_eeh(phb, bdfn);
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index be435a810d19..52dc4d05eaca 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -7,8 +7,6 @@
>  #include <asm/iommu.h>
>  #include <asm/msi_bitmap.h>
>  
> -struct pci_dn;
> -


This is the best bit :)


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




>  enum pnv_phb_type {
>  	PNV_PHB_IODA1		= 0,
>  	PNV_PHB_IODA2		= 1,
> @@ -174,9 +172,9 @@ extern struct pci_ops pnv_pci_ops;
>  
>  void pnv_pci_dump_phb_diag_data(struct pci_controller *hose,
>  				unsigned char *log_buff);
> -int pnv_pci_cfg_read(struct pci_dn *pdn,
> +int pnv_pci_cfg_read(struct pnv_phb *phb, u16 bdfn,
>  		     int where, int size, u32 *val);
> -int pnv_pci_cfg_write(struct pci_dn *pdn,
> +int pnv_pci_cfg_write(struct pnv_phb *phb, u16 bdfn,
>  		      int where, int size, u32 val);
>  extern struct iommu_table *pnv_pci_table_alloc(int nid);
>  
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 27/46] powernv/pci: Clear reserved PE freezes
  2019-11-20  1:28 ` [Very RFC 27/46] powernv/pci: Clear reserved PE freezes Oliver O'Halloran
@ 2019-11-27  3:00   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  3:00 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> When we scan an empty slot the PHB gets an Unsupported Request from the
> downstream bridge when there's no device present at that BDFN.  Some older
> PHBs (p7-IOC) don't allow further config space accesses while the PE is
> frozen, so clear it here without bothering with the diagnostic log.


This executes when EEH is not enabled (rather unsupported case) and the
patch allegedly extends support of some P7 none of which was ever
supported by the powernv platform, or was/is it? Thanks,


> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 36eea4bb514c..5b1f4677cdce 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -642,6 +642,19 @@ static void pnv_pci_config_check_eeh(struct pnv_phb *phb, u16 bdfn)
>  	if (fstate == OPAL_EEH_STOPPED_MMIO_FREEZE ||
>  	    fstate == OPAL_EEH_STOPPED_DMA_FREEZE  ||
>  	    fstate == OPAL_EEH_STOPPED_MMIO_DMA_FREEZE) {
> +
> +		/*
> +		 * Scanning an empty slot will result in a freeze on the reserved PE.
> +		 *
> +		 * Some old and bad PHBs block config space access to frozen PEs in
> +		 * addition to MMIOs, so unfreeze it here.
> +		 */
> +		if (pe_no == phb->ioda.reserved_pe_idx) {
> +			phb->unfreeze_pe(phb, phb->ioda.reserved_pe_idx,
> +					 OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
> +			return;
> +		}
> +
>  		/*
>  		 * If PHB supports compound PE, freeze it for
>  		 * consistency.
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn
  2019-11-20  1:28 ` [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn Oliver O'Halloran
@ 2019-11-27  4:09   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  4:09 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Move the SR-IOV into a platform specific structure. I'm sure stashing all the
> SR-IOV state in pci_dn seemed like a good idea at the time, but it results in a
> lot of powernv specifics being leaked out of the platform directory.
> 
> Moving all the PHB3/4 specific M64 BAR wrangling into a PowerNV specific
> structure helps to clarify the role of pci_dn and ensures that the platform
> specifics stay that way.
> 
> This will make the code easier to understand and modify since we don't need
> to so much aboute PowerNV changes breaking pseries and EEH, and vis-a-vis.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> TODO: Remove all the sriov stuff from pci_dn. We can't do that yet because
> the pseries SRIOV support was a giant hack that re-used some of the
> previously powernv specific fields.
> ---
>  arch/powerpc/include/asm/device.h         |   3 +
>  arch/powerpc/platforms/powernv/pci-ioda.c | 199 ++++++++++++----------
>  arch/powerpc/platforms/powernv/pci.h      |  36 ++++
>  3 files changed, 148 insertions(+), 90 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
> index 266542769e4b..4d8934db7ef5 100644
> --- a/arch/powerpc/include/asm/device.h
> +++ b/arch/powerpc/include/asm/device.h
> @@ -49,6 +49,9 @@ struct dev_archdata {
>  #ifdef CONFIG_CXL_BASE
>  	struct cxl_context	*cxl_ctx;
>  #endif
> +#ifdef CONFIG_PCI_IOV
> +	void *iov_data;
> +#endif
>  };
>  
>  struct pdev_archdata {
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index a1c9315f3208..1c90feed233d 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -966,14 +966,15 @@ static int pnv_ioda_configure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
>  #ifdef CONFIG_PCI_IOV
>  static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
>  {
> -	struct pci_dn *pdn = pci_get_pdn(dev);
> -	int i;
>  	struct resource *res, res2;
> +	struct pnv_iov_data *iov;
>  	resource_size_t size;
>  	u16 num_vfs;
> +	int i;
>  
>  	if (!dev->is_physfn)
>  		return -EINVAL;
> +	iov = pnv_iov_get(dev);
>  
>  	/*
>  	 * "offset" is in VFs.  The M64 windows are sized so that when they
> @@ -983,7 +984,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
>  	 * separate PE, and changing the IOV BAR start address changes the
>  	 * range of PEs the VFs are in.
>  	 */
> -	num_vfs = pdn->num_vfs;
> +	num_vfs = iov->num_vfs;
>  	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
>  		res = &dev->resource[i + PCI_IOV_RESOURCES];
>  		if (!res->flags || !res->parent)
> @@ -1029,19 +1030,19 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
>  			 num_vfs, offset);
>  
>  		if (offset < 0) {
> -			devm_release_resource(&dev->dev, &pdn->holes[i]);
> -			memset(&pdn->holes[i], 0, sizeof(pdn->holes[i]));
> +			devm_release_resource(&dev->dev, &iov->holes[i]);
> +			memset(&iov->holes[i], 0, sizeof(iov->holes[i]));
>  		}
>  
>  		pci_update_resource(dev, i + PCI_IOV_RESOURCES);
>  
>  		if (offset > 0) {
> -			pdn->holes[i].start = res2.start;
> -			pdn->holes[i].end = res2.start + size * offset - 1;
> -			pdn->holes[i].flags = IORESOURCE_BUS;
> -			pdn->holes[i].name = "pnv_iov_reserved";
> +			iov->holes[i].start = res2.start;
> +			iov->holes[i].end = res2.start + size * offset - 1;
> +			iov->holes[i].flags = IORESOURCE_BUS;
> +			iov->holes[i].name = "pnv_iov_reserved";
>  			devm_request_resource(&dev->dev, res->parent,
> -					&pdn->holes[i]);
> +					&iov->holes[i]);
>  		}
>  	}
>  	return 0;
> @@ -1273,37 +1274,37 @@ static void pnv_pci_ioda_setup_PEs(void)
>  #ifdef CONFIG_PCI_IOV
>  static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
>  {
> +	struct pnv_iov_data   *iov;
>  	struct pnv_phb        *phb;
> -	struct pci_dn         *pdn;
>  	int                    i, j;
>  	int                    m64_bars;
>  
>  	phb = pci_bus_to_pnvhb(pdev->bus);
> -	pdn = pci_get_pdn(pdev);
> +	iov = pnv_iov_get(pdev);
>  
> -	if (pdn->m64_single_mode)
> +	if (iov->m64_single_mode)
>  		m64_bars = num_vfs;
>  	else
>  		m64_bars = 1;
>  
>  	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
>  		for (j = 0; j < m64_bars; j++) {
> -			if (pdn->m64_map[j][i] == IODA_INVALID_M64)
> +			if (iov->m64_map[j][i] == IODA_INVALID_M64)
>  				continue;
>  			opal_pci_phb_mmio_enable(phb->opal_id,
> -				OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 0);
> -			clear_bit(pdn->m64_map[j][i], &phb->ioda.m64_bar_alloc);
> -			pdn->m64_map[j][i] = IODA_INVALID_M64;
> +				OPAL_M64_WINDOW_TYPE, iov->m64_map[j][i], 0);
> +			clear_bit(iov->m64_map[j][i], &phb->ioda.m64_bar_alloc);
> +			iov->m64_map[j][i] = IODA_INVALID_M64;
>  		}
>  
> -	kfree(pdn->m64_map);
> +	kfree(iov->m64_map);
>  	return 0;
>  }
>  
>  static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  {
> +	struct pnv_iov_data   *iov;
>  	struct pnv_phb        *phb;
> -	struct pci_dn         *pdn;
>  	unsigned int           win;
>  	struct resource       *res;
>  	int                    i, j;
> @@ -1314,23 +1315,23 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  	int                    m64_bars;
>  
>  	phb = pci_bus_to_pnvhb(pdev->bus);
> -	pdn = pci_get_pdn(pdev);
> +	iov = pnv_iov_get(pdev);
>  	total_vfs = pci_sriov_get_totalvfs(pdev);
>  
> -	if (pdn->m64_single_mode)
> +	if (iov->m64_single_mode)
>  		m64_bars = num_vfs;
>  	else
>  		m64_bars = 1;
>  
> -	pdn->m64_map = kmalloc_array(m64_bars,
> -				     sizeof(*pdn->m64_map),
> +	iov->m64_map = kmalloc_array(m64_bars,
> +				     sizeof(*iov->m64_map),
>  				     GFP_KERNEL);
> -	if (!pdn->m64_map)
> +	if (!iov->m64_map)
>  		return -ENOMEM;
>  	/* Initialize the m64_map to IODA_INVALID_M64 */
>  	for (i = 0; i < m64_bars ; i++)
>  		for (j = 0; j < PCI_SRIOV_NUM_BARS; j++)
> -			pdn->m64_map[i][j] = IODA_INVALID_M64;
> +			iov->m64_map[i][j] = IODA_INVALID_M64;
>  
>  
>  	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> @@ -1347,9 +1348,9 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  					goto m64_failed;
>  			} while (test_and_set_bit(win, &phb->ioda.m64_bar_alloc));
>  
> -			pdn->m64_map[j][i] = win;
> +			iov->m64_map[j][i] = win;
>  
> -			if (pdn->m64_single_mode) {
> +			if (iov->m64_single_mode) {
>  				size = pci_iov_resource_size(pdev,
>  							PCI_IOV_RESOURCES + i);
>  				start = res->start + size * j;
> @@ -1359,16 +1360,16 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  			}
>  
>  			/* Map the M64 here */
> -			if (pdn->m64_single_mode) {
> -				pe_num = pdn->pe_num_map[j];
> +			if (iov->m64_single_mode) {
> +				pe_num = iov->pe_num_map[j];
>  				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
>  						pe_num, OPAL_M64_WINDOW_TYPE,
> -						pdn->m64_map[j][i], 0);
> +						iov->m64_map[j][i], 0);
>  			}
>  
>  			rc = opal_pci_set_phb_mem_window(phb->opal_id,
>  						 OPAL_M64_WINDOW_TYPE,
> -						 pdn->m64_map[j][i],
> +						 iov->m64_map[j][i],
>  						 start,
>  						 0, /* unused */
>  						 size);
> @@ -1380,12 +1381,12 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs)
>  				goto m64_failed;
>  			}
>  
> -			if (pdn->m64_single_mode)
> +			if (iov->m64_single_mode)
>  				rc = opal_pci_phb_mmio_enable(phb->opal_id,
> -				     OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 2);
> +				     OPAL_M64_WINDOW_TYPE, iov->m64_map[j][i], 2);
>  			else
>  				rc = opal_pci_phb_mmio_enable(phb->opal_id,
> -				     OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 1);
> +				     OPAL_M64_WINDOW_TYPE, iov->m64_map[j][i], 1);
>  
>  			if (rc != OPAL_SUCCESS) {
>  				dev_err(&pdev->dev, "Failed to enable M64 window #%d: %llx\n",
> @@ -1426,10 +1427,8 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
>  {
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe, *pe_n;
> -	struct pci_dn         *pdn;
>  
>  	phb = pci_bus_to_pnvhb(pdev->bus);
> -	pdn = pci_get_pdn(pdev);


Looks like an unrelated cleanup.


>  
>  	if (!pdev->is_physfn)
>  		return;
> @@ -1455,36 +1454,36 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
>  {
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe;
> -	struct pci_dn         *pdn;
> +	struct pnv_iov_data   *iov;
>  	u16                    num_vfs, i;
>  
>  	phb = pci_bus_to_pnvhb(pdev->bus);
> -	pdn = pci_get_pdn(pdev);
> -	num_vfs = pdn->num_vfs;
> +	iov = pnv_iov_get(pdev);
> +	num_vfs = iov->num_vfs;
>  
>  	/* Release VF PEs */
>  	pnv_ioda_release_vf_PE(pdev);
>  
>  	if (phb->type == PNV_PHB_IODA2) {
> -		if (!pdn->m64_single_mode)
> -			pnv_pci_vf_resource_shift(pdev, -*pdn->pe_num_map);
> +		if (!iov->m64_single_mode)
> +			pnv_pci_vf_resource_shift(pdev, -*iov->pe_num_map);
>  
>  		/* Release M64 windows */
>  		pnv_pci_vf_release_m64(pdev, num_vfs);
>  
>  		/* Release PE numbers */
> -		if (pdn->m64_single_mode) {
> +		if (iov->m64_single_mode) {
>  			for (i = 0; i < num_vfs; i++) {
> -				if (pdn->pe_num_map[i] == IODA_INVALID_PE)
> +				if (iov->pe_num_map[i] == IODA_INVALID_PE)
>  					continue;
>  
> -				pe = &phb->ioda.pe_array[pdn->pe_num_map[i]];
> +				pe = &phb->ioda.pe_array[iov->pe_num_map[i]];
>  				pnv_ioda_free_pe(pe);
>  			}
>  		} else
> -			bitmap_clear(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs);
> +			bitmap_clear(phb->ioda.pe_alloc, *iov->pe_num_map, num_vfs);
>  		/* Releasing pe_num_map */
> -		kfree(pdn->pe_num_map);
> +		kfree(iov->pe_num_map);
>  	}
>  }
>  
> @@ -1501,24 +1500,24 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  	struct pnv_ioda_pe    *pe;
>  	int                    pe_num;
>  	u16                    vf_index;
> -	struct pci_dn         *pdn;
> -
> -	phb = pci_bus_to_pnvhb(pdev->bus);
> -	pdn = pci_get_pdn(pdev);
> +	struct pnv_iov_data   *iov;
>  
>  	if (!pdev->is_physfn)
>  		return;
>  
> +	phb = pci_bus_to_pnvhb(pdev->bus);
> +	iov = pnv_iov_get(pdev);
> +
>  	/* Reserve PE for each VF */
>  	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
>  		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
>  		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
>  		struct pci_dn *vf_pdn;
>  
> -		if (pdn->m64_single_mode)
> -			pe_num = pdn->pe_num_map[vf_index];
> +		if (iov->m64_single_mode)
> +			pe_num = iov->pe_num_map[vf_index];
>  		else
> -			pe_num = *pdn->pe_num_map + vf_index;
> +			pe_num = *iov->pe_num_map + vf_index;
>  
>  		pe = &phb->ioda.pe_array[pe_num];
>  		pe->pe_number = pe_num;
> @@ -1565,17 +1564,17 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  
>  int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  {
> +	struct pnv_iov_data   *iov;
>  	struct pnv_phb        *phb;
>  	struct pnv_ioda_pe    *pe;
> -	struct pci_dn         *pdn;
>  	int                    ret;
>  	u16                    i;
>  
>  	phb = pci_bus_to_pnvhb(pdev->bus);
> -	pdn = pci_get_pdn(pdev);
> +	iov = pnv_iov_get(pdev);
>  
>  	if (phb->type == PNV_PHB_IODA2) {
> -		if (!pdn->vfs_expanded) {
> +		if (!iov->vfs_expanded) {
>  			dev_info(&pdev->dev, "don't support this SRIOV device"
>  				" with non 64bit-prefetchable IOV BAR\n");
>  			return -ENOSPC;
> @@ -1585,28 +1584,26 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  		 * When M64 BARs functions in Single PE mode, the number of VFs
>  		 * could be enabled must be less than the number of M64 BARs.
>  		 */
> -		if (pdn->m64_single_mode && num_vfs > phb->ioda.m64_bar_idx) {
> +		if (iov->m64_single_mode && num_vfs > phb->ioda.m64_bar_idx) {
>  			dev_info(&pdev->dev, "Not enough M64 BAR for VFs\n");
>  			return -EBUSY;
>  		}
>  
>  		/* Allocating pe_num_map */
> -		if (pdn->m64_single_mode)
> -			pdn->pe_num_map = kmalloc_array(num_vfs,
> -							sizeof(*pdn->pe_num_map),
> -							GFP_KERNEL);
> +		if (iov->m64_single_mode)
> +			iov->pe_num_map = kmalloc_array(num_vfs, sizeof(*iov->pe_num_map), GFP_KERNEL);
>  		else
> -			pdn->pe_num_map = kmalloc(sizeof(*pdn->pe_num_map), GFP_KERNEL);
> +			iov->pe_num_map = kmalloc(sizeof(*iov->pe_num_map), GFP_KERNEL);
>  
> -		if (!pdn->pe_num_map)
> +		if (!iov->pe_num_map)
>  			return -ENOMEM;
>  
> -		if (pdn->m64_single_mode)
> +		if (iov->m64_single_mode)
>  			for (i = 0; i < num_vfs; i++)
> -				pdn->pe_num_map[i] = IODA_INVALID_PE;
> +				iov->pe_num_map[i] = IODA_INVALID_PE;
>  
>  		/* Calculate available PE for required VFs */
> -		if (pdn->m64_single_mode) {
> +		if (iov->m64_single_mode) {
>  			for (i = 0; i < num_vfs; i++) {
>  				pe = pnv_ioda_alloc_pe(phb);
>  				if (!pe) {
> @@ -1614,23 +1611,23 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  					goto m64_failed;
>  				}
>  
> -				pdn->pe_num_map[i] = pe->pe_number;
> +				iov->pe_num_map[i] = pe->pe_number;
>  			}
>  		} else {
>  			mutex_lock(&phb->ioda.pe_alloc_mutex);
> -			*pdn->pe_num_map = bitmap_find_next_zero_area(
> +			*iov->pe_num_map = bitmap_find_next_zero_area(
>  				phb->ioda.pe_alloc, phb->ioda.total_pe_num,
>  				0, num_vfs, 0);
> -			if (*pdn->pe_num_map >= phb->ioda.total_pe_num) {
> +			if (*iov->pe_num_map >= phb->ioda.total_pe_num) {
>  				mutex_unlock(&phb->ioda.pe_alloc_mutex);
>  				dev_info(&pdev->dev, "Failed to enable VF%d\n", num_vfs);
> -				kfree(pdn->pe_num_map);
> +				kfree(iov->pe_num_map);
>  				return -EBUSY;
>  			}
> -			bitmap_set(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs);
> +			bitmap_set(phb->ioda.pe_alloc, *iov->pe_num_map, num_vfs);
>  			mutex_unlock(&phb->ioda.pe_alloc_mutex);
>  		}
> -		pdn->num_vfs = num_vfs;
> +		iov->num_vfs = num_vfs;
>  
>  		/* Assign M64 window accordingly */
>  		ret = pnv_pci_vf_assign_m64(pdev, num_vfs);
> @@ -1644,8 +1641,8 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  		 * the IOV BAR according to the PE# allocated to the VFs.
>  		 * Otherwise, the PE# for the VF will conflict with others.
>  		 */
> -		if (!pdn->m64_single_mode) {
> -			ret = pnv_pci_vf_resource_shift(pdev, *pdn->pe_num_map);
> +		if (!iov->m64_single_mode) {
> +			ret = pnv_pci_vf_resource_shift(pdev, *iov->pe_num_map);
>  			if (ret)
>  				goto m64_failed;
>  		}
> @@ -1657,19 +1654,19 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  	return 0;
>  
>  m64_failed:
> -	if (pdn->m64_single_mode) {
> +	if (iov->m64_single_mode) {
>  		for (i = 0; i < num_vfs; i++) {
> -			if (pdn->pe_num_map[i] == IODA_INVALID_PE)
> +			if (iov->pe_num_map[i] == IODA_INVALID_PE)
>  				continue;
>  
> -			pe = &phb->ioda.pe_array[pdn->pe_num_map[i]];
> +			pe = &phb->ioda.pe_array[iov->pe_num_map[i]];
>  			pnv_ioda_free_pe(pe);
>  		}
>  	} else
> -		bitmap_clear(phb->ioda.pe_alloc, *pdn->pe_num_map, num_vfs);
> +		bitmap_clear(phb->ioda.pe_alloc, *iov->pe_num_map, num_vfs);
>  
>  	/* Releasing pe_num_map */
> -	kfree(pdn->pe_num_map);
> +	kfree(iov->pe_num_map);
>  
>  	return ret;
>  }
> @@ -2840,12 +2837,13 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  	struct resource *res;
>  	int i;
>  	resource_size_t size, total_vf_bar_sz;
> -	struct pci_dn *pdn;
> +	struct pnv_iov_data *iov;
>  	int mul, total_vfs;
>  
> -	pdn = pci_get_pdn(pdev);
> -	pdn->vfs_expanded = 0;
> -	pdn->m64_single_mode = false;
> +	iov = kzalloc(sizeof(*iov), GFP_KERNEL);
> +	if (!iov)
> +		goto truncate_iov;
> +	pdev->dev.archdata.iov_data = iov;
>  
>  	total_vfs = pci_sriov_get_totalvfs(pdev);
>  	mul = phb->ioda.total_pe_num;
> @@ -2882,7 +2880,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  			dev_info(&pdev->dev,
>  				"VF BAR Total IOV size %llx > %llx, roundup to %d VFs\n",
>  				total_vf_bar_sz, gate, mul);
> -			pdn->m64_single_mode = true;
> +			iov->m64_single_mode = true;
>  			break;
>  		}
>  	}
> @@ -2897,7 +2895,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  		 * On PHB3, the minimum size alignment of M64 BAR in single
>  		 * mode is 32MB.
>  		 */
> -		if (pdn->m64_single_mode && (size < SZ_32M))
> +		if (iov->m64_single_mode && (size < SZ_32M))
>  			goto truncate_iov;
>  		dev_dbg(&pdev->dev, " Fixing VF BAR%d: %pR to\n", i, res);
>  		res->end = res->start + size * mul - 1;
> @@ -2905,7 +2903,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  		dev_info(&pdev->dev, "VF BAR%d: %pR (expanded to %d VFs for PE alignment)",
>  			 i, res, mul);
>  	}
> -	pdn->vfs_expanded = mul;
> +	iov->vfs_expanded = mul;
>  
>  	return;
>  
> @@ -2916,6 +2914,9 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>  		res->flags = 0;
>  		res->end = res->start - 1;
>  	}
> +
> +	pdev->dev.archdata.iov_data = NULL;
> +	kfree(iov);
>  }
>  
>  static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
> @@ -3321,7 +3322,7 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
>  						      int resno)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
> +	struct pnv_iov_data *iov = pnv_iov_get(pdev);
>  	resource_size_t align;
>  
>  	/*
> @@ -3342,12 +3343,21 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
>  	 * M64 segment size if IOV BAR size is less.
>  	 */
>  	align = pci_iov_resource_size(pdev, resno);
> -	if (!pdn->vfs_expanded)
> +
> +	/*
> +	 * iov can be null if we have an SR-IOV device with IOV BAR that can't
> +	 * be placed in the m64 space (i.e. The BAR is 32bit or non-prefetch).
> +	 * In that case we don't allow VFs to be enabled so just return the
> +	 * default alignment.
> +	 */
> +	if (!iov)
>  		return align;
> -	if (pdn->m64_single_mode)
> +	if (!iov->vfs_expanded)
> +		return align;
> +	if (iov->m64_single_mode)
>  		return max(align, (resource_size_t)phb->ioda.m64_segsize);
>  
> -	return pdn->vfs_expanded * align;
> +	return iov->vfs_expanded * align;
>  }
>  #endif /* CONFIG_PCI_IOV */
>  
> @@ -3545,12 +3555,21 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> +	/* The VF PE state is torn down when sriov_disable() is called */
>  	if (pdev->is_virtfn)
>  		return;
>  
>  	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
>  		return;
>  
> +	/*
> +	 * FIXME: Try move this to sriov_disable(). It's here since we allocate
> +	 * the iov state at probe time since we need to fiddle with the IOV
> +	 * resources.
> +	 */
> +	if (pdev->is_physfn)
> +		kfree(pdev->dev.archdata.iov_data);


pdev->dev.archdata.iov_data = NULL just in case if some other device
shutdown code tries accessing it.


> +
>  	/*
>  	 * PCI hotplug can happen as part of EEH error recovery. The @pdn
>  	 * isn't removed and added afterwards in this scenario. We should
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 52dc4d05eaca..0e875f714911 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -168,6 +168,42 @@ struct pnv_phb {
>  	u8			*diag_data;
>  };
>  
> +#ifdef CONFIG_PCI_IOV
> +/*
> + * For SR-IOV we want to put each VF's MMIO resource in to a seperate PE.
> + * This requires a bit of acrobatics with the MMIO -> PE configuration
> + * and this structure is used to keep track of it all.
> + */
> +struct pnv_iov_data {
> +	/* number of VFs IOV BAR expanded. FIXME: rename this to something less bad */
> +	u16     vfs_expanded;


Can this one be removed from pci_dn now? It looks like it can after
46/46 indeed. And probably others as well as they do not seem to be used
by pseries anyway. Thanks,


> +
> +	/* number of VFs enabled */
> +	u16     num_vfs;
> +	unsigned int *pe_num_map;	/* PE# for the first VF PE or array */
> +
> +	/* Did we map the VF BARs with single-PE IODA BARs? */
> +	bool    m64_single_mode;
> +
> +	int     (*m64_map)[PCI_SRIOV_NUM_BARS];
> +#define IODA_INVALID_M64        (-1)
> +
> +	/*
> +	 * If we map the SR-IOV BARs with a segmented window then
> +	 * parts of that window will be "claimed" by other PEs.
> +	 *
> +	 * "holes" here is used to reserve the leading portion
> +	 * of the window that is used by other (non VF) PEs.
> +	 */
> +	struct resource holes[PCI_SRIOV_NUM_BARS];
> +};
> +
> +static inline struct pnv_iov_data *pnv_iov_get(struct pci_dev *pdev)
> +{
> +	return pdev->dev.archdata.iov_data;
> +}
> +#endif
> +
>  extern struct pci_ops pnv_pci_ops;
>  
>  void pnv_pci_dump_phb_diag_data(struct pci_controller *hose,
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup
  2019-11-20  1:28 ` [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup Oliver O'Halloran
@ 2019-11-27  4:26   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  4:26 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 32 +++++++++++++++++------
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 1c90feed233d..5bd7c1b058da 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -760,6 +760,11 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
>  		}
>  	}
>  
> +	/*
> +	 * Walk the bridges up to the root. Along the way mark this PE as
> +	 * downstream of the bridge PE(s) so that errors upstream errors


Too many "errors" in "errors upstream errors".

Otherwise

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> +	 * also cause this PE to be frozen.
> +	 */
>  	if (pe->flags & (PNV_IODA_PE_BUS_ALL | PNV_IODA_PE_BUS))
>  		pdev = pe->pbus->self;
>  	else if (pe->flags & PNV_IODA_PE_DEV)
> @@ -768,16 +773,27 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
>  	else if (pe->flags & PNV_IODA_PE_VF)
>  		pdev = pe->parent_dev;
>  #endif /* CONFIG_PCI_IOV */
> +
>  	while (pdev) {
> -		struct pci_dn *pdn = pci_get_pdn(pdev);
> -		struct pnv_ioda_pe *parent;
> +		struct pnv_ioda_pe *parent = pnv_ioda_get_pe(pdev);
>  
> -		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
> -			parent = &phb->ioda.pe_array[pdn->pe_number];
> -			ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
> -			if (ret)
> -				return ret;
> -		}
> +		/*
> +		 * FIXME: This is called from pcibios_setup_bridge(), which is called
> +		 * from the bottom (leaf) bridge to the root. This means that this
> +		 * doesn't actually setup the PELT-V entries since the PEs for
> +		 * the bridges above assigned after this is run for the leaf.
> +		 *
> +		 * FIXMEFIXME: might not be true since moving PE configuration
> +		 * into pcibios_bus_add_device().
> +		 */
> +		if (!parent)
> +			break;
> +
> +		WARN_ON(!parent || parent->pe_number == IODA_INVALID_PE);
> +
> +		ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
> +		if (ret)
> +			return ret;
>  
>  		pdev = pdev->bus->self;
>  	}
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown
  2019-11-20  1:28 ` [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown Oliver O'Halloran
@ 2019-11-27  4:50   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  4:50 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5bd7c1b058da..d4b5ee926222 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -853,11 +853,13 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
>  
>  	/* Release from all parents PELT-V */
>  	while (parent) {
> -		struct pci_dn *pdn = pci_get_pdn(parent);
> -		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
> -			rc = opal_pci_set_peltv(phb->opal_id, pdn->pe_number,
> -						pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
> -			/* XXX What to do in case of error ? */

May be print a warning, like a few lines below (in the code, not in the
patch). Not important though if gcc does not complain about an unused
returned value.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> +		struct pnv_ioda_pe *parent_pe = pnv_ioda_get_pe(parent);
> +
> +		if (parent_pe) {
> +			rc = opal_pci_set_peltv(phb->opal_id,
> +						parent_pe->pe_number,
> +						pe->pe_number,
> +						OPAL_REMOVE_PE_FROM_DOMAIN);
>  		}
>  		parent = parent->bus->self;
>  	}
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup()
  2019-11-20  1:28 ` [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup() Oliver O'Halloran
  2019-11-21  7:52   ` Christoph Hellwig
@ 2019-11-27  4:53   ` Alexey Kardashevskiy
  1 sibling, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  4:53 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Use the helper to look up the pnv_ioda_pe for the device we're configuring DMA
> for. In the VF case there's no need set pdn->pe_number since nothing looks at
> it any more.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d4b5ee926222..98d858999a2d 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1709,10 +1709,9 @@ int pnv_pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  
>  static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev)
>  {
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
>  	struct pnv_ioda_pe *pe;
>  
> -	pe = &phb->ioda.pe_array[pdn->pe_number];
> +	pe = pnv_ioda_get_pe(pdev);
>  	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
>  	pdev->dev.archdata.dma_offset = pe->tce_bypass_base;
>  	set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported()
  2019-11-20  1:28 ` [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported() Oliver O'Halloran
@ 2019-11-27  5:09   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  5:09 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

but honestly can be squashed into 31/46 or/and 33/46 or other similar
patches.

> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 98d858999a2d..7e88de18ead6 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1801,13 +1801,11 @@ static bool pnv_pci_ioda_iommu_bypass_supported(struct pci_dev *pdev,
>  		u64 dma_mask)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
> -	struct pnv_ioda_pe *pe;
> +	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
>  
> -	if (WARN_ON(!pdn || pdn->pe_number == IODA_INVALID_PE))
> +	if (WARN_ON(!pe))
>  		return false;
>  
> -	pe = &phb->ioda.pe_array[pdn->pe_number];
>  	if (pe->tce_bypass_enabled) {
>  		u64 top = pe->tce_bypass_base + memblock_end_of_DRAM() - 1;
>  		if (dma_mask >= top)
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier
  2019-11-20  1:28 ` [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier Oliver O'Halloran
@ 2019-11-27  5:09   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  5:09 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 5b1f4677cdce..0eeea8652426 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -943,23 +943,22 @@ static int pnv_tce_iommu_bus_notifier(struct notifier_block *nb,
>  {
>  	struct device *dev = data;
>  	struct pci_dev *pdev;
> -	struct pci_dn *pdn;
>  	struct pnv_ioda_pe *pe;
>  	struct pnv_phb *phb;
>  
>  	switch (action) {
>  	case BUS_NOTIFY_ADD_DEVICE:
>  		pdev = to_pci_dev(dev);
> -		pdn = pci_get_pdn(pdev);
>  		phb = pci_bus_to_pnvhb(pdev->bus);
>  
>  		WARN_ON_ONCE(!phb);
> -		if (!pdn || pdn->pe_number == IODA_INVALID_PE || !phb)
> +		if (!phb)
>  			return 0;

This check is weird - the function does not use @phb anymore, it would
make more sense if pnv_ioda_get_pe() checked phb!=NULL.


>  
> -		pe = &phb->ioda.pe_array[pdn->pe_number];
> -		if (!pe->table_group.group)
> +		pe = pnv_ioda_get_pe(pdev);
> +		if (!pe || !pe->table_group.group)
>  			return 0;
> +
>  		iommu_add_device(&pe->table_group, dev);
>  		return 0;
>  	case BUS_NOTIFY_DEL_DEVICE:
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook()
  2019-11-20  1:28 ` [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook() Oliver O'Halloran
@ 2019-11-27  5:14   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  5:14 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

but better squash it.


> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 7e88de18ead6..4f38652c7cd7 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3382,7 +3382,6 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
>  static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
> -	struct pci_dn *pdn;
>  
>  	/* The function is probably called while the PEs have
>  	 * not be created yet. For example, resource reassignment
> @@ -3392,11 +3391,7 @@ static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
>  	if (!phb->initialized)
>  		return true;
>  
> -	pdn = pci_get_pdn(dev);
> -	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
> -		return false;
> -
> -	return true;
> +	return !!pnv_ioda_get_pe(dev);
>  }
>  
>  static long pnv_pci_ioda1_unset_window(struct iommu_table_group *table_group,
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device
  2019-11-20  1:28 ` [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device Oliver O'Halloran
@ 2019-11-27  5:24   ` Alexey Kardashevskiy
  2019-11-27  9:51     ` Oliver O'Halloran
  0 siblings, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  5:24 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 4f38652c7cd7..8525642b1256 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3562,14 +3562,14 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
>  static void pnv_pci_release_device(struct pci_dev *pdev)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> +	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
>  	struct pci_dn *pdn = pci_get_pdn(pdev);
> -	struct pnv_ioda_pe *pe;
>  
>  	/* The VF PE state is torn down when sriov_disable() is called */
>  	if (pdev->is_virtfn)
>  		return;
>  
> -	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
> +	if (WARN_ON(!pe))


Is that WARN_ON because there is always a PE - from upstream bridge or a
reserved one?



>  		return;
>  
>  	/*
> @@ -3588,7 +3588,6 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
>  	 * be increased on adding devices. It leads to unbalanced PE's device
>  	 * count and eventually make normal PCI hotplug path broken.
>  	 */
> -	pe = &phb->ioda.pe_array[pdn->pe_number];
>  	pdn->pe_number = IODA_INVALID_PE;
>  
>  	WARN_ON(--pe->device_count < 0);
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device
  2019-11-20  1:28 ` [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device Oliver O'Halloran
@ 2019-11-27  5:45   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  5:45 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 13 ++-----------
>  1 file changed, 2 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index b95b9e3c4c98..68bfaef44862 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -97,25 +97,16 @@ EXPORT_SYMBOL(pnv_pci_get_npu_dev);
>  static struct pnv_ioda_pe *get_gpu_pci_dev_and_pe(struct pnv_ioda_pe *npe,
>  						  struct pci_dev **gpdev)
>  {
> -	struct pnv_phb *phb;
> -	struct pci_controller *hose;
>  	struct pci_dev *pdev;
>  	struct pnv_ioda_pe *pe;
> -	struct pci_dn *pdn;
>  
>  	pdev = pnv_pci_get_gpu_dev(npe->pdev);
>  	if (!pdev)
>  		return NULL;
>  
> -	pdn = pci_get_pdn(pdev);
> -	if (WARN_ON(!pdn || pdn->pe_number == IODA_INVALID_PE))
> -		return NULL;
> -
> -	hose = pci_bus_to_host(pdev->bus);
> -	phb = hose->private_data;
> -	pe = &phb->ioda.pe_array[pdn->pe_number];
> +	pe = pnv_ioda_get_pe(pdev);
>  
> -	if (gpdev)
> +	if (pe && pdev)


s/pdev/gpdev/



>  		*gpdev = pdev;
>  
>  	return pe;
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL
  2019-11-20  1:28 ` [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL Oliver O'Halloran
@ 2019-11-27  6:27   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  6:27 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Currently when using the normal (i.e not building pci_dev's from the DT
> node) probe method we only scan the devfn corresponding to the first child
> of the bridge's DT node. This doesn't make much sense to me, but it seems
> to have worked so far. At a guess it seems to work because in a PCIe
> environment the first downstream child will be at devfn 00.0.
> 
> In any case it's completely broken when no pci_dn is available. Remove
> the PCI_DN checking and scan each of the device number that might be on
> the downstream bus.


Then why not just use pci_scan_child_bus()? Thanks,


> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> I'm not sure we should be using pci_scan_slot() directly here. Maybe
> there's some insane legacy reason for it.
> ---
>  arch/powerpc/kernel/pci-hotplug.c | 15 ++++-----------
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> index d6a67f814983..85299c769768 100644
> --- a/arch/powerpc/kernel/pci-hotplug.c
> +++ b/arch/powerpc/kernel/pci-hotplug.c
> @@ -123,17 +123,10 @@ void pci_hp_add_devices(struct pci_bus *bus)
>  	if (mode == PCI_PROBE_DEVTREE) {
>  		/* use ofdt-based probe */
>  		of_rescan_bus(dn, bus);
> -	} else if (mode == PCI_PROBE_NORMAL &&
> -		   dn->child && PCI_DN(dn->child)) {
> -		/*
> -		 * Use legacy probe. In the partial hotplug case, we
> -		 * probably have grandchildren devices unplugged. So
> -		 * we don't check the return value from pci_scan_slot() in
> -		 * order for fully rescan all the way down to pick them up.
> -		 * They can have been removed during partial hotplug.
> -		 */
> -		slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
> -		pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
> +	} else if (mode == PCI_PROBE_NORMAL) {
> +		for (slotno = 0; slotno < 255; slotno += 8)
> +			pci_scan_slot(bus, slotno);
> +
>  		max = bus->busn_res.start;
>  		/*
>  		 * Scan bridges that are already configured. We don't touch
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev
  2019-11-20  1:28 ` [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev Oliver O'Halloran
@ 2019-11-27  6:58   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  6:58 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> There's no need to use the pci_dn to find a device_node from a pci_dev.
> Just search for the node pointed to by the pci_dev's of_node pointer.



Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 68bfaef44862..72d3749da02c 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -21,11 +21,11 @@
>  
>  static struct pci_dev *get_pci_dev(struct device_node *dn)
>  {
> -	struct pci_dn *pdn = PCI_DN(dn);
> -	struct pci_dev *pdev;
> +	struct pci_dev *pdev = NULL;
>  
> -	pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> -					   pdn->busno, pdn->devfn);
> +	for_each_pci_dev(pdev)
> +		if (pdev->dev.of_node == dn)
> +			break;
>  
>  	/*
>  	 * pci_get_domain_bus_and_slot() increased the reference count of
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-20  1:28 ` [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs Oliver O'Halloran
@ 2019-11-27  7:09   ` Alexey Kardashevskiy
  2019-11-27  8:24     ` Greg Kurz
  0 siblings, 1 reply; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  7:09 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko, Greg Kurz



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> The comment here implies that we don't need to take a ref to the pci_dev
> because the ioda_pe will always have one. This implies that the current
> expection is that the pci_dev for an NPU device will *never* be torn
> down since the ioda_pe having a ref to the device will prevent the
> release function from being called.
> 
> In other words, the desired behaviour here appears to be leaking a ref.
> 
> Nice!


There is a history: https://patchwork.ozlabs.org/patch/1088078/

We did not fix anything in particular then, we do not seem to be fixing
anything now (in other words - we cannot test it in a normal natural
way). I'd drop this one.



> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 72d3749da02c..2eb6e6d45a98 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
>  			break;
>  
>  	/*
> -	 * pci_get_domain_bus_and_slot() increased the reference count of
> -	 * the PCI device, but callers don't need that actually as the PE
> -	 * already holds a reference to the device. Since callers aren't
> -	 * aware of the reference count change, call pci_dev_put() now to
> -	 * avoid leaks.
> +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
> +	 * Caller is responsible for dropping the ref when it's
> +	 * finished with it.
>  	 */
> -	if (pdev)
> -		pci_dev_put(pdev);
> -
>  	return pdev;
>  }
>  
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs
  2019-11-20  1:28 ` [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs Oliver O'Halloran
@ 2019-11-27  7:14   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  7:14 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> We don't need a pci_dn for the VF any more, so we can skip adding them.

Excellent!

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 16 ----------------
>  1 file changed, 16 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d111a50fbe68..d3e375d71cdc 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1526,7 +1526,6 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
>  		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
>  		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> -		struct pci_dn *vf_pdn;
>  
>  		if (iov->m64_single_mode)
>  			pe_num = iov->pe_num_map[vf_index];
> @@ -1558,15 +1557,6 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		list_add_tail(&pe->list, &phb->ioda.pe_list);
>  		mutex_unlock(&phb->ioda.pe_list_mutex);
>  
> -		/* associate this pe to it's pdn */
> -		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
> -			if (vf_pdn->busno == vf_bus &&
> -			    vf_pdn->devfn == vf_devfn) {
> -				vf_pdn->pe_number = pe_num;
> -				break;
> -			}
> -		}
> -
>  		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>  #ifdef CONFIG_IOMMU_API
>  		iommu_register_group(&pe->table_group,
> @@ -1688,17 +1678,11 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  int pnv_pcibios_sriov_disable(struct pci_dev *pdev)
>  {
>  	pnv_pci_sriov_disable(pdev);
> -
> -	/* Release PCI data */
> -	remove_sriov_vf_pdns(pdev);
>  	return 0;
>  }
>  
>  int pnv_pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
>  {
> -	/* Allocate PCI data */
> -	add_sriov_vf_pdns(pdev);
> -
>  	return pnv_pci_sriov_enable(pdev, num_vfs);
>  }
>  #endif /* CONFIG_PCI_IOV */
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device
  2019-11-20  1:28 ` [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device Oliver O'Halloran
@ 2019-11-27  7:30   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27  7:30 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> Nothing looks at it anymore.

With a small extra step we can ditch it (compile tested):

https://github.com/aik/linux/commit/14db7061d48220354e83f8e100ab0cc1b7181da4



> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d3e375d71cdc..45d940730c30 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3541,9 +3541,7 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
>  
>  static void pnv_pci_release_device(struct pci_dev *pdev)
>  {
> -	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>  	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
> -	struct pci_dn *pdn = pci_get_pdn(pdev);
>  
>  	/* The VF PE state is torn down when sriov_disable() is called */
>  	if (pdev->is_virtfn)
> @@ -3560,16 +3558,6 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
>  	if (pdev->is_physfn)
>  		kfree(pdev->dev.archdata.iov_data);
>  
> -	/*
> -	 * PCI hotplug can happen as part of EEH error recovery. The @pdn
> -	 * isn't removed and added afterwards in this scenario. We should
> -	 * set the PE number in @pdn to an invalid one. Otherwise, the PE's
> -	 * device count is decreased on removing devices while failing to
> -	 * be increased on adding devices. It leads to unbalanced PE's device
> -	 * count and eventually make normal PCI hotplug path broken.
> -	 */
> -	pdn->pe_number = IODA_INVALID_PE;
> -
>  	WARN_ON(--pe->device_count < 0);
>  	if (pe->device_count == 0)
>  		pnv_ioda_release_pe(pe);
> 






-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  7:09   ` Alexey Kardashevskiy
@ 2019-11-27  8:24     ` Greg Kurz
  2019-11-27  9:10       ` Frederic Barrat
  0 siblings, 1 reply; 107+ messages in thread
From: Greg Kurz @ 2019-11-27  8:24 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Frederic Barrat, linuxppc-dev, Oliver O'Halloran,
	s.miroshnichenko, alistair

On Wed, 27 Nov 2019 18:09:40 +1100
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> 
> 
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > The comment here implies that we don't need to take a ref to the pci_dev
> > because the ioda_pe will always have one. This implies that the current
> > expection is that the pci_dev for an NPU device will *never* be torn
> > down since the ioda_pe having a ref to the device will prevent the
> > release function from being called.
> > 
> > In other words, the desired behaviour here appears to be leaking a ref.
> > 
> > Nice!
> 
> 
> There is a history: https://patchwork.ozlabs.org/patch/1088078/
> 
> We did not fix anything in particular then, we do not seem to be fixing
> anything now (in other words - we cannot test it in a normal natural
> way). I'd drop this one.
> 

Yeah, I didn't fix anything at the time. Just reverted to the ref
count behavior we had before:

https://patchwork.ozlabs.org/patch/829172/

Frederic recently posted his take on the same topic from the OpenCAPI
point of view:

http://patchwork.ozlabs.org/patch/1198947/

He seems to indicate the NPU devices as the real culprit because
nobody ever cared for them to be removable. Fixing that seems be
a chore nobody really wants to address obviously... :-\

> 
> 
> > 
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> >  arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
> >  1 file changed, 3 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> > index 72d3749da02c..2eb6e6d45a98 100644
> > --- a/arch/powerpc/platforms/powernv/npu-dma.c
> > +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> > @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
> >  			break;
> >  
> >  	/*
> > -	 * pci_get_domain_bus_and_slot() increased the reference count of
> > -	 * the PCI device, but callers don't need that actually as the PE
> > -	 * already holds a reference to the device. Since callers aren't
> > -	 * aware of the reference count change, call pci_dev_put() now to
> > -	 * avoid leaks.
> > +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
> > +	 * Caller is responsible for dropping the ref when it's
> > +	 * finished with it.
> >  	 */
> > -	if (pdev)
> > -		pci_dev_put(pdev);
> > -
> >  	return pdev;
> >  }
> >  
> > 
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  8:24     ` Greg Kurz
@ 2019-11-27  9:10       ` Frederic Barrat
  2019-11-27  9:33         ` Greg Kurz
  0 siblings, 1 reply; 107+ messages in thread
From: Frederic Barrat @ 2019-11-27  9:10 UTC (permalink / raw)
  To: Greg Kurz, Alexey Kardashevskiy
  Cc: linuxppc-dev, Oliver O'Halloran, s.miroshnichenko, alistair



Le 27/11/2019 à 09:24, Greg Kurz a écrit :
> On Wed, 27 Nov 2019 18:09:40 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>>
>>
>> On 20/11/2019 12:28, Oliver O'Halloran wrote:
>>> The comment here implies that we don't need to take a ref to the pci_dev
>>> because the ioda_pe will always have one. This implies that the current
>>> expection is that the pci_dev for an NPU device will *never* be torn
>>> down since the ioda_pe having a ref to the device will prevent the
>>> release function from being called.
>>>
>>> In other words, the desired behaviour here appears to be leaking a ref.
>>>
>>> Nice!
>>
>>
>> There is a history: https://patchwork.ozlabs.org/patch/1088078/
>>
>> We did not fix anything in particular then, we do not seem to be fixing
>> anything now (in other words - we cannot test it in a normal natural
>> way). I'd drop this one.
>>
> 
> Yeah, I didn't fix anything at the time. Just reverted to the ref
> count behavior we had before:
> 
> https://patchwork.ozlabs.org/patch/829172/
> 
> Frederic recently posted his take on the same topic from the OpenCAPI
> point of view:
> 
> http://patchwork.ozlabs.org/patch/1198947/
> 
> He seems to indicate the NPU devices as the real culprit because
> nobody ever cared for them to be removable. Fixing that seems be
> a chore nobody really wants to address obviously... :-\


I had taken a stab at not leaking a ref for the nvlink devices and do 
the proper thing regarding ref counting (i.e. fixing all the callers of 
get_pci_dev() to drop the reference when they were done). With that, I 
could see that the ref count of the nvlink devices could drop to 0 
(calling remove for the device in /sys) and that the devices could go away.

But then, I realized it's not necessarily desirable at this point. There 
are several comments in the code saying the npu devices (for nvlink) 
don't go away, there's no device release callback defined when it seems 
there should be, at least to handle releasing PEs.... All in all, it 
seems that some work would be needed. And if it hasn't been required by 
now...

   Fred


>>
>>
>>>
>>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>>> ---
>>>   arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
>>>   1 file changed, 3 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
>>> index 72d3749da02c..2eb6e6d45a98 100644
>>> --- a/arch/powerpc/platforms/powernv/npu-dma.c
>>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
>>> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
>>>   			break;
>>>   
>>>   	/*
>>> -	 * pci_get_domain_bus_and_slot() increased the reference count of
>>> -	 * the PCI device, but callers don't need that actually as the PE
>>> -	 * already holds a reference to the device. Since callers aren't
>>> -	 * aware of the reference count change, call pci_dev_put() now to
>>> -	 * avoid leaks.
>>> +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
>>> +	 * Caller is responsible for dropping the ref when it's
>>> +	 * finished with it.
>>>   	 */
>>> -	if (pdev)
>>> -		pci_dev_put(pdev);
>>> -
>>>   	return pdev;
>>>   }
>>>   
>>>
>>
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  9:10       ` Frederic Barrat
@ 2019-11-27  9:33         ` Greg Kurz
  2019-11-27  9:40           ` Oliver O'Halloran
  2019-11-27  9:47           ` Frederic Barrat
  0 siblings, 2 replies; 107+ messages in thread
From: Greg Kurz @ 2019-11-27  9:33 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Alexey Kardashevskiy, linuxppc-dev, Oliver O'Halloran,
	s.miroshnichenko, alistair

On Wed, 27 Nov 2019 10:10:13 +0100
Frederic Barrat <fbarrat@linux.ibm.com> wrote:

> 
> 
> Le 27/11/2019 à 09:24, Greg Kurz a écrit :
> > On Wed, 27 Nov 2019 18:09:40 +1100
> > Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> > 
> >>
> >>
> >> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> >>> The comment here implies that we don't need to take a ref to the pci_dev
> >>> because the ioda_pe will always have one. This implies that the current
> >>> expection is that the pci_dev for an NPU device will *never* be torn
> >>> down since the ioda_pe having a ref to the device will prevent the
> >>> release function from being called.
> >>>
> >>> In other words, the desired behaviour here appears to be leaking a ref.
> >>>
> >>> Nice!
> >>
> >>
> >> There is a history: https://patchwork.ozlabs.org/patch/1088078/
> >>
> >> We did not fix anything in particular then, we do not seem to be fixing
> >> anything now (in other words - we cannot test it in a normal natural
> >> way). I'd drop this one.
> >>
> > 
> > Yeah, I didn't fix anything at the time. Just reverted to the ref
> > count behavior we had before:
> > 
> > https://patchwork.ozlabs.org/patch/829172/
> > 
> > Frederic recently posted his take on the same topic from the OpenCAPI
> > point of view:
> > 
> > http://patchwork.ozlabs.org/patch/1198947/
> > 
> > He seems to indicate the NPU devices as the real culprit because
> > nobody ever cared for them to be removable. Fixing that seems be
> > a chore nobody really wants to address obviously... :-\
> 
> 
> I had taken a stab at not leaking a ref for the nvlink devices and do 
> the proper thing regarding ref counting (i.e. fixing all the callers of 
> get_pci_dev() to drop the reference when they were done). With that, I 
> could see that the ref count of the nvlink devices could drop to 0 
> (calling remove for the device in /sys) and that the devices could go away.
> 
> But then, I realized it's not necessarily desirable at this point. There 
> are several comments in the code saying the npu devices (for nvlink) 
> don't go away, there's no device release callback defined when it seems 
> there should be, at least to handle releasing PEs.... All in all, it 
> seems that some work would be needed. And if it hasn't been required by 
> now...
> 

If everyone is ok with leaking a reference in the NPU case, I guess
this isn't a problem. But if we move forward with Oliver's patch, a
pci_dev_put() would be needed for OpenCAPI, correct ?

>    Fred
> 
> 
> >>
> >>
> >>>
> >>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> >>> ---
> >>>   arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
> >>>   1 file changed, 3 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> >>> index 72d3749da02c..2eb6e6d45a98 100644
> >>> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> >>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> >>> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
> >>>   			break;
> >>>   
> >>>   	/*
> >>> -	 * pci_get_domain_bus_and_slot() increased the reference count of
> >>> -	 * the PCI device, but callers don't need that actually as the PE
> >>> -	 * already holds a reference to the device. Since callers aren't
> >>> -	 * aware of the reference count change, call pci_dev_put() now to
> >>> -	 * avoid leaks.
> >>> +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
> >>> +	 * Caller is responsible for dropping the ref when it's
> >>> +	 * finished with it.
> >>>   	 */
> >>> -	if (pdev)
> >>> -		pci_dev_put(pdev);
> >>> -
> >>>   	return pdev;
> >>>   }
> >>>   
> >>>
> >>
> > 
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  9:33         ` Greg Kurz
@ 2019-11-27  9:40           ` Oliver O'Halloran
  2019-11-27 12:00             ` Greg Kurz
  2019-11-27  9:47           ` Frederic Barrat
  1 sibling, 1 reply; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-27  9:40 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Frederic Barrat, Alexey Kardashevskiy, linuxppc-dev,
	Sergey Miroshnichenko, Alistair Popple

On Wed, Nov 27, 2019 at 8:34 PM Greg Kurz <groug@kaod.org> wrote:
>
>
> If everyone is ok with leaking a reference in the NPU case, I guess
> this isn't a problem. But if we move forward with Oliver's patch, a
> pci_dev_put() would be needed for OpenCAPI, correct ?

Yes, but I think that's fair enough. By convention it's the callers
responsibility to drop the ref when it calls a function that returns a
refcounted object. Doing anything else creates a race condition since
the object's count could drop to zero before the caller starts using
it.

Oliver

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  9:33         ` Greg Kurz
  2019-11-27  9:40           ` Oliver O'Halloran
@ 2019-11-27  9:47           ` Frederic Barrat
  2019-11-27 12:03             ` Greg Kurz
  1 sibling, 1 reply; 107+ messages in thread
From: Frederic Barrat @ 2019-11-27  9:47 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Alexey Kardashevskiy, linuxppc-dev, Oliver O'Halloran,
	s.miroshnichenko, alistair



Le 27/11/2019 à 10:33, Greg Kurz a écrit :
> On Wed, 27 Nov 2019 10:10:13 +0100
> Frederic Barrat <fbarrat@linux.ibm.com> wrote:
> 
>>
>>
>> Le 27/11/2019 à 09:24, Greg Kurz a écrit :
>>> On Wed, 27 Nov 2019 18:09:40 +1100
>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>
>>>>
>>>>
>>>> On 20/11/2019 12:28, Oliver O'Halloran wrote:
>>>>> The comment here implies that we don't need to take a ref to the pci_dev
>>>>> because the ioda_pe will always have one. This implies that the current
>>>>> expection is that the pci_dev for an NPU device will *never* be torn
>>>>> down since the ioda_pe having a ref to the device will prevent the
>>>>> release function from being called.
>>>>>
>>>>> In other words, the desired behaviour here appears to be leaking a ref.
>>>>>
>>>>> Nice!
>>>>
>>>>
>>>> There is a history: https://patchwork.ozlabs.org/patch/1088078/
>>>>
>>>> We did not fix anything in particular then, we do not seem to be fixing
>>>> anything now (in other words - we cannot test it in a normal natural
>>>> way). I'd drop this one.
>>>>
>>>
>>> Yeah, I didn't fix anything at the time. Just reverted to the ref
>>> count behavior we had before:
>>>
>>> https://patchwork.ozlabs.org/patch/829172/
>>>
>>> Frederic recently posted his take on the same topic from the OpenCAPI
>>> point of view:
>>>
>>> http://patchwork.ozlabs.org/patch/1198947/
>>>
>>> He seems to indicate the NPU devices as the real culprit because
>>> nobody ever cared for them to be removable. Fixing that seems be
>>> a chore nobody really wants to address obviously... :-\
>>
>>
>> I had taken a stab at not leaking a ref for the nvlink devices and do
>> the proper thing regarding ref counting (i.e. fixing all the callers of
>> get_pci_dev() to drop the reference when they were done). With that, I
>> could see that the ref count of the nvlink devices could drop to 0
>> (calling remove for the device in /sys) and that the devices could go away.
>>
>> But then, I realized it's not necessarily desirable at this point. There
>> are several comments in the code saying the npu devices (for nvlink)
>> don't go away, there's no device release callback defined when it seems
>> there should be, at least to handle releasing PEs.... All in all, it
>> seems that some work would be needed. And if it hasn't been required by
>> now...
>>
> 
> If everyone is ok with leaking a reference in the NPU case, I guess
> this isn't a problem. But if we move forward with Oliver's patch, a
> pci_dev_put() would be needed for OpenCAPI, correct ?


No, these code paths are nvlink-only.

   Fred



>>     Fred
>>
>>
>>>>
>>>>
>>>>>
>>>>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>>>>> ---
>>>>>    arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
>>>>>    1 file changed, 3 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
>>>>> index 72d3749da02c..2eb6e6d45a98 100644
>>>>> --- a/arch/powerpc/platforms/powernv/npu-dma.c
>>>>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
>>>>> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
>>>>>    			break;
>>>>>    
>>>>>    	/*
>>>>> -	 * pci_get_domain_bus_and_slot() increased the reference count of
>>>>> -	 * the PCI device, but callers don't need that actually as the PE
>>>>> -	 * already holds a reference to the device. Since callers aren't
>>>>> -	 * aware of the reference count change, call pci_dev_put() now to
>>>>> -	 * avoid leaks.
>>>>> +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
>>>>> +	 * Caller is responsible for dropping the ref when it's
>>>>> +	 * finished with it.
>>>>>    	 */
>>>>> -	if (pdev)
>>>>> -		pci_dev_put(pdev);
>>>>> -
>>>>>    	return pdev;
>>>>>    }
>>>>>    
>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device
  2019-11-27  5:24   ` Alexey Kardashevskiy
@ 2019-11-27  9:51     ` Oliver O'Halloran
  0 siblings, 0 replies; 107+ messages in thread
From: Oliver O'Halloran @ 2019-11-27  9:51 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, Sergey Miroshnichenko

On Wed, Nov 27, 2019 at 4:24 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> >  arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 4f38652c7cd7..8525642b1256 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -3562,14 +3562,14 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
> >  static void pnv_pci_release_device(struct pci_dev *pdev)
> >  {
> >       struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
> > +     struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
> >       struct pci_dn *pdn = pci_get_pdn(pdev);
> > -     struct pnv_ioda_pe *pe;
> >
> >       /* The VF PE state is torn down when sriov_disable() is called */
> >       if (pdev->is_virtfn)
> >               return;
> >
> > -     if (!pdn || pdn->pe_number == IODA_INVALID_PE)
> > +     if (WARN_ON(!pe))
>
>
> Is that WARN_ON because there is always a PE - from upstream bridge or a

The device should always belong to a PE. If it doesn't (at this point)
then something deeply strange has happened.

> reserved one?

If it's associated with the reserved PE the rmap is set to
IODA_PE_INVALID, so would return NULL and we'd hit the WARN_ON(). I
think that's ok though since PE assignment should always succeed. If
it failed, or we're tearing down the device before we got to the point
of assigning a PE then there's probably a bug.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  9:40           ` Oliver O'Halloran
@ 2019-11-27 12:00             ` Greg Kurz
  0 siblings, 0 replies; 107+ messages in thread
From: Greg Kurz @ 2019-11-27 12:00 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Frederic Barrat, Alexey Kardashevskiy, linuxppc-dev,
	Sergey Miroshnichenko, Alistair Popple

On Wed, 27 Nov 2019 20:40:00 +1100
"Oliver O'Halloran" <oohall@gmail.com> wrote:

> On Wed, Nov 27, 2019 at 8:34 PM Greg Kurz <groug@kaod.org> wrote:
> >
> >
> > If everyone is ok with leaking a reference in the NPU case, I guess
> > this isn't a problem. But if we move forward with Oliver's patch, a
> > pci_dev_put() would be needed for OpenCAPI, correct ?
> 
> Yes, but I think that's fair enough. By convention it's the callers
> responsibility to drop the ref when it calls a function that returns a
> refcounted object. Doing anything else creates a race condition since
> the object's count could drop to zero before the caller starts using
> it.
> 

Sure, you're right, especially with Frederic's patch that drops
the pci_dev_get(dev) in pnv_ioda_setup_dev_PE().

> Oliver


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
  2019-11-27  9:47           ` Frederic Barrat
@ 2019-11-27 12:03             ` Greg Kurz
  0 siblings, 0 replies; 107+ messages in thread
From: Greg Kurz @ 2019-11-27 12:03 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Alexey Kardashevskiy, linuxppc-dev, Oliver O'Halloran,
	s.miroshnichenko, alistair

On Wed, 27 Nov 2019 10:47:45 +0100
Frederic Barrat <fbarrat@linux.ibm.com> wrote:

> 
> 
> Le 27/11/2019 à 10:33, Greg Kurz a écrit :
> > On Wed, 27 Nov 2019 10:10:13 +0100
> > Frederic Barrat <fbarrat@linux.ibm.com> wrote:
> > 
> >>
> >>
> >> Le 27/11/2019 à 09:24, Greg Kurz a écrit :
> >>> On Wed, 27 Nov 2019 18:09:40 +1100
> >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> >>>>> The comment here implies that we don't need to take a ref to the pci_dev
> >>>>> because the ioda_pe will always have one. This implies that the current
> >>>>> expection is that the pci_dev for an NPU device will *never* be torn
> >>>>> down since the ioda_pe having a ref to the device will prevent the
> >>>>> release function from being called.
> >>>>>
> >>>>> In other words, the desired behaviour here appears to be leaking a ref.
> >>>>>
> >>>>> Nice!
> >>>>
> >>>>
> >>>> There is a history: https://patchwork.ozlabs.org/patch/1088078/
> >>>>
> >>>> We did not fix anything in particular then, we do not seem to be fixing
> >>>> anything now (in other words - we cannot test it in a normal natural
> >>>> way). I'd drop this one.
> >>>>
> >>>
> >>> Yeah, I didn't fix anything at the time. Just reverted to the ref
> >>> count behavior we had before:
> >>>
> >>> https://patchwork.ozlabs.org/patch/829172/
> >>>
> >>> Frederic recently posted his take on the same topic from the OpenCAPI
> >>> point of view:
> >>>
> >>> http://patchwork.ozlabs.org/patch/1198947/
> >>>
> >>> He seems to indicate the NPU devices as the real culprit because
> >>> nobody ever cared for them to be removable. Fixing that seems be
> >>> a chore nobody really wants to address obviously... :-\
> >>
> >>
> >> I had taken a stab at not leaking a ref for the nvlink devices and do
> >> the proper thing regarding ref counting (i.e. fixing all the callers of
> >> get_pci_dev() to drop the reference when they were done). With that, I
> >> could see that the ref count of the nvlink devices could drop to 0
> >> (calling remove for the device in /sys) and that the devices could go away.
> >>
> >> But then, I realized it's not necessarily desirable at this point. There
> >> are several comments in the code saying the npu devices (for nvlink)
> >> don't go away, there's no device release callback defined when it seems
> >> there should be, at least to handle releasing PEs.... All in all, it
> >> seems that some work would be needed. And if it hasn't been required by
> >> now...
> >>
> > 
> > If everyone is ok with leaking a reference in the NPU case, I guess
> > this isn't a problem. But if we move forward with Oliver's patch, a
> > pci_dev_put() would be needed for OpenCAPI, correct ?
> 
> 
> No, these code paths are nvlink-only.
> 

Oh yes indeed. Then this patch and yours fit well together :)

>    Fred
> 
> 
> 
> >>     Fred
> >>
> >>
> >>>>
> >>>>
> >>>>>
> >>>>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> >>>>> ---
> >>>>>    arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
> >>>>>    1 file changed, 3 insertions(+), 8 deletions(-)
> >>>>>
> >>>>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> >>>>> index 72d3749da02c..2eb6e6d45a98 100644
> >>>>> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> >>>>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> >>>>> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
> >>>>>    			break;
> >>>>>    
> >>>>>    	/*
> >>>>> -	 * pci_get_domain_bus_and_slot() increased the reference count of
> >>>>> -	 * the PCI device, but callers don't need that actually as the PE
> >>>>> -	 * already holds a reference to the device. Since callers aren't
> >>>>> -	 * aware of the reference count change, call pci_dev_put() now to
> >>>>> -	 * avoid leaks.
> >>>>> +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
> >>>>> +	 * Caller is responsible for dropping the ref when it's
> >>>>> +	 * finished with it.
> >>>>>    	 */
> >>>>> -	if (pdev)
> >>>>> -		pci_dev_put(pdev);
> >>>>> -
> >>>>>    	return pdev;
> >>>>>    }
> >>>>>    
> >>>>>
> >>>>
> >>>
> >>
> > 
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices
  2019-11-20  1:28 ` [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices Oliver O'Halloran
@ 2019-11-27 22:49   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27 22:49 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko, Greg Kurz

cc: Greg.


On 20/11/2019 12:28, Oliver O'Halloran wrote:
> The only thing we need the pdn for in this function is setting the pe_number
> field, which we don't use anymore. Fix the weird refcounting behaviour while
> we're here.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> Either Fred, or Reza also fixed this in some patch lately and that'll probably get
> merged before this one does.
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 27 +++++++++--------------
>  1 file changed, 10 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 45d940730c30..2a9201306543 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1066,16 +1066,13 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
>  static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>  {
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
> -	struct pci_dn *pdn = pci_get_pdn(dev);
> -	struct pnv_ioda_pe *pe;
> +	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev);
>  
> -	if (!pdn) {
> -		pr_err("%s: Device tree node not associated properly\n",
> -			   pci_name(dev));
> +	/* Already has a PE assigned? huh? */
> +	if (pe) {
> +		WARN_ON(1);
>  		return NULL;
>  	}
> -	if (pdn->pe_number != IODA_INVALID_PE)
> -		return NULL;
>  
>  	pe = pnv_ioda_alloc_pe(phb);
>  	if (!pe) {
> @@ -1084,29 +1081,25 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>  		return NULL;
>  	}
>  
> -	/* NOTE: We get only one ref to the pci_dev for the pdn, not for the
> -	 * pointer in the PE data structure, both should be destroyed at the
> -	 * same time. However, this needs to be looked at more closely again
> -	 * once we actually start removing things (Hotplug, SR-IOV, ...)
> +	/*
> +	 * NB: We **do not** hold a pci_dev ref for pe->pdev.
>  	 *
> -	 * At some point we want to remove the PDN completely anyways
> +	 * The pci_dev's release function cleans up the ioda_pe state, so:
> +	 *  a) We can't take a ref otherwise the release function is never called
> +	 *  b) The pe->pdev pointer will always point to valid pci_dev (or NULL)
>  	 */
> -	pci_dev_get(dev);
> -	pdn->pe_number = pe->pe_number;
>  	pe->flags = PNV_IODA_PE_DEV;
>  	pe->pdev = dev;
>  	pe->pbus = NULL;
>  	pe->mve_number = -1;
> -	pe->rid = dev->bus->number << 8 | pdn->devfn;
> +	pe->rid = dev->bus->number << 8 | dev->devfn;
>  
>  	pe_info(pe, "Associated device to PE\n");
>  
>  	if (pnv_ioda_configure_pe(phb, pe)) {
>  		/* XXX What do we do here ? */
>  		pnv_ioda_free_pe(pe);
> -		pdn->pe_number = IODA_INVALID_PE;
>  		pe->pdev = NULL;
> -		pci_dev_put(dev);
>  		return NULL;
>  	}
>  
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack
  2019-11-20  1:28 ` [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack Oliver O'Halloran
@ 2019-11-27 22:54   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27 22:54 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> P8 needs to shove four GPUs into three PEs for $reasons. Remove the
> pdn->pe_assignment done there since we just use the pe_rmap[] now.


Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>




> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 2a9201306543..eceff27357e5 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1183,7 +1183,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
>  	long rid;
>  	struct pnv_ioda_pe *pe;
>  	struct pci_dev *gpu_pdev;
> -	struct pci_dn *npu_pdn;
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(npu_pdev->bus);
>  
>  	/*
> @@ -1210,9 +1209,8 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
>  			dev_info(&npu_pdev->dev,
>  				"Associating to existing PE %x\n", pe_num);
>  			pci_dev_get(npu_pdev);
> -			npu_pdn = pci_get_pdn(npu_pdev);
> -			rid = npu_pdev->bus->number << 8 | npu_pdn->devfn;
> -			npu_pdn->pe_number = pe_num;
> +
> +			rid = npu_pdev->bus->number << 8 | npu_pdev->devfn;
>  			phb->ioda.pe_rmap[rid] = pe->pe_number;
>  
>  			/* Map the PE to this link */
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors
  2019-11-20  1:28 ` [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors Oliver O'Halloran
@ 2019-11-27 23:00   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 107+ messages in thread
From: Alexey Kardashevskiy @ 2019-11-27 23:00 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: alistair, s.miroshnichenko



On 20/11/2019 12:28, Oliver O'Halloran wrote:
> :toot:
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>


Squash it into 26/46 "powernv/pci: Remove pdn from
pnv_pci_cfg_{read|write}". Thanks,


> ---
>  arch/powerpc/platforms/powernv/pci.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 0eeea8652426..6383dcfec606 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -750,17 +750,12 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  			       unsigned int devfn,
>  			       int where, int size, u32 *val)
>  {
> -	struct pci_dn *pdn;
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	u16 bdfn = bus->number << 8 | devfn;
>  	struct eeh_dev *edev;
>  	int ret;
>  
>  	*val = 0xFFFFFFFF;
> -	pdn = pci_get_pdn_by_devfn(bus, devfn);
> -	if (!pdn)
> -		return PCIBIOS_DEVICE_NOT_FOUND;
> -
>  	edev = pnv_eeh_find_edev(phb, bdfn);
>  	if (!pnv_eeh_pre_cfg_check(edev))
>  		return PCIBIOS_DEVICE_NOT_FOUND;
> @@ -781,16 +776,11 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  				unsigned int devfn,
>  				int where, int size, u32 val)
>  {
> -	struct pci_dn *pdn;
>  	struct pnv_phb *phb = pci_bus_to_pnvhb(bus);
>  	u16 bdfn = bus->number << 8 | devfn;
>  	struct eeh_dev *edev;
>  	int ret;
>  
> -	pdn = pci_get_pdn_by_devfn(bus, devfn);
> -	if (!pdn)
> -		return PCIBIOS_DEVICE_NOT_FOUND;
> -
>  	edev = pnv_eeh_find_edev(phb, bdfn);
>  	if (!pnv_eeh_pre_cfg_check(edev))
>  		return PCIBIOS_DEVICE_NOT_FOUND;
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 107+ messages in thread

end of thread, other threads:[~2019-11-27 23:02 UTC | newest]

Thread overview: 107+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset Oliver O'Halloran
2019-11-21  3:38   ` Alexey Kardashevskiy
2019-11-21  4:34     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 02/46] powernv/pci: Add helper to find ioda_pe from BDFN Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs Oliver O'Halloran
2019-11-21  3:57   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c Oliver O'Halloran
2019-11-21  4:02   ` Alexey Kardashevskiy
2019-11-21  4:33     ` Oliver O'Halloran
2019-11-21  7:46   ` Christoph Hellwig
2019-11-20  1:28 ` [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback Oliver O'Halloran
2019-11-21  4:03   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() Oliver O'Halloran
2019-11-21  4:34   ` Alexey Kardashevskiy
2019-11-25  4:41     ` Oliver O'Halloran
2019-11-21  7:48   ` Christoph Hellwig
2019-11-25  4:39     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting Oliver O'Halloran
2019-11-21  5:48   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn Oliver O'Halloran
2019-11-22  4:43   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config() Oliver O'Halloran
2019-11-22  4:52   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 10/46] powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config() Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn Oliver O'Halloran
2019-11-22  4:55   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev Oliver O'Halloran
2019-11-22  5:45   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 13/46] powerpc/eeh: Rework how pdev_probe() is used Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early() Oliver O'Halloran
2019-11-22  6:01   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops Oliver O'Halloran
2019-11-22  6:15   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print Oliver O'Halloran
2019-11-22  6:17   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev() Oliver O'Halloran
2019-11-25  0:30   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper Oliver O'Halloran
2019-11-25  0:42   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions Oliver O'Halloran
2019-11-25  1:02   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev Oliver O'Halloran
2019-11-25  1:26   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev() Oliver O'Halloran
2019-11-25  3:20   ` Alexey Kardashevskiy
2019-11-25  4:17     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed Oliver O'Halloran
2019-11-25  3:27   ` Alexey Kardashevskiy
2019-11-25  4:26     ` Oliver O'Halloran
2019-11-27  1:50       ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform Oliver O'Halloran
2019-11-25  5:00   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn Oliver O'Halloran
2019-11-27  0:21   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh() Oliver O'Halloran
2019-11-27  1:05   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write} Oliver O'Halloran
2019-11-27  2:16   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 27/46] powernv/pci: Clear reserved PE freezes Oliver O'Halloran
2019-11-27  3:00   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn Oliver O'Halloran
2019-11-27  4:09   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup Oliver O'Halloran
2019-11-27  4:26   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown Oliver O'Halloran
2019-11-27  4:50   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup() Oliver O'Halloran
2019-11-21  7:52   ` Christoph Hellwig
2019-11-27  4:53   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported() Oliver O'Halloran
2019-11-27  5:09   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier Oliver O'Halloran
2019-11-27  5:09   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook() Oliver O'Halloran
2019-11-27  5:14   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device Oliver O'Halloran
2019-11-27  5:24   ` Alexey Kardashevskiy
2019-11-27  9:51     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device Oliver O'Halloran
2019-11-27  5:45   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe() Oliver O'Halloran
2019-11-21  3:50   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL Oliver O'Halloran
2019-11-27  6:27   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev Oliver O'Halloran
2019-11-27  6:58   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs Oliver O'Halloran
2019-11-27  7:09   ` Alexey Kardashevskiy
2019-11-27  8:24     ` Greg Kurz
2019-11-27  9:10       ` Frederic Barrat
2019-11-27  9:33         ` Greg Kurz
2019-11-27  9:40           ` Oliver O'Halloran
2019-11-27 12:00             ` Greg Kurz
2019-11-27  9:47           ` Frederic Barrat
2019-11-27 12:03             ` Greg Kurz
2019-11-20  1:28 ` [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs Oliver O'Halloran
2019-11-27  7:14   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device Oliver O'Halloran
2019-11-27  7:30   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices Oliver O'Halloran
2019-11-27 22:49   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack Oliver O'Halloran
2019-11-27 22:54   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors Oliver O'Halloran
2019-11-27 23:00   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 46/46] HACK: prevent pdn's from being created Oliver O'Halloran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.