All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management
@ 2015-06-04  6:41 Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 01/42] PCI: Add pcibios_setup_bridge() Gavin Shan
                   ` (29 more replies)
  0 siblings, 30 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The series of patches intend to support PCI slot for PowerPC PowerNV platform,
which is running on top of skiboot firmware. The patchset requires corresponding
changes from skiboot firmware, which is sent to skiboot@lists.ozlabs.org
for review. The PCI slots are exposed by skiboot with device node properties,
and kernel utilizes those properties to populated PCI slots accordingly.

The original PCI infrastructure on PowerNV platform can't support hotplug
because the PE is assigned during PHB fixup time, which is called for once
during system boot time. For this, the PCI infrastructure on PowerNV platform
has been reworked for a lot. After that, the PE and its corresponding resources
(IODT, M32DT, M64 segments, DMA32 and bypass window) are assigned upon updating
PCI bridge's resources, which might decide PE# assigned to the PE (e.g. M64
resources, on P8 strictly speaking). Each PE will maintain a reference count,
which is (number of child PCI devices + 1). That indicates when last child PCI
device leaves the PE, the PE and its included resources will be relased and put
back into free pool again. With this design, the PE will be released when EEH PE
is released. PATCH[1 - 24] are related to this part.

>From skiboot perspective, PCI slot is providing (hot/fundamental/complete)
resets to EEH. The kernel gets to know if skiboot supports various reset on one
particular PCI slot through device-tree node. If it does, EEH will utilize the
functionality provided by skiboot. Besides, the device-tree nodes have to change
in order to support PCI hotplug. For example, when one PCI adapter inserted to
one slot, its device-tree node should be added to the system dynamically. Conversely,
the device-tree node should be removed from the system when the PCI adapter is going
to be offline. Since pci_dn and eeh_dev have same life cyle as PCI device nodes,
they should be added/removed accordingly during PCI hotplug. PATCH[25 - 38] are
doing the related work.

The OF driver is changed to support unflattening FDT blob for sub-stree, which
is covered by PATCH[39 - 41].

The last patch is the standalone PCI hotplug driver for PowerNV platform. When
removing PCI adapter from one PCI slot, which is invoked by command in userland,
the skiboot will power off the slot to save power and remove all device-tree
nodes for all PCI devices behind the slot. Conversely, the Power to the slot
is turned on, the PCI devices behind the slot is rescanned, and the device-tree
nodes for those newly detected PCI devices will be built in skiboot. For both
of cases, one message will be sent to kernel by skiboot so that the kernel
can adjust the device-tree accordingly. At the same time, the kernel also have
to deallocate or allocate PE# and its related resources (PE# and so on) for the
removed/added PCI devices.

Changelog
=========
v5:
   * Rebased to 4.1.rc6 and some unmerged patches as below:
     Alexey's DDW patchset (v11);
     Gavin's EEH error injection support (in mpe's next branch);
     Richard's EEH cleanup patches (in mpe's next branch);
     Richard's EEH support for VF (v7);
     Gavin's misc EEH fixes for 4.2;
   * The revision bases on skiboot corresponding patches (v7):
     https://patchwork.ozlabs.org/patch/480437/
   * Utilize OF overlay to update device-tree with help of newly introduced
     OPAL API opal_get_overlay_dt().
   * Split patches for easy review according to aik's comments.
   * Fix coding style from checkpatchc.pl as pointed by aik.
   * Code cleanup and misc fixup according to aik's input.
v4:
   * Rebased to 4.1.RC1
   * Added API to unflatten FDT blob to device node sub-tree, which is attached
     the indicated parent device node. The original mechanism based on formatted
     string stream has been dropped.
   * The PATCH[v3 09/21] ("powerpc/eeh: Delay probing EEH device during hotplug")
     was picked up sent to linux-ppc@ separately for review as Richard's "VF EEH
     Support" depends on that.
v3:
   * Rebased to 4.1.RC0
   * PowerNV PCI infrasturcture is total refactored in order to support PCI
     hotplug. The PowerNV hotplug driver is also reworked a lot because of
     the changes in skiboot in order to support PCI hotplug.


Gavin Shan (42):
  PCI: Add pcibios_setup_bridge()
  powerpc/powernv: Enable M64 on P7IOC
  powerpc/powernv: M64 support improvement
  powerpc/powernv: Trace consumed IO and M32 segments by PE
  powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()
  powerpc/powernv: Improve IO and M32 mapping
  powerpc/powernv: Calculate PHB's DMA weight dynamically
  powerpc/powernv: DMA32 cleanup
  powerpc/powernv: pnv_ioda_setup_dma() configure one PE only
  powerpc/powernv: Trace DMA32 segments consumed by PE
  powerpc/powernv: Increase PE# capacity
  powerpc/pci: Cleanup on pci_controller_ops
  powerpc/pci: Override pcibios_setup_bridge()
  powerpc/powernv: Allocate PE# in deasending order
  powerpc/powernv: Reserve PE# for root bus
  powerpc/powernv: Create PEs dynamically
  powerpc/powernv: PE oriented during configuration
  powerpc/powernv: Helper function pnv_ioda_init_pe()
  powerpc/powernv: Remove DMA32 list of PEs
  powerpc/powernv: Rename pnv_ioda_get_pe() to pnv_ioda_dev_to_pe()
  powerpc/powernv: Drop pnv_ioda_setup_dev_PE()
  powerpc/powernv: Move functions around
  powerpc/powernv: Cleanup on pnv_pci_ioda2_release_dma_pe()
  powerpc/powernv: Release PEs dynamically
  powerpc/powernv: Supports slot ID
  powerpc/powernv: Use PCI slot reset infrastructure
  powerpc/powernv: Simplify pnv_eeh_reset()
  powerpc/powernv: Don't cover root bus in pnv_pci_reset_secondary_bus()
  powerpc/powernv: Issue fundamental reset in
    pnv_pci_reset_secondary_bus()
  powerpc/pci: Don't scan empty slot
  powerpc/pci: Move pcibios_find_pci_bus() around
  powerpc/powernv: Introduce pnv_pci_poll()
  powerpc/powernv: Functions to get/reset PCI slot status
  powerpc/pci: Delay creating pci_dn
  powerpc/pci: Create eeh_dev while creating pci_dn
  powerpc/pci: Export traverse_pci_device_nodes()
  powerpc/pci: Update bridge windows on PCI plugging
  powerpc/powernv: Select OF_OVERLAY
  drivers/of: Unflatten nodes equal or deeper than specified level
  drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
  drivers/of: Return allocated memory chunk from of_fdt_unflatten_tree()
  pci/hotplug: PowerPC PowerNV PCI hotplug driver

 MAINTAINERS                                    |    6 +
 arch/powerpc/include/asm/eeh.h                 |    6 +-
 arch/powerpc/include/asm/opal-api.h            |    8 +-
 arch/powerpc/include/asm/opal.h                |    8 +-
 arch/powerpc/include/asm/pci-bridge.h          |   14 +-
 arch/powerpc/include/asm/pnv-pci.h             |    7 +
 arch/powerpc/include/asm/ppc-pci.h             |    8 +-
 arch/powerpc/kernel/eeh_dev.c                  |   20 +-
 arch/powerpc/kernel/pci-common.c               |   16 +-
 arch/powerpc/kernel/pci-hotplug.c              |   44 +-
 arch/powerpc/kernel/pci_dn.c                   |   91 +-
 arch/powerpc/platforms/maple/pci.c             |   35 +-
 arch/powerpc/platforms/pasemi/pci.c            |    3 -
 arch/powerpc/platforms/powermac/pci.c          |   39 +-
 arch/powerpc/platforms/powernv/Kconfig         |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c   |  180 +--
 arch/powerpc/platforms/powernv/opal-wrappers.S |    4 +
 arch/powerpc/platforms/powernv/pci-ioda.c      | 1770 ++++++++++++++----------
 arch/powerpc/platforms/powernv/pci.c           |   90 +-
 arch/powerpc/platforms/powernv/pci.h           |   59 +-
 arch/powerpc/platforms/pseries/msi.c           |    4 +-
 arch/powerpc/platforms/pseries/pci_dlpar.c     |   32 -
 arch/powerpc/platforms/pseries/setup.c         |    9 +-
 drivers/of/fdt.c                               |   85 +-
 drivers/of/unittest.c                          |    2 +-
 drivers/pci/hotplug/Kconfig                    |   12 +
 drivers/pci/hotplug/Makefile                   |    4 +
 drivers/pci/hotplug/powernv_php.c              |  140 ++
 drivers/pci/hotplug/powernv_php.h              |   90 ++
 drivers/pci/hotplug/powernv_php_slot.c         |  732 ++++++++++
 drivers/pci/setup-bus.c                        |    5 +
 include/linux/of_fdt.h                         |    5 +-
 include/linux/pci.h                            |    1 +
 33 files changed, 2559 insertions(+), 971 deletions(-)
 create mode 100644 drivers/pci/hotplug/powernv_php.c
 create mode 100644 drivers/pci/hotplug/powernv_php.h
 create mode 100644 drivers/pci/hotplug/powernv_php_slot.c

-- 
2.1.0

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v5 01/42] PCI: Add pcibios_setup_bridge()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-05 19:44   ` Bjorn Helgaas
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

Currently, PowerPC PowerNV platform utilizes ppc_md.pcibios_fixup(),
which is called for once after PCI probing and resource assignment
are completed, to allocate platform required resources for PCI devices:
PE#, IO and MMIO mapping, DMA address translation (TCE) table etc.
Obviously, it's not hotplug friendly.

The patch adds weak function pcibios_setup_bridge(), which is called
by pci_setup_bridge(). PowerPC PowerNV platform will reuse the function
to assign above platform required resources to newly added PCI devices,
in order to support PCI hotplug in subsequent patches.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Corrected subject as Bjorn suggested
  * pci_setup_bridge() calls pcibios_setup_bridge() and __pci_setup_bridge()
---
 drivers/pci/setup-bus.c | 5 +++++
 include/linux/pci.h     | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4fd0cac..623dee3 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -693,11 +693,16 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 	pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bus->bridge_ctl);
 }
 
+void __weak pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+}
+
 void pci_setup_bridge(struct pci_bus *bus)
 {
 	unsigned long type = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 
+	pcibios_setup_bridge(bus, type);
 	__pci_setup_bridge(bus, type);
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 94bacfa..5aacd0a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -811,6 +811,7 @@ void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev);
 void pci_stop_root_bus(struct pci_bus *bus);
 void pci_remove_root_bus(struct pci_bus *bus);
 void pci_setup_cardbus(struct pci_bus *bus);
+void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
 void pci_sort_breadthfirst(void);
 #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
 #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 01/42] PCI: Add pcibios_setup_bridge() Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 03/42] powerpc/powernv: M64 support improvement Gavin Shan
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch enables M64 window on P7IOC, which has been enabled on
PHB3. Different from PHB3 where 16 M64 BARs are supported and each
of them can be owned by one particular PE# exclusively or divided
evenly to 256 segments, each P7IOC PHB has 16 M64 BARs and each
of them are divided into 8 segments. So each P7IOC PHB can support
128 M64 segments only. Also, P7IOC has M64DT, which helps mapping
one particular M64 segment# to arbitrary PE#. PHB3 doesn't have
M64DT, indicating that one M64 segment can only be pinned to the
fixed PE#. In order to have similar logic to support M64 for PHB3
and P7IOC, we just provide 128 M64 (16 BARs) segments and fixed
mapping between PE# and M64 segment# on P7IOC. In turn, we just
need different phb->init_m64() hooks for P7IOC and PHB3 to support
M64.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Pinned OPAL API return value type to "int64_t"
  * Don't initialize M64 callbacks for unknown PHB type
  * Fixed comments as suggested by aik
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 110 ++++++++++++++++++++++++++----
 1 file changed, 98 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 573b07a..245ef81 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -174,6 +174,69 @@ static void pnv_ioda_free_pe(struct pnv_phb *phb, int pe)
 	clear_bit(pe, phb->ioda.pe_alloc);
 }
 
+static int pnv_ioda1_init_m64(struct pnv_phb *phb)
+{
+	struct resource *r;
+	int seg;
+
+	/* There are as many M64 segments as the maximum number
+	 * of PEs, which is 128.
+	 */
+	for (seg = 0; seg < phb->ioda.total_pe; seg += 8) {
+		unsigned long base;
+		int64_t rc;
+
+		base = phb->ioda.m64_base + seg * phb->ioda.m64_segsize;
+		rc = opal_pci_set_phb_mem_window(phb->opal_id,
+						 OPAL_M64_WINDOW_TYPE,
+						 seg / 8,
+						 base,
+						 0, /* unused */
+						 8 * phb->ioda.m64_segsize);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("  Error %lld setting M64 PHB#%d-BAR#%d\n",
+				rc, phb->hose->global_number, seg / 8);
+			goto fail;
+		}
+
+		rc = opal_pci_phb_mmio_enable(phb->opal_id,
+					      OPAL_M64_WINDOW_TYPE,
+					      seg / 8,
+					      OPAL_ENABLE_M64_SPLIT);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("  Error %lld enabling M64 PHB#%d-BAR#%d\n",
+				rc, phb->hose->global_number, seg / 8);
+			goto fail;
+		}
+	}
+
+	/* Strip off the segment used by the reserved PE, which
+	 * is expected to be 0 or last supported PE#. The PHB's
+	 * first memory window traces the 32-bits MMIO range
+	 * while the second one traces the 64-bits prefetchable
+	 * MMIO range that the PHB supports.
+	 */
+	r = &phb->hose->mem_resources[1];
+	if (phb->ioda.reserved_pe == 0)
+		r->start += phb->ioda.m64_segsize;
+	else if (phb->ioda.reserved_pe == (phb->ioda.total_pe - 1))
+		r->end -= phb->ioda.m64_segsize;
+	else
+		pr_warn("  Cannot strip M64 segment for reserved PE#%d\n",
+			phb->ioda.reserved_pe);
+
+	return 0;
+
+fail:
+	for ( ; seg >= 0; seg -= 8)
+		opal_pci_phb_mmio_enable(phb->opal_id,
+					 OPAL_M64_WINDOW_TYPE,
+					 seg / 8,
+					 OPAL_DISABLE_M64);
+
+	return -EIO;
+}
+
 /* The default M64 BAR is shared by all PEs */
 static int pnv_ioda2_init_m64(struct pnv_phb *phb)
 {
@@ -231,7 +294,7 @@ fail:
 	return -EIO;
 }
 
-static void pnv_ioda2_reserve_m64_pe(struct pnv_phb *phb)
+static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb)
 {
 	resource_size_t sgsz = phb->ioda.m64_segsize;
 	struct pci_dev *pdev;
@@ -257,8 +320,8 @@ static void pnv_ioda2_reserve_m64_pe(struct pnv_phb *phb)
 	}
 }
 
-static int pnv_ioda2_pick_m64_pe(struct pnv_phb *phb,
-				 struct pci_bus *bus, int all)
+static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
+				struct pci_bus *bus, int all)
 {
 	resource_size_t segsz = phb->ioda.m64_segsize;
 	struct pci_dev *pdev;
@@ -355,6 +418,26 @@ done:
 			pe->master = master_pe;
 			list_add_tail(&pe->list, &master_pe->slaves);
 		}
+
+		/* P7IOC supports M64DT, which helps mapping M64 segment
+		 * to one particular PE#. However, PHB3 has fixed mapping
+		 * between M64 segment and PE#. In order to have same logic
+		 * for P7IOC and PHB3, we enforce fixed mapping between M64
+		 * segment and PE# on P7IOC.
+		 */
+		if (phb->type == PNV_PHB_IODA1) {
+			int64_t rc;
+
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+							 pe->pe_number,
+							 OPAL_M64_WINDOW_TYPE,
+							 pe->pe_number / 8,
+							 pe->pe_number % 8);
+			if (rc != OPAL_SUCCESS)
+				pr_warn("%s: Error %lld mapping M64 for PHB#%d-PE#%d\n",
+					__func__, rc, phb->hose->global_number,
+					pe->pe_number);
+		}
 	}
 
 	kfree(pe_alloc);
@@ -369,12 +452,6 @@ static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
 	const u32 *r;
 	u64 pci_addr;
 
-	/* FIXME: Support M64 for P7IOC */
-	if (phb->type != PNV_PHB_IODA2) {
-		pr_info("  Not support M64 window\n");
-		return;
-	}
-
 	if (!firmware_has_feature(FW_FEATURE_OPALv3)) {
 		pr_info("  Firmware too old to support M64 window\n");
 		return;
@@ -403,9 +480,18 @@ static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
 
 	/* Use last M64 BAR to cover M64 window */
 	phb->ioda.m64_bar_idx = 15;
-	phb->init_m64 = pnv_ioda2_init_m64;
-	phb->reserve_m64_pe = pnv_ioda2_reserve_m64_pe;
-	phb->pick_m64_pe = pnv_ioda2_pick_m64_pe;
+	phb->reserve_m64_pe = pnv_ioda_reserve_m64_pe;
+	phb->pick_m64_pe = pnv_ioda_pick_m64_pe;
+	switch (phb->type) {
+	case PNV_PHB_IODA1:
+		phb->init_m64 = pnv_ioda1_init_m64;
+		break;
+	case PNV_PHB_IODA2:
+		phb->init_m64 = pnv_ioda2_init_m64;
+		break;
+	default:
+		pr_debug(" Cannot support M64 for unknown type of PHB\n");
+	}
 }
 
 static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 03/42] powerpc/powernv: M64 support improvement
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 01/42] PCI: Add pcibios_setup_bridge() Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 07/42] powerpc/powernv: Calculate PHB's DMA weight dynamically Gavin Shan
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

We're having the hardware (on PHB3) or software enforced (on P7IOC)
limitation: M64 segment#x can only be assigned to PE#x. IO and M32
segment can be mapped to arbitrary PE# via IODT and M32DT. It means
the PE number should be x if M64 segment#x has been assigned to the
PE. Also, each PE owns one M64 segment at most. Currently, we are
reserving PE# according to root port's M64 window. It won't be reliable
once we extend M64 windows of root port, or the upstream port of the
PCIE switch behind root port to PHB's M64 window, in order to support
PCI hotplug in future.

The patch reserves PE# for M64 segments according to the M64 resources
of the PCI devices (not bridges) contained in the PE. Besides, it's
always worthy to trace the M64 segments consumed by the PE, which can
be released at PCI unplugging time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Made the changelog more descriptive on the fixed M64 seg# mapping
  * Dropped unnecessary and corrected comments pointed by aik
  * Replace "pe_bitsmap" with "pe_bitmap"
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 189 ++++++++++++++++++------------
 arch/powerpc/platforms/powernv/pci.h      |  10 +-
 2 files changed, 121 insertions(+), 78 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 245ef81..71afb38 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -294,28 +294,78 @@ fail:
 	return -EIO;
 }
 
-static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb)
+/* We extend the M64 window of root port, or the upstream bridge port
+ * of the PCIE switch behind root port. So we shouldn't reserve PEs
+ * for M64 resources because there are no (normal) PCI devices consuming
+ * M64 resources on the PCI buses leading from root port, or the upstream
+ * bridge port. The function returns true if the indicated PCI bus needs
+ * reserved PEs because of M64 resources in advance. Otherwise, the
+ * function returns false.
+ */
+static bool pnv_ioda_need_m64_pe(struct pnv_phb *phb,
+				 struct pci_bus *bus)
 {
-	resource_size_t sgsz = phb->ioda.m64_segsize;
+	if (!bus || pci_is_root_bus(bus))
+		return false;
+
+	/* Bus leading from root port. We need check what types of PCI
+	 * devices on the bus. If it's connecting PCI bridge, we don't
+	 * need reserve M64 PEs for it. Otherwise, we still need to do
+	 * that.
+	 */
+	if (pci_is_root_bus(bus->self->bus)) {
+		struct pci_dev *pdev;
+
+		list_for_each_entry(pdev, &bus->devices, bus_list) {
+			if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL)
+				return true;
+		}
+
+		return false;
+	}
+
+	/* Bus leading from the upstream bridge port on top level */
+	if (pci_is_root_bus(bus->self->bus->self->bus))
+		return false;
+
+	return true;
+}
+
+static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb,
+				    struct pci_bus *bus)
+{
+	resource_size_t segsz = phb->ioda.m64_segsize;
 	struct pci_dev *pdev;
 	struct resource *r;
-	int base, step, i;
+	unsigned long pe_no, limit;
+	int i;
 
-	/*
-	 * Root bus always has full M64 range and root port has
-	 * M64 range used in reality. So we're checking root port
-	 * instead of root bus.
+	if (!pnv_ioda_need_m64_pe(phb, bus))
+		return;
+
+	/* The bridge's M64 window might have been extended to the
+	 * PHB's M64 window in order to support PCI hotplug. So the
+	 * bridge's M64 window isn't reliable to be used for picking
+	 * PE# for its leading PCI bus. We have to check the M64
+	 * resources consumed by the PCI devices, which seat on the
+	 * PCI bus.
 	 */
-	list_for_each_entry(pdev, &phb->hose->bus->devices, bus_list) {
-		for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
-			r = &pdev->resource[PCI_BRIDGE_RESOURCES + i];
-			if (!r->parent ||
-			    !pnv_pci_is_mem_pref_64(r->flags))
+	list_for_each_entry(pdev, &bus->devices, bus_list) {
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+#ifdef CONFIG_PCI_IOV
+			if (i >= PCI_IOV_RESOURCES && i <= PCI_IOV_RESOURCE_END)
+				continue;
+#endif
+			r = &pdev->resource[i];
+			if (!r->flags || r->start >= r->end ||
+			    !r->parent || !pnv_pci_is_mem_pref_64(r->flags))
 				continue;
 
-			base = (r->start - phb->ioda.m64_base) / sgsz;
-			for (step = 0; step < resource_size(r) / sgsz; step++)
-				pnv_ioda_reserve_pe(phb, base + step);
+			pe_no = (r->start - phb->ioda.m64_base) / segsz;
+			limit = ALIGN(r->end - phb->ioda.m64_base, segsz) /
+				segsz;
+			for (; pe_no < limit; pe_no++)
+				pnv_ioda_reserve_pe(phb, pe_no);
 		}
 	}
 }
@@ -327,85 +377,63 @@ static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
 	struct pci_dev *pdev;
 	struct resource *r;
 	struct pnv_ioda_pe *master_pe, *pe;
-	unsigned long size, *pe_alloc;
-	bool found;
-	int start, i, j;
-
-	/* Root bus shouldn't use M64 */
-	if (pci_is_root_bus(bus))
-		return IODA_INVALID_PE;
-
-	/* We support only one M64 window on each bus */
-	found = false;
-	pci_bus_for_each_resource(bus, r, i) {
-		if (r && r->parent &&
-		    pnv_pci_is_mem_pref_64(r->flags)) {
-			found = true;
-			break;
-		}
-	}
+	unsigned long size, *pe_bitmap;
+	unsigned long pe_no, limit;
+	int i;
 
-	/* No M64 window found ? */
-	if (!found)
+	if (!pnv_ioda_need_m64_pe(phb, bus))
 		return IODA_INVALID_PE;
 
 	/* Allocate bitmap */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
-	pe_alloc = kzalloc(size, GFP_KERNEL);
-	if (!pe_alloc) {
-		pr_warn("%s: Out of memory !\n",
-			__func__);
+	pe_bitmap = kzalloc(size, GFP_KERNEL);
+	if (!pe_bitmap)
 		return IODA_INVALID_PE;
-	}
 
-	/*
-	 * Figure out reserved PE numbers by the PE
-	 * the its child PEs.
-	 */
-	start = (r->start - phb->ioda.m64_base) / segsz;
-	for (i = 0; i < resource_size(r) / segsz; i++)
-		set_bit(start + i, pe_alloc);
-
-	if (all)
-		goto done;
-
-	/*
-	 * If the PE doesn't cover all subordinate buses,
-	 * we need subtract from reserved PEs for children.
+	/* The bridge's M64 window might be extended to PHB's M64
+	 * window by intention to support PCI hotplug. So we have
+	 * to check the M64 resources consumed by the PCI devices
+	 * on the PCI bus.
 	 */
 	list_for_each_entry(pdev, &bus->devices, bus_list) {
-		if (!pdev->subordinate)
-			continue;
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+#ifdef CONFIG_PCI_IOV
+			if (i >= PCI_IOV_RESOURCES &&
+			    i <= PCI_IOV_RESOURCE_END)
+				continue;
+#endif
+			/* Don't scan bridge's window if the PE
+			 * doesn't contain its subordinate bus.
+			 */
+			if (!all && i >= PCI_BRIDGE_RESOURCES &&
+			    i <= PCI_BRIDGE_RESOURCE_END)
+				continue;
 
-		pci_bus_for_each_resource(pdev->subordinate, r, i) {
-			if (!r || !r->parent ||
-			    !pnv_pci_is_mem_pref_64(r->flags))
+			r = &pdev->resource[i];
+			if (!r->flags || r->start >= r->end ||
+			    !r->parent || !pnv_pci_is_mem_pref_64(r->flags))
 				continue;
 
-			start = (r->start - phb->ioda.m64_base) / segsz;
-			for (j = 0; j < resource_size(r) / segsz ; j++)
-				clear_bit(start + j, pe_alloc);
-                }
-        }
+			pe_no = (r->start - phb->ioda.m64_base) / segsz;
+			limit = ALIGN(r->end - phb->ioda.m64_base, segsz) /
+				segsz;
+			for (; pe_no < limit; pe_no++)
+				set_bit(pe_no, pe_bitmap);
+		}
+	}
 
-	/*
-	 * the current bus might not own M64 window and that's all
-	 * contributed by its child buses. For the case, we needn't
-	 * pick M64 dependent PE#.
-	 */
-	if (bitmap_empty(pe_alloc, phb->ioda.total_pe)) {
-		kfree(pe_alloc);
+	/* No M64 window found ? */
+	if (bitmap_empty(pe_bitmap, phb->ioda.total_pe)) {
+		kfree(pe_bitmap);
 		return IODA_INVALID_PE;
 	}
 
-	/*
-	 * Figure out the master PE and put all slave PEs to master
-	 * PE's list to form compound PE.
+	/* Figure out the master PE and put all slave PEs
+	 * to master PE's list to form compound PE.
 	 */
-done:
 	master_pe = NULL;
 	i = -1;
-	while ((i = find_next_bit(pe_alloc, phb->ioda.total_pe, i + 1)) <
+	while ((i = find_next_bit(pe_bitmap, phb->ioda.total_pe, i + 1)) <
 		phb->ioda.total_pe) {
 		pe = &phb->ioda.pe_array[i];
 
@@ -419,6 +447,13 @@ done:
 			list_add_tail(&pe->list, &master_pe->slaves);
 		}
 
+		/* Reserve the M64 segment, which should be available. Also,
+		 * those M64 segments consumed by slave PEs are contributed
+		 * to the master PE.
+		 */
+		BUG_ON(test_and_set_bit(pe->pe_number, phb->ioda.m64_segmap));
+		BUG_ON(test_and_set_bit(pe->pe_number, master_pe->m64_segmap));
+
 		/* P7IOC supports M64DT, which helps mapping M64 segment
 		 * to one particular PE#. However, PHB3 has fixed mapping
 		 * between M64 segment and PE#. In order to have same logic
@@ -440,7 +475,7 @@ done:
 		}
 	}
 
-	kfree(pe_alloc);
+	kfree(pe_bitmap);
 	return master_pe->pe_number;
 }
 
@@ -1233,7 +1268,7 @@ static void pnv_pci_ioda_setup_PEs(void)
 
 		/* M64 layout might affect PE allocation */
 		if (phb->reserve_m64_pe)
-			phb->reserve_m64_pe(phb);
+			phb->reserve_m64_pe(phb, phb->hose->bus);
 
 		pnv_ioda_setup_PEs(hose->bus);
 	}
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index fc6be02..54657f4 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -49,6 +49,13 @@ struct pnv_ioda_pe {
 	/* PE number */
 	unsigned int		pe_number;
 
+	/* IO/M32/M64 segments consumed by the PE. Each PE can
+	 * have one M64 segment at most, but M64 segments consumed
+	 * by slave PEs will be contributed to the master PE. One
+	 * PE can own multiple IO and M32 segments.
+	 */
+	unsigned long		m64_segmap[8];
+
 	/* "Weight" assigned to the PE for the sake of DMA resource
 	 * allocations
 	 */
@@ -113,7 +120,7 @@ struct pnv_phb {
 	u32 (*bdfn_to_pe)(struct pnv_phb *phb, struct pci_bus *bus, u32 devfn);
 	void (*shutdown)(struct pnv_phb *phb);
 	int (*init_m64)(struct pnv_phb *phb);
-	void (*reserve_m64_pe)(struct pnv_phb *phb);
+	void (*reserve_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus);
 	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, int all);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
@@ -153,6 +160,7 @@ struct pnv_phb {
 			struct mutex		pe_alloc_mutex;
 
 			/* M32 & IO segment maps */
+			unsigned long		m64_segmap[8];
 			unsigned int		*m32_segmap;
 			unsigned int		*io_segmap;
 			struct pnv_ioda_pe	*pe_array;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 04/42] powerpc/powernv: Trace consumed IO and M32 segments by PE
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The patch introduces two bitmaps to trace the IO and M32 segments
consumed by one particular PE, which can be released once the PE
is destroyed during PCI unplugging time. Also, we're using fixed
quantity of bits to trace the used IO and M32 segments by PEs in
one particular PHB. Besides, @pe_array is put to the location
adjacent to @pe_alloc on account of their close relation.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from PATCH[v4 04/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 17 +++++------------
 arch/powerpc/platforms/powernv/pci.h      | 11 ++++++-----
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 71afb38..53d0efd 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2992,7 +2992,8 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 
 			while (index < phb->ioda.total_pe &&
 			       region.start <= region.end) {
-				phb->ioda.io_segmap[index] = pe->pe_number;
+				set_bit(index, phb->ioda.io_segmap);
+				set_bit(index, pe->io_segmap);
 				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
 					pe->pe_number, OPAL_IO_WINDOW_TYPE, 0, index);
 				if (rc != OPAL_SUCCESS) {
@@ -3017,7 +3018,8 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 
 			while (index < phb->ioda.total_pe &&
 			       region.start <= region.end) {
-				phb->ioda.m32_segmap[index] = pe->pe_number;
+				set_bit(index, phb->ioda.m32_segmap);
+				set_bit(index, pe->m32_segmap);
 				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
 					pe->pe_number, OPAL_M32_WINDOW_TYPE, 0, index);
 				if (rc != OPAL_SUCCESS) {
@@ -3196,7 +3198,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 {
 	struct pci_controller *hose;
 	struct pnv_phb *phb;
-	unsigned long size, m32map_off, pemap_off, iomap_off = 0;
+	unsigned long size, pemap_off;
 	const __be64 *prop64;
 	const __be32 *prop32;
 	int len;
@@ -3281,19 +3283,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 
 	/* Allocate aux data & arrays. We don't have IO ports on PHB3 */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
-	m32map_off = size;
-	size += phb->ioda.total_pe * sizeof(phb->ioda.m32_segmap[0]);
-	if (phb->type == PNV_PHB_IODA1) {
-		iomap_off = size;
-		size += phb->ioda.total_pe * sizeof(phb->ioda.io_segmap[0]);
-	}
 	pemap_off = size;
 	size += phb->ioda.total_pe * sizeof(struct pnv_ioda_pe);
 	aux = memblock_virt_alloc(size, 0);
 	phb->ioda.pe_alloc = aux;
-	phb->ioda.m32_segmap = aux + m32map_off;
-	if (phb->type == PNV_PHB_IODA1)
-		phb->ioda.io_segmap = aux + iomap_off;
 	phb->ioda.pe_array = aux + pemap_off;
 	set_bit(phb->ioda.reserved_pe, phb->ioda.pe_alloc);
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 54657f4..0a8cecb 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -54,6 +54,8 @@ struct pnv_ioda_pe {
 	 * by slave PEs will be contributed to the master PE. One
 	 * PE can own multiple IO and M32 segments.
 	 */
+	unsigned long		io_segmap[8];
+	unsigned long		m32_segmap[8];
 	unsigned long		m64_segmap[8];
 
 	/* "Weight" assigned to the PE for the sake of DMA resource
@@ -154,16 +156,15 @@ struct pnv_phb {
 			unsigned int		io_segsize;
 			unsigned int		io_pci_base;
 
-			/* PE allocation bitmap */
+			/* PE allocation */
 			unsigned long		*pe_alloc;
-			/* PE allocation mutex */
+			struct pnv_ioda_pe	*pe_array;
 			struct mutex		pe_alloc_mutex;
 
 			/* M32 & IO segment maps */
+			unsigned long		io_segmap[8];
+			unsigned long		m32_segmap[8];
 			unsigned long		m64_segmap[8];
-			unsigned int		*m32_segmap;
-			unsigned int		*io_segmap;
-			struct pnv_ioda_pe	*pe_array;
 
 			/* IRQ chip */
 			int			irq_chip_init;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 04/42] powerpc/powernv: Trace consumed IO and M32 segments by PE
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch introduces two bitmaps to trace the IO and M32 segments
consumed by one particular PE, which can be released once the PE
is destroyed during PCI unplugging time. Also, we're using fixed
quantity of bits to trace the used IO and M32 segments by PEs in
one particular PHB. Besides, @pe_array is put to the location
adjacent to @pe_alloc on account of their close relation.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 04/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 17 +++++------------
 arch/powerpc/platforms/powernv/pci.h      | 11 ++++++-----
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 71afb38..53d0efd 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2992,7 +2992,8 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 
 			while (index < phb->ioda.total_pe &&
 			       region.start <= region.end) {
-				phb->ioda.io_segmap[index] = pe->pe_number;
+				set_bit(index, phb->ioda.io_segmap);
+				set_bit(index, pe->io_segmap);
 				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
 					pe->pe_number, OPAL_IO_WINDOW_TYPE, 0, index);
 				if (rc != OPAL_SUCCESS) {
@@ -3017,7 +3018,8 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 
 			while (index < phb->ioda.total_pe &&
 			       region.start <= region.end) {
-				phb->ioda.m32_segmap[index] = pe->pe_number;
+				set_bit(index, phb->ioda.m32_segmap);
+				set_bit(index, pe->m32_segmap);
 				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
 					pe->pe_number, OPAL_M32_WINDOW_TYPE, 0, index);
 				if (rc != OPAL_SUCCESS) {
@@ -3196,7 +3198,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 {
 	struct pci_controller *hose;
 	struct pnv_phb *phb;
-	unsigned long size, m32map_off, pemap_off, iomap_off = 0;
+	unsigned long size, pemap_off;
 	const __be64 *prop64;
 	const __be32 *prop32;
 	int len;
@@ -3281,19 +3283,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 
 	/* Allocate aux data & arrays. We don't have IO ports on PHB3 */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
-	m32map_off = size;
-	size += phb->ioda.total_pe * sizeof(phb->ioda.m32_segmap[0]);
-	if (phb->type == PNV_PHB_IODA1) {
-		iomap_off = size;
-		size += phb->ioda.total_pe * sizeof(phb->ioda.io_segmap[0]);
-	}
 	pemap_off = size;
 	size += phb->ioda.total_pe * sizeof(struct pnv_ioda_pe);
 	aux = memblock_virt_alloc(size, 0);
 	phb->ioda.pe_alloc = aux;
-	phb->ioda.m32_segmap = aux + m32map_off;
-	if (phb->type == PNV_PHB_IODA1)
-		phb->ioda.io_segmap = aux + iomap_off;
 	phb->ioda.pe_array = aux + pemap_off;
 	set_bit(phb->ioda.reserved_pe, phb->ioda.pe_alloc);
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 54657f4..0a8cecb 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -54,6 +54,8 @@ struct pnv_ioda_pe {
 	 * by slave PEs will be contributed to the master PE. One
 	 * PE can own multiple IO and M32 segments.
 	 */
+	unsigned long		io_segmap[8];
+	unsigned long		m32_segmap[8];
 	unsigned long		m64_segmap[8];
 
 	/* "Weight" assigned to the PE for the sake of DMA resource
@@ -154,16 +156,15 @@ struct pnv_phb {
 			unsigned int		io_segsize;
 			unsigned int		io_pci_base;
 
-			/* PE allocation bitmap */
+			/* PE allocation */
 			unsigned long		*pe_alloc;
-			/* PE allocation mutex */
+			struct pnv_ioda_pe	*pe_array;
 			struct mutex		pe_alloc_mutex;
 
 			/* M32 & IO segment maps */
+			unsigned long		io_segmap[8];
+			unsigned long		m32_segmap[8];
 			unsigned long		m64_segmap[8];
-			unsigned int		*m32_segmap;
-			unsigned int		*io_segmap;
-			struct pnv_ioda_pe	*pe_array;
 
 			/* IRQ chip */
 			int			irq_chip_init;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 05/42] powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The original implementation of pnv_ioda_setup_pe_seg() configures
IO and M32 segments by separate logics, which can be merged by
by caching @seg_bitmap, @seg_size, @win in advance. The patch
shouldn't cause any behavioural changes.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from PATCH[v4 04/21]
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 67 +++++++++++++++----------------
 1 file changed, 32 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 53d0efd..3bb4ce8 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2971,7 +2971,10 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 	struct pci_bus_region region;
 	struct resource *res;
 	int i, index;
-	int rc;
+	unsigned int segsize;
+	unsigned long *segmap, *pe_segmap;
+	uint16_t win;
+	int64_t rc;
 
 	/*
 	 * NOTE: We only care PCI bus based PE for now. For PCI
@@ -2988,50 +2991,44 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 		if (res->flags & IORESOURCE_IO) {
 			region.start = res->start - phb->ioda.io_pci_base;
 			region.end   = res->end - phb->ioda.io_pci_base;
-			index = region.start / phb->ioda.io_segsize;
-
-			while (index < phb->ioda.total_pe &&
-			       region.start <= region.end) {
-				set_bit(index, phb->ioda.io_segmap);
-				set_bit(index, pe->io_segmap);
-				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
-					pe->pe_number, OPAL_IO_WINDOW_TYPE, 0, index);
-				if (rc != OPAL_SUCCESS) {
-					pr_err("%s: OPAL error %d when mapping IO "
-					       "segment #%d to PE#%d\n",
-					       __func__, rc, index, pe->pe_number);
-					break;
-				}
-
-				region.start += phb->ioda.io_segsize;
-				index++;
-			}
+			segsize      = phb->ioda.io_segsize;
+			segmap       = phb->ioda.io_segmap;
+			pe_segmap    = pe->io_segmap;
+			win          = OPAL_IO_WINDOW_TYPE;
 		} else if ((res->flags & IORESOURCE_MEM) &&
-			   !pnv_pci_is_mem_pref_64(res->flags)) {
+			    !pnv_pci_is_mem_pref_64(res->flags)) {
 			region.start = res->start -
 				       hose->mem_offset[0] -
 				       phb->ioda.m32_pci_base;
 			region.end   = res->end -
 				       hose->mem_offset[0] -
 				       phb->ioda.m32_pci_base;
-			index = region.start / phb->ioda.m32_segsize;
+			segsize      = phb->ioda.m32_segsize;
+			segmap       = phb->ioda.m32_segmap;
+			pe_segmap    = pe->m32_segmap;
+			win          = OPAL_M32_WINDOW_TYPE;
+		} else {
+			continue;
+		}
 
-			while (index < phb->ioda.total_pe &&
-			       region.start <= region.end) {
-				set_bit(index, phb->ioda.m32_segmap);
-				set_bit(index, pe->m32_segmap);
-				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
-					pe->pe_number, OPAL_M32_WINDOW_TYPE, 0, index);
-				if (rc != OPAL_SUCCESS) {
-					pr_err("%s: OPAL error %d when mapping M32 "
-					       "segment#%d to PE#%d",
-					       __func__, rc, index, pe->pe_number);
-					break;
-				}
+		index = region.start / phb->ioda.io_segsize;
+		while (index < phb->ioda.total_pe &&
+		       region.start <= region.end) {
+			set_bit(index, segmap);
+			set_bit(index, pe_segmap);
 
-				region.start += phb->ioda.m32_segsize;
-				index++;
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+					pe->pe_number, win, 0, index);
+			if (rc != OPAL_SUCCESS) {
+				pr_warn("%s: Error %lld mapping (%d) seg#%d to PHB#%d-PE#%d\n",
+					__func__, rc, win, index,
+					pe->phb->hose->global_number,
+					pe->pe_number);
+				break;
 			}
+
+			region.start += segsize;
+			index++;
 		}
 	}
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 05/42] powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The original implementation of pnv_ioda_setup_pe_seg() configures
IO and M32 segments by separate logics, which can be merged by
by caching @seg_bitmap, @seg_size, @win in advance. The patch
shouldn't cause any behavioural changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 04/21]
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 67 +++++++++++++++----------------
 1 file changed, 32 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 53d0efd..3bb4ce8 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2971,7 +2971,10 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 	struct pci_bus_region region;
 	struct resource *res;
 	int i, index;
-	int rc;
+	unsigned int segsize;
+	unsigned long *segmap, *pe_segmap;
+	uint16_t win;
+	int64_t rc;
 
 	/*
 	 * NOTE: We only care PCI bus based PE for now. For PCI
@@ -2988,50 +2991,44 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 		if (res->flags & IORESOURCE_IO) {
 			region.start = res->start - phb->ioda.io_pci_base;
 			region.end   = res->end - phb->ioda.io_pci_base;
-			index = region.start / phb->ioda.io_segsize;
-
-			while (index < phb->ioda.total_pe &&
-			       region.start <= region.end) {
-				set_bit(index, phb->ioda.io_segmap);
-				set_bit(index, pe->io_segmap);
-				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
-					pe->pe_number, OPAL_IO_WINDOW_TYPE, 0, index);
-				if (rc != OPAL_SUCCESS) {
-					pr_err("%s: OPAL error %d when mapping IO "
-					       "segment #%d to PE#%d\n",
-					       __func__, rc, index, pe->pe_number);
-					break;
-				}
-
-				region.start += phb->ioda.io_segsize;
-				index++;
-			}
+			segsize      = phb->ioda.io_segsize;
+			segmap       = phb->ioda.io_segmap;
+			pe_segmap    = pe->io_segmap;
+			win          = OPAL_IO_WINDOW_TYPE;
 		} else if ((res->flags & IORESOURCE_MEM) &&
-			   !pnv_pci_is_mem_pref_64(res->flags)) {
+			    !pnv_pci_is_mem_pref_64(res->flags)) {
 			region.start = res->start -
 				       hose->mem_offset[0] -
 				       phb->ioda.m32_pci_base;
 			region.end   = res->end -
 				       hose->mem_offset[0] -
 				       phb->ioda.m32_pci_base;
-			index = region.start / phb->ioda.m32_segsize;
+			segsize      = phb->ioda.m32_segsize;
+			segmap       = phb->ioda.m32_segmap;
+			pe_segmap    = pe->m32_segmap;
+			win          = OPAL_M32_WINDOW_TYPE;
+		} else {
+			continue;
+		}
 
-			while (index < phb->ioda.total_pe &&
-			       region.start <= region.end) {
-				set_bit(index, phb->ioda.m32_segmap);
-				set_bit(index, pe->m32_segmap);
-				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
-					pe->pe_number, OPAL_M32_WINDOW_TYPE, 0, index);
-				if (rc != OPAL_SUCCESS) {
-					pr_err("%s: OPAL error %d when mapping M32 "
-					       "segment#%d to PE#%d",
-					       __func__, rc, index, pe->pe_number);
-					break;
-				}
+		index = region.start / phb->ioda.io_segsize;
+		while (index < phb->ioda.total_pe &&
+		       region.start <= region.end) {
+			set_bit(index, segmap);
+			set_bit(index, pe_segmap);
 
-				region.start += phb->ioda.m32_segsize;
-				index++;
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+					pe->pe_number, win, 0, index);
+			if (rc != OPAL_SUCCESS) {
+				pr_warn("%s: Error %lld mapping (%d) seg#%d to PHB#%d-PE#%d\n",
+					__func__, rc, win, index,
+					pe->phb->hose->global_number,
+					pe->pe_number);
+				break;
 			}
+
+			region.start += segsize;
+			index++;
 		}
 	}
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 06/42] powerpc/powernv: Improve IO and M32 mapping
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The PHB's IO or M32 window is divided evenly to segments, each of
them can be mapped to arbitrary PE# by IODT or M32DT. Current code
figures out the consumed IO and M32 segments by one particular PE
from the windows of the PE's upstream bridge. It won't be reliable
once we extend M64 windows of root port, or the upstream port of
the PCIE switch behind root port to PHB's IO or M32 window, in order
to support PCI hotplug in future.

The patch improves the above situation by calculating PE's consumed
IO or M32 segments from its contained devices, no PCI bridge windows
involved if the PE doesn't contain all the subordinate PCI buses.
Otherwise, the PCI bridge windows still contribute to PE's consumed
IO or M32 segments.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 136 ++++++++++++++++++------------
 1 file changed, 80 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3bb4ce8..46a5e10 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2959,76 +2959,100 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 }
 #endif /* CONFIG_PCI_IOV */
 
-/*
- * This function is supposed to be called on basis of PE from top
- * to bottom style. So the the I/O or MMIO segment assigned to
- * parent PE could be overrided by its child PEs if necessary.
- */
-static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
-				  struct pnv_ioda_pe *pe)
+static int pnv_ioda_map_pe_one_res(struct pci_controller *hose,
+				   struct pnv_ioda_pe *pe,
+				   struct resource *res)
 {
 	struct pnv_phb *phb = hose->private_data;
 	struct pci_bus_region region;
-	struct resource *res;
-	int i, index;
+	int index;
 	unsigned int segsize;
 	unsigned long *segmap, *pe_segmap;
 	uint16_t win;
 	int64_t rc;
 
-	/*
-	 * NOTE: We only care PCI bus based PE for now. For PCI
-	 * device based PE, for example SRIOV sensitive VF should
-	 * be figured out later.
-	 */
-	BUG_ON(!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)));
+	/* Check if we need map the resource */
+	if (!res->parent ||
+	    !res->flags ||
+	    res->start > res->end ||
+	    pnv_pci_is_mem_pref_64(res->flags))
+		return 0;
 
-	pci_bus_for_each_resource(pe->pbus, res, i) {
-		if (!res || !res->flags ||
-		    res->start > res->end)
-			continue;
+	if (res->flags & IORESOURCE_IO) {
+		region.start = res->start - phb->ioda.io_pci_base;
+		region.end   = res->end - phb->ioda.io_pci_base;
+		segsize      = phb->ioda.io_segsize;
+		segmap       = phb->ioda.io_segmap;
+		pe_segmap    = pe->io_segmap;
+		win          = OPAL_IO_WINDOW_TYPE;
+	} else if ((res->flags & IORESOURCE_MEM) &&
+		   !pnv_pci_is_mem_pref_64(res->flags)) {
+		region.start = res->start -
+			       hose->mem_offset[0] -
+			       phb->ioda.m32_pci_base;
+		region.end   = res->end -
+			       hose->mem_offset[0] -
+			       phb->ioda.m32_pci_base;
+		segsize      = phb->ioda.m32_segsize;
+		segmap       = phb->ioda.m32_segmap;
+		pe_segmap    = pe->m32_segmap;
+		win          = OPAL_M32_WINDOW_TYPE;
+	} else {
+		return 0;
+	}
 
-		if (res->flags & IORESOURCE_IO) {
-			region.start = res->start - phb->ioda.io_pci_base;
-			region.end   = res->end - phb->ioda.io_pci_base;
-			segsize      = phb->ioda.io_segsize;
-			segmap       = phb->ioda.io_segmap;
-			pe_segmap    = pe->io_segmap;
-			win          = OPAL_IO_WINDOW_TYPE;
-		} else if ((res->flags & IORESOURCE_MEM) &&
-			    !pnv_pci_is_mem_pref_64(res->flags)) {
-			region.start = res->start -
-				       hose->mem_offset[0] -
-				       phb->ioda.m32_pci_base;
-			region.end   = res->end -
-				       hose->mem_offset[0] -
-				       phb->ioda.m32_pci_base;
-			segsize      = phb->ioda.m32_segsize;
-			segmap       = phb->ioda.m32_segmap;
-			pe_segmap    = pe->m32_segmap;
-			win          = OPAL_M32_WINDOW_TYPE;
-		} else {
-			continue;
+	index = region.start / phb->ioda.io_segsize;
+	while (index < phb->ioda.total_pe &&
+	       region.start <= region.end) {
+		set_bit(index, segmap);
+		set_bit(index, pe_segmap);
+
+		rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+				pe->pe_number, win, 0, index);
+		if (rc != OPAL_SUCCESS) {
+			pr_err("%s: Error %lld mapping (%d) seg#%d to PHB#%d-PE#%d\n",
+				__func__, rc, win, index,
+				pe->phb->hose->global_number,
+				pe->pe_number);
+			return -EIO;
 		}
 
-		index = region.start / phb->ioda.io_segsize;
-		while (index < phb->ioda.total_pe &&
-		       region.start <= region.end) {
-			set_bit(index, segmap);
-			set_bit(index, pe_segmap);
+		region.start += segsize;
+		index++;
+	}
 
-			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
-					pe->pe_number, win, 0, index);
-			if (rc != OPAL_SUCCESS) {
-				pr_warn("%s: Error %lld mapping (%d) seg#%d to PHB#%d-PE#%d\n",
-					__func__, rc, win, index,
-					pe->phb->hose->global_number,
-					pe->pe_number);
-				break;
-			}
+	return 0;
+}
+
+static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
+				  struct pnv_ioda_pe *pe)
+{
+	struct pci_dev *pdev;
+	struct resource *res;
+	int i;
+
+	/* This function only works for bus dependent PE */
+	BUG_ON(!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)));
+
+	list_for_each_entry(pdev, &pe->pbus->devices, bus_list) {
+		for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
+			res = &pdev->resource[i];
+			if (pnv_ioda_map_pe_one_res(hose, pe, res))
+				return;
+		}
+
+		/* If the PE contains all subordinate PCI buses, the
+		 * resources of the child bridges should be mapped
+		 * to the PE as well.
+		 */
+		if (!(pe->flags & PNV_IODA_PE_BUS_ALL) ||
+		    (pdev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
+			continue;
 
-			region.start += segsize;
-			index++;
+		for (i = 0; i <= PCI_BRIDGE_RESOURCE_NUM; i++) {
+			res = &pdev->resource[PCI_BRIDGE_RESOURCES + i];
+			if (pnv_ioda_map_pe_one_res(hose, pe, res))
+				return;
 		}
 	}
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 06/42] powerpc/powernv: Improve IO and M32 mapping
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The PHB's IO or M32 window is divided evenly to segments, each of
them can be mapped to arbitrary PE# by IODT or M32DT. Current code
figures out the consumed IO and M32 segments by one particular PE
from the windows of the PE's upstream bridge. It won't be reliable
once we extend M64 windows of root port, or the upstream port of
the PCIE switch behind root port to PHB's IO or M32 window, in order
to support PCI hotplug in future.

The patch improves the above situation by calculating PE's consumed
IO or M32 segments from its contained devices, no PCI bridge windows
involved if the PE doesn't contain all the subordinate PCI buses.
Otherwise, the PCI bridge windows still contribute to PE's consumed
IO or M32 segments.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 136 ++++++++++++++++++------------
 1 file changed, 80 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3bb4ce8..46a5e10 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2959,76 +2959,100 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 }
 #endif /* CONFIG_PCI_IOV */
 
-/*
- * This function is supposed to be called on basis of PE from top
- * to bottom style. So the the I/O or MMIO segment assigned to
- * parent PE could be overrided by its child PEs if necessary.
- */
-static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
-				  struct pnv_ioda_pe *pe)
+static int pnv_ioda_map_pe_one_res(struct pci_controller *hose,
+				   struct pnv_ioda_pe *pe,
+				   struct resource *res)
 {
 	struct pnv_phb *phb = hose->private_data;
 	struct pci_bus_region region;
-	struct resource *res;
-	int i, index;
+	int index;
 	unsigned int segsize;
 	unsigned long *segmap, *pe_segmap;
 	uint16_t win;
 	int64_t rc;
 
-	/*
-	 * NOTE: We only care PCI bus based PE for now. For PCI
-	 * device based PE, for example SRIOV sensitive VF should
-	 * be figured out later.
-	 */
-	BUG_ON(!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)));
+	/* Check if we need map the resource */
+	if (!res->parent ||
+	    !res->flags ||
+	    res->start > res->end ||
+	    pnv_pci_is_mem_pref_64(res->flags))
+		return 0;
 
-	pci_bus_for_each_resource(pe->pbus, res, i) {
-		if (!res || !res->flags ||
-		    res->start > res->end)
-			continue;
+	if (res->flags & IORESOURCE_IO) {
+		region.start = res->start - phb->ioda.io_pci_base;
+		region.end   = res->end - phb->ioda.io_pci_base;
+		segsize      = phb->ioda.io_segsize;
+		segmap       = phb->ioda.io_segmap;
+		pe_segmap    = pe->io_segmap;
+		win          = OPAL_IO_WINDOW_TYPE;
+	} else if ((res->flags & IORESOURCE_MEM) &&
+		   !pnv_pci_is_mem_pref_64(res->flags)) {
+		region.start = res->start -
+			       hose->mem_offset[0] -
+			       phb->ioda.m32_pci_base;
+		region.end   = res->end -
+			       hose->mem_offset[0] -
+			       phb->ioda.m32_pci_base;
+		segsize      = phb->ioda.m32_segsize;
+		segmap       = phb->ioda.m32_segmap;
+		pe_segmap    = pe->m32_segmap;
+		win          = OPAL_M32_WINDOW_TYPE;
+	} else {
+		return 0;
+	}
 
-		if (res->flags & IORESOURCE_IO) {
-			region.start = res->start - phb->ioda.io_pci_base;
-			region.end   = res->end - phb->ioda.io_pci_base;
-			segsize      = phb->ioda.io_segsize;
-			segmap       = phb->ioda.io_segmap;
-			pe_segmap    = pe->io_segmap;
-			win          = OPAL_IO_WINDOW_TYPE;
-		} else if ((res->flags & IORESOURCE_MEM) &&
-			    !pnv_pci_is_mem_pref_64(res->flags)) {
-			region.start = res->start -
-				       hose->mem_offset[0] -
-				       phb->ioda.m32_pci_base;
-			region.end   = res->end -
-				       hose->mem_offset[0] -
-				       phb->ioda.m32_pci_base;
-			segsize      = phb->ioda.m32_segsize;
-			segmap       = phb->ioda.m32_segmap;
-			pe_segmap    = pe->m32_segmap;
-			win          = OPAL_M32_WINDOW_TYPE;
-		} else {
-			continue;
+	index = region.start / phb->ioda.io_segsize;
+	while (index < phb->ioda.total_pe &&
+	       region.start <= region.end) {
+		set_bit(index, segmap);
+		set_bit(index, pe_segmap);
+
+		rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+				pe->pe_number, win, 0, index);
+		if (rc != OPAL_SUCCESS) {
+			pr_err("%s: Error %lld mapping (%d) seg#%d to PHB#%d-PE#%d\n",
+				__func__, rc, win, index,
+				pe->phb->hose->global_number,
+				pe->pe_number);
+			return -EIO;
 		}
 
-		index = region.start / phb->ioda.io_segsize;
-		while (index < phb->ioda.total_pe &&
-		       region.start <= region.end) {
-			set_bit(index, segmap);
-			set_bit(index, pe_segmap);
+		region.start += segsize;
+		index++;
+	}
 
-			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
-					pe->pe_number, win, 0, index);
-			if (rc != OPAL_SUCCESS) {
-				pr_warn("%s: Error %lld mapping (%d) seg#%d to PHB#%d-PE#%d\n",
-					__func__, rc, win, index,
-					pe->phb->hose->global_number,
-					pe->pe_number);
-				break;
-			}
+	return 0;
+}
+
+static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
+				  struct pnv_ioda_pe *pe)
+{
+	struct pci_dev *pdev;
+	struct resource *res;
+	int i;
+
+	/* This function only works for bus dependent PE */
+	BUG_ON(!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)));
+
+	list_for_each_entry(pdev, &pe->pbus->devices, bus_list) {
+		for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
+			res = &pdev->resource[i];
+			if (pnv_ioda_map_pe_one_res(hose, pe, res))
+				return;
+		}
+
+		/* If the PE contains all subordinate PCI buses, the
+		 * resources of the child bridges should be mapped
+		 * to the PE as well.
+		 */
+		if (!(pe->flags & PNV_IODA_PE_BUS_ALL) ||
+		    (pdev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
+			continue;
 
-			region.start += segsize;
-			index++;
+		for (i = 0; i <= PCI_BRIDGE_RESOURCE_NUM; i++) {
+			res = &pdev->resource[PCI_BRIDGE_RESOURCES + i];
+			if (pnv_ioda_map_pe_one_res(hose, pe, res))
+				return;
 		}
 	}
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 07/42] powerpc/powernv: Calculate PHB's DMA weight dynamically
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (2 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 03/42] powerpc/powernv: M64 support improvement Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup Gavin Shan
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

For P7IOC, the whole available DMA32 space, which is below the
MEM32 space, is divided evenly into 256MB segments. How many
continuous segments assigned to one particular PE depends on
the PE's DMA weight that is figured out from the type of each
PCI devices contained in the PE, and PHB's DMA weight which is
accumulative DMA weight of PEs contained in the PHB. It means
that the PHB's DMA weight calculation depends on existing PEs,
which works perfectly now, but not hotplug friendly. As the
whole available DMA32 space can be assigned to one PE on PHB3,
so we don't have the issue on PHB3.

The patch calculates PHB's DMA weight based on the PCI devices
contained in the PHB dynamically so that it's hotplug friendly.
At the meanwhile, the patch removes the code handling DMA weight
for PHB3 in pnv_ioda_setup_dma().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 5/21]
  * Fixed line over 80 characters reported from checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 90 +++++++++++++++----------------
 arch/powerpc/platforms/powernv/pci.h      |  6 ---
 2 files changed, 44 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 46a5e10..d9ff739 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -979,8 +979,11 @@ static void pnv_ioda_link_pe_by_weight(struct pnv_phb *phb,
 	list_add_tail(&pe->dma_link, &phb->ioda.pe_dma_list);
 }
 
-static unsigned int pnv_ioda_dma_weight(struct pci_dev *dev)
+static unsigned int pnv_ioda_dev_dma_weight(struct pci_dev *dev)
 {
+	struct pci_controller *hose = pci_bus_to_host(dev->bus);
+	struct pnv_phb *phb = hose->private_data;
+
 	/* This is quite simplistic. The "base" weight of a device
 	 * is 10. 0 means no DMA is to be accounted for it.
 	 */
@@ -993,14 +996,34 @@ static unsigned int pnv_ioda_dma_weight(struct pci_dev *dev)
 	if (dev->class == PCI_CLASS_SERIAL_USB_UHCI ||
 	    dev->class == PCI_CLASS_SERIAL_USB_OHCI ||
 	    dev->class == PCI_CLASS_SERIAL_USB_EHCI)
-		return 3;
+		return 3 * phb->ioda.tce32_count;
 
 	/* Increase the weight of RAID (includes Obsidian) */
 	if ((dev->class >> 8) == PCI_CLASS_STORAGE_RAID)
-		return 15;
+		return 15 * phb->ioda.tce32_count;
 
 	/* Default */
-	return 10;
+	return 10 * phb->ioda.tce32_count;
+}
+
+static int __pnv_ioda_phb_dma_weight(struct pci_dev *pdev, void *data)
+{
+	unsigned int *dma_weight = data;
+
+	*dma_weight += pnv_ioda_dev_dma_weight(pdev);
+	return 0;
+}
+
+static unsigned int pnv_ioda_phb_dma_weight(struct pnv_phb *phb)
+{
+	unsigned int dma_weight = 0;
+
+	if (!phb->hose->bus)
+		return dma_weight;
+
+	pci_walk_bus(phb->hose->bus,
+		     __pnv_ioda_phb_dma_weight, &dma_weight);
+	return dma_weight;
 }
 
 #ifdef CONFIG_PCI_IOV
@@ -1159,7 +1182,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 			continue;
 		}
 		pdn->pe_number = pe->pe_number;
-		pe->dma_weight += pnv_ioda_dma_weight(dev);
+		pe->dma_weight += pnv_ioda_dev_dma_weight(dev);
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
 			pnv_ioda_setup_same_PE(dev->subordinate, pe);
 	}
@@ -1222,14 +1245,6 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 	/* Put PE to the list */
 	list_add_tail(&pe->list, &phb->ioda.pe_list);
 
-	/* Account for one DMA PE if at least one DMA capable device exist
-	 * below the bridge
-	 */
-	if (pe->dma_weight != 0) {
-		phb->ioda.dma_weight += pe->dma_weight;
-		phb->ioda.dma_pe_count++;
-	}
-
 	/* Link the PE */
 	pnv_ioda_link_pe_by_weight(phb, pe);
 }
@@ -2546,24 +2561,13 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 {
 	struct pci_controller *hose = phb->hose;
-	unsigned int residual, remaining, segs, tw, base;
 	struct pnv_ioda_pe *pe;
+	unsigned int dma_weight;
 
-	/* If we have more PE# than segments available, hand out one
-	 * per PE until we run out and let the rest fail. If not,
-	 * then we assign at least one segment per PE, plus more based
-	 * on the amount of devices under that PE
-	 */
-	if (phb->ioda.dma_pe_count > phb->ioda.tce32_count)
-		residual = 0;
-	else
-		residual = phb->ioda.tce32_count -
-			phb->ioda.dma_pe_count;
-
-	pr_info("PCI: Domain %04x has %ld available 32-bit DMA segments\n",
-		hose->global_number, phb->ioda.tce32_count);
-	pr_info("PCI: %d PE# for a total weight of %d\n",
-		phb->ioda.dma_pe_count, phb->ioda.dma_weight);
+	/* Calculate the PHB's DMA weight */
+	dma_weight = pnv_ioda_phb_dma_weight(phb);
+	pr_info("PCI%04x has %ld DMA32 segments, total weight %d\n",
+		hose->global_number, phb->ioda.tce32_count, dma_weight);
 
 	pnv_pci_ioda_setup_opal_tce_kill(phb);
 
@@ -2571,22 +2575,9 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 	 * out one base segment plus any residual segments based on
 	 * weight
 	 */
-	remaining = phb->ioda.tce32_count;
-	tw = phb->ioda.dma_weight;
-	base = 0;
 	list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link) {
 		if (!pe->dma_weight)
 			continue;
-		if (!remaining) {
-			pe_warn(pe, "No DMA32 resources available\n");
-			continue;
-		}
-		segs = 1;
-		if (residual) {
-			segs += ((pe->dma_weight * residual)  + (tw / 2)) / tw;
-			if (segs > remaining)
-				segs = remaining;
-		}
 
 		/*
 		 * For IODA2 compliant PHB3, we needn't care about the weight.
@@ -2594,17 +2585,24 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 		 * the specific PE.
 		 */
 		if (phb->type == PNV_PHB_IODA1) {
+			unsigned int segs, base = 0;
+
+			if (pe->dma_weight <
+			    dma_weight / phb->ioda.tce32_count)
+				segs = 1;
+			else
+				segs = (pe->dma_weight *
+					phb->ioda.tce32_count) / dma_weight;
+
 			pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
 				pe->dma_weight, segs);
 			pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
+
+			base += segs;
 		} else {
 			pe_info(pe, "Assign DMA32 space\n");
-			segs = 0;
 			pnv_pci_ioda2_setup_dma_pe(phb, pe);
 		}
-
-		remaining -= segs;
-		base += segs;
 	}
 }
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 0a8cecb..38d8616 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -185,12 +185,6 @@ struct pnv_phb {
 			/* 32-bit TCE tables allocation */
 			unsigned long		tce32_count;
 
-			/* Total "weight" for the sake of DMA resources
-			 * allocation
-			 */
-			unsigned int		dma_weight;
-			unsigned int		dma_pe_count;
-
 			/* Sorted list of used PE's, sorted at
 			 * boot for resource allocation purposes
 			 */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (3 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 07/42] powerpc/powernv: Calculate PHB's DMA weight dynamically Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-10  4:17   ` Alexey Kardashevskiy
  2015-06-04  6:41 ` [PATCH v5 09/42] powerpc/powernv: pnv_ioda_setup_dma() configure one PE only Gavin Shan
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch cleans up DMA32 in pci-ioda.c. It shouldn't introduce
behavioural changes:

   * Rename various fields in "struct pnv_phb" and "struct pnv_ioda_pe"
     as 32-bits DMA should be related to "DMA", not "TCE", and move
     them around to reflect their relationship and their relative
     importance.
   * Removed struct pnv_ioda_pe::tce32_segcount.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 5/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 48 +++++++++++++++----------------
 arch/powerpc/platforms/powernv/pci.h      | 13 +++------
 2 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d9ff739..4af3d06 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -971,7 +971,7 @@ static void pnv_ioda_link_pe_by_weight(struct pnv_phb *phb,
 	struct pnv_ioda_pe *lpe;
 
 	list_for_each_entry(lpe, &phb->ioda.pe_dma_list, dma_link) {
-		if (lpe->dma_weight < pe->dma_weight) {
+		if (lpe->dma32_weight < pe->dma32_weight) {
 			list_add_tail(&pe->dma_link, &lpe->dma_link);
 			return;
 		}
@@ -996,14 +996,14 @@ static unsigned int pnv_ioda_dev_dma_weight(struct pci_dev *dev)
 	if (dev->class == PCI_CLASS_SERIAL_USB_UHCI ||
 	    dev->class == PCI_CLASS_SERIAL_USB_OHCI ||
 	    dev->class == PCI_CLASS_SERIAL_USB_EHCI)
-		return 3 * phb->ioda.tce32_count;
+		return 3 * phb->ioda.dma32_segcount;
 
 	/* Increase the weight of RAID (includes Obsidian) */
 	if ((dev->class >> 8) == PCI_CLASS_STORAGE_RAID)
-		return 15 * phb->ioda.tce32_count;
+		return 15 * phb->ioda.dma32_segcount;
 
 	/* Default */
-	return 10 * phb->ioda.tce32_count;
+	return 10 * phb->ioda.dma32_segcount;
 }
 
 static int __pnv_ioda_phb_dma_weight(struct pci_dev *pdev, void *data)
@@ -1182,7 +1182,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 			continue;
 		}
 		pdn->pe_number = pe->pe_number;
-		pe->dma_weight += pnv_ioda_dev_dma_weight(dev);
+		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
 			pnv_ioda_setup_same_PE(dev->subordinate, pe);
 	}
@@ -1219,10 +1219,10 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 	pe->flags |= (all ? PNV_IODA_PE_BUS_ALL : PNV_IODA_PE_BUS);
 	pe->pbus = bus;
 	pe->pdev = NULL;
-	pe->tce32_seg = -1;
+	pe->dma32_seg = -1;
 	pe->mve_number = -1;
 	pe->rid = bus->busn_res.start << 8;
-	pe->dma_weight = 0;
+	pe->dma32_weight = 0;
 
 	if (all)
 		pe_info(pe, "Secondary bus %d..%d associated with PE#%d\n",
@@ -1585,7 +1585,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		pe->flags = PNV_IODA_PE_VF;
 		pe->pbus = NULL;
 		pe->parent_dev = pdev;
-		pe->tce32_seg = -1;
+		pe->dma32_seg = -1;
 		pe->mve_number = -1;
 		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
 			   pci_iov_virtfn_devfn(pdev, vf_index);
@@ -2061,7 +2061,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	/* XXX FIXME: Allocate multi-level tables on PHB3 */
 
 	/* We shouldn't already have a 32-bit DMA associated */
-	if (WARN_ON(pe->tce32_seg >= 0))
+	if (WARN_ON(pe->dma32_seg >= 0))
 		return;
 
 	tbl = pnv_pci_table_alloc(phb->hose->node);
@@ -2070,7 +2070,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	pnv_pci_link_table_and_group(phb->hose->node, 0, tbl, &pe->table_group);
 
 	/* Grab a 32-bit TCE table */
-	pe->tce32_seg = base;
+	pe->dma32_seg = base;
 	pe_info(pe, " Setting up 32-bit TCE table at %08x..%08x\n",
 		(base << 28), ((base + segs) << 28) - 1);
 
@@ -2131,8 +2131,8 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	return;
  fail:
 	/* XXX Failure: Try to fallback to 64-bit only ? */
-	if (pe->tce32_seg >= 0)
-		pe->tce32_seg = -1;
+	if (pe->dma32_seg >= 0)
+		pe->dma32_seg = -1;
 	if (tce_mem)
 		__free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
 	if (tbl) {
@@ -2520,7 +2520,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 	int64_t rc;
 
 	/* We shouldn't already have a 32-bit DMA associated */
-	if (WARN_ON(pe->tce32_seg >= 0))
+	if (WARN_ON(pe->dma32_seg >= 0))
 		return;
 
 	/* TVE #1 is selected by PCI address bit 59 */
@@ -2530,7 +2530,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 			pe->pe_number);
 
 	/* The PE will reserve all possible 32-bits space */
-	pe->tce32_seg = 0;
+	pe->dma32_seg = 0;
 	pe_info(pe, "Setting up 32-bit TCE table at 0..%08x\n",
 		phb->ioda.m32_pci_base);
 
@@ -2547,8 +2547,8 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 
 	rc = pnv_pci_ioda2_setup_default_config(pe);
 	if (rc) {
-		if (pe->tce32_seg >= 0)
-			pe->tce32_seg = -1;
+		if (pe->dma32_seg >= 0)
+			pe->dma32_seg = -1;
 		return;
 	}
 
@@ -2567,7 +2567,7 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 	/* Calculate the PHB's DMA weight */
 	dma_weight = pnv_ioda_phb_dma_weight(phb);
 	pr_info("PCI%04x has %ld DMA32 segments, total weight %d\n",
-		hose->global_number, phb->ioda.tce32_count, dma_weight);
+		hose->global_number, phb->ioda.dma32_segcount, dma_weight);
 
 	pnv_pci_ioda_setup_opal_tce_kill(phb);
 
@@ -2576,7 +2576,7 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 	 * weight
 	 */
 	list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link) {
-		if (!pe->dma_weight)
+		if (!pe->dma32_weight)
 			continue;
 
 		/*
@@ -2587,15 +2587,15 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 		if (phb->type == PNV_PHB_IODA1) {
 			unsigned int segs, base = 0;
 
-			if (pe->dma_weight <
-			    dma_weight / phb->ioda.tce32_count)
+			if (pe->dma32_weight <
+			    dma_weight / phb->ioda.dma32_segcount)
 				segs = 1;
 			else
-				segs = (pe->dma_weight *
-					phb->ioda.tce32_count) / dma_weight;
+				segs = (pe->dma32_weight *
+					phb->ioda.dma32_segcount) / dma_weight;
 
 			pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
-				pe->dma_weight, segs);
+				pe->dma32_weight, segs);
 			pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
 
 			base += segs;
@@ -3314,7 +3314,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	mutex_init(&phb->ioda.pe_list_mutex);
 
 	/* Calculate how many 32-bit TCE segments we have */
-	phb->ioda.tce32_count = phb->ioda.m32_pci_base >> 28;
+	phb->ioda.dma32_segcount = phb->ioda.m32_pci_base >> 28;
 
 #if 0 /* We should really do that ... */
 	rc = opal_pci_set_phb_mem_window(opal->phb_id,
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 38d8616..5ea33ca 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -58,15 +58,10 @@ struct pnv_ioda_pe {
 	unsigned long		m32_segmap[8];
 	unsigned long		m64_segmap[8];
 
-	/* "Weight" assigned to the PE for the sake of DMA resource
-	 * allocations
-	 */
-	unsigned int		dma_weight;
-
 	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
-	int			tce32_seg;
-	int			tce32_segcount;
 	struct iommu_table_group table_group;
+	int			dma32_seg;
+	unsigned int		dma32_weight;
 
 	/* 64-bit TCE bypass region */
 	bool			tce_bypass_enabled;
@@ -182,8 +177,8 @@ struct pnv_phb {
 			 */
 			unsigned char		pe_rmap[0x10000];
 
-			/* 32-bit TCE tables allocation */
-			unsigned long		tce32_count;
+			/* Number of 32-bit DMA segments */
+			unsigned long		dma32_segcount;
 
 			/* Sorted list of used PE's, sorted at
 			 * boot for resource allocation purposes
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 09/42] powerpc/powernv: pnv_ioda_setup_dma() configure one PE only
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (4 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity Gavin Shan
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The original implementation of pnv_ioda_setup_dma() iterates the
list of PEs and configures the DMA32 space for them one by one.
The function was designed to be called during PHB fixup time.
When configuring PE's DMA32 space in pcibios_setup_bridge(), in
order to support PCI hotplug, we have to have the function PE
oriented.

The patch introduces one more argument "struct pnv_ioda_pe *pe"
to pnv_ioda_setup_dma(). The caller, pnv_pci_ioda_setup_DMA(),
gets PE from the list and passes to it. The patch shouldn't
cause logic changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 06/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 60 ++++++++++++++-----------------
 1 file changed, 27 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 4af3d06..63fad4d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2558,12 +2558,14 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 		pnv_ioda_setup_bus_dma(pe, pe->pbus);
 }
 
-static void pnv_ioda_setup_dma(struct pnv_phb *phb)
+static void pnv_ioda_setup_dma(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
 	struct pci_controller *hose = phb->hose;
-	struct pnv_ioda_pe *pe;
 	unsigned int dma_weight;
 
+	if (!pe->dma32_weight)
+		return;
+
 	/* Calculate the PHB's DMA weight */
 	dma_weight = pnv_ioda_phb_dma_weight(phb);
 	pr_info("PCI%04x has %ld DMA32 segments, total weight %d\n",
@@ -2571,38 +2573,28 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 
 	pnv_pci_ioda_setup_opal_tce_kill(phb);
 
-	/* Walk our PE list and configure their DMA segments, hand them
-	 * out one base segment plus any residual segments based on
-	 * weight
+	/*
+	 * For IODA2 compliant PHB3, we needn't care about the weight.
+	 * The all available 32-bits DMA space will be assigned to
+	 * the specific PE.
 	 */
-	list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link) {
-		if (!pe->dma32_weight)
-			continue;
+	if (phb->type == PNV_PHB_IODA1) {
+		unsigned int segs, base = 0;
 
-		/*
-		 * For IODA2 compliant PHB3, we needn't care about the weight.
-		 * The all available 32-bits DMA space will be assigned to
-		 * the specific PE.
-		 */
-		if (phb->type == PNV_PHB_IODA1) {
-			unsigned int segs, base = 0;
-
-			if (pe->dma32_weight <
-			    dma_weight / phb->ioda.dma32_segcount)
-				segs = 1;
-			else
-				segs = (pe->dma32_weight *
-					phb->ioda.dma32_segcount) / dma_weight;
-
-			pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
-				pe->dma32_weight, segs);
-			pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
+		if (pe->dma32_weight <
+		    dma_weight / phb->ioda.dma32_segcount)
+			segs = 1;
+		else
+			segs = (pe->dma32_weight *
+				phb->ioda.dma32_segcount) / dma_weight;
 
-			base += segs;
-		} else {
-			pe_info(pe, "Assign DMA32 space\n");
-			pnv_pci_ioda2_setup_dma_pe(phb, pe);
-		}
+		pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
+			pe->dma32_weight, segs);
+		pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
+		base += segs;
+	} else {
+		pe_info(pe, "Assign DMA32 space\n");
+		pnv_pci_ioda2_setup_dma_pe(phb, pe);
 	}
 }
 
@@ -3073,12 +3065,14 @@ static void pnv_pci_ioda_setup_DMA(void)
 {
 	struct pci_controller *hose, *tmp;
 	struct pnv_phb *phb;
+	struct pnv_ioda_pe *pe;
 
 	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		pnv_ioda_setup_dma(hose->private_data);
+		phb = hose->private_data;
+		list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link)
+			pnv_ioda_setup_dma(phb, pe);
 
 		/* Mark the PHB initialization done */
-		phb = hose->private_data;
 		phb->initialized = 1;
 	}
 }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 10/42] powerpc/powernv: Trace DMA32 segments consumed by PE
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

On P7IOC, the whole DMA32 space is divided evenly to 256MB segments.
Each PE can consume one or multiple DMA32 segments. Current code
doesn't trace the available DMA32 segments and those consumed by
one particular PE. It's conflicting with PCI hotplug.

The patch introduces one bitmap to PHB to trace the available
DMA32 segments for allocation, more fields to "struct pnv_ioda_pe"
to trace the consumed DMA32 segments by the PE, which is going to
be released when the PE is destroyed at PCI unplugging time.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from PATCH[v4 07/21]
  * Added space before open parenthesis reported by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 24 +++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.h      |  4 ++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 63fad4d..2087c5c 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2071,6 +2071,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 
 	/* Grab a 32-bit TCE table */
 	pe->dma32_seg = base;
+	pe->dma32_segcount = segs;
 	pe_info(pe, " Setting up 32-bit TCE table at %08x..%08x\n",
 		(base << 28), ((base + segs) << 28) - 1);
 
@@ -2131,8 +2132,10 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	return;
  fail:
 	/* XXX Failure: Try to fallback to 64-bit only ? */
-	if (pe->dma32_seg >= 0)
+	if (pe->dma32_seg >= 0) {
+		bitmap_clear(phb->ioda.dma32_segmap, base, segs);
 		pe->dma32_seg = -1;
+	}
 	if (tce_mem)
 		__free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
 	if (tbl) {
@@ -2531,6 +2534,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 
 	/* The PE will reserve all possible 32-bits space */
 	pe->dma32_seg = 0;
+	pe->dma32_segcount = 1;
 	pe_info(pe, "Setting up 32-bit TCE table at 0..%08x\n",
 		phb->ioda.m32_pci_base);
 
@@ -2588,6 +2592,24 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 			segs = (pe->dma32_weight *
 				phb->ioda.dma32_segcount) / dma_weight;
 
+		/* Allocate DMA32 segments as required. We might not have
+		 * enough available resource. However, we expect at least
+		 * one segment is allocated.
+		 */
+		do {
+			base = bitmap_find_next_zero_area(
+					phb->ioda.dma32_segmap,
+					phb->ioda.dma32_segcount,
+					0, segs, 0);
+			if (base < phb->ioda.dma32_segcount) {
+				bitmap_set(phb->ioda.dma32_segmap, base, segs);
+				break;
+			}
+		} while (--segs);
+
+		if (!segs)
+			return;
+
 		pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
 			pe->dma32_weight, segs);
 		pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 5ea33ca..94ef1df 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -61,6 +61,7 @@ struct pnv_ioda_pe {
 	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
 	struct iommu_table_group table_group;
 	int			dma32_seg;
+	int			dma32_segcount;
 	unsigned int		dma32_weight;
 
 	/* 64-bit TCE bypass region */
@@ -161,6 +162,9 @@ struct pnv_phb {
 			unsigned long		m32_segmap[8];
 			unsigned long		m64_segmap[8];
 
+			/* DMA32 segment maps */
+			unsigned long		dma32_segmap[8];
+
 			/* IRQ chip */
 			int			irq_chip_init;
 			struct irq_chip		irq_chip;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 10/42] powerpc/powernv: Trace DMA32 segments consumed by PE
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

On P7IOC, the whole DMA32 space is divided evenly to 256MB segments.
Each PE can consume one or multiple DMA32 segments. Current code
doesn't trace the available DMA32 segments and those consumed by
one particular PE. It's conflicting with PCI hotplug.

The patch introduces one bitmap to PHB to trace the available
DMA32 segments for allocation, more fields to "struct pnv_ioda_pe"
to trace the consumed DMA32 segments by the PE, which is going to
be released when the PE is destroyed at PCI unplugging time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 07/21]
  * Added space before open parenthesis reported by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 24 +++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.h      |  4 ++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 63fad4d..2087c5c 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2071,6 +2071,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 
 	/* Grab a 32-bit TCE table */
 	pe->dma32_seg = base;
+	pe->dma32_segcount = segs;
 	pe_info(pe, " Setting up 32-bit TCE table at %08x..%08x\n",
 		(base << 28), ((base + segs) << 28) - 1);
 
@@ -2131,8 +2132,10 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	return;
  fail:
 	/* XXX Failure: Try to fallback to 64-bit only ? */
-	if (pe->dma32_seg >= 0)
+	if (pe->dma32_seg >= 0) {
+		bitmap_clear(phb->ioda.dma32_segmap, base, segs);
 		pe->dma32_seg = -1;
+	}
 	if (tce_mem)
 		__free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
 	if (tbl) {
@@ -2531,6 +2534,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 
 	/* The PE will reserve all possible 32-bits space */
 	pe->dma32_seg = 0;
+	pe->dma32_segcount = 1;
 	pe_info(pe, "Setting up 32-bit TCE table at 0..%08x\n",
 		phb->ioda.m32_pci_base);
 
@@ -2588,6 +2592,24 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 			segs = (pe->dma32_weight *
 				phb->ioda.dma32_segcount) / dma_weight;
 
+		/* Allocate DMA32 segments as required. We might not have
+		 * enough available resource. However, we expect at least
+		 * one segment is allocated.
+		 */
+		do {
+			base = bitmap_find_next_zero_area(
+					phb->ioda.dma32_segmap,
+					phb->ioda.dma32_segcount,
+					0, segs, 0);
+			if (base < phb->ioda.dma32_segcount) {
+				bitmap_set(phb->ioda.dma32_segmap, base, segs);
+				break;
+			}
+		} while (--segs);
+
+		if (!segs)
+			return;
+
 		pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
 			pe->dma32_weight, segs);
 		pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 5ea33ca..94ef1df 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -61,6 +61,7 @@ struct pnv_ioda_pe {
 	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
 	struct iommu_table_group table_group;
 	int			dma32_seg;
+	int			dma32_segcount;
 	unsigned int		dma32_weight;
 
 	/* 64-bit TCE bypass region */
@@ -161,6 +162,9 @@ struct pnv_phb {
 			unsigned long		m32_segmap[8];
 			unsigned long		m64_segmap[8];
 
+			/* DMA32 segment maps */
+			unsigned long		dma32_segmap[8];
+
 			/* IRQ chip */
 			int			irq_chip_init;
 			struct irq_chip		irq_chip;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (5 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 09/42] powerpc/powernv: pnv_ioda_setup_dma() configure one PE only Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-10  4:41   ` Alexey Kardashevskiy
  2015-06-04  6:41 ` [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops Gavin Shan
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

Each PHB maintains an array helping to translate RID (Request
ID) to PE# with the assumption that PE# takes 8 bits, indicating
that we can't have more than 256 PEs. However, pci_dn->pe_number
already had 4-bytes for the PE#.

The patch extends the PE# capacity so that each of them will be
4-bytes long. Then we can use IODA_INVALID_PE to check one entry
in phb->pe_rmap[] is valid or not.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from [PATCH v5 v4 06/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++++-
 arch/powerpc/platforms/powernv/pci.h      | 5 ++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2087c5c..d8b0ef5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -840,7 +840,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 
 	/* Clear the reverse map */
 	for (rid = pe->rid; rid < rid_end; rid++)
-		phb->ioda.pe_rmap[rid] = 0;
+		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
 
 	/* Release from all parents PELT-V */
 	while (parent) {
@@ -3303,6 +3303,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	if (prop32)
 		phb->ioda.reserved_pe = be32_to_cpup(prop32);
 
+	/* Invalidate RID to PE# mapping */
+	memset(phb->ioda.pe_rmap, 0xff, sizeof(phb->ioda.pe_rmap));
+
 	/* Parse 64-bit MMIO range */
 	pnv_ioda_parse_m64_window(phb);
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 94ef1df..590f778 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -175,11 +175,10 @@ struct pnv_phb {
 			struct list_head	pe_list;
 			struct mutex            pe_list_mutex;
 
-			/* Reverse map of PEs, will have to extend if
-			 * we are to support more than 256 PEs, indexed
+			/* Reverse map of PEs, indexed by
 			 * bus { bus, devfn }
 			 */
-			unsigned char		pe_rmap[0x10000];
+			int			pe_rmap[0x10000];
 
 			/* Number of 32-bit DMA segments */
 			unsigned long		dma32_segcount;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (6 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-10  4:43   ` Alexey Kardashevskiy
  2015-06-04  6:41 ` [PATCH v5 14/42] powerpc/powernv: Allocate PE# in deasending order Gavin Shan
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan, Daniel Axtens

Each PHB maintains one instance of "struct pci_controller_ops",
which includes various callbacks called by PCI subsystem. In the
definition of this struct, some callbacks have explicit names for
its arguments, but the left don't have.

The patch removes all explicit names of the arguments to the
callbacks in "struct pci_controller_ops" to keep the code look
consistent.

Cc: Daniel Axtens <dja@axtens.net>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Newly introduced
---
 arch/powerpc/include/asm/pci-bridge.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 744884b..1252cd5 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -18,8 +18,8 @@ struct device_node;
  * PCI controller operations
  */
 struct pci_controller_ops {
-	void		(*dma_dev_setup)(struct pci_dev *dev);
-	void		(*dma_bus_setup)(struct pci_bus *bus);
+	void		(*dma_dev_setup)(struct pci_dev *);
+	void		(*dma_bus_setup)(struct pci_bus *);
 
 	int		(*probe_mode)(struct pci_bus *);
 
@@ -28,8 +28,8 @@ struct pci_controller_ops {
 	bool		(*enable_device_hook)(struct pci_dev *);
 
 	/* Called during PCI resource reassignment */
-	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long type);
-	void		(*reset_secondary_bus)(struct pci_dev *dev);
+	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
+	void		(*reset_secondary_bus)(struct pci_dev *);
 };
 
 /*
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 13/42] powerpc/pci: Override pcibios_setup_bridge()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The patch overrides pcibios_setup_bridge(), called to update PCI
bridge windows at completion of PCI resource assignment, to assign
PE and setup various (resource) mapping in next patch.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from [PATCH v5 v4 06/21]
---
 arch/powerpc/include/asm/pci-bridge.h | 1 +
 arch/powerpc/kernel/pci-common.c      | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1252cd5..1f39ca7 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -29,6 +29,7 @@ struct pci_controller_ops {
 
 	/* Called during PCI resource reassignment */
 	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
+	void		(*setup_bridge)(struct pci_bus *, unsigned long);
 	void		(*reset_secondary_bus)(struct pci_dev *);
 };
 
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0d05406..0358f24 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -122,6 +122,14 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 	return 1;
 }
 
+void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+
+	if (hose->controller_ops.setup_bridge)
+		hose->controller_ops.setup_bridge(bus, type);
+}
+
 void pcibios_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *phb = pci_bus_to_host(dev->bus);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 13/42] powerpc/pci: Override pcibios_setup_bridge()
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch overrides pcibios_setup_bridge(), called to update PCI
bridge windows at completion of PCI resource assignment, to assign
PE and setup various (resource) mapping in next patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from [PATCH v5 v4 06/21]
---
 arch/powerpc/include/asm/pci-bridge.h | 1 +
 arch/powerpc/kernel/pci-common.c      | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1252cd5..1f39ca7 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -29,6 +29,7 @@ struct pci_controller_ops {
 
 	/* Called during PCI resource reassignment */
 	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
+	void		(*setup_bridge)(struct pci_bus *, unsigned long);
 	void		(*reset_secondary_bus)(struct pci_dev *);
 };
 
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0d05406..0358f24 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -122,6 +122,14 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 	return 1;
 }
 
+void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+
+	if (hose->controller_ops.setup_bridge)
+		hose->controller_ops.setup_bridge(bus, type);
+}
+
 void pcibios_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *phb = pci_bus_to_host(dev->bus);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 14/42] powerpc/powernv: Allocate PE# in deasending order
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (7 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 17/42] powerpc/powernv: PE oriented during configuration Gavin Shan
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The available PE#, represented by a bitmap in the PHB, is allocated
in asending order. It conflicts with the fact that M64 segments are
assigned in same order. In order to avoid the conflict, the patch
allocates PE# in deasending order.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from [PATCH v5 v4 06/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d8b0ef5..0d6539a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -152,18 +152,23 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 
 static int pnv_ioda_alloc_pe(struct pnv_phb *phb)
 {
-	unsigned long pe;
+	unsigned long pe_no;
+	unsigned long limit = phb->ioda.total_pe - 1;
 
 	do {
-		pe = find_next_zero_bit(phb->ioda.pe_alloc,
-					phb->ioda.total_pe, 0);
-		if (pe >= phb->ioda.total_pe)
+		pe_no = find_next_zero_bit(phb->ioda.pe_alloc,
+					   phb->ioda.total_pe, limit);
+		if (pe_no < phb->ioda.total_pe &&
+		    !test_and_set_bit(pe_no, phb->ioda.pe_alloc))
+			break;
+
+		if (--limit >= phb->ioda.total_pe)
 			return IODA_INVALID_PE;
-	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
+	} while (1);
 
-	phb->ioda.pe_array[pe].phb = phb;
-	phb->ioda.pe_array[pe].pe_number = pe;
-	return pe;
+	phb->ioda.pe_array[pe_no].phb = phb;
+	phb->ioda.pe_array[pe_no].pe_number = pe_no;
+	return pe_no;
 }
 
 static void pnv_ioda_free_pe(struct pnv_phb *phb, int pe)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 15/42] powerpc/powernv: Reserve PE# for root bus
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

pcibios_setup_bridge(), called to update PCI bridge windows, will
allocate PE for PCI buses. The function isn't called for root bus
that doesn't have upstream bridge. The patch reserves PE# for root
bus in advance so that we can setup it in next patch.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from [PATCH v5 v4 06/21]
  * Replace "strip of" with "strip off" in comments
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 31 ++++++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.h      |  1 +
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 0d6539a..2eb8baa 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -230,6 +230,13 @@ static int pnv_ioda1_init_m64(struct pnv_phb *phb)
 		pr_warn("  Cannot strip M64 segment for reserved PE#%d\n",
 			phb->ioda.reserved_pe);
 
+	/* Strip off the segment used by PE for PCI root bus,
+	 * which is last supported PE#, or one next to the
+	 * reserved PE#
+	 */
+	if (phb->ioda.root_pe != IODA_INVALID_PE)
+		r->end -= phb->ioda.m64_segsize;
+
 	return 0;
 
 fail:
@@ -287,6 +294,13 @@ static int pnv_ioda2_init_m64(struct pnv_phb *phb)
 		pr_warn("  Cannot strip M64 segment for reserved PE#%d\n",
 			phb->ioda.reserved_pe);
 
+	/* Strip off the segment used by PE for PCI root bus,
+	 * which is last supported PE#, or one next to the
+	 * reserved PE#
+	 */
+	if (phb->ioda.root_pe != IODA_INVALID_PE)
+		r->end -= phb->ioda.m64_segsize;
+
 	return 0;
 
 fail:
@@ -3331,7 +3345,22 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	aux = memblock_virt_alloc(size, 0);
 	phb->ioda.pe_alloc = aux;
 	phb->ioda.pe_array = aux + pemap_off;
-	set_bit(phb->ioda.reserved_pe, phb->ioda.pe_alloc);
+
+	/* Choose number of PE for root bus, which shouldn't consume
+	 * any M64 resource. So we avoid picking low-end PE#, which
+	 * is usually binding with 64-bits prefetchable memory resources
+	 * closely.
+	 */
+	pnv_ioda_reserve_pe(phb, phb->ioda.reserved_pe);
+	if (phb->ioda.reserved_pe == 0) {
+		phb->ioda.root_pe = phb->ioda.total_pe - 1;
+		pnv_ioda_reserve_pe(phb, phb->ioda.root_pe);
+	} else if (phb->ioda.reserved_pe == (phb->ioda.total_pe - 1)) {
+		phb->ioda.root_pe = phb->ioda.reserved_pe - 1;
+		pnv_ioda_reserve_pe(phb, phb->ioda.root_pe);
+	} else {
+		phb->ioda.root_pe = IODA_INVALID_PE;
+	}
 
 	INIT_LIST_HEAD(&phb->ioda.pe_dma_list);
 	INIT_LIST_HEAD(&phb->ioda.pe_list);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 590f778..e372b9f 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -133,6 +133,7 @@ struct pnv_phb {
 		struct {
 			/* Global bridge info */
 			unsigned int		total_pe;
+			unsigned int		root_pe;
 			unsigned int		reserved_pe;
 
 			/* 32-bit MMIO window */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 15/42] powerpc/powernv: Reserve PE# for root bus
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

pcibios_setup_bridge(), called to update PCI bridge windows, will
allocate PE for PCI buses. The function isn't called for root bus
that doesn't have upstream bridge. The patch reserves PE# for root
bus in advance so that we can setup it in next patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from [PATCH v5 v4 06/21]
  * Replace "strip of" with "strip off" in comments
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 31 ++++++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.h      |  1 +
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 0d6539a..2eb8baa 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -230,6 +230,13 @@ static int pnv_ioda1_init_m64(struct pnv_phb *phb)
 		pr_warn("  Cannot strip M64 segment for reserved PE#%d\n",
 			phb->ioda.reserved_pe);
 
+	/* Strip off the segment used by PE for PCI root bus,
+	 * which is last supported PE#, or one next to the
+	 * reserved PE#
+	 */
+	if (phb->ioda.root_pe != IODA_INVALID_PE)
+		r->end -= phb->ioda.m64_segsize;
+
 	return 0;
 
 fail:
@@ -287,6 +294,13 @@ static int pnv_ioda2_init_m64(struct pnv_phb *phb)
 		pr_warn("  Cannot strip M64 segment for reserved PE#%d\n",
 			phb->ioda.reserved_pe);
 
+	/* Strip off the segment used by PE for PCI root bus,
+	 * which is last supported PE#, or one next to the
+	 * reserved PE#
+	 */
+	if (phb->ioda.root_pe != IODA_INVALID_PE)
+		r->end -= phb->ioda.m64_segsize;
+
 	return 0;
 
 fail:
@@ -3331,7 +3345,22 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	aux = memblock_virt_alloc(size, 0);
 	phb->ioda.pe_alloc = aux;
 	phb->ioda.pe_array = aux + pemap_off;
-	set_bit(phb->ioda.reserved_pe, phb->ioda.pe_alloc);
+
+	/* Choose number of PE for root bus, which shouldn't consume
+	 * any M64 resource. So we avoid picking low-end PE#, which
+	 * is usually binding with 64-bits prefetchable memory resources
+	 * closely.
+	 */
+	pnv_ioda_reserve_pe(phb, phb->ioda.reserved_pe);
+	if (phb->ioda.reserved_pe == 0) {
+		phb->ioda.root_pe = phb->ioda.total_pe - 1;
+		pnv_ioda_reserve_pe(phb, phb->ioda.root_pe);
+	} else if (phb->ioda.reserved_pe == (phb->ioda.total_pe - 1)) {
+		phb->ioda.root_pe = phb->ioda.reserved_pe - 1;
+		pnv_ioda_reserve_pe(phb, phb->ioda.root_pe);
+	} else {
+		phb->ioda.root_pe = IODA_INVALID_PE;
+	}
 
 	INIT_LIST_HEAD(&phb->ioda.pe_dma_list);
 	INIT_LIST_HEAD(&phb->ioda.pe_list);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 590f778..e372b9f 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -133,6 +133,7 @@ struct pnv_phb {
 		struct {
 			/* Global bridge info */
 			unsigned int		total_pe;
+			unsigned int		root_pe;
 			unsigned int		reserved_pe;
 
 			/* 32-bit MMIO window */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 16/42] powerpc/powernv: Create PEs dynamically
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

Currently, the PEs and their associated resources are assigned
in ppc_md.pcibios_fixup() except those consumed by SRIOV VFs.
The function is called for once after PCI probing and resources
assignment are finished. Obviously, it's not hotplug friendly.

The patch creates PEs dynamically by ppc_md.pcibios_setup_bridge(),
which is called on the event during system bootup and PCI hotplug:
updating PCI bridge's windows after resource assignment/reassignment
are finished. For partial hotplug case, where not all PCI devices
belonging to the PE are unplugged and plugged again, we just need
unbinding/binding the affected PCI devices with the corresponding
PE without creating new one.

Besides, it might require addtional resources (e.g. M32) to the
windows of the PCI bridge when unplugging current adapter, and
insert a different adapter if there is one PCI slot, which is
assumed behind root port, or the downstream bridge of the PCIE
switch behind root port. The parent bridge of the newly plugged
adapter would reject the request to add more resources, leading
to hotplug failure. For the issue, the patch extends the windows
of root port, or the upstream port of the PCIe switch behind root
port to PHB's windows when ppc_md.pcibios_setup_bridge() is called.

There is no upstream bridge for root bus, so we have to fix it up
before any PE is created because the root bus PE is the ancestor
to anyone else.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Derived from [PATCH v5 v4 06/21]
  * Correct "accommodate" reported by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 203 +++++++++++++++++++-----------
 arch/powerpc/platforms/powernv/pci.h      |   1 +
 2 files changed, 128 insertions(+), 76 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2eb8baa..fd2f898 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1200,6 +1200,13 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 				pci_name(dev));
 			continue;
 		}
+
+		/* The PCI device might have been associated with the PE
+		 * in case of partial hotplug.
+		 */
+		if (pdn->pe_number != IODA_INVALID_PE)
+			continue;
+
 		pdn->pe_number = pe->pe_number;
 		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -1213,15 +1220,31 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
  * subordinate PCI devices and buses. The second type of PE is normally
  * orgiriated by PCIe-to-PCI bridge or PLX switch downstream ports.
  */
-static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
+static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
 	struct pnv_phb *phb = hose->private_data;
 	struct pnv_ioda_pe *pe;
 	int pe_num = IODA_INVALID_PE;
 
+	/* For partial hotplug case, the PE instance hasn't been destroyed
+	 * yet. We shouldn't allocated a new one and assign resources to
+	 * it. The existing PE instance should be reused, but we should
+	 * associate the devices to the PE.
+	 */
+	pe_num = phb->ioda.pe_rmap[bus->number << 8];
+	if (pe_num != IODA_INVALID_PE) {
+		pe = &phb->ioda.pe_array[pe_num];
+		pnv_ioda_setup_same_PE(bus, pe);
+		return NULL;
+	}
+
+	/* PE number for root bus should have been reserved */
+	if (pci_is_root_bus(bus))
+		pe_num = phb->ioda.root_pe;
+
 	/* Check if PE is determined by M64 */
-	if (phb->pick_m64_pe)
+	if (pe_num == IODA_INVALID_PE && phb->pick_m64_pe)
 		pe_num = phb->pick_m64_pe(phb, bus, all);
 
 	/* The PE number isn't pinned by M64 */
@@ -1231,7 +1254,7 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 	if (pe_num == IODA_INVALID_PE) {
 		pr_warning("%s: Not enough PE# available for PCI bus %04x:%02x\n",
 			__func__, pci_domain_nr(bus), bus->number);
-		return;
+		return NULL;
 	}
 
 	pe = &phb->ioda.pe_array[pe_num];
@@ -1255,7 +1278,7 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 		if (pe_num)
 			pnv_ioda_free_pe(phb, pe_num);
 		pe->pbus = NULL;
-		return;
+		return NULL;
 	}
 
 	/* Associate it with all child devices */
@@ -1266,46 +1289,8 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 
 	/* Link the PE */
 	pnv_ioda_link_pe_by_weight(phb, pe);
-}
-
-static void pnv_ioda_setup_PEs(struct pci_bus *bus)
-{
-	struct pci_dev *dev;
-
-	pnv_ioda_setup_bus_PE(bus, 0);
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
-			if (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)
-				pnv_ioda_setup_bus_PE(dev->subordinate, 1);
-			else
-				pnv_ioda_setup_PEs(dev->subordinate);
-		}
-	}
-}
-
-/*
- * Configure PEs so that the downstream PCI buses and devices
- * could have their associated PE#. Unfortunately, we didn't
- * figure out the way to identify the PLX bridge yet. So we
- * simply put the PCI bus and the subordinate behind the root
- * port to PE# here. The game rule here is expected to be changed
- * as soon as we can detected PLX bridge correctly.
- */
-static void pnv_pci_ioda_setup_PEs(void)
-{
-	struct pci_controller *hose, *tmp;
-	struct pnv_phb *phb;
 
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-
-		/* M64 layout might affect PE allocation */
-		if (phb->reserve_m64_pe)
-			phb->reserve_m64_pe(phb, phb->hose->bus);
-
-		pnv_ioda_setup_PEs(hose->bus);
-	}
+	return pe;
 }
 
 #ifdef CONFIG_PCI_IOV
@@ -3088,36 +3073,6 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 	}
 }
 
-static void pnv_pci_ioda_setup_seg(void)
-{
-	struct pci_controller *tmp, *hose;
-	struct pnv_phb *phb;
-	struct pnv_ioda_pe *pe;
-
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-			pnv_ioda_setup_pe_seg(hose, pe);
-		}
-	}
-}
-
-static void pnv_pci_ioda_setup_DMA(void)
-{
-	struct pci_controller *hose, *tmp;
-	struct pnv_phb *phb;
-	struct pnv_ioda_pe *pe;
-
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-		list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link)
-			pnv_ioda_setup_dma(phb, pe);
-
-		/* Mark the PHB initialization done */
-		phb->initialized = 1;
-	}
-}
-
 static void pnv_pci_ioda_create_dbgfs(void)
 {
 #ifdef CONFIG_DEBUG_FS
@@ -3139,9 +3094,8 @@ static void pnv_pci_ioda_create_dbgfs(void)
 
 static void pnv_pci_ioda_fixup(void)
 {
-	pnv_pci_ioda_setup_PEs();
-	pnv_pci_ioda_setup_seg();
-	pnv_pci_ioda_setup_DMA();
+	struct pci_controller *tmp, *hose;
+	struct pnv_phb *phb;
 
 	pnv_pci_ioda_create_dbgfs();
 
@@ -3149,6 +3103,12 @@ static void pnv_pci_ioda_fixup(void)
 	eeh_init();
 	eeh_addr_cache_build();
 #endif
+
+	/* Notify initialization of PHB done */
+	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
+		phb = hose->private_data;
+		phb->initialized = 1;
+	}
 }
 
 /*
@@ -3192,6 +3152,96 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 	return phb->ioda.io_segsize;
 }
 
+/*
+ * We are updating root port or the upstream bridge behind the root
+ * port with PHB's various windows in order to accommodate the changes
+ * on required resources during PCI (slot) hotplug, which is connected
+ * to either root port, or the downstream ports of PCIe switch behind
+ * the root port.
+ */
+static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
+					   unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dev *bridge = bus->self;
+	struct resource *r, *w;
+	int i;
+
+	/* Check if we need apply fixup to the bridge's windows */
+	if (!pci_is_root_bus(bridge->bus) &&
+	    !pci_is_root_bus(bridge->bus->self->bus))
+		return;
+
+	/* Fixup the resoureces */
+	for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
+		r = &bridge->resource[PCI_BRIDGE_RESOURCES + i];
+		if (!r->flags || !r->parent)
+			continue;
+
+		w = NULL;
+		if (r->flags & type & IORESOURCE_IO)
+			w = &hose->io_resource;
+		else if (pnv_pci_is_mem_pref_64(r->flags) &&
+			(type & IORESOURCE_PREFETCH) &&
+			phb->ioda.m64_segsize)
+			w = &hose->mem_resources[1];
+		else if (r->flags & type & IORESOURCE_MEM)
+			w = &hose->mem_resources[0];
+
+		r->start = w->start;
+		r->end = w->end;
+	}
+
+}
+
+static void pnv_pci_setup_bridge(struct pci_bus *bus,
+				 unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dev *bridge = bus->self;
+	struct pnv_ioda_pe *pe;
+
+	/* The root bus (ancestor PE) should be finalized before anyone else */
+	if (!phb->ioda.root_pe_populated) {
+		pe = pnv_ioda_setup_bus_PE(phb->hose->bus, 0);
+		if (pe && phb->ioda.root_pe == IODA_INVALID_PE)
+			phb->ioda.root_pe = pe->pe_number;
+			phb->ioda.root_pe_populated = 1;
+		}
+
+	/* Extend bridge's windows if necessary */
+	pnv_pci_fixup_bridge_resources(bus, type);
+
+	/* Don't assign PE to bus, which doesn't have any subordinate
+	 * PCI devices on it.
+	 */
+	if (list_empty(&bus->devices))
+		return;
+
+	/* Reserve PEs for M64 resource */
+	if (phb->reserve_m64_pe)
+		phb->reserve_m64_pe(phb, bus);
+
+	/* Assign PE. We might run here because of partial hotplug.
+	 * For the case, we just pick up the existing PE and should
+	 * not allocate resources again.
+	 */
+	if (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE)
+		pe = pnv_ioda_setup_bus_PE(bus, 1);
+	else
+		pe = pnv_ioda_setup_bus_PE(bus, 0);
+	if (!pe)
+		return;
+
+	/* Setup MMIO mapping */
+	pnv_ioda_setup_pe_seg(hose, pe);
+
+	/* Setup DMA */
+	pnv_ioda_setup_dma(phb, pe);
+}
+
 #ifdef CONFIG_PCI_IOV
 static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
@@ -3418,6 +3468,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
 	pnv_pci_controller_ops.enable_device_hook = pnv_pci_enable_device_hook;
 	pnv_pci_controller_ops.window_alignment = pnv_pci_window_alignment;
+	pnv_pci_controller_ops.setup_bridge = pnv_pci_setup_bridge;
 	pnv_pci_controller_ops.reset_secondary_bus = pnv_pci_reset_secondary_bus;
 	hose->controller_ops = pnv_pci_controller_ops;
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index e372b9f..45a6450 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -134,6 +134,7 @@ struct pnv_phb {
 			/* Global bridge info */
 			unsigned int		total_pe;
 			unsigned int		root_pe;
+			unsigned int		root_pe_populated;
 			unsigned int		reserved_pe;
 
 			/* 32-bit MMIO window */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 16/42] powerpc/powernv: Create PEs dynamically
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

Currently, the PEs and their associated resources are assigned
in ppc_md.pcibios_fixup() except those consumed by SRIOV VFs.
The function is called for once after PCI probing and resources
assignment are finished. Obviously, it's not hotplug friendly.

The patch creates PEs dynamically by ppc_md.pcibios_setup_bridge(),
which is called on the event during system bootup and PCI hotplug:
updating PCI bridge's windows after resource assignment/reassignment
are finished. For partial hotplug case, where not all PCI devices
belonging to the PE are unplugged and plugged again, we just need
unbinding/binding the affected PCI devices with the corresponding
PE without creating new one.

Besides, it might require addtional resources (e.g. M32) to the
windows of the PCI bridge when unplugging current adapter, and
insert a different adapter if there is one PCI slot, which is
assumed behind root port, or the downstream bridge of the PCIE
switch behind root port. The parent bridge of the newly plugged
adapter would reject the request to add more resources, leading
to hotplug failure. For the issue, the patch extends the windows
of root port, or the upstream port of the PCIe switch behind root
port to PHB's windows when ppc_md.pcibios_setup_bridge() is called.

There is no upstream bridge for root bus, so we have to fix it up
before any PE is created because the root bus PE is the ancestor
to anyone else.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from [PATCH v5 v4 06/21]
  * Correct "accommodate" reported by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 203 +++++++++++++++++++-----------
 arch/powerpc/platforms/powernv/pci.h      |   1 +
 2 files changed, 128 insertions(+), 76 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2eb8baa..fd2f898 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1200,6 +1200,13 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 				pci_name(dev));
 			continue;
 		}
+
+		/* The PCI device might have been associated with the PE
+		 * in case of partial hotplug.
+		 */
+		if (pdn->pe_number != IODA_INVALID_PE)
+			continue;
+
 		pdn->pe_number = pe->pe_number;
 		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -1213,15 +1220,31 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
  * subordinate PCI devices and buses. The second type of PE is normally
  * orgiriated by PCIe-to-PCI bridge or PLX switch downstream ports.
  */
-static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
+static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
 	struct pnv_phb *phb = hose->private_data;
 	struct pnv_ioda_pe *pe;
 	int pe_num = IODA_INVALID_PE;
 
+	/* For partial hotplug case, the PE instance hasn't been destroyed
+	 * yet. We shouldn't allocated a new one and assign resources to
+	 * it. The existing PE instance should be reused, but we should
+	 * associate the devices to the PE.
+	 */
+	pe_num = phb->ioda.pe_rmap[bus->number << 8];
+	if (pe_num != IODA_INVALID_PE) {
+		pe = &phb->ioda.pe_array[pe_num];
+		pnv_ioda_setup_same_PE(bus, pe);
+		return NULL;
+	}
+
+	/* PE number for root bus should have been reserved */
+	if (pci_is_root_bus(bus))
+		pe_num = phb->ioda.root_pe;
+
 	/* Check if PE is determined by M64 */
-	if (phb->pick_m64_pe)
+	if (pe_num == IODA_INVALID_PE && phb->pick_m64_pe)
 		pe_num = phb->pick_m64_pe(phb, bus, all);
 
 	/* The PE number isn't pinned by M64 */
@@ -1231,7 +1254,7 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 	if (pe_num == IODA_INVALID_PE) {
 		pr_warning("%s: Not enough PE# available for PCI bus %04x:%02x\n",
 			__func__, pci_domain_nr(bus), bus->number);
-		return;
+		return NULL;
 	}
 
 	pe = &phb->ioda.pe_array[pe_num];
@@ -1255,7 +1278,7 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 		if (pe_num)
 			pnv_ioda_free_pe(phb, pe_num);
 		pe->pbus = NULL;
-		return;
+		return NULL;
 	}
 
 	/* Associate it with all child devices */
@@ -1266,46 +1289,8 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 
 	/* Link the PE */
 	pnv_ioda_link_pe_by_weight(phb, pe);
-}
-
-static void pnv_ioda_setup_PEs(struct pci_bus *bus)
-{
-	struct pci_dev *dev;
-
-	pnv_ioda_setup_bus_PE(bus, 0);
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
-			if (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)
-				pnv_ioda_setup_bus_PE(dev->subordinate, 1);
-			else
-				pnv_ioda_setup_PEs(dev->subordinate);
-		}
-	}
-}
-
-/*
- * Configure PEs so that the downstream PCI buses and devices
- * could have their associated PE#. Unfortunately, we didn't
- * figure out the way to identify the PLX bridge yet. So we
- * simply put the PCI bus and the subordinate behind the root
- * port to PE# here. The game rule here is expected to be changed
- * as soon as we can detected PLX bridge correctly.
- */
-static void pnv_pci_ioda_setup_PEs(void)
-{
-	struct pci_controller *hose, *tmp;
-	struct pnv_phb *phb;
 
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-
-		/* M64 layout might affect PE allocation */
-		if (phb->reserve_m64_pe)
-			phb->reserve_m64_pe(phb, phb->hose->bus);
-
-		pnv_ioda_setup_PEs(hose->bus);
-	}
+	return pe;
 }
 
 #ifdef CONFIG_PCI_IOV
@@ -3088,36 +3073,6 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 	}
 }
 
-static void pnv_pci_ioda_setup_seg(void)
-{
-	struct pci_controller *tmp, *hose;
-	struct pnv_phb *phb;
-	struct pnv_ioda_pe *pe;
-
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-			pnv_ioda_setup_pe_seg(hose, pe);
-		}
-	}
-}
-
-static void pnv_pci_ioda_setup_DMA(void)
-{
-	struct pci_controller *hose, *tmp;
-	struct pnv_phb *phb;
-	struct pnv_ioda_pe *pe;
-
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-		list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link)
-			pnv_ioda_setup_dma(phb, pe);
-
-		/* Mark the PHB initialization done */
-		phb->initialized = 1;
-	}
-}
-
 static void pnv_pci_ioda_create_dbgfs(void)
 {
 #ifdef CONFIG_DEBUG_FS
@@ -3139,9 +3094,8 @@ static void pnv_pci_ioda_create_dbgfs(void)
 
 static void pnv_pci_ioda_fixup(void)
 {
-	pnv_pci_ioda_setup_PEs();
-	pnv_pci_ioda_setup_seg();
-	pnv_pci_ioda_setup_DMA();
+	struct pci_controller *tmp, *hose;
+	struct pnv_phb *phb;
 
 	pnv_pci_ioda_create_dbgfs();
 
@@ -3149,6 +3103,12 @@ static void pnv_pci_ioda_fixup(void)
 	eeh_init();
 	eeh_addr_cache_build();
 #endif
+
+	/* Notify initialization of PHB done */
+	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
+		phb = hose->private_data;
+		phb->initialized = 1;
+	}
 }
 
 /*
@@ -3192,6 +3152,96 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 	return phb->ioda.io_segsize;
 }
 
+/*
+ * We are updating root port or the upstream bridge behind the root
+ * port with PHB's various windows in order to accommodate the changes
+ * on required resources during PCI (slot) hotplug, which is connected
+ * to either root port, or the downstream ports of PCIe switch behind
+ * the root port.
+ */
+static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
+					   unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dev *bridge = bus->self;
+	struct resource *r, *w;
+	int i;
+
+	/* Check if we need apply fixup to the bridge's windows */
+	if (!pci_is_root_bus(bridge->bus) &&
+	    !pci_is_root_bus(bridge->bus->self->bus))
+		return;
+
+	/* Fixup the resoureces */
+	for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
+		r = &bridge->resource[PCI_BRIDGE_RESOURCES + i];
+		if (!r->flags || !r->parent)
+			continue;
+
+		w = NULL;
+		if (r->flags & type & IORESOURCE_IO)
+			w = &hose->io_resource;
+		else if (pnv_pci_is_mem_pref_64(r->flags) &&
+			(type & IORESOURCE_PREFETCH) &&
+			phb->ioda.m64_segsize)
+			w = &hose->mem_resources[1];
+		else if (r->flags & type & IORESOURCE_MEM)
+			w = &hose->mem_resources[0];
+
+		r->start = w->start;
+		r->end = w->end;
+	}
+
+}
+
+static void pnv_pci_setup_bridge(struct pci_bus *bus,
+				 unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dev *bridge = bus->self;
+	struct pnv_ioda_pe *pe;
+
+	/* The root bus (ancestor PE) should be finalized before anyone else */
+	if (!phb->ioda.root_pe_populated) {
+		pe = pnv_ioda_setup_bus_PE(phb->hose->bus, 0);
+		if (pe && phb->ioda.root_pe == IODA_INVALID_PE)
+			phb->ioda.root_pe = pe->pe_number;
+			phb->ioda.root_pe_populated = 1;
+		}
+
+	/* Extend bridge's windows if necessary */
+	pnv_pci_fixup_bridge_resources(bus, type);
+
+	/* Don't assign PE to bus, which doesn't have any subordinate
+	 * PCI devices on it.
+	 */
+	if (list_empty(&bus->devices))
+		return;
+
+	/* Reserve PEs for M64 resource */
+	if (phb->reserve_m64_pe)
+		phb->reserve_m64_pe(phb, bus);
+
+	/* Assign PE. We might run here because of partial hotplug.
+	 * For the case, we just pick up the existing PE and should
+	 * not allocate resources again.
+	 */
+	if (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE)
+		pe = pnv_ioda_setup_bus_PE(bus, 1);
+	else
+		pe = pnv_ioda_setup_bus_PE(bus, 0);
+	if (!pe)
+		return;
+
+	/* Setup MMIO mapping */
+	pnv_ioda_setup_pe_seg(hose, pe);
+
+	/* Setup DMA */
+	pnv_ioda_setup_dma(phb, pe);
+}
+
 #ifdef CONFIG_PCI_IOV
 static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
@@ -3418,6 +3468,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
 	pnv_pci_controller_ops.enable_device_hook = pnv_pci_enable_device_hook;
 	pnv_pci_controller_ops.window_alignment = pnv_pci_window_alignment;
+	pnv_pci_controller_ops.setup_bridge = pnv_pci_setup_bridge;
 	pnv_pci_controller_ops.reset_secondary_bus = pnv_pci_reset_secondary_bus;
 	hose->controller_ops = pnv_pci_controller_ops;
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index e372b9f..45a6450 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -134,6 +134,7 @@ struct pnv_phb {
 			/* Global bridge info */
 			unsigned int		total_pe;
 			unsigned int		root_pe;
+			unsigned int		root_pe_populated;
 			unsigned int		reserved_pe;
 
 			/* 32-bit MMIO window */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 17/42] powerpc/powernv: PE oriented during configuration
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (8 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 14/42] powerpc/powernv: Allocate PE# in deasending order Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 19/42] powerpc/powernv: Remove DMA32 list of PEs Gavin Shan
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

Except pnv_ioda_configure_pe(), all PE configuration related functions
are already PE oriented. The patch changes the return value from PE
number to PE instance for its callee for the purpose.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 07/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 44 ++++++++++++++++---------------
 arch/powerpc/platforms/powernv/pci.h      |  3 ++-
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index fd2f898..6187f84 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,25 +132,26 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
-static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
+static struct pnv_ioda_pe *pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 {
 	if (!(pe_no >= 0 && pe_no < phb->ioda.total_pe)) {
 		pr_warn("%s: Invalid PE %d on PHB#%x\n",
 			__func__, pe_no, phb->hose->global_number);
-		return;
+		return NULL;
 	}
 
 	if (test_and_set_bit(pe_no, phb->ioda.pe_alloc)) {
 		pr_warn("%s: PE %d was assigned on PHB#%x\n",
 			__func__, pe_no, phb->hose->global_number);
-		return;
+		return NULL;
 	}
 
 	phb->ioda.pe_array[pe_no].phb = phb;
 	phb->ioda.pe_array[pe_no].pe_number = pe_no;
+	return &phb->ioda.pe_array[pe_no];
 }
 
-static int pnv_ioda_alloc_pe(struct pnv_phb *phb)
+static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
 {
 	unsigned long pe_no;
 	unsigned long limit = phb->ioda.total_pe - 1;
@@ -163,12 +164,12 @@ static int pnv_ioda_alloc_pe(struct pnv_phb *phb)
 			break;
 
 		if (--limit >= phb->ioda.total_pe)
-			return IODA_INVALID_PE;
+			return NULL;
 	} while (1);
 
 	phb->ioda.pe_array[pe_no].phb = phb;
 	phb->ioda.pe_array[pe_no].pe_number = pe_no;
-	return pe_no;
+	return &phb->ioda.pe_array[pe_no];
 }
 
 static void pnv_ioda_free_pe(struct pnv_phb *phb, int pe)
@@ -389,8 +390,8 @@ static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb,
 	}
 }
 
-static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
-				struct pci_bus *bus, int all)
+static struct pnv_ioda_pe *pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
+						struct pci_bus *bus, int all)
 {
 	resource_size_t segsz = phb->ioda.m64_segsize;
 	struct pci_dev *pdev;
@@ -401,13 +402,13 @@ static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
 	int i;
 
 	if (!pnv_ioda_need_m64_pe(phb, bus))
-		return IODA_INVALID_PE;
+		return NULL;
 
 	/* Allocate bitmap */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
 	pe_bitmap = kzalloc(size, GFP_KERNEL);
 	if (!pe_bitmap)
-		return IODA_INVALID_PE;
+		return NULL;
 
 	/* The bridge's M64 window might be extended to PHB's M64
 	 * window by intention to support PCI hotplug. So we have
@@ -444,7 +445,7 @@ static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
 	/* No M64 window found ? */
 	if (bitmap_empty(pe_bitmap, phb->ioda.total_pe)) {
 		kfree(pe_bitmap);
-		return IODA_INVALID_PE;
+		return NULL;
 	}
 
 	/* Figure out the master PE and put all slave PEs
@@ -495,7 +496,7 @@ static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
 	}
 
 	kfree(pe_bitmap);
-	return master_pe->pe_number;
+	return master_pe;
 }
 
 static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
@@ -1224,7 +1225,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
 	struct pnv_phb *phb = hose->private_data;
-	struct pnv_ioda_pe *pe;
+	struct pnv_ioda_pe *pe = NULL;
 	int pe_num = IODA_INVALID_PE;
 
 	/* For partial hotplug case, the PE instance hasn't been destroyed
@@ -1240,24 +1241,25 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 	}
 
 	/* PE number for root bus should have been reserved */
-	if (pci_is_root_bus(bus))
-		pe_num = phb->ioda.root_pe;
+	if (pci_is_root_bus(bus) &&
+	    phb->ioda.root_pe != IODA_INVALID_PE)
+		pe = &phb->ioda.pe_array[phb->ioda.root_pe];
 
 	/* Check if PE is determined by M64 */
-	if (pe_num == IODA_INVALID_PE && phb->pick_m64_pe)
-		pe_num = phb->pick_m64_pe(phb, bus, all);
+	if (!pe && phb->pick_m64_pe)
+		pe = phb->pick_m64_pe(phb, bus, all);
 
 	/* The PE number isn't pinned by M64 */
-	if (pe_num == IODA_INVALID_PE)
-		pe_num = pnv_ioda_alloc_pe(phb);
+	if (!pe)
+		pe = pnv_ioda_alloc_pe(phb);
 
-	if (pe_num == IODA_INVALID_PE) {
+	if (!pe) {
 		pr_warning("%s: Not enough PE# available for PCI bus %04x:%02x\n",
 			__func__, pci_domain_nr(bus), bus->number);
 		return NULL;
 	}
 
-	pe = &phb->ioda.pe_array[pe_num];
+	pe_num = pe->pe_number;
 	pe->flags |= (all ? PNV_IODA_PE_BUS_ALL : PNV_IODA_PE_BUS);
 	pe->pbus = bus;
 	pe->pdev = NULL;
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 45a6450..64c7f03 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -119,7 +119,8 @@ struct pnv_phb {
 	void (*shutdown)(struct pnv_phb *phb);
 	int (*init_m64)(struct pnv_phb *phb);
 	void (*reserve_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus);
-	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, int all);
+	struct pnv_ioda_pe * (*pick_m64_pe)(struct pnv_phb *phb,
+					    struct pci_bus *bus, int all);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
 	int (*unfreeze_pe)(struct pnv_phb *phb, int pe_no, int opt);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 18/42] powerpc/powernv: Helper function pnv_ioda_init_pe()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The patch introduces helper function pnv_ioda_init_pe(), which
initialize PE instance after reserving or allocating PE#, to
simplify the code. The patch doesn't introduce behavioural
changes.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from PATCH[v4 07/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 6187f84..f0b54ab 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,17 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
+static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
+{
+	struct pnv_ioda_pe *pe = &phb->ioda.pe_array[pe_no];
+
+	pe->phb = phb;
+	pe->pe_number = pe_no;
+	INIT_LIST_HEAD(&pe->list);
+
+	return pe;
+}
+
 static struct pnv_ioda_pe *pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 {
 	if (!(pe_no >= 0 && pe_no < phb->ioda.total_pe)) {
@@ -146,9 +157,7 @@ static struct pnv_ioda_pe *pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 		return NULL;
 	}
 
-	phb->ioda.pe_array[pe_no].phb = phb;
-	phb->ioda.pe_array[pe_no].pe_number = pe_no;
-	return &phb->ioda.pe_array[pe_no];
+	return pnv_ioda_init_pe(phb, pe_no);
 }
 
 static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
@@ -167,9 +176,7 @@ static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
 			return NULL;
 	} while (1);
 
-	phb->ioda.pe_array[pe_no].phb = phb;
-	phb->ioda.pe_array[pe_no].pe_number = pe_no;
-	return &phb->ioda.pe_array[pe_no];
+	return pnv_ioda_init_pe(phb, pe_no);
 }
 
 static void pnv_ioda_free_pe(struct pnv_phb *phb, int pe)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 18/42] powerpc/powernv: Helper function pnv_ioda_init_pe()
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch introduces helper function pnv_ioda_init_pe(), which
initialize PE instance after reserving or allocating PE#, to
simplify the code. The patch doesn't introduce behavioural
changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 07/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 6187f84..f0b54ab 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,17 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
+static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
+{
+	struct pnv_ioda_pe *pe = &phb->ioda.pe_array[pe_no];
+
+	pe->phb = phb;
+	pe->pe_number = pe_no;
+	INIT_LIST_HEAD(&pe->list);
+
+	return pe;
+}
+
 static struct pnv_ioda_pe *pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 {
 	if (!(pe_no >= 0 && pe_no < phb->ioda.total_pe)) {
@@ -146,9 +157,7 @@ static struct pnv_ioda_pe *pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 		return NULL;
 	}
 
-	phb->ioda.pe_array[pe_no].phb = phb;
-	phb->ioda.pe_array[pe_no].pe_number = pe_no;
-	return &phb->ioda.pe_array[pe_no];
+	return pnv_ioda_init_pe(phb, pe_no);
 }
 
 static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
@@ -167,9 +176,7 @@ static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
 			return NULL;
 	} while (1);
 
-	phb->ioda.pe_array[pe_no].phb = phb;
-	phb->ioda.pe_array[pe_no].pe_number = pe_no;
-	return &phb->ioda.pe_array[pe_no];
+	return pnv_ioda_init_pe(phb, pe_no);
 }
 
 static void pnv_ioda_free_pe(struct pnv_phb *phb, int pe)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 19/42] powerpc/powernv: Remove DMA32 list of PEs
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (9 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 17/42] powerpc/powernv: PE oriented during configuration Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 20/42] powerpc/powernv: Rename pnv_ioda_get_pe() to pnv_ioda_dev_to_pe() Gavin Shan
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

PEs were put into the list, maintained by PHB, according its DMA32
weight. After that, the list was iterated to initialize PE's DMA
capability. For now, the PE is created and its DMA capability is
initialized right way. So we don't need the list and the patch
removes that.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Newly introduced
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 18 ------------------
 arch/powerpc/platforms/powernv/pci.h      |  6 ------
 2 files changed, 24 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index f0b54ab..0447534 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -992,20 +992,6 @@ out:
 	return 0;
 }
 
-static void pnv_ioda_link_pe_by_weight(struct pnv_phb *phb,
-				       struct pnv_ioda_pe *pe)
-{
-	struct pnv_ioda_pe *lpe;
-
-	list_for_each_entry(lpe, &phb->ioda.pe_dma_list, dma_link) {
-		if (lpe->dma32_weight < pe->dma32_weight) {
-			list_add_tail(&pe->dma_link, &lpe->dma_link);
-			return;
-		}
-	}
-	list_add_tail(&pe->dma_link, &phb->ioda.pe_dma_list);
-}
-
 static unsigned int pnv_ioda_dev_dma_weight(struct pci_dev *dev)
 {
 	struct pci_controller *hose = pci_bus_to_host(dev->bus);
@@ -1296,9 +1282,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 	/* Put PE to the list */
 	list_add_tail(&pe->list, &phb->ioda.pe_list);
 
-	/* Link the PE */
-	pnv_ioda_link_pe_by_weight(phb, pe);
-
 	return pe;
 }
 
@@ -3421,7 +3404,6 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 		phb->ioda.root_pe = IODA_INVALID_PE;
 	}
 
-	INIT_LIST_HEAD(&phb->ioda.pe_dma_list);
 	INIT_LIST_HEAD(&phb->ioda.pe_list);
 	mutex_init(&phb->ioda.pe_list_mutex);
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 64c7f03..bf63481 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -79,7 +79,6 @@ struct pnv_ioda_pe {
 	struct list_head	slaves;
 
 	/* Link in list of PE#s */
-	struct list_head	dma_link;
 	struct list_head	list;
 };
 
@@ -186,11 +185,6 @@ struct pnv_phb {
 			/* Number of 32-bit DMA segments */
 			unsigned long		dma32_segcount;
 
-			/* Sorted list of used PE's, sorted at
-			 * boot for resource allocation purposes
-			 */
-			struct list_head	pe_dma_list;
-
 			/* TCE cache invalidate registers (physical and
 			 * remapped)
 			 */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 20/42] powerpc/powernv: Rename pnv_ioda_get_pe() to pnv_ioda_dev_to_pe()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (10 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 19/42] powerpc/powernv: Remove DMA32 list of PEs Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 21/42] powerpc/powernv: Drop pnv_ioda_setup_dev_PE() Gavin Shan
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

pnv_ioda_get_pe() indicates it's increasing refcount to the given
PE instance from the name. However, it gets the instance of the
PE, which contains the indicated PCI device. The patch renames it
to pnv_ioda_dev_to_pe() to reflect its purpose.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 07/21]
  * Fixed "do not use assignment in if condition" from checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 0447534..e9165fa 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -702,7 +702,7 @@ static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
  * but in the meantime, we need to protect them to avoid warnings
  */
 #ifdef CONFIG_PCI_MSI
-static struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev)
+static struct pnv_ioda_pe *pnv_ioda_dev_to_pe(struct pci_dev *dev)
 {
 	struct pci_controller *hose = pci_bus_to_host(dev->bus);
 	struct pnv_phb *phb = hose->private_data;
@@ -2671,7 +2671,7 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode)
 	struct pnv_ioda_pe *pe;
 	int rc;
 
-	pe = pnv_ioda_get_pe(dev);
+	pe = pnv_ioda_dev_to_pe(dev);
 	if (!pe)
 		return -ENODEV;
 
@@ -2787,7 +2787,8 @@ int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
 	struct pnv_ioda_pe *pe;
 	int rc;
 
-	if (!(pe = pnv_ioda_get_pe(dev)))
+	pe = pnv_ioda_dev_to_pe(dev);
+	if (!pe)
 		return -ENODEV;
 
 	/* Assign XIVE to PE */
@@ -2809,7 +2810,7 @@ static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev,
 				  unsigned int hwirq, unsigned int virq,
 				  unsigned int is_64, struct msi_msg *msg)
 {
-	struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev);
+	struct pnv_ioda_pe *pe = pnv_ioda_dev_to_pe(dev);
 	unsigned int xive_num = hwirq - phb->msi_base;
 	__be32 data;
 	int rc;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 21/42] powerpc/powernv: Drop pnv_ioda_setup_dev_PE()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (11 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 20/42] powerpc/powernv: Rename pnv_ioda_get_pe() to pnv_ioda_dev_to_pe() Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
       [not found] ` <1433400131-18429-1-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

Nobody is using the this function. The patch drops it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
v5:
  * Derived from PATCH[v4 08/21]
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 71 -------------------------------
 1 file changed, 71 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index e9165fa..8a79403 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1111,77 +1111,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
 }
 #endif /* CONFIG_PCI_IOV */
 
-#if 0
-static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
-{
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
-	struct pci_dn *pdn = pci_get_pdn(dev);
-	struct pnv_ioda_pe *pe;
-	int pe_num;
-
-	if (!pdn) {
-		pr_err("%s: Device tree node not associated properly\n",
-			   pci_name(dev));
-		return NULL;
-	}
-	if (pdn->pe_number != IODA_INVALID_PE)
-		return NULL;
-
-	/* PE#0 has been pre-set */
-	if (dev->bus->number == 0)
-		pe_num = 0;
-	else
-		pe_num = pnv_ioda_alloc_pe(phb);
-	if (pe_num == IODA_INVALID_PE) {
-		pr_warning("%s: Not enough PE# available, disabling device\n",
-			   pci_name(dev));
-		return NULL;
-	}
-
-	/* NOTE: We get only one ref to the pci_dev for the pdn, not for the
-	 * pointer in the PE data structure, both should be destroyed at the
-	 * same time. However, this needs to be looked at more closely again
-	 * once we actually start removing things (Hotplug, SR-IOV, ...)
-	 *
-	 * At some point we want to remove the PDN completely anyways
-	 */
-	pe = &phb->ioda.pe_array[pe_num];
-	pci_dev_get(dev);
-	pdn->pcidev = dev;
-	pdn->pe_number = pe_num;
-	pe->pdev = dev;
-	pe->pbus = NULL;
-	pe->tce32_seg = -1;
-	pe->mve_number = -1;
-	pe->rid = dev->bus->number << 8 | pdn->devfn;
-
-	pe_info(pe, "Associated device to PE\n");
-
-	if (pnv_ioda_configure_pe(phb, pe)) {
-		/* XXX What do we do here ? */
-		if (pe_num)
-			pnv_ioda_free_pe(phb, pe_num);
-		pdn->pe_number = IODA_INVALID_PE;
-		pe->pdev = NULL;
-		pci_dev_put(dev);
-		return NULL;
-	}
-
-	/* Assign a DMA weight to the device */
-	pe->dma_weight = pnv_ioda_dma_weight(dev);
-	if (pe->dma_weight != 0) {
-		phb->ioda.dma_weight += pe->dma_weight;
-		phb->ioda.dma_pe_count++;
-	}
-
-	/* Link the PE */
-	pnv_ioda_link_pe_by_weight(phb, pe);
-
-	return pe;
-}
-#endif /* Useful for SRIOV case */
-
 static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 {
 	struct pci_dev *dev;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 22/42] powerpc/powernv: Move functions around
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The patch moves functions related to releasing PE around so that
we don't need extra declaration for them in subsequent patches.
It doesn't introduce any behavioural changes.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from PATCH[v4 07/21]
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 735 +++++++++++++++---------------
 1 file changed, 369 insertions(+), 366 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8a79403..3d5aec8d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,285 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
+static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe *pe)
+{
+	/* 01xb - invalidate TCEs that match the specified PE# */
+	unsigned long val = (0x4ull << 60) | (pe->pe_number & 0xFF);
+	struct pnv_phb *phb = pe->phb;
+
+	if (!phb->ioda.tce_inval_reg)
+		return;
+
+	mb(); /* Ensure above stores are visible */
+	__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
+}
+
+static long pnv_pci_ioda2_unset_window(struct iommu_table_group *table_group,
+		int num)
+{
+	struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
+			table_group);
+	struct pnv_phb *phb = pe->phb;
+	long ret;
+
+	pe_info(pe, "Removing DMA window #%d\n", num);
+
+	ret = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+			(pe->pe_number << 1) + num,
+			0/* levels */, 0/* table address */,
+			0/* table size */, 0/* page size */);
+	if (ret)
+		pe_warn(pe, "Unmapping failed, ret = %ld\n", ret);
+	else
+		pnv_pci_ioda2_tce_invalidate_entire(pe);
+
+	pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
+
+	return ret;
+}
+
+static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable)
+{
+	uint16_t window_id = (pe->pe_number << 1) + 1;
+	int64_t rc;
+
+	pe_info(pe, "%sabling 64-bit DMA bypass\n", enable ? "En" : "Dis");
+	if (enable) {
+		phys_addr_t top = memblock_end_of_DRAM();
+
+		top = roundup_pow_of_two(top);
+		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
+						     pe->pe_number,
+						     window_id,
+						     pe->tce_bypass_base,
+						     top);
+	} else {
+		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
+						     pe->pe_number,
+						     window_id,
+						     pe->tce_bypass_base,
+						     0);
+	}
+	if (rc)
+		pe_err(pe, "OPAL error %lld configuring bypass window\n", rc);
+	else
+		pe->tce_bypass_enabled = enable;
+}
+
+static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev,
+					 struct pnv_ioda_pe *pe)
+{
+	struct iommu_table    *tbl;
+	int64_t               rc;
+
+	tbl = pe->table_group.tables[0];
+	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
+	if (rc)
+		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
+
+	pnv_pci_ioda2_set_bypass(pe, false);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		BUG_ON(pe->table_group.group);
+	}
+	pnv_pci_ioda2_table_free_pages(tbl);
+	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
+}
+
+static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
+				  struct pnv_ioda_pe *parent,
+				  struct pnv_ioda_pe *child,
+				  bool is_add)
+{
+	const char *desc = is_add ? "adding" : "removing";
+	uint8_t op = is_add ? OPAL_ADD_PE_TO_DOMAIN :
+			      OPAL_REMOVE_PE_FROM_DOMAIN;
+	struct pnv_ioda_pe *slave;
+	long rc;
+
+	/* Parent PE affects child PE */
+	rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
+				child->pe_number, op);
+	if (rc != OPAL_SUCCESS) {
+		pe_warn(child, "OPAL error %ld %s to parent PELTV\n",
+			rc, desc);
+		return -ENXIO;
+	}
+
+	if (!(child->flags & PNV_IODA_PE_MASTER))
+		return 0;
+
+	/* Compound case: parent PE affects slave PEs */
+	list_for_each_entry(slave, &child->slaves, list) {
+		rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
+					slave->pe_number, op);
+		if (rc != OPAL_SUCCESS) {
+			pe_warn(slave, "OPAL error %ld %s to parent PELTV\n",
+				rc, desc);
+			return -ENXIO;
+		}
+	}
+
+	return 0;
+}
+
+static int pnv_ioda_set_peltv(struct pnv_phb *phb,
+			      struct pnv_ioda_pe *pe,
+			      bool is_add)
+{
+	struct pnv_ioda_pe *slave;
+	struct pci_dev *pdev = NULL;
+	int ret;
+
+	/*
+	 * Clear PE frozen state. If it's master PE, we need
+	 * clear slave PE frozen state as well.
+	 */
+	if (is_add) {
+		opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number,
+					  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+		if (pe->flags & PNV_IODA_PE_MASTER) {
+			list_for_each_entry(slave, &pe->slaves, list)
+				opal_pci_eeh_freeze_clear(phb->opal_id,
+					slave->pe_number,
+					OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+		}
+	}
+
+	/*
+	 * Associate PE in PELT. We need add the PE into the
+	 * corresponding PELT-V as well. Otherwise, the error
+	 * originated from the PE might contribute to other
+	 * PEs.
+	 */
+	ret = pnv_ioda_set_one_peltv(phb, pe, pe, is_add);
+	if (ret)
+		return ret;
+
+	/* For compound PEs, any one affects all of them */
+	if (pe->flags & PNV_IODA_PE_MASTER) {
+		list_for_each_entry(slave, &pe->slaves, list) {
+			ret = pnv_ioda_set_one_peltv(phb, slave, pe, is_add);
+			if (ret)
+				return ret;
+		}
+	}
+
+	if (pe->flags & (PNV_IODA_PE_BUS_ALL | PNV_IODA_PE_BUS))
+		pdev = pe->pbus->self;
+	else if (pe->flags & PNV_IODA_PE_DEV)
+		pdev = pe->pdev->bus->self;
+#ifdef CONFIG_PCI_IOV
+	else if (pe->flags & PNV_IODA_PE_VF)
+		pdev = pe->parent_dev->bus->self;
+#endif /* CONFIG_PCI_IOV */
+	while (pdev) {
+		struct pci_dn *pdn = pci_get_pdn(pdev);
+		struct pnv_ioda_pe *parent;
+
+		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
+			parent = &phb->ioda.pe_array[pdn->pe_number];
+			ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
+			if (ret)
+				return ret;
+		}
+
+		pdev = pdev->bus->self;
+	}
+
+	return 0;
+}
+
+#ifdef CONFIG_PCI_IOV
+static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
+{
+	struct pci_dev *parent;
+	uint8_t bcomp, dcomp, fcomp;
+	int64_t rc;
+	long rid_end, rid;
+
+	/* Currently, we just deconfigure VF PE. Bus PE will always there.*/
+	if (pe->pbus) {
+		int count;
+
+		dcomp = OPAL_IGNORE_RID_DEVICE_NUMBER;
+		fcomp = OPAL_IGNORE_RID_FUNCTION_NUMBER;
+		parent = pe->pbus->self;
+		if (pe->flags & PNV_IODA_PE_BUS_ALL)
+			count = pe->pbus->busn_res.end -
+				pe->pbus->busn_res.start + 1;
+		else
+			count = 1;
+
+		switch (count) {
+		case  1:
+			bcomp = OpalPciBusAll;   break;
+		case  2:
+			bcomp = OpalPciBus7Bits; break;
+		case  4:
+			bcomp = OpalPciBus6Bits; break;
+		case  8:
+			bcomp = OpalPciBus5Bits; break;
+		case 16:
+			bcomp = OpalPciBus4Bits; break;
+		case 32:
+			bcomp = OpalPciBus3Bits; break;
+		default:
+			dev_err(&pe->pbus->dev, "Number of subordinate buses %d unsupported\n",
+				count);
+			/* Do an exact match only */
+			bcomp = OpalPciBusAll;
+		}
+		rid_end = pe->rid + (count << 8);
+	} else {
+		if (pe->flags & PNV_IODA_PE_VF)
+			parent = pe->parent_dev;
+		else
+			parent = pe->pdev->bus->self;
+		bcomp = OpalPciBusAll;
+		dcomp = OPAL_COMPARE_RID_DEVICE_NUMBER;
+		fcomp = OPAL_COMPARE_RID_FUNCTION_NUMBER;
+		rid_end = pe->rid + 1;
+	}
+
+	/* Clear the reverse map */
+	for (rid = pe->rid; rid < rid_end; rid++)
+		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
+
+	/* Release from all parents PELT-V */
+	while (parent) {
+		struct pci_dn *pdn = pci_get_pdn(parent);
+
+		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
+			rc = opal_pci_set_peltv(phb->opal_id,
+					pdn->pe_number, pe->pe_number,
+					OPAL_REMOVE_PE_FROM_DOMAIN);
+			/* XXX What to do in case of error ? */
+		}
+		parent = parent->bus->self;
+	}
+
+	opal_pci_eeh_freeze_set(phb->opal_id, pe->pe_number,
+				  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+
+	/* Disassociate PE in PELT */
+	rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number,
+				pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
+	if (rc)
+		pe_warn(pe, "OPAL error %ld remove self from PELTV\n", rc);
+	rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid,
+			     bcomp, dcomp, fcomp, OPAL_UNMAP_PE);
+	if (rc)
+		pe_err(pe, "OPAL error %ld trying to setup PELT table\n", rc);
+
+	pe->pbus = NULL;
+	pe->pdev = NULL;
+	pe->parent_dev = NULL;
+
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
 {
 	struct pnv_ioda_pe *pe = &phb->ioda.pe_array[pe_no];
@@ -599,307 +878,119 @@ static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
 static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
 {
 	struct pnv_ioda_pe *pe, *slave;
-	s64 rc;
-
-	/* Find master PE */
-	pe = &phb->ioda.pe_array[pe_no];
-	if (pe->flags & PNV_IODA_PE_SLAVE) {
-		pe = pe->master;
-		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
-		pe_no = pe->pe_number;
-	}
-
-	/* Clear frozen state for master PE */
-	rc = opal_pci_eeh_freeze_clear(phb->opal_id, pe_no, opt);
-	if (rc != OPAL_SUCCESS) {
-		pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
-			__func__, rc, opt, phb->hose->global_number, pe_no);
-		return -EIO;
-	}
-
-	if (!(pe->flags & PNV_IODA_PE_MASTER))
-		return 0;
-
-	/* Clear frozen state for slave PEs */
-	list_for_each_entry(slave, &pe->slaves, list) {
-		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
-					     slave->pe_number,
-					     opt);
-		if (rc != OPAL_SUCCESS) {
-			pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
-				__func__, rc, opt, phb->hose->global_number,
-				slave->pe_number);
-			return -EIO;
-		}
-	}
-
-	return 0;
-}
-
-static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
-{
-	struct pnv_ioda_pe *slave, *pe;
-	u8 fstate, state;
-	__be16 pcierr;
-	s64 rc;
-
-	/* Sanity check on PE number */
-	if (pe_no < 0 || pe_no >= phb->ioda.total_pe)
-		return OPAL_EEH_STOPPED_PERM_UNAVAIL;
-
-	/*
-	 * Fetch the master PE and the PE instance might be
-	 * not initialized yet.
-	 */
-	pe = &phb->ioda.pe_array[pe_no];
-	if (pe->flags & PNV_IODA_PE_SLAVE) {
-		pe = pe->master;
-		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
-		pe_no = pe->pe_number;
-	}
-
-	/* Check the master PE */
-	rc = opal_pci_eeh_freeze_status(phb->opal_id, pe_no,
-					&state, &pcierr, NULL);
-	if (rc != OPAL_SUCCESS) {
-		pr_warn("%s: Failure %lld getting "
-			"PHB#%x-PE#%x state\n",
-			__func__, rc,
-			phb->hose->global_number, pe_no);
-		return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
-	}
-
-	/* Check the slave PE */
-	if (!(pe->flags & PNV_IODA_PE_MASTER))
-		return state;
-
-	list_for_each_entry(slave, &pe->slaves, list) {
-		rc = opal_pci_eeh_freeze_status(phb->opal_id,
-						slave->pe_number,
-						&fstate,
-						&pcierr,
-						NULL);
-		if (rc != OPAL_SUCCESS) {
-			pr_warn("%s: Failure %lld getting "
-				"PHB#%x-PE#%x state\n",
-				__func__, rc,
-				phb->hose->global_number, slave->pe_number);
-			return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
-		}
-
-		/*
-		 * Override the result based on the ascending
-		 * priority.
-		 */
-		if (fstate > state)
-			state = fstate;
-	}
-
-	return state;
-}
-
-/* Currently those 2 are only used when MSIs are enabled, this will change
- * but in the meantime, we need to protect them to avoid warnings
- */
-#ifdef CONFIG_PCI_MSI
-static struct pnv_ioda_pe *pnv_ioda_dev_to_pe(struct pci_dev *dev)
-{
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
-	struct pci_dn *pdn = pci_get_pdn(dev);
-
-	if (!pdn)
-		return NULL;
-	if (pdn->pe_number == IODA_INVALID_PE)
-		return NULL;
-	return &phb->ioda.pe_array[pdn->pe_number];
-}
-#endif /* CONFIG_PCI_MSI */
-
-static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
-				  struct pnv_ioda_pe *parent,
-				  struct pnv_ioda_pe *child,
-				  bool is_add)
-{
-	const char *desc = is_add ? "adding" : "removing";
-	uint8_t op = is_add ? OPAL_ADD_PE_TO_DOMAIN :
-			      OPAL_REMOVE_PE_FROM_DOMAIN;
-	struct pnv_ioda_pe *slave;
-	long rc;
-
-	/* Parent PE affects child PE */
-	rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
-				child->pe_number, op);
-	if (rc != OPAL_SUCCESS) {
-		pe_warn(child, "OPAL error %ld %s to parent PELTV\n",
-			rc, desc);
-		return -ENXIO;
-	}
-
-	if (!(child->flags & PNV_IODA_PE_MASTER))
-		return 0;
-
-	/* Compound case: parent PE affects slave PEs */
-	list_for_each_entry(slave, &child->slaves, list) {
-		rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
-					slave->pe_number, op);
-		if (rc != OPAL_SUCCESS) {
-			pe_warn(slave, "OPAL error %ld %s to parent PELTV\n",
-				rc, desc);
-			return -ENXIO;
-		}
-	}
-
-	return 0;
-}
-
-static int pnv_ioda_set_peltv(struct pnv_phb *phb,
-			      struct pnv_ioda_pe *pe,
-			      bool is_add)
-{
-	struct pnv_ioda_pe *slave;
-	struct pci_dev *pdev = NULL;
-	int ret;
-
-	/*
-	 * Clear PE frozen state. If it's master PE, we need
-	 * clear slave PE frozen state as well.
-	 */
-	if (is_add) {
-		opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number,
-					  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
-		if (pe->flags & PNV_IODA_PE_MASTER) {
-			list_for_each_entry(slave, &pe->slaves, list)
-				opal_pci_eeh_freeze_clear(phb->opal_id,
-							  slave->pe_number,
-							  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
-		}
-	}
-
-	/*
-	 * Associate PE in PELT. We need add the PE into the
-	 * corresponding PELT-V as well. Otherwise, the error
-	 * originated from the PE might contribute to other
-	 * PEs.
-	 */
-	ret = pnv_ioda_set_one_peltv(phb, pe, pe, is_add);
-	if (ret)
-		return ret;
+	s64 rc;
 
-	/* For compound PEs, any one affects all of them */
-	if (pe->flags & PNV_IODA_PE_MASTER) {
-		list_for_each_entry(slave, &pe->slaves, list) {
-			ret = pnv_ioda_set_one_peltv(phb, slave, pe, is_add);
-			if (ret)
-				return ret;
-		}
+	/* Find master PE */
+	pe = &phb->ioda.pe_array[pe_no];
+	if (pe->flags & PNV_IODA_PE_SLAVE) {
+		pe = pe->master;
+		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
+		pe_no = pe->pe_number;
 	}
 
-	if (pe->flags & (PNV_IODA_PE_BUS_ALL | PNV_IODA_PE_BUS))
-		pdev = pe->pbus->self;
-	else if (pe->flags & PNV_IODA_PE_DEV)
-		pdev = pe->pdev->bus->self;
-#ifdef CONFIG_PCI_IOV
-	else if (pe->flags & PNV_IODA_PE_VF)
-		pdev = pe->parent_dev->bus->self;
-#endif /* CONFIG_PCI_IOV */
-	while (pdev) {
-		struct pci_dn *pdn = pci_get_pdn(pdev);
-		struct pnv_ioda_pe *parent;
+	/* Clear frozen state for master PE */
+	rc = opal_pci_eeh_freeze_clear(phb->opal_id, pe_no, opt);
+	if (rc != OPAL_SUCCESS) {
+		pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
+			__func__, rc, opt, phb->hose->global_number, pe_no);
+		return -EIO;
+	}
 
-		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
-			parent = &phb->ioda.pe_array[pdn->pe_number];
-			ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
-			if (ret)
-				return ret;
-		}
+	if (!(pe->flags & PNV_IODA_PE_MASTER))
+		return 0;
 
-		pdev = pdev->bus->self;
+	/* Clear frozen state for slave PEs */
+	list_for_each_entry(slave, &pe->slaves, list) {
+		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
+					     slave->pe_number,
+					     opt);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
+				__func__, rc, opt, phb->hose->global_number,
+				slave->pe_number);
+			return -EIO;
+		}
 	}
 
 	return 0;
 }
 
-#ifdef CONFIG_PCI_IOV
-static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
+static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
 {
-	struct pci_dev *parent;
-	uint8_t bcomp, dcomp, fcomp;
-	int64_t rc;
-	long rid_end, rid;
+	struct pnv_ioda_pe *slave, *pe;
+	u8 fstate, state;
+	__be16 pcierr;
+	s64 rc;
 
-	/* Currently, we just deconfigure VF PE. Bus PE will always there.*/
-	if (pe->pbus) {
-		int count;
+	/* Sanity check on PE number */
+	if (pe_no < 0 || pe_no >= phb->ioda.total_pe)
+		return OPAL_EEH_STOPPED_PERM_UNAVAIL;
 
-		dcomp = OPAL_IGNORE_RID_DEVICE_NUMBER;
-		fcomp = OPAL_IGNORE_RID_FUNCTION_NUMBER;
-		parent = pe->pbus->self;
-		if (pe->flags & PNV_IODA_PE_BUS_ALL)
-			count = pe->pbus->busn_res.end - pe->pbus->busn_res.start + 1;
-		else
-			count = 1;
+	/*
+	 * Fetch the master PE and the PE instance might be
+	 * not initialized yet.
+	 */
+	pe = &phb->ioda.pe_array[pe_no];
+	if (pe->flags & PNV_IODA_PE_SLAVE) {
+		pe = pe->master;
+		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
+		pe_no = pe->pe_number;
+	}
 
-		switch(count) {
-		case  1: bcomp = OpalPciBusAll;         break;
-		case  2: bcomp = OpalPciBus7Bits;       break;
-		case  4: bcomp = OpalPciBus6Bits;       break;
-		case  8: bcomp = OpalPciBus5Bits;       break;
-		case 16: bcomp = OpalPciBus4Bits;       break;
-		case 32: bcomp = OpalPciBus3Bits;       break;
-		default:
-			dev_err(&pe->pbus->dev, "Number of subordinate buses %d unsupported\n",
-			        count);
-			/* Do an exact match only */
-			bcomp = OpalPciBusAll;
-		}
-		rid_end = pe->rid + (count << 8);
-	} else {
-		if (pe->flags & PNV_IODA_PE_VF)
-			parent = pe->parent_dev;
-		else
-			parent = pe->pdev->bus->self;
-		bcomp = OpalPciBusAll;
-		dcomp = OPAL_COMPARE_RID_DEVICE_NUMBER;
-		fcomp = OPAL_COMPARE_RID_FUNCTION_NUMBER;
-		rid_end = pe->rid + 1;
+	/* Check the master PE */
+	rc = opal_pci_eeh_freeze_status(phb->opal_id, pe_no,
+					&state, &pcierr, NULL);
+	if (rc != OPAL_SUCCESS) {
+		pr_warn("%s: Error %lld getting PHB#%x-PE#%x state\n",
+			__func__, rc, phb->hose->global_number, pe_no);
+		return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
 	}
 
-	/* Clear the reverse map */
-	for (rid = pe->rid; rid < rid_end; rid++)
-		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
+	/* Check the slave PE */
+	if (!(pe->flags & PNV_IODA_PE_MASTER))
+		return state;
 
-	/* Release from all parents PELT-V */
-	while (parent) {
-		struct pci_dn *pdn = pci_get_pdn(parent);
-		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
-			rc = opal_pci_set_peltv(phb->opal_id, pdn->pe_number,
-						pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
-			/* XXX What to do in case of error ? */
+	list_for_each_entry(slave, &pe->slaves, list) {
+		rc = opal_pci_eeh_freeze_status(phb->opal_id,
+						slave->pe_number,
+						&fstate,
+						&pcierr,
+						NULL);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("%s: Error %lld getting PHB#%x-PE#%x state\n",
+				__func__, rc, phb->hose->global_number,
+				slave->pe_number);
+			return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
 		}
-		parent = parent->bus->self;
-	}
 
-	opal_pci_eeh_freeze_set(phb->opal_id, pe->pe_number,
-				  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+		/*
+		 * Override the result based on the ascending
+		 * priority.
+		 */
+		if (fstate > state)
+			state = fstate;
+	}
 
-	/* Disassociate PE in PELT */
-	rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number,
-				pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
-	if (rc)
-		pe_warn(pe, "OPAL error %ld remove self from PELTV\n", rc);
-	rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid,
-			     bcomp, dcomp, fcomp, OPAL_UNMAP_PE);
-	if (rc)
-		pe_err(pe, "OPAL error %ld trying to setup PELT table\n", rc);
+	return state;
+}
 
-	pe->pbus = NULL;
-	pe->pdev = NULL;
-	pe->parent_dev = NULL;
+/* Currently those 2 are only used when MSIs are enabled, this will change
+ * but in the meantime, we need to protect them to avoid warnings
+ */
+#ifdef CONFIG_PCI_MSI
+static struct pnv_ioda_pe *pnv_ioda_dev_to_pe(struct pci_dev *dev)
+{
+	struct pci_controller *hose = pci_bus_to_host(dev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(dev);
 
-	return 0;
+	if (!pdn)
+		return NULL;
+	if (pdn->pe_number == IODA_INVALID_PE)
+		return NULL;
+	return &phb->ioda.pe_array[pdn->pe_number];
 }
-#endif /* CONFIG_PCI_IOV */
+#endif /* CONFIG_PCI_MSI */
 
 static int pnv_ioda_configure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
@@ -1349,29 +1440,6 @@ m64_failed:
 	return -EBUSY;
 }
 
-static long pnv_pci_ioda2_unset_window(struct iommu_table_group *table_group,
-		int num);
-static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable);
-
-static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe *pe)
-{
-	struct iommu_table    *tbl;
-	int64_t               rc;
-
-	tbl = pe->table_group.tables[0];
-	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
-	if (rc)
-		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
-
-	pnv_pci_ioda2_set_bypass(pe, false);
-	if (pe->table_group.group) {
-		iommu_group_put(pe->table_group.group);
-		BUG_ON(pe->table_group.group);
-	}
-	pnv_pci_ioda2_table_free_pages(tbl);
-	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
-}
-
 static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 {
 	struct pci_bus        *bus;
@@ -1860,19 +1928,6 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
 	.get = pnv_tce_get,
 };
 
-static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe *pe)
-{
-	/* 01xb - invalidate TCEs that match the specified PE# */
-	unsigned long val = (0x4ull << 60) | (pe->pe_number & 0xFF);
-	struct pnv_phb *phb = pe->phb;
-
-	if (!phb->ioda.tce_inval_reg)
-		return;
-
-	mb(); /* Ensure above stores are visible */
-	__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
-}
-
 static void pnv_pci_ioda2_tce_do_invalidate(unsigned pe_number, bool rm,
 		__be64 __iomem *invalidate, unsigned shift,
 		unsigned long index, unsigned long npages)
@@ -2108,34 +2163,6 @@ static long pnv_pci_ioda2_set_window(struct iommu_table_group *table_group,
 	return 0;
 }
 
-static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable)
-{
-	uint16_t window_id = (pe->pe_number << 1 ) + 1;
-	int64_t rc;
-
-	pe_info(pe, "%sabling 64-bit DMA bypass\n", enable ? "En" : "Dis");
-	if (enable) {
-		phys_addr_t top = memblock_end_of_DRAM();
-
-		top = roundup_pow_of_two(top);
-		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
-						     pe->pe_number,
-						     window_id,
-						     pe->tce_bypass_base,
-						     top);
-	} else {
-		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
-						     pe->pe_number,
-						     window_id,
-						     pe->tce_bypass_base,
-						     0);
-	}
-	if (rc)
-		pe_err(pe, "OPAL error %lld configuring bypass window\n", rc);
-	else
-		pe->tce_bypass_enabled = enable;
-}
-
 static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset,
 		__u32 page_shift, __u64 window_size, __u32 levels,
 		struct iommu_table *tbl);
@@ -2248,30 +2275,6 @@ static unsigned long pnv_pci_ioda2_get_table_size(__u32 page_shift,
 	return bytes;
 }
 
-static long pnv_pci_ioda2_unset_window(struct iommu_table_group *table_group,
-		int num)
-{
-	struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
-			table_group);
-	struct pnv_phb *phb = pe->phb;
-	long ret;
-
-	pe_info(pe, "Removing DMA window #%d\n", num);
-
-	ret = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
-			(pe->pe_number << 1) + num,
-			0/* levels */, 0/* table address */,
-			0/* table size */, 0/* page size */);
-	if (ret)
-		pe_warn(pe, "Unmapping failed, ret = %ld\n", ret);
-	else
-		pnv_pci_ioda2_tce_invalidate_entire(pe);
-
-	pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
-
-	return ret;
-}
-
 static void pnv_ioda2_take_ownership(struct iommu_table_group *table_group)
 {
 	struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 22/42] powerpc/powernv: Move functions around
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch moves functions related to releasing PE around so that
we don't need extra declaration for them in subsequent patches.
It doesn't introduce any behavioural changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 07/21]
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 735 +++++++++++++++---------------
 1 file changed, 369 insertions(+), 366 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8a79403..3d5aec8d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,285 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
+static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe *pe)
+{
+	/* 01xb - invalidate TCEs that match the specified PE# */
+	unsigned long val = (0x4ull << 60) | (pe->pe_number & 0xFF);
+	struct pnv_phb *phb = pe->phb;
+
+	if (!phb->ioda.tce_inval_reg)
+		return;
+
+	mb(); /* Ensure above stores are visible */
+	__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
+}
+
+static long pnv_pci_ioda2_unset_window(struct iommu_table_group *table_group,
+		int num)
+{
+	struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
+			table_group);
+	struct pnv_phb *phb = pe->phb;
+	long ret;
+
+	pe_info(pe, "Removing DMA window #%d\n", num);
+
+	ret = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+			(pe->pe_number << 1) + num,
+			0/* levels */, 0/* table address */,
+			0/* table size */, 0/* page size */);
+	if (ret)
+		pe_warn(pe, "Unmapping failed, ret = %ld\n", ret);
+	else
+		pnv_pci_ioda2_tce_invalidate_entire(pe);
+
+	pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
+
+	return ret;
+}
+
+static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable)
+{
+	uint16_t window_id = (pe->pe_number << 1) + 1;
+	int64_t rc;
+
+	pe_info(pe, "%sabling 64-bit DMA bypass\n", enable ? "En" : "Dis");
+	if (enable) {
+		phys_addr_t top = memblock_end_of_DRAM();
+
+		top = roundup_pow_of_two(top);
+		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
+						     pe->pe_number,
+						     window_id,
+						     pe->tce_bypass_base,
+						     top);
+	} else {
+		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
+						     pe->pe_number,
+						     window_id,
+						     pe->tce_bypass_base,
+						     0);
+	}
+	if (rc)
+		pe_err(pe, "OPAL error %lld configuring bypass window\n", rc);
+	else
+		pe->tce_bypass_enabled = enable;
+}
+
+static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev,
+					 struct pnv_ioda_pe *pe)
+{
+	struct iommu_table    *tbl;
+	int64_t               rc;
+
+	tbl = pe->table_group.tables[0];
+	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
+	if (rc)
+		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
+
+	pnv_pci_ioda2_set_bypass(pe, false);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		BUG_ON(pe->table_group.group);
+	}
+	pnv_pci_ioda2_table_free_pages(tbl);
+	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
+}
+
+static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
+				  struct pnv_ioda_pe *parent,
+				  struct pnv_ioda_pe *child,
+				  bool is_add)
+{
+	const char *desc = is_add ? "adding" : "removing";
+	uint8_t op = is_add ? OPAL_ADD_PE_TO_DOMAIN :
+			      OPAL_REMOVE_PE_FROM_DOMAIN;
+	struct pnv_ioda_pe *slave;
+	long rc;
+
+	/* Parent PE affects child PE */
+	rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
+				child->pe_number, op);
+	if (rc != OPAL_SUCCESS) {
+		pe_warn(child, "OPAL error %ld %s to parent PELTV\n",
+			rc, desc);
+		return -ENXIO;
+	}
+
+	if (!(child->flags & PNV_IODA_PE_MASTER))
+		return 0;
+
+	/* Compound case: parent PE affects slave PEs */
+	list_for_each_entry(slave, &child->slaves, list) {
+		rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
+					slave->pe_number, op);
+		if (rc != OPAL_SUCCESS) {
+			pe_warn(slave, "OPAL error %ld %s to parent PELTV\n",
+				rc, desc);
+			return -ENXIO;
+		}
+	}
+
+	return 0;
+}
+
+static int pnv_ioda_set_peltv(struct pnv_phb *phb,
+			      struct pnv_ioda_pe *pe,
+			      bool is_add)
+{
+	struct pnv_ioda_pe *slave;
+	struct pci_dev *pdev = NULL;
+	int ret;
+
+	/*
+	 * Clear PE frozen state. If it's master PE, we need
+	 * clear slave PE frozen state as well.
+	 */
+	if (is_add) {
+		opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number,
+					  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+		if (pe->flags & PNV_IODA_PE_MASTER) {
+			list_for_each_entry(slave, &pe->slaves, list)
+				opal_pci_eeh_freeze_clear(phb->opal_id,
+					slave->pe_number,
+					OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+		}
+	}
+
+	/*
+	 * Associate PE in PELT. We need add the PE into the
+	 * corresponding PELT-V as well. Otherwise, the error
+	 * originated from the PE might contribute to other
+	 * PEs.
+	 */
+	ret = pnv_ioda_set_one_peltv(phb, pe, pe, is_add);
+	if (ret)
+		return ret;
+
+	/* For compound PEs, any one affects all of them */
+	if (pe->flags & PNV_IODA_PE_MASTER) {
+		list_for_each_entry(slave, &pe->slaves, list) {
+			ret = pnv_ioda_set_one_peltv(phb, slave, pe, is_add);
+			if (ret)
+				return ret;
+		}
+	}
+
+	if (pe->flags & (PNV_IODA_PE_BUS_ALL | PNV_IODA_PE_BUS))
+		pdev = pe->pbus->self;
+	else if (pe->flags & PNV_IODA_PE_DEV)
+		pdev = pe->pdev->bus->self;
+#ifdef CONFIG_PCI_IOV
+	else if (pe->flags & PNV_IODA_PE_VF)
+		pdev = pe->parent_dev->bus->self;
+#endif /* CONFIG_PCI_IOV */
+	while (pdev) {
+		struct pci_dn *pdn = pci_get_pdn(pdev);
+		struct pnv_ioda_pe *parent;
+
+		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
+			parent = &phb->ioda.pe_array[pdn->pe_number];
+			ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
+			if (ret)
+				return ret;
+		}
+
+		pdev = pdev->bus->self;
+	}
+
+	return 0;
+}
+
+#ifdef CONFIG_PCI_IOV
+static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
+{
+	struct pci_dev *parent;
+	uint8_t bcomp, dcomp, fcomp;
+	int64_t rc;
+	long rid_end, rid;
+
+	/* Currently, we just deconfigure VF PE. Bus PE will always there.*/
+	if (pe->pbus) {
+		int count;
+
+		dcomp = OPAL_IGNORE_RID_DEVICE_NUMBER;
+		fcomp = OPAL_IGNORE_RID_FUNCTION_NUMBER;
+		parent = pe->pbus->self;
+		if (pe->flags & PNV_IODA_PE_BUS_ALL)
+			count = pe->pbus->busn_res.end -
+				pe->pbus->busn_res.start + 1;
+		else
+			count = 1;
+
+		switch (count) {
+		case  1:
+			bcomp = OpalPciBusAll;   break;
+		case  2:
+			bcomp = OpalPciBus7Bits; break;
+		case  4:
+			bcomp = OpalPciBus6Bits; break;
+		case  8:
+			bcomp = OpalPciBus5Bits; break;
+		case 16:
+			bcomp = OpalPciBus4Bits; break;
+		case 32:
+			bcomp = OpalPciBus3Bits; break;
+		default:
+			dev_err(&pe->pbus->dev, "Number of subordinate buses %d unsupported\n",
+				count);
+			/* Do an exact match only */
+			bcomp = OpalPciBusAll;
+		}
+		rid_end = pe->rid + (count << 8);
+	} else {
+		if (pe->flags & PNV_IODA_PE_VF)
+			parent = pe->parent_dev;
+		else
+			parent = pe->pdev->bus->self;
+		bcomp = OpalPciBusAll;
+		dcomp = OPAL_COMPARE_RID_DEVICE_NUMBER;
+		fcomp = OPAL_COMPARE_RID_FUNCTION_NUMBER;
+		rid_end = pe->rid + 1;
+	}
+
+	/* Clear the reverse map */
+	for (rid = pe->rid; rid < rid_end; rid++)
+		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
+
+	/* Release from all parents PELT-V */
+	while (parent) {
+		struct pci_dn *pdn = pci_get_pdn(parent);
+
+		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
+			rc = opal_pci_set_peltv(phb->opal_id,
+					pdn->pe_number, pe->pe_number,
+					OPAL_REMOVE_PE_FROM_DOMAIN);
+			/* XXX What to do in case of error ? */
+		}
+		parent = parent->bus->self;
+	}
+
+	opal_pci_eeh_freeze_set(phb->opal_id, pe->pe_number,
+				  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+
+	/* Disassociate PE in PELT */
+	rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number,
+				pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
+	if (rc)
+		pe_warn(pe, "OPAL error %ld remove self from PELTV\n", rc);
+	rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid,
+			     bcomp, dcomp, fcomp, OPAL_UNMAP_PE);
+	if (rc)
+		pe_err(pe, "OPAL error %ld trying to setup PELT table\n", rc);
+
+	pe->pbus = NULL;
+	pe->pdev = NULL;
+	pe->parent_dev = NULL;
+
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
 {
 	struct pnv_ioda_pe *pe = &phb->ioda.pe_array[pe_no];
@@ -599,307 +878,119 @@ static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
 static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
 {
 	struct pnv_ioda_pe *pe, *slave;
-	s64 rc;
-
-	/* Find master PE */
-	pe = &phb->ioda.pe_array[pe_no];
-	if (pe->flags & PNV_IODA_PE_SLAVE) {
-		pe = pe->master;
-		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
-		pe_no = pe->pe_number;
-	}
-
-	/* Clear frozen state for master PE */
-	rc = opal_pci_eeh_freeze_clear(phb->opal_id, pe_no, opt);
-	if (rc != OPAL_SUCCESS) {
-		pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
-			__func__, rc, opt, phb->hose->global_number, pe_no);
-		return -EIO;
-	}
-
-	if (!(pe->flags & PNV_IODA_PE_MASTER))
-		return 0;
-
-	/* Clear frozen state for slave PEs */
-	list_for_each_entry(slave, &pe->slaves, list) {
-		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
-					     slave->pe_number,
-					     opt);
-		if (rc != OPAL_SUCCESS) {
-			pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
-				__func__, rc, opt, phb->hose->global_number,
-				slave->pe_number);
-			return -EIO;
-		}
-	}
-
-	return 0;
-}
-
-static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
-{
-	struct pnv_ioda_pe *slave, *pe;
-	u8 fstate, state;
-	__be16 pcierr;
-	s64 rc;
-
-	/* Sanity check on PE number */
-	if (pe_no < 0 || pe_no >= phb->ioda.total_pe)
-		return OPAL_EEH_STOPPED_PERM_UNAVAIL;
-
-	/*
-	 * Fetch the master PE and the PE instance might be
-	 * not initialized yet.
-	 */
-	pe = &phb->ioda.pe_array[pe_no];
-	if (pe->flags & PNV_IODA_PE_SLAVE) {
-		pe = pe->master;
-		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
-		pe_no = pe->pe_number;
-	}
-
-	/* Check the master PE */
-	rc = opal_pci_eeh_freeze_status(phb->opal_id, pe_no,
-					&state, &pcierr, NULL);
-	if (rc != OPAL_SUCCESS) {
-		pr_warn("%s: Failure %lld getting "
-			"PHB#%x-PE#%x state\n",
-			__func__, rc,
-			phb->hose->global_number, pe_no);
-		return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
-	}
-
-	/* Check the slave PE */
-	if (!(pe->flags & PNV_IODA_PE_MASTER))
-		return state;
-
-	list_for_each_entry(slave, &pe->slaves, list) {
-		rc = opal_pci_eeh_freeze_status(phb->opal_id,
-						slave->pe_number,
-						&fstate,
-						&pcierr,
-						NULL);
-		if (rc != OPAL_SUCCESS) {
-			pr_warn("%s: Failure %lld getting "
-				"PHB#%x-PE#%x state\n",
-				__func__, rc,
-				phb->hose->global_number, slave->pe_number);
-			return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
-		}
-
-		/*
-		 * Override the result based on the ascending
-		 * priority.
-		 */
-		if (fstate > state)
-			state = fstate;
-	}
-
-	return state;
-}
-
-/* Currently those 2 are only used when MSIs are enabled, this will change
- * but in the meantime, we need to protect them to avoid warnings
- */
-#ifdef CONFIG_PCI_MSI
-static struct pnv_ioda_pe *pnv_ioda_dev_to_pe(struct pci_dev *dev)
-{
-	struct pci_controller *hose = pci_bus_to_host(dev->bus);
-	struct pnv_phb *phb = hose->private_data;
-	struct pci_dn *pdn = pci_get_pdn(dev);
-
-	if (!pdn)
-		return NULL;
-	if (pdn->pe_number == IODA_INVALID_PE)
-		return NULL;
-	return &phb->ioda.pe_array[pdn->pe_number];
-}
-#endif /* CONFIG_PCI_MSI */
-
-static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
-				  struct pnv_ioda_pe *parent,
-				  struct pnv_ioda_pe *child,
-				  bool is_add)
-{
-	const char *desc = is_add ? "adding" : "removing";
-	uint8_t op = is_add ? OPAL_ADD_PE_TO_DOMAIN :
-			      OPAL_REMOVE_PE_FROM_DOMAIN;
-	struct pnv_ioda_pe *slave;
-	long rc;
-
-	/* Parent PE affects child PE */
-	rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
-				child->pe_number, op);
-	if (rc != OPAL_SUCCESS) {
-		pe_warn(child, "OPAL error %ld %s to parent PELTV\n",
-			rc, desc);
-		return -ENXIO;
-	}
-
-	if (!(child->flags & PNV_IODA_PE_MASTER))
-		return 0;
-
-	/* Compound case: parent PE affects slave PEs */
-	list_for_each_entry(slave, &child->slaves, list) {
-		rc = opal_pci_set_peltv(phb->opal_id, parent->pe_number,
-					slave->pe_number, op);
-		if (rc != OPAL_SUCCESS) {
-			pe_warn(slave, "OPAL error %ld %s to parent PELTV\n",
-				rc, desc);
-			return -ENXIO;
-		}
-	}
-
-	return 0;
-}
-
-static int pnv_ioda_set_peltv(struct pnv_phb *phb,
-			      struct pnv_ioda_pe *pe,
-			      bool is_add)
-{
-	struct pnv_ioda_pe *slave;
-	struct pci_dev *pdev = NULL;
-	int ret;
-
-	/*
-	 * Clear PE frozen state. If it's master PE, we need
-	 * clear slave PE frozen state as well.
-	 */
-	if (is_add) {
-		opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number,
-					  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
-		if (pe->flags & PNV_IODA_PE_MASTER) {
-			list_for_each_entry(slave, &pe->slaves, list)
-				opal_pci_eeh_freeze_clear(phb->opal_id,
-							  slave->pe_number,
-							  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
-		}
-	}
-
-	/*
-	 * Associate PE in PELT. We need add the PE into the
-	 * corresponding PELT-V as well. Otherwise, the error
-	 * originated from the PE might contribute to other
-	 * PEs.
-	 */
-	ret = pnv_ioda_set_one_peltv(phb, pe, pe, is_add);
-	if (ret)
-		return ret;
+	s64 rc;
 
-	/* For compound PEs, any one affects all of them */
-	if (pe->flags & PNV_IODA_PE_MASTER) {
-		list_for_each_entry(slave, &pe->slaves, list) {
-			ret = pnv_ioda_set_one_peltv(phb, slave, pe, is_add);
-			if (ret)
-				return ret;
-		}
+	/* Find master PE */
+	pe = &phb->ioda.pe_array[pe_no];
+	if (pe->flags & PNV_IODA_PE_SLAVE) {
+		pe = pe->master;
+		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
+		pe_no = pe->pe_number;
 	}
 
-	if (pe->flags & (PNV_IODA_PE_BUS_ALL | PNV_IODA_PE_BUS))
-		pdev = pe->pbus->self;
-	else if (pe->flags & PNV_IODA_PE_DEV)
-		pdev = pe->pdev->bus->self;
-#ifdef CONFIG_PCI_IOV
-	else if (pe->flags & PNV_IODA_PE_VF)
-		pdev = pe->parent_dev->bus->self;
-#endif /* CONFIG_PCI_IOV */
-	while (pdev) {
-		struct pci_dn *pdn = pci_get_pdn(pdev);
-		struct pnv_ioda_pe *parent;
+	/* Clear frozen state for master PE */
+	rc = opal_pci_eeh_freeze_clear(phb->opal_id, pe_no, opt);
+	if (rc != OPAL_SUCCESS) {
+		pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
+			__func__, rc, opt, phb->hose->global_number, pe_no);
+		return -EIO;
+	}
 
-		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
-			parent = &phb->ioda.pe_array[pdn->pe_number];
-			ret = pnv_ioda_set_one_peltv(phb, parent, pe, is_add);
-			if (ret)
-				return ret;
-		}
+	if (!(pe->flags & PNV_IODA_PE_MASTER))
+		return 0;
 
-		pdev = pdev->bus->self;
+	/* Clear frozen state for slave PEs */
+	list_for_each_entry(slave, &pe->slaves, list) {
+		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
+					     slave->pe_number,
+					     opt);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("%s: Failure %lld clear %d on PHB#%x-PE#%x\n",
+				__func__, rc, opt, phb->hose->global_number,
+				slave->pe_number);
+			return -EIO;
+		}
 	}
 
 	return 0;
 }
 
-#ifdef CONFIG_PCI_IOV
-static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
+static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
 {
-	struct pci_dev *parent;
-	uint8_t bcomp, dcomp, fcomp;
-	int64_t rc;
-	long rid_end, rid;
+	struct pnv_ioda_pe *slave, *pe;
+	u8 fstate, state;
+	__be16 pcierr;
+	s64 rc;
 
-	/* Currently, we just deconfigure VF PE. Bus PE will always there.*/
-	if (pe->pbus) {
-		int count;
+	/* Sanity check on PE number */
+	if (pe_no < 0 || pe_no >= phb->ioda.total_pe)
+		return OPAL_EEH_STOPPED_PERM_UNAVAIL;
 
-		dcomp = OPAL_IGNORE_RID_DEVICE_NUMBER;
-		fcomp = OPAL_IGNORE_RID_FUNCTION_NUMBER;
-		parent = pe->pbus->self;
-		if (pe->flags & PNV_IODA_PE_BUS_ALL)
-			count = pe->pbus->busn_res.end - pe->pbus->busn_res.start + 1;
-		else
-			count = 1;
+	/*
+	 * Fetch the master PE and the PE instance might be
+	 * not initialized yet.
+	 */
+	pe = &phb->ioda.pe_array[pe_no];
+	if (pe->flags & PNV_IODA_PE_SLAVE) {
+		pe = pe->master;
+		WARN_ON(!pe || !(pe->flags & PNV_IODA_PE_MASTER));
+		pe_no = pe->pe_number;
+	}
 
-		switch(count) {
-		case  1: bcomp = OpalPciBusAll;         break;
-		case  2: bcomp = OpalPciBus7Bits;       break;
-		case  4: bcomp = OpalPciBus6Bits;       break;
-		case  8: bcomp = OpalPciBus5Bits;       break;
-		case 16: bcomp = OpalPciBus4Bits;       break;
-		case 32: bcomp = OpalPciBus3Bits;       break;
-		default:
-			dev_err(&pe->pbus->dev, "Number of subordinate buses %d unsupported\n",
-			        count);
-			/* Do an exact match only */
-			bcomp = OpalPciBusAll;
-		}
-		rid_end = pe->rid + (count << 8);
-	} else {
-		if (pe->flags & PNV_IODA_PE_VF)
-			parent = pe->parent_dev;
-		else
-			parent = pe->pdev->bus->self;
-		bcomp = OpalPciBusAll;
-		dcomp = OPAL_COMPARE_RID_DEVICE_NUMBER;
-		fcomp = OPAL_COMPARE_RID_FUNCTION_NUMBER;
-		rid_end = pe->rid + 1;
+	/* Check the master PE */
+	rc = opal_pci_eeh_freeze_status(phb->opal_id, pe_no,
+					&state, &pcierr, NULL);
+	if (rc != OPAL_SUCCESS) {
+		pr_warn("%s: Error %lld getting PHB#%x-PE#%x state\n",
+			__func__, rc, phb->hose->global_number, pe_no);
+		return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
 	}
 
-	/* Clear the reverse map */
-	for (rid = pe->rid; rid < rid_end; rid++)
-		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
+	/* Check the slave PE */
+	if (!(pe->flags & PNV_IODA_PE_MASTER))
+		return state;
 
-	/* Release from all parents PELT-V */
-	while (parent) {
-		struct pci_dn *pdn = pci_get_pdn(parent);
-		if (pdn && pdn->pe_number != IODA_INVALID_PE) {
-			rc = opal_pci_set_peltv(phb->opal_id, pdn->pe_number,
-						pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
-			/* XXX What to do in case of error ? */
+	list_for_each_entry(slave, &pe->slaves, list) {
+		rc = opal_pci_eeh_freeze_status(phb->opal_id,
+						slave->pe_number,
+						&fstate,
+						&pcierr,
+						NULL);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("%s: Error %lld getting PHB#%x-PE#%x state\n",
+				__func__, rc, phb->hose->global_number,
+				slave->pe_number);
+			return OPAL_EEH_STOPPED_TEMP_UNAVAIL;
 		}
-		parent = parent->bus->self;
-	}
 
-	opal_pci_eeh_freeze_set(phb->opal_id, pe->pe_number,
-				  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
+		/*
+		 * Override the result based on the ascending
+		 * priority.
+		 */
+		if (fstate > state)
+			state = fstate;
+	}
 
-	/* Disassociate PE in PELT */
-	rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number,
-				pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN);
-	if (rc)
-		pe_warn(pe, "OPAL error %ld remove self from PELTV\n", rc);
-	rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid,
-			     bcomp, dcomp, fcomp, OPAL_UNMAP_PE);
-	if (rc)
-		pe_err(pe, "OPAL error %ld trying to setup PELT table\n", rc);
+	return state;
+}
 
-	pe->pbus = NULL;
-	pe->pdev = NULL;
-	pe->parent_dev = NULL;
+/* Currently those 2 are only used when MSIs are enabled, this will change
+ * but in the meantime, we need to protect them to avoid warnings
+ */
+#ifdef CONFIG_PCI_MSI
+static struct pnv_ioda_pe *pnv_ioda_dev_to_pe(struct pci_dev *dev)
+{
+	struct pci_controller *hose = pci_bus_to_host(dev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(dev);
 
-	return 0;
+	if (!pdn)
+		return NULL;
+	if (pdn->pe_number == IODA_INVALID_PE)
+		return NULL;
+	return &phb->ioda.pe_array[pdn->pe_number];
 }
-#endif /* CONFIG_PCI_IOV */
+#endif /* CONFIG_PCI_MSI */
 
 static int pnv_ioda_configure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
@@ -1349,29 +1440,6 @@ m64_failed:
 	return -EBUSY;
 }
 
-static long pnv_pci_ioda2_unset_window(struct iommu_table_group *table_group,
-		int num);
-static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable);
-
-static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe *pe)
-{
-	struct iommu_table    *tbl;
-	int64_t               rc;
-
-	tbl = pe->table_group.tables[0];
-	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
-	if (rc)
-		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
-
-	pnv_pci_ioda2_set_bypass(pe, false);
-	if (pe->table_group.group) {
-		iommu_group_put(pe->table_group.group);
-		BUG_ON(pe->table_group.group);
-	}
-	pnv_pci_ioda2_table_free_pages(tbl);
-	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
-}
-
 static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 {
 	struct pci_bus        *bus;
@@ -1860,19 +1928,6 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
 	.get = pnv_tce_get,
 };
 
-static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe *pe)
-{
-	/* 01xb - invalidate TCEs that match the specified PE# */
-	unsigned long val = (0x4ull << 60) | (pe->pe_number & 0xFF);
-	struct pnv_phb *phb = pe->phb;
-
-	if (!phb->ioda.tce_inval_reg)
-		return;
-
-	mb(); /* Ensure above stores are visible */
-	__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
-}
-
 static void pnv_pci_ioda2_tce_do_invalidate(unsigned pe_number, bool rm,
 		__be64 __iomem *invalidate, unsigned shift,
 		unsigned long index, unsigned long npages)
@@ -2108,34 +2163,6 @@ static long pnv_pci_ioda2_set_window(struct iommu_table_group *table_group,
 	return 0;
 }
 
-static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable)
-{
-	uint16_t window_id = (pe->pe_number << 1 ) + 1;
-	int64_t rc;
-
-	pe_info(pe, "%sabling 64-bit DMA bypass\n", enable ? "En" : "Dis");
-	if (enable) {
-		phys_addr_t top = memblock_end_of_DRAM();
-
-		top = roundup_pow_of_two(top);
-		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
-						     pe->pe_number,
-						     window_id,
-						     pe->tce_bypass_base,
-						     top);
-	} else {
-		rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id,
-						     pe->pe_number,
-						     window_id,
-						     pe->tce_bypass_base,
-						     0);
-	}
-	if (rc)
-		pe_err(pe, "OPAL error %lld configuring bypass window\n", rc);
-	else
-		pe->tce_bypass_enabled = enable;
-}
-
 static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset,
 		__u32 page_shift, __u64 window_size, __u32 levels,
 		struct iommu_table *tbl);
@@ -2248,30 +2275,6 @@ static unsigned long pnv_pci_ioda2_get_table_size(__u32 page_shift,
 	return bytes;
 }
 
-static long pnv_pci_ioda2_unset_window(struct iommu_table_group *table_group,
-		int num)
-{
-	struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
-			table_group);
-	struct pnv_phb *phb = pe->phb;
-	long ret;
-
-	pe_info(pe, "Removing DMA window #%d\n", num);
-
-	ret = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
-			(pe->pe_number << 1) + num,
-			0/* levels */, 0/* table address */,
-			0/* table size */, 0/* page size */);
-	if (ret)
-		pe_warn(pe, "Unmapping failed, ret = %ld\n", ret);
-	else
-		pnv_pci_ioda2_tce_invalidate_entire(pe);
-
-	pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
-
-	return ret;
-}
-
 static void pnv_ioda2_take_ownership(struct iommu_table_group *table_group)
 {
 	struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 23/42] powerpc/powernv: Cleanup on pnv_pci_ioda2_release_dma_pe()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (13 preceding siblings ...)
       [not found] ` <1433400131-18429-1-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 25/42] powerpc/powernv: Supports slot ID Gavin Shan
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch applies cleanup on pnv_pci_ioda2_release_dma_pe():

  * Rename it to pnv_pci_ioda2_release_pe_dma() to match the
    function names used to release resources for one PE in the
    subsequent patches.
  * Remove the parameter of PCI device, which is used to figure
    out device node. VFs don't have associated device nodes in
    SRIOV case. For other cases, the device node can be figured
    out from the PCI bus or device the PE was allocated for.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Newly introduced
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3d5aec8d..2e31472 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -197,11 +197,11 @@ static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable)
 		pe->tce_bypass_enabled = enable;
 }
 
-static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev,
-					 struct pnv_ioda_pe *pe)
+static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)
 {
-	struct iommu_table    *tbl;
-	int64_t               rc;
+	struct iommu_table *tbl;
+	struct device_node *dn;
+	int64_t rc;
 
 	tbl = pe->table_group.tables[0];
 	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
@@ -213,8 +213,20 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev,
 		iommu_group_put(pe->table_group.group);
 		BUG_ON(pe->table_group.group);
 	}
+
+	if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
+		dn = pci_bus_to_OF_node(pe->pbus);
+	else if (pe->flags & PNV_IODA_PE_DEV)
+		dn = pci_device_to_OF_node(pe->pdev);
+#ifdef CONFIG_PCI_IOV
+	else if (pe->flags & PNV_IODA_PE_VF)
+		dn = pci_device_to_OF_node(pe->parent_dev);
+#endif
+	else
+		dn = NULL;
+
 	pnv_pci_ioda2_table_free_pages(tbl);
-	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
+	iommu_free_table(tbl, of_node_full_name(dn));
 }
 
 static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
@@ -1495,14 +1507,14 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		if ((pe->flags & PNV_IODA_PE_MASTER) &&
 		    (pe->flags & PNV_IODA_PE_VF)) {
 			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
-				pnv_pci_ioda2_release_dma_pe(pdev, s);
+				pnv_pci_ioda2_release_pe_dma(s);
 				list_del(&s->list);
 				pnv_ioda_deconfigure_pe(phb, s);
 				pnv_ioda_free_pe(phb, s->pe_number);
 			}
 		}
 
-		pnv_pci_ioda2_release_dma_pe(pdev, pe);
+		pnv_pci_ioda2_release_pe_dma(pe);
 
 		/* Remove from list */
 		mutex_lock(&phb->ioda.pe_list_mutex);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 24/42] powerpc/powernv: Release PEs dynamically
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:41     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The patch adds refcount to PE, which counts number of PCI devices
included in the PE. When last device leaves from the PE, the PE
together with its consumed resources (IO, DMA, PELTM/PELTV) are
released, in order to support PCI hotplug.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Derived from PATCH[v4 07/21]
---
 arch/powerpc/include/asm/pci-bridge.h     |   1 +
 arch/powerpc/kernel/pci-hotplug.c         |   5 +
 arch/powerpc/platforms/powernv/pci-ioda.c | 181 +++++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.h      |   2 +
 4 files changed, 183 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1f39ca7..9a83cdb 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -26,6 +26,7 @@ struct pci_controller_ops {
 	/* Called when pci_enable_device() is called. Returns true to
 	 * allow assignment/enabling of the device. */
 	bool		(*enable_device_hook)(struct pci_dev *);
+	void		(*release_device)(struct pci_dev *);
 
 	/* Called during PCI resource reassignment */
 	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 98f84ed..21973e7 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -29,6 +29,11 @@
  */
 void pcibios_release_device(struct pci_dev *dev)
 {
+	struct pci_controller *hose = pci_bus_to_host(dev->bus);
+
+	if (hose->controller_ops.release_device)
+		hose->controller_ops.release_device(dev);
+
 	eeh_remove_device(dev);
 }
 
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2e31472..17ba55c 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,50 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
+static void pnv_pci_ioda_release_pe_dma(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	struct iommu_table *tbl;
+	int seg;
+	int64_t rc;
+
+	/* No DMA32 segments allocated */
+	if (pe->dma32_seg < 0 ||
+	    pe->dma32_segcount <= 0)
+		return;
+
+	/* Unlink IOMMU table from group */
+	tbl = pe->table_group.tables[0];
+	pnv_pci_unlink_table_and_group(tbl, &pe->table_group);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		BUG_ON(pe->table_group.group);
+	}
+
+	/* Release IOMMU table */
+	free_pages(tbl->it_base,
+		get_order(TCE32_TABLE_SIZE * pe->dma32_segcount));
+	iommu_free_table(tbl,
+		of_node_full_name(pci_bus_to_OF_node(pe->pbus)));
+
+	/* Disable TVE */
+	for (seg = pe->dma32_seg;
+	     seg < pe->dma32_seg + pe->dma32_segcount;
+	     seg++) {
+		rc = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+				seg, 0, 0ul, 0ul, 0ul);
+		if (rc)
+			pe_warn(pe, "Error %ld unmapping DMA32 seg#%d\n",
+				rc, seg);
+	}
+
+	/* Free the DMA32 segments */
+	bitmap_clear(phb->ioda.dma32_segmap,
+		pe->dma32_seg, pe->dma32_segcount);
+	pe->dma32_seg = -1;
+	pe->dma32_segcount = 0;
+}
+
 static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe *pe)
 {
 	/* 01xb - invalidate TCEs that match the specified PE# */
@@ -203,6 +247,10 @@ static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)
 	struct device_node *dn;
 	int64_t rc;
 
+	if (pe->dma32_seg < 0 ||
+	    pe->dma32_segcount <= 0)
+		return;
+
 	tbl = pe->table_group.tables[0];
 	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
 	if (rc)
@@ -227,6 +275,61 @@ static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)
 
 	pnv_pci_ioda2_table_free_pages(tbl);
 	iommu_free_table(tbl, of_node_full_name(dn));
+	pe->dma32_seg = -1;
+	pe->dma32_segcount = 0;
+}
+
+static void pnv_ioda_release_pe_dma(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+
+	if (phb->type == PNV_PHB_IODA1)
+		pnv_pci_ioda_release_pe_dma(pe);
+	else if (phb->type == PNV_PHB_IODA2)
+		pnv_pci_ioda2_release_pe_dma(pe);
+}
+
+static void pnv_ioda_release_pe_seg(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	unsigned long *segmap = NULL;
+	unsigned long *pe_segmap = NULL;
+	uint16_t win;
+	int segno;
+
+	for (win = OPAL_M32_WINDOW_TYPE; win <= OPAL_IO_WINDOW_TYPE; win++) {
+		switch (win) {
+		case OPAL_IO_WINDOW_TYPE:
+			segmap = phb->ioda.io_segmap;
+			pe_segmap = pe->io_segmap;
+			break;
+		case OPAL_M32_WINDOW_TYPE:
+			segmap = phb->ioda.m32_segmap;
+			pe_segmap = pe->m32_segmap;
+			break;
+		case OPAL_M64_WINDOW_TYPE:
+			segmap = phb->ioda.m64_segmap;
+			pe_segmap = pe->m64_segmap;
+			break;
+		}
+
+		segno = -1;
+		while ((segno = find_next_bit(pe_segmap,
+			phb->ioda.total_pe, segno + 1))
+			< phb->ioda.total_pe) {
+			if (win == OPAL_IO_WINDOW_TYPE ||
+			    win == OPAL_M32_WINDOW_TYPE)
+				opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe, win, 0, segno);
+			else if (phb->type == PNV_PHB_IODA1)
+				opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe, win,
+					segno / 8, segno % 8);
+
+			clear_bit(segno, pe_segmap);
+			clear_bit(segno, segmap);
+		}
+	}
 }
 
 static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
@@ -333,7 +436,6 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
 	return 0;
 }
 
-#ifdef CONFIG_PCI_IOV
 static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
 	struct pci_dev *parent;
@@ -421,7 +523,74 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 
 	return 0;
 }
-#endif /* CONFIG_PCI_IOV */
+
+static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	struct pnv_ioda_pe *tmp, *slave;
+
+	/* Release slave PEs in compound PE */
+	if (pe->flags & PNV_IODA_PE_MASTER) {
+		list_for_each_entry_safe(slave, tmp, &pe->slaves, list)
+			pnv_ioda_release_pe(pe);
+	}
+
+	/* Remove the PE from the list */
+	list_del(&pe->list);
+
+	/* Release resources */
+	pnv_ioda_release_pe_dma(pe);
+	pnv_ioda_release_pe_seg(pe);
+	pnv_ioda_deconfigure_pe(pe->phb, pe);
+
+	/* Release PE number */
+	clear_bit(pe->pe_number, phb->ioda.pe_alloc);
+}
+
+static void pnv_ioda_destroy_pe(struct kref *kref)
+{
+	struct pnv_ioda_pe *pe = container_of(kref, struct pnv_ioda_pe, kref);
+
+	pnv_ioda_release_pe(pe);
+}
+
+static inline struct pnv_ioda_pe *pnv_ioda_get_pe(struct pnv_ioda_pe *pe)
+{
+	if (!pe)
+		return NULL;
+
+	if (!pe->kref_init) {
+		pe->kref_init = true;
+		kref_init(&pe->kref);
+	} else {
+		kref_get(&pe->kref);
+	}
+
+	return pe;
+}
+
+static inline void pnv_ioda_put_pe(struct pnv_ioda_pe *pe)
+{
+	if (pe)
+		kref_put(&pe->kref, pnv_ioda_destroy_pe);
+}
+
+static void pnv_pci_release_device(struct pci_dev *pdev)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	struct pnv_ioda_pe *pe;
+
+	if (pdev->is_virtfn)
+		return;
+
+	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
+		return;
+
+	pe = &phb->ioda.pe_array[pdn->pe_number];
+	pnv_ioda_put_pe(pe);
+}
 
 static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
 {
@@ -429,6 +598,7 @@ static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
 
 	pe->phb = phb;
 	pe->pe_number = pe_no;
+	pe->kref_init = false;
 	INIT_LIST_HEAD(&pe->list);
 
 	return pe;
@@ -1233,6 +1403,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 		if (pdn->pe_number != IODA_INVALID_PE)
 			continue;
 
+		pnv_ioda_get_pe(pe);
 		pdn->pe_number = pe->pe_number;
 		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -1301,10 +1472,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 			bus->busn_res.start, pe_num);
 
 	if (pnv_ioda_configure_pe(phb, pe)) {
-		/* XXX What do we do here ? */
-		if (pe_num)
-			pnv_ioda_free_pe(phb, pe_num);
-		pe->pbus = NULL;
+		pnv_ioda_release_pe(pe);
 		return NULL;
 	}
 
@@ -3403,6 +3571,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	 */
 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
 	pnv_pci_controller_ops.enable_device_hook = pnv_pci_enable_device_hook;
+	pnv_pci_controller_ops.release_device = pnv_pci_release_device;
 	pnv_pci_controller_ops.window_alignment = pnv_pci_window_alignment;
 	pnv_pci_controller_ops.setup_bridge = pnv_pci_setup_bridge;
 	pnv_pci_controller_ops.reset_secondary_bus = pnv_pci_reset_secondary_bus;
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index bf63481..f68e036 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -30,6 +30,8 @@ struct pnv_phb;
 struct pnv_ioda_pe {
 	unsigned long		flags;
 	struct pnv_phb		*phb;
+	struct kref		kref;
+	bool			kref_init;
 
 	/* A PE can be associated with a single device or an
 	 * entire bus (& children). In the former case, pdev
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 24/42] powerpc/powernv: Release PEs dynamically
@ 2015-06-04  6:41     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch adds refcount to PE, which counts number of PCI devices
included in the PE. When last device leaves from the PE, the PE
together with its consumed resources (IO, DMA, PELTM/PELTV) are
released, in order to support PCI hotplug.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 07/21]
---
 arch/powerpc/include/asm/pci-bridge.h     |   1 +
 arch/powerpc/kernel/pci-hotplug.c         |   5 +
 arch/powerpc/platforms/powernv/pci-ioda.c | 181 +++++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.h      |   2 +
 4 files changed, 183 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1f39ca7..9a83cdb 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -26,6 +26,7 @@ struct pci_controller_ops {
 	/* Called when pci_enable_device() is called. Returns true to
 	 * allow assignment/enabling of the device. */
 	bool		(*enable_device_hook)(struct pci_dev *);
+	void		(*release_device)(struct pci_dev *);
 
 	/* Called during PCI resource reassignment */
 	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 98f84ed..21973e7 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -29,6 +29,11 @@
  */
 void pcibios_release_device(struct pci_dev *dev)
 {
+	struct pci_controller *hose = pci_bus_to_host(dev->bus);
+
+	if (hose->controller_ops.release_device)
+		hose->controller_ops.release_device(dev);
+
 	eeh_remove_device(dev);
 }
 
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2e31472..17ba55c 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -132,6 +132,50 @@ static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
 		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
 }
 
+static void pnv_pci_ioda_release_pe_dma(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	struct iommu_table *tbl;
+	int seg;
+	int64_t rc;
+
+	/* No DMA32 segments allocated */
+	if (pe->dma32_seg < 0 ||
+	    pe->dma32_segcount <= 0)
+		return;
+
+	/* Unlink IOMMU table from group */
+	tbl = pe->table_group.tables[0];
+	pnv_pci_unlink_table_and_group(tbl, &pe->table_group);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		BUG_ON(pe->table_group.group);
+	}
+
+	/* Release IOMMU table */
+	free_pages(tbl->it_base,
+		get_order(TCE32_TABLE_SIZE * pe->dma32_segcount));
+	iommu_free_table(tbl,
+		of_node_full_name(pci_bus_to_OF_node(pe->pbus)));
+
+	/* Disable TVE */
+	for (seg = pe->dma32_seg;
+	     seg < pe->dma32_seg + pe->dma32_segcount;
+	     seg++) {
+		rc = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+				seg, 0, 0ul, 0ul, 0ul);
+		if (rc)
+			pe_warn(pe, "Error %ld unmapping DMA32 seg#%d\n",
+				rc, seg);
+	}
+
+	/* Free the DMA32 segments */
+	bitmap_clear(phb->ioda.dma32_segmap,
+		pe->dma32_seg, pe->dma32_segcount);
+	pe->dma32_seg = -1;
+	pe->dma32_segcount = 0;
+}
+
 static inline void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_ioda_pe *pe)
 {
 	/* 01xb - invalidate TCEs that match the specified PE# */
@@ -203,6 +247,10 @@ static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)
 	struct device_node *dn;
 	int64_t rc;
 
+	if (pe->dma32_seg < 0 ||
+	    pe->dma32_segcount <= 0)
+		return;
+
 	tbl = pe->table_group.tables[0];
 	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
 	if (rc)
@@ -227,6 +275,61 @@ static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)
 
 	pnv_pci_ioda2_table_free_pages(tbl);
 	iommu_free_table(tbl, of_node_full_name(dn));
+	pe->dma32_seg = -1;
+	pe->dma32_segcount = 0;
+}
+
+static void pnv_ioda_release_pe_dma(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+
+	if (phb->type == PNV_PHB_IODA1)
+		pnv_pci_ioda_release_pe_dma(pe);
+	else if (phb->type == PNV_PHB_IODA2)
+		pnv_pci_ioda2_release_pe_dma(pe);
+}
+
+static void pnv_ioda_release_pe_seg(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	unsigned long *segmap = NULL;
+	unsigned long *pe_segmap = NULL;
+	uint16_t win;
+	int segno;
+
+	for (win = OPAL_M32_WINDOW_TYPE; win <= OPAL_IO_WINDOW_TYPE; win++) {
+		switch (win) {
+		case OPAL_IO_WINDOW_TYPE:
+			segmap = phb->ioda.io_segmap;
+			pe_segmap = pe->io_segmap;
+			break;
+		case OPAL_M32_WINDOW_TYPE:
+			segmap = phb->ioda.m32_segmap;
+			pe_segmap = pe->m32_segmap;
+			break;
+		case OPAL_M64_WINDOW_TYPE:
+			segmap = phb->ioda.m64_segmap;
+			pe_segmap = pe->m64_segmap;
+			break;
+		}
+
+		segno = -1;
+		while ((segno = find_next_bit(pe_segmap,
+			phb->ioda.total_pe, segno + 1))
+			< phb->ioda.total_pe) {
+			if (win == OPAL_IO_WINDOW_TYPE ||
+			    win == OPAL_M32_WINDOW_TYPE)
+				opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe, win, 0, segno);
+			else if (phb->type == PNV_PHB_IODA1)
+				opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe, win,
+					segno / 8, segno % 8);
+
+			clear_bit(segno, pe_segmap);
+			clear_bit(segno, segmap);
+		}
+	}
 }
 
 static int pnv_ioda_set_one_peltv(struct pnv_phb *phb,
@@ -333,7 +436,6 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
 	return 0;
 }
 
-#ifdef CONFIG_PCI_IOV
 static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
 	struct pci_dev *parent;
@@ -421,7 +523,74 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 
 	return 0;
 }
-#endif /* CONFIG_PCI_IOV */
+
+static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	struct pnv_ioda_pe *tmp, *slave;
+
+	/* Release slave PEs in compound PE */
+	if (pe->flags & PNV_IODA_PE_MASTER) {
+		list_for_each_entry_safe(slave, tmp, &pe->slaves, list)
+			pnv_ioda_release_pe(pe);
+	}
+
+	/* Remove the PE from the list */
+	list_del(&pe->list);
+
+	/* Release resources */
+	pnv_ioda_release_pe_dma(pe);
+	pnv_ioda_release_pe_seg(pe);
+	pnv_ioda_deconfigure_pe(pe->phb, pe);
+
+	/* Release PE number */
+	clear_bit(pe->pe_number, phb->ioda.pe_alloc);
+}
+
+static void pnv_ioda_destroy_pe(struct kref *kref)
+{
+	struct pnv_ioda_pe *pe = container_of(kref, struct pnv_ioda_pe, kref);
+
+	pnv_ioda_release_pe(pe);
+}
+
+static inline struct pnv_ioda_pe *pnv_ioda_get_pe(struct pnv_ioda_pe *pe)
+{
+	if (!pe)
+		return NULL;
+
+	if (!pe->kref_init) {
+		pe->kref_init = true;
+		kref_init(&pe->kref);
+	} else {
+		kref_get(&pe->kref);
+	}
+
+	return pe;
+}
+
+static inline void pnv_ioda_put_pe(struct pnv_ioda_pe *pe)
+{
+	if (pe)
+		kref_put(&pe->kref, pnv_ioda_destroy_pe);
+}
+
+static void pnv_pci_release_device(struct pci_dev *pdev)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	struct pnv_ioda_pe *pe;
+
+	if (pdev->is_virtfn)
+		return;
+
+	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
+		return;
+
+	pe = &phb->ioda.pe_array[pdn->pe_number];
+	pnv_ioda_put_pe(pe);
+}
 
 static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
 {
@@ -429,6 +598,7 @@ static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
 
 	pe->phb = phb;
 	pe->pe_number = pe_no;
+	pe->kref_init = false;
 	INIT_LIST_HEAD(&pe->list);
 
 	return pe;
@@ -1233,6 +1403,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 		if (pdn->pe_number != IODA_INVALID_PE)
 			continue;
 
+		pnv_ioda_get_pe(pe);
 		pdn->pe_number = pe->pe_number;
 		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -1301,10 +1472,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
 			bus->busn_res.start, pe_num);
 
 	if (pnv_ioda_configure_pe(phb, pe)) {
-		/* XXX What do we do here ? */
-		if (pe_num)
-			pnv_ioda_free_pe(phb, pe_num);
-		pe->pbus = NULL;
+		pnv_ioda_release_pe(pe);
 		return NULL;
 	}
 
@@ -3403,6 +3571,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	 */
 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
 	pnv_pci_controller_ops.enable_device_hook = pnv_pci_enable_device_hook;
+	pnv_pci_controller_ops.release_device = pnv_pci_release_device;
 	pnv_pci_controller_ops.window_alignment = pnv_pci_window_alignment;
 	pnv_pci_controller_ops.setup_bridge = pnv_pci_setup_bridge;
 	pnv_pci_controller_ops.reset_secondary_bus = pnv_pci_reset_secondary_bus;
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index bf63481..f68e036 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -30,6 +30,8 @@ struct pnv_phb;
 struct pnv_ioda_pe {
 	unsigned long		flags;
 	struct pnv_phb		*phb;
+	struct kref		kref;
+	bool			kref_init;
 
 	/* A PE can be associated with a single device or an
 	 * entire bus (& children). In the former case, pdev
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 25/42] powerpc/powernv: Supports slot ID
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (14 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 23/42] powerpc/powernv: Cleanup on pnv_pci_ioda2_release_dma_pe() Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 26/42] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

PowerNV platform is running on top of skiboot firmware, which has
changes supporting PCI slots. PCI slots are identified by PHB's
OPAL ID (PHB slot) or combo of that and PCI slot ID. The patch
changes argument names of opal_pci_reset() and opal_pci_poll()
to reflect the firmware's change. pnv_eeh_phb_poll() is also
renamed to pnv_eeh_poll() to reflect the firmware's change.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 09/21]
---
 arch/powerpc/include/asm/opal.h              | 4 ++--
 arch/powerpc/platforms/powernv/eeh-powernv.c | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 042af1a..6d467df 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -129,7 +129,7 @@ int64_t opal_pci_map_pe_dma_window(uint64_t phb_id, uint16_t pe_number, uint16_t
 int64_t opal_pci_map_pe_dma_window_real(uint64_t phb_id, uint16_t pe_number,
 					uint16_t dma_window_number, uint64_t pci_start_addr,
 					uint64_t pci_mem_size);
-int64_t opal_pci_reset(uint64_t phb_id, uint8_t reset_scope, uint8_t assert_state);
+int64_t opal_pci_reset(uint64_t id, uint8_t reset_scope, uint8_t assert_state);
 
 int64_t opal_pci_get_hub_diag_data(uint64_t hub_id, void *diag_buffer,
 				   uint64_t diag_buffer_len);
@@ -145,7 +145,7 @@ int64_t opal_get_epow_status(__be64 *status);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
 			    __be16 *pci_error_type, __be16 *severity);
-int64_t opal_pci_poll(uint64_t phb_id);
+int64_t opal_pci_poll(uint64_t id, uint8_t *val);
 int64_t opal_return_cpu(void);
 int64_t opal_check_token(uint64_t token);
 int64_t opal_reinit_cpus(uint64_t flags);
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index dfdb31f..4fd8f15 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -743,12 +743,12 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int *delay)
 	return ret;
 }
 
-static s64 pnv_eeh_phb_poll(struct pnv_phb *phb)
+static s64 pnv_eeh_poll(uint64_t id)
 {
 	s64 rc = OPAL_HARDWARE;
 
 	while (1) {
-		rc = opal_pci_poll(phb->opal_id);
+		rc = opal_pci_poll(id, NULL);
 		if (rc <= 0)
 			break;
 
@@ -788,7 +788,7 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 	 * reset followed by hot reset on root bus. So we also
 	 * need the PCI bus settlement delay.
 	 */
-	rc = pnv_eeh_phb_poll(phb);
+	rc = pnv_eeh_poll(phb->opal_id);
 	if (option == EEH_RESET_DEACTIVATE) {
 		if (system_state < SYSTEM_RUNNING)
 			udelay(1000 * EEH_PE_RST_SETTLE_TIME);
@@ -831,7 +831,7 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
 		goto out;
 
 	/* Poll state of the PHB until the request is done */
-	rc = pnv_eeh_phb_poll(phb);
+	rc = pnv_eeh_poll(phb->opal_id);
 	if (option == EEH_RESET_DEACTIVATE)
 		msleep(EEH_PE_RST_SETTLE_TIME);
 out:
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 26/42] powerpc/powernv: Use PCI slot reset infrastructure
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (15 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 25/42] powerpc/powernv: Supports slot ID Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 27/42] powerpc/powernv: Simplify pnv_eeh_reset() Gavin Shan
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The skiboot firmware might provide the capability of resetting PCI
slot by property "ibm,reset-by-firmware" on the PCI slot associated
device node. The patch checks on the property and route the reset
to firmware if the property exists. Otherwise, we fail back to the
old path as before.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 09/21]
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 44 +++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 4fd8f15..4feb533 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -841,7 +841,7 @@ out:
 	return 0;
 }
 
-static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
+static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 {
 	struct pci_dn *pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -892,6 +892,48 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static int pnv_eeh_bridge_reset(struct pci_dev *pdev, int option)
+{
+	struct pci_controller *hose;
+	struct pnv_phb *phb;
+	struct device_node *dn = pdev ? pci_device_to_OF_node(pdev) : NULL;
+	uint64_t id = (0x1ul << 60);
+	uint8_t scope;
+	int64_t rc;
+
+	/*
+	 * If the firmware can't handle it, we will issue hot reset
+	 * on the secondary bus despite the requested reset type.
+	 */
+	if (!dn || !of_get_property(dn, "ibm,reset-by-firmware", NULL))
+		return __pnv_eeh_bridge_reset(pdev, option);
+
+	/* The firmware can handle the request */
+	switch (option) {
+	case EEH_RESET_HOT:
+		scope = OPAL_RESET_PCI_HOT;
+		break;
+	case EEH_RESET_FUNDAMENTAL:
+		scope = OPAL_RESET_PCI_FUNDAMENTAL;
+		break;
+	case EEH_RESET_DEACTIVATE:
+		return 0;
+	default:
+		dev_warn(&pdev->dev, "%s: Unsupported reset %d\n",
+			 __func__, option);
+		return -EINVAL;
+	}
+
+	hose = pci_bus_to_host(pdev->bus);
+	phb = hose->private_data;
+	id |= (pdev->bus->number << 24) | (pdev->devfn << 16) | phb->opal_id;
+	rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET);
+	if (rc > 0)
+		rc = pnv_eeh_poll(id);
+
+	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
+}
+
 static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, int pos,
 				     u16 mask, bool af_flr_rst)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 27/42] powerpc/powernv: Simplify pnv_eeh_reset()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (16 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 26/42] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 28/42] powerpc/powernv: Don't cover root bus in pnv_pci_reset_secondary_bus() Gavin Shan
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch simplifies pnv_eeh_reset() by dropping unnecessary nested
if statement. No logic changed by the patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 09/21]
  * Fixed "quoted string split across lines" from checkpatch.pl
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 65 +++++++++++++---------------
 1 file changed, 31 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 4feb533..4669122 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1084,7 +1084,9 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 {
 	struct pci_controller *hose = pe->phb;
+	struct pnv_phb *phb = hose->private_data;
 	struct pci_bus *bus;
+	int64_t rc;
 	int ret;
 
 	/*
@@ -1101,44 +1103,39 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 	 * reset. The side effect is that EEH core has to clear the frozen
 	 * state explicitly after BAR restore.
 	 */
-	if (pe->type & EEH_PE_PHB) {
-		ret = pnv_eeh_phb_reset(hose, option);
-	} else {
-		struct pnv_phb *phb;
-		s64 rc;
+	if (pe->type & EEH_PE_PHB)
+		return pnv_eeh_phb_reset(hose, option);
 
-		/*
-		 * The frozen PE might be caused by PAPR error injection
-		 * registers, which are expected to be cleared after hitting
-		 * frozen PE as stated in the hardware spec. Unfortunately,
-		 * that's not true on P7IOC. So we have to clear it manually
-		 * to avoid recursive EEH errors during recovery.
-		 */
-		phb = hose->private_data;
-		if (phb->model == PNV_PHB_MODEL_P7IOC &&
-		    (option == EEH_RESET_HOT ||
-		    option == EEH_RESET_FUNDAMENTAL)) {
-			rc = opal_pci_reset(phb->opal_id,
-					    OPAL_RESET_PHB_ERROR,
-					    OPAL_ASSERT_RESET);
-			if (rc != OPAL_SUCCESS) {
-				pr_warn("%s: Failure %lld clearing "
-					"error injection registers\n",
-					__func__, rc);
-				return -EIO;
-			}
+	/*
+	 * The frozen PE might be caused by PAPR error injection
+	 * registers, which are expected to be cleared after hitting
+	 * frozen PE as stated in the hardware spec. Unfortunately,
+	 * that's not true on P7IOC. So we have to clear it manually
+	 * to avoid recursive EEH errors during recovery.
+	 */
+	phb = hose->private_data;
+	if (phb->model == PNV_PHB_MODEL_P7IOC &&
+	    (option == EEH_RESET_HOT ||
+	    option == EEH_RESET_FUNDAMENTAL)) {
+		rc = opal_pci_reset(phb->opal_id,
+				    OPAL_RESET_PHB_ERROR,
+				    OPAL_ASSERT_RESET);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("%s: Error %lld clearing errinjct registers\n",
+				__func__, rc);
+			return -EIO;
 		}
-
-		bus = eeh_pe_bus_get(pe);
-		if (pe->type & EEH_PE_VF)
-			ret = pnv_eeh_vf_pe_reset(pe, option);
-		else if (pci_is_root_bus(bus) ||
-			pci_is_root_bus(bus->parent))
-			ret = pnv_eeh_root_reset(hose, option);
-		else
-			ret = pnv_eeh_bridge_reset(bus->self, option);
 	}
 
+	bus = eeh_pe_bus_get(pe);
+	if (pe->type & EEH_PE_VF)
+		ret = pnv_eeh_vf_pe_reset(pe, option);
+	else if (pci_is_root_bus(bus) ||
+		pci_is_root_bus(bus->parent))
+		ret = pnv_eeh_root_reset(hose, option);
+	else
+		ret = pnv_eeh_bridge_reset(bus->self, option);
+
 	return ret;
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 28/42] powerpc/powernv: Don't cover root bus in pnv_pci_reset_secondary_bus()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (17 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 27/42] powerpc/powernv: Simplify pnv_eeh_reset() Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 29/42] powerpc/powernv: Issue fundamental reset " Gavin Shan
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

There should have an upstream bridge for the PCI bus for which
pnv_pci_reset_secondary_bus() is called. It's impossible to call
the function for root buses. So we needn't do reset for root buses
in pnv_pci_reset_secondary_bus() and simply drop the logic.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 10/21]
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 4669122..18167c5 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1057,16 +1057,8 @@ static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
 
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
-	struct pci_controller *hose;
-
-	if (pci_is_root_bus(dev->bus)) {
-		hose = pci_bus_to_host(dev->bus);
-		pnv_eeh_root_reset(hose, EEH_RESET_HOT);
-		pnv_eeh_root_reset(hose, EEH_RESET_DEACTIVATE);
-	} else {
-		pnv_eeh_bridge_reset(dev, EEH_RESET_HOT);
-		pnv_eeh_bridge_reset(dev, EEH_RESET_DEACTIVATE);
-	}
+	pnv_eeh_bridge_reset(dev, EEH_RESET_HOT);
+	pnv_eeh_bridge_reset(dev, EEH_RESET_DEACTIVATE);
 }
 
 /**
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 29/42] powerpc/powernv: Issue fundamental reset in pnv_pci_reset_secondary_bus()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (18 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 28/42] powerpc/powernv: Don't cover root bus in pnv_pci_reset_secondary_bus() Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 30/42] powerpc/pci: Don't scan empty slot Gavin Shan
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

There might have PCI devices, under the specified PCI bus, asking
for fundamental reset. The patch iterates all PCI devices under
the specified PCI bus and issue fundamental reset to the PCI bus
if any PCI device is asking for that. Otherwise, hot reset is
issued to the PCI bus.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 10/21]
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 18167c5..4eb53ed 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1055,9 +1055,32 @@ static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
 	return 0;
 }
 
+static int pnv_pci_dev_reset_type(struct pci_dev *pdev, void *data)
+{
+	int *freset = data;
+
+	/*
+	 * Stop the iteration immediately if there is any
+	 * one PCI device requesting fundamental reset
+	 */
+	*freset |= pdev->needs_freset;
+	return *freset;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
-	pnv_eeh_bridge_reset(dev, EEH_RESET_HOT);
+	int option = EEH_RESET_HOT;
+
+	if (dev->subordinate) {
+		int freset = 0;
+
+		pci_walk_bus(dev->subordinate,
+			     pnv_pci_dev_reset_type,
+			     &freset);
+		option = freset ? EEH_RESET_FUNDAMENTAL : EEH_RESET_HOT;
+	}
+
+	pnv_eeh_bridge_reset(dev, option);
 	pnv_eeh_bridge_reset(dev, EEH_RESET_DEACTIVATE);
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 30/42] powerpc/pci: Don't scan empty slot
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (19 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 29/42] powerpc/powernv: Issue fundamental reset " Gavin Shan
@ 2015-06-04  6:41 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:41 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

In hotplug case, function pcibios_add_pci_devices() is called to
rescan the specified PCI bus, which might not have any child devices.
Access to the PCI bus's child device node will cause kernel crash
without exception. The patch adds condition of skipping scanning
PCI bus without child devices, in order to avoid kernel crash.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 11/21]
---
 arch/powerpc/kernel/pci-hotplug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 21973e7..ca392fc 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -92,7 +92,8 @@ void pcibios_add_pci_devices(struct pci_bus * bus)
 	if (mode == PCI_PROBE_DEVTREE) {
 		/* use ofdt-based probe */
 		of_rescan_bus(dn, bus);
-	} else if (mode == PCI_PROBE_NORMAL) {
+	} else if (mode == PCI_PROBE_NORMAL &&
+		   dn->child && PCI_DN(dn->child)) {
 		/*
 		 * Use legacy probe. In the partial hotplug case, we
 		 * probably have grandchildren devices unplugged. So
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (20 preceding siblings ...)
  2015-06-04  6:41 ` [PATCH v5 30/42] powerpc/pci: Don't scan empty slot Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-05 19:47   ` Bjorn Helgaas
  2015-06-04  6:42 ` [PATCH v5 32/42] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
                   ` (7 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch moves pcibios_find_pci_bus() to PPC kerenl directory so
that it can be reused by hotplug code for pSeries and PowerNV
platform at the same time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v5:
  * Derived from PATCH[v4 12/21]
---
 arch/powerpc/kernel/pci-hotplug.c          | 36 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/pci_dlpar.c | 32 --------------------------
 2 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index ca392fc..1482bc1 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -21,6 +21,42 @@
 #include <asm/firmware.h>
 #include <asm/eeh.h>
 
+static struct pci_bus *find_pci_bus(struct pci_bus *bus,
+				    struct device_node *dn)
+{
+	struct pci_bus *tmp, *child = NULL;
+	struct device_node *busdn;
+
+	busdn = pci_bus_to_OF_node(bus);
+	if (busdn == dn)
+		return bus;
+
+	list_for_each_entry(tmp, &bus->children, node) {
+		child = find_pci_bus(tmp, dn);
+		if (child)
+			break;
+	}
+
+	return child;
+}
+
+/**
+ * pcibios_find_pci_bus - find PCI bus according to the given device node
+ * @dn: Device node
+ *
+ * Find the corresponding PCI bus according to the given device node.
+ */
+struct pci_bus *pcibios_find_pci_bus(struct device_node *dn)
+{
+	struct pci_dn *pdn = PCI_DN(dn);
+
+	if (!pdn  || !pdn->phb || !pdn->phb->bus)
+		return NULL;
+
+	return find_pci_bus(pdn->phb->bus, dn);
+}
+EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
+
 /**
  * pcibios_release_device - release PCI device
  * @dev: PCI device
diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c
index 5d4a3df..906dbaa 100644
--- a/arch/powerpc/platforms/pseries/pci_dlpar.c
+++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
@@ -34,38 +34,6 @@
 
 #include "pseries.h"
 
-static struct pci_bus *
-find_bus_among_children(struct pci_bus *bus,
-                        struct device_node *dn)
-{
-	struct pci_bus *child = NULL;
-	struct pci_bus *tmp;
-	struct device_node *busdn;
-
-	busdn = pci_bus_to_OF_node(bus);
-	if (busdn == dn)
-		return bus;
-
-	list_for_each_entry(tmp, &bus->children, node) {
-		child = find_bus_among_children(tmp, dn);
-		if (child)
-			break;
-	};
-	return child;
-}
-
-struct pci_bus *
-pcibios_find_pci_bus(struct device_node *dn)
-{
-	struct pci_dn *pdn = dn->data;
-
-	if (!pdn  || !pdn->phb || !pdn->phb->bus)
-		return NULL;
-
-	return find_bus_among_children(pdn->phb->bus, dn);
-}
-EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
-
 struct pci_controller *init_phb_dynamic(struct device_node *dn)
 {
 	struct pci_controller *phb;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 32/42] powerpc/powernv: Introduce pnv_pci_poll()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (21 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 33/42] powerpc/powernv: Functions to get/reset PCI slot status Gavin Shan
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch obsoletes pnv_eeh_poll() with pnv_pci_poll():

   * The return value from last OPAL API is passed to the
     pnv_pci_poll() and handled there.
   * More information (e.g. PCI slot power status) is retrieved
     if the last argument is valid.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 13/21]
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 46 ++++++----------------------
 arch/powerpc/platforms/powernv/pci.c         | 21 +++++++++++++
 arch/powerpc/platforms/powernv/pci.h         |  1 +
 3 files changed, 31 insertions(+), 37 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 4eb53ed..7ee328b 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -743,28 +743,11 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int *delay)
 	return ret;
 }
 
-static s64 pnv_eeh_poll(uint64_t id)
-{
-	s64 rc = OPAL_HARDWARE;
-
-	while (1) {
-		rc = opal_pci_poll(id, NULL);
-		if (rc <= 0)
-			break;
-
-		if (system_state < SYSTEM_RUNNING)
-			udelay(1000 * rc);
-		else
-			msleep(rc);
-	}
-
-	return rc;
-}
-
 int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 {
 	struct pnv_phb *phb = hose->private_data;
 	s64 rc = OPAL_HARDWARE;
+	int ret;
 
 	pr_debug("%s: Reset PHB#%x, option=%d\n",
 		 __func__, hose->global_number, option);
@@ -779,8 +762,6 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 		rc = opal_pci_reset(phb->opal_id,
 				    OPAL_RESET_PHB_COMPLETE,
 				    OPAL_DEASSERT_RESET);
-	if (rc < 0)
-		goto out;
 
 	/*
 	 * Poll state of the PHB until the request is done
@@ -788,24 +769,22 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 	 * reset followed by hot reset on root bus. So we also
 	 * need the PCI bus settlement delay.
 	 */
-	rc = pnv_eeh_poll(phb->opal_id);
-	if (option == EEH_RESET_DEACTIVATE) {
+	ret = pnv_pci_poll(phb->opal_id, rc, NULL);
+	if (option == EEH_RESET_DEACTIVATE && !ret) {
 		if (system_state < SYSTEM_RUNNING)
 			udelay(1000 * EEH_PE_RST_SETTLE_TIME);
 		else
 			msleep(EEH_PE_RST_SETTLE_TIME);
 	}
-out:
-	if (rc != OPAL_SUCCESS)
-		return -EIO;
 
-	return 0;
+	return ret;
 }
 
 static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
 {
 	struct pnv_phb *phb = hose->private_data;
 	s64 rc = OPAL_HARDWARE;
+	int ret;
 
 	pr_debug("%s: Reset PHB#%x, option=%d\n",
 		 __func__, hose->global_number, option);
@@ -827,18 +806,13 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
 		rc = opal_pci_reset(phb->opal_id,
 				    OPAL_RESET_PCI_HOT,
 				    OPAL_DEASSERT_RESET);
-	if (rc < 0)
-		goto out;
 
 	/* Poll state of the PHB until the request is done */
-	rc = pnv_eeh_poll(phb->opal_id);
-	if (option == EEH_RESET_DEACTIVATE)
+	ret = pnv_pci_poll(phb->opal_id, rc, NULL);
+	if (option == EEH_RESET_DEACTIVATE && !ret)
 		msleep(EEH_PE_RST_SETTLE_TIME);
-out:
-	if (rc != OPAL_SUCCESS)
-		return -EIO;
 
-	return 0;
+	return ret;
 }
 
 static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
@@ -928,10 +902,8 @@ static int pnv_eeh_bridge_reset(struct pci_dev *pdev, int option)
 	phb = hose->private_data;
 	id |= (pdev->bus->number << 24) | (pdev->devfn << 16) | phb->opal_id;
 	rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET);
-	if (rc > 0)
-		rc = pnv_eeh_poll(id);
 
-	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
+	return pnv_pci_poll(id, rc, NULL);
 }
 
 static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, int pos,
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 678eb24..bf5df04 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -44,6 +44,27 @@
 #define cfg_dbg(fmt...)	do { } while(0)
 //#define cfg_dbg(fmt...)	printk(fmt)
 
+int pnv_pci_poll(uint64_t id, int64_t rval, uint8_t *pval)
+{
+	while (rval > 0) {
+		if (system_state < SYSTEM_RUNNING)
+			udelay(1000 * rval);
+		else
+			msleep(rval);
+
+		rval = opal_pci_poll(id, pval);
+	}
+
+	/*
+	 * The caller expects to retrieve additional information
+	 * if the last argument is valid.
+	 */
+	if (rval == OPAL_SUCCESS && pval)
+		rval = opal_pci_poll(id, pval);
+
+	return rval ? -EIO : 0;
+}
+
 #ifdef CONFIG_PCI_MSI
 static int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index f68e036..510e781 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -214,6 +214,7 @@ extern int pnv_tce_xchg(struct iommu_table *tbl, long index,
 		unsigned long *hpa, enum dma_data_direction *direction);
 extern unsigned long pnv_tce_get(struct iommu_table *tbl, long index);
 
+int pnv_pci_poll(uint64_t id, int64_t rval, uint8_t *pval);
 void pnv_pci_dump_phb_diag_data(struct pci_controller *hose,
 				unsigned char *log_buff);
 int pnv_pci_cfg_read(struct pci_dn *pdn,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 33/42] powerpc/powernv: Functions to get/reset PCI slot status
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (22 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 32/42] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 35/42] powerpc/pci: Create eeh_dev while creating pci_dn Gavin Shan
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch exports 4 functions, which base on corresponding OPAL
APIs to get or set PCI slot status. Those functions are going to
be used by PCI hotplug module in subsequent patches:

   pnv_pci_get_overlay_dt()       opal_get_overlay_dt()
   pnv_pci_get_presence_status()  opal_pci_get_presence_status()
   pnv_pci_get_power_status()     opal_pci_get_power_status()
   pnv_pci_set_power_status()     opal_pci_set_power_status()

Besides, the patch also exports pnv_pci_hotplug_notifier_{register,
unregister}() to allow registration and unregistration of PCI hotplug
notifier, which will be used to receive PCI hotplug message from skiboot
firmware.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 14/21]
  * No polling required for pnv_pci_get_presence_status()
  * Separate functions for registration and unregistration of PCI
    hotplug notifier
  * int64_t for value returned from OPAL API
---
 arch/powerpc/include/asm/opal-api.h            |  8 +++-
 arch/powerpc/include/asm/opal.h                |  4 ++
 arch/powerpc/include/asm/pnv-pci.h             |  7 +++
 arch/powerpc/platforms/powernv/opal-wrappers.S |  4 ++
 arch/powerpc/platforms/powernv/pci.c           | 66 ++++++++++++++++++++++++++
 5 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 0321a90..c534dd8 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -153,7 +153,11 @@
 #define OPAL_FLASH_READ				110
 #define OPAL_FLASH_WRITE			111
 #define OPAL_FLASH_ERASE			112
-#define OPAL_LAST				112
+#define OPAL_GET_OVERLAY_DT			116
+#define OPAL_PCI_GET_PRESENCE_STATUS		117
+#define OPAL_PCI_GET_POWER_STATUS		118
+#define OPAL_PCI_SET_POWER_STATUS		119
+#define OPAL_LAST				119
 
 /* Device tree flags */
 
@@ -352,6 +356,8 @@ enum opal_msg_type {
 	OPAL_MSG_SHUTDOWN,		/* params[0] = 1 reboot, 0 shutdown */
 	OPAL_MSG_HMI_EVT,
 	OPAL_MSG_DPO,
+	OPAL_MSG_PRD,
+	OPAL_MSG_PCI_HOTPLUG,
 	OPAL_MSG_TYPE_MAX,
 };
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 6d467df..2d1c825 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -200,6 +200,10 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, uint64_t buf,
 		uint64_t size, uint64_t token);
 int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
 		uint64_t token);
+int64_t opal_get_overlay_dt(uint64_t *counter, void *buf, uint64_t len);
+int64_t opal_pci_get_presence_status(uint64_t id, uint8_t *status);
+int64_t opal_pci_get_power_status(uint64_t id, uint8_t *status);
+int64_t opal_pci_set_power_status(uint64_t id, uint8_t status);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h
index f9b4982..9f63375 100644
--- a/arch/powerpc/include/asm/pnv-pci.h
+++ b/arch/powerpc/include/asm/pnv-pci.h
@@ -13,6 +13,13 @@
 #include <linux/pci.h>
 #include <misc/cxl.h>
 
+extern int pnv_pci_get_overlay_dt(uint64_t *counter, void *buf, uint64_t len);
+extern int pnv_pci_get_presence_status(uint64_t id, uint8_t *status);
+extern int pnv_pci_get_power_status(uint64_t id, uint8_t *status);
+extern int pnv_pci_set_power_status(uint64_t id, uint8_t status);
+extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb);
+extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb);
+
 int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode);
 int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
 			   unsigned int virq);
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index a7ade94..1d87c30 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -295,3 +295,7 @@ OPAL_CALL(opal_i2c_request,			OPAL_I2C_REQUEST);
 OPAL_CALL(opal_flash_read,			OPAL_FLASH_READ);
 OPAL_CALL(opal_flash_write,			OPAL_FLASH_WRITE);
 OPAL_CALL(opal_flash_erase,			OPAL_FLASH_ERASE);
+OPAL_CALL(opal_get_overlay_dt,			OPAL_GET_OVERLAY_DT);
+OPAL_CALL(opal_pci_get_presence_status,		OPAL_PCI_GET_PRESENCE_STATUS);
+OPAL_CALL(opal_pci_get_power_status,		OPAL_PCI_GET_POWER_STATUS);
+OPAL_CALL(opal_pci_set_power_status,		OPAL_PCI_SET_POWER_STATUS);
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index bf5df04..c332ea7 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -65,6 +65,72 @@ int pnv_pci_poll(uint64_t id, int64_t rval, uint8_t *pval)
 	return rval ? -EIO : 0;
 }
 
+int pnv_pci_get_overlay_dt(uint64_t *counter, void *buf, uint64_t len)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_GET_OVERLAY_DT))
+		return -ENXIO;
+
+	rc = opal_get_overlay_dt(counter, buf, len);
+	if (rc != OPAL_SUCCESS)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_overlay_dt);
+
+int pnv_pci_get_presence_status(uint64_t id, uint8_t *status)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_GET_PRESENCE_STATUS))
+		return -ENXIO;
+
+	rc = opal_pci_get_presence_status(id, status);
+	if (rc != OPAL_SUCCESS)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_presence_status);
+
+int pnv_pci_get_power_status(uint64_t id, uint8_t *status)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_GET_POWER_STATUS))
+		return -ENXIO;
+
+	rc = opal_pci_get_power_status(id, status);
+	return pnv_pci_poll(id, rc, status);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_power_status);
+
+int pnv_pci_set_power_status(uint64_t id, uint8_t status)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_SET_POWER_STATUS))
+		return -ENXIO;
+
+	rc = opal_pci_set_power_status(id, status);
+	return pnv_pci_poll(id, rc, NULL);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_set_power_status);
+
+int pnv_pci_hotplug_notifier_register(struct notifier_block *nb)
+{
+	return opal_message_notifier_register(OPAL_MSG_PCI_HOTPLUG, nb);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_hotplug_notifier_register);
+
+int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb)
+{
+	return opal_message_notifier_unregister(OPAL_MSG_PCI_HOTPLUG, nb);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_hotplug_notifier_unregister);
+
 #ifdef CONFIG_PCI_MSI
 static int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 34/42] powerpc/pci: Delay creating pci_dn
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:42     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The pci_dn instances are allocated from memblock or bootmem when
creating PCI controller (hoses) in setup_arch(). The PCI hotplug,
which will be supported by proceeding patches, will release PCI
device nodes and their corresponding pci_dn on unplugging event.
The pci_dn instance memory chunks alloed from memblock or bootmem
are hard to reused after being released.

The patch delay creating pci_dn so that they can be allocated from
slab. In turn, the memory chunks for them can be reused after being
released without problem. The creation of eeh_dev instances, which
depends on pci_dn, is delayed a bit as well.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Derived from PATCH[v4 15/21]
  * Dropped unrelated changes moving pci_dev_pdn_setup() around
---
 arch/powerpc/include/asm/ppc-pci.h     |  1 -
 arch/powerpc/kernel/eeh_dev.c          |  2 +-
 arch/powerpc/kernel/pci_dn.c           |  8 +++++--
 arch/powerpc/platforms/maple/pci.c     | 35 ++++++++++++++++++------------
 arch/powerpc/platforms/pasemi/pci.c    |  3 ---
 arch/powerpc/platforms/powermac/pci.c  | 39 +++++++++++++++++++++-------------
 arch/powerpc/platforms/powernv/pci.c   |  3 ---
 arch/powerpc/platforms/pseries/setup.c |  1 -
 8 files changed, 52 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 4122a86..7388316 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -40,7 +40,6 @@ void *traverse_pci_dn(struct pci_dn *root,
 		      void *(*fn)(struct pci_dn *, void *),
 		      void *data);
 
-extern void pci_devs_phb_init(void);
 extern void pci_devs_phb_init_dynamic(struct pci_controller *phb);
 
 /* From rtas_pci.h */
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index aabba94..f33ce5b 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -110,4 +110,4 @@ static int __init eeh_dev_phb_init(void)
 	return 0;
 }
 
-core_initcall(eeh_dev_phb_init);
+core_initcall_sync(eeh_dev_phb_init);
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index 0469247..35554c2 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -288,7 +288,7 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
 	struct device_node *parent;
 	struct pci_dn *pdn;
 
-	pdn = zalloc_maybe_bootmem(sizeof(*pdn), GFP_KERNEL);
+	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
 	if (pdn == NULL)
 		return NULL;
 	dn->data = pdn;
@@ -462,15 +462,19 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
  * pci device found underneath.  This routine runs once,
  * early in the boot sequence.
  */
-void __init pci_devs_phb_init(void)
+static int __init pci_devs_phb_init(void)
 {
 	struct pci_controller *phb, *tmp;
 
 	/* This must be done first so the device nodes have valid pci info! */
 	list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
 		pci_devs_phb_init_dynamic(phb);
+
+	return 0;
 }
 
+core_initcall(pci_devs_phb_init);
+
 static void pci_dev_pdn_setup(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn;
diff --git a/arch/powerpc/platforms/maple/pci.c b/arch/powerpc/platforms/maple/pci.c
index a923230..04a69a8 100644
--- a/arch/powerpc/platforms/maple/pci.c
+++ b/arch/powerpc/platforms/maple/pci.c
@@ -568,6 +568,26 @@ void maple_pci_irq_fixup(struct pci_dev *dev)
 	DBG(" <- maple_pci_irq_fixup\n");
 }
 
+static int maple_pci_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+	struct pci_controller *hose = pci_bus_to_host(bridge->bus);
+	struct device_node *np, *child;
+
+	if (hose != u3_agp)
+		return 0;
+
+	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
+	 * assume there is no P2P bridge on the AGP bus, which should be a
+	 * safe assumptions hopefully.
+	 */
+	np = hose->dn;
+	PCI_DN(np)->busno = 0xf0;
+	for_each_child_of_node(np, child)
+		PCI_DN(child)->busno = 0xf0;
+
+	return 0;
+}
+
 void __init maple_pci_init(void)
 {
 	struct device_node *np, *root;
@@ -605,20 +625,7 @@ void __init maple_pci_init(void)
 	if (ht && maple_add_bridge(ht) != 0)
 		of_node_put(ht);
 
-	/* Setup the linkage between OF nodes and PHBs */ 
-	pci_devs_phb_init();
-
-	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
-	 * assume there is no P2P bridge on the AGP bus, which should be a
-	 * safe assumptions hopefully.
-	 */
-	if (u3_agp) {
-		struct device_node *np = u3_agp->dn;
-		PCI_DN(np)->busno = 0xf0;
-		for (np = np->child; np; np = np->sibling)
-			PCI_DN(np)->busno = 0xf0;
-	}
-
+	ppc_md.pcibios_root_bridge_prepare = maple_pci_root_bridge_prepare;
 	/* Tell pci.c to not change any resource allocations.  */
 	pci_add_flags(PCI_PROBE_ONLY);
 }
diff --git a/arch/powerpc/platforms/pasemi/pci.c b/arch/powerpc/platforms/pasemi/pci.c
index f3a68a0..10c4e8f 100644
--- a/arch/powerpc/platforms/pasemi/pci.c
+++ b/arch/powerpc/platforms/pasemi/pci.c
@@ -229,9 +229,6 @@ void __init pas_pci_init(void)
 			of_node_get(np);
 
 	of_node_put(root);
-
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
 }
 
 void __iomem *pasemi_pci_getcfgaddr(struct pci_dev *dev, int offset)
diff --git a/arch/powerpc/platforms/powermac/pci.c b/arch/powerpc/platforms/powermac/pci.c
index 59ab16f..368716f 100644
--- a/arch/powerpc/platforms/powermac/pci.c
+++ b/arch/powerpc/platforms/powermac/pci.c
@@ -878,6 +878,29 @@ void pmac_pci_irq_fixup(struct pci_dev *dev)
 #endif /* CONFIG_PPC32 */
 }
 
+#ifdef CONFIG_PPC64
+static int pmac_pci_root_bridge_prepare(struct pci_hot_bridge *bridge)
+{
+	struct pci_controller *hose = pci_bus_to_host(bridge->bus);
+	struct device_node *np, *child;
+
+	if (hose != u3_agp)
+		return 0;
+
+	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
+	 * assume there is no P2P bridge on the AGP bus, which should be a
+	 * safe assumptions for now. We should do something better in the
+	 * future though
+	 */
+	np = hose->dn;
+	PCI_DN(np)->busno = 0xf0;
+	for_each_child_of_node(np, child)
+		PCI_DN(child)->busno = 0xf0;
+
+	return 0;
+}
+#endif /* CONFIG_PPC64 */
+
 void __init pmac_pci_init(void)
 {
 	struct device_node *np, *root;
@@ -914,22 +937,8 @@ void __init pmac_pci_init(void)
 	if (ht && pmac_add_bridge(ht) != 0)
 		of_node_put(ht);
 
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
-
-	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
-	 * assume there is no P2P bridge on the AGP bus, which should be a
-	 * safe assumptions for now. We should do something better in the
-	 * future though
-	 */
-	if (u3_agp) {
-		struct device_node *np = u3_agp->dn;
-		PCI_DN(np)->busno = 0xf0;
-		for (np = np->child; np; np = np->sibling)
-			PCI_DN(np)->busno = 0xf0;
-	}
 	/* pmac_check_ht_link(); */
-
+	ppc_md.pcibios_root_bridge_prepare = pmac_pci_root_bridge_prepare;
 #else /* CONFIG_PPC64 */
 	init_p2pbridge();
 	init_second_ohare();
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index c332ea7..9fd1c0d 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -949,9 +949,6 @@ void __init pnv_pci_init(void)
 	for_each_compatible_node(np, NULL, "ibm,ioda2-phb")
 		pnv_pci_init_ioda2_phb(np);
 
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
-
 	/* Configure IOMMU DMA hooks */
 	set_pci_dma_ops(&dma_iommu_ops);
 
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index df6a704..5f80758 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -482,7 +482,6 @@ static void __init find_and_init_phbs(void)
 	}
 
 	of_node_put(root);
-	pci_devs_phb_init();
 
 	/*
 	 * PCI_PROBE_ONLY and PCI_REASSIGN_ALL_BUS can be set via properties
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 34/42] powerpc/pci: Delay creating pci_dn
@ 2015-06-04  6:42     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The pci_dn instances are allocated from memblock or bootmem when
creating PCI controller (hoses) in setup_arch(). The PCI hotplug,
which will be supported by proceeding patches, will release PCI
device nodes and their corresponding pci_dn on unplugging event.
The pci_dn instance memory chunks alloed from memblock or bootmem
are hard to reused after being released.

The patch delay creating pci_dn so that they can be allocated from
slab. In turn, the memory chunks for them can be reused after being
released without problem. The creation of eeh_dev instances, which
depends on pci_dn, is delayed a bit as well.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 15/21]
  * Dropped unrelated changes moving pci_dev_pdn_setup() around
---
 arch/powerpc/include/asm/ppc-pci.h     |  1 -
 arch/powerpc/kernel/eeh_dev.c          |  2 +-
 arch/powerpc/kernel/pci_dn.c           |  8 +++++--
 arch/powerpc/platforms/maple/pci.c     | 35 ++++++++++++++++++------------
 arch/powerpc/platforms/pasemi/pci.c    |  3 ---
 arch/powerpc/platforms/powermac/pci.c  | 39 +++++++++++++++++++++-------------
 arch/powerpc/platforms/powernv/pci.c   |  3 ---
 arch/powerpc/platforms/pseries/setup.c |  1 -
 8 files changed, 52 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 4122a86..7388316 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -40,7 +40,6 @@ void *traverse_pci_dn(struct pci_dn *root,
 		      void *(*fn)(struct pci_dn *, void *),
 		      void *data);
 
-extern void pci_devs_phb_init(void);
 extern void pci_devs_phb_init_dynamic(struct pci_controller *phb);
 
 /* From rtas_pci.h */
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index aabba94..f33ce5b 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -110,4 +110,4 @@ static int __init eeh_dev_phb_init(void)
 	return 0;
 }
 
-core_initcall(eeh_dev_phb_init);
+core_initcall_sync(eeh_dev_phb_init);
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index 0469247..35554c2 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -288,7 +288,7 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
 	struct device_node *parent;
 	struct pci_dn *pdn;
 
-	pdn = zalloc_maybe_bootmem(sizeof(*pdn), GFP_KERNEL);
+	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
 	if (pdn == NULL)
 		return NULL;
 	dn->data = pdn;
@@ -462,15 +462,19 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
  * pci device found underneath.  This routine runs once,
  * early in the boot sequence.
  */
-void __init pci_devs_phb_init(void)
+static int __init pci_devs_phb_init(void)
 {
 	struct pci_controller *phb, *tmp;
 
 	/* This must be done first so the device nodes have valid pci info! */
 	list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
 		pci_devs_phb_init_dynamic(phb);
+
+	return 0;
 }
 
+core_initcall(pci_devs_phb_init);
+
 static void pci_dev_pdn_setup(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn;
diff --git a/arch/powerpc/platforms/maple/pci.c b/arch/powerpc/platforms/maple/pci.c
index a923230..04a69a8 100644
--- a/arch/powerpc/platforms/maple/pci.c
+++ b/arch/powerpc/platforms/maple/pci.c
@@ -568,6 +568,26 @@ void maple_pci_irq_fixup(struct pci_dev *dev)
 	DBG(" <- maple_pci_irq_fixup\n");
 }
 
+static int maple_pci_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+	struct pci_controller *hose = pci_bus_to_host(bridge->bus);
+	struct device_node *np, *child;
+
+	if (hose != u3_agp)
+		return 0;
+
+	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
+	 * assume there is no P2P bridge on the AGP bus, which should be a
+	 * safe assumptions hopefully.
+	 */
+	np = hose->dn;
+	PCI_DN(np)->busno = 0xf0;
+	for_each_child_of_node(np, child)
+		PCI_DN(child)->busno = 0xf0;
+
+	return 0;
+}
+
 void __init maple_pci_init(void)
 {
 	struct device_node *np, *root;
@@ -605,20 +625,7 @@ void __init maple_pci_init(void)
 	if (ht && maple_add_bridge(ht) != 0)
 		of_node_put(ht);
 
-	/* Setup the linkage between OF nodes and PHBs */ 
-	pci_devs_phb_init();
-
-	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
-	 * assume there is no P2P bridge on the AGP bus, which should be a
-	 * safe assumptions hopefully.
-	 */
-	if (u3_agp) {
-		struct device_node *np = u3_agp->dn;
-		PCI_DN(np)->busno = 0xf0;
-		for (np = np->child; np; np = np->sibling)
-			PCI_DN(np)->busno = 0xf0;
-	}
-
+	ppc_md.pcibios_root_bridge_prepare = maple_pci_root_bridge_prepare;
 	/* Tell pci.c to not change any resource allocations.  */
 	pci_add_flags(PCI_PROBE_ONLY);
 }
diff --git a/arch/powerpc/platforms/pasemi/pci.c b/arch/powerpc/platforms/pasemi/pci.c
index f3a68a0..10c4e8f 100644
--- a/arch/powerpc/platforms/pasemi/pci.c
+++ b/arch/powerpc/platforms/pasemi/pci.c
@@ -229,9 +229,6 @@ void __init pas_pci_init(void)
 			of_node_get(np);
 
 	of_node_put(root);
-
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
 }
 
 void __iomem *pasemi_pci_getcfgaddr(struct pci_dev *dev, int offset)
diff --git a/arch/powerpc/platforms/powermac/pci.c b/arch/powerpc/platforms/powermac/pci.c
index 59ab16f..368716f 100644
--- a/arch/powerpc/platforms/powermac/pci.c
+++ b/arch/powerpc/platforms/powermac/pci.c
@@ -878,6 +878,29 @@ void pmac_pci_irq_fixup(struct pci_dev *dev)
 #endif /* CONFIG_PPC32 */
 }
 
+#ifdef CONFIG_PPC64
+static int pmac_pci_root_bridge_prepare(struct pci_hot_bridge *bridge)
+{
+	struct pci_controller *hose = pci_bus_to_host(bridge->bus);
+	struct device_node *np, *child;
+
+	if (hose != u3_agp)
+		return 0;
+
+	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
+	 * assume there is no P2P bridge on the AGP bus, which should be a
+	 * safe assumptions for now. We should do something better in the
+	 * future though
+	 */
+	np = hose->dn;
+	PCI_DN(np)->busno = 0xf0;
+	for_each_child_of_node(np, child)
+		PCI_DN(child)->busno = 0xf0;
+
+	return 0;
+}
+#endif /* CONFIG_PPC64 */
+
 void __init pmac_pci_init(void)
 {
 	struct device_node *np, *root;
@@ -914,22 +937,8 @@ void __init pmac_pci_init(void)
 	if (ht && pmac_add_bridge(ht) != 0)
 		of_node_put(ht);
 
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
-
-	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
-	 * assume there is no P2P bridge on the AGP bus, which should be a
-	 * safe assumptions for now. We should do something better in the
-	 * future though
-	 */
-	if (u3_agp) {
-		struct device_node *np = u3_agp->dn;
-		PCI_DN(np)->busno = 0xf0;
-		for (np = np->child; np; np = np->sibling)
-			PCI_DN(np)->busno = 0xf0;
-	}
 	/* pmac_check_ht_link(); */
-
+	ppc_md.pcibios_root_bridge_prepare = pmac_pci_root_bridge_prepare;
 #else /* CONFIG_PPC64 */
 	init_p2pbridge();
 	init_second_ohare();
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index c332ea7..9fd1c0d 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -949,9 +949,6 @@ void __init pnv_pci_init(void)
 	for_each_compatible_node(np, NULL, "ibm,ioda2-phb")
 		pnv_pci_init_ioda2_phb(np);
 
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
-
 	/* Configure IOMMU DMA hooks */
 	set_pci_dma_ops(&dma_iommu_ops);
 
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index df6a704..5f80758 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -482,7 +482,6 @@ static void __init find_and_init_phbs(void)
 	}
 
 	of_node_put(root);
-	pci_devs_phb_init();
 
 	/*
 	 * PCI_PROBE_ONLY and PCI_REASSIGN_ALL_BUS can be set via properties
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 35/42] powerpc/pci: Create eeh_dev while creating pci_dn
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (23 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 33/42] powerpc/powernv: Functions to get/reset PCI slot status Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 37/42] powerpc/pci: Update bridge windows on PCI plugging Gavin Shan
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The eeh_dev is always created based on pci_dn, but with initcall
core_initcall_sync(). The patch creates eeh_dev when pci_dn is
created, indicating they have same life cycle.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 16/21]
---
 arch/powerpc/include/asm/eeh.h         |  6 ++++--
 arch/powerpc/kernel/eeh_dev.c          | 18 ++++--------------
 arch/powerpc/kernel/pci_dn.c           | 12 ++++++++++++
 arch/powerpc/platforms/pseries/setup.c |  6 +-----
 4 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index ea1f13c4..c0236a6 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -272,7 +272,8 @@ void eeh_pe_restore_bars(struct eeh_pe *pe);
 const char *eeh_pe_loc_get(struct eeh_pe *pe);
 struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
 
-void *eeh_dev_init(struct pci_dn *pdn, void *data);
+struct eeh_dev *eeh_dev_init(struct pci_dn *pdn,
+			     struct pci_controller *phb);
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
 int eeh_init(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
@@ -325,7 +326,8 @@ static inline int eeh_init(void)
 	return 0;
 }
 
-static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
+static inline struct eeh_dev *eeh_dev_init(struct pci_dn *pdn,
+					   struct pci_controller *phb)
 {
 	return NULL;
 }
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index f33ce5b..7486932 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -44,14 +44,14 @@
 /**
  * eeh_dev_init - Create EEH device according to OF node
  * @pdn: PCI device node
- * @data: PHB
+ * @phb: PCI controller
  *
  * It will create EEH device according to the given OF node. The function
  * might be called by PCI emunation, DR, PHB hotplug.
  */
-void *eeh_dev_init(struct pci_dn *pdn, void *data)
+struct eeh_dev *eeh_dev_init(struct pci_dn *pdn,
+			     struct pci_controller *phb)
 {
-	struct pci_controller *phb = data;
 	struct eeh_dev *edev;
 
 	/* Allocate EEH device */
@@ -68,7 +68,7 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
 	edev->phb = phb;
 	INIT_LIST_HEAD(&edev->list);
 
-	return NULL;
+	return edev;
 }
 
 /**
@@ -80,16 +80,8 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
  */
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb)
 {
-	struct pci_dn *root = phb->pci_data;
-
 	/* EEH PE for PHB */
 	eeh_phb_pe_create(phb);
-
-	/* EEH device for PHB */
-	eeh_dev_init(root, phb);
-
-	/* EEH devices for children OF nodes */
-	traverse_pci_dn(root, eeh_dev_init, phb);
 }
 
 /**
@@ -105,8 +97,6 @@ static int __init eeh_dev_phb_init(void)
 	list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
 		eeh_dev_phb_init_dynamic(phb);
 
-	pr_info("EEH: devices created\n");
-
 	return 0;
 }
 
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index 35554c2..d4330d2 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -287,6 +287,9 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
 	const __be32 *regs;
 	struct device_node *parent;
 	struct pci_dn *pdn;
+#ifdef CONFIG_EEH
+	struct eeh_dev *edev;
+#endif
 
 	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
 	if (pdn == NULL)
@@ -317,6 +320,15 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
 	/* Extended config space */
 	pdn->pci_ext_config_space = (type && of_read_number(type, 1) == 1);
 
+	/* Initialize EEH device */
+#ifdef CONFIG_EEH
+	edev = eeh_dev_init(pdn, phb);
+	if (!edev) {
+		kfree(pdn);
+		return NULL;
+	}
+#endif
+
 	/* Attach to parent node */
 	INIT_LIST_HEAD(&pdn->child_list);
 	INIT_LIST_HEAD(&pdn->list);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 5f80758..92974aa 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -261,12 +261,8 @@ static int pci_dn_reconfig_notifier(struct notifier_block *nb, unsigned long act
 	switch (action) {
 	case OF_RECONFIG_ATTACH_NODE:
 		pci = np->parent->data;
-		if (pci) {
+		if (pci)
 			update_dn_pci_info(np, pci->phb);
-
-			/* Create EEH device for the OF node */
-			eeh_dev_init(PCI_DN(np), pci->phb);
-		}
 		break;
 	default:
 		err = NOTIFY_DONE;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 36/42] powerpc/pci: Export traverse_pci_device_nodes()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:42     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

The patch exports following functions, which are derived from their
original implementation, so that the PCI hotplug logic can reuse
the functions to add or remove pci_dn for all device nodes under
specified PCI slot.

   traverse_pci_device_nodes()     traverse_pci_devices()
   add_pci_device_node_info()      update_dn_pci_info()
   remove_pci_device_node_info()   newly added

The patch also releases eeh_dev when its corresponding pci_dn
is released, indicating they have same life cycle.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Derived from PATCH[v4 17/21]
  * Fixed "assignment in if condition" from checkpatch.pl
---
 arch/powerpc/include/asm/pci-bridge.h  |  4 +-
 arch/powerpc/include/asm/ppc-pci.h     |  7 ++--
 arch/powerpc/kernel/pci_dn.c           | 71 ++++++++++++++++++++++++++++------
 arch/powerpc/platforms/pseries/msi.c   |  4 +-
 arch/powerpc/platforms/pseries/setup.c |  2 +-
 5 files changed, 70 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 9a83cdb..d0b4b1a 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -222,7 +222,9 @@ extern struct pci_dn *pci_get_pdn_by_devfn(struct pci_bus *bus,
 extern struct pci_dn *pci_get_pdn(struct pci_dev *pdev);
 extern struct pci_dn *add_dev_pci_data(struct pci_dev *pdev);
 extern void remove_dev_pci_data(struct pci_dev *pdev);
-extern void *update_dn_pci_info(struct device_node *dn, void *data);
+extern void *add_pci_device_node_info(struct device_node *dn,
+				      struct pci_controller *phb);
+extern void remove_pci_device_node_info(struct device_node *dn);
 
 static inline int pci_device_from_OF_node(struct device_node *np,
 					  u8 *bus, u8 *devfn)
diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 7388316..a5b0ea0 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -33,9 +33,10 @@ extern struct pci_dev *isa_bridge_pcidev;	/* may be NULL if no ISA bus */
 struct device_node;
 struct pci_dn;
 
-typedef void *(*traverse_func)(struct device_node *me, void *data);
-void *traverse_pci_devices(struct device_node *start, traverse_func pre,
-		void *data);
+void *traverse_pci_device_nodes(struct device_node *start,
+				void *(*fn)(struct device_node *,
+					    struct pci_controller *),
+				void *data);
 void *traverse_pci_dn(struct pci_dn *root,
 		      void *(*fn)(struct pci_dn *, void *),
 		      void *data);
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index d4330d2..f821e96 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -276,13 +276,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 #endif /* CONFIG_PCI_IOV */
 }
 
-/*
- * Traverse_func that inits the PCI fields of the device node.
- * NOTE: this *must* be done before read/write config to the device.
+/**
+ * add_pci_device_node_info - Add pci_dn for PCI device node
+ * @dn: PCI device node
+ * @phb: PHB
+ *
+ * Add pci_dn for the indicated PCI device node. The newly created
+ * pci_dn will be put into the child list of the parent device node.
  */
-void *update_dn_pci_info(struct device_node *dn, void *data)
+void *add_pci_device_node_info(struct device_node *dn,
+			       struct pci_controller *phb)
 {
-	struct pci_controller *phb = data;
 	const __be32 *type = of_get_property(dn, "ibm,pci-config-space-type", NULL);
 	const __be32 *regs;
 	struct device_node *parent;
@@ -339,8 +343,48 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
 
 	return NULL;
 }
+EXPORT_SYMBOL(add_pci_device_node_info);
 
-/*
+/**
+ * remove_pci_device_node_info - Remove pci_dn from PCI device node
+ * @dn: PCI device node
+ *
+ * Remove pci_dn from PCI device node. The pci_dn is also removed
+ * from the child list of the parent pci_dn.
+ */
+void remove_pci_device_node_info(struct device_node *np)
+{
+	struct pci_dn *pdn = np ? PCI_DN(np) : NULL;
+#ifdef CONFIG_EEH
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+#endif
+
+	if (!pdn)
+		return;
+
+#ifdef CONFIG_EEH
+	if (edev) {
+		pdn->edev = NULL;
+		kfree(edev);
+	}
+#endif
+
+	BUG_ON(!list_empty(&pdn->child_list));
+	list_del(&pdn->list);
+	if (pdn->parent)
+		of_node_put(pdn->parent->node);
+
+	np->data = NULL;
+	kfree(pdn);
+}
+EXPORT_SYMBOL(remove_pci_device_node_info);
+
+/**
+ * traverse_pci_device_nodes - Traverse children of indicated device node
+ * @start: indicated device node
+ * @pre: callback
+ * @data: additional parameter to the callback
+ *
  * Traverse a device tree stopping each PCI device in the tree.
  * This is done depth first.  As each node is processed, a "pre"
  * function is called and the children are processed recursively.
@@ -358,8 +402,10 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
  * one of these nodes we also assume its siblings are non-pci for
  * performance.
  */
-void *traverse_pci_devices(struct device_node *start, traverse_func pre,
-		void *data)
+void *traverse_pci_device_nodes(struct device_node *start,
+				void *(*fn)(struct device_node *,
+					    struct pci_controller *phb),
+				void *data)
 {
 	struct device_node *dn, *nextdn;
 	void *ret;
@@ -374,7 +420,8 @@ void *traverse_pci_devices(struct device_node *start, traverse_func pre,
 		if (classp)
 			class = of_read_number(classp, 1);
 
-		if (pre && ((ret = pre(dn, data)) != NULL))
+		ret = fn ? fn(dn, data) : NULL;
+		if (ret != NULL)
 			return ret;
 
 		/* If we are a PCI bridge, go down */
@@ -395,8 +442,10 @@ void *traverse_pci_devices(struct device_node *start, traverse_func pre,
 			nextdn = dn->sibling;
 		}
 	}
+
 	return NULL;
 }
+EXPORT_SYMBOL_GPL(traverse_pci_device_nodes);
 
 static struct pci_dn *pci_dn_next_one(struct pci_dn *root,
 				      struct pci_dn *pdn)
@@ -452,7 +501,7 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
 	struct pci_dn *pdn;
 
 	/* PHB nodes themselves must not match */
-	update_dn_pci_info(dn, phb);
+	add_pci_device_node_info(dn, phb);
 	pdn = dn->data;
 	if (pdn) {
 		pdn->devfn = pdn->busno = -1;
@@ -462,7 +511,7 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
 	}
 
 	/* Update dn->phb ptrs for new phb and children devices */
-	traverse_pci_devices(dn, update_dn_pci_info, phb);
+	traverse_pci_device_nodes(dn, add_pci_device_node_info, phb);
 }
 
 /** 
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index c8d24f9..9ebbd19 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -303,7 +303,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int request)
 	memset(&counts, 0, sizeof(struct msi_counts));
 
 	/* Work out how many devices we have below this PE */
-	traverse_pci_devices(pe_dn, count_non_bridge_devices, &counts);
+	traverse_pci_device_nodes(pe_dn, count_non_bridge_devices, &counts);
 
 	if (counts.num_devices == 0) {
 		pr_err("rtas_msi: found 0 devices under PE for %s\n",
@@ -318,7 +318,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int request)
 	/* else, we have some more calculating to do */
 	counts.requestor = pci_device_to_OF_node(dev);
 	counts.request = request;
-	traverse_pci_devices(pe_dn, count_spare_msis, &counts);
+	traverse_pci_device_nodes(pe_dn, count_spare_msis, &counts);
 
 	/* If the quota isn't an integer multiple of the total, we can
 	 * use the remainder as spare MSIs for anyone that wants them. */
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 92974aa..ed8c894 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -262,7 +262,7 @@ static int pci_dn_reconfig_notifier(struct notifier_block *nb, unsigned long act
 	case OF_RECONFIG_ATTACH_NODE:
 		pci = np->parent->data;
 		if (pci)
-			update_dn_pci_info(np, pci->phb);
+			add_pci_device_node_info(np, pci->phb);
 		break;
 	default:
 		err = NOTIFY_DONE;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 36/42] powerpc/pci: Export traverse_pci_device_nodes()
@ 2015-06-04  6:42     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch exports following functions, which are derived from their
original implementation, so that the PCI hotplug logic can reuse
the functions to add or remove pci_dn for all device nodes under
specified PCI slot.

   traverse_pci_device_nodes()     traverse_pci_devices()
   add_pci_device_node_info()      update_dn_pci_info()
   remove_pci_device_node_info()   newly added

The patch also releases eeh_dev when its corresponding pci_dn
is released, indicating they have same life cycle.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 17/21]
  * Fixed "assignment in if condition" from checkpatch.pl
---
 arch/powerpc/include/asm/pci-bridge.h  |  4 +-
 arch/powerpc/include/asm/ppc-pci.h     |  7 ++--
 arch/powerpc/kernel/pci_dn.c           | 71 ++++++++++++++++++++++++++++------
 arch/powerpc/platforms/pseries/msi.c   |  4 +-
 arch/powerpc/platforms/pseries/setup.c |  2 +-
 5 files changed, 70 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 9a83cdb..d0b4b1a 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -222,7 +222,9 @@ extern struct pci_dn *pci_get_pdn_by_devfn(struct pci_bus *bus,
 extern struct pci_dn *pci_get_pdn(struct pci_dev *pdev);
 extern struct pci_dn *add_dev_pci_data(struct pci_dev *pdev);
 extern void remove_dev_pci_data(struct pci_dev *pdev);
-extern void *update_dn_pci_info(struct device_node *dn, void *data);
+extern void *add_pci_device_node_info(struct device_node *dn,
+				      struct pci_controller *phb);
+extern void remove_pci_device_node_info(struct device_node *dn);
 
 static inline int pci_device_from_OF_node(struct device_node *np,
 					  u8 *bus, u8 *devfn)
diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 7388316..a5b0ea0 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -33,9 +33,10 @@ extern struct pci_dev *isa_bridge_pcidev;	/* may be NULL if no ISA bus */
 struct device_node;
 struct pci_dn;
 
-typedef void *(*traverse_func)(struct device_node *me, void *data);
-void *traverse_pci_devices(struct device_node *start, traverse_func pre,
-		void *data);
+void *traverse_pci_device_nodes(struct device_node *start,
+				void *(*fn)(struct device_node *,
+					    struct pci_controller *),
+				void *data);
 void *traverse_pci_dn(struct pci_dn *root,
 		      void *(*fn)(struct pci_dn *, void *),
 		      void *data);
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index d4330d2..f821e96 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -276,13 +276,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 #endif /* CONFIG_PCI_IOV */
 }
 
-/*
- * Traverse_func that inits the PCI fields of the device node.
- * NOTE: this *must* be done before read/write config to the device.
+/**
+ * add_pci_device_node_info - Add pci_dn for PCI device node
+ * @dn: PCI device node
+ * @phb: PHB
+ *
+ * Add pci_dn for the indicated PCI device node. The newly created
+ * pci_dn will be put into the child list of the parent device node.
  */
-void *update_dn_pci_info(struct device_node *dn, void *data)
+void *add_pci_device_node_info(struct device_node *dn,
+			       struct pci_controller *phb)
 {
-	struct pci_controller *phb = data;
 	const __be32 *type = of_get_property(dn, "ibm,pci-config-space-type", NULL);
 	const __be32 *regs;
 	struct device_node *parent;
@@ -339,8 +343,48 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
 
 	return NULL;
 }
+EXPORT_SYMBOL(add_pci_device_node_info);
 
-/*
+/**
+ * remove_pci_device_node_info - Remove pci_dn from PCI device node
+ * @dn: PCI device node
+ *
+ * Remove pci_dn from PCI device node. The pci_dn is also removed
+ * from the child list of the parent pci_dn.
+ */
+void remove_pci_device_node_info(struct device_node *np)
+{
+	struct pci_dn *pdn = np ? PCI_DN(np) : NULL;
+#ifdef CONFIG_EEH
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+#endif
+
+	if (!pdn)
+		return;
+
+#ifdef CONFIG_EEH
+	if (edev) {
+		pdn->edev = NULL;
+		kfree(edev);
+	}
+#endif
+
+	BUG_ON(!list_empty(&pdn->child_list));
+	list_del(&pdn->list);
+	if (pdn->parent)
+		of_node_put(pdn->parent->node);
+
+	np->data = NULL;
+	kfree(pdn);
+}
+EXPORT_SYMBOL(remove_pci_device_node_info);
+
+/**
+ * traverse_pci_device_nodes - Traverse children of indicated device node
+ * @start: indicated device node
+ * @pre: callback
+ * @data: additional parameter to the callback
+ *
  * Traverse a device tree stopping each PCI device in the tree.
  * This is done depth first.  As each node is processed, a "pre"
  * function is called and the children are processed recursively.
@@ -358,8 +402,10 @@ void *update_dn_pci_info(struct device_node *dn, void *data)
  * one of these nodes we also assume its siblings are non-pci for
  * performance.
  */
-void *traverse_pci_devices(struct device_node *start, traverse_func pre,
-		void *data)
+void *traverse_pci_device_nodes(struct device_node *start,
+				void *(*fn)(struct device_node *,
+					    struct pci_controller *phb),
+				void *data)
 {
 	struct device_node *dn, *nextdn;
 	void *ret;
@@ -374,7 +420,8 @@ void *traverse_pci_devices(struct device_node *start, traverse_func pre,
 		if (classp)
 			class = of_read_number(classp, 1);
 
-		if (pre && ((ret = pre(dn, data)) != NULL))
+		ret = fn ? fn(dn, data) : NULL;
+		if (ret != NULL)
 			return ret;
 
 		/* If we are a PCI bridge, go down */
@@ -395,8 +442,10 @@ void *traverse_pci_devices(struct device_node *start, traverse_func pre,
 			nextdn = dn->sibling;
 		}
 	}
+
 	return NULL;
 }
+EXPORT_SYMBOL_GPL(traverse_pci_device_nodes);
 
 static struct pci_dn *pci_dn_next_one(struct pci_dn *root,
 				      struct pci_dn *pdn)
@@ -452,7 +501,7 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
 	struct pci_dn *pdn;
 
 	/* PHB nodes themselves must not match */
-	update_dn_pci_info(dn, phb);
+	add_pci_device_node_info(dn, phb);
 	pdn = dn->data;
 	if (pdn) {
 		pdn->devfn = pdn->busno = -1;
@@ -462,7 +511,7 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
 	}
 
 	/* Update dn->phb ptrs for new phb and children devices */
-	traverse_pci_devices(dn, update_dn_pci_info, phb);
+	traverse_pci_device_nodes(dn, add_pci_device_node_info, phb);
 }
 
 /** 
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index c8d24f9..9ebbd19 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -303,7 +303,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int request)
 	memset(&counts, 0, sizeof(struct msi_counts));
 
 	/* Work out how many devices we have below this PE */
-	traverse_pci_devices(pe_dn, count_non_bridge_devices, &counts);
+	traverse_pci_device_nodes(pe_dn, count_non_bridge_devices, &counts);
 
 	if (counts.num_devices == 0) {
 		pr_err("rtas_msi: found 0 devices under PE for %s\n",
@@ -318,7 +318,7 @@ static int msi_quota_for_device(struct pci_dev *dev, int request)
 	/* else, we have some more calculating to do */
 	counts.requestor = pci_device_to_OF_node(dev);
 	counts.request = request;
-	traverse_pci_devices(pe_dn, count_spare_msis, &counts);
+	traverse_pci_device_nodes(pe_dn, count_spare_msis, &counts);
 
 	/* If the quota isn't an integer multiple of the total, we can
 	 * use the remainder as spare MSIs for anyone that wants them. */
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 92974aa..ed8c894 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -262,7 +262,7 @@ static int pci_dn_reconfig_notifier(struct notifier_block *nb, unsigned long act
 	case OF_RECONFIG_ATTACH_NODE:
 		pci = np->parent->data;
 		if (pci)
-			update_dn_pci_info(np, pci->phb);
+			add_pci_device_node_info(np, pci->phb);
 		break;
 	default:
 		err = NOTIFY_DONE;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 37/42] powerpc/pci: Update bridge windows on PCI plugging
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (24 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 35/42] powerpc/pci: Create eeh_dev while creating pci_dn Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 38/42] powerpc/powernv: Select OF_OVERLAY Gavin Shan
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

During the PCI plugging event, the PCI devices are rescanned and
their IO and MMIO resources are reassigned. However, the PowerNV
platform will assign PE# based on that, which depends on updating
to window of bridge of the PE's primary bus.

The patch updates the windows of bridge of PE's primary bus if
we have valid bridge. Otherwise, we assume it's root bus or SRIOV
virtual bus and PE won't be assigned during PCI plugging time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 18/21]
---
 arch/powerpc/kernel/pci-common.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0358f24..811eb4d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1471,8 +1471,12 @@ void pcibios_finish_adding_to_bus(struct pci_bus *bus)
 	/* Allocate bus and devices resources */
 	pcibios_allocate_bus_resources(bus);
 	pcibios_claim_one_bus(bus);
-	if (!pci_has_flag(PCI_PROBE_ONLY))
-		pci_assign_unassigned_bus_resources(bus);
+	if (!pci_has_flag(PCI_PROBE_ONLY)) {
+		if (bus->self)
+			pci_assign_unassigned_bridge_resources(bus->self);
+		else
+			pci_assign_unassigned_bus_resources(bus);
+	}
 
 	/* Fixup EEH */
 	eeh_add_device_tree_late(bus);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 38/42] powerpc/powernv: Select OF_OVERLAY
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (25 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 37/42] powerpc/pci: Update bridge windows on PCI plugging Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree() Gavin Shan
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The device tree nodes will be changed dynamically on PCI hotplug
events on PowerNV platform with the help of overlay mechanism.
The patch enables CONFIG_OF_OVERLAY on PowerNV platform to support
that.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Derived from PATCH[v4 20/21]
  * Enables OF_OVERLAY instead of OF_DYNAMIC
---
 arch/powerpc/platforms/powernv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 4b044d8..97d481b 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -18,4 +18,5 @@ config PPC_POWERNV
 	select CPU_FREQ_GOV_ONDEMAND
 	select CPU_FREQ_GOV_CONSERVATIVE
 	select PPC_DOORBELL
+	select OF_OVERLAY
 	default y
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 39/42] drivers/of: Unflatten nodes equal or deeper than specified level
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
@ 2015-06-04  6:42     ` Gavin Shan
  2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
                       ` (28 subsequent siblings)
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Gavin Shan

unflatten_dt_node() is called recursively to unflatten FDT nodes
with the assumption that FDT blob has only one root node, which
isn't true when the FDT blob represents device sub-tree. The
patch improves the function to supporting device sub-tree that
have multiple root nodes:

   * Rename original unflatten_dt_node() to __unflatten_dt_node().
   * Wrapper unflatten_dt_node() calls __unflatten_dt_node() with
     adjusted current node depth to 1 to avoid underflow.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
v5:
  * Split from PATCH[v4 19/21]
  * Fixed "line over 80 characters" from checkpatch.pl
---
 drivers/of/fdt.c | 56 ++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index cde35c5d01..b87c157 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -28,6 +28,8 @@
 #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
 #include <asm/page.h>
 
+static int cur_node_depth;
+
 /*
  * of_fdt_limit_memory - limit the number of regions in the /memory node
  * @limit: maximum entries
@@ -161,27 +163,26 @@ static void *unflatten_dt_alloc(void **mem, unsigned long size,
 }
 
 /**
- * unflatten_dt_node - Alloc and populate a device_node from the flat tree
+ * __unflatten_dt_node - Alloc and populate a device_node from the flat tree
  * @blob: The parent device tree blob
  * @mem: Memory chunk to use for allocating device nodes and properties
  * @p: pointer to node in flat tree
  * @dad: Parent struct device_node
  * @fpsize: Size of the node path up at the current depth.
  */
-static void * unflatten_dt_node(void *blob,
-				void *mem,
-				int *poffset,
-				struct device_node *dad,
-				struct device_node **nodepp,
-				unsigned long fpsize,
-				bool dryrun)
+static void *__unflatten_dt_node(void *blob,
+				 void *mem,
+				 int *poffset,
+				 struct device_node *dad,
+				 struct device_node **nodepp,
+				 unsigned long fpsize,
+				 bool dryrun)
 {
 	const __be32 *p;
 	struct device_node *np;
 	struct property *pp, **prev_pp = NULL;
 	const char *pathp;
 	unsigned int l, allocl;
-	static int depth = 0;
 	int old_depth;
 	int offset;
 	int has_name = 0;
@@ -334,13 +335,19 @@ static void * unflatten_dt_node(void *blob,
 			np->type = "<NULL>";
 	}
 
-	old_depth = depth;
-	*poffset = fdt_next_node(blob, *poffset, &depth);
-	if (depth < 0)
-		depth = 0;
-	while (*poffset > 0 && depth > old_depth)
-		mem = unflatten_dt_node(blob, mem, poffset, np, NULL,
-					fpsize, dryrun);
+	old_depth = cur_node_depth;
+	*poffset = fdt_next_node(blob, *poffset, &cur_node_depth);
+	while (*poffset > 0) {
+		if (cur_node_depth < old_depth)
+			break;
+
+		if (cur_node_depth == old_depth)
+			mem = __unflatten_dt_node(blob, mem, poffset,
+						  dad, NULL, fpsize, dryrun);
+		else if (cur_node_depth > old_depth)
+			mem = __unflatten_dt_node(blob, mem, poffset,
+						  np, NULL, fpsize, dryrun);
+	}
 
 	if (*poffset < 0 && *poffset != -FDT_ERR_NOTFOUND)
 		pr_err("unflatten: error %d processing FDT\n", *poffset);
@@ -366,6 +373,18 @@ static void * unflatten_dt_node(void *blob,
 	return mem;
 }
 
+static void *unflatten_dt_node(void *blob,
+			       void *mem,
+			       int *poffset,
+			       struct device_node *dad,
+			       struct device_node **nodepp,
+			       bool dryrun)
+{
+	cur_node_depth = 1;
+	return __unflatten_dt_node(blob, mem, poffset,
+				   dad, nodepp, 0, dryrun);
+}
+
 /**
  * __unflatten_device_tree - create tree of device_nodes from flat blob
  *
@@ -405,7 +424,8 @@ static void __unflatten_device_tree(void *blob,
 
 	/* First pass, scan for size */
 	start = 0;
-	size = (unsigned long)unflatten_dt_node(blob, NULL, &start, NULL, NULL, 0, true);
+	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
+						NULL, NULL, true);
 	size = ALIGN(size, 4);
 
 	pr_debug("  size is %lx, allocating...\n", size);
@@ -420,7 +440,7 @@ static void __unflatten_device_tree(void *blob,
 
 	/* Second pass, do actual unflattening */
 	start = 0;
-	unflatten_dt_node(blob, mem, &start, NULL, mynodes, 0, false);
+	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 39/42] drivers/of: Unflatten nodes equal or deeper than specified level
@ 2015-06-04  6:42     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

unflatten_dt_node() is called recursively to unflatten FDT nodes
with the assumption that FDT blob has only one root node, which
isn't true when the FDT blob represents device sub-tree. The
patch improves the function to supporting device sub-tree that
have multiple root nodes:

   * Rename original unflatten_dt_node() to __unflatten_dt_node().
   * Wrapper unflatten_dt_node() calls __unflatten_dt_node() with
     adjusted current node depth to 1 to avoid underflow.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Split from PATCH[v4 19/21]
  * Fixed "line over 80 characters" from checkpatch.pl
---
 drivers/of/fdt.c | 56 ++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index cde35c5d01..b87c157 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -28,6 +28,8 @@
 #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
 #include <asm/page.h>
 
+static int cur_node_depth;
+
 /*
  * of_fdt_limit_memory - limit the number of regions in the /memory node
  * @limit: maximum entries
@@ -161,27 +163,26 @@ static void *unflatten_dt_alloc(void **mem, unsigned long size,
 }
 
 /**
- * unflatten_dt_node - Alloc and populate a device_node from the flat tree
+ * __unflatten_dt_node - Alloc and populate a device_node from the flat tree
  * @blob: The parent device tree blob
  * @mem: Memory chunk to use for allocating device nodes and properties
  * @p: pointer to node in flat tree
  * @dad: Parent struct device_node
  * @fpsize: Size of the node path up at the current depth.
  */
-static void * unflatten_dt_node(void *blob,
-				void *mem,
-				int *poffset,
-				struct device_node *dad,
-				struct device_node **nodepp,
-				unsigned long fpsize,
-				bool dryrun)
+static void *__unflatten_dt_node(void *blob,
+				 void *mem,
+				 int *poffset,
+				 struct device_node *dad,
+				 struct device_node **nodepp,
+				 unsigned long fpsize,
+				 bool dryrun)
 {
 	const __be32 *p;
 	struct device_node *np;
 	struct property *pp, **prev_pp = NULL;
 	const char *pathp;
 	unsigned int l, allocl;
-	static int depth = 0;
 	int old_depth;
 	int offset;
 	int has_name = 0;
@@ -334,13 +335,19 @@ static void * unflatten_dt_node(void *blob,
 			np->type = "<NULL>";
 	}
 
-	old_depth = depth;
-	*poffset = fdt_next_node(blob, *poffset, &depth);
-	if (depth < 0)
-		depth = 0;
-	while (*poffset > 0 && depth > old_depth)
-		mem = unflatten_dt_node(blob, mem, poffset, np, NULL,
-					fpsize, dryrun);
+	old_depth = cur_node_depth;
+	*poffset = fdt_next_node(blob, *poffset, &cur_node_depth);
+	while (*poffset > 0) {
+		if (cur_node_depth < old_depth)
+			break;
+
+		if (cur_node_depth == old_depth)
+			mem = __unflatten_dt_node(blob, mem, poffset,
+						  dad, NULL, fpsize, dryrun);
+		else if (cur_node_depth > old_depth)
+			mem = __unflatten_dt_node(blob, mem, poffset,
+						  np, NULL, fpsize, dryrun);
+	}
 
 	if (*poffset < 0 && *poffset != -FDT_ERR_NOTFOUND)
 		pr_err("unflatten: error %d processing FDT\n", *poffset);
@@ -366,6 +373,18 @@ static void * unflatten_dt_node(void *blob,
 	return mem;
 }
 
+static void *unflatten_dt_node(void *blob,
+			       void *mem,
+			       int *poffset,
+			       struct device_node *dad,
+			       struct device_node **nodepp,
+			       bool dryrun)
+{
+	cur_node_depth = 1;
+	return __unflatten_dt_node(blob, mem, poffset,
+				   dad, nodepp, 0, dryrun);
+}
+
 /**
  * __unflatten_device_tree - create tree of device_nodes from flat blob
  *
@@ -405,7 +424,8 @@ static void __unflatten_device_tree(void *blob,
 
 	/* First pass, scan for size */
 	start = 0;
-	size = (unsigned long)unflatten_dt_node(blob, NULL, &start, NULL, NULL, 0, true);
+	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
+						NULL, NULL, true);
 	size = ALIGN(size, 4);
 
 	pr_debug("  size is %lx, allocating...\n", size);
@@ -420,7 +440,7 @@ static void __unflatten_device_tree(void *blob,
 
 	/* Second pass, do actual unflattening */
 	start = 0;
-	unflatten_dt_node(blob, mem, &start, NULL, mynodes, 0, false);
+	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (26 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 38/42] powerpc/powernv: Select OF_OVERLAY Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04 22:10   ` Rob Herring
       [not found]   ` <1433400131-18429-41-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2015-06-04  6:42 ` [PATCH v5 41/42] drivers/of: Return allocated memory chunk from of_fdt_unflatten_tree() Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
  29 siblings, 2 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch introduces one more argument to of_fdt_unflatten_tree()
to specify the root node for the FDT blob, which is going to be
unflattened. In the result, the function can be used to unflatten
FDT blob, which represents device sub-tree in subsequent patches.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Newly introduced
---
 drivers/of/fdt.c       | 26 ++++++++++++++++++--------
 drivers/of/unittest.c  |  2 +-
 include/linux/of_fdt.h |  3 ++-
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index b87c157..b6a6c59 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -380,9 +380,16 @@ static void *unflatten_dt_node(void *blob,
 			       struct device_node **nodepp,
 			       bool dryrun)
 {
+	unsigned long fpsize = 0;
+
+	if (dad)
+		fpsize = strlen(of_node_full_name(dad));
+	else
+		fpsize = 0;
+
 	cur_node_depth = 1;
 	return __unflatten_dt_node(blob, mem, poffset,
-				   dad, nodepp, 0, dryrun);
+				   dad, nodepp, fpsize, dryrun);
 }
 
 /**
@@ -393,13 +400,15 @@ static void *unflatten_dt_node(void *blob,
  * pointers of the nodes so the normal device-tree walking functions
  * can be used.
  * @blob: The blob to expand
+ * @dad: The root node of the created device_node tree
  * @mynodes: The device_node tree created by the call
  * @dt_alloc: An allocator that provides a virtual address to memory
  * for the resulting tree
  */
 static void __unflatten_device_tree(void *blob,
-			     struct device_node **mynodes,
-			     void * (*dt_alloc)(u64 size, u64 align))
+				    struct device_node *dad,
+				    struct device_node **mynodes,
+				    void * (*dt_alloc)(u64 size, u64 align))
 {
 	unsigned long size;
 	int start;
@@ -425,7 +434,7 @@ static void __unflatten_device_tree(void *blob,
 	/* First pass, scan for size */
 	start = 0;
 	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
-						NULL, NULL, true);
+						dad, NULL, true);
 	size = ALIGN(size, 4);
 
 	pr_debug("  size is %lx, allocating...\n", size);
@@ -440,7 +449,7 @@ static void __unflatten_device_tree(void *blob,
 
 	/* Second pass, do actual unflattening */
 	start = 0;
-	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
+	unflatten_dt_node(blob, mem, &start, dad, mynodes, false);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
@@ -462,9 +471,10 @@ static void *kernel_tree_alloc(u64 size, u64 align)
  * can be used.
  */
 void of_fdt_unflatten_tree(unsigned long *blob,
-			struct device_node **mynodes)
+			   struct device_node *dad,
+			   struct device_node **mynodes)
 {
-	__unflatten_device_tree(blob, mynodes, &kernel_tree_alloc);
+	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
 }
 EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
 
@@ -1095,7 +1105,7 @@ bool __init early_init_dt_scan(void *params)
  */
 void __init unflatten_device_tree(void)
 {
-	__unflatten_device_tree(initial_boot_params, &of_root,
+	__unflatten_device_tree(initial_boot_params, NULL, &of_root,
 				early_init_dt_alloc_memory_arch);
 
 	/* Get pointer to "/chosen" and "/aliases" nodes for use everywhere */
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 1801634..2270830 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -907,7 +907,7 @@ static int __init unittest_data_add(void)
 			"not running tests\n", __func__);
 		return -ENOMEM;
 	}
-	of_fdt_unflatten_tree(unittest_data, &unittest_data_node);
+	of_fdt_unflatten_tree(unittest_data, NULL, &unittest_data_node);
 	if (!unittest_data_node) {
 		pr_warn("%s: No tree to attach; not running tests\n", __func__);
 		return -ENODATA;
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 587ee50..8882640 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -38,7 +38,8 @@ extern bool of_fdt_is_big_endian(const void *blob,
 extern int of_fdt_match(const void *blob, unsigned long node,
 			const char *const *compat);
 extern void of_fdt_unflatten_tree(unsigned long *blob,
-			       struct device_node **mynodes);
+				  struct device_node *dad,
+				  struct device_node **mynodes);
 
 /* TBD: Temporary export of fdt globals - remove when code fully merged */
 extern int __initdata dt_root_addr_cells;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 41/42] drivers/of: Return allocated memory chunk from of_fdt_unflatten_tree()
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (27 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree() Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
  2015-06-04  6:42 ` [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
  29 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch changes of_fdt_unflatten_tree() so that it returns the
allocated memory chunk for unflattened device-tree, which can be
released once it's obsoleted.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Newly introduced
---
 drivers/of/fdt.c       | 21 +++++++++++----------
 include/linux/of_fdt.h |  6 +++---
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index b6a6c59..a954279 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -405,10 +405,10 @@ static void *unflatten_dt_node(void *blob,
  * @dt_alloc: An allocator that provides a virtual address to memory
  * for the resulting tree
  */
-static void __unflatten_device_tree(void *blob,
-				    struct device_node *dad,
-				    struct device_node **mynodes,
-				    void * (*dt_alloc)(u64 size, u64 align))
+static void *__unflatten_device_tree(void *blob,
+				     struct device_node *dad,
+				     struct device_node **mynodes,
+				     void * (*dt_alloc)(u64 size, u64 align))
 {
 	unsigned long size;
 	int start;
@@ -418,7 +418,7 @@ static void __unflatten_device_tree(void *blob,
 
 	if (!blob) {
 		pr_debug("No device tree pointer\n");
-		return;
+		return NULL;
 	}
 
 	pr_debug("Unflattening device tree:\n");
@@ -428,7 +428,7 @@ static void __unflatten_device_tree(void *blob,
 
 	if (fdt_check_header(blob)) {
 		pr_err("Invalid device tree blob header\n");
-		return;
+		return NULL;
 	}
 
 	/* First pass, scan for size */
@@ -455,6 +455,7 @@ static void __unflatten_device_tree(void *blob,
 			   be32_to_cpup(mem + size));
 
 	pr_debug(" <- unflatten_device_tree()\n");
+	return mem;
 }
 
 static void *kernel_tree_alloc(u64 size, u64 align)
@@ -470,11 +471,11 @@ static void *kernel_tree_alloc(u64 size, u64 align)
  * pointers of the nodes so the normal device-tree walking functions
  * can be used.
  */
-void of_fdt_unflatten_tree(unsigned long *blob,
-			   struct device_node *dad,
-			   struct device_node **mynodes)
+void *of_fdt_unflatten_tree(unsigned long *blob,
+			    struct device_node *dad,
+			    struct device_node **mynodes)
 {
-	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
+	return __unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
 }
 EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
 
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 8882640..8a38c6a 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -37,9 +37,9 @@ extern bool of_fdt_is_big_endian(const void *blob,
 				 unsigned long node);
 extern int of_fdt_match(const void *blob, unsigned long node,
 			const char *const *compat);
-extern void of_fdt_unflatten_tree(unsigned long *blob,
-				  struct device_node *dad,
-				  struct device_node **mynodes);
+extern void *of_fdt_unflatten_tree(unsigned long *blob,
+				   struct device_node *dad,
+				   struct device_node **mynodes);
 
 /* TBD: Temporary export of fdt globals - remove when code fully merged */
 extern int __initdata dt_root_addr_cells;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
                   ` (28 preceding siblings ...)
  2015-06-04  6:42 ` [PATCH v5 41/42] drivers/of: Return allocated memory chunk from of_fdt_unflatten_tree() Gavin Shan
@ 2015-06-04  6:42 ` Gavin Shan
       [not found]   ` <1433400131-18429-43-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2015-06-30 18:18     ` Grant Likely
  29 siblings, 2 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-04  6:42 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	grant.likely, Gavin Shan

The patch intends to add standalone driver to support PCI hotplug
for PowerPC PowerNV platform, which runs on top of skiboot firmware.
The firmware identified hotpluggable slots and marked their device
tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
The driver simply scans device-tree to create/register PCI hotplug slot
accordingly.

If the skiboot firmware doesn't support slot status retrieval, the PCI
slot device node shouldn't have property "ibm,reset-by-firmware". In
that case, none of valid PCI slots will be detected from device tree.
The skiboot firmware doesn't export the capability to access attention
LEDs yet and it's something for TBD.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Use OF OVERLAY to update the device-tree
  * Removed unnecessary header files
  * More meaningful return value from powernv_php_register_one()
  * Use pnv_pci_hotplug_notifier_{register, unregister}()
  * Decimal values for slot's states
  * Removed struct powernv_php_slot::release()
  * Merged two bool arguments to one for powernv_php_slot_enable()
  * Rename release_device_nodes_info() to remove_device_nodes_info()
  * Don't check on "!len" in slot_power_on_handler()
  * Handle return value in get_adapter_status() as suggested by aik
  * Drop invalid attention status in set_attention_status()
  * Renaming functions
  * Fixed coding style and added entry in MAINTAINERS reported by
    checkpatch.pl
---
 MAINTAINERS                            |   6 +
 drivers/pci/hotplug/Kconfig            |  12 +
 drivers/pci/hotplug/Makefile           |   4 +
 drivers/pci/hotplug/powernv_php.c      | 140 +++++++
 drivers/pci/hotplug/powernv_php.h      |  90 ++++
 drivers/pci/hotplug/powernv_php_slot.c | 732 +++++++++++++++++++++++++++++++++
 6 files changed, 984 insertions(+)
 create mode 100644 drivers/pci/hotplug/powernv_php.c
 create mode 100644 drivers/pci/hotplug/powernv_php.h
 create mode 100644 drivers/pci/hotplug/powernv_php_slot.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e308718..f5e1dce 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7481,6 +7481,12 @@ L:	linux-pci@vger.kernel.org
 S:	Supported
 F:	Documentation/PCI/pci-error-recovery.txt
 
+PCI HOTPLUG DRIVER FOR POWERNV PLATFORM
+M:	Gavin Shan <gwshan@linux.vnet.ibm.com>
+L:	linux-pci@vger.kernel.org
+S:	Supported
+F:	drivers/pci/hotplug/powernv_php*
+
 PCI SUBSYSTEM
 M:	Bjorn Helgaas <bhelgaas@google.com>
 L:	linux-pci@vger.kernel.org
diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
index df8caec..ef55dae 100644
--- a/drivers/pci/hotplug/Kconfig
+++ b/drivers/pci/hotplug/Kconfig
@@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC
 
 	  When in doubt, say N.
 
+config HOTPLUG_PCI_POWERNV
+	tristate "PowerPC PowerNV PCI Hotplug driver"
+	depends on PPC_POWERNV && EEH
+	help
+	  Say Y here if you run PowerPC PowerNV platform that supports
+          PCI Hotplug
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called powernv-php.
+
+	  When in doubt, say N.
+
 config HOTPLUG_PCI_RPA
 	tristate "RPA PCI Hotplug driver"
 	depends on PPC_PSERIES && EEH
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 4a9aa08..a69665e 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)		+= pciehp.o
 obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550)	+= cpcihp_zt5550.o
 obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)	+= cpcihp_generic.o
 obj-$(CONFIG_HOTPLUG_PCI_SHPC)		+= shpchp.o
+obj-$(CONFIG_HOTPLUG_PCI_POWERNV)	+= powernv-php.o
 obj-$(CONFIG_HOTPLUG_PCI_RPA)		+= rpaphp.o
 obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR)	+= rpadlpar_io.o
 obj-$(CONFIG_HOTPLUG_PCI_SGI)		+= sgi_hotplug.o
@@ -50,6 +51,9 @@ ibmphp-objs		:=	ibmphp_core.o	\
 acpiphp-objs		:=	acpiphp_core.o	\
 				acpiphp_glue.o
 
+powernv-php-objs	:=	powernv_php.o	\
+				powernv_php_slot.o
+
 rpaphp-objs		:=	rpaphp_core.o	\
 				rpaphp_pci.o	\
 				rpaphp_slot.o
diff --git a/drivers/pci/hotplug/powernv_php.c b/drivers/pci/hotplug/powernv_php.c
new file mode 100644
index 0000000..4cbff7a
--- /dev/null
+++ b/drivers/pci/hotplug/powernv_php.c
@@ -0,0 +1,140 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2015.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+
+#include <asm/opal.h>
+#include <asm/pnv-pci.h>
+
+#include "powernv_php.h"
+
+#define DRIVER_VERSION	"0.1"
+#define DRIVER_AUTHOR	"Gavin Shan, IBM Corporation"
+#define DRIVER_DESC	"PowerPC PowerNV PCI Hotplug Driver"
+
+static struct notifier_block php_msg_nb = {
+	.notifier_call	= powernv_php_msg_handler,
+	.next		= NULL,
+	.priority	= 0,
+};
+
+static int powernv_php_register_one(struct device_node *dn)
+{
+	struct powernv_php_slot *slot;
+	const __be32 *prop32;
+	int ret;
+
+	/* Check if it's hotpluggable slot */
+	prop32 = of_get_property(dn, "ibm,slot-pluggable", NULL);
+	if (!prop32 || !of_read_number(prop32, 1))
+		return -ENXIO;
+
+	prop32 = of_get_property(dn, "ibm,reset-by-firmware", NULL);
+	if (!prop32 || !of_read_number(prop32, 1))
+		return -ENXIO;
+
+	/* Allocate slot */
+	slot = powernv_php_slot_alloc(dn);
+	if (!slot)
+		return -ENODEV;
+
+	/* Register it */
+	ret = powernv_php_slot_register(slot);
+	if (ret) {
+		powernv_php_slot_put(slot);
+		return ret;
+	}
+
+	return powernv_php_slot_enable(slot->php_slot, false);
+}
+
+int powernv_php_register(struct device_node *dn)
+{
+	struct device_node *child;
+	int ret = 0;
+
+	/*
+	 * The parent slots should be registered before their
+	 * child slots.
+	 */
+	for_each_child_of_node(dn, child) {
+		powernv_php_register_one(child);
+		powernv_php_register(child);
+	}
+
+	return ret;
+}
+
+static void powernv_php_unregister_one(struct device_node *dn)
+{
+	struct powernv_php_slot *slot;
+
+	slot = powernv_php_slot_find(dn);
+	if (!slot)
+		return;
+
+	pci_hp_deregister(slot->php_slot);
+}
+
+void powernv_php_unregister(struct device_node *dn)
+{
+	struct device_node *child;
+
+	/* The child slots should go before their parent slots */
+	for_each_child_of_node(dn, child) {
+		powernv_php_unregister(child);
+		powernv_php_unregister_one(child);
+	}
+}
+
+static int __init powernv_php_init(void)
+{
+	struct device_node *dn;
+	int ret;
+
+	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
+
+	/* Register hotplug message handler */
+	ret = pnv_pci_hotplug_notifier_register(&php_msg_nb);
+	if (ret) {
+		pr_warn("%s: Error %d registering hotplug notifier\n",
+			__func__, ret);
+		return ret;
+	}
+
+	/* Scan PHB nodes and their children */
+	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
+		powernv_php_register(dn);
+	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
+		powernv_php_register(dn);
+
+	return 0;
+}
+
+static void __exit powernv_php_exit(void)
+{
+	struct device_node *dn;
+
+	pnv_pci_hotplug_notifier_unregister(&php_msg_nb);
+
+	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
+		powernv_php_unregister(dn);
+	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
+		powernv_php_unregister(dn);
+}
+
+module_init(powernv_php_init);
+module_exit(powernv_php_exit);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/pci/hotplug/powernv_php.h b/drivers/pci/hotplug/powernv_php.h
new file mode 100644
index 0000000..5e14a65
--- /dev/null
+++ b/drivers/pci/hotplug/powernv_php.h
@@ -0,0 +1,90 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2015.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _POWERNV_PHP_H
+#define _POWERNV_PHP_H
+
+#include <linux/list.h>
+#include <linux/kref.h>
+#include <linux/of.h>
+#include <linux/pci.h>
+#include <linux/pci_hotplug.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+
+#include <asm/opal-api.h>
+
+/* Slot power status */
+#define POWERNV_PHP_SLOT_POWER_OFF	0
+#define POWERNV_PHP_SLOT_POWER_ON	1
+
+/* Slot presence status */
+#define POWERNV_PHP_SLOT_EMPTY		0
+#define POWERNV_PHP_SLOT_PRESENT	1
+
+/* Slot attention status */
+#define POWERNV_PHP_SLOT_ATTEN_OFF	0
+#define POWERNV_PHP_SLOT_ATTEN_ON	1
+#define POWERNV_PHP_SLOT_ATTEN_IND	2
+#define POWERNV_PHP_SLOT_ATTEN_ACT	3
+
+struct powernv_php_slot {
+	char			*name;
+	struct device_node	*dn;
+	struct pci_bus		*bus;
+	uint64_t		id;
+	int			slot_no;
+	struct kref		kref;
+#define POWERNV_PHP_SLOT_STATE_INIT		0
+#define POWERNV_PHP_SLOT_STATE_REGISTER		1
+#define POWERNV_PHP_SLOT_STATE_POPULATED	2
+	int			state;
+	int			check_power_status;
+	int			status_confirmed;
+	struct opal_msg		*msg;
+	uint64_t		dt_counter;
+	int			overlay_id;
+	struct work_struct	work;
+	wait_queue_head_t	queue;
+	struct hotplug_slot	*php_slot;
+	struct powernv_php_slot	*parent;
+	struct list_head	children;
+	struct list_head	link;
+};
+
+int powernv_php_msg_handler(struct notifier_block *nb,
+			    unsigned long type, void *message);
+struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn);
+void powernv_php_slot_free(struct kref *kref);
+struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn);
+int powernv_php_slot_register(struct powernv_php_slot *slot);
+int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan);
+int powernv_php_register(struct device_node *dn);
+void powernv_php_unregister(struct device_node *dn);
+
+#define to_powernv_php_slot(kref) \
+	container_of(kref, struct powernv_php_slot, kref)
+
+static inline void powernv_php_slot_get(struct powernv_php_slot *slot)
+{
+	if (slot)
+		kref_get(&slot->kref);
+}
+
+static inline int powernv_php_slot_put(struct powernv_php_slot *slot)
+{
+	if (slot)
+		return kref_put(&slot->kref, powernv_php_slot_free);
+
+	return 0;
+}
+
+#endif /* !_POWERNV_PHP_H */
diff --git a/drivers/pci/hotplug/powernv_php_slot.c b/drivers/pci/hotplug/powernv_php_slot.c
new file mode 100644
index 0000000..6c56455
--- /dev/null
+++ b/drivers/pci/hotplug/powernv_php_slot.c
@@ -0,0 +1,732 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2015.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+
+#include <asm/opal.h>
+#include <asm/pnv-pci.h>
+#include <asm/ppc-pci.h>
+
+#include "powernv_php.h"
+
+static LIST_HEAD(php_slot_list);
+static DEFINE_SPINLOCK(php_slot_lock);
+
+/*
+ * Remove firmware data for all child device nodes of the
+ * indicated one.
+ */
+static void remove_child_pdn(struct device_node *np)
+{
+	struct device_node *child;
+
+	for_each_child_of_node(np, child) {
+		/* In depth first */
+		remove_child_pdn(child);
+
+		remove_pci_device_node_info(child);
+	}
+}
+
+/*
+ * Remove all subordinate device nodes of the indicated one.
+ * Those device nodes in deepest path should be released firstly.
+ */
+static int remove_child_device_nodes(struct device_node *parent)
+{
+	struct device_node *np, *child;
+	int ret = 0;
+
+	/* If the device node has children, remove them firstly */
+	for_each_child_of_node(parent, np) {
+		ret = remove_child_device_nodes(np);
+		if (ret)
+			return ret;
+
+		/* The device shouldn't have alive children */
+		child = of_get_next_child(np, NULL);
+		if (child) {
+			of_node_put(child);
+			of_node_put(np);
+			pr_err("%s: Alive children of node <%s>\n",
+			       __func__, of_node_full_name(np));
+			return -EBUSY;
+		}
+
+		/* Detach the device node */
+		of_detach_node(np);
+		of_node_put(np);
+	}
+
+	return 0;
+}
+
+/*
+ * The function processes the message sent by firmware
+ * to remove all device tree nodes beneath the slot's
+ * nodes, and the associated auxillary data.
+ */
+static void slot_power_off_handler(struct powernv_php_slot *slot)
+{
+	int ret;
+
+	/* Release the firmware data for the child device nodes */
+	remove_child_pdn(slot->dn);
+
+	/*
+	 * Release the child device nodes. If the sub-tree was
+	 * built with the help of overlay, we just need revert
+	 * the changes introduced by the overlay
+	 */
+	if (slot->overlay_id >= 0) {
+		ret = of_overlay_destroy(slot->overlay_id);
+		if (ret)
+			pr_warn("%s: Error %d destroying overlay %d\n",
+				__func__, ret, slot->overlay_id);
+		slot->overlay_id = -1;
+	} else {
+		ret = remove_child_device_nodes(slot->dn);
+		if (ret)
+			pr_warn("%s: Error %d releasing children of <%s>\n",
+				__func__, ret, of_node_full_name(slot->dn));
+	}
+
+	/* Confirm status change */
+	slot->status_confirmed = 1;
+	wake_up_interruptible(&slot->queue);
+}
+
+static void slot_power_on_handler(struct powernv_php_slot *slot)
+{
+	struct device_node *nodes[3] = {NULL, NULL, NULL};
+	struct property *prop = NULL;
+	void *fdt = NULL, *dt = NULL;
+	phandle handle;
+	uint64_t len;
+	int i, ret;
+
+	/* Build overlay sub-tree */
+	for (i = 0; i < ARRAY_SIZE(nodes); i++) {
+		nodes[i] = kzalloc(sizeof(struct device_node), GFP_KERNEL);
+		if (!nodes[i])
+			goto out;
+
+		of_node_init(nodes[i]);
+		if (i > 0) {
+			nodes[i - 1]->child = nodes[i];
+			nodes[i]->parent = nodes[i - 1];
+		}
+	}
+
+	/* Target property for parent node */
+	prop = kzalloc(sizeof(struct property), GFP_KERNEL);
+	if (!prop)
+		goto out;
+	prop->name = kstrdup("target", GFP_KERNEL);
+	if (!prop->name)
+		goto out;
+	prop->value = kzalloc(sizeof(phandle), GFP_KERNEL);
+	if (!prop->value)
+		goto out;
+	handle = cpu_to_be32(slot->dn->phandle);
+	memcpy(prop->value, &handle, sizeof(phandle));
+	prop->length = sizeof(phandle);
+	nodes[1]->properties = prop;
+
+	/* Names for overlay node */
+	nodes[2]->name = kstrdup("__overlay__", GFP_KERNEL);
+	if (!nodes[2]->name)
+		goto out;
+	nodes[2]->full_name = kstrdup(of_node_full_name(slot->dn), GFP_KERNEL);
+	if (!nodes[2]->full_name)
+		goto out;
+
+	/* Get FDT blob */
+	slot->dt_counter += 1;
+	fdt = NULL;
+	len = 0x2000;
+	while (len <= 0x10000) {
+		fdt = kzalloc(len, GFP_KERNEL);
+		if (!fdt)
+			break;
+
+		ret = pnv_pci_get_overlay_dt(&slot->dt_counter, fdt, len);
+		if (!ret)
+			break;
+
+		kfree(fdt);
+		fdt = NULL;
+		len *= 2;
+	}
+
+	if (!fdt)
+		goto out;
+
+	/* Unflatten device tree blob */
+	dt = of_fdt_unflatten_tree(fdt, nodes[2], NULL);
+
+	/* Apply the overlay tree */
+	slot->overlay_id = of_overlay_create(nodes[0]);
+	if (slot->overlay_id < 0)
+		goto out;
+
+	/* Add device node firmware data */
+	traverse_pci_device_nodes(slot->dn,
+				  add_pci_device_node_info,
+				  pci_bus_to_host(slot->bus));
+
+out:
+	kfree(dt);
+	kfree(fdt);
+	if (nodes[2]) {
+		kfree(nodes[2]->name);
+		kfree(nodes[2]->full_name);
+	}
+	if (prop) {
+		kfree(prop->value);
+		kfree(prop->name);
+	}
+
+	kfree(prop);
+	for (i = 0; i < ARRAY_SIZE(nodes); i++)
+		kfree(nodes[i]);
+
+	/* Confirm status change */
+	slot->status_confirmed = 1;
+	wake_up_interruptible(&slot->queue);
+}
+
+static void powernv_php_slot_work(struct work_struct *data)
+{
+	struct powernv_php_slot *slot = container_of(data,
+						     struct powernv_php_slot,
+						     work);
+	uint64_t php_event = be64_to_cpu(slot->msg->params[0]);
+
+	switch (php_event) {
+	case 0: /* Slot power off */
+		slot_power_off_handler(slot);
+		break;
+	case 1: /* Slot power on */
+		slot_power_on_handler(slot);
+		break;
+	default:
+		pr_warn("%s: Unsupported hotplug event %lld\n",
+			__func__, php_event);
+	}
+
+	of_node_put(slot->dn);
+}
+
+int powernv_php_msg_handler(struct notifier_block *nb,
+			    unsigned long type, void *message)
+{
+	phandle h;
+	struct device_node *np;
+	struct powernv_php_slot *slot;
+	struct opal_msg *msg = message;
+
+	/* Check the message type */
+	if (type != OPAL_MSG_PCI_HOTPLUG) {
+		pr_warn("%s: Wrong message type %ld received!\n",
+			__func__, type);
+		return NOTIFY_DONE;
+	}
+
+	/* Find the device node */
+	h = (phandle)be64_to_cpu(msg->params[1]);
+	np = of_find_node_by_phandle(h);
+	if (!np) {
+		pr_warn("%s: No device node for phandle 0x%08x\n",
+			__func__, h);
+		return NOTIFY_DONE;
+	}
+
+	/* Find the slot */
+	slot = powernv_php_slot_find(np);
+	if (!slot) {
+		pr_warn("%s: No slot found for node <%s>\n",
+			__func__, of_node_full_name(np));
+		of_node_put(np);
+		return NOTIFY_DONE;
+	}
+
+	/* Schedule the work */
+	slot->msg = msg;
+	schedule_work(&slot->work);
+	return NOTIFY_OK;
+}
+
+static int set_power_status(struct hotplug_slot *php_slot, u8 val)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	int ret;
+
+	/* Retrieve the counter of device tree */
+	ret = pnv_pci_get_overlay_dt(&slot->dt_counter, NULL, 0);
+	if (ret) {
+		pr_warn("%s: Error %d getting DT counter for slot %016llx\n",
+			__func__, ret, slot->id);
+		return ret;
+	}
+
+	/* Set power status */
+	slot->status_confirmed = 0;
+	ret = pnv_pci_set_power_status(slot->id, val);
+	if (ret) {
+		pr_warn("%s: Error %d powering %s slot %016llx\n",
+			__func__, ret, val ? "on" : "off", slot->id);
+		return ret;
+	}
+
+	/* Waiting until the device tree is updated */
+	ret = wait_event_timeout(slot->queue,
+				 !slot->status_confirmed,
+				 10 * HZ);
+	if (ret) {
+		pr_warn("%s: Error %d completing power-%s slot %016llx\n",
+			__func__, ret, val ? "on" : "off", slot->id);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int get_power_status(struct hotplug_slot *php_slot, u8 *val)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t state;
+	int ret;
+
+	/*
+	 * Retrieve power status from firmware. If we fail
+	 * getting that, the power status fails back to
+	 * be on.
+	 */
+	ret = pnv_pci_get_power_status(slot->id, &state);
+	if (ret) {
+		*val = POWERNV_PHP_SLOT_POWER_ON;
+		pr_warn("%s: Error %d getting power status of slot %016llx\n",
+			__func__, ret, slot->id);
+	} else {
+		*val = state ? POWERNV_PHP_SLOT_POWER_ON :
+			       POWERNV_PHP_SLOT_POWER_OFF;
+		php_slot->info->power_status = *val;
+	}
+
+	return 0;
+}
+
+static int get_adapter_status(struct hotplug_slot *php_slot, u8 *val)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t state;
+	int ret;
+
+	/*
+	 * Retrieve presence status from firmware. If we can't
+	 * get that, it will fail back to be empty.
+	 */
+	ret = pnv_pci_get_presence_status(slot->id, &state);
+	if (ret >= 0) {
+		ret = 0;
+		*val = state ? POWERNV_PHP_SLOT_PRESENT :
+			       POWERNV_PHP_SLOT_EMPTY;
+		php_slot->info->adapter_status = *val;
+		ret = 0;
+	} else {
+		*val = POWERNV_PHP_SLOT_EMPTY;
+		pr_warn("%s: Error %d getting presence of slot %016llx\n",
+			__func__, ret, slot->id);
+	}
+
+	return ret;
+}
+
+static int set_attention_status(struct hotplug_slot *php_slot, u8 val)
+{
+	/* The default operation would to turn on the attention */
+	switch (val) {
+	case POWERNV_PHP_SLOT_ATTEN_OFF:
+	case POWERNV_PHP_SLOT_ATTEN_ON:
+	case POWERNV_PHP_SLOT_ATTEN_IND:
+	case POWERNV_PHP_SLOT_ATTEN_ACT:
+		break;
+	default:
+		pr_warn("%s: Invalid attention status 0x%02x\n",
+			__func__, val);
+		return -EINVAL;
+	}
+
+	/* FIXME: Make it real once firmware supports it */
+	php_slot->info->attention_status = val;
+
+	return 0;
+}
+
+int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t presence, power_status;
+	int ret;
+
+	/* Check if the slot has been configured */
+	if (slot->state != POWERNV_PHP_SLOT_STATE_REGISTER)
+		return 0;
+
+	/* Retrieve slot presence status */
+	ret = php_slot->ops->get_adapter_status(php_slot, &presence);
+	if (ret) {
+		pr_warn("%s: Error %d getting presence of slot %016llx\n",
+			__func__, ret, slot->id);
+		return ret;
+	}
+
+	/* Proceed if there have nothing behind the slot */
+	if (presence == POWERNV_PHP_SLOT_EMPTY)
+		goto scan;
+
+	/*
+	 * If we don't detect something behind the slot, we need
+	 * make sure the power suply to the slot is on. Otherwise,
+	 * the slot downstream PCIe linkturn should be down.
+	 *
+	 * On the first time, we don't change the power status to
+	 * boost system boot with assumption that the firmware
+	 * supplies consistent slot power status: empty slot always
+	 * has its power off and non-empty slot has its power on.
+	 */
+	if (!slot->check_power_status) {
+		slot->check_power_status = 1;
+		goto scan;
+	}
+
+	/* Check the power status. Scan the slot if that's already on */
+	ret = php_slot->ops->get_power_status(php_slot, &power_status);
+	if (ret) {
+		pr_warn("%s: Error %d getting power status of slot %016llx\n",
+			__func__, ret, slot->id);
+		return ret;
+	}
+	if (power_status == POWERNV_PHP_SLOT_POWER_ON)
+		goto scan;
+
+	/* Power is off, turn it on and then scan the slot */
+	ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_ON);
+	if (ret) {
+		pr_warn("%s: Error %d powering on slot %016llx\n",
+			__func__, ret, slot->id);
+		return ret;
+	}
+
+scan:
+	switch (presence) {
+	case POWERNV_PHP_SLOT_PRESENT:
+		if (rescan) {
+			pci_lock_rescan_remove();
+			pcibios_add_pci_devices(slot->bus);
+			pci_unlock_rescan_remove();
+		}
+
+		/* Rescan for child hotpluggable slots */
+		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
+		if (rescan)
+			powernv_php_register(slot->dn);
+		break;
+	case POWERNV_PHP_SLOT_EMPTY:
+		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
+		break;
+	default:
+		pr_warn("%s: Invalid presence status %d of slot %016llx\n",
+			__func__, presence, slot->id);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int enable_slot(struct hotplug_slot *php_slot)
+{
+	return powernv_php_slot_enable(php_slot, true);
+}
+
+static int disable_slot(struct hotplug_slot *php_slot)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t power_status;
+	int ret;
+
+	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
+		return 0;
+
+	/* Remove all devices behind the slot */
+	pci_lock_rescan_remove();
+	pcibios_remove_pci_devices(slot->bus);
+	pci_unlock_rescan_remove();
+
+	/* Detach the child hotpluggable slots */
+	powernv_php_unregister(slot->dn);
+
+	/*
+	 * Check the power status and turn it off if necessary. If we
+	 * fail to get the power status, the power will be forced to
+	 * be off.
+	 */
+	ret = php_slot->ops->get_power_status(php_slot, &power_status);
+	if (ret || power_status == POWERNV_PHP_SLOT_POWER_ON) {
+		ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_OFF);
+		if (ret)
+			pr_warn("%s: Error %d powering off slot %016llx\n",
+				__func__, ret, slot->id);
+	}
+
+	/* Update slot state */
+	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
+	return 0;
+}
+
+static struct hotplug_slot_ops php_slot_ops = {
+	.get_power_status	= get_power_status,
+	.get_adapter_status	= get_adapter_status,
+	.set_attention_status	= set_attention_status,
+	.enable_slot		= enable_slot,
+	.disable_slot		= disable_slot,
+};
+
+static struct powernv_php_slot *php_slot_match(struct device_node *dn,
+					       struct powernv_php_slot *slot)
+{
+	struct powernv_php_slot *target, *tmp;
+
+	if (slot->dn == dn)
+		return slot;
+
+	list_for_each_entry(tmp, &slot->children, link) {
+		target = php_slot_match(dn, tmp);
+		if (target)
+			return target;
+	}
+
+	return NULL;
+}
+
+struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn)
+{
+	struct powernv_php_slot *slot, *tmp;
+	unsigned long flags;
+
+	spin_lock_irqsave(&php_slot_lock, flags);
+	list_for_each_entry(tmp, &php_slot_list, link) {
+		slot = php_slot_match(dn, tmp);
+		if (slot) {
+			spin_unlock_irqrestore(&php_slot_lock, flags);
+			return slot;
+		}
+	}
+	spin_unlock_irqrestore(&php_slot_lock, flags);
+
+	return NULL;
+}
+
+void powernv_php_slot_free(struct kref *kref)
+{
+	struct powernv_php_slot *slot = to_powernv_php_slot(kref);
+
+	WARN_ON(!list_empty(&slot->children));
+	kfree(slot->name);
+	kfree(slot);
+}
+
+static void php_slot_release(struct hotplug_slot *hp_slot)
+{
+	struct powernv_php_slot *slot = hp_slot->private;
+	unsigned long flags;
+
+	/* Remove from global or child list */
+	spin_lock_irqsave(&php_slot_lock, flags);
+	list_del(&slot->link);
+	spin_unlock_irqrestore(&php_slot_lock, flags);
+
+	/* Detach from parent */
+	powernv_php_slot_put(slot);
+	powernv_php_slot_put(slot->parent);
+}
+
+static bool php_slot_get_id(struct device_node *dn,
+			    uint64_t *id)
+{
+	struct device_node *parent = dn;
+	const __be64 *prop64;
+	const __be32 *prop32;
+
+	/*
+	 * The hotpluggable slot always has a compound Id, which
+	 * consists of 16-bits PHB Id, 16 bits bus/slot/function
+	 * number, and compound indicator
+	 */
+	*id = (0x1ul << 63);
+
+	/* Bus/Slot/Function number */
+	prop32 = of_get_property(dn, "reg", NULL);
+	if (!prop32)
+		return false;
+	*id |= ((of_read_number(prop32, 1) & 0x00ffff00) << 8);
+
+	/* PHB Id */
+	while ((parent = of_get_parent(parent))) {
+		if (!PCI_DN(parent)) {
+			of_node_put(parent);
+			break;
+		}
+
+		if (!of_device_is_compatible(parent, "ibm,ioda2-phb") &&
+		    !of_device_is_compatible(parent, "ibm,ioda-phb")) {
+			of_node_put(parent);
+			continue;
+		}
+
+		prop64 = of_get_property(parent, "ibm,opal-phbid", NULL);
+		if (!prop64) {
+			of_node_put(parent);
+			return false;
+		}
+
+		*id |= be64_to_cpup(prop64);
+		of_node_put(parent);
+		return true;
+	}
+
+	return false;
+}
+
+struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn)
+{
+	struct pci_bus *bus;
+	struct powernv_php_slot *slot;
+	const char *label;
+	uint64_t id;
+	int slot_no;
+	size_t size;
+	void *pmem;
+
+	/* Slot name */
+	label = of_get_property(dn, "ibm,slot-label", NULL);
+	if (!label)
+		return NULL;
+
+	/* Slot indentifier */
+	if (!php_slot_get_id(dn, &id))
+		return NULL;
+
+	/* PCI bus */
+	bus = pcibios_find_pci_bus(dn);
+	if (!bus)
+		return NULL;
+
+	/* Slot number */
+	if (dn->child && PCI_DN(dn->child))
+		slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
+	else
+		slot_no = -1;
+
+	/* Allocate slot */
+	size = sizeof(struct powernv_php_slot) +
+	       sizeof(struct hotplug_slot) +
+	       sizeof(struct hotplug_slot_info);
+	pmem = kzalloc(size, GFP_KERNEL);
+	if (!pmem) {
+		pr_warn("%s: Cannot allocate slot for node %s\n",
+			__func__, dn->full_name);
+		return NULL;
+	}
+
+	/* Assign memory blocks */
+	slot = pmem;
+	slot->php_slot = pmem + sizeof(struct powernv_php_slot);
+	slot->php_slot->info = pmem + sizeof(struct powernv_php_slot) +
+			      sizeof(struct hotplug_slot);
+	slot->name = kstrdup(label, GFP_KERNEL);
+	if (!slot->name) {
+		pr_warn("%s: Cannot populate name for node %s\n",
+			__func__, dn->full_name);
+		kfree(pmem);
+		return NULL;
+	}
+
+	/* Initialize slot */
+	kref_init(&slot->kref);
+	slot->state = POWERNV_PHP_SLOT_STATE_INIT;
+	slot->dn = dn;
+	slot->bus = bus;
+	slot->id = id;
+	slot->slot_no = slot_no;
+	slot->overlay_id = -1;
+	INIT_WORK(&slot->work, powernv_php_slot_work);
+	init_waitqueue_head(&slot->queue);
+	slot->check_power_status = 0;
+	slot->status_confirmed = 0;
+	slot->php_slot->ops = &php_slot_ops;
+	slot->php_slot->release = php_slot_release;
+	slot->php_slot->private = slot;
+	INIT_LIST_HEAD(&slot->children);
+	INIT_LIST_HEAD(&slot->link);
+
+	return slot;
+}
+
+int powernv_php_slot_register(struct powernv_php_slot *slot)
+{
+	struct powernv_php_slot *parent;
+	struct device_node *dn = slot->dn;
+	unsigned long flags;
+	int ret;
+
+	/* Avoid register same slot for twice */
+	if (powernv_php_slot_find(slot->dn))
+		return -EEXIST;
+
+	/* Register slot */
+	ret = pci_hp_register(slot->php_slot, slot->bus,
+			      slot->slot_no, slot->name);
+	if (ret) {
+		pr_warn("%s: Cannot register slot %s (%d)\n",
+			__func__, slot->name, ret);
+		return ret;
+	}
+
+	/* Put into global or parent list */
+	while ((dn = of_get_parent(dn))) {
+		if (!PCI_DN(dn)) {
+			of_node_put(dn);
+			break;
+		}
+
+		parent = powernv_php_slot_find(dn);
+		if (parent) {
+			of_node_put(dn);
+			break;
+		}
+	}
+
+	spin_lock_irqsave(&php_slot_lock, flags);
+	if (parent) {
+		powernv_php_slot_get(parent);
+		slot->parent = parent;
+		list_add_tail(&slot->link, &parent->children);
+	} else {
+		list_add_tail(&slot->link, &php_slot_list);
+	}
+	spin_unlock_irqrestore(&php_slot_lock, flags);
+
+	/* Update slot state */
+	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
+	return 0;
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
  2015-06-04  6:42 ` [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree() Gavin Shan
@ 2015-06-04 22:10   ` Rob Herring
       [not found]   ` <1433400131-18429-41-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  1 sibling, 0 replies; 83+ messages in thread
From: Rob Herring @ 2015-06-04 22:10 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, Benjamin Herrenschmidt,
	Bjorn Helgaas, aik, Pantelis Antoniou, Grant Likely

On Thu, Jun 4, 2015 at 1:42 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
> The patch introduces one more argument to of_fdt_unflatten_tree()
> to specify the root node for the FDT blob, which is going to be
> unflattened. In the result, the function can be used to unflatten
> FDT blob, which represents device sub-tree in subsequent patches.
>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>   * Newly introduced
> ---
>  drivers/of/fdt.c       | 26 ++++++++++++++++++--------
>  drivers/of/unittest.c  |  2 +-
>  include/linux/of_fdt.h |  3 ++-
>  3 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index b87c157..b6a6c59 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -380,9 +380,16 @@ static void *unflatten_dt_node(void *blob,
>                                struct device_node **nodepp,
>                                bool dryrun)
>  {
> +       unsigned long fpsize = 0;
> +
> +       if (dad)
> +               fpsize = strlen(of_node_full_name(dad));
> +       else
> +               fpsize = 0;
> +
>         cur_node_depth = 1;
>         return __unflatten_dt_node(blob, mem, poffset,
> -                                  dad, nodepp, 0, dryrun);
> +                                  dad, nodepp, fpsize, dryrun);
>  }
>
>  /**
> @@ -393,13 +400,15 @@ static void *unflatten_dt_node(void *blob,
>   * pointers of the nodes so the normal device-tree walking functions
>   * can be used.
>   * @blob: The blob to expand
> + * @dad: The root node of the created device_node tree
>   * @mynodes: The device_node tree created by the call
>   * @dt_alloc: An allocator that provides a virtual address to memory
>   * for the resulting tree
>   */
>  static void __unflatten_device_tree(void *blob,
> -                            struct device_node **mynodes,
> -                            void * (*dt_alloc)(u64 size, u64 align))
> +                                   struct device_node *dad,
> +                                   struct device_node **mynodes,
> +                                   void * (*dt_alloc)(u64 size, u64 align))
>  {
>         unsigned long size;
>         int start;
> @@ -425,7 +434,7 @@ static void __unflatten_device_tree(void *blob,
>         /* First pass, scan for size */
>         start = 0;
>         size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
> -                                               NULL, NULL, true);
> +                                               dad, NULL, true);
>         size = ALIGN(size, 4);
>
>         pr_debug("  size is %lx, allocating...\n", size);
> @@ -440,7 +449,7 @@ static void __unflatten_device_tree(void *blob,
>
>         /* Second pass, do actual unflattening */
>         start = 0;
> -       unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
> +       unflatten_dt_node(blob, mem, &start, dad, mynodes, false);
>         if (be32_to_cpup(mem + size) != 0xdeadbeef)
>                 pr_warning("End of tree marker overwritten: %08x\n",
>                            be32_to_cpup(mem + size));
> @@ -462,9 +471,10 @@ static void *kernel_tree_alloc(u64 size, u64 align)
>   * can be used.
>   */
>  void of_fdt_unflatten_tree(unsigned long *blob,
> -                       struct device_node **mynodes)
> +                          struct device_node *dad,
> +                          struct device_node **mynodes)
>  {
> -       __unflatten_device_tree(blob, mynodes, &kernel_tree_alloc);
> +       __unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
>  }
>  EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
>
> @@ -1095,7 +1105,7 @@ bool __init early_init_dt_scan(void *params)
>   */
>  void __init unflatten_device_tree(void)
>  {
> -       __unflatten_device_tree(initial_boot_params, &of_root,
> +       __unflatten_device_tree(initial_boot_params, NULL, &of_root,
>                                 early_init_dt_alloc_memory_arch);
>
>         /* Get pointer to "/chosen" and "/aliases" nodes for use everywhere */
> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> index 1801634..2270830 100644
> --- a/drivers/of/unittest.c
> +++ b/drivers/of/unittest.c
> @@ -907,7 +907,7 @@ static int __init unittest_data_add(void)
>                         "not running tests\n", __func__);
>                 return -ENOMEM;
>         }
> -       of_fdt_unflatten_tree(unittest_data, &unittest_data_node);
> +       of_fdt_unflatten_tree(unittest_data, NULL, &unittest_data_node);
>         if (!unittest_data_node) {
>                 pr_warn("%s: No tree to attach; not running tests\n", __func__);
>                 return -ENODATA;
> diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
> index 587ee50..8882640 100644
> --- a/include/linux/of_fdt.h
> +++ b/include/linux/of_fdt.h
> @@ -38,7 +38,8 @@ extern bool of_fdt_is_big_endian(const void *blob,
>  extern int of_fdt_match(const void *blob, unsigned long node,
>                         const char *const *compat);
>  extern void of_fdt_unflatten_tree(unsigned long *blob,
> -                              struct device_node **mynodes);
> +                                 struct device_node *dad,
> +                                 struct device_node **mynodes);
>
>  /* TBD: Temporary export of fdt globals - remove when code fully merged */
>  extern int __initdata dt_root_addr_cells;
> --
> 2.1.0
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 01/42] PCI: Add pcibios_setup_bridge()
  2015-06-04  6:41 ` [PATCH v5 01/42] PCI: Add pcibios_setup_bridge() Gavin Shan
@ 2015-06-05 19:44   ` Bjorn Helgaas
  2015-06-09  5:49     ` Gavin Shan
  0 siblings, 1 reply; 83+ messages in thread
From: Bjorn Helgaas @ 2015-06-05 19:44 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, benh, aik, panto,
	robherring2, grant.likely

On Thu, Jun 04, 2015 at 04:41:30PM +1000, Gavin Shan wrote:
> Currently, PowerPC PowerNV platform utilizes ppc_md.pcibios_fixup(),
> which is called for once after PCI probing and resource assignment
> are completed, to allocate platform required resources for PCI devices:
> PE#, IO and MMIO mapping, DMA address translation (TCE) table etc.
> Obviously, it's not hotplug friendly.
> 
> The patch adds weak function pcibios_setup_bridge(), which is called
> by pci_setup_bridge(). PowerPC PowerNV platform will reuse the function
> to assign above platform required resources to newly added PCI devices,
> in order to support PCI hotplug in subsequent patches.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
> v5:
>   * Corrected subject as Bjorn suggested
>   * pci_setup_bridge() calls pcibios_setup_bridge() and __pci_setup_bridge()
> ---
>  drivers/pci/setup-bus.c | 5 +++++
>  include/linux/pci.h     | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 4fd0cac..623dee3 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -693,11 +693,16 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>  	pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bus->bridge_ctl);
>  }
>  
> +void __weak pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
> +{
> +}
> +
>  void pci_setup_bridge(struct pci_bus *bus)
>  {
>  	unsigned long type = IORESOURCE_IO | IORESOURCE_MEM |
>  				  IORESOURCE_PREFETCH;
>  
> +	pcibios_setup_bridge(bus, type);
>  	__pci_setup_bridge(bus, type);
>  }
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 94bacfa..5aacd0a 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -811,6 +811,7 @@ void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev);
>  void pci_stop_root_bus(struct pci_bus *bus);
>  void pci_remove_root_bus(struct pci_bus *bus);
>  void pci_setup_cardbus(struct pci_bus *bus);
> +void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
>  void pci_sort_breadthfirst(void);
>  #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
>  #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around
  2015-06-04  6:42 ` [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
@ 2015-06-05 19:47   ` Bjorn Helgaas
  2015-06-09  6:10     ` Gavin Shan
  0 siblings, 1 reply; 83+ messages in thread
From: Bjorn Helgaas @ 2015-06-05 19:47 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, benh, aik, panto,
	robherring2, grant.likely

"Move pcibios_find_pci_bus() from pSeries to generic powerpc code"?

On Thu, Jun 04, 2015 at 04:42:00PM +1000, Gavin Shan wrote:
> The patch moves pcibios_find_pci_bus() to PPC kerenl directory so

s/kerenl/kernel/

> that it can be reused by hotplug code for pSeries and PowerNV
> platform at the same time.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> v5:
>   * Derived from PATCH[v4 12/21]
> ---
>  arch/powerpc/kernel/pci-hotplug.c          | 36 ++++++++++++++++++++++++++++++
>  arch/powerpc/platforms/pseries/pci_dlpar.c | 32 --------------------------
>  2 files changed, 36 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> index ca392fc..1482bc1 100644
> --- a/arch/powerpc/kernel/pci-hotplug.c
> +++ b/arch/powerpc/kernel/pci-hotplug.c
> @@ -21,6 +21,42 @@
>  #include <asm/firmware.h>
>  #include <asm/eeh.h>
>  
> +static struct pci_bus *find_pci_bus(struct pci_bus *bus,
> +				    struct device_node *dn)
> +{
> +	struct pci_bus *tmp, *child = NULL;
> +	struct device_node *busdn;
> +
> +	busdn = pci_bus_to_OF_node(bus);
> +	if (busdn == dn)
> +		return bus;
> +
> +	list_for_each_entry(tmp, &bus->children, node) {
> +		child = find_pci_bus(tmp, dn);
> +		if (child)
> +			break;
> +	}
> +
> +	return child;
> +}
> +
> +/**
> + * pcibios_find_pci_bus - find PCI bus according to the given device node
> + * @dn: Device node
> + *
> + * Find the corresponding PCI bus according to the given device node.
> + */
> +struct pci_bus *pcibios_find_pci_bus(struct device_node *dn)
> +{
> +	struct pci_dn *pdn = PCI_DN(dn);
> +
> +	if (!pdn  || !pdn->phb || !pdn->phb->bus)
> +		return NULL;
> +
> +	return find_pci_bus(pdn->phb->bus, dn);
> +}
> +EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
> +
>  /**
>   * pcibios_release_device - release PCI device
>   * @dev: PCI device
> diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c
> index 5d4a3df..906dbaa 100644
> --- a/arch/powerpc/platforms/pseries/pci_dlpar.c
> +++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
> @@ -34,38 +34,6 @@
>  
>  #include "pseries.h"
>  
> -static struct pci_bus *
> -find_bus_among_children(struct pci_bus *bus,
> -                        struct device_node *dn)
> -{
> -	struct pci_bus *child = NULL;
> -	struct pci_bus *tmp;
> -	struct device_node *busdn;
> -
> -	busdn = pci_bus_to_OF_node(bus);
> -	if (busdn == dn)
> -		return bus;
> -
> -	list_for_each_entry(tmp, &bus->children, node) {
> -		child = find_bus_among_children(tmp, dn);
> -		if (child)
> -			break;
> -	};
> -	return child;
> -}
> -
> -struct pci_bus *
> -pcibios_find_pci_bus(struct device_node *dn)
> -{
> -	struct pci_dn *pdn = dn->data;
> -
> -	if (!pdn  || !pdn->phb || !pdn->phb->bus)
> -		return NULL;
> -
> -	return find_bus_among_children(pdn->phb->bus, dn);
> -}
> -EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
> -
>  struct pci_controller *init_phb_dynamic(struct device_node *dn)
>  {
>  	struct pci_controller *phb;
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-04  6:42 ` [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
@ 2015-06-05 20:11       ` Bjorn Helgaas
  2015-06-30 18:18     ` Grant Likely
  1 sibling, 0 replies; 83+ messages in thread
From: Bjorn Helgaas @ 2015-06-05 20:11 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A

On Thu, Jun 04, 2015 at 04:42:11PM +1000, Gavin Shan wrote:
> The patch intends to add standalone driver to support PCI hotplug
> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
> The firmware identified hotpluggable slots and marked their device
> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
> The driver simply scans device-tree to create/register PCI hotplug slot
> accordingly.
> 
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
> 
> Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

Acked-by: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

But I do have a few comments (my ack is valid whether you do anything with
them or not):

> +static void slot_power_off_handler(struct powernv_php_slot *slot)
> +{
> +	int ret;
> +
> +	/* Release the firmware data for the child device nodes */
> +	remove_child_pdn(slot->dn);
> +
> +	/*
> +	 * Release the child device nodes. If the sub-tree was
> +	 * built with the help of overlay, we just need revert
> +	 * the changes introduced by the overlay
> +	 */
> +	if (slot->overlay_id >= 0) {
> +		ret = of_overlay_destroy(slot->overlay_id);
> +		if (ret)
> +			pr_warn("%s: Error %d destroying overlay %d\n",
> +				__func__, ret, slot->overlay_id);

For this and similar messages: isn't there a device you can use with
dev_warn() here?  I think a device name would be much better than a
function name.

> +scan:
> +	switch (presence) {
> +	case POWERNV_PHP_SLOT_PRESENT:
> +		if (rescan) {
> +			pci_lock_rescan_remove();
> +			pcibios_add_pci_devices(slot->bus);

You didn't add this, but "pcibios_add_pci_devices" doesn't seem like the
right name.  "pcibios" generally refers to an arch-specific hook that's
called by the generic PCI core.  In this case, pcibios_add_pci_devices()
contains powerpc-specific code, and it's only called from powerpc code, so
I think using "pcibios_" in the name is a bit misleading.

> +	/* Remove all devices behind the slot */
> +	pci_lock_rescan_remove();
> +	pcibios_remove_pci_devices(slot->bus);

Same comment for pcibios_remove_pci_devices().  It would be better if the
name didn't suggest that this was part of the pcibios_ interface between
the PCI core and the arch code, because it's not.

> +	/* Slot indentifier */

s/indentifier/identifier/

> +	if (!php_slot_get_id(dn, &id))
> +		return NULL;
> +

> +	/* PCI bus */
> +	bus = pcibios_find_pci_bus(dn);

And pcibios_find_pci_bus() (it's also powerpc-specific).

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
@ 2015-06-05 20:11       ` Bjorn Helgaas
  0 siblings, 0 replies; 83+ messages in thread
From: Bjorn Helgaas @ 2015-06-05 20:11 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, benh, aik, panto,
	robherring2, grant.likely

On Thu, Jun 04, 2015 at 04:42:11PM +1000, Gavin Shan wrote:
> The patch intends to add standalone driver to support PCI hotplug
> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
> The firmware identified hotpluggable slots and marked their device
> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
> The driver simply scans device-tree to create/register PCI hotplug slot
> accordingly.
> 
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

But I do have a few comments (my ack is valid whether you do anything with
them or not):

> +static void slot_power_off_handler(struct powernv_php_slot *slot)
> +{
> +	int ret;
> +
> +	/* Release the firmware data for the child device nodes */
> +	remove_child_pdn(slot->dn);
> +
> +	/*
> +	 * Release the child device nodes. If the sub-tree was
> +	 * built with the help of overlay, we just need revert
> +	 * the changes introduced by the overlay
> +	 */
> +	if (slot->overlay_id >= 0) {
> +		ret = of_overlay_destroy(slot->overlay_id);
> +		if (ret)
> +			pr_warn("%s: Error %d destroying overlay %d\n",
> +				__func__, ret, slot->overlay_id);

For this and similar messages: isn't there a device you can use with
dev_warn() here?  I think a device name would be much better than a
function name.

> +scan:
> +	switch (presence) {
> +	case POWERNV_PHP_SLOT_PRESENT:
> +		if (rescan) {
> +			pci_lock_rescan_remove();
> +			pcibios_add_pci_devices(slot->bus);

You didn't add this, but "pcibios_add_pci_devices" doesn't seem like the
right name.  "pcibios" generally refers to an arch-specific hook that's
called by the generic PCI core.  In this case, pcibios_add_pci_devices()
contains powerpc-specific code, and it's only called from powerpc code, so
I think using "pcibios_" in the name is a bit misleading.

> +	/* Remove all devices behind the slot */
> +	pci_lock_rescan_remove();
> +	pcibios_remove_pci_devices(slot->bus);

Same comment for pcibios_remove_pci_devices().  It would be better if the
name didn't suggest that this was part of the pcibios_ interface between
the PCI core and the arch code, because it's not.

> +	/* Slot indentifier */

s/indentifier/identifier/

> +	if (!php_slot_get_id(dn, &id))
> +		return NULL;
> +

> +	/* PCI bus */
> +	bus = pcibios_find_pci_bus(dn);

And pcibios_find_pci_bus() (it's also powerpc-specific).

Bjorn

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-05 20:11       ` Bjorn Helgaas
@ 2015-06-05 20:18           ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-06-05 20:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Gavin Shan, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A

On Fri, 2015-06-05 at 15:11 -0500, Bjorn Helgaas wrote:

> You didn't add this, but "pcibios_add_pci_devices" doesn't seem like the
> right name.  "pcibios" generally refers to an arch-specific hook that's
> called by the generic PCI core.  In this case, pcibios_add_pci_devices()
> contains powerpc-specific code, and it's only called from powerpc code, so
> I think using "pcibios_" in the name is a bit misleading.

Maybe but just calling it pci_add_* makes it easy to confuse with a core
function and ppc_add_* is gross :-)

> > +	/* Remove all devices behind the slot */
> > +	pci_lock_rescan_remove();
> > +	pcibios_remove_pci_devices(slot->bus);
> 
> Same comment for pcibios_remove_pci_devices().  It would be better if the
> name didn't suggest that this was part of the pcibios_ interface between
> the PCI core and the arch code, because it's not.
> 
> > +	/* Slot indentifier */
> 
> s/indentifier/identifier/
> 
> > +	if (!php_slot_get_id(dn, &id))
> > +		return NULL;
> > +
> 
> > +	/* PCI bus */
> > +	bus = pcibios_find_pci_bus(dn);
> 
> And pcibios_find_pci_bus() (it's also powerpc-specific).

This one could actually move to of_pci.c and be generic, something like
of_pci_node_to_bus()

Ben.


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
@ 2015-06-05 20:18           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-06-05 20:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, aik, panto,
	robherring2, grant.likely

On Fri, 2015-06-05 at 15:11 -0500, Bjorn Helgaas wrote:

> You didn't add this, but "pcibios_add_pci_devices" doesn't seem like the
> right name.  "pcibios" generally refers to an arch-specific hook that's
> called by the generic PCI core.  In this case, pcibios_add_pci_devices()
> contains powerpc-specific code, and it's only called from powerpc code, so
> I think using "pcibios_" in the name is a bit misleading.

Maybe but just calling it pci_add_* makes it easy to confuse with a core
function and ppc_add_* is gross :-)

> > +	/* Remove all devices behind the slot */
> > +	pci_lock_rescan_remove();
> > +	pcibios_remove_pci_devices(slot->bus);
> 
> Same comment for pcibios_remove_pci_devices().  It would be better if the
> name didn't suggest that this was part of the pcibios_ interface between
> the PCI core and the arch code, because it's not.
> 
> > +	/* Slot indentifier */
> 
> s/indentifier/identifier/
> 
> > +	if (!php_slot_get_id(dn, &id))
> > +		return NULL;
> > +
> 
> > +	/* PCI bus */
> > +	bus = pcibios_find_pci_bus(dn);
> 
> And pcibios_find_pci_bus() (it's also powerpc-specific).

This one could actually move to of_pci.c and be generic, something like
of_pci_node_to_bus()

Ben.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 01/42] PCI: Add pcibios_setup_bridge()
  2015-06-05 19:44   ` Bjorn Helgaas
@ 2015-06-09  5:49     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-09  5:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, aik,
	panto, robherring2, grant.likely

On Fri, Jun 05, 2015 at 02:44:32PM -0500, Bjorn Helgaas wrote:
>On Thu, Jun 04, 2015 at 04:41:30PM +1000, Gavin Shan wrote:
>> Currently, PowerPC PowerNV platform utilizes ppc_md.pcibios_fixup(),
>> which is called for once after PCI probing and resource assignment
>> are completed, to allocate platform required resources for PCI devices:
>> PE#, IO and MMIO mapping, DMA address translation (TCE) table etc.
>> Obviously, it's not hotplug friendly.
>> 
>> The patch adds weak function pcibios_setup_bridge(), which is called
>> by pci_setup_bridge(). PowerPC PowerNV platform will reuse the function
>> to assign above platform required resources to newly added PCI devices,
>> in order to support PCI hotplug in subsequent patches.
>> 
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
>Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>

Thanks for your review, Bjorn.

Thanks,
Gavin

>> ---
>> v5:
>>   * Corrected subject as Bjorn suggested
>>   * pci_setup_bridge() calls pcibios_setup_bridge() and __pci_setup_bridge()
>> ---
>>  drivers/pci/setup-bus.c | 5 +++++
>>  include/linux/pci.h     | 1 +
>>  2 files changed, 6 insertions(+)
>> 
>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>> index 4fd0cac..623dee3 100644
>> --- a/drivers/pci/setup-bus.c
>> +++ b/drivers/pci/setup-bus.c
>> @@ -693,11 +693,16 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>>  	pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bus->bridge_ctl);
>>  }
>>  
>> +void __weak pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
>> +{
>> +}
>> +
>>  void pci_setup_bridge(struct pci_bus *bus)
>>  {
>>  	unsigned long type = IORESOURCE_IO | IORESOURCE_MEM |
>>  				  IORESOURCE_PREFETCH;
>>  
>> +	pcibios_setup_bridge(bus, type);
>>  	__pci_setup_bridge(bus, type);
>>  }
>>  
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 94bacfa..5aacd0a 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -811,6 +811,7 @@ void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev);
>>  void pci_stop_root_bus(struct pci_bus *bus);
>>  void pci_remove_root_bus(struct pci_bus *bus);
>>  void pci_setup_cardbus(struct pci_bus *bus);
>> +void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
>>  void pci_sort_breadthfirst(void);
>>  #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
>>  #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
>> -- 
>> 2.1.0
>> 
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-05 20:11       ` Bjorn Helgaas
  (?)
  (?)
@ 2015-06-09  6:08       ` Gavin Shan
  -1 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-09  6:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, aik,
	panto, robherring2, grant.likely

On Fri, Jun 05, 2015 at 03:11:10PM -0500, Bjorn Helgaas wrote:
>On Thu, Jun 04, 2015 at 04:42:11PM +1000, Gavin Shan wrote:
>> The patch intends to add standalone driver to support PCI hotplug
>> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
>> The firmware identified hotpluggable slots and marked their device
>> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
>> The driver simply scans device-tree to create/register PCI hotplug slot
>> accordingly.
>> 
>> If the skiboot firmware doesn't support slot status retrieval, the PCI
>> slot device node shouldn't have property "ibm,reset-by-firmware". In
>> that case, none of valid PCI slots will be detected from device tree.
>> The skiboot firmware doesn't export the capability to access attention
>> LEDs yet and it's something for TBD.
>> 
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
>Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>But I do have a few comments (my ack is valid whether you do anything with
>them or not):
>

Thanks for your review, Bjorn.

>> +static void slot_power_off_handler(struct powernv_php_slot *slot)
>> +{
>> +	int ret;
>> +
>> +	/* Release the firmware data for the child device nodes */
>> +	remove_child_pdn(slot->dn);
>> +
>> +	/*
>> +	 * Release the child device nodes. If the sub-tree was
>> +	 * built with the help of overlay, we just need revert
>> +	 * the changes introduced by the overlay
>> +	 */
>> +	if (slot->overlay_id >= 0) {
>> +		ret = of_overlay_destroy(slot->overlay_id);
>> +		if (ret)
>> +			pr_warn("%s: Error %d destroying overlay %d\n",
>> +				__func__, ret, slot->overlay_id);
>
>For this and similar messages: isn't there a device you can use with
>dev_warn() here?  I think a device name would be much better than a
>function name.
>

There is PCI bus referred (struct powernv_php_slot::bus), but it's
not always valid. So I'll add one more field "struct pci_dev *pdev"
which is initialized to the parent PCI device of the slot, then print
those messages with dev_warn().

>> +scan:
>> +	switch (presence) {
>> +	case POWERNV_PHP_SLOT_PRESENT:
>> +		if (rescan) {
>> +			pci_lock_rescan_remove();
>> +			pcibios_add_pci_devices(slot->bus);
>
>You didn't add this, but "pcibios_add_pci_devices" doesn't seem like the
>right name.  "pcibios" generally refers to an arch-specific hook that's
>called by the generic PCI core.  In this case, pcibios_add_pci_devices()
>contains powerpc-specific code, and it's only called from powerpc code, so
>I think using "pcibios_" in the name is a bit misleading.
>

Ben already suggested some better names in another reply. I'll pick
it if you agree: pci_add_pci_devices().

>> +	/* Remove all devices behind the slot */
>> +	pci_lock_rescan_remove();
>> +	pcibios_remove_pci_devices(slot->bus);
>
>Same comment for pcibios_remove_pci_devices().  It would be better if the
>name didn't suggest that this was part of the pcibios_ interface between
>the PCI core and the arch code, because it's not.
>

According to Ben's suggestion in another reply, it would be pci_remove_pci_devices()
if you agree :-)

>> +	/* Slot indentifier */
>
>s/indentifier/identifier/
>

Thanks for pointing it out. I'll fix it up in next revision.

>> +	if (!php_slot_get_id(dn, &id))
>> +		return NULL;
>> +
>
>> +	/* PCI bus */
>> +	bus = pcibios_find_pci_bus(dn);
>
>And pcibios_find_pci_bus() (it's also powerpc-specific).
>

I'll pick Ben's suggested name if you agree: of_pci_node_to_bus().

Thanks,
Gavin


>Bjorn
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-05 20:18           ` Benjamin Herrenschmidt
  (?)
@ 2015-06-09  6:10           ` Gavin Shan
  -1 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-09  6:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Bjorn Helgaas, Gavin Shan, linuxppc-dev, linux-pci, devicetree,
	aik, panto, robherring2, grant.likely

On Sat, Jun 06, 2015 at 06:18:15AM +1000, Benjamin Herrenschmidt wrote:
>On Fri, 2015-06-05 at 15:11 -0500, Bjorn Helgaas wrote:
>
>> You didn't add this, but "pcibios_add_pci_devices" doesn't seem like the
>> right name.  "pcibios" generally refers to an arch-specific hook that's
>> called by the generic PCI core.  In this case, pcibios_add_pci_devices()
>> contains powerpc-specific code, and it's only called from powerpc code, so
>> I think using "pcibios_" in the name is a bit misleading.
>
>Maybe but just calling it pci_add_* makes it easy to confuse with a core
>function and ppc_add_* is gross :-)
>
>> > +	/* Remove all devices behind the slot */
>> > +	pci_lock_rescan_remove();
>> > +	pcibios_remove_pci_devices(slot->bus);
>> 
>> Same comment for pcibios_remove_pci_devices().  It would be better if the
>> name didn't suggest that this was part of the pcibios_ interface between
>> the PCI core and the arch code, because it's not.
>> 
>> > +	/* Slot indentifier */
>> 
>> s/indentifier/identifier/
>> 
>> > +	if (!php_slot_get_id(dn, &id))
>> > +		return NULL;
>> > +
>> 
>> > +	/* PCI bus */
>> > +	bus = pcibios_find_pci_bus(dn);
>> 
>> And pcibios_find_pci_bus() (it's also powerpc-specific).
>
>This one could actually move to of_pci.c and be generic, something like
>of_pci_node_to_bus()
>

Thanks, Ben. I'll rename those functions as below if Bjorn won't object:

pcibios_add_pci_devices()           pci_add_pci_devices()
pcibios_remove_pci_devices()        pci_remove_pci_devices()
pcibios_find_pci_bus()              of_node_to_pci_bus()

Thanks,
Gavin

>Ben.
>
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around
  2015-06-05 19:47   ` Bjorn Helgaas
@ 2015-06-09  6:10     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-09  6:10 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, aik,
	panto, robherring2, grant.likely

On Fri, Jun 05, 2015 at 02:47:30PM -0500, Bjorn Helgaas wrote:
>"Move pcibios_find_pci_bus() from pSeries to generic powerpc code"?
>
>On Thu, Jun 04, 2015 at 04:42:00PM +1000, Gavin Shan wrote:
>> The patch moves pcibios_find_pci_bus() to PPC kerenl directory so
>
>s/kerenl/kernel/
>

Thanks. I'll fix it in next revision.

Thanks,
Gavin

>> that it can be reused by hotplug code for pSeries and PowerNV
>> platform at the same time.
>> 
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> ---
>> v5:
>>   * Derived from PATCH[v4 12/21]
>> ---
>>  arch/powerpc/kernel/pci-hotplug.c          | 36 ++++++++++++++++++++++++++++++
>>  arch/powerpc/platforms/pseries/pci_dlpar.c | 32 --------------------------
>>  2 files changed, 36 insertions(+), 32 deletions(-)
>> 
>> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
>> index ca392fc..1482bc1 100644
>> --- a/arch/powerpc/kernel/pci-hotplug.c
>> +++ b/arch/powerpc/kernel/pci-hotplug.c
>> @@ -21,6 +21,42 @@
>>  #include <asm/firmware.h>
>>  #include <asm/eeh.h>
>>  
>> +static struct pci_bus *find_pci_bus(struct pci_bus *bus,
>> +				    struct device_node *dn)
>> +{
>> +	struct pci_bus *tmp, *child = NULL;
>> +	struct device_node *busdn;
>> +
>> +	busdn = pci_bus_to_OF_node(bus);
>> +	if (busdn == dn)
>> +		return bus;
>> +
>> +	list_for_each_entry(tmp, &bus->children, node) {
>> +		child = find_pci_bus(tmp, dn);
>> +		if (child)
>> +			break;
>> +	}
>> +
>> +	return child;
>> +}
>> +
>> +/**
>> + * pcibios_find_pci_bus - find PCI bus according to the given device node
>> + * @dn: Device node
>> + *
>> + * Find the corresponding PCI bus according to the given device node.
>> + */
>> +struct pci_bus *pcibios_find_pci_bus(struct device_node *dn)
>> +{
>> +	struct pci_dn *pdn = PCI_DN(dn);
>> +
>> +	if (!pdn  || !pdn->phb || !pdn->phb->bus)
>> +		return NULL;
>> +
>> +	return find_pci_bus(pdn->phb->bus, dn);
>> +}
>> +EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
>> +
>>  /**
>>   * pcibios_release_device - release PCI device
>>   * @dev: PCI device
>> diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c
>> index 5d4a3df..906dbaa 100644
>> --- a/arch/powerpc/platforms/pseries/pci_dlpar.c
>> +++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
>> @@ -34,38 +34,6 @@
>>  
>>  #include "pseries.h"
>>  
>> -static struct pci_bus *
>> -find_bus_among_children(struct pci_bus *bus,
>> -                        struct device_node *dn)
>> -{
>> -	struct pci_bus *child = NULL;
>> -	struct pci_bus *tmp;
>> -	struct device_node *busdn;
>> -
>> -	busdn = pci_bus_to_OF_node(bus);
>> -	if (busdn == dn)
>> -		return bus;
>> -
>> -	list_for_each_entry(tmp, &bus->children, node) {
>> -		child = find_bus_among_children(tmp, dn);
>> -		if (child)
>> -			break;
>> -	};
>> -	return child;
>> -}
>> -
>> -struct pci_bus *
>> -pcibios_find_pci_bus(struct device_node *dn)
>> -{
>> -	struct pci_dn *pdn = dn->data;
>> -
>> -	if (!pdn  || !pdn->phb || !pdn->phb->bus)
>> -		return NULL;
>> -
>> -	return find_bus_among_children(pdn->phb->bus, dn);
>> -}
>> -EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
>> -
>>  struct pci_controller *init_phb_dynamic(struct device_node *dn)
>>  {
>>  	struct pci_controller *phb;
>> -- 
>> 2.1.0
>> 
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup
  2015-06-04  6:41 ` [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup Gavin Shan
@ 2015-06-10  4:17   ` Alexey Kardashevskiy
  2015-06-10  6:12     ` Gavin Shan
  0 siblings, 1 reply; 83+ messages in thread
From: Alexey Kardashevskiy @ 2015-06-10  4:17 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, panto, robherring2, grant.likely

On 06/04/2015 04:41 PM, Gavin Shan wrote:
> The patch cleans up DMA32 in pci-ioda.c. It shouldn't introduce
> behavioural changes:
>
>     * Rename various fields in "struct pnv_phb" and "struct pnv_ioda_pe"
>       as 32-bits DMA should be related to "DMA", not "TCE", and move
>       them around to reflect their relationship and their relative
>       importance.
>     * Removed struct pnv_ioda_pe::tce32_segcount.
>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>    * Split from PATCH[v4 5/21]
> ---
>   arch/powerpc/platforms/powernv/pci-ioda.c | 48 +++++++++++++++----------------
>   arch/powerpc/platforms/powernv/pci.h      | 13 +++------
>   2 files changed, 28 insertions(+), 33 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d9ff739..4af3d06 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -971,7 +971,7 @@ static void pnv_ioda_link_pe_by_weight(struct pnv_phb *phb,
>   	struct pnv_ioda_pe *lpe;
>
>   	list_for_each_entry(lpe, &phb->ioda.pe_dma_list, dma_link) {
> -		if (lpe->dma_weight < pe->dma_weight) {
> +		if (lpe->dma32_weight < pe->dma32_weight) {
>   			list_add_tail(&pe->dma_link, &lpe->dma_link);
>   			return;
>   		}
> @@ -996,14 +996,14 @@ static unsigned int pnv_ioda_dev_dma_weight(struct pci_dev *dev)
>   	if (dev->class == PCI_CLASS_SERIAL_USB_UHCI ||
>   	    dev->class == PCI_CLASS_SERIAL_USB_OHCI ||
>   	    dev->class == PCI_CLASS_SERIAL_USB_EHCI)
> -		return 3 * phb->ioda.tce32_count;
> +		return 3 * phb->ioda.dma32_segcount;
>
>   	/* Increase the weight of RAID (includes Obsidian) */
>   	if ((dev->class >> 8) == PCI_CLASS_STORAGE_RAID)
> -		return 15 * phb->ioda.tce32_count;
> +		return 15 * phb->ioda.dma32_segcount;
>
>   	/* Default */
> -	return 10 * phb->ioda.tce32_count;
> +	return 10 * phb->ioda.dma32_segcount;
>   }
>
>   static int __pnv_ioda_phb_dma_weight(struct pci_dev *pdev, void *data)
> @@ -1182,7 +1182,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
>   			continue;
>   		}
>   		pdn->pe_number = pe->pe_number;
> -		pe->dma_weight += pnv_ioda_dev_dma_weight(dev);
> +		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
>   		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
>   			pnv_ioda_setup_same_PE(dev->subordinate, pe);
>   	}
> @@ -1219,10 +1219,10 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
>   	pe->flags |= (all ? PNV_IODA_PE_BUS_ALL : PNV_IODA_PE_BUS);
>   	pe->pbus = bus;
>   	pe->pdev = NULL;
> -	pe->tce32_seg = -1;
> +	pe->dma32_seg = -1;
>   	pe->mve_number = -1;
>   	pe->rid = bus->busn_res.start << 8;
> -	pe->dma_weight = 0;
> +	pe->dma32_weight = 0;
>
>   	if (all)
>   		pe_info(pe, "Secondary bus %d..%d associated with PE#%d\n",
> @@ -1585,7 +1585,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>   		pe->flags = PNV_IODA_PE_VF;
>   		pe->pbus = NULL;
>   		pe->parent_dev = pdev;
> -		pe->tce32_seg = -1;
> +		pe->dma32_seg = -1;
>   		pe->mve_number = -1;
>   		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
>   			   pci_iov_virtfn_devfn(pdev, vf_index);
> @@ -2061,7 +2061,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>   	/* XXX FIXME: Allocate multi-level tables on PHB3 */
>
>   	/* We shouldn't already have a 32-bit DMA associated */
> -	if (WARN_ON(pe->tce32_seg >= 0))
> +	if (WARN_ON(pe->dma32_seg >= 0))
>   		return;
>
>   	tbl = pnv_pci_table_alloc(phb->hose->node);
> @@ -2070,7 +2070,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>   	pnv_pci_link_table_and_group(phb->hose->node, 0, tbl, &pe->table_group);
>
>   	/* Grab a 32-bit TCE table */
> -	pe->tce32_seg = base;
> +	pe->dma32_seg = base;
>   	pe_info(pe, " Setting up 32-bit TCE table at %08x..%08x\n",
>   		(base << 28), ((base + segs) << 28) - 1);
>
> @@ -2131,8 +2131,8 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>   	return;
>    fail:
>   	/* XXX Failure: Try to fallback to 64-bit only ? */
> -	if (pe->tce32_seg >= 0)
> -		pe->tce32_seg = -1;
> +	if (pe->dma32_seg >= 0)
> +		pe->dma32_seg = -1;
>   	if (tce_mem)
>   		__free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
>   	if (tbl) {
> @@ -2520,7 +2520,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>   	int64_t rc;
>
>   	/* We shouldn't already have a 32-bit DMA associated */
> -	if (WARN_ON(pe->tce32_seg >= 0))
> +	if (WARN_ON(pe->dma32_seg >= 0))
>   		return;
>
>   	/* TVE #1 is selected by PCI address bit 59 */
> @@ -2530,7 +2530,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>   			pe->pe_number);
>
>   	/* The PE will reserve all possible 32-bits space */
> -	pe->tce32_seg = 0;
> +	pe->dma32_seg = 0;
>   	pe_info(pe, "Setting up 32-bit TCE table at 0..%08x\n",
>   		phb->ioda.m32_pci_base);
>
> @@ -2547,8 +2547,8 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>
>   	rc = pnv_pci_ioda2_setup_default_config(pe);
>   	if (rc) {
> -		if (pe->tce32_seg >= 0)
> -			pe->tce32_seg = -1;
> +		if (pe->dma32_seg >= 0)
> +			pe->dma32_seg = -1;
>   		return;
>   	}
>
> @@ -2567,7 +2567,7 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>   	/* Calculate the PHB's DMA weight */
>   	dma_weight = pnv_ioda_phb_dma_weight(phb);
>   	pr_info("PCI%04x has %ld DMA32 segments, total weight %d\n",
> -		hose->global_number, phb->ioda.tce32_count, dma_weight);
> +		hose->global_number, phb->ioda.dma32_segcount, dma_weight);
>
>   	pnv_pci_ioda_setup_opal_tce_kill(phb);
>
> @@ -2576,7 +2576,7 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>   	 * weight
>   	 */
>   	list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link) {
> -		if (!pe->dma_weight)
> +		if (!pe->dma32_weight)
>   			continue;
>
>   		/*
> @@ -2587,15 +2587,15 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>   		if (phb->type == PNV_PHB_IODA1) {
>   			unsigned int segs, base = 0;
>
> -			if (pe->dma_weight <
> -			    dma_weight / phb->ioda.tce32_count)
> +			if (pe->dma32_weight <
> +			    dma_weight / phb->ioda.dma32_segcount)
>   				segs = 1;
>   			else
> -				segs = (pe->dma_weight *
> -					phb->ioda.tce32_count) / dma_weight;
> +				segs = (pe->dma32_weight *
> +					phb->ioda.dma32_segcount) / dma_weight;
>
>   			pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
> -				pe->dma_weight, segs);
> +				pe->dma32_weight, segs);
>   			pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
>
>   			base += segs;
> @@ -3314,7 +3314,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>   	mutex_init(&phb->ioda.pe_list_mutex);
>
>   	/* Calculate how many 32-bit TCE segments we have */
> -	phb->ioda.tce32_count = phb->ioda.m32_pci_base >> 28;
> +	phb->ioda.dma32_segcount = phb->ioda.m32_pci_base >> 28;
>
>   #if 0 /* We should really do that ... */
>   	rc = opal_pci_set_phb_mem_window(opal->phb_id,
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 38d8616..5ea33ca 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -58,15 +58,10 @@ struct pnv_ioda_pe {
>   	unsigned long		m32_segmap[8];
>   	unsigned long		m64_segmap[8];
>
> -	/* "Weight" assigned to the PE for the sake of DMA resource
> -	 * allocations
> -	 */
> -	unsigned int		dma_weight;


This belongs to the previous patch, more precisely to the part of the 
previous patch which changes stuff for PHB3 and which you want to move to a 
separate patch (or to this one, up to you).


> -
>   	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
> -	int			tce32_seg;
> -	int			tce32_segcount;
>   	struct iommu_table_group table_group;
> +	int			dma32_seg;
> +	unsigned int		dma32_weight;

Tiny comment - you not just renamed the fields but also moved them :)


>
>   	/* 64-bit TCE bypass region */
>   	bool			tce_bypass_enabled;
> @@ -182,8 +177,8 @@ struct pnv_phb {
>   			 */
>   			unsigned char		pe_rmap[0x10000];
>
> -			/* 32-bit TCE tables allocation */
> -			unsigned long		tce32_count;
> +			/* Number of 32-bit DMA segments */
> +			unsigned long		dma32_segcount;
>
>   			/* Sorted list of used PE's, sorted at
>   			 * boot for resource allocation purposes
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity
  2015-06-04  6:41 ` [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity Gavin Shan
@ 2015-06-10  4:41   ` Alexey Kardashevskiy
  2015-06-10  6:18     ` Gavin Shan
  0 siblings, 1 reply; 83+ messages in thread
From: Alexey Kardashevskiy @ 2015-06-10  4:41 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, panto, robherring2, grant.likely

On 06/04/2015 04:41 PM, Gavin Shan wrote:
> Each PHB maintains an array helping to translate RID (Request
> ID) to PE# with the assumption that PE# takes 8 bits, indicating
> that we can't have more than 256 PEs. However, pci_dn->pe_number
> already had 4-bytes for the PE#.
>
> The patch extends the PE# capacity so that each of them will be
> 4-bytes long. Then we can use IODA_INVALID_PE to check one entry
> in phb->pe_rmap[] is valid or not.
>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>    * Split from [PATCH v5 v4 06/21]
> ---
>   arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++++-
>   arch/powerpc/platforms/powernv/pci.h      | 5 ++---
>   2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 2087c5c..d8b0ef5 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -840,7 +840,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
>
>   	/* Clear the reverse map */
>   	for (rid = pe->rid; rid < rid_end; rid++)
> -		phb->ioda.pe_rmap[rid] = 0;
> +		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
>
>   	/* Release from all parents PELT-V */
>   	while (parent) {
> @@ -3303,6 +3303,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>   	if (prop32)
>   		phb->ioda.reserved_pe = be32_to_cpup(prop32);
>
> +	/* Invalidate RID to PE# mapping */
> +	memset(phb->ioda.pe_rmap, 0xff, sizeof(phb->ioda.pe_rmap));


Above you assign IODA_INVALID_PE in a loop and here you just do 0xff for 
the entire array. Have a loop here too and assign IODA_INVALID_PE to every 
entry:
for (i = 0; i < ARRAY_SIZE(phb->ioda.pe_rmap); ++i)
	phb->ioda.pe_rmap[i] = IODA_INVALID_PE;



> +
>   	/* Parse 64-bit MMIO range */
>   	pnv_ioda_parse_m64_window(phb);
>
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 94ef1df..590f778 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -175,11 +175,10 @@ struct pnv_phb {
>   			struct list_head	pe_list;
>   			struct mutex            pe_list_mutex;
>
> -			/* Reverse map of PEs, will have to extend if
> -			 * we are to support more than 256 PEs, indexed
> +			/* Reverse map of PEs, indexed by
>   			 * bus { bus, devfn }
>   			 */
> -			unsigned char		pe_rmap[0x10000];
> +			int			pe_rmap[0x10000];


Most time most of the array will be empty and it is 256K per PHB... I 
understand we have quite a lot of RAM but still.


>
>   			/* Number of 32-bit DMA segments */
>   			unsigned long		dma32_segcount;
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops
  2015-06-04  6:41 ` [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops Gavin Shan
@ 2015-06-10  4:43   ` Alexey Kardashevskiy
  2015-06-10  6:20       ` Gavin Shan
  0 siblings, 1 reply; 83+ messages in thread
From: Alexey Kardashevskiy @ 2015-06-10  4:43 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, panto, robherring2,
	grant.likely, Daniel Axtens

On 06/04/2015 04:41 PM, Gavin Shan wrote:
> Each PHB maintains one instance of "struct pci_controller_ops",
> which includes various callbacks called by PCI subsystem. In the
> definition of this struct, some callbacks have explicit names for
> its arguments, but the left don't have.
>
> The patch removes all explicit names of the arguments to the
> callbacks in "struct pci_controller_ops" to keep the code look
> consistent.

imho it is a bad idea. Self-documeted code gets less self-documented - how 
do I know what "unsigned long" parameters are for without grepping?


>
> Cc: Daniel Axtens <dja@axtens.net>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>    * Newly introduced
> ---
>   arch/powerpc/include/asm/pci-bridge.h | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
> index 744884b..1252cd5 100644
> --- a/arch/powerpc/include/asm/pci-bridge.h
> +++ b/arch/powerpc/include/asm/pci-bridge.h
> @@ -18,8 +18,8 @@ struct device_node;
>    * PCI controller operations
>    */
>   struct pci_controller_ops {
> -	void		(*dma_dev_setup)(struct pci_dev *dev);
> -	void		(*dma_bus_setup)(struct pci_bus *bus);
> +	void		(*dma_dev_setup)(struct pci_dev *);
> +	void		(*dma_bus_setup)(struct pci_bus *);
>
>   	int		(*probe_mode)(struct pci_bus *);
>
> @@ -28,8 +28,8 @@ struct pci_controller_ops {
>   	bool		(*enable_device_hook)(struct pci_dev *);
>
>   	/* Called during PCI resource reassignment */
> -	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long type);
> -	void		(*reset_secondary_bus)(struct pci_dev *dev);
> +	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
> +	void		(*reset_secondary_bus)(struct pci_dev *);
>   };
>
>   /*
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup
  2015-06-10  4:17   ` Alexey Kardashevskiy
@ 2015-06-10  6:12     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-10  6:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, bhelgaas,
	panto, robherring2, grant.likely

On Wed, Jun 10, 2015 at 02:17:26PM +1000, Alexey Kardashevskiy wrote:
>On 06/04/2015 04:41 PM, Gavin Shan wrote:
>>The patch cleans up DMA32 in pci-ioda.c. It shouldn't introduce
>>behavioural changes:
>>
>>    * Rename various fields in "struct pnv_phb" and "struct pnv_ioda_pe"
>>      as 32-bits DMA should be related to "DMA", not "TCE", and move
>>      them around to reflect their relationship and their relative
>>      importance.
>>    * Removed struct pnv_ioda_pe::tce32_segcount.
>>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>---
>>v5:
>>   * Split from PATCH[v4 5/21]
>>---
>>  arch/powerpc/platforms/powernv/pci-ioda.c | 48 +++++++++++++++----------------
>>  arch/powerpc/platforms/powernv/pci.h      | 13 +++------
>>  2 files changed, 28 insertions(+), 33 deletions(-)
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index d9ff739..4af3d06 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -971,7 +971,7 @@ static void pnv_ioda_link_pe_by_weight(struct pnv_phb *phb,
>>  	struct pnv_ioda_pe *lpe;
>>
>>  	list_for_each_entry(lpe, &phb->ioda.pe_dma_list, dma_link) {
>>-		if (lpe->dma_weight < pe->dma_weight) {
>>+		if (lpe->dma32_weight < pe->dma32_weight) {
>>  			list_add_tail(&pe->dma_link, &lpe->dma_link);
>>  			return;
>>  		}
>>@@ -996,14 +996,14 @@ static unsigned int pnv_ioda_dev_dma_weight(struct pci_dev *dev)
>>  	if (dev->class == PCI_CLASS_SERIAL_USB_UHCI ||
>>  	    dev->class == PCI_CLASS_SERIAL_USB_OHCI ||
>>  	    dev->class == PCI_CLASS_SERIAL_USB_EHCI)
>>-		return 3 * phb->ioda.tce32_count;
>>+		return 3 * phb->ioda.dma32_segcount;
>>
>>  	/* Increase the weight of RAID (includes Obsidian) */
>>  	if ((dev->class >> 8) == PCI_CLASS_STORAGE_RAID)
>>-		return 15 * phb->ioda.tce32_count;
>>+		return 15 * phb->ioda.dma32_segcount;
>>
>>  	/* Default */
>>-	return 10 * phb->ioda.tce32_count;
>>+	return 10 * phb->ioda.dma32_segcount;
>>  }
>>
>>  static int __pnv_ioda_phb_dma_weight(struct pci_dev *pdev, void *data)
>>@@ -1182,7 +1182,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
>>  			continue;
>>  		}
>>  		pdn->pe_number = pe->pe_number;
>>-		pe->dma_weight += pnv_ioda_dev_dma_weight(dev);
>>+		pe->dma32_weight += pnv_ioda_dev_dma_weight(dev);
>>  		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
>>  			pnv_ioda_setup_same_PE(dev->subordinate, pe);
>>  	}
>>@@ -1219,10 +1219,10 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
>>  	pe->flags |= (all ? PNV_IODA_PE_BUS_ALL : PNV_IODA_PE_BUS);
>>  	pe->pbus = bus;
>>  	pe->pdev = NULL;
>>-	pe->tce32_seg = -1;
>>+	pe->dma32_seg = -1;
>>  	pe->mve_number = -1;
>>  	pe->rid = bus->busn_res.start << 8;
>>-	pe->dma_weight = 0;
>>+	pe->dma32_weight = 0;
>>
>>  	if (all)
>>  		pe_info(pe, "Secondary bus %d..%d associated with PE#%d\n",
>>@@ -1585,7 +1585,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>  		pe->flags = PNV_IODA_PE_VF;
>>  		pe->pbus = NULL;
>>  		pe->parent_dev = pdev;
>>-		pe->tce32_seg = -1;
>>+		pe->dma32_seg = -1;
>>  		pe->mve_number = -1;
>>  		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
>>  			   pci_iov_virtfn_devfn(pdev, vf_index);
>>@@ -2061,7 +2061,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>>  	/* XXX FIXME: Allocate multi-level tables on PHB3 */
>>
>>  	/* We shouldn't already have a 32-bit DMA associated */
>>-	if (WARN_ON(pe->tce32_seg >= 0))
>>+	if (WARN_ON(pe->dma32_seg >= 0))
>>  		return;
>>
>>  	tbl = pnv_pci_table_alloc(phb->hose->node);
>>@@ -2070,7 +2070,7 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>>  	pnv_pci_link_table_and_group(phb->hose->node, 0, tbl, &pe->table_group);
>>
>>  	/* Grab a 32-bit TCE table */
>>-	pe->tce32_seg = base;
>>+	pe->dma32_seg = base;
>>  	pe_info(pe, " Setting up 32-bit TCE table at %08x..%08x\n",
>>  		(base << 28), ((base + segs) << 28) - 1);
>>
>>@@ -2131,8 +2131,8 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>>  	return;
>>   fail:
>>  	/* XXX Failure: Try to fallback to 64-bit only ? */
>>-	if (pe->tce32_seg >= 0)
>>-		pe->tce32_seg = -1;
>>+	if (pe->dma32_seg >= 0)
>>+		pe->dma32_seg = -1;
>>  	if (tce_mem)
>>  		__free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
>>  	if (tbl) {
>>@@ -2520,7 +2520,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>>  	int64_t rc;
>>
>>  	/* We shouldn't already have a 32-bit DMA associated */
>>-	if (WARN_ON(pe->tce32_seg >= 0))
>>+	if (WARN_ON(pe->dma32_seg >= 0))
>>  		return;
>>
>>  	/* TVE #1 is selected by PCI address bit 59 */
>>@@ -2530,7 +2530,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>>  			pe->pe_number);
>>
>>  	/* The PE will reserve all possible 32-bits space */
>>-	pe->tce32_seg = 0;
>>+	pe->dma32_seg = 0;
>>  	pe_info(pe, "Setting up 32-bit TCE table at 0..%08x\n",
>>  		phb->ioda.m32_pci_base);
>>
>>@@ -2547,8 +2547,8 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>>
>>  	rc = pnv_pci_ioda2_setup_default_config(pe);
>>  	if (rc) {
>>-		if (pe->tce32_seg >= 0)
>>-			pe->tce32_seg = -1;
>>+		if (pe->dma32_seg >= 0)
>>+			pe->dma32_seg = -1;
>>  		return;
>>  	}
>>
>>@@ -2567,7 +2567,7 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>>  	/* Calculate the PHB's DMA weight */
>>  	dma_weight = pnv_ioda_phb_dma_weight(phb);
>>  	pr_info("PCI%04x has %ld DMA32 segments, total weight %d\n",
>>-		hose->global_number, phb->ioda.tce32_count, dma_weight);
>>+		hose->global_number, phb->ioda.dma32_segcount, dma_weight);
>>
>>  	pnv_pci_ioda_setup_opal_tce_kill(phb);
>>
>>@@ -2576,7 +2576,7 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>>  	 * weight
>>  	 */
>>  	list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link) {
>>-		if (!pe->dma_weight)
>>+		if (!pe->dma32_weight)
>>  			continue;
>>
>>  		/*
>>@@ -2587,15 +2587,15 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>>  		if (phb->type == PNV_PHB_IODA1) {
>>  			unsigned int segs, base = 0;
>>
>>-			if (pe->dma_weight <
>>-			    dma_weight / phb->ioda.tce32_count)
>>+			if (pe->dma32_weight <
>>+			    dma_weight / phb->ioda.dma32_segcount)
>>  				segs = 1;
>>  			else
>>-				segs = (pe->dma_weight *
>>-					phb->ioda.tce32_count) / dma_weight;
>>+				segs = (pe->dma32_weight *
>>+					phb->ioda.dma32_segcount) / dma_weight;
>>
>>  			pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
>>-				pe->dma_weight, segs);
>>+				pe->dma32_weight, segs);
>>  			pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
>>
>>  			base += segs;
>>@@ -3314,7 +3314,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>>  	mutex_init(&phb->ioda.pe_list_mutex);
>>
>>  	/* Calculate how many 32-bit TCE segments we have */
>>-	phb->ioda.tce32_count = phb->ioda.m32_pci_base >> 28;
>>+	phb->ioda.dma32_segcount = phb->ioda.m32_pci_base >> 28;
>>
>>  #if 0 /* We should really do that ... */
>>  	rc = opal_pci_set_phb_mem_window(opal->phb_id,
>>diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>>index 38d8616..5ea33ca 100644
>>--- a/arch/powerpc/platforms/powernv/pci.h
>>+++ b/arch/powerpc/platforms/powernv/pci.h
>>@@ -58,15 +58,10 @@ struct pnv_ioda_pe {
>>  	unsigned long		m32_segmap[8];
>>  	unsigned long		m64_segmap[8];
>>
>>-	/* "Weight" assigned to the PE for the sake of DMA resource
>>-	 * allocations
>>-	 */
>>-	unsigned int		dma_weight;
>
>
>This belongs to the previous patch, more precisely to the part of the
>previous patch which changes stuff for PHB3 and which you want to move to a
>separate patch (or to this one, up to you).
>

Ok. I'll try to change the code according to your comments.

>>-
>>  	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
>>-	int			tce32_seg;
>>-	int			tce32_segcount;
>>  	struct iommu_table_group table_group;
>>+	int			dma32_seg;
>>+	unsigned int		dma32_weight;
>
>Tiny comment - you not just renamed the fields but also moved them :)
>

Yes, as I explained in the chagelog :-)

	* Rename various fields in "struct pnv_phb" and "struct pnv_ioda_pe"
	  as 32-bits DMA should be related to "DMA", not "TCE", and move
	  them around to reflect their relationship and their relative
	  importance.

>>
>>  	/* 64-bit TCE bypass region */
>>  	bool			tce_bypass_enabled;
>>@@ -182,8 +177,8 @@ struct pnv_phb {
>>  			 */
>>  			unsigned char		pe_rmap[0x10000];
>>
>>-			/* 32-bit TCE tables allocation */
>>-			unsigned long		tce32_count;
>>+			/* Number of 32-bit DMA segments */
>>+			unsigned long		dma32_segcount;
>>
>>  			/* Sorted list of used PE's, sorted at
>>  			 * boot for resource allocation purposes
>>

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity
  2015-06-10  4:41   ` Alexey Kardashevskiy
@ 2015-06-10  6:18     ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-10  6:18 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, bhelgaas,
	panto, robherring2, grant.likely

On Wed, Jun 10, 2015 at 02:41:13PM +1000, Alexey Kardashevskiy wrote:
>On 06/04/2015 04:41 PM, Gavin Shan wrote:
>>Each PHB maintains an array helping to translate RID (Request
>>ID) to PE# with the assumption that PE# takes 8 bits, indicating
>>that we can't have more than 256 PEs. However, pci_dn->pe_number
>>already had 4-bytes for the PE#.
>>
>>The patch extends the PE# capacity so that each of them will be
>>4-bytes long. Then we can use IODA_INVALID_PE to check one entry
>>in phb->pe_rmap[] is valid or not.
>>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>---
>>v5:
>>   * Split from [PATCH v5 v4 06/21]
>>---
>>  arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++++-
>>  arch/powerpc/platforms/powernv/pci.h      | 5 ++---
>>  2 files changed, 6 insertions(+), 4 deletions(-)
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index 2087c5c..d8b0ef5 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -840,7 +840,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
>>
>>  	/* Clear the reverse map */
>>  	for (rid = pe->rid; rid < rid_end; rid++)
>>-		phb->ioda.pe_rmap[rid] = 0;
>>+		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
>>
>>  	/* Release from all parents PELT-V */
>>  	while (parent) {
>>@@ -3303,6 +3303,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>>  	if (prop32)
>>  		phb->ioda.reserved_pe = be32_to_cpup(prop32);
>>
>>+	/* Invalidate RID to PE# mapping */
>>+	memset(phb->ioda.pe_rmap, 0xff, sizeof(phb->ioda.pe_rmap));
>
>
>Above you assign IODA_INVALID_PE in a loop and here you just do 0xff for the
>entire array. Have a loop here too and assign IODA_INVALID_PE to every entry:
>for (i = 0; i < ARRAY_SIZE(phb->ioda.pe_rmap); ++i)
>	phb->ioda.pe_rmap[i] = IODA_INVALID_PE;
>

Yeah, will change accordingly.

>>+
>>  	/* Parse 64-bit MMIO range */
>>  	pnv_ioda_parse_m64_window(phb);
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>>index 94ef1df..590f778 100644
>>--- a/arch/powerpc/platforms/powernv/pci.h
>>+++ b/arch/powerpc/platforms/powernv/pci.h
>>@@ -175,11 +175,10 @@ struct pnv_phb {
>>  			struct list_head	pe_list;
>>  			struct mutex            pe_list_mutex;
>>
>>-			/* Reverse map of PEs, will have to extend if
>>-			 * we are to support more than 256 PEs, indexed
>>+			/* Reverse map of PEs, indexed by
>>  			 * bus { bus, devfn }
>>  			 */
>>-			unsigned char		pe_rmap[0x10000];
>>+			int			pe_rmap[0x10000];
>
>
>Most time most of the array will be empty and it is 256K per PHB... I
>understand we have quite a lot of RAM but still.
>

Indeed, I'll think about how to save memory here, but not in this
patchset.

>>
>>  			/* Number of 32-bit DMA segments */
>>  			unsigned long		dma32_segcount;
>>

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops
  2015-06-10  4:43   ` Alexey Kardashevskiy
@ 2015-06-10  6:20       ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-10  6:20 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: devicetree, linux-pci, panto, Gavin Shan, grant.likely,
	robherring2, bhelgaas, linuxppc-dev, Daniel Axtens

On Wed, Jun 10, 2015 at 02:43:57PM +1000, Alexey Kardashevskiy wrote:
>On 06/04/2015 04:41 PM, Gavin Shan wrote:
>>Each PHB maintains one instance of "struct pci_controller_ops",
>>which includes various callbacks called by PCI subsystem. In the
>>definition of this struct, some callbacks have explicit names for
>>its arguments, but the left don't have.
>>
>>The patch removes all explicit names of the arguments to the
>>callbacks in "struct pci_controller_ops" to keep the code look
>>consistent.
>
>imho it is a bad idea. Self-documeted code gets less self-documented - how do
>I know what "unsigned long" parameters are for without grepping?
>

Ok. I'll change the function definations to always have explicit
argument names.

>>
>>Cc: Daniel Axtens <dja@axtens.net>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>---
>>v5:
>>   * Newly introduced
>>---
>>  arch/powerpc/include/asm/pci-bridge.h | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>index 744884b..1252cd5 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -18,8 +18,8 @@ struct device_node;
>>   * PCI controller operations
>>   */
>>  struct pci_controller_ops {
>>-	void		(*dma_dev_setup)(struct pci_dev *dev);
>>-	void		(*dma_bus_setup)(struct pci_bus *bus);
>>+	void		(*dma_dev_setup)(struct pci_dev *);
>>+	void		(*dma_bus_setup)(struct pci_bus *);
>>
>>  	int		(*probe_mode)(struct pci_bus *);
>>
>>@@ -28,8 +28,8 @@ struct pci_controller_ops {
>>  	bool		(*enable_device_hook)(struct pci_dev *);
>>
>>  	/* Called during PCI resource reassignment */
>>-	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long type);
>>-	void		(*reset_secondary_bus)(struct pci_dev *dev);
>>+	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
>>+	void		(*reset_secondary_bus)(struct pci_dev *);
>>  };
>>
>>  /*
>>

Thanks,
Gavin

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops
@ 2015-06-10  6:20       ` Gavin Shan
  0 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-06-10  6:20 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, bhelgaas,
	panto, robherring2, grant.likely, Daniel Axtens

On Wed, Jun 10, 2015 at 02:43:57PM +1000, Alexey Kardashevskiy wrote:
>On 06/04/2015 04:41 PM, Gavin Shan wrote:
>>Each PHB maintains one instance of "struct pci_controller_ops",
>>which includes various callbacks called by PCI subsystem. In the
>>definition of this struct, some callbacks have explicit names for
>>its arguments, but the left don't have.
>>
>>The patch removes all explicit names of the arguments to the
>>callbacks in "struct pci_controller_ops" to keep the code look
>>consistent.
>
>imho it is a bad idea. Self-documeted code gets less self-documented - how do
>I know what "unsigned long" parameters are for without grepping?
>

Ok. I'll change the function definations to always have explicit
argument names.

>>
>>Cc: Daniel Axtens <dja@axtens.net>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>---
>>v5:
>>   * Newly introduced
>>---
>>  arch/powerpc/include/asm/pci-bridge.h | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>index 744884b..1252cd5 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -18,8 +18,8 @@ struct device_node;
>>   * PCI controller operations
>>   */
>>  struct pci_controller_ops {
>>-	void		(*dma_dev_setup)(struct pci_dev *dev);
>>-	void		(*dma_bus_setup)(struct pci_bus *bus);
>>+	void		(*dma_dev_setup)(struct pci_dev *);
>>+	void		(*dma_bus_setup)(struct pci_bus *);
>>
>>  	int		(*probe_mode)(struct pci_bus *);
>>
>>@@ -28,8 +28,8 @@ struct pci_controller_ops {
>>  	bool		(*enable_device_hook)(struct pci_dev *);
>>
>>  	/* Called during PCI resource reassignment */
>>-	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long type);
>>-	void		(*reset_secondary_bus)(struct pci_dev *dev);
>>+	resource_size_t (*window_alignment)(struct pci_bus *, unsigned long);
>>+	void		(*reset_secondary_bus)(struct pci_dev *);
>>  };
>>
>>  /*
>>

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 39/42] drivers/of: Unflatten nodes equal or deeper than specified level
  2015-06-04  6:42     ` Gavin Shan
@ 2015-06-30 17:47         ` Grant Likely
  -1 siblings, 0 replies; 83+ messages in thread
From: Grant Likely @ 2015-06-30 17:47 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w, Gavin Shan

On Thu,  4 Jun 2015 16:42:08 +1000
, Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
 wrote:
> unflatten_dt_node() is called recursively to unflatten FDT nodes
> with the assumption that FDT blob has only one root node, which
> isn't true when the FDT blob represents device sub-tree. The
> patch improves the function to supporting device sub-tree that
> have multiple root nodes:
> 
>    * Rename original unflatten_dt_node() to __unflatten_dt_node().
>    * Wrapper unflatten_dt_node() calls __unflatten_dt_node() with
>      adjusted current node depth to 1 to avoid underflow.
> 
> Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> ---
> v5:
>   * Split from PATCH[v4 19/21]
>   * Fixed "line over 80 characters" from checkpatch.pl
> ---
>  drivers/of/fdt.c | 56 ++++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 38 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index cde35c5d01..b87c157 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -28,6 +28,8 @@
>  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>  #include <asm/page.h>
>  
> +static int cur_node_depth;
> +

eeeek! We'll never be able to call this concurrently this way. That will
create theoretical race conditions in the overlay code. (actually, you
didn't introduce this problem, see below...)

>  /*
>   * of_fdt_limit_memory - limit the number of regions in the /memory node
>   * @limit: maximum entries
> @@ -161,27 +163,26 @@ static void *unflatten_dt_alloc(void **mem, unsigned long size,
>  }
>  
>  /**
> - * unflatten_dt_node - Alloc and populate a device_node from the flat tree
> + * __unflatten_dt_node - Alloc and populate a device_node from the flat tree
>   * @blob: The parent device tree blob
>   * @mem: Memory chunk to use for allocating device nodes and properties
>   * @p: pointer to node in flat tree
>   * @dad: Parent struct device_node
>   * @fpsize: Size of the node path up at the current depth.
>   */
> -static void * unflatten_dt_node(void *blob,
> -				void *mem,
> -				int *poffset,
> -				struct device_node *dad,
> -				struct device_node **nodepp,
> -				unsigned long fpsize,
> -				bool dryrun)
> +static void *__unflatten_dt_node(void *blob,
> +				 void *mem,
> +				 int *poffset,
> +				 struct device_node *dad,
> +				 struct device_node **nodepp,
> +				 unsigned long fpsize,
> +				 bool dryrun)

nitpick: If you resist the temptation to reflow indentation, then the
diffstat is smaller.

>  {
>  	const __be32 *p;
>  	struct device_node *np;
>  	struct property *pp, **prev_pp = NULL;
>  	const char *pathp;
>  	unsigned int l, allocl;
> -	static int depth = 0;

Hmmmm.. looks like the race condition is already there. Well that's no
good. If you move *depth into the parameters to unflatten_dt_node(), then
you can solve both problems at once without having to create a __
version of the function. That will be a cleaner solution overall.

>  	int old_depth;
>  	int offset;
>  	int has_name = 0;
> @@ -334,13 +335,19 @@ static void * unflatten_dt_node(void *blob,
>  			np->type = "<NULL>";
>  	}
>  
> -	old_depth = depth;
> -	*poffset = fdt_next_node(blob, *poffset, &depth);
> -	if (depth < 0)
> -		depth = 0;
> -	while (*poffset > 0 && depth > old_depth)
> -		mem = unflatten_dt_node(blob, mem, poffset, np, NULL,
> -					fpsize, dryrun);
> +	old_depth = cur_node_depth;
> +	*poffset = fdt_next_node(blob, *poffset, &cur_node_depth);
> +	while (*poffset > 0) {

What is the reasoning here? Why change to looking for poffset > 0?

> +		if (cur_node_depth < old_depth)
> +			break;
> +
> +		if (cur_node_depth == old_depth)
> +			mem = __unflatten_dt_node(blob, mem, poffset,
> +						  dad, NULL, fpsize, dryrun);
> +		else if (cur_node_depth > old_depth)
> +			mem = __unflatten_dt_node(blob, mem, poffset,
> +						  np, NULL, fpsize, dryrun);

Ditto here, please describe the purpose of the new logic.

> +	}
>  
>  	if (*poffset < 0 && *poffset != -FDT_ERR_NOTFOUND)
>  		pr_err("unflatten: error %d processing FDT\n", *poffset);
> @@ -366,6 +373,18 @@ static void * unflatten_dt_node(void *blob,
>  	return mem;
>  }
>  
> +static void *unflatten_dt_node(void *blob,
> +			       void *mem,
> +			       int *poffset,
> +			       struct device_node *dad,
> +			       struct device_node **nodepp,
> +			       bool dryrun)
> +{
> +	cur_node_depth = 1;
> +	return __unflatten_dt_node(blob, mem, poffset,
> +				   dad, nodepp, 0, dryrun);
> +}
> +
>  /**
>   * __unflatten_device_tree - create tree of device_nodes from flat blob
>   *
> @@ -405,7 +424,8 @@ static void __unflatten_device_tree(void *blob,
>  
>  	/* First pass, scan for size */
>  	start = 0;
> -	size = (unsigned long)unflatten_dt_node(blob, NULL, &start, NULL, NULL, 0, true);
> +	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
> +						NULL, NULL, true);
>  	size = ALIGN(size, 4);
>  
>  	pr_debug("  size is %lx, allocating...\n", size);
> @@ -420,7 +440,7 @@ static void __unflatten_device_tree(void *blob,
>  
>  	/* Second pass, do actual unflattening */
>  	start = 0;
> -	unflatten_dt_node(blob, mem, &start, NULL, mynodes, 0, false);
> +	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
>  	if (be32_to_cpup(mem + size) != 0xdeadbeef)
>  		pr_warning("End of tree marker overwritten: %08x\n",
>  			   be32_to_cpup(mem + size));
> -- 
> 2.1.0
> 

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 39/42] drivers/of: Unflatten nodes equal or deeper than specified level
@ 2015-06-30 17:47         ` Grant Likely
  0 siblings, 0 replies; 83+ messages in thread
From: Grant Likely @ 2015-06-30 17:47 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	Gavin Shan

On Thu,  4 Jun 2015 16:42:08 +1000
, Gavin Shan <gwshan@linux.vnet.ibm.com>
 wrote:
> unflatten_dt_node() is called recursively to unflatten FDT nodes
> with the assumption that FDT blob has only one root node, which
> isn't true when the FDT blob represents device sub-tree. The
> patch improves the function to supporting device sub-tree that
> have multiple root nodes:
> 
>    * Rename original unflatten_dt_node() to __unflatten_dt_node().
>    * Wrapper unflatten_dt_node() calls __unflatten_dt_node() with
>      adjusted current node depth to 1 to avoid underflow.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>   * Split from PATCH[v4 19/21]
>   * Fixed "line over 80 characters" from checkpatch.pl
> ---
>  drivers/of/fdt.c | 56 ++++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 38 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index cde35c5d01..b87c157 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -28,6 +28,8 @@
>  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>  #include <asm/page.h>
>  
> +static int cur_node_depth;
> +

eeeek! We'll never be able to call this concurrently this way. That will
create theoretical race conditions in the overlay code. (actually, you
didn't introduce this problem, see below...)

>  /*
>   * of_fdt_limit_memory - limit the number of regions in the /memory node
>   * @limit: maximum entries
> @@ -161,27 +163,26 @@ static void *unflatten_dt_alloc(void **mem, unsigned long size,
>  }
>  
>  /**
> - * unflatten_dt_node - Alloc and populate a device_node from the flat tree
> + * __unflatten_dt_node - Alloc and populate a device_node from the flat tree
>   * @blob: The parent device tree blob
>   * @mem: Memory chunk to use for allocating device nodes and properties
>   * @p: pointer to node in flat tree
>   * @dad: Parent struct device_node
>   * @fpsize: Size of the node path up at the current depth.
>   */
> -static void * unflatten_dt_node(void *blob,
> -				void *mem,
> -				int *poffset,
> -				struct device_node *dad,
> -				struct device_node **nodepp,
> -				unsigned long fpsize,
> -				bool dryrun)
> +static void *__unflatten_dt_node(void *blob,
> +				 void *mem,
> +				 int *poffset,
> +				 struct device_node *dad,
> +				 struct device_node **nodepp,
> +				 unsigned long fpsize,
> +				 bool dryrun)

nitpick: If you resist the temptation to reflow indentation, then the
diffstat is smaller.

>  {
>  	const __be32 *p;
>  	struct device_node *np;
>  	struct property *pp, **prev_pp = NULL;
>  	const char *pathp;
>  	unsigned int l, allocl;
> -	static int depth = 0;

Hmmmm.. looks like the race condition is already there. Well that's no
good. If you move *depth into the parameters to unflatten_dt_node(), then
you can solve both problems at once without having to create a __
version of the function. That will be a cleaner solution overall.

>  	int old_depth;
>  	int offset;
>  	int has_name = 0;
> @@ -334,13 +335,19 @@ static void * unflatten_dt_node(void *blob,
>  			np->type = "<NULL>";
>  	}
>  
> -	old_depth = depth;
> -	*poffset = fdt_next_node(blob, *poffset, &depth);
> -	if (depth < 0)
> -		depth = 0;
> -	while (*poffset > 0 && depth > old_depth)
> -		mem = unflatten_dt_node(blob, mem, poffset, np, NULL,
> -					fpsize, dryrun);
> +	old_depth = cur_node_depth;
> +	*poffset = fdt_next_node(blob, *poffset, &cur_node_depth);
> +	while (*poffset > 0) {

What is the reasoning here? Why change to looking for poffset > 0?

> +		if (cur_node_depth < old_depth)
> +			break;
> +
> +		if (cur_node_depth == old_depth)
> +			mem = __unflatten_dt_node(blob, mem, poffset,
> +						  dad, NULL, fpsize, dryrun);
> +		else if (cur_node_depth > old_depth)
> +			mem = __unflatten_dt_node(blob, mem, poffset,
> +						  np, NULL, fpsize, dryrun);

Ditto here, please describe the purpose of the new logic.

> +	}
>  
>  	if (*poffset < 0 && *poffset != -FDT_ERR_NOTFOUND)
>  		pr_err("unflatten: error %d processing FDT\n", *poffset);
> @@ -366,6 +373,18 @@ static void * unflatten_dt_node(void *blob,
>  	return mem;
>  }
>  
> +static void *unflatten_dt_node(void *blob,
> +			       void *mem,
> +			       int *poffset,
> +			       struct device_node *dad,
> +			       struct device_node **nodepp,
> +			       bool dryrun)
> +{
> +	cur_node_depth = 1;
> +	return __unflatten_dt_node(blob, mem, poffset,
> +				   dad, nodepp, 0, dryrun);
> +}
> +
>  /**
>   * __unflatten_device_tree - create tree of device_nodes from flat blob
>   *
> @@ -405,7 +424,8 @@ static void __unflatten_device_tree(void *blob,
>  
>  	/* First pass, scan for size */
>  	start = 0;
> -	size = (unsigned long)unflatten_dt_node(blob, NULL, &start, NULL, NULL, 0, true);
> +	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
> +						NULL, NULL, true);
>  	size = ALIGN(size, 4);
>  
>  	pr_debug("  size is %lx, allocating...\n", size);
> @@ -420,7 +440,7 @@ static void __unflatten_device_tree(void *blob,
>  
>  	/* Second pass, do actual unflattening */
>  	start = 0;
> -	unflatten_dt_node(blob, mem, &start, NULL, mynodes, 0, false);
> +	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
>  	if (be32_to_cpup(mem + size) != 0xdeadbeef)
>  		pr_warning("End of tree marker overwritten: %08x\n",
>  			   be32_to_cpup(mem + size));
> -- 
> 2.1.0
> 


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
  2015-06-04  6:42 ` [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree() Gavin Shan
@ 2015-06-30 18:06       ` Grant Likely
       [not found]   ` <1433400131-18429-41-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  1 sibling, 0 replies; 83+ messages in thread
From: Grant Likely @ 2015-06-30 18:06 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w, Gavin Shan

On Thu,  4 Jun 2015 16:42:09 +1000
, Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
 wrote:
> The patch introduces one more argument to of_fdt_unflatten_tree()
> to specify the root node for the FDT blob, which is going to be
> unflattened. In the result, the function can be used to unflatten
> FDT blob, which represents device sub-tree in subsequent patches.
> 
> Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

In principle, looks okay. There are going to be lifecycle issues though
because nodes allocated from unflatten_dt_node cannot be cleanly freed
because they aren't allocated in the same way as OF_DYNAMIC nodes are
allocated.

It may be time to dump the special allocation of fdt.c entirely and
treat all nodes the same way, with name and properties all allocated
with normal kmallocs.... Investigation is needed to figure out if this
is feasible.

Comments below.

> ---
> v5:
>   * Newly introduced
> ---
>  drivers/of/fdt.c       | 26 ++++++++++++++++++--------
>  drivers/of/unittest.c  |  2 +-
>  include/linux/of_fdt.h |  3 ++-
>  3 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index b87c157..b6a6c59 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -380,9 +380,16 @@ static void *unflatten_dt_node(void *blob,
>  			       struct device_node **nodepp,
>  			       bool dryrun)
>  {
> +	unsigned long fpsize = 0;
> +
> +	if (dad)
> +		fpsize = strlen(of_node_full_name(dad));
> +	else
> +		fpsize = 0;

The 'else' is redundant. Better yet:

	unsigned long fpsize = dad ? strlen(of_node_full_name(dad)) : 0;

>  	cur_node_depth = 1;
>  	return __unflatten_dt_node(blob, mem, poffset,
> -				   dad, nodepp, 0, dryrun);
> +				   dad, nodepp, fpsize, dryrun);
>  }
>  
>  /**
> @@ -393,13 +400,15 @@ static void *unflatten_dt_node(void *blob,
>   * pointers of the nodes so the normal device-tree walking functions
>   * can be used.
>   * @blob: The blob to expand
> + * @dad: The root node of the created device_node tree
>   * @mynodes: The device_node tree created by the call
>   * @dt_alloc: An allocator that provides a virtual address to memory
>   * for the resulting tree
>   */
>  static void __unflatten_device_tree(void *blob,
> -			     struct device_node **mynodes,
> -			     void * (*dt_alloc)(u64 size, u64 align))
> +				    struct device_node *dad,
> +				    struct device_node **mynodes,
> +				    void * (*dt_alloc)(u64 size, u64 align))

Same comment as before, don't reflow the indentation unless you really
need to.

>  {
>  	unsigned long size;
>  	int start;
> @@ -425,7 +434,7 @@ static void __unflatten_device_tree(void *blob,
>  	/* First pass, scan for size */
>  	start = 0;
>  	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
> -						NULL, NULL, true);
> +						dad, NULL, true);
>  	size = ALIGN(size, 4);
>  
>  	pr_debug("  size is %lx, allocating...\n", size);
> @@ -440,7 +449,7 @@ static void __unflatten_device_tree(void *blob,
>  
>  	/* Second pass, do actual unflattening */
>  	start = 0;
> -	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
> +	unflatten_dt_node(blob, mem, &start, dad, mynodes, false);
>  	if (be32_to_cpup(mem + size) != 0xdeadbeef)
>  		pr_warning("End of tree marker overwritten: %08x\n",
>  			   be32_to_cpup(mem + size));
> @@ -462,9 +471,10 @@ static void *kernel_tree_alloc(u64 size, u64 align)
>   * can be used.
>   */
>  void of_fdt_unflatten_tree(unsigned long *blob,
> -			struct device_node **mynodes)
> +			   struct device_node *dad,
> +			   struct device_node **mynodes)
>  {
> -	__unflatten_device_tree(blob, mynodes, &kernel_tree_alloc);
> +	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
>  }
>  EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
>  
> @@ -1095,7 +1105,7 @@ bool __init early_init_dt_scan(void *params)
>   */
>  void __init unflatten_device_tree(void)
>  {
> -	__unflatten_device_tree(initial_boot_params, &of_root,
> +	__unflatten_device_tree(initial_boot_params, NULL, &of_root,
>  				early_init_dt_alloc_memory_arch);
>  
>  	/* Get pointer to "/chosen" and "/aliases" nodes for use everywhere */
> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> index 1801634..2270830 100644
> --- a/drivers/of/unittest.c
> +++ b/drivers/of/unittest.c
> @@ -907,7 +907,7 @@ static int __init unittest_data_add(void)
>  			"not running tests\n", __func__);
>  		return -ENOMEM;
>  	}
> -	of_fdt_unflatten_tree(unittest_data, &unittest_data_node);
> +	of_fdt_unflatten_tree(unittest_data, NULL, &unittest_data_node);
>  	if (!unittest_data_node) {
>  		pr_warn("%s: No tree to attach; not running tests\n", __func__);
>  		return -ENODATA;
> diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
> index 587ee50..8882640 100644
> --- a/include/linux/of_fdt.h
> +++ b/include/linux/of_fdt.h
> @@ -38,7 +38,8 @@ extern bool of_fdt_is_big_endian(const void *blob,
>  extern int of_fdt_match(const void *blob, unsigned long node,
>  			const char *const *compat);
>  extern void of_fdt_unflatten_tree(unsigned long *blob,
> -			       struct device_node **mynodes);
> +				  struct device_node *dad,
> +				  struct device_node **mynodes);
>  
>  /* TBD: Temporary export of fdt globals - remove when code fully merged */
>  extern int __initdata dt_root_addr_cells;
> -- 
> 2.1.0
> 

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
@ 2015-06-30 18:06       ` Grant Likely
  0 siblings, 0 replies; 83+ messages in thread
From: Grant Likely @ 2015-06-30 18:06 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	Gavin Shan

On Thu,  4 Jun 2015 16:42:09 +1000
, Gavin Shan <gwshan@linux.vnet.ibm.com>
 wrote:
> The patch introduces one more argument to of_fdt_unflatten_tree()
> to specify the root node for the FDT blob, which is going to be
> unflattened. In the result, the function can be used to unflatten
> FDT blob, which represents device sub-tree in subsequent patches.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

In principle, looks okay. There are going to be lifecycle issues though
because nodes allocated from unflatten_dt_node cannot be cleanly freed
because they aren't allocated in the same way as OF_DYNAMIC nodes are
allocated.

It may be time to dump the special allocation of fdt.c entirely and
treat all nodes the same way, with name and properties all allocated
with normal kmallocs.... Investigation is needed to figure out if this
is feasible.

Comments below.

> ---
> v5:
>   * Newly introduced
> ---
>  drivers/of/fdt.c       | 26 ++++++++++++++++++--------
>  drivers/of/unittest.c  |  2 +-
>  include/linux/of_fdt.h |  3 ++-
>  3 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index b87c157..b6a6c59 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -380,9 +380,16 @@ static void *unflatten_dt_node(void *blob,
>  			       struct device_node **nodepp,
>  			       bool dryrun)
>  {
> +	unsigned long fpsize = 0;
> +
> +	if (dad)
> +		fpsize = strlen(of_node_full_name(dad));
> +	else
> +		fpsize = 0;

The 'else' is redundant. Better yet:

	unsigned long fpsize = dad ? strlen(of_node_full_name(dad)) : 0;

>  	cur_node_depth = 1;
>  	return __unflatten_dt_node(blob, mem, poffset,
> -				   dad, nodepp, 0, dryrun);
> +				   dad, nodepp, fpsize, dryrun);
>  }
>  
>  /**
> @@ -393,13 +400,15 @@ static void *unflatten_dt_node(void *blob,
>   * pointers of the nodes so the normal device-tree walking functions
>   * can be used.
>   * @blob: The blob to expand
> + * @dad: The root node of the created device_node tree
>   * @mynodes: The device_node tree created by the call
>   * @dt_alloc: An allocator that provides a virtual address to memory
>   * for the resulting tree
>   */
>  static void __unflatten_device_tree(void *blob,
> -			     struct device_node **mynodes,
> -			     void * (*dt_alloc)(u64 size, u64 align))
> +				    struct device_node *dad,
> +				    struct device_node **mynodes,
> +				    void * (*dt_alloc)(u64 size, u64 align))

Same comment as before, don't reflow the indentation unless you really
need to.

>  {
>  	unsigned long size;
>  	int start;
> @@ -425,7 +434,7 @@ static void __unflatten_device_tree(void *blob,
>  	/* First pass, scan for size */
>  	start = 0;
>  	size = (unsigned long)unflatten_dt_node(blob, NULL, &start,
> -						NULL, NULL, true);
> +						dad, NULL, true);
>  	size = ALIGN(size, 4);
>  
>  	pr_debug("  size is %lx, allocating...\n", size);
> @@ -440,7 +449,7 @@ static void __unflatten_device_tree(void *blob,
>  
>  	/* Second pass, do actual unflattening */
>  	start = 0;
> -	unflatten_dt_node(blob, mem, &start, NULL, mynodes, false);
> +	unflatten_dt_node(blob, mem, &start, dad, mynodes, false);
>  	if (be32_to_cpup(mem + size) != 0xdeadbeef)
>  		pr_warning("End of tree marker overwritten: %08x\n",
>  			   be32_to_cpup(mem + size));
> @@ -462,9 +471,10 @@ static void *kernel_tree_alloc(u64 size, u64 align)
>   * can be used.
>   */
>  void of_fdt_unflatten_tree(unsigned long *blob,
> -			struct device_node **mynodes)
> +			   struct device_node *dad,
> +			   struct device_node **mynodes)
>  {
> -	__unflatten_device_tree(blob, mynodes, &kernel_tree_alloc);
> +	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
>  }
>  EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
>  
> @@ -1095,7 +1105,7 @@ bool __init early_init_dt_scan(void *params)
>   */
>  void __init unflatten_device_tree(void)
>  {
> -	__unflatten_device_tree(initial_boot_params, &of_root,
> +	__unflatten_device_tree(initial_boot_params, NULL, &of_root,
>  				early_init_dt_alloc_memory_arch);
>  
>  	/* Get pointer to "/chosen" and "/aliases" nodes for use everywhere */
> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> index 1801634..2270830 100644
> --- a/drivers/of/unittest.c
> +++ b/drivers/of/unittest.c
> @@ -907,7 +907,7 @@ static int __init unittest_data_add(void)
>  			"not running tests\n", __func__);
>  		return -ENOMEM;
>  	}
> -	of_fdt_unflatten_tree(unittest_data, &unittest_data_node);
> +	of_fdt_unflatten_tree(unittest_data, NULL, &unittest_data_node);
>  	if (!unittest_data_node) {
>  		pr_warn("%s: No tree to attach; not running tests\n", __func__);
>  		return -ENODATA;
> diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
> index 587ee50..8882640 100644
> --- a/include/linux/of_fdt.h
> +++ b/include/linux/of_fdt.h
> @@ -38,7 +38,8 @@ extern bool of_fdt_is_big_endian(const void *blob,
>  extern int of_fdt_match(const void *blob, unsigned long node,
>  			const char *const *compat);
>  extern void of_fdt_unflatten_tree(unsigned long *blob,
> -			       struct device_node **mynodes);
> +				  struct device_node *dad,
> +				  struct device_node **mynodes);
>  
>  /* TBD: Temporary export of fdt globals - remove when code fully merged */
>  extern int __initdata dt_root_addr_cells;
> -- 
> 2.1.0
> 


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-04  6:42 ` [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
@ 2015-06-30 18:18     ` Grant Likely
  2015-06-30 18:18     ` Grant Likely
  1 sibling, 0 replies; 83+ messages in thread
From: Grant Likely @ 2015-06-30 18:18 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, aik, linux-pci, panto, Gavin Shan, robherring2, bhelgaas

On Thu,  4 Jun 2015 16:42:11 +1000
, Gavin Shan <gwshan@linux.vnet.ibm.com>
 wrote:
> The patch intends to add standalone driver to support PCI hotplug
> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
> The firmware identified hotpluggable slots and marked their device
> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
> The driver simply scans device-tree to create/register PCI hotplug slot
> accordingly.
> 
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>   * Use OF OVERLAY to update the device-tree
>   * Removed unnecessary header files
>   * More meaningful return value from powernv_php_register_one()
>   * Use pnv_pci_hotplug_notifier_{register, unregister}()
>   * Decimal values for slot's states
>   * Removed struct powernv_php_slot::release()
>   * Merged two bool arguments to one for powernv_php_slot_enable()
>   * Rename release_device_nodes_info() to remove_device_nodes_info()
>   * Don't check on "!len" in slot_power_on_handler()
>   * Handle return value in get_adapter_status() as suggested by aik
>   * Drop invalid attention status in set_attention_status()
>   * Renaming functions
>   * Fixed coding style and added entry in MAINTAINERS reported by
>     checkpatch.pl
> ---
>  MAINTAINERS                            |   6 +
>  drivers/pci/hotplug/Kconfig            |  12 +
>  drivers/pci/hotplug/Makefile           |   4 +
>  drivers/pci/hotplug/powernv_php.c      | 140 +++++++
>  drivers/pci/hotplug/powernv_php.h      |  90 ++++
>  drivers/pci/hotplug/powernv_php_slot.c | 732 +++++++++++++++++++++++++++++++++
>  6 files changed, 984 insertions(+)
>  create mode 100644 drivers/pci/hotplug/powernv_php.c
>  create mode 100644 drivers/pci/hotplug/powernv_php.h
>  create mode 100644 drivers/pci/hotplug/powernv_php_slot.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e308718..f5e1dce 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7481,6 +7481,12 @@ L:	linux-pci@vger.kernel.org
>  S:	Supported
>  F:	Documentation/PCI/pci-error-recovery.txt
>  
> +PCI HOTPLUG DRIVER FOR POWERNV PLATFORM
> +M:	Gavin Shan <gwshan@linux.vnet.ibm.com>
> +L:	linux-pci@vger.kernel.org
> +S:	Supported
> +F:	drivers/pci/hotplug/powernv_php*
> +
>  PCI SUBSYSTEM
>  M:	Bjorn Helgaas <bhelgaas@google.com>
>  L:	linux-pci@vger.kernel.org
> diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
> index df8caec..ef55dae 100644
> --- a/drivers/pci/hotplug/Kconfig
> +++ b/drivers/pci/hotplug/Kconfig
> @@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC
>  
>  	  When in doubt, say N.
>  
> +config HOTPLUG_PCI_POWERNV
> +	tristate "PowerPC PowerNV PCI Hotplug driver"
> +	depends on PPC_POWERNV && EEH
> +	help
> +	  Say Y here if you run PowerPC PowerNV platform that supports
> +          PCI Hotplug
> +
> +	  To compile this driver as a module, choose M here: the
> +	  module will be called powernv-php.
> +
> +	  When in doubt, say N.
> +
>  config HOTPLUG_PCI_RPA
>  	tristate "RPA PCI Hotplug driver"
>  	depends on PPC_PSERIES && EEH
> diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
> index 4a9aa08..a69665e 100644
> --- a/drivers/pci/hotplug/Makefile
> +++ b/drivers/pci/hotplug/Makefile
> @@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)		+= pciehp.o
>  obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550)	+= cpcihp_zt5550.o
>  obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)	+= cpcihp_generic.o
>  obj-$(CONFIG_HOTPLUG_PCI_SHPC)		+= shpchp.o
> +obj-$(CONFIG_HOTPLUG_PCI_POWERNV)	+= powernv-php.o
>  obj-$(CONFIG_HOTPLUG_PCI_RPA)		+= rpaphp.o
>  obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR)	+= rpadlpar_io.o
>  obj-$(CONFIG_HOTPLUG_PCI_SGI)		+= sgi_hotplug.o
> @@ -50,6 +51,9 @@ ibmphp-objs		:=	ibmphp_core.o	\
>  acpiphp-objs		:=	acpiphp_core.o	\
>  				acpiphp_glue.o
>  
> +powernv-php-objs	:=	powernv_php.o	\
> +				powernv_php_slot.o
> +
>  rpaphp-objs		:=	rpaphp_core.o	\
>  				rpaphp_pci.o	\
>  				rpaphp_slot.o
> diff --git a/drivers/pci/hotplug/powernv_php.c b/drivers/pci/hotplug/powernv_php.c
> new file mode 100644
> index 0000000..4cbff7a
> --- /dev/null
> +++ b/drivers/pci/hotplug/powernv_php.c
> @@ -0,0 +1,140 @@
> +/*
> + * PCI Hotplug Driver for PowerPC PowerNV platform.
> + *
> + * Copyright Gavin Shan, IBM Corporation 2015.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/module.h>
> +
> +#include <asm/opal.h>
> +#include <asm/pnv-pci.h>
> +
> +#include "powernv_php.h"
> +
> +#define DRIVER_VERSION	"0.1"
> +#define DRIVER_AUTHOR	"Gavin Shan, IBM Corporation"
> +#define DRIVER_DESC	"PowerPC PowerNV PCI Hotplug Driver"
> +
> +static struct notifier_block php_msg_nb = {
> +	.notifier_call	= powernv_php_msg_handler,
> +	.next		= NULL,
> +	.priority	= 0,
> +};
> +
> +static int powernv_php_register_one(struct device_node *dn)
> +{
> +	struct powernv_php_slot *slot;
> +	const __be32 *prop32;
> +	int ret;
> +
> +	/* Check if it's hotpluggable slot */
> +	prop32 = of_get_property(dn, "ibm,slot-pluggable", NULL);
> +	if (!prop32 || !of_read_number(prop32, 1))
> +		return -ENXIO;
> +
> +	prop32 = of_get_property(dn, "ibm,reset-by-firmware", NULL);
> +	if (!prop32 || !of_read_number(prop32, 1))
> +		return -ENXIO;
> +
> +	/* Allocate slot */
> +	slot = powernv_php_slot_alloc(dn);
> +	if (!slot)
> +		return -ENODEV;
> +
> +	/* Register it */
> +	ret = powernv_php_slot_register(slot);
> +	if (ret) {
> +		powernv_php_slot_put(slot);
> +		return ret;
> +	}
> +
> +	return powernv_php_slot_enable(slot->php_slot, false);
> +}
> +
> +int powernv_php_register(struct device_node *dn)
> +{
> +	struct device_node *child;
> +	int ret = 0;
> +
> +	/*
> +	 * The parent slots should be registered before their
> +	 * child slots.
> +	 */
> +	for_each_child_of_node(dn, child) {
> +		powernv_php_register_one(child);
> +		powernv_php_register(child);
> +	}
> +
> +	return ret;
> +}
> +
> +static void powernv_php_unregister_one(struct device_node *dn)
> +{
> +	struct powernv_php_slot *slot;
> +
> +	slot = powernv_php_slot_find(dn);
> +	if (!slot)
> +		return;
> +
> +	pci_hp_deregister(slot->php_slot);
> +}
> +
> +void powernv_php_unregister(struct device_node *dn)
> +{
> +	struct device_node *child;
> +
> +	/* The child slots should go before their parent slots */
> +	for_each_child_of_node(dn, child) {
> +		powernv_php_unregister(child);
> +		powernv_php_unregister_one(child);
> +	}
> +}
> +
> +static int __init powernv_php_init(void)
> +{
> +	struct device_node *dn;
> +	int ret;
> +
> +	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
> +
> +	/* Register hotplug message handler */
> +	ret = pnv_pci_hotplug_notifier_register(&php_msg_nb);
> +	if (ret) {
> +		pr_warn("%s: Error %d registering hotplug notifier\n",
> +			__func__, ret);
> +		return ret;
> +	}
> +
> +	/* Scan PHB nodes and their children */
> +	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
> +		powernv_php_register(dn);
> +	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
> +		powernv_php_register(dn);
> +
> +	return 0;
> +}
> +
> +static void __exit powernv_php_exit(void)
> +{
> +	struct device_node *dn;
> +
> +	pnv_pci_hotplug_notifier_unregister(&php_msg_nb);
> +
> +	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
> +		powernv_php_unregister(dn);
> +	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
> +		powernv_php_unregister(dn);
> +}
> +
> +module_init(powernv_php_init);
> +module_exit(powernv_php_exit);
> +
> +MODULE_VERSION(DRIVER_VERSION);
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR(DRIVER_AUTHOR);
> +MODULE_DESCRIPTION(DRIVER_DESC);
> diff --git a/drivers/pci/hotplug/powernv_php.h b/drivers/pci/hotplug/powernv_php.h
> new file mode 100644
> index 0000000..5e14a65
> --- /dev/null
> +++ b/drivers/pci/hotplug/powernv_php.h
> @@ -0,0 +1,90 @@
> +/*
> + * PCI Hotplug Driver for PowerPC PowerNV platform.
> + *
> + * Copyright Gavin Shan, IBM Corporation 2015.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef _POWERNV_PHP_H
> +#define _POWERNV_PHP_H
> +
> +#include <linux/list.h>
> +#include <linux/kref.h>
> +#include <linux/of.h>
> +#include <linux/pci.h>
> +#include <linux/pci_hotplug.h>
> +#include <linux/wait.h>
> +#include <linux/workqueue.h>
> +
> +#include <asm/opal-api.h>
> +
> +/* Slot power status */
> +#define POWERNV_PHP_SLOT_POWER_OFF	0
> +#define POWERNV_PHP_SLOT_POWER_ON	1
> +
> +/* Slot presence status */
> +#define POWERNV_PHP_SLOT_EMPTY		0
> +#define POWERNV_PHP_SLOT_PRESENT	1
> +
> +/* Slot attention status */
> +#define POWERNV_PHP_SLOT_ATTEN_OFF	0
> +#define POWERNV_PHP_SLOT_ATTEN_ON	1
> +#define POWERNV_PHP_SLOT_ATTEN_IND	2
> +#define POWERNV_PHP_SLOT_ATTEN_ACT	3
> +
> +struct powernv_php_slot {
> +	char			*name;
> +	struct device_node	*dn;
> +	struct pci_bus		*bus;
> +	uint64_t		id;
> +	int			slot_no;
> +	struct kref		kref;
> +#define POWERNV_PHP_SLOT_STATE_INIT		0
> +#define POWERNV_PHP_SLOT_STATE_REGISTER		1
> +#define POWERNV_PHP_SLOT_STATE_POPULATED	2
> +	int			state;
> +	int			check_power_status;
> +	int			status_confirmed;
> +	struct opal_msg		*msg;
> +	uint64_t		dt_counter;
> +	int			overlay_id;
> +	struct work_struct	work;
> +	wait_queue_head_t	queue;
> +	struct hotplug_slot	*php_slot;
> +	struct powernv_php_slot	*parent;
> +	struct list_head	children;
> +	struct list_head	link;
> +};
> +
> +int powernv_php_msg_handler(struct notifier_block *nb,
> +			    unsigned long type, void *message);
> +struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn);
> +void powernv_php_slot_free(struct kref *kref);
> +struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn);
> +int powernv_php_slot_register(struct powernv_php_slot *slot);
> +int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan);
> +int powernv_php_register(struct device_node *dn);
> +void powernv_php_unregister(struct device_node *dn);
> +
> +#define to_powernv_php_slot(kref) \
> +	container_of(kref, struct powernv_php_slot, kref)
> +
> +static inline void powernv_php_slot_get(struct powernv_php_slot *slot)
> +{
> +	if (slot)
> +		kref_get(&slot->kref);
> +}
> +
> +static inline int powernv_php_slot_put(struct powernv_php_slot *slot)
> +{
> +	if (slot)
> +		return kref_put(&slot->kref, powernv_php_slot_free);
> +
> +	return 0;
> +}
> +
> +#endif /* !_POWERNV_PHP_H */
> diff --git a/drivers/pci/hotplug/powernv_php_slot.c b/drivers/pci/hotplug/powernv_php_slot.c
> new file mode 100644
> index 0000000..6c56455
> --- /dev/null
> +++ b/drivers/pci/hotplug/powernv_php_slot.c
> @@ -0,0 +1,732 @@
> +/*
> + * PCI Hotplug Driver for PowerPC PowerNV platform.
> + *
> + * Copyright Gavin Shan, IBM Corporation 2015.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/module.h>
> +
> +#include <asm/opal.h>
> +#include <asm/pnv-pci.h>
> +#include <asm/ppc-pci.h>
> +
> +#include "powernv_php.h"
> +
> +static LIST_HEAD(php_slot_list);
> +static DEFINE_SPINLOCK(php_slot_lock);
> +
> +/*
> + * Remove firmware data for all child device nodes of the
> + * indicated one.
> + */
> +static void remove_child_pdn(struct device_node *np)
> +{
> +	struct device_node *child;
> +
> +	for_each_child_of_node(np, child) {
> +		/* In depth first */
> +		remove_child_pdn(child);
> +
> +		remove_pci_device_node_info(child);
> +	}
> +}
> +
> +/*
> + * Remove all subordinate device nodes of the indicated one.
> + * Those device nodes in deepest path should be released firstly.
> + */
> +static int remove_child_device_nodes(struct device_node *parent)
> +{
> +	struct device_node *np, *child;
> +	int ret = 0;
> +
> +	/* If the device node has children, remove them firstly */
> +	for_each_child_of_node(parent, np) {
> +		ret = remove_child_device_nodes(np);
> +		if (ret)
> +			return ret;
> +
> +		/* The device shouldn't have alive children */
> +		child = of_get_next_child(np, NULL);
> +		if (child) {
> +			of_node_put(child);
> +			of_node_put(np);
> +			pr_err("%s: Alive children of node <%s>\n",
> +			       __func__, of_node_full_name(np));
> +			return -EBUSY;
> +		}
> +
> +		/* Detach the device node */
> +		of_detach_node(np);
> +		of_node_put(np);
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * The function processes the message sent by firmware
> + * to remove all device tree nodes beneath the slot's
> + * nodes, and the associated auxillary data.
> + */
> +static void slot_power_off_handler(struct powernv_php_slot *slot)
> +{
> +	int ret;
> +
> +	/* Release the firmware data for the child device nodes */
> +	remove_child_pdn(slot->dn);
> +
> +	/*
> +	 * Release the child device nodes. If the sub-tree was
> +	 * built with the help of overlay, we just need revert
> +	 * the changes introduced by the overlay
> +	 */
> +	if (slot->overlay_id >= 0) {
> +		ret = of_overlay_destroy(slot->overlay_id);
> +		if (ret)
> +			pr_warn("%s: Error %d destroying overlay %d\n",
> +				__func__, ret, slot->overlay_id);
> +		slot->overlay_id = -1;
> +	} else {
> +		ret = remove_child_device_nodes(slot->dn);
> +		if (ret)
> +			pr_warn("%s: Error %d releasing children of <%s>\n",
> +				__func__, ret, of_node_full_name(slot->dn));
> +	}
> +
> +	/* Confirm status change */
> +	slot->status_confirmed = 1;
> +	wake_up_interruptible(&slot->queue);
> +}
> +
> +static void slot_power_on_handler(struct powernv_php_slot *slot)
> +{
> +	struct device_node *nodes[3] = {NULL, NULL, NULL};
> +	struct property *prop = NULL;
> +	void *fdt = NULL, *dt = NULL;
> +	phandle handle;
> +	uint64_t len;
> +	int i, ret;
> +
> +	/* Build overlay sub-tree */
> +	for (i = 0; i < ARRAY_SIZE(nodes); i++) {
> +		nodes[i] = kzalloc(sizeof(struct device_node), GFP_KERNEL);
> +		if (!nodes[i])
> +			goto out;
> +
> +		of_node_init(nodes[i]);
> +		if (i > 0) {
> +			nodes[i - 1]->child = nodes[i];
> +			nodes[i]->parent = nodes[i - 1];
> +		}
> +	}
> +
> +	/* Target property for parent node */
> +	prop = kzalloc(sizeof(struct property), GFP_KERNEL);
> +	if (!prop)
> +		goto out;
> +	prop->name = kstrdup("target", GFP_KERNEL);
> +	if (!prop->name)
> +		goto out;
> +	prop->value = kzalloc(sizeof(phandle), GFP_KERNEL);
> +	if (!prop->value)
> +		goto out;
> +	handle = cpu_to_be32(slot->dn->phandle);
> +	memcpy(prop->value, &handle, sizeof(phandle));
> +	prop->length = sizeof(phandle);
> +	nodes[1]->properties = prop;
> +
> +	/* Names for overlay node */
> +	nodes[2]->name = kstrdup("__overlay__", GFP_KERNEL);
> +	if (!nodes[2]->name)
> +		goto out;
> +	nodes[2]->full_name = kstrdup(of_node_full_name(slot->dn), GFP_KERNEL);
> +	if (!nodes[2]->full_name)
> +		goto out;

I think you can simplify this driver by using the of_changeset api
instead of of_overlay. of_overlay is a particular data format passed
into the kernel, but it uses of_changeset in the back end. In this case,
you would allocate an of_changeset structure and then do:

of_changeset_init()
of_changeset_attach_node()
	/* you might need to create an
	 * of_changeset_attach_node_subtree() varient */
of_changeset_attach_node()
of_changeset_attach_node()
of_changeset_attach_node()
of_changeset_apply()
of_changeset_destroy() /* frees the structure */

Then you don't have to muck about with creating a DT in the structure
expected by the of_overlay code.

> +
> +	/* Get FDT blob */
> +	slot->dt_counter += 1;
> +	fdt = NULL;
> +	len = 0x2000;
> +	while (len <= 0x10000) {
> +		fdt = kzalloc(len, GFP_KERNEL);
> +		if (!fdt)
> +			break;
> +
> +		ret = pnv_pci_get_overlay_dt(&slot->dt_counter, fdt, len);
> +		if (!ret)
> +			break;
> +
> +		kfree(fdt);
> +		fdt = NULL;
> +		len *= 2;
> +	}
> +
> +	if (!fdt)
> +		goto out;
> +
> +	/* Unflatten device tree blob */
> +	dt = of_fdt_unflatten_tree(fdt, nodes[2], NULL);
> +
> +	/* Apply the overlay tree */
> +	slot->overlay_id = of_overlay_create(nodes[0]);
> +	if (slot->overlay_id < 0)
> +		goto out;
> +
> +	/* Add device node firmware data */
> +	traverse_pci_device_nodes(slot->dn,
> +				  add_pci_device_node_info,
> +				  pci_bus_to_host(slot->bus));
> +
> +out:
> +	kfree(dt);
> +	kfree(fdt);
> +	if (nodes[2]) {
> +		kfree(nodes[2]->name);
> +		kfree(nodes[2]->full_name);
> +	}
> +	if (prop) {
> +		kfree(prop->value);
> +		kfree(prop->name);
> +	}
> +
> +	kfree(prop);
> +	for (i = 0; i < ARRAY_SIZE(nodes); i++)
> +		kfree(nodes[i]);
> +
> +	/* Confirm status change */
> +	slot->status_confirmed = 1;
> +	wake_up_interruptible(&slot->queue);
> +}
> +
> +static void powernv_php_slot_work(struct work_struct *data)
> +{
> +	struct powernv_php_slot *slot = container_of(data,
> +						     struct powernv_php_slot,
> +						     work);
> +	uint64_t php_event = be64_to_cpu(slot->msg->params[0]);
> +
> +	switch (php_event) {
> +	case 0: /* Slot power off */
> +		slot_power_off_handler(slot);
> +		break;
> +	case 1: /* Slot power on */
> +		slot_power_on_handler(slot);
> +		break;
> +	default:
> +		pr_warn("%s: Unsupported hotplug event %lld\n",
> +			__func__, php_event);
> +	}
> +
> +	of_node_put(slot->dn);
> +}
> +
> +int powernv_php_msg_handler(struct notifier_block *nb,
> +			    unsigned long type, void *message)
> +{
> +	phandle h;
> +	struct device_node *np;
> +	struct powernv_php_slot *slot;
> +	struct opal_msg *msg = message;
> +
> +	/* Check the message type */
> +	if (type != OPAL_MSG_PCI_HOTPLUG) {
> +		pr_warn("%s: Wrong message type %ld received!\n",
> +			__func__, type);
> +		return NOTIFY_DONE;
> +	}
> +
> +	/* Find the device node */
> +	h = (phandle)be64_to_cpu(msg->params[1]);
> +	np = of_find_node_by_phandle(h);
> +	if (!np) {
> +		pr_warn("%s: No device node for phandle 0x%08x\n",
> +			__func__, h);
> +		return NOTIFY_DONE;
> +	}
> +
> +	/* Find the slot */
> +	slot = powernv_php_slot_find(np);
> +	if (!slot) {
> +		pr_warn("%s: No slot found for node <%s>\n",
> +			__func__, of_node_full_name(np));
> +		of_node_put(np);
> +		return NOTIFY_DONE;
> +	}
> +
> +	/* Schedule the work */
> +	slot->msg = msg;
> +	schedule_work(&slot->work);
> +	return NOTIFY_OK;
> +}
> +
> +static int set_power_status(struct hotplug_slot *php_slot, u8 val)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	int ret;
> +
> +	/* Retrieve the counter of device tree */
> +	ret = pnv_pci_get_overlay_dt(&slot->dt_counter, NULL, 0);
> +	if (ret) {
> +		pr_warn("%s: Error %d getting DT counter for slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +
> +	/* Set power status */
> +	slot->status_confirmed = 0;
> +	ret = pnv_pci_set_power_status(slot->id, val);
> +	if (ret) {
> +		pr_warn("%s: Error %d powering %s slot %016llx\n",
> +			__func__, ret, val ? "on" : "off", slot->id);
> +		return ret;
> +	}
> +
> +	/* Waiting until the device tree is updated */
> +	ret = wait_event_timeout(slot->queue,
> +				 !slot->status_confirmed,
> +				 10 * HZ);
> +	if (ret) {
> +		pr_warn("%s: Error %d completing power-%s slot %016llx\n",
> +			__func__, ret, val ? "on" : "off", slot->id);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int get_power_status(struct hotplug_slot *php_slot, u8 *val)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t state;
> +	int ret;
> +
> +	/*
> +	 * Retrieve power status from firmware. If we fail
> +	 * getting that, the power status fails back to
> +	 * be on.
> +	 */
> +	ret = pnv_pci_get_power_status(slot->id, &state);
> +	if (ret) {
> +		*val = POWERNV_PHP_SLOT_POWER_ON;
> +		pr_warn("%s: Error %d getting power status of slot %016llx\n",
> +			__func__, ret, slot->id);
> +	} else {
> +		*val = state ? POWERNV_PHP_SLOT_POWER_ON :
> +			       POWERNV_PHP_SLOT_POWER_OFF;
> +		php_slot->info->power_status = *val;
> +	}
> +
> +	return 0;
> +}
> +
> +static int get_adapter_status(struct hotplug_slot *php_slot, u8 *val)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t state;
> +	int ret;
> +
> +	/*
> +	 * Retrieve presence status from firmware. If we can't
> +	 * get that, it will fail back to be empty.
> +	 */
> +	ret = pnv_pci_get_presence_status(slot->id, &state);
> +	if (ret >= 0) {
> +		ret = 0;
> +		*val = state ? POWERNV_PHP_SLOT_PRESENT :
> +			       POWERNV_PHP_SLOT_EMPTY;
> +		php_slot->info->adapter_status = *val;
> +		ret = 0;
> +	} else {
> +		*val = POWERNV_PHP_SLOT_EMPTY;
> +		pr_warn("%s: Error %d getting presence of slot %016llx\n",
> +			__func__, ret, slot->id);
> +	}
> +
> +	return ret;
> +}
> +
> +static int set_attention_status(struct hotplug_slot *php_slot, u8 val)
> +{
> +	/* The default operation would to turn on the attention */
> +	switch (val) {
> +	case POWERNV_PHP_SLOT_ATTEN_OFF:
> +	case POWERNV_PHP_SLOT_ATTEN_ON:
> +	case POWERNV_PHP_SLOT_ATTEN_IND:
> +	case POWERNV_PHP_SLOT_ATTEN_ACT:
> +		break;
> +	default:
> +		pr_warn("%s: Invalid attention status 0x%02x\n",
> +			__func__, val);
> +		return -EINVAL;
> +	}
> +
> +	/* FIXME: Make it real once firmware supports it */
> +	php_slot->info->attention_status = val;
> +
> +	return 0;
> +}
> +
> +int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t presence, power_status;
> +	int ret;
> +
> +	/* Check if the slot has been configured */
> +	if (slot->state != POWERNV_PHP_SLOT_STATE_REGISTER)
> +		return 0;
> +
> +	/* Retrieve slot presence status */
> +	ret = php_slot->ops->get_adapter_status(php_slot, &presence);
> +	if (ret) {
> +		pr_warn("%s: Error %d getting presence of slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +
> +	/* Proceed if there have nothing behind the slot */
> +	if (presence == POWERNV_PHP_SLOT_EMPTY)
> +		goto scan;
> +
> +	/*
> +	 * If we don't detect something behind the slot, we need
> +	 * make sure the power suply to the slot is on. Otherwise,
> +	 * the slot downstream PCIe linkturn should be down.
> +	 *
> +	 * On the first time, we don't change the power status to
> +	 * boost system boot with assumption that the firmware
> +	 * supplies consistent slot power status: empty slot always
> +	 * has its power off and non-empty slot has its power on.
> +	 */
> +	if (!slot->check_power_status) {
> +		slot->check_power_status = 1;
> +		goto scan;
> +	}
> +
> +	/* Check the power status. Scan the slot if that's already on */
> +	ret = php_slot->ops->get_power_status(php_slot, &power_status);
> +	if (ret) {
> +		pr_warn("%s: Error %d getting power status of slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +	if (power_status == POWERNV_PHP_SLOT_POWER_ON)
> +		goto scan;
> +
> +	/* Power is off, turn it on and then scan the slot */
> +	ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_ON);
> +	if (ret) {
> +		pr_warn("%s: Error %d powering on slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +
> +scan:
> +	switch (presence) {
> +	case POWERNV_PHP_SLOT_PRESENT:
> +		if (rescan) {
> +			pci_lock_rescan_remove();
> +			pcibios_add_pci_devices(slot->bus);
> +			pci_unlock_rescan_remove();
> +		}
> +
> +		/* Rescan for child hotpluggable slots */
> +		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
> +		if (rescan)
> +			powernv_php_register(slot->dn);
> +		break;
> +	case POWERNV_PHP_SLOT_EMPTY:
> +		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
> +		break;
> +	default:
> +		pr_warn("%s: Invalid presence status %d of slot %016llx\n",
> +			__func__, presence, slot->id);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int enable_slot(struct hotplug_slot *php_slot)
> +{
> +	return powernv_php_slot_enable(php_slot, true);
> +}
> +
> +static int disable_slot(struct hotplug_slot *php_slot)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t power_status;
> +	int ret;
> +
> +	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
> +		return 0;
> +
> +	/* Remove all devices behind the slot */
> +	pci_lock_rescan_remove();
> +	pcibios_remove_pci_devices(slot->bus);
> +	pci_unlock_rescan_remove();
> +
> +	/* Detach the child hotpluggable slots */
> +	powernv_php_unregister(slot->dn);
> +
> +	/*
> +	 * Check the power status and turn it off if necessary. If we
> +	 * fail to get the power status, the power will be forced to
> +	 * be off.
> +	 */
> +	ret = php_slot->ops->get_power_status(php_slot, &power_status);
> +	if (ret || power_status == POWERNV_PHP_SLOT_POWER_ON) {
> +		ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_OFF);
> +		if (ret)
> +			pr_warn("%s: Error %d powering off slot %016llx\n",
> +				__func__, ret, slot->id);
> +	}
> +
> +	/* Update slot state */
> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
> +	return 0;
> +}
> +
> +static struct hotplug_slot_ops php_slot_ops = {
> +	.get_power_status	= get_power_status,
> +	.get_adapter_status	= get_adapter_status,
> +	.set_attention_status	= set_attention_status,
> +	.enable_slot		= enable_slot,
> +	.disable_slot		= disable_slot,
> +};
> +
> +static struct powernv_php_slot *php_slot_match(struct device_node *dn,
> +					       struct powernv_php_slot *slot)
> +{
> +	struct powernv_php_slot *target, *tmp;
> +
> +	if (slot->dn == dn)
> +		return slot;
> +
> +	list_for_each_entry(tmp, &slot->children, link) {
> +		target = php_slot_match(dn, tmp);
> +		if (target)
> +			return target;
> +	}
> +
> +	return NULL;
> +}
> +
> +struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn)
> +{
> +	struct powernv_php_slot *slot, *tmp;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&php_slot_lock, flags);
> +	list_for_each_entry(tmp, &php_slot_list, link) {
> +		slot = php_slot_match(dn, tmp);
> +		if (slot) {
> +			spin_unlock_irqrestore(&php_slot_lock, flags);
> +			return slot;
> +		}
> +	}
> +	spin_unlock_irqrestore(&php_slot_lock, flags);
> +
> +	return NULL;
> +}
> +
> +void powernv_php_slot_free(struct kref *kref)
> +{
> +	struct powernv_php_slot *slot = to_powernv_php_slot(kref);
> +
> +	WARN_ON(!list_empty(&slot->children));
> +	kfree(slot->name);
> +	kfree(slot);
> +}
> +
> +static void php_slot_release(struct hotplug_slot *hp_slot)
> +{
> +	struct powernv_php_slot *slot = hp_slot->private;
> +	unsigned long flags;
> +
> +	/* Remove from global or child list */
> +	spin_lock_irqsave(&php_slot_lock, flags);
> +	list_del(&slot->link);
> +	spin_unlock_irqrestore(&php_slot_lock, flags);
> +
> +	/* Detach from parent */
> +	powernv_php_slot_put(slot);
> +	powernv_php_slot_put(slot->parent);
> +}
> +
> +static bool php_slot_get_id(struct device_node *dn,
> +			    uint64_t *id)
> +{
> +	struct device_node *parent = dn;
> +	const __be64 *prop64;
> +	const __be32 *prop32;
> +
> +	/*
> +	 * The hotpluggable slot always has a compound Id, which
> +	 * consists of 16-bits PHB Id, 16 bits bus/slot/function
> +	 * number, and compound indicator
> +	 */
> +	*id = (0x1ul << 63);
> +
> +	/* Bus/Slot/Function number */
> +	prop32 = of_get_property(dn, "reg", NULL);
> +	if (!prop32)
> +		return false;
> +	*id |= ((of_read_number(prop32, 1) & 0x00ffff00) << 8);
> +
> +	/* PHB Id */
> +	while ((parent = of_get_parent(parent))) {
> +		if (!PCI_DN(parent)) {
> +			of_node_put(parent);
> +			break;
> +		}
> +
> +		if (!of_device_is_compatible(parent, "ibm,ioda2-phb") &&
> +		    !of_device_is_compatible(parent, "ibm,ioda-phb")) {
> +			of_node_put(parent);
> +			continue;
> +		}
> +
> +		prop64 = of_get_property(parent, "ibm,opal-phbid", NULL);
> +		if (!prop64) {
> +			of_node_put(parent);
> +			return false;
> +		}
> +
> +		*id |= be64_to_cpup(prop64);
> +		of_node_put(parent);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn)
> +{
> +	struct pci_bus *bus;
> +	struct powernv_php_slot *slot;
> +	const char *label;
> +	uint64_t id;
> +	int slot_no;
> +	size_t size;
> +	void *pmem;
> +
> +	/* Slot name */
> +	label = of_get_property(dn, "ibm,slot-label", NULL);
> +	if (!label)
> +		return NULL;
> +
> +	/* Slot indentifier */
> +	if (!php_slot_get_id(dn, &id))
> +		return NULL;
> +
> +	/* PCI bus */
> +	bus = pcibios_find_pci_bus(dn);
> +	if (!bus)
> +		return NULL;
> +
> +	/* Slot number */
> +	if (dn->child && PCI_DN(dn->child))
> +		slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
> +	else
> +		slot_no = -1;
> +
> +	/* Allocate slot */
> +	size = sizeof(struct powernv_php_slot) +
> +	       sizeof(struct hotplug_slot) +
> +	       sizeof(struct hotplug_slot_info);
> +	pmem = kzalloc(size, GFP_KERNEL);
> +	if (!pmem) {
> +		pr_warn("%s: Cannot allocate slot for node %s\n",
> +			__func__, dn->full_name);
> +		return NULL;
> +	}
> +
> +	/* Assign memory blocks */
> +	slot = pmem;
> +	slot->php_slot = pmem + sizeof(struct powernv_php_slot);
> +	slot->php_slot->info = pmem + sizeof(struct powernv_php_slot) +
> +			      sizeof(struct hotplug_slot);
> +	slot->name = kstrdup(label, GFP_KERNEL);
> +	if (!slot->name) {
> +		pr_warn("%s: Cannot populate name for node %s\n",
> +			__func__, dn->full_name);
> +		kfree(pmem);
> +		return NULL;
> +	}
> +
> +	/* Initialize slot */
> +	kref_init(&slot->kref);
> +	slot->state = POWERNV_PHP_SLOT_STATE_INIT;
> +	slot->dn = dn;
> +	slot->bus = bus;
> +	slot->id = id;
> +	slot->slot_no = slot_no;
> +	slot->overlay_id = -1;
> +	INIT_WORK(&slot->work, powernv_php_slot_work);
> +	init_waitqueue_head(&slot->queue);
> +	slot->check_power_status = 0;
> +	slot->status_confirmed = 0;
> +	slot->php_slot->ops = &php_slot_ops;
> +	slot->php_slot->release = php_slot_release;
> +	slot->php_slot->private = slot;
> +	INIT_LIST_HEAD(&slot->children);
> +	INIT_LIST_HEAD(&slot->link);
> +
> +	return slot;
> +}
> +
> +int powernv_php_slot_register(struct powernv_php_slot *slot)
> +{
> +	struct powernv_php_slot *parent;
> +	struct device_node *dn = slot->dn;
> +	unsigned long flags;
> +	int ret;
> +
> +	/* Avoid register same slot for twice */
> +	if (powernv_php_slot_find(slot->dn))
> +		return -EEXIST;
> +
> +	/* Register slot */
> +	ret = pci_hp_register(slot->php_slot, slot->bus,
> +			      slot->slot_no, slot->name);
> +	if (ret) {
> +		pr_warn("%s: Cannot register slot %s (%d)\n",
> +			__func__, slot->name, ret);
> +		return ret;
> +	}
> +
> +	/* Put into global or parent list */
> +	while ((dn = of_get_parent(dn))) {
> +		if (!PCI_DN(dn)) {
> +			of_node_put(dn);
> +			break;
> +		}
> +
> +		parent = powernv_php_slot_find(dn);
> +		if (parent) {
> +			of_node_put(dn);
> +			break;
> +		}
> +	}
> +
> +	spin_lock_irqsave(&php_slot_lock, flags);
> +	if (parent) {
> +		powernv_php_slot_get(parent);
> +		slot->parent = parent;
> +		list_add_tail(&slot->link, &parent->children);
> +	} else {
> +		list_add_tail(&slot->link, &php_slot_list);
> +	}
> +	spin_unlock_irqrestore(&php_slot_lock, flags);
> +
> +	/* Update slot state */
> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
> +	return 0;
> +}
> -- 
> 2.1.0
> 

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
@ 2015-06-30 18:18     ` Grant Likely
  0 siblings, 0 replies; 83+ messages in thread
From: Grant Likely @ 2015-06-30 18:18 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, bhelgaas, aik, panto, robherring2,
	Gavin Shan

On Thu,  4 Jun 2015 16:42:11 +1000
, Gavin Shan <gwshan@linux.vnet.ibm.com>
 wrote:
> The patch intends to add standalone driver to support PCI hotplug
> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
> The firmware identified hotpluggable slots and marked their device
> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
> The driver simply scans device-tree to create/register PCI hotplug slot
> accordingly.
> 
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> v5:
>   * Use OF OVERLAY to update the device-tree
>   * Removed unnecessary header files
>   * More meaningful return value from powernv_php_register_one()
>   * Use pnv_pci_hotplug_notifier_{register, unregister}()
>   * Decimal values for slot's states
>   * Removed struct powernv_php_slot::release()
>   * Merged two bool arguments to one for powernv_php_slot_enable()
>   * Rename release_device_nodes_info() to remove_device_nodes_info()
>   * Don't check on "!len" in slot_power_on_handler()
>   * Handle return value in get_adapter_status() as suggested by aik
>   * Drop invalid attention status in set_attention_status()
>   * Renaming functions
>   * Fixed coding style and added entry in MAINTAINERS reported by
>     checkpatch.pl
> ---
>  MAINTAINERS                            |   6 +
>  drivers/pci/hotplug/Kconfig            |  12 +
>  drivers/pci/hotplug/Makefile           |   4 +
>  drivers/pci/hotplug/powernv_php.c      | 140 +++++++
>  drivers/pci/hotplug/powernv_php.h      |  90 ++++
>  drivers/pci/hotplug/powernv_php_slot.c | 732 +++++++++++++++++++++++++++++++++
>  6 files changed, 984 insertions(+)
>  create mode 100644 drivers/pci/hotplug/powernv_php.c
>  create mode 100644 drivers/pci/hotplug/powernv_php.h
>  create mode 100644 drivers/pci/hotplug/powernv_php_slot.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e308718..f5e1dce 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7481,6 +7481,12 @@ L:	linux-pci@vger.kernel.org
>  S:	Supported
>  F:	Documentation/PCI/pci-error-recovery.txt
>  
> +PCI HOTPLUG DRIVER FOR POWERNV PLATFORM
> +M:	Gavin Shan <gwshan@linux.vnet.ibm.com>
> +L:	linux-pci@vger.kernel.org
> +S:	Supported
> +F:	drivers/pci/hotplug/powernv_php*
> +
>  PCI SUBSYSTEM
>  M:	Bjorn Helgaas <bhelgaas@google.com>
>  L:	linux-pci@vger.kernel.org
> diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
> index df8caec..ef55dae 100644
> --- a/drivers/pci/hotplug/Kconfig
> +++ b/drivers/pci/hotplug/Kconfig
> @@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC
>  
>  	  When in doubt, say N.
>  
> +config HOTPLUG_PCI_POWERNV
> +	tristate "PowerPC PowerNV PCI Hotplug driver"
> +	depends on PPC_POWERNV && EEH
> +	help
> +	  Say Y here if you run PowerPC PowerNV platform that supports
> +          PCI Hotplug
> +
> +	  To compile this driver as a module, choose M here: the
> +	  module will be called powernv-php.
> +
> +	  When in doubt, say N.
> +
>  config HOTPLUG_PCI_RPA
>  	tristate "RPA PCI Hotplug driver"
>  	depends on PPC_PSERIES && EEH
> diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
> index 4a9aa08..a69665e 100644
> --- a/drivers/pci/hotplug/Makefile
> +++ b/drivers/pci/hotplug/Makefile
> @@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)		+= pciehp.o
>  obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550)	+= cpcihp_zt5550.o
>  obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)	+= cpcihp_generic.o
>  obj-$(CONFIG_HOTPLUG_PCI_SHPC)		+= shpchp.o
> +obj-$(CONFIG_HOTPLUG_PCI_POWERNV)	+= powernv-php.o
>  obj-$(CONFIG_HOTPLUG_PCI_RPA)		+= rpaphp.o
>  obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR)	+= rpadlpar_io.o
>  obj-$(CONFIG_HOTPLUG_PCI_SGI)		+= sgi_hotplug.o
> @@ -50,6 +51,9 @@ ibmphp-objs		:=	ibmphp_core.o	\
>  acpiphp-objs		:=	acpiphp_core.o	\
>  				acpiphp_glue.o
>  
> +powernv-php-objs	:=	powernv_php.o	\
> +				powernv_php_slot.o
> +
>  rpaphp-objs		:=	rpaphp_core.o	\
>  				rpaphp_pci.o	\
>  				rpaphp_slot.o
> diff --git a/drivers/pci/hotplug/powernv_php.c b/drivers/pci/hotplug/powernv_php.c
> new file mode 100644
> index 0000000..4cbff7a
> --- /dev/null
> +++ b/drivers/pci/hotplug/powernv_php.c
> @@ -0,0 +1,140 @@
> +/*
> + * PCI Hotplug Driver for PowerPC PowerNV platform.
> + *
> + * Copyright Gavin Shan, IBM Corporation 2015.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/module.h>
> +
> +#include <asm/opal.h>
> +#include <asm/pnv-pci.h>
> +
> +#include "powernv_php.h"
> +
> +#define DRIVER_VERSION	"0.1"
> +#define DRIVER_AUTHOR	"Gavin Shan, IBM Corporation"
> +#define DRIVER_DESC	"PowerPC PowerNV PCI Hotplug Driver"
> +
> +static struct notifier_block php_msg_nb = {
> +	.notifier_call	= powernv_php_msg_handler,
> +	.next		= NULL,
> +	.priority	= 0,
> +};
> +
> +static int powernv_php_register_one(struct device_node *dn)
> +{
> +	struct powernv_php_slot *slot;
> +	const __be32 *prop32;
> +	int ret;
> +
> +	/* Check if it's hotpluggable slot */
> +	prop32 = of_get_property(dn, "ibm,slot-pluggable", NULL);
> +	if (!prop32 || !of_read_number(prop32, 1))
> +		return -ENXIO;
> +
> +	prop32 = of_get_property(dn, "ibm,reset-by-firmware", NULL);
> +	if (!prop32 || !of_read_number(prop32, 1))
> +		return -ENXIO;
> +
> +	/* Allocate slot */
> +	slot = powernv_php_slot_alloc(dn);
> +	if (!slot)
> +		return -ENODEV;
> +
> +	/* Register it */
> +	ret = powernv_php_slot_register(slot);
> +	if (ret) {
> +		powernv_php_slot_put(slot);
> +		return ret;
> +	}
> +
> +	return powernv_php_slot_enable(slot->php_slot, false);
> +}
> +
> +int powernv_php_register(struct device_node *dn)
> +{
> +	struct device_node *child;
> +	int ret = 0;
> +
> +	/*
> +	 * The parent slots should be registered before their
> +	 * child slots.
> +	 */
> +	for_each_child_of_node(dn, child) {
> +		powernv_php_register_one(child);
> +		powernv_php_register(child);
> +	}
> +
> +	return ret;
> +}
> +
> +static void powernv_php_unregister_one(struct device_node *dn)
> +{
> +	struct powernv_php_slot *slot;
> +
> +	slot = powernv_php_slot_find(dn);
> +	if (!slot)
> +		return;
> +
> +	pci_hp_deregister(slot->php_slot);
> +}
> +
> +void powernv_php_unregister(struct device_node *dn)
> +{
> +	struct device_node *child;
> +
> +	/* The child slots should go before their parent slots */
> +	for_each_child_of_node(dn, child) {
> +		powernv_php_unregister(child);
> +		powernv_php_unregister_one(child);
> +	}
> +}
> +
> +static int __init powernv_php_init(void)
> +{
> +	struct device_node *dn;
> +	int ret;
> +
> +	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
> +
> +	/* Register hotplug message handler */
> +	ret = pnv_pci_hotplug_notifier_register(&php_msg_nb);
> +	if (ret) {
> +		pr_warn("%s: Error %d registering hotplug notifier\n",
> +			__func__, ret);
> +		return ret;
> +	}
> +
> +	/* Scan PHB nodes and their children */
> +	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
> +		powernv_php_register(dn);
> +	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
> +		powernv_php_register(dn);
> +
> +	return 0;
> +}
> +
> +static void __exit powernv_php_exit(void)
> +{
> +	struct device_node *dn;
> +
> +	pnv_pci_hotplug_notifier_unregister(&php_msg_nb);
> +
> +	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
> +		powernv_php_unregister(dn);
> +	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
> +		powernv_php_unregister(dn);
> +}
> +
> +module_init(powernv_php_init);
> +module_exit(powernv_php_exit);
> +
> +MODULE_VERSION(DRIVER_VERSION);
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR(DRIVER_AUTHOR);
> +MODULE_DESCRIPTION(DRIVER_DESC);
> diff --git a/drivers/pci/hotplug/powernv_php.h b/drivers/pci/hotplug/powernv_php.h
> new file mode 100644
> index 0000000..5e14a65
> --- /dev/null
> +++ b/drivers/pci/hotplug/powernv_php.h
> @@ -0,0 +1,90 @@
> +/*
> + * PCI Hotplug Driver for PowerPC PowerNV platform.
> + *
> + * Copyright Gavin Shan, IBM Corporation 2015.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef _POWERNV_PHP_H
> +#define _POWERNV_PHP_H
> +
> +#include <linux/list.h>
> +#include <linux/kref.h>
> +#include <linux/of.h>
> +#include <linux/pci.h>
> +#include <linux/pci_hotplug.h>
> +#include <linux/wait.h>
> +#include <linux/workqueue.h>
> +
> +#include <asm/opal-api.h>
> +
> +/* Slot power status */
> +#define POWERNV_PHP_SLOT_POWER_OFF	0
> +#define POWERNV_PHP_SLOT_POWER_ON	1
> +
> +/* Slot presence status */
> +#define POWERNV_PHP_SLOT_EMPTY		0
> +#define POWERNV_PHP_SLOT_PRESENT	1
> +
> +/* Slot attention status */
> +#define POWERNV_PHP_SLOT_ATTEN_OFF	0
> +#define POWERNV_PHP_SLOT_ATTEN_ON	1
> +#define POWERNV_PHP_SLOT_ATTEN_IND	2
> +#define POWERNV_PHP_SLOT_ATTEN_ACT	3
> +
> +struct powernv_php_slot {
> +	char			*name;
> +	struct device_node	*dn;
> +	struct pci_bus		*bus;
> +	uint64_t		id;
> +	int			slot_no;
> +	struct kref		kref;
> +#define POWERNV_PHP_SLOT_STATE_INIT		0
> +#define POWERNV_PHP_SLOT_STATE_REGISTER		1
> +#define POWERNV_PHP_SLOT_STATE_POPULATED	2
> +	int			state;
> +	int			check_power_status;
> +	int			status_confirmed;
> +	struct opal_msg		*msg;
> +	uint64_t		dt_counter;
> +	int			overlay_id;
> +	struct work_struct	work;
> +	wait_queue_head_t	queue;
> +	struct hotplug_slot	*php_slot;
> +	struct powernv_php_slot	*parent;
> +	struct list_head	children;
> +	struct list_head	link;
> +};
> +
> +int powernv_php_msg_handler(struct notifier_block *nb,
> +			    unsigned long type, void *message);
> +struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn);
> +void powernv_php_slot_free(struct kref *kref);
> +struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn);
> +int powernv_php_slot_register(struct powernv_php_slot *slot);
> +int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan);
> +int powernv_php_register(struct device_node *dn);
> +void powernv_php_unregister(struct device_node *dn);
> +
> +#define to_powernv_php_slot(kref) \
> +	container_of(kref, struct powernv_php_slot, kref)
> +
> +static inline void powernv_php_slot_get(struct powernv_php_slot *slot)
> +{
> +	if (slot)
> +		kref_get(&slot->kref);
> +}
> +
> +static inline int powernv_php_slot_put(struct powernv_php_slot *slot)
> +{
> +	if (slot)
> +		return kref_put(&slot->kref, powernv_php_slot_free);
> +
> +	return 0;
> +}
> +
> +#endif /* !_POWERNV_PHP_H */
> diff --git a/drivers/pci/hotplug/powernv_php_slot.c b/drivers/pci/hotplug/powernv_php_slot.c
> new file mode 100644
> index 0000000..6c56455
> --- /dev/null
> +++ b/drivers/pci/hotplug/powernv_php_slot.c
> @@ -0,0 +1,732 @@
> +/*
> + * PCI Hotplug Driver for PowerPC PowerNV platform.
> + *
> + * Copyright Gavin Shan, IBM Corporation 2015.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/module.h>
> +
> +#include <asm/opal.h>
> +#include <asm/pnv-pci.h>
> +#include <asm/ppc-pci.h>
> +
> +#include "powernv_php.h"
> +
> +static LIST_HEAD(php_slot_list);
> +static DEFINE_SPINLOCK(php_slot_lock);
> +
> +/*
> + * Remove firmware data for all child device nodes of the
> + * indicated one.
> + */
> +static void remove_child_pdn(struct device_node *np)
> +{
> +	struct device_node *child;
> +
> +	for_each_child_of_node(np, child) {
> +		/* In depth first */
> +		remove_child_pdn(child);
> +
> +		remove_pci_device_node_info(child);
> +	}
> +}
> +
> +/*
> + * Remove all subordinate device nodes of the indicated one.
> + * Those device nodes in deepest path should be released firstly.
> + */
> +static int remove_child_device_nodes(struct device_node *parent)
> +{
> +	struct device_node *np, *child;
> +	int ret = 0;
> +
> +	/* If the device node has children, remove them firstly */
> +	for_each_child_of_node(parent, np) {
> +		ret = remove_child_device_nodes(np);
> +		if (ret)
> +			return ret;
> +
> +		/* The device shouldn't have alive children */
> +		child = of_get_next_child(np, NULL);
> +		if (child) {
> +			of_node_put(child);
> +			of_node_put(np);
> +			pr_err("%s: Alive children of node <%s>\n",
> +			       __func__, of_node_full_name(np));
> +			return -EBUSY;
> +		}
> +
> +		/* Detach the device node */
> +		of_detach_node(np);
> +		of_node_put(np);
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * The function processes the message sent by firmware
> + * to remove all device tree nodes beneath the slot's
> + * nodes, and the associated auxillary data.
> + */
> +static void slot_power_off_handler(struct powernv_php_slot *slot)
> +{
> +	int ret;
> +
> +	/* Release the firmware data for the child device nodes */
> +	remove_child_pdn(slot->dn);
> +
> +	/*
> +	 * Release the child device nodes. If the sub-tree was
> +	 * built with the help of overlay, we just need revert
> +	 * the changes introduced by the overlay
> +	 */
> +	if (slot->overlay_id >= 0) {
> +		ret = of_overlay_destroy(slot->overlay_id);
> +		if (ret)
> +			pr_warn("%s: Error %d destroying overlay %d\n",
> +				__func__, ret, slot->overlay_id);
> +		slot->overlay_id = -1;
> +	} else {
> +		ret = remove_child_device_nodes(slot->dn);
> +		if (ret)
> +			pr_warn("%s: Error %d releasing children of <%s>\n",
> +				__func__, ret, of_node_full_name(slot->dn));
> +	}
> +
> +	/* Confirm status change */
> +	slot->status_confirmed = 1;
> +	wake_up_interruptible(&slot->queue);
> +}
> +
> +static void slot_power_on_handler(struct powernv_php_slot *slot)
> +{
> +	struct device_node *nodes[3] = {NULL, NULL, NULL};
> +	struct property *prop = NULL;
> +	void *fdt = NULL, *dt = NULL;
> +	phandle handle;
> +	uint64_t len;
> +	int i, ret;
> +
> +	/* Build overlay sub-tree */
> +	for (i = 0; i < ARRAY_SIZE(nodes); i++) {
> +		nodes[i] = kzalloc(sizeof(struct device_node), GFP_KERNEL);
> +		if (!nodes[i])
> +			goto out;
> +
> +		of_node_init(nodes[i]);
> +		if (i > 0) {
> +			nodes[i - 1]->child = nodes[i];
> +			nodes[i]->parent = nodes[i - 1];
> +		}
> +	}
> +
> +	/* Target property for parent node */
> +	prop = kzalloc(sizeof(struct property), GFP_KERNEL);
> +	if (!prop)
> +		goto out;
> +	prop->name = kstrdup("target", GFP_KERNEL);
> +	if (!prop->name)
> +		goto out;
> +	prop->value = kzalloc(sizeof(phandle), GFP_KERNEL);
> +	if (!prop->value)
> +		goto out;
> +	handle = cpu_to_be32(slot->dn->phandle);
> +	memcpy(prop->value, &handle, sizeof(phandle));
> +	prop->length = sizeof(phandle);
> +	nodes[1]->properties = prop;
> +
> +	/* Names for overlay node */
> +	nodes[2]->name = kstrdup("__overlay__", GFP_KERNEL);
> +	if (!nodes[2]->name)
> +		goto out;
> +	nodes[2]->full_name = kstrdup(of_node_full_name(slot->dn), GFP_KERNEL);
> +	if (!nodes[2]->full_name)
> +		goto out;

I think you can simplify this driver by using the of_changeset api
instead of of_overlay. of_overlay is a particular data format passed
into the kernel, but it uses of_changeset in the back end. In this case,
you would allocate an of_changeset structure and then do:

of_changeset_init()
of_changeset_attach_node()
	/* you might need to create an
	 * of_changeset_attach_node_subtree() varient */
of_changeset_attach_node()
of_changeset_attach_node()
of_changeset_attach_node()
of_changeset_apply()
of_changeset_destroy() /* frees the structure */

Then you don't have to muck about with creating a DT in the structure
expected by the of_overlay code.

> +
> +	/* Get FDT blob */
> +	slot->dt_counter += 1;
> +	fdt = NULL;
> +	len = 0x2000;
> +	while (len <= 0x10000) {
> +		fdt = kzalloc(len, GFP_KERNEL);
> +		if (!fdt)
> +			break;
> +
> +		ret = pnv_pci_get_overlay_dt(&slot->dt_counter, fdt, len);
> +		if (!ret)
> +			break;
> +
> +		kfree(fdt);
> +		fdt = NULL;
> +		len *= 2;
> +	}
> +
> +	if (!fdt)
> +		goto out;
> +
> +	/* Unflatten device tree blob */
> +	dt = of_fdt_unflatten_tree(fdt, nodes[2], NULL);
> +
> +	/* Apply the overlay tree */
> +	slot->overlay_id = of_overlay_create(nodes[0]);
> +	if (slot->overlay_id < 0)
> +		goto out;
> +
> +	/* Add device node firmware data */
> +	traverse_pci_device_nodes(slot->dn,
> +				  add_pci_device_node_info,
> +				  pci_bus_to_host(slot->bus));
> +
> +out:
> +	kfree(dt);
> +	kfree(fdt);
> +	if (nodes[2]) {
> +		kfree(nodes[2]->name);
> +		kfree(nodes[2]->full_name);
> +	}
> +	if (prop) {
> +		kfree(prop->value);
> +		kfree(prop->name);
> +	}
> +
> +	kfree(prop);
> +	for (i = 0; i < ARRAY_SIZE(nodes); i++)
> +		kfree(nodes[i]);
> +
> +	/* Confirm status change */
> +	slot->status_confirmed = 1;
> +	wake_up_interruptible(&slot->queue);
> +}
> +
> +static void powernv_php_slot_work(struct work_struct *data)
> +{
> +	struct powernv_php_slot *slot = container_of(data,
> +						     struct powernv_php_slot,
> +						     work);
> +	uint64_t php_event = be64_to_cpu(slot->msg->params[0]);
> +
> +	switch (php_event) {
> +	case 0: /* Slot power off */
> +		slot_power_off_handler(slot);
> +		break;
> +	case 1: /* Slot power on */
> +		slot_power_on_handler(slot);
> +		break;
> +	default:
> +		pr_warn("%s: Unsupported hotplug event %lld\n",
> +			__func__, php_event);
> +	}
> +
> +	of_node_put(slot->dn);
> +}
> +
> +int powernv_php_msg_handler(struct notifier_block *nb,
> +			    unsigned long type, void *message)
> +{
> +	phandle h;
> +	struct device_node *np;
> +	struct powernv_php_slot *slot;
> +	struct opal_msg *msg = message;
> +
> +	/* Check the message type */
> +	if (type != OPAL_MSG_PCI_HOTPLUG) {
> +		pr_warn("%s: Wrong message type %ld received!\n",
> +			__func__, type);
> +		return NOTIFY_DONE;
> +	}
> +
> +	/* Find the device node */
> +	h = (phandle)be64_to_cpu(msg->params[1]);
> +	np = of_find_node_by_phandle(h);
> +	if (!np) {
> +		pr_warn("%s: No device node for phandle 0x%08x\n",
> +			__func__, h);
> +		return NOTIFY_DONE;
> +	}
> +
> +	/* Find the slot */
> +	slot = powernv_php_slot_find(np);
> +	if (!slot) {
> +		pr_warn("%s: No slot found for node <%s>\n",
> +			__func__, of_node_full_name(np));
> +		of_node_put(np);
> +		return NOTIFY_DONE;
> +	}
> +
> +	/* Schedule the work */
> +	slot->msg = msg;
> +	schedule_work(&slot->work);
> +	return NOTIFY_OK;
> +}
> +
> +static int set_power_status(struct hotplug_slot *php_slot, u8 val)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	int ret;
> +
> +	/* Retrieve the counter of device tree */
> +	ret = pnv_pci_get_overlay_dt(&slot->dt_counter, NULL, 0);
> +	if (ret) {
> +		pr_warn("%s: Error %d getting DT counter for slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +
> +	/* Set power status */
> +	slot->status_confirmed = 0;
> +	ret = pnv_pci_set_power_status(slot->id, val);
> +	if (ret) {
> +		pr_warn("%s: Error %d powering %s slot %016llx\n",
> +			__func__, ret, val ? "on" : "off", slot->id);
> +		return ret;
> +	}
> +
> +	/* Waiting until the device tree is updated */
> +	ret = wait_event_timeout(slot->queue,
> +				 !slot->status_confirmed,
> +				 10 * HZ);
> +	if (ret) {
> +		pr_warn("%s: Error %d completing power-%s slot %016llx\n",
> +			__func__, ret, val ? "on" : "off", slot->id);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int get_power_status(struct hotplug_slot *php_slot, u8 *val)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t state;
> +	int ret;
> +
> +	/*
> +	 * Retrieve power status from firmware. If we fail
> +	 * getting that, the power status fails back to
> +	 * be on.
> +	 */
> +	ret = pnv_pci_get_power_status(slot->id, &state);
> +	if (ret) {
> +		*val = POWERNV_PHP_SLOT_POWER_ON;
> +		pr_warn("%s: Error %d getting power status of slot %016llx\n",
> +			__func__, ret, slot->id);
> +	} else {
> +		*val = state ? POWERNV_PHP_SLOT_POWER_ON :
> +			       POWERNV_PHP_SLOT_POWER_OFF;
> +		php_slot->info->power_status = *val;
> +	}
> +
> +	return 0;
> +}
> +
> +static int get_adapter_status(struct hotplug_slot *php_slot, u8 *val)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t state;
> +	int ret;
> +
> +	/*
> +	 * Retrieve presence status from firmware. If we can't
> +	 * get that, it will fail back to be empty.
> +	 */
> +	ret = pnv_pci_get_presence_status(slot->id, &state);
> +	if (ret >= 0) {
> +		ret = 0;
> +		*val = state ? POWERNV_PHP_SLOT_PRESENT :
> +			       POWERNV_PHP_SLOT_EMPTY;
> +		php_slot->info->adapter_status = *val;
> +		ret = 0;
> +	} else {
> +		*val = POWERNV_PHP_SLOT_EMPTY;
> +		pr_warn("%s: Error %d getting presence of slot %016llx\n",
> +			__func__, ret, slot->id);
> +	}
> +
> +	return ret;
> +}
> +
> +static int set_attention_status(struct hotplug_slot *php_slot, u8 val)
> +{
> +	/* The default operation would to turn on the attention */
> +	switch (val) {
> +	case POWERNV_PHP_SLOT_ATTEN_OFF:
> +	case POWERNV_PHP_SLOT_ATTEN_ON:
> +	case POWERNV_PHP_SLOT_ATTEN_IND:
> +	case POWERNV_PHP_SLOT_ATTEN_ACT:
> +		break;
> +	default:
> +		pr_warn("%s: Invalid attention status 0x%02x\n",
> +			__func__, val);
> +		return -EINVAL;
> +	}
> +
> +	/* FIXME: Make it real once firmware supports it */
> +	php_slot->info->attention_status = val;
> +
> +	return 0;
> +}
> +
> +int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t presence, power_status;
> +	int ret;
> +
> +	/* Check if the slot has been configured */
> +	if (slot->state != POWERNV_PHP_SLOT_STATE_REGISTER)
> +		return 0;
> +
> +	/* Retrieve slot presence status */
> +	ret = php_slot->ops->get_adapter_status(php_slot, &presence);
> +	if (ret) {
> +		pr_warn("%s: Error %d getting presence of slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +
> +	/* Proceed if there have nothing behind the slot */
> +	if (presence == POWERNV_PHP_SLOT_EMPTY)
> +		goto scan;
> +
> +	/*
> +	 * If we don't detect something behind the slot, we need
> +	 * make sure the power suply to the slot is on. Otherwise,
> +	 * the slot downstream PCIe linkturn should be down.
> +	 *
> +	 * On the first time, we don't change the power status to
> +	 * boost system boot with assumption that the firmware
> +	 * supplies consistent slot power status: empty slot always
> +	 * has its power off and non-empty slot has its power on.
> +	 */
> +	if (!slot->check_power_status) {
> +		slot->check_power_status = 1;
> +		goto scan;
> +	}
> +
> +	/* Check the power status. Scan the slot if that's already on */
> +	ret = php_slot->ops->get_power_status(php_slot, &power_status);
> +	if (ret) {
> +		pr_warn("%s: Error %d getting power status of slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +	if (power_status == POWERNV_PHP_SLOT_POWER_ON)
> +		goto scan;
> +
> +	/* Power is off, turn it on and then scan the slot */
> +	ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_ON);
> +	if (ret) {
> +		pr_warn("%s: Error %d powering on slot %016llx\n",
> +			__func__, ret, slot->id);
> +		return ret;
> +	}
> +
> +scan:
> +	switch (presence) {
> +	case POWERNV_PHP_SLOT_PRESENT:
> +		if (rescan) {
> +			pci_lock_rescan_remove();
> +			pcibios_add_pci_devices(slot->bus);
> +			pci_unlock_rescan_remove();
> +		}
> +
> +		/* Rescan for child hotpluggable slots */
> +		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
> +		if (rescan)
> +			powernv_php_register(slot->dn);
> +		break;
> +	case POWERNV_PHP_SLOT_EMPTY:
> +		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
> +		break;
> +	default:
> +		pr_warn("%s: Invalid presence status %d of slot %016llx\n",
> +			__func__, presence, slot->id);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int enable_slot(struct hotplug_slot *php_slot)
> +{
> +	return powernv_php_slot_enable(php_slot, true);
> +}
> +
> +static int disable_slot(struct hotplug_slot *php_slot)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +	uint8_t power_status;
> +	int ret;
> +
> +	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
> +		return 0;
> +
> +	/* Remove all devices behind the slot */
> +	pci_lock_rescan_remove();
> +	pcibios_remove_pci_devices(slot->bus);
> +	pci_unlock_rescan_remove();
> +
> +	/* Detach the child hotpluggable slots */
> +	powernv_php_unregister(slot->dn);
> +
> +	/*
> +	 * Check the power status and turn it off if necessary. If we
> +	 * fail to get the power status, the power will be forced to
> +	 * be off.
> +	 */
> +	ret = php_slot->ops->get_power_status(php_slot, &power_status);
> +	if (ret || power_status == POWERNV_PHP_SLOT_POWER_ON) {
> +		ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_OFF);
> +		if (ret)
> +			pr_warn("%s: Error %d powering off slot %016llx\n",
> +				__func__, ret, slot->id);
> +	}
> +
> +	/* Update slot state */
> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
> +	return 0;
> +}
> +
> +static struct hotplug_slot_ops php_slot_ops = {
> +	.get_power_status	= get_power_status,
> +	.get_adapter_status	= get_adapter_status,
> +	.set_attention_status	= set_attention_status,
> +	.enable_slot		= enable_slot,
> +	.disable_slot		= disable_slot,
> +};
> +
> +static struct powernv_php_slot *php_slot_match(struct device_node *dn,
> +					       struct powernv_php_slot *slot)
> +{
> +	struct powernv_php_slot *target, *tmp;
> +
> +	if (slot->dn == dn)
> +		return slot;
> +
> +	list_for_each_entry(tmp, &slot->children, link) {
> +		target = php_slot_match(dn, tmp);
> +		if (target)
> +			return target;
> +	}
> +
> +	return NULL;
> +}
> +
> +struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn)
> +{
> +	struct powernv_php_slot *slot, *tmp;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&php_slot_lock, flags);
> +	list_for_each_entry(tmp, &php_slot_list, link) {
> +		slot = php_slot_match(dn, tmp);
> +		if (slot) {
> +			spin_unlock_irqrestore(&php_slot_lock, flags);
> +			return slot;
> +		}
> +	}
> +	spin_unlock_irqrestore(&php_slot_lock, flags);
> +
> +	return NULL;
> +}
> +
> +void powernv_php_slot_free(struct kref *kref)
> +{
> +	struct powernv_php_slot *slot = to_powernv_php_slot(kref);
> +
> +	WARN_ON(!list_empty(&slot->children));
> +	kfree(slot->name);
> +	kfree(slot);
> +}
> +
> +static void php_slot_release(struct hotplug_slot *hp_slot)
> +{
> +	struct powernv_php_slot *slot = hp_slot->private;
> +	unsigned long flags;
> +
> +	/* Remove from global or child list */
> +	spin_lock_irqsave(&php_slot_lock, flags);
> +	list_del(&slot->link);
> +	spin_unlock_irqrestore(&php_slot_lock, flags);
> +
> +	/* Detach from parent */
> +	powernv_php_slot_put(slot);
> +	powernv_php_slot_put(slot->parent);
> +}
> +
> +static bool php_slot_get_id(struct device_node *dn,
> +			    uint64_t *id)
> +{
> +	struct device_node *parent = dn;
> +	const __be64 *prop64;
> +	const __be32 *prop32;
> +
> +	/*
> +	 * The hotpluggable slot always has a compound Id, which
> +	 * consists of 16-bits PHB Id, 16 bits bus/slot/function
> +	 * number, and compound indicator
> +	 */
> +	*id = (0x1ul << 63);
> +
> +	/* Bus/Slot/Function number */
> +	prop32 = of_get_property(dn, "reg", NULL);
> +	if (!prop32)
> +		return false;
> +	*id |= ((of_read_number(prop32, 1) & 0x00ffff00) << 8);
> +
> +	/* PHB Id */
> +	while ((parent = of_get_parent(parent))) {
> +		if (!PCI_DN(parent)) {
> +			of_node_put(parent);
> +			break;
> +		}
> +
> +		if (!of_device_is_compatible(parent, "ibm,ioda2-phb") &&
> +		    !of_device_is_compatible(parent, "ibm,ioda-phb")) {
> +			of_node_put(parent);
> +			continue;
> +		}
> +
> +		prop64 = of_get_property(parent, "ibm,opal-phbid", NULL);
> +		if (!prop64) {
> +			of_node_put(parent);
> +			return false;
> +		}
> +
> +		*id |= be64_to_cpup(prop64);
> +		of_node_put(parent);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn)
> +{
> +	struct pci_bus *bus;
> +	struct powernv_php_slot *slot;
> +	const char *label;
> +	uint64_t id;
> +	int slot_no;
> +	size_t size;
> +	void *pmem;
> +
> +	/* Slot name */
> +	label = of_get_property(dn, "ibm,slot-label", NULL);
> +	if (!label)
> +		return NULL;
> +
> +	/* Slot indentifier */
> +	if (!php_slot_get_id(dn, &id))
> +		return NULL;
> +
> +	/* PCI bus */
> +	bus = pcibios_find_pci_bus(dn);
> +	if (!bus)
> +		return NULL;
> +
> +	/* Slot number */
> +	if (dn->child && PCI_DN(dn->child))
> +		slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
> +	else
> +		slot_no = -1;
> +
> +	/* Allocate slot */
> +	size = sizeof(struct powernv_php_slot) +
> +	       sizeof(struct hotplug_slot) +
> +	       sizeof(struct hotplug_slot_info);
> +	pmem = kzalloc(size, GFP_KERNEL);
> +	if (!pmem) {
> +		pr_warn("%s: Cannot allocate slot for node %s\n",
> +			__func__, dn->full_name);
> +		return NULL;
> +	}
> +
> +	/* Assign memory blocks */
> +	slot = pmem;
> +	slot->php_slot = pmem + sizeof(struct powernv_php_slot);
> +	slot->php_slot->info = pmem + sizeof(struct powernv_php_slot) +
> +			      sizeof(struct hotplug_slot);
> +	slot->name = kstrdup(label, GFP_KERNEL);
> +	if (!slot->name) {
> +		pr_warn("%s: Cannot populate name for node %s\n",
> +			__func__, dn->full_name);
> +		kfree(pmem);
> +		return NULL;
> +	}
> +
> +	/* Initialize slot */
> +	kref_init(&slot->kref);
> +	slot->state = POWERNV_PHP_SLOT_STATE_INIT;
> +	slot->dn = dn;
> +	slot->bus = bus;
> +	slot->id = id;
> +	slot->slot_no = slot_no;
> +	slot->overlay_id = -1;
> +	INIT_WORK(&slot->work, powernv_php_slot_work);
> +	init_waitqueue_head(&slot->queue);
> +	slot->check_power_status = 0;
> +	slot->status_confirmed = 0;
> +	slot->php_slot->ops = &php_slot_ops;
> +	slot->php_slot->release = php_slot_release;
> +	slot->php_slot->private = slot;
> +	INIT_LIST_HEAD(&slot->children);
> +	INIT_LIST_HEAD(&slot->link);
> +
> +	return slot;
> +}
> +
> +int powernv_php_slot_register(struct powernv_php_slot *slot)
> +{
> +	struct powernv_php_slot *parent;
> +	struct device_node *dn = slot->dn;
> +	unsigned long flags;
> +	int ret;
> +
> +	/* Avoid register same slot for twice */
> +	if (powernv_php_slot_find(slot->dn))
> +		return -EEXIST;
> +
> +	/* Register slot */
> +	ret = pci_hp_register(slot->php_slot, slot->bus,
> +			      slot->slot_no, slot->name);
> +	if (ret) {
> +		pr_warn("%s: Cannot register slot %s (%d)\n",
> +			__func__, slot->name, ret);
> +		return ret;
> +	}
> +
> +	/* Put into global or parent list */
> +	while ((dn = of_get_parent(dn))) {
> +		if (!PCI_DN(dn)) {
> +			of_node_put(dn);
> +			break;
> +		}
> +
> +		parent = powernv_php_slot_find(dn);
> +		if (parent) {
> +			of_node_put(dn);
> +			break;
> +		}
> +	}
> +
> +	spin_lock_irqsave(&php_slot_lock, flags);
> +	if (parent) {
> +		powernv_php_slot_get(parent);
> +		slot->parent = parent;
> +		list_add_tail(&slot->link, &parent->children);
> +	} else {
> +		list_add_tail(&slot->link, &php_slot_list);
> +	}
> +	spin_unlock_irqrestore(&php_slot_lock, flags);
> +
> +	/* Update slot state */
> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
> +	return 0;
> +}
> -- 
> 2.1.0
> 


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
  2015-06-30 18:06       ` Grant Likely
@ 2015-06-30 21:46           ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-06-30 21:46 UTC (permalink / raw)
  To: Grant Likely
  Cc: Gavin Shan, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, aik-sLpHqDYs0B2HXe+LvDLADg,
	panto-wVdstyuyKrO8r51toPun2/C9HSW9iNxf,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w

On Tue, 2015-06-30 at 19:06 +0100, Grant Likely wrote:
> It may be time to dump the special allocation of fdt.c entirely and
> treat all nodes the same way, with name and properties all allocated
> with normal kmallocs.... Investigation is needed to figure out if this
> is feasible.

kmalloc isn't available early enough

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree()
@ 2015-06-30 21:46           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-06-30 21:46 UTC (permalink / raw)
  To: Grant Likely
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, bhelgaas, aik,
	panto, robherring2

On Tue, 2015-06-30 at 19:06 +0100, Grant Likely wrote:
> It may be time to dump the special allocation of fdt.c entirely and
> treat all nodes the same way, with name and properties all allocated
> with normal kmallocs.... Investigation is needed to figure out if this
> is feasible.

kmalloc isn't available early enough

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-06-30 18:18     ` Grant Likely
  (?)
@ 2015-07-01  0:51     ` Gavin Shan
  -1 siblings, 0 replies; 83+ messages in thread
From: Gavin Shan @ 2015-07-01  0:51 UTC (permalink / raw)
  To: Grant Likely
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, bhelgaas,
	aik, panto, robherring2

On Tue, Jun 30, 2015 at 07:18:04PM +0100, Grant Likely wrote:
>On Thu,  4 Jun 2015 16:42:11 +1000
>, Gavin Shan <gwshan@linux.vnet.ibm.com>
> wrote:
>> The patch intends to add standalone driver to support PCI hotplug
>> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
>> The firmware identified hotpluggable slots and marked their device
>> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
>> The driver simply scans device-tree to create/register PCI hotplug slot
>> accordingly.
>> 
>> If the skiboot firmware doesn't support slot status retrieval, the PCI
>> slot device node shouldn't have property "ibm,reset-by-firmware". In
>> that case, none of valid PCI slots will be detected from device tree.
>> The skiboot firmware doesn't export the capability to access attention
>> LEDs yet and it's something for TBD.
>> 
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> ---
>> v5:
>>   * Use OF OVERLAY to update the device-tree
>>   * Removed unnecessary header files
>>   * More meaningful return value from powernv_php_register_one()
>>   * Use pnv_pci_hotplug_notifier_{register, unregister}()
>>   * Decimal values for slot's states
>>   * Removed struct powernv_php_slot::release()
>>   * Merged two bool arguments to one for powernv_php_slot_enable()
>>   * Rename release_device_nodes_info() to remove_device_nodes_info()
>>   * Don't check on "!len" in slot_power_on_handler()
>>   * Handle return value in get_adapter_status() as suggested by aik
>>   * Drop invalid attention status in set_attention_status()
>>   * Renaming functions
>>   * Fixed coding style and added entry in MAINTAINERS reported by
>>     checkpatch.pl
>> ---
>>  MAINTAINERS                            |   6 +
>>  drivers/pci/hotplug/Kconfig            |  12 +
>>  drivers/pci/hotplug/Makefile           |   4 +
>>  drivers/pci/hotplug/powernv_php.c      | 140 +++++++
>>  drivers/pci/hotplug/powernv_php.h      |  90 ++++
>>  drivers/pci/hotplug/powernv_php_slot.c | 732 +++++++++++++++++++++++++++++++++
>>  6 files changed, 984 insertions(+)
>>  create mode 100644 drivers/pci/hotplug/powernv_php.c
>>  create mode 100644 drivers/pci/hotplug/powernv_php.h
>>  create mode 100644 drivers/pci/hotplug/powernv_php_slot.c
>> 
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index e308718..f5e1dce 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -7481,6 +7481,12 @@ L:	linux-pci@vger.kernel.org
>>  S:	Supported
>>  F:	Documentation/PCI/pci-error-recovery.txt
>>  
>> +PCI HOTPLUG DRIVER FOR POWERNV PLATFORM
>> +M:	Gavin Shan <gwshan@linux.vnet.ibm.com>
>> +L:	linux-pci@vger.kernel.org
>> +S:	Supported
>> +F:	drivers/pci/hotplug/powernv_php*
>> +
>>  PCI SUBSYSTEM
>>  M:	Bjorn Helgaas <bhelgaas@google.com>
>>  L:	linux-pci@vger.kernel.org
>> diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
>> index df8caec..ef55dae 100644
>> --- a/drivers/pci/hotplug/Kconfig
>> +++ b/drivers/pci/hotplug/Kconfig
>> @@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC
>>  
>>  	  When in doubt, say N.
>>  
>> +config HOTPLUG_PCI_POWERNV
>> +	tristate "PowerPC PowerNV PCI Hotplug driver"
>> +	depends on PPC_POWERNV && EEH
>> +	help
>> +	  Say Y here if you run PowerPC PowerNV platform that supports
>> +          PCI Hotplug
>> +
>> +	  To compile this driver as a module, choose M here: the
>> +	  module will be called powernv-php.
>> +
>> +	  When in doubt, say N.
>> +
>>  config HOTPLUG_PCI_RPA
>>  	tristate "RPA PCI Hotplug driver"
>>  	depends on PPC_PSERIES && EEH
>> diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
>> index 4a9aa08..a69665e 100644
>> --- a/drivers/pci/hotplug/Makefile
>> +++ b/drivers/pci/hotplug/Makefile
>> @@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)		+= pciehp.o
>>  obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550)	+= cpcihp_zt5550.o
>>  obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)	+= cpcihp_generic.o
>>  obj-$(CONFIG_HOTPLUG_PCI_SHPC)		+= shpchp.o
>> +obj-$(CONFIG_HOTPLUG_PCI_POWERNV)	+= powernv-php.o
>>  obj-$(CONFIG_HOTPLUG_PCI_RPA)		+= rpaphp.o
>>  obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR)	+= rpadlpar_io.o
>>  obj-$(CONFIG_HOTPLUG_PCI_SGI)		+= sgi_hotplug.o
>> @@ -50,6 +51,9 @@ ibmphp-objs		:=	ibmphp_core.o	\
>>  acpiphp-objs		:=	acpiphp_core.o	\
>>  				acpiphp_glue.o
>>  
>> +powernv-php-objs	:=	powernv_php.o	\
>> +				powernv_php_slot.o
>> +
>>  rpaphp-objs		:=	rpaphp_core.o	\
>>  				rpaphp_pci.o	\
>>  				rpaphp_slot.o
>> diff --git a/drivers/pci/hotplug/powernv_php.c b/drivers/pci/hotplug/powernv_php.c
>> new file mode 100644
>> index 0000000..4cbff7a
>> --- /dev/null
>> +++ b/drivers/pci/hotplug/powernv_php.c
>> @@ -0,0 +1,140 @@
>> +/*
>> + * PCI Hotplug Driver for PowerPC PowerNV platform.
>> + *
>> + * Copyright Gavin Shan, IBM Corporation 2015.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + */
>> +
>> +#include <linux/module.h>
>> +
>> +#include <asm/opal.h>
>> +#include <asm/pnv-pci.h>
>> +
>> +#include "powernv_php.h"
>> +
>> +#define DRIVER_VERSION	"0.1"
>> +#define DRIVER_AUTHOR	"Gavin Shan, IBM Corporation"
>> +#define DRIVER_DESC	"PowerPC PowerNV PCI Hotplug Driver"
>> +
>> +static struct notifier_block php_msg_nb = {
>> +	.notifier_call	= powernv_php_msg_handler,
>> +	.next		= NULL,
>> +	.priority	= 0,
>> +};
>> +
>> +static int powernv_php_register_one(struct device_node *dn)
>> +{
>> +	struct powernv_php_slot *slot;
>> +	const __be32 *prop32;
>> +	int ret;
>> +
>> +	/* Check if it's hotpluggable slot */
>> +	prop32 = of_get_property(dn, "ibm,slot-pluggable", NULL);
>> +	if (!prop32 || !of_read_number(prop32, 1))
>> +		return -ENXIO;
>> +
>> +	prop32 = of_get_property(dn, "ibm,reset-by-firmware", NULL);
>> +	if (!prop32 || !of_read_number(prop32, 1))
>> +		return -ENXIO;
>> +
>> +	/* Allocate slot */
>> +	slot = powernv_php_slot_alloc(dn);
>> +	if (!slot)
>> +		return -ENODEV;
>> +
>> +	/* Register it */
>> +	ret = powernv_php_slot_register(slot);
>> +	if (ret) {
>> +		powernv_php_slot_put(slot);
>> +		return ret;
>> +	}
>> +
>> +	return powernv_php_slot_enable(slot->php_slot, false);
>> +}
>> +
>> +int powernv_php_register(struct device_node *dn)
>> +{
>> +	struct device_node *child;
>> +	int ret = 0;
>> +
>> +	/*
>> +	 * The parent slots should be registered before their
>> +	 * child slots.
>> +	 */
>> +	for_each_child_of_node(dn, child) {
>> +		powernv_php_register_one(child);
>> +		powernv_php_register(child);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void powernv_php_unregister_one(struct device_node *dn)
>> +{
>> +	struct powernv_php_slot *slot;
>> +
>> +	slot = powernv_php_slot_find(dn);
>> +	if (!slot)
>> +		return;
>> +
>> +	pci_hp_deregister(slot->php_slot);
>> +}
>> +
>> +void powernv_php_unregister(struct device_node *dn)
>> +{
>> +	struct device_node *child;
>> +
>> +	/* The child slots should go before their parent slots */
>> +	for_each_child_of_node(dn, child) {
>> +		powernv_php_unregister(child);
>> +		powernv_php_unregister_one(child);
>> +	}
>> +}
>> +
>> +static int __init powernv_php_init(void)
>> +{
>> +	struct device_node *dn;
>> +	int ret;
>> +
>> +	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
>> +
>> +	/* Register hotplug message handler */
>> +	ret = pnv_pci_hotplug_notifier_register(&php_msg_nb);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d registering hotplug notifier\n",
>> +			__func__, ret);
>> +		return ret;
>> +	}
>> +
>> +	/* Scan PHB nodes and their children */
>> +	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
>> +		powernv_php_register(dn);
>> +	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
>> +		powernv_php_register(dn);
>> +
>> +	return 0;
>> +}
>> +
>> +static void __exit powernv_php_exit(void)
>> +{
>> +	struct device_node *dn;
>> +
>> +	pnv_pci_hotplug_notifier_unregister(&php_msg_nb);
>> +
>> +	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
>> +		powernv_php_unregister(dn);
>> +	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
>> +		powernv_php_unregister(dn);
>> +}
>> +
>> +module_init(powernv_php_init);
>> +module_exit(powernv_php_exit);
>> +
>> +MODULE_VERSION(DRIVER_VERSION);
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_AUTHOR(DRIVER_AUTHOR);
>> +MODULE_DESCRIPTION(DRIVER_DESC);
>> diff --git a/drivers/pci/hotplug/powernv_php.h b/drivers/pci/hotplug/powernv_php.h
>> new file mode 100644
>> index 0000000..5e14a65
>> --- /dev/null
>> +++ b/drivers/pci/hotplug/powernv_php.h
>> @@ -0,0 +1,90 @@
>> +/*
>> + * PCI Hotplug Driver for PowerPC PowerNV platform.
>> + *
>> + * Copyright Gavin Shan, IBM Corporation 2015.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + */
>> +
>> +#ifndef _POWERNV_PHP_H
>> +#define _POWERNV_PHP_H
>> +
>> +#include <linux/list.h>
>> +#include <linux/kref.h>
>> +#include <linux/of.h>
>> +#include <linux/pci.h>
>> +#include <linux/pci_hotplug.h>
>> +#include <linux/wait.h>
>> +#include <linux/workqueue.h>
>> +
>> +#include <asm/opal-api.h>
>> +
>> +/* Slot power status */
>> +#define POWERNV_PHP_SLOT_POWER_OFF	0
>> +#define POWERNV_PHP_SLOT_POWER_ON	1
>> +
>> +/* Slot presence status */
>> +#define POWERNV_PHP_SLOT_EMPTY		0
>> +#define POWERNV_PHP_SLOT_PRESENT	1
>> +
>> +/* Slot attention status */
>> +#define POWERNV_PHP_SLOT_ATTEN_OFF	0
>> +#define POWERNV_PHP_SLOT_ATTEN_ON	1
>> +#define POWERNV_PHP_SLOT_ATTEN_IND	2
>> +#define POWERNV_PHP_SLOT_ATTEN_ACT	3
>> +
>> +struct powernv_php_slot {
>> +	char			*name;
>> +	struct device_node	*dn;
>> +	struct pci_bus		*bus;
>> +	uint64_t		id;
>> +	int			slot_no;
>> +	struct kref		kref;
>> +#define POWERNV_PHP_SLOT_STATE_INIT		0
>> +#define POWERNV_PHP_SLOT_STATE_REGISTER		1
>> +#define POWERNV_PHP_SLOT_STATE_POPULATED	2
>> +	int			state;
>> +	int			check_power_status;
>> +	int			status_confirmed;
>> +	struct opal_msg		*msg;
>> +	uint64_t		dt_counter;
>> +	int			overlay_id;
>> +	struct work_struct	work;
>> +	wait_queue_head_t	queue;
>> +	struct hotplug_slot	*php_slot;
>> +	struct powernv_php_slot	*parent;
>> +	struct list_head	children;
>> +	struct list_head	link;
>> +};
>> +
>> +int powernv_php_msg_handler(struct notifier_block *nb,
>> +			    unsigned long type, void *message);
>> +struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn);
>> +void powernv_php_slot_free(struct kref *kref);
>> +struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn);
>> +int powernv_php_slot_register(struct powernv_php_slot *slot);
>> +int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan);
>> +int powernv_php_register(struct device_node *dn);
>> +void powernv_php_unregister(struct device_node *dn);
>> +
>> +#define to_powernv_php_slot(kref) \
>> +	container_of(kref, struct powernv_php_slot, kref)
>> +
>> +static inline void powernv_php_slot_get(struct powernv_php_slot *slot)
>> +{
>> +	if (slot)
>> +		kref_get(&slot->kref);
>> +}
>> +
>> +static inline int powernv_php_slot_put(struct powernv_php_slot *slot)
>> +{
>> +	if (slot)
>> +		return kref_put(&slot->kref, powernv_php_slot_free);
>> +
>> +	return 0;
>> +}
>> +
>> +#endif /* !_POWERNV_PHP_H */
>> diff --git a/drivers/pci/hotplug/powernv_php_slot.c b/drivers/pci/hotplug/powernv_php_slot.c
>> new file mode 100644
>> index 0000000..6c56455
>> --- /dev/null
>> +++ b/drivers/pci/hotplug/powernv_php_slot.c
>> @@ -0,0 +1,732 @@
>> +/*
>> + * PCI Hotplug Driver for PowerPC PowerNV platform.
>> + *
>> + * Copyright Gavin Shan, IBM Corporation 2015.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + */
>> +
>> +#include <linux/module.h>
>> +
>> +#include <asm/opal.h>
>> +#include <asm/pnv-pci.h>
>> +#include <asm/ppc-pci.h>
>> +
>> +#include "powernv_php.h"
>> +
>> +static LIST_HEAD(php_slot_list);
>> +static DEFINE_SPINLOCK(php_slot_lock);
>> +
>> +/*
>> + * Remove firmware data for all child device nodes of the
>> + * indicated one.
>> + */
>> +static void remove_child_pdn(struct device_node *np)
>> +{
>> +	struct device_node *child;
>> +
>> +	for_each_child_of_node(np, child) {
>> +		/* In depth first */
>> +		remove_child_pdn(child);
>> +
>> +		remove_pci_device_node_info(child);
>> +	}
>> +}
>> +
>> +/*
>> + * Remove all subordinate device nodes of the indicated one.
>> + * Those device nodes in deepest path should be released firstly.
>> + */
>> +static int remove_child_device_nodes(struct device_node *parent)
>> +{
>> +	struct device_node *np, *child;
>> +	int ret = 0;
>> +
>> +	/* If the device node has children, remove them firstly */
>> +	for_each_child_of_node(parent, np) {
>> +		ret = remove_child_device_nodes(np);
>> +		if (ret)
>> +			return ret;
>> +
>> +		/* The device shouldn't have alive children */
>> +		child = of_get_next_child(np, NULL);
>> +		if (child) {
>> +			of_node_put(child);
>> +			of_node_put(np);
>> +			pr_err("%s: Alive children of node <%s>\n",
>> +			       __func__, of_node_full_name(np));
>> +			return -EBUSY;
>> +		}
>> +
>> +		/* Detach the device node */
>> +		of_detach_node(np);
>> +		of_node_put(np);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + * The function processes the message sent by firmware
>> + * to remove all device tree nodes beneath the slot's
>> + * nodes, and the associated auxillary data.
>> + */
>> +static void slot_power_off_handler(struct powernv_php_slot *slot)
>> +{
>> +	int ret;
>> +
>> +	/* Release the firmware data for the child device nodes */
>> +	remove_child_pdn(slot->dn);
>> +
>> +	/*
>> +	 * Release the child device nodes. If the sub-tree was
>> +	 * built with the help of overlay, we just need revert
>> +	 * the changes introduced by the overlay
>> +	 */
>> +	if (slot->overlay_id >= 0) {
>> +		ret = of_overlay_destroy(slot->overlay_id);
>> +		if (ret)
>> +			pr_warn("%s: Error %d destroying overlay %d\n",
>> +				__func__, ret, slot->overlay_id);
>> +		slot->overlay_id = -1;
>> +	} else {
>> +		ret = remove_child_device_nodes(slot->dn);
>> +		if (ret)
>> +			pr_warn("%s: Error %d releasing children of <%s>\n",
>> +				__func__, ret, of_node_full_name(slot->dn));
>> +	}
>> +
>> +	/* Confirm status change */
>> +	slot->status_confirmed = 1;
>> +	wake_up_interruptible(&slot->queue);
>> +}
>> +
>> +static void slot_power_on_handler(struct powernv_php_slot *slot)
>> +{
>> +	struct device_node *nodes[3] = {NULL, NULL, NULL};
>> +	struct property *prop = NULL;
>> +	void *fdt = NULL, *dt = NULL;
>> +	phandle handle;
>> +	uint64_t len;
>> +	int i, ret;
>> +
>> +	/* Build overlay sub-tree */
>> +	for (i = 0; i < ARRAY_SIZE(nodes); i++) {
>> +		nodes[i] = kzalloc(sizeof(struct device_node), GFP_KERNEL);
>> +		if (!nodes[i])
>> +			goto out;
>> +
>> +		of_node_init(nodes[i]);
>> +		if (i > 0) {
>> +			nodes[i - 1]->child = nodes[i];
>> +			nodes[i]->parent = nodes[i - 1];
>> +		}
>> +	}
>> +
>> +	/* Target property for parent node */
>> +	prop = kzalloc(sizeof(struct property), GFP_KERNEL);
>> +	if (!prop)
>> +		goto out;
>> +	prop->name = kstrdup("target", GFP_KERNEL);
>> +	if (!prop->name)
>> +		goto out;
>> +	prop->value = kzalloc(sizeof(phandle), GFP_KERNEL);
>> +	if (!prop->value)
>> +		goto out;
>> +	handle = cpu_to_be32(slot->dn->phandle);
>> +	memcpy(prop->value, &handle, sizeof(phandle));
>> +	prop->length = sizeof(phandle);
>> +	nodes[1]->properties = prop;
>> +
>> +	/* Names for overlay node */
>> +	nodes[2]->name = kstrdup("__overlay__", GFP_KERNEL);
>> +	if (!nodes[2]->name)
>> +		goto out;
>> +	nodes[2]->full_name = kstrdup(of_node_full_name(slot->dn), GFP_KERNEL);
>> +	if (!nodes[2]->full_name)
>> +		goto out;
>
>I think you can simplify this driver by using the of_changeset api
>instead of of_overlay. of_overlay is a particular data format passed
>into the kernel, but it uses of_changeset in the back end. In this case,
>you would allocate an of_changeset structure and then do:
>
>of_changeset_init()
>of_changeset_attach_node()
>	/* you might need to create an
>	 * of_changeset_attach_node_subtree() varient */
>of_changeset_attach_node()
>of_changeset_attach_node()
>of_changeset_attach_node()
>of_changeset_apply()
>of_changeset_destroy() /* frees the structure */
>
>Then you don't have to muck about with creating a DT in the structure
>expected by the of_overlay code.
>

Yeah, Thanks for the suggestion, Grant. I'm waiting for a usable
4.2.rc1 and integrate the comments I received, then post the new
revision. The changes to use changeset will be included in next
revision.

Thanks,
Gavin

>> +
>> +	/* Get FDT blob */
>> +	slot->dt_counter += 1;
>> +	fdt = NULL;
>> +	len = 0x2000;
>> +	while (len <= 0x10000) {
>> +		fdt = kzalloc(len, GFP_KERNEL);
>> +		if (!fdt)
>> +			break;
>> +
>> +		ret = pnv_pci_get_overlay_dt(&slot->dt_counter, fdt, len);
>> +		if (!ret)
>> +			break;
>> +
>> +		kfree(fdt);
>> +		fdt = NULL;
>> +		len *= 2;
>> +	}
>> +
>> +	if (!fdt)
>> +		goto out;
>> +
>> +	/* Unflatten device tree blob */
>> +	dt = of_fdt_unflatten_tree(fdt, nodes[2], NULL);
>> +
>> +	/* Apply the overlay tree */
>> +	slot->overlay_id = of_overlay_create(nodes[0]);
>> +	if (slot->overlay_id < 0)
>> +		goto out;
>> +
>> +	/* Add device node firmware data */
>> +	traverse_pci_device_nodes(slot->dn,
>> +				  add_pci_device_node_info,
>> +				  pci_bus_to_host(slot->bus));
>> +
>> +out:
>> +	kfree(dt);
>> +	kfree(fdt);
>> +	if (nodes[2]) {
>> +		kfree(nodes[2]->name);
>> +		kfree(nodes[2]->full_name);
>> +	}
>> +	if (prop) {
>> +		kfree(prop->value);
>> +		kfree(prop->name);
>> +	}
>> +
>> +	kfree(prop);
>> +	for (i = 0; i < ARRAY_SIZE(nodes); i++)
>> +		kfree(nodes[i]);
>> +
>> +	/* Confirm status change */
>> +	slot->status_confirmed = 1;
>> +	wake_up_interruptible(&slot->queue);
>> +}
>> +
>> +static void powernv_php_slot_work(struct work_struct *data)
>> +{
>> +	struct powernv_php_slot *slot = container_of(data,
>> +						     struct powernv_php_slot,
>> +						     work);
>> +	uint64_t php_event = be64_to_cpu(slot->msg->params[0]);
>> +
>> +	switch (php_event) {
>> +	case 0: /* Slot power off */
>> +		slot_power_off_handler(slot);
>> +		break;
>> +	case 1: /* Slot power on */
>> +		slot_power_on_handler(slot);
>> +		break;
>> +	default:
>> +		pr_warn("%s: Unsupported hotplug event %lld\n",
>> +			__func__, php_event);
>> +	}
>> +
>> +	of_node_put(slot->dn);
>> +}
>> +
>> +int powernv_php_msg_handler(struct notifier_block *nb,
>> +			    unsigned long type, void *message)
>> +{
>> +	phandle h;
>> +	struct device_node *np;
>> +	struct powernv_php_slot *slot;
>> +	struct opal_msg *msg = message;
>> +
>> +	/* Check the message type */
>> +	if (type != OPAL_MSG_PCI_HOTPLUG) {
>> +		pr_warn("%s: Wrong message type %ld received!\n",
>> +			__func__, type);
>> +		return NOTIFY_DONE;
>> +	}
>> +
>> +	/* Find the device node */
>> +	h = (phandle)be64_to_cpu(msg->params[1]);
>> +	np = of_find_node_by_phandle(h);
>> +	if (!np) {
>> +		pr_warn("%s: No device node for phandle 0x%08x\n",
>> +			__func__, h);
>> +		return NOTIFY_DONE;
>> +	}
>> +
>> +	/* Find the slot */
>> +	slot = powernv_php_slot_find(np);
>> +	if (!slot) {
>> +		pr_warn("%s: No slot found for node <%s>\n",
>> +			__func__, of_node_full_name(np));
>> +		of_node_put(np);
>> +		return NOTIFY_DONE;
>> +	}
>> +
>> +	/* Schedule the work */
>> +	slot->msg = msg;
>> +	schedule_work(&slot->work);
>> +	return NOTIFY_OK;
>> +}
>> +
>> +static int set_power_status(struct hotplug_slot *php_slot, u8 val)
>> +{
>> +	struct powernv_php_slot *slot = php_slot->private;
>> +	int ret;
>> +
>> +	/* Retrieve the counter of device tree */
>> +	ret = pnv_pci_get_overlay_dt(&slot->dt_counter, NULL, 0);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d getting DT counter for slot %016llx\n",
>> +			__func__, ret, slot->id);
>> +		return ret;
>> +	}
>> +
>> +	/* Set power status */
>> +	slot->status_confirmed = 0;
>> +	ret = pnv_pci_set_power_status(slot->id, val);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d powering %s slot %016llx\n",
>> +			__func__, ret, val ? "on" : "off", slot->id);
>> +		return ret;
>> +	}
>> +
>> +	/* Waiting until the device tree is updated */
>> +	ret = wait_event_timeout(slot->queue,
>> +				 !slot->status_confirmed,
>> +				 10 * HZ);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d completing power-%s slot %016llx\n",
>> +			__func__, ret, val ? "on" : "off", slot->id);
>> +		return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_power_status(struct hotplug_slot *php_slot, u8 *val)
>> +{
>> +	struct powernv_php_slot *slot = php_slot->private;
>> +	uint8_t state;
>> +	int ret;
>> +
>> +	/*
>> +	 * Retrieve power status from firmware. If we fail
>> +	 * getting that, the power status fails back to
>> +	 * be on.
>> +	 */
>> +	ret = pnv_pci_get_power_status(slot->id, &state);
>> +	if (ret) {
>> +		*val = POWERNV_PHP_SLOT_POWER_ON;
>> +		pr_warn("%s: Error %d getting power status of slot %016llx\n",
>> +			__func__, ret, slot->id);
>> +	} else {
>> +		*val = state ? POWERNV_PHP_SLOT_POWER_ON :
>> +			       POWERNV_PHP_SLOT_POWER_OFF;
>> +		php_slot->info->power_status = *val;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_adapter_status(struct hotplug_slot *php_slot, u8 *val)
>> +{
>> +	struct powernv_php_slot *slot = php_slot->private;
>> +	uint8_t state;
>> +	int ret;
>> +
>> +	/*
>> +	 * Retrieve presence status from firmware. If we can't
>> +	 * get that, it will fail back to be empty.
>> +	 */
>> +	ret = pnv_pci_get_presence_status(slot->id, &state);
>> +	if (ret >= 0) {
>> +		ret = 0;
>> +		*val = state ? POWERNV_PHP_SLOT_PRESENT :
>> +			       POWERNV_PHP_SLOT_EMPTY;
>> +		php_slot->info->adapter_status = *val;
>> +		ret = 0;
>> +	} else {
>> +		*val = POWERNV_PHP_SLOT_EMPTY;
>> +		pr_warn("%s: Error %d getting presence of slot %016llx\n",
>> +			__func__, ret, slot->id);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static int set_attention_status(struct hotplug_slot *php_slot, u8 val)
>> +{
>> +	/* The default operation would to turn on the attention */
>> +	switch (val) {
>> +	case POWERNV_PHP_SLOT_ATTEN_OFF:
>> +	case POWERNV_PHP_SLOT_ATTEN_ON:
>> +	case POWERNV_PHP_SLOT_ATTEN_IND:
>> +	case POWERNV_PHP_SLOT_ATTEN_ACT:
>> +		break;
>> +	default:
>> +		pr_warn("%s: Invalid attention status 0x%02x\n",
>> +			__func__, val);
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* FIXME: Make it real once firmware supports it */
>> +	php_slot->info->attention_status = val;
>> +
>> +	return 0;
>> +}
>> +
>> +int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan)
>> +{
>> +	struct powernv_php_slot *slot = php_slot->private;
>> +	uint8_t presence, power_status;
>> +	int ret;
>> +
>> +	/* Check if the slot has been configured */
>> +	if (slot->state != POWERNV_PHP_SLOT_STATE_REGISTER)
>> +		return 0;
>> +
>> +	/* Retrieve slot presence status */
>> +	ret = php_slot->ops->get_adapter_status(php_slot, &presence);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d getting presence of slot %016llx\n",
>> +			__func__, ret, slot->id);
>> +		return ret;
>> +	}
>> +
>> +	/* Proceed if there have nothing behind the slot */
>> +	if (presence == POWERNV_PHP_SLOT_EMPTY)
>> +		goto scan;
>> +
>> +	/*
>> +	 * If we don't detect something behind the slot, we need
>> +	 * make sure the power suply to the slot is on. Otherwise,
>> +	 * the slot downstream PCIe linkturn should be down.
>> +	 *
>> +	 * On the first time, we don't change the power status to
>> +	 * boost system boot with assumption that the firmware
>> +	 * supplies consistent slot power status: empty slot always
>> +	 * has its power off and non-empty slot has its power on.
>> +	 */
>> +	if (!slot->check_power_status) {
>> +		slot->check_power_status = 1;
>> +		goto scan;
>> +	}
>> +
>> +	/* Check the power status. Scan the slot if that's already on */
>> +	ret = php_slot->ops->get_power_status(php_slot, &power_status);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d getting power status of slot %016llx\n",
>> +			__func__, ret, slot->id);
>> +		return ret;
>> +	}
>> +	if (power_status == POWERNV_PHP_SLOT_POWER_ON)
>> +		goto scan;
>> +
>> +	/* Power is off, turn it on and then scan the slot */
>> +	ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_ON);
>> +	if (ret) {
>> +		pr_warn("%s: Error %d powering on slot %016llx\n",
>> +			__func__, ret, slot->id);
>> +		return ret;
>> +	}
>> +
>> +scan:
>> +	switch (presence) {
>> +	case POWERNV_PHP_SLOT_PRESENT:
>> +		if (rescan) {
>> +			pci_lock_rescan_remove();
>> +			pcibios_add_pci_devices(slot->bus);
>> +			pci_unlock_rescan_remove();
>> +		}
>> +
>> +		/* Rescan for child hotpluggable slots */
>> +		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
>> +		if (rescan)
>> +			powernv_php_register(slot->dn);
>> +		break;
>> +	case POWERNV_PHP_SLOT_EMPTY:
>> +		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
>> +		break;
>> +	default:
>> +		pr_warn("%s: Invalid presence status %d of slot %016llx\n",
>> +			__func__, presence, slot->id);
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int enable_slot(struct hotplug_slot *php_slot)
>> +{
>> +	return powernv_php_slot_enable(php_slot, true);
>> +}
>> +
>> +static int disable_slot(struct hotplug_slot *php_slot)
>> +{
>> +	struct powernv_php_slot *slot = php_slot->private;
>> +	uint8_t power_status;
>> +	int ret;
>> +
>> +	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
>> +		return 0;
>> +
>> +	/* Remove all devices behind the slot */
>> +	pci_lock_rescan_remove();
>> +	pcibios_remove_pci_devices(slot->bus);
>> +	pci_unlock_rescan_remove();
>> +
>> +	/* Detach the child hotpluggable slots */
>> +	powernv_php_unregister(slot->dn);
>> +
>> +	/*
>> +	 * Check the power status and turn it off if necessary. If we
>> +	 * fail to get the power status, the power will be forced to
>> +	 * be off.
>> +	 */
>> +	ret = php_slot->ops->get_power_status(php_slot, &power_status);
>> +	if (ret || power_status == POWERNV_PHP_SLOT_POWER_ON) {
>> +		ret = set_power_status(php_slot, POWERNV_PHP_SLOT_POWER_OFF);
>> +		if (ret)
>> +			pr_warn("%s: Error %d powering off slot %016llx\n",
>> +				__func__, ret, slot->id);
>> +	}
>> +
>> +	/* Update slot state */
>> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
>> +	return 0;
>> +}
>> +
>> +static struct hotplug_slot_ops php_slot_ops = {
>> +	.get_power_status	= get_power_status,
>> +	.get_adapter_status	= get_adapter_status,
>> +	.set_attention_status	= set_attention_status,
>> +	.enable_slot		= enable_slot,
>> +	.disable_slot		= disable_slot,
>> +};
>> +
>> +static struct powernv_php_slot *php_slot_match(struct device_node *dn,
>> +					       struct powernv_php_slot *slot)
>> +{
>> +	struct powernv_php_slot *target, *tmp;
>> +
>> +	if (slot->dn == dn)
>> +		return slot;
>> +
>> +	list_for_each_entry(tmp, &slot->children, link) {
>> +		target = php_slot_match(dn, tmp);
>> +		if (target)
>> +			return target;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn)
>> +{
>> +	struct powernv_php_slot *slot, *tmp;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&php_slot_lock, flags);
>> +	list_for_each_entry(tmp, &php_slot_list, link) {
>> +		slot = php_slot_match(dn, tmp);
>> +		if (slot) {
>> +			spin_unlock_irqrestore(&php_slot_lock, flags);
>> +			return slot;
>> +		}
>> +	}
>> +	spin_unlock_irqrestore(&php_slot_lock, flags);
>> +
>> +	return NULL;
>> +}
>> +
>> +void powernv_php_slot_free(struct kref *kref)
>> +{
>> +	struct powernv_php_slot *slot = to_powernv_php_slot(kref);
>> +
>> +	WARN_ON(!list_empty(&slot->children));
>> +	kfree(slot->name);
>> +	kfree(slot);
>> +}
>> +
>> +static void php_slot_release(struct hotplug_slot *hp_slot)
>> +{
>> +	struct powernv_php_slot *slot = hp_slot->private;
>> +	unsigned long flags;
>> +
>> +	/* Remove from global or child list */
>> +	spin_lock_irqsave(&php_slot_lock, flags);
>> +	list_del(&slot->link);
>> +	spin_unlock_irqrestore(&php_slot_lock, flags);
>> +
>> +	/* Detach from parent */
>> +	powernv_php_slot_put(slot);
>> +	powernv_php_slot_put(slot->parent);
>> +}
>> +
>> +static bool php_slot_get_id(struct device_node *dn,
>> +			    uint64_t *id)
>> +{
>> +	struct device_node *parent = dn;
>> +	const __be64 *prop64;
>> +	const __be32 *prop32;
>> +
>> +	/*
>> +	 * The hotpluggable slot always has a compound Id, which
>> +	 * consists of 16-bits PHB Id, 16 bits bus/slot/function
>> +	 * number, and compound indicator
>> +	 */
>> +	*id = (0x1ul << 63);
>> +
>> +	/* Bus/Slot/Function number */
>> +	prop32 = of_get_property(dn, "reg", NULL);
>> +	if (!prop32)
>> +		return false;
>> +	*id |= ((of_read_number(prop32, 1) & 0x00ffff00) << 8);
>> +
>> +	/* PHB Id */
>> +	while ((parent = of_get_parent(parent))) {
>> +		if (!PCI_DN(parent)) {
>> +			of_node_put(parent);
>> +			break;
>> +		}
>> +
>> +		if (!of_device_is_compatible(parent, "ibm,ioda2-phb") &&
>> +		    !of_device_is_compatible(parent, "ibm,ioda-phb")) {
>> +			of_node_put(parent);
>> +			continue;
>> +		}
>> +
>> +		prop64 = of_get_property(parent, "ibm,opal-phbid", NULL);
>> +		if (!prop64) {
>> +			of_node_put(parent);
>> +			return false;
>> +		}
>> +
>> +		*id |= be64_to_cpup(prop64);
>> +		of_node_put(parent);
>> +		return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn)
>> +{
>> +	struct pci_bus *bus;
>> +	struct powernv_php_slot *slot;
>> +	const char *label;
>> +	uint64_t id;
>> +	int slot_no;
>> +	size_t size;
>> +	void *pmem;
>> +
>> +	/* Slot name */
>> +	label = of_get_property(dn, "ibm,slot-label", NULL);
>> +	if (!label)
>> +		return NULL;
>> +
>> +	/* Slot indentifier */
>> +	if (!php_slot_get_id(dn, &id))
>> +		return NULL;
>> +
>> +	/* PCI bus */
>> +	bus = pcibios_find_pci_bus(dn);
>> +	if (!bus)
>> +		return NULL;
>> +
>> +	/* Slot number */
>> +	if (dn->child && PCI_DN(dn->child))
>> +		slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
>> +	else
>> +		slot_no = -1;
>> +
>> +	/* Allocate slot */
>> +	size = sizeof(struct powernv_php_slot) +
>> +	       sizeof(struct hotplug_slot) +
>> +	       sizeof(struct hotplug_slot_info);
>> +	pmem = kzalloc(size, GFP_KERNEL);
>> +	if (!pmem) {
>> +		pr_warn("%s: Cannot allocate slot for node %s\n",
>> +			__func__, dn->full_name);
>> +		return NULL;
>> +	}
>> +
>> +	/* Assign memory blocks */
>> +	slot = pmem;
>> +	slot->php_slot = pmem + sizeof(struct powernv_php_slot);
>> +	slot->php_slot->info = pmem + sizeof(struct powernv_php_slot) +
>> +			      sizeof(struct hotplug_slot);
>> +	slot->name = kstrdup(label, GFP_KERNEL);
>> +	if (!slot->name) {
>> +		pr_warn("%s: Cannot populate name for node %s\n",
>> +			__func__, dn->full_name);
>> +		kfree(pmem);
>> +		return NULL;
>> +	}
>> +
>> +	/* Initialize slot */
>> +	kref_init(&slot->kref);
>> +	slot->state = POWERNV_PHP_SLOT_STATE_INIT;
>> +	slot->dn = dn;
>> +	slot->bus = bus;
>> +	slot->id = id;
>> +	slot->slot_no = slot_no;
>> +	slot->overlay_id = -1;
>> +	INIT_WORK(&slot->work, powernv_php_slot_work);
>> +	init_waitqueue_head(&slot->queue);
>> +	slot->check_power_status = 0;
>> +	slot->status_confirmed = 0;
>> +	slot->php_slot->ops = &php_slot_ops;
>> +	slot->php_slot->release = php_slot_release;
>> +	slot->php_slot->private = slot;
>> +	INIT_LIST_HEAD(&slot->children);
>> +	INIT_LIST_HEAD(&slot->link);
>> +
>> +	return slot;
>> +}
>> +
>> +int powernv_php_slot_register(struct powernv_php_slot *slot)
>> +{
>> +	struct powernv_php_slot *parent;
>> +	struct device_node *dn = slot->dn;
>> +	unsigned long flags;
>> +	int ret;
>> +
>> +	/* Avoid register same slot for twice */
>> +	if (powernv_php_slot_find(slot->dn))
>> +		return -EEXIST;
>> +
>> +	/* Register slot */
>> +	ret = pci_hp_register(slot->php_slot, slot->bus,
>> +			      slot->slot_no, slot->name);
>> +	if (ret) {
>> +		pr_warn("%s: Cannot register slot %s (%d)\n",
>> +			__func__, slot->name, ret);
>> +		return ret;
>> +	}
>> +
>> +	/* Put into global or parent list */
>> +	while ((dn = of_get_parent(dn))) {
>> +		if (!PCI_DN(dn)) {
>> +			of_node_put(dn);
>> +			break;
>> +		}
>> +
>> +		parent = powernv_php_slot_find(dn);
>> +		if (parent) {
>> +			of_node_put(dn);
>> +			break;
>> +		}
>> +	}
>> +
>> +	spin_lock_irqsave(&php_slot_lock, flags);
>> +	if (parent) {
>> +		powernv_php_slot_get(parent);
>> +		slot->parent = parent;
>> +		list_add_tail(&slot->link, &parent->children);
>> +	} else {
>> +		list_add_tail(&slot->link, &php_slot_list);
>> +	}
>> +	spin_unlock_irqrestore(&php_slot_lock, flags);
>> +
>> +	/* Update slot state */
>> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
>> +	return 0;
>> +}
>> -- 
>> 2.1.0
>> 
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2015-07-01  0:51 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
2015-06-04  6:41 ` [PATCH v5 01/42] PCI: Add pcibios_setup_bridge() Gavin Shan
2015-06-05 19:44   ` Bjorn Helgaas
2015-06-09  5:49     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
2015-06-04  6:41 ` [PATCH v5 03/42] powerpc/powernv: M64 support improvement Gavin Shan
2015-06-04  6:41 ` [PATCH v5 07/42] powerpc/powernv: Calculate PHB's DMA weight dynamically Gavin Shan
2015-06-04  6:41 ` [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup Gavin Shan
2015-06-10  4:17   ` Alexey Kardashevskiy
2015-06-10  6:12     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 09/42] powerpc/powernv: pnv_ioda_setup_dma() configure one PE only Gavin Shan
2015-06-04  6:41 ` [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity Gavin Shan
2015-06-10  4:41   ` Alexey Kardashevskiy
2015-06-10  6:18     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops Gavin Shan
2015-06-10  4:43   ` Alexey Kardashevskiy
2015-06-10  6:20     ` Gavin Shan
2015-06-10  6:20       ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 14/42] powerpc/powernv: Allocate PE# in deasending order Gavin Shan
2015-06-04  6:41 ` [PATCH v5 17/42] powerpc/powernv: PE oriented during configuration Gavin Shan
2015-06-04  6:41 ` [PATCH v5 19/42] powerpc/powernv: Remove DMA32 list of PEs Gavin Shan
2015-06-04  6:41 ` [PATCH v5 20/42] powerpc/powernv: Rename pnv_ioda_get_pe() to pnv_ioda_dev_to_pe() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 21/42] powerpc/powernv: Drop pnv_ioda_setup_dev_PE() Gavin Shan
     [not found] ` <1433400131-18429-1-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-06-04  6:41   ` [PATCH v5 04/42] powerpc/powernv: Trace consumed IO and M32 segments by PE Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 05/42] powerpc/powernv: Simplify pnv_ioda_setup_pe_seg() Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 06/42] powerpc/powernv: Improve IO and M32 mapping Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 10/42] powerpc/powernv: Trace DMA32 segments consumed by PE Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 13/42] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 15/42] powerpc/powernv: Reserve PE# for root bus Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 16/42] powerpc/powernv: Create PEs dynamically Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 18/42] powerpc/powernv: Helper function pnv_ioda_init_pe() Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 22/42] powerpc/powernv: Move functions around Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:41   ` [PATCH v5 24/42] powerpc/powernv: Release PEs dynamically Gavin Shan
2015-06-04  6:41     ` Gavin Shan
2015-06-04  6:42   ` [PATCH v5 34/42] powerpc/pci: Delay creating pci_dn Gavin Shan
2015-06-04  6:42     ` Gavin Shan
2015-06-04  6:42   ` [PATCH v5 36/42] powerpc/pci: Export traverse_pci_device_nodes() Gavin Shan
2015-06-04  6:42     ` Gavin Shan
2015-06-04  6:42   ` [PATCH v5 39/42] drivers/of: Unflatten nodes equal or deeper than specified level Gavin Shan
2015-06-04  6:42     ` Gavin Shan
     [not found]     ` <1433400131-18429-40-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-06-30 17:47       ` Grant Likely
2015-06-30 17:47         ` Grant Likely
2015-06-04  6:41 ` [PATCH v5 23/42] powerpc/powernv: Cleanup on pnv_pci_ioda2_release_dma_pe() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 25/42] powerpc/powernv: Supports slot ID Gavin Shan
2015-06-04  6:41 ` [PATCH v5 26/42] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
2015-06-04  6:41 ` [PATCH v5 27/42] powerpc/powernv: Simplify pnv_eeh_reset() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 28/42] powerpc/powernv: Don't cover root bus in pnv_pci_reset_secondary_bus() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 29/42] powerpc/powernv: Issue fundamental reset " Gavin Shan
2015-06-04  6:41 ` [PATCH v5 30/42] powerpc/pci: Don't scan empty slot Gavin Shan
2015-06-04  6:42 ` [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
2015-06-05 19:47   ` Bjorn Helgaas
2015-06-09  6:10     ` Gavin Shan
2015-06-04  6:42 ` [PATCH v5 32/42] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
2015-06-04  6:42 ` [PATCH v5 33/42] powerpc/powernv: Functions to get/reset PCI slot status Gavin Shan
2015-06-04  6:42 ` [PATCH v5 35/42] powerpc/pci: Create eeh_dev while creating pci_dn Gavin Shan
2015-06-04  6:42 ` [PATCH v5 37/42] powerpc/pci: Update bridge windows on PCI plugging Gavin Shan
2015-06-04  6:42 ` [PATCH v5 38/42] powerpc/powernv: Select OF_OVERLAY Gavin Shan
2015-06-04  6:42 ` [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree() Gavin Shan
2015-06-04 22:10   ` Rob Herring
     [not found]   ` <1433400131-18429-41-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-06-30 18:06     ` Grant Likely
2015-06-30 18:06       ` Grant Likely
     [not found]       ` <20150630180652.198E2C4063C-WNowdnHR2B42iJbIjFUEsiwD8/FfD2ys@public.gmane.org>
2015-06-30 21:46         ` Benjamin Herrenschmidt
2015-06-30 21:46           ` Benjamin Herrenschmidt
2015-06-04  6:42 ` [PATCH v5 41/42] drivers/of: Return allocated memory chunk from of_fdt_unflatten_tree() Gavin Shan
2015-06-04  6:42 ` [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
     [not found]   ` <1433400131-18429-43-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-06-05 20:11     ` Bjorn Helgaas
2015-06-05 20:11       ` Bjorn Helgaas
     [not found]       ` <20150605201110.GP3631-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2015-06-05 20:18         ` Benjamin Herrenschmidt
2015-06-05 20:18           ` Benjamin Herrenschmidt
2015-06-09  6:10           ` Gavin Shan
2015-06-09  6:08       ` Gavin Shan
2015-06-30 18:18   ` Grant Likely
2015-06-30 18:18     ` Grant Likely
2015-07-01  0:51     ` Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.