All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/22] powerpc/powernv: PCI hotplug support
@ 2016-05-03 13:22 Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 01/22] PCI: Add pcibios_setup_bridge() Gavin Shan
                   ` (21 more replies)
  0 siblings, 22 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

The series is split from "[PATCH v8 00/45] powerpc/powernv: PCI hotplug
support". Another series (A) sent to linux-ppc-dev maillist as it's only
related to PowerPC. Besides, this series needs the firmware patches (B)
to work. Without the firmware patches, the PCI hotplug driver won't detect
and populate any PCI slots. So this series is working on old firmware.

 (A): https://patchwork.ozlabs.org/patch/617768/
 (B): https://patchwork.ozlabs.org/patch/617749/

The series of patches is highlighted as below:

   * In order to create PE during PCI hot plugging, pcibios_setup_bridge()
     which is called to update bridge's window populates the PE, together
     with the associated resources like IO/M32/M64 segments, DMA windows etc.
   * One refcount is maintained by each PE to track the number of PCI devices
     that are associated with the PE. The refcount is increased by one when
     a new PCI device joins the PE. It is decreased by one when a PCI device
     is released (pcibios_release_device()). The PE together with the used
     resources will be destroyed when refcount reaches to 0, meaning no PCI
     device needs the PE any more.
   * If the firmware has capability to support PCI slot and reset functionality,
     the reset required by EEH recovery is routed to firmware. Otherwise, it
     is done in kernel as before.
   * Changes to drivers/of/fdt.c in order to use OF changeset in hotplug driver.
   * PCI hotplug driver for PowerNV platform. The PCI slots are identified by
     firmware and exposed to kernel through device tree. Firmware provides APIs
     to get presence/power state or set power state from/to PCI slot. The PCI
     slot hotplug state is sychronized with its power state. When user changes
     PCI slot power state from off to on through sysfs file, the PCI devices
     behind the PCI slot will be brought into online. Otherwise, the PCI slot's
     subordinate devices will be removed from the system.

Changelog
=========
v9:
   * Rebased to linux-powerpc next branch + (A).
   * Patch order, split and merge (Alexey / Alistair)
   * Lots of misc comments covered, I don't elaborate them one by one (Alexey)
   * One more patch to export detach_of_node()
   * Fixed uninitialized variables, memory leak on @fdt1. Added flush_work()
     and other misc comments (Alexey / Alistair)
   * Same testing scenario carried as v8. More will be carried out later.
   * The confused function names aren't changed. Will check with Alexey and
     Alistair. Or have a separate patch to address it later.
v8:
   * Rebased to linux-powerpc next branch.
   * Resolve comments from Alexey and Daniel on PCI part
   * Resolve comments from Rob on fdt.c
   * Retested (refer to the "Testing section")
v7:
   * Reworked revision to some extent.
   * Rebased to powerpc/next repository.
   * Reorder/split/merge/drop according - Alexey.
   * Defined macros and use array to track IO/M32/M64/DMA32 segments - Alexey.
   * Merged 3 files to one for the hotplug driver - Alexey.
   * As part of OPAL API, defined macros for PCI slot power state, hotplug
     message type. Defined macros for PCI slot power confirmed state in
     hotplug driver.
   * Misc comments from Alexey.
   * Reworked unflatten_dt_node() to avoid recursive function calls.
   * Use EXPORT_SYMBOL_GPL() and document function's input/output - Rob/Frank.
v6:
   * Patch reorder, split, squash - Alexey.
   * Minor coding style - Alexey.
   * Better function names for pcibios_{add,remove}_pci_devices - Bjorn
   * Replace pr_warn() with dev_warn() in PowerNV hotplug driver - Bjorn
   * Concurrent depth as parameter passed to __unflatten_dt_node() - Grant / Alexey
   * Replace overlay with of_changeset - Grant
v5:
   * Rebased to 4.1.rc6 and some unmerged patches as below:
     Alexey's DDW patchset (v11);
     Gavin's EEH error injection support (in mpe's next branch);
     Richard's EEH cleanup patches (in mpe's next branch);
     Richard's EEH support for VF (v7);
     Gavin's misc EEH fixes for 4.2;
   * The revision bases on skiboot corresponding patches (v7):
     https://patchwork.ozlabs.org/patch/480437/
   * Utilize OF overlay to update device-tree with help of newly introduced
     OPAL API opal_get_overlay_dt().
   * Split patches for easy review according to aik's comments.
   * Fix coding style from checkpatchc.pl as pointed by aik.
   * Code cleanup and misc fixup according to aik's input.
v4:
   * Rebased to 4.1.RC1
   * Added API to unflatten FDT blob to device node sub-tree, which is attached
     the indicated parent device node. The original mechanism based on formatted
     string stream has been dropped.
   * The PATCH[v3 09/21] ("powerpc/eeh: Delay probing EEH device during hotplug")
     was picked up sent to linux-ppc@ separately for review as Richard's "VF EEH
     Support" depends on that.
v3:
   * Rebased to 4.1.RC0
   * PowerNV PCI infrasturcture is total refactored in order to support PCI
     hotplug. The PowerNV hotplug driver is also reworked a lot because of
     the changes in skiboot in order to support PCI hotplug.

Gavin Shan (22):
  PCI: Add pcibios_setup_bridge()
  powerpc/pci: Override pcibios_setup_bridge()
  powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around
  powerpc/powernv: Increase PE# capacity
  powerpc/powernv: Allocate PE# in reverse order
  powerpc/powernv: Create PEs in pcibios_setup_bridge()
  powerpc/powernv: Setup PE for root bus
  powerpc/powernv: Extend PCI bridge resources
  powerpc/powernv: Make pnv_ioda_deconfigure_pe() visible
  powerpc/powernv: Dynamically release PE
  powerpc/pci: Update bridge windows on PCI plug
  powerpc/pci: Delay populating pdn
  powerpc/powernv: Support PCI slot ID
  powerpc/powernv: Use PCI slot reset infrastructure
  powerpc/powernv: Functions to get/set PCI slot state
  drivers/of: Split unflatten_dt_node()
  drivers/of: Avoid recursively calling unflatten_dt_node()
  drivers/of: Rename unflatten_dt_node()
  drivers/of: Specify parent node in of_fdt_unflatten_tree()
  drivers/of: Return allocated memory from of_fdt_unflatten_tree()
  drivers/of: Export of_detach_node()
  PCI/hotplug: PowerPC PowerNV PCI hotplug driver

 arch/powerpc/include/asm/eeh.h                 |   2 +-
 arch/powerpc/include/asm/opal-api.h            |  18 +-
 arch/powerpc/include/asm/opal.h                |   9 +-
 arch/powerpc/include/asm/pci-bridge.h          |   2 +
 arch/powerpc/include/asm/pnv-pci.h             |  11 +
 arch/powerpc/include/asm/ppc-pci.h             |   2 -
 arch/powerpc/kernel/eeh_dev.c                  |  17 +-
 arch/powerpc/kernel/pci-common.c               |  16 +-
 arch/powerpc/kernel/pci_dn.c                   |  23 +-
 arch/powerpc/platforms/maple/pci.c             |  34 +-
 arch/powerpc/platforms/pasemi/pci.c            |   3 -
 arch/powerpc/platforms/powermac/pci.c          |  38 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c   |  49 +-
 arch/powerpc/platforms/powernv/opal-wrappers.S |   5 +
 arch/powerpc/platforms/powernv/pci-ioda.c      | 481 ++++++++++----
 arch/powerpc/platforms/powernv/pci.c           | 105 ++-
 arch/powerpc/platforms/powernv/pci.h           |  10 +-
 arch/powerpc/platforms/pseries/setup.c         |   6 +-
 drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c   |   2 +-
 drivers/of/dynamic.c                           |   1 +
 drivers/of/fdt.c                               | 372 +++++++----
 drivers/of/unittest.c                          |   2 +-
 drivers/pci/hotplug/Kconfig                    |  13 +
 drivers/pci/hotplug/Makefile                   |   3 +
 drivers/pci/hotplug/pnv_php.c                  | 869 +++++++++++++++++++++++++
 drivers/pci/setup-bus.c                        |   5 +
 include/linux/of_fdt.h                         |   5 +-
 include/linux/pci.h                            |   1 +
 28 files changed, 1748 insertions(+), 356 deletions(-)
 create mode 100644 drivers/pci/hotplug/pnv_php.c

-- 
2.1.0

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v9 01/22] PCI: Add pcibios_setup_bridge()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

Currently, PowerPC PowerNV platform utilizes ppc_md.pcibios_fixup(),
which is called for once after PCI probing and resource assignment
are completed, to allocate platform required resources for PCI devices:
PE#, IO and MMIO mapping, DMA address translation (TCE) table etc.
Obviously, it's not hotplug friendly.

This adds weak function pcibios_setup_bridge(), which is called by
pci_setup_bridge(). PowerPC PowerNV platform will reuse the function
to assign above platform required resources to newly plugged PCI devices
during PCI hotplug in subsequent patches.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/setup-bus.c | 5 +++++
 include/linux/pci.h     | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 55641a3..d678c46 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -695,11 +695,16 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 	pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bus->bridge_ctl);
 }
 
+void __weak pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+}
+
 void pci_setup_bridge(struct pci_bus *bus)
 {
 	unsigned long type = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 
+	pcibios_setup_bridge(bus, type);
 	__pci_setup_bridge(bus, type);
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 004b813..a17f0e8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -856,6 +856,7 @@ void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev);
 void pci_stop_root_bus(struct pci_bus *bus);
 void pci_remove_root_bus(struct pci_bus *bus);
 void pci_setup_cardbus(struct pci_bus *bus);
+void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
 void pci_sort_breadthfirst(void);
 #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
 #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 01/22] PCI: Add pcibios_setup_bridge() Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 03/22] powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around Gavin Shan
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This overrides pcibios_setup_bridge() that is called to update PCI
bridge windows when PCI resource assignment is completed, to assign
PE and setup various (resource) mapping for the PE in subsequent
patches.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 arch/powerpc/include/asm/pci-bridge.h | 2 ++
 arch/powerpc/kernel/pci-common.c      | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 467c0b0..b5e88e4 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -33,6 +33,8 @@ struct pci_controller_ops {
 	/* Called during PCI resource reassignment */
 	resource_size_t (*window_alignment)(struct pci_bus *bus,
 					    unsigned long type);
+	void		(*setup_bridge)(struct pci_bus *bus,
+					unsigned long type);
 	void		(*reset_secondary_bus)(struct pci_dev *pdev);
 
 #ifdef CONFIG_PCI_MSI
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..40df3a5 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -124,6 +124,14 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 	return 1;
 }
 
+void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+
+	if (hose->controller_ops.setup_bridge)
+		hose->controller_ops.setup_bridge(bus, type);
+}
+
 void pcibios_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *phb = pci_bus_to_host(dev->bus);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 03/22] powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 01/22] PCI: Add pcibios_setup_bridge() Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-06  6:36   ` Alexey Kardashevskiy
  2016-05-03 13:22 ` [PATCH v9 04/22] powerpc/powernv: Increase PE# capacity Gavin Shan
                   ` (18 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

pnv_pci_ioda_setup_opal_tce_kill() called by pnv_ioda_setup_dma()
to remap the TCE kill regiter. What's done in pnv_ioda_setup_dma()
will be covered in pcibios_setup_bridge() which is invoked on each
PCI bridge. It means we will possibly remap the TCE kill register
for multiple times and it's unnecessary.

This moves pnv_pci_ioda_setup_opal_tce_kill() to where the PHB is
initialized (pnv_pci_init_ioda_phb()) to avoid above issue.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5ee8a57..cbd4c0b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2599,8 +2599,6 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 	pr_info("PCI: Domain %04x has %d available 32-bit DMA segments\n",
 		hose->global_number, phb->ioda.dma32_count);
 
-	pnv_pci_ioda_setup_opal_tce_kill(phb);
-
 	/* Walk our PE list and configure their DMA segments */
 	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
 		weight = pnv_pci_ioda_pe_dma_weight(pe);
@@ -3396,6 +3394,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	if (phb->regs == NULL)
 		pr_err("  Failed to map registers !\n");
 
+	/* Initialize TCE kill register */
+	pnv_pci_ioda_setup_opal_tce_kill(phb);
+
 	/* Initialize more IODA stuff */
 	phb->ioda.total_pe_num = 1;
 	prop32 = of_get_property(np, "ibm,opal-num-pes", NULL);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 04/22] powerpc/powernv: Increase PE# capacity
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (2 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 03/22] powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-06  7:17   ` Alexey Kardashevskiy
  2016-05-03 13:22   ` Gavin Shan
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

Each PHB maintains an array helping to translate 2-bytes Request
ID (RID) to PE# with the assumption that PE# takes one byte, meaning
that we can't have more than 256 PEs. However, pci_dn->pe_number
already had 4-bytes for the PE#.

This extends the PE# capacity for every PHB. After that, the PE number
is represented by 4-bytes value. Then we can reuse IODA_INVALID_PE to
check the PE# in phb->pe_rmap[] is valid or not.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 6 +++++-
 arch/powerpc/platforms/powernv/pci.h      | 7 ++-----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index cbd4c0b..cf96cb5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -768,7 +768,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 
 	/* Clear the reverse map */
 	for (rid = pe->rid; rid < rid_end; rid++)
-		phb->ioda.pe_rmap[rid] = 0;
+		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
 
 	/* Release from all parents PELT-V */
 	while (parent) {
@@ -3406,6 +3406,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	if (prop32)
 		phb->ioda.reserved_pe_idx = be32_to_cpup(prop32);
 
+	/* Invalidate RID to PE# mapping */
+	for (segno = 0; segno < ARRAY_SIZE(phb->ioda.pe_rmap); segno++)
+		phb->ioda.pe_rmap[segno] = IODA_INVALID_PE;
+
 	/* Parse 64-bit MMIO range */
 	pnv_ioda_parse_m64_window(phb);
 
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 904f60b..80f5326 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -156,11 +156,8 @@ struct pnv_phb {
 		struct list_head	pe_list;
 		struct mutex            pe_list_mutex;
 
-		/* Reverse map of PEs, will have to extend if
-		 * we are to support more than 256 PEs, indexed
-		 * bus { bus, devfn }
-		 */
-		unsigned char		pe_rmap[0x10000];
+		/* Reverse map of PEs, indexed by {bus, devfn} */
+		unsigned int		pe_rmap[0x10000];
 
 		/* TCE cache invalidate registers (physical and
 		 * remapped)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 05/22] powerpc/powernv: Allocate PE# in reverse order
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22   ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, alistair, aik, linux-pci, Gavin Shan, robherring2,
	bhelgaas, dja

PE number for one particular PE can be allocated dynamically or
reserved according to the consumed M64 (64-bits prefetchable)
segments of the PE. The M64 segment can't be remapped to arbitrary
PE, meaning the PE number is determined according to the index
of the consumed M64 segment. As below figure shows, M64 resource
grows from low to high end, meaning the PE (number) reserved
according to M64 segment grows from low to high end as well,
so does the dynamically allocated PE number. It will lead to
conflict: PE number (M64 segment) reserved by dynamic allocation
is required by hot added PCI adapter at later point. It fails
the PCI hotplug because of the PE number can't be reserved
based on the index of the consumed M64 segment.

  +---+---+---+---+---+--------------------------------+-----+
  | 0 | 1 | 2 | 3 | 4 |      .......                   | 255 |
  +---+---+---+---+---+--------------------------------+-----+

  PE number for dynamic allocation          ----------------->
  PE number reserved for M64 segment        ----------------->

To resolve above conflicts, this forces the PE number to be
allocated dynamically in reverse order. With this patch applied,
the PE numbers are reserved in ascending order, but allocated
dynamically in reverse order.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index cf96cb5..679e279 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -148,16 +148,14 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 
 static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
 {
-	unsigned long pe;
+	unsigned long pe = phb->ioda.total_pe_num - 1;
 
-	do {
-		pe = find_next_zero_bit(phb->ioda.pe_alloc,
-					phb->ioda.total_pe_num, 0);
-		if (pe >= phb->ioda.total_pe_num)
-			return NULL;
-	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
+	for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
+		if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
+			return pnv_ioda_init_pe(phb, pe);
+	}
 
-	return pnv_ioda_init_pe(phb, pe);
+	return NULL;
 }
 
 static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
-- 
2.1.0

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 05/22] powerpc/powernv: Allocate PE# in reverse order
@ 2016-05-03 13:22   ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

PE number for one particular PE can be allocated dynamically or
reserved according to the consumed M64 (64-bits prefetchable)
segments of the PE. The M64 segment can't be remapped to arbitrary
PE, meaning the PE number is determined according to the index
of the consumed M64 segment. As below figure shows, M64 resource
grows from low to high end, meaning the PE (number) reserved
according to M64 segment grows from low to high end as well,
so does the dynamically allocated PE number. It will lead to
conflict: PE number (M64 segment) reserved by dynamic allocation
is required by hot added PCI adapter at later point. It fails
the PCI hotplug because of the PE number can't be reserved
based on the index of the consumed M64 segment.

  +---+---+---+---+---+--------------------------------+-----+
  | 0 | 1 | 2 | 3 | 4 |      .......                   | 255 |
  +---+---+---+---+---+--------------------------------+-----+

  PE number for dynamic allocation          ----------------->
  PE number reserved for M64 segment        ----------------->

To resolve above conflicts, this forces the PE number to be
allocated dynamically in reverse order. With this patch applied,
the PE numbers are reserved in ascending order, but allocated
dynamically in reverse order.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index cf96cb5..679e279 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -148,16 +148,14 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 
 static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
 {
-	unsigned long pe;
+	unsigned long pe = phb->ioda.total_pe_num - 1;
 
-	do {
-		pe = find_next_zero_bit(phb->ioda.pe_alloc,
-					phb->ioda.total_pe_num, 0);
-		if (pe >= phb->ioda.total_pe_num)
-			return NULL;
-	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
+	for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
+		if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
+			return pnv_ioda_init_pe(phb, pe);
+	}
 
-	return pnv_ioda_init_pe(phb, pe);
+	return NULL;
 }
 
 static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 06/22] powerpc/powernv: Create PEs in pcibios_setup_bridge()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (4 preceding siblings ...)
  2016-05-03 13:22   ` Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 07/22] powerpc/powernv: Setup PE for root bus Gavin Shan
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

Currently, the PEs and their associated resources are assigned in
ppc_md.pcibios_fixup() except those used by SRIOV VFs. The function
is called for once after PCI probing and resources assignment is
completed. So it's obviously not hotplug friendly.

This creates PEs dynamically in pcibios_setup_bridge() that is
called for the event during system bootup and PCI hotplug: updating
PCI bridge's windows after resource assignment/reassignment are done.
In partial hotplug case, not all PCI devices included to one particular
PE are unplugged and plugged again, we just need unbinding/binding the
hot added PCI devices with the corresponding PE without creating new
one. The change is applied to IODA1 and IODA2 PHBs only. The behaviour
on NPU PHBs aren't changed. There are no PCI bridges on NPU PHBs,
meaning pcibios_setup_bridge() won't be invoked there. We have to use
old path (pnv_pci_ioda_fixup()) to setup PEs on NPU PHBs.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 181 +++++++++++-------------------
 1 file changed, 68 insertions(+), 113 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 679e279..f63902a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1026,6 +1026,15 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 				pci_name(dev));
 			continue;
 		}
+
+		/*
+		 * In partial hotplug case, the PCI device might be still
+		 * associated with the PE and needn't attach it to the PE
+		 * again.
+		 */
+		if (pdn->pe_number != IODA_INVALID_PE)
+			continue;
+
 		pdn->pcidev = dev;
 		pdn->pe_number = pe->pe_number;
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -1044,6 +1053,18 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 	struct pci_controller *hose = pci_bus_to_host(bus);
 	struct pnv_phb *phb = hose->private_data;
 	struct pnv_ioda_pe *pe = NULL;
+	unsigned int pe_num;
+
+	/*
+	 * In partial hotplug case, the PE instance might be still alive.
+	 * We should reuse it instead of allocating a new one.
+	 */
+	pe_num = phb->ioda.pe_rmap[bus->number << 8];
+	if (pe_num != IODA_INVALID_PE) {
+		pe = &phb->ioda.pe_array[pe_num];
+		pnv_ioda_setup_same_PE(bus, pe);
+		return NULL;
+	}
 
 	/* Check if PE is determined by M64 */
 	if (phb->pick_m64_pe)
@@ -1158,30 +1179,6 @@ static void pnv_ioda_setup_npu_PEs(struct pci_bus *bus)
 		pnv_ioda_setup_npu_PE(pdev);
 }
 
-static void pnv_ioda_setup_PEs(struct pci_bus *bus)
-{
-	struct pci_dev *dev;
-
-	pnv_ioda_setup_bus_PE(bus, false);
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
-			if (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)
-				pnv_ioda_setup_bus_PE(dev->subordinate, true);
-			else
-				pnv_ioda_setup_PEs(dev->subordinate);
-		}
-	}
-}
-
-/*
- * Configure PEs so that the downstream PCI buses and devices
- * could have their associated PE#. Unfortunately, we didn't
- * figure out the way to identify the PLX bridge yet. So we
- * simply put the PCI bus and the subordinate behind the root
- * port to PE# here. The game rule here is expected to be changed
- * as soon as we can detected PLX bridge correctly.
- */
 static void pnv_pci_ioda_setup_PEs(void)
 {
 	struct pci_controller *hose, *tmp;
@@ -1189,22 +1186,11 @@ static void pnv_pci_ioda_setup_PEs(void)
 
 	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
 		phb = hose->private_data;
-
-		/* M64 layout might affect PE allocation */
-		if (phb->reserve_m64_pe)
-			phb->reserve_m64_pe(hose->bus, NULL, true);
-
-		/*
-		 * On NPU PHB, we expect separate PEs for individual PCI
-		 * functions. PCI bus dependent PEs are required for the
-		 * remaining types of PHBs.
-		 */
 		if (phb->type == PNV_PHB_NPU) {
 			/* PE#0 is needed for error reporting */
 			pnv_ioda_reserve_pe(phb, 0);
 			pnv_ioda_setup_npu_PEs(hose->bus);
-		} else
-			pnv_ioda_setup_PEs(hose->bus);
+		}
 	}
 }
 
@@ -2552,6 +2538,9 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 {
 	int64_t rc;
 
+	if (!pnv_pci_ioda_pe_dma_weight(pe))
+		return;
+
 	/* TVE #1 is selected by PCI address bit 59 */
 	pe->tce_bypass_base = 1ull << 59;
 
@@ -2583,47 +2572,6 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 		pnv_ioda_setup_bus_dma(pe, pe->pbus);
 }
 
-static void pnv_ioda_setup_dma(struct pnv_phb *phb)
-{
-	struct pci_controller *hose = phb->hose;
-	struct pnv_ioda_pe *pe;
-	unsigned int weight;
-
-	/* If we have more PE# than segments available, hand out one
-	 * per PE until we run out and let the rest fail. If not,
-	 * then we assign at least one segment per PE, plus more based
-	 * on the amount of devices under that PE
-	 */
-	pr_info("PCI: Domain %04x has %d available 32-bit DMA segments\n",
-		hose->global_number, phb->ioda.dma32_count);
-
-	/* Walk our PE list and configure their DMA segments */
-	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-		weight = pnv_pci_ioda_pe_dma_weight(pe);
-		if (!weight)
-			continue;
-
-		/*
-		 * For IODA2 compliant PHB3, we needn't care about the weight.
-		 * The all available 32-bits DMA space will be assigned to
-		 * the specific PE.
-		 */
-		if (phb->type == PNV_PHB_IODA1) {
-			pnv_pci_ioda1_setup_dma_pe(phb, pe);
-		} else if (phb->type == PNV_PHB_IODA2) {
-			pe_info(pe, "Assign DMA32 space\n");
-			pnv_pci_ioda2_setup_dma_pe(phb, pe);
-		} else if (phb->type == PNV_PHB_NPU) {
-			/*
-			 * We initialise the DMA space for an NPU PHB
-			 * after setup of the PHB is complete as we
-			 * point the NPU TVT to the the same location
-			 * as the PHB3 TVT.
-			 */
-		}
-	}
-}
-
 #ifdef CONFIG_PCI_MSI
 static void pnv_ioda2_msi_eoi(struct irq_data *d)
 {
@@ -3090,39 +3038,6 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe *pe)
 	}
 }
 
-static void pnv_pci_ioda_setup_seg(void)
-{
-	struct pci_controller *tmp, *hose;
-	struct pnv_phb *phb;
-	struct pnv_ioda_pe *pe;
-
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		phb = hose->private_data;
-
-		/* NPU PHB does not support IO or MMIO segmentation */
-		if (phb->type == PNV_PHB_NPU)
-			continue;
-
-		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
-			pnv_ioda_setup_pe_seg(pe);
-		}
-	}
-}
-
-static void pnv_pci_ioda_setup_DMA(void)
-{
-	struct pci_controller *hose, *tmp;
-	struct pnv_phb *phb;
-
-	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
-		pnv_ioda_setup_dma(hose->private_data);
-
-		/* Mark the PHB initialization done */
-		phb = hose->private_data;
-		phb->initialized = 1;
-	}
-}
-
 static void pnv_pci_ioda_create_dbgfs(void)
 {
 #ifdef CONFIG_DEBUG_FS
@@ -3133,6 +3048,9 @@ static void pnv_pci_ioda_create_dbgfs(void)
 	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
 		phb = hose->private_data;
 
+		/* Notify initialization of PHB done */
+		phb->initialized = 1;
+
 		sprintf(name, "PCI%04x", hose->global_number);
 		phb->dbgfs = debugfs_create_dir(name, powerpc_debugfs_root);
 		if (!phb->dbgfs)
@@ -3171,9 +3089,6 @@ static void pnv_npu_ioda_fixup(void)
 static void pnv_pci_ioda_fixup(void)
 {
 	pnv_pci_ioda_setup_PEs();
-	pnv_pci_ioda_setup_seg();
-	pnv_pci_ioda_setup_DMA();
-
 	pnv_pci_ioda_create_dbgfs();
 
 #ifdef CONFIG_EEH
@@ -3226,6 +3141,45 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 	return phb->ioda.io_segsize;
 }
 
+static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dev *bridge = bus->self;
+	struct pnv_ioda_pe *pe;
+	bool all = (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
+
+	/* Don't assign PE to PCI bus, which doesn't have subordinate devices */
+	if (list_empty(&bus->devices))
+		return;
+
+	/* Reserve PEs according to used M64 resources */
+	if (phb->reserve_m64_pe)
+		phb->reserve_m64_pe(bus, NULL, all);
+
+	/*
+	 * Assign PE. We might run here because of partial hotplug.
+	 * For the case, we just pick up the existing PE and should
+	 * not allocate resources again.
+	 */
+	pe = pnv_ioda_setup_bus_PE(bus, all);
+	if (!pe)
+		return;
+
+	pnv_ioda_setup_pe_seg(pe);
+	switch (phb->type) {
+	case PNV_PHB_IODA1:
+		pnv_pci_ioda1_setup_dma_pe(phb, pe);
+		break;
+	case PNV_PHB_IODA2:
+		pnv_pci_ioda2_setup_dma_pe(phb, pe);
+		break;
+	default:
+		pr_warn("%s: No DMA for PHB#%d (type %d)\n",
+			__func__, phb->hose->global_number, phb->type);
+	}
+}
+
 #ifdef CONFIG_PCI_IOV
 static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
@@ -3303,6 +3257,7 @@ static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
 #endif
 	.enable_device_hook	= pnv_pci_enable_device_hook,
 	.window_alignment	= pnv_pci_window_alignment,
+	.setup_bridge		= pnv_pci_setup_bridge,
 	.reset_secondary_bus	= pnv_pci_reset_secondary_bus,
 	.dma_set_mask		= pnv_pci_ioda_dma_set_mask,
 	.dma_get_required_mask	= pnv_pci_ioda_dma_get_required_mask,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 07/22] powerpc/powernv: Setup PE for root bus
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (5 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 06/22] powerpc/powernv: Create PEs in pcibios_setup_bridge() Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 08/22] powerpc/powernv: Extend PCI bridge resources Gavin Shan
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

There is no parent bridge for root bus, meaning pcibios_setup_bridge()
isn't invoked for root bus. The PE for root bus is the ancestor of
other PEs in PELTV. It means we need PE for root bus populated before
all others.

This populates the PE for root bus in pcibios_setup_bridge() path
if it's not existing. The PE number next to the reserved one is
used as the PE# to avoid holes in continuous M64 space.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 49 ++++++++++++++++++++++++-------
 arch/powerpc/platforms/powernv/pci.h      |  2 ++
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index f63902a..de3f292 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -201,14 +201,14 @@ static int pnv_ioda2_init_m64(struct pnv_phb *phb)
 	set_bit(phb->ioda.m64_bar_idx, &phb->ioda.m64_bar_alloc);
 
 	/*
-	 * Strip off the segment used by the reserved PE, which is
-	 * expected to be 0 or last one of PE capabicity.
+	 * Exclude the segments for reserved and root bus PE, which
+	 * are first or last two PEs.
 	 */
 	r = &phb->hose->mem_resources[1];
 	if (phb->ioda.reserved_pe_idx == 0)
-		r->start += phb->ioda.m64_segsize;
+		r->start += (2 * phb->ioda.m64_segsize);
 	else if (phb->ioda.reserved_pe_idx == (phb->ioda.total_pe_num - 1))
-		r->end -= phb->ioda.m64_segsize;
+		r->end -= (2 * phb->ioda.m64_segsize);
 	else
 		pr_warn("  Cannot strip M64 segment for reserved PE#%d\n",
 			phb->ioda.reserved_pe_idx);
@@ -288,14 +288,14 @@ static int pnv_ioda1_init_m64(struct pnv_phb *phb)
 	}
 
 	/*
-	 * Exclude the segment used by the reserved PE, which
-	 * is expected to be 0 or last supported PE#.
+	 * Exclude the segments for reserved and root bus PE, which
+	 * are first or last two PEs.
 	 */
 	r = &phb->hose->mem_resources[1];
 	if (phb->ioda.reserved_pe_idx == 0)
-		r->start += phb->ioda.m64_segsize;
+		r->start += (2 * phb->ioda.m64_segsize);
 	else if (phb->ioda.reserved_pe_idx == (phb->ioda.total_pe_num - 1))
-		r->end -= phb->ioda.m64_segsize;
+		r->end -= (2 * phb->ioda.m64_segsize);
 	else
 		pr_warn("  Cannot cut M64 segment for reserved PE#%d\n",
 			phb->ioda.reserved_pe_idx);
@@ -1066,8 +1066,13 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 		return NULL;
 	}
 
+	/* PE number for root bus should have been reserved */
+	if (pci_is_root_bus(bus) &&
+	    phb->ioda.root_pe_idx != IODA_INVALID_PE)
+		pe = &phb->ioda.pe_array[phb->ioda.root_pe_idx];
+
 	/* Check if PE is determined by M64 */
-	if (phb->pick_m64_pe)
+	if (!pe && phb->pick_m64_pe)
 		pe = phb->pick_m64_pe(bus, all);
 
 	/* The PE number isn't pinned by M64 */
@@ -3149,6 +3154,15 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 	struct pnv_ioda_pe *pe;
 	bool all = (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
 
+	/* The PE for root bus should be realized before any one else */
+	if (!phb->ioda.root_pe_populated) {
+		pe = pnv_ioda_setup_bus_PE(phb->hose->bus, false);
+		if (pe) {
+			phb->ioda.root_pe_idx = pe->pe_number;
+			phb->ioda.root_pe_populated = true;
+		}
+	}
+
 	/* Don't assign PE to PCI bus, which doesn't have subordinate devices */
 	if (list_empty(&bus->devices))
 		return;
@@ -3413,7 +3427,22 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 			phb->ioda.dma32_segmap[segno] = IODA_INVALID_PE;
 	}
 	phb->ioda.pe_array = aux + pemap_off;
-	set_bit(phb->ioda.reserved_pe_idx, phb->ioda.pe_alloc);
+
+	/*
+	 * Choose PE number for root bus, which shouldn't have
+	 * M64 resources consumed by its child devices. To pick
+	 * the PE number adjacent to the reserved one if possible.
+	 */
+	pnv_ioda_reserve_pe(phb, phb->ioda.reserved_pe_idx);
+	if (phb->ioda.reserved_pe_idx == 0) {
+		phb->ioda.root_pe_idx = 1;
+		pnv_ioda_reserve_pe(phb, phb->ioda.root_pe_idx);
+	} else if (phb->ioda.reserved_pe_idx == (phb->ioda.total_pe_num - 1)) {
+		phb->ioda.root_pe_idx = phb->ioda.reserved_pe_idx - 1;
+		pnv_ioda_reserve_pe(phb, phb->ioda.root_pe_idx);
+	} else {
+		phb->ioda.root_pe_idx = IODA_INVALID_PE;
+	}
 
 	INIT_LIST_HEAD(&phb->ioda.pe_list);
 	mutex_init(&phb->ioda.pe_list_mutex);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 80f5326..a81bf01 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -114,6 +114,8 @@ struct pnv_phb {
 		/* Global bridge info */
 		unsigned int		total_pe_num;
 		unsigned int		reserved_pe_idx;
+		unsigned int		root_pe_idx;
+		bool			root_pe_populated;
 
 		/* 32-bit MMIO window */
 		unsigned int		m32_size;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 08/22] powerpc/powernv: Extend PCI bridge resources
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (6 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 07/22] powerpc/powernv: Setup PE for root bus Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 09/22] powerpc/powernv: Make pnv_ioda_deconfigure_pe() visible Gavin Shan
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

The PCI slots are associated with root port or downstream ports
of the PCIe switch connected to root port. When adapter is hot
added to the PCI slot, it usually requests more IO or memory
resource from the directly connected parent bridge (port) and
update the bridge's windows accordingly. The resource windows
of upstream bridges can't be updated automatically. It possibly
leads to unbalanced resource across the bridges: The window of
downstream bridge is overruning that of upstream bridge. The
IO or MMIO path won't work.

This resolves the above issue by extending bridge windows of
root port and upstream port of the PCIe switch connected to
the root port to PHB's windows.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 46 +++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index de3f292..b518364 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3146,6 +3146,49 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 	return phb->ioda.io_segsize;
 }
 
+/*
+ * We are updating root port or the upstream port of the
+ * bridge behind the root port with PHB's windows in order
+ * to accommodate the changes on required resources during
+ * PCI (slot) hotplug, which is connected to either root
+ * port or the downstream ports of PCIe switch behind the
+ * root port.
+ */
+static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
+					   unsigned long type)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dev *bridge = bus->self;
+	struct resource *r, *w;
+	int i;
+
+	/* Check if we need apply fixup to the bridge's windows */
+	if (!pci_is_root_bus(bridge->bus) &&
+	    !pci_is_root_bus(bridge->bus->self->bus))
+		return;
+
+	/* Fixup the resources */
+	for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
+		r = &bridge->resource[PCI_BRIDGE_RESOURCES + i];
+		if (!r->flags || !r->parent)
+			continue;
+
+		w = NULL;
+		if (r->flags & type & IORESOURCE_IO)
+			w = &hose->io_resource;
+		else if (pnv_pci_is_mem_pref_64(r->flags) &&
+			 (type & IORESOURCE_PREFETCH) &&
+			 phb->ioda.m64_segsize)
+			w = &hose->mem_resources[1];
+		else if (r->flags & type & IORESOURCE_MEM)
+			w = &hose->mem_resources[0];
+
+		r->start = w->start;
+		r->end = w->end;
+	}
+}
+
 static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
@@ -3154,6 +3197,9 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 	struct pnv_ioda_pe *pe;
 	bool all = (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
 
+	/* Extend bridge's windows if necessary */
+	pnv_pci_fixup_bridge_resources(bus, type);
+
 	/* The PE for root bus should be realized before any one else */
 	if (!phb->ioda.root_pe_populated) {
 		pe = pnv_ioda_setup_bus_PE(phb->hose->bus, false);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 09/22] powerpc/powernv: Make pnv_ioda_deconfigure_pe() visible
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (7 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 08/22] powerpc/powernv: Extend PCI bridge resources Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22   ` Gavin Shan
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

pnv_ioda_deconfigure_pe() is visible only when CONFIG_PCI_IOV is
enabled. The function will be used to tear down PE's associated
mapping in PCI hotplug path that doesn't depend on CONFIG_PCI_IOV.

This makes pnv_ioda_deconfigure_pe() visible and not depend on
CONFIG_PCI_IOV.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index b518364..b32021b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -719,7 +719,6 @@ static int pnv_ioda_set_peltv(struct pnv_phb *phb,
 	return 0;
 }
 
-#ifdef CONFIG_PCI_IOV
 static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
 	struct pci_dev *parent;
@@ -754,9 +753,11 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 		}
 		rid_end = pe->rid + (count << 8);
 	} else {
+#ifdef CONFIG_PCI_IOV
 		if (pe->flags & PNV_IODA_PE_VF)
 			parent = pe->parent_dev;
 		else
+#endif
 			parent = pe->pdev->bus->self;
 		bcomp = OpalPciBusAll;
 		dcomp = OPAL_COMPARE_RID_DEVICE_NUMBER;
@@ -794,11 +795,12 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 
 	pe->pbus = NULL;
 	pe->pdev = NULL;
+#ifdef CONFIG_PCI_IOV
 	pe->parent_dev = NULL;
+#endif
 
 	return 0;
 }
-#endif /* CONFIG_PCI_IOV */
 
 static int pnv_ioda_configure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 10/22] powerpc/powernv: Dynamically release PE
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22   ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, alistair, aik, linux-pci, Gavin Shan, robherring2,
	bhelgaas, dja

This supports releasing PEs dynamically. A reference count is
introduced to PE representing number of PCI devices associated
with the PE. The reference count is increased when PCI device
joins the PE and decreased when PCI device leaves the PE in
pnv_pci_release_device(). When the count becomes zero, the PE
and its consumed resources are released. Note that the count
is accessed concurrently. So a counter with "int" type is enough
here.

In order to release the sources consumed by the PE, couple of
helper functions are introduced as below:

   * pnv_pci_ioda1_unset_window() - Unset IODA1 DMA32 window
   * pnv_pci_ioda1_release_dma_pe() - Release IODA1 DMA32 segments
   * pnv_pci_ioda2_release_dma_pe() - Release IODA2 DMA resource
   * pnv_ioda_release_pe_seg() - Unmap IO/M32/M64 segments

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 174 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h      |   1 +
 2 files changed, 175 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index b32021b..ee56ed2 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1037,6 +1037,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 		if (pdn->pe_number != IODA_INVALID_PE)
 			continue;
 
+		pe->device_count++;
 		pdn->pcidev = dev;
 		pdn->pe_number = pe->pe_number;
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -3302,6 +3303,178 @@ static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
 	return true;
 }
 
+static long pnv_pci_ioda1_unset_window(struct iommu_table_group *table_group,
+				       int num)
+{
+	struct pnv_ioda_pe *pe = container_of(table_group,
+					      struct pnv_ioda_pe, table_group);
+	struct pnv_phb *phb = pe->phb;
+	unsigned int idx;
+	long rc;
+
+	pe_info(pe, "Removing DMA window #%d\n", num);
+	for (idx = 0; idx < phb->ioda.dma32_count; idx++) {
+		if (phb->ioda.dma32_segmap[idx] != pe->pe_number)
+			continue;
+
+		rc = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+						idx, 0, 0ul, 0ul, 0ul);
+		if (rc != OPAL_SUCCESS) {
+			pe_warn(pe, "Failure %ld unmapping DMA32 segment#%d\n",
+				rc, idx);
+			return rc;
+		}
+
+		phb->ioda.dma32_segmap[idx] = IODA_INVALID_PE;
+	}
+
+	pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
+	return OPAL_SUCCESS;
+}
+
+static void pnv_pci_ioda1_release_dma_pe(struct pnv_ioda_pe *pe)
+{
+	unsigned int weight = pnv_pci_ioda_pe_dma_weight(pe);
+	struct iommu_table *tbl = pe->table_group.tables[0];
+	int64_t rc;
+
+	if (!weight)
+		return;
+
+	rc = pnv_pci_ioda1_unset_window(&pe->table_group, 0);
+	if (rc != OPAL_SUCCESS)
+		return;
+
+	pnv_pci_ioda1_tce_invalidate(tbl, tbl->it_offset, tbl->it_size, false);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		WARN_ON(pe->table_group.group);
+	}
+
+	free_pages(tbl->it_base, get_order(tbl->it_size << 3));
+	iommu_free_table(tbl, "pnv");
+}
+
+static void pnv_pci_ioda2_release_dma_pe(struct pnv_ioda_pe *pe)
+{
+	struct iommu_table *tbl = pe->table_group.tables[0];
+	unsigned int weight = pnv_pci_ioda_pe_dma_weight(pe);
+#ifdef CONFIG_IOMMU_API
+	int64_t rc;
+#endif
+
+	if (!weight)
+		return;
+
+#ifdef CONFIG_IOMMU_API
+	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
+	if (rc)
+		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
+#endif
+
+	pnv_pci_ioda2_set_bypass(pe, false);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		WARN_ON(pe->table_group.group);
+	}
+
+	pnv_pci_ioda2_table_free_pages(tbl);
+	iommu_free_table(tbl, "pnv");
+}
+
+static void pnv_ioda_free_pe_seg(struct pnv_ioda_pe *pe,
+				 unsigned short win,
+				 unsigned int *map)
+{
+	struct pnv_phb *phb = pe->phb;
+	int idx;
+	int64_t rc;
+
+	for (idx = 0; idx < phb->ioda.total_pe_num; idx++) {
+		if (map[idx] != pe->pe_number)
+			continue;
+
+		if (win == OPAL_M64_WINDOW_TYPE)
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe_idx, win,
+					idx / PNV_IODA1_M64_SEGS,
+					idx % PNV_IODA1_M64_SEGS);
+		else
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe_idx, win, 0, idx);
+
+		if (rc != OPAL_SUCCESS)
+			pe_warn(pe, "Error %ld unmapping (%d) segment#%d\n",
+				rc, win, idx);
+
+		map[idx] = IODA_INVALID_PE;
+	}
+}
+
+static void pnv_ioda_release_pe_seg(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+
+	if (phb->type == PNV_PHB_IODA1) {
+		pnv_ioda_free_pe_seg(pe, OPAL_IO_WINDOW_TYPE,
+				     phb->ioda.io_segmap);
+		pnv_ioda_free_pe_seg(pe, OPAL_M32_WINDOW_TYPE,
+				     phb->ioda.m32_segmap);
+		pnv_ioda_free_pe_seg(pe, OPAL_M64_WINDOW_TYPE,
+				     phb->ioda.m64_segmap);
+	} else if (phb->type == PNV_PHB_IODA2) {
+		pnv_ioda_free_pe_seg(pe, OPAL_M32_WINDOW_TYPE,
+				     phb->ioda.m32_segmap);
+	}
+}
+
+static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	struct pnv_ioda_pe *slave, *tmp;
+
+	/* Release slave PEs in compound PE */
+	if (pe->flags & PNV_IODA_PE_MASTER) {
+		list_for_each_entry_safe(slave, tmp, &pe->slaves, list)
+			pnv_ioda_release_pe(slave);
+	}
+
+	list_del(&pe->list);
+	switch (phb->type) {
+	case PNV_PHB_IODA1:
+		pnv_pci_ioda1_release_dma_pe(pe);
+		break;
+	case PNV_PHB_IODA2:
+		pnv_pci_ioda2_release_dma_pe(pe);
+		break;
+	default:
+		WARN_ON(1);
+	}
+
+	pnv_ioda_release_pe_seg(pe);
+	pnv_ioda_deconfigure_pe(pe->phb, pe);
+	pnv_ioda_free_pe(pe);
+}
+
+static void pnv_pci_release_device(struct pci_dev *pdev)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	struct pnv_ioda_pe *pe;
+
+	if (pdev->is_virtfn)
+		return;
+
+	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
+		return;
+
+	pe = &phb->ioda.pe_array[pdn->pe_number];
+	WARN_ON(--pe->device_count < 0);
+	if (pe->device_count == 0)
+		pnv_ioda_release_pe(pe);
+}
+
 static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
 {
 	struct pnv_phb *phb = hose->private_data;
@@ -3318,6 +3491,7 @@ static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
 	.teardown_msi_irqs	= pnv_teardown_msi_irqs,
 #endif
 	.enable_device_hook	= pnv_pci_enable_device_hook,
+	.release_device		= pnv_pci_release_device,
 	.window_alignment	= pnv_pci_window_alignment,
 	.setup_bridge		= pnv_pci_setup_bridge,
 	.reset_secondary_bus	= pnv_pci_reset_secondary_bus,
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index a81bf01..dda5fa7 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -31,6 +31,7 @@ struct pnv_phb;
 struct pnv_ioda_pe {
 	unsigned long		flags;
 	struct pnv_phb		*phb;
+	int			device_count;
 
 #define PNV_IODA_MAX_PEER_PES	8
 	struct pnv_ioda_pe	*peers[PNV_IODA_MAX_PEER_PES];
-- 
2.1.0

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 10/22] powerpc/powernv: Dynamically release PE
@ 2016-05-03 13:22   ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This supports releasing PEs dynamically. A reference count is
introduced to PE representing number of PCI devices associated
with the PE. The reference count is increased when PCI device
joins the PE and decreased when PCI device leaves the PE in
pnv_pci_release_device(). When the count becomes zero, the PE
and its consumed resources are released. Note that the count
is accessed concurrently. So a counter with "int" type is enough
here.

In order to release the sources consumed by the PE, couple of
helper functions are introduced as below:

   * pnv_pci_ioda1_unset_window() - Unset IODA1 DMA32 window
   * pnv_pci_ioda1_release_dma_pe() - Release IODA1 DMA32 segments
   * pnv_pci_ioda2_release_dma_pe() - Release IODA2 DMA resource
   * pnv_ioda_release_pe_seg() - Unmap IO/M32/M64 segments

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 174 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h      |   1 +
 2 files changed, 175 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index b32021b..ee56ed2 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1037,6 +1037,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 		if (pdn->pe_number != IODA_INVALID_PE)
 			continue;
 
+		pe->device_count++;
 		pdn->pcidev = dev;
 		pdn->pe_number = pe->pe_number;
 		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
@@ -3302,6 +3303,178 @@ static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
 	return true;
 }
 
+static long pnv_pci_ioda1_unset_window(struct iommu_table_group *table_group,
+				       int num)
+{
+	struct pnv_ioda_pe *pe = container_of(table_group,
+					      struct pnv_ioda_pe, table_group);
+	struct pnv_phb *phb = pe->phb;
+	unsigned int idx;
+	long rc;
+
+	pe_info(pe, "Removing DMA window #%d\n", num);
+	for (idx = 0; idx < phb->ioda.dma32_count; idx++) {
+		if (phb->ioda.dma32_segmap[idx] != pe->pe_number)
+			continue;
+
+		rc = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+						idx, 0, 0ul, 0ul, 0ul);
+		if (rc != OPAL_SUCCESS) {
+			pe_warn(pe, "Failure %ld unmapping DMA32 segment#%d\n",
+				rc, idx);
+			return rc;
+		}
+
+		phb->ioda.dma32_segmap[idx] = IODA_INVALID_PE;
+	}
+
+	pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
+	return OPAL_SUCCESS;
+}
+
+static void pnv_pci_ioda1_release_dma_pe(struct pnv_ioda_pe *pe)
+{
+	unsigned int weight = pnv_pci_ioda_pe_dma_weight(pe);
+	struct iommu_table *tbl = pe->table_group.tables[0];
+	int64_t rc;
+
+	if (!weight)
+		return;
+
+	rc = pnv_pci_ioda1_unset_window(&pe->table_group, 0);
+	if (rc != OPAL_SUCCESS)
+		return;
+
+	pnv_pci_ioda1_tce_invalidate(tbl, tbl->it_offset, tbl->it_size, false);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		WARN_ON(pe->table_group.group);
+	}
+
+	free_pages(tbl->it_base, get_order(tbl->it_size << 3));
+	iommu_free_table(tbl, "pnv");
+}
+
+static void pnv_pci_ioda2_release_dma_pe(struct pnv_ioda_pe *pe)
+{
+	struct iommu_table *tbl = pe->table_group.tables[0];
+	unsigned int weight = pnv_pci_ioda_pe_dma_weight(pe);
+#ifdef CONFIG_IOMMU_API
+	int64_t rc;
+#endif
+
+	if (!weight)
+		return;
+
+#ifdef CONFIG_IOMMU_API
+	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
+	if (rc)
+		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
+#endif
+
+	pnv_pci_ioda2_set_bypass(pe, false);
+	if (pe->table_group.group) {
+		iommu_group_put(pe->table_group.group);
+		WARN_ON(pe->table_group.group);
+	}
+
+	pnv_pci_ioda2_table_free_pages(tbl);
+	iommu_free_table(tbl, "pnv");
+}
+
+static void pnv_ioda_free_pe_seg(struct pnv_ioda_pe *pe,
+				 unsigned short win,
+				 unsigned int *map)
+{
+	struct pnv_phb *phb = pe->phb;
+	int idx;
+	int64_t rc;
+
+	for (idx = 0; idx < phb->ioda.total_pe_num; idx++) {
+		if (map[idx] != pe->pe_number)
+			continue;
+
+		if (win == OPAL_M64_WINDOW_TYPE)
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe_idx, win,
+					idx / PNV_IODA1_M64_SEGS,
+					idx % PNV_IODA1_M64_SEGS);
+		else
+			rc = opal_pci_map_pe_mmio_window(phb->opal_id,
+					phb->ioda.reserved_pe_idx, win, 0, idx);
+
+		if (rc != OPAL_SUCCESS)
+			pe_warn(pe, "Error %ld unmapping (%d) segment#%d\n",
+				rc, win, idx);
+
+		map[idx] = IODA_INVALID_PE;
+	}
+}
+
+static void pnv_ioda_release_pe_seg(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+
+	if (phb->type == PNV_PHB_IODA1) {
+		pnv_ioda_free_pe_seg(pe, OPAL_IO_WINDOW_TYPE,
+				     phb->ioda.io_segmap);
+		pnv_ioda_free_pe_seg(pe, OPAL_M32_WINDOW_TYPE,
+				     phb->ioda.m32_segmap);
+		pnv_ioda_free_pe_seg(pe, OPAL_M64_WINDOW_TYPE,
+				     phb->ioda.m64_segmap);
+	} else if (phb->type == PNV_PHB_IODA2) {
+		pnv_ioda_free_pe_seg(pe, OPAL_M32_WINDOW_TYPE,
+				     phb->ioda.m32_segmap);
+	}
+}
+
+static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+	struct pnv_ioda_pe *slave, *tmp;
+
+	/* Release slave PEs in compound PE */
+	if (pe->flags & PNV_IODA_PE_MASTER) {
+		list_for_each_entry_safe(slave, tmp, &pe->slaves, list)
+			pnv_ioda_release_pe(slave);
+	}
+
+	list_del(&pe->list);
+	switch (phb->type) {
+	case PNV_PHB_IODA1:
+		pnv_pci_ioda1_release_dma_pe(pe);
+		break;
+	case PNV_PHB_IODA2:
+		pnv_pci_ioda2_release_dma_pe(pe);
+		break;
+	default:
+		WARN_ON(1);
+	}
+
+	pnv_ioda_release_pe_seg(pe);
+	pnv_ioda_deconfigure_pe(pe->phb, pe);
+	pnv_ioda_free_pe(pe);
+}
+
+static void pnv_pci_release_device(struct pci_dev *pdev)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	struct pnv_ioda_pe *pe;
+
+	if (pdev->is_virtfn)
+		return;
+
+	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
+		return;
+
+	pe = &phb->ioda.pe_array[pdn->pe_number];
+	WARN_ON(--pe->device_count < 0);
+	if (pe->device_count == 0)
+		pnv_ioda_release_pe(pe);
+}
+
 static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
 {
 	struct pnv_phb *phb = hose->private_data;
@@ -3318,6 +3491,7 @@ static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
 	.teardown_msi_irqs	= pnv_teardown_msi_irqs,
 #endif
 	.enable_device_hook	= pnv_pci_enable_device_hook,
+	.release_device		= pnv_pci_release_device,
 	.window_alignment	= pnv_pci_window_alignment,
 	.setup_bridge		= pnv_pci_setup_bridge,
 	.reset_secondary_bus	= pnv_pci_reset_secondary_bus,
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index a81bf01..dda5fa7 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -31,6 +31,7 @@ struct pnv_phb;
 struct pnv_ioda_pe {
 	unsigned long		flags;
 	struct pnv_phb		*phb;
+	int			device_count;
 
 #define PNV_IODA_MAX_PEER_PES	8
 	struct pnv_ioda_pe	*peers[PNV_IODA_MAX_PEER_PES];
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 11/22] powerpc/pci: Update bridge windows on PCI plug
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (9 preceding siblings ...)
  2016-05-03 13:22   ` Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 12/22] powerpc/pci: Delay populating pdn Gavin Shan
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

On the PCI plugging event, PCI slot's subordinate devices are
scanned and their (IO and MMIO) resources are assigned. Platform
dependent resources (PE#, IO/MMIO/DMA windows) are allocated or
created on updating windows of the slot's upstream bridge.

This updates the windows of the hot plugged slot's upstream bridge
in pcibios_finish_adding_to_bus() so that the platform resources
(PE#, IO/MMIO/DMA segments) are allocated or created accordingly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 arch/powerpc/kernel/pci-common.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 40df3a5..be9e515 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1444,8 +1444,12 @@ void pcibios_finish_adding_to_bus(struct pci_bus *bus)
 	/* Allocate bus and devices resources */
 	pcibios_allocate_bus_resources(bus);
 	pcibios_claim_one_bus(bus);
-	if (!pci_has_flag(PCI_PROBE_ONLY))
-		pci_assign_unassigned_bus_resources(bus);
+	if (!pci_has_flag(PCI_PROBE_ONLY)) {
+		if (bus->self)
+			pci_assign_unassigned_bridge_resources(bus->self);
+		else
+			pci_assign_unassigned_bus_resources(bus);
+	}
 
 	/* Fixup EEH */
 	eeh_add_device_tree_late(bus);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 12/22] powerpc/pci: Delay populating pdn
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (10 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 11/22] powerpc/pci: Update bridge windows on PCI plug Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 13/22] powerpc/powernv: Support PCI slot ID Gavin Shan
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

The pdn (struct pci_dn) instances are allocated from memblock or
bootmem when creating PCI controller (hoses) in setup_arch(). PCI
hotplug, which will be supported by proceeding patches, releases
PCI device nodes and their corresponding pdn on unplugging event.
The memory chunks for pdn instances allocated from memblock or
bootmem are hard to reused after being released.

This delays creating pdn by pci_devs_phb_init() from setup_arch()
to core_initcall() so that they are allocated from slab. The memory
consumed by pdn can be released to system without problem during
PCI unplugging time. It indicates that pci_dn is unavailable in
setup_arch() and the the fixup on pdn (like AGP's) can't be carried
out that time. We have to do that in pcibios_root_bridge_prepare()
on maple/pasemi/powermac platforms where/when the pdn is available.
pcibios_root_bridge_prepare is called from subsys_initcall() which
is executed after core_initcall() so the code flow does not change.

At the mean while, the EEH device is created when pdn is populated,
meaning pdn and EEH device have same life cycle. In turn, we needn't
call eeh_dev_init() to create EEH device explicitly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 arch/powerpc/include/asm/eeh.h         |  2 +-
 arch/powerpc/include/asm/ppc-pci.h     |  2 --
 arch/powerpc/kernel/eeh_dev.c          | 17 +++------------
 arch/powerpc/kernel/pci_dn.c           | 23 ++++++++++++++++----
 arch/powerpc/platforms/maple/pci.c     | 34 ++++++++++++++++++------------
 arch/powerpc/platforms/pasemi/pci.c    |  3 ---
 arch/powerpc/platforms/powermac/pci.c  | 38 +++++++++++++++++++++-------------
 arch/powerpc/platforms/powernv/pci.c   |  3 ---
 arch/powerpc/platforms/pseries/setup.c |  6 +-----
 9 files changed, 69 insertions(+), 59 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index fb9f376..8721580 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -274,7 +274,7 @@ void eeh_pe_restore_bars(struct eeh_pe *pe);
 const char *eeh_pe_loc_get(struct eeh_pe *pe);
 struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
 
-void *eeh_dev_init(struct pci_dn *pdn, void *data);
+struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
 int eeh_init(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 8753e4e..0f73de0 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -39,8 +39,6 @@ void *pci_traverse_device_nodes(struct device_node *start,
 void *traverse_pci_dn(struct pci_dn *root,
 		      void *(*fn)(struct pci_dn *, void *),
 		      void *data);
-
-extern void pci_devs_phb_init(void);
 extern void pci_devs_phb_init_dynamic(struct pci_controller *phb);
 
 /* From rtas_pci.h */
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index 7815095..d6b2ca7 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -44,14 +44,13 @@
 /**
  * eeh_dev_init - Create EEH device according to OF node
  * @pdn: PCI device node
- * @data: PHB
  *
  * It will create EEH device according to the given OF node. The function
  * might be called by PCI emunation, DR, PHB hotplug.
  */
-void *eeh_dev_init(struct pci_dn *pdn, void *data)
+struct eeh_dev *eeh_dev_init(struct pci_dn *pdn)
 {
-	struct pci_controller *phb = data;
+	struct pci_controller *phb = pdn->phb;
 	struct eeh_dev *edev;
 
 	/* Allocate EEH device */
@@ -69,7 +68,7 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
 	INIT_LIST_HEAD(&edev->list);
 	INIT_LIST_HEAD(&edev->rmv_list);
 
-	return NULL;
+	return edev;
 }
 
 /**
@@ -81,16 +80,8 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
  */
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb)
 {
-	struct pci_dn *root = phb->pci_data;
-
 	/* EEH PE for PHB */
 	eeh_phb_pe_create(phb);
-
-	/* EEH device for PHB */
-	eeh_dev_init(root, phb);
-
-	/* EEH devices for children OF nodes */
-	traverse_pci_dn(root, eeh_dev_init, phb);
 }
 
 /**
@@ -106,8 +97,6 @@ static int __init eeh_dev_phb_init(void)
 	list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
 		eeh_dev_phb_init_dynamic(phb);
 
-	pr_info("EEH: devices created\n");
-
 	return 0;
 }
 
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index ecdccce..9cbf95a 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -209,8 +209,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 		}
 
 		/* Create the EEH device for the VF */
-		eeh_dev_init(pdn, pci_bus_to_host(pdev->bus));
-		edev = pdn_to_eeh_dev(pdn);
+		edev = eeh_dev_init(pdn);
 		BUG_ON(!edev);
 		edev->physfn = pdev;
 	}
@@ -289,8 +288,11 @@ struct pci_dn *pci_add_device_node_info(struct pci_controller *hose,
 	const __be32 *regs;
 	struct device_node *parent;
 	struct pci_dn *pdn;
+#ifdef CONFIG_EEH
+	struct eeh_dev *edev;
+#endif
 
-	pdn = zalloc_maybe_bootmem(sizeof(*pdn), GFP_KERNEL);
+	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
 	if (pdn == NULL)
 		return NULL;
 	dn->data = pdn;
@@ -319,6 +321,15 @@ struct pci_dn *pci_add_device_node_info(struct pci_controller *hose,
 	/* Extended config space */
 	pdn->pci_ext_config_space = (type && of_read_number(type, 1) == 1);
 
+	/* Create EEH device */
+#ifdef CONFIG_EEH
+	edev = eeh_dev_init(pdn);
+	if (!edev) {
+		kfree(pdn);
+		return NULL;
+	}
+#endif
+
 	/* Attach to parent node */
 	INIT_LIST_HEAD(&pdn->child_list);
 	INIT_LIST_HEAD(&pdn->list);
@@ -504,15 +515,19 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
  * pci device found underneath.  This routine runs once,
  * early in the boot sequence.
  */
-void __init pci_devs_phb_init(void)
+static int __init pci_devs_phb_init(void)
 {
 	struct pci_controller *phb, *tmp;
 
 	/* This must be done first so the device nodes have valid pci info! */
 	list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
 		pci_devs_phb_init_dynamic(phb);
+
+	return 0;
 }
 
+core_initcall(pci_devs_phb_init);
+
 static void pci_dev_pdn_setup(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn;
diff --git a/arch/powerpc/platforms/maple/pci.c b/arch/powerpc/platforms/maple/pci.c
index a923230..a2f89e6 100644
--- a/arch/powerpc/platforms/maple/pci.c
+++ b/arch/powerpc/platforms/maple/pci.c
@@ -568,6 +568,26 @@ void maple_pci_irq_fixup(struct pci_dev *dev)
 	DBG(" <- maple_pci_irq_fixup\n");
 }
 
+static int maple_pci_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+	struct pci_controller *hose = pci_bus_to_host(bridge->bus);
+	struct device_node *np, *child;
+
+	if (hose != u3_agp)
+		return 0;
+
+	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
+	 * assume there is no P2P bridge on the AGP bus, which should be a
+	 * safe assumptions hopefully.
+	 */
+	np = hose->dn;
+	PCI_DN(np)->busno = 0xf0;
+	for_each_child_of_node(np, child)
+		PCI_DN(child)->busno = 0xf0;
+
+	return 0;
+}
+
 void __init maple_pci_init(void)
 {
 	struct device_node *np, *root;
@@ -605,19 +625,7 @@ void __init maple_pci_init(void)
 	if (ht && maple_add_bridge(ht) != 0)
 		of_node_put(ht);
 
-	/* Setup the linkage between OF nodes and PHBs */ 
-	pci_devs_phb_init();
-
-	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
-	 * assume there is no P2P bridge on the AGP bus, which should be a
-	 * safe assumptions hopefully.
-	 */
-	if (u3_agp) {
-		struct device_node *np = u3_agp->dn;
-		PCI_DN(np)->busno = 0xf0;
-		for (np = np->child; np; np = np->sibling)
-			PCI_DN(np)->busno = 0xf0;
-	}
+	ppc_md.pcibios_root_bridge_prepare = maple_pci_root_bridge_prepare;
 
 	/* Tell pci.c to not change any resource allocations.  */
 	pci_add_flags(PCI_PROBE_ONLY);
diff --git a/arch/powerpc/platforms/pasemi/pci.c b/arch/powerpc/platforms/pasemi/pci.c
index f3a68a0..10c4e8f 100644
--- a/arch/powerpc/platforms/pasemi/pci.c
+++ b/arch/powerpc/platforms/pasemi/pci.c
@@ -229,9 +229,6 @@ void __init pas_pci_init(void)
 			of_node_get(np);
 
 	of_node_put(root);
-
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
 }
 
 void __iomem *pasemi_pci_getcfgaddr(struct pci_dev *dev, int offset)
diff --git a/arch/powerpc/platforms/powermac/pci.c b/arch/powerpc/platforms/powermac/pci.c
index 59ab16f..6e06c3b 100644
--- a/arch/powerpc/platforms/powermac/pci.c
+++ b/arch/powerpc/platforms/powermac/pci.c
@@ -878,6 +878,29 @@ void pmac_pci_irq_fixup(struct pci_dev *dev)
 #endif /* CONFIG_PPC32 */
 }
 
+#ifdef CONFIG_PPC64
+static int pmac_pci_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+	struct pci_controller *hose = pci_bus_to_host(bridge->bus);
+	struct device_node *np, *child;
+
+	if (hose != u3_agp)
+		return 0;
+
+	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
+	 * assume there is no P2P bridge on the AGP bus, which should be a
+	 * safe assumptions for now. We should do something better in the
+	 * future though
+	 */
+	np = hose->dn;
+	PCI_DN(np)->busno = 0xf0;
+	for_each_child_of_node(np, child)
+		PCI_DN(child)->busno = 0xf0;
+
+	return 0;
+}
+#endif /* CONFIG_PPC64 */
+
 void __init pmac_pci_init(void)
 {
 	struct device_node *np, *root;
@@ -914,20 +937,7 @@ void __init pmac_pci_init(void)
 	if (ht && pmac_add_bridge(ht) != 0)
 		of_node_put(ht);
 
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
-
-	/* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We
-	 * assume there is no P2P bridge on the AGP bus, which should be a
-	 * safe assumptions for now. We should do something better in the
-	 * future though
-	 */
-	if (u3_agp) {
-		struct device_node *np = u3_agp->dn;
-		PCI_DN(np)->busno = 0xf0;
-		for (np = np->child; np; np = np->sibling)
-			PCI_DN(np)->busno = 0xf0;
-	}
+	ppc_md.pcibios_root_bridge_prepare = pmac_pci_root_bridge_prepare;
 	/* pmac_check_ht_link(); */
 
 #else /* CONFIG_PPC64 */
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 8827461..67a33e9 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -822,9 +822,6 @@ void __init pnv_pci_init(void)
 	for_each_compatible_node(np, NULL, "ibm,ioda2-npu-phb")
 		pnv_pci_init_npu_phb(np);
 
-	/* Setup the linkage between OF nodes and PHBs */
-	pci_devs_phb_init();
-
 	/* Configure IOMMU DMA hooks */
 	set_pci_dma_ops(&dma_iommu_ops);
 }
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index cd739da..62041a3 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -263,11 +263,8 @@ static int pci_dn_reconfig_notifier(struct notifier_block *nb, unsigned long act
 	case OF_RECONFIG_ATTACH_NODE:
 		parent = of_get_parent(np);
 		pdn = parent ? PCI_DN(parent) : NULL;
-		if (pdn) {
-			/* Create pdn and EEH device */
+		if (pdn)
 			pci_add_device_node_info(pdn->phb, np);
-			eeh_dev_init(PCI_DN(np), pdn->phb);
-		}
 
 		of_node_put(parent);
 		break;
@@ -490,7 +487,6 @@ static void __init find_and_init_phbs(void)
 	}
 
 	of_node_put(root);
-	pci_devs_phb_init();
 
 	/*
 	 * PCI_PROBE_ONLY and PCI_REASSIGN_ALL_BUS can be set via properties
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 13/22] powerpc/powernv: Support PCI slot ID
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (11 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 12/22] powerpc/pci: Delay populating pdn Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 14/22] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

The reset and poll functionality from (OPAL) firmware supports
PHB and PCI slot at same time. They are identified by ID. This
supports PCI slot ID by:

   * Rename the argument name for opal_pci_reset() and opal_pci_poll()
     accordingly
   * Rename pnv_eeh_phb_poll() to pnv_eeh_poll() and adjust its argument
     name.
   * One macro is added to produce PCI slot ID.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/opal.h              | 4 ++--
 arch/powerpc/include/asm/pnv-pci.h           | 4 ++++
 arch/powerpc/platforms/powernv/eeh-powernv.c | 8 ++++----
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9d86c66..348132c 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -131,7 +131,7 @@ int64_t opal_pci_map_pe_dma_window(uint64_t phb_id, uint16_t pe_number, uint16_t
 int64_t opal_pci_map_pe_dma_window_real(uint64_t phb_id, uint16_t pe_number,
 					uint16_t dma_window_number, uint64_t pci_start_addr,
 					uint64_t pci_mem_size);
-int64_t opal_pci_reset(uint64_t phb_id, uint8_t reset_scope, uint8_t assert_state);
+int64_t opal_pci_reset(uint64_t id, uint8_t reset_scope, uint8_t assert_state);
 
 int64_t opal_pci_get_hub_diag_data(uint64_t hub_id, void *diag_buffer,
 				   uint64_t diag_buffer_len);
@@ -148,7 +148,7 @@ int64_t opal_get_dpo_status(__be64 *dpo_timeout);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
 			    __be16 *pci_error_type, __be16 *severity);
-int64_t opal_pci_poll(uint64_t phb_id);
+int64_t opal_pci_poll(uint64_t id);
 int64_t opal_return_cpu(void);
 int64_t opal_check_token(uint64_t token);
 int64_t opal_reinit_cpus(uint64_t flags);
diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h
index 6f77f71..c607902 100644
--- a/arch/powerpc/include/asm/pnv-pci.h
+++ b/arch/powerpc/include/asm/pnv-pci.h
@@ -13,6 +13,10 @@
 #include <linux/pci.h>
 #include <misc/cxl-base.h>
 
+#define PCI_SLOT_ID_PREFIX	0x8000000000000000
+#define PCI_SLOT_ID(phb_id, bdfn)	\
+	(PCI_SLOT_ID_PREFIX | ((uint64_t)(bdfn) << 16) | (phb_id))
+
 int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode);
 int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
 			   unsigned int virq);
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 593b8dc..3b17033 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -717,12 +717,12 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int *delay)
 	return ret;
 }
 
-static s64 pnv_eeh_phb_poll(struct pnv_phb *phb)
+static s64 pnv_eeh_poll(unsigned long id)
 {
 	s64 rc = OPAL_HARDWARE;
 
 	while (1) {
-		rc = opal_pci_poll(phb->opal_id);
+		rc = opal_pci_poll(id);
 		if (rc <= 0)
 			break;
 
@@ -762,7 +762,7 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 	 * reset followed by hot reset on root bus. So we also
 	 * need the PCI bus settlement delay.
 	 */
-	rc = pnv_eeh_phb_poll(phb);
+	rc = pnv_eeh_poll(phb->opal_id);
 	if (option == EEH_RESET_DEACTIVATE) {
 		if (system_state < SYSTEM_RUNNING)
 			udelay(1000 * EEH_PE_RST_SETTLE_TIME);
@@ -805,7 +805,7 @@ static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
 		goto out;
 
 	/* Poll state of the PHB until the request is done */
-	rc = pnv_eeh_phb_poll(phb);
+	rc = pnv_eeh_poll(phb->opal_id);
 	if (option == EEH_RESET_DEACTIVATE)
 		msleep(EEH_PE_RST_SETTLE_TIME);
 out:
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 14/22] powerpc/powernv: Use PCI slot reset infrastructure
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (12 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 13/22] powerpc/powernv: Support PCI slot ID Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state Gavin Shan
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

The (OPAL) firmware might provide the PCI slot reset capability
which is identified by property "ibm,reset-by-firmware" on the
PCI slot associated device node.

This routes the reset request to firmware if "ibm,reset-by-firmware"
exists in the PCI slot device node. Otherwise, the reset is done
inside kernel as before.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 3b17033..1ea78e5 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -36,6 +36,7 @@
 #include <asm/msi_bitmap.h>
 #include <asm/opal.h>
 #include <asm/ppc-pci.h>
+#include <asm/pnv-pci.h>
 
 #include "powernv.h"
 #include "pci.h"
@@ -815,7 +816,7 @@ out:
 	return 0;
 }
 
-static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
+static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 {
 	struct pci_dn *pdn = pci_get_pdn_by_devfn(dev->bus, dev->devfn);
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -866,6 +867,44 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static int pnv_eeh_bridge_reset(struct pci_dev *pdev, int option)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	struct device_node *dn = pci_device_to_OF_node(pdev);
+	uint64_t id = PCI_SLOT_ID(phb->opal_id,
+				  (pdev->bus->number << 8) | pdev->devfn);
+	uint8_t scope;
+	int64_t rc;
+
+	/* Hot reset to the bus if firmware cannot handle */
+	if (!dn || !of_get_property(dn, "ibm,reset-by-firmware", NULL))
+		return __pnv_eeh_bridge_reset(pdev, option);
+
+	switch (option) {
+	case EEH_RESET_FUNDAMENTAL:
+		scope = OPAL_RESET_PCI_FUNDAMENTAL;
+		break;
+	case EEH_RESET_HOT:
+		scope = OPAL_RESET_PCI_HOT;
+		break;
+	case EEH_RESET_DEACTIVATE:
+		return 0;
+	default:
+		dev_dbg(&pdev->dev, "%s: Unsupported reset %d\n",
+			__func__, option);
+		return -EINVAL;
+	}
+
+	rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET);
+	if (rc <= OPAL_SUCCESS)
+		goto out;
+
+	rc = pnv_eeh_poll(id);
+out:
+	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	pnv_eeh_bridge_reset(dev, EEH_RESET_HOT);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (13 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 14/22] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-11  3:28   ` Alistair Popple
  2016-05-03 13:22 ` [PATCH v9 16/22] drivers/of: Split unflatten_dt_node() Gavin Shan
                   ` (6 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This exports 4 functions, which base on the corresponding OPAL
APIs to get/set PCI slot status. Those functions are going to
be used by PowerNV PCI hotplug driver:

   pnv_pci_get_device_tree()    opal_get_device_tree()
   pnv_pci_get_presence_state() opal_pci_get_presence_state()
   pnv_pci_get_power_state()    opal_pci_get_power_state()
   pnv_pci_set_power_state()    opal_pci_set_power_state()

Besides, the patch also exports pnv_pci_hotplug_notifier_{register,
unregister}() to allow registration and unregistration of PCI hotplug
notifier, which will be used to receive PCI hotplug message from
skiboot firmware in PowerNV PCI hotplug driver.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 arch/powerpc/include/asm/opal-api.h            |  18 ++++-
 arch/powerpc/include/asm/opal.h                |   5 ++
 arch/powerpc/include/asm/pnv-pci.h             |   7 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S |   5 ++
 arch/powerpc/platforms/powernv/pci.c           | 102 +++++++++++++++++++++++++
 5 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 9bb8ddf..728e04e 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -158,7 +158,12 @@
 #define OPAL_LEDS_SET_INDICATOR			115
 #define OPAL_CEC_REBOOT2			116
 #define OPAL_CONSOLE_FLUSH			117
-#define OPAL_LAST				117
+#define OPAL_GET_DEVICE_TREE			118
+#define OPAL_PCI_GET_PRESENCE_STATE		119
+#define OPAL_PCI_GET_POWER_STATE		120
+#define OPAL_PCI_SET_POWER_STATE		121
+#define OPAL_PCI_POLL2				122
+#define OPAL_LAST				122
 
 /* Device tree flags */
 
@@ -344,6 +349,16 @@ enum OpalPciResetState {
 	OPAL_ASSERT_RESET   = 1
 };
 
+enum OpalPciSlotPresentenceState {
+	OPAL_PCI_SLOT_EMPTY	= 0,
+	OPAL_PCI_SLOT_PRESENT	= 1
+};
+
+enum OpalPciSlotPowerState {
+	OPAL_PCI_SLOT_POWER_OFF	= 0,
+	OPAL_PCI_SLOT_POWER_ON	= 1
+};
+
 enum OpalSlotLedType {
 	OPAL_SLOT_LED_TYPE_ID = 0,	/* IDENTIFY LED */
 	OPAL_SLOT_LED_TYPE_FAULT = 1,	/* FAULT LED */
@@ -378,6 +393,7 @@ enum opal_msg_type {
 	OPAL_MSG_DPO		= 5,
 	OPAL_MSG_PRD		= 6,
 	OPAL_MSG_OCC		= 7,
+	OPAL_MSG_PCI_HOTPLUG	= 8,
 	OPAL_MSG_TYPE_MAX,
 };
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 348132c..1a83c80 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -209,6 +209,11 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, uint64_t buf,
 		uint64_t size, uint64_t token);
 int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
 		uint64_t token);
+int64_t opal_get_device_tree(uint32_t phandle, uint64_t buf, uint64_t len);
+int64_t opal_pci_get_presence_state(uint64_t id, uint64_t data);
+int64_t opal_pci_get_power_state(uint64_t id, uint64_t data);
+int64_t opal_pci_set_power_state(uint64_t id, uint64_t data);
+int64_t opal_pci_poll2(uint64_t id, uint64_t data);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h
index c607902..8db7439 100644
--- a/arch/powerpc/include/asm/pnv-pci.h
+++ b/arch/powerpc/include/asm/pnv-pci.h
@@ -17,6 +17,13 @@
 #define PCI_SLOT_ID(phb_id, bdfn)	\
 	(PCI_SLOT_ID_PREFIX | ((uint64_t)(bdfn) << 16) | (phb_id))
 
+extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t len);
+extern int pnv_pci_get_presence_state(uint64_t id, uint8_t *state);
+extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state);
+extern int pnv_pci_set_power_state(uint64_t id, uint8_t state);
+extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb);
+extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb);
+
 int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode);
 int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
 			   unsigned int virq);
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index e45b88a..60397d2 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -302,3 +302,8 @@ OPAL_CALL(opal_prd_msg,				OPAL_PRD_MSG);
 OPAL_CALL(opal_leds_get_ind,			OPAL_LEDS_GET_INDICATOR);
 OPAL_CALL(opal_leds_set_ind,			OPAL_LEDS_SET_INDICATOR);
 OPAL_CALL(opal_console_flush,			OPAL_CONSOLE_FLUSH);
+OPAL_CALL(opal_get_device_tree,			OPAL_GET_DEVICE_TREE);
+OPAL_CALL(opal_pci_get_presence_state,		OPAL_PCI_GET_PRESENCE_STATE);
+OPAL_CALL(opal_pci_get_power_state,		OPAL_PCI_GET_POWER_STATE);
+OPAL_CALL(opal_pci_set_power_state,		OPAL_PCI_SET_POWER_STATE);
+OPAL_CALL(opal_pci_poll2,			OPAL_PCI_POLL2);
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 67a33e9..6e10ac4 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -42,6 +42,108 @@
 #define cfg_dbg(fmt...)	do { } while(0)
 //#define cfg_dbg(fmt...)	printk(fmt)
 
+int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t len)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_GET_DEVICE_TREE))
+		return -ENXIO;
+
+	rc = opal_get_device_tree(phandle, (uint64_t)buf, len);
+	if (rc != OPAL_SUCCESS)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_device_tree);
+
+static int pnv_pci_poll2(uint64_t id, uint8_t *state)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_POLL2))
+		return -ENXIO;
+
+	while (1) {
+		rc = opal_pci_poll2(id, (uint64_t)state);
+		if (rc <= OPAL_SUCCESS)
+			break;
+
+		if (system_state < SYSTEM_RUNNING)
+			udelay(1000 * rc);
+		else
+			msleep(rc);
+	}
+
+	if (rc != OPAL_SUCCESS)
+		return -EIO;
+
+	return 0;
+}
+
+int pnv_pci_get_presence_state(uint64_t id, uint8_t *state)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_GET_PRESENCE_STATE))
+		return -ENXIO;
+
+	rc = opal_pci_get_presence_state(id, (uint64_t)state);
+	if (rc == OPAL_SUCCESS)
+		return 0;
+	else if (rc < OPAL_SUCCESS)
+		return -EIO;
+
+	return pnv_pci_poll2(id, state);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_presence_state);
+
+int pnv_pci_get_power_state(uint64_t id, uint8_t *state)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_GET_POWER_STATE))
+		return -ENXIO;
+
+	rc = opal_pci_get_power_state(id, (uint64_t)state);
+	if (rc == OPAL_SUCCESS)
+		return 0;
+	else if (rc < OPAL_SUCCESS)
+		return -EIO;
+
+	return pnv_pci_poll2(id, state);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_power_state);
+
+int pnv_pci_set_power_state(uint64_t id, uint8_t state)
+{
+	int64_t rc;
+
+	if (!opal_check_token(OPAL_PCI_SET_POWER_STATE))
+		return -ENXIO;
+
+	rc = opal_pci_set_power_state(id, (uint64_t)&state);
+	if (rc == OPAL_SUCCESS)
+		return 0;
+	else if (rc < OPAL_SUCCESS)
+		return -EIO;
+
+	return pnv_pci_poll2(id, &state);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_set_power_state);
+
+int pnv_pci_hotplug_notifier_register(struct notifier_block *nb)
+{
+	return opal_message_notifier_register(OPAL_MSG_PCI_HOTPLUG, nb);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_hotplug_notifier_register);
+
+int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb)
+{
+	return opal_message_notifier_unregister(OPAL_MSG_PCI_HOTPLUG, nb);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_hotplug_notifier_unregister);
+
 #ifdef CONFIG_PCI_MSI
 int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 16/22] drivers/of: Split unflatten_dt_node()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (14 preceding siblings ...)
  2016-05-03 13:22 ` [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
  2016-05-03 13:22   ` Gavin Shan
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

The function unflatten_dt_node() is called recursively to unflatten
device nodes and properties in the FDT blob. It looks complicated
and hard to be understood.

This splits the function into 3 functions: populate_properties(),
populate_node() and unflatten_dt_node(). populate_properties(),
which is called by populate_node(), creates properties for the
indicated device node. The later one creates the device nodes
from FDT blob. populate_node() gets the offset in FDT blob for
next device nodes and then calls populate_node(). No logical
changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/of/fdt.c | 249 ++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 147 insertions(+), 102 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 3349d2a..d031c78 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -161,39 +161,127 @@ static void *unflatten_dt_alloc(void **mem, unsigned long size,
 	return res;
 }
 
-/**
- * unflatten_dt_node - Alloc and populate a device_node from the flat tree
- * @blob: The parent device tree blob
- * @mem: Memory chunk to use for allocating device nodes and properties
- * @poffset: pointer to node in flat tree
- * @dad: Parent struct device_node
- * @nodepp: The device_node tree created by the call
- * @fpsize: Size of the node path up at the current depth.
- * @dryrun: If true, do not allocate device nodes but still calculate needed
- * memory size
- */
-static void * unflatten_dt_node(const void *blob,
-				void *mem,
-				int *poffset,
-				struct device_node *dad,
-				struct device_node **nodepp,
-				unsigned long fpsize,
+static void populate_properties(const void *blob,
+				int offset,
+				void **mem,
+				struct device_node *np,
+				const char *nodename,
 				bool dryrun)
 {
-	const __be32 *p;
+	struct property *pp, **pprev = NULL;
+	int cur;
+	bool has_name = false;
+
+	pprev = &np->properties;
+	for (cur = fdt_first_property_offset(blob, offset);
+	     cur >= 0;
+	     cur = fdt_next_property_offset(blob, cur)) {
+		const __be32 *val;
+		const char *pname;
+		u32 sz;
+
+		val = fdt_getprop_by_offset(blob, cur, &pname, &sz);
+		if (!val) {
+			pr_warn("%s: Cannot locate property at 0x%x\n",
+				__func__, cur);
+			continue;
+		}
+
+		if (!pname) {
+			pr_warn("%s: Cannot find property name at 0x%x\n",
+				__func__, cur);
+			continue;
+		}
+
+		if (!strcmp(pname, "name"))
+			has_name = true;
+
+		pp = unflatten_dt_alloc(mem, sizeof(struct property),
+					__alignof__(struct property));
+		if (dryrun)
+			continue;
+
+		/* We accept flattened tree phandles either in
+		 * ePAPR-style "phandle" properties, or the
+		 * legacy "linux,phandle" properties.  If both
+		 * appear and have different values, things
+		 * will get weird. Don't do that.
+		 */
+		if (!strcmp(pname, "phandle") ||
+		    !strcmp(pname, "linux,phandle")) {
+			if (!np->phandle)
+				np->phandle = be32_to_cpup(val);
+		}
+
+		/* And we process the "ibm,phandle" property
+		 * used in pSeries dynamic device tree
+		 * stuff
+		 */
+		if (!strcmp(pname, "ibm,phandle"))
+			np->phandle = be32_to_cpup(val);
+
+		pp->name   = (char *)pname;
+		pp->length = sz;
+		pp->value  = (__be32 *)val;
+		*pprev     = pp;
+		pprev      = &pp->next;
+	}
+
+	/* With version 0x10 we may not have the name property,
+	 * recreate it here from the unit name if absent
+	 */
+	if (!has_name) {
+		const char *p = nodename, *ps = p, *pa = NULL;
+		int len;
+
+		while (*p) {
+			if ((*p) == '@')
+				pa = p;
+			else if ((*p) == '/')
+				ps = p + 1;
+			p++;
+		}
+
+		if (pa < ps)
+			pa = p;
+		len = (pa - ps) + 1;
+		pp = unflatten_dt_alloc(mem, sizeof(struct property) + len,
+					__alignof__(struct property));
+		if (!dryrun) {
+			pp->name   = "name";
+			pp->length = len;
+			pp->value  = pp + 1;
+			*pprev     = pp;
+			pprev      = &pp->next;
+			memcpy(pp->value, ps, len - 1);
+			((char *)pp->value)[len - 1] = 0;
+			pr_debug("fixed up name for %s -> %s\n",
+				 nodename, (char *)pp->value);
+		}
+	}
+
+	if (!dryrun)
+		*pprev = NULL;
+}
+
+static unsigned long populate_node(const void *blob,
+				   int offset,
+				   void **mem,
+				   struct device_node *dad,
+				   unsigned long fpsize,
+				   struct device_node **pnp,
+				   bool dryrun)
+{
 	struct device_node *np;
-	struct property *pp, **prev_pp = NULL;
 	const char *pathp;
 	unsigned int l, allocl;
-	static int depth;
-	int old_depth;
-	int offset;
-	int has_name = 0;
 	int new_format = 0;
 
-	pathp = fdt_get_name(blob, *poffset, &l);
-	if (!pathp)
-		return mem;
+	pathp = fdt_get_name(blob, offset, &l);
+	if (!pathp) {
+		*pnp = NULL;
+		return 0;
+	}
 
 	allocl = ++l;
 
@@ -223,7 +311,7 @@ static void * unflatten_dt_node(const void *blob,
 		}
 	}
 
-	np = unflatten_dt_alloc(&mem, sizeof(struct device_node) + allocl,
+	np = unflatten_dt_alloc(mem, sizeof(struct device_node) + allocl,
 				__alignof__(struct device_node));
 	if (!dryrun) {
 		char *fn;
@@ -246,89 +334,15 @@ static void * unflatten_dt_node(const void *blob,
 		}
 		memcpy(fn, pathp, l);
 
-		prev_pp = &np->properties;
 		if (dad != NULL) {
 			np->parent = dad;
 			np->sibling = dad->child;
 			dad->child = np;
 		}
 	}
-	/* process properties */
-	for (offset = fdt_first_property_offset(blob, *poffset);
-	     (offset >= 0);
-	     (offset = fdt_next_property_offset(blob, offset))) {
-		const char *pname;
-		u32 sz;
 
-		if (!(p = fdt_getprop_by_offset(blob, offset, &pname, &sz))) {
-			offset = -FDT_ERR_INTERNAL;
-			break;
-		}
-
-		if (pname == NULL) {
-			pr_info("Can't find property name in list !\n");
-			break;
-		}
-		if (strcmp(pname, "name") == 0)
-			has_name = 1;
-		pp = unflatten_dt_alloc(&mem, sizeof(struct property),
-					__alignof__(struct property));
-		if (!dryrun) {
-			/* We accept flattened tree phandles either in
-			 * ePAPR-style "phandle" properties, or the
-			 * legacy "linux,phandle" properties.  If both
-			 * appear and have different values, things
-			 * will get weird.  Don't do that. */
-			if ((strcmp(pname, "phandle") == 0) ||
-			    (strcmp(pname, "linux,phandle") == 0)) {
-				if (np->phandle == 0)
-					np->phandle = be32_to_cpup(p);
-			}
-			/* And we process the "ibm,phandle" property
-			 * used in pSeries dynamic device tree
-			 * stuff */
-			if (strcmp(pname, "ibm,phandle") == 0)
-				np->phandle = be32_to_cpup(p);
-			pp->name = (char *)pname;
-			pp->length = sz;
-			pp->value = (__be32 *)p;
-			*prev_pp = pp;
-			prev_pp = &pp->next;
-		}
-	}
-	/* with version 0x10 we may not have the name property, recreate
-	 * it here from the unit name if absent
-	 */
-	if (!has_name) {
-		const char *p1 = pathp, *ps = pathp, *pa = NULL;
-		int sz;
-
-		while (*p1) {
-			if ((*p1) == '@')
-				pa = p1;
-			if ((*p1) == '/')
-				ps = p1 + 1;
-			p1++;
-		}
-		if (pa < ps)
-			pa = p1;
-		sz = (pa - ps) + 1;
-		pp = unflatten_dt_alloc(&mem, sizeof(struct property) + sz,
-					__alignof__(struct property));
-		if (!dryrun) {
-			pp->name = "name";
-			pp->length = sz;
-			pp->value = pp + 1;
-			*prev_pp = pp;
-			prev_pp = &pp->next;
-			memcpy(pp->value, ps, sz - 1);
-			((char *)pp->value)[sz - 1] = 0;
-			pr_debug("fixed up name for %s -> %s\n", pathp,
-				(char *)pp->value);
-		}
-	}
+	populate_properties(blob, offset, mem, np, pathp, dryrun);
 	if (!dryrun) {
-		*prev_pp = NULL;
 		np->name = of_get_property(np, "name", NULL);
 		np->type = of_get_property(np, "device_type", NULL);
 
@@ -338,6 +352,37 @@ static void * unflatten_dt_node(const void *blob,
 			np->type = "<NULL>";
 	}
 
+	*pnp = np;
+	return fpsize;
+}
+
+/**
+ * unflatten_dt_node - Alloc and populate a device_node from the flat tree
+ * @blob: The parent device tree blob
+ * @mem: Memory chunk to use for allocating device nodes and properties
+ * @poffset: pointer to node in flat tree
+ * @dad: Parent struct device_node
+ * @nodepp: The device_node tree created by the call
+ * @fpsize: Size of the node path up at the current depth.
+ * @dryrun: If true, do not allocate device nodes but still calculate needed
+ * memory size
+ */
+static void *unflatten_dt_node(const void *blob,
+			       void *mem,
+			       int *poffset,
+			       struct device_node *dad,
+			       struct device_node **nodepp,
+			       unsigned long fpsize,
+			       bool dryrun)
+{
+	struct device_node *np;
+	static int depth;
+	int old_depth;
+
+	fpsize = populate_node(blob, *poffset, &mem, dad, fpsize, &np, dryrun);
+	if (!fpsize)
+		return mem;
+
 	old_depth = depth;
 	*poffset = fdt_next_node(blob, *poffset, &depth);
 	if (depth < 0)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 17/22] drivers/of: Avoid recursively calling unflatten_dt_node()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22   ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, alistair, aik, linux-pci, Gavin Shan, robherring2,
	bhelgaas, dja

In current implementation, unflatten_dt_node() is called recursively
to unflatten device nodes in FDT blob. It's stress to limited stack
capacity, especially to adopt the function to unflatten device sub-tree
that possibly has multiple root nodes. In that case, we runs out of
stack and the system can't boot up successfully.

In order to reuse the function to unflatten device sub-tree, this avoids
calling the function recursively, meaning the device nodes are unflattened
in one call on unflatten_dt_node(): two arrays are introduced to track the
parent path size and the device node of current level of depth, which will
be used by the device node on next level of depth to be unflattened. All
device nodes in more than 64 level of depth are dropped and hopefully,
the system can boot up successfully with the partial device-tree.

Also, the parameter "poffset" and "fpsize" are unused and dropped and the
parameter "dryrun" is figured out from "mem == NULL". Besides, the return
value of the function is changed to indicate the size of memory consumed by
the unflatten device tree or error code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/of/fdt.c | 122 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 74 insertions(+), 48 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d031c78..d1d5309 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -356,63 +356,90 @@ static unsigned long populate_node(const void *blob,
 	return fpsize;
 }
 
+static void reverse_nodes(struct device_node *parent)
+{
+	struct device_node *child, *next;
+
+	/* In-depth first */
+	child = parent->child;
+	while (child) {
+		reverse_nodes(child);
+
+		child = child->sibling;
+	}
+
+	/* Reverse the nodes in the child list */
+	child = parent->child;
+	parent->child = NULL;
+	while (child) {
+		next = child->sibling;
+
+		child->sibling = parent->child;
+		parent->child = child;
+		child = next;
+	}
+}
+
 /**
  * unflatten_dt_node - Alloc and populate a device_node from the flat tree
  * @blob: The parent device tree blob
  * @mem: Memory chunk to use for allocating device nodes and properties
- * @poffset: pointer to node in flat tree
  * @dad: Parent struct device_node
  * @nodepp: The device_node tree created by the call
- * @fpsize: Size of the node path up at the current depth.
- * @dryrun: If true, do not allocate device nodes but still calculate needed
- * memory size
+ *
+ * It returns the size of unflattened device tree or error code
  */
-static void *unflatten_dt_node(const void *blob,
-			       void *mem,
-			       int *poffset,
-			       struct device_node *dad,
-			       struct device_node **nodepp,
-			       unsigned long fpsize,
-			       bool dryrun)
+static int unflatten_dt_node(const void *blob,
+			     void *mem,
+			     struct device_node *dad,
+			     struct device_node **nodepp)
 {
-	struct device_node *np;
-	static int depth;
-	int old_depth;
+	struct device_node *root;
+	int offset = 0, depth = 0;
+#define FDT_MAX_DEPTH	64
+	unsigned long fpsizes[FDT_MAX_DEPTH];
+	struct device_node *nps[FDT_MAX_DEPTH];
+	void *base = mem;
+	bool dryrun = !base;
 
-	fpsize = populate_node(blob, *poffset, &mem, dad, fpsize, &np, dryrun);
-	if (!fpsize)
-		return mem;
+	if (nodepp)
+		*nodepp = NULL;
+
+	root = dad;
+	fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;
+	nps[depth++] = dad;
+	for (offset = 0;
+	     offset >= 0;
+	     offset = fdt_next_node(blob, offset, &depth)) {
+		if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
+			continue;
 
-	old_depth = depth;
-	*poffset = fdt_next_node(blob, *poffset, &depth);
-	if (depth < 0)
-		depth = 0;
-	while (*poffset > 0 && depth > old_depth)
-		mem = unflatten_dt_node(blob, mem, poffset, np, NULL,
-					fpsize, dryrun);
+		fpsizes[depth] = populate_node(blob, offset, &mem,
+					       nps[depth - 1],
+					       fpsizes[depth - 1],
+					       &nps[depth], dryrun);
+		if (!fpsizes[depth])
+			return mem - base;
+
+		if (!dryrun && nodepp && !*nodepp)
+			*nodepp = nps[depth];
+		if (!dryrun && !root)
+			root = nps[depth];
+	}
 
-	if (*poffset < 0 && *poffset != -FDT_ERR_NOTFOUND)
-		pr_err("unflatten: error %d processing FDT\n", *poffset);
+	if (offset < 0 && offset != -FDT_ERR_NOTFOUND) {
+		pr_err("%s: Error %d processing FDT\n", __func__, offset);
+		return -EINVAL;
+	}
 
 	/*
 	 * Reverse the child list. Some drivers assumes node order matches .dts
 	 * node order
 	 */
-	if (!dryrun && np->child) {
-		struct device_node *child = np->child;
-		np->child = NULL;
-		while (child) {
-			struct device_node *next = child->sibling;
-			child->sibling = np->child;
-			np->child = child;
-			child = next;
-		}
-	}
-
-	if (nodepp)
-		*nodepp = np;
+	if (!dryrun)
+		reverse_nodes(root);
 
-	return mem;
+	return mem - base;
 }
 
 /**
@@ -431,8 +458,7 @@ static void __unflatten_device_tree(const void *blob,
 			     struct device_node **mynodes,
 			     void * (*dt_alloc)(u64 size, u64 align))
 {
-	unsigned long size;
-	int start;
+	int size;
 	void *mem;
 
 	pr_debug(" -> unflatten_device_tree()\n");
@@ -453,11 +479,12 @@ static void __unflatten_device_tree(const void *blob,
 	}
 
 	/* First pass, scan for size */
-	start = 0;
-	size = (unsigned long)unflatten_dt_node(blob, NULL, &start, NULL, NULL, 0, true);
-	size = ALIGN(size, 4);
+	size = unflatten_dt_node(blob, NULL, NULL, NULL);
+	if (size < 0)
+		return;
 
-	pr_debug("  size is %lx, allocating...\n", size);
+	size = ALIGN(size, 4);
+	pr_debug("  size is %d, allocating...\n", size);
 
 	/* Allocate memory for the expanded device tree */
 	mem = dt_alloc(size + 4, __alignof__(struct device_node));
@@ -468,8 +495,7 @@ static void __unflatten_device_tree(const void *blob,
 	pr_debug("  unflattening %p...\n", mem);
 
 	/* Second pass, do actual unflattening */
-	start = 0;
-	unflatten_dt_node(blob, mem, &start, NULL, mynodes, 0, false);
+	unflatten_dt_node(blob, mem, NULL, mynodes);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
-- 
2.1.0

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 17/22] drivers/of: Avoid recursively calling unflatten_dt_node()
@ 2016-05-03 13:22   ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

In current implementation, unflatten_dt_node() is called recursively
to unflatten device nodes in FDT blob. It's stress to limited stack
capacity, especially to adopt the function to unflatten device sub-tree
that possibly has multiple root nodes. In that case, we runs out of
stack and the system can't boot up successfully.

In order to reuse the function to unflatten device sub-tree, this avoids
calling the function recursively, meaning the device nodes are unflattened
in one call on unflatten_dt_node(): two arrays are introduced to track the
parent path size and the device node of current level of depth, which will
be used by the device node on next level of depth to be unflattened. All
device nodes in more than 64 level of depth are dropped and hopefully,
the system can boot up successfully with the partial device-tree.

Also, the parameter "poffset" and "fpsize" are unused and dropped and the
parameter "dryrun" is figured out from "mem == NULL". Besides, the return
value of the function is changed to indicate the size of memory consumed by
the unflatten device tree or error code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/of/fdt.c | 122 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 74 insertions(+), 48 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d031c78..d1d5309 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -356,63 +356,90 @@ static unsigned long populate_node(const void *blob,
 	return fpsize;
 }
 
+static void reverse_nodes(struct device_node *parent)
+{
+	struct device_node *child, *next;
+
+	/* In-depth first */
+	child = parent->child;
+	while (child) {
+		reverse_nodes(child);
+
+		child = child->sibling;
+	}
+
+	/* Reverse the nodes in the child list */
+	child = parent->child;
+	parent->child = NULL;
+	while (child) {
+		next = child->sibling;
+
+		child->sibling = parent->child;
+		parent->child = child;
+		child = next;
+	}
+}
+
 /**
  * unflatten_dt_node - Alloc and populate a device_node from the flat tree
  * @blob: The parent device tree blob
  * @mem: Memory chunk to use for allocating device nodes and properties
- * @poffset: pointer to node in flat tree
  * @dad: Parent struct device_node
  * @nodepp: The device_node tree created by the call
- * @fpsize: Size of the node path up at the current depth.
- * @dryrun: If true, do not allocate device nodes but still calculate needed
- * memory size
+ *
+ * It returns the size of unflattened device tree or error code
  */
-static void *unflatten_dt_node(const void *blob,
-			       void *mem,
-			       int *poffset,
-			       struct device_node *dad,
-			       struct device_node **nodepp,
-			       unsigned long fpsize,
-			       bool dryrun)
+static int unflatten_dt_node(const void *blob,
+			     void *mem,
+			     struct device_node *dad,
+			     struct device_node **nodepp)
 {
-	struct device_node *np;
-	static int depth;
-	int old_depth;
+	struct device_node *root;
+	int offset = 0, depth = 0;
+#define FDT_MAX_DEPTH	64
+	unsigned long fpsizes[FDT_MAX_DEPTH];
+	struct device_node *nps[FDT_MAX_DEPTH];
+	void *base = mem;
+	bool dryrun = !base;
 
-	fpsize = populate_node(blob, *poffset, &mem, dad, fpsize, &np, dryrun);
-	if (!fpsize)
-		return mem;
+	if (nodepp)
+		*nodepp = NULL;
+
+	root = dad;
+	fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;
+	nps[depth++] = dad;
+	for (offset = 0;
+	     offset >= 0;
+	     offset = fdt_next_node(blob, offset, &depth)) {
+		if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
+			continue;
 
-	old_depth = depth;
-	*poffset = fdt_next_node(blob, *poffset, &depth);
-	if (depth < 0)
-		depth = 0;
-	while (*poffset > 0 && depth > old_depth)
-		mem = unflatten_dt_node(blob, mem, poffset, np, NULL,
-					fpsize, dryrun);
+		fpsizes[depth] = populate_node(blob, offset, &mem,
+					       nps[depth - 1],
+					       fpsizes[depth - 1],
+					       &nps[depth], dryrun);
+		if (!fpsizes[depth])
+			return mem - base;
+
+		if (!dryrun && nodepp && !*nodepp)
+			*nodepp = nps[depth];
+		if (!dryrun && !root)
+			root = nps[depth];
+	}
 
-	if (*poffset < 0 && *poffset != -FDT_ERR_NOTFOUND)
-		pr_err("unflatten: error %d processing FDT\n", *poffset);
+	if (offset < 0 && offset != -FDT_ERR_NOTFOUND) {
+		pr_err("%s: Error %d processing FDT\n", __func__, offset);
+		return -EINVAL;
+	}
 
 	/*
 	 * Reverse the child list. Some drivers assumes node order matches .dts
 	 * node order
 	 */
-	if (!dryrun && np->child) {
-		struct device_node *child = np->child;
-		np->child = NULL;
-		while (child) {
-			struct device_node *next = child->sibling;
-			child->sibling = np->child;
-			np->child = child;
-			child = next;
-		}
-	}
-
-	if (nodepp)
-		*nodepp = np;
+	if (!dryrun)
+		reverse_nodes(root);
 
-	return mem;
+	return mem - base;
 }
 
 /**
@@ -431,8 +458,7 @@ static void __unflatten_device_tree(const void *blob,
 			     struct device_node **mynodes,
 			     void * (*dt_alloc)(u64 size, u64 align))
 {
-	unsigned long size;
-	int start;
+	int size;
 	void *mem;
 
 	pr_debug(" -> unflatten_device_tree()\n");
@@ -453,11 +479,12 @@ static void __unflatten_device_tree(const void *blob,
 	}
 
 	/* First pass, scan for size */
-	start = 0;
-	size = (unsigned long)unflatten_dt_node(blob, NULL, &start, NULL, NULL, 0, true);
-	size = ALIGN(size, 4);
+	size = unflatten_dt_node(blob, NULL, NULL, NULL);
+	if (size < 0)
+		return;
 
-	pr_debug("  size is %lx, allocating...\n", size);
+	size = ALIGN(size, 4);
+	pr_debug("  size is %d, allocating...\n", size);
 
 	/* Allocate memory for the expanded device tree */
 	mem = dt_alloc(size + 4, __alignof__(struct device_node));
@@ -468,8 +495,7 @@ static void __unflatten_device_tree(const void *blob,
 	pr_debug("  unflattening %p...\n", mem);
 
 	/* Second pass, do actual unflattening */
-	start = 0;
-	unflatten_dt_node(blob, mem, &start, NULL, mynodes, 0, false);
+	unflatten_dt_node(blob, mem, NULL, mynodes);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 18/22] drivers/of: Rename unflatten_dt_node()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22   ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, alistair, aik, linux-pci, Gavin Shan, robherring2,
	bhelgaas, dja

This renames unflatten_dt_node() to unflatten_dt_nodes() as it
populates multiple device nodes from FDT blob. No logical changes
introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/of/fdt.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d1d5309..7850150 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -381,7 +381,7 @@ static void reverse_nodes(struct device_node *parent)
 }
 
 /**
- * unflatten_dt_node - Alloc and populate a device_node from the flat tree
+ * unflatten_dt_nodes - Alloc and populate a device_node from the flat tree
  * @blob: The parent device tree blob
  * @mem: Memory chunk to use for allocating device nodes and properties
  * @dad: Parent struct device_node
@@ -389,10 +389,10 @@ static void reverse_nodes(struct device_node *parent)
  *
  * It returns the size of unflattened device tree or error code
  */
-static int unflatten_dt_node(const void *blob,
-			     void *mem,
-			     struct device_node *dad,
-			     struct device_node **nodepp)
+static int unflatten_dt_nodes(const void *blob,
+			      void *mem,
+			      struct device_node *dad,
+			      struct device_node **nodepp)
 {
 	struct device_node *root;
 	int offset = 0, depth = 0;
@@ -479,7 +479,7 @@ static void __unflatten_device_tree(const void *blob,
 	}
 
 	/* First pass, scan for size */
-	size = unflatten_dt_node(blob, NULL, NULL, NULL);
+	size = unflatten_dt_nodes(blob, NULL, NULL, NULL);
 	if (size < 0)
 		return;
 
@@ -495,7 +495,7 @@ static void __unflatten_device_tree(const void *blob,
 	pr_debug("  unflattening %p...\n", mem);
 
 	/* Second pass, do actual unflattening */
-	unflatten_dt_node(blob, mem, NULL, mynodes);
+	unflatten_dt_nodes(blob, mem, NULL, mynodes);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
-- 
2.1.0

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 18/22] drivers/of: Rename unflatten_dt_node()
@ 2016-05-03 13:22   ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This renames unflatten_dt_node() to unflatten_dt_nodes() as it
populates multiple device nodes from FDT blob. No logical changes
introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/of/fdt.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d1d5309..7850150 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -381,7 +381,7 @@ static void reverse_nodes(struct device_node *parent)
 }
 
 /**
- * unflatten_dt_node - Alloc and populate a device_node from the flat tree
+ * unflatten_dt_nodes - Alloc and populate a device_node from the flat tree
  * @blob: The parent device tree blob
  * @mem: Memory chunk to use for allocating device nodes and properties
  * @dad: Parent struct device_node
@@ -389,10 +389,10 @@ static void reverse_nodes(struct device_node *parent)
  *
  * It returns the size of unflattened device tree or error code
  */
-static int unflatten_dt_node(const void *blob,
-			     void *mem,
-			     struct device_node *dad,
-			     struct device_node **nodepp)
+static int unflatten_dt_nodes(const void *blob,
+			      void *mem,
+			      struct device_node *dad,
+			      struct device_node **nodepp)
 {
 	struct device_node *root;
 	int offset = 0, depth = 0;
@@ -479,7 +479,7 @@ static void __unflatten_device_tree(const void *blob,
 	}
 
 	/* First pass, scan for size */
-	size = unflatten_dt_node(blob, NULL, NULL, NULL);
+	size = unflatten_dt_nodes(blob, NULL, NULL, NULL);
 	if (size < 0)
 		return;
 
@@ -495,7 +495,7 @@ static void __unflatten_device_tree(const void *blob,
 	pr_debug("  unflattening %p...\n", mem);
 
 	/* Second pass, do actual unflattening */
-	unflatten_dt_node(blob, mem, NULL, mynodes);
+	unflatten_dt_nodes(blob, mem, NULL, mynodes);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 19/22] drivers/of: Specify parent node in of_fdt_unflatten_tree()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (17 preceding siblings ...)
  2016-05-03 13:22   ` Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
       [not found] ` <1462281773-26438-1-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan, Jyri Sarha

This adds one more argument to of_fdt_unflatten_tree() to specify
the parent node of the FDT blob that is going to be unflattened.
In the result, the function can be used to unflatten FDT blob that
represents device sub-tree in PowerNV PCI hotplug driver.

Cc: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Jyri Sarha <jsarha@ti.com>
---
 drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c |  2 +-
 drivers/of/fdt.c                             | 14 ++++++++++----
 drivers/of/unittest.c                        |  2 +-
 include/linux/of_fdt.h                       |  1 +
 4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c b/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c
index 106679b..f9c79da 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c
@@ -157,7 +157,7 @@ struct device_node * __init tilcdc_get_overlay(struct kfree_table *kft)
 	if (!overlay_data || kfree_table_add(kft, overlay_data))
 		return NULL;
 
-	of_fdt_unflatten_tree(overlay_data, &overlay);
+	of_fdt_unflatten_tree(overlay_data, NULL, &overlay);
 	if (!overlay) {
 		pr_warn("%s: Unfattening overlay tree failed\n", __func__);
 		return NULL;
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 7850150..337fb1a 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -450,11 +450,13 @@ static int unflatten_dt_nodes(const void *blob,
  * pointers of the nodes so the normal device-tree walking functions
  * can be used.
  * @blob: The blob to expand
+ * @dad: Parent device node
  * @mynodes: The device_node tree created by the call
  * @dt_alloc: An allocator that provides a virtual address to memory
  * for the resulting tree
  */
 static void __unflatten_device_tree(const void *blob,
+			     struct device_node *dad,
 			     struct device_node **mynodes,
 			     void * (*dt_alloc)(u64 size, u64 align))
 {
@@ -479,7 +481,7 @@ static void __unflatten_device_tree(const void *blob,
 	}
 
 	/* First pass, scan for size */
-	size = unflatten_dt_nodes(blob, NULL, NULL, NULL);
+	size = unflatten_dt_nodes(blob, NULL, dad, NULL);
 	if (size < 0)
 		return;
 
@@ -495,7 +497,7 @@ static void __unflatten_device_tree(const void *blob,
 	pr_debug("  unflattening %p...\n", mem);
 
 	/* Second pass, do actual unflattening */
-	unflatten_dt_nodes(blob, mem, NULL, mynodes);
+	unflatten_dt_nodes(blob, mem, dad, mynodes);
 	if (be32_to_cpup(mem + size) != 0xdeadbeef)
 		pr_warning("End of tree marker overwritten: %08x\n",
 			   be32_to_cpup(mem + size));
@@ -512,6 +514,9 @@ static DEFINE_MUTEX(of_fdt_unflatten_mutex);
 
 /**
  * of_fdt_unflatten_tree - create tree of device_nodes from flat blob
+ * @blob: Flat device tree blob
+ * @dad: Parent device node
+ * @mynodes: The device tree created by the call
  *
  * unflattens the device-tree passed by the firmware, creating the
  * tree of struct device_node. It also fills the "name" and "type"
@@ -519,10 +524,11 @@ static DEFINE_MUTEX(of_fdt_unflatten_mutex);
  * can be used.
  */
 void of_fdt_unflatten_tree(const unsigned long *blob,
+			struct device_node *dad,
 			struct device_node **mynodes)
 {
 	mutex_lock(&of_fdt_unflatten_mutex);
-	__unflatten_device_tree(blob, mynodes, &kernel_tree_alloc);
+	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
 	mutex_unlock(&of_fdt_unflatten_mutex);
 }
 EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
@@ -1189,7 +1195,7 @@ bool __init early_init_dt_scan(void *params)
  */
 void __init unflatten_device_tree(void)
 {
-	__unflatten_device_tree(initial_boot_params, &of_root,
+	__unflatten_device_tree(initial_boot_params, NULL, &of_root,
 				early_init_dt_alloc_memory_arch);
 
 	/* Get pointer to "/chosen" and "/aliases" nodes for use everywhere */
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index e986e6e..8c0f11c 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -921,7 +921,7 @@ static int __init unittest_data_add(void)
 			"not running tests\n", __func__);
 		return -ENOMEM;
 	}
-	of_fdt_unflatten_tree(unittest_data, &unittest_data_node);
+	of_fdt_unflatten_tree(unittest_data, NULL, &unittest_data_node);
 	if (!unittest_data_node) {
 		pr_warn("%s: No tree to attach; not running tests\n", __func__);
 		return -ENODATA;
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 2fbe868..1bffcbd 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -38,6 +38,7 @@ extern bool of_fdt_is_big_endian(const void *blob,
 extern int of_fdt_match(const void *blob, unsigned long node,
 			const char *const *compat);
 extern void of_fdt_unflatten_tree(const unsigned long *blob,
+			       struct device_node *dad,
 			       struct device_node **mynodes);
 
 /* TBD: Temporary export of fdt globals - remove when code fully merged */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 20/22] drivers/of: Return allocated memory from of_fdt_unflatten_tree()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22     ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                       ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	mpe-Gsx/Oe8HsFggBc27wqDAHg, aik-sLpHqDYs0B2HXe+LvDLADg,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	robherring2-Re5JQEeQqe8AvxtiuMwx3w, dja-Yfaxwxk/+vWsTnJN9+BGXg,
	alistair-Y4h6yKqj69EXC2x5gXVKYQ, Gavin Shan

This returns the allocate memory chunk, storing the unflattened device
tree, from of_fdt_unflatten_tree() so that memory chunk can be released
on demand in PowerNV PCI hotplug driver.

Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/of/fdt.c       | 33 ++++++++++++++++++++++-----------
 include/linux/of_fdt.h |  6 +++---
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 337fb1a..c95054c 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -454,11 +454,14 @@ static int unflatten_dt_nodes(const void *blob,
  * @mynodes: The device_node tree created by the call
  * @dt_alloc: An allocator that provides a virtual address to memory
  * for the resulting tree
+ *
+ * Returns NULL on failure or the memory chunk containing the unflattened
+ * device tree on success.
  */
-static void __unflatten_device_tree(const void *blob,
-			     struct device_node *dad,
-			     struct device_node **mynodes,
-			     void * (*dt_alloc)(u64 size, u64 align))
+static void *__unflatten_device_tree(const void *blob,
+				     struct device_node *dad,
+				     struct device_node **mynodes,
+				     void *(*dt_alloc)(u64 size, u64 align))
 {
 	int size;
 	void *mem;
@@ -467,7 +470,7 @@ static void __unflatten_device_tree(const void *blob,
 
 	if (!blob) {
 		pr_debug("No device tree pointer\n");
-		return;
+		return NULL;
 	}
 
 	pr_debug("Unflattening device tree:\n");
@@ -477,13 +480,13 @@ static void __unflatten_device_tree(const void *blob,
 
 	if (fdt_check_header(blob)) {
 		pr_err("Invalid device tree blob header\n");
-		return;
+		return NULL;
 	}
 
 	/* First pass, scan for size */
 	size = unflatten_dt_nodes(blob, NULL, dad, NULL);
 	if (size < 0)
-		return;
+		return NULL;
 
 	size = ALIGN(size, 4);
 	pr_debug("  size is %d, allocating...\n", size);
@@ -503,6 +506,7 @@ static void __unflatten_device_tree(const void *blob,
 			   be32_to_cpup(mem + size));
 
 	pr_debug(" <- unflatten_device_tree()\n");
+	return mem;
 }
 
 static void *kernel_tree_alloc(u64 size, u64 align)
@@ -522,14 +526,21 @@ static DEFINE_MUTEX(of_fdt_unflatten_mutex);
  * tree of struct device_node. It also fills the "name" and "type"
  * pointers of the nodes so the normal device-tree walking functions
  * can be used.
+ *
+ * Returns NULL on failure or the memory chunk containing the unflattened
+ * device tree on success.
  */
-void of_fdt_unflatten_tree(const unsigned long *blob,
-			struct device_node *dad,
-			struct device_node **mynodes)
+void *of_fdt_unflatten_tree(const unsigned long *blob,
+			    struct device_node *dad,
+			    struct device_node **mynodes)
 {
+	void *mem;
+
 	mutex_lock(&of_fdt_unflatten_mutex);
-	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
+	mem = __unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
 	mutex_unlock(&of_fdt_unflatten_mutex);
+
+	return mem;
 }
 EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
 
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 1bffcbd..901ec01 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -37,9 +37,9 @@ extern bool of_fdt_is_big_endian(const void *blob,
 				 unsigned long node);
 extern int of_fdt_match(const void *blob, unsigned long node,
 			const char *const *compat);
-extern void of_fdt_unflatten_tree(const unsigned long *blob,
-			       struct device_node *dad,
-			       struct device_node **mynodes);
+extern void *of_fdt_unflatten_tree(const unsigned long *blob,
+				   struct device_node *dad,
+				   struct device_node **mynodes);
 
 /* TBD: Temporary export of fdt globals - remove when code fully merged */
 extern int __initdata dt_root_addr_cells;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 20/22] drivers/of: Return allocated memory from of_fdt_unflatten_tree()
@ 2016-05-03 13:22     ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This returns the allocate memory chunk, storing the unflattened device
tree, from of_fdt_unflatten_tree() so that memory chunk can be released
on demand in PowerNV PCI hotplug driver.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/of/fdt.c       | 33 ++++++++++++++++++++++-----------
 include/linux/of_fdt.h |  6 +++---
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 337fb1a..c95054c 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -454,11 +454,14 @@ static int unflatten_dt_nodes(const void *blob,
  * @mynodes: The device_node tree created by the call
  * @dt_alloc: An allocator that provides a virtual address to memory
  * for the resulting tree
+ *
+ * Returns NULL on failure or the memory chunk containing the unflattened
+ * device tree on success.
  */
-static void __unflatten_device_tree(const void *blob,
-			     struct device_node *dad,
-			     struct device_node **mynodes,
-			     void * (*dt_alloc)(u64 size, u64 align))
+static void *__unflatten_device_tree(const void *blob,
+				     struct device_node *dad,
+				     struct device_node **mynodes,
+				     void *(*dt_alloc)(u64 size, u64 align))
 {
 	int size;
 	void *mem;
@@ -467,7 +470,7 @@ static void __unflatten_device_tree(const void *blob,
 
 	if (!blob) {
 		pr_debug("No device tree pointer\n");
-		return;
+		return NULL;
 	}
 
 	pr_debug("Unflattening device tree:\n");
@@ -477,13 +480,13 @@ static void __unflatten_device_tree(const void *blob,
 
 	if (fdt_check_header(blob)) {
 		pr_err("Invalid device tree blob header\n");
-		return;
+		return NULL;
 	}
 
 	/* First pass, scan for size */
 	size = unflatten_dt_nodes(blob, NULL, dad, NULL);
 	if (size < 0)
-		return;
+		return NULL;
 
 	size = ALIGN(size, 4);
 	pr_debug("  size is %d, allocating...\n", size);
@@ -503,6 +506,7 @@ static void __unflatten_device_tree(const void *blob,
 			   be32_to_cpup(mem + size));
 
 	pr_debug(" <- unflatten_device_tree()\n");
+	return mem;
 }
 
 static void *kernel_tree_alloc(u64 size, u64 align)
@@ -522,14 +526,21 @@ static DEFINE_MUTEX(of_fdt_unflatten_mutex);
  * tree of struct device_node. It also fills the "name" and "type"
  * pointers of the nodes so the normal device-tree walking functions
  * can be used.
+ *
+ * Returns NULL on failure or the memory chunk containing the unflattened
+ * device tree on success.
  */
-void of_fdt_unflatten_tree(const unsigned long *blob,
-			struct device_node *dad,
-			struct device_node **mynodes)
+void *of_fdt_unflatten_tree(const unsigned long *blob,
+			    struct device_node *dad,
+			    struct device_node **mynodes)
 {
+	void *mem;
+
 	mutex_lock(&of_fdt_unflatten_mutex);
-	__unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
+	mem = __unflatten_device_tree(blob, dad, mynodes, &kernel_tree_alloc);
 	mutex_unlock(&of_fdt_unflatten_mutex);
+
+	return mem;
 }
 EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
 
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 1bffcbd..901ec01 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -37,9 +37,9 @@ extern bool of_fdt_is_big_endian(const void *blob,
 				 unsigned long node);
 extern int of_fdt_match(const void *blob, unsigned long node,
 			const char *const *compat);
-extern void of_fdt_unflatten_tree(const unsigned long *blob,
-			       struct device_node *dad,
-			       struct device_node **mynodes);
+extern void *of_fdt_unflatten_tree(const unsigned long *blob,
+				   struct device_node *dad,
+				   struct device_node **mynodes);
 
 /* TBD: Temporary export of fdt globals - remove when code fully merged */
 extern int __initdata dt_root_addr_cells;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 21/22] drivers/of: Export of_detach_node()
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
@ 2016-05-03 13:22   ` Gavin Shan
  2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, alistair, aik, linux-pci, Gavin Shan, robherring2,
	bhelgaas, dja

This exports of_detach_node() for PowerPC PowerNV PCI hotplug
driver. No functional changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 drivers/of/dynamic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index c647bd1..75ce30d 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -311,6 +311,7 @@ int of_detach_node(struct device_node *np)
 
 	return rc;
 }
+EXPORT_SYMBOL_GPL(of_detach_node);
 
 /**
  * of_node_release() - release a dynamically allocated node
-- 
2.1.0

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 21/22] drivers/of: Export of_detach_node()
@ 2016-05-03 13:22   ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This exports of_detach_node() for PowerPC PowerNV PCI hotplug
driver. No functional changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 drivers/of/dynamic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index c647bd1..75ce30d 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -311,6 +311,7 @@ int of_detach_node(struct device_node *np)
 
 	return rc;
 }
+EXPORT_SYMBOL_GPL(of_detach_node);
 
 /**
  * of_node_release() - release a dynamically allocated node
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
                   ` (20 preceding siblings ...)
  2016-05-03 13:22   ` Gavin Shan
@ 2016-05-03 13:22 ` Gavin Shan
       [not found]   ` <1462281773-26438-23-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  21 siblings, 1 reply; 41+ messages in thread
From: Gavin Shan @ 2016-05-03 13:22 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, aik, bhelgaas, robherring2,
	dja, alistair, Gavin Shan

This adds standalone driver to support PCI hotplug for PowerPC PowerNV
platform that runs on top of skiboot firmware. The firmware identifies
hotpluggable slots and marked their device tree node with proper
"ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans
device tree nodes to create/register PCI hotplug slot accordingly.

The PCI slots are organized in fashion of tree, which means one
PCI slot might have parent PCI slot and parent PCI slot possibly
contains multiple child PCI slots. At the plugging time, the parent
PCI slot is populated before its children. The child PCI slots are
removed before their parent PCI slot can be removed from the system.

If the skiboot firmware doesn't support slot status retrieval, the PCI
slot device node shouldn't have property "ibm,reset-by-firmware". In
that case, none of valid PCI slots will be detected from device tree.
The skiboot firmware doesn't export the capability to access attention
LEDs yet and it's something for TBD.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/hotplug/Kconfig   |  13 +
 drivers/pci/hotplug/Makefile  |   3 +
 drivers/pci/hotplug/pnv_php.c | 869 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 885 insertions(+)
 create mode 100644 drivers/pci/hotplug/pnv_php.c

diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
index df8caec..aadce45 100644
--- a/drivers/pci/hotplug/Kconfig
+++ b/drivers/pci/hotplug/Kconfig
@@ -113,6 +113,19 @@ config HOTPLUG_PCI_SHPC
 
 	  When in doubt, say N.
 
+config HOTPLUG_PCI_POWERNV
+	tristate "PowerPC PowerNV PCI Hotplug driver"
+	depends on PPC_POWERNV && EEH
+	select OF_DYNAMIC
+	help
+	  Say Y here if you run PowerPC PowerNV platform that supports
+	  PCI Hotplug
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called pnv-php.
+
+	  When in doubt, say N.
+
 config HOTPLUG_PCI_RPA
 	tristate "RPA PCI Hotplug driver"
 	depends on PPC_PSERIES && EEH
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index b616e75..e33cdda 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)		+= pciehp.o
 obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550)	+= cpcihp_zt5550.o
 obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)	+= cpcihp_generic.o
 obj-$(CONFIG_HOTPLUG_PCI_SHPC)		+= shpchp.o
+obj-$(CONFIG_HOTPLUG_PCI_POWERNV)	+= pnv-php.o
 obj-$(CONFIG_HOTPLUG_PCI_RPA)		+= rpaphp.o
 obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR)	+= rpadlpar_io.o
 obj-$(CONFIG_HOTPLUG_PCI_SGI)		+= sgi_hotplug.o
@@ -50,6 +51,8 @@ ibmphp-objs		:=	ibmphp_core.o	\
 acpiphp-objs		:=	acpiphp_core.o	\
 				acpiphp_glue.o
 
+pnv-php-objs		:=	pnv_php.o
+
 rpaphp-objs		:=	rpaphp_core.o	\
 				rpaphp_pci.o	\
 				rpaphp_slot.o
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
new file mode 100644
index 0000000..8bb2159
--- /dev/null
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -0,0 +1,869 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2016.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/libfdt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/pci_hotplug.h>
+
+#include <asm/opal.h>
+#include <asm/pnv-pci.h>
+#include <asm/ppc-pci.h>
+
+#define DRIVER_VERSION	"0.1"
+#define DRIVER_AUTHOR	"Gavin Shan, IBM Corporation"
+#define DRIVER_DESC	"PowerPC PowerNV PCI Hotplug Driver"
+
+struct pnv_php_slot {
+	struct hotplug_slot		slot;
+	struct hotplug_slot_info	slot_info;
+	uint64_t			id;
+	char				*name;
+	int				slot_no;
+	struct kref			kref;
+#define PNV_PHP_STATE_INITIALIZED	0
+#define PNV_PHP_STATE_REGISTERED	1
+#define PNV_PHP_STATE_POPULATED		2
+#define PNV_PHP_STATE_OFFLINE		3
+	int				state;
+	struct device_node		*dn;
+	struct pci_dev			*pdev;
+	struct pci_bus			*bus;
+	bool				power_state_check;
+	int				power_state_confirmed;
+#define PNV_PHP_POWER_CONFIRMED_INVALID	0
+#define PNV_PHP_POWER_CONFIRMED_SUCCESS	1
+#define PNV_PHP_POWER_CONFIRMED_FAIL	2
+	struct opal_msg			*msg;
+	void				*fdt;
+	void				*dt;
+	struct of_changeset		ocs;
+	struct work_struct		work;
+	wait_queue_head_t		queue;
+	struct pnv_php_slot		*parent;
+	struct list_head		children;
+	struct list_head		link;
+};
+
+static LIST_HEAD(pnv_php_slot_list);
+static DEFINE_SPINLOCK(pnv_php_lock);
+
+static void pnv_php_register(struct device_node *dn);
+static void pnv_php_unregister_one(struct device_node *dn);
+static void pnv_php_unregister(struct device_node *dn);
+
+static void pnv_php_free_slot(struct kref *kref)
+{
+	struct pnv_php_slot *php_slot = container_of(kref,
+						     struct pnv_php_slot,
+						     kref);
+
+	WARN_ON(!list_empty(&php_slot->children));
+	kfree(php_slot->name);
+	kfree(php_slot);
+}
+
+static inline void pnv_php_put_slot(struct pnv_php_slot *php_slot)
+{
+
+	if (WARN_ON(!php_slot))
+		return;
+
+	kref_put(&php_slot->kref, pnv_php_free_slot);
+}
+
+static struct pnv_php_slot *pnv_php_match(struct device_node *dn,
+					  struct pnv_php_slot *php_slot)
+{
+	struct pnv_php_slot *target, *tmp;
+
+	if (php_slot->dn == dn) {
+		kref_get(&php_slot->kref);
+		return php_slot;
+	}
+
+	list_for_each_entry(tmp, &php_slot->children, link) {
+		target = pnv_php_match(dn, tmp);
+		if (target)
+			return target;
+	}
+
+	return NULL;
+}
+
+static struct pnv_php_slot *pnv_php_find_slot(struct device_node *dn)
+{
+	struct pnv_php_slot *php_slot, *tmp;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pnv_php_lock, flags);
+	list_for_each_entry(tmp, &pnv_php_slot_list, link) {
+		php_slot = pnv_php_match(dn, tmp);
+		if (php_slot) {
+			spin_unlock_irqrestore(&pnv_php_lock, flags);
+			return php_slot;
+		}
+	}
+	spin_unlock_irqrestore(&pnv_php_lock, flags);
+
+	return NULL;
+}
+
+/*
+ * Remove pdn for all children of the indicated device node.
+ * The function should remove pdn in a depth-first manner.
+ */
+static void pnv_php_rmv_pdns(struct device_node *dn)
+{
+	struct device_node *child;
+
+	for_each_child_of_node(dn, child) {
+		pnv_php_rmv_pdns(child);
+
+		pci_remove_device_node_info(child);
+	}
+}
+
+/*
+ * Remove all child nodes of the indicated device nodes. The
+ * function should remove device nodes in depth-first manner.
+ */
+static int pnv_php_rmv_device_nodes(struct device_node *parent)
+{
+	struct device_node *dn, *child;
+	int ret = 0;
+
+	for_each_child_of_node(parent, dn) {
+		ret = pnv_php_rmv_device_nodes(dn);
+		if (ret)
+			return ret;
+
+		child = of_get_next_child(dn, NULL);
+		if (child) {
+			of_node_put(child);
+			of_node_put(dn);
+			pr_err("%s: Alive children of node <%s>\n",
+			       __func__, of_node_full_name(dn));
+			return -EBUSY;
+		}
+
+		of_detach_node(dn);
+		of_node_put(dn);
+	}
+
+	return 0;
+}
+
+/*
+ * The function processes the message sent by firmware
+ * to remove all device tree nodes beneath the slot's
+ * nodes and the associated auxiliary data.
+ */
+static void pnv_php_handle_poweroff(struct pnv_php_slot *php_slot)
+{
+	int ret;
+
+	pnv_php_rmv_pdns(php_slot->dn);
+
+	/*
+	 * If the device sub-tree was created from OF changeset, simply
+	 * to revert that. Otherwise, the device nodes in the sub-tree
+	 * need to be iterated and detached.
+	 */
+	if (php_slot->fdt) {
+		of_changeset_destroy(&php_slot->ocs);
+		kfree(php_slot->dt);
+		kfree(php_slot->fdt);
+		php_slot->dt        = NULL;
+		php_slot->dn->child = NULL;
+		php_slot->fdt       = NULL;
+		php_slot->power_state_confirmed =
+			PNV_PHP_POWER_CONFIRMED_SUCCESS;
+	} else {
+		ret = pnv_php_rmv_device_nodes(php_slot->dn);
+		if (!ret) {
+			php_slot->power_state_confirmed =
+				PNV_PHP_POWER_CONFIRMED_SUCCESS;
+		} else {
+			php_slot->power_state_confirmed =
+				PNV_PHP_POWER_CONFIRMED_FAIL;
+			dev_warn(&php_slot->pdev->dev, "Error %d freeing nodes\n",
+				 ret);
+		}
+	}
+
+	wake_up_interruptible(&php_slot->queue);
+}
+
+static int pnv_php_populate_changeset(struct of_changeset *ocs,
+				      struct device_node *dn)
+{
+	struct device_node *child;
+	int ret = 0;
+
+	for_each_child_of_node(dn, child) {
+		ret = of_changeset_attach_node(ocs, child);
+		if (ret)
+			break;
+
+		ret = pnv_php_populate_changeset(ocs, child);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+
+static void *pnv_php_add_one_pdn(struct device_node *dn, void *data)
+{
+	struct pci_controller *hose = (struct pci_controller *)data;
+	struct pci_dn *pdn;
+
+	pdn = pci_add_device_node_info(hose, dn);
+	if (!pdn)
+		return ERR_PTR(-ENOMEM);
+
+	return NULL;
+}
+
+static void pnv_php_add_pdns(struct pnv_php_slot *slot)
+{
+	struct pci_controller *hose = pci_bus_to_host(slot->bus);
+
+	pci_traverse_device_nodes(slot->dn, pnv_php_add_one_pdn, hose);
+}
+
+static void pnv_php_handle_poweron(struct pnv_php_slot *php_slot)
+{
+	void *fdt, *fdt1, *dt;
+	int confirm = PNV_PHP_POWER_CONFIRMED_SUCCESS;
+	int ret;
+
+	/* We don't know the FDT blob size. We try to get it through
+	 * maximal memory chunk and then copy it to another chunk that
+	 * fits the real size.
+	 */
+	fdt1 = kzalloc(0x10000, GFP_KERNEL);
+	if (!fdt1)
+		goto error;
+
+	ret = pnv_pci_get_device_tree(php_slot->dn->phandle, fdt1, 0x10000);
+	if (ret)
+		goto free_fdt1;
+
+	fdt = kzalloc(fdt_totalsize(fdt1), GFP_KERNEL);
+	if (!fdt)
+		goto free_fdt1;
+
+	/* Unflatten device tree blob */
+	memcpy(fdt, fdt1, fdt_totalsize(fdt1));
+	dt = of_fdt_unflatten_tree(fdt, php_slot->dn, NULL);
+	if (!dt) {
+		dev_warn(&php_slot->pdev->dev, "Cannot unflatten FDT\n");
+		goto free_fdt;
+	}
+
+	/* Initialize and apply the changeset */
+	of_changeset_init(&php_slot->ocs);
+	ret = pnv_php_populate_changeset(&php_slot->ocs, php_slot->dn);
+	if (ret) {
+		dev_warn(&php_slot->pdev->dev, "Error %d populating changeset\n",
+			 ret);
+		goto free_dt;
+	}
+
+	php_slot->dn->child = NULL;
+	ret = of_changeset_apply(&php_slot->ocs);
+	if (ret) {
+		dev_warn(&php_slot->pdev->dev, "Error %d applying changeset\n",
+			 ret);
+		goto destroy_changeset;
+	}
+
+	/* Add device node firmware data */
+	pnv_php_add_pdns(php_slot);
+	php_slot->fdt = fdt;
+	php_slot->dt  = dt;
+	kfree(fdt1);
+	goto out;
+
+destroy_changeset:
+	of_changeset_destroy(&php_slot->ocs);
+free_dt:
+	kfree(dt);
+	php_slot->dn->child = NULL;
+free_fdt:
+	kfree(fdt);
+free_fdt1:
+	kfree(fdt1);
+error:
+	confirm = PNV_PHP_POWER_CONFIRMED_FAIL;
+out:
+	/* Confirm status change */
+	php_slot->power_state_confirmed = confirm;
+	wake_up_interruptible(&php_slot->queue);
+}
+
+static void pnv_php_work(struct work_struct *data)
+{
+	struct pnv_php_slot *php_slot = container_of(data,
+						     struct pnv_php_slot,
+						     work);
+	uint64_t event = be64_to_cpu(php_slot->msg->params[0]);
+
+	if (event == OPAL_PCI_SLOT_POWER_OFF)
+		pnv_php_handle_poweroff(php_slot);
+	else
+		pnv_php_handle_poweron(php_slot);
+
+	pnv_php_put_slot(php_slot);
+}
+
+static int pnv_php_handle_msg(struct notifier_block *nb,
+			      unsigned long type,
+			      void *message)
+{
+	phandle h;
+	struct device_node *dn;
+	struct pnv_php_slot *php_slot;
+	struct opal_msg *msg = message;
+
+	if (type != OPAL_MSG_PCI_HOTPLUG) {
+		pr_warn("%s: Invalid message %ld received!\n",
+			__func__, type);
+		return NOTIFY_DONE;
+	}
+
+	h = (phandle)be64_to_cpu(msg->params[1]);
+	dn = of_find_node_by_phandle(h);
+	if (!dn) {
+		pr_warn("%s: No device node for phandle 0x%x\n",
+			__func__, h);
+		return NOTIFY_DONE;
+	}
+
+	php_slot = pnv_php_find_slot(dn);
+	if (!php_slot || php_slot->state == PNV_PHP_STATE_OFFLINE) {
+		pr_warn("%s: No slot or offlined slot for node <%s>\n",
+			__func__, of_node_full_name(dn));
+		of_node_put(dn);
+		return NOTIFY_DONE;
+	}
+
+	of_node_put(dn);
+	php_slot->msg = msg;
+	schedule_work(&php_slot->work);
+	return NOTIFY_OK;
+}
+
+static int pnv_php_set_power_state(struct hotplug_slot *slot, u8 state)
+{
+	struct pnv_php_slot *php_slot = slot->private;
+	int ret;
+
+	php_slot->power_state_confirmed = PNV_PHP_POWER_CONFIRMED_INVALID;
+	ret = pnv_pci_set_power_state(php_slot->id, state);
+	if (ret) {
+		dev_warn(&php_slot->pdev->dev, "Error %d powering %s slot\n",
+			 ret, state ? "on" : "off");
+		return ret;
+	}
+
+	/* Continue to PCI probing after finalized device-tree. The
+	 * device-tree might have been updated completely at this
+	 * point. Thus we don't have to wait forever.
+	 */
+	if (php_slot->power_state_confirmed == PNV_PHP_POWER_CONFIRMED_SUCCESS)
+		return 0;
+
+	if (php_slot->power_state_confirmed == PNV_PHP_POWER_CONFIRMED_FAIL)
+		return -EBUSY;
+
+	/* Wait for firmware to add or remove device sub-tree. When it's done,
+	 * one signal is received from firmware.
+	 */
+	ret = wait_event_timeout(php_slot->queue,
+				 php_slot->power_state_confirmed, 10 * HZ);
+	if (!ret) {
+		dev_warn(&php_slot->pdev->dev, "Error %d waiting for power-%s\n",
+			 ret, state ? "on" : "off");
+		return -EBUSY;
+	}
+
+	if (php_slot->power_state_confirmed == PNV_PHP_POWER_CONFIRMED_SUCCESS)
+		return 0;
+
+	dev_warn(&php_slot->pdev->dev, "Error status %d for power-%s\n",
+		 php_slot->power_state_confirmed, state ? "on" : "off");
+	return -EBUSY;
+}
+
+static int pnv_php_get_power_state(struct hotplug_slot *slot, u8 *state)
+{
+	struct pnv_php_slot *php_slot = slot->private;
+	uint8_t power_state = OPAL_PCI_SLOT_POWER_ON;
+	int ret;
+
+	/*
+	 * Retrieve power status from firmware. If we fail
+	 * getting that, the power status fails back to
+	 * be on.
+	 */
+	ret = pnv_pci_get_power_state(php_slot->id, &power_state);
+	if (ret) {
+		dev_warn(&php_slot->pdev->dev, "Error %d getting power status\n",
+			 ret);
+	} else {
+		*state = power_state;
+		slot->info->power_status = power_state;
+	}
+
+	return 0;
+}
+
+static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
+{
+	struct pnv_php_slot *php_slot = slot->private;
+	uint8_t presence = OPAL_PCI_SLOT_EMPTY;
+	int ret;
+
+	/*
+	 * Retrieve presence status from firmware. If we can't
+	 * get that, it will fail back to be empty.
+	 */
+	ret = pnv_pci_get_presence_state(php_slot->id, &presence);
+	if (ret >= 0) {
+		*state = presence;
+		slot->info->adapter_status = presence;
+		ret = 0;
+	} else {
+		dev_warn(&php_slot->pdev->dev, "Error %d getting presence\n",
+			 ret);
+	}
+
+	return ret;
+}
+
+static int pnv_php_set_attention_state(struct hotplug_slot *slot, u8 state)
+{
+	/* FIXME: Make it real once firmware supports it */
+	slot->info->attention_status = state;
+
+	return 0;
+}
+
+static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
+{
+	struct hotplug_slot *slot = &php_slot->slot;
+	uint8_t presence = OPAL_PCI_SLOT_EMPTY;
+	uint8_t power_status = OPAL_PCI_SLOT_POWER_ON;
+	int ret;
+
+	/* Check if the slot has been configured */
+	if (php_slot->state != PNV_PHP_STATE_REGISTERED)
+		return 0;
+
+	/* Retrieve slot presence status */
+	ret = pnv_php_get_adapter_state(slot, &presence);
+	if (ret)
+		return ret;
+
+	/* Proceed if there have nothing behind the slot */
+	if (presence == OPAL_PCI_SLOT_EMPTY)
+		goto scan;
+
+	/*
+	 * If the power supply to the slot is off, we can't detect
+	 * adapter presence state. That means we have to turn the
+	 * slot on before going to probe slot's presence state.
+	 *
+	 * On the first time, we don't change the power status to
+	 * boost system boot with assumption that the firmware
+	 * supplies consistent slot power status: empty slot always
+	 * has its power off and non-empty slot has its power on.
+	 */
+	if (!php_slot->power_state_check) {
+		php_slot->power_state_check = true;
+
+		ret = pnv_php_get_power_state(slot, &power_status);
+		if (ret)
+			return ret;
+
+		if (power_status != OPAL_PCI_SLOT_POWER_ON)
+			return 0;
+	}
+
+	/* Check the power status. Scan the slot if it is already on */
+	ret = pnv_php_get_power_state(slot, &power_status);
+	if (ret)
+		return ret;
+
+	if (power_status == OPAL_PCI_SLOT_POWER_ON)
+		goto scan;
+
+	/* Power is off, turn it on and then scan the slot */
+	ret = pnv_php_set_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+	if (ret)
+		return ret;
+
+scan:
+	if (presence == OPAL_PCI_SLOT_PRESENT) {
+		if (rescan) {
+			pci_lock_rescan_remove();
+			pci_hp_add_devices(php_slot->bus);
+			pci_unlock_rescan_remove();
+		}
+
+		/* Rescan for child hotpluggable slots */
+		php_slot->state = PNV_PHP_STATE_POPULATED;
+		if (rescan)
+			pnv_php_register(php_slot->dn);
+	} else {
+		php_slot->state = PNV_PHP_STATE_POPULATED;
+	}
+
+	return 0;
+}
+
+static int pnv_php_enable_slot(struct hotplug_slot *slot)
+{
+	struct pnv_php_slot *php_slot = container_of(slot,
+						     struct pnv_php_slot, slot);
+
+	return pnv_php_enable(php_slot, true);
+}
+
+static int pnv_php_disable_slot(struct hotplug_slot *slot)
+{
+	struct pnv_php_slot *php_slot = slot->private;
+	uint8_t power_state;
+	int ret;
+
+	if (php_slot->state != PNV_PHP_STATE_POPULATED)
+		return 0;
+
+	/* Remove all devices behind the slot */
+	pci_lock_rescan_remove();
+	pci_hp_remove_devices(php_slot->bus);
+	pci_unlock_rescan_remove();
+
+	/* Detach the child hotpluggable slots */
+	pnv_php_unregister(php_slot->dn);
+
+	/*
+	 * Check the power status and turn it off if necessary. If we
+	 * fail to get the power status, the power will be forced to
+	 * be off.
+	 */
+	ret = pnv_php_get_power_state(slot, &power_state);
+	if (ret || power_state == OPAL_PCI_SLOT_POWER_ON) {
+		ret = pnv_php_set_power_state(slot, OPAL_PCI_SLOT_POWER_OFF);
+		if (ret)
+			dev_warn(&php_slot->pdev->dev, "Error %d powering off\n",
+				 ret);
+	}
+
+	/* Update slot state */
+	php_slot->state = PNV_PHP_STATE_REGISTERED;
+	return 0;
+}
+
+static struct hotplug_slot_ops php_slot_ops = {
+	.get_power_status	= pnv_php_get_power_state,
+	.get_adapter_status	= pnv_php_get_adapter_state,
+	.set_attention_status	= pnv_php_set_attention_state,
+	.enable_slot		= pnv_php_enable_slot,
+	.disable_slot		= pnv_php_disable_slot,
+};
+
+static void pnv_php_release(struct hotplug_slot *slot)
+{
+	struct pnv_php_slot *php_slot = slot->private;
+	unsigned long flags;
+
+	/* Remove from global or child list */
+	spin_lock_irqsave(&pnv_php_lock, flags);
+	list_del(&php_slot->link);
+	spin_unlock_irqrestore(&pnv_php_lock, flags);
+
+	/* Detach from parent */
+	pnv_php_put_slot(php_slot);
+	pnv_php_put_slot(php_slot->parent);
+}
+
+static int pnv_php_get_slot_id(struct device_node *dn, uint64_t *id)
+{
+	struct device_node *parent = dn;
+	const __be64 *prop64;
+	const __be32 *prop32;
+	uint64_t phb_id;
+	int bdfn;
+
+	/* Bus/Slot/Function number */
+	prop32 = of_get_property(dn, "reg", NULL);
+	if (!prop32)
+		return -ENXIO;
+	bdfn = (of_read_number(prop32, 1) & 0x00ffff00) >> 8;
+
+	/* PHB Id */
+	while ((parent = of_get_parent(parent))) {
+		if (!PCI_DN(parent)) {
+			of_node_put(parent);
+			break;
+		}
+
+		if (!of_device_is_compatible(parent, "ibm,ioda2-phb")) {
+			of_node_put(parent);
+			continue;
+		}
+
+		prop64 = of_get_property(parent, "ibm,opal-phbid", NULL);
+		if (!prop64) {
+			of_node_put(parent);
+			return -ENXIO;
+		}
+
+		phb_id = be64_to_cpup(prop64);
+		of_node_put(parent);
+
+		*id = PCI_SLOT_ID(phb_id, bdfn);
+		return 0;
+	}
+
+	return -ENODEV;
+}
+
+static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)
+{
+	struct pnv_php_slot *php_slot;
+	struct pci_bus *bus;
+	const char *label;
+	uint64_t id;
+
+	label = of_get_property(dn, "ibm,slot-label", NULL);
+	if (!label)
+		return NULL;
+
+	if (pnv_php_get_slot_id(dn, &id))
+		return NULL;
+
+	bus = pci_find_bus_by_node(dn);
+	if (!bus)
+		return NULL;
+
+	php_slot = kzalloc(sizeof(*php_slot), GFP_KERNEL);
+	if (!php_slot)
+		return NULL;
+
+	php_slot->name = kstrdup(label, GFP_KERNEL);
+	if (!php_slot->name) {
+		kfree(php_slot);
+		return NULL;
+	}
+
+	if (dn->child && PCI_DN(dn->child))
+		php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
+	else
+		php_slot->slot_no = -1;   /* Placeholder slot */
+
+	kref_init(&php_slot->kref);
+	php_slot->state	                = PNV_PHP_STATE_INITIALIZED;
+	php_slot->dn	                = dn;
+	php_slot->pdev	                = bus->self;
+	php_slot->bus	                = bus;
+	php_slot->id	                = id;
+	php_slot->power_state_check     = false;
+	php_slot->power_state_confirmed = PNV_PHP_POWER_CONFIRMED_INVALID;
+	php_slot->slot.ops              = &php_slot_ops;
+	php_slot->slot.info             = &php_slot->slot_info;
+	php_slot->slot.release          = pnv_php_release;
+	php_slot->slot.private          = php_slot;
+
+	INIT_WORK(&php_slot->work, pnv_php_work);
+	init_waitqueue_head(&php_slot->queue);
+	INIT_LIST_HEAD(&php_slot->children);
+	INIT_LIST_HEAD(&php_slot->link);
+
+	return php_slot;
+}
+
+static int pnv_php_register_slot(struct pnv_php_slot *php_slot)
+{
+	struct pnv_php_slot *parent;
+	struct device_node *dn = php_slot->dn;
+	unsigned long flags;
+	int ret;
+
+	/* Check if the slot is registered or not */
+	parent = pnv_php_find_slot(php_slot->dn);
+	if (parent) {
+		pnv_php_put_slot(parent);
+		return -EEXIST;
+	}
+
+	/* Register PCI slot */
+	ret = pci_hp_register(&php_slot->slot, php_slot->bus,
+			      php_slot->slot_no, php_slot->name);
+	if (ret) {
+		dev_warn(&php_slot->pdev->dev, "Error %d registering slot\n",
+			 ret);
+		return ret;
+	}
+
+	/* Attach to the parent's child list or global list */
+	while ((dn = of_get_parent(dn))) {
+		if (!PCI_DN(dn)) {
+			of_node_put(dn);
+			break;
+		}
+
+		parent = pnv_php_find_slot(dn);
+		if (parent) {
+			of_node_put(dn);
+			break;
+		}
+
+		of_node_put(dn);
+	}
+
+	spin_lock_irqsave(&pnv_php_lock, flags);
+	php_slot->parent = parent;
+	if (parent)
+		list_add_tail(&php_slot->link, &parent->children);
+	else
+		list_add_tail(&php_slot->link, &pnv_php_slot_list);
+	spin_unlock_irqrestore(&pnv_php_lock, flags);
+
+	php_slot->state = PNV_PHP_STATE_REGISTERED;
+	return 0;
+}
+
+static int pnv_php_register_one(struct device_node *dn)
+{
+	struct pnv_php_slot *php_slot;
+	const __be32 *prop32;
+	int ret;
+
+	/* Check if it's hotpluggable slot */
+	prop32 = of_get_property(dn, "ibm,slot-pluggable", NULL);
+	if (!prop32 || !of_read_number(prop32, 1))
+		return -ENXIO;
+
+	prop32 = of_get_property(dn, "ibm,reset-by-firmware", NULL);
+	if (!prop32 || !of_read_number(prop32, 1))
+		return -ENXIO;
+
+	php_slot = pnv_php_alloc_slot(dn);
+	if (!php_slot)
+		return -ENODEV;
+
+	ret = pnv_php_register_slot(php_slot);
+	if (ret)
+		goto free_slot;
+
+	ret = pnv_php_enable(php_slot, false);
+	if (ret)
+		goto unregister_slot;
+
+	return 0;
+
+unregister_slot:
+	pnv_php_unregister_one(php_slot->dn);
+free_slot:
+	pnv_php_put_slot(php_slot);
+	return ret;
+}
+
+static void pnv_php_register(struct device_node *dn)
+{
+	struct device_node *child;
+
+	/*
+	 * The parent slots should be registered before their
+	 * child slots.
+	 */
+	for_each_child_of_node(dn, child) {
+		pnv_php_register_one(child);
+		pnv_php_register(child);
+	}
+}
+
+static void pnv_php_unregister_one(struct device_node *dn)
+{
+	struct pnv_php_slot *php_slot;
+
+	php_slot = pnv_php_find_slot(dn);
+	if (!php_slot)
+		return;
+
+	php_slot->state = PNV_PHP_STATE_OFFLINE;
+	flush_work(&php_slot->work);
+
+	pnv_php_put_slot(php_slot);
+	pci_hp_deregister(&php_slot->slot);
+}
+
+static void pnv_php_unregister(struct device_node *dn)
+{
+	struct device_node *child;
+
+	/* The child slots should go before their parent slots */
+	for_each_child_of_node(dn, child) {
+		pnv_php_unregister(child);
+		pnv_php_unregister_one(child);
+	}
+}
+
+static struct notifier_block php_msg_nb = {
+	.notifier_call	= pnv_php_handle_msg,
+	.next		= NULL,
+	.priority	= 0,
+};
+
+static int __init pnv_php_init(void)
+{
+	struct device_node *dn;
+	int ret;
+
+	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
+
+	/* Register hotplug message handler */
+	ret = pnv_pci_hotplug_notifier_register(&php_msg_nb);
+	if (ret) {
+		pr_warn("%s: Error %d registering hotplug notifier\n",
+			__func__, ret);
+		return ret;
+	}
+
+	/* Scan PHB nodes and their children */
+	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
+		pnv_php_register(dn);
+
+	return 0;
+}
+
+static void __exit pnv_php_exit(void)
+{
+	struct device_node *dn;
+
+	pnv_pci_hotplug_notifier_unregister(&php_msg_nb);
+
+	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
+		pnv_php_unregister(dn);
+}
+
+module_init(pnv_php_init);
+module_exit(pnv_php_exit);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 21/22] drivers/of: Export of_detach_node()
  2016-05-03 13:22   ` Gavin Shan
  (?)
@ 2016-05-04 13:36   ` Gavin Shan
  2016-05-05 19:42     ` Rob Herring
  -1 siblings, 1 reply; 41+ messages in thread
From: Gavin Shan @ 2016-05-04 13:36 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, benh, mpe, aik, bhelgaas,
	robherring2, dja, alistair

On Tue, May 03, 2016 at 11:22:52PM +1000, Gavin Shan wrote:
>This exports of_detach_node() for PowerPC PowerNV PCI hotplug
>driver. No functional changes introduced.
>
>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Rob, I'm not sure it's late to cache the 4.7 merge window.
Also, I was told this series is needed by new CAPI driver which
is being developed. The developers hope to push the new CAPI driver
in 4.8 merge window (perhaps) or linux-next before 4.8 merge window
is opened. It means the new CAPI driver depends on this series which
has to be pushed to linux-next before 4.8 merge window.

If it's fine, could you please merge 16 to 21? With them merged to
4.7, the left patches won't depend on FDT changes. Thanks in advance
for your help.

Thanks,
Gavin

>---
> drivers/of/dynamic.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
>index c647bd1..75ce30d 100644
>--- a/drivers/of/dynamic.c
>+++ b/drivers/of/dynamic.c
>@@ -311,6 +311,7 @@ int of_detach_node(struct device_node *np)
>
> 	return rc;
> }
>+EXPORT_SYMBOL_GPL(of_detach_node);
>
> /**
>  * of_node_release() - release a dynamically allocated node
>-- 
>2.1.0
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2016-05-03 13:22 ` [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
@ 2016-05-05 17:04       ` Rob Herring
  0 siblings, 0 replies; 41+ messages in thread
From: Rob Herring @ 2016-05-05 17:04 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Benjamin Herrenschmidt,
	Michael Ellerman, Alexey Kardashevskiy, Bjorn Helgaas,
	dja-Yfaxwxk/+vWsTnJN9+BGXg, Alistair Popple

On Tue, May 3, 2016 at 8:22 AM, Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> This adds standalone driver to support PCI hotplug for PowerPC PowerNV
> platform that runs on top of skiboot firmware. The firmware identifies
> hotpluggable slots and marked their device tree node with proper
> "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans
> device tree nodes to create/register PCI hotplug slot accordingly.
>
> The PCI slots are organized in fashion of tree, which means one
> PCI slot might have parent PCI slot and parent PCI slot possibly
> contains multiple child PCI slots. At the plugging time, the parent
> PCI slot is populated before its children. The child PCI slots are
> removed before their parent PCI slot can be removed from the system.
>
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
>
> Signed-off-by: Gavin Shan <gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> Acked-by: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

[...]

> +static void pnv_php_handle_poweron(struct pnv_php_slot *php_slot)
> +{
> +       void *fdt, *fdt1, *dt;
> +       int confirm = PNV_PHP_POWER_CONFIRMED_SUCCESS;
> +       int ret;
> +
> +       /* We don't know the FDT blob size. We try to get it through
> +        * maximal memory chunk and then copy it to another chunk that
> +        * fits the real size.
> +        */
> +       fdt1 = kzalloc(0x10000, GFP_KERNEL);
> +       if (!fdt1)
> +               goto error;
> +
> +       ret = pnv_pci_get_device_tree(php_slot->dn->phandle, fdt1, 0x10000);
> +       if (ret)
> +               goto free_fdt1;
> +
> +       fdt = kzalloc(fdt_totalsize(fdt1), GFP_KERNEL);
> +       if (!fdt)
> +               goto free_fdt1;
> +
> +       /* Unflatten device tree blob */
> +       memcpy(fdt, fdt1, fdt_totalsize(fdt1));

This is wrong. If the size is greater than 64K, then you will be
overrunning the fdt1 buffer. You need to fetch the FDT again if it is
bigger than 64KB.


> +       dt = of_fdt_unflatten_tree(fdt, php_slot->dn, NULL);
> +       if (!dt) {
> +               dev_warn(&php_slot->pdev->dev, "Cannot unflatten FDT\n");
> +               goto free_fdt;
> +       }
> +
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
@ 2016-05-05 17:04       ` Rob Herring
  0 siblings, 0 replies; 41+ messages in thread
From: Rob Herring @ 2016-05-05 17:04 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, Benjamin Herrenschmidt,
	Michael Ellerman, Alexey Kardashevskiy, Bjorn Helgaas, dja,
	Alistair Popple

On Tue, May 3, 2016 at 8:22 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
> This adds standalone driver to support PCI hotplug for PowerPC PowerNV
> platform that runs on top of skiboot firmware. The firmware identifies
> hotpluggable slots and marked their device tree node with proper
> "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans
> device tree nodes to create/register PCI hotplug slot accordingly.
>
> The PCI slots are organized in fashion of tree, which means one
> PCI slot might have parent PCI slot and parent PCI slot possibly
> contains multiple child PCI slots. At the plugging time, the parent
> PCI slot is populated before its children. The child PCI slots are
> removed before their parent PCI slot can be removed from the system.
>
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>

[...]

> +static void pnv_php_handle_poweron(struct pnv_php_slot *php_slot)
> +{
> +       void *fdt, *fdt1, *dt;
> +       int confirm = PNV_PHP_POWER_CONFIRMED_SUCCESS;
> +       int ret;
> +
> +       /* We don't know the FDT blob size. We try to get it through
> +        * maximal memory chunk and then copy it to another chunk that
> +        * fits the real size.
> +        */
> +       fdt1 = kzalloc(0x10000, GFP_KERNEL);
> +       if (!fdt1)
> +               goto error;
> +
> +       ret = pnv_pci_get_device_tree(php_slot->dn->phandle, fdt1, 0x10000);
> +       if (ret)
> +               goto free_fdt1;
> +
> +       fdt = kzalloc(fdt_totalsize(fdt1), GFP_KERNEL);
> +       if (!fdt)
> +               goto free_fdt1;
> +
> +       /* Unflatten device tree blob */
> +       memcpy(fdt, fdt1, fdt_totalsize(fdt1));

This is wrong. If the size is greater than 64K, then you will be
overrunning the fdt1 buffer. You need to fetch the FDT again if it is
bigger than 64KB.


> +       dt = of_fdt_unflatten_tree(fdt, php_slot->dn, NULL);
> +       if (!dt) {
> +               dev_warn(&php_slot->pdev->dev, "Cannot unflatten FDT\n");
> +               goto free_fdt;
> +       }
> +

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 21/22] drivers/of: Export of_detach_node()
  2016-05-04 13:36   ` Gavin Shan
@ 2016-05-05 19:42     ` Rob Herring
  2016-05-06  0:40       ` Gavin Shan
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2016-05-05 19:42 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, benh, mpe, aik, bhelgaas,
	robherring2, dja, alistair

On Wed, May 04, 2016 at 11:36:03PM +1000, Gavin Shan wrote:
> On Tue, May 03, 2016 at 11:22:52PM +1000, Gavin Shan wrote:
> >This exports of_detach_node() for PowerPC PowerNV PCI hotplug
> >driver. No functional changes introduced.
> >
> >Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> 
> Rob, I'm not sure it's late to cache the 4.7 merge window.
> Also, I was told this series is needed by new CAPI driver which
> is being developed. The developers hope to push the new CAPI driver
> in 4.8 merge window (perhaps) or linux-next before 4.8 merge window
> is opened. It means the new CAPI driver depends on this series which
> has to be pushed to linux-next before 4.8 merge window.
> 
> If it's fine, could you please merge 16 to 21? With them merged to
> 4.7, the left patches won't depend on FDT changes. Thanks in advance
> for your help.

I've applied the 6 patches.

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2016-05-05 17:04       ` Rob Herring
  (?)
@ 2016-05-06  0:28       ` Gavin Shan
  2016-05-06 13:12         ` Rob Herring
  -1 siblings, 1 reply; 41+ messages in thread
From: Gavin Shan @ 2016-05-06  0:28 UTC (permalink / raw)
  To: Rob Herring
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree,
	Benjamin Herrenschmidt, Michael Ellerman, Alexey Kardashevskiy,
	Bjorn Helgaas, dja, Alistair Popple

On Thu, May 05, 2016 at 12:04:49PM -0500, Rob Herring wrote:
>On Tue, May 3, 2016 at 8:22 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
>> This adds standalone driver to support PCI hotplug for PowerPC PowerNV
>> platform that runs on top of skiboot firmware. The firmware identifies
>> hotpluggable slots and marked their device tree node with proper
>> "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans
>> device tree nodes to create/register PCI hotplug slot accordingly.
>>
>> The PCI slots are organized in fashion of tree, which means one
>> PCI slot might have parent PCI slot and parent PCI slot possibly
>> contains multiple child PCI slots. At the plugging time, the parent
>> PCI slot is populated before its children. The child PCI slots are
>> removed before their parent PCI slot can be removed from the system.
>>
>> If the skiboot firmware doesn't support slot status retrieval, the PCI
>> slot device node shouldn't have property "ibm,reset-by-firmware". In
>> that case, none of valid PCI slots will be detected from device tree.
>> The skiboot firmware doesn't export the capability to access attention
>> LEDs yet and it's something for TBD.
>>
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>[...]
>
>> +static void pnv_php_handle_poweron(struct pnv_php_slot *php_slot)
>> +{
>> +       void *fdt, *fdt1, *dt;
>> +       int confirm = PNV_PHP_POWER_CONFIRMED_SUCCESS;
>> +       int ret;
>> +
>> +       /* We don't know the FDT blob size. We try to get it through
>> +        * maximal memory chunk and then copy it to another chunk that
>> +        * fits the real size.
>> +        */
>> +       fdt1 = kzalloc(0x10000, GFP_KERNEL);
>> +       if (!fdt1)
>> +               goto error;
>> +
>> +       ret = pnv_pci_get_device_tree(php_slot->dn->phandle, fdt1, 0x10000);
>> +       if (ret)
>> +               goto free_fdt1;
>> +
>> +       fdt = kzalloc(fdt_totalsize(fdt1), GFP_KERNEL);
>> +       if (!fdt)
>> +               goto free_fdt1;
>> +
>> +       /* Unflatten device tree blob */
>> +       memcpy(fdt, fdt1, fdt_totalsize(fdt1));
>
>This is wrong. If the size is greater than 64K, then you will be
>overrunning the fdt1 buffer. You need to fetch the FDT again if it is
>bigger than 64KB.
>

Thanks for review, Rob. Sorry that I don't see how it's a problem. An
errcode is returned from pnv_pci_get_device_tree() if the FDT blob
size is greater than 64K. In this case, memcpy() won't be triggered.
pnv_pci_get_device_tree() relies on firmware implementation which
avoids overrunning the buffer.

On the other hand, it would be reasonable to retry retriving the
FDT blob if 64K buffer isn't enough. Also, kzalloc() can be replaced
with alloc_pages() as 64K is the default page size on PPC64. I will
have something like below until some one has more concerns. As the
size of the allocated buffer will be greater than the real FDT blob
size, some memory (not too much) is wasted. I guess it should be ok.

	struct page *page;
	void *fdt;
	unsigned int order;
	int ret;

	for (order = 0; order < MAX_ORDER; order++) {
		page = alloc_pages(GFP_KERNEL, order);
		if (page) {
			fdt = page_address(page);
			ret = pnv_pci_get_device_tree(php_slot->dn->phandle,
						      fdt, (1 << order) * PAGE_SIZE);
			if (ret) {
				dev_dbg(&php_slot->pdev.dev, "Error %d getting device tree (%d)\n",
					ret, order);
				free_pages(fdt, order);
				continue;
			}
		}
	}

	if (order >= MAX_ORDER) {
		dev_warn(&php_slot->pdev.dev, "Cannot get device tree\n");
		return;
	}

	/* Unflatten the blob without copying it to another allocated buffer */

Thanks,
Gavin

>
>> +       dt = of_fdt_unflatten_tree(fdt, php_slot->dn, NULL);
>> +       if (!dt) {
>> +               dev_warn(&php_slot->pdev->dev, "Cannot unflatten FDT\n");
>> +               goto free_fdt;
>> +       }
>> +
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 21/22] drivers/of: Export of_detach_node()
  2016-05-05 19:42     ` Rob Herring
@ 2016-05-06  0:40       ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-06  0:40 UTC (permalink / raw)
  To: Rob Herring
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, mpe, aik,
	bhelgaas, robherring2, dja, alistair

On Thu, May 05, 2016 at 02:42:35PM -0500, Rob Herring wrote:
>On Wed, May 04, 2016 at 11:36:03PM +1000, Gavin Shan wrote:
>> On Tue, May 03, 2016 at 11:22:52PM +1000, Gavin Shan wrote:
>> >This exports of_detach_node() for PowerPC PowerNV PCI hotplug
>> >driver. No functional changes introduced.
>> >
>> >Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> 
>> Rob, I'm not sure it's late to cache the 4.7 merge window.
>> Also, I was told this series is needed by new CAPI driver which
>> is being developed. The developers hope to push the new CAPI driver
>> in 4.8 merge window (perhaps) or linux-next before 4.8 merge window
>> is opened. It means the new CAPI driver depends on this series which
>> has to be pushed to linux-next before 4.8 merge window.
>> 
>> If it's fine, could you please merge 16 to 21? With them merged to
>> 4.7, the left patches won't depend on FDT changes. Thanks in advance
>> for your help.
>
>I've applied the 6 patches.
>

Rob, thank you very much for your helps!

Thanks,
Gavin

>Rob
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 03/22] powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around
  2016-05-03 13:22 ` [PATCH v9 03/22] powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around Gavin Shan
@ 2016-05-06  6:36   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 41+ messages in thread
From: Alexey Kardashevskiy @ 2016-05-06  6:36 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, bhelgaas, robherring2, dja, alistair

On 05/03/2016 11:22 PM, Gavin Shan wrote:
> pnv_pci_ioda_setup_opal_tce_kill() called by pnv_ioda_setup_dma()
> to remap the TCE kill regiter. What's done in pnv_ioda_setup_dma()
> will be covered in pcibios_setup_bridge() which is invoked on each
> PCI bridge. It means we will possibly remap the TCE kill register
> for multiple times and it's unnecessary.
>
> This moves pnv_pci_ioda_setup_opal_tce_kill() to where the PHB is
> initialized (pnv_pci_init_ioda_phb()) to avoid above issue.
>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5ee8a57..cbd4c0b 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2599,8 +2599,6 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
>  	pr_info("PCI: Domain %04x has %d available 32-bit DMA segments\n",
>  		hose->global_number, phb->ioda.dma32_count);
>
> -	pnv_pci_ioda_setup_opal_tce_kill(phb);
> -
>  	/* Walk our PE list and configure their DMA segments */
>  	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>  		weight = pnv_pci_ioda_pe_dma_weight(pe);
> @@ -3396,6 +3394,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>  	if (phb->regs == NULL)
>  		pr_err("  Failed to map registers !\n");
>
> +	/* Initialize TCE kill register */
> +	pnv_pci_ioda_setup_opal_tce_kill(phb);
> +
>  	/* Initialize more IODA stuff */
>  	phb->ioda.total_pe_num = 1;
>  	prop32 = of_get_property(np, "ibm,opal-num-pes", NULL);
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 04/22] powerpc/powernv: Increase PE# capacity
  2016-05-03 13:22 ` [PATCH v9 04/22] powerpc/powernv: Increase PE# capacity Gavin Shan
@ 2016-05-06  7:17   ` Alexey Kardashevskiy
  2016-05-06 11:05     ` Gavin Shan
  0 siblings, 1 reply; 41+ messages in thread
From: Alexey Kardashevskiy @ 2016-05-06  7:17 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev
  Cc: linux-pci, devicetree, benh, mpe, bhelgaas, robherring2, dja, alistair

On 05/03/2016 11:22 PM, Gavin Shan wrote:
> Each PHB maintains an array helping to translate 2-bytes Request
> ID (RID) to PE# with the assumption that PE# takes one byte, meaning
> that we can't have more than 256 PEs. However, pci_dn->pe_number
> already had 4-bytes for the PE#.

Can you possibly have more than 256 PEs? Or exactly 256? What patch in this 
series makes use of it?

I probably asked but do not remember the answer :)

Looks like waste of memory - you only used a small fraction of 
pe_rmap[0x10000] and now the waste is quadrupled.


>
> This extends the PE# capacity for every PHB. After that, the PE number
> is represented by 4-bytes value. Then we can reuse IODA_INVALID_PE to
> check the PE# in phb->pe_rmap[] is valid or not.

Looks like using IODA_INVALID_PE is the only reason for this patch.


>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Reviewed-by: Daniel Axtens <dja@axtens.net>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 6 +++++-
>  arch/powerpc/platforms/powernv/pci.h      | 7 ++-----
>  2 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index cbd4c0b..cf96cb5 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -768,7 +768,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
>
>  	/* Clear the reverse map */
>  	for (rid = pe->rid; rid < rid_end; rid++)
> -		phb->ioda.pe_rmap[rid] = 0;
> +		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
>
>  	/* Release from all parents PELT-V */
>  	while (parent) {
> @@ -3406,6 +3406,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>  	if (prop32)
>  		phb->ioda.reserved_pe_idx = be32_to_cpup(prop32);
>
> +	/* Invalidate RID to PE# mapping */
> +	for (segno = 0; segno < ARRAY_SIZE(phb->ioda.pe_rmap); segno++)
> +		phb->ioda.pe_rmap[segno] = IODA_INVALID_PE;
> +
>  	/* Parse 64-bit MMIO range */
>  	pnv_ioda_parse_m64_window(phb);
>
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 904f60b..80f5326 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -156,11 +156,8 @@ struct pnv_phb {
>  		struct list_head	pe_list;
>  		struct mutex            pe_list_mutex;
>
> -		/* Reverse map of PEs, will have to extend if
> -		 * we are to support more than 256 PEs, indexed
> -		 * bus { bus, devfn }
> -		 */
> -		unsigned char		pe_rmap[0x10000];
> +		/* Reverse map of PEs, indexed by {bus, devfn} */
> +		unsigned int		pe_rmap[0x10000];
>
>  		/* TCE cache invalidate registers (physical and
>  		 * remapped)
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 04/22] powerpc/powernv: Increase PE# capacity
  2016-05-06  7:17   ` Alexey Kardashevskiy
@ 2016-05-06 11:05     ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-06 11:05 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree, benh, mpe,
	bhelgaas, robherring2, dja, alistair

On Fri, May 06, 2016 at 05:17:25PM +1000, Alexey Kardashevskiy wrote:
>On 05/03/2016 11:22 PM, Gavin Shan wrote:
>>Each PHB maintains an array helping to translate 2-bytes Request
>>ID (RID) to PE# with the assumption that PE# takes one byte, meaning
>>that we can't have more than 256 PEs. However, pci_dn->pe_number
>>already had 4-bytes for the PE#.
>
>Can you possibly have more than 256 PEs? Or exactly 256? What patch in this
>series makes use of it?
>
>I probably asked but do not remember the answer :)
>
>Looks like waste of memory - you only used a small fraction of
>pe_rmap[0x10000] and now the waste is quadrupled.
>

The PE capacities on different hardware are different as below. So we're
going to support 16-bits PE number in near future. That means the element
in the array needs "unsigned short" at least and 2 pages (2 * 64KB) will
be reserved for it.

P7IOC: 127        PHB3: 256
PHB4:  65536      NPU1: 4        NPU2: 16

I agree some memory is wasted and the wasted amount depends on the PCI
topology. No memory will be wasted if 256 busses show on one particular
PHB. Less busses one PHB has, more memory will be wasted. As I explained
before, the total used memory is 4 pages (4 * 64KB). Considering the memory
capacity on PPC64 (especially PowerNV), I guess it's fine. Note that the
memory is allocated from memblock together with PHB instance.

The alternative solution (to avoid wasting memory) would be searching for
the PE number according to the input BDFN through the PE list maintained
in each PHB. Obviously, it will induce more logic and more CPU cycles will
be used. So it's a kind of trade-off. If you really want to see this, I
absolutely can do it in next revision. Another option would be to improve
it later and keep the code as what we have. Please input your thought. 

>
>>
>>This extends the PE# capacity for every PHB. After that, the PE number
>>is represented by 4-bytes value. Then we can reuse IODA_INVALID_PE to
>>check the PE# in phb->pe_rmap[] is valid or not.
>
>Looks like using IODA_INVALID_PE is the only reason for this patch.
>

For now, yes. In near future, It needs to be extended to represent 16-bits
PE number for PHB4 as I explained above. 

>
>>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>Reviewed-by: Daniel Axtens <dja@axtens.net>
>>---
>> arch/powerpc/platforms/powernv/pci-ioda.c | 6 +++++-
>> arch/powerpc/platforms/powernv/pci.h      | 7 ++-----
>> 2 files changed, 7 insertions(+), 6 deletions(-)
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index cbd4c0b..cf96cb5 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -768,7 +768,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe)
>>
>> 	/* Clear the reverse map */
>> 	for (rid = pe->rid; rid < rid_end; rid++)
>>-		phb->ioda.pe_rmap[rid] = 0;
>>+		phb->ioda.pe_rmap[rid] = IODA_INVALID_PE;
>>
>> 	/* Release from all parents PELT-V */
>> 	while (parent) {
>>@@ -3406,6 +3406,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>> 	if (prop32)
>> 		phb->ioda.reserved_pe_idx = be32_to_cpup(prop32);
>>
>>+	/* Invalidate RID to PE# mapping */
>>+	for (segno = 0; segno < ARRAY_SIZE(phb->ioda.pe_rmap); segno++)
>>+		phb->ioda.pe_rmap[segno] = IODA_INVALID_PE;
>>+
>> 	/* Parse 64-bit MMIO range */
>> 	pnv_ioda_parse_m64_window(phb);
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>>index 904f60b..80f5326 100644
>>--- a/arch/powerpc/platforms/powernv/pci.h
>>+++ b/arch/powerpc/platforms/powernv/pci.h
>>@@ -156,11 +156,8 @@ struct pnv_phb {
>> 		struct list_head	pe_list;
>> 		struct mutex            pe_list_mutex;
>>
>>-		/* Reverse map of PEs, will have to extend if
>>-		 * we are to support more than 256 PEs, indexed
>>-		 * bus { bus, devfn }
>>-		 */
>>-		unsigned char		pe_rmap[0x10000];
>>+		/* Reverse map of PEs, indexed by {bus, devfn} */
>>+		unsigned int		pe_rmap[0x10000];
>>
>> 		/* TCE cache invalidate registers (physical and
>> 		 * remapped)
>>
>
>
>-- 
>Alexey
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2016-05-06  0:28       ` Gavin Shan
@ 2016-05-06 13:12         ` Rob Herring
  2016-05-08 23:51           ` Gavin Shan
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2016-05-06 13:12 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linuxppc-dev, linux-pci, devicetree, Benjamin Herrenschmidt,
	Michael Ellerman, Alexey Kardashevskiy, Bjorn Helgaas, dja,
	Alistair Popple

On Thu, May 5, 2016 at 7:28 PM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
> On Thu, May 05, 2016 at 12:04:49PM -0500, Rob Herring wrote:
>>On Tue, May 3, 2016 at 8:22 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
>>> This adds standalone driver to support PCI hotplug for PowerPC PowerNV
>>> platform that runs on top of skiboot firmware. The firmware identifies
>>> hotpluggable slots and marked their device tree node with proper
>>> "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans
>>> device tree nodes to create/register PCI hotplug slot accordingly.
>>>
>>> The PCI slots are organized in fashion of tree, which means one
>>> PCI slot might have parent PCI slot and parent PCI slot possibly
>>> contains multiple child PCI slots. At the plugging time, the parent
>>> PCI slot is populated before its children. The child PCI slots are
>>> removed before their parent PCI slot can be removed from the system.
>>>
>>> If the skiboot firmware doesn't support slot status retrieval, the PCI
>>> slot device node shouldn't have property "ibm,reset-by-firmware". In
>>> that case, none of valid PCI slots will be detected from device tree.
>>> The skiboot firmware doesn't export the capability to access attention
>>> LEDs yet and it's something for TBD.
>>>
>>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>>
>>[...]
>>
>>> +static void pnv_php_handle_poweron(struct pnv_php_slot *php_slot)
>>> +{
>>> +       void *fdt, *fdt1, *dt;
>>> +       int confirm = PNV_PHP_POWER_CONFIRMED_SUCCESS;
>>> +       int ret;
>>> +
>>> +       /* We don't know the FDT blob size. We try to get it through
>>> +        * maximal memory chunk and then copy it to another chunk that
>>> +        * fits the real size.
>>> +        */
>>> +       fdt1 = kzalloc(0x10000, GFP_KERNEL);
>>> +       if (!fdt1)
>>> +               goto error;
>>> +
>>> +       ret = pnv_pci_get_device_tree(php_slot->dn->phandle, fdt1, 0x10000);
>>> +       if (ret)
>>> +               goto free_fdt1;
>>> +
>>> +       fdt = kzalloc(fdt_totalsize(fdt1), GFP_KERNEL);
>>> +       if (!fdt)
>>> +               goto free_fdt1;
>>> +
>>> +       /* Unflatten device tree blob */
>>> +       memcpy(fdt, fdt1, fdt_totalsize(fdt1));
>>
>>This is wrong. If the size is greater than 64K, then you will be
>>overrunning the fdt1 buffer. You need to fetch the FDT again if it is
>>bigger than 64KB.
>>
>
> Thanks for review, Rob. Sorry that I don't see how it's a problem. An
> errcode is returned from pnv_pci_get_device_tree() if the FDT blob
> size is greater than 64K. In this case, memcpy() won't be triggered.
> pnv_pci_get_device_tree() relies on firmware implementation which
> avoids overrunning the buffer.

Okay, I missed that pnv_pci_get_device_tree would error out.

> On the other hand, it would be reasonable to retry retriving the
> FDT blob if 64K buffer isn't enough. Also, kzalloc() can be replaced
> with alloc_pages() as 64K is the default page size on PPC64. I will
> have something like below until some one has more concerns. As the
> size of the allocated buffer will be greater than the real FDT blob
> size, some memory (not too much) is wasted. I guess it should be ok.
>
>         struct page *page;
>         void *fdt;
>         unsigned int order;
>         int ret;
>
>         for (order = 0; order < MAX_ORDER; order++) {
>                 page = alloc_pages(GFP_KERNEL, order);
>                 if (page) {
>                         fdt = page_address(page);
>                         ret = pnv_pci_get_device_tree(php_slot->dn->phandle,
>                                                       fdt, (1 << order) * PAGE_SIZE);
>                         if (ret) {
>                                 dev_dbg(&php_slot->pdev.dev, "Error %d getting device tree (%d)\n",
>                                         ret, order);
>                                 free_pages(fdt, order);
>                                 continue;
>                         }
>                 }
>         }

I would allocate a minimal buffer to read the header, get the actual
size, then allocate a new buffer. There's no point in looping. If you
know 64KB is the biggest size you should ever see, then how you had it
is reasonable, too.

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2016-05-06 13:12         ` Rob Herring
@ 2016-05-08 23:51           ` Gavin Shan
  0 siblings, 0 replies; 41+ messages in thread
From: Gavin Shan @ 2016-05-08 23:51 UTC (permalink / raw)
  To: Rob Herring
  Cc: Gavin Shan, linuxppc-dev, linux-pci, devicetree,
	Benjamin Herrenschmidt, Michael Ellerman, Alexey Kardashevskiy,
	Bjorn Helgaas, dja, Alistair Popple

On Fri, May 06, 2016 at 08:12:42AM -0500, Rob Herring wrote:
>On Thu, May 5, 2016 at 7:28 PM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
>> On Thu, May 05, 2016 at 12:04:49PM -0500, Rob Herring wrote:
>>>On Tue, May 3, 2016 at 8:22 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
>>>> This adds standalone driver to support PCI hotplug for PowerPC PowerNV
>>>> platform that runs on top of skiboot firmware. The firmware identifies
>>>> hotpluggable slots and marked their device tree node with proper
>>>> "ibm,slot-pluggable" and "ibm,reset-by-firmware". The driver scans
>>>> device tree nodes to create/register PCI hotplug slot accordingly.
>>>>
>>>> The PCI slots are organized in fashion of tree, which means one
>>>> PCI slot might have parent PCI slot and parent PCI slot possibly
>>>> contains multiple child PCI slots. At the plugging time, the parent
>>>> PCI slot is populated before its children. The child PCI slots are
>>>> removed before their parent PCI slot can be removed from the system.
>>>>
>>>> If the skiboot firmware doesn't support slot status retrieval, the PCI
>>>> slot device node shouldn't have property "ibm,reset-by-firmware". In
>>>> that case, none of valid PCI slots will be detected from device tree.
>>>> The skiboot firmware doesn't export the capability to access attention
>>>> LEDs yet and it's something for TBD.
>>>>
>>>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>>> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>>>
>>>[...]
>>>
>>>> +static void pnv_php_handle_poweron(struct pnv_php_slot *php_slot)
>>>> +{
>>>> +       void *fdt, *fdt1, *dt;
>>>> +       int confirm = PNV_PHP_POWER_CONFIRMED_SUCCESS;
>>>> +       int ret;
>>>> +
>>>> +       /* We don't know the FDT blob size. We try to get it through
>>>> +        * maximal memory chunk and then copy it to another chunk that
>>>> +        * fits the real size.
>>>> +        */
>>>> +       fdt1 = kzalloc(0x10000, GFP_KERNEL);
>>>> +       if (!fdt1)
>>>> +               goto error;
>>>> +
>>>> +       ret = pnv_pci_get_device_tree(php_slot->dn->phandle, fdt1, 0x10000);
>>>> +       if (ret)
>>>> +               goto free_fdt1;
>>>> +
>>>> +       fdt = kzalloc(fdt_totalsize(fdt1), GFP_KERNEL);
>>>> +       if (!fdt)
>>>> +               goto free_fdt1;
>>>> +
>>>> +       /* Unflatten device tree blob */
>>>> +       memcpy(fdt, fdt1, fdt_totalsize(fdt1));
>>>
>>>This is wrong. If the size is greater than 64K, then you will be
>>>overrunning the fdt1 buffer. You need to fetch the FDT again if it is
>>>bigger than 64KB.
>>>
>>
>> Thanks for review, Rob. Sorry that I don't see how it's a problem. An
>> errcode is returned from pnv_pci_get_device_tree() if the FDT blob
>> size is greater than 64K. In this case, memcpy() won't be triggered.
>> pnv_pci_get_device_tree() relies on firmware implementation which
>> avoids overrunning the buffer.
>
>Okay, I missed that pnv_pci_get_device_tree would error out.
>
>> On the other hand, it would be reasonable to retry retriving the
>> FDT blob if 64K buffer isn't enough. Also, kzalloc() can be replaced
>> with alloc_pages() as 64K is the default page size on PPC64. I will
>> have something like below until some one has more concerns. As the
>> size of the allocated buffer will be greater than the real FDT blob
>> size, some memory (not too much) is wasted. I guess it should be ok.
>>
>>         struct page *page;
>>         void *fdt;
>>         unsigned int order;
>>         int ret;
>>
>>         for (order = 0; order < MAX_ORDER; order++) {
>>                 page = alloc_pages(GFP_KERNEL, order);
>>                 if (page) {
>>                         fdt = page_address(page);
>>                         ret = pnv_pci_get_device_tree(php_slot->dn->phandle,
>>                                                       fdt, (1 << order) * PAGE_SIZE);
>>                         if (ret) {
>>                                 dev_dbg(&php_slot->pdev.dev, "Error %d getting device tree (%d)\n",
>>                                         ret, order);
>>                                 free_pages(fdt, order);
>>                                 continue;
>>                         }
>>                 }
>>         }
>
>I would allocate a minimal buffer to read the header, get the actual
>size, then allocate a new buffer. There's no point in looping. If you
>know 64KB is the biggest size you should ever see, then how you had it
>is reasonable, too.
>

The interface pnv_pci_get_device_tree() returns header and meta-data
in one shoot as it was designed. Also, 64KB here is good enough as I
can see. I will keep the code as-is. Thanks a lot for looking into
details, Rob.

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state
  2016-05-03 13:22 ` [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state Gavin Shan
@ 2016-05-11  3:28   ` Alistair Popple
  0 siblings, 0 replies; 41+ messages in thread
From: Alistair Popple @ 2016-05-11  3:28 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Gavin Shan, devicetree, aik, linux-pci, robherring2, bhelgaas,
	dja, Stewart Smith

Gavin,

On Tue, 3 May 2016 23:22:46 Gavin Shan wrote:
> This exports 4 functions, which base on the corresponding OPAL
> APIs to get/set PCI slot status. Those functions are going to
> be used by PowerNV PCI hotplug driver:
> 
>    pnv_pci_get_device_tree()    opal_get_device_tree()
>    pnv_pci_get_presence_state() opal_pci_get_presence_state()
>    pnv_pci_get_power_state()    opal_pci_get_power_state()
>    pnv_pci_set_power_state()    opal_pci_set_power_state()
> 
> Besides, the patch also exports pnv_pci_hotplug_notifier_{register,
> unregister}() to allow registration and unregistration of PCI hotplug
> notifier, which will be used to receive PCI hotplug message from
> skiboot firmware in PowerNV PCI hotplug driver.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  arch/powerpc/include/asm/opal-api.h            |  18 ++++-
>  arch/powerpc/include/asm/opal.h                |   5 ++
>  arch/powerpc/include/asm/pnv-pci.h             |   7 ++
>  arch/powerpc/platforms/powernv/opal-wrappers.S |   5 ++
>  arch/powerpc/platforms/powernv/pci.c           | 102 
+++++++++++++++++++++++++
>  5 files changed, 136 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
> index 9bb8ddf..728e04e 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -158,7 +158,12 @@
>  #define OPAL_LEDS_SET_INDICATOR			115
>  #define OPAL_CEC_REBOOT2			116
>  #define OPAL_CONSOLE_FLUSH			117
> -#define OPAL_LAST				117
> +#define OPAL_GET_DEVICE_TREE			118
> +#define OPAL_PCI_GET_PRESENCE_STATE		119
> +#define OPAL_PCI_GET_POWER_STATE		120
> +#define OPAL_PCI_SET_POWER_STATE		121
> +#define OPAL_PCI_POLL2				122
> +#define OPAL_LAST				122
>  
>  /* Device tree flags */
>  
> @@ -344,6 +349,16 @@ enum OpalPciResetState {
>  	OPAL_ASSERT_RESET   = 1
>  };
>  
> +enum OpalPciSlotPresentenceState {
> +	OPAL_PCI_SLOT_EMPTY	= 0,
> +	OPAL_PCI_SLOT_PRESENT	= 1
> +};
> +
> +enum OpalPciSlotPowerState {
> +	OPAL_PCI_SLOT_POWER_OFF	= 0,
> +	OPAL_PCI_SLOT_POWER_ON	= 1
> +};
> +
>  enum OpalSlotLedType {
>  	OPAL_SLOT_LED_TYPE_ID = 0,	/* IDENTIFY LED */
>  	OPAL_SLOT_LED_TYPE_FAULT = 1,	/* FAULT LED */
> @@ -378,6 +393,7 @@ enum opal_msg_type {
>  	OPAL_MSG_DPO		= 5,
>  	OPAL_MSG_PRD		= 6,
>  	OPAL_MSG_OCC		= 7,
> +	OPAL_MSG_PCI_HOTPLUG	= 8,
>  	OPAL_MSG_TYPE_MAX,
>  };
>  
> diff --git a/arch/powerpc/include/asm/opal.h 
b/arch/powerpc/include/asm/opal.h
> index 348132c..1a83c80 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -209,6 +209,11 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, 
uint64_t buf,
>  		uint64_t size, uint64_t token);
>  int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
>  		uint64_t token);
> +int64_t opal_get_device_tree(uint32_t phandle, uint64_t buf, uint64_t len);
> +int64_t opal_pci_get_presence_state(uint64_t id, uint64_t data);
> +int64_t opal_pci_get_power_state(uint64_t id, uint64_t data);
> +int64_t opal_pci_set_power_state(uint64_t id, uint64_t data);
> +int64_t opal_pci_poll2(uint64_t id, uint64_t data);
>  
>  /* Internal functions */
>  extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
> diff --git a/arch/powerpc/include/asm/pnv-pci.h 
b/arch/powerpc/include/asm/pnv-pci.h
> index c607902..8db7439 100644
> --- a/arch/powerpc/include/asm/pnv-pci.h
> +++ b/arch/powerpc/include/asm/pnv-pci.h
> @@ -17,6 +17,13 @@
>  #define PCI_SLOT_ID(phb_id, bdfn)	\
>  	(PCI_SLOT_ID_PREFIX | ((uint64_t)(bdfn) << 16) | (phb_id))
>  
> +extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t 
len);
> +extern int pnv_pci_get_presence_state(uint64_t id, uint8_t *state);
> +extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state);
> +extern int pnv_pci_set_power_state(uint64_t id, uint8_t state);
> +extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb);
> +extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb);
> +
>  int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode);
>  int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
>  			   unsigned int virq);
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index e45b88a..60397d2 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -302,3 +302,8 @@ OPAL_CALL(opal_prd_msg,				
OPAL_PRD_MSG);
>  OPAL_CALL(opal_leds_get_ind,			OPAL_LEDS_GET_INDICATOR);
>  OPAL_CALL(opal_leds_set_ind,			OPAL_LEDS_SET_INDICATOR);
>  OPAL_CALL(opal_console_flush,			OPAL_CONSOLE_FLUSH);
> +OPAL_CALL(opal_get_device_tree,			OPAL_GET_DEVICE_TREE);
> +OPAL_CALL(opal_pci_get_presence_state,		
OPAL_PCI_GET_PRESENCE_STATE);
> +OPAL_CALL(opal_pci_get_power_state,		OPAL_PCI_GET_POWER_STATE);
> +OPAL_CALL(opal_pci_set_power_state,		OPAL_PCI_SET_POWER_STATE);
> +OPAL_CALL(opal_pci_poll2,			OPAL_PCI_POLL2);
> diff --git a/arch/powerpc/platforms/powernv/pci.c 
b/arch/powerpc/platforms/powernv/pci.c
> index 67a33e9..6e10ac4 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -42,6 +42,108 @@
>  #define cfg_dbg(fmt...)	do { } while(0)
>  //#define cfg_dbg(fmt...)	printk(fmt)
>  
> +int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t len)
> +{
> +	int64_t rc;
> +
> +	if (!opal_check_token(OPAL_GET_DEVICE_TREE))
> +		return -ENXIO;
> +
> +	rc = opal_get_device_tree(phandle, (uint64_t)buf, len);
> +	if (rc != OPAL_SUCCESS)
> +		return -EIO;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(pnv_pci_get_device_tree);
> +
> +static int pnv_pci_poll2(uint64_t id, uint8_t *state)
> +{
> +	int64_t rc;
> +
> +	if (!opal_check_token(OPAL_PCI_POLL2))
> +		return -ENXIO;
> +
> +	while (1) {
> +		rc = opal_pci_poll2(id, (uint64_t)state);
> +		if (rc <= OPAL_SUCCESS)
> +			break;
> +
> +		if (system_state < SYSTEM_RUNNING)
> +			udelay(1000 * rc);
> +		else
> +			msleep(rc);

As we discussed offline I think this interface needs to change. Skiboot now 
has good support for timers which can be scheduled for cranking state 
machines, etc. rather than relying on the kernel.

If the kernel needs to wait for completion it can poll on a relevant status by 
using opal_pci_get_power_state() to return the current state for example, or 
perhaps better it could wait to receive a message or interrupt asynchronously.

Regards,

Alistair

> +	}
> +
> +	if (rc != OPAL_SUCCESS)
> +		return -EIO;
> +
> +	return 0;
> +}
> +
> +int pnv_pci_get_presence_state(uint64_t id, uint8_t *state)
> +{
> +	int64_t rc;
> +
> +	if (!opal_check_token(OPAL_PCI_GET_PRESENCE_STATE))
> +		return -ENXIO;
> +
> +	rc = opal_pci_get_presence_state(id, (uint64_t)state);
> +	if (rc == OPAL_SUCCESS)
> +		return 0;
> +	else if (rc < OPAL_SUCCESS)
> +		return -EIO;
> +
> +	return pnv_pci_poll2(id, state);
> +}
> +EXPORT_SYMBOL_GPL(pnv_pci_get_presence_state);
> +
> +int pnv_pci_get_power_state(uint64_t id, uint8_t *state)
> +{
> +	int64_t rc;
> +
> +	if (!opal_check_token(OPAL_PCI_GET_POWER_STATE))
> +		return -ENXIO;
> +
> +	rc = opal_pci_get_power_state(id, (uint64_t)state);
> +	if (rc == OPAL_SUCCESS)
> +		return 0;
> +	else if (rc < OPAL_SUCCESS)
> +		return -EIO;
> +
> +	return pnv_pci_poll2(id, state);
> +}
> +EXPORT_SYMBOL_GPL(pnv_pci_get_power_state);
> +
> +int pnv_pci_set_power_state(uint64_t id, uint8_t state)
> +{
> +	int64_t rc;
> +
> +	if (!opal_check_token(OPAL_PCI_SET_POWER_STATE))
> +		return -ENXIO;
> +
> +	rc = opal_pci_set_power_state(id, (uint64_t)&state);
> +	if (rc == OPAL_SUCCESS)
> +		return 0;
> +	else if (rc < OPAL_SUCCESS)
> +		return -EIO;
> +
> +	return pnv_pci_poll2(id, &state);
> +}
> +EXPORT_SYMBOL_GPL(pnv_pci_set_power_state);
> +
> +int pnv_pci_hotplug_notifier_register(struct notifier_block *nb)
> +{
> +	return opal_message_notifier_register(OPAL_MSG_PCI_HOTPLUG, nb);
> +}
> +EXPORT_SYMBOL_GPL(pnv_pci_hotplug_notifier_register);
> +
> +int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb)
> +{
> +	return opal_message_notifier_unregister(OPAL_MSG_PCI_HOTPLUG, nb);
> +}
> +EXPORT_SYMBOL_GPL(pnv_pci_hotplug_notifier_unregister);
> +
>  #ifdef CONFIG_PCI_MSI
>  int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
>  {
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2016-05-11  3:28 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-03 13:22 [PATCH v9 00/22] powerpc/powernv: PCI hotplug support Gavin Shan
2016-05-03 13:22 ` [PATCH v9 01/22] PCI: Add pcibios_setup_bridge() Gavin Shan
2016-05-03 13:22 ` [PATCH v9 02/22] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
2016-05-03 13:22 ` [PATCH v9 03/22] powerpc/powernv: Move pnv_pci_ioda_setup_opal_tce_kill() around Gavin Shan
2016-05-06  6:36   ` Alexey Kardashevskiy
2016-05-03 13:22 ` [PATCH v9 04/22] powerpc/powernv: Increase PE# capacity Gavin Shan
2016-05-06  7:17   ` Alexey Kardashevskiy
2016-05-06 11:05     ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 05/22] powerpc/powernv: Allocate PE# in reverse order Gavin Shan
2016-05-03 13:22   ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 06/22] powerpc/powernv: Create PEs in pcibios_setup_bridge() Gavin Shan
2016-05-03 13:22 ` [PATCH v9 07/22] powerpc/powernv: Setup PE for root bus Gavin Shan
2016-05-03 13:22 ` [PATCH v9 08/22] powerpc/powernv: Extend PCI bridge resources Gavin Shan
2016-05-03 13:22 ` [PATCH v9 09/22] powerpc/powernv: Make pnv_ioda_deconfigure_pe() visible Gavin Shan
2016-05-03 13:22 ` [PATCH v9 10/22] powerpc/powernv: Dynamically release PE Gavin Shan
2016-05-03 13:22   ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 11/22] powerpc/pci: Update bridge windows on PCI plug Gavin Shan
2016-05-03 13:22 ` [PATCH v9 12/22] powerpc/pci: Delay populating pdn Gavin Shan
2016-05-03 13:22 ` [PATCH v9 13/22] powerpc/powernv: Support PCI slot ID Gavin Shan
2016-05-03 13:22 ` [PATCH v9 14/22] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
2016-05-03 13:22 ` [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state Gavin Shan
2016-05-11  3:28   ` Alistair Popple
2016-05-03 13:22 ` [PATCH v9 16/22] drivers/of: Split unflatten_dt_node() Gavin Shan
2016-05-03 13:22 ` [PATCH v9 17/22] drivers/of: Avoid recursively calling unflatten_dt_node() Gavin Shan
2016-05-03 13:22   ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 18/22] drivers/of: Rename unflatten_dt_node() Gavin Shan
2016-05-03 13:22   ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 19/22] drivers/of: Specify parent node in of_fdt_unflatten_tree() Gavin Shan
     [not found] ` <1462281773-26438-1-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2016-05-03 13:22   ` [PATCH v9 20/22] drivers/of: Return allocated memory from of_fdt_unflatten_tree() Gavin Shan
2016-05-03 13:22     ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 21/22] drivers/of: Export of_detach_node() Gavin Shan
2016-05-03 13:22   ` Gavin Shan
2016-05-04 13:36   ` Gavin Shan
2016-05-05 19:42     ` Rob Herring
2016-05-06  0:40       ` Gavin Shan
2016-05-03 13:22 ` [PATCH v9 22/22] PCI/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
     [not found]   ` <1462281773-26438-23-git-send-email-gwshan-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2016-05-05 17:04     ` Rob Herring
2016-05-05 17:04       ` Rob Herring
2016-05-06  0:28       ` Gavin Shan
2016-05-06 13:12         ` Rob Herring
2016-05-08 23:51           ` Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.