linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
@ 2012-07-26 19:55 Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init Thierry Reding
                   ` (11 more replies)
  0 siblings, 12 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

This patch series adds support for device tree based probing of the PCIe
controller found on Tegra SoCs.

Patches 1 and 2 keep the pci_fixup_irqs() and ARM-specific
pci_common_init() functions around after init. This is required to
support driver probe deferral, which may cause built-in drivers to be
probed after __init data has already been freed.

Patch 3 allows a driver's probe function to pass per-controller private
data when calling the pci_common_init() function.

Patch 4 is trivial and has already been Acked-by: Stephen Warren before.

Patch 5 adds a flag to mark a struct resource as defining a PCI
configuration space. The flag will be used subsequently to differentiate
between memory-mapped I/O regions and PCI configuration space.

Patch 6 rewrites PCIe support as a driver and switches the Harmony and
TrimSlice boards to add the proper platform device instead of calling
the tegra_pcie_init() function. Patch 7 adds MSI support as an IRQ
domain.

Patches 8 and 9 add code to support the new PCIe controller binding used
by patch 10 to instantiate the controller from DT.


Thierry Reding (10):
  PCI: Keep pci_fixup_irqs() around after init
  ARM: pci: Keep pci_common_init() around after init
  ARM: pci: Allow passing per-controller private data
  ARM: tegra: Move tegra_pcie_xclk_clamp() to PMC
  resource: add PCI configuration space support
  ARM: tegra: Rewrite PCIe support as a driver
  ARM: tegra: pcie: Add MSI support
  of/address: Handle #address-cells > 2 specially
  of: Add of_pci_parse_ranges()
  ARM: tegra: pcie: Add device tree support

 .../bindings/pci/nvidia,tegra20-pcie.txt           |   94 ++
 arch/arm/boot/dts/tegra20.dtsi                     |   62 +
 arch/arm/include/asm/mach/pci.h                    |    1 +
 arch/arm/kernel/bios32.c                           |    7 +-
 arch/arm/mach-tegra/Kconfig                        |    1 +
 arch/arm/mach-tegra/board-dt-tegra20.c             |    7 +-
 arch/arm/mach-tegra/board-harmony-pcie.c           |   30 +-
 arch/arm/mach-tegra/board-harmony.c                |    1 +
 arch/arm/mach-tegra/board-harmony.h                |    1 +
 arch/arm/mach-tegra/board-trimslice.c              |   11 +-
 arch/arm/mach-tegra/board.h                        |    2 +-
 arch/arm/mach-tegra/devices.c                      |  142 ++
 arch/arm/mach-tegra/devices.h                      |    3 +
 arch/arm/mach-tegra/include/mach/iomap.h           |    3 -
 arch/arm/mach-tegra/include/mach/irqs.h            |    5 +-
 arch/arm/mach-tegra/include/mach/pci-tegra.h       |   38 +
 arch/arm/mach-tegra/pcie.c                         | 1406 ++++++++++++++------
 arch/arm/mach-tegra/pmc.c                          |   16 +
 arch/arm/mach-tegra/pmc.h                          |    1 +
 drivers/of/address.c                               |    8 +
 drivers/of/of_pci.c                                |   84 +-
 drivers/pci/setup-irq.c                            |    4 +-
 include/linux/ioport.h                             |    2 +-
 include/linux/of_pci.h                             |    2 +
 24 files changed, 1485 insertions(+), 446 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt
 create mode 100644 arch/arm/mach-tegra/include/mach/pci-tegra.h

-- 
1.7.11.2


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-08-14  5:06   ` Bjorn Helgaas
  2012-08-15 17:06   ` Bjorn Helgaas
  2012-07-26 19:55 ` [PATCH v3 02/10] ARM: pci: Keep pci_common_init() " Thierry Reding
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

When using deferred driver probing, PCI host controller drivers may
actually require this function after the init stage.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- none

Changes in v2:
- use __devinit annotations

 drivers/pci/setup-irq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c
index eb219a1..f0bcd56 100644
--- a/drivers/pci/setup-irq.c
+++ b/drivers/pci/setup-irq.c
@@ -18,7 +18,7 @@
 #include <linux/cache.h>
 
 
-static void __init
+static void __devinit
 pdev_fixup_irq(struct pci_dev *dev,
 	       u8 (*swizzle)(struct pci_dev *, u8 *),
 	       int (*map_irq)(const struct pci_dev *, u8, u8))
@@ -54,7 +54,7 @@ pdev_fixup_irq(struct pci_dev *dev,
 	pcibios_update_irq(dev, irq);
 }
 
-void __init
+void __devinit
 pci_fixup_irqs(u8 (*swizzle)(struct pci_dev *, u8 *),
 	       int (*map_irq)(const struct pci_dev *, u8, u8))
 {
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 02/10] ARM: pci: Keep pci_common_init() around after init
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 03/10] ARM: pci: Allow passing per-controller private data Thierry Reding
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

When using deferred driver probing, PCI host controller drivers may
actually require this function after the init stage.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- none

Changes in v2:
- use __devinit annotations

 arch/arm/kernel/bios32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index 70dabb9..5c3dd59 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -456,7 +456,7 @@ static int __init pcibios_init_resources(int busnr, struct pci_sys_data *sys)
 	return 0;
 }
 
-static void __init pcibios_init_hw(struct hw_pci *hw, struct list_head *head)
+static void __devinit pcibios_init_hw(struct hw_pci *hw, struct list_head *head)
 {
 	struct pci_sys_data *sys = NULL;
 	int ret;
@@ -504,7 +504,7 @@ static void __init pcibios_init_hw(struct hw_pci *hw, struct list_head *head)
 	}
 }
 
-void __init pci_common_init(struct hw_pci *hw)
+void __devinit pci_common_init(struct hw_pci *hw)
 {
 	struct pci_sys_data *sys;
 	LIST_HEAD(head);
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 03/10] ARM: pci: Allow passing per-controller private data
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 02/10] ARM: pci: Keep pci_common_init() " Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 04/10] ARM: tegra: Move tegra_pcie_xclk_clamp() to PMC Thierry Reding
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

In order to allow drivers to specify private data for each controller,
this commit adds a private_data field to the struct hw_pci. This field
is an array of nr_controllers pointers that will be used to initialize
the private_data field of the corresponding controller's pci_sys_data
structure.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- none

Changes in v2:
- new patch

 arch/arm/include/asm/mach/pci.h | 1 +
 arch/arm/kernel/bios32.c        | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h
index 188fd58..31fab93 100644
--- a/arch/arm/include/asm/mach/pci.h
+++ b/arch/arm/include/asm/mach/pci.h
@@ -24,6 +24,7 @@ struct hw_pci {
 #endif
 	struct pci_ops	*ops;
 	int		nr_controllers;
+	void		**private_data;
 	int		(*setup)(int nr, struct pci_sys_data *);
 	struct pci_bus *(*scan)(int nr, struct pci_sys_data *);
 	void		(*preinit)(void);
diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index 5c3dd59..897b21a 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -475,6 +475,9 @@ static void __devinit pcibios_init_hw(struct hw_pci *hw, struct list_head *head)
 		sys->map_irq = hw->map_irq;
 		INIT_LIST_HEAD(&sys->resources);
 
+		if (hw->private_data)
+			sys->private_data = hw->private_data[nr];
+
 		ret = hw->setup(nr, sys);
 
 		if (ret > 0) {
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 04/10] ARM: tegra: Move tegra_pcie_xclk_clamp() to PMC
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (2 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 03/10] ARM: pci: Allow passing per-controller private data Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 05/10] resource: add PCI configuration space support Thierry Reding
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

The PMC code already accesses to PMC registers so it makes sense to
move this function there as well. While at it, rename the function to
tegra_pmc_pcie_xclk_clamp() for consistency.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
---
Changes in v3:
- none

Changes in v2:
- none

 arch/arm/mach-tegra/pcie.c | 30 ++++--------------------------
 arch/arm/mach-tegra/pmc.c  | 16 ++++++++++++++++
 arch/arm/mach-tegra/pmc.h  |  1 +
 3 files changed, 21 insertions(+), 26 deletions(-)

diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
index 576347a..efe71dd 100644
--- a/arch/arm/mach-tegra/pcie.c
+++ b/arch/arm/mach-tegra/pcie.c
@@ -42,6 +42,7 @@
 #include <mach/powergate.h>
 
 #include "board.h"
+#include "pmc.h"
 
 /* register definitions */
 #define AFI_OFFSET	0x3800
@@ -145,17 +146,6 @@
 #define  PADS_PLL_CTL_TXCLKREF_DIV10		(0 << 20)
 #define  PADS_PLL_CTL_TXCLKREF_DIV5		(1 << 20)
 
-/* PMC access is required for PCIE xclk (un)clamping */
-#define PMC_SCRATCH42		0x144
-#define PMC_SCRATCH42_PCX_CLAMP	(1 << 0)
-
-static void __iomem *reg_pmc_base = IO_ADDRESS(TEGRA_PMC_BASE);
-
-#define pmc_writel(value, reg) \
-	__raw_writel(value, reg_pmc_base + (reg))
-#define pmc_readl(reg) \
-	__raw_readl(reg_pmc_base + (reg))
-
 /*
  * Tegra2 defines 1GB in the AXI address map for PCIe.
  *
@@ -647,18 +637,6 @@ static int tegra_pcie_enable_controller(void)
 	return 0;
 }
 
-static void tegra_pcie_xclk_clamp(bool clamp)
-{
-	u32 reg;
-
-	reg = pmc_readl(PMC_SCRATCH42) & ~PMC_SCRATCH42_PCX_CLAMP;
-
-	if (clamp)
-		reg |= PMC_SCRATCH42_PCX_CLAMP;
-
-	pmc_writel(reg, PMC_SCRATCH42);
-}
-
 static void tegra_pcie_power_off(void)
 {
 	tegra_periph_reset_assert(tegra_pcie.pcie_xclk);
@@ -666,7 +644,7 @@ static void tegra_pcie_power_off(void)
 	tegra_periph_reset_assert(tegra_pcie.pex_clk);
 
 	tegra_powergate_power_off(TEGRA_POWERGATE_PCIE);
-	tegra_pcie_xclk_clamp(true);
+	tegra_pmc_pcie_xclk_clamp(true);
 }
 
 static int tegra_pcie_power_regate(void)
@@ -675,7 +653,7 @@ static int tegra_pcie_power_regate(void)
 
 	tegra_pcie_power_off();
 
-	tegra_pcie_xclk_clamp(true);
+	tegra_pmc_pcie_xclk_clamp(true);
 
 	tegra_periph_reset_assert(tegra_pcie.pcie_xclk);
 	tegra_periph_reset_assert(tegra_pcie.afi_clk);
@@ -689,7 +667,7 @@ static int tegra_pcie_power_regate(void)
 
 	tegra_periph_reset_deassert(tegra_pcie.afi_clk);
 
-	tegra_pcie_xclk_clamp(false);
+	tegra_pmc_pcie_xclk_clamp(false);
 
 	clk_prepare_enable(tegra_pcie.afi_clk);
 	clk_prepare_enable(tegra_pcie.pex_clk);
diff --git a/arch/arm/mach-tegra/pmc.c b/arch/arm/mach-tegra/pmc.c
index 7af6a54..399dc3a 100644
--- a/arch/arm/mach-tegra/pmc.c
+++ b/arch/arm/mach-tegra/pmc.c
@@ -24,6 +24,10 @@
 #define PMC_CTRL		0x0
 #define PMC_CTRL_INTR_LOW	(1 << 17)
 
+/* PMC access is required for PCIE xclk (un)clamping */
+#define PMC_SCRATCH42		0x144
+#define PMC_SCRATCH42_PCX_CLAMP	(1 << 0)
+
 static inline u32 tegra_pmc_readl(u32 reg)
 {
 	return readl(IO_ADDRESS(TEGRA_PMC_BASE + reg));
@@ -74,3 +78,15 @@ void __init tegra_pmc_init(void)
 		val &= ~PMC_CTRL_INTR_LOW;
 	tegra_pmc_writel(val, PMC_CTRL);
 }
+
+void tegra_pmc_pcie_xclk_clamp(bool clamp)
+{
+	u32 reg;
+
+	reg = tegra_pmc_readl(PMC_SCRATCH42) & ~PMC_SCRATCH42_PCX_CLAMP;
+
+	if (clamp)
+		reg |= PMC_SCRATCH42_PCX_CLAMP;
+
+	tegra_pmc_writel(reg, PMC_SCRATCH42);
+}
diff --git a/arch/arm/mach-tegra/pmc.h b/arch/arm/mach-tegra/pmc.h
index 8995ee4..2631c9a 100644
--- a/arch/arm/mach-tegra/pmc.h
+++ b/arch/arm/mach-tegra/pmc.h
@@ -19,5 +19,6 @@
 #define __MACH_TEGRA_PMC_H
 
 void tegra_pmc_init(void);
+void tegra_pmc_pcie_xclk_clamp(bool clamp);
 
 #endif
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 05/10] resource: add PCI configuration space support
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (3 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 04/10] ARM: tegra: Move tegra_pcie_xclk_clamp() to PMC Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-08-14  5:00   ` Bjorn Helgaas
  2012-07-26 19:55 ` [PATCH v3 06/10] ARM: tegra: Rewrite PCIe support as a driver Thierry Reding
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

This commit adds a new flag that allows marking resources as PCI
configuration space.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- new patch

 include/linux/ioport.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 589e0e7..3314843 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -102,7 +102,7 @@ struct resource {
 
 /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
 #define IORESOURCE_PCI_FIXED		(1<<4)	/* Do not move resource */
-
+#define IORESOURCE_PCI_CS		(1<<5)	/* PCI configuration space */
 
 /* helpers to define resources */
 #define DEFINE_RES_NAMED(_start, _size, _name, _flags)			\
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 06/10] ARM: tegra: Rewrite PCIe support as a driver
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (4 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 05/10] resource: add PCI configuration space support Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 07/10] ARM: tegra: pcie: Add MSI support Thierry Reding
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

This commit adds a platform device driver for the PCIe controller on
Tegra SOCs. Current users of the old code (TrimSlice and Harmony) are
converted and now initialize and register a corresponding platform
device.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- use devm_request_and_ioremap() and devm_clk_get()
- make root ports separate devices
- fix extended configuration space access

Changes in v2:
- use struct hw_pci's new private_data field
- fix DT initialization for TrimSlice

 arch/arm/mach-tegra/board-harmony-pcie.c     |  30 +-
 arch/arm/mach-tegra/board-harmony.c          |   1 +
 arch/arm/mach-tegra/board-harmony.h          |   1 +
 arch/arm/mach-tegra/board-trimslice.c        |  11 +-
 arch/arm/mach-tegra/board.h                  |   2 +-
 arch/arm/mach-tegra/devices.c                | 135 ++++
 arch/arm/mach-tegra/devices.h                |   3 +
 arch/arm/mach-tegra/include/mach/iomap.h     |   3 -
 arch/arm/mach-tegra/include/mach/pci-tegra.h |  38 ++
 arch/arm/mach-tegra/pcie.c                   | 926 ++++++++++++++++-----------
 10 files changed, 732 insertions(+), 418 deletions(-)
 create mode 100644 arch/arm/mach-tegra/include/mach/pci-tegra.h

diff --git a/arch/arm/mach-tegra/board-harmony-pcie.c b/arch/arm/mach-tegra/board-harmony-pcie.c
index e8c3fda..712f3bd 100644
--- a/arch/arm/mach-tegra/board-harmony-pcie.c
+++ b/arch/arm/mach-tegra/board-harmony-pcie.c
@@ -22,12 +22,14 @@
 
 #include <asm/mach-types.h>
 
+#include <mach/pci-tegra.h>
+
 #include "board.h"
+#include "devices.h"
 #include "board-harmony.h"
 
 #ifdef CONFIG_TEGRA_PCI
-
-int __init harmony_pcie_init(void)
+static int harmony_pcie_board_init(struct platform_device *pdev)
 {
 	struct regulator *regulator = NULL;
 	int err;
@@ -44,30 +46,24 @@ int __init harmony_pcie_init(void)
 
 	regulator_enable(regulator);
 
-	err = tegra_pcie_init(true, true);
-	if (err)
-		goto err_pcie;
-
 	return 0;
 
-err_pcie:
-	regulator_disable(regulator);
-	regulator_put(regulator);
 err_reg:
 	gpio_free(TEGRA_GPIO_EN_VDD_1V05_GPIO);
 
 	return err;
 }
 
-static int __init harmony_pcie_initcall(void)
+int __init harmony_pcie_init(void)
 {
-	if (!machine_is_harmony())
-		return 0;
+	tegra_pcie_pdata.init = harmony_pcie_board_init;
+	platform_device_register(&tegra_pcie_device);
 
-	return harmony_pcie_init();
+	return 0;
+}
+#else
+int __init harmony_pcie_init(void)
+{
+	return 0;
 }
-
-/* PCI should be initialized after I2C, mfd and regulators */
-subsys_initcall_sync(harmony_pcie_initcall);
-
 #endif
diff --git a/arch/arm/mach-tegra/board-harmony.c b/arch/arm/mach-tegra/board-harmony.c
index e5f3352..063c7d5 100644
--- a/arch/arm/mach-tegra/board-harmony.c
+++ b/arch/arm/mach-tegra/board-harmony.c
@@ -204,6 +204,7 @@ static void __init tegra_harmony_init(void)
 	pwm_add_table(harmony_pwm_lookup, ARRAY_SIZE(harmony_pwm_lookup));
 	harmony_i2c_init();
 	harmony_regulator_init();
+	harmony_pcie_init();
 }
 
 MACHINE_START(HARMONY, "harmony")
diff --git a/arch/arm/mach-tegra/board-harmony.h b/arch/arm/mach-tegra/board-harmony.h
index 139d96c..afa68e2 100644
--- a/arch/arm/mach-tegra/board-harmony.h
+++ b/arch/arm/mach-tegra/board-harmony.h
@@ -37,5 +37,6 @@
 
 void harmony_pinmux_init(void);
 int harmony_regulator_init(void);
+int harmony_pcie_init(void);
 
 #endif
diff --git a/arch/arm/mach-tegra/board-trimslice.c b/arch/arm/mach-tegra/board-trimslice.c
index 776aa95..2667fe9 100644
--- a/arch/arm/mach-tegra/board-trimslice.c
+++ b/arch/arm/mach-tegra/board-trimslice.c
@@ -34,6 +34,7 @@
 #include <asm/setup.h>
 
 #include <mach/iomap.h>
+#include <mach/pci-tegra.h>
 #include <mach/sdhci.h>
 
 #include "board.h"
@@ -145,14 +146,11 @@ static __initdata struct tegra_clk_init_table trimslice_clk_init_table[] = {
 	{ NULL,		NULL,		0,		0},
 };
 
-static int __init tegra_trimslice_pci_init(void)
+static int __init trimslice_pci_init(void)
 {
-	if (!machine_is_trimslice())
-		return 0;
-
-	return tegra_pcie_init(true, true);
+	platform_device_register(&tegra_pcie_device);
+	return 0;
 }
-subsys_initcall(tegra_trimslice_pci_init);
 
 static void __init tegra_trimslice_init(void)
 {
@@ -167,6 +165,7 @@ static void __init tegra_trimslice_init(void)
 
 	trimslice_i2c_init();
 	trimslice_usb_init();
+	trimslice_pci_init();
 }
 
 MACHINE_START(TRIMSLICE, "trimslice")
diff --git a/arch/arm/mach-tegra/board.h b/arch/arm/mach-tegra/board.h
index f88e514..3a2a7e9 100644
--- a/arch/arm/mach-tegra/board.h
+++ b/arch/arm/mach-tegra/board.h
@@ -30,7 +30,6 @@ void __init tegra30_init_early(void);
 void __init tegra_map_common_io(void);
 void __init tegra_init_irq(void);
 void __init tegra_dt_init_irq(void);
-int __init tegra_pcie_init(bool init_port0, bool init_port1);
 
 void tegra_init_late(void);
 
@@ -56,4 +55,5 @@ static inline int harmony_pcie_init(void) { return 0; }
 void __init tegra_paz00_wifikill_init(void);
 
 extern struct sys_timer tegra_timer;
+
 #endif
diff --git a/arch/arm/mach-tegra/devices.c b/arch/arm/mach-tegra/devices.c
index 4529561..203af2e 100644
--- a/arch/arm/mach-tegra/devices.c
+++ b/arch/arm/mach-tegra/devices.c
@@ -28,6 +28,7 @@
 #include <mach/iomap.h>
 #include <mach/dma.h>
 #include <mach/usb_phy.h>
+#include <mach/pci-tegra.h>
 
 #include "gpio-names.h"
 #include "devices.h"
@@ -735,3 +736,137 @@ struct platform_device tegra_nand_device = {
 	.num_resources = ARRAY_SIZE(tegra_nand_resources),
 	.resource = tegra_nand_resources,
 };
+
+static struct resource tegra_pcie_resources[] = {
+	/* PADS registers */
+	[0] = {
+		.start = 0x80003000,
+		.end = 0x800037ff,
+		.flags = IORESOURCE_MEM,
+	},
+	/* AFI registers */
+	[1] = {
+		.start = 0x80003800,
+		.end = 0x800039ff,
+		.flags = IORESOURCE_MEM,
+	},
+	/* PCI configuration space */
+	[2] = {
+		.start = 0x81000000,
+		.end = 0x81000000 + SZ_16M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS,
+	},
+	/* PCI extended configuration space */
+	[3] = {
+		.start = 0x90000000,
+		.end = 0x90000000 + SZ_256M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS,
+	},
+	[4] = {
+		.start = INT_PCIE_INTR,
+		.end = INT_PCIE_INTR,
+		.flags = IORESOURCE_IRQ,
+	},
+};
+
+static struct resource tegra_pcie_rp0_resources[] = {
+	[0] = {
+		.start = 0x80000000,
+		.end = 0x80000000 + SZ_4K - 1,
+		.flags = IORESOURCE_MEM,
+	},
+};
+
+static struct resource tegra_pcie_rp0_ranges[] = {
+	[0] = {
+		.start = 0x81000000,
+		.end = 0x81000000 + SZ_8M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS,
+	},
+	[1] = {
+		.start = 0x90000000,
+		.end = 0x90000000 + SZ_128M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS,
+	},
+	[2] = {
+		.start = 0x82000000,
+		.end = 0x82000000 + SZ_64K - 1,
+		.flags = IORESOURCE_IO,
+	},
+	[3] = {
+		.start = 0xa0000000,
+		.end = 0xa0000000 + SZ_128M - 1,
+		.flags = IORESOURCE_MEM,
+	},
+	[4] = {
+		.start = 0xb0000000,
+		.end = 0xb0000000 + SZ_128M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PREFETCH,
+	},
+};
+
+static struct resource tegra_pcie_rp1_resources[] = {
+	[0] = {
+		.start = 0x80001000,
+		.end = 0x80001000 + SZ_4K - 1,
+		.flags = IORESOURCE_MEM,
+	},
+};
+
+static struct resource tegra_pcie_rp1_ranges[] = {
+	[0] = {
+		.start = 0x81800000,
+		.end = 0x81800000 + SZ_8M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS,
+	},
+	[1] = {
+		.start = 0x98000000,
+		.end = 0x98000000 + SZ_128M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS,
+	},
+	[2] = {
+		.start = 0x82010000,
+		.end = 0x82010000 + SZ_64K - 1,
+		.flags = IORESOURCE_IO,
+	},
+	[3] = {
+		.start = 0xa8000000,
+		.end = 0xa8000000 + SZ_128M - 1,
+		.flags = IORESOURCE_MEM,
+	},
+	[4] = {
+		.start = 0xb8000000,
+		.end = 0xb8000000 + SZ_128M - 1,
+		.flags = IORESOURCE_MEM | IORESOURCE_PREFETCH,
+	},
+};
+
+static struct tegra_pcie_rp tegra_pcie_ports[] = {
+	[0] = {
+		.resources = tegra_pcie_rp0_resources,
+		.num_resources = ARRAY_SIZE(tegra_pcie_rp0_resources),
+		.ranges = tegra_pcie_rp0_ranges,
+		.num_ranges = ARRAY_SIZE(tegra_pcie_rp0_ranges),
+		.num_lanes = 2,
+	},
+	[1] = {
+		.resources = tegra_pcie_rp1_resources,
+		.num_resources = ARRAY_SIZE(tegra_pcie_rp1_resources),
+		.ranges = tegra_pcie_rp1_ranges,
+		.num_ranges = ARRAY_SIZE(tegra_pcie_rp1_ranges),
+		.num_lanes = 2,
+	},
+};
+
+struct tegra_pcie_pdata tegra_pcie_pdata = {
+	.ports = tegra_pcie_ports,
+	.num_ports = ARRAY_SIZE(tegra_pcie_ports),
+};
+
+struct platform_device tegra_pcie_device = {
+	.name = "tegra-pcie",
+	.id = -1,
+	.resource = tegra_pcie_resources,
+	.num_resources = ARRAY_SIZE(tegra_pcie_resources),
+	.dev.platform_data = &tegra_pcie_pdata,
+};
diff --git a/arch/arm/mach-tegra/devices.h b/arch/arm/mach-tegra/devices.h
index f054d10..eb28671 100644
--- a/arch/arm/mach-tegra/devices.h
+++ b/arch/arm/mach-tegra/devices.h
@@ -58,4 +58,7 @@ extern struct platform_device tegra_i2s_device2;
 extern struct platform_device tegra_das_device;
 extern struct platform_device tegra_pwm_device;
 
+extern struct tegra_pcie_pdata tegra_pcie_pdata;
+extern struct platform_device tegra_pcie_device;
+
 #endif
diff --git a/arch/arm/mach-tegra/include/mach/iomap.h b/arch/arm/mach-tegra/include/mach/iomap.h
index fee3a94..7e76da7 100644
--- a/arch/arm/mach-tegra/include/mach/iomap.h
+++ b/arch/arm/mach-tegra/include/mach/iomap.h
@@ -303,9 +303,6 @@
 #define IO_APB_VIRT	IOMEM(0xFE300000)
 #define IO_APB_SIZE	SZ_1M
 
-#define TEGRA_PCIE_BASE		0x80000000
-#define TEGRA_PCIE_IO_BASE	(TEGRA_PCIE_BASE + SZ_4M)
-
 #define IO_TO_VIRT_BETWEEN(p, st, sz)	((p) >= (st) && (p) < ((st) + (sz)))
 #define IO_TO_VIRT_XLATE(p, pst, vst)	(((p) - (pst) + (vst)))
 
diff --git a/arch/arm/mach-tegra/include/mach/pci-tegra.h b/arch/arm/mach-tegra/include/mach/pci-tegra.h
new file mode 100644
index 0000000..e6d9fc3
--- /dev/null
+++ b/arch/arm/mach-tegra/include/mach/pci-tegra.h
@@ -0,0 +1,38 @@
+/*
+ * arch/arm/mach-tegra/include/mach/tegra-pcie.h
+ *
+ * Copyright (C) 2012 Avionic Design GmbH
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __MACH_TEGRA_PCIE_H
+#define __MACH_TEGRA_PCIE_H
+
+#include <linux/platform_device.h>
+
+struct tegra_pcie_rp {
+	unsigned int index;
+	struct resource *resources;
+	unsigned int num_resources;
+	struct resource *ranges;
+	unsigned int num_ranges;
+	unsigned int num_lanes;
+};
+
+struct tegra_pcie_pdata {
+	int (*init)(struct platform_device *pdev);
+	int (*exit)(struct platform_device *pdev);
+
+	struct tegra_pcie_rp *ports;
+	unsigned int num_ports;
+};
+
+#endif
diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
index efe71dd..3e5fb66 100644
--- a/arch/arm/mach-tegra/pcie.c
+++ b/arch/arm/mach-tegra/pcie.c
@@ -27,7 +27,9 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/slab.h>
 #include <linux/pci.h>
+#include <linux/platform_device.h>
 #include <linux/interrupt.h>
 #include <linux/irq.h>
 #include <linux/clk.h>
@@ -35,20 +37,17 @@
 #include <linux/export.h>
 
 #include <asm/sizes.h>
+#include <asm/mach/irq.h>
 #include <asm/mach/pci.h>
 
 #include <mach/iomap.h>
 #include <mach/clk.h>
 #include <mach/powergate.h>
+#include <mach/pci-tegra.h>
 
-#include "board.h"
 #include "pmc.h"
 
 /* register definitions */
-#define AFI_OFFSET	0x3800
-#define PADS_OFFSET	0x3000
-#define RP0_OFFSET	0x0000
-#define RP1_OFFSET	0x1000
 
 #define AFI_AXI_BAR0_SZ	0x00
 #define AFI_AXI_BAR1_SZ	0x04
@@ -161,141 +160,146 @@
  * 0x90000000 - 0x9fffffff - non-prefetchable memory
  * 0xa0000000 - 0xbfffffff - prefetchable memory
  */
-#define PCIE_REGS_SZ		SZ_16K
-#define PCIE_CFG_OFF		PCIE_REGS_SZ
-#define PCIE_CFG_SZ		SZ_1M
-#define PCIE_EXT_CFG_OFF	(PCIE_CFG_SZ + PCIE_CFG_OFF)
-#define PCIE_EXT_CFG_SZ		SZ_1M
-#define PCIE_IOMAP_SZ		(PCIE_REGS_SZ + PCIE_CFG_SZ + PCIE_EXT_CFG_SZ)
-
-#define MEM_BASE_0		(TEGRA_PCIE_BASE + SZ_256M)
-#define MEM_SIZE_0		SZ_128M
-#define MEM_BASE_1		(MEM_BASE_0 + MEM_SIZE_0)
-#define MEM_SIZE_1		SZ_128M
-#define PREFETCH_MEM_BASE_0	(MEM_BASE_1 + MEM_SIZE_1)
-#define PREFETCH_MEM_SIZE_0	SZ_128M
-#define PREFETCH_MEM_BASE_1	(PREFETCH_MEM_BASE_0 + PREFETCH_MEM_SIZE_0)
-#define PREFETCH_MEM_SIZE_1	SZ_128M
-
-#define  PCIE_CONF_BUS(b)	((b) << 16)
-#define  PCIE_CONF_DEV(d)	((d) << 11)
-#define  PCIE_CONF_FUNC(f)	((f) << 8)
-#define  PCIE_CONF_REG(r)	\
-	(((r) & ~0x3) | (((r) < 256) ? PCIE_CFG_OFF : PCIE_EXT_CFG_OFF))
 
-struct tegra_pcie_port {
-	int			index;
-	u8			root_bus_nr;
-	void __iomem		*base;
+#define PCIE_CONF_BUS(b)	((b) << 16)
+#define PCIE_CONF_DEV(d)	((d) << 11)
+#define PCIE_CONF_FUNC(f)	((f) << 8)
+#define PCIE_CONF_REG(r)	((((r) & 0xf00) << 16) | ((r) & ~3))
 
-	bool			link_up;
+struct tegra_pcie {
+	struct device *dev;
 
-	char			mem_space_name[16];
-	char			prefetch_space_name[20];
-	struct resource		res[2];
-};
+	void __iomem *pads;
+	void __iomem *afi;
+	int irq;
+
+	struct resource *cfg;
+	struct resource *extcfg;
 
-struct tegra_pcie_info {
-	struct tegra_pcie_port	port[2];
-	int			num_ports;
+	void __iomem *cs;
+	void __iomem *extcs;
 
-	void __iomem		*regs;
-	struct resource		res_mmio;
+	struct resource io;
+	struct resource mem;
+	struct resource prefetch;
 
-	struct clk		*pex_clk;
-	struct clk		*afi_clk;
-	struct clk		*pcie_xclk;
-	struct clk		*pll_e;
+	struct clk *pex_clk;
+	struct clk *afi_clk;
+	struct clk *pcie_xclk;
+	struct clk *pll_e;
+
+	struct list_head ports;
+	unsigned int num_ports;
 };
 
-static struct tegra_pcie_info tegra_pcie;
+struct tegra_pcie_port {
+	struct tegra_pcie *pcie;
+
+	void __iomem *base;
+	unsigned int index;
+
+	struct resource io;
+	struct resource mem;
+	struct resource prefetch;
+
+	struct list_head list;
+};
 
-static inline void afi_writel(u32 value, unsigned long offset)
+static inline struct tegra_pcie_port *sys_to_pcie(struct pci_sys_data *sys)
 {
-	writel(value, offset + AFI_OFFSET + tegra_pcie.regs);
+	return sys->private_data;
 }
 
-static inline u32 afi_readl(unsigned long offset)
+static inline void afi_writel(struct tegra_pcie *pcie, u32 value,
+			      unsigned long offset)
 {
-	return readl(offset + AFI_OFFSET + tegra_pcie.regs);
+	writel(value, pcie->afi + offset);
 }
 
-static inline void pads_writel(u32 value, unsigned long offset)
+static inline u32 afi_readl(struct tegra_pcie *pcie, unsigned long offset)
 {
-	writel(value, offset + PADS_OFFSET + tegra_pcie.regs);
+	return readl(pcie->afi + offset);
 }
 
-static inline u32 pads_readl(unsigned long offset)
+static inline void pads_writel(struct tegra_pcie *pcie, u32 value,
+			       unsigned long offset)
 {
-	return readl(offset + PADS_OFFSET + tegra_pcie.regs);
+	writel(value, pcie->pads + offset);
 }
 
-static struct tegra_pcie_port *bus_to_port(int bus)
+static inline u32 pads_readl(struct tegra_pcie *pcie, unsigned long offset)
 {
-	int i;
-
-	for (i = tegra_pcie.num_ports - 1; i >= 0; i--) {
-		int rbus = tegra_pcie.port[i].root_bus_nr;
-		if (rbus != -1 && rbus == bus)
-			break;
-	}
-
-	return i >= 0 ? tegra_pcie.port + i : NULL;
+	return readl(pcie->pads + offset);
 }
 
 static int tegra_pcie_read_conf(struct pci_bus *bus, unsigned int devfn,
-				int where, int size, u32 *val)
+				int where, int size, u32 *value)
 {
-	struct tegra_pcie_port *pp = bus_to_port(bus->number);
-	void __iomem *addr;
+	struct tegra_pcie_port *port = sys_to_pcie(bus->sysdata);
+	struct tegra_pcie *pcie = port->pcie;
+	unsigned long offset = -1;
+	void __iomem *addr = NULL;
 
-	if (pp) {
+	if (!bus->parent) {
 		if (devfn != 0) {
-			*val = 0xffffffff;
+			*value = 0xffffffff;
 			return PCIBIOS_DEVICE_NOT_FOUND;
 		}
 
-		addr = pp->base + (where & ~0x3);
+		addr = port->base + (where & ~0x3);
 	} else {
-		addr = tegra_pcie.regs + (PCIE_CONF_BUS(bus->number) +
-					  PCIE_CONF_DEV(PCI_SLOT(devfn)) +
-					  PCIE_CONF_FUNC(PCI_FUNC(devfn)) +
-					  PCIE_CONF_REG(where));
+		if (where >= 0x100)
+			addr = pcie->extcs;
+		else
+			addr = pcie->cs;
+
+		offset = PCIE_CONF_BUS(bus->number) +
+			 PCIE_CONF_DEV(PCI_SLOT(devfn)) +
+			 PCIE_CONF_FUNC(PCI_FUNC(devfn)) +
+			 PCIE_CONF_REG(where);
+		addr += offset;
 	}
 
-	*val = readl(addr);
+	*value = readl(addr);
 
 	if (size == 1)
-		*val = (*val >> (8 * (where & 3))) & 0xff;
+		*value = (*value >> (8 * (where & 3))) & 0xff;
 	else if (size == 2)
-		*val = (*val >> (8 * (where & 3))) & 0xffff;
+		*value = (*value >> (8 * (where & 3))) & 0xffff;
 
 	return PCIBIOS_SUCCESSFUL;
 }
 
 static int tegra_pcie_write_conf(struct pci_bus *bus, unsigned int devfn,
-				 int where, int size, u32 val)
+				 int where, int size, u32 value)
 {
-	struct tegra_pcie_port *pp = bus_to_port(bus->number);
+	struct tegra_pcie_port *port = sys_to_pcie(bus->sysdata);
+	struct tegra_pcie *pcie = port->pcie;
+	unsigned long offset = -1;
 	void __iomem *addr;
-
 	u32 mask;
 	u32 tmp;
 
-	if (pp) {
+	if (!bus->parent) {
 		if (devfn != 0)
 			return PCIBIOS_DEVICE_NOT_FOUND;
 
-		addr = pp->base + (where & ~0x3);
+		addr = port->base + (where & ~0x3);
 	} else {
-		addr = tegra_pcie.regs + (PCIE_CONF_BUS(bus->number) +
-					  PCIE_CONF_DEV(PCI_SLOT(devfn)) +
-					  PCIE_CONF_FUNC(PCI_FUNC(devfn)) +
-					  PCIE_CONF_REG(where));
+		if (where >= 0x100)
+			addr = pcie->extcs;
+		else
+			addr = pcie->cs;
+
+		offset = PCIE_CONF_BUS(bus->number) +
+			 PCIE_CONF_DEV(PCI_SLOT(devfn)) +
+			 PCIE_CONF_FUNC(PCI_FUNC(devfn)) +
+			 PCIE_CONF_REG(where);
+		addr += offset;
 	}
 
 	if (size == 4) {
-		writel(val, addr);
+		writel(value, addr);
 		return PCIBIOS_SUCCESSFUL;
 	}
 
@@ -307,7 +311,7 @@ static int tegra_pcie_write_conf(struct pci_bus *bus, unsigned int devfn,
 		return PCIBIOS_BAD_REGISTER_NUMBER;
 
 	tmp = readl(addr) & mask;
-	tmp |= val << ((where & 0x3) * 8);
+	tmp |= value << ((where & 0x3) * 8);
 	writel(tmp, addr);
 
 	return PCIBIOS_SUCCESSFUL;
@@ -358,85 +362,36 @@ DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_relax_enable);
 
 static int tegra_pcie_setup(int nr, struct pci_sys_data *sys)
 {
-	struct tegra_pcie_port *pp;
-
-	if (nr >= tegra_pcie.num_ports)
-		return 0;
+	struct tegra_pcie_port *port = sys_to_pcie(sys);
 
-	pp = tegra_pcie.port + nr;
-	pp->root_bus_nr = sys->busnr;
+	pci_add_resource_offset(&sys->resources, &port->io,
+				sys->io_offset);
+	pci_add_resource_offset(&sys->resources, &port->mem,
+				sys->mem_offset);
+	pci_add_resource_offset(&sys->resources, &port->prefetch,
+				sys->mem_offset);
 
-	pci_ioremap_io(nr * SZ_64K, TEGRA_PCIE_IO_BASE);
-
-	/*
-	 * IORESOURCE_MEM
-	 */
-	snprintf(pp->mem_space_name, sizeof(pp->mem_space_name),
-		 "PCIe %d MEM", pp->index);
-	pp->mem_space_name[sizeof(pp->mem_space_name) - 1] = 0;
-	pp->res[0].name = pp->mem_space_name;
-	if (pp->index == 0) {
-		pp->res[0].start = MEM_BASE_0;
-		pp->res[0].end = pp->res[0].start + MEM_SIZE_0 - 1;
-	} else {
-		pp->res[0].start = MEM_BASE_1;
-		pp->res[0].end = pp->res[0].start + MEM_SIZE_1 - 1;
-	}
-	pp->res[0].flags = IORESOURCE_MEM;
-	if (request_resource(&iomem_resource, &pp->res[0]))
-		panic("Request PCIe Memory resource failed\n");
-	pci_add_resource_offset(&sys->resources, &pp->res[0], sys->mem_offset);
-
-	/*
-	 * IORESOURCE_MEM | IORESOURCE_PREFETCH
-	 */
-	snprintf(pp->prefetch_space_name, sizeof(pp->prefetch_space_name),
-		 "PCIe %d PREFETCH MEM", pp->index);
-	pp->prefetch_space_name[sizeof(pp->prefetch_space_name) - 1] = 0;
-	pp->res[1].name = pp->prefetch_space_name;
-	if (pp->index == 0) {
-		pp->res[1].start = PREFETCH_MEM_BASE_0;
-		pp->res[1].end = pp->res[2].start + PREFETCH_MEM_SIZE_0 - 1;
-	} else {
-		pp->res[1].start = PREFETCH_MEM_BASE_1;
-		pp->res[1].end = pp->res[1].start + PREFETCH_MEM_SIZE_1 - 1;
-	}
-	pp->res[1].flags = IORESOURCE_MEM | IORESOURCE_PREFETCH;
-	if (request_resource(&iomem_resource, &pp->res[1]))
-		panic("Request PCIe Prefetch Memory resource failed\n");
-	pci_add_resource_offset(&sys->resources, &pp->res[1], sys->mem_offset);
+	pci_ioremap_io(nr * SZ_64K, port->io.start);
 
 	return 1;
 }
 
-static int tegra_pcie_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
+static int tegra_pcie_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
 {
-	return INT_PCIE_INTR;
+	struct tegra_pcie_port *port = sys_to_pcie(pdev->bus->sysdata);
+
+	return port->pcie->irq;
 }
 
-static struct pci_bus __init *tegra_pcie_scan_bus(int nr,
-						  struct pci_sys_data *sys)
+static struct pci_bus __devinit *tegra_pcie_scan_bus(int nr,
+						     struct pci_sys_data *sys)
 {
-	struct tegra_pcie_port *pp;
+	struct tegra_pcie_port *port = sys_to_pcie(sys);
 
-	if (nr >= tegra_pcie.num_ports)
-		return NULL;
-
-	pp = tegra_pcie.port + nr;
-	pp->root_bus_nr = sys->busnr;
-
-	return pci_scan_root_bus(NULL, sys->busnr, &tegra_pcie_ops, sys,
-				 &sys->resources);
+	return pci_scan_root_bus(port->pcie->dev, sys->busnr, &tegra_pcie_ops,
+				 sys, &sys->resources);
 }
 
-static struct hw_pci tegra_pcie_hw __initdata = {
-	.nr_controllers	= 2,
-	.setup		= tegra_pcie_setup,
-	.scan		= tegra_pcie_scan_bus,
-	.map_irq	= tegra_pcie_map_irq,
-};
-
-
 static irqreturn_t tegra_pcie_isr(int irq, void *arg)
 {
 	const char *err_msg[] = {
@@ -448,14 +403,14 @@ static irqreturn_t tegra_pcie_isr(int irq, void *arg)
 		"Invalid write",
 		"Response decoding error",
 		"AXI response decoding error",
-		"Transcation timeout",
+		"Transaction timeout",
 	};
-
+	struct tegra_pcie *pcie = arg;
 	u32 code, signature;
 
-	code = afi_readl(AFI_INTR_CODE) & AFI_INTR_CODE_MASK;
-	signature = afi_readl(AFI_INTR_SIGNATURE);
-	afi_writel(0, AFI_INTR_CODE);
+	code = afi_readl(pcie, AFI_INTR_CODE) & AFI_INTR_CODE_MASK;
+	signature = afi_readl(pcie, AFI_INTR_SIGNATURE);
+	afi_writel(pcie, 0, AFI_INTR_CODE);
 
 	if (code == AFI_INTR_LEGACY)
 		return IRQ_NONE;
@@ -468,405 +423,594 @@ static irqreturn_t tegra_pcie_isr(int irq, void *arg)
 	 * happen a lot during enumeration
 	 */
 	if (code == AFI_INTR_MASTER_ABORT)
-		pr_debug("PCIE: %s, signature: %08x\n", err_msg[code], signature);
+		dev_dbg(pcie->dev, "%s, signature: %08x\n", err_msg[code],
+			signature);
 	else
-		pr_err("PCIE: %s, signature: %08x\n", err_msg[code], signature);
+		dev_err(pcie->dev, "%s, signature: %08x\n", err_msg[code],
+			signature);
 
 	return IRQ_HANDLED;
 }
 
-static void tegra_pcie_setup_translations(void)
+static void tegra_pcie_setup_translations(struct tegra_pcie *pcie)
 {
 	u32 fpci_bar;
 	u32 size;
 	u32 axi_address;
 
 	/* Bar 0: config Bar */
-	fpci_bar = ((u32)0xfdff << 16);
-	size = PCIE_CFG_SZ;
-	axi_address = TEGRA_PCIE_BASE + PCIE_CFG_OFF;
-	afi_writel(axi_address, AFI_AXI_BAR0_START);
-	afi_writel(size >> 12, AFI_AXI_BAR0_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR0);
+	fpci_bar = 0xfdff0000;
+	size = resource_size(pcie->cfg);
+	axi_address = pcie->cfg->start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR0_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR0_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR0);
 
 	/* Bar 1: extended config Bar */
-	fpci_bar = ((u32)0xfe1 << 20);
-	size = PCIE_EXT_CFG_SZ;
-	axi_address = TEGRA_PCIE_BASE + PCIE_EXT_CFG_OFF;
-	afi_writel(axi_address, AFI_AXI_BAR1_START);
-	afi_writel(size >> 12, AFI_AXI_BAR1_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR1);
+	fpci_bar = 0xfe100000;
+	size = resource_size(pcie->extcfg);
+	axi_address = pcie->extcfg->start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR1_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR1_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR1);
 
 	/* Bar 2: downstream IO bar */
-	fpci_bar = ((__u32)0xfdfc << 16);
-	size = SZ_128K;
-	axi_address = TEGRA_PCIE_IO_BASE;
-	afi_writel(axi_address, AFI_AXI_BAR2_START);
-	afi_writel(size >> 12, AFI_AXI_BAR2_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR2);
+	fpci_bar = 0xfdfc0000;
+	size = resource_size(&pcie->io);
+	axi_address = pcie->io.start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR2_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR2_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR2);
 
 	/* Bar 3: prefetchable memory BAR */
-	fpci_bar = (((PREFETCH_MEM_BASE_0 >> 12) & 0x0fffffff) << 4) | 0x1;
-	size =  PREFETCH_MEM_SIZE_0 +  PREFETCH_MEM_SIZE_1;
-	axi_address = PREFETCH_MEM_BASE_0;
-	afi_writel(axi_address, AFI_AXI_BAR3_START);
-	afi_writel(size >> 12, AFI_AXI_BAR3_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR3);
+	fpci_bar = (((pcie->prefetch.start >> 12) & 0x0fffffff) << 4) | 0x1;
+	size = resource_size(&pcie->prefetch);
+	axi_address = pcie->prefetch.start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR3_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR3_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR3);
 
 	/* Bar 4: non prefetchable memory BAR */
-	fpci_bar = (((MEM_BASE_0 >> 12)	& 0x0FFFFFFF) << 4) | 0x1;
-	size = MEM_SIZE_0 + MEM_SIZE_1;
-	axi_address = MEM_BASE_0;
-	afi_writel(axi_address, AFI_AXI_BAR4_START);
-	afi_writel(size >> 12, AFI_AXI_BAR4_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR4);
+	fpci_bar = (((pcie->mem.start >> 12) & 0x0fffffff) << 4) | 0x1;
+	size = resource_size(&pcie->mem);
+	axi_address = pcie->mem.start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR4_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR4_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR4);
 
 	/* Bar 5: NULL out the remaining BAR as it is not used */
 	fpci_bar = 0;
 	size = 0;
 	axi_address = 0;
-	afi_writel(axi_address, AFI_AXI_BAR5_START);
-	afi_writel(size >> 12, AFI_AXI_BAR5_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR5);
+	afi_writel(pcie, axi_address, AFI_AXI_BAR5_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR5_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR5);
 
 	/* map all upstream transactions as uncached */
-	afi_writel(PHYS_OFFSET, AFI_CACHE_BAR0_ST);
-	afi_writel(0, AFI_CACHE_BAR0_SZ);
-	afi_writel(0, AFI_CACHE_BAR1_ST);
-	afi_writel(0, AFI_CACHE_BAR1_SZ);
-
-	/* No MSI */
-	afi_writel(0, AFI_MSI_FPCI_BAR_ST);
-	afi_writel(0, AFI_MSI_BAR_SZ);
-	afi_writel(0, AFI_MSI_AXI_BAR_ST);
-	afi_writel(0, AFI_MSI_BAR_SZ);
+	afi_writel(pcie, PHYS_OFFSET, AFI_CACHE_BAR0_ST);
+	afi_writel(pcie, 0, AFI_CACHE_BAR0_SZ);
+	afi_writel(pcie, 0, AFI_CACHE_BAR1_ST);
+	afi_writel(pcie, 0, AFI_CACHE_BAR1_SZ);
+
+	/* MSI translations are setup only when needed */
+	afi_writel(pcie, 0, AFI_MSI_FPCI_BAR_ST);
+	afi_writel(pcie, 0, AFI_MSI_BAR_SZ);
+	afi_writel(pcie, 0, AFI_MSI_AXI_BAR_ST);
+	afi_writel(pcie, 0, AFI_MSI_BAR_SZ);
 }
 
-static int tegra_pcie_enable_controller(void)
+static int tegra_pcie_enable_controller(struct tegra_pcie *pcie)
 {
-	u32 val, reg;
-	int i, timeout;
-
-	/* Enable slot clock and pulse the reset signals */
-	for (i = 0, reg = AFI_PEX0_CTRL; i < 2; i++, reg += 0x8) {
-		val = afi_readl(reg) |  AFI_PEX_CTRL_REFCLK_EN;
-		afi_writel(val, reg);
-		val &= ~AFI_PEX_CTRL_RST;
-		afi_writel(val, reg);
-
-		val = afi_readl(reg) | AFI_PEX_CTRL_RST;
-		afi_writel(val, reg);
-	}
+	unsigned int timeout;
+	unsigned long value;
 
-	/* Enable dual controller and both ports */
-	val = afi_readl(AFI_PCIE_CONFIG);
-	val &= ~(AFI_PCIE_CONFIG_PCIEC0_DISABLE_DEVICE |
-		 AFI_PCIE_CONFIG_PCIEC1_DISABLE_DEVICE |
-		 AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK);
-	val |= AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL;
-	afi_writel(val, AFI_PCIE_CONFIG);
+	/* enable dual controller and both ports */
+	value = afi_readl(pcie, AFI_PCIE_CONFIG);
+	value &= ~(AFI_PCIE_CONFIG_PCIEC0_DISABLE_DEVICE |
+		   AFI_PCIE_CONFIG_PCIEC1_DISABLE_DEVICE |
+		   AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK);
+	value |= AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL;
+	afi_writel(pcie, value, AFI_PCIE_CONFIG);
 
-	val = afi_readl(AFI_FUSE) & ~AFI_FUSE_PCIE_T0_GEN2_DIS;
-	afi_writel(val, AFI_FUSE);
+	value = afi_readl(pcie, AFI_FUSE);
+	value &= ~AFI_FUSE_PCIE_T0_GEN2_DIS;
+	afi_writel(pcie, value, AFI_FUSE);
 
-	/* Initialze internal PHY, enable up to 16 PCIE lanes */
-	pads_writel(0x0, PADS_CTL_SEL);
+	/* initialze internal PHY, enable up to 16 PCIE lanes */
+	pads_writel(pcie, 0x0, PADS_CTL_SEL);
 
 	/* override IDDQ to 1 on all 4 lanes */
-	val = pads_readl(PADS_CTL) | PADS_CTL_IDDQ_1L;
-	pads_writel(val, PADS_CTL);
+	value = pads_readl(pcie, PADS_CTL);
+	value |= PADS_CTL_IDDQ_1L;
+	pads_writel(pcie, value, PADS_CTL);
 
 	/*
-	 * set up PHY PLL inputs select PLLE output as refclock,
-	 * set TX ref sel to div10 (not div5)
+	 * Set up PHY PLL inputs select PLLE output as refclock,
+	 * set TX ref sel to div10 (not div5).
 	 */
-	val = pads_readl(PADS_PLL_CTL);
-	val &= ~(PADS_PLL_CTL_REFCLK_MASK | PADS_PLL_CTL_TXCLKREF_MASK);
-	val |= (PADS_PLL_CTL_REFCLK_INTERNAL_CML | PADS_PLL_CTL_TXCLKREF_DIV10);
-	pads_writel(val, PADS_PLL_CTL);
+	value = pads_readl(pcie, PADS_PLL_CTL);
+	value &= ~(PADS_PLL_CTL_REFCLK_MASK | PADS_PLL_CTL_TXCLKREF_MASK);
+	value |= (PADS_PLL_CTL_REFCLK_INTERNAL_CML | PADS_PLL_CTL_TXCLKREF_DIV10);
+	pads_writel(pcie, value, PADS_PLL_CTL);
 
 	/* take PLL out of reset  */
-	val = pads_readl(PADS_PLL_CTL) | PADS_PLL_CTL_RST_B4SM;
-	pads_writel(val, PADS_PLL_CTL);
+	value = pads_readl(pcie, PADS_PLL_CTL);
+	value |= PADS_PLL_CTL_RST_B4SM;
+	pads_writel(pcie, value, PADS_PLL_CTL);
 
 	/*
 	 * Hack, set the clock voltage to the DEFAULT provided by hw folks.
-	 * This doesn't exist in the documentation
+	 * This doesn't exist in the documentation.
 	 */
-	pads_writel(0xfa5cfa5c, 0xc8);
+	pads_writel(pcie, 0xfa5cfa5c, 0xc8);
 
-	/* Wait for the PLL to lock */
+	/* wait for the PLL to lock */
 	timeout = 300;
 	do {
-		val = pads_readl(PADS_PLL_CTL);
+		value = pads_readl(pcie, PADS_PLL_CTL);
 		usleep_range(1000, 1000);
 		if (--timeout == 0) {
 			pr_err("Tegra PCIe error: timeout waiting for PLL\n");
 			return -EBUSY;
 		}
-	} while (!(val & PADS_PLL_CTL_LOCKDET));
+	} while (!(value & PADS_PLL_CTL_LOCKDET));
 
 	/* turn off IDDQ override */
-	val = pads_readl(PADS_CTL) & ~PADS_CTL_IDDQ_1L;
-	pads_writel(val, PADS_CTL);
+	value = pads_readl(pcie, PADS_CTL);
+	value &= ~PADS_CTL_IDDQ_1L;
+	pads_writel(pcie, value, PADS_CTL);
 
 	/* enable TX/RX data */
-	val = pads_readl(PADS_CTL);
-	val |= (PADS_CTL_TX_DATA_EN_1L | PADS_CTL_RX_DATA_EN_1L);
-	pads_writel(val, PADS_CTL);
+	value = pads_readl(pcie, PADS_CTL);
+	value |= PADS_CTL_TX_DATA_EN_1L | PADS_CTL_RX_DATA_EN_1L;
+	pads_writel(pcie, value, PADS_CTL);
 
-	/* Take the PCIe interface module out of reset */
-	tegra_periph_reset_deassert(tegra_pcie.pcie_xclk);
+	/* take the PCIe interface module out of reset */
+	tegra_periph_reset_deassert(pcie->pcie_xclk);
 
-	/* Finally enable PCIe */
-	val = afi_readl(AFI_CONFIGURATION) | AFI_CONFIGURATION_EN_FPCI;
-	afi_writel(val, AFI_CONFIGURATION);
+	/* finally enable PCIe */
+	value = afi_readl(pcie, AFI_CONFIGURATION);
+	value |= AFI_CONFIGURATION_EN_FPCI;
+	afi_writel(pcie, value, AFI_CONFIGURATION);
 
-	val = (AFI_INTR_EN_INI_SLVERR | AFI_INTR_EN_INI_DECERR |
-	       AFI_INTR_EN_TGT_SLVERR | AFI_INTR_EN_TGT_DECERR |
-	       AFI_INTR_EN_TGT_WRERR | AFI_INTR_EN_DFPCI_DECERR);
-	afi_writel(val, AFI_AFI_INTR_ENABLE);
-	afi_writel(0xffffffff, AFI_SM_INTR_ENABLE);
+	value = AFI_INTR_EN_INI_SLVERR | AFI_INTR_EN_INI_DECERR |
+		AFI_INTR_EN_TGT_SLVERR | AFI_INTR_EN_TGT_DECERR |
+		AFI_INTR_EN_TGT_WRERR | AFI_INTR_EN_DFPCI_DECERR;
+	afi_writel(pcie, value, AFI_AFI_INTR_ENABLE);
+	afi_writel(pcie, 0xffffffff, AFI_SM_INTR_ENABLE);
 
-	/* FIXME: No MSI for now, only INT */
-	afi_writel(AFI_INTR_MASK_INT_MASK, AFI_INTR_MASK);
+	/* don't enable MSI for now, only when needed */
+	afi_writel(pcie, AFI_INTR_MASK_INT_MASK, AFI_INTR_MASK);
 
-	/* Disable all execptions */
-	afi_writel(0, AFI_FPCI_ERROR_MASKS);
+	/* disable all exceptions */
+	afi_writel(pcie, 0, AFI_FPCI_ERROR_MASKS);
 
 	return 0;
 }
 
-static void tegra_pcie_power_off(void)
+static void tegra_pcie_power_off(struct tegra_pcie *pcie)
 {
-	tegra_periph_reset_assert(tegra_pcie.pcie_xclk);
-	tegra_periph_reset_assert(tegra_pcie.afi_clk);
-	tegra_periph_reset_assert(tegra_pcie.pex_clk);
+	tegra_periph_reset_assert(pcie->pcie_xclk);
+	tegra_periph_reset_assert(pcie->afi_clk);
+	tegra_periph_reset_assert(pcie->pex_clk);
 
 	tegra_powergate_power_off(TEGRA_POWERGATE_PCIE);
 	tegra_pmc_pcie_xclk_clamp(true);
 }
 
-static int tegra_pcie_power_regate(void)
+static int tegra_pcie_power_regate(struct tegra_pcie *pcie)
 {
 	int err;
 
-	tegra_pcie_power_off();
+	tegra_pcie_power_off(pcie);
 
 	tegra_pmc_pcie_xclk_clamp(true);
 
-	tegra_periph_reset_assert(tegra_pcie.pcie_xclk);
-	tegra_periph_reset_assert(tegra_pcie.afi_clk);
+	tegra_periph_reset_assert(pcie->pcie_xclk);
+	tegra_periph_reset_assert(pcie->afi_clk);
 
 	err = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_PCIE,
-						tegra_pcie.pex_clk);
+						pcie->pex_clk);
 	if (err) {
-		pr_err("PCIE: powerup sequence failed: %d\n", err);
+		dev_err(pcie->dev, "powerup sequence failed: %d\n", err);
 		return err;
 	}
 
-	tegra_periph_reset_deassert(tegra_pcie.afi_clk);
+	tegra_periph_reset_deassert(pcie->afi_clk);
 
 	tegra_pmc_pcie_xclk_clamp(false);
 
-	clk_prepare_enable(tegra_pcie.afi_clk);
-	clk_prepare_enable(tegra_pcie.pex_clk);
-	return clk_prepare_enable(tegra_pcie.pll_e);
+	clk_prepare_enable(pcie->afi_clk);
+	clk_prepare_enable(pcie->pex_clk);
+	return clk_prepare_enable(pcie->pll_e);
 }
 
-static int tegra_pcie_clocks_get(void)
+static int tegra_pcie_clocks_get(struct tegra_pcie *pcie)
 {
-	int err;
+	pcie->pex_clk = devm_clk_get(pcie->dev, "pex");
+	if (IS_ERR(pcie->pex_clk))
+		return PTR_ERR(pcie->pex_clk);
 
-	tegra_pcie.pex_clk = clk_get(NULL, "pex");
-	if (IS_ERR(tegra_pcie.pex_clk))
-		return PTR_ERR(tegra_pcie.pex_clk);
+	pcie->afi_clk = devm_clk_get(pcie->dev, "afi");
+	if (IS_ERR(pcie->afi_clk))
+		return PTR_ERR(pcie->afi_clk);
 
-	tegra_pcie.afi_clk = clk_get(NULL, "afi");
-	if (IS_ERR(tegra_pcie.afi_clk)) {
-		err = PTR_ERR(tegra_pcie.afi_clk);
-		goto err_afi_clk;
-	}
+	pcie->pcie_xclk = devm_clk_get(pcie->dev, "pcie_xclk");
+	if (IS_ERR(pcie->pcie_xclk))
+		return PTR_ERR(pcie->pcie_xclk);
 
-	tegra_pcie.pcie_xclk = clk_get(NULL, "pcie_xclk");
-	if (IS_ERR(tegra_pcie.pcie_xclk)) {
-		err =  PTR_ERR(tegra_pcie.pcie_xclk);
-		goto err_pcie_xclk;
-	}
-
-	tegra_pcie.pll_e = clk_get_sys(NULL, "pll_e");
-	if (IS_ERR(tegra_pcie.pll_e)) {
-		err = PTR_ERR(tegra_pcie.pll_e);
-		goto err_pll_e;
-	}
+	pcie->pll_e = devm_clk_get(pcie->dev, "pll_e");
+	if (IS_ERR(pcie->pll_e))
+		return PTR_ERR(pcie->pll_e);
 
 	return 0;
-
-err_pll_e:
-	clk_put(tegra_pcie.pcie_xclk);
-err_pcie_xclk:
-	clk_put(tegra_pcie.afi_clk);
-err_afi_clk:
-	clk_put(tegra_pcie.pex_clk);
-
-	return err;
-}
-
-static void tegra_pcie_clocks_put(void)
-{
-	clk_put(tegra_pcie.pll_e);
-	clk_put(tegra_pcie.pcie_xclk);
-	clk_put(tegra_pcie.afi_clk);
-	clk_put(tegra_pcie.pex_clk);
 }
 
-static int __init tegra_pcie_get_resources(void)
+static int __devinit tegra_pcie_get_resources(struct tegra_pcie *pcie)
 {
+	struct platform_device *pdev = to_platform_device(pcie->dev);
+	struct resource *pads, *afi;
 	int err;
 
-	err = tegra_pcie_clocks_get();
+	err = tegra_pcie_clocks_get(pcie);
 	if (err) {
-		pr_err("PCIE: failed to get clocks: %d\n", err);
+		dev_err(&pdev->dev, "failed to get clocks: %d\n", err);
 		return err;
 	}
 
-	err = tegra_pcie_power_regate();
+	err = tegra_pcie_power_regate(pcie);
 	if (err) {
-		pr_err("PCIE: failed to power up: %d\n", err);
-		goto err_pwr_on;
+		dev_err(&pdev->dev, "failed to power up: %d\n", err);
+		return err;
 	}
 
-	tegra_pcie.regs = ioremap_nocache(TEGRA_PCIE_BASE, PCIE_IOMAP_SZ);
-	if (tegra_pcie.regs == NULL) {
-		pr_err("PCIE: Failed to map PCI/AFI registers\n");
-		err = -ENOMEM;
-		goto err_map_reg;
+	/* request and remap controller registers */
+	pads = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!pads) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
 	}
 
-	err = request_irq(INT_PCIE_INTR, tegra_pcie_isr,
-			  IRQF_SHARED, "PCIE", &tegra_pcie);
+	afi = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+	if (!afi) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	pcie->pads = devm_request_and_ioremap(&pdev->dev, pads);
+	if (!pcie->pads) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	pcie->afi = devm_request_and_ioremap(&pdev->dev, afi);
+	if (!pcie->afi) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	/* request and remap configuration space */
+	pcie->cfg = platform_get_resource(pdev, IORESOURCE_MEM, 2);
+	if (!pcie->cfg) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	pcie->extcfg = platform_get_resource(pdev, IORESOURCE_MEM, 3);
+	if (!pcie->extcfg) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	pcie->cs = devm_request_and_ioremap(&pdev->dev, pcie->cfg);
+	if (!pcie->cs) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	pcie->extcs = devm_request_and_ioremap(&pdev->dev, pcie->extcfg);
+	if (!pcie->extcs) {
+		err = -EADDRNOTAVAIL;
+		goto power_off;
+	}
+
+	/* request interrupt */
+	err = platform_get_irq(pdev, 0);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to get IRQ: %d\n", err);
+		goto power_off;
+	}
+
+	pcie->irq = err;
+
+	err = devm_request_irq(&pdev->dev, pcie->irq, tegra_pcie_isr,
+			       IRQF_SHARED, "PCIE", pcie);
 	if (err) {
-		pr_err("PCIE: Failed to register IRQ: %d\n", err);
-		goto err_req_io;
+		dev_err(&pdev->dev, "failed to register IRQ: %d\n", err);
+		goto power_off;
 	}
-	set_irq_flags(INT_PCIE_INTR, IRQF_VALID);
 
 	return 0;
 
-err_req_io:
-	iounmap(tegra_pcie.regs);
-err_map_reg:
-	tegra_pcie_power_off();
-err_pwr_on:
-	tegra_pcie_clocks_put();
-
+power_off:
+	tegra_pcie_power_off(pcie);
 	return err;
 }
 
+static int tegra_pcie_put_resources(struct tegra_pcie *pcie)
+{
+	tegra_pcie_power_off(pcie);
+	return 0;
+}
+
+static inline void init_range(struct resource *range, unsigned long flags)
+{
+	range->start = ~0;
+	range->end = 0;
+	range->flags = flags;
+}
+
+static inline void merge_range(struct resource *range, struct resource *new)
+{
+	if (new->start < range->start)
+		range->start = new->start;
+
+	if (new->end > range->end)
+		range->end = new->end;
+}
+
+static unsigned long tegra_pcie_port_get_pex_ctrl(struct tegra_pcie_port *port)
+{
+	unsigned long ret = 0;
+
+	switch (port->index) {
+	case 0:
+		ret = AFI_PEX0_CTRL;
+		break;
+
+	case 1:
+		ret = AFI_PEX1_CTRL;
+		break;
+	}
+
+	return ret;
+}
+
 /*
  * FIXME: If there are no PCIe cards attached, then calling this function
  * can result in the increase of the bootup time as there are big timeout
  * loops.
  */
 #define TEGRA_PCIE_LINKUP_TIMEOUT	200	/* up to 1.2 seconds */
-static bool tegra_pcie_check_link(struct tegra_pcie_port *pp, int idx,
-				  u32 reset_reg)
+static bool tegra_pcie_port_check_link(struct tegra_pcie_port *port)
 {
-	u32 reg;
-	int retries = 3;
-	int timeout;
+	unsigned long value, ctrl = tegra_pcie_port_get_pex_ctrl(port);
+	unsigned int retries = 3;
+
+	/* enable reference clock */
+	value = afi_readl(port->pcie, ctrl);
+	value |= AFI_PEX_CTRL_REFCLK_EN;
+	afi_writel(port->pcie, value, ctrl);
 
 	do {
-		timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
-		while (timeout) {
-			reg = readl(pp->base + RP_VEND_XP);
+		unsigned int timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
 
-			if (reg & RP_VEND_XP_DL_UP)
+		do {
+			value = readl(port->base + RP_VEND_XP);
+
+			if (value & RP_VEND_XP_DL_UP)
 				break;
 
-			mdelay(1);
-			timeout--;
-		}
+			usleep_range(1000, 1000);
+		} while (--timeout);
 
-		if (!timeout)  {
-			pr_err("PCIE: port %d: link down, retrying\n", idx);
+		if (!timeout) {
+			dev_err(port->pcie->dev, "link %u down, retrying\n",
+				port->index);
 			goto retry;
 		}
 
 		timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
-		while (timeout) {
-			reg = readl(pp->base + RP_LINK_CONTROL_STATUS);
 
-			if (reg & 0x20000000)
+		do {
+			value = readl(port->base + RP_LINK_CONTROL_STATUS);
+
+			if (value & 0x20000000)
 				return true;
 
-			mdelay(1);
-			timeout--;
-		}
+			usleep_range(1000, 1000);
+		} while (--timeout);
 
 retry:
 		/* Pulse the PEX reset */
-		reg = afi_readl(reset_reg) | AFI_PEX_CTRL_RST;
-		afi_writel(reg, reset_reg);
-		mdelay(1);
-		reg = afi_readl(reset_reg) & ~AFI_PEX_CTRL_RST;
-		afi_writel(reg, reset_reg);
+		value = afi_readl(port->pcie, ctrl);
+		value |= AFI_PEX_CTRL_RST;
+		afi_writel(port->pcie, value, ctrl);
 
-		retries--;
-	} while (retries);
+		usleep_range(1000, 1000);
+
+		value = afi_readl(port->pcie, ctrl);
+		value &= ~AFI_PEX_CTRL_RST;
+		afi_writel(port->pcie, value, ctrl);
+	} while (--retries);
 
 	return false;
 }
 
-static void __init tegra_pcie_add_port(int index, u32 offset, u32 reset_reg)
+static int tegra_pcie_enable(struct tegra_pcie *pcie)
 {
-	struct tegra_pcie_port *pp;
+	struct tegra_pcie_port *port;
+	struct hw_pci hw;
 
-	pp = tegra_pcie.port + tegra_pcie.num_ports;
+	list_for_each_entry(port, &pcie->ports, list) {
+		memset(&hw, 0, sizeof(hw));
 
-	pp->index = -1;
-	pp->base = tegra_pcie.regs + offset;
-	pp->link_up = tegra_pcie_check_link(pp, index, reset_reg);
+		hw.nr_controllers = 1;
+		hw.private_data = (void **)&port;
+		hw.setup = tegra_pcie_setup;
+		hw.scan = tegra_pcie_scan_bus;
+		hw.map_irq = tegra_pcie_map_irq;
 
-	if (!pp->link_up) {
-		pp->base = NULL;
-		printk(KERN_INFO "PCIE: port %d: link down, ignoring\n", index);
-		return;
+		pci_common_init(&hw);
+	}
+
+	return 0;
+}
+
+static int tegra_pcie_add_port(struct tegra_pcie *pcie,
+			       struct tegra_pcie_rp *rp)
+{
+	struct tegra_pcie_port *port;
+	unsigned int i;
+
+	if (!rp->num_resources)
+		return -ENODEV;
+
+	port = devm_kzalloc(pcie->dev, sizeof(*port), GFP_KERNEL);
+	if (!port)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&port->list);
+	port->index = rp->index;
+	port->pcie = pcie;
+
+	port->base = devm_request_and_ioremap(pcie->dev, &rp->resources[0]);
+	if (!port->base)
+		return -EADDRNOTAVAIL;
+
+	if (!tegra_pcie_port_check_link(port)) {
+		dev_info(pcie->dev, "link %u down, ignoring\n", port->index);
+		return -ENODEV;
+	}
+
+	for (i = 0; i < rp->num_ranges; i++) {
+		struct resource *src = &rp->ranges[i], *dest;
+
+		switch (src->flags & IORESOURCE_TYPE_BITS) {
+		case IORESOURCE_IO:
+			if (resource_size(src) > SZ_64K) {
+				dev_warn(pcie->dev, "I/O region for port %u exceeds 64 KiB limit, truncating!\n",
+					 port->index);
+				src->end = src->start + SZ_64K - 1;
+			}
+
+			merge_range(&pcie->io, src);
+			dest = &port->io;
+			break;
+
+		case IORESOURCE_MEM:
+			if (src->flags & IORESOURCE_PREFETCH) {
+				merge_range(&pcie->prefetch, src);
+				dest = &port->prefetch;
+			} else {
+				merge_range(&pcie->mem, src);
+				dest = &port->mem;
+			}
+			break;
+
+		default:
+			dev_dbg(pcie->dev, "unknown resource type: %#lx\n",
+				src->flags & IORESOURCE_TYPE_BITS);
+			continue;
+		}
+
+		memcpy(dest, src, sizeof(*src));
 	}
 
-	tegra_pcie.num_ports++;
-	pp->index = index;
-	pp->root_bus_nr = -1;
-	memset(pp->res, 0, sizeof(pp->res));
+	list_add_tail(&port->list, &pcie->ports);
+	pcie->num_ports++;
+
+	return 0;
 }
 
-int __init tegra_pcie_init(bool init_port0, bool init_port1)
+static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 {
+	struct tegra_pcie_pdata *pdata = pdev->dev.platform_data;
+	struct tegra_pcie *pcie;
+	unsigned int i;
 	int err;
 
-	if (!(init_port0 || init_port1))
+	pcie = devm_kzalloc(&pdev->dev, sizeof(*pcie), GFP_KERNEL);
+	if (!pcie)
+		return -ENOMEM;
+
+	pcie->dev = &pdev->dev;
+
+	if (!pdata)
 		return -ENODEV;
 
 	pcibios_min_mem = 0;
 
-	err = tegra_pcie_get_resources();
-	if (err)
+	err = tegra_pcie_get_resources(pcie);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request resources: %d\n", err);
 		return err;
+	}
+
+	platform_set_drvdata(pdev, pcie);
+
+	if (pdata->init) {
+		err = pdata->init(pdev);
+		if (err < 0)
+			goto put_resources;
+	}
 
-	err = tegra_pcie_enable_controller();
+	err = tegra_pcie_enable_controller(pcie);
 	if (err)
-		return err;
+		goto put_resources;
+
+	/* probe root ports */
+	INIT_LIST_HEAD(&pcie->ports);
+	pcie->num_ports = 0;
+
+	for (i = 0; i < pdata->num_ports; i++) {
+		err = tegra_pcie_add_port(pcie, &pdata->ports[i]);
+		if (err < 0)
+			dev_dbg(&pdev->dev, "failed to add port %u: %d\n",
+				pdata->ports[i].index, err);
+	}
 
 	/* setup the AFI address translations */
-	tegra_pcie_setup_translations();
+	tegra_pcie_setup_translations(pcie);
 
-	if (init_port0)
-		tegra_pcie_add_port(0, RP0_OFFSET, AFI_PEX0_CTRL);
+	err = tegra_pcie_enable(pcie);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to enable PCIe ports: %d\n", err);
+		goto put_resources;
+	}
 
-	if (init_port1)
-		tegra_pcie_add_port(1, RP1_OFFSET, AFI_PEX1_CTRL);
+	return 0;
 
-	pci_common_init(&tegra_pcie_hw);
+put_resources:
+	tegra_pcie_put_resources(pcie);
+	return err;
+}
+
+static int __devexit tegra_pcie_remove(struct platform_device *pdev)
+{
+	struct tegra_pcie_pdata *pdata = pdev->dev.platform_data;
+	struct tegra_pcie *pcie = platform_get_drvdata(pdev);
+	int err;
+
+	err = tegra_pcie_put_resources(pcie);
+	if (err < 0)
+		return err;
+
+	if (pdata->exit) {
+		err = pdata->exit(pdev);
+		if (err < 0)
+			return err;
+	}
 
 	return 0;
 }
+
+static struct platform_driver tegra_pcie_driver = {
+	.driver = {
+		.name = "tegra-pcie",
+		.owner = THIS_MODULE,
+	},
+	.probe = tegra_pcie_probe,
+	.remove = __devexit_p(tegra_pcie_remove),
+};
+module_platform_driver(tegra_pcie_driver);
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 07/10] ARM: tegra: pcie: Add MSI support
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (5 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 06/10] ARM: tegra: Rewrite PCIe support as a driver Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-26 19:55 ` [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially Thierry Reding
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

This commit adds support for message signaled interrupts to the Tegra
PCIe controller. Based on code by Krishna Kishore <kthota@nvidia.com>.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- clear interrupts before handling them
- free pages used as MSI region

Changes in v2:
- improve compile coverage by using the IS_ENABLED() macro
- move MSI-related fields to a separate structure
- free pages used for the AFI/FPCI region
- properly remove IRQ domain on module removal
- disable MSI interrupt on module removal
- use linear IRQ domain

 arch/arm/mach-tegra/Kconfig             |   1 +
 arch/arm/mach-tegra/devices.c           |   7 +
 arch/arm/mach-tegra/include/mach/irqs.h |   5 +-
 arch/arm/mach-tegra/pcie.c              | 277 +++++++++++++++++++++++++++++++-
 4 files changed, 288 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig
index 3f6ea1e..dba0702 100644
--- a/arch/arm/mach-tegra/Kconfig
+++ b/arch/arm/mach-tegra/Kconfig
@@ -48,6 +48,7 @@ config ARCH_TEGRA_3x_SOC
 config TEGRA_PCI
 	bool "PCI Express support"
 	depends on ARCH_TEGRA_2x_SOC
+	select ARCH_SUPPORTS_MSI
 	select PCI
 
 config TEGRA_AHB
diff --git a/arch/arm/mach-tegra/devices.c b/arch/arm/mach-tegra/devices.c
index 203af2e..308515a 100644
--- a/arch/arm/mach-tegra/devices.c
+++ b/arch/arm/mach-tegra/devices.c
@@ -767,6 +767,13 @@ static struct resource tegra_pcie_resources[] = {
 		.end = INT_PCIE_INTR,
 		.flags = IORESOURCE_IRQ,
 	},
+#ifdef CONFIG_PCI_MSI
+	[5] = {
+		.start = INT_PCIE_MSI,
+		.end = INT_PCIE_MSI,
+		.flags = IORESOURCE_IRQ,
+	},
+#endif
 };
 
 static struct resource tegra_pcie_rp0_resources[] = {
diff --git a/arch/arm/mach-tegra/include/mach/irqs.h b/arch/arm/mach-tegra/include/mach/irqs.h
index 0a0dcac..a282524 100644
--- a/arch/arm/mach-tegra/include/mach/irqs.h
+++ b/arch/arm/mach-tegra/include/mach/irqs.h
@@ -172,7 +172,10 @@
 /* Tegra30 has 8 banks of 32 GPIOs */
 #define INT_GPIO_NR			(32 * 8)
 
-#define TEGRA_NR_IRQS			(INT_GPIO_BASE + INT_GPIO_NR)
+#define INT_PCI_MSI_BASE		(INT_GPIO_BASE + INT_GPIO_NR)
+#define INT_PCI_MSI_NR			(32 * 8)
+
+#define TEGRA_NR_IRQS			(INT_PCI_MSI_BASE + INT_PCI_MSI_NR)
 
 #define INT_BOARD_BASE			TEGRA_NR_IRQS
 #define NR_BOARD_IRQS			128
diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
index 3e5fb66..dab3479 100644
--- a/arch/arm/mach-tegra/pcie.c
+++ b/arch/arm/mach-tegra/pcie.c
@@ -32,9 +32,11 @@
 #include <linux/platform_device.h>
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/irqdomain.h>
 #include <linux/clk.h>
 #include <linux/delay.h>
 #include <linux/export.h>
+#include <linux/msi.h>
 
 #include <asm/sizes.h>
 #include <asm/mach/irq.h>
@@ -79,6 +81,24 @@
 #define AFI_MSI_FPCI_BAR_ST	0x64
 #define AFI_MSI_AXI_BAR_ST	0x68
 
+#define AFI_MSI_VEC0		0x6c
+#define AFI_MSI_VEC1		0x70
+#define AFI_MSI_VEC2		0x74
+#define AFI_MSI_VEC3		0x78
+#define AFI_MSI_VEC4		0x7c
+#define AFI_MSI_VEC5		0x80
+#define AFI_MSI_VEC6		0x84
+#define AFI_MSI_VEC7		0x88
+
+#define AFI_MSI_EN_VEC0		0x8c
+#define AFI_MSI_EN_VEC1		0x90
+#define AFI_MSI_EN_VEC2		0x94
+#define AFI_MSI_EN_VEC3		0x98
+#define AFI_MSI_EN_VEC4		0x9c
+#define AFI_MSI_EN_VEC5		0xa0
+#define AFI_MSI_EN_VEC6		0xa4
+#define AFI_MSI_EN_VEC7		0xa8
+
 #define AFI_CONFIGURATION		0xac
 #define  AFI_CONFIGURATION_EN_FPCI	(1 << 0)
 
@@ -166,6 +186,14 @@
 #define PCIE_CONF_FUNC(f)	((f) << 8)
 #define PCIE_CONF_REG(r)	((((r) & 0xf00) << 16) | ((r) & ~3))
 
+struct tegra_pcie_msi {
+	DECLARE_BITMAP(used, INT_PCI_MSI_NR);
+	struct irq_domain *domain;
+	unsigned long pages;
+	struct mutex lock;
+	int irq;
+};
+
 struct tegra_pcie {
 	struct device *dev;
 
@@ -190,6 +218,8 @@ struct tegra_pcie {
 
 	struct list_head ports;
 	unsigned int num_ports;
+
+	struct tegra_pcie_msi *msi;
 };
 
 struct tegra_pcie_port {
@@ -759,6 +789,233 @@ static inline void merge_range(struct resource *range, struct resource *new)
 		range->end = new->end;
 }
 
+static int tegra_pcie_msi_alloc(struct tegra_pcie *pcie)
+{
+	int msi;
+
+	mutex_lock(&pcie->msi->lock);
+
+	msi = find_first_zero_bit(pcie->msi->used, INT_PCI_MSI_NR);
+	if (msi < INT_PCI_MSI_NR)
+		set_bit(msi, pcie->msi->used);
+	else
+		msi = -ENOSPC;
+
+	mutex_unlock(&pcie->msi->lock);
+
+	return msi;
+}
+
+static void tegra_pcie_msi_free(struct tegra_pcie *pcie, unsigned long irq)
+{
+	mutex_lock(&pcie->msi->lock);
+
+	if (!test_bit(irq, pcie->msi->used))
+		dev_err(pcie->dev, "trying to free unused MSI#%lu\n", irq);
+	else
+		clear_bit(irq, pcie->msi->used);
+
+	mutex_unlock(&pcie->msi->lock);
+}
+
+static irqreturn_t tegra_pcie_msi_irq(int irq, void *data)
+{
+	struct tegra_pcie *pcie = data;
+	unsigned int i;
+
+	for (i = 0; i < 8; i++) {
+		unsigned long reg = afi_readl(pcie, AFI_MSI_VEC0 + i * 4);
+
+		while (reg) {
+			unsigned int offset = find_first_bit(&reg, 32);
+			unsigned int index = i * 32 + offset;
+			unsigned int irq;
+
+			/* clear the interrupt */
+			afi_writel(pcie, 1 << offset, AFI_MSI_VEC0 + i * 4);
+
+			irq = irq_find_mapping(pcie->msi->domain, index);
+			if (irq) {
+				if (test_bit(index, pcie->msi->used))
+					generic_handle_irq(irq);
+				else
+					dev_info(pcie->dev, "unhandled MSI\n");
+			} else {
+				/*
+				 * that's weird who triggered this?
+				 * just clear it
+				 */
+				dev_info(pcie->dev, "unexpected MSI\n");
+			}
+
+			/* see if there's any more pending in this vector */
+			reg = afi_readl(pcie, AFI_MSI_VEC0 + i * 4);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
+/* called by arch_setup_msi_irqs in drivers/pci/msi.c */
+int arch_setup_msi_irq(struct pci_dev *pdev, struct msi_desc *desc)
+{
+	struct tegra_pcie_port *port = sys_to_pcie(pdev->bus->sysdata);
+	struct tegra_pcie *pcie = port->pcie;
+	struct msi_msg msg;
+	unsigned int irq;
+	int hwirq;
+
+	hwirq = tegra_pcie_msi_alloc(pcie);
+	if (hwirq < 0)
+		return hwirq;
+
+	irq = irq_create_mapping(pcie->msi->domain, hwirq);
+	if (!irq)
+		return -EINVAL;
+
+	irq_set_msi_desc(irq, desc);
+
+	msg.address_lo = afi_readl(pcie, AFI_MSI_AXI_BAR_ST);
+	/* 32 bit address only */
+	msg.address_hi = 0;
+	msg.data = hwirq;
+
+	write_msi_msg(irq, &msg);
+
+	return 0;
+}
+
+void arch_teardown_msi_irq(unsigned int irq)
+{
+	struct tegra_pcie *pcie = irq_get_chip_data(irq);
+	struct irq_data *d = irq_get_irq_data(irq);
+
+	tegra_pcie_msi_free(pcie, d->hwirq);
+}
+
+static struct irq_chip tegra_pcie_msi_irq_chip = {
+	.name = "Tegra PCIe MSI",
+	.irq_enable = unmask_msi_irq,
+	.irq_disable = mask_msi_irq,
+	.irq_mask = mask_msi_irq,
+	.irq_unmask = unmask_msi_irq,
+};
+
+static int tegra_pcie_msi_map(struct irq_domain *domain, unsigned int irq,
+			      irq_hw_number_t hwirq)
+{
+	irq_set_chip_and_handler(irq, &tegra_pcie_msi_irq_chip,
+				 handle_simple_irq);
+	irq_set_chip_data(irq, domain->host_data);
+	set_irq_flags(irq, IRQF_VALID);
+
+	return 0;
+}
+
+static const struct irq_domain_ops msi_domain_ops = {
+	.map = tegra_pcie_msi_map,
+};
+
+static int tegra_pcie_enable_msi(struct tegra_pcie *pcie)
+{
+	struct platform_device *pdev = to_platform_device(pcie->dev);
+	unsigned long base;
+	int err;
+	u32 reg;
+
+	pcie->msi = devm_kzalloc(&pdev->dev, sizeof(*pcie->msi), GFP_KERNEL);
+	if (!pcie->msi)
+		return -ENOMEM;
+
+	mutex_init(&pcie->msi->lock);
+
+	pcie->msi->domain = irq_domain_add_linear(pcie->dev->of_node,
+						  INT_PCI_MSI_NR,
+						  &msi_domain_ops, pcie);
+	if (!pcie->msi->domain) {
+		dev_err(&pdev->dev, "failed to create IRQ domain\n");
+		return -ENOMEM;
+	}
+
+	err = platform_get_irq(pdev, 1);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to get IRQ: %d\n", err);
+		goto err;
+	}
+
+	pcie->msi->irq = err;
+
+	err = devm_request_irq(&pdev->dev, pcie->msi->irq, tegra_pcie_msi_irq,
+			       0, tegra_pcie_msi_irq_chip.name, pcie);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request IRQ: %d\n", err);
+		goto err;
+	}
+
+	/* setup AFI/FPCI range */
+	pcie->msi->pages = __get_free_pages(GFP_KERNEL, 3);
+	base = virt_to_phys((void *)pcie->msi->pages);
+
+	afi_writel(pcie, base, AFI_MSI_FPCI_BAR_ST);
+	afi_writel(pcie, base, AFI_MSI_AXI_BAR_ST);
+	/* this register is in 4K increments */
+	afi_writel(pcie, 1, AFI_MSI_BAR_SZ);
+
+	/* enable all MSI vectors */
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC0);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC1);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC2);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC3);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC4);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC5);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC6);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC7);
+
+	/* and unmask the MSI interrupt */
+	reg = afi_readl(pcie, AFI_INTR_MASK);
+	reg |= AFI_INTR_MASK_MSI_MASK;
+	afi_writel(pcie, reg, AFI_INTR_MASK);
+
+	return 0;
+
+err:
+	irq_domain_remove(pcie->msi->domain);
+	return err;
+}
+
+static int tegra_pcie_disable_msi(struct tegra_pcie *pcie)
+{
+	unsigned int i, irq;
+	u32 value;
+
+	/* mask the MSI interrupt */
+	value = afi_readl(pcie, AFI_INTR_MASK);
+	value &= ~AFI_INTR_MASK_MSI_MASK;
+	afi_writel(pcie, value, AFI_INTR_MASK);
+
+	/* disable all MSI vectors */
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC0);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC1);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC2);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC3);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC4);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC5);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC6);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC7);
+
+	free_pages(pcie->msi->pages, 3);
+
+	for (i = 0; i < INT_PCI_MSI_NR; i++) {
+		irq = irq_find_mapping(pcie->msi->domain, i);
+		if (irq > 0)
+			irq_dispose_mapping(irq);
+	}
+
+	irq_domain_remove(pcie->msi->domain);
+
+	return 0;
+}
+
 static unsigned long tegra_pcie_port_get_pex_ctrl(struct tegra_pcie_port *port)
 {
 	unsigned long ret = 0;
@@ -973,14 +1230,26 @@ static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 	/* setup the AFI address translations */
 	tegra_pcie_setup_translations(pcie);
 
+	if (IS_ENABLED(CONFIG_PCI_MSI)) {
+		err = tegra_pcie_enable_msi(pcie);
+		if (err < 0) {
+			dev_err(&pdev->dev, "failed to enable MSI support: %d\n",
+				err);
+			goto put_resources;
+		}
+	}
+
 	err = tegra_pcie_enable(pcie);
 	if (err < 0) {
 		dev_err(&pdev->dev, "failed to enable PCIe ports: %d\n", err);
-		goto put_resources;
+		goto disable_msi;
 	}
 
 	return 0;
 
+disable_msi:
+	if (IS_ENABLED(CONFIG_PCI_MSI))
+		tegra_pcie_disable_msi(pcie);
 put_resources:
 	tegra_pcie_put_resources(pcie);
 	return err;
@@ -992,6 +1261,12 @@ static int __devexit tegra_pcie_remove(struct platform_device *pdev)
 	struct tegra_pcie *pcie = platform_get_drvdata(pdev);
 	int err;
 
+	if (IS_ENABLED(CONFIG_PCI_MSI)) {
+		err = tegra_pcie_disable_msi(pcie);
+		if (err < 0)
+			return err;
+	}
+
 	err = tegra_pcie_put_resources(pcie);
 	if (err < 0)
 		return err;
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (6 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 07/10] ARM: tegra: pcie: Add MSI support Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-31 20:18   ` Rob Herring
  2012-07-26 19:55 ` [PATCH v3 09/10] of: Add of_pci_parse_ranges() Thierry Reding
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

When a bus specifies #address-cells > 2, of_bus_default_map() now
assumes that the mapping isn't for a physical address but rather an
identifier that needs to match exactly.

This is required by bindings that use multiple cells to translate a
resource to the parent bus (device index, type, ...).

See here for the discussion:

	https://lists.ozlabs.org/pipermail/devicetree-discuss/2012-June/016577.html

Originally-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- new patch

 drivers/of/address.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 7e262a6..2776119 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -69,6 +69,14 @@ static u64 of_bus_default_map(u32 *addr, const __be32 *range,
 		 (unsigned long long)cp, (unsigned long long)s,
 		 (unsigned long long)da);
 
+	/*
+	 * If the number of address cells is larger than 2 we assume the
+	 * mapping doesn't specify a physical address. Rather, the address
+	 * specifies an identifier that must match exactly.
+	 */
+	if (na > 2 && memcmp(range, addr, na * 4) != 0)
+		return OF_BAD_ADDR;
+
 	if (da < cp || da >= (cp + s))
 		return OF_BAD_ADDR;
 	return da - cp;
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 09/10] of: Add of_pci_parse_ranges()
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (7 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-07-31 20:07   ` Rob Herring
  2012-07-26 19:55 ` [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support Thierry Reding
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

This new function parses the ranges property of PCI device nodes into
an array of struct resource elements. It is useful in multiple-port PCI
host controller drivers to collect information about the ranges that it
needs to forward to the respective ports.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- new patch

 drivers/of/of_pci.c    | 84 +++++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/of_pci.h |  2 ++
 2 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 13e37e2..bcff301 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -1,7 +1,8 @@
 #include <linux/kernel.h>
 #include <linux/export.h>
-#include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/of_pci.h>
+#include <linux/slab.h>
 #include <asm/prom.h>
 
 static inline int __of_pci_pci_compare(struct device_node *node,
@@ -40,3 +41,84 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
 	return NULL;
 }
 EXPORT_SYMBOL_GPL(of_pci_find_child_device);
+
+struct resource *of_pci_parse_ranges(struct device_node *node,
+				     unsigned int *countp)
+{
+	unsigned int count, i = 0;
+	struct resource *ranges;
+	const __be32 *values;
+	int len, pna, np;
+
+	values = of_get_property(node, "ranges", &len);
+	if (!values)
+		return NULL;
+
+	pna = of_n_addr_cells(node);
+	np = pna + 5;
+
+	count = len / (np * sizeof(*values));
+
+	ranges = kzalloc(sizeof(*ranges) * count, GFP_KERNEL);
+	if (!ranges)
+		return NULL;
+
+	pr_debug("PCI ranges:\n");
+
+	while ((len -= np * sizeof(*values)) >= 0) {
+		u64 addr = of_translate_address(node, values + 3);
+		u64 size = of_read_number(values + 3 + pna, 2);
+		u32 type = be32_to_cpup(values);
+		const char *suffix = "";
+		struct resource range;
+
+		memset(&range, 0, sizeof(range));
+		range.start = addr;
+		range.end = addr + size - 1;
+
+		switch ((type >> 24) & 0x3) {
+		case 0:
+			range.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS;
+			pr_debug("  CS  %#x-%#x\n", range.start, range.end);
+			break;
+
+		case 1:
+			range.flags = IORESOURCE_IO;
+			pr_debug("  IO  %#x-%#x\n", range.start, range.end);
+			break;
+
+		case 2:
+			range.flags = IORESOURCE_MEM;
+
+			if (type & 0x40000000) {
+				range.flags |= IORESOURCE_PREFETCH;
+				suffix = "prefetch";
+			}
+
+			pr_debug("  MEM %#x-%#x %s\n", range.start, range.end,
+				 suffix);
+			break;
+
+		case 3:
+			range.flags = IORESOURCE_MEM | IORESOURCE_MEM_64;
+
+			if (type & 0x40000000) {
+				range.flags |= IORESOURCE_PREFETCH;
+				suffix = "prefetch";
+			}
+
+			pr_debug("  MEM %#x-%#x 64-bit %s\n", range.start,
+				 range.end, suffix);
+			break;
+		}
+
+		memcpy(&ranges[i++], &range, sizeof(range));
+		values += np;
+	}
+
+	if (countp)
+		*countp = count;
+
+	return ranges;
+}
+EXPORT_SYMBOL_GPL(of_pci_parse_ranges);
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index bb115de..c0db6ea 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -10,5 +10,7 @@ int of_irq_map_pci(const struct pci_dev *pdev, struct of_irq *out_irq);
 struct device_node;
 struct device_node *of_pci_find_child_device(struct device_node *parent,
 					     unsigned int devfn);
+struct resource *of_pci_parse_ranges(struct device_node *node,
+				     unsigned int *countp);
 
 #endif
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (8 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 09/10] of: Add of_pci_parse_ranges() Thierry Reding
@ 2012-07-26 19:55 ` Thierry Reding
  2012-08-14 20:12   ` Thierry Reding
  2012-07-31 16:18 ` [PATCH v3 00/10] ARM: tegra: Add PCIe " Stephen Warren
  2012-08-06 19:42 ` Stephen Warren
  11 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-07-26 19:55 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

This commit adds support for instantiating the Tegra PCIe controller
from a device tree.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
Changes in v3:
- rewrite the DT binding and adapt driver correspondingly

Changes in v2:
- increase compile coverage by using the IS_ENABLED() macro
- disable node by default

 .../bindings/pci/nvidia,tegra20-pcie.txt           |  94 ++++++++++
 arch/arm/boot/dts/tegra20.dtsi                     |  62 +++++++
 arch/arm/mach-tegra/board-dt-tegra20.c             |   7 +-
 arch/arm/mach-tegra/pcie.c                         | 195 +++++++++++++++++++++
 4 files changed, 353 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt b/Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt
new file mode 100644
index 0000000..b181d4c
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt
@@ -0,0 +1,94 @@
+NVIDIA Tegra PCIe controller
+
+Required properties:
+- compatible: "nvidia,tegra20-pcie"
+- reg: physical base address and length of the controller's registers
+- interrupts: the interrupt outputs of the controller
+- pex-clk-supply: supply voltage for internal reference clock
+- vdd-supply: power supply for controller (1.05V)
+- ranges: describes the translation of addresses for root ports
+- #address-cells: address representation for root ports (must be 3)
+  - cell 0 specifies the port index
+  - cell 1 denotes the address type
+      0: root port register space
+      1: PCI configuration space
+      2: PCI extended configuration space
+      3: downstream I/O
+      4: non-prefetchable memory
+      5: prefetchable memory
+  - cell 2 provides a number space that can include the size (should be 0)
+- #size-cells: size representation for root ports (must be 1)
+
+Root ports are defined as subnodes of the PCIe controller node.
+
+Required properties:
+- device_type: must be "pciex"
+- reg: address and size of the port configuration registers
+- #address-cells: must be 3
+- #size-cells: must be 2
+- ranges: sub-ranges distributed from the PCIe controller node
+- nvidia,num-lanes: number of lanes to use for this port
+
+Example:
+
+	pcie-controller {
+		compatible = "nvidia,tegra20-pcie";
+		reg = <0x80003000 0x00000800   /* PADS registers */
+		       0x80003800 0x00000200   /* AFI registers */
+		       0x81000000 0x01000000   /* configuration space */
+		       0x90000000 0x10000000>; /* extended configuration space */
+		interrupts = <0 98 0x04   /* controller interrupt */
+		              0 99 0x04>; /* MSI interrupt */
+		status = "disabled";
+
+		ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
+			  0 1 0  0x81000000 0x00800000   /* port 0 config space */
+			  0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
+			  0 3 0  0x82000000 0x00008000   /* port 0 downstream I/O */
+			  0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
+			  0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
+
+			  1 0 0  0x80001000 0x00001000   /* root port 1 */
+			  1 1 0  0x81800000 0x00800000   /* port 1 config space */
+			  1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
+			  1 3 0  0x82008000 0x00008000   /* port 1 downstream I/O */
+			  1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
+			  1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
+
+		#address-cells = <3>;
+		#size-cells = <1>;
+
+		pci@0 {
+			device_type = "pciex";
+			reg = <0 0 0 0x1000>;
+			status = "disabled";
+
+			#address-cells = <3>;
+			#size-cells = <2>;
+
+			ranges = <0x80000000 0 0  0 1 0  0 0x00800000   /* config space */
+				  0x90000000 0 0  0 2 0  0 0x08000000   /* ext config space */
+				  0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
+				  0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
+				  0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
+
+			nvidia,num-lanes = <2>;
+		};
+
+		pci@1 {
+			device_type = "pciex";
+			reg = <1 0 0 0x1000>;
+			status = "disabled";
+
+			#address-cells = <3>;
+			#size-cells = <2>;
+
+			ranges = <0x80000000 0 0  1 1 0  0 0x00800000   /* config space */
+				  0x90000000 0 0  1 2 0  0 0x08000000   /* ext config space */
+				  0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
+				  0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
+				  0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
+
+			nvidia,num-lanes = <2>;
+		};
+	};
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index a094c97..c886dff 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -199,6 +199,68 @@
 		#size-cells = <0>;
 	};
 
+	pcie-controller {
+		compatible = "nvidia,tegra20-pcie";
+		reg = <0x80003000 0x00000800   /* PADS registers */
+		       0x80003800 0x00000200   /* AFI registers */
+		       0x81000000 0x01000000   /* configuration space */
+		       0x90000000 0x10000000>; /* extended configuration space */
+		interrupts = <0 98 0x04   /* controller interrupt */
+		              0 99 0x04>; /* MSI interrupt */
+		status = "disabled";
+
+		ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
+			  0 1 0  0x81000000 0x00800000   /* port 0 config space */
+			  0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
+			  0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
+			  0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
+			  0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
+
+			  1 0 0  0x80001000 0x00001000   /* root port 1 */
+			  1 1 0  0x81800000 0x00800000   /* port 1 config space */
+			  1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
+			  1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
+			  1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
+			  1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
+
+		#address-cells = <3>;
+		#size-cells = <1>;
+
+		pci@0 {
+			device_type = "pciex";
+			reg = <0 0 0 0x1000>;
+			status = "disabled";
+
+			#address-cells = <3>;
+			#size-cells = <2>;
+
+			ranges = <0x80000000 0 0  0 1 0  0 0x00800000   /* config space */
+				  0x90000000 0 0  0 2 0  0 0x08000000   /* ext config space */
+				  0x81000000 0 0  0 3 0  0 0x00010000   /* I/O */
+				  0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
+				  0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
+
+			nvidia,num-lanes = <2>;
+		};
+
+		pci@1 {
+			device_type = "pciex";
+			reg = <1 0 0 0x1000>;
+			status = "disabled";
+
+			#address-cells = <3>;
+			#size-cells = <2>;
+
+			ranges = <0x80000000 0 0  1 1 0  0 0x00800000   /* config space */
+				  0x90000000 0 0  1 2 0  0 0x08000000   /* ext config space */
+				  0x81000000 0 0  1 3 0  0 0x00010000   /* I/O */
+				  0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
+				  0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
+
+			nvidia,num-lanes = <2>;
+		};
+	};
+
 	usb@c5000000 {
 		compatible = "nvidia,tegra20-ehci", "usb-ehci";
 		reg = <0xc5000000 0x4000>;
diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c b/arch/arm/mach-tegra/board-dt-tegra20.c
index a8a05c1..caa377a 100644
--- a/arch/arm/mach-tegra/board-dt-tegra20.c
+++ b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -40,6 +40,7 @@
 
 #include <mach/iomap.h>
 #include <mach/irqs.h>
+#include <mach/pci-tegra.h>
 
 #include "board.h"
 #include "board-harmony.h"
@@ -114,11 +115,7 @@ static void __init tegra_dt_init(void)
 #ifdef CONFIG_MACH_TRIMSLICE
 static void __init trimslice_init(void)
 {
-	int ret;
-
-	ret = tegra_pcie_init(true, true);
-	if (ret)
-		pr_err("tegra_pci_init() failed: %d\n", ret);
+	platform_device_register(&tegra_pcie_device);
 }
 #endif
 
diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
index dab3479..2d00b1c 100644
--- a/arch/arm/mach-tegra/pcie.c
+++ b/arch/arm/mach-tegra/pcie.c
@@ -37,6 +37,10 @@
 #include <linux/delay.h>
 #include <linux/export.h>
 #include <linux/msi.h>
+#include <linux/of_address.h>
+#include <linux/of_pci.h>
+#include <linux/of_platform.h>
+#include <linux/regulator/consumer.h>
 
 #include <asm/sizes.h>
 #include <asm/mach/irq.h>
@@ -220,6 +224,9 @@ struct tegra_pcie {
 	unsigned int num_ports;
 
 	struct tegra_pcie_msi *msi;
+
+	struct regulator *pex_clk_supply;
+	struct regulator *vdd_supply;
 };
 
 struct tegra_pcie_port {
@@ -1016,6 +1023,178 @@ static int tegra_pcie_disable_msi(struct tegra_pcie *pcie)
 	return 0;
 }
 
+static int tegra_pcie_dt_init(struct platform_device *pdev)
+{
+	struct tegra_pcie *pcie = platform_get_drvdata(pdev);
+	int err;
+
+	if (!IS_ERR_OR_NULL(pcie->vdd_supply)) {
+		err = regulator_enable(pcie->vdd_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to enable VDD regulator: %d\n", err);
+			return err;
+		}
+	}
+
+	if (!IS_ERR_OR_NULL(pcie->pex_clk_supply)) {
+		err = regulator_enable(pcie->pex_clk_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to enable pex-clk regulator: %d\n",
+				err);
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static int tegra_pcie_dt_exit(struct platform_device *pdev)
+{
+	struct tegra_pcie *pcie = platform_get_drvdata(pdev);
+	int err;
+
+	if (!IS_ERR_OR_NULL(pcie->pex_clk_supply)) {
+		err = regulator_disable(pcie->pex_clk_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to disable pex-clk regulator: %d\n",
+				err);
+			return err;
+		}
+	}
+
+	if (!IS_ERR_OR_NULL(pcie->vdd_supply)) {
+		err = regulator_disable(pcie->vdd_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to disable VDD regulator: %d\n", err);
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+struct resource *of_parse_reg(struct device_node *np, unsigned int *countp)
+{
+	unsigned int count = 0, i;
+	struct resource *reg, res;
+	int err;
+
+	while (of_address_to_resource(np, count, &res) == 0)
+		count++;
+
+	reg = kzalloc(sizeof(*reg) * count, GFP_KERNEL);
+	if (!reg)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < count; i++) {
+		err = of_address_to_resource(np, i, &reg[i]);
+		if (err < 0) {
+			kfree(reg);
+			return ERR_PTR(err);
+		}
+	}
+
+	if (countp)
+		*countp = count;
+
+	return reg;
+}
+
+static int tegra_pcie_port_parse_dt(struct tegra_pcie *pcie,
+				    struct device_node *node,
+				    struct tegra_pcie_rp *port)
+{
+	const __be32 *values;
+	u32 value;
+	int err;
+
+	values = of_get_property(node, "reg", NULL);
+	if (!values)
+		return -ENODEV;
+
+	port->index = be32_to_cpup(values);
+
+	port->resources = of_parse_reg(node, &port->num_resources);
+	if (!port->resources)
+		return -ENOMEM;
+
+	port->ranges = of_pci_parse_ranges(node, &port->num_ranges);
+	if (!port->ranges) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	err = of_property_read_u32(node, "nvidia,num-lanes", &value);
+	if (err < 0)
+		goto free;
+
+	port->num_lanes = value;
+
+	return 0;
+
+free:
+	kfree(port->ranges);
+	kfree(port->resources);
+	return err;
+}
+
+static struct tegra_pcie_pdata *tegra_pcie_parse_dt(struct tegra_pcie *pcie)
+{
+	struct tegra_pcie_pdata *pdata;
+	struct device_node *child;
+	unsigned int i = 0;
+	size_t size;
+	int err;
+
+	pdata = devm_kzalloc(pcie->dev, sizeof(*pdata), GFP_KERNEL);
+	if (!pdata)
+		return NULL;
+
+	pdata->init = tegra_pcie_dt_init;
+	pdata->exit = tegra_pcie_dt_exit;
+
+	pcie->vdd_supply = devm_regulator_get(pcie->dev, "vdd");
+	if (IS_ERR_OR_NULL(pcie->vdd_supply))
+		return ERR_CAST(pcie->vdd_supply);
+
+	pcie->pex_clk_supply = devm_regulator_get(pcie->dev, "pex-clk");
+	if (IS_ERR_OR_NULL(pcie->pex_clk_supply))
+		return ERR_CAST(pcie->pex_clk_supply);
+
+	/* parse root port nodes */
+	for_each_child_of_node(pcie->dev->of_node, child) {
+		if (of_device_is_available(child))
+			pdata->num_ports++;
+	}
+
+	size = pdata->num_ports * sizeof(*pdata->ports);
+
+	pdata->ports = devm_kzalloc(pcie->dev, size, GFP_KERNEL);
+	if (!pdata->ports)
+		return ERR_PTR(-ENOMEM);
+
+	for_each_child_of_node(pcie->dev->of_node, child) {
+		struct tegra_pcie_rp *port = &pdata->ports[i];
+
+		if (!of_device_is_available(child))
+			continue;
+
+		err = tegra_pcie_port_parse_dt(pcie, child, port);
+		if (err < 0)
+			return ERR_PTR(err);
+
+		i++;
+	}
+
+	pdata->num_ports = i;
+
+	return pdata;
+}
+
 static unsigned long tegra_pcie_port_get_pex_ctrl(struct tegra_pcie_port *port)
 {
 	unsigned long ret = 0;
@@ -1193,6 +1372,14 @@ static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 
 	pcie->dev = &pdev->dev;
 
+	if (IS_ENABLED(CONFIG_OF)) {
+		if (!pdata && pdev->dev.of_node) {
+			pdata = tegra_pcie_parse_dt(pcie);
+			if (IS_ERR(pdata))
+				return PTR_ERR(pdata);
+		}
+	}
+
 	if (!pdata)
 		return -ENODEV;
 
@@ -1280,10 +1467,18 @@ static int __devexit tegra_pcie_remove(struct platform_device *pdev)
 	return 0;
 }
 
+#ifdef CONFIG_OF
+static const struct of_device_id tegra_pcie_of_match[] = {
+	{ .compatible = "nvidia,tegra20-pcie", },
+	{ },
+};
+#endif
+
 static struct platform_driver tegra_pcie_driver = {
 	.driver = {
 		.name = "tegra-pcie",
 		.owner = THIS_MODULE,
+		.of_match_table = of_match_ptr(tegra_pcie_of_match),
 	},
 	.probe = tegra_pcie_probe,
 	.remove = __devexit_p(tegra_pcie_remove),
-- 
1.7.11.2


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (9 preceding siblings ...)
  2012-07-26 19:55 ` [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support Thierry Reding
@ 2012-07-31 16:18 ` Stephen Warren
  2012-08-01  6:35   ` Thierry Reding
  2012-08-06 19:42 ` Stephen Warren
  11 siblings, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-07-31 16:18 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

On 07/26/2012 01:55 PM, Thierry Reding wrote:
> This patch series adds support for device tree based probing of the PCIe
> controller found on Tegra SoCs.

Thierry,

I think one thing that would help here would be to split up this series
into one per subsystem, and to get all the dependencies merged by the
respective maintainers. Preferably, each subsystem would export a stable
branch (perhaps consisting of just these patches) that I can then merge
into the Tegra tree and use as a basis for the PCIe driver itself.
Hopefully this approach will get more traction on all the non-Tegra
changes. Does that sound like a good plan?

Or, I can just take it all through the Tegra tree, but I'll definitely
need acks on all/most of the non-Tegra patches, since although the
patches themselves are quite small, I'd like to be sure that maintainers
are OK with the conceptual changes, like initializing PCI controllers
later than at present etc.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 09/10] of: Add of_pci_parse_ranges()
  2012-07-26 19:55 ` [PATCH v3 09/10] of: Add of_pci_parse_ranges() Thierry Reding
@ 2012-07-31 20:07   ` Rob Herring
  2012-08-01  6:54     ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Rob Herring @ 2012-07-31 20:07 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Colin Cross, Bjorn Helgaas, linux-arm-kernel,
	Arnd Bergmann

On 07/26/2012 02:55 PM, Thierry Reding wrote:
> This new function parses the ranges property of PCI device nodes into
> an array of struct resource elements. It is useful in multiple-port PCI
> host controller drivers to collect information about the ranges that it
> needs to forward to the respective ports.

It seems to me that some of the DT PCI code in arch/powerpc/kernel/pci*
like pci_process_bridge_OF_ranges() should apply for ARM as well.

Each arch defining their own pci controller structs complicates this,
but I would think at least the DT parsing can be common.

> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> ---
> Changes in v3:
> - new patch
> 
>  drivers/of/of_pci.c    | 84 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  include/linux/of_pci.h |  2 ++
>  2 files changed, 85 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index 13e37e2..bcff301 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
> @@ -1,7 +1,8 @@
>  #include <linux/kernel.h>
>  #include <linux/export.h>
> -#include <linux/of.h>
> +#include <linux/of_address.h>
>  #include <linux/of_pci.h>
> +#include <linux/slab.h>
>  #include <asm/prom.h>
>  
>  static inline int __of_pci_pci_compare(struct device_node *node,
> @@ -40,3 +41,84 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
>  	return NULL;
>  }
>  EXPORT_SYMBOL_GPL(of_pci_find_child_device);
> +
> +struct resource *of_pci_parse_ranges(struct device_node *node,
> +				     unsigned int *countp)
> +{
> +	unsigned int count, i = 0;
> +	struct resource *ranges;

I think res or pci_res would be clearer than ranges that this is a
struct resource.

> +	const __be32 *values;
> +	int len, pna, np;
> +
> +	values = of_get_property(node, "ranges", &len);
> +	if (!values)
> +		return NULL;
> +
> +	pna = of_n_addr_cells(node);
> +	np = pna + 5;
> +
> +	count = len / (np * sizeof(*values));
> +
> +	ranges = kzalloc(sizeof(*ranges) * count, GFP_KERNEL);
> +	if (!ranges)
> +		return NULL;
> +
> +	pr_debug("PCI ranges:\n");
> +
> +	while ((len -= np * sizeof(*values)) >= 0) {
> +		u64 addr = of_translate_address(node, values + 3);
> +		u64 size = of_read_number(values + 3 + pna, 2);
> +		u32 type = be32_to_cpup(values);
> +		const char *suffix = "";
> +		struct resource range;

Same here.

Rob

> +
> +		memset(&range, 0, sizeof(range));
> +		range.start = addr;
> +		range.end = addr + size - 1;
> +
> +		switch ((type >> 24) & 0x3) {
> +		case 0:
> +			range.flags = IORESOURCE_MEM | IORESOURCE_PCI_CS;
> +			pr_debug("  CS  %#x-%#x\n", range.start, range.end);
> +			break;
> +
> +		case 1:
> +			range.flags = IORESOURCE_IO;
> +			pr_debug("  IO  %#x-%#x\n", range.start, range.end);
> +			break;
> +
> +		case 2:
> +			range.flags = IORESOURCE_MEM;
> +
> +			if (type & 0x40000000) {
> +				range.flags |= IORESOURCE_PREFETCH;
> +				suffix = "prefetch";
> +			}
> +
> +			pr_debug("  MEM %#x-%#x %s\n", range.start, range.end,
> +				 suffix);
> +			break;
> +
> +		case 3:
> +			range.flags = IORESOURCE_MEM | IORESOURCE_MEM_64;
> +
> +			if (type & 0x40000000) {
> +				range.flags |= IORESOURCE_PREFETCH;
> +				suffix = "prefetch";
> +			}
> +
> +			pr_debug("  MEM %#x-%#x 64-bit %s\n", range.start,
> +				 range.end, suffix);
> +			break;
> +		}
> +
> +		memcpy(&ranges[i++], &range, sizeof(range));
> +		values += np;
> +	}
> +
> +	if (countp)
> +		*countp = count;
> +
> +	return ranges;
> +}
> +EXPORT_SYMBOL_GPL(of_pci_parse_ranges);
> diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> index bb115de..c0db6ea 100644
> --- a/include/linux/of_pci.h
> +++ b/include/linux/of_pci.h
> @@ -10,5 +10,7 @@ int of_irq_map_pci(const struct pci_dev *pdev, struct of_irq *out_irq);
>  struct device_node;
>  struct device_node *of_pci_find_child_device(struct device_node *parent,
>  					     unsigned int devfn);
> +struct resource *of_pci_parse_ranges(struct device_node *node,
> +				     unsigned int *countp);
>  
>  #endif
> 



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially
  2012-07-26 19:55 ` [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially Thierry Reding
@ 2012-07-31 20:18   ` Rob Herring
  2012-08-15 20:06     ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Rob Herring @ 2012-07-31 20:18 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Colin Cross, Bjorn Helgaas, linux-arm-kernel

On 07/26/2012 02:55 PM, Thierry Reding wrote:
> When a bus specifies #address-cells > 2, of_bus_default_map() now
> assumes that the mapping isn't for a physical address but rather an
> identifier that needs to match exactly.
> 
> This is required by bindings that use multiple cells to translate a
> resource to the parent bus (device index, type, ...).
> 
> See here for the discussion:
> 
> 	https://lists.ozlabs.org/pipermail/devicetree-discuss/2012-June/016577.html
> 
> Originally-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>

Acked-by: Rob Herring <rob.herring@calxeda.com>

> ---
> Changes in v3:
> - new patch
> 
>  drivers/of/address.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 7e262a6..2776119 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -69,6 +69,14 @@ static u64 of_bus_default_map(u32 *addr, const __be32 *range,
>  		 (unsigned long long)cp, (unsigned long long)s,
>  		 (unsigned long long)da);
>  
> +	/*
> +	 * If the number of address cells is larger than 2 we assume the
> +	 * mapping doesn't specify a physical address. Rather, the address
> +	 * specifies an identifier that must match exactly.
> +	 */
> +	if (na > 2 && memcmp(range, addr, na * 4) != 0)
> +		return OF_BAD_ADDR;
> +
>  	if (da < cp || da >= (cp + s))
>  		return OF_BAD_ADDR;
>  	return da - cp;
> 



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-07-31 16:18 ` [PATCH v3 00/10] ARM: tegra: Add PCIe " Stephen Warren
@ 2012-08-01  6:35   ` Thierry Reding
  2012-08-01 17:02     ` Stephen Warren
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-01  6:35 UTC (permalink / raw)
  To: Stephen Warren
  Cc: linux-tegra, Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2205 bytes --]

On Tue, Jul 31, 2012 at 10:18:15AM -0600, Stephen Warren wrote:
> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> > This patch series adds support for device tree based probing of the PCIe
> > controller found on Tegra SoCs.
> 
> Thierry,
> 
> I think one thing that would help here would be to split up this series
> into one per subsystem, and to get all the dependencies merged by the
> respective maintainers. Preferably, each subsystem would export a stable
> branch (perhaps consisting of just these patches) that I can then merge
> into the Tegra tree and use as a basis for the PCIe driver itself.
> Hopefully this approach will get more traction on all the non-Tegra
> changes. Does that sound like a good plan?

I don't understand. The series is already split up into per-subsystem
patches. I just didn't want to post them separately so everybody on Cc
would be able to see the big picture and the reason why the patch was
required.

> Or, I can just take it all through the Tegra tree, but I'll definitely
> need acks on all/most of the non-Tegra patches, since although the
> patches themselves are quite small, I'd like to be sure that maintainers
> are OK with the conceptual changes, like initializing PCI controllers
> later than at present etc.

Taking the patches through the individual trees should be fine. Only so
far other subsystem maintainers haven't commented on this.

As to when the PCI controller is initialized, there really aren't many
options. On Harmony (and TEC, which in this respect, and others, is very
similar) the VDD is supplied by a GPIO of the PMU, so it is currently
necessary to probe it at least after the PMU. And unless we want to
work around these issues using different initcall ordering we need to
rely on deferred probing, which turns out to be done after the init
phase often (always?).

Also I had been toying with making the Tegra PCIe driver work as a
module, but that doesn't work currently with MSI support. However it
might be nice to allow MSI controllers to register at runtime generally,
which wouldn't be all that difficult to do I think. Of course somebody
more knowledgeable may disagree.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 09/10] of: Add of_pci_parse_ranges()
  2012-07-31 20:07   ` Rob Herring
@ 2012-08-01  6:54     ` Thierry Reding
  2012-08-01 16:07       ` Stephen Warren
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-01  6:54 UTC (permalink / raw)
  To: Rob Herring
  Cc: linux-tegra, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Colin Cross, Bjorn Helgaas, linux-arm-kernel,
	Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2461 bytes --]

On Tue, Jul 31, 2012 at 03:07:31PM -0500, Rob Herring wrote:
> On 07/26/2012 02:55 PM, Thierry Reding wrote:
> > This new function parses the ranges property of PCI device nodes into
> > an array of struct resource elements. It is useful in multiple-port PCI
> > host controller drivers to collect information about the ranges that it
> > needs to forward to the respective ports.
> 
> It seems to me that some of the DT PCI code in arch/powerpc/kernel/pci*
> like pci_process_bridge_OF_ranges() should apply for ARM as well.
> 
> Each arch defining their own pci controller structs complicates this,
> but I would think at least the DT parsing can be common.

Yes, there's quite a lot of room for refactoring. When I first started
work on this there had been some discussion about whether it would make
sense to move PCI controller drivers into a common location to make it
easier to refactor but the consensus at the time was that this should
not be done.

I still think this might be worthwhile, but I have other things that I
need to finish first.

> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> > ---
> > Changes in v3:
> > - new patch
> > 
> >  drivers/of/of_pci.c    | 84 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  include/linux/of_pci.h |  2 ++
> >  2 files changed, 85 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> > index 13e37e2..bcff301 100644
> > --- a/drivers/of/of_pci.c
> > +++ b/drivers/of/of_pci.c
> > @@ -1,7 +1,8 @@
> >  #include <linux/kernel.h>
> >  #include <linux/export.h>
> > -#include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_pci.h>
> > +#include <linux/slab.h>
> >  #include <asm/prom.h>
> >  
> >  static inline int __of_pci_pci_compare(struct device_node *node,
> > @@ -40,3 +41,84 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
> >  	return NULL;
> >  }
> >  EXPORT_SYMBOL_GPL(of_pci_find_child_device);
> > +
> > +struct resource *of_pci_parse_ranges(struct device_node *node,
> > +				     unsigned int *countp)
> > +{
> > +	unsigned int count, i = 0;
> > +	struct resource *ranges;
> 
> I think res or pci_res would be clearer than ranges that this is a
> struct resource.

"range" is the term used by the PCI specifications to denote these
regions. I don't see what's wrong with using it as a variable name.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 09/10] of: Add of_pci_parse_ranges()
  2012-08-01  6:54     ` Thierry Reding
@ 2012-08-01 16:07       ` Stephen Warren
  0 siblings, 0 replies; 79+ messages in thread
From: Stephen Warren @ 2012-08-01 16:07 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, linux-tegra, Russell King, linux-pci,
	devicetree-discuss, Rob Herring, Colin Cross, Bjorn Helgaas,
	linux-arm-kernel, Arnd Bergmann

On 08/01/2012 12:54 AM, Thierry Reding wrote:
> On Tue, Jul 31, 2012 at 03:07:31PM -0500, Rob Herring wrote:
>> On 07/26/2012 02:55 PM, Thierry Reding wrote:
>>> This new function parses the ranges property of PCI device
>>> nodes into an array of struct resource elements. It is useful
>>> in multiple-port PCI host controller drivers to collect
>>> information about the ranges that it needs to forward to the
>>> respective ports.
>> 
>> It seems to me that some of the DT PCI code in
>> arch/powerpc/kernel/pci* like pci_process_bridge_OF_ranges()
>> should apply for ARM as well.
>> 
>> Each arch defining their own pci controller structs complicates
>> this, but I would think at least the DT parsing can be common.
> 
> Yes, there's quite a lot of room for refactoring. When I first
> started work on this there had been some discussion about whether
> it would make sense to move PCI controller drivers into a common
> location to make it easier to refactor but the consensus at the
> time was that this should not be done.

In the long term, if we end up with a separate arch/aarch64/, I think
we'll have to move the PCIe (and any other) drivers to a common
location. Still, I imagine there is plenty of time until that's mandatory.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-01  6:35   ` Thierry Reding
@ 2012-08-01 17:02     ` Stephen Warren
  2012-08-02  6:15       ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-08-01 17:02 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/01/2012 12:35 AM, Thierry Reding wrote:
> On Tue, Jul 31, 2012 at 10:18:15AM -0600, Stephen Warren wrote:
>> On 07/26/2012 01:55 PM, Thierry Reding wrote:
>>> This patch series adds support for device tree based probing of
>>> the PCIe controller found on Tegra SoCs.
>> 
>> Thierry,
>> 
>> I think one thing that would help here would be to split up this
>> series into one per subsystem, and to get all the dependencies
>> merged by the respective maintainers. Preferably, each subsystem
>> would export a stable branch (perhaps consisting of just these
>> patches) that I can then merge into the Tegra tree and use as a
>> basis for the PCIe driver itself. Hopefully this approach will
>> get more traction on all the non-Tegra changes. Does that sound
>> like a good plan?
> 
> I don't understand. The series is already split up into
> per-subsystem patches. I just didn't want to post them separately
> so everybody on Cc would be able to see the big picture and the
> reason why the patch was required.

There are separate patches that touch the different subsystems, but
they're all scattered throughout the one series, rather than having e.g.:

* a series for the PCI tree
* a series for the ARM core tree
* a series for the OF/DT tree
* a series for the Tegra tree, which indicates it depends on all of those.

I'm wondering if the reason you haven't seen much discussion/Acks from
the maintainers of the non-Tegra trees is because this is a big scary
series that touches a lot of stuff, and it's not necessarily clear for
people unfamiliar with it why they're being CCd on it.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-01 17:02     ` Stephen Warren
@ 2012-08-02  6:15       ` Thierry Reding
  0 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-02  6:15 UTC (permalink / raw)
  To: Stephen Warren
  Cc: linux-tegra, Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2096 bytes --]

On Wed, Aug 01, 2012 at 11:02:18AM -0600, Stephen Warren wrote:
> On 08/01/2012 12:35 AM, Thierry Reding wrote:
> > On Tue, Jul 31, 2012 at 10:18:15AM -0600, Stephen Warren wrote:
> >> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> >>> This patch series adds support for device tree based probing of
> >>> the PCIe controller found on Tegra SoCs.
> >> 
> >> Thierry,
> >> 
> >> I think one thing that would help here would be to split up this
> >> series into one per subsystem, and to get all the dependencies
> >> merged by the respective maintainers. Preferably, each subsystem
> >> would export a stable branch (perhaps consisting of just these
> >> patches) that I can then merge into the Tegra tree and use as a
> >> basis for the PCIe driver itself. Hopefully this approach will
> >> get more traction on all the non-Tegra changes. Does that sound
> >> like a good plan?
> > 
> > I don't understand. The series is already split up into
> > per-subsystem patches. I just didn't want to post them separately
> > so everybody on Cc would be able to see the big picture and the
> > reason why the patch was required.
> 
> There are separate patches that touch the different subsystems, but
> they're all scattered throughout the one series, rather than having e.g.:
> 
> * a series for the PCI tree
> * a series for the ARM core tree
> * a series for the OF/DT tree
> * a series for the Tegra tree, which indicates it depends on all of those.
> 
> I'm wondering if the reason you haven't seen much discussion/Acks from
> the maintainers of the non-Tegra trees is because this is a big scary
> series that touches a lot of stuff, and it's not necessarily clear for
> people unfamiliar with it why they're being CCd on it.

I see. If you think it'll help I can certainly split them up as you
suggested. I just thought it might be useful for everybody to know the
context. At least from my own experience I get annoyed when I'm Cc'ed
on a single patch in a larger series because it means I need to look
through other archives for the context.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
                   ` (10 preceding siblings ...)
  2012-07-31 16:18 ` [PATCH v3 00/10] ARM: tegra: Add PCIe " Stephen Warren
@ 2012-08-06 19:42 ` Stephen Warren
  2012-08-07 18:20   ` Thierry Reding
  2012-08-13 17:40   ` Thierry Reding
  11 siblings, 2 replies; 79+ messages in thread
From: Stephen Warren @ 2012-08-06 19:42 UTC (permalink / raw)
  To: Thierry Reding, Russell King
  Cc: linux-tegra, Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

On 07/26/2012 01:55 PM, Thierry Reding wrote:
> This patch series adds support for device tree based probing of the PCIe
> controller found on Tegra SoCs.

Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed that
PCIe doesn't work on TrimSlice when booting use device tree. I think I
found the cause, and I can't see why the same problem doesn't affect
this series. Perhaps you can enlighten me?

When booting TrimSlice (or Harmony) using board files, Tegra's PCIe is
initialized using a subsys_initcall to tegra_pcie_init() directly (or
for Harmony to harmony_pcie_init() which then calls tegra_pcie_init()).

The final thing tegra_pcie_init() does is call pci_common_init(). This
calls pcibios_init_hw() which calls hw->scan() which calls
pci_scan_root_bus() which adds a device object for each device on the
PCIe bus. However, since this happens very early in the boot sequence, I
believe the enumerated PCIe devices don't immediately get probed.
Instead, control gets returned to pci_common_init() which I believe then
calls pci_bus_assign_resources() which actually sets up the resources
for those devices. Later, the PCIe devices actually get probed, and
everything works.

However, when booting using device tree, with the code currently in
v3.6-rc1, tegra_pcie_init() is called late in the boot sequence, and so
in the sequence described above, as soon as pci_scan_root_bus() adds a
device, it gets probed, before the device object's resources have been
set up, which results in the following failure:

PCI: Device 0000:01:00.0 not available because of resource collisions

... because of the following code in pcibios_enable_device():

> 	for (idx = 0; idx < 6; idx++) {
> 		/* Only set up the requested stuff */
> 		if (!(mask & (1 << idx)))
> 			continue;
> 
> 		r = dev->resource + idx;
> 		if (!r->start && r->end) {
> 			printk(KERN_ERR "PCI: Device %s not available because"
> 			       " of resource collisions\n", pci_name(dev));

Doesn't this same problem exist when instantiating the PCIe device
itself from device tree as in your patch series? If not, can you explain
why?

Now, the obvious solution in v3.6 would be to simply have
tegra_pcie_init() be called at the same early stage in the boot process
when booting using device tree as it is when booting using board files.
This works for TrimSlice.

However, on Harmony, it doesn't work, because PCIe on Harmony depends on
regulators, and the regulators are accessed using an I2C bus that is
instantiated from DT, and the instantiation of the I2C bus happens
fairly late in the boot process so can't be found early during the boot
sequence. See harmony_regulator_init() for the failing code.

Does anyone have any good ideas (small, self-contained patches) for
solving this in v3.6 in such a way that PCIe works on both TrimSlice and
Harmony?

Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-06 19:42 ` Stephen Warren
@ 2012-08-07 18:20   ` Thierry Reding
  2012-08-13 17:40   ` Thierry Reding
  1 sibling, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-07 18:20 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Russell King, linux-tegra, Bjorn Helgaas, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]

On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> > This patch series adds support for device tree based probing of the PCIe
> > controller found on Tegra SoCs.
> 
> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed that
> PCIe doesn't work on TrimSlice when booting use device tree. I think I
> found the cause, and I can't see why the same problem doesn't affect
> this series. Perhaps you can enlighten me?
> 
> When booting TrimSlice (or Harmony) using board files, Tegra's PCIe is
> initialized using a subsys_initcall to tegra_pcie_init() directly (or
> for Harmony to harmony_pcie_init() which then calls tegra_pcie_init()).
> 
> The final thing tegra_pcie_init() does is call pci_common_init(). This
> calls pcibios_init_hw() which calls hw->scan() which calls
> pci_scan_root_bus() which adds a device object for each device on the
> PCIe bus. However, since this happens very early in the boot sequence, I
> believe the enumerated PCIe devices don't immediately get probed.
> Instead, control gets returned to pci_common_init() which I believe then
> calls pci_bus_assign_resources() which actually sets up the resources
> for those devices. Later, the PCIe devices actually get probed, and
> everything works.
> 
> However, when booting using device tree, with the code currently in
> v3.6-rc1, tegra_pcie_init() is called late in the boot sequence, and so
> in the sequence described above, as soon as pci_scan_root_bus() adds a
> device, it gets probed, before the device object's resources have been
> set up, which results in the following failure:
> 
> PCI: Device 0000:01:00.0 not available because of resource collisions
> 
> ... because of the following code in pcibios_enable_device():
> 
> > 	for (idx = 0; idx < 6; idx++) {
> > 		/* Only set up the requested stuff */
> > 		if (!(mask & (1 << idx)))
> > 			continue;
> > 
> > 		r = dev->resource + idx;
> > 		if (!r->start && r->end) {
> > 			printk(KERN_ERR "PCI: Device %s not available because"
> > 			       " of resource collisions\n", pci_name(dev));
> 
> Doesn't this same problem exist when instantiating the PCIe device
> itself from device tree as in your patch series? If not, can you explain
> why?

I think I've seen this before as well but hadn't had the time to
investigate further. The devices that failed to initialize were serial
ports I think. Perhaps the ordering here is the key. All the drivers
that I use for the PCI devices are loaded as modules, so they'll probe
the devices much later anyway. The serial driver is an exception here as
it is builtin.

> Now, the obvious solution in v3.6 would be to simply have
> tegra_pcie_init() be called at the same early stage in the boot process
> when booting using device tree as it is when booting using board files.
> This works for TrimSlice.
> 
> However, on Harmony, it doesn't work, because PCIe on Harmony depends on
> regulators, and the regulators are accessed using an I2C bus that is
> instantiated from DT, and the instantiation of the I2C bus happens
> fairly late in the boot process so can't be found early during the boot
> sequence. See harmony_regulator_init() for the failing code.
> 
> Does anyone have any good ideas (small, self-contained patches) for
> solving this in v3.6 in such a way that PCIe works on both TrimSlice and
> Harmony?

To me this looks like a problem in the probing code. Drivers' .probe()
shouldn't be called on devices that haven't had their resources assigned
yet. Perhaps nobody has ever seen this before because the ordering
always ensured that the PCI controller driver was loaded first.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-06 19:42 ` Stephen Warren
  2012-08-07 18:20   ` Thierry Reding
@ 2012-08-13 17:40   ` Thierry Reding
  2012-08-13 18:47     ` Stephen Warren
  1 sibling, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-13 17:40 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Russell King, linux-tegra, Bjorn Helgaas, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann


[-- Attachment #1.1: Type: text/plain, Size: 3515 bytes --]

On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> > This patch series adds support for device tree based probing of the PCIe
> > controller found on Tegra SoCs.
> 
> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed that
> PCIe doesn't work on TrimSlice when booting use device tree. I think I
> found the cause, and I can't see why the same problem doesn't affect
> this series. Perhaps you can enlighten me?
> 
> When booting TrimSlice (or Harmony) using board files, Tegra's PCIe is
> initialized using a subsys_initcall to tegra_pcie_init() directly (or
> for Harmony to harmony_pcie_init() which then calls tegra_pcie_init()).
> 
> The final thing tegra_pcie_init() does is call pci_common_init(). This
> calls pcibios_init_hw() which calls hw->scan() which calls
> pci_scan_root_bus() which adds a device object for each device on the
> PCIe bus. However, since this happens very early in the boot sequence, I
> believe the enumerated PCIe devices don't immediately get probed.
> Instead, control gets returned to pci_common_init() which I believe then
> calls pci_bus_assign_resources() which actually sets up the resources
> for those devices. Later, the PCIe devices actually get probed, and
> everything works.
> 
> However, when booting using device tree, with the code currently in
> v3.6-rc1, tegra_pcie_init() is called late in the boot sequence, and so
> in the sequence described above, as soon as pci_scan_root_bus() adds a
> device, it gets probed, before the device object's resources have been
> set up, which results in the following failure:
> 
> PCI: Device 0000:01:00.0 not available because of resource collisions
> 
> ... because of the following code in pcibios_enable_device():
> 
> > 	for (idx = 0; idx < 6; idx++) {
> > 		/* Only set up the requested stuff */
> > 		if (!(mask & (1 << idx)))
> > 			continue;
> > 
> > 		r = dev->resource + idx;
> > 		if (!r->start && r->end) {
> > 			printk(KERN_ERR "PCI: Device %s not available because"
> > 			       " of resource collisions\n", pci_name(dev));
> 
> Doesn't this same problem exist when instantiating the PCIe device
> itself from device tree as in your patch series? If not, can you explain
> why?
> 
> Now, the obvious solution in v3.6 would be to simply have
> tegra_pcie_init() be called at the same early stage in the boot process
> when booting using device tree as it is when booting using board files.
> This works for TrimSlice.
> 
> However, on Harmony, it doesn't work, because PCIe on Harmony depends on
> regulators, and the regulators are accessed using an I2C bus that is
> instantiated from DT, and the instantiation of the I2C bus happens
> fairly late in the boot process so can't be found early during the boot
> sequence. See harmony_regulator_init() for the failing code.
> 
> Does anyone have any good ideas (small, self-contained patches) for
> solving this in v3.6 in such a way that PCIe works on both TrimSlice and
> Harmony?
> 
> Thanks.

I've looked into this a bit, and it seems like ARM is using an open-
coded version of the pci_enable_resources() function here, with the only
difference being the unconditional enabling of both I/O and memory-
mapped access for bridges. On Tegra there is already a PCI fixup to do
this, so pci_enable_resources() can be used as-is. I came up with the
attached patch but haven't been able to test it yet.

Thierry

[-- Attachment #1.2: 0001-ARM-PCI-refactor-pcibios_enable_device.patch --]
[-- Type: text/plain, Size: 2276 bytes --]

From ebd69ae0a3d076e574da74d963cb3834b71dc6ad Mon Sep 17 00:00:00 2001
From: Thierry Reding <thierry.reding@avionic-design.de>
Date: Mon, 13 Aug 2012 18:49:28 +0200
Subject: [PATCH] ARM: PCI: refactor pcibios_enable_device()

The implementation is an open-coded version on pci_enable_resources()
with a special case to enable I/O and memory-mapped functionality on
bridges. This commit reuses the existing PCI core implementation of the
pci_enable_resources() function. This also means that bridges no longer
enable I/O and memory-mapped functionality unconditionally. Platforms
where this is really required can add a corresponding fixup.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
---
 arch/arm/kernel/bios32.c | 36 +-----------------------------------
 1 file changed, 1 insertion(+), 35 deletions(-)

diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index 13fd97b..dfe25f7 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -601,41 +601,7 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res,
  */
 int pcibios_enable_device(struct pci_dev *dev, int mask)
 {
-	u16 cmd, old_cmd;
-	int idx;
-	struct resource *r;
-
-	pci_read_config_word(dev, PCI_COMMAND, &cmd);
-	old_cmd = cmd;
-	for (idx = 0; idx < 6; idx++) {
-		/* Only set up the requested stuff */
-		if (!(mask & (1 << idx)))
-			continue;
-
-		r = dev->resource + idx;
-		if (!r->start && r->end) {
-			printk(KERN_ERR "PCI: Device %s not available because"
-			       " of resource collisions\n", pci_name(dev));
-			return -EINVAL;
-		}
-		if (r->flags & IORESOURCE_IO)
-			cmd |= PCI_COMMAND_IO;
-		if (r->flags & IORESOURCE_MEM)
-			cmd |= PCI_COMMAND_MEMORY;
-	}
-
-	/*
-	 * Bridges (eg, cardbus bridges) need to be fully enabled
-	 */
-	if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE)
-		cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY;
-
-	if (cmd != old_cmd) {
-		printk("PCI: enabling device %s (%04x -> %04x)\n",
-		       pci_name(dev), old_cmd, cmd);
-		pci_write_config_word(dev, PCI_COMMAND, cmd);
-	}
-	return 0;
+	return pci_enable_resources(dev, mask);
 }
 
 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
-- 
1.7.11.4


[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 17:40   ` Thierry Reding
@ 2012-08-13 18:47     ` Stephen Warren
  2012-08-13 20:33       ` Thierry Reding
  2012-08-13 23:18       ` Bjorn Helgaas
  0 siblings, 2 replies; 79+ messages in thread
From: Stephen Warren @ 2012-08-13 18:47 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Russell King, linux-tegra, Bjorn Helgaas, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/13/2012 11:40 AM, Thierry Reding wrote:
> On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
>> On 07/26/2012 01:55 PM, Thierry Reding wrote:
>>> This patch series adds support for device tree based probing of
>>> the PCIe controller found on Tegra SoCs.
>> 
>> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed
>> that PCIe doesn't work on TrimSlice when booting use device tree.
>> I think I found the cause, and I can't see why the same problem
>> doesn't affect this series. Perhaps you can enlighten me?
...
>> PCI: Device 0000:01:00.0 not available because of resource
>> collisions
...
> I've looked into this a bit, and it seems like ARM is using an
> open- coded version of the pci_enable_resources() function here,
> with the only difference being the unconditional enabling of both
> I/O and memory- mapped access for bridges. On Tegra there is
> already a PCI fixup to do this, so pci_enable_resources() can be
> used as-is. I came up with the attached patch but haven't been able
> to test it yet.

Thanks very much for looking into this.

The patch did alter the behavior a little for TrimSlice, but didn't
solve the problem. The old error messages:

> [    2.173971] PCI: Device 0000:01:00.0 not available because of resource collisions
> [    2.181453] r8169 0000:01:00.0: (unregistered net_device): enable failure
> [    2.188254] r8169: probe of 0000:01:00.0 failed with error -22

Were replaced with the following with your patch:

> [    2.174010] r8169 0000:01:00.0: device not available (can't reserve [io  0x0000-0x00ff])
> [    2.182098] r8169 0000:01:00.0: (unregistered net_device): enable failure
> [    2.188900] r8169: probe of 0000:01:00.0 failed with error -22

This message appears from drivers/pci/setup-res.c pci_enable_resources()
due to:

> 		if (!r->parent) {
> 			dev_err(&dev->dev, "device not available "
> 				"(can't reserve %pR)\n", r);
> 			return -EINVAL;
> 		}

That check doesn't appear in ARM's custom pcibios_enable_device().
Disabling that check yields:

> [    2.174192] r8169 0000:01:00.0: enabling device (0140 -> 0143)
> [    2.180041] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> [    2.188386] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> [    2.196140] r8169: probe of 0000:01:00.0 failed with error -16

I think that's because the pci_dev's resources are initially assigned
PCI-aperture-relative addresses, and then these are later patched up to
take account of where the aperture is mapped into the CPU's address space.

Boot log using board files:

> [    1.146145] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> [    1.151745] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> [    1.159007] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> [    1.166270] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
...
> [    1.217829] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> [    1.225264] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> [    1.233236] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> [    1.241206] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
... (I added some extra printks:)
> [    1.488007] r8169 0000:01:00.0: BAR 0: requesting [io  0x1000-0x10ff]
> [    1.501483] r8169 0000:01:00.0: BAR 2: requesting [mem 0xa0024000-0xa0024fff 64bit pref]
> [    1.516611] r8169 0000:01:00.0: BAR 4: requesting [mem 0xa0020000-0xa0023fff 64bit pref]

whereas for a device tree boot:

(same):
> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
... (request region happens early)
> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
... (same, just happens too late)
> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]

I suspect this is all still related to the PCI devices themselves being
probed much earlier in the overall PCI initialization sequence when the
PCI controller is probed later in the boot sequence, whereas PCI device
probe is deferred until the overall PCI initialization sequence is
complete if the PCI controller is probed very early in the boot sequence.

Does anyone know where/what that "probe now" vs. "probe later" decision
point is? I'll try and track it down if nobody beats me to it.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 18:47     ` Stephen Warren
@ 2012-08-13 20:33       ` Thierry Reding
  2012-08-13 21:38         ` Rob Herring
  2012-08-13 23:18       ` Bjorn Helgaas
  1 sibling, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-13 20:33 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Russell King, linux-tegra, Bjorn Helgaas, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann


[-- Attachment #1.1: Type: text/plain, Size: 5896 bytes --]

On Mon, Aug 13, 2012 at 12:47:38PM -0600, Stephen Warren wrote:
> On 08/13/2012 11:40 AM, Thierry Reding wrote:
> > On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
> >> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> >>> This patch series adds support for device tree based probing of
> >>> the PCIe controller found on Tegra SoCs.
> >> 
> >> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed
> >> that PCIe doesn't work on TrimSlice when booting use device tree.
> >> I think I found the cause, and I can't see why the same problem
> >> doesn't affect this series. Perhaps you can enlighten me?
> ...
> >> PCI: Device 0000:01:00.0 not available because of resource
> >> collisions
> ...
> > I've looked into this a bit, and it seems like ARM is using an
> > open- coded version of the pci_enable_resources() function here,
> > with the only difference being the unconditional enabling of both
> > I/O and memory- mapped access for bridges. On Tegra there is
> > already a PCI fixup to do this, so pci_enable_resources() can be
> > used as-is. I came up with the attached patch but haven't been able
> > to test it yet.
> 
> Thanks very much for looking into this.
> 
> The patch did alter the behavior a little for TrimSlice, but didn't
> solve the problem. The old error messages:
> 
> > [    2.173971] PCI: Device 0000:01:00.0 not available because of resource collisions
> > [    2.181453] r8169 0000:01:00.0: (unregistered net_device): enable failure
> > [    2.188254] r8169: probe of 0000:01:00.0 failed with error -22
> 
> Were replaced with the following with your patch:
> 
> > [    2.174010] r8169 0000:01:00.0: device not available (can't reserve [io  0x0000-0x00ff])
> > [    2.182098] r8169 0000:01:00.0: (unregistered net_device): enable failure
> > [    2.188900] r8169: probe of 0000:01:00.0 failed with error -22
> 
> This message appears from drivers/pci/setup-res.c pci_enable_resources()
> due to:
> 
> > 		if (!r->parent) {
> > 			dev_err(&dev->dev, "device not available "
> > 				"(can't reserve %pR)\n", r);
> > 			return -EINVAL;
> > 		}

Looking at the code some more, this may be caused by the pci_remap_io()
patch series, so you might want to revert that patch and see if it fixes
the I/O resources.

> That check doesn't appear in ARM's custom pcibios_enable_device().
> Disabling that check yields:
> 
> > [    2.174192] r8169 0000:01:00.0: enabling device (0140 -> 0143)
> > [    2.180041] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> > [    2.188386] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> > [    2.196140] r8169: probe of 0000:01:00.0 failed with error -16
> 
> I think that's because the pci_dev's resources are initially assigned
> PCI-aperture-relative addresses, and then these are later patched up to
> take account of where the aperture is mapped into the CPU's address space.
> 
> Boot log using board files:
> 
> > [    1.146145] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> > [    1.151745] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> > [    1.159007] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> > [    1.166270] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> ...
> > [    1.217829] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> > [    1.225264] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> > [    1.233236] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> > [    1.241206] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> ... (I added some extra printks:)
> > [    1.488007] r8169 0000:01:00.0: BAR 0: requesting [io  0x1000-0x10ff]
> > [    1.501483] r8169 0000:01:00.0: BAR 2: requesting [mem 0xa0024000-0xa0024fff 64bit pref]
> > [    1.516611] r8169 0000:01:00.0: BAR 4: requesting [mem 0xa0020000-0xa0023fff 64bit pref]
> 
> whereas for a device tree boot:
> 
> (same):
> > [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> > [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> > [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> > [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> ... (request region happens early)
> > [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
> > [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
> > [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> > [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> ... (same, just happens too late)
> > [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> > [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> > [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> > [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> 
> I suspect this is all still related to the PCI devices themselves being
> probed much earlier in the overall PCI initialization sequence when the
> PCI controller is probed later in the boot sequence, whereas PCI device
> probe is deferred until the overall PCI initialization sequence is
> complete if the PCI controller is probed very early in the boot sequence.
> 
> Does anyone know where/what that "probe now" vs. "probe later" decision
> point is? I'll try and track it down if nobody beats me to it.

There's the io_offset and mem_offset fields that I've completely ignored
up to now. Can you try the patch below to see if it changes anything?
I'm sorry but I can't test any of this myself right now.

Thierry

[-- Attachment #1.2: pcie.patch --]
[-- Type: text/plain, Size: 1030 bytes --]

diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
index 3463fb5..9b9b3e0 100644
--- a/arch/arm/mach-tegra/pcie.c
+++ b/arch/arm/mach-tegra/pcie.c
@@ -395,7 +395,7 @@ static int tegra_pcie_setup(int nr, struct pci_sys_data *sys)
 	pp->res[0].flags = IORESOURCE_MEM;
 	if (request_resource(&iomem_resource, &pp->res[0]))
 		panic("Request PCIe Memory resource failed\n");
-	pci_add_resource_offset(&sys->resources, &pp->res[0], sys->mem_offset);
+	pci_add_resource_offset(&sys->resources, &pp->res[0], pp->res[0].start);
 
 	/*
 	 * IORESOURCE_MEM | IORESOURCE_PREFETCH
@@ -414,7 +414,7 @@ static int tegra_pcie_setup(int nr, struct pci_sys_data *sys)
 	pp->res[1].flags = IORESOURCE_MEM | IORESOURCE_PREFETCH;
 	if (request_resource(&iomem_resource, &pp->res[1]))
 		panic("Request PCIe Prefetch Memory resource failed\n");
-	pci_add_resource_offset(&sys->resources, &pp->res[1], sys->mem_offset);
+	pci_add_resource_offset(&sys->resources, &pp->res[1], pp->res[1].start);
 
 	return 1;
 }

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 20:33       ` Thierry Reding
@ 2012-08-13 21:38         ` Rob Herring
  2012-08-14  6:14           ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Rob Herring @ 2012-08-13 21:38 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Stephen Warren, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Bjorn Helgaas, Colin Cross, linux-tegra,
	linux-arm-kernel

On 08/13/2012 03:33 PM, Thierry Reding wrote:
> On Mon, Aug 13, 2012 at 12:47:38PM -0600, Stephen Warren wrote:
>> On 08/13/2012 11:40 AM, Thierry Reding wrote:
>>> On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
>>>> On 07/26/2012 01:55 PM, Thierry Reding wrote:
>>>>> This patch series adds support for device tree based probing of
>>>>> the PCIe controller found on Tegra SoCs.
>>>>
>>>> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed
>>>> that PCIe doesn't work on TrimSlice when booting use device tree.
>>>> I think I found the cause, and I can't see why the same problem
>>>> doesn't affect this series. Perhaps you can enlighten me?
>> ...
>>>> PCI: Device 0000:01:00.0 not available because of resource
>>>> collisions
>> ...
>>> I've looked into this a bit, and it seems like ARM is using an
>>> open- coded version of the pci_enable_resources() function here,
>>> with the only difference being the unconditional enabling of both
>>> I/O and memory- mapped access for bridges. On Tegra there is
>>> already a PCI fixup to do this, so pci_enable_resources() can be
>>> used as-is. I came up with the attached patch but haven't been able
>>> to test it yet.
>>
>> Thanks very much for looking into this.
>>
>> The patch did alter the behavior a little for TrimSlice, but didn't
>> solve the problem. The old error messages:
>>
>>> [    2.173971] PCI: Device 0000:01:00.0 not available because of resource collisions
>>> [    2.181453] r8169 0000:01:00.0: (unregistered net_device): enable failure
>>> [    2.188254] r8169: probe of 0000:01:00.0 failed with error -22
>>
>> Were replaced with the following with your patch:
>>
>>> [    2.174010] r8169 0000:01:00.0: device not available (can't reserve [io  0x0000-0x00ff])
>>> [    2.182098] r8169 0000:01:00.0: (unregistered net_device): enable failure
>>> [    2.188900] r8169: probe of 0000:01:00.0 failed with error -22
>>
>> This message appears from drivers/pci/setup-res.c pci_enable_resources()
>> due to:
>>
>>> 		if (!r->parent) {
>>> 			dev_err(&dev->dev, "device not available "
>>> 				"(can't reserve %pR)\n", r);
>>> 			return -EINVAL;
>>> 		}
> 
> Looking at the code some more, this may be caused by the pci_remap_io()
> patch series, so you might want to revert that patch and see if it fixes
> the I/O resources.
> 

Humm... But this patch deals with the i/o space and it is failing below
on the memory space.

>> That check doesn't appear in ARM's custom pcibios_enable_device().
>> Disabling that check yields:
>>
>>> [    2.174192] r8169 0000:01:00.0: enabling device (0140 -> 0143)
>>> [    2.180041] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.188386] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>>> [    2.196140] r8169: probe of 0000:01:00.0 failed with error -16
>>
>> I think that's because the pci_dev's resources are initially assigned
>> PCI-aperture-relative addresses, and then these are later patched up to
>> take account of where the aperture is mapped into the CPU's address space.
>>
>> Boot log using board files:
>>
>>> [    1.146145] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>> [    1.151745] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>> [    1.159007] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>> [    1.166270] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>> ...
>>> [    1.217829] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>> [    1.225264] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>> [    1.233236] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>> [    1.241206] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>> ... (I added some extra printks:)
>>> [    1.488007] r8169 0000:01:00.0: BAR 0: requesting [io  0x1000-0x10ff]
>>> [    1.501483] r8169 0000:01:00.0: BAR 2: requesting [mem 0xa0024000-0xa0024fff 64bit pref]
>>> [    1.516611] r8169 0000:01:00.0: BAR 4: requesting [mem 0xa0020000-0xa0023fff 64bit pref]
>>
>> whereas for a device tree boot:
>>
>> (same):
>>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>> ... (request region happens early)
>>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>> ... (same, just happens too late)
>>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>>
>> I suspect this is all still related to the PCI devices themselves being
>> probed much earlier in the overall PCI initialization sequence when the
>> PCI controller is probed later in the boot sequence, whereas PCI device
>> probe is deferred until the overall PCI initialization sequence is
>> complete if the PCI controller is probed very early in the boot sequence.
>>
>> Does anyone know where/what that "probe now" vs. "probe later" decision
>> point is? I'll try and track it down if nobody beats me to it.
> 
> There's the io_offset and mem_offset fields that I've completely ignored
> up to now. Can you try the patch below to see if it changes anything?
> I'm sorry but I can't test any of this myself right now.

Arnd and I discussed io_offset some. I don't think either of us can
figure out when it should be anything but 0 at least if pci i/o bus
addresses start at 0.

I don't think mem_offset is the issue. I think perhaps you need to set
pcibios_min_mem to the memory window base (0xa0000000), but that's just
a guess.

Rob

> 
> Thierry
> 
> 
> 
> _______________________________________________
> devicetree-discuss mailing list
> devicetree-discuss@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/devicetree-discuss
> 


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 18:47     ` Stephen Warren
  2012-08-13 20:33       ` Thierry Reding
@ 2012-08-13 23:18       ` Bjorn Helgaas
  2012-08-14  6:29         ` Thierry Reding
  2012-08-14 19:39         ` Stephen Warren
  1 sibling, 2 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-13 23:18 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Thierry Reding, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/13/2012 11:40 AM, Thierry Reding wrote:
>> On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
>>> On 07/26/2012 01:55 PM, Thierry Reding wrote:
>>>> This patch series adds support for device tree based probing of
>>>> the PCIe controller found on Tegra SoCs.
>>>
>>> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed
>>> that PCIe doesn't work on TrimSlice when booting use device tree.
>>> I think I found the cause, and I can't see why the same problem
>>> doesn't affect this series. Perhaps you can enlighten me?
> ...
>>> PCI: Device 0000:01:00.0 not available because of resource
>>> collisions
> ...
>> I've looked into this a bit, and it seems like ARM is using an
>> open- coded version of the pci_enable_resources() function here,
>> with the only difference being the unconditional enabling of both
>> I/O and memory- mapped access for bridges. On Tegra there is
>> already a PCI fixup to do this, so pci_enable_resources() can be
>> used as-is.

I'd prefer that bridge I/O & memory access enabling be done in a
mainline path, not in a fixup.  Fixups are intended for working around
defects in specific devices, not for the normal path.  I know various
architectures have fixups that are used in the normal path, but I've
been working on eliminating them.

> The patch did alter the behavior a little for TrimSlice, but didn't
> solve the problem. The old error messages:
>
>> [    2.173971] PCI: Device 0000:01:00.0 not available because of resource collisions
>> [    2.181453] r8169 0000:01:00.0: (unregistered net_device): enable failure
>> [    2.188254] r8169: probe of 0000:01:00.0 failed with error -22
>
> Were replaced with the following with your patch:
>
>> [    2.174010] r8169 0000:01:00.0: device not available (can't reserve [io  0x0000-0x00ff])
>> [    2.182098] r8169 0000:01:00.0: (unregistered net_device): enable failure
>> [    2.188900] r8169: probe of 0000:01:00.0 failed with error -22
>
> This message appears from drivers/pci/setup-res.c pci_enable_resources()
> due to:
>
>>               if (!r->parent) {
>>                       dev_err(&dev->dev, "device not available "
>>                               "(can't reserve %pR)\n", r);
>>                       return -EINVAL;
>>               }
>
> That check doesn't appear in ARM's custom pcibios_enable_device().
> Disabling that check yields:
>
>> [    2.174192] r8169 0000:01:00.0: enabling device (0140 -> 0143)
>> [    2.180041] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>> [    2.188386] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>> [    2.196140] r8169: probe of 0000:01:00.0 failed with error -16
>
> I think that's because the pci_dev's resources are initially assigned
> PCI-aperture-relative addresses, and then these are later patched up to
> take account of where the aperture is mapped into the CPU's address space.

We definitely shouldn't be calling the driver probe routine before the
device BARs are assigned.

> Boot log using board files:
>
>> [    1.146145] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>> [    1.151745] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>> [    1.159007] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>> [    1.166270] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> ...
>> [    1.217829] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>> [    1.225264] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>> [    1.233236] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>> [    1.241206] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> ... (I added some extra printks:)
>> [    1.488007] r8169 0000:01:00.0: BAR 0: requesting [io  0x1000-0x10ff]
>> [    1.501483] r8169 0000:01:00.0: BAR 2: requesting [mem 0xa0024000-0xa0024fff 64bit pref]
>> [    1.516611] r8169 0000:01:00.0: BAR 4: requesting [mem 0xa0020000-0xa0023fff 64bit pref]
>
> whereas for a device tree boot:
>
> (same):
>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> ... (request region happens early)
>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> ... (same, just happens too late)
>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>
> I suspect this is all still related to the PCI devices themselves being
> probed much earlier in the overall PCI initialization sequence when the
> PCI controller is probed later in the boot sequence, whereas PCI device
> probe is deferred until the overall PCI initialization sequence is
> complete if the PCI controller is probed very early in the boot sequence.

I don't know what to apply your patches to (they don't apply cleanly
to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
like you might be calling pci_bus_add_devices() before
pci_bus_assign_resource(), which isn't going to work.

I don't know what it means to probe PCI devices before probing the PCI
controller (the host bridge) -- that shouldn't happen either.  In
order to probe PCI devices, we have to first know about the host
bridge so we know how to do config accesses and what the bridge
apertures are.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-07-26 19:55 ` [PATCH v3 05/10] resource: add PCI configuration space support Thierry Reding
@ 2012-08-14  5:00   ` Bjorn Helgaas
  2012-08-14  5:55     ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-14  5:00 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> This commit adds a new flag that allows marking resources as PCI
> configuration space.
>
> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> ---
> Changes in v3:
> - new patch
>
>  include/linux/ioport.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index 589e0e7..3314843 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -102,7 +102,7 @@ struct resource {
>
>  /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
>  #define IORESOURCE_PCI_FIXED           (1<<4)  /* Do not move resource */
> -
> +#define IORESOURCE_PCI_CS              (1<<5)  /* PCI configuration space */

What is the purpose of this?  It seems that you are marking regions
that we call MMCONFIG on x86, or ECAM-type regions in the language of
the PCIe spec.  I see that you set it in several places, but I don't
see anything that ever looks for it.  Do you have plans to use it in
the future?  If it really does correspond to MMCONFIG/ECAM, we should
handle those regions consistently across all architectures.

>  /* helpers to define resources */
>  #define DEFINE_RES_NAMED(_start, _size, _name, _flags)                 \
> --
> 1.7.11.2
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-07-26 19:55 ` [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init Thierry Reding
@ 2012-08-14  5:06   ` Bjorn Helgaas
  2012-08-14  5:37     ` Thierry Reding
  2012-08-15 17:06   ` Bjorn Helgaas
  1 sibling, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-14  5:06 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> When using deferred driver probing, PCI host controller drivers may
> actually require this function after the init stage.

Yes, this is a bug.  Actually, there's still another bug here: if we
hot-add a device, we won't do pdev_fixup_irq() for it.  But your
change is worthwhile even without fixing that other bug.

Normally I would include this in my PCI tree since it's under
drivers/pci.  That would make it easier for somebody to fix the
hotplug problem in this cycle.  Would having it in my PCI tree make
things harder for you?

> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
> Changes in v3:
> - none
>
> Changes in v2:
> - use __devinit annotations
>
>  drivers/pci/setup-irq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c
> index eb219a1..f0bcd56 100644
> --- a/drivers/pci/setup-irq.c
> +++ b/drivers/pci/setup-irq.c
> @@ -18,7 +18,7 @@
>  #include <linux/cache.h>
>
>
> -static void __init
> +static void __devinit
>  pdev_fixup_irq(struct pci_dev *dev,
>                u8 (*swizzle)(struct pci_dev *, u8 *),
>                int (*map_irq)(const struct pci_dev *, u8, u8))
> @@ -54,7 +54,7 @@ pdev_fixup_irq(struct pci_dev *dev,
>         pcibios_update_irq(dev, irq);
>  }
>
> -void __init
> +void __devinit
>  pci_fixup_irqs(u8 (*swizzle)(struct pci_dev *, u8 *),
>                int (*map_irq)(const struct pci_dev *, u8, u8))
>  {
> --
> 1.7.11.2
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-08-14  5:06   ` Bjorn Helgaas
@ 2012-08-14  5:37     ` Thierry Reding
  0 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-14  5:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 884 bytes --]

On Mon, Aug 13, 2012 at 10:06:38PM -0700, Bjorn Helgaas wrote:
> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > When using deferred driver probing, PCI host controller drivers may
> > actually require this function after the init stage.
> 
> Yes, this is a bug.  Actually, there's still another bug here: if we
> hot-add a device, we won't do pdev_fixup_irq() for it.  But your
> change is worthwhile even without fixing that other bug.
> 
> Normally I would include this in my PCI tree since it's under
> drivers/pci.  That would make it easier for somebody to fix the
> hotplug problem in this cycle.  Would having it in my PCI tree make
> things harder for you?

No, not at all.

> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-14  5:00   ` Bjorn Helgaas
@ 2012-08-14  5:55     ` Thierry Reding
  2012-08-14 17:38       ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-14  5:55 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2452 bytes --]

On Mon, Aug 13, 2012 at 10:00:45PM -0700, Bjorn Helgaas wrote:
> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > This commit adds a new flag that allows marking resources as PCI
> > configuration space.
> >
> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> > ---
> > Changes in v3:
> > - new patch
> >
> >  include/linux/ioport.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > index 589e0e7..3314843 100644
> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -102,7 +102,7 @@ struct resource {
> >
> >  /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
> >  #define IORESOURCE_PCI_FIXED           (1<<4)  /* Do not move resource */
> > -
> > +#define IORESOURCE_PCI_CS              (1<<5)  /* PCI configuration space */
> 
> What is the purpose of this?  It seems that you are marking regions
> that we call MMCONFIG on x86, or ECAM-type regions in the language of
> the PCIe spec.  I see that you set it in several places, but I don't
> see anything that ever looks for it.  Do you have plans to use it in
> the future?  If it really does correspond to MMCONFIG/ECAM, we should
> handle those regions consistently across all architectures.

The purpose is ultimately to obtain the MMCONFIG/ECAM resources assigned
to a PCI host controller. I've used this in the of_pci_parse_ranges()
and in the static board setup code to mark ranges as such. Perhaps
IORESOURCE_ECAM or IORESOURCE_MMCONFIG might have been better names. I
also just noticed that I'm not using this anywhere, but the plan was to
eventually use it with platform_get_resource(). However that doesn't
seem to work either because the lower bits of the flags aren't use for
comparison in that function.

Any other ideas how that could be handled? Basically what I need is a
way to mark a resource as an MMCONFIG/ECAM range so that it can be used
to program the PCI host controller accordingly. I don't know how these
are assigned on x86. I was under the impression that the MMCONFIG/ECAM
space was accessed through a single single address/data register pair.

On Tegra 20 this region can be anywhere in the upper 1 GiB of the 4 GiB
address space, while on Tegra 30 it can be anywhere in the lower 1 GiB,
so they can be freely assigned.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 21:38         ` Rob Herring
@ 2012-08-14  6:14           ` Thierry Reding
  0 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-14  6:14 UTC (permalink / raw)
  To: Rob Herring
  Cc: Stephen Warren, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Bjorn Helgaas, Colin Cross, linux-tegra,
	linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 7515 bytes --]

On Mon, Aug 13, 2012 at 04:38:45PM -0500, Rob Herring wrote:
> On 08/13/2012 03:33 PM, Thierry Reding wrote:
> > On Mon, Aug 13, 2012 at 12:47:38PM -0600, Stephen Warren wrote:
> >> On 08/13/2012 11:40 AM, Thierry Reding wrote:
> >>> On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
> >>>> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> >>>>> This patch series adds support for device tree based probing of
> >>>>> the PCIe controller found on Tegra SoCs.
> >>>>
> >>>> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed
> >>>> that PCIe doesn't work on TrimSlice when booting use device tree.
> >>>> I think I found the cause, and I can't see why the same problem
> >>>> doesn't affect this series. Perhaps you can enlighten me?
> >> ...
> >>>> PCI: Device 0000:01:00.0 not available because of resource
> >>>> collisions
> >> ...
> >>> I've looked into this a bit, and it seems like ARM is using an
> >>> open- coded version of the pci_enable_resources() function here,
> >>> with the only difference being the unconditional enabling of both
> >>> I/O and memory- mapped access for bridges. On Tegra there is
> >>> already a PCI fixup to do this, so pci_enable_resources() can be
> >>> used as-is. I came up with the attached patch but haven't been able
> >>> to test it yet.
> >>
> >> Thanks very much for looking into this.
> >>
> >> The patch did alter the behavior a little for TrimSlice, but didn't
> >> solve the problem. The old error messages:
> >>
> >>> [    2.173971] PCI: Device 0000:01:00.0 not available because of resource collisions
> >>> [    2.181453] r8169 0000:01:00.0: (unregistered net_device): enable failure
> >>> [    2.188254] r8169: probe of 0000:01:00.0 failed with error -22
> >>
> >> Were replaced with the following with your patch:
> >>
> >>> [    2.174010] r8169 0000:01:00.0: device not available (can't reserve [io  0x0000-0x00ff])
> >>> [    2.182098] r8169 0000:01:00.0: (unregistered net_device): enable failure
> >>> [    2.188900] r8169: probe of 0000:01:00.0 failed with error -22
> >>
> >> This message appears from drivers/pci/setup-res.c pci_enable_resources()
> >> due to:
> >>
> >>> 		if (!r->parent) {
> >>> 			dev_err(&dev->dev, "device not available "
> >>> 				"(can't reserve %pR)\n", r);
> >>> 			return -EINVAL;
> >>> 		}
> > 
> > Looking at the code some more, this may be caused by the pci_remap_io()
> > patch series, so you might want to revert that patch and see if it fixes
> > the I/O resources.
> > 
> 
> Humm... But this patch deals with the i/o space and it is failing below
> on the memory space.

But above it also fails for I/O. Looking at this some more, it seems
like your patch isn't at fault. Rather there seems to be a general
resource assignment problem.

> 
> >> That check doesn't appear in ARM's custom pcibios_enable_device().
> >> Disabling that check yields:
> >>
> >>> [    2.174192] r8169 0000:01:00.0: enabling device (0140 -> 0143)
> >>> [    2.180041] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.188386] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> >>> [    2.196140] r8169: probe of 0000:01:00.0 failed with error -16
> >>
> >> I think that's because the pci_dev's resources are initially assigned
> >> PCI-aperture-relative addresses, and then these are later patched up to
> >> take account of where the aperture is mapped into the CPU's address space.
> >>
> >> Boot log using board files:
> >>
> >>> [    1.146145] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> >>> [    1.151745] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    1.159007] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> >>> [    1.166270] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> >> ...
> >>> [    1.217829] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> >>> [    1.225264] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> >>> [    1.233236] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> >>> [    1.241206] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> >> ... (I added some extra printks:)
> >>> [    1.488007] r8169 0000:01:00.0: BAR 0: requesting [io  0x1000-0x10ff]
> >>> [    1.501483] r8169 0000:01:00.0: BAR 2: requesting [mem 0xa0024000-0xa0024fff 64bit pref]
> >>> [    1.516611] r8169 0000:01:00.0: BAR 4: requesting [mem 0xa0020000-0xa0023fff 64bit pref]
> >>
> >> whereas for a device tree boot:
> >>
> >> (same):
> >>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> >>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> >>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> >> ... (request region happens early)
> >>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
> >>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> >> ... (same, just happens too late)
> >>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> >>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> >>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> >>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> >>
> >> I suspect this is all still related to the PCI devices themselves being
> >> probed much earlier in the overall PCI initialization sequence when the
> >> PCI controller is probed later in the boot sequence, whereas PCI device
> >> probe is deferred until the overall PCI initialization sequence is
> >> complete if the PCI controller is probed very early in the boot sequence.
> >>
> >> Does anyone know where/what that "probe now" vs. "probe later" decision
> >> point is? I'll try and track it down if nobody beats me to it.
> > 
> > There's the io_offset and mem_offset fields that I've completely ignored
> > up to now. Can you try the patch below to see if it changes anything?
> > I'm sorry but I can't test any of this myself right now.
> 
> Arnd and I discussed io_offset some. I don't think either of us can
> figure out when it should be anything but 0 at least if pci i/o bus
> addresses start at 0.
> 
> I don't think mem_offset is the issue. I think perhaps you need to set
> pcibios_min_mem to the memory window base (0xa0000000), but that's just
> a guess.

I'm having trouble understanding how that's supposed to work for regions
of prefetchable memory. At least on Tegra these can arbitrarily assigned
and I thought I had seen other platforms where this was also the case.

However the pcibios_min_mem (or the equivalent macro PCIBIOS_MIN_MEM) is
used while assigning the resources in the __pci_assign_resources() in
drivers/pci/setup-res.c, so it may influence things.

But I think the more fundamental issue here is that BARs are assigned
properly, only they are assigned too late in the DT case as opposed to
the board files case. I don't understand why that happens.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 23:18       ` Bjorn Helgaas
@ 2012-08-14  6:29         ` Thierry Reding
  2012-08-14 19:39         ` Stephen Warren
  1 sibling, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-14  6:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Stephen Warren, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 7331 bytes --]

On Mon, Aug 13, 2012 at 04:18:16PM -0700, Bjorn Helgaas wrote:
> On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> > On 08/13/2012 11:40 AM, Thierry Reding wrote:
> >> On Mon, Aug 06, 2012 at 01:42:21PM -0600, Stephen Warren wrote:
> >>> On 07/26/2012 01:55 PM, Thierry Reding wrote:
> >>>> This patch series adds support for device tree based probing of
> >>>> the PCIe controller found on Tegra SoCs.
> >>>
> >>> Thierry, I just tested all Tegra boards in v3.6-rc1, and noticed
> >>> that PCIe doesn't work on TrimSlice when booting use device tree.
> >>> I think I found the cause, and I can't see why the same problem
> >>> doesn't affect this series. Perhaps you can enlighten me?
> > ...
> >>> PCI: Device 0000:01:00.0 not available because of resource
> >>> collisions
> > ...
> >> I've looked into this a bit, and it seems like ARM is using an
> >> open- coded version of the pci_enable_resources() function here,
> >> with the only difference being the unconditional enabling of both
> >> I/O and memory- mapped access for bridges. On Tegra there is
> >> already a PCI fixup to do this, so pci_enable_resources() can be
> >> used as-is.
> 
> I'd prefer that bridge I/O & memory access enabling be done in a
> mainline path, not in a fixup.  Fixups are intended for working around
> defects in specific devices, not for the normal path.  I know various
> architectures have fixups that are used in the normal path, but I've
> been working on eliminating them.

I understand. Perhaps it should be added to the pci_enable_resources()
function?

> > The patch did alter the behavior a little for TrimSlice, but didn't
> > solve the problem. The old error messages:
> >
> >> [    2.173971] PCI: Device 0000:01:00.0 not available because of resource collisions
> >> [    2.181453] r8169 0000:01:00.0: (unregistered net_device): enable failure
> >> [    2.188254] r8169: probe of 0000:01:00.0 failed with error -22
> >
> > Were replaced with the following with your patch:
> >
> >> [    2.174010] r8169 0000:01:00.0: device not available (can't reserve [io  0x0000-0x00ff])
> >> [    2.182098] r8169 0000:01:00.0: (unregistered net_device): enable failure
> >> [    2.188900] r8169: probe of 0000:01:00.0 failed with error -22
> >
> > This message appears from drivers/pci/setup-res.c pci_enable_resources()
> > due to:
> >
> >>               if (!r->parent) {
> >>                       dev_err(&dev->dev, "device not available "
> >>                               "(can't reserve %pR)\n", r);
> >>                       return -EINVAL;
> >>               }
> >
> > That check doesn't appear in ARM's custom pcibios_enable_device().
> > Disabling that check yields:
> >
> >> [    2.174192] r8169 0000:01:00.0: enabling device (0140 -> 0143)
> >> [    2.180041] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> >> [    2.188386] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> >> [    2.196140] r8169: probe of 0000:01:00.0 failed with error -16
> >
> > I think that's because the pci_dev's resources are initially assigned
> > PCI-aperture-relative addresses, and then these are later patched up to
> > take account of where the aperture is mapped into the CPU's address space.
> 
> We definitely shouldn't be calling the driver probe routine before the
> device BARs are assigned.
> 
> > Boot log using board files:
> >
> >> [    1.146145] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> >> [    1.151745] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> >> [    1.159007] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> >> [    1.166270] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> > ...
> >> [    1.217829] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> >> [    1.225264] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> >> [    1.233236] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> >> [    1.241206] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> > ... (I added some extra printks:)
> >> [    1.488007] r8169 0000:01:00.0: BAR 0: requesting [io  0x1000-0x10ff]
> >> [    1.501483] r8169 0000:01:00.0: BAR 2: requesting [mem 0xa0024000-0xa0024fff 64bit pref]
> >> [    1.516611] r8169 0000:01:00.0: BAR 4: requesting [mem 0xa0020000-0xa0023fff 64bit pref]
> >
> > whereas for a device tree boot:
> >
> > (same):
> >> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> >> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> >> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> >> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> > ... (request region happens early)
> >> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
> >> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
> >> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> >> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> > ... (same, just happens too late)
> >> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> >> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> >> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> >> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> >
> > I suspect this is all still related to the PCI devices themselves being
> > probed much earlier in the overall PCI initialization sequence when the
> > PCI controller is probed later in the boot sequence, whereas PCI device
> > probe is deferred until the overall PCI initialization sequence is
> > complete if the PCI controller is probed very early in the boot sequence.
> 
> I don't know what to apply your patches to (they don't apply cleanly
> to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
> like you might be calling pci_bus_add_devices() before
> pci_bus_assign_resource(), which isn't going to work.

The patch was on top of this series, which in turn is based on -next.
However the patch doesn't actually change anything in the ordering of
the initialization sequence, it just replaces the ARM implementation
of pcibios_enable_device() to reuse pci_enable_resources().

ARM's main PCI entry point, arch/arm/kernel/bios32.c:pci_common_init(),
does call the correct sequence: pci_bus_assign_resources() followed by
pci_bus_add_devices(), so the sequence looks correct. However, judging
by the logs Stephen posted, the BAR assignment is too late, after the
driver has been probed.

> I don't know what it means to probe PCI devices before probing the PCI
> controller (the host bridge) -- that shouldn't happen either.  In
> order to probe PCI devices, we have to first know about the host
> bridge so we know how to do config accesses and what the bridge
> apertures are.

Right. I don't think this can happen. If the bridge hasn't been probed,
then the devices shouldn't be there anyway.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-14  5:55     ` Thierry Reding
@ 2012-08-14 17:38       ` Bjorn Helgaas
  2012-08-14 18:01         ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-14 17:38 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Mon, Aug 13, 2012 at 10:55 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Mon, Aug 13, 2012 at 10:00:45PM -0700, Bjorn Helgaas wrote:
>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > This commit adds a new flag that allows marking resources as PCI
>> > configuration space.
>> >
>> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
>> > ---
>> > Changes in v3:
>> > - new patch
>> >
>> >  include/linux/ioport.h | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> > index 589e0e7..3314843 100644
>> > --- a/include/linux/ioport.h
>> > +++ b/include/linux/ioport.h
>> > @@ -102,7 +102,7 @@ struct resource {
>> >
>> >  /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
>> >  #define IORESOURCE_PCI_FIXED           (1<<4)  /* Do not move resource */
>> > -
>> > +#define IORESOURCE_PCI_CS              (1<<5)  /* PCI configuration space */
>>
>> What is the purpose of this?  It seems that you are marking regions
>> that we call MMCONFIG on x86, or ECAM-type regions in the language of
>> the PCIe spec.  I see that you set it in several places, but I don't
>> see anything that ever looks for it.  Do you have plans to use it in
>> the future?  If it really does correspond to MMCONFIG/ECAM, we should
>> handle those regions consistently across all architectures.
>
> The purpose is ultimately to obtain the MMCONFIG/ECAM resources assigned
> to a PCI host controller. I've used this in the of_pci_parse_ranges()
> and in the static board setup code to mark ranges as such. Perhaps
> IORESOURCE_ECAM or IORESOURCE_MMCONFIG might have been better names. I
> also just noticed that I'm not using this anywhere, but the plan was to
> eventually use it with platform_get_resource(). However that doesn't
> seem to work either because the lower bits of the flags aren't use for
> comparison in that function.
>
> Any other ideas how that could be handled? Basically what I need is a
> way to mark a resource as an MMCONFIG/ECAM range so that it can be used
> to program the PCI host controller accordingly. I don't know how these
> are assigned on x86. I was under the impression that the MMCONFIG/ECAM
> space was accessed through a single single address/data register pair.

The legacy config access mechanism (CF8h/CFCh registers described in
PCI 3.0 spec sec 3.2.2.3.2) is a single address/data pair, but this is
mostly x86-specific.  The ECAM mechanism (described in the PCIe 3.0
spec sec 7.2.2) is not a single address/data pair; instead, each byte
of config space is directly mapped into the host's MMIO space.

Here's what we do on x86 (omitting some historical grunge that
complicates things):

  - Discover the host bridge via a PNP0A08 device in the ACPI namespace.
  - Discover the bus number range behind the bridge using a _CRS
method in the PNP0A08 device.
  - Discover the ECAM space for those buses via a _CBA method in the
PNP0A08 device.
  - Tell the config accessors (struct pci_ops) that the ECAM space for
buses A-B is at address X.
  - Enumerate the devices behind the host bridge by calling
pci_scan_root_bus(), passing the config accessors.

It sounds like you want a way to parse the resources at one point,
saving them and marking the ECAM region, then at a later point, look
up the ECAM from a saved list.  We don't do that on x86 because the
config accessors keep an internal list of the ECAM area for each bus.

We do of course want to put this ECAM space in the IORESOURCE_MEM tree
because it consumes address space and we have to make sure we don't
put anything else on top of it.  But we don't have any reason to
describe the MMIO -> config space function in that tree.  From the
point of view of the rest of the system, it's just MMIO space that's
consumed by the PCI host bridge, just like any other device-specific
MMIO area.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-14 17:38       ` Bjorn Helgaas
@ 2012-08-14 18:01         ` Thierry Reding
  2012-08-14 21:44           ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-14 18:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 5050 bytes --]

On Tue, Aug 14, 2012 at 10:38:08AM -0700, Bjorn Helgaas wrote:
> On Mon, Aug 13, 2012 at 10:55 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Mon, Aug 13, 2012 at 10:00:45PM -0700, Bjorn Helgaas wrote:
> >> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > This commit adds a new flag that allows marking resources as PCI
> >> > configuration space.
> >> >
> >> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> >> > ---
> >> > Changes in v3:
> >> > - new patch
> >> >
> >> >  include/linux/ioport.h | 2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> >> > index 589e0e7..3314843 100644
> >> > --- a/include/linux/ioport.h
> >> > +++ b/include/linux/ioport.h
> >> > @@ -102,7 +102,7 @@ struct resource {
> >> >
> >> >  /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
> >> >  #define IORESOURCE_PCI_FIXED           (1<<4)  /* Do not move resource */
> >> > -
> >> > +#define IORESOURCE_PCI_CS              (1<<5)  /* PCI configuration space */
> >>
> >> What is the purpose of this?  It seems that you are marking regions
> >> that we call MMCONFIG on x86, or ECAM-type regions in the language of
> >> the PCIe spec.  I see that you set it in several places, but I don't
> >> see anything that ever looks for it.  Do you have plans to use it in
> >> the future?  If it really does correspond to MMCONFIG/ECAM, we should
> >> handle those regions consistently across all architectures.
> >
> > The purpose is ultimately to obtain the MMCONFIG/ECAM resources assigned
> > to a PCI host controller. I've used this in the of_pci_parse_ranges()
> > and in the static board setup code to mark ranges as such. Perhaps
> > IORESOURCE_ECAM or IORESOURCE_MMCONFIG might have been better names. I
> > also just noticed that I'm not using this anywhere, but the plan was to
> > eventually use it with platform_get_resource(). However that doesn't
> > seem to work either because the lower bits of the flags aren't use for
> > comparison in that function.
> >
> > Any other ideas how that could be handled? Basically what I need is a
> > way to mark a resource as an MMCONFIG/ECAM range so that it can be used
> > to program the PCI host controller accordingly. I don't know how these
> > are assigned on x86. I was under the impression that the MMCONFIG/ECAM
> > space was accessed through a single single address/data register pair.
> 
> The legacy config access mechanism (CF8h/CFCh registers described in
> PCI 3.0 spec sec 3.2.2.3.2) is a single address/data pair, but this is
> mostly x86-specific.  The ECAM mechanism (described in the PCIe 3.0
> spec sec 7.2.2) is not a single address/data pair; instead, each byte
> of config space is directly mapped into the host's MMIO space.
> 
> Here's what we do on x86 (omitting some historical grunge that
> complicates things):
> 
>   - Discover the host bridge via a PNP0A08 device in the ACPI namespace.
>   - Discover the bus number range behind the bridge using a _CRS
> method in the PNP0A08 device.
>   - Discover the ECAM space for those buses via a _CBA method in the
> PNP0A08 device.
>   - Tell the config accessors (struct pci_ops) that the ECAM space for
> buses A-B is at address X.
>   - Enumerate the devices behind the host bridge by calling
> pci_scan_root_bus(), passing the config accessors.
> 
> It sounds like you want a way to parse the resources at one point,
> saving them and marking the ECAM region, then at a later point, look
> up the ECAM from a saved list.  We don't do that on x86 because the
> config accessors keep an internal list of the ECAM area for each bus.
> 
> We do of course want to put this ECAM space in the IORESOURCE_MEM tree
> because it consumes address space and we have to make sure we don't
> put anything else on top of it.  But we don't have any reason to
> describe the MMIO -> config space function in that tree.  From the
> point of view of the rest of the system, it's just MMIO space that's
> consumed by the PCI host bridge, just like any other device-specific
> MMIO area.

What I currently do is pass the ECAM space as a resource to the PCI host
bridge platform device. Regular and extended configuration spaces are
given by the third and fourth resources, respectively. If I understand
correctly, you're saying that nothing beyond that needs to be encoded.
In other words it is enough for the PCI host bridge driver to know where
to take the data from.

I'll have to see what this means for the DT binding. There are other
issues that I need to think about, like for example how to pass the ECAM
space from the PCI host controller to each of the two bridges via the
ranges property. This no longer makes sense in the current form, as the
ECAM covers the configuration spaces for devices of both bridges and
cannot really be split among them.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-13 23:18       ` Bjorn Helgaas
  2012-08-14  6:29         ` Thierry Reding
@ 2012-08-14 19:39         ` Stephen Warren
  2012-08-14 19:58           ` Thierry Reding
  1 sibling, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-08-14 19:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Thierry Reding, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
> On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
...
>> whereas for a device tree boot:
>>
>> (same):
>>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>> ... (request region happens early)
>>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>> ... (same, just happens too late)
>>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>>
>> I suspect this is all still related to the PCI devices themselves being
>> probed much earlier in the overall PCI initialization sequence when the
>> PCI controller is probed later in the boot sequence, whereas PCI device
>> probe is deferred until the overall PCI initialization sequence is
>> complete if the PCI controller is probed very early in the boot sequence.
> 
> I don't know what to apply your patches to (they don't apply cleanly
> to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
> like you might be calling pci_bus_add_devices() before
> pci_bus_assign_resource(), which isn't going to work.

Yes, that's exactly what is happening.

PCIe initialization starts in arch/arm/mach-tegra/pci.e
tegra_pcie_init() which calls arch/arm/kernel/bios32.c
pci_common_init(). That function first calls pcibios_init_hw() (in the
same file, more about this later) and then loops over PCI buses, calling
amongst other things pci_bus_assign_resources() then pci_bus_add_devices().

The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
(or a host-driver-specific function which that also calls
pci_scan_root_bus() in Tegra's case) which in turn calls
pci_bus_add_devices() right at the end, before control has returned to
pci_common_init() and hence before pci_bus_assign_resources() has been
called.

If I modify pci_scan_root_bus() and remove the call to
pci_bus_add_devices(), everything works as expected.

So, I guess the question is: Should ARM's pcibios_init_hw() not be
calling pci_scan_root_bus(), or at least presumably the ARM PCI code
needs to do things in a slightly different order?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 19:39         ` Stephen Warren
@ 2012-08-14 19:58           ` Thierry Reding
  2012-08-14 21:55             ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-14 19:58 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Bjorn Helgaas, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 3289 bytes --]

On Tue, Aug 14, 2012 at 01:39:23PM -0600, Stephen Warren wrote:
> On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
> > On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> ...
> >> whereas for a device tree boot:
> >>
> >> (same):
> >>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
> >>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
> >>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> >> ... (request region happens early)
> >>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
> >>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
> >>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
> >> ... (same, just happens too late)
> >>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
> >>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
> >>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
> >>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> >>
> >> I suspect this is all still related to the PCI devices themselves being
> >> probed much earlier in the overall PCI initialization sequence when the
> >> PCI controller is probed later in the boot sequence, whereas PCI device
> >> probe is deferred until the overall PCI initialization sequence is
> >> complete if the PCI controller is probed very early in the boot sequence.
> > 
> > I don't know what to apply your patches to (they don't apply cleanly
> > to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
> > like you might be calling pci_bus_add_devices() before
> > pci_bus_assign_resource(), which isn't going to work.
> 
> Yes, that's exactly what is happening.
> 
> PCIe initialization starts in arch/arm/mach-tegra/pci.e
> tegra_pcie_init() which calls arch/arm/kernel/bios32.c
> pci_common_init(). That function first calls pcibios_init_hw() (in the
> same file, more about this later) and then loops over PCI buses, calling
> amongst other things pci_bus_assign_resources() then pci_bus_add_devices().
> 
> The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
> (or a host-driver-specific function which that also calls
> pci_scan_root_bus() in Tegra's case) which in turn calls
> pci_bus_add_devices() right at the end, before control has returned to
> pci_common_init() and hence before pci_bus_assign_resources() has been
> called.
> 
> If I modify pci_scan_root_bus() and remove the call to
> pci_bus_add_devices(), everything works as expected.
> 
> So, I guess the question is: Should ARM's pcibios_init_hw() not be
> calling pci_scan_root_bus(), or at least presumably the ARM PCI code
> needs to do things in a slightly different order?

Maybe pci_scan_root_bus() should be calling pci_bus_assign_resources()?
Or a new function could be added which also assigns the resources.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-07-26 19:55 ` [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support Thierry Reding
@ 2012-08-14 20:12   ` Thierry Reding
  2012-08-14 23:50     ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-14 20:12 UTC (permalink / raw)
  To: linux-tegra
  Cc: Bjorn Helgaas, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 3875 bytes --]

On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> index a094c97..c886dff 100644
> --- a/arch/arm/boot/dts/tegra20.dtsi
> +++ b/arch/arm/boot/dts/tegra20.dtsi
> @@ -199,6 +199,68 @@
>  		#size-cells = <0>;
>  	};
>  
> +	pcie-controller {
> +		compatible = "nvidia,tegra20-pcie";
> +		reg = <0x80003000 0x00000800   /* PADS registers */
> +		       0x80003800 0x00000200   /* AFI registers */
> +		       0x81000000 0x01000000   /* configuration space */
> +		       0x90000000 0x10000000>; /* extended configuration space */
> +		interrupts = <0 98 0x04   /* controller interrupt */
> +		              0 99 0x04>; /* MSI interrupt */
> +		status = "disabled";
> +
> +		ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
> +			  0 1 0  0x81000000 0x00800000   /* port 0 config space */
> +			  0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
> +			  0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
> +			  0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
> +			  0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
> +
> +			  1 0 0  0x80001000 0x00001000   /* root port 1 */
> +			  1 1 0  0x81800000 0x00800000   /* port 1 config space */
> +			  1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
> +			  1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
> +			  1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
> +			  1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */

I've been thinking about this some more. The translations for both the
regular and extended configuration spaces are configured in the top-
level PCIe controller. It is therefore wrong how they are passed to the
PCI host bridges via the ranges property.

I remember Mitch saying that it should be passed down to the children
because it is partitioned among them, but since the layout is compatible
with ECAM, the partitioning isn't as simple as what's in the tree. In
fact the partitions will be dependent on the number of devices attached
to the host bridges.

> +
> +		#address-cells = <3>;
> +		#size-cells = <1>;
> +
> +		pci@0 {
> +			device_type = "pciex";
> +			reg = <0 0 0 0x1000>;
> +			status = "disabled";
> +
> +			#address-cells = <3>;
> +			#size-cells = <2>;
> +
> +			ranges = <0x80000000 0 0  0 1 0  0 0x00800000   /* config space */
> +				  0x90000000 0 0  0 2 0  0 0x08000000   /* ext config space */
> +				  0x81000000 0 0  0 3 0  0 0x00010000   /* I/O */
> +				  0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
> +				  0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
> +
> +			nvidia,num-lanes = <2>;
> +		};
> +
> +		pci@1 {
> +			device_type = "pciex";
> +			reg = <1 0 0 0x1000>;
> +			status = "disabled";
> +
> +			#address-cells = <3>;
> +			#size-cells = <2>;
> +
> +			ranges = <0x80000000 0 0  1 1 0  0 0x00800000   /* config space */
> +				  0x90000000 0 0  1 2 0  0 0x08000000   /* ext config space */
> +				  0x81000000 0 0  1 3 0  0 0x00010000   /* I/O */
> +				  0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
> +				  0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
> +
> +			nvidia,num-lanes = <2>;
> +		};
> +	};

The same is true for the ranges properties of the PCI host bridge nodes.
Which part of the configuration spaces maps to the children of the
respective host bridge depends on the actual device hierarchy.

Would it be possible to alternatively pass the complete range to the
children without further partitioning?

The driver doesn't actually care about the ranges property and only uses
the values specified in the reg property of the pcie-controller node.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-14 18:01         ` Thierry Reding
@ 2012-08-14 21:44           ` Bjorn Helgaas
  2012-08-15  6:49             ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-14 21:44 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Tue, Aug 14, 2012 at 11:01 AM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Tue, Aug 14, 2012 at 10:38:08AM -0700, Bjorn Helgaas wrote:
>> On Mon, Aug 13, 2012 at 10:55 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > On Mon, Aug 13, 2012 at 10:00:45PM -0700, Bjorn Helgaas wrote:
>> >> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
>> >> <thierry.reding@avionic-design.de> wrote:
>> >> > This commit adds a new flag that allows marking resources as PCI
>> >> > configuration space.
>> >> >
>> >> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
>> >> > ---
>> >> > Changes in v3:
>> >> > - new patch
>> >> >
>> >> >  include/linux/ioport.h | 2 +-
>> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> >> > index 589e0e7..3314843 100644
>> >> > --- a/include/linux/ioport.h
>> >> > +++ b/include/linux/ioport.h
>> >> > @@ -102,7 +102,7 @@ struct resource {
>> >> >
>> >> >  /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
>> >> >  #define IORESOURCE_PCI_FIXED           (1<<4)  /* Do not move resource */
>> >> > -
>> >> > +#define IORESOURCE_PCI_CS              (1<<5)  /* PCI configuration space */
>> >>
>> >> What is the purpose of this?  It seems that you are marking regions
>> >> that we call MMCONFIG on x86, or ECAM-type regions in the language of
>> >> the PCIe spec.  I see that you set it in several places, but I don't
>> >> see anything that ever looks for it.  Do you have plans to use it in
>> >> the future?  If it really does correspond to MMCONFIG/ECAM, we should
>> >> handle those regions consistently across all architectures.
>> >
>> > The purpose is ultimately to obtain the MMCONFIG/ECAM resources assigned
>> > to a PCI host controller. I've used this in the of_pci_parse_ranges()
>> > and in the static board setup code to mark ranges as such. Perhaps
>> > IORESOURCE_ECAM or IORESOURCE_MMCONFIG might have been better names. I
>> > also just noticed that I'm not using this anywhere, but the plan was to
>> > eventually use it with platform_get_resource(). However that doesn't
>> > seem to work either because the lower bits of the flags aren't use for
>> > comparison in that function.
>> >
>> > Any other ideas how that could be handled? Basically what I need is a
>> > way to mark a resource as an MMCONFIG/ECAM range so that it can be used
>> > to program the PCI host controller accordingly. I don't know how these
>> > are assigned on x86. I was under the impression that the MMCONFIG/ECAM
>> > space was accessed through a single single address/data register pair.
>>
>> The legacy config access mechanism (CF8h/CFCh registers described in
>> PCI 3.0 spec sec 3.2.2.3.2) is a single address/data pair, but this is
>> mostly x86-specific.  The ECAM mechanism (described in the PCIe 3.0
>> spec sec 7.2.2) is not a single address/data pair; instead, each byte
>> of config space is directly mapped into the host's MMIO space.
>>
>> Here's what we do on x86 (omitting some historical grunge that
>> complicates things):
>>
>>   - Discover the host bridge via a PNP0A08 device in the ACPI namespace.
>>   - Discover the bus number range behind the bridge using a _CRS
>> method in the PNP0A08 device.
>>   - Discover the ECAM space for those buses via a _CBA method in the
>> PNP0A08 device.
>>   - Tell the config accessors (struct pci_ops) that the ECAM space for
>> buses A-B is at address X.
>>   - Enumerate the devices behind the host bridge by calling
>> pci_scan_root_bus(), passing the config accessors.
>>
>> It sounds like you want a way to parse the resources at one point,
>> saving them and marking the ECAM region, then at a later point, look
>> up the ECAM from a saved list.  We don't do that on x86 because the
>> config accessors keep an internal list of the ECAM area for each bus.
>>
>> We do of course want to put this ECAM space in the IORESOURCE_MEM tree
>> because it consumes address space and we have to make sure we don't
>> put anything else on top of it.  But we don't have any reason to
>> describe the MMIO -> config space function in that tree.  From the
>> point of view of the rest of the system, it's just MMIO space that's
>> consumed by the PCI host bridge, just like any other device-specific
>> MMIO area.
>
> What I currently do is pass the ECAM space as a resource to the PCI host
> bridge platform device. Regular and extended configuration spaces are
> given by the third and fourth resources, respectively. If I understand
> correctly, you're saying that nothing beyond that needs to be encoded.
> In other words it is enough for the PCI host bridge driver to know where
> to take the data from.

Yes, I think so.

I'd like to someday make ECAM support generic, since it's specified by
the PCIe spec.  The spec doesn't differentiate between PCI
configuration space (offsets 0-0xff) and PCI Express *extended*
configuration space (offsets 0x100-0xfff).  But it sounds like Tegra
might have a memory-mapped configuration mechanism that is similar to
ECAM but with a different address map, since you mention two resources
(third and fourth).  That's not a problem, since you're doing a
Tegra-specific solution right now anyway, but something to keep in the
back of your mind if we ever do make a generic ECAM solution.

> I'll have to see what this means for the DT binding. There are other
> issues that I need to think about, like for example how to pass the ECAM
> space from the PCI host controller to each of the two bridges via the
> ranges property. This no longer makes sense in the current form, as the
> ECAM covers the configuration spaces for devices of both bridges and
> cannot really be split among them.

We have the same situation on x86.  Part of the historical grunge that
I omitted is that there's a static ACPI table (MCFG) that can tell you
the ECAM range for a bus number range.  Often that bus number range
includes several host bridges.  On x86, we have a single set of ECAM
config accessors (see pci_mmcfg), and they maintain a list of "bus
number -> ECAM addr" mappings internally, independent of which host
bridge leads to the buses.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 19:58           ` Thierry Reding
@ 2012-08-14 21:55             ` Bjorn Helgaas
  2012-08-14 22:58               ` Stephen Warren
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-14 21:55 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Stephen Warren, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Tue, Aug 14, 2012 at 12:58 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Tue, Aug 14, 2012 at 01:39:23PM -0600, Stephen Warren wrote:
>> On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
>> > On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> ...
>> >> whereas for a device tree boot:
>> >>
>> >> (same):
>> >>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>> >>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>> >>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>> >>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>> >> ... (request region happens early)
>> >>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>> >>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>> >>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>> >>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>> >> ... (same, just happens too late)
>> >>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>> >>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>> >>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>> >>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>> >>
>> >> I suspect this is all still related to the PCI devices themselves being
>> >> probed much earlier in the overall PCI initialization sequence when the
>> >> PCI controller is probed later in the boot sequence, whereas PCI device
>> >> probe is deferred until the overall PCI initialization sequence is
>> >> complete if the PCI controller is probed very early in the boot sequence.
>> >
>> > I don't know what to apply your patches to (they don't apply cleanly
>> > to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
>> > like you might be calling pci_bus_add_devices() before
>> > pci_bus_assign_resource(), which isn't going to work.
>>
>> Yes, that's exactly what is happening.
>>
>> PCIe initialization starts in arch/arm/mach-tegra/pci.e
>> tegra_pcie_init() which calls arch/arm/kernel/bios32.c
>> pci_common_init(). That function first calls pcibios_init_hw() (in the
>> same file, more about this later) and then loops over PCI buses, calling
>> amongst other things pci_bus_assign_resources() then pci_bus_add_devices().
>>
>> The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
>> (or a host-driver-specific function which that also calls
>> pci_scan_root_bus() in Tegra's case) which in turn calls
>> pci_bus_add_devices() right at the end, before control has returned to
>> pci_common_init() and hence before pci_bus_assign_resources() has been
>> called.
>>
>> If I modify pci_scan_root_bus() and remove the call to
>> pci_bus_add_devices(), everything works as expected.
>>
>> So, I guess the question is: Should ARM's pcibios_init_hw() not be
>> calling pci_scan_root_bus(), or at least presumably the ARM PCI code
>> needs to do things in a slightly different order?

I think you need to do something like this instead of using pci_scan_root_bus():

    pci_create_root_bus()
    pci_scan_child_bus()
    pci_bus_assign_resources()
    pci_bus_add_devices()

This is the effective order used by most of the pci_create_root_bus() callers.

> Maybe pci_scan_root_bus() should be calling pci_bus_assign_resources()?
> Or a new function could be added which also assigns the resources.

Yes, it probably should.  I'm nervous about just throwing it in there
without quite a bit more analysis, but that's definitely the direction
I think we should be heading.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 21:55             ` Bjorn Helgaas
@ 2012-08-14 22:58               ` Stephen Warren
  2012-08-14 23:51                 ` Stephen Warren
  2012-08-15  0:08                 ` Bjorn Helgaas
  0 siblings, 2 replies; 79+ messages in thread
From: Stephen Warren @ 2012-08-14 22:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Thierry Reding, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/14/2012 03:55 PM, Bjorn Helgaas wrote:
> On Tue, Aug 14, 2012 at 12:58 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
>> On Tue, Aug 14, 2012 at 01:39:23PM -0600, Stephen Warren wrote:
>>> On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
>>>> On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> ...
>>>>> whereas for a device tree boot:
>>>>>
>>>>> (same):
>>>>>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>>>>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>>>>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>>>>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>>>>> ... (request region happens early)
>>>>>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>>>>>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>>>>>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>>>>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>>>>> ... (same, just happens too late)
>>>>>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>>>>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>>>>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>>>>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>>>>>
>>>>> I suspect this is all still related to the PCI devices themselves being
>>>>> probed much earlier in the overall PCI initialization sequence when the
>>>>> PCI controller is probed later in the boot sequence, whereas PCI device
>>>>> probe is deferred until the overall PCI initialization sequence is
>>>>> complete if the PCI controller is probed very early in the boot sequence.
>>>>
>>>> I don't know what to apply your patches to (they don't apply cleanly
>>>> to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
>>>> like you might be calling pci_bus_add_devices() before
>>>> pci_bus_assign_resource(), which isn't going to work.
>>>
>>> Yes, that's exactly what is happening.
>>>
>>> PCIe initialization starts in arch/arm/mach-tegra/pci.e
>>> tegra_pcie_init() which calls arch/arm/kernel/bios32.c
>>> pci_common_init(). That function first calls pcibios_init_hw() (in the
>>> same file, more about this later) and then loops over PCI buses, calling
>>> amongst other things pci_bus_assign_resources() then pci_bus_add_devices().
>>>
>>> The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
>>> (or a host-driver-specific function which that also calls
>>> pci_scan_root_bus() in Tegra's case) which in turn calls
>>> pci_bus_add_devices() right at the end, before control has returned to
>>> pci_common_init() and hence before pci_bus_assign_resources() has been
>>> called.
>>>
>>> If I modify pci_scan_root_bus() and remove the call to
>>> pci_bus_add_devices(), everything works as expected.
>>>
>>> So, I guess the question is: Should ARM's pcibios_init_hw() not be
>>> calling pci_scan_root_bus(), or at least presumably the ARM PCI code
>>> needs to do things in a slightly different order?
> 
> I think you need to do something like this instead of using pci_scan_root_bus():
> 
>     pci_create_root_bus()
>     pci_scan_child_bus()
>     pci_bus_assign_resources()
>     pci_bus_add_devices()
> 
> This is the effective order used by most of the pci_create_root_bus() callers.

That would pretty much duplicate everything in pci_scan_root_bus(). That
might cause divergence down the road.

Can't we make the call to pci_bus_add_devices() optional in
pci_scan_root_bus() somehow; one of:

* Add a parameter to pci_scan_root_bus() controlling this.

(rather a large patch)

* Split pci_scan_root_bus() into pci_scan_root_bus() and
pci_scan_root_bus_no_add(), such that pci_scan_root_bus() is just a
wrapper that calls pci_scan_root_bus_no_add() then pci_bus_add_devices().

(very simple patch, and the new function can easily be used as/when it's
needed, e.g. enabled just for Tegra in 3.6 to reduce risk of regressions)

* Add a flag to struct pci_bus that requests pci_scan_root_bus() skip
the call to pci_bus_add_devices().

(a flag in the bus struct just for one function seems a little
circuitous, but perhaps OK)

* ifdef out the call to pci_bus_add_devices(), if building for ARM.

(very simple, and probably correct)

Actually, I'm not totally convinced some other archs shouldn't skip this
too; while I couldn't find any other arch that explicitly calls
pci_bus_assign_resources() and pci_bus_add_devices() after
pci_scan_root_bus(), I did see some that call
pci_assign_unassigned_resources() which seems like it might be due to a
similar situation?

* Add another pcibios_*() callback that pci_scan_root_bus() calls to
determine whether to call pci_bus_add_devices(), with default
implementation.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-14 20:12   ` Thierry Reding
@ 2012-08-14 23:50     ` Bjorn Helgaas
  2012-08-15  6:37       ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-14 23:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
>> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
>> index a094c97..c886dff 100644
>> --- a/arch/arm/boot/dts/tegra20.dtsi
>> +++ b/arch/arm/boot/dts/tegra20.dtsi
>> @@ -199,6 +199,68 @@
>>               #size-cells = <0>;
>>       };
>>
>> +     pcie-controller {
>> +             compatible = "nvidia,tegra20-pcie";
>> +             reg = <0x80003000 0x00000800   /* PADS registers */
>> +                    0x80003800 0x00000200   /* AFI registers */
>> +                    0x81000000 0x01000000   /* configuration space */
>> +                    0x90000000 0x10000000>; /* extended configuration space */
>> +             interrupts = <0 98 0x04   /* controller interrupt */
>> +                           0 99 0x04>; /* MSI interrupt */
>> +             status = "disabled";
>> +
>> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
>> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
>> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
>> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
>> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
>> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
>> +
>> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
>> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
>> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
>> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
>> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
>> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
>
> I've been thinking about this some more. The translations for both the
> regular and extended configuration spaces are configured in the top-
> level PCIe controller. It is therefore wrong how they are passed to the
> PCI host bridges via the ranges property.
>
> I remember Mitch saying that it should be passed down to the children
> because it is partitioned among them, but since the layout is compatible
> with ECAM, the partitioning isn't as simple as what's in the tree. In
> fact the partitions will be dependent on the number of devices attached
> to the host bridges.

I don't understand this last bit about the number of devices attached
to the host bridges.  Logically, the host bridge has a bus number
aperture that you can know up front, even before you know anything
about what devices are below it.  On x86, for example, the ACPI _CRS
method has something like "[bus 00-7f]" in it, which means that any
buses in that range are below this bridge.  That doesn't tell us
anything about which buses actually have devices on them, of course;
it's just analogous to the secondary and subordinate bus number
registers in a P2P bridge.

>> +
>> +             #address-cells = <3>;
>> +             #size-cells = <1>;
>> +
>> +             pci@0 {
>> +                     device_type = "pciex";
>> +                     reg = <0 0 0 0x1000>;
>> +                     status = "disabled";
>> +
>> +                     #address-cells = <3>;
>> +                     #size-cells = <2>;
>> +
>> +                     ranges = <0x80000000 0 0  0 1 0  0 0x00800000   /* config space */
>> +                               0x90000000 0 0  0 2 0  0 0x08000000   /* ext config space */
>> +                               0x81000000 0 0  0 3 0  0 0x00010000   /* I/O */
>> +                               0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
>> +                               0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
>> +
>> +                     nvidia,num-lanes = <2>;
>> +             };
>> +
>> +             pci@1 {
>> +                     device_type = "pciex";
>> +                     reg = <1 0 0 0x1000>;
>> +                     status = "disabled";
>> +
>> +                     #address-cells = <3>;
>> +                     #size-cells = <2>;
>> +
>> +                     ranges = <0x80000000 0 0  1 1 0  0 0x00800000   /* config space */
>> +                               0x90000000 0 0  1 2 0  0 0x08000000   /* ext config space */
>> +                               0x81000000 0 0  1 3 0  0 0x00010000   /* I/O */
>> +                               0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
>> +                               0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
>> +
>> +                     nvidia,num-lanes = <2>;
>> +             };
>> +     };
>
> The same is true for the ranges properties of the PCI host bridge nodes.
> Which part of the configuration spaces maps to the children of the
> respective host bridge depends on the actual device hierarchy.
>
> Would it be possible to alternatively pass the complete range to the
> children without further partitioning?
>
> The driver doesn't actually care about the ranges property and only uses
> the values specified in the reg property of the pcie-controller node.
>
> Thierry

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 22:58               ` Stephen Warren
@ 2012-08-14 23:51                 ` Stephen Warren
  2012-08-15 19:04                   ` Stephen Warren
  2012-09-07 23:34                   ` Stephen Warren
  2012-08-15  0:08                 ` Bjorn Helgaas
  1 sibling, 2 replies; 79+ messages in thread
From: Stephen Warren @ 2012-08-14 23:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Thierry Reding, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/14/2012 04:58 PM, Stephen Warren wrote:
> On 08/14/2012 03:55 PM, Bjorn Helgaas wrote:
>> On Tue, Aug 14, 2012 at 12:58 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>>> On Tue, Aug 14, 2012 at 01:39:23PM -0600, Stephen Warren wrote:
>>>> On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
>>>>> On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> ...
>>>>>> whereas for a device tree boot:
>>>>>>
>>>>>> (same):
>>>>>>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>>>>>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>>>>>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>>>>>> ... (request region happens early)
>>>>>>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>>>>>>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>>>>>> ... (same, just happens too late)
>>>>>>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>>>>>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>>>>>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>>>>>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>>>>>>
>>>>>> I suspect this is all still related to the PCI devices themselves being
>>>>>> probed much earlier in the overall PCI initialization sequence when the
>>>>>> PCI controller is probed later in the boot sequence, whereas PCI device
>>>>>> probe is deferred until the overall PCI initialization sequence is
>>>>>> complete if the PCI controller is probed very early in the boot sequence.
>>>>>
>>>>> I don't know what to apply your patches to (they don't apply cleanly
>>>>> to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
>>>>> like you might be calling pci_bus_add_devices() before
>>>>> pci_bus_assign_resource(), which isn't going to work.
>>>>
>>>> Yes, that's exactly what is happening.
>>>>
>>>> PCIe initialization starts in arch/arm/mach-tegra/pci.e
>>>> tegra_pcie_init() which calls arch/arm/kernel/bios32.c
>>>> pci_common_init(). That function first calls pcibios_init_hw() (in the
>>>> same file, more about this later) and then loops over PCI buses, calling
>>>> amongst other things pci_bus_assign_resources() then pci_bus_add_devices().
>>>>
>>>> The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
>>>> (or a host-driver-specific function which that also calls
>>>> pci_scan_root_bus() in Tegra's case) which in turn calls
>>>> pci_bus_add_devices() right at the end, before control has returned to
>>>> pci_common_init() and hence before pci_bus_assign_resources() has been
>>>> called.
>>>>
>>>> If I modify pci_scan_root_bus() and remove the call to
>>>> pci_bus_add_devices(), everything works as expected.
>>>>
>>>> So, I guess the question is: Should ARM's pcibios_init_hw() not be
>>>> calling pci_scan_root_bus(), or at least presumably the ARM PCI code
>>>> needs to do things in a slightly different order?
>>
>> I think you need to do something like this instead of using pci_scan_root_bus():
>>
>>     pci_create_root_bus()
>>     pci_scan_child_bus()
>>     pci_bus_assign_resources()
>>     pci_bus_add_devices()
>>
>> This is the effective order used by most of the pci_create_root_bus() callers.
> 
> That would pretty much duplicate everything in pci_scan_root_bus(). That
> might cause divergence down the road.
> 
> Can't we make the call to pci_bus_add_devices() optional in
> pci_scan_root_bus() somehow; one of:

Sigh, that turns out not to work correctly; it solves at least this part
of the problem when booting using device tree, but when booting using a
board file, it causes the IRQ number passed to the PCIe device to be
bogus:-(

I give up for now.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 22:58               ` Stephen Warren
  2012-08-14 23:51                 ` Stephen Warren
@ 2012-08-15  0:08                 ` Bjorn Helgaas
  1 sibling, 0 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-15  0:08 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Thierry Reding, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Tue, Aug 14, 2012 at 3:58 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/14/2012 03:55 PM, Bjorn Helgaas wrote:
>> On Tue, Aug 14, 2012 at 12:58 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>>> On Tue, Aug 14, 2012 at 01:39:23PM -0600, Stephen Warren wrote:
>>>> On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
>>>>> On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> ...
>>>>>> whereas for a device tree boot:
>>>>>>
>>>>>> (same):
>>>>>>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>>>>>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>>>>>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>>>>>> ... (request region happens early)
>>>>>>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>>>>>>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>>>>>> ... (same, just happens too late)
>>>>>>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>>>>>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>>>>>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>>>>>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>>>>>>
>>>>>> I suspect this is all still related to the PCI devices themselves being
>>>>>> probed much earlier in the overall PCI initialization sequence when the
>>>>>> PCI controller is probed later in the boot sequence, whereas PCI device
>>>>>> probe is deferred until the overall PCI initialization sequence is
>>>>>> complete if the PCI controller is probed very early in the boot sequence.
>>>>>
>>>>> I don't know what to apply your patches to (they don't apply cleanly
>>>>> to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
>>>>> like you might be calling pci_bus_add_devices() before
>>>>> pci_bus_assign_resource(), which isn't going to work.
>>>>
>>>> Yes, that's exactly what is happening.
>>>>
>>>> PCIe initialization starts in arch/arm/mach-tegra/pci.e
>>>> tegra_pcie_init() which calls arch/arm/kernel/bios32.c
>>>> pci_common_init(). That function first calls pcibios_init_hw() (in the
>>>> same file, more about this later) and then loops over PCI buses, calling
>>>> amongst other things pci_bus_assign_resources() then pci_bus_add_devices().
>>>>
>>>> The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
>>>> (or a host-driver-specific function which that also calls
>>>> pci_scan_root_bus() in Tegra's case) which in turn calls
>>>> pci_bus_add_devices() right at the end, before control has returned to
>>>> pci_common_init() and hence before pci_bus_assign_resources() has been
>>>> called.
>>>>
>>>> If I modify pci_scan_root_bus() and remove the call to
>>>> pci_bus_add_devices(), everything works as expected.
>>>>
>>>> So, I guess the question is: Should ARM's pcibios_init_hw() not be
>>>> calling pci_scan_root_bus(), or at least presumably the ARM PCI code
>>>> needs to do things in a slightly different order?
>>
>> I think you need to do something like this instead of using pci_scan_root_bus():
>>
>>     pci_create_root_bus()
>>     pci_scan_child_bus()
>>     pci_bus_assign_resources()
>>     pci_bus_add_devices()
>>
>> This is the effective order used by most of the pci_create_root_bus() callers.
>
> That would pretty much duplicate everything in pci_scan_root_bus(). That
> might cause divergence down the road.

That's true, but it is what most other architectures do, and if you do
it the same way, we'll be able to converge things more easily later.

> Can't we make the call to pci_bus_add_devices() optional in
> pci_scan_root_bus() somehow; one of:
>
> * Add a parameter to pci_scan_root_bus() controlling this.
>
> (rather a large patch)
>
> * Split pci_scan_root_bus() into pci_scan_root_bus() and
> pci_scan_root_bus_no_add(), such that pci_scan_root_bus() is just a
> wrapper that calls pci_scan_root_bus_no_add() then pci_bus_add_devices().
>
> (very simple patch, and the new function can easily be used as/when it's
> needed, e.g. enabled just for Tegra in 3.6 to reduce risk of regressions)
>
> * Add a flag to struct pci_bus that requests pci_scan_root_bus() skip
> the call to pci_bus_add_devices().
>
> (a flag in the bus struct just for one function seems a little
> circuitous, but perhaps OK)
>
> * ifdef out the call to pci_bus_add_devices(), if building for ARM.
>
> (very simple, and probably correct)

I'd rather not add more variants of pci_scan_root_bus().  We already
have several very similar things in drivers/pci/probe.c:

  pci_scan_bus()
  pci_scan_bus_parented()
  pci_scan_root_bus()

And in addition, we have many callers of pci_create_root_bus() that
look very much like one of these.  I'm trying to consolidate all this,
but there's a fair amount of work, and I think the simplest thing is
to just use pci_create_root_bus() for now.

> Actually, I'm not totally convinced some other archs shouldn't skip this
> too; while I couldn't find any other arch that explicitly calls
> pci_bus_assign_resources() and pci_bus_add_devices() after
> pci_scan_root_bus(), I did see some that call
> pci_assign_unassigned_resources() which seems like it might be due to a
> similar situation?

Yes, that does sound like a similar problem.  In the past, resource
assignment has been pretty much separate from enumeration, with the
arch having the responsibility to enumerate, assign, then add devices.
 But that is error-prone and needlessly arch-specific, and I'd like to
pull it into generic code.  It's just not done yet :)

> * Add another pcibios_*() callback that pci_scan_root_bus() calls to
> determine whether to call pci_bus_add_devices(), with default
> implementation.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-14 23:50     ` Bjorn Helgaas
@ 2012-08-15  6:37       ` Thierry Reding
  2012-08-15 12:18         ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15  6:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 4149 bytes --]

On Tue, Aug 14, 2012 at 04:50:26PM -0700, Bjorn Helgaas wrote:
> On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
> >> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> >> index a094c97..c886dff 100644
> >> --- a/arch/arm/boot/dts/tegra20.dtsi
> >> +++ b/arch/arm/boot/dts/tegra20.dtsi
> >> @@ -199,6 +199,68 @@
> >>               #size-cells = <0>;
> >>       };
> >>
> >> +     pcie-controller {
> >> +             compatible = "nvidia,tegra20-pcie";
> >> +             reg = <0x80003000 0x00000800   /* PADS registers */
> >> +                    0x80003800 0x00000200   /* AFI registers */
> >> +                    0x81000000 0x01000000   /* configuration space */
> >> +                    0x90000000 0x10000000>; /* extended configuration space */
> >> +             interrupts = <0 98 0x04   /* controller interrupt */
> >> +                           0 99 0x04>; /* MSI interrupt */
> >> +             status = "disabled";
> >> +
> >> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
> >> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
> >> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
> >> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
> >> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
> >> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
> >> +
> >> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
> >> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
> >> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
> >> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
> >> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
> >> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
> >
> > I've been thinking about this some more. The translations for both the
> > regular and extended configuration spaces are configured in the top-
> > level PCIe controller. It is therefore wrong how they are passed to the
> > PCI host bridges via the ranges property.
> >
> > I remember Mitch saying that it should be passed down to the children
> > because it is partitioned among them, but since the layout is compatible
> > with ECAM, the partitioning isn't as simple as what's in the tree. In
> > fact the partitions will be dependent on the number of devices attached
> > to the host bridges.
> 
> I don't understand this last bit about the number of devices attached
> to the host bridges.  Logically, the host bridge has a bus number
> aperture that you can know up front, even before you know anything
> about what devices are below it.  On x86, for example, the ACPI _CRS
> method has something like "[bus 00-7f]" in it, which means that any
> buses in that range are below this bridge.  That doesn't tell us
> anything about which buses actually have devices on them, of course;
> it's just analogous to the secondary and subordinate bus number
> registers in a P2P bridge.

That's one of the issues I still need to take care of. Currently no bus
resource is attached to the individual bridges (nor the PCI controller
for that matter), so the PCI core will assign them dynamically. If this
range is known at boot time we could assign ECAM ranges based on the bus
numbers. Standard ECAM ranges, that is. On Tegra this won't work because
as Stephen mentioned in a previous mail, the bus field is not the top
field in the ECAM addresses. Basically what you have is this:

	[27:24] upper 4 bits of the register address for extended
	        configuration space
	[23:16] bus number
	[15:11] device number
	[10: 8] device function
	[ 8: 0] register

So the ECAM space cannot be partitioned by bus number.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-14 21:44           ` Bjorn Helgaas
@ 2012-08-15  6:49             ` Thierry Reding
  2012-08-16 15:18               ` Stephen Warren
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15  6:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 8045 bytes --]

On Tue, Aug 14, 2012 at 02:44:24PM -0700, Bjorn Helgaas wrote:
> On Tue, Aug 14, 2012 at 11:01 AM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Tue, Aug 14, 2012 at 10:38:08AM -0700, Bjorn Helgaas wrote:
> >> On Mon, Aug 13, 2012 at 10:55 PM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > On Mon, Aug 13, 2012 at 10:00:45PM -0700, Bjorn Helgaas wrote:
> >> >> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> >> >> <thierry.reding@avionic-design.de> wrote:
> >> >> > This commit adds a new flag that allows marking resources as PCI
> >> >> > configuration space.
> >> >> >
> >> >> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> >> >> > ---
> >> >> > Changes in v3:
> >> >> > - new patch
> >> >> >
> >> >> >  include/linux/ioport.h | 2 +-
> >> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >> >
> >> >> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> >> >> > index 589e0e7..3314843 100644
> >> >> > --- a/include/linux/ioport.h
> >> >> > +++ b/include/linux/ioport.h
> >> >> > @@ -102,7 +102,7 @@ struct resource {
> >> >> >
> >> >> >  /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
> >> >> >  #define IORESOURCE_PCI_FIXED           (1<<4)  /* Do not move resource */
> >> >> > -
> >> >> > +#define IORESOURCE_PCI_CS              (1<<5)  /* PCI configuration space */
> >> >>
> >> >> What is the purpose of this?  It seems that you are marking regions
> >> >> that we call MMCONFIG on x86, or ECAM-type regions in the language of
> >> >> the PCIe spec.  I see that you set it in several places, but I don't
> >> >> see anything that ever looks for it.  Do you have plans to use it in
> >> >> the future?  If it really does correspond to MMCONFIG/ECAM, we should
> >> >> handle those regions consistently across all architectures.
> >> >
> >> > The purpose is ultimately to obtain the MMCONFIG/ECAM resources assigned
> >> > to a PCI host controller. I've used this in the of_pci_parse_ranges()
> >> > and in the static board setup code to mark ranges as such. Perhaps
> >> > IORESOURCE_ECAM or IORESOURCE_MMCONFIG might have been better names. I
> >> > also just noticed that I'm not using this anywhere, but the plan was to
> >> > eventually use it with platform_get_resource(). However that doesn't
> >> > seem to work either because the lower bits of the flags aren't use for
> >> > comparison in that function.
> >> >
> >> > Any other ideas how that could be handled? Basically what I need is a
> >> > way to mark a resource as an MMCONFIG/ECAM range so that it can be used
> >> > to program the PCI host controller accordingly. I don't know how these
> >> > are assigned on x86. I was under the impression that the MMCONFIG/ECAM
> >> > space was accessed through a single single address/data register pair.
> >>
> >> The legacy config access mechanism (CF8h/CFCh registers described in
> >> PCI 3.0 spec sec 3.2.2.3.2) is a single address/data pair, but this is
> >> mostly x86-specific.  The ECAM mechanism (described in the PCIe 3.0
> >> spec sec 7.2.2) is not a single address/data pair; instead, each byte
> >> of config space is directly mapped into the host's MMIO space.
> >>
> >> Here's what we do on x86 (omitting some historical grunge that
> >> complicates things):
> >>
> >>   - Discover the host bridge via a PNP0A08 device in the ACPI namespace.
> >>   - Discover the bus number range behind the bridge using a _CRS
> >> method in the PNP0A08 device.
> >>   - Discover the ECAM space for those buses via a _CBA method in the
> >> PNP0A08 device.
> >>   - Tell the config accessors (struct pci_ops) that the ECAM space for
> >> buses A-B is at address X.
> >>   - Enumerate the devices behind the host bridge by calling
> >> pci_scan_root_bus(), passing the config accessors.
> >>
> >> It sounds like you want a way to parse the resources at one point,
> >> saving them and marking the ECAM region, then at a later point, look
> >> up the ECAM from a saved list.  We don't do that on x86 because the
> >> config accessors keep an internal list of the ECAM area for each bus.
> >>
> >> We do of course want to put this ECAM space in the IORESOURCE_MEM tree
> >> because it consumes address space and we have to make sure we don't
> >> put anything else on top of it.  But we don't have any reason to
> >> describe the MMIO -> config space function in that tree.  From the
> >> point of view of the rest of the system, it's just MMIO space that's
> >> consumed by the PCI host bridge, just like any other device-specific
> >> MMIO area.
> >
> > What I currently do is pass the ECAM space as a resource to the PCI host
> > bridge platform device. Regular and extended configuration spaces are
> > given by the third and fourth resources, respectively. If I understand
> > correctly, you're saying that nothing beyond that needs to be encoded.
> > In other words it is enough for the PCI host bridge driver to know where
> > to take the data from.
> 
> Yes, I think so.
> 
> I'd like to someday make ECAM support generic, since it's specified by
> the PCIe spec.  The spec doesn't differentiate between PCI
> configuration space (offsets 0-0xff) and PCI Express *extended*
> configuration space (offsets 0x100-0xfff).  But it sounds like Tegra
> might have a memory-mapped configuration mechanism that is similar to
> ECAM but with a different address map, since you mention two resources
> (third and fourth).  That's not a problem, since you're doing a
> Tegra-specific solution right now anyway, but something to keep in the
> back of your mind if we ever do make a generic ECAM solution.

Something that's kept me wondering is why there actually need to be two
separate regions. Since the extended configuration space is a superset
of the regular configuration space it should be possible to access the
regular registers via the extended configuration space as well.

Stephen: Could you try to find out whether the regular configuration
space translation can just be omitted if we already set up the one for
the extended configuration space? In tegra_pcie_setup_translations(),
BAR 0 is setup for regular configuration space (which requires a 16 MiB
region), while BAR 1 is setup for the extended configuration space
(requiring a full 256 MiB region). However, if I understand correctly,
each of the registers that can be accessed via the BAR 0 translation can
also be accessed via the BAR 1 translation. That seems like we're
wasting the 16 MiB set aside for the BAR 0 mapping.

> > I'll have to see what this means for the DT binding. There are other
> > issues that I need to think about, like for example how to pass the ECAM
> > space from the PCI host controller to each of the two bridges via the
> > ranges property. This no longer makes sense in the current form, as the
> > ECAM covers the configuration spaces for devices of both bridges and
> > cannot really be split among them.
> 
> We have the same situation on x86.  Part of the historical grunge that
> I omitted is that there's a static ACPI table (MCFG) that can tell you
> the ECAM range for a bus number range.  Often that bus number range
> includes several host bridges.  On x86, we have a single set of ECAM
> config accessors (see pci_mmcfg), and they maintain a list of "bus
> number -> ECAM addr" mappings internally, independent of which host
> bridge leads to the buses.

That's very similar to how things work on Tegra. Configuration space
accesses to the host bridges are done via their respective memory-mapped
register spaces. All other accesses use either the BAR 0 or BAR 1
translations as I described above. These translations however are unique
and only a single set of accessors are provided.

Given that and what I said in the other mail about the implementation of
ECAM on Tegra any CS range we pass to the host bridges via the DT ranges
property is actually wrong.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15  6:37       ` Thierry Reding
@ 2012-08-15 12:18         ` Bjorn Helgaas
  2012-08-15 12:30           ` Thierry Reding
  2012-08-16 12:15           ` Thierry Reding
  0 siblings, 2 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-15 12:18 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Tue, Aug 14, 2012 at 11:37 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Tue, Aug 14, 2012 at 04:50:26PM -0700, Bjorn Helgaas wrote:
>> On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
>> >> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
>> >> index a094c97..c886dff 100644
>> >> --- a/arch/arm/boot/dts/tegra20.dtsi
>> >> +++ b/arch/arm/boot/dts/tegra20.dtsi
>> >> @@ -199,6 +199,68 @@
>> >>               #size-cells = <0>;
>> >>       };
>> >>
>> >> +     pcie-controller {
>> >> +             compatible = "nvidia,tegra20-pcie";
>> >> +             reg = <0x80003000 0x00000800   /* PADS registers */
>> >> +                    0x80003800 0x00000200   /* AFI registers */
>> >> +                    0x81000000 0x01000000   /* configuration space */
>> >> +                    0x90000000 0x10000000>; /* extended configuration space */
>> >> +             interrupts = <0 98 0x04   /* controller interrupt */
>> >> +                           0 99 0x04>; /* MSI interrupt */
>> >> +             status = "disabled";
>> >> +
>> >> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
>> >> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
>> >> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
>> >> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
>> >> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
>> >> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
>> >> +
>> >> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
>> >> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
>> >> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
>> >> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
>> >> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
>> >> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
>> >
>> > I've been thinking about this some more. The translations for both the
>> > regular and extended configuration spaces are configured in the top-
>> > level PCIe controller. It is therefore wrong how they are passed to the
>> > PCI host bridges via the ranges property.
>> >
>> > I remember Mitch saying that it should be passed down to the children
>> > because it is partitioned among them, but since the layout is compatible
>> > with ECAM, the partitioning isn't as simple as what's in the tree. In
>> > fact the partitions will be dependent on the number of devices attached
>> > to the host bridges.
>>
>> I don't understand this last bit about the number of devices attached
>> to the host bridges.  Logically, the host bridge has a bus number
>> aperture that you can know up front, even before you know anything
>> about what devices are below it.  On x86, for example, the ACPI _CRS
>> method has something like "[bus 00-7f]" in it, which means that any
>> buses in that range are below this bridge.  That doesn't tell us
>> anything about which buses actually have devices on them, of course;
>> it's just analogous to the secondary and subordinate bus number
>> registers in a P2P bridge.
>
> That's one of the issues I still need to take care of. Currently no bus
> resource is attached to the individual bridges (nor the PCI controller
> for that matter), so the PCI core will assign them dynamically.

So your PCI controller driver knows how to program the controller bus
number aperture?  Sometimes people start by assuming that two host
bridges both have [bus 00-ff] apertures, then they enumerate below the
first and adjust the bus number apertures based on what they found.
For example, if they found buses 00-12 behind the first bridge, they
make the apertures [bus 00-12] for the first bridge and [bus 13-ff]
for the second.  That might be the case, depending on what firmware
set up, but it seems like a dubious way to do it, and of course it
precludes a lot of hot-plug scenarios.

> If this
> range is known at boot time we could assign ECAM ranges based on the bus
> numbers. Standard ECAM ranges, that is. On Tegra this won't work because
> as Stephen mentioned in a previous mail, the bus field is not the top
> field in the ECAM addresses. Basically what you have is this:
>
>         [27:24] upper 4 bits of the register address for extended
>                 configuration space
>         [23:16] bus number
>         [15:11] device number
>         [10: 8] device function
>         [ 8: 0] register
>
> So the ECAM space cannot be partitioned by bus number.

Ah, OK, so definitely not standard PCIe ECAM.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 12:18         ` Bjorn Helgaas
@ 2012-08-15 12:30           ` Thierry Reding
  2012-08-15 14:36             ` Bjorn Helgaas
  2012-08-16 12:15           ` Thierry Reding
  1 sibling, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 12:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 5150 bytes --]

On Wed, Aug 15, 2012 at 05:18:04AM -0700, Bjorn Helgaas wrote:
> On Tue, Aug 14, 2012 at 11:37 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Tue, Aug 14, 2012 at 04:50:26PM -0700, Bjorn Helgaas wrote:
> >> On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
> >> >> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> >> >> index a094c97..c886dff 100644
> >> >> --- a/arch/arm/boot/dts/tegra20.dtsi
> >> >> +++ b/arch/arm/boot/dts/tegra20.dtsi
> >> >> @@ -199,6 +199,68 @@
> >> >>               #size-cells = <0>;
> >> >>       };
> >> >>
> >> >> +     pcie-controller {
> >> >> +             compatible = "nvidia,tegra20-pcie";
> >> >> +             reg = <0x80003000 0x00000800   /* PADS registers */
> >> >> +                    0x80003800 0x00000200   /* AFI registers */
> >> >> +                    0x81000000 0x01000000   /* configuration space */
> >> >> +                    0x90000000 0x10000000>; /* extended configuration space */
> >> >> +             interrupts = <0 98 0x04   /* controller interrupt */
> >> >> +                           0 99 0x04>; /* MSI interrupt */
> >> >> +             status = "disabled";
> >> >> +
> >> >> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
> >> >> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
> >> >> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
> >> >> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
> >> >> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
> >> >> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
> >> >> +
> >> >> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
> >> >> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
> >> >> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
> >> >> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
> >> >> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
> >> >> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
> >> >
> >> > I've been thinking about this some more. The translations for both the
> >> > regular and extended configuration spaces are configured in the top-
> >> > level PCIe controller. It is therefore wrong how they are passed to the
> >> > PCI host bridges via the ranges property.
> >> >
> >> > I remember Mitch saying that it should be passed down to the children
> >> > because it is partitioned among them, but since the layout is compatible
> >> > with ECAM, the partitioning isn't as simple as what's in the tree. In
> >> > fact the partitions will be dependent on the number of devices attached
> >> > to the host bridges.
> >>
> >> I don't understand this last bit about the number of devices attached
> >> to the host bridges.  Logically, the host bridge has a bus number
> >> aperture that you can know up front, even before you know anything
> >> about what devices are below it.  On x86, for example, the ACPI _CRS
> >> method has something like "[bus 00-7f]" in it, which means that any
> >> buses in that range are below this bridge.  That doesn't tell us
> >> anything about which buses actually have devices on them, of course;
> >> it's just analogous to the secondary and subordinate bus number
> >> registers in a P2P bridge.
> >
> > That's one of the issues I still need to take care of. Currently no bus
> > resource is attached to the individual bridges (nor the PCI controller
> > for that matter), so the PCI core will assign them dynamically.
> 
> So your PCI controller driver knows how to program the controller bus
> number aperture?  Sometimes people start by assuming that two host
> bridges both have [bus 00-ff] apertures, then they enumerate below the
> first and adjust the bus number apertures based on what they found.
> For example, if they found buses 00-12 behind the first bridge, they
> make the apertures [bus 00-12] for the first bridge and [bus 13-ff]
> for the second.  That might be the case, depending on what firmware
> set up, but it seems like a dubious way to do it, and of course it
> precludes a lot of hot-plug scenarios.

No, that's not what I meant. What happens is that no pre-assigned bus
range is specified for either of the host bridges, so that the range
0x00-0xff will be assigned by default in pci_scan_root_bus(). If I
understand correctly, what needs to be done is partition the bus range
between the two bridges (equally?). That would allow hot-plug scenarios
and be more in line with how other architectures do things.

I don't know if the Tegra PCIe controller supports hot-plug, though, so
maybe that wouldn't even be an issue and dynamic assignment would be
okay.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 12:30           ` Thierry Reding
@ 2012-08-15 14:36             ` Bjorn Helgaas
  2012-08-15 14:57               ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-15 14:36 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Wed, Aug 15, 2012 at 5:30 AM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Wed, Aug 15, 2012 at 05:18:04AM -0700, Bjorn Helgaas wrote:
>> On Tue, Aug 14, 2012 at 11:37 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > On Tue, Aug 14, 2012 at 04:50:26PM -0700, Bjorn Helgaas wrote:
>> >> On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
>> >> <thierry.reding@avionic-design.de> wrote:
>> >> > On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
>> >> >> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
>> >> >> index a094c97..c886dff 100644
>> >> >> --- a/arch/arm/boot/dts/tegra20.dtsi
>> >> >> +++ b/arch/arm/boot/dts/tegra20.dtsi
>> >> >> @@ -199,6 +199,68 @@
>> >> >>               #size-cells = <0>;
>> >> >>       };
>> >> >>
>> >> >> +     pcie-controller {
>> >> >> +             compatible = "nvidia,tegra20-pcie";
>> >> >> +             reg = <0x80003000 0x00000800   /* PADS registers */
>> >> >> +                    0x80003800 0x00000200   /* AFI registers */
>> >> >> +                    0x81000000 0x01000000   /* configuration space */
>> >> >> +                    0x90000000 0x10000000>; /* extended configuration space */
>> >> >> +             interrupts = <0 98 0x04   /* controller interrupt */
>> >> >> +                           0 99 0x04>; /* MSI interrupt */
>> >> >> +             status = "disabled";
>> >> >> +
>> >> >> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
>> >> >> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
>> >> >> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
>> >> >> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
>> >> >> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
>> >> >> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
>> >> >> +
>> >> >> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
>> >> >> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
>> >> >> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
>> >> >> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
>> >> >> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
>> >> >> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
>> >> >
>> >> > I've been thinking about this some more. The translations for both the
>> >> > regular and extended configuration spaces are configured in the top-
>> >> > level PCIe controller. It is therefore wrong how they are passed to the
>> >> > PCI host bridges via the ranges property.
>> >> >
>> >> > I remember Mitch saying that it should be passed down to the children
>> >> > because it is partitioned among them, but since the layout is compatible
>> >> > with ECAM, the partitioning isn't as simple as what's in the tree. In
>> >> > fact the partitions will be dependent on the number of devices attached
>> >> > to the host bridges.
>> >>
>> >> I don't understand this last bit about the number of devices attached
>> >> to the host bridges.  Logically, the host bridge has a bus number
>> >> aperture that you can know up front, even before you know anything
>> >> about what devices are below it.  On x86, for example, the ACPI _CRS
>> >> method has something like "[bus 00-7f]" in it, which means that any
>> >> buses in that range are below this bridge.  That doesn't tell us
>> >> anything about which buses actually have devices on them, of course;
>> >> it's just analogous to the secondary and subordinate bus number
>> >> registers in a P2P bridge.
>> >
>> > That's one of the issues I still need to take care of. Currently no bus
>> > resource is attached to the individual bridges (nor the PCI controller
>> > for that matter), so the PCI core will assign them dynamically.
>>
>> So your PCI controller driver knows how to program the controller bus
>> number aperture?  Sometimes people start by assuming that two host
>> bridges both have [bus 00-ff] apertures, then they enumerate below the
>> first and adjust the bus number apertures based on what they found.
>> For example, if they found buses 00-12 behind the first bridge, they
>> make the apertures [bus 00-12] for the first bridge and [bus 13-ff]
>> for the second.  That might be the case, depending on what firmware
>> set up, but it seems like a dubious way to do it, and of course it
>> precludes a lot of hot-plug scenarios.
>
> No, that's not what I meant. What happens is that no pre-assigned bus
> range is specified for either of the host bridges, so that the range
> 0x00-0xff will be assigned by default in pci_scan_root_bus().

My concern is about making the kernel's idea of the host bridge bus
number aperture match what the hardware is doing.  I'm pretty sure
that the default [bus 00-ff] range assigned by pci_scan_root_bus()
doesn't actually match the hardware in most cases, at least when we
have multiple host bridges in the same PCI domain.

For example, if you don't supply a bus number range,
pci_scan_root_bus() will assume [bus 00-ff] for both host bridges.
But if you could put an analyzer on each of the root buses and then
read bus 0 config space, will you see that config transaction on
*both* buses?  I doubt it.

You have to know at least the bus number of the root bus up front
before you can even start enumerating it.  The only way to learn that
is by reading registers in the host bridge or by some external
mechanism like ACPI or device tree.  That's the beginning of the bus
number aperture.  The end of the aperture is similar: we can't
reliably determine it by enumerating devices below the host bridge, so
we have to know it up front.  You can enumerate starting with the root
bus number and assigning new subordinate bus numbers as necessary, but
unless you know the host bridge aperture to begin with, you could
inadvertently assign a new bus number that actually belongs to a
different host bridge.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 14:36             ` Bjorn Helgaas
@ 2012-08-15 14:57               ` Thierry Reding
  2012-08-15 20:25                 ` Arnd Bergmann
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 14:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 6870 bytes --]

On Wed, Aug 15, 2012 at 07:36:24AM -0700, Bjorn Helgaas wrote:
> On Wed, Aug 15, 2012 at 5:30 AM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Wed, Aug 15, 2012 at 05:18:04AM -0700, Bjorn Helgaas wrote:
> >> On Tue, Aug 14, 2012 at 11:37 PM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > On Tue, Aug 14, 2012 at 04:50:26PM -0700, Bjorn Helgaas wrote:
> >> >> On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
> >> >> <thierry.reding@avionic-design.de> wrote:
> >> >> > On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
> >> >> >> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> >> >> >> index a094c97..c886dff 100644
> >> >> >> --- a/arch/arm/boot/dts/tegra20.dtsi
> >> >> >> +++ b/arch/arm/boot/dts/tegra20.dtsi
> >> >> >> @@ -199,6 +199,68 @@
> >> >> >>               #size-cells = <0>;
> >> >> >>       };
> >> >> >>
> >> >> >> +     pcie-controller {
> >> >> >> +             compatible = "nvidia,tegra20-pcie";
> >> >> >> +             reg = <0x80003000 0x00000800   /* PADS registers */
> >> >> >> +                    0x80003800 0x00000200   /* AFI registers */
> >> >> >> +                    0x81000000 0x01000000   /* configuration space */
> >> >> >> +                    0x90000000 0x10000000>; /* extended configuration space */
> >> >> >> +             interrupts = <0 98 0x04   /* controller interrupt */
> >> >> >> +                           0 99 0x04>; /* MSI interrupt */
> >> >> >> +             status = "disabled";
> >> >> >> +
> >> >> >> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
> >> >> >> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
> >> >> >> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
> >> >> >> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
> >> >> >> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
> >> >> >> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
> >> >> >> +
> >> >> >> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
> >> >> >> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
> >> >> >> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
> >> >> >> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
> >> >> >> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
> >> >> >> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
> >> >> >
> >> >> > I've been thinking about this some more. The translations for both the
> >> >> > regular and extended configuration spaces are configured in the top-
> >> >> > level PCIe controller. It is therefore wrong how they are passed to the
> >> >> > PCI host bridges via the ranges property.
> >> >> >
> >> >> > I remember Mitch saying that it should be passed down to the children
> >> >> > because it is partitioned among them, but since the layout is compatible
> >> >> > with ECAM, the partitioning isn't as simple as what's in the tree. In
> >> >> > fact the partitions will be dependent on the number of devices attached
> >> >> > to the host bridges.
> >> >>
> >> >> I don't understand this last bit about the number of devices attached
> >> >> to the host bridges.  Logically, the host bridge has a bus number
> >> >> aperture that you can know up front, even before you know anything
> >> >> about what devices are below it.  On x86, for example, the ACPI _CRS
> >> >> method has something like "[bus 00-7f]" in it, which means that any
> >> >> buses in that range are below this bridge.  That doesn't tell us
> >> >> anything about which buses actually have devices on them, of course;
> >> >> it's just analogous to the secondary and subordinate bus number
> >> >> registers in a P2P bridge.
> >> >
> >> > That's one of the issues I still need to take care of. Currently no bus
> >> > resource is attached to the individual bridges (nor the PCI controller
> >> > for that matter), so the PCI core will assign them dynamically.
> >>
> >> So your PCI controller driver knows how to program the controller bus
> >> number aperture?  Sometimes people start by assuming that two host
> >> bridges both have [bus 00-ff] apertures, then they enumerate below the
> >> first and adjust the bus number apertures based on what they found.
> >> For example, if they found buses 00-12 behind the first bridge, they
> >> make the apertures [bus 00-12] for the first bridge and [bus 13-ff]
> >> for the second.  That might be the case, depending on what firmware
> >> set up, but it seems like a dubious way to do it, and of course it
> >> precludes a lot of hot-plug scenarios.
> >
> > No, that's not what I meant. What happens is that no pre-assigned bus
> > range is specified for either of the host bridges, so that the range
> > 0x00-0xff will be assigned by default in pci_scan_root_bus().
> 
> My concern is about making the kernel's idea of the host bridge bus
> number aperture match what the hardware is doing.  I'm pretty sure
> that the default [bus 00-ff] range assigned by pci_scan_root_bus()
> doesn't actually match the hardware in most cases, at least when we
> have multiple host bridges in the same PCI domain.
> 
> For example, if you don't supply a bus number range,
> pci_scan_root_bus() will assume [bus 00-ff] for both host bridges.
> But if you could put an analyzer on each of the root buses and then
> read bus 0 config space, will you see that config transaction on
> *both* buses?  I doubt it.
> 
> You have to know at least the bus number of the root bus up front
> before you can even start enumerating it.  The only way to learn that
> is by reading registers in the host bridge or by some external
> mechanism like ACPI or device tree.  That's the beginning of the bus
> number aperture.  The end of the aperture is similar: we can't
> reliably determine it by enumerating devices below the host bridge, so
> we have to know it up front.  You can enumerate starting with the root
> bus number and assigning new subordinate bus numbers as necessary, but
> unless you know the host bridge aperture to begin with, you could
> inadvertently assign a new bus number that actually belongs to a
> different host bridge.

Yes, that was my understanding as well. So currently I haven't seen any
problems with this because I only use one of the two host bridges. But I
suppose I should add code to initialize the bus number aperture properly
either via platform device resources (for the non-DT case) and the
device tree otherwise.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-07-26 19:55 ` [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init Thierry Reding
  2012-08-14  5:06   ` Bjorn Helgaas
@ 2012-08-15 17:06   ` Bjorn Helgaas
  2012-08-15 19:28     ` Thierry Reding
  1 sibling, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-15 17:06 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> When using deferred driver probing, PCI host controller drivers may
> actually require this function after the init stage.
>
> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> ---
> Changes in v3:
> - none
>
> Changes in v2:
> - use __devinit annotations

Your original patch removed __init completely.  Here you change it to
__devinit.  That means we decide whether to discard the function based
on whether CONFIG_HOTPLUG is supported.  But I think your point is not
about hotplug; it's merely that we should be able to scan a PCI bus
after init-time.  We ought to be able to do a late PCI scan even if
hotplug is not supported.

Therefore, I'd be inclined to remove __init completely unless you have
another reason for preferring __devinit.

>  drivers/pci/setup-irq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c
> index eb219a1..f0bcd56 100644
> --- a/drivers/pci/setup-irq.c
> +++ b/drivers/pci/setup-irq.c
> @@ -18,7 +18,7 @@
>  #include <linux/cache.h>
>
>
> -static void __init
> +static void __devinit
>  pdev_fixup_irq(struct pci_dev *dev,
>                u8 (*swizzle)(struct pci_dev *, u8 *),
>                int (*map_irq)(const struct pci_dev *, u8, u8))
> @@ -54,7 +54,7 @@ pdev_fixup_irq(struct pci_dev *dev,
>         pcibios_update_irq(dev, irq);
>  }
>
> -void __init
> +void __devinit
>  pci_fixup_irqs(u8 (*swizzle)(struct pci_dev *, u8 *),
>                int (*map_irq)(const struct pci_dev *, u8, u8))
>  {
> --
> 1.7.11.2
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 23:51                 ` Stephen Warren
@ 2012-08-15 19:04                   ` Stephen Warren
  2012-08-15 20:09                     ` Thierry Reding
  2012-09-07 23:34                   ` Stephen Warren
  1 sibling, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-08-15 19:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Thierry Reding, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/14/2012 05:51 PM, Stephen Warren wrote:
> On 08/14/2012 04:58 PM, Stephen Warren wrote:
...
>> Can't we make the call to pci_bus_add_devices() optional in
>> pci_scan_root_bus() somehow; one of:
> 
> Sigh, that turns out not to work correctly; it solves at least this part
> of the problem when booting using device tree, but when booting using a
> board file, it causes the IRQ number passed to the PCIe device to be
> bogus:-(
> 
> I give up for now.

I think the appropriate workaround for Tegra in 3.6 is to simply make
any drivers for PCIe-based devices be modules instead of built-in, as
Thierry hinted at much earlier in the thread. I've validated that the
Ethernet works just fine on TrimSlice with that change, booting v3.6-rc1
using either board files or device tree.

For 3.7, we should continue the discussion about a real fix; I'll look
into the change Bjorn requested and see if it works, although given that
hacking pci_scan_root_bus as described immediately previously in this
thread caused a regression when booting using a board file, and the fact
that board files are no longer supported on Tegra, I'm not too confident
in the outcome...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-08-15 17:06   ` Bjorn Helgaas
@ 2012-08-15 19:28     ` Thierry Reding
  2012-08-15 19:42       ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 19:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 1360 bytes --]

On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > When using deferred driver probing, PCI host controller drivers may
> > actually require this function after the init stage.
> >
> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> > ---
> > Changes in v3:
> > - none
> >
> > Changes in v2:
> > - use __devinit annotations
> 
> Your original patch removed __init completely.  Here you change it to
> __devinit.  That means we decide whether to discard the function based
> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
> about hotplug; it's merely that we should be able to scan a PCI bus
> after init-time.  We ought to be able to do a late PCI scan even if
> hotplug is not supported.
> 
> Therefore, I'd be inclined to remove __init completely unless you have
> another reason for preferring __devinit.

I thought __devinit would resolve to nothing if HOTPLUG is defined and
__init otherwise. That seemed more appropriate. However you are right
that it is useful to always have it available, so I'm fine with removing
the annotations altogether. Do you want me to follow up with a patch? Or
can you just take the first version? I'm not sure if it still applies.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-08-15 19:28     ` Thierry Reding
@ 2012-08-15 19:42       ` Bjorn Helgaas
  2012-08-15 20:01         ` Thierry Reding
  2012-09-07 16:19         ` Stephen Warren
  0 siblings, 2 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-15 19:42 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > When using deferred driver probing, PCI host controller drivers may
>> > actually require this function after the init stage.
>> >
>> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
>> > ---
>> > Changes in v3:
>> > - none
>> >
>> > Changes in v2:
>> > - use __devinit annotations
>>
>> Your original patch removed __init completely.  Here you change it to
>> __devinit.  That means we decide whether to discard the function based
>> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
>> about hotplug; it's merely that we should be able to scan a PCI bus
>> after init-time.  We ought to be able to do a late PCI scan even if
>> hotplug is not supported.
>>
>> Therefore, I'd be inclined to remove __init completely unless you have
>> another reason for preferring __devinit.
>
> I thought __devinit would resolve to nothing if HOTPLUG is defined and
> __init otherwise. That seemed more appropriate. However you are right
> that it is useful to always have it available, so I'm fine with removing
> the annotations altogether. Do you want me to follow up with a patch? Or
> can you just take the first version? I'm not sure if it still applies.

You're right about how __devinit works.  It's just that I don't think
hotplug is actually relevant here.  We're trying to make
pci_fixup_irqs() work after init, whether it's because of hotplug or
simply because the arch scans host bridges after init.

I applied this to my "next" branch.  Thanks!

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-08-15 19:42       ` Bjorn Helgaas
@ 2012-08-15 20:01         ` Thierry Reding
  2012-09-07 16:19         ` Stephen Warren
  1 sibling, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 20:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]

On Wed, Aug 15, 2012 at 01:42:39PM -0600, Bjorn Helgaas wrote:
> On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
> >> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > When using deferred driver probing, PCI host controller drivers may
> >> > actually require this function after the init stage.
> >> >
> >> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> >> > ---
> >> > Changes in v3:
> >> > - none
> >> >
> >> > Changes in v2:
> >> > - use __devinit annotations
> >>
> >> Your original patch removed __init completely.  Here you change it to
> >> __devinit.  That means we decide whether to discard the function based
> >> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
> >> about hotplug; it's merely that we should be able to scan a PCI bus
> >> after init-time.  We ought to be able to do a late PCI scan even if
> >> hotplug is not supported.
> >>
> >> Therefore, I'd be inclined to remove __init completely unless you have
> >> another reason for preferring __devinit.
> >
> > I thought __devinit would resolve to nothing if HOTPLUG is defined and
> > __init otherwise. That seemed more appropriate. However you are right
> > that it is useful to always have it available, so I'm fine with removing
> > the annotations altogether. Do you want me to follow up with a patch? Or
> > can you just take the first version? I'm not sure if it still applies.
> 
> You're right about how __devinit works.  It's just that I don't think
> hotplug is actually relevant here.  We're trying to make
> pci_fixup_irqs() work after init, whether it's because of hotplug or
> simply because the arch scans host bridges after init.
> 
> I applied this to my "next" branch.  Thanks!

Great! Thanks. It's nice to see the patch series shrink. =)

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially
  2012-07-31 20:18   ` Rob Herring
@ 2012-08-15 20:06     ` Thierry Reding
  2012-09-07 16:24       ` Stephen Warren
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 20:06 UTC (permalink / raw)
  To: Rob Herring
  Cc: linux-tegra, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Colin Cross, Bjorn Helgaas, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

On Tue, Jul 31, 2012 at 03:18:43PM -0500, Rob Herring wrote:
> On 07/26/2012 02:55 PM, Thierry Reding wrote:
> > When a bus specifies #address-cells > 2, of_bus_default_map() now
> > assumes that the mapping isn't for a physical address but rather an
> > identifier that needs to match exactly.
> > 
> > This is required by bindings that use multiple cells to translate a
> > resource to the parent bus (device index, type, ...).
> > 
> > See here for the discussion:
> > 
> > 	https://lists.ozlabs.org/pipermail/devicetree-discuss/2012-June/016577.html
> > 
> > Originally-by: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> 
> Acked-by: Rob Herring <rob.herring@calxeda.com>

Hi Rob,

Were you going to take this through your DT tree? I'm trying to reduce
the number of patches in this series to make it more manageable and
split it into smaller chunks. There are also a couple of issues that
need to be resolved so I don't know if I can get the whole series into
shape for 3.7.

However if you don't think this patch is useful to be applied by itself
I can also carry it until the complete series is ready.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-15 19:04                   ` Stephen Warren
@ 2012-08-15 20:09                     ` Thierry Reding
  2012-08-15 20:11                       ` Stephen Warren
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 20:09 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Bjorn Helgaas, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 1547 bytes --]

On Wed, Aug 15, 2012 at 01:04:20PM -0600, Stephen Warren wrote:
> On 08/14/2012 05:51 PM, Stephen Warren wrote:
> > On 08/14/2012 04:58 PM, Stephen Warren wrote:
> ...
> >> Can't we make the call to pci_bus_add_devices() optional in
> >> pci_scan_root_bus() somehow; one of:
> > 
> > Sigh, that turns out not to work correctly; it solves at least this part
> > of the problem when booting using device tree, but when booting using a
> > board file, it causes the IRQ number passed to the PCIe device to be
> > bogus:-(
> > 
> > I give up for now.
> 
> I think the appropriate workaround for Tegra in 3.6 is to simply make
> any drivers for PCIe-based devices be modules instead of built-in, as
> Thierry hinted at much earlier in the thread. I've validated that the
> Ethernet works just fine on TrimSlice with that change, booting v3.6-rc1
> using either board files or device tree.

That's certainly the easiest and least error-prone solution.

> For 3.7, we should continue the discussion about a real fix; I'll look
> into the change Bjorn requested and see if it works, although given that
> hacking pci_scan_root_bus as described immediately previously in this
> thread caused a regression when booting using a board file, and the fact
> that board files are no longer supported on Tegra, I'm not too confident
> in the outcome...

I don't understand this last part. If the problem is there when booting
from board files and the board files are removed, doesn't that remove
the problem as well? =)

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-15 20:09                     ` Thierry Reding
@ 2012-08-15 20:11                       ` Stephen Warren
  2012-08-15 20:19                         ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-08-15 20:11 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Bjorn Helgaas, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/15/2012 02:09 PM, Thierry Reding wrote:
> On Wed, Aug 15, 2012 at 01:04:20PM -0600, Stephen Warren wrote:
>> On 08/14/2012 05:51 PM, Stephen Warren wrote:
>>> On 08/14/2012 04:58 PM, Stephen Warren wrote:
>> ...
>>>> Can't we make the call to pci_bus_add_devices() optional in 
>>>> pci_scan_root_bus() somehow; one of:
>>> 
>>> Sigh, that turns out not to work correctly; it solves at least
>>> this part of the problem when booting using device tree, but
>>> when booting using a board file, it causes the IRQ number
>>> passed to the PCIe device to be bogus:-(
>>> 
>>> I give up for now.
>> 
>> I think the appropriate workaround for Tegra in 3.6 is to simply
>> make any drivers for PCIe-based devices be modules instead of
>> built-in, as Thierry hinted at much earlier in the thread. I've
>> validated that the Ethernet works just fine on TrimSlice with
>> that change, booting v3.6-rc1 using either board files or device
>> tree.
> 
> That's certainly the easiest and least error-prone solution.
> 
>> For 3.7, we should continue the discussion about a real fix; I'll
>> look into the change Bjorn requested and see if it works,
>> although given that hacking pci_scan_root_bus as described
>> immediately previously in this thread caused a regression when
>> booting using a board file, and the fact that board files are no
>> longer supported on Tegra, I'm not too confident in the
>> outcome...
> 
> I don't understand this last part. If the problem is there when
> booting from board files and the board files are removed, doesn't
> that remove the problem as well? =)

It does for ARM SoCs/CPUs exclusively using device tree, but not all
ARM systems are converting to device tree, so the fact that Tegra has
makes it harder for me not to break anything else.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-15 20:11                       ` Stephen Warren
@ 2012-08-15 20:19                         ` Thierry Reding
  0 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-15 20:19 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Bjorn Helgaas, Russell King, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2329 bytes --]

On Wed, Aug 15, 2012 at 02:11:10PM -0600, Stephen Warren wrote:
> On 08/15/2012 02:09 PM, Thierry Reding wrote:
> > On Wed, Aug 15, 2012 at 01:04:20PM -0600, Stephen Warren wrote:
> >> On 08/14/2012 05:51 PM, Stephen Warren wrote:
> >>> On 08/14/2012 04:58 PM, Stephen Warren wrote:
> >> ...
> >>>> Can't we make the call to pci_bus_add_devices() optional in 
> >>>> pci_scan_root_bus() somehow; one of:
> >>> 
> >>> Sigh, that turns out not to work correctly; it solves at least
> >>> this part of the problem when booting using device tree, but
> >>> when booting using a board file, it causes the IRQ number
> >>> passed to the PCIe device to be bogus:-(
> >>> 
> >>> I give up for now.
> >> 
> >> I think the appropriate workaround for Tegra in 3.6 is to simply
> >> make any drivers for PCIe-based devices be modules instead of
> >> built-in, as Thierry hinted at much earlier in the thread. I've
> >> validated that the Ethernet works just fine on TrimSlice with
> >> that change, booting v3.6-rc1 using either board files or device
> >> tree.
> > 
> > That's certainly the easiest and least error-prone solution.
> > 
> >> For 3.7, we should continue the discussion about a real fix; I'll
> >> look into the change Bjorn requested and see if it works,
> >> although given that hacking pci_scan_root_bus as described
> >> immediately previously in this thread caused a regression when
> >> booting using a board file, and the fact that board files are no
> >> longer supported on Tegra, I'm not too confident in the
> >> outcome...
> > 
> > I don't understand this last part. If the problem is there when
> > booting from board files and the board files are removed, doesn't
> > that remove the problem as well? =)
> 
> It does for ARM SoCs/CPUs exclusively using device tree, but not all
> ARM systems are converting to device tree, so the fact that Tegra has
> makes it harder for me not to break anything else.

I think the best road to a real fix would be to implement a custom scan
function for Tegra along the lines of what Bjorn suggested. Ideally,
this implementation should eventually converge to what's done on other
architectures. At that point maybe ARM can completely be converted to
this new generic implementation and the Tegra-specific hook removed.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 14:57               ` Thierry Reding
@ 2012-08-15 20:25                 ` Arnd Bergmann
  2012-08-15 20:48                   ` Bjorn Helgaas
  2012-08-16  4:55                   ` Thierry Reding
  0 siblings, 2 replies; 79+ messages in thread
From: Arnd Bergmann @ 2012-08-15 20:25 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley

On Wednesday 15 August 2012, Thierry Reding wrote:
> Yes, that was my understanding as well. So currently I haven't seen any
> problems with this because I only use one of the two host bridges. But I
> suppose I should add code to initialize the bus number aperture properly
> either via platform device resources (for the non-DT case) and the
> device tree otherwise.

I think when we last discussed this, the assumption was that each
root port has its own config space range and its own pci domain,
so you don't have to worry about bus apertures because each root port
can then have all 255 bus numbers. Has that turned out to be incorrect
now?

	Arnd

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 20:25                 ` Arnd Bergmann
@ 2012-08-15 20:48                   ` Bjorn Helgaas
  2012-08-16  4:55                   ` Thierry Reding
  1 sibling, 0 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-08-15 20:48 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thierry Reding, linux-tegra, linux-pci, Grant Likely,
	Rob Herring, devicetree-discuss, Russell King, linux-arm-kernel,
	Colin Cross, Olof Johansson, Stephen Warren, Mitch Bradley

On Wed, Aug 15, 2012 at 2:25 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 15 August 2012, Thierry Reding wrote:
>> Yes, that was my understanding as well. So currently I haven't seen any
>> problems with this because I only use one of the two host bridges. But I
>> suppose I should add code to initialize the bus number aperture properly
>> either via platform device resources (for the non-DT case) and the
>> device tree otherwise.
>
> I think when we last discussed this, the assumption was that each
> root port has its own config space range and its own pci domain,
> so you don't have to worry about bus apertures because each root port
> can then have all 255 bus numbers. Has that turned out to be incorrect
> now?

If that's the case, there's no problem.  I just want to be explicit
about the host bridge bus number aperture because I'd like to make
pci_scan_root_bus() fail if no aperture is supplied or if the aperture
overlaps with something we've already seen.  I don't know if that
means you want to add a domain and [bus 00-ff] range in device tree,
or if you want to make some device tree rule about every host bridge
being in its own domain, or what.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 20:25                 ` Arnd Bergmann
  2012-08-15 20:48                   ` Bjorn Helgaas
@ 2012-08-16  4:55                   ` Thierry Reding
  2012-08-16  7:03                     ` Arnd Bergmann
  1 sibling, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-08-16  4:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley

[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]

On Wed, Aug 15, 2012 at 08:25:25PM +0000, Arnd Bergmann wrote:
> On Wednesday 15 August 2012, Thierry Reding wrote:
> > Yes, that was my understanding as well. So currently I haven't seen any
> > problems with this because I only use one of the two host bridges. But I
> > suppose I should add code to initialize the bus number aperture properly
> > either via platform device resources (for the non-DT case) and the
> > device tree otherwise.
> 
> I think when we last discussed this, the assumption was that each
> root port has its own config space range and its own pci domain,
> so you don't have to worry about bus apertures because each root port
> can then have all 255 bus numbers. Has that turned out to be incorrect
> now?

At least for the config space this is incorrect. There's a single region
to access the configuration space for all devices below the PCIe
controller. So it is shared by both (Tegra20) or all three (Tegra30)
root ports.

I'm not sure about PCI domains. Do you have any good pointers as to
where I could read up on them? If they need special hardware support,
then I think Tegra doesn't support them either. At least I haven't come
across any mention of domains while going through the, admittedly some-
what sparse on PCIe, Tegra documentation.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-16  4:55                   ` Thierry Reding
@ 2012-08-16  7:03                     ` Arnd Bergmann
  2012-08-16  7:47                       ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Arnd Bergmann @ 2012-08-16  7:03 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley

On Thursday 16 August 2012, Thierry Reding wrote:
> At least for the config space this is incorrect. There's a single region
> to access the configuration space for all devices below the PCIe
> controller. So it is shared by both (Tegra20) or all three (Tegra30)
> root ports.
> 
> I'm not sure about PCI domains. Do you have any good pointers as to
> where I could read up on them? If they need special hardware support,
> then I think Tegra doesn't support them either. At least I haven't come
> across any mention of domains while going through the, admittedly some-
> what sparse on PCIe, Tegra documentation.

I was referring to the same thing here: if you had a separate config
space for each root port, they would by definition be separate PCI
domain. So you don't have them.

	Arnd

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-16  7:03                     ` Arnd Bergmann
@ 2012-08-16  7:47                       ` Thierry Reding
  0 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-16  7:47 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley

[-- Attachment #1: Type: text/plain, Size: 954 bytes --]

On Thu, Aug 16, 2012 at 07:03:50AM +0000, Arnd Bergmann wrote:
> On Thursday 16 August 2012, Thierry Reding wrote:
> > At least for the config space this is incorrect. There's a single region
> > to access the configuration space for all devices below the PCIe
> > controller. So it is shared by both (Tegra20) or all three (Tegra30)
> > root ports.
> > 
> > I'm not sure about PCI domains. Do you have any good pointers as to
> > where I could read up on them? If they need special hardware support,
> > then I think Tegra doesn't support them either. At least I haven't come
> > across any mention of domains while going through the, admittedly some-
> > what sparse on PCIe, Tegra documentation.
> 
> I was referring to the same thing here: if you had a separate config
> space for each root port, they would by definition be separate PCI
> domain. So you don't have them.

I see, that makes sense. Thanks for explaining.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support
  2012-08-15 12:18         ` Bjorn Helgaas
  2012-08-15 12:30           ` Thierry Reding
@ 2012-08-16 12:15           ` Thierry Reding
  1 sibling, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-16 12:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Stephen Warren, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 6085 bytes --]

On Wed, Aug 15, 2012 at 05:18:04AM -0700, Bjorn Helgaas wrote:
> On Tue, Aug 14, 2012 at 11:37 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Tue, Aug 14, 2012 at 04:50:26PM -0700, Bjorn Helgaas wrote:
> >> On Tue, Aug 14, 2012 at 1:12 PM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > On Thu, Jul 26, 2012 at 09:55:12PM +0200, Thierry Reding wrote:
> >> >> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> >> >> index a094c97..c886dff 100644
> >> >> --- a/arch/arm/boot/dts/tegra20.dtsi
> >> >> +++ b/arch/arm/boot/dts/tegra20.dtsi
> >> >> @@ -199,6 +199,68 @@
> >> >>               #size-cells = <0>;
> >> >>       };
> >> >>
> >> >> +     pcie-controller {
> >> >> +             compatible = "nvidia,tegra20-pcie";
> >> >> +             reg = <0x80003000 0x00000800   /* PADS registers */
> >> >> +                    0x80003800 0x00000200   /* AFI registers */
> >> >> +                    0x81000000 0x01000000   /* configuration space */
> >> >> +                    0x90000000 0x10000000>; /* extended configuration space */
> >> >> +             interrupts = <0 98 0x04   /* controller interrupt */
> >> >> +                           0 99 0x04>; /* MSI interrupt */
> >> >> +             status = "disabled";
> >> >> +
> >> >> +             ranges = <0 0 0  0x80000000 0x00001000   /* root port 0 */
> >> >> +                       0 1 0  0x81000000 0x00800000   /* port 0 config space */
> >> >> +                       0 2 0  0x90000000 0x08000000   /* port 0 ext config space */
> >> >> +                       0 3 0  0x82000000 0x00010000   /* port 0 downstream I/O */
> >> >> +                       0 4 0  0xa0000000 0x08000000   /* port 0 non-prefetchable memory */
> >> >> +                       0 5 0  0xb0000000 0x08000000   /* port 0 prefetchable memory */
> >> >> +
> >> >> +                       1 0 0  0x80001000 0x00001000   /* root port 1 */
> >> >> +                       1 1 0  0x81800000 0x00800000   /* port 1 config space */
> >> >> +                       1 2 0  0x98000000 0x08000000   /* port 1 ext config space */
> >> >> +                       1 3 0  0x82010000 0x00010000   /* port 1 downstream I/O */
> >> >> +                       1 4 0  0xa8000000 0x08000000   /* port 1 non-prefetchable memory */
> >> >> +                       1 5 0  0xb8000000 0x08000000>; /* port 1 prefetchable memory */
> >> >
> >> > I've been thinking about this some more. The translations for both the
> >> > regular and extended configuration spaces are configured in the top-
> >> > level PCIe controller. It is therefore wrong how they are passed to the
> >> > PCI host bridges via the ranges property.
> >> >
> >> > I remember Mitch saying that it should be passed down to the children
> >> > because it is partitioned among them, but since the layout is compatible
> >> > with ECAM, the partitioning isn't as simple as what's in the tree. In
> >> > fact the partitions will be dependent on the number of devices attached
> >> > to the host bridges.
> >>
> >> I don't understand this last bit about the number of devices attached
> >> to the host bridges.  Logically, the host bridge has a bus number
> >> aperture that you can know up front, even before you know anything
> >> about what devices are below it.  On x86, for example, the ACPI _CRS
> >> method has something like "[bus 00-7f]" in it, which means that any
> >> buses in that range are below this bridge.  That doesn't tell us
> >> anything about which buses actually have devices on them, of course;
> >> it's just analogous to the secondary and subordinate bus number
> >> registers in a P2P bridge.
> >
> > That's one of the issues I still need to take care of. Currently no bus
> > resource is attached to the individual bridges (nor the PCI controller
> > for that matter), so the PCI core will assign them dynamically.
> 
> So your PCI controller driver knows how to program the controller bus
> number aperture?  Sometimes people start by assuming that two host
> bridges both have [bus 00-ff] apertures, then they enumerate below the
> first and adjust the bus number apertures based on what they found.
> For example, if they found buses 00-12 behind the first bridge, they
> make the apertures [bus 00-12] for the first bridge and [bus 13-ff]
> for the second.  That might be the case, depending on what firmware
> set up, but it seems like a dubious way to do it, and of course it
> precludes a lot of hot-plug scenarios.
> 
> > If this
> > range is known at boot time we could assign ECAM ranges based on the bus
> > numbers. Standard ECAM ranges, that is. On Tegra this won't work because
> > as Stephen mentioned in a previous mail, the bus field is not the top
> > field in the ECAM addresses. Basically what you have is this:
> >
> >         [27:24] upper 4 bits of the register address for extended
> >                 configuration space
> >         [23:16] bus number
> >         [15:11] device number
> >         [10: 8] device function
> >         [ 8: 0] register
> >
> > So the ECAM space cannot be partitioned by bus number.
> 
> Ah, OK, so definitely not standard PCIe ECAM.

I had an idea about how we could maybe implement a generic ECAM
mechanism that provides for this kind of mapping as well. Basically,
every mechanism that can be covered by such an implementation would in
one way or another specify the above five fields, so how about designing
it in a way that the driver could define the fields that encode the
information, so for Tegra, something along these lines:

	struct pci_ecam_map pci_ecam_tegra = {
		.register = {  0,  7 },
		.extended = { 24, 27 },
		.function = {  8, 10 },
		.device   = { 11, 15 },
		.bus      = { 16, 23 },
	}

For standard ECAM, something like this:

	struct pci_ecam_map pci_ecam_standard = {
		.register = {  0,  7 },
		.extended = {  8, 11 },
		.function = { 12, 14 },
		.device   = { 15, 19 },
		.bus      = { 20, 27 },
	};

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-15  6:49             ` Thierry Reding
@ 2012-08-16 15:18               ` Stephen Warren
  2012-08-16 18:27                 ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-08-16 15:18 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/15/2012 12:49 AM, Thierry Reding wrote:
...
> Stephen: Could you try to find out whether the regular
> configuration space translation can just be omitted if we already
> set up the one for the extended configuration space? In
> tegra_pcie_setup_translations(), BAR 0 is setup for regular
> configuration space (which requires a 16 MiB region), while BAR 1
> is setup for the extended configuration space (requiring a full 256
> MiB region). However, if I understand correctly, each of the
> registers that can be accessed via the BAR 0 translation can also
> be accessed via the BAR 1 translation. That seems like we're 
> wasting the 16 MiB set aside for the BAR 0 mapping.

I have confirmed that in theory, the EXTCFG space can indeed be used
to access any register, making the regular config space redundant.
However, using the HW this way has apparently received less validation.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 05/10] resource: add PCI configuration space support
  2012-08-16 15:18               ` Stephen Warren
@ 2012-08-16 18:27                 ` Thierry Reding
  0 siblings, 0 replies; 79+ messages in thread
From: Thierry Reding @ 2012-08-16 18:27 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

On Thu, Aug 16, 2012 at 09:18:20AM -0600, Stephen Warren wrote:
> On 08/15/2012 12:49 AM, Thierry Reding wrote:
> ...
> > Stephen: Could you try to find out whether the regular
> > configuration space translation can just be omitted if we already
> > set up the one for the extended configuration space? In
> > tegra_pcie_setup_translations(), BAR 0 is setup for regular
> > configuration space (which requires a 16 MiB region), while BAR 1
> > is setup for the extended configuration space (requiring a full 256
> > MiB region). However, if I understand correctly, each of the
> > registers that can be accessed via the BAR 0 translation can also
> > be accessed via the BAR 1 translation. That seems like we're 
> > wasting the 16 MiB set aside for the BAR 0 mapping.
> 
> I have confirmed that in theory, the EXTCFG space can indeed be used
> to access any register, making the regular config space redundant.
> However, using the HW this way has apparently received less validation.

Okay. I wasn't expecting anything else really. I'll give this a try next
time I get my hands on some Tegra hardware. If that works maybe we
should make up for the missing validation by testing this more
thoroughly.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-08-15 19:42       ` Bjorn Helgaas
  2012-08-15 20:01         ` Thierry Reding
@ 2012-09-07 16:19         ` Stephen Warren
  2012-09-07 17:00           ` Thierry Reding
  1 sibling, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-09-07 16:19 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Thierry Reding, linux-tegra, linux-pci, Grant Likely,
	Rob Herring, devicetree-discuss, Russell King, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/15/2012 01:42 PM, Bjorn Helgaas wrote:
> On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
>> On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
>>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
>>> <thierry.reding@avionic-design.de> wrote:
>>>> When using deferred driver probing, PCI host controller drivers may
>>>> actually require this function after the init stage.
>>>>
>>>> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
>>>> ---
>>>> Changes in v3:
>>>> - none
>>>>
>>>> Changes in v2:
>>>> - use __devinit annotations
>>>
>>> Your original patch removed __init completely.  Here you change it to
>>> __devinit.  That means we decide whether to discard the function based
>>> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
>>> about hotplug; it's merely that we should be able to scan a PCI bus
>>> after init-time.  We ought to be able to do a late PCI scan even if
>>> hotplug is not supported.
>>>
>>> Therefore, I'd be inclined to remove __init completely unless you have
>>> another reason for preferring __devinit.
>>
>> I thought __devinit would resolve to nothing if HOTPLUG is defined and
>> __init otherwise. That seemed more appropriate. However you are right
>> that it is useful to always have it available, so I'm fine with removing
>> the annotations altogether. Do you want me to follow up with a patch? Or
>> can you just take the first version? I'm not sure if it still applies.
> 
> You're right about how __devinit works.  It's just that I don't think
> hotplug is actually relevant here.  We're trying to make
> pci_fixup_irqs() work after init, whether it's because of hotplug or
> simply because the arch scans host bridges after init.
> 
> I applied this to my "next" branch.  Thanks!

Bjorn, I don't see this patch in next-20120907. Did it get dropped for
some reason?

For the full history, see: http://patchwork.ozlabs.org/patch/173495/.

Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially
  2012-08-15 20:06     ` Thierry Reding
@ 2012-09-07 16:24       ` Stephen Warren
  2012-09-07 16:32         ` Rob Herring
  0 siblings, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-09-07 16:24 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring
  Cc: linux-tegra, Russell King, linux-pci, devicetree-discuss,
	Rob Herring, Colin Cross, Bjorn Helgaas, linux-arm-kernel

On 08/15/2012 02:06 PM, Thierry Reding wrote:
> On Tue, Jul 31, 2012 at 03:18:43PM -0500, Rob Herring wrote:
>> On 07/26/2012 02:55 PM, Thierry Reding wrote:
>>> When a bus specifies #address-cells > 2, of_bus_default_map()
>>> now assumes that the mapping isn't for a physical address but
>>> rather an identifier that needs to match exactly.
>>> 
>>> This is required by bindings that use multiple cells to
>>> translate a resource to the parent bus (device index, type,
>>> ...).
>>> 
>>> See here for the discussion:
>>> 
>>> https://lists.ozlabs.org/pipermail/devicetree-discuss/2012-June/016577.html
>>>
>>>
>>> 
Originally-by: Arnd Bergmann <arnd@arndb.de>
>>> Signed-off-by: Thierry Reding
>>> <thierry.reding@avionic-design.de>
>> 
>> Acked-by: Rob Herring <rob.herring@calxeda.com>
> 
> Hi Rob,
> 
> Were you going to take this through your DT tree? I'm trying to
> reduce the number of patches in this series to make it more
> manageable and split it into smaller chunks. There are also a
> couple of issues that need to be resolved so I don't know if I can
> get the whole series into shape for 3.7.
> 
> However if you don't think this patch is useful to be applied by
> itself I can also carry it until the complete series is ready.

Rob,

Are you able to take this patch now for 3.7, or should it be held off
until the Tegra PCIe driver is re-written and requires this
functionality (in which case I'd expect to take this through the Tegra
tree at that time).

For reference, it's at http://patchwork.ozlabs.org/patch/173497/.

Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially
  2012-09-07 16:24       ` Stephen Warren
@ 2012-09-07 16:32         ` Rob Herring
  0 siblings, 0 replies; 79+ messages in thread
From: Rob Herring @ 2012-09-07 16:32 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Thierry Reding, linux-tegra, Russell King, linux-pci,
	devicetree-discuss, Rob Herring, Colin Cross, Bjorn Helgaas,
	linux-arm-kernel

On 09/07/2012 11:24 AM, Stephen Warren wrote:
> On 08/15/2012 02:06 PM, Thierry Reding wrote:
>> On Tue, Jul 31, 2012 at 03:18:43PM -0500, Rob Herring wrote:
>>> On 07/26/2012 02:55 PM, Thierry Reding wrote:
>>>> When a bus specifies #address-cells > 2, of_bus_default_map()
>>>> now assumes that the mapping isn't for a physical address but
>>>> rather an identifier that needs to match exactly.
>>>>
>>>> This is required by bindings that use multiple cells to
>>>> translate a resource to the parent bus (device index, type,
>>>> ...).
>>>>
>>>> See here for the discussion:
>>>>
>>>> https://lists.ozlabs.org/pipermail/devicetree-discuss/2012-June/016577.html
>>>>
>>>>
>>>>
> Originally-by: Arnd Bergmann <arnd@arndb.de>
>>>> Signed-off-by: Thierry Reding
>>>> <thierry.reding@avionic-design.de>
>>>
>>> Acked-by: Rob Herring <rob.herring@calxeda.com>
>>
>> Hi Rob,
>>
>> Were you going to take this through your DT tree? I'm trying to
>> reduce the number of patches in this series to make it more
>> manageable and split it into smaller chunks. There are also a
>> couple of issues that need to be resolved so I don't know if I can
>> get the whole series into shape for 3.7.
>>
>> However if you don't think this patch is useful to be applied by
>> itself I can also carry it until the complete series is ready.
> 
> Rob,
> 
> Are you able to take this patch now for 3.7, or should it be held off
> until the Tegra PCIe driver is re-written and requires this
> functionality (in which case I'd expect to take this through the Tegra
> tree at that time).
> 
> For reference, it's at http://patchwork.ozlabs.org/patch/173497/.

Applied for 3.7.

Rob


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-09-07 16:19         ` Stephen Warren
@ 2012-09-07 17:00           ` Thierry Reding
  2012-09-07 17:22             ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-09-07 17:00 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Bjorn Helgaas, linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, Russell King, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2771 bytes --]

On Fri, Sep 07, 2012 at 10:19:46AM -0600, Stephen Warren wrote:
> On 08/15/2012 01:42 PM, Bjorn Helgaas wrote:
> > On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
> > <thierry.reding@avionic-design.de> wrote:
> >> On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
> >>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> >>> <thierry.reding@avionic-design.de> wrote:
> >>>> When using deferred driver probing, PCI host controller drivers may
> >>>> actually require this function after the init stage.
> >>>>
> >>>> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> >>>> ---
> >>>> Changes in v3:
> >>>> - none
> >>>>
> >>>> Changes in v2:
> >>>> - use __devinit annotations
> >>>
> >>> Your original patch removed __init completely.  Here you change it to
> >>> __devinit.  That means we decide whether to discard the function based
> >>> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
> >>> about hotplug; it's merely that we should be able to scan a PCI bus
> >>> after init-time.  We ought to be able to do a late PCI scan even if
> >>> hotplug is not supported.
> >>>
> >>> Therefore, I'd be inclined to remove __init completely unless you have
> >>> another reason for preferring __devinit.
> >>
> >> I thought __devinit would resolve to nothing if HOTPLUG is defined and
> >> __init otherwise. That seemed more appropriate. However you are right
> >> that it is useful to always have it available, so I'm fine with removing
> >> the annotations altogether. Do you want me to follow up with a patch? Or
> >> can you just take the first version? I'm not sure if it still applies.
> > 
> > You're right about how __devinit works.  It's just that I don't think
> > hotplug is actually relevant here.  We're trying to make
> > pci_fixup_irqs() work after init, whether it's because of hotplug or
> > simply because the arch scans host bridges after init.
> > 
> > I applied this to my "next" branch.  Thanks!
> 
> Bjorn, I don't see this patch in next-20120907. Did it get dropped for
> some reason?

Yes, it turns out that dropping the annotations causes lots of section
mismatches on other architectures. See here[0] for the details. I think
the solution to the issue would be to either remove HOTPLUG altogether
and drop __devinit and __devexit annotations or update all architectures
to fix these warnings. I think Bjorn and I settled on the latter because
it's obviously less intrusive. I've been busy building toolchains for
all the PCI architectures and I think I have all of them. I'll just need
some more time to build, find and fix any remaining section mismatches.

Thierry

[0]: http://www.linux-mips.org/archives/linux-mips/2012-08/msg00250.html

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-09-07 17:00           ` Thierry Reding
@ 2012-09-07 17:22             ` Bjorn Helgaas
  2012-09-14 18:55               ` Thierry Reding
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-09-07 17:22 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Stephen Warren, linux-tegra, linux-pci, Grant Likely,
	Rob Herring, devicetree-discuss, Russell King, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Fri, Sep 7, 2012 at 10:00 AM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Fri, Sep 07, 2012 at 10:19:46AM -0600, Stephen Warren wrote:
>> On 08/15/2012 01:42 PM, Bjorn Helgaas wrote:
>> > On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
>> > <thierry.reding@avionic-design.de> wrote:
>> >> On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
>> >>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
>> >>> <thierry.reding@avionic-design.de> wrote:
>> >>>> When using deferred driver probing, PCI host controller drivers may
>> >>>> actually require this function after the init stage.
>> >>>>
>> >>>> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
>> >>>> ---
>> >>>> Changes in v3:
>> >>>> - none
>> >>>>
>> >>>> Changes in v2:
>> >>>> - use __devinit annotations
>> >>>
>> >>> Your original patch removed __init completely.  Here you change it to
>> >>> __devinit.  That means we decide whether to discard the function based
>> >>> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
>> >>> about hotplug; it's merely that we should be able to scan a PCI bus
>> >>> after init-time.  We ought to be able to do a late PCI scan even if
>> >>> hotplug is not supported.
>> >>>
>> >>> Therefore, I'd be inclined to remove __init completely unless you have
>> >>> another reason for preferring __devinit.
>> >>
>> >> I thought __devinit would resolve to nothing if HOTPLUG is defined and
>> >> __init otherwise. That seemed more appropriate. However you are right
>> >> that it is useful to always have it available, so I'm fine with removing
>> >> the annotations altogether. Do you want me to follow up with a patch? Or
>> >> can you just take the first version? I'm not sure if it still applies.
>> >
>> > You're right about how __devinit works.  It's just that I don't think
>> > hotplug is actually relevant here.  We're trying to make
>> > pci_fixup_irqs() work after init, whether it's because of hotplug or
>> > simply because the arch scans host bridges after init.
>> >
>> > I applied this to my "next" branch.  Thanks!
>>
>> Bjorn, I don't see this patch in next-20120907. Did it get dropped for
>> some reason?
>
> Yes, it turns out that dropping the annotations causes lots of section
> mismatches on other architectures. See here[0] for the details. I think
> the solution to the issue would be to either remove HOTPLUG altogether
> and drop __devinit and __devexit annotations or update all architectures
> to fix these warnings. I think Bjorn and I settled on the latter because
> it's obviously less intrusive. I've been busy building toolchains for
> all the PCI architectures and I think I have all of them. I'll just need
> some more time to build, find and fix any remaining section mismatches.

Greg KH is actively removing CONFIG_HOTPLUG altogether -- see
https://lkml.org/lkml/2012/9/4/489

That will make __devinit resolve to nothing in all cases, and we'll
eventually remove __devinit completely.  So we don't want to convert
__init to __devinit; we have to remove the __init altogether.  I think
that means we have to update all arches first to avoid the section
mismatches.

So I think Thierry is on the right track:
  1) Change all arch pcibios_update_irq() implementations (and
probably a few other things) to be non-__init
  2) Change pci_fixup_irqs() to be non-__init

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-08-14 23:51                 ` Stephen Warren
  2012-08-15 19:04                   ` Stephen Warren
@ 2012-09-07 23:34                   ` Stephen Warren
  2012-09-08  0:04                     ` Russell King - ARM Linux
  1 sibling, 1 reply; 79+ messages in thread
From: Stephen Warren @ 2012-09-07 23:34 UTC (permalink / raw)
  To: Bjorn Helgaas, Thierry Reding, Russell King
  Cc: linux-tegra, linux-pci, Grant Likely, Rob Herring,
	devicetree-discuss, linux-arm-kernel, Colin Cross,
	Olof Johansson, Mitch Bradley, Arnd Bergmann

On 08/14/2012 05:51 PM, Stephen Warren wrote:
> On 08/14/2012 04:58 PM, Stephen Warren wrote:
>> On 08/14/2012 03:55 PM, Bjorn Helgaas wrote:
>>> On Tue, Aug 14, 2012 at 12:58 PM, Thierry Reding
>>> <thierry.reding@avionic-design.de> wrote:
>>>> On Tue, Aug 14, 2012 at 01:39:23PM -0600, Stephen Warren wrote:
>>>>> On 08/13/2012 05:18 PM, Bjorn Helgaas wrote:
>>>>>> On Mon, Aug 13, 2012 at 11:47 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>> ...
>>>>>>> whereas for a device tree boot:
>>>>>>>
>>>>>>> (same):
>>>>>>>> [    2.112217] pci 0000:01:00.0: reg 10: [io  0x0000-0x00ff]
>>>>>>>> [    2.117635] pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>>> [    2.124690] pci 0000:01:00.0: reg 20: [mem 0x00000000-0x00003fff 64bit pref]
>>>>>>>> [    2.131731] pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
>>>>>>> ... (request region happens early)
>>>>>>>> [    2.179838] r8169 0000:01:00.0: BAR 0: requesting [io  0x0000-0x00ff]
>>>>>>>> [    2.193312] r8169 0000:01:00.0: BAR 2: requesting [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>>> [    2.201397] r8169 0000:01:00.0: BAR 2: can't reserve [mem 0x00000000-0x00000fff 64bit pref]
>>>>>>>> [    2.209742] r8169 0000:01:00.0: (unregistered net_device): could not request regions
>>>>>>> ... (same, just happens too late)
>>>>>>>> [    2.236818] pci 0000:01:00.0: BAR 6: assigned [mem 0xa0000000-0xa001ffff pref]
>>>>>>>> [    2.244027] pci 0000:01:00.0: BAR 4: assigned [mem 0xa0020000-0xa0023fff 64bit pref]
>>>>>>>> [    2.251794] pci 0000:01:00.0: BAR 2: assigned [mem 0xa0024000-0xa0024fff 64bit pref]
>>>>>>>> [    2.259542] pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
>>>>>>>
>>>>>>> I suspect this is all still related to the PCI devices themselves being
>>>>>>> probed much earlier in the overall PCI initialization sequence when the
>>>>>>> PCI controller is probed later in the boot sequence, whereas PCI device
>>>>>>> probe is deferred until the overall PCI initialization sequence is
>>>>>>> complete if the PCI controller is probed very early in the boot sequence.
>>>>>>
>>>>>> I don't know what to apply your patches to (they don't apply cleanly
>>>>>> to v3.6-rc2), so I can't see exactly what you're doing.  But it looks
>>>>>> like you might be calling pci_bus_add_devices() before
>>>>>> pci_bus_assign_resource(), which isn't going to work.
>>>>>
>>>>> Yes, that's exactly what is happening.
>>>>>
>>>>> PCIe initialization starts in arch/arm/mach-tegra/pci.e
>>>>> tegra_pcie_init() which calls arch/arm/kernel/bios32.c
>>>>> pci_common_init(). That function first calls pcibios_init_hw() (in the
>>>>> same file, more about this later) and then loops over PCI buses, calling
>>>>> amongst other things pci_bus_assign_resources() then pci_bus_add_devices().
>>>>>
>>>>> The problem is that ARM's pcibios_init_hw() calls pci_scan_root_bus()
>>>>> (or a host-driver-specific function which that also calls
>>>>> pci_scan_root_bus() in Tegra's case) which in turn calls
>>>>> pci_bus_add_devices() right at the end, before control has returned to
>>>>> pci_common_init() and hence before pci_bus_assign_resources() has been
>>>>> called.
>>>>>
>>>>> If I modify pci_scan_root_bus() and remove the call to
>>>>> pci_bus_add_devices(), everything works as expected.
>>>>>
>>>>> So, I guess the question is: Should ARM's pcibios_init_hw() not be
>>>>> calling pci_scan_root_bus(), or at least presumably the ARM PCI code
>>>>> needs to do things in a slightly different order?
>>>
>>> I think you need to do something like this instead of using pci_scan_root_bus():
>>>
>>>     pci_create_root_bus()
>>>     pci_scan_child_bus()
>>>     pci_bus_assign_resources()
>>>     pci_bus_add_devices()
>>>
>>> This is the effective order used by most of the pci_create_root_bus() callers.
>>
>> That would pretty much duplicate everything in pci_scan_root_bus(). That
>> might cause divergence down the road.
>>
>> Can't we make the call to pci_bus_add_devices() optional in
>> pci_scan_root_bus() somehow; one of:
> 
> Sigh, that turns out not to work correctly; it solves at least this part
> of the problem when booting using device tree, but when booting using a
> board file, it causes the IRQ number passed to the PCIe device to be
> bogus:-(

The reason this fails is because the following happens after
pci_scan_bus_root():

	pci_fixup_irqs(pcibios_swizzle, pcibios_map_irq);

(in arch/arm/kernel/bios32.c:pci_common_init())

So, if I replace the call to pci_scan_root_bus() with the code you wrote
above, then the IRQ numbers aren't assigned into the pci_dev structures
until after they're device_add()ed, and hence probed().

Ideally, I could just add a call to pci_fixup_irqs() right before the
call to pci_bus_add_devices() in the code you wrote above. However, that
won't work, because pci_fixup_irqs() finds all the PCI devices to fix up
by searching the list of devices that pci_bus_add_devices() adds to:-(

I guess it's a pretty basic premise of the current PCI code that all the
PCI scanning happens well before any device drivers are registered,
which in turn means that device_add() doesn't trigger the device's
probe() until much later, after all the fixups and resource assignments
are done?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-09-07 23:34                   ` Stephen Warren
@ 2012-09-08  0:04                     ` Russell King - ARM Linux
  2012-09-08  5:53                       ` Stephen Warren
  2012-09-08 17:51                       ` Bjorn Helgaas
  0 siblings, 2 replies; 79+ messages in thread
From: Russell King - ARM Linux @ 2012-09-08  0:04 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Bjorn Helgaas, Thierry Reding, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Fri, Sep 07, 2012 at 05:34:35PM -0600, Stephen Warren wrote:
> I guess it's a pretty basic premise of the current PCI code that all the
> PCI scanning happens well before any device drivers are registered,
> which in turn means that device_add() doesn't trigger the device's
> probe() until much later, after all the fixups and resource assignments
> are done?

Are you saying that the PCI layer is again screwed up after all my
hard work several years ago to ensure that PCI devices are properly
setup _before_ they're made available to the PCI drivers then?  That
was around the time I was looking at Cardbus stuff, ensuring that that
worked with the same guarantees.

Not amused.

What is wrong with the "probe devices, apply fixups, setup resources,
apply more fixups, publish" process that it's had to be yet again
broken?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-09-08  0:04                     ` Russell King - ARM Linux
@ 2012-09-08  5:53                       ` Stephen Warren
  2012-09-08 17:51                       ` Bjorn Helgaas
  1 sibling, 0 replies; 79+ messages in thread
From: Stephen Warren @ 2012-09-08  5:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Bjorn Helgaas, Thierry Reding, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On 09/07/2012 06:04 PM, Russell King - ARM Linux wrote:
> On Fri, Sep 07, 2012 at 05:34:35PM -0600, Stephen Warren wrote:
>> I guess it's a pretty basic premise of the current PCI code that all the
>> PCI scanning happens well before any device drivers are registered,
>> which in turn means that device_add() doesn't trigger the device's
>> probe() until much later, after all the fixups and resource assignments
>> are done?
> 
> Are you saying that the PCI layer is again screwed up after all my
> hard work several years ago to ensure that PCI devices are properly
> setup _before_ they're made available to the PCI drivers then?  That
> was around the time I was looking at Cardbus stuff, ensuring that that
> worked with the same guarantees.
> 
> Not amused.
> 
> What is wrong with the "probe devices, apply fixups, setup resources,
> apply more fixups, publish" process that it's had to be yet again
> broken?

I must admit, I'm having a hard time finding when the code worked like
that; ARM's bios32.c:pcibios_init_hw() seems to have always called
pci_scan_root_bus(), or ops function hw->scan(), which at least in the 1
PCIe controller driver I looked at, in turn always called either
pci_scan_root_bus() or another similar function that I believe always
called pci_bus_add_devices(), which is before pci_common_init() could
assign the resources. Maybe I'm just looking at the wrong PCIe
controller driver, or not looking back far enough in git history (or
pre-git)? If you could point out when it was working as you describe
(which sounds reasonable), I'd be interested in tracing the history from
there to see when/why it changed.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-09-08  0:04                     ` Russell King - ARM Linux
  2012-09-08  5:53                       ` Stephen Warren
@ 2012-09-08 17:51                       ` Bjorn Helgaas
  2012-09-18  6:33                         ` Thierry Reding
  1 sibling, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2012-09-08 17:51 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Stephen Warren, Thierry Reding, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Fri, Sep 7, 2012 at 6:04 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Sep 07, 2012 at 05:34:35PM -0600, Stephen Warren wrote:
>> I guess it's a pretty basic premise of the current PCI code that all the
>> PCI scanning happens well before any device drivers are registered,
>> which in turn means that device_add() doesn't trigger the device's
>> probe() until much later, after all the fixups and resource assignments
>> are done?
>
> Are you saying that the PCI layer is again screwed up after all my
> hard work several years ago to ensure that PCI devices are properly
> setup _before_ they're made available to the PCI drivers then?  That
> was around the time I was looking at Cardbus stuff, ensuring that that
> worked with the same guarantees.
>
> Not amused.
>
> What is wrong with the "probe devices, apply fixups, setup resources,
> apply more fixups, publish" process that it's had to be yet again
> broken?

It seems that there are some bugs in the PCI layer, no doubt
introduced after all your hard work.  We'll do our best to fix them.

The particular issue of pci_fixup_irqs() has been on my list for a
while, and we talked about it at the recent PCI mini-summit.  It's
clearly broken that we do this with for_each_pci_dev() once at
boot-time because that does nothing for hot-added devices.  It's also
broken that it is called after device_add() because the core shouldn't
touch the device after it's available to drivers.

This should get fixed reasonably soon, probably not in v3.6, but
hopefully in v3.7.  It's not completely trivial because many arches
have the same problem, and we need to fix all of them.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-09-07 17:22             ` Bjorn Helgaas
@ 2012-09-14 18:55               ` Thierry Reding
  2012-09-14 19:45                 ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-09-14 18:55 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Stephen Warren, linux-tegra, linux-pci, Grant Likely,
	Rob Herring, devicetree-discuss, Russell King, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 5037 bytes --]

On Fri, Sep 07, 2012 at 10:22:28AM -0700, Bjorn Helgaas wrote:
> On Fri, Sep 7, 2012 at 10:00 AM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Fri, Sep 07, 2012 at 10:19:46AM -0600, Stephen Warren wrote:
> >> On 08/15/2012 01:42 PM, Bjorn Helgaas wrote:
> >> > On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
> >> > <thierry.reding@avionic-design.de> wrote:
> >> >> On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
> >> >>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
> >> >>> <thierry.reding@avionic-design.de> wrote:
> >> >>>> When using deferred driver probing, PCI host controller drivers may
> >> >>>> actually require this function after the init stage.
> >> >>>>
> >> >>>> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
> >> >>>> ---
> >> >>>> Changes in v3:
> >> >>>> - none
> >> >>>>
> >> >>>> Changes in v2:
> >> >>>> - use __devinit annotations
> >> >>>
> >> >>> Your original patch removed __init completely.  Here you change it to
> >> >>> __devinit.  That means we decide whether to discard the function based
> >> >>> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
> >> >>> about hotplug; it's merely that we should be able to scan a PCI bus
> >> >>> after init-time.  We ought to be able to do a late PCI scan even if
> >> >>> hotplug is not supported.
> >> >>>
> >> >>> Therefore, I'd be inclined to remove __init completely unless you have
> >> >>> another reason for preferring __devinit.
> >> >>
> >> >> I thought __devinit would resolve to nothing if HOTPLUG is defined and
> >> >> __init otherwise. That seemed more appropriate. However you are right
> >> >> that it is useful to always have it available, so I'm fine with removing
> >> >> the annotations altogether. Do you want me to follow up with a patch? Or
> >> >> can you just take the first version? I'm not sure if it still applies.
> >> >
> >> > You're right about how __devinit works.  It's just that I don't think
> >> > hotplug is actually relevant here.  We're trying to make
> >> > pci_fixup_irqs() work after init, whether it's because of hotplug or
> >> > simply because the arch scans host bridges after init.
> >> >
> >> > I applied this to my "next" branch.  Thanks!
> >>
> >> Bjorn, I don't see this patch in next-20120907. Did it get dropped for
> >> some reason?
> >
> > Yes, it turns out that dropping the annotations causes lots of section
> > mismatches on other architectures. See here[0] for the details. I think
> > the solution to the issue would be to either remove HOTPLUG altogether
> > and drop __devinit and __devexit annotations or update all architectures
> > to fix these warnings. I think Bjorn and I settled on the latter because
> > it's obviously less intrusive. I've been busy building toolchains for
> > all the PCI architectures and I think I have all of them. I'll just need
> > some more time to build, find and fix any remaining section mismatches.
> 
> Greg KH is actively removing CONFIG_HOTPLUG altogether -- see
> https://lkml.org/lkml/2012/9/4/489
> 
> That will make __devinit resolve to nothing in all cases, and we'll
> eventually remove __devinit completely.  So we don't want to convert
> __init to __devinit; we have to remove the __init altogether.  I think
> that means we have to update all arches first to avoid the section
> mismatches.
> 
> So I think Thierry is on the right track:
>   1) Change all arch pcibios_update_irq() implementations (and
> probably a few other things) to be non-__init
>   2) Change pci_fixup_irqs() to be non-__init

Okay this wasn't as much work as I had thought. It turns out that there
very few dependencies other than pcibios_update_irq(). Actually none at
all. In addition some of the architectures already had these annotated
with __devinit. Luckily those were the ones I wasn't able to build a
cross-compiler for. Note that I resolved to converting all annotations
from __init to __devinit first. I'll try to find out what exactly Greg
is planning to do. If it turns out that the plan is to just remove the
__devinit incrementally I could just drop all of these. Otherwise maybe
a better option would be to leave them in and remove them all at once
when the HOTPLUG symbol is removed.

Furthermore it seems like almost all implementations are the same, so I
was going to include a patch that moves the common implementation to the
PCI core and make it a weak symbol so architectures would still have the
possibility to override it.

The only exception to this is 64-bit SPARC, which apparently doesn't do
anything in pcibios_update_irq(). I wonder if it would hurt to just use
the generic implementation anyway, which just sets a byte in the
configuration space. That should work regardless of architecture, right?

Unicore32 and ARM also output a debugging message and maybe it would be
okay to include that with the generic implementation as well. Do you
have any objections?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init
  2012-09-14 18:55               ` Thierry Reding
@ 2012-09-14 19:45                 ` Bjorn Helgaas
  0 siblings, 0 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-09-14 19:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Stephen Warren, linux-tegra, linux-pci, Grant Likely,
	Rob Herring, devicetree-discuss, Russell King, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Fri, Sep 14, 2012 at 12:55 PM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Fri, Sep 07, 2012 at 10:22:28AM -0700, Bjorn Helgaas wrote:
>> On Fri, Sep 7, 2012 at 10:00 AM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > On Fri, Sep 07, 2012 at 10:19:46AM -0600, Stephen Warren wrote:
>> >> On 08/15/2012 01:42 PM, Bjorn Helgaas wrote:
>> >> > On Wed, Aug 15, 2012 at 1:28 PM, Thierry Reding
>> >> > <thierry.reding@avionic-design.de> wrote:
>> >> >> On Wed, Aug 15, 2012 at 10:06:27AM -0700, Bjorn Helgaas wrote:
>> >> >>> On Thu, Jul 26, 2012 at 12:55 PM, Thierry Reding
>> >> >>> <thierry.reding@avionic-design.de> wrote:
>> >> >>>> When using deferred driver probing, PCI host controller drivers may
>> >> >>>> actually require this function after the init stage.
>> >> >>>>
>> >> >>>> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
>> >> >>>> ---
>> >> >>>> Changes in v3:
>> >> >>>> - none
>> >> >>>>
>> >> >>>> Changes in v2:
>> >> >>>> - use __devinit annotations
>> >> >>>
>> >> >>> Your original patch removed __init completely.  Here you change it to
>> >> >>> __devinit.  That means we decide whether to discard the function based
>> >> >>> on whether CONFIG_HOTPLUG is supported.  But I think your point is not
>> >> >>> about hotplug; it's merely that we should be able to scan a PCI bus
>> >> >>> after init-time.  We ought to be able to do a late PCI scan even if
>> >> >>> hotplug is not supported.
>> >> >>>
>> >> >>> Therefore, I'd be inclined to remove __init completely unless you have
>> >> >>> another reason for preferring __devinit.
>> >> >>
>> >> >> I thought __devinit would resolve to nothing if HOTPLUG is defined and
>> >> >> __init otherwise. That seemed more appropriate. However you are right
>> >> >> that it is useful to always have it available, so I'm fine with removing
>> >> >> the annotations altogether. Do you want me to follow up with a patch? Or
>> >> >> can you just take the first version? I'm not sure if it still applies.
>> >> >
>> >> > You're right about how __devinit works.  It's just that I don't think
>> >> > hotplug is actually relevant here.  We're trying to make
>> >> > pci_fixup_irqs() work after init, whether it's because of hotplug or
>> >> > simply because the arch scans host bridges after init.
>> >> >
>> >> > I applied this to my "next" branch.  Thanks!
>> >>
>> >> Bjorn, I don't see this patch in next-20120907. Did it get dropped for
>> >> some reason?
>> >
>> > Yes, it turns out that dropping the annotations causes lots of section
>> > mismatches on other architectures. See here[0] for the details. I think
>> > the solution to the issue would be to either remove HOTPLUG altogether
>> > and drop __devinit and __devexit annotations or update all architectures
>> > to fix these warnings. I think Bjorn and I settled on the latter because
>> > it's obviously less intrusive. I've been busy building toolchains for
>> > all the PCI architectures and I think I have all of them. I'll just need
>> > some more time to build, find and fix any remaining section mismatches.
>>
>> Greg KH is actively removing CONFIG_HOTPLUG altogether -- see
>> https://lkml.org/lkml/2012/9/4/489
>>
>> That will make __devinit resolve to nothing in all cases, and we'll
>> eventually remove __devinit completely.  So we don't want to convert
>> __init to __devinit; we have to remove the __init altogether.  I think
>> that means we have to update all arches first to avoid the section
>> mismatches.
>>
>> So I think Thierry is on the right track:
>>   1) Change all arch pcibios_update_irq() implementations (and
>> probably a few other things) to be non-__init
>>   2) Change pci_fixup_irqs() to be non-__init
>
> Okay this wasn't as much work as I had thought. It turns out that there
> very few dependencies other than pcibios_update_irq(). Actually none at
> all. In addition some of the architectures already had these annotated
> with __devinit. Luckily those were the ones I wasn't able to build a
> cross-compiler for. Note that I resolved to converting all annotations
> from __init to __devinit first. I'll try to find out what exactly Greg
> is planning to do. If it turns out that the plan is to just remove the
> __devinit incrementally I could just drop all of these. Otherwise maybe
> a better option would be to leave them in and remove them all at once
> when the HOTPLUG symbol is removed.
>
> Furthermore it seems like almost all implementations are the same, so I
> was going to include a patch that moves the common implementation to the
> PCI core and make it a weak symbol so architectures would still have the
> possibility to override it.
>
> The only exception to this is 64-bit SPARC, which apparently doesn't do
> anything in pcibios_update_irq(). I wonder if it would hurt to just use
> the generic implementation anyway, which just sets a byte in the
> configuration space. That should work regardless of architecture, right?
>
> Unicore32 and ARM also output a debugging message and maybe it would be
> okay to include that with the generic implementation as well. Do you
> have any objections?

Sounds good to me.  DaveM is really responsive -- if you propose using
the generic implementation on SPARC, I bet he'd either ack it or tell
you why SPARC needs to be special.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-09-08 17:51                       ` Bjorn Helgaas
@ 2012-09-18  6:33                         ` Thierry Reding
  2012-09-18 15:56                           ` Bjorn Helgaas
  0 siblings, 1 reply; 79+ messages in thread
From: Thierry Reding @ 2012-09-18  6:33 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Russell King - ARM Linux, Stephen Warren, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 2263 bytes --]

On Sat, Sep 08, 2012 at 11:51:00AM -0600, Bjorn Helgaas wrote:
> On Fri, Sep 7, 2012 at 6:04 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Sep 07, 2012 at 05:34:35PM -0600, Stephen Warren wrote:
> >> I guess it's a pretty basic premise of the current PCI code that all the
> >> PCI scanning happens well before any device drivers are registered,
> >> which in turn means that device_add() doesn't trigger the device's
> >> probe() until much later, after all the fixups and resource assignments
> >> are done?
> >
> > Are you saying that the PCI layer is again screwed up after all my
> > hard work several years ago to ensure that PCI devices are properly
> > setup _before_ they're made available to the PCI drivers then?  That
> > was around the time I was looking at Cardbus stuff, ensuring that that
> > worked with the same guarantees.
> >
> > Not amused.
> >
> > What is wrong with the "probe devices, apply fixups, setup resources,
> > apply more fixups, publish" process that it's had to be yet again
> > broken?
> 
> It seems that there are some bugs in the PCI layer, no doubt
> introduced after all your hard work.  We'll do our best to fix them.
> 
> The particular issue of pci_fixup_irqs() has been on my list for a
> while, and we talked about it at the recent PCI mini-summit.  It's
> clearly broken that we do this with for_each_pci_dev() once at
> boot-time because that does nothing for hot-added devices.  It's also
> broken that it is called after device_add() because the core shouldn't
> touch the device after it's available to drivers.
> 
> This should get fixed reasonably soon, probably not in v3.6, but
> hopefully in v3.7.  It's not completely trivial because many arches
> have the same problem, and we need to fix all of them.

Has there been any progress on this? I've read the PCI mini summit notes
that you posted a while ago but it doesn't mention the pci_fixup_irqs()
issue. Was there some decision as to how this should be solved? If I
understand correctly this should solve the issue that Stephen has been
seeing with bogus interrupt assignments, so we'll need this to fix PCIe
on Tegra. Is there anything I can do to help move this forward?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support
  2012-09-18  6:33                         ` Thierry Reding
@ 2012-09-18 15:56                           ` Bjorn Helgaas
  0 siblings, 0 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2012-09-18 15:56 UTC (permalink / raw)
  To: Thierry Reding, Myron Stowe
  Cc: Russell King - ARM Linux, Stephen Warren, linux-tegra, linux-pci,
	Grant Likely, Rob Herring, devicetree-discuss, linux-arm-kernel,
	Colin Cross, Olof Johansson, Mitch Bradley, Arnd Bergmann

On Tue, Sep 18, 2012 at 12:33 AM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Sat, Sep 08, 2012 at 11:51:00AM -0600, Bjorn Helgaas wrote:
>> On Fri, Sep 7, 2012 at 6:04 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Sep 07, 2012 at 05:34:35PM -0600, Stephen Warren wrote:
>> >> I guess it's a pretty basic premise of the current PCI code that all the
>> >> PCI scanning happens well before any device drivers are registered,
>> >> which in turn means that device_add() doesn't trigger the device's
>> >> probe() until much later, after all the fixups and resource assignments
>> >> are done?
>> >
>> > Are you saying that the PCI layer is again screwed up after all my
>> > hard work several years ago to ensure that PCI devices are properly
>> > setup _before_ they're made available to the PCI drivers then?  That
>> > was around the time I was looking at Cardbus stuff, ensuring that that
>> > worked with the same guarantees.
>> >
>> > Not amused.
>> >
>> > What is wrong with the "probe devices, apply fixups, setup resources,
>> > apply more fixups, publish" process that it's had to be yet again
>> > broken?
>>
>> It seems that there are some bugs in the PCI layer, no doubt
>> introduced after all your hard work.  We'll do our best to fix them.
>>
>> The particular issue of pci_fixup_irqs() has been on my list for a
>> while, and we talked about it at the recent PCI mini-summit.  It's
>> clearly broken that we do this with for_each_pci_dev() once at
>> boot-time because that does nothing for hot-added devices.  It's also
>> broken that it is called after device_add() because the core shouldn't
>> touch the device after it's available to drivers.
>>
>> This should get fixed reasonably soon, probably not in v3.6, but
>> hopefully in v3.7.  It's not completely trivial because many arches
>> have the same problem, and we need to fix all of them.
>
> Has there been any progress on this? I've read the PCI mini summit notes
> that you posted a while ago but it doesn't mention the pci_fixup_irqs()
> issue. Was there some decision as to how this should be solved? If I
> understand correctly this should solve the issue that Stephen has been
> seeing with bogus interrupt assignments, so we'll need this to fix PCIe
> on Tegra. Is there anything I can do to help move this forward?

I didn't mention pci_fixup_irqs() by name in the notes, but that's
part of what I meant by "device setup being done by initcalls" as a
hot-plug issue.  I know Myron Stowe expressed interest in working on
that, but I don't think he's had a chance to do anything yet (at
least, I haven't seen anything yet).

I don't have time to fix it myself, so moving it forward is just a
matter of somebody doing the work and posting the patches.  I think
that code just needs to be moved to some pcibios hook (possibly a new
one) that's called in the device_add path.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2012-09-18 15:56 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-26 19:55 [PATCH v3 00/10] ARM: tegra: Add PCIe device tree support Thierry Reding
2012-07-26 19:55 ` [PATCH v3 01/10] PCI: Keep pci_fixup_irqs() around after init Thierry Reding
2012-08-14  5:06   ` Bjorn Helgaas
2012-08-14  5:37     ` Thierry Reding
2012-08-15 17:06   ` Bjorn Helgaas
2012-08-15 19:28     ` Thierry Reding
2012-08-15 19:42       ` Bjorn Helgaas
2012-08-15 20:01         ` Thierry Reding
2012-09-07 16:19         ` Stephen Warren
2012-09-07 17:00           ` Thierry Reding
2012-09-07 17:22             ` Bjorn Helgaas
2012-09-14 18:55               ` Thierry Reding
2012-09-14 19:45                 ` Bjorn Helgaas
2012-07-26 19:55 ` [PATCH v3 02/10] ARM: pci: Keep pci_common_init() " Thierry Reding
2012-07-26 19:55 ` [PATCH v3 03/10] ARM: pci: Allow passing per-controller private data Thierry Reding
2012-07-26 19:55 ` [PATCH v3 04/10] ARM: tegra: Move tegra_pcie_xclk_clamp() to PMC Thierry Reding
2012-07-26 19:55 ` [PATCH v3 05/10] resource: add PCI configuration space support Thierry Reding
2012-08-14  5:00   ` Bjorn Helgaas
2012-08-14  5:55     ` Thierry Reding
2012-08-14 17:38       ` Bjorn Helgaas
2012-08-14 18:01         ` Thierry Reding
2012-08-14 21:44           ` Bjorn Helgaas
2012-08-15  6:49             ` Thierry Reding
2012-08-16 15:18               ` Stephen Warren
2012-08-16 18:27                 ` Thierry Reding
2012-07-26 19:55 ` [PATCH v3 06/10] ARM: tegra: Rewrite PCIe support as a driver Thierry Reding
2012-07-26 19:55 ` [PATCH v3 07/10] ARM: tegra: pcie: Add MSI support Thierry Reding
2012-07-26 19:55 ` [PATCH v3 08/10] of/address: Handle #address-cells > 2 specially Thierry Reding
2012-07-31 20:18   ` Rob Herring
2012-08-15 20:06     ` Thierry Reding
2012-09-07 16:24       ` Stephen Warren
2012-09-07 16:32         ` Rob Herring
2012-07-26 19:55 ` [PATCH v3 09/10] of: Add of_pci_parse_ranges() Thierry Reding
2012-07-31 20:07   ` Rob Herring
2012-08-01  6:54     ` Thierry Reding
2012-08-01 16:07       ` Stephen Warren
2012-07-26 19:55 ` [PATCH v3 10/10] ARM: tegra: pcie: Add device tree support Thierry Reding
2012-08-14 20:12   ` Thierry Reding
2012-08-14 23:50     ` Bjorn Helgaas
2012-08-15  6:37       ` Thierry Reding
2012-08-15 12:18         ` Bjorn Helgaas
2012-08-15 12:30           ` Thierry Reding
2012-08-15 14:36             ` Bjorn Helgaas
2012-08-15 14:57               ` Thierry Reding
2012-08-15 20:25                 ` Arnd Bergmann
2012-08-15 20:48                   ` Bjorn Helgaas
2012-08-16  4:55                   ` Thierry Reding
2012-08-16  7:03                     ` Arnd Bergmann
2012-08-16  7:47                       ` Thierry Reding
2012-08-16 12:15           ` Thierry Reding
2012-07-31 16:18 ` [PATCH v3 00/10] ARM: tegra: Add PCIe " Stephen Warren
2012-08-01  6:35   ` Thierry Reding
2012-08-01 17:02     ` Stephen Warren
2012-08-02  6:15       ` Thierry Reding
2012-08-06 19:42 ` Stephen Warren
2012-08-07 18:20   ` Thierry Reding
2012-08-13 17:40   ` Thierry Reding
2012-08-13 18:47     ` Stephen Warren
2012-08-13 20:33       ` Thierry Reding
2012-08-13 21:38         ` Rob Herring
2012-08-14  6:14           ` Thierry Reding
2012-08-13 23:18       ` Bjorn Helgaas
2012-08-14  6:29         ` Thierry Reding
2012-08-14 19:39         ` Stephen Warren
2012-08-14 19:58           ` Thierry Reding
2012-08-14 21:55             ` Bjorn Helgaas
2012-08-14 22:58               ` Stephen Warren
2012-08-14 23:51                 ` Stephen Warren
2012-08-15 19:04                   ` Stephen Warren
2012-08-15 20:09                     ` Thierry Reding
2012-08-15 20:11                       ` Stephen Warren
2012-08-15 20:19                         ` Thierry Reding
2012-09-07 23:34                   ` Stephen Warren
2012-09-08  0:04                     ` Russell King - ARM Linux
2012-09-08  5:53                       ` Stephen Warren
2012-09-08 17:51                       ` Bjorn Helgaas
2012-09-18  6:33                         ` Thierry Reding
2012-09-18 15:56                           ` Bjorn Helgaas
2012-08-15  0:08                 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).