All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip
@ 2017-08-23  7:01 Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 01/10] PCI: rockchip: spilt out rockchip_pcie_setup_irq Shawn Lin
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin


Hi Bjorn,

Patch 1 -> 4 are for what you suggested in my V4
of bug fixing[1].

Patch 5 -> 7 are sloving what I said for my former
patch of PHY cleanup[2]. It seems you didn't see my
V2 of that[3], and my v2 also have some minor issues
that was fixed by Jeffy's patch[4]. So these patches in
flight for cleaning up the pcie-rockchip error handling
path which would conflict with each other. I merge the
similar PHY cleanup from patch[4].

Also I think we need to split it[4] up into smaller pieces.
So patch 8 -> 10 are for this purpose and avoid Jeffy to
rebase this work again.

Could you kindly drop patch[2] from your host-rockchip branch
and apply this patchset if it looks good to you? :)

[1]: https://patchwork.kernel.org/patch/9895141/
[2]: https://patchwork.kernel.org/patch/9890367/
[3]: https://patchwork.kernel.org/patch/9892461/
[4]: http://patchwork.ozlabs.org/patch/804239/


Changes in v5:
- rebase on former reconstrtion patches suggested by Bjorn
- fix all the missing error handling cases that need to cleanup
  PHY

Changes in v4:
- split out rockchip_pcie_enable_clocks and reuse
  rockchip_pcie_enable_clocks and rockchip_pcie_disable_clocks
  for elsewhere suggested by Jeffy

Changes in v3:
- check the return value of devm_add_action_or_reset and spilt out
  rockchip_pcie_setup_irq in order to move requesting irq after
  enabling clks.

Changes in v2:
- use devm_add_action_or_reset to fix this ordering suggested by
  Heiko and Jeffy. Thanks!

Jeffy Chen (3):
  PCI: rockchip: disable vpcie0v9 for resume_noirq error handling path
  PCI: rockchip: remove irq domain if failing to probe
  PCI: rockchip: umap io space if failing to probe

Shawn Lin (7):
  PCI: rockchip: spilt out rockchip_pcie_setup_irq
  PCI: rockchip: spilt out rockchip_pcie_enable_clocks
  PCI: rockchip: spilt out rockchip_pcie_disable_clocks
  PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  PCI: rockchip: spilt out rockchip_pcie_deinit_phys
  PCI: rockchip: fix missing phy manipulation for legacy phy
  PCI: rockchip: Clean up PHY if driver probe or resume fails

 drivers/pci/host/pcie-rockchip.c | 297 ++++++++++++++++++++++-----------------
 1 file changed, 166 insertions(+), 131 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 01/10] PCI: rockchip: spilt out rockchip_pcie_setup_irq
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
@ 2017-08-23  7:02 ` Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 02/10] PCI: rockchip: spilt out rockchip_pcie_enable_clocks Shawn Lin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

Spilt out rockchip_pcie_setup_irq in order to prepare for
the following bug fixing. No functional change intended.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 82 +++++++++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 2eccd53..6dc3d83 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -929,6 +929,51 @@ static int rockchip_pcie_get_phys(struct rockchip_pcie *rockchip)
 	return 0;
 }
 
+static int rockchip_pcie_setup_irq(struct rockchip_pcie *rockchip)
+{
+	int irq, err;
+	struct device *dev = rockchip->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+
+	irq = platform_get_irq_byname(pdev, "sys");
+	if (irq < 0) {
+		dev_err(dev, "missing sys IRQ resource\n");
+		return -EINVAL;
+	}
+
+	err = devm_request_irq(dev, irq, rockchip_pcie_subsys_irq_handler,
+			       IRQF_SHARED, "pcie-sys", rockchip);
+	if (err) {
+		dev_err(dev, "failed to request PCIe subsystem IRQ\n");
+		return err;
+	}
+
+	irq = platform_get_irq_byname(pdev, "legacy");
+	if (irq < 0) {
+		dev_err(dev, "missing legacy IRQ resource\n");
+		return -EINVAL;
+	}
+
+	irq_set_chained_handler_and_data(irq,
+					 rockchip_pcie_legacy_int_handler,
+					 rockchip);
+
+	irq = platform_get_irq_byname(pdev, "client");
+	if (irq < 0) {
+		dev_err(dev, "missing client IRQ resource\n");
+		return -EINVAL;
+	}
+
+	err = devm_request_irq(dev, irq, rockchip_pcie_client_irq_handler,
+			       IRQF_SHARED, "pcie-client", rockchip);
+	if (err) {
+		dev_err(dev, "failed to request PCIe client IRQ\n");
+		return err;
+	}
+
+	return 0;
+}
+
 /**
  * rockchip_pcie_parse_dt - Parse Device Tree
  * @rockchip: PCIe port information
@@ -941,7 +986,6 @@ static int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
 	struct platform_device *pdev = to_platform_device(dev);
 	struct device_node *node = dev->of_node;
 	struct resource *regs;
-	int irq;
 	int err;
 
 	regs = platform_get_resource_byname(pdev,
@@ -1055,41 +1099,9 @@ static int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
 		return PTR_ERR(rockchip->clk_pcie_pm);
 	}
 
-	irq = platform_get_irq_byname(pdev, "sys");
-	if (irq < 0) {
-		dev_err(dev, "missing sys IRQ resource\n");
-		return -EINVAL;
-	}
-
-	err = devm_request_irq(dev, irq, rockchip_pcie_subsys_irq_handler,
-			       IRQF_SHARED, "pcie-sys", rockchip);
-	if (err) {
-		dev_err(dev, "failed to request PCIe subsystem IRQ\n");
-		return err;
-	}
-
-	irq = platform_get_irq_byname(pdev, "legacy");
-	if (irq < 0) {
-		dev_err(dev, "missing legacy IRQ resource\n");
-		return -EINVAL;
-	}
-
-	irq_set_chained_handler_and_data(irq,
-					 rockchip_pcie_legacy_int_handler,
-					 rockchip);
-
-	irq = platform_get_irq_byname(pdev, "client");
-	if (irq < 0) {
-		dev_err(dev, "missing client IRQ resource\n");
-		return -EINVAL;
-	}
-
-	err = devm_request_irq(dev, irq, rockchip_pcie_client_irq_handler,
-			       IRQF_SHARED, "pcie-client", rockchip);
-	if (err) {
-		dev_err(dev, "failed to request PCIe client IRQ\n");
+	err = rockchip_pcie_setup_irq(rockchip);
+	if (err)
 		return err;
-	}
 
 	rockchip->vpcie12v = devm_regulator_get_optional(dev, "vpcie12v");
 	if (IS_ERR(rockchip->vpcie12v)) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 02/10] PCI: rockchip: spilt out rockchip_pcie_enable_clocks
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 01/10] PCI: rockchip: spilt out rockchip_pcie_setup_irq Shawn Lin
@ 2017-08-23  7:02 ` Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 03/10] PCI: rockchip: spilt out rockchip_pcie_disable_clocks Shawn Lin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

Spilt out rockchip_pcie_enable_clocks so that it could be
reused by rockchip_pcie_resume_noirq and rockchip_pcie_probe.
No functional change intended.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 91 ++++++++++++++++++++--------------------
 1 file changed, 46 insertions(+), 45 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 6dc3d83..52974cf 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -1373,6 +1373,47 @@ static int rockchip_pcie_wait_l2(struct rockchip_pcie *rockchip)
 	return 0;
 }
 
+static int rockchip_pcie_enable_clocks(
+				struct rockchip_pcie *rockchip)
+{
+	struct device *dev = rockchip->dev;
+	int err;
+
+	err = clk_prepare_enable(rockchip->aclk_pcie);
+	if (err) {
+		dev_err(dev, "unable to enable aclk_pcie clock\n");
+		return err;
+	}
+
+	err = clk_prepare_enable(rockchip->aclk_perf_pcie);
+	if (err) {
+		dev_err(dev, "unable to enable aclk_perf_pcie clock\n");
+		goto err_aclk_perf_pcie;
+	}
+
+	err = clk_prepare_enable(rockchip->hclk_pcie);
+	if (err) {
+		dev_err(dev, "unable to enable hclk_pcie clock\n");
+		goto err_hclk_pcie;
+	}
+
+	err = clk_prepare_enable(rockchip->clk_pcie_pm);
+	if (err) {
+		dev_err(dev, "unable to enable clk_pcie_pm clock\n");
+		goto err_clk_pcie_pm;
+	}
+
+	return 0;
+
+err_clk_pcie_pm:
+	clk_disable_unprepare(rockchip->hclk_pcie);
+err_hclk_pcie:
+	clk_disable_unprepare(rockchip->aclk_perf_pcie);
+err_aclk_perf_pcie:
+	clk_disable_unprepare(rockchip->aclk_pcie);
+	return err;
+}
+
 static int __maybe_unused rockchip_pcie_suspend_noirq(struct device *dev)
 {
 	struct rockchip_pcie *rockchip = dev_get_drvdata(dev);
@@ -1420,21 +1461,9 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 		}
 	}
 
-	err = clk_prepare_enable(rockchip->clk_pcie_pm);
-	if (err)
-		goto err_pcie_pm;
-
-	err = clk_prepare_enable(rockchip->hclk_pcie);
-	if (err)
-		goto err_hclk_pcie;
-
-	err = clk_prepare_enable(rockchip->aclk_perf_pcie);
-	if (err)
-		goto err_aclk_perf_pcie;
-
-	err = clk_prepare_enable(rockchip->aclk_pcie);
+	err = rockchip_pcie_enable_clocks(rockchip);
 	if (err)
-		goto err_aclk_pcie;
+		return err;
 
 	err = rockchip_pcie_init_port(rockchip);
 	if (err)
@@ -1452,13 +1481,9 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 
 err_pcie_resume:
 	clk_disable_unprepare(rockchip->aclk_pcie);
-err_aclk_pcie:
 	clk_disable_unprepare(rockchip->aclk_perf_pcie);
-err_aclk_perf_pcie:
 	clk_disable_unprepare(rockchip->hclk_pcie);
-err_hclk_pcie:
 	clk_disable_unprepare(rockchip->clk_pcie_pm);
-err_pcie_pm:
 	return err;
 }
 
@@ -1492,29 +1517,9 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 	if (err)
 		return err;
 
-	err = clk_prepare_enable(rockchip->aclk_pcie);
-	if (err) {
-		dev_err(dev, "unable to enable aclk_pcie clock\n");
-		goto err_aclk_pcie;
-	}
-
-	err = clk_prepare_enable(rockchip->aclk_perf_pcie);
-	if (err) {
-		dev_err(dev, "unable to enable aclk_perf_pcie clock\n");
-		goto err_aclk_perf_pcie;
-	}
-
-	err = clk_prepare_enable(rockchip->hclk_pcie);
-	if (err) {
-		dev_err(dev, "unable to enable hclk_pcie clock\n");
-		goto err_hclk_pcie;
-	}
-
-	err = clk_prepare_enable(rockchip->clk_pcie_pm);
-	if (err) {
-		dev_err(dev, "unable to enable hclk_pcie clock\n");
-		goto err_pcie_pm;
-	}
+	err = rockchip_pcie_enable_clocks(rockchip);
+	if (err)
+		return err;
 
 	err = rockchip_pcie_set_vpcie(rockchip);
 	if (err) {
@@ -1618,13 +1623,9 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 		regulator_disable(rockchip->vpcie0v9);
 err_set_vpcie:
 	clk_disable_unprepare(rockchip->clk_pcie_pm);
-err_pcie_pm:
 	clk_disable_unprepare(rockchip->hclk_pcie);
-err_hclk_pcie:
 	clk_disable_unprepare(rockchip->aclk_perf_pcie);
-err_aclk_perf_pcie:
 	clk_disable_unprepare(rockchip->aclk_pcie);
-err_aclk_pcie:
 	return err;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 03/10] PCI: rockchip: spilt out rockchip_pcie_disable_clocks
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 01/10] PCI: rockchip: spilt out rockchip_pcie_setup_irq Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 02/10] PCI: rockchip: spilt out rockchip_pcie_enable_clocks Shawn Lin
@ 2017-08-23  7:02 ` Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ Shawn Lin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

Spilt out rockchip_pcie_disable_clocks so that it could be
reused by other functions. No functional change intended.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 52974cf..971d22b 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -1414,6 +1414,16 @@ static int rockchip_pcie_enable_clocks(
 	return err;
 }
 
+static void rockchip_pcie_disable_clocks(void *data)
+{
+	struct rockchip_pcie *rockchip = data;
+
+	clk_disable_unprepare(rockchip->clk_pcie_pm);
+	clk_disable_unprepare(rockchip->hclk_pcie);
+	clk_disable_unprepare(rockchip->aclk_perf_pcie);
+	clk_disable_unprepare(rockchip->aclk_pcie);
+}
+
 static int __maybe_unused rockchip_pcie_suspend_noirq(struct device *dev)
 {
 	struct rockchip_pcie *rockchip = dev_get_drvdata(dev);
@@ -1437,10 +1447,7 @@ static int __maybe_unused rockchip_pcie_suspend_noirq(struct device *dev)
 		phy_exit(rockchip->phys[i]);
 	}
 
-	clk_disable_unprepare(rockchip->clk_pcie_pm);
-	clk_disable_unprepare(rockchip->hclk_pcie);
-	clk_disable_unprepare(rockchip->aclk_perf_pcie);
-	clk_disable_unprepare(rockchip->aclk_pcie);
+	rockchip_pcie_disable_clocks(rockchip);
 
 	if (!IS_ERR(rockchip->vpcie0v9))
 		regulator_disable(rockchip->vpcie0v9);
@@ -1480,10 +1487,7 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 	return 0;
 
 err_pcie_resume:
-	clk_disable_unprepare(rockchip->aclk_pcie);
-	clk_disable_unprepare(rockchip->aclk_perf_pcie);
-	clk_disable_unprepare(rockchip->hclk_pcie);
-	clk_disable_unprepare(rockchip->clk_pcie_pm);
+	rockchip_pcie_disable_clocks(rockchip);
 	return err;
 }
 
@@ -1622,10 +1626,7 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 	if (!IS_ERR(rockchip->vpcie0v9))
 		regulator_disable(rockchip->vpcie0v9);
 err_set_vpcie:
-	clk_disable_unprepare(rockchip->clk_pcie_pm);
-	clk_disable_unprepare(rockchip->hclk_pcie);
-	clk_disable_unprepare(rockchip->aclk_perf_pcie);
-	clk_disable_unprepare(rockchip->aclk_pcie);
+	rockchip_pcie_disable_clocks(rockchip);
 	return err;
 }
 
@@ -1647,10 +1648,7 @@ static int rockchip_pcie_remove(struct platform_device *pdev)
 		phy_exit(rockchip->phys[i]);
 	}
 
-	clk_disable_unprepare(rockchip->clk_pcie_pm);
-	clk_disable_unprepare(rockchip->hclk_pcie);
-	clk_disable_unprepare(rockchip->aclk_perf_pcie);
-	clk_disable_unprepare(rockchip->aclk_pcie);
+	rockchip_pcie_disable_clocks(rockchip);
 
 	if (!IS_ERR(rockchip->vpcie12v))
 		regulator_disable(rockchip->vpcie12v);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (2 preceding siblings ...)
  2017-08-23  7:02 ` [PATCH v5 03/10] PCI: rockchip: spilt out rockchip_pcie_disable_clocks Shawn Lin
@ 2017-08-23  7:02 ` Shawn Lin
  2017-08-24 20:21   ` Bjorn Helgaas
  2017-08-23  7:02 ` [PATCH v5 05/10] PCI: rockchip: spilt out rockchip_pcie_deinit_phys Shawn Lin
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

With CONFIG_DEBUG_SHIRQ enabled, the irq tear down routine
would still access the irq handler registed as a shard irq.
Per the comment within the function of __free_irq, it says
"It's a shared IRQ -- the driver ought to be prepared for
an IRQ event to happen even now it's being freed". However
when failing to probe the driver, it may disable the clock
for accessing the register and the following check for shared
irq state would call the irq handler which accesses the register
w/o the clk enabled. That will hang the system forever.

With adding some dump_stack we could see how that happened.

calling  rockchip_pcie_driver_init+0x0/0x28 @ 1
rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
rockchip-pcie f8000000.pcie: no vpcie1v8 regulator found
rockchip-pcie f8000000.pcie: no vpcie0v9 regulator found
rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc3-next-20170807-ARCH+ #189
Hardware name: Firefly-RK3399 Board (DT)
Call trace:
[<ffff000008089bf0>] dump_backtrace+0x0/0x250
[<ffff000008089eb0>] show_stack+0x20/0x28
[<ffff000008c3313c>] dump_stack+0x90/0xb0
[<ffff000008632ad4>] rockchip_pcie_read.isra.11+0x54/0x58
[<ffff0000086334fc>] rockchip_pcie_client_irq_handler+0x30/0x1a0
[<ffff00000813ce98>] __free_irq+0x1c8/0x2dc
[<ffff00000813d044>] free_irq+0x44/0x74
[<ffff0000081415fc>] devm_irq_release+0x24/0x2c
[<ffff00000877429c>] release_nodes+0x1d8/0x30c
[<ffff000008774838>] devres_release_all+0x3c/0x5c
[<ffff00000876f19c>] driver_probe_device+0x244/0x494
[<ffff00000876f50c>] __driver_attach+0x120/0x124
[<ffff00000876cb80>] bus_for_each_dev+0x6c/0xac
[<ffff00000876e984>] driver_attach+0x2c/0x34
[<ffff00000876e3a4>] bus_add_driver+0x244/0x2b0
[<ffff000008770264>] driver_register+0x70/0x110
[<ffff0000087718b4>] platform_driver_register+0x60/0x6c
[<ffff0000091eb108>] rockchip_pcie_driver_init+0x20/0x28
[<ffff000008083a2c>] do_one_initcall+0xc8/0x130
[<ffff0000091a0ea8>] kernel_init_freeable+0x1a0/0x238
[<ffff000008c461cc>] kernel_init+0x18/0x108
[<ffff0000080836c0>] ret_from_fork+0x10/0x50

In order to fix this, we remove all the clock-disabling from
the error handle path and driver's remove function. And replying
on the devm_add_action_or_reset to fire the clock-disabling at
the appropriate time. Also split out rockchip_pcie_setup_irq
and move requesting irq after enabling clks to avoid this kind

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>

---

Changes in v5:
- rebase on former reconstrtion patches suggested by Bjorn

Changes in v4:
- split out rockchip_pcie_enable_clocks and reuse
  rockchip_pcie_enable_clocks and rockchip_pcie_disable_clocks
  for elsewhere suggested by Jeffy

Changes in v3:
- check the return value of devm_add_action_or_reset and spilt out
  rockchip_pcie_setup_irq in order to move requesting irq after
  enabling clks.

Changes in v2:
- use devm_add_action_or_reset to fix this ordering suggested by
  Heiko and Jeffy. Thanks!

 drivers/pci/host/pcie-rockchip.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 971d22b..891b60a 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -1099,10 +1099,6 @@ static int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
 		return PTR_ERR(rockchip->clk_pcie_pm);
 	}
 
-	err = rockchip_pcie_setup_irq(rockchip);
-	if (err)
-		return err;
-
 	rockchip->vpcie12v = devm_regulator_get_optional(dev, "vpcie12v");
 	if (IS_ERR(rockchip->vpcie12v)) {
 		if (PTR_ERR(rockchip->vpcie12v) == -EPROBE_DEFER)
@@ -1525,10 +1521,22 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 	if (err)
 		return err;
 
+	err = devm_add_action_or_reset(dev,
+				       rockchip_pcie_disable_clocks,
+				       rockchip);
+	if (err) {
+		dev_err(dev, "unable to add action or reset\n");
+		return err;
+	}
+
+	err = rockchip_pcie_setup_irq(rockchip);
+	if (err)
+		return err;
+
 	err = rockchip_pcie_set_vpcie(rockchip);
 	if (err) {
 		dev_err(dev, "failed to set vpcie regulator\n");
-		goto err_set_vpcie;
+		return err;
 	}
 
 	err = rockchip_pcie_init_port(rockchip);
@@ -1625,8 +1633,6 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 		regulator_disable(rockchip->vpcie1v8);
 	if (!IS_ERR(rockchip->vpcie0v9))
 		regulator_disable(rockchip->vpcie0v9);
-err_set_vpcie:
-	rockchip_pcie_disable_clocks(rockchip);
 	return err;
 }
 
@@ -1648,8 +1654,6 @@ static int rockchip_pcie_remove(struct platform_device *pdev)
 		phy_exit(rockchip->phys[i]);
 	}
 
-	rockchip_pcie_disable_clocks(rockchip);
-
 	if (!IS_ERR(rockchip->vpcie12v))
 		regulator_disable(rockchip->vpcie12v);
 	if (!IS_ERR(rockchip->vpcie3v3))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 05/10] PCI: rockchip: spilt out rockchip_pcie_deinit_phys
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (3 preceding siblings ...)
  2017-08-23  7:02 ` [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ Shawn Lin
@ 2017-08-23  7:02 ` Shawn Lin
  2017-08-23  7:02 ` [PATCH v5 06/10] PCI: rockchip: fix missing phy manipulation for legacy phy Shawn Lin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

Spilt out rockchip_pcie_deinit_phys so that it could be
reused by rockchip_pcie_suspend_noirq and rockchip_pcie_remove.
No functional change intended.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 891b60a..9cd51e0 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -753,6 +753,18 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 	return 0;
 }
 
+static void rockchip_pcie_deinit_phys(struct rockchip_pcie *rockchip)
+{
+	int i;
+
+	for (i = 0; i < MAX_LANE_NUM; i++) {
+		/* inactive lane is already powered off */
+		if (rockchip->lanes_map & BIT(i))
+			phy_power_off(rockchip->phys[i]);
+		phy_exit(rockchip->phys[i]);
+	}
+}
+
 static irqreturn_t rockchip_pcie_subsys_irq_handler(int irq, void *arg)
 {
 	struct rockchip_pcie *rockchip = arg;
@@ -1423,7 +1435,7 @@ static void rockchip_pcie_disable_clocks(void *data)
 static int __maybe_unused rockchip_pcie_suspend_noirq(struct device *dev)
 {
 	struct rockchip_pcie *rockchip = dev_get_drvdata(dev);
-	int ret, i;
+	int ret;
 
 	/* disable core and cli int since we don't need to ack PME_ACK */
 	rockchip_pcie_write(rockchip, (PCIE_CLIENT_INT_CLI << 16) |
@@ -1436,12 +1448,7 @@ static int __maybe_unused rockchip_pcie_suspend_noirq(struct device *dev)
 		return ret;
 	}
 
-	for (i = 0; i < MAX_LANE_NUM; i++) {
-		/* inactive lane is already powered off */
-		if (rockchip->lanes_map & BIT(i))
-			phy_power_off(rockchip->phys[i]);
-		phy_exit(rockchip->phys[i]);
-	}
+	rockchip_pcie_deinit_phys(rockchip);
 
 	rockchip_pcie_disable_clocks(rockchip);
 
@@ -1640,19 +1647,13 @@ static int rockchip_pcie_remove(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct rockchip_pcie *rockchip = dev_get_drvdata(dev);
-	int i;
 
 	pci_stop_root_bus(rockchip->root_bus);
 	pci_remove_root_bus(rockchip->root_bus);
 	pci_unmap_iospace(rockchip->io);
 	irq_domain_remove(rockchip->irq_domain);
 
-	for (i = 0; i < MAX_LANE_NUM; i++) {
-		/* inactive lane is already powered off */
-		if (rockchip->lanes_map & BIT(i))
-			phy_power_off(rockchip->phys[i]);
-		phy_exit(rockchip->phys[i]);
-	}
+	rockchip_pcie_deinit_phys(rockchip);
 
 	if (!IS_ERR(rockchip->vpcie12v))
 		regulator_disable(rockchip->vpcie12v);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 06/10] PCI: rockchip: fix missing phy manipulation for legacy phy
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (4 preceding siblings ...)
  2017-08-23  7:02 ` [PATCH v5 05/10] PCI: rockchip: spilt out rockchip_pcie_deinit_phys Shawn Lin
@ 2017-08-23  7:02 ` Shawn Lin
  2017-08-25 21:18   ` Bjorn Helgaas
  2017-08-23  7:03 ` [PATCH v5 07/10] PCI: rockchip: Clean up PHY if driver probe or resume fails Shawn Lin
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

For instance, if a EP connect to lane3 and work under lagecy
phy mode, so struct phy phys[0..2] are all NULL. In this case,
rockchip->lanes_map & BIT(i) will tell the driver that lane0 is
already inactive, but what we want actually is to power off
the phys[0] for legacy phy mode. Fix this by add checking of
rockchip->legacy_phy for rockchip_pcie_deinit_phys.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 9cd51e0..933e3e9 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -759,7 +759,7 @@ static void rockchip_pcie_deinit_phys(struct rockchip_pcie *rockchip)
 
 	for (i = 0; i < MAX_LANE_NUM; i++) {
 		/* inactive lane is already powered off */
-		if (rockchip->lanes_map & BIT(i))
+		if (rockchip->legacy_phy || rockchip->lanes_map & BIT(i))
 			phy_power_off(rockchip->phys[i]);
 		phy_exit(rockchip->phys[i]);
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 07/10] PCI: rockchip: Clean up PHY if driver probe or resume fails
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (5 preceding siblings ...)
  2017-08-23  7:02 ` [PATCH v5 06/10] PCI: rockchip: fix missing phy manipulation for legacy phy Shawn Lin
@ 2017-08-23  7:03 ` Shawn Lin
  2017-08-23  7:03 ` [PATCH v5 08/10] PCI: rockchip: disable vpcie0v9 for resume_noirq error handling path Shawn Lin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

We observed that the clk_pciephy_ref is still enabled when we actually
fail to probe the driver.

root@linaro-alip:~# cat /sys/kernel/debug/clk/clk_summary | grep pcie
clk_pciephy_ref                    1     1        24000000       0 0
clk_pcie_pm                        0     0        24000000       0 0
        clk_pcie_core_cru          0     0       125000000       0 0
        clk_pciephy_ref100m        0     0       100000000       0 0
                aclk_pcie          0     0       148500000       0 0
                aclk_perf_pcie     0     0       148500000       0 0
                        pclk_pcie  0     0        37125000       0 0
clk_pcie_core                      0     0               0       0 0

clk_pciephy_ref is used by phy driver and we need to properly disable
it for this case. So this patch add error handle for the function of
rockchip_pcie_init_port and rockchip_pcie_resume_noirq to fix this issue.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>

---

Changes in v5:
- fix all the missing error handling cases that need to cleanup
  PHY

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 46 +++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 933e3e9..42dcb3d 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -561,32 +561,32 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 		err = phy_init(rockchip->phys[i]);
 		if (err) {
 			dev_err(dev, "init phy%d err %d\n", i, err);
-			return err;
+			goto err_exit_phy;
 		}
 	}
 
 	err = reset_control_assert(rockchip->core_rst);
 	if (err) {
 		dev_err(dev, "assert core_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	err = reset_control_assert(rockchip->mgmt_rst);
 	if (err) {
 		dev_err(dev, "assert mgmt_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	err = reset_control_assert(rockchip->mgmt_sticky_rst);
 	if (err) {
 		dev_err(dev, "assert mgmt_sticky_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	err = reset_control_assert(rockchip->pipe_rst);
 	if (err) {
 		dev_err(dev, "assert pipe_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	udelay(10);
@@ -594,19 +594,19 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 	err = reset_control_deassert(rockchip->pm_rst);
 	if (err) {
 		dev_err(dev, "deassert pm_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	err = reset_control_deassert(rockchip->aclk_rst);
 	if (err) {
 		dev_err(dev, "deassert aclk_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	err = reset_control_deassert(rockchip->pclk_rst);
 	if (err) {
 		dev_err(dev, "deassert pclk_rst err %d\n", err);
-		return err;
+		goto err_exit_phy;
 	}
 
 	if (rockchip->link_gen == 2)
@@ -628,7 +628,7 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 		err = phy_power_on(rockchip->phys[i]);
 		if (err) {
 			dev_err(dev, "power on phy%d err %d\n", i, err);
-			return err;
+			goto err_power_off_phy;
 		}
 	}
 
@@ -639,25 +639,25 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 	err = reset_control_deassert(rockchip->mgmt_sticky_rst);
 	if (err) {
 		dev_err(dev, "deassert mgmt_sticky_rst err %d\n", err);
-		return err;
+		goto err_power_off_phy;
 	}
 
 	err = reset_control_deassert(rockchip->core_rst);
 	if (err) {
 		dev_err(dev, "deassert core_rst err %d\n", err);
-		return err;
+		goto err_power_off_phy;
 	}
 
 	err = reset_control_deassert(rockchip->mgmt_rst);
 	if (err) {
 		dev_err(dev, "deassert mgmt_rst err %d\n", err);
-		return err;
+		goto err_power_off_phy;
 	}
 
 	err = reset_control_deassert(rockchip->pipe_rst);
 	if (err) {
 		dev_err(dev, "deassert pipe_rst err %d\n", err);
-		return err;
+		goto err_power_off_phy;
 	}
 
 	/* Fix the transmitted FTS count desired to exit from L0s. */
@@ -690,7 +690,7 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 				 500 * USEC_PER_MSEC);
 	if (err) {
 		dev_err(dev, "PCIe link training gen1 timeout!\n");
-		return -ETIMEDOUT;
+		goto err_power_off_phy;
 	}
 
 	if (rockchip->link_gen == 2) {
@@ -751,6 +751,14 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 	rockchip_pcie_write(rockchip, status, PCIE_RC_CONFIG_DCSR);
 
 	return 0;
+err_power_off_phy:
+	while (i--)
+		phy_power_off(rockchip->phys[i]);
+	i = MAX_LANE_NUM;
+err_exit_phy:
+	while (i--)
+		phy_exit(rockchip->phys[i]);
+	return err;
 }
 
 static void rockchip_pcie_deinit_phys(struct rockchip_pcie *rockchip)
@@ -1481,7 +1489,7 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 
 	err = rockchip_pcie_cfg_atu(rockchip);
 	if (err)
-		goto err_pcie_resume;
+		goto err_err_deinit_port;
 
 	/* Need this to enter L1 again */
 	rockchip_pcie_update_txcredit_mui(rockchip);
@@ -1489,6 +1497,8 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 
 	return 0;
 
+err_err_deinit_port:
+	rockchip_pcie_deinit_phys(rockchip);
 err_pcie_resume:
 	rockchip_pcie_disable_clocks(rockchip);
 	return err;
@@ -1554,12 +1564,12 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 
 	err = rockchip_pcie_init_irq_domain(rockchip);
 	if (err < 0)
-		goto err_vpcie;
+		goto err_deinit_port;
 
 	err = of_pci_get_host_bridge_resources(dev->of_node, 0, 0xff,
 					       &res, &io_base);
 	if (err)
-		goto err_vpcie;
+		goto err_deinit_port;
 
 	err = devm_request_pci_bus_resources(dev, &res);
 	if (err)
@@ -1631,6 +1641,8 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 
 err_free_res:
 	pci_free_resource_list(&res);
+err_deinit_port:
+	rockchip_pcie_deinit_phys(rockchip);
 err_vpcie:
 	if (!IS_ERR(rockchip->vpcie12v))
 		regulator_disable(rockchip->vpcie12v);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 08/10] PCI: rockchip: disable vpcie0v9 for resume_noirq error handling path
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (6 preceding siblings ...)
  2017-08-23  7:03 ` [PATCH v5 07/10] PCI: rockchip: Clean up PHY if driver probe or resume fails Shawn Lin
@ 2017-08-23  7:03 ` Shawn Lin
  2017-08-23  7:03 ` [PATCH v5 09/10] PCI: rockchip: remove irq domain if failing to probe Shawn Lin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

From: Jeffy Chen <jeffy.chen@rock-chips.com>

Need to disable vpcie0v9 regulator if failing to finish
resume_noirq callback.

Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 42dcb3d..6dbd3fd 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -1481,7 +1481,7 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 
 	err = rockchip_pcie_enable_clocks(rockchip);
 	if (err)
-		return err;
+		goto err_disable_0v9;
 
 	err = rockchip_pcie_init_port(rockchip);
 	if (err)
@@ -1501,6 +1501,9 @@ static int __maybe_unused rockchip_pcie_resume_noirq(struct device *dev)
 	rockchip_pcie_deinit_phys(rockchip);
 err_pcie_resume:
 	rockchip_pcie_disable_clocks(rockchip);
+err_disable_0v9:
+	if (!IS_ERR(rockchip->vpcie0v9))
+		regulator_disable(rockchip->vpcie0v9);
 	return err;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 09/10] PCI: rockchip: remove irq domain if failing to probe
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (7 preceding siblings ...)
  2017-08-23  7:03 ` [PATCH v5 08/10] PCI: rockchip: disable vpcie0v9 for resume_noirq error handling path Shawn Lin
@ 2017-08-23  7:03 ` Shawn Lin
  2017-08-23  7:03 ` [PATCH v5 10/10] PCI: rockchip: umap io space " Shawn Lin
  2017-08-25 21:38 ` [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Bjorn Helgaas
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

From: Jeffy Chen <jeffy.chen@rock-chips.com>

Fix the missing to call irq_domain_remove if failing
to finish the probe.

Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 6dbd3fd..e752d3e 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -1572,7 +1572,7 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 	err = of_pci_get_host_bridge_resources(dev->of_node, 0, 0xff,
 					       &res, &io_base);
 	if (err)
-		goto err_deinit_port;
+		goto err_remove_irq_domain;
 
 	err = devm_request_pci_bus_resources(dev, &res);
 	if (err)
@@ -1644,6 +1644,8 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 
 err_free_res:
 	pci_free_resource_list(&res);
+err_remove_irq_domain:
+	irq_domain_remove(rockchip->irq_domain);
 err_deinit_port:
 	rockchip_pcie_deinit_phys(rockchip);
 err_vpcie:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 10/10] PCI: rockchip: umap io space if failing to probe
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (8 preceding siblings ...)
  2017-08-23  7:03 ` [PATCH v5 09/10] PCI: rockchip: remove irq domain if failing to probe Shawn Lin
@ 2017-08-23  7:03 ` Shawn Lin
  2017-08-25 21:38 ` [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Bjorn Helgaas
  10 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-23  7:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

From: Jeffy Chen <jeffy.chen@rock-chips.com>

Fix the missing to call pci_unmap_iospace if failing
to finish the probe.

Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>

---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/host/pcie-rockchip.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index e752d3e..70d878b 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -1610,12 +1610,12 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 
 	err = rockchip_pcie_cfg_atu(rockchip);
 	if (err)
-		goto err_free_res;
+		goto err_unmap_iospace;
 
 	rockchip->msg_region = devm_ioremap(dev, rockchip->msg_bus_addr, SZ_1M);
 	if (!rockchip->msg_region) {
 		err = -ENOMEM;
-		goto err_free_res;
+		goto err_unmap_iospace;
 	}
 
 	list_splice_init(&res, &bridge->windows);
@@ -1628,7 +1628,7 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 
 	err = pci_scan_root_bus_bridge(bridge);
 	if (err < 0)
-		goto err_free_res;
+		goto err_unmap_iospace;
 
 	bus = bridge->bus;
 
@@ -1642,6 +1642,8 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
 	pci_bus_add_devices(bus);
 	return 0;
 
+err_unmap_iospace:
+	pci_unmap_iospace(rockchip->io);
 err_free_res:
 	pci_free_resource_list(&res);
 err_remove_irq_domain:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  2017-08-23  7:02 ` [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ Shawn Lin
@ 2017-08-24 20:21   ` Bjorn Helgaas
  2017-08-24 21:10     ` Dmitry Torokhov
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Bjorn Helgaas @ 2017-08-24 20:21 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris,
	Jeffy Chen, Tejun Heo, Dmitry Torokhov, Michael Turquette,
	Stephen Boyd, linux-clk

[+cc Tejun, Dmitry, Michael, Stephen, linux-clk for devm/clk questions]

On Wed, Aug 23, 2017 at 03:02:38PM +0800, Shawn Lin wrote:
> With CONFIG_DEBUG_SHIRQ enabled, the irq tear down routine
> would still access the irq handler registed as a shard irq.
> Per the comment within the function of __free_irq, it says
> "It's a shared IRQ -- the driver ought to be prepared for
> an IRQ event to happen even now it's being freed". However
> when failing to probe the driver, it may disable the clock
> for accessing the register and the following check for shared
> irq state would call the irq handler which accesses the register
> w/o the clk enabled. That will hang the system forever.
> 
> With adding some dump_stack we could see how that happened.
> 
> calling  rockchip_pcie_driver_init+0x0/0x28 @ 1
> rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
> rockchip-pcie f8000000.pcie: no vpcie1v8 regulator found
> rockchip-pcie f8000000.pcie: no vpcie0v9 regulator found
> rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc3-next-20170807-ARCH+ #189
> Hardware name: Firefly-RK3399 Board (DT)
> Call trace:
> [<ffff000008089bf0>] dump_backtrace+0x0/0x250
> [<ffff000008089eb0>] show_stack+0x20/0x28
> [<ffff000008c3313c>] dump_stack+0x90/0xb0
> [<ffff000008632ad4>] rockchip_pcie_read.isra.11+0x54/0x58
> [<ffff0000086334fc>] rockchip_pcie_client_irq_handler+0x30/0x1a0
> [<ffff00000813ce98>] __free_irq+0x1c8/0x2dc
> [<ffff00000813d044>] free_irq+0x44/0x74
> [<ffff0000081415fc>] devm_irq_release+0x24/0x2c
> [<ffff00000877429c>] release_nodes+0x1d8/0x30c
> [<ffff000008774838>] devres_release_all+0x3c/0x5c
> [<ffff00000876f19c>] driver_probe_device+0x244/0x494
> [<ffff00000876f50c>] __driver_attach+0x120/0x124
> [<ffff00000876cb80>] bus_for_each_dev+0x6c/0xac
> [<ffff00000876e984>] driver_attach+0x2c/0x34
> [<ffff00000876e3a4>] bus_add_driver+0x244/0x2b0
> [<ffff000008770264>] driver_register+0x70/0x110
> [<ffff0000087718b4>] platform_driver_register+0x60/0x6c
> [<ffff0000091eb108>] rockchip_pcie_driver_init+0x20/0x28
> [<ffff000008083a2c>] do_one_initcall+0xc8/0x130
> [<ffff0000091a0ea8>] kernel_init_freeable+0x1a0/0x238
> [<ffff000008c461cc>] kernel_init+0x18/0x108
> [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> 
> In order to fix this, we remove all the clock-disabling from
> the error handle path and driver's remove function. And replying
> on the devm_add_action_or_reset to fire the clock-disabling at
> the appropriate time. Also split out rockchip_pcie_setup_irq
> and move requesting irq after enabling clks to avoid this kind

Thanks for splitting out the refactoring stuff.  That really makes
this patch much simpler.

IIUC, this really has nothing to do with CONFIG_DEBUG_SHIRQ.  It may
be true that you've only *seen* the problem with CONFIG_DEBUG_SHIRQ
enabled, but all that config option does is take a situation that
could happen at any time (another device sharing the IRQ generating an
interrupt), and force it to happen.  So it's just a way to expose an
existing driver problem.

The real problem is apparently that rockchip_pcie_subsys_irq_handler()
relies on some clock being enabled, but we're leaving it registered at
a time when the clock has already been disabled.

You fixed that by using devm_add_action_or_reset() to tell devm to
disable the clocks *after* releasing the IRQ.

That sort of makes sense, but devm_add_action_or_reset() is a little
obscure, and this feels like a hole in the devm framework.  Seems like
it would be nice if there were some sort of devm wrapper for
clk_prepare_enable() so this would happen automatically.

This pattern:

  clk = devm_clk_get(...);
  if (IS_ERR(clk)) {
    dev_warn("no clock for ...");
    return PTR_ERR(clk);
  }

  ret = clk_prepare_enable(clk);
  if (ret) {
    dev_warn("failed to enable ...");
    return err;
  }

is quite common ("git grep -A10 devm_clk_get | grep clk_prepare_enable
 | wc -l" finds over 400 occurrences).  Should there be something to
simplify this a little?

I also wonder about other PCI host drivers that use both
clk_prepare_enable() and devm_request_irq().  Maybe Rockchip is
"special" in that it seems the driver must turn on a clock before it
can even talk to the host controller, whereas maybe other drivers can
always talk to the host controller, but need to turn on clocks
downstream from the controller.  I didn't audit them, but I'm
concerned that some of them might have this same problem.

> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> 
> ---
> 
> Changes in v5:
> - rebase on former reconstrtion patches suggested by Bjorn
> 
> Changes in v4:
> - split out rockchip_pcie_enable_clocks and reuse
>   rockchip_pcie_enable_clocks and rockchip_pcie_disable_clocks
>   for elsewhere suggested by Jeffy
> 
> Changes in v3:
> - check the return value of devm_add_action_or_reset and spilt out
>   rockchip_pcie_setup_irq in order to move requesting irq after
>   enabling clks.
> 
> Changes in v2:
> - use devm_add_action_or_reset to fix this ordering suggested by
>   Heiko and Jeffy. Thanks!
> 
>  drivers/pci/host/pcie-rockchip.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 971d22b..891b60a 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -1099,10 +1099,6 @@ static int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
>  		return PTR_ERR(rockchip->clk_pcie_pm);
>  	}
>  
> -	err = rockchip_pcie_setup_irq(rockchip);
> -	if (err)
> -		return err;
> -
>  	rockchip->vpcie12v = devm_regulator_get_optional(dev, "vpcie12v");
>  	if (IS_ERR(rockchip->vpcie12v)) {
>  		if (PTR_ERR(rockchip->vpcie12v) == -EPROBE_DEFER)
> @@ -1525,10 +1521,22 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
>  	if (err)
>  		return err;
>  
> +	err = devm_add_action_or_reset(dev,
> +				       rockchip_pcie_disable_clocks,
> +				       rockchip);
> +	if (err) {
> +		dev_err(dev, "unable to add action or reset\n");
> +		return err;
> +	}
> +
> +	err = rockchip_pcie_setup_irq(rockchip);
> +	if (err)
> +		return err;
> +
>  	err = rockchip_pcie_set_vpcie(rockchip);
>  	if (err) {
>  		dev_err(dev, "failed to set vpcie regulator\n");
> -		goto err_set_vpcie;
> +		return err;
>  	}
>  
>  	err = rockchip_pcie_init_port(rockchip);
> @@ -1625,8 +1633,6 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
>  		regulator_disable(rockchip->vpcie1v8);
>  	if (!IS_ERR(rockchip->vpcie0v9))
>  		regulator_disable(rockchip->vpcie0v9);
> -err_set_vpcie:
> -	rockchip_pcie_disable_clocks(rockchip);
>  	return err;
>  }
>  
> @@ -1648,8 +1654,6 @@ static int rockchip_pcie_remove(struct platform_device *pdev)
>  		phy_exit(rockchip->phys[i]);
>  	}
>  
> -	rockchip_pcie_disable_clocks(rockchip);
> -
>  	if (!IS_ERR(rockchip->vpcie12v))
>  		regulator_disable(rockchip->vpcie12v);
>  	if (!IS_ERR(rockchip->vpcie3v3))
> -- 
> 1.9.1
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  2017-08-24 20:21   ` Bjorn Helgaas
@ 2017-08-24 21:10     ` Dmitry Torokhov
  2017-08-25  1:44       ` Brian Norris
  2017-08-25  1:05     ` jeffy
  2017-08-25  1:38     ` Shawn Lin
  2 siblings, 1 reply; 22+ messages in thread
From: Dmitry Torokhov @ 2017-08-24 21:10 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Shawn Lin, Bjorn Helgaas, Linux PCI,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Jeffy Chen, Tejun Heo, Michael Turquette,
	Stephen Boyd, linux-clk

On Thu, Aug 24, 2017 at 1:21 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> [+cc Tejun, Dmitry, Michael, Stephen, linux-clk for devm/clk questions]
>
> On Wed, Aug 23, 2017 at 03:02:38PM +0800, Shawn Lin wrote:
>> With CONFIG_DEBUG_SHIRQ enabled, the irq tear down routine
>> would still access the irq handler registed as a shard irq.
>> Per the comment within the function of __free_irq, it says
>> "It's a shared IRQ -- the driver ought to be prepared for
>> an IRQ event to happen even now it's being freed". However
>> when failing to probe the driver, it may disable the clock
>> for accessing the register and the following check for shared
>> irq state would call the irq handler which accesses the register
>> w/o the clk enabled. That will hang the system forever.
>>
>> With adding some dump_stack we could see how that happened.
>>
>> calling  rockchip_pcie_driver_init+0x0/0x28 @ 1
>> rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
>> rockchip-pcie f8000000.pcie: no vpcie1v8 regulator found
>> rockchip-pcie f8000000.pcie: no vpcie0v9 regulator found
>> rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc3-next-20170807-ARCH+ #189
>> Hardware name: Firefly-RK3399 Board (DT)
>> Call trace:
>> [<ffff000008089bf0>] dump_backtrace+0x0/0x250
>> [<ffff000008089eb0>] show_stack+0x20/0x28
>> [<ffff000008c3313c>] dump_stack+0x90/0xb0
>> [<ffff000008632ad4>] rockchip_pcie_read.isra.11+0x54/0x58
>> [<ffff0000086334fc>] rockchip_pcie_client_irq_handler+0x30/0x1a0
>> [<ffff00000813ce98>] __free_irq+0x1c8/0x2dc
>> [<ffff00000813d044>] free_irq+0x44/0x74
>> [<ffff0000081415fc>] devm_irq_release+0x24/0x2c
>> [<ffff00000877429c>] release_nodes+0x1d8/0x30c
>> [<ffff000008774838>] devres_release_all+0x3c/0x5c
>> [<ffff00000876f19c>] driver_probe_device+0x244/0x494
>> [<ffff00000876f50c>] __driver_attach+0x120/0x124
>> [<ffff00000876cb80>] bus_for_each_dev+0x6c/0xac
>> [<ffff00000876e984>] driver_attach+0x2c/0x34
>> [<ffff00000876e3a4>] bus_add_driver+0x244/0x2b0
>> [<ffff000008770264>] driver_register+0x70/0x110
>> [<ffff0000087718b4>] platform_driver_register+0x60/0x6c
>> [<ffff0000091eb108>] rockchip_pcie_driver_init+0x20/0x28
>> [<ffff000008083a2c>] do_one_initcall+0xc8/0x130
>> [<ffff0000091a0ea8>] kernel_init_freeable+0x1a0/0x238
>> [<ffff000008c461cc>] kernel_init+0x18/0x108
>> [<ffff0000080836c0>] ret_from_fork+0x10/0x50
>>
>> In order to fix this, we remove all the clock-disabling from
>> the error handle path and driver's remove function. And replying
>> on the devm_add_action_or_reset to fire the clock-disabling at
>> the appropriate time. Also split out rockchip_pcie_setup_irq
>> and move requesting irq after enabling clks to avoid this kind
>
> Thanks for splitting out the refactoring stuff.  That really makes
> this patch much simpler.
>
> IIUC, this really has nothing to do with CONFIG_DEBUG_SHIRQ.  It may
> be true that you've only *seen* the problem with CONFIG_DEBUG_SHIRQ
> enabled, but all that config option does is take a situation that
> could happen at any time (another device sharing the IRQ generating an
> interrupt), and force it to happen.  So it's just a way to expose an
> existing driver problem.
>
> The real problem is apparently that rockchip_pcie_subsys_irq_handler()
> relies on some clock being enabled, but we're leaving it registered at
> a time when the clock has already been disabled.
>
> You fixed that by using devm_add_action_or_reset() to tell devm to
> disable the clocks *after* releasing the IRQ.
>
> That sort of makes sense, but devm_add_action_or_reset() is a little
> obscure, and this feels like a hole in the devm framework.  Seems like
> it would be nice if there were some sort of devm wrapper for
> clk_prepare_enable() so this would happen automatically.
>
> This pattern:
>
>   clk = devm_clk_get(...);
>   if (IS_ERR(clk)) {
>     dev_warn("no clock for ...");
>     return PTR_ERR(clk);
>   }
>
>   ret = clk_prepare_enable(clk);
>   if (ret) {
>     dev_warn("failed to enable ...");
>     return err;
>   }
>
> is quite common ("git grep -A10 devm_clk_get | grep clk_prepare_enable
>  | wc -l" finds over 400 occurrences).  Should there be something to
> simplify this a little?
>
> I also wonder about other PCI host drivers that use both
> clk_prepare_enable() and devm_request_irq().  Maybe Rockchip is
> "special" in that it seems the driver must turn on a clock before it
> can even talk to the host controller, whereas maybe other drivers can
> always talk to the host controller, but need to turn on clocks
> downstream from the controller.  I didn't audit them, but I'm
> concerned that some of them might have this same problem.

I proposed devm_clk_prepare_enable() and friends (see
https://lkml.org/lkml/2017/2/14/544), but Stephen did not like it and
mentioned that he and Mike were working on a different solution where
clk_put() would drop all enables. I have not seen any updates on that
though. Maybe we should revisit devm approach?

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  2017-08-24 20:21   ` Bjorn Helgaas
  2017-08-24 21:10     ` Dmitry Torokhov
@ 2017-08-25  1:05     ` jeffy
  2017-08-25  1:38     ` Shawn Lin
  2 siblings, 0 replies; 22+ messages in thread
From: jeffy @ 2017-08-25  1:05 UTC (permalink / raw)
  To: Bjorn Helgaas, Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris,
	Tejun Heo, Dmitry Torokhov, Michael Turquette, Stephen Boyd,
	linux-clk

Hi Bjorn,

On 08/25/2017 04:21 AM, Bjorn Helgaas wrote:
>> >In order to fix this, we remove all the clock-disabling from
>> >the error handle path and driver's remove function. And replying
>> >on the devm_add_action_or_reset to fire the clock-disabling at
>> >the appropriate time. Also split out rockchip_pcie_setup_irq
>> >and move requesting irq after enabling clks to avoid this kind
> Thanks for splitting out the refactoring stuff.  That really makes
> this patch much simpler.
>
> IIUC, this really has nothing to do with CONFIG_DEBUG_SHIRQ.  It may
> be true that you've only*seen*  the problem with CONFIG_DEBUG_SHIRQ
> enabled, but all that config option does is take a situation that
> could happen at any time (another device sharing the IRQ generating an
> interrupt), and force it to happen.  So it's just a way to expose an
> existing driver problem.
yes, and i'm wondering would it make more sense to somehow ignore those 
irqs(triggered by other devices, and we don't really need to care since 
we already unregistered) than trying to hold all needed resources(clks & 
power domains & some other resources maybe) for that?

maybe we can just make sure the irq handler unregistered when we stop 
caring about the irqs? or maybe add a flag to tell the irq handler to 
stop processing them?

>
> The real problem is apparently that rockchip_pcie_subsys_irq_handler()
> relies on some clock being enabled, but we're leaving it registered at
> a time when the clock has already been disabled.
>
> You fixed that by using devm_add_action_or_reset() to tell devm to
> disable the clocks*after*  releasing the IRQ.
>
> That sort of makes sense, but devm_add_action_or_reset() is a little
> obscure, and this feels like a hole in the devm framework.  Seems like
> it would be nice if there were some sort of devm wrapper for
> clk_prepare_enable() so this would happen automatically.
>
> This pattern:
>
>    clk = devm_clk_get(...);
>    if (IS_ERR(clk)) {
>      dev_warn("no clock for ...");
>      return PTR_ERR(clk);
>    }
>
>    ret = clk_prepare_enable(clk);
>    if (ret) {
>      dev_warn("failed to enable ...");
>      return err;
>    }
>
> is quite common ("git grep -A10 devm_clk_get | grep clk_prepare_enable
>   | wc -l" finds over 400 occurrences).  Should there be something to
> simplify this a little?
>
> I also wonder about other PCI host drivers that use both
> clk_prepare_enable() and devm_request_irq().  Maybe Rockchip is
> "special" in that it seems the driver must turn on a clock before it
> can even talk to the host controller, whereas maybe other drivers can
> always talk to the host controller, but need to turn on clocks
> downstream from the controller.  I didn't audit them, but I'm
> concerned that some of them might have this same problem.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  2017-08-24 20:21   ` Bjorn Helgaas
  2017-08-24 21:10     ` Dmitry Torokhov
  2017-08-25  1:05     ` jeffy
@ 2017-08-25  1:38     ` Shawn Lin
  2 siblings, 0 replies; 22+ messages in thread
From: Shawn Lin @ 2017-08-25  1:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: shawn.lin, Bjorn Helgaas, linux-pci, linux-rockchip,
	Brian Norris, Jeffy Chen, Tejun Heo, Dmitry Torokhov,
	Michael Turquette, Stephen Boyd, linux-clk

Hi Bjorn,

On在 2017/8/25 4:21, Bjorn Helgaas wrote:
> [+cc Tejun, Dmitry, Michael, Stephen, linux-clk for devm/clk questions]
> 
> On Wed, Aug 23, 2017 at 03:02:38PM +0800, Shawn Lin wrote:
>> With CONFIG_DEBUG_SHIRQ enabled, the irq tear down routine
>> would still access the irq handler registed as a shard irq.
>> Per the comment within the function of __free_irq, it says
>> "It's a shared IRQ -- the driver ought to be prepared for
>> an IRQ event to happen even now it's being freed". However
>> when failing to probe the driver, it may disable the clock
>> for accessing the register and the following check for shared
>> irq state would call the irq handler which accesses the register
>> w/o the clk enabled. That will hang the system forever.
>>
>> With adding some dump_stack we could see how that happened.
>>
>> calling  rockchip_pcie_driver_init+0x0/0x28 @ 1
>> rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
>> rockchip-pcie f8000000.pcie: no vpcie1v8 regulator found
>> rockchip-pcie f8000000.pcie: no vpcie0v9 regulator found
>> rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc3-next-20170807-ARCH+ #189
>> Hardware name: Firefly-RK3399 Board (DT)
>> Call trace:
>> [<ffff000008089bf0>] dump_backtrace+0x0/0x250
>> [<ffff000008089eb0>] show_stack+0x20/0x28
>> [<ffff000008c3313c>] dump_stack+0x90/0xb0
>> [<ffff000008632ad4>] rockchip_pcie_read.isra.11+0x54/0x58
>> [<ffff0000086334fc>] rockchip_pcie_client_irq_handler+0x30/0x1a0
>> [<ffff00000813ce98>] __free_irq+0x1c8/0x2dc
>> [<ffff00000813d044>] free_irq+0x44/0x74
>> [<ffff0000081415fc>] devm_irq_release+0x24/0x2c
>> [<ffff00000877429c>] release_nodes+0x1d8/0x30c
>> [<ffff000008774838>] devres_release_all+0x3c/0x5c
>> [<ffff00000876f19c>] driver_probe_device+0x244/0x494
>> [<ffff00000876f50c>] __driver_attach+0x120/0x124
>> [<ffff00000876cb80>] bus_for_each_dev+0x6c/0xac
>> [<ffff00000876e984>] driver_attach+0x2c/0x34
>> [<ffff00000876e3a4>] bus_add_driver+0x244/0x2b0
>> [<ffff000008770264>] driver_register+0x70/0x110
>> [<ffff0000087718b4>] platform_driver_register+0x60/0x6c
>> [<ffff0000091eb108>] rockchip_pcie_driver_init+0x20/0x28
>> [<ffff000008083a2c>] do_one_initcall+0xc8/0x130
>> [<ffff0000091a0ea8>] kernel_init_freeable+0x1a0/0x238
>> [<ffff000008c461cc>] kernel_init+0x18/0x108
>> [<ffff0000080836c0>] ret_from_fork+0x10/0x50
>>
>> In order to fix this, we remove all the clock-disabling from
>> the error handle path and driver's remove function. And replying
>> on the devm_add_action_or_reset to fire the clock-disabling at
>> the appropriate time. Also split out rockchip_pcie_setup_irq
>> and move requesting irq after enabling clks to avoid this kind
> 
> Thanks for splitting out the refactoring stuff.  That really makes
> this patch much simpler.
> 
> IIUC, this really has nothing to do with CONFIG_DEBUG_SHIRQ.  It may
> be true that you've only *seen* the problem with CONFIG_DEBUG_SHIRQ
> enabled, but all that config option does is take a situation that
> could happen at any time (another device sharing the IRQ generating an
> interrupt), and force it to happen.  So it's just a way to expose an
> existing driver problem.

Right.

> 
> The real problem is apparently that rockchip_pcie_subsys_irq_handler()
> relies on some clock being enabled, but we're leaving it registered at
> a time when the clock has already been disabled.
> 
> You fixed that by using devm_add_action_or_reset() to tell devm to
> disable the clocks *after* releasing the IRQ.
> 
> That sort of makes sense, but devm_add_action_or_reset() is a little
> obscure, and this feels like a hole in the devm framework.  Seems like
> it would be nice if there were some sort of devm wrapper for
> clk_prepare_enable() so this would happen automatically.

Yes, I would appreciate it if we have devm wrapper for
clk_prepare_enable so that we don't resort to devm_add_action_or_reset.

> 
> This pattern:
> 
>    clk = devm_clk_get(...);
>    if (IS_ERR(clk)) {
>      dev_warn("no clock for ...");
>      return PTR_ERR(clk);
>    }
> 
>    ret = clk_prepare_enable(clk);
>    if (ret) {
>      dev_warn("failed to enable ...");
>      return err;
>    }
> 
> is quite common ("git grep -A10 devm_clk_get | grep clk_prepare_enable
>   | wc -l" finds over 400 occurrences).  Should there be something to
> simplify this a little?
> 
> I also wonder about other PCI host drivers that use both
> clk_prepare_enable() and devm_request_irq().  Maybe Rockchip is
> "special" in that it seems the driver must turn on a clock before it
> can even talk to the host controller, whereas maybe other drivers can

IIRC, some of the other ARM SoCs have the same problem.

> always talk to the host controller, but need to turn on clocks
> downstream from the controller.  I didn't audit them, but I'm
> concerned that some of them might have this same problem.

So that is my concern as well. But I have to say we may face a worse
situation as I see it by randomly search the DT,

arch/arm64/boot/dts/renesas/r8a7795.dtsi  includes a power-domains
for pcie-rcar and pcie-rcar registers shared irq either. So the power-
domain would be powered off once failing to probe or calling ->remove()
immediately even *before* doing devm cleanup. In another word, I don't
have too much confident that renesas's CPU could visit PCIe IP w/o power
domain in 'on' state?

I posted a relevant patch for fixing this for driver core but havn't got
any input from there (https://lkml.org/lkml/2017/8/15/146). That don't
affect pcie-rockchip *now* as we don't have power-domain for that but
it's highly relevant to the problem we are disscussing.

Finally, as a life-saving straw if we don't reach an agreement for
anyone of adding devm clk_prepare_enable wrraper and adjusting the
sequence of powering off power-domain, we have to get rid of using
devm_request_irq and use request_irq/free_irq instead for all
the potential problematic drivers...


> 
>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
>>
>> ---
>>
>> Changes in v5:
>> - rebase on former reconstrtion patches suggested by Bjorn
>>
>> Changes in v4:
>> - split out rockchip_pcie_enable_clocks and reuse
>>    rockchip_pcie_enable_clocks and rockchip_pcie_disable_clocks
>>    for elsewhere suggested by Jeffy
>>
>> Changes in v3:
>> - check the return value of devm_add_action_or_reset and spilt out
>>    rockchip_pcie_setup_irq in order to move requesting irq after
>>    enabling clks.
>>
>> Changes in v2:
>> - use devm_add_action_or_reset to fix this ordering suggested by
>>    Heiko and Jeffy. Thanks!
>>
>>   drivers/pci/host/pcie-rockchip.c | 22 +++++++++++++---------
>>   1 file changed, 13 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
>> index 971d22b..891b60a 100644
>> --- a/drivers/pci/host/pcie-rockchip.c
>> +++ b/drivers/pci/host/pcie-rockchip.c
>> @@ -1099,10 +1099,6 @@ static int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
>>   		return PTR_ERR(rockchip->clk_pcie_pm);
>>   	}
>>   
>> -	err = rockchip_pcie_setup_irq(rockchip);
>> -	if (err)
>> -		return err;
>> -
>>   	rockchip->vpcie12v = devm_regulator_get_optional(dev, "vpcie12v");
>>   	if (IS_ERR(rockchip->vpcie12v)) {
>>   		if (PTR_ERR(rockchip->vpcie12v) == -EPROBE_DEFER)
>> @@ -1525,10 +1521,22 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
>>   	if (err)
>>   		return err;
>>   
>> +	err = devm_add_action_or_reset(dev,
>> +				       rockchip_pcie_disable_clocks,
>> +				       rockchip);
>> +	if (err) {
>> +		dev_err(dev, "unable to add action or reset\n");
>> +		return err;
>> +	}
>> +
>> +	err = rockchip_pcie_setup_irq(rockchip);
>> +	if (err)
>> +		return err;
>> +
>>   	err = rockchip_pcie_set_vpcie(rockchip);
>>   	if (err) {
>>   		dev_err(dev, "failed to set vpcie regulator\n");
>> -		goto err_set_vpcie;
>> +		return err;
>>   	}
>>   
>>   	err = rockchip_pcie_init_port(rockchip);
>> @@ -1625,8 +1633,6 @@ static int rockchip_pcie_probe(struct platform_device *pdev)
>>   		regulator_disable(rockchip->vpcie1v8);
>>   	if (!IS_ERR(rockchip->vpcie0v9))
>>   		regulator_disable(rockchip->vpcie0v9);
>> -err_set_vpcie:
>> -	rockchip_pcie_disable_clocks(rockchip);
>>   	return err;
>>   }
>>   
>> @@ -1648,8 +1654,6 @@ static int rockchip_pcie_remove(struct platform_device *pdev)
>>   		phy_exit(rockchip->phys[i]);
>>   	}
>>   
>> -	rockchip_pcie_disable_clocks(rockchip);
>> -
>>   	if (!IS_ERR(rockchip->vpcie12v))
>>   		regulator_disable(rockchip->vpcie12v);
>>   	if (!IS_ERR(rockchip->vpcie3v3))
>> -- 
>> 1.9.1
>>
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  2017-08-24 21:10     ` Dmitry Torokhov
@ 2017-08-25  1:44       ` Brian Norris
  0 siblings, 0 replies; 22+ messages in thread
From: Brian Norris @ 2017-08-25  1:44 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Bjorn Helgaas, Shawn Lin, Bjorn Helgaas, Linux PCI,
	open list:ARM/Rockchip SoC...,
	Jeffy Chen, Tejun Heo, Michael Turquette, Stephen Boyd,
	linux-clk

On Thu, Aug 24, 2017 at 02:10:52PM -0700, Dmitry Torokhov wrote:
> On Thu, Aug 24, 2017 at 1:21 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > [+cc Tejun, Dmitry, Michael, Stephen, linux-clk for devm/clk questions]
> >
> > On Wed, Aug 23, 2017 at 03:02:38PM +0800, Shawn Lin wrote:
> >> With CONFIG_DEBUG_SHIRQ enabled, the irq tear down routine
> >> would still access the irq handler registed as a shard irq.
> >> Per the comment within the function of __free_irq, it says
> >> "It's a shared IRQ -- the driver ought to be prepared for
> >> an IRQ event to happen even now it's being freed". However
> >> when failing to probe the driver, it may disable the clock
> >> for accessing the register and the following check for shared
> >> irq state would call the irq handler which accesses the register
> >> w/o the clk enabled. That will hang the system forever.

Side note: why is this driver even requesting a shared IRQ? This is for
rk3399, and the IRQ is a dedicated GIC interrupt for the PCIe
controller. It shouldn't need to be 'shared'.

The problem still might not be *only* theoretical though, since it's
still possible for this non-shared interrupt to
(a) trigger
(b) concurrently, we remove/tear down (including disable clocks)
(c) we service the IRQ      <-- dead, because clock is disabled
(d) if we ever got here... free_irq()

Brian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 06/10] PCI: rockchip: fix missing phy manipulation for legacy phy
  2017-08-23  7:02 ` [PATCH v5 06/10] PCI: rockchip: fix missing phy manipulation for legacy phy Shawn Lin
@ 2017-08-25 21:18   ` Bjorn Helgaas
  0 siblings, 0 replies; 22+ messages in thread
From: Bjorn Helgaas @ 2017-08-25 21:18 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris, Jeffy Chen

On Wed, Aug 23, 2017 at 03:02:57PM +0800, Shawn Lin wrote:
> For instance, if a EP connect to lane3 and work under lagecy
> phy mode, so struct phy phys[0..2] are all NULL. In this case,
> rockchip->lanes_map & BIT(i) will tell the driver that lane0 is
> already inactive, but what we want actually is to power off
> the phys[0] for legacy phy mode. Fix this by add checking of
> rockchip->legacy_phy for rockchip_pcie_deinit_phys.

This changelog is not quite correct.  If "rockchip->legacy_phy", then
rockchip->phys[0] is a valid PHY, but phys[1..3] are NULL (not 0..2).

> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
> Changes in v5: None
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> 
>  drivers/pci/host/pcie-rockchip.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 9cd51e0..933e3e9 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -759,7 +759,7 @@ static void rockchip_pcie_deinit_phys(struct rockchip_pcie *rockchip)
>  
>  	for (i = 0; i < MAX_LANE_NUM; i++) {
>  		/* inactive lane is already powered off */
> -		if (rockchip->lanes_map & BIT(i))
> +		if (rockchip->legacy_phy || rockchip->lanes_map & BIT(i))
>  			phy_power_off(rockchip->phys[i]);
>  		phy_exit(rockchip->phys[i]);
>  	}

I think this is harder to understand than necessary.  If we're using
legacy_phy, the pointer is in phys[0].  If we always set
rockchip->lanes_map, even in the legacy_phy case (where it would show
that only phys[0] is valid), this patch won't even be necessary.

I'd propose the following patches, which could be squashed into the
existing series on pci/host-rockchip.  The first is purely cosmetic,
as is some of the second.

The important part is this:

   static u8 rockchip_pcie_lane_map(struct rockchip_pcie *rockchip)
   {
  -       u32 val = rockchip_pcie_read(rockchip, PCIE_CORE_LANE_MAP);
  -       u8 map = val & PCIE_CORE_LANE_MAP_MASK;
  +       u32 val;
  +       u8 map;
  +
  +       if (rockchip->legacy_phy)
  +               return BIT(0);



commit d3d39c577edf63b9441d1a7614808e02721dd2b6
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Fri Aug 25 16:00:25 2017 -0500

    Possibly squash into "PCI: rockchip: Add per-lane PHY support"

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index f8b88004e20f..5ccbdbfa97d0 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -867,24 +867,25 @@ static int rockchip_pcie_get_phys(struct rockchip_pcie *rockchip)
 	char *name;
 	u32 i;
 
-	rockchip->phys[0] = devm_phy_get(dev, "pcie-phy");
-	if (IS_ERR(rockchip->phys[0])) {
-		if (PTR_ERR(rockchip->phys[0]) == -EPROBE_DEFER)
-			return PTR_ERR(rockchip->phys[0]);
-		dev_dbg(dev, "missing legacy phy; search for per-lane PHY\n");
-	} else {
+	phy = devm_phy_get(dev, "pcie-phy");
+	if (!IS_ERR(phy)) {
 		rockchip->legacy_phy = true;
+		rockchip->phys[0] = phy;
 		dev_warn(dev, "legacy phy model is deprecated!\n");
 		return 0;
 	}
 
+	if (PTR_ERR(phy) == -EPROBE_DEFER)
+		return PTR_ERR(phy);
+
+	dev_dbg(dev, "missing legacy phy; search for per-lane PHY\n");
+
 	for (i = 0; i < MAX_LANE_NUM; i++) {
 		name = kasprintf(GFP_KERNEL, "pcie-phy-%u", i);
 		if (!name)
 			return -ENOMEM;
 
-		phy = devm_of_phy_get(rockchip->dev,
-				      rockchip->dev->of_node, name);
+		phy = devm_of_phy_get(dev, dev->of_node, name);
 		kfree(name);
 
 		if (IS_ERR(phy)) {
commit 6f8bcdfe4568809437e93e2d54e68b2cba3b4ac4
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Fri Aug 25 15:39:10 2017 -0500

    Possibly squash into "PCI: rockchip: Idle inactive PHY(s)"

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 60069acd9f86..29ebfc971896 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -309,8 +309,14 @@ static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
 
 static u8 rockchip_pcie_lane_map(struct rockchip_pcie *rockchip)
 {
-	u32 val = rockchip_pcie_read(rockchip, PCIE_CORE_LANE_MAP);
-	u8 map = val & PCIE_CORE_LANE_MAP_MASK;
+	u32 val;
+	u8 map;
+
+	if (rockchip->legacy_phy)
+		return BIT(0);
+
+	val = rockchip_pcie_read(rockchip, PCIE_CORE_LANE_MAP);
+	map = val & PCIE_CORE_LANE_MAP_MASK;
 
 	/* The link may be using a reverse-indexed mapping. */
 	if (val & PCIE_CORE_LANE_MAP_REVERSE)
@@ -715,13 +721,10 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
 			  PCIE_CORE_PL_CONF_LANE_SHIFT);
 	dev_dbg(dev, "current link width is x%d\n", status);
 
-	if (!rockchip->legacy_phy) {
-		/*  power off unused lane(s) */
-		rockchip->lanes_map = rockchip_pcie_lane_map(rockchip);
-		for (i = 0; i < MAX_LANE_NUM; i++) {
-			if (rockchip->lanes_map & BIT(i))
-				continue;
-
+	/* Power off unused lane(s) */
+	rockchip->lanes_map = rockchip_pcie_lane_map(rockchip);
+	for (i = 0; i < MAX_LANE_NUM; i++) {
+		if (!(rockchip->lanes_map & BIT(i))) {
 			dev_dbg(dev, "idling lane %d\n", i);
 			phy_power_off(rockchip->phys[i]);
 		}
@@ -1378,7 +1381,7 @@ static int __maybe_unused rockchip_pcie_suspend_noirq(struct device *dev)
 	}
 
 	for (i = 0; i < MAX_LANE_NUM; i++) {
-		/* inactive lane is already powered off */
+		/* inactive lanes are already powered off */
 		if (rockchip->lanes_map & BIT(i))
 			phy_power_off(rockchip->phys[i]);
 		phy_exit(rockchip->phys[i]);
@@ -1628,7 +1631,7 @@ static int rockchip_pcie_remove(struct platform_device *pdev)
 	irq_domain_remove(rockchip->irq_domain);
 
 	for (i = 0; i < MAX_LANE_NUM; i++) {
-		/* inactive lane is already powered off */
+		/* inactive lanes are already powered off */
 		if (rockchip->lanes_map & BIT(i))
 			phy_power_off(rockchip->phys[i]);
 		phy_exit(rockchip->phys[i]);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip
  2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
                   ` (9 preceding siblings ...)
  2017-08-23  7:03 ` [PATCH v5 10/10] PCI: rockchip: umap io space " Shawn Lin
@ 2017-08-25 21:38 ` Bjorn Helgaas
  2017-08-28  2:22   ` Shawn Lin
  10 siblings, 1 reply; 22+ messages in thread
From: Bjorn Helgaas @ 2017-08-25 21:38 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris, Jeffy Chen

On Wed, Aug 23, 2017 at 03:01:13PM +0800, Shawn Lin wrote:
> 
> Hi Bjorn,
> 
> Patch 1 -> 4 are for what you suggested in my V4
> of bug fixing[1].
> 
> Patch 5 -> 7 are sloving what I said for my former
> patch of PHY cleanup[2]. It seems you didn't see my
> V2 of that[3], and my v2 also have some minor issues
> that was fixed by Jeffy's patch[4]. So these patches in
> flight for cleaning up the pcie-rockchip error handling
> path which would conflict with each other. I merge the
> similar PHY cleanup from patch[4].
> 
> Also I think we need to split it[4] up into smaller pieces.
> So patch 8 -> 10 are for this purpose and avoid Jeffy to
> rebase this work again.
> 
> Could you kindly drop patch[2] from your host-rockchip branch
> and apply this patchset if it looks good to you? :)
> 
> [1]: https://patchwork.kernel.org/patch/9895141/
> [2]: https://patchwork.kernel.org/patch/9890367/
> [3]: https://patchwork.kernel.org/patch/9892461/
> [4]: http://patchwork.ozlabs.org/patch/804239/
> 
> 
> Changes in v5:
> - rebase on former reconstrtion patches suggested by Bjorn
> - fix all the missing error handling cases that need to cleanup
>   PHY
> 
> Changes in v4:
> - split out rockchip_pcie_enable_clocks and reuse
>   rockchip_pcie_enable_clocks and rockchip_pcie_disable_clocks
>   for elsewhere suggested by Jeffy
> 
> Changes in v3:
> - check the return value of devm_add_action_or_reset and spilt out
>   rockchip_pcie_setup_irq in order to move requesting irq after
>   enabling clks.
> 
> Changes in v2:
> - use devm_add_action_or_reset to fix this ordering suggested by
>   Heiko and Jeffy. Thanks!
> 
> Jeffy Chen (3):
>   PCI: rockchip: disable vpcie0v9 for resume_noirq error handling path
>   PCI: rockchip: remove irq domain if failing to probe
>   PCI: rockchip: umap io space if failing to probe
> 
> Shawn Lin (7):
>   PCI: rockchip: spilt out rockchip_pcie_setup_irq
>   PCI: rockchip: spilt out rockchip_pcie_enable_clocks
>   PCI: rockchip: spilt out rockchip_pcie_disable_clocks
>   PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
>   PCI: rockchip: spilt out rockchip_pcie_deinit_phys
>   PCI: rockchip: fix missing phy manipulation for legacy phy
>   PCI: rockchip: Clean up PHY if driver probe or resume fails
> 
>  drivers/pci/host/pcie-rockchip.c | 297 ++++++++++++++++++++++-----------------
>  1 file changed, 166 insertions(+), 131 deletions(-)

I applied these to pci/host-rockchip except for these:

  PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
  PCI: rockchip: fix missing phy manipulation for legacy phy

I'm not really happy with the devm/clk_prepare_enable situation, so
I'm waiting to see if a better solution emerges.

I went ahead and applied the tweaks I proposed to the earlier "PCI:
rockchip: Add per-lane PHY support" and "PCI: rockchip: Idle inactive
PHY(s)" patches.  I *think* those make the "fix missing phy
manipulation for legacy phy" patch unnecessary.

But please take a look and make sure.  If I went the wrong direction,
I'll gladly back it out.

Bjorn

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip
  2017-08-25 21:38 ` [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Bjorn Helgaas
@ 2017-08-28  2:22   ` Shawn Lin
  2017-08-28 18:33     ` Bjorn Helgaas
  0 siblings, 1 reply; 22+ messages in thread
From: Shawn Lin @ 2017-08-28  2:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: shawn.lin, Bjorn Helgaas, linux-pci, linux-rockchip,
	Brian Norris, Jeffy Chen

Hi Bjorn,

On 2017/8/26 5:38, Bjorn Helgaas wrote:
> On Wed, Aug 23, 2017 at 03:01:13PM +0800, Shawn Lin wrote:
>>

...

> 
> I applied these to pci/host-rockchip except for these:
> 
>    PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
>    PCI: rockchip: fix missing phy manipulation for legacy phy
> 
> I'm not really happy with the devm/clk_prepare_enable situation, so
> I'm waiting to see if a better solution emerges.
> 
> I went ahead and applied the tweaks I proposed to the earlier "PCI:
> rockchip: Add per-lane PHY support" and "PCI: rockchip: Idle inactive
> PHY(s)" patches.  I *think* those make the "fix missing phy
> manipulation for legacy phy" patch unnecessary.
> 
> But please take a look and make sure.  If I went the wrong direction,
> I'll gladly back it out.

I tested both of legacy and per-lane PHY mode, and it works fine.

But just a nit:

with legacy PHY, we could avoid to print the following message
which is confusing as for legacy PHY mode, we couldn't idle
any inactive lanes.

[    0.655192] rockchip-pcie f8000000.pcie: idling lane 1
[    0.655696] rockchip-pcie f8000000.pcie: idling lane 2
[    0.656194] rockchip-pcie f8000000.pcie: idling lane 3

So I think we could return 0xf for legacy PHY mode like this:

static u8 rockchip_pcie_lane_map(struct rockchip_pcie *rockchip)
         u8 map;

         if (rockchip->legacy_phy)
-               return BIT(0);
+               return GENMASK(MAX_LANE_NUM - 1, 0);


> 
> Bjorn
> 
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip
  2017-08-28  2:22   ` Shawn Lin
@ 2017-08-28 18:33     ` Bjorn Helgaas
  2017-08-29  0:47       ` Shawn Lin
  0 siblings, 1 reply; 22+ messages in thread
From: Bjorn Helgaas @ 2017-08-28 18:33 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris, Jeffy Chen

On Mon, Aug 28, 2017 at 10:22:24AM +0800, Shawn Lin wrote:
> Hi Bjorn,
> 
> On 2017/8/26 5:38, Bjorn Helgaas wrote:
> >On Wed, Aug 23, 2017 at 03:01:13PM +0800, Shawn Lin wrote:
> >>
> 
> ...
> 
> >
> >I applied these to pci/host-rockchip except for these:
> >
> >   PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
> >   PCI: rockchip: fix missing phy manipulation for legacy phy
> >
> >I'm not really happy with the devm/clk_prepare_enable situation, so
> >I'm waiting to see if a better solution emerges.
> >
> >I went ahead and applied the tweaks I proposed to the earlier "PCI:
> >rockchip: Add per-lane PHY support" and "PCI: rockchip: Idle inactive
> >PHY(s)" patches.  I *think* those make the "fix missing phy
> >manipulation for legacy phy" patch unnecessary.
> >
> >But please take a look and make sure.  If I went the wrong direction,
> >I'll gladly back it out.
> 
> I tested both of legacy and per-lane PHY mode, and it works fine.
> 
> But just a nit:
> 
> with legacy PHY, we could avoid to print the following message
> which is confusing as for legacy PHY mode, we couldn't idle
> any inactive lanes.
> 
> [    0.655192] rockchip-pcie f8000000.pcie: idling lane 1
> [    0.655696] rockchip-pcie f8000000.pcie: idling lane 2
> [    0.656194] rockchip-pcie f8000000.pcie: idling lane 3
> 
> So I think we could return 0xf for legacy PHY mode like this:
> 
> static u8 rockchip_pcie_lane_map(struct rockchip_pcie *rockchip)
>         u8 map;
> 
>         if (rockchip->legacy_phy)
> -               return BIT(0);
> +               return GENMASK(MAX_LANE_NUM - 1, 0);

When we're using legacy PHY mode, this changes lanes_map from 0x1 to
0xf.  That means we won't print the "idling lane 1" message.  It
*also* means we won't call phy_power_off() for lanes 1-3.  Is that the
correct behavior?  Does the legacy PHY mode actually use all 4 PHYs?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip
  2017-08-28 18:33     ` Bjorn Helgaas
@ 2017-08-29  0:47       ` Shawn Lin
  2017-08-29 18:25         ` Bjorn Helgaas
  0 siblings, 1 reply; 22+ messages in thread
From: Shawn Lin @ 2017-08-29  0:47 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: shawn.lin, Bjorn Helgaas, linux-pci, linux-rockchip,
	Brian Norris, Jeffy Chen

Hi Bjorn,

On 2017/8/29 2:33, Bjorn Helgaas wrote:
> On Mon, Aug 28, 2017 at 10:22:24AM +0800, Shawn Lin wrote:
>> Hi Bjorn,
>>
>> On 2017/8/26 5:38, Bjorn Helgaas wrote:
>>> On Wed, Aug 23, 2017 at 03:01:13PM +0800, Shawn Lin wrote:
>>>>
>>
>> ...
>>
>>>
>>> I applied these to pci/host-rockchip except for these:
>>>
>>>    PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
>>>    PCI: rockchip: fix missing phy manipulation for legacy phy
>>>
>>> I'm not really happy with the devm/clk_prepare_enable situation, so
>>> I'm waiting to see if a better solution emerges.
>>>
>>> I went ahead and applied the tweaks I proposed to the earlier "PCI:
>>> rockchip: Add per-lane PHY support" and "PCI: rockchip: Idle inactive
>>> PHY(s)" patches.  I *think* those make the "fix missing phy
>>> manipulation for legacy phy" patch unnecessary.
>>>
>>> But please take a look and make sure.  If I went the wrong direction,
>>> I'll gladly back it out.
>>
>> I tested both of legacy and per-lane PHY mode, and it works fine.
>>
>> But just a nit:
>>
>> with legacy PHY, we could avoid to print the following message
>> which is confusing as for legacy PHY mode, we couldn't idle
>> any inactive lanes.
>>
>> [    0.655192] rockchip-pcie f8000000.pcie: idling lane 1
>> [    0.655696] rockchip-pcie f8000000.pcie: idling lane 2
>> [    0.656194] rockchip-pcie f8000000.pcie: idling lane 3
>>
>> So I think we could return 0xf for legacy PHY mode like this:
>>
>> static u8 rockchip_pcie_lane_map(struct rockchip_pcie *rockchip)
>>          u8 map;
>>
>>          if (rockchip->legacy_phy)
>> -               return BIT(0);
>> +               return GENMASK(MAX_LANE_NUM - 1, 0);
> 
> When we're using legacy PHY mode, this changes lanes_map from 0x1 to
> 0xf.  That means we won't print the "idling lane 1" message.  It
> *also* means we won't call phy_power_off() for lanes 1-3.  Is that the
> correct behavior?  Does the legacy PHY mode actually use all 4 PHYs?

Yes, it's the correct behaviour. The 4 PHYs is abstraction of 4 lanes
for per-lane PHY mode, but for legacy PHY mode, PHY represents a single
component managing 4 lanes together. So no matter how many lanes are
used, we couldn't call phy_power_off for legacy PHY mode until we really
want to power off the PHY. That is  why we need to recontruct the pcie
and phy driver to migrate to per-lane PHY mode as we could treat each
lane as a phy object and let phy driver do idling work in the
phy_power_off callback instead of power-off work if ref count isn't
decreased to zero.

Also I think the key point here is that for legacy PHY mode, only
phys[0] is vaild, so there is no difference if we call phy_power_off for
phys[1..3] as you did, because PHY API will check the phy object and
won't do anything if seeing invalid/NULL phy object. So changing 
lanes_map from 0x1 to 0xf is only silent the "idling lane x" log.


> 
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip
  2017-08-29  0:47       ` Shawn Lin
@ 2017-08-29 18:25         ` Bjorn Helgaas
  0 siblings, 0 replies; 22+ messages in thread
From: Bjorn Helgaas @ 2017-08-29 18:25 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris, Jeffy Chen

On Tue, Aug 29, 2017 at 08:47:28AM +0800, Shawn Lin wrote:
> Hi Bjorn,
> 
> On 2017/8/29 2:33, Bjorn Helgaas wrote:
> >On Mon, Aug 28, 2017 at 10:22:24AM +0800, Shawn Lin wrote:
> >>Hi Bjorn,
> >>
> >>On 2017/8/26 5:38, Bjorn Helgaas wrote:
> >>>On Wed, Aug 23, 2017 at 03:01:13PM +0800, Shawn Lin wrote:
> >>>>
> >>
> >>...
> >>
> >>>
> >>>I applied these to pci/host-rockchip except for these:
> >>>
> >>>   PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ
> >>>   PCI: rockchip: fix missing phy manipulation for legacy phy
> >>>
> >>>I'm not really happy with the devm/clk_prepare_enable situation, so
> >>>I'm waiting to see if a better solution emerges.
> >>>
> >>>I went ahead and applied the tweaks I proposed to the earlier "PCI:
> >>>rockchip: Add per-lane PHY support" and "PCI: rockchip: Idle inactive
> >>>PHY(s)" patches.  I *think* those make the "fix missing phy
> >>>manipulation for legacy phy" patch unnecessary.
> >>>
> >>>But please take a look and make sure.  If I went the wrong direction,
> >>>I'll gladly back it out.
> >>
> >>I tested both of legacy and per-lane PHY mode, and it works fine.
> >>
> >>But just a nit:
> >>
> >>with legacy PHY, we could avoid to print the following message
> >>which is confusing as for legacy PHY mode, we couldn't idle
> >>any inactive lanes.
> >>
> >>[    0.655192] rockchip-pcie f8000000.pcie: idling lane 1
> >>[    0.655696] rockchip-pcie f8000000.pcie: idling lane 2
> >>[    0.656194] rockchip-pcie f8000000.pcie: idling lane 3
> >>
> >>So I think we could return 0xf for legacy PHY mode like this:
> >>
> >>static u8 rockchip_pcie_lane_map(struct rockchip_pcie *rockchip)
> >>         u8 map;
> >>
> >>         if (rockchip->legacy_phy)
> >>-               return BIT(0);
> >>+               return GENMASK(MAX_LANE_NUM - 1, 0);
> >
> >When we're using legacy PHY mode, this changes lanes_map from 0x1 to
> >0xf.  That means we won't print the "idling lane 1" message.  It
> >*also* means we won't call phy_power_off() for lanes 1-3.  Is that the
> >correct behavior?  Does the legacy PHY mode actually use all 4 PHYs?
> 
> Yes, it's the correct behaviour. The 4 PHYs is abstraction of 4 lanes
> for per-lane PHY mode, but for legacy PHY mode, PHY represents a single
> component managing 4 lanes together. So no matter how many lanes are
> used, we couldn't call phy_power_off for legacy PHY mode until we really
> want to power off the PHY. That is  why we need to recontruct the pcie
> and phy driver to migrate to per-lane PHY mode as we could treat each
> lane as a phy object and let phy driver do idling work in the
> phy_power_off callback instead of power-off work if ref count isn't
> decreased to zero.
> 
> Also I think the key point here is that for legacy PHY mode, only
> phys[0] is vaild, so there is no difference if we call phy_power_off for
> phys[1..3] as you did, because PHY API will check the phy object and
> won't do anything if seeing invalid/NULL phy object. So changing
> lanes_map from 0x1 to 0xf is only silent the "idling lane x" log.

OK, I folded in the GENMASK update above.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-08-29 18:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-23  7:01 [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Shawn Lin
2017-08-23  7:02 ` [PATCH v5 01/10] PCI: rockchip: spilt out rockchip_pcie_setup_irq Shawn Lin
2017-08-23  7:02 ` [PATCH v5 02/10] PCI: rockchip: spilt out rockchip_pcie_enable_clocks Shawn Lin
2017-08-23  7:02 ` [PATCH v5 03/10] PCI: rockchip: spilt out rockchip_pcie_disable_clocks Shawn Lin
2017-08-23  7:02 ` [PATCH v5 04/10] PCI: rockchip: fix system hang up if activating CONFIG_DEBUG_SHIRQ Shawn Lin
2017-08-24 20:21   ` Bjorn Helgaas
2017-08-24 21:10     ` Dmitry Torokhov
2017-08-25  1:44       ` Brian Norris
2017-08-25  1:05     ` jeffy
2017-08-25  1:38     ` Shawn Lin
2017-08-23  7:02 ` [PATCH v5 05/10] PCI: rockchip: spilt out rockchip_pcie_deinit_phys Shawn Lin
2017-08-23  7:02 ` [PATCH v5 06/10] PCI: rockchip: fix missing phy manipulation for legacy phy Shawn Lin
2017-08-25 21:18   ` Bjorn Helgaas
2017-08-23  7:03 ` [PATCH v5 07/10] PCI: rockchip: Clean up PHY if driver probe or resume fails Shawn Lin
2017-08-23  7:03 ` [PATCH v5 08/10] PCI: rockchip: disable vpcie0v9 for resume_noirq error handling path Shawn Lin
2017-08-23  7:03 ` [PATCH v5 09/10] PCI: rockchip: remove irq domain if failing to probe Shawn Lin
2017-08-23  7:03 ` [PATCH v5 10/10] PCI: rockchip: umap io space " Shawn Lin
2017-08-25 21:38 ` [PATCH v5 0/10] Some cleanup and bug fix for pcie-rockchip Bjorn Helgaas
2017-08-28  2:22   ` Shawn Lin
2017-08-28 18:33     ` Bjorn Helgaas
2017-08-29  0:47       ` Shawn Lin
2017-08-29 18:25         ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.