All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-10 23:02 ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, Daniel Vetter, David Airlie, linux-kernel

I found that if I ever had a little mistake in my kernel config,
or device tree, or graphics driver that my system would sit in a loop
at bootup trying again and again and again.  An example log was:

  msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
  msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
  msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
  [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
  ...

I finally tracked it down where this was happening:
  - msm_pdev_probe() is called.
  - msm_pdev_probe() registers drivers.  Registering drivers kicks
    off processing of probe deferrals.
  - component_master_add_with_match() could return -EPROBE_DEFER.
    making msm_pdev_probe() return -EPROBE_DEFER.
  - When msm_pdev_probe() returned the processing of probe deferrals
    happens.
  - Loop back to the start.

It looks like we can fix this by marking "mdss" as a "simple-bus".
I have no idea if people consider this the right thing to do or a
hack.  Hopefully it's the right thing to do.  :-)

Once I do this I notice that my boot gets marginally faster (you
don't need to probe the sub devices over and over) and also if I
have a problem it doesn't loop forever (on my system it still
gets upset about some stuck clocks in that case, but at least I
can boot up).

Unless someone hates this, I'd expect:
- Get Rob H to say that the bindings are OK (if they are) and yell
  that these really need to be converted to yaml (they do).
- Get Sean or Rob C to land the bindings and driver patch.
- Get Andy or Bjorn to land the dts bits.

NOTES:
- The first patch could land either way.  It's just a cleanup.
- I tried to split the dts files into separate patches to ease
  backporting if desired.  Also because I can't actually test most
  of this hardware myself.


Douglas Anderson (9):
  drm/msm: Use the devm variant of of_platform_populate()
  dt-bindings: msm/dpu: Add simple-bus to dpu bindings
  dt-bindings: msm/mdp5: Add simple-bus to dpu bindings
  drm/msm: Avoid manually populating our children if "simple-bus" is
    there
  arm64: dts: qcom: sc7180: Add "simple-bus" to our mdss node
  arm64: dts: qcom: sdm845: Add "simple-bus" to our mdss node
  arm64: dts: qcom: msm8916: Add "simple-bus" to our mdss node
  arm64: dts: qcom: msm8996: Add "simple-bus" to our mdss node
  ARM: dts: qcom: msm8974: Add "simple-bus" to our mdss node

 .../devicetree/bindings/display/msm/dpu.txt   |  4 ++-
 .../devicetree/bindings/display/msm/mdp5.txt  |  2 +-
 arch/arm/boot/dts/qcom-msm8974.dtsi           |  2 +-
 arch/arm64/boot/dts/qcom/msm8916.dtsi         |  2 +-
 arch/arm64/boot/dts/qcom/msm8996.dtsi         |  2 +-
 arch/arm64/boot/dts/qcom/sc7180.dtsi          |  2 +-
 arch/arm64/boot/dts/qcom/sdm845.dtsi          |  2 +-
 drivers/gpu/drm/msm/msm_drv.c                 | 34 ++++++++-----------
 8 files changed, 24 insertions(+), 26 deletions(-)

-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-10 23:02 ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, David Airlie, linux-arm-msm, Douglas Anderson,
	dri-devel, linux-kernel, freedreno

I found that if I ever had a little mistake in my kernel config,
or device tree, or graphics driver that my system would sit in a loop
at bootup trying again and again and again.  An example log was:

  msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
  msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
  msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
  [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
  ...

I finally tracked it down where this was happening:
  - msm_pdev_probe() is called.
  - msm_pdev_probe() registers drivers.  Registering drivers kicks
    off processing of probe deferrals.
  - component_master_add_with_match() could return -EPROBE_DEFER.
    making msm_pdev_probe() return -EPROBE_DEFER.
  - When msm_pdev_probe() returned the processing of probe deferrals
    happens.
  - Loop back to the start.

It looks like we can fix this by marking "mdss" as a "simple-bus".
I have no idea if people consider this the right thing to do or a
hack.  Hopefully it's the right thing to do.  :-)

Once I do this I notice that my boot gets marginally faster (you
don't need to probe the sub devices over and over) and also if I
have a problem it doesn't loop forever (on my system it still
gets upset about some stuck clocks in that case, but at least I
can boot up).

Unless someone hates this, I'd expect:
- Get Rob H to say that the bindings are OK (if they are) and yell
  that these really need to be converted to yaml (they do).
- Get Sean or Rob C to land the bindings and driver patch.
- Get Andy or Bjorn to land the dts bits.

NOTES:
- The first patch could land either way.  It's just a cleanup.
- I tried to split the dts files into separate patches to ease
  backporting if desired.  Also because I can't actually test most
  of this hardware myself.


Douglas Anderson (9):
  drm/msm: Use the devm variant of of_platform_populate()
  dt-bindings: msm/dpu: Add simple-bus to dpu bindings
  dt-bindings: msm/mdp5: Add simple-bus to dpu bindings
  drm/msm: Avoid manually populating our children if "simple-bus" is
    there
  arm64: dts: qcom: sc7180: Add "simple-bus" to our mdss node
  arm64: dts: qcom: sdm845: Add "simple-bus" to our mdss node
  arm64: dts: qcom: msm8916: Add "simple-bus" to our mdss node
  arm64: dts: qcom: msm8996: Add "simple-bus" to our mdss node
  ARM: dts: qcom: msm8974: Add "simple-bus" to our mdss node

 .../devicetree/bindings/display/msm/dpu.txt   |  4 ++-
 .../devicetree/bindings/display/msm/mdp5.txt  |  2 +-
 arch/arm/boot/dts/qcom-msm8974.dtsi           |  2 +-
 arch/arm64/boot/dts/qcom/msm8916.dtsi         |  2 +-
 arch/arm64/boot/dts/qcom/msm8996.dtsi         |  2 +-
 arch/arm64/boot/dts/qcom/sc7180.dtsi          |  2 +-
 arch/arm64/boot/dts/qcom/sdm845.dtsi          |  2 +-
 drivers/gpu/drm/msm/msm_drv.c                 | 34 ++++++++-----------
 8 files changed, 24 insertions(+), 26 deletions(-)

-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/9] drm/msm: Use the devm variant of of_platform_populate()
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, Daniel Vetter, David Airlie, linux-kernel

Let's save ourselves some hassle and use the devm variant of
of_platform_populate() do we don't need to worry about manually
depopulating.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/msm_drv.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index f6ce40bf3699..78b09fde9e29 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1208,7 +1208,7 @@ static int add_display_components(struct device *dev,
 	if (of_device_is_compatible(dev->of_node, "qcom,mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sdm845-mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sc7180-mdss")) {
-		ret = of_platform_populate(dev->of_node, NULL, NULL, dev);
+		ret = devm_of_platform_populate(dev);
 		if (ret) {
 			DRM_DEV_ERROR(dev, "failed to populate children devices\n");
 			return ret;
@@ -1217,7 +1217,6 @@ static int add_display_components(struct device *dev,
 		mdp_dev = device_find_child(dev, NULL, compare_name_mdp);
 		if (!mdp_dev) {
 			DRM_DEV_ERROR(dev, "failed to find MDSS MDP node\n");
-			of_platform_depopulate(dev);
 			return -ENODEV;
 		}
 
@@ -1232,8 +1231,6 @@ static int add_display_components(struct device *dev,
 	}
 
 	ret = add_components_mdp(mdp_dev, matchptr);
-	if (ret)
-		of_platform_depopulate(dev);
 
 	return ret;
 }
@@ -1300,30 +1297,21 @@ static int msm_pdev_probe(struct platform_device *pdev)
 
 	ret = add_gpu_components(&pdev->dev, &match);
 	if (ret)
-		goto fail;
+		return ret;
 
 	/* on all devices that I am aware of, iommu's which can map
 	 * any address the cpu can see are used:
 	 */
 	ret = dma_set_mask_and_coherent(&pdev->dev, ~0);
 	if (ret)
-		goto fail;
-
-	ret = component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
-	if (ret)
-		goto fail;
-
-	return 0;
+		return ret;
 
-fail:
-	of_platform_depopulate(&pdev->dev);
-	return ret;
+	return component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
 }
 
 static int msm_pdev_remove(struct platform_device *pdev)
 {
 	component_master_del(&pdev->dev, &msm_drm_ops);
-	of_platform_depopulate(&pdev->dev);
 
 	return 0;
 }
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 1/9] drm/msm: Use the devm variant of of_platform_populate()
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, David Airlie, linux-arm-msm, Douglas Anderson,
	dri-devel, linux-kernel, freedreno

Let's save ourselves some hassle and use the devm variant of
of_platform_populate() do we don't need to worry about manually
depopulating.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/msm_drv.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index f6ce40bf3699..78b09fde9e29 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1208,7 +1208,7 @@ static int add_display_components(struct device *dev,
 	if (of_device_is_compatible(dev->of_node, "qcom,mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sdm845-mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sc7180-mdss")) {
-		ret = of_platform_populate(dev->of_node, NULL, NULL, dev);
+		ret = devm_of_platform_populate(dev);
 		if (ret) {
 			DRM_DEV_ERROR(dev, "failed to populate children devices\n");
 			return ret;
@@ -1217,7 +1217,6 @@ static int add_display_components(struct device *dev,
 		mdp_dev = device_find_child(dev, NULL, compare_name_mdp);
 		if (!mdp_dev) {
 			DRM_DEV_ERROR(dev, "failed to find MDSS MDP node\n");
-			of_platform_depopulate(dev);
 			return -ENODEV;
 		}
 
@@ -1232,8 +1231,6 @@ static int add_display_components(struct device *dev,
 	}
 
 	ret = add_components_mdp(mdp_dev, matchptr);
-	if (ret)
-		of_platform_depopulate(dev);
 
 	return ret;
 }
@@ -1300,30 +1297,21 @@ static int msm_pdev_probe(struct platform_device *pdev)
 
 	ret = add_gpu_components(&pdev->dev, &match);
 	if (ret)
-		goto fail;
+		return ret;
 
 	/* on all devices that I am aware of, iommu's which can map
 	 * any address the cpu can see are used:
 	 */
 	ret = dma_set_mask_and_coherent(&pdev->dev, ~0);
 	if (ret)
-		goto fail;
-
-	ret = component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
-	if (ret)
-		goto fail;
-
-	return 0;
+		return ret;
 
-fail:
-	of_platform_depopulate(&pdev->dev);
-	return ret;
+	return component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
 }
 
 static int msm_pdev_remove(struct platform_device *pdev)
 {
 	component_master_del(&pdev->dev, &msm_drm_ops);
-	of_platform_depopulate(&pdev->dev);
 
 	return 0;
 }
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 2/9] dt-bindings: msm/dpu: Add simple-bus to dpu bindings
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, Daniel Vetter, David Airlie, linux-kernel

We have a whole bunch of child devices that we really just want
treated as other devices to instantiate.  Add the "simple-bus"
property for this.

Commit-notes
This bindings file really needs to move over to yaml.  Hopefully
someone at Qualcomm who's working on display code can help with that.

Right now on Linux we have a manual call to of_platform_populate() to
populate our children.  That's pretty non-ideal as described in
another patch in this series and I'm trying to remove it.

I'm not sure how much of a hack to consider "simple-bus".  I've seen
it used like this before, but if folks tell me that it's terrible then
I guess we'll have to figure some other way out of our crazy
-EPROBE_DEFER loop in Linux.
END

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 Documentation/devicetree/bindings/display/msm/dpu.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt b/Documentation/devicetree/bindings/display/msm/dpu.txt
index 551ae26f60da..b88269524365 100644
--- a/Documentation/devicetree/bindings/display/msm/dpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/dpu.txt
@@ -8,7 +8,9 @@ The DPU display controller is found in SDM845 SoC.
 
 MDSS:
 Required properties:
-- compatible:  "qcom,sdm845-mdss", "qcom,sc7180-mdss"
+- compatible:  One of:
+  - "qcom,sdm845-mdss", "simple-bus"
+  - "qcom,sc7180-mdss", "simple-bus"
 - reg: physical base address and length of contoller's registers.
 - reg-names: register region names. The following region is required:
   * "mdss"
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 2/9] dt-bindings: msm/dpu: Add simple-bus to dpu bindings
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, David Airlie, linux-arm-msm, Douglas Anderson,
	dri-devel, linux-kernel, freedreno

We have a whole bunch of child devices that we really just want
treated as other devices to instantiate.  Add the "simple-bus"
property for this.

Commit-notes
This bindings file really needs to move over to yaml.  Hopefully
someone at Qualcomm who's working on display code can help with that.

Right now on Linux we have a manual call to of_platform_populate() to
populate our children.  That's pretty non-ideal as described in
another patch in this series and I'm trying to remove it.

I'm not sure how much of a hack to consider "simple-bus".  I've seen
it used like this before, but if folks tell me that it's terrible then
I guess we'll have to figure some other way out of our crazy
-EPROBE_DEFER loop in Linux.
END

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 Documentation/devicetree/bindings/display/msm/dpu.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt b/Documentation/devicetree/bindings/display/msm/dpu.txt
index 551ae26f60da..b88269524365 100644
--- a/Documentation/devicetree/bindings/display/msm/dpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/dpu.txt
@@ -8,7 +8,9 @@ The DPU display controller is found in SDM845 SoC.
 
 MDSS:
 Required properties:
-- compatible:  "qcom,sdm845-mdss", "qcom,sc7180-mdss"
+- compatible:  One of:
+  - "qcom,sdm845-mdss", "simple-bus"
+  - "qcom,sc7180-mdss", "simple-bus"
 - reg: physical base address and length of contoller's registers.
 - reg-names: register region names. The following region is required:
   * "mdss"
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 3/9] dt-bindings: msm/mdp5: Add simple-bus to dpu bindings
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, Daniel Vetter, David Airlie, linux-kernel

This is just like the patch ("dt-bindings: msm/disp: Add simple-bus to
dpu bindings") but for the mdp5 bindings.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 Documentation/devicetree/bindings/display/msm/mdp5.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/mdp5.txt b/Documentation/devicetree/bindings/display/msm/mdp5.txt
index 43d11279c925..bb57b07b7546 100644
--- a/Documentation/devicetree/bindings/display/msm/mdp5.txt
+++ b/Documentation/devicetree/bindings/display/msm/mdp5.txt
@@ -9,7 +9,7 @@ controller found in SoCs like MSM8974, APQ8084, MSM8916, MSM8994 and MSM8996.
 MDSS:
 Required properties:
 - compatible:
-  * "qcom,mdss" - MDSS
+  * "qcom,mdss", "simple-bus" - MDSS
 - reg: Physical base address and length of the controller's registers.
 - reg-names: The names of register regions. The following regions are required:
   * "mdss_phys"
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 3/9] dt-bindings: msm/mdp5: Add simple-bus to dpu bindings
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, David Airlie, linux-arm-msm, Douglas Anderson,
	dri-devel, linux-kernel, freedreno

This is just like the patch ("dt-bindings: msm/disp: Add simple-bus to
dpu bindings") but for the mdp5 bindings.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 Documentation/devicetree/bindings/display/msm/mdp5.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/mdp5.txt b/Documentation/devicetree/bindings/display/msm/mdp5.txt
index 43d11279c925..bb57b07b7546 100644
--- a/Documentation/devicetree/bindings/display/msm/mdp5.txt
+++ b/Documentation/devicetree/bindings/display/msm/mdp5.txt
@@ -9,7 +9,7 @@ controller found in SoCs like MSM8974, APQ8084, MSM8916, MSM8994 and MSM8996.
 MDSS:
 Required properties:
 - compatible:
-  * "qcom,mdss" - MDSS
+  * "qcom,mdss", "simple-bus" - MDSS
 - reg: Physical base address and length of the controller's registers.
 - reg-names: The names of register regions. The following regions are required:
   * "mdss_phys"
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 4/9] drm/msm: Avoid manually populating our children if "simple-bus" is there
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, Daniel Vetter, David Airlie, linux-kernel

If our compatible string has "simple-bus" at the end of it then the
system will handle probing our children for us.  Doing this has a few
advantages:

1. If we defer we don't have to keep probing / removing our children
   which should speed up boot.  The system just probes them once.

2. It fixes a problem where we could get into a big loop where we'd
   have:
   - msm_pdev_probe() is called.
   - msm_pdev_probe() registers drivers.  Registering drivers kicks
     off processing of probe deferrals.
   - component_master_add_with_match() could return -EPROBE_DEFER.
     making msm_pdev_probe() return -EPROBE_DEFER.
   - When msm_pdev_probe() returned the processing of probe deferrals
     happens.
   - Loop back to the start.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/msm_drv.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 78b09fde9e29..f7c6ef147095 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1208,10 +1208,18 @@ static int add_display_components(struct device *dev,
 	if (of_device_is_compatible(dev->of_node, "qcom,mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sdm845-mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sc7180-mdss")) {
-		ret = devm_of_platform_populate(dev);
-		if (ret) {
-			DRM_DEV_ERROR(dev, "failed to populate children devices\n");
-			return ret;
+		/*
+		 * Old device tree files didn't include "simple-bus"
+		 * in their compatible string so we had to manually pouplate
+		 * our children.  When existing device trees are fixed this
+		 * can be removed.
+		 */
+		if (!of_device_is_compatible(dev->of_node, "simple-bus")) {
+			ret = devm_of_platform_populate(dev);
+			if (ret) {
+				DRM_DEV_ERROR(dev, "failed to populate children devices\n");
+				return ret;
+			}
 		}
 
 		mdp_dev = device_find_child(dev, NULL, compare_name_mdp);
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 4/9] drm/msm: Avoid manually populating our children if "simple-bus" is there
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, David Airlie, linux-arm-msm, Douglas Anderson,
	dri-devel, linux-kernel, freedreno

If our compatible string has "simple-bus" at the end of it then the
system will handle probing our children for us.  Doing this has a few
advantages:

1. If we defer we don't have to keep probing / removing our children
   which should speed up boot.  The system just probes them once.

2. It fixes a problem where we could get into a big loop where we'd
   have:
   - msm_pdev_probe() is called.
   - msm_pdev_probe() registers drivers.  Registering drivers kicks
     off processing of probe deferrals.
   - component_master_add_with_match() could return -EPROBE_DEFER.
     making msm_pdev_probe() return -EPROBE_DEFER.
   - When msm_pdev_probe() returned the processing of probe deferrals
     happens.
   - Loop back to the start.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/msm_drv.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 78b09fde9e29..f7c6ef147095 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1208,10 +1208,18 @@ static int add_display_components(struct device *dev,
 	if (of_device_is_compatible(dev->of_node, "qcom,mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sdm845-mdss") ||
 	    of_device_is_compatible(dev->of_node, "qcom,sc7180-mdss")) {
-		ret = devm_of_platform_populate(dev);
-		if (ret) {
-			DRM_DEV_ERROR(dev, "failed to populate children devices\n");
-			return ret;
+		/*
+		 * Old device tree files didn't include "simple-bus"
+		 * in their compatible string so we had to manually pouplate
+		 * our children.  When existing device trees are fixed this
+		 * can be removed.
+		 */
+		if (!of_device_is_compatible(dev->of_node, "simple-bus")) {
+			ret = devm_of_platform_populate(dev);
+			if (ret) {
+				DRM_DEV_ERROR(dev, "failed to populate children devices\n");
+				return ret;
+			}
 		}
 
 		mdp_dev = device_find_child(dev, NULL, compare_name_mdp);
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 5/9] arm64: dts: qcom: sc7180: Add "simple-bus" to our mdss node
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, linux-kernel

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

 arch/arm64/boot/dts/qcom/sc7180.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index 2be81a2a1512..cae6c69fd85a 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -2594,7 +2594,7 @@ camnoc_virt: interconnect@ac00000 {
 		};
 
 		mdss: mdss@ae00000 {
-			compatible = "qcom,sc7180-mdss";
+			compatible = "qcom,sc7180-mdss", "simple-bus";
 			reg = <0 0x0ae00000 0 0x1000>;
 			reg-names = "mdss";
 
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 5/9] arm64: dts: qcom: sc7180: Add "simple-bus" to our mdss node
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, linux-arm-msm, linux-kernel, dri-devel,
	Douglas Anderson, freedreno

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

 arch/arm64/boot/dts/qcom/sc7180.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index 2be81a2a1512..cae6c69fd85a 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -2594,7 +2594,7 @@ camnoc_virt: interconnect@ac00000 {
 		};
 
 		mdss: mdss@ae00000 {
-			compatible = "qcom,sc7180-mdss";
+			compatible = "qcom,sc7180-mdss", "simple-bus";
 			reg = <0 0x0ae00000 0 0x1000>;
 			reg-names = "mdss";
 
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 6/9] arm64: dts: qcom: sdm845: Add "simple-bus" to our mdss node
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, linux-kernel

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index b0d8308a3e95..e52a5e95168a 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -3582,7 +3582,7 @@ clock_camcc: clock-controller@ad00000 {
 		};
 
 		mdss: mdss@ae00000 {
-			compatible = "qcom,sdm845-mdss";
+			compatible = "qcom,sdm845-mdss", "simple-bus";
 			reg = <0 0x0ae00000 0 0x1000>;
 			reg-names = "mdss";
 
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 6/9] arm64: dts: qcom: sdm845: Add "simple-bus" to our mdss node
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, linux-arm-msm, linux-kernel, dri-devel,
	Douglas Anderson, freedreno

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index b0d8308a3e95..e52a5e95168a 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -3582,7 +3582,7 @@ clock_camcc: clock-controller@ad00000 {
 		};
 
 		mdss: mdss@ae00000 {
-			compatible = "qcom,sdm845-mdss";
+			compatible = "qcom,sdm845-mdss", "simple-bus";
 			reg = <0 0x0ae00000 0 0x1000>;
 			reg-names = "mdss";
 
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 7/9] arm64: dts: qcom: msm8916: Add "simple-bus" to our mdss node
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, linux-kernel

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

NOTE: I have no way to test this patch, but I'm assuming it works OK?

 arch/arm64/boot/dts/qcom/msm8916.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 67cae5f9e47e..491362fe02ac 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -1023,7 +1023,7 @@ opp-19200000 {
 		};
 
 		mdss: mdss@1a00000 {
-			compatible = "qcom,mdss";
+			compatible = "qcom,mdss", "simple-bus";
 			reg = <0x1a00000 0x1000>,
 			      <0x1ac8000 0x3000>;
 			reg-names = "mdss_phys", "vbif_phys";
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 7/9] arm64: dts: qcom: msm8916: Add "simple-bus" to our mdss node
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, linux-arm-msm, linux-kernel, dri-devel,
	Douglas Anderson, freedreno

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

NOTE: I have no way to test this patch, but I'm assuming it works OK?

 arch/arm64/boot/dts/qcom/msm8916.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 67cae5f9e47e..491362fe02ac 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -1023,7 +1023,7 @@ opp-19200000 {
 		};
 
 		mdss: mdss@1a00000 {
-			compatible = "qcom,mdss";
+			compatible = "qcom,mdss", "simple-bus";
 			reg = <0x1a00000 0x1000>,
 			      <0x1ac8000 0x3000>;
 			reg-names = "mdss_phys", "vbif_phys";
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 8/9] arm64: dts: qcom: msm8996: Add "simple-bus" to our mdss node
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, linux-kernel

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

NOTE: I have no way to test this patch, but I'm assuming it works OK?

 arch/arm64/boot/dts/qcom/msm8996.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 9951286db775..e303b0e644ac 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -499,7 +499,7 @@ mmcc: clock-controller@8c0000 {
 		};
 
 		mdss: mdss@900000 {
-			compatible = "qcom,mdss";
+			compatible = "qcom,mdss", "simple-bus";
 
 			reg = <0x00900000 0x1000>,
 			      <0x009b0000 0x1040>,
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 8/9] arm64: dts: qcom: msm8996: Add "simple-bus" to our mdss node
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, linux-arm-msm, linux-kernel, dri-devel,
	Douglas Anderson, freedreno

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

NOTE: I have no way to test this patch, but I'm assuming it works OK?

 arch/arm64/boot/dts/qcom/msm8996.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 9951286db775..e303b0e644ac 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -499,7 +499,7 @@ mmcc: clock-controller@8c0000 {
 		};
 
 		mdss: mdss@900000 {
-			compatible = "qcom,mdss";
+			compatible = "qcom,mdss", "simple-bus";
 
 			reg = <0x00900000 0x1000>,
 			      <0x009b0000 0x1040>,
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 9/9] ARM: dts: qcom: msm8974: Add "simple-bus" to our mdss node
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-10 23:02   ` Douglas Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: dri-devel, linux-arm-msm, freedreno, devicetree,
	Douglas Anderson, linux-kernel

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

NOTE: I have no way to test this patch, but I'm assuming it works OK?

 arch/arm/boot/dts/qcom-msm8974.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 51f5f904f9eb..9f84c9fd716c 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -1402,7 +1402,7 @@ cnoc: interconnect@fc480000 {
 		mdss: mdss@fd900000 {
 			status = "disabled";
 
-			compatible = "qcom,mdss";
+			compatible = "qcom,mdss", "simple-bus";
 			reg = <0xfd900000 0x100>,
 			      <0xfd924000 0x1000>;
 			reg-names = "mdss_phys",
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 9/9] ARM: dts: qcom: msm8974: Add "simple-bus" to our mdss node
@ 2020-07-10 23:02   ` Douglas Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Douglas Anderson @ 2020-07-10 23:02 UTC (permalink / raw)
  To: Rob Herring, Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson
  Cc: devicetree, linux-arm-msm, linux-kernel, dri-devel,
	Douglas Anderson, freedreno

As described in the bindings patch, this means that our child nodes
are devices in their own right.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If this patch lands before the patch ("drm/msm: Avoid manually
populating our children if "simple-bus" is there") it doesn't seem to
be the end of the world.  The first time through
add_display_components() it'll notice that the children are already
populated and it will be a no-op.  When it gets a defer it will then
depouplate them (even though it didn't populate them) and future calls
will just populate them again.

NOTE: I have no way to test this patch, but I'm assuming it works OK?

 arch/arm/boot/dts/qcom-msm8974.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 51f5f904f9eb..9f84c9fd716c 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -1402,7 +1402,7 @@ cnoc: interconnect@fc480000 {
 		mdss: mdss@fd900000 {
 			status = "disabled";
 
-			compatible = "qcom,mdss";
+			compatible = "qcom,mdss", "simple-bus";
 			reg = <0xfd900000 0x100>,
 			      <0xfd924000 0x1000>;
 			reg-names = "mdss_phys",
-- 
2.27.0.383.g050319c2ae-goog

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-10 23:02 ` Douglas Anderson
@ 2020-07-13 14:11   ` Rob Herring
  -1 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-13 14:11 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson, dri-devel,
	linux-arm-msm, freedreno, devicetree, Daniel Vetter,
	David Airlie, linux-kernel

On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> I found that if I ever had a little mistake in my kernel config,
> or device tree, or graphics driver that my system would sit in a loop
> at bootup trying again and again and again.  An example log was:

Why do we care about optimizing the error case?

>   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
>   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
>   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
>   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
>   ...
>
> I finally tracked it down where this was happening:
>   - msm_pdev_probe() is called.
>   - msm_pdev_probe() registers drivers.  Registering drivers kicks
>     off processing of probe deferrals.
>   - component_master_add_with_match() could return -EPROBE_DEFER.
>     making msm_pdev_probe() return -EPROBE_DEFER.
>   - When msm_pdev_probe() returned the processing of probe deferrals
>     happens.
>   - Loop back to the start.
>
> It looks like we can fix this by marking "mdss" as a "simple-bus".
> I have no idea if people consider this the right thing to do or a
> hack.  Hopefully it's the right thing to do.  :-)

It's a simple test. Do the child devices have any dependency on the
parent to probe and/or function? If so, not a simple-bus.

> Once I do this I notice that my boot gets marginally faster (you
> don't need to probe the sub devices over and over) and also if I

Can you quantify that?

Have you run with devlinks enabled. You need a command line option to
enable. That too should reduce deferred probes.

> have a problem it doesn't loop forever (on my system it still
> gets upset about some stuck clocks in that case, but at least I
> can boot up).

Deferred probe only runs when a device is added, so it's not like it
is continually running.

Rob

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-13 14:11   ` Rob Herring
  0 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-13 14:11 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: freedreno, devicetree, David Airlie, linux-arm-msm, linux-kernel,
	dri-devel, Bjorn Andersson, Andy Gross, Sean Paul

On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> I found that if I ever had a little mistake in my kernel config,
> or device tree, or graphics driver that my system would sit in a loop
> at bootup trying again and again and again.  An example log was:

Why do we care about optimizing the error case?

>   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
>   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
>   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
>   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
>   ...
>
> I finally tracked it down where this was happening:
>   - msm_pdev_probe() is called.
>   - msm_pdev_probe() registers drivers.  Registering drivers kicks
>     off processing of probe deferrals.
>   - component_master_add_with_match() could return -EPROBE_DEFER.
>     making msm_pdev_probe() return -EPROBE_DEFER.
>   - When msm_pdev_probe() returned the processing of probe deferrals
>     happens.
>   - Loop back to the start.
>
> It looks like we can fix this by marking "mdss" as a "simple-bus".
> I have no idea if people consider this the right thing to do or a
> hack.  Hopefully it's the right thing to do.  :-)

It's a simple test. Do the child devices have any dependency on the
parent to probe and/or function? If so, not a simple-bus.

> Once I do this I notice that my boot gets marginally faster (you
> don't need to probe the sub devices over and over) and also if I

Can you quantify that?

Have you run with devlinks enabled. You need a command line option to
enable. That too should reduce deferred probes.

> have a problem it doesn't loop forever (on my system it still
> gets upset about some stuck clocks in that case, but at least I
> can boot up).

Deferred probe only runs when a device is added, so it's not like it
is continually running.

Rob
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-13 14:11   ` Rob Herring
@ 2020-07-13 14:58     ` Jeffrey Hugo
  -1 siblings, 0 replies; 40+ messages in thread
From: Jeffrey Hugo @ 2020-07-13 14:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: Douglas Anderson, freedreno, DTML, David Airlie, linux-arm-msm,
	linux-kernel, dri-devel, Bjorn Andersson, Rob Clark, Andy Gross,
	Daniel Vetter, Sean Paul

On Mon, Jul 13, 2020 at 8:11 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > I found that if I ever had a little mistake in my kernel config,
> > or device tree, or graphics driver that my system would sit in a loop
> > at bootup trying again and again and again.  An example log was:
>
> Why do we care about optimizing the error case?
>
> >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> >   ...
> >
> > I finally tracked it down where this was happening:
> >   - msm_pdev_probe() is called.
> >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> >     off processing of probe deferrals.
> >   - component_master_add_with_match() could return -EPROBE_DEFER.
> >     making msm_pdev_probe() return -EPROBE_DEFER.
> >   - When msm_pdev_probe() returned the processing of probe deferrals
> >     happens.
> >   - Loop back to the start.
> >
> > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > I have no idea if people consider this the right thing to do or a
> > hack.  Hopefully it's the right thing to do.  :-)
>
> It's a simple test. Do the child devices have any dependency on the
> parent to probe and/or function? If so, not a simple-bus.
>
> > Once I do this I notice that my boot gets marginally faster (you
> > don't need to probe the sub devices over and over) and also if I
>
> Can you quantify that?
>
> Have you run with devlinks enabled. You need a command line option to
> enable. That too should reduce deferred probes.
>
> > have a problem it doesn't loop forever (on my system it still
> > gets upset about some stuck clocks in that case, but at least I
> > can boot up).
>
> Deferred probe only runs when a device is added, so it's not like it
> is continually running.

But it is.  I've hit this as well, but haven't attempted a fix.

So we have a parent device, with several sub devices.  The parent
device probes which causes the sub devices to probe.  One of the sub
devices successfully probes, and another fails with EPROBE_DEFER.
This both caused the probe defer framework to immediately schedule
processing the probe defer queue, and also cause all of the chile
devices and the parent device to be removed to probe defer later.
Since the system state doesn't change (one of the sub devices actually
requires an independent other device to have probed), the system ends
up an an infinite probe defer loop.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-13 14:58     ` Jeffrey Hugo
  0 siblings, 0 replies; 40+ messages in thread
From: Jeffrey Hugo @ 2020-07-13 14:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sean Paul, DTML, David Airlie, linux-arm-msm, Douglas Anderson,
	dri-devel, linux-kernel, Andy Gross, Bjorn Andersson, freedreno

On Mon, Jul 13, 2020 at 8:11 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > I found that if I ever had a little mistake in my kernel config,
> > or device tree, or graphics driver that my system would sit in a loop
> > at bootup trying again and again and again.  An example log was:
>
> Why do we care about optimizing the error case?
>
> >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> >   ...
> >
> > I finally tracked it down where this was happening:
> >   - msm_pdev_probe() is called.
> >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> >     off processing of probe deferrals.
> >   - component_master_add_with_match() could return -EPROBE_DEFER.
> >     making msm_pdev_probe() return -EPROBE_DEFER.
> >   - When msm_pdev_probe() returned the processing of probe deferrals
> >     happens.
> >   - Loop back to the start.
> >
> > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > I have no idea if people consider this the right thing to do or a
> > hack.  Hopefully it's the right thing to do.  :-)
>
> It's a simple test. Do the child devices have any dependency on the
> parent to probe and/or function? If so, not a simple-bus.
>
> > Once I do this I notice that my boot gets marginally faster (you
> > don't need to probe the sub devices over and over) and also if I
>
> Can you quantify that?
>
> Have you run with devlinks enabled. You need a command line option to
> enable. That too should reduce deferred probes.
>
> > have a problem it doesn't loop forever (on my system it still
> > gets upset about some stuck clocks in that case, but at least I
> > can boot up).
>
> Deferred probe only runs when a device is added, so it's not like it
> is continually running.

But it is.  I've hit this as well, but haven't attempted a fix.

So we have a parent device, with several sub devices.  The parent
device probes which causes the sub devices to probe.  One of the sub
devices successfully probes, and another fails with EPROBE_DEFER.
This both caused the probe defer framework to immediately schedule
processing the probe defer queue, and also cause all of the chile
devices and the parent device to be removed to probe defer later.
Since the system state doesn't change (one of the sub devices actually
requires an independent other device to have probed), the system ends
up an an infinite probe defer loop.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-13 14:11   ` Rob Herring
@ 2020-07-13 15:08     ` Doug Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Doug Anderson @ 2020-07-13 15:08 UTC (permalink / raw)
  To: Rob Herring
  Cc: Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson, dri-devel,
	linux-arm-msm, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Daniel Vetter, David Airlie, linux-kernel

Hi,

On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > I found that if I ever had a little mistake in my kernel config,
> > or device tree, or graphics driver that my system would sit in a loop
> > at bootup trying again and again and again.  An example log was:
>
> Why do we care about optimizing the error case?

It actually results in a _fully_ infinite loop.  That is: if anything
small causes a component of DRM to fail to probe then the whole system
doesn't boot because it just loops trying to probe over and over
again.  The messages I put in the commit message are printed over and
over and over again.


> >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> >   ...
> >
> > I finally tracked it down where this was happening:
> >   - msm_pdev_probe() is called.
> >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> >     off processing of probe deferrals.
> >   - component_master_add_with_match() could return -EPROBE_DEFER.
> >     making msm_pdev_probe() return -EPROBE_DEFER.
> >   - When msm_pdev_probe() returned the processing of probe deferrals
> >     happens.
> >   - Loop back to the start.
> >
> > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > I have no idea if people consider this the right thing to do or a
> > hack.  Hopefully it's the right thing to do.  :-)
>
> It's a simple test. Do the child devices have any dependency on the
> parent to probe and/or function? If so, not a simple-bus.

Great!  You can see in the earlier patch in the series that the very
first thing that happens when the parent device probes is that it
calls devm_of_platform_populate().  That means no dependencies, right?
 So that means it's fine/correct to add "simple-bus" here?


> > Once I do this I notice that my boot gets marginally faster (you
> > don't need to probe the sub devices over and over) and also if I
>
> Can you quantify that?

I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
the point of this patch.


> Have you run with devlinks enabled. You need a command line option to
> enable. That too should reduce deferred probes.

Ah, good idea!  I will try it.  However, even with devlinks, if there
is any chance of deferred probes then we need a fix like this.  The
point of the patch isn't about speeding things up but about avoiding
an infinite loop at bootup due to a small bug.


> > have a problem it doesn't loop forever (on my system it still
> > gets upset about some stuck clocks in that case, but at least I
> > can boot up).
>
> Deferred probe only runs when a device is added, so it's not like it
> is continually running.

If you don't mind looking at the code patch, see:

https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/

Specifically you can see that each time we try to probe we were
calling of_platform_populate().  That appeared to be enough to trigger
things.


-Doug

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-13 15:08     ` Doug Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Doug Anderson @ 2020-07-13 15:08 UTC (permalink / raw)
  To: Rob Herring
  Cc: freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Bjorn Andersson, Andy Gross, Sean Paul

Hi,

On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > I found that if I ever had a little mistake in my kernel config,
> > or device tree, or graphics driver that my system would sit in a loop
> > at bootup trying again and again and again.  An example log was:
>
> Why do we care about optimizing the error case?

It actually results in a _fully_ infinite loop.  That is: if anything
small causes a component of DRM to fail to probe then the whole system
doesn't boot because it just loops trying to probe over and over
again.  The messages I put in the commit message are printed over and
over and over again.


> >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> >   ...
> >
> > I finally tracked it down where this was happening:
> >   - msm_pdev_probe() is called.
> >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> >     off processing of probe deferrals.
> >   - component_master_add_with_match() could return -EPROBE_DEFER.
> >     making msm_pdev_probe() return -EPROBE_DEFER.
> >   - When msm_pdev_probe() returned the processing of probe deferrals
> >     happens.
> >   - Loop back to the start.
> >
> > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > I have no idea if people consider this the right thing to do or a
> > hack.  Hopefully it's the right thing to do.  :-)
>
> It's a simple test. Do the child devices have any dependency on the
> parent to probe and/or function? If so, not a simple-bus.

Great!  You can see in the earlier patch in the series that the very
first thing that happens when the parent device probes is that it
calls devm_of_platform_populate().  That means no dependencies, right?
 So that means it's fine/correct to add "simple-bus" here?


> > Once I do this I notice that my boot gets marginally faster (you
> > don't need to probe the sub devices over and over) and also if I
>
> Can you quantify that?

I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
the point of this patch.


> Have you run with devlinks enabled. You need a command line option to
> enable. That too should reduce deferred probes.

Ah, good idea!  I will try it.  However, even with devlinks, if there
is any chance of deferred probes then we need a fix like this.  The
point of the patch isn't about speeding things up but about avoiding
an infinite loop at bootup due to a small bug.


> > have a problem it doesn't loop forever (on my system it still
> > gets upset about some stuck clocks in that case, but at least I
> > can boot up).
>
> Deferred probe only runs when a device is added, so it's not like it
> is continually running.

If you don't mind looking at the code patch, see:

https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/

Specifically you can see that each time we try to probe we were
calling of_platform_populate().  That appeared to be enough to trigger
things.


-Doug
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-13 15:08     ` Doug Anderson
@ 2020-07-13 20:25       ` Rob Herring
  -1 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-13 20:25 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson, dri-devel,
	linux-arm-msm, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Daniel Vetter, David Airlie, linux-kernel

On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> >
> > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > >
> > > I found that if I ever had a little mistake in my kernel config,
> > > or device tree, or graphics driver that my system would sit in a loop
> > > at bootup trying again and again and again.  An example log was:
> >
> > Why do we care about optimizing the error case?
>
> It actually results in a _fully_ infinite loop.  That is: if anything
> small causes a component of DRM to fail to probe then the whole system
> doesn't boot because it just loops trying to probe over and over
> again.  The messages I put in the commit message are printed over and
> over and over again.

Sounds like a bug as that's not what should happen.

If you defer during boot (initcalls), then you'll be on the deferred
list until late_initcall and everything is retried. After
late_initcall, only devices getting added should trigger probing. But
maybe the adding and then removing a device is causing a re-trigger.

> > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > >   ...
> > >
> > > I finally tracked it down where this was happening:
> > >   - msm_pdev_probe() is called.
> > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > >     off processing of probe deferrals.
> > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > >     happens.
> > >   - Loop back to the start.
> > >
> > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > I have no idea if people consider this the right thing to do or a
> > > hack.  Hopefully it's the right thing to do.  :-)
> >
> > It's a simple test. Do the child devices have any dependency on the
> > parent to probe and/or function? If so, not a simple-bus.
>
> Great!  You can see in the earlier patch in the series that the very
> first thing that happens when the parent device probes is that it
> calls devm_of_platform_populate().  That means no dependencies, right?

It should. But then I reviewed the MDSS binding today and it looks
like the MDSS is the interrupt parent for at least some child devices?

>  So that means it's fine/correct to add "simple-bus" here?
>
>
> > > Once I do this I notice that my boot gets marginally faster (you
> > > don't need to probe the sub devices over and over) and also if I
> >
> > Can you quantify that?
>
> I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
> the point of this patch.
>
>
> > Have you run with devlinks enabled. You need a command line option to
> > enable. That too should reduce deferred probes.
>
> Ah, good idea!  I will try it.  However, even with devlinks, if there
> is any chance of deferred probes then we need a fix like this.  The
> point of the patch isn't about speeding things up but about avoiding
> an infinite loop at bootup due to a small bug.

I think a deferred probe would only happen if there's a dependency we
don't track (but we're tracking about everything that's common). But
if there's some error, I'm not sure what would happen. Seems like a
good test case. :)

> > > have a problem it doesn't loop forever (on my system it still
> > > gets upset about some stuck clocks in that case, but at least I
> > > can boot up).
> >
> > Deferred probe only runs when a device is added, so it's not like it
> > is continually running.
>
> If you don't mind looking at the code patch, see:
>
> https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/
>
> Specifically you can see that each time we try to probe we were
> calling of_platform_populate().  That appeared to be enough to trigger
> things.

Like I said, sounds like a bug. Even if 'simple-bus' is the
appropriate thing to do here, it should be fixed or at least
understood.

Rob

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-13 20:25       ` Rob Herring
  0 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-13 20:25 UTC (permalink / raw)
  To: Doug Anderson
  Cc: freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Bjorn Andersson, Andy Gross, Sean Paul

On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> >
> > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > >
> > > I found that if I ever had a little mistake in my kernel config,
> > > or device tree, or graphics driver that my system would sit in a loop
> > > at bootup trying again and again and again.  An example log was:
> >
> > Why do we care about optimizing the error case?
>
> It actually results in a _fully_ infinite loop.  That is: if anything
> small causes a component of DRM to fail to probe then the whole system
> doesn't boot because it just loops trying to probe over and over
> again.  The messages I put in the commit message are printed over and
> over and over again.

Sounds like a bug as that's not what should happen.

If you defer during boot (initcalls), then you'll be on the deferred
list until late_initcall and everything is retried. After
late_initcall, only devices getting added should trigger probing. But
maybe the adding and then removing a device is causing a re-trigger.

> > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > >   ...
> > >
> > > I finally tracked it down where this was happening:
> > >   - msm_pdev_probe() is called.
> > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > >     off processing of probe deferrals.
> > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > >     happens.
> > >   - Loop back to the start.
> > >
> > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > I have no idea if people consider this the right thing to do or a
> > > hack.  Hopefully it's the right thing to do.  :-)
> >
> > It's a simple test. Do the child devices have any dependency on the
> > parent to probe and/or function? If so, not a simple-bus.
>
> Great!  You can see in the earlier patch in the series that the very
> first thing that happens when the parent device probes is that it
> calls devm_of_platform_populate().  That means no dependencies, right?

It should. But then I reviewed the MDSS binding today and it looks
like the MDSS is the interrupt parent for at least some child devices?

>  So that means it's fine/correct to add "simple-bus" here?
>
>
> > > Once I do this I notice that my boot gets marginally faster (you
> > > don't need to probe the sub devices over and over) and also if I
> >
> > Can you quantify that?
>
> I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
> the point of this patch.
>
>
> > Have you run with devlinks enabled. You need a command line option to
> > enable. That too should reduce deferred probes.
>
> Ah, good idea!  I will try it.  However, even with devlinks, if there
> is any chance of deferred probes then we need a fix like this.  The
> point of the patch isn't about speeding things up but about avoiding
> an infinite loop at bootup due to a small bug.

I think a deferred probe would only happen if there's a dependency we
don't track (but we're tracking about everything that's common). But
if there's some error, I'm not sure what would happen. Seems like a
good test case. :)

> > > have a problem it doesn't loop forever (on my system it still
> > > gets upset about some stuck clocks in that case, but at least I
> > > can boot up).
> >
> > Deferred probe only runs when a device is added, so it's not like it
> > is continually running.
>
> If you don't mind looking at the code patch, see:
>
> https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/
>
> Specifically you can see that each time we try to probe we were
> calling of_platform_populate().  That appeared to be enough to trigger
> things.

Like I said, sounds like a bug. Even if 'simple-bus' is the
appropriate thing to do here, it should be fixed or at least
understood.

Rob
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-13 20:25       ` Rob Herring
@ 2020-07-13 21:32         ` Rob Clark
  -1 siblings, 0 replies; 40+ messages in thread
From: Rob Clark @ 2020-07-13 21:32 UTC (permalink / raw)
  To: Rob Herring
  Cc: Doug Anderson, Sean Paul, Andy Gross, Bjorn Andersson, dri-devel,
	linux-arm-msm, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Daniel Vetter, David Airlie, linux-kernel

On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > >
> > > > I found that if I ever had a little mistake in my kernel config,
> > > > or device tree, or graphics driver that my system would sit in a loop
> > > > at bootup trying again and again and again.  An example log was:
> > >
> > > Why do we care about optimizing the error case?
> >
> > It actually results in a _fully_ infinite loop.  That is: if anything
> > small causes a component of DRM to fail to probe then the whole system
> > doesn't boot because it just loops trying to probe over and over
> > again.  The messages I put in the commit message are printed over and
> > over and over again.
>
> Sounds like a bug as that's not what should happen.
>
> If you defer during boot (initcalls), then you'll be on the deferred
> list until late_initcall and everything is retried. After
> late_initcall, only devices getting added should trigger probing. But
> maybe the adding and then removing a device is causing a re-trigger.
>
> > > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > > >   ...
> > > >
> > > > I finally tracked it down where this was happening:
> > > >   - msm_pdev_probe() is called.
> > > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > > >     off processing of probe deferrals.
> > > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > > >     happens.
> > > >   - Loop back to the start.
> > > >
> > > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > > I have no idea if people consider this the right thing to do or a
> > > > hack.  Hopefully it's the right thing to do.  :-)
> > >
> > > It's a simple test. Do the child devices have any dependency on the
> > > parent to probe and/or function? If so, not a simple-bus.
> >
> > Great!  You can see in the earlier patch in the series that the very
> > first thing that happens when the parent device probes is that it
> > calls devm_of_platform_populate().  That means no dependencies, right?
>
> It should. But then I reviewed the MDSS binding today and it looks
> like the MDSS is the interrupt parent for at least some child devices?
>

yes, that is correct

BR,
-R

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-13 21:32         ` Rob Clark
  0 siblings, 0 replies; 40+ messages in thread
From: Rob Clark @ 2020-07-13 21:32 UTC (permalink / raw)
  To: Rob Herring
  Cc: freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, Doug Anderson, dri-devel,
	Bjorn Andersson, Andy Gross, Sean Paul, linux-kernel

On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > >
> > > > I found that if I ever had a little mistake in my kernel config,
> > > > or device tree, or graphics driver that my system would sit in a loop
> > > > at bootup trying again and again and again.  An example log was:
> > >
> > > Why do we care about optimizing the error case?
> >
> > It actually results in a _fully_ infinite loop.  That is: if anything
> > small causes a component of DRM to fail to probe then the whole system
> > doesn't boot because it just loops trying to probe over and over
> > again.  The messages I put in the commit message are printed over and
> > over and over again.
>
> Sounds like a bug as that's not what should happen.
>
> If you defer during boot (initcalls), then you'll be on the deferred
> list until late_initcall and everything is retried. After
> late_initcall, only devices getting added should trigger probing. But
> maybe the adding and then removing a device is causing a re-trigger.
>
> > > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > > >   ...
> > > >
> > > > I finally tracked it down where this was happening:
> > > >   - msm_pdev_probe() is called.
> > > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > > >     off processing of probe deferrals.
> > > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > > >     happens.
> > > >   - Loop back to the start.
> > > >
> > > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > > I have no idea if people consider this the right thing to do or a
> > > > hack.  Hopefully it's the right thing to do.  :-)
> > >
> > > It's a simple test. Do the child devices have any dependency on the
> > > parent to probe and/or function? If so, not a simple-bus.
> >
> > Great!  You can see in the earlier patch in the series that the very
> > first thing that happens when the parent device probes is that it
> > calls devm_of_platform_populate().  That means no dependencies, right?
>
> It should. But then I reviewed the MDSS binding today and it looks
> like the MDSS is the interrupt parent for at least some child devices?
>

yes, that is correct

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-13 20:25       ` Rob Herring
@ 2020-07-13 23:50         ` Doug Anderson
  -1 siblings, 0 replies; 40+ messages in thread
From: Doug Anderson @ 2020-07-13 23:50 UTC (permalink / raw)
  To: Rob Herring
  Cc: Rob Clark, Sean Paul, Andy Gross, Bjorn Andersson, dri-devel,
	linux-arm-msm, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Daniel Vetter, David Airlie, linux-kernel

Hi,

On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > >
> > > > I found that if I ever had a little mistake in my kernel config,
> > > > or device tree, or graphics driver that my system would sit in a loop
> > > > at bootup trying again and again and again.  An example log was:
> > >
> > > Why do we care about optimizing the error case?
> >
> > It actually results in a _fully_ infinite loop.  That is: if anything
> > small causes a component of DRM to fail to probe then the whole system
> > doesn't boot because it just loops trying to probe over and over
> > again.  The messages I put in the commit message are printed over and
> > over and over again.
>
> Sounds like a bug as that's not what should happen.
>
> If you defer during boot (initcalls), then you'll be on the deferred
> list until late_initcall and everything is retried. After
> late_initcall, only devices getting added should trigger probing. But
> maybe the adding and then removing a device is causing a re-trigger.

Right, I'm nearly certain that the adding and then removing is causing
a re-trigger.  I believe the loop would happen for any case where we
have a probe function that:

1. Adds devices.
2. After adding devices it decides that it needs to defer.
3. Removes the devices it added.
4. Return -EPROBE_DEFER from its probe function.

Specifically from what I know about how -EPROBE_DEFER works I'm not
sure how it wouldn't cause an infinite loop in that case.

Perhaps the missing part of my explanation, though, is why it never
gets out of this infinite loop.  In my case I purposely made the
bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
every time.  Obviously I wasn't going to get a display up like this,
but I just wanted to not loop forever at bootup.  I tracked down
exactly why we get an - EPROBE_DEFER over and over in this case.

You can see it in msm_dsi_host_register().  If some components haven't
shown up when that function runs it will _always_ return
-EPROBE_DEFER.

In my case, since I caused the bridge to fail to probe, those
components will _never_ show up.  That means that
msm_dsi_host_register() will _always_ return -EPROBE_DEFER.

I haven't dug through all the DRM code enough, but it doesn't
necessarily seem like the wrong behavior.  If the bridge driver or a
panel was a module then (presumably) they could show up later and so
it should be OK for it to defer, right?

So with all that, it doesn't really feel like this is a bug so much as
it's an unsupported use case.  The current deferral logic simply can't
handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
if your probe function adds devices each time through the probe
function.

Assuming all the above makes sense, that means we're stuck with:

a) This patch series, which makes us not add devices.

b) Some other patch series which rearchitects the MSM graphics stack
to not return -EPROBE_DEFER in this case.

c) Smarten up the deferral system to somehow detect this loop.  I'm
really not sure how to do this.  You'd have to somehow know that you
keep adding the same devices over and over again and they didn't get
us out of the deferral loop last time and so you should eventually
give up.


> > > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > > >   ...
> > > >
> > > > I finally tracked it down where this was happening:
> > > >   - msm_pdev_probe() is called.
> > > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > > >     off processing of probe deferrals.
> > > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > > >     happens.
> > > >   - Loop back to the start.
> > > >
> > > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > > I have no idea if people consider this the right thing to do or a
> > > > hack.  Hopefully it's the right thing to do.  :-)
> > >
> > > It's a simple test. Do the child devices have any dependency on the
> > > parent to probe and/or function? If so, not a simple-bus.
> >
> > Great!  You can see in the earlier patch in the series that the very
> > first thing that happens when the parent device probes is that it
> > calls devm_of_platform_populate().  That means no dependencies, right?
>
> It should. But then I reviewed the MDSS binding today and it looks
> like the MDSS is the interrupt parent for at least some child devices?

Hrm.  How does that work?  Let's see...

...ah, I believe it works because we don't try to grab interrupts in
the probe path of our sub-components.  That means we probe them just
fine without the parent.  I guess it has to be like this because
otherwise we end up with circular dependencies.

So there is a dependency of the child on the parent and of the parent
on the child (the parent won't really probe until the children do).
No idea if this means that the whole thing was architected in a
non-optimal way or if it's just really hard to fit the DRM Component
model into the Linux Driver model (or both).  Where does that leave us
about whether "simple-bus" is OK, though?



> >  So that means it's fine/correct to add "simple-bus" here?
> >
> >
> > > > Once I do this I notice that my boot gets marginally faster (you
> > > > don't need to probe the sub devices over and over) and also if I
> > >
> > > Can you quantify that?
> >
> > I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
> > the point of this patch.
> >
> >
> > > Have you run with devlinks enabled. You need a command line option to
> > > enable. That too should reduce deferred probes.
> >
> > Ah, good idea!  I will try it.  However, even with devlinks, if there
> > is any chance of deferred probes then we need a fix like this.  The
> > point of the patch isn't about speeding things up but about avoiding
> > an infinite loop at bootup due to a small bug.
>
> I think a deferred probe would only happen if there's a dependency we
> don't track (but we're tracking about everything that's common). But
> if there's some error, I'm not sure what would happen. Seems like a
> good test case. :)

Maybe now that I've pointed at msm_dsi_host_register() it will help
clarify.  I don't know a ton about the MSM DRM world (mostly I just
jumped in here because I was sick of getting stuck in this infinite
loop), but I'm not sure how we can get around the problems.

I guess in my specific case we could maybe determine that the bridge
chip returned -EINVAL and thus would never probe, but what about if I
put the bridge chip driver in a loadable kernel module?

My device links knowledge is super weak (and I'm currently mostly
focused on the slightly older 5.4 kernel if that matters) but are you
saying that the system should just know which device would eventually
provide the bridge/panel and would know not to probe the main DRM
driver until after it probes?


> > > > have a problem it doesn't loop forever (on my system it still
> > > > gets upset about some stuck clocks in that case, but at least I
> > > > can boot up).
> > >
> > > Deferred probe only runs when a device is added, so it's not like it
> > > is continually running.
> >
> > If you don't mind looking at the code patch, see:
> >
> > https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/
> >
> > Specifically you can see that each time we try to probe we were
> > calling of_platform_populate().  That appeared to be enough to trigger
> > things.
>
> Like I said, sounds like a bug. Even if 'simple-bus' is the
> appropriate thing to do here, it should be fixed or at least
> understood.
>
> Rob

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-13 23:50         ` Doug Anderson
  0 siblings, 0 replies; 40+ messages in thread
From: Doug Anderson @ 2020-07-13 23:50 UTC (permalink / raw)
  To: Rob Herring
  Cc: freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Bjorn Andersson, Andy Gross, Sean Paul

Hi,

On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > >
> > > > I found that if I ever had a little mistake in my kernel config,
> > > > or device tree, or graphics driver that my system would sit in a loop
> > > > at bootup trying again and again and again.  An example log was:
> > >
> > > Why do we care about optimizing the error case?
> >
> > It actually results in a _fully_ infinite loop.  That is: if anything
> > small causes a component of DRM to fail to probe then the whole system
> > doesn't boot because it just loops trying to probe over and over
> > again.  The messages I put in the commit message are printed over and
> > over and over again.
>
> Sounds like a bug as that's not what should happen.
>
> If you defer during boot (initcalls), then you'll be on the deferred
> list until late_initcall and everything is retried. After
> late_initcall, only devices getting added should trigger probing. But
> maybe the adding and then removing a device is causing a re-trigger.

Right, I'm nearly certain that the adding and then removing is causing
a re-trigger.  I believe the loop would happen for any case where we
have a probe function that:

1. Adds devices.
2. After adding devices it decides that it needs to defer.
3. Removes the devices it added.
4. Return -EPROBE_DEFER from its probe function.

Specifically from what I know about how -EPROBE_DEFER works I'm not
sure how it wouldn't cause an infinite loop in that case.

Perhaps the missing part of my explanation, though, is why it never
gets out of this infinite loop.  In my case I purposely made the
bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
every time.  Obviously I wasn't going to get a display up like this,
but I just wanted to not loop forever at bootup.  I tracked down
exactly why we get an - EPROBE_DEFER over and over in this case.

You can see it in msm_dsi_host_register().  If some components haven't
shown up when that function runs it will _always_ return
-EPROBE_DEFER.

In my case, since I caused the bridge to fail to probe, those
components will _never_ show up.  That means that
msm_dsi_host_register() will _always_ return -EPROBE_DEFER.

I haven't dug through all the DRM code enough, but it doesn't
necessarily seem like the wrong behavior.  If the bridge driver or a
panel was a module then (presumably) they could show up later and so
it should be OK for it to defer, right?

So with all that, it doesn't really feel like this is a bug so much as
it's an unsupported use case.  The current deferral logic simply can't
handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
if your probe function adds devices each time through the probe
function.

Assuming all the above makes sense, that means we're stuck with:

a) This patch series, which makes us not add devices.

b) Some other patch series which rearchitects the MSM graphics stack
to not return -EPROBE_DEFER in this case.

c) Smarten up the deferral system to somehow detect this loop.  I'm
really not sure how to do this.  You'd have to somehow know that you
keep adding the same devices over and over again and they didn't get
us out of the deferral loop last time and so you should eventually
give up.


> > > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > > >   ...
> > > >
> > > > I finally tracked it down where this was happening:
> > > >   - msm_pdev_probe() is called.
> > > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > > >     off processing of probe deferrals.
> > > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > > >     happens.
> > > >   - Loop back to the start.
> > > >
> > > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > > I have no idea if people consider this the right thing to do or a
> > > > hack.  Hopefully it's the right thing to do.  :-)
> > >
> > > It's a simple test. Do the child devices have any dependency on the
> > > parent to probe and/or function? If so, not a simple-bus.
> >
> > Great!  You can see in the earlier patch in the series that the very
> > first thing that happens when the parent device probes is that it
> > calls devm_of_platform_populate().  That means no dependencies, right?
>
> It should. But then I reviewed the MDSS binding today and it looks
> like the MDSS is the interrupt parent for at least some child devices?

Hrm.  How does that work?  Let's see...

...ah, I believe it works because we don't try to grab interrupts in
the probe path of our sub-components.  That means we probe them just
fine without the parent.  I guess it has to be like this because
otherwise we end up with circular dependencies.

So there is a dependency of the child on the parent and of the parent
on the child (the parent won't really probe until the children do).
No idea if this means that the whole thing was architected in a
non-optimal way or if it's just really hard to fit the DRM Component
model into the Linux Driver model (or both).  Where does that leave us
about whether "simple-bus" is OK, though?



> >  So that means it's fine/correct to add "simple-bus" here?
> >
> >
> > > > Once I do this I notice that my boot gets marginally faster (you
> > > > don't need to probe the sub devices over and over) and also if I
> > >
> > > Can you quantify that?
> >
> > I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
> > the point of this patch.
> >
> >
> > > Have you run with devlinks enabled. You need a command line option to
> > > enable. That too should reduce deferred probes.
> >
> > Ah, good idea!  I will try it.  However, even with devlinks, if there
> > is any chance of deferred probes then we need a fix like this.  The
> > point of the patch isn't about speeding things up but about avoiding
> > an infinite loop at bootup due to a small bug.
>
> I think a deferred probe would only happen if there's a dependency we
> don't track (but we're tracking about everything that's common). But
> if there's some error, I'm not sure what would happen. Seems like a
> good test case. :)

Maybe now that I've pointed at msm_dsi_host_register() it will help
clarify.  I don't know a ton about the MSM DRM world (mostly I just
jumped in here because I was sick of getting stuck in this infinite
loop), but I'm not sure how we can get around the problems.

I guess in my specific case we could maybe determine that the bridge
chip returned -EINVAL and thus would never probe, but what about if I
put the bridge chip driver in a loadable kernel module?

My device links knowledge is super weak (and I'm currently mostly
focused on the slightly older 5.4 kernel if that matters) but are you
saying that the system should just know which device would eventually
provide the bridge/panel and would know not to probe the main DRM
driver until after it probes?


> > > > have a problem it doesn't loop forever (on my system it still
> > > > gets upset about some stuck clocks in that case, but at least I
> > > > can boot up).
> > >
> > > Deferred probe only runs when a device is added, so it's not like it
> > > is continually running.
> >
> > If you don't mind looking at the code patch, see:
> >
> > https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/
> >
> > Specifically you can see that each time we try to probe we were
> > calling of_platform_populate().  That appeared to be enough to trigger
> > things.
>
> Like I said, sounds like a bug. Even if 'simple-bus' is the
> appropriate thing to do here, it should be fixed or at least
> understood.
>
> Rob
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-13 23:50         ` Doug Anderson
@ 2020-07-14 16:32           ` Jeffrey Hugo
  -1 siblings, 0 replies; 40+ messages in thread
From: Jeffrey Hugo @ 2020-07-14 16:32 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Rob Herring, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Bjorn Andersson, Rob Clark, Andy Gross, Daniel Vetter, Sean Paul

On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> >
> > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > >
> > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > >
> > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > at bootup trying again and again and again.  An example log was:
> > > >
> > > > Why do we care about optimizing the error case?
> > >
> > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > small causes a component of DRM to fail to probe then the whole system
> > > doesn't boot because it just loops trying to probe over and over
> > > again.  The messages I put in the commit message are printed over and
> > > over and over again.
> >
> > Sounds like a bug as that's not what should happen.
> >
> > If you defer during boot (initcalls), then you'll be on the deferred
> > list until late_initcall and everything is retried. After
> > late_initcall, only devices getting added should trigger probing. But
> > maybe the adding and then removing a device is causing a re-trigger.
>
> Right, I'm nearly certain that the adding and then removing is causing
> a re-trigger.  I believe the loop would happen for any case where we
> have a probe function that:
>
> 1. Adds devices.
> 2. After adding devices it decides that it needs to defer.
> 3. Removes the devices it added.
> 4. Return -EPROBE_DEFER from its probe function.
>
> Specifically from what I know about how -EPROBE_DEFER works I'm not
> sure how it wouldn't cause an infinite loop in that case.
>
> Perhaps the missing part of my explanation, though, is why it never
> gets out of this infinite loop.  In my case I purposely made the
> bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> every time.  Obviously I wasn't going to get a display up like this,
> but I just wanted to not loop forever at bootup.  I tracked down
> exactly why we get an - EPROBE_DEFER over and over in this case.
>
> You can see it in msm_dsi_host_register().  If some components haven't
> shown up when that function runs it will _always_ return
> -EPROBE_DEFER.
>
> In my case, since I caused the bridge to fail to probe, those
> components will _never_ show up.  That means that
> msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
>
> I haven't dug through all the DRM code enough, but it doesn't
> necessarily seem like the wrong behavior.  If the bridge driver or a
> panel was a module then (presumably) they could show up later and so
> it should be OK for it to defer, right?
>
> So with all that, it doesn't really feel like this is a bug so much as
> it's an unsupported use case.  The current deferral logic simply can't
> handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> if your probe function adds devices each time through the probe
> function.
>
> Assuming all the above makes sense, that means we're stuck with:
>
> a) This patch series, which makes us not add devices.
>
> b) Some other patch series which rearchitects the MSM graphics stack
> to not return -EPROBE_DEFER in this case.

This isn't a MSM specific issue.  This is an issue with how the DSI
interface works, and how software is structured in Linux.  I would
expect that pretty much any DSI host in the kernel would have some
version of this issue.

The problem is that DSI is not "hot pluggable", so to give the DRM
stack the info it needs, we need both the DSI controller (aka the MSM
graphics stack in your case), and the thing it connects to (in your
case, the TI bridge, normally the actual panel) because the DRM stack
expects that if init completes, it has certain information
(resolution, etc), and some of that information is in the DSI
controller, and some of it is on the DSI device.

>
> c) Smarten up the deferral system to somehow detect this loop.  I'm
> really not sure how to do this.  You'd have to somehow know that you
> keep adding the same devices over and over again and they didn't get
> us out of the deferral loop last time and so you should eventually
> give up.
>
>
> > > > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > > > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > > > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > > > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > > > >   ...
> > > > >
> > > > > I finally tracked it down where this was happening:
> > > > >   - msm_pdev_probe() is called.
> > > > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > > > >     off processing of probe deferrals.
> > > > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > > > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > > > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > > > >     happens.
> > > > >   - Loop back to the start.
> > > > >
> > > > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > > > I have no idea if people consider this the right thing to do or a
> > > > > hack.  Hopefully it's the right thing to do.  :-)
> > > >
> > > > It's a simple test. Do the child devices have any dependency on the
> > > > parent to probe and/or function? If so, not a simple-bus.
> > >
> > > Great!  You can see in the earlier patch in the series that the very
> > > first thing that happens when the parent device probes is that it
> > > calls devm_of_platform_populate().  That means no dependencies, right?
> >
> > It should. But then I reviewed the MDSS binding today and it looks
> > like the MDSS is the interrupt parent for at least some child devices?
>
> Hrm.  How does that work?  Let's see...
>
> ...ah, I believe it works because we don't try to grab interrupts in
> the probe path of our sub-components.  That means we probe them just
> fine without the parent.  I guess it has to be like this because
> otherwise we end up with circular dependencies.
>
> So there is a dependency of the child on the parent and of the parent
> on the child (the parent won't really probe until the children do).
> No idea if this means that the whole thing was architected in a
> non-optimal way or if it's just really hard to fit the DRM Component
> model into the Linux Driver model (or both).  Where does that leave us
> about whether "simple-bus" is OK, though?
>
>
>
> > >  So that means it's fine/correct to add "simple-bus" here?
> > >
> > >
> > > > > Once I do this I notice that my boot gets marginally faster (you
> > > > > don't need to probe the sub devices over and over) and also if I
> > > >
> > > > Can you quantify that?
> > >
> > > I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
> > > the point of this patch.
> > >
> > >
> > > > Have you run with devlinks enabled. You need a command line option to
> > > > enable. That too should reduce deferred probes.
> > >
> > > Ah, good idea!  I will try it.  However, even with devlinks, if there
> > > is any chance of deferred probes then we need a fix like this.  The
> > > point of the patch isn't about speeding things up but about avoiding
> > > an infinite loop at bootup due to a small bug.
> >
> > I think a deferred probe would only happen if there's a dependency we
> > don't track (but we're tracking about everything that's common). But
> > if there's some error, I'm not sure what would happen. Seems like a
> > good test case. :)
>
> Maybe now that I've pointed at msm_dsi_host_register() it will help
> clarify.  I don't know a ton about the MSM DRM world (mostly I just
> jumped in here because I was sick of getting stuck in this infinite
> loop), but I'm not sure how we can get around the problems.
>
> I guess in my specific case we could maybe determine that the bridge
> chip returned -EINVAL and thus would never probe, but what about if I
> put the bridge chip driver in a loadable kernel module?
>
> My device links knowledge is super weak (and I'm currently mostly
> focused on the slightly older 5.4 kernel if that matters) but are you
> saying that the system should just know which device would eventually
> provide the bridge/panel and would know not to probe the main DRM
> driver until after it probes?
>
>
> > > > > have a problem it doesn't loop forever (on my system it still
> > > > > gets upset about some stuck clocks in that case, but at least I
> > > > > can boot up).
> > > >
> > > > Deferred probe only runs when a device is added, so it's not like it
> > > > is continually running.
> > >
> > > If you don't mind looking at the code patch, see:
> > >
> > > https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/
> > >
> > > Specifically you can see that each time we try to probe we were
> > > calling of_platform_populate().  That appeared to be enough to trigger
> > > things.
> >
> > Like I said, sounds like a bug. Even if 'simple-bus' is the
> > appropriate thing to do here, it should be fixed or at least
> > understood.
> >
> > Rob
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-14 16:32           ` Jeffrey Hugo
  0 siblings, 0 replies; 40+ messages in thread
From: Jeffrey Hugo @ 2020-07-14 16:32 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Sean Paul,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Bjorn Andersson, Rob Herring, Andy Gross, freedreno

On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> >
> > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > >
> > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > >
> > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > at bootup trying again and again and again.  An example log was:
> > > >
> > > > Why do we care about optimizing the error case?
> > >
> > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > small causes a component of DRM to fail to probe then the whole system
> > > doesn't boot because it just loops trying to probe over and over
> > > again.  The messages I put in the commit message are printed over and
> > > over and over again.
> >
> > Sounds like a bug as that's not what should happen.
> >
> > If you defer during boot (initcalls), then you'll be on the deferred
> > list until late_initcall and everything is retried. After
> > late_initcall, only devices getting added should trigger probing. But
> > maybe the adding and then removing a device is causing a re-trigger.
>
> Right, I'm nearly certain that the adding and then removing is causing
> a re-trigger.  I believe the loop would happen for any case where we
> have a probe function that:
>
> 1. Adds devices.
> 2. After adding devices it decides that it needs to defer.
> 3. Removes the devices it added.
> 4. Return -EPROBE_DEFER from its probe function.
>
> Specifically from what I know about how -EPROBE_DEFER works I'm not
> sure how it wouldn't cause an infinite loop in that case.
>
> Perhaps the missing part of my explanation, though, is why it never
> gets out of this infinite loop.  In my case I purposely made the
> bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> every time.  Obviously I wasn't going to get a display up like this,
> but I just wanted to not loop forever at bootup.  I tracked down
> exactly why we get an - EPROBE_DEFER over and over in this case.
>
> You can see it in msm_dsi_host_register().  If some components haven't
> shown up when that function runs it will _always_ return
> -EPROBE_DEFER.
>
> In my case, since I caused the bridge to fail to probe, those
> components will _never_ show up.  That means that
> msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
>
> I haven't dug through all the DRM code enough, but it doesn't
> necessarily seem like the wrong behavior.  If the bridge driver or a
> panel was a module then (presumably) they could show up later and so
> it should be OK for it to defer, right?
>
> So with all that, it doesn't really feel like this is a bug so much as
> it's an unsupported use case.  The current deferral logic simply can't
> handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> if your probe function adds devices each time through the probe
> function.
>
> Assuming all the above makes sense, that means we're stuck with:
>
> a) This patch series, which makes us not add devices.
>
> b) Some other patch series which rearchitects the MSM graphics stack
> to not return -EPROBE_DEFER in this case.

This isn't a MSM specific issue.  This is an issue with how the DSI
interface works, and how software is structured in Linux.  I would
expect that pretty much any DSI host in the kernel would have some
version of this issue.

The problem is that DSI is not "hot pluggable", so to give the DRM
stack the info it needs, we need both the DSI controller (aka the MSM
graphics stack in your case), and the thing it connects to (in your
case, the TI bridge, normally the actual panel) because the DRM stack
expects that if init completes, it has certain information
(resolution, etc), and some of that information is in the DSI
controller, and some of it is on the DSI device.

>
> c) Smarten up the deferral system to somehow detect this loop.  I'm
> really not sure how to do this.  You'd have to somehow know that you
> keep adding the same devices over and over again and they didn't get
> us out of the deferral loop last time and so you should eventually
> give up.
>
>
> > > > >   msm ae00000.mdss: bound ae01000.mdp (ops 0xffffffe596e951f8)
> > > > >   msm_dsi ae94000.dsi: ae94000.dsi supply gdsc not found, using dummy regulator
> > > > >   msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> > > > >   [drm:ti_sn_bridge_probe] *ERROR* could not find any panel node
> > > > >   ...
> > > > >
> > > > > I finally tracked it down where this was happening:
> > > > >   - msm_pdev_probe() is called.
> > > > >   - msm_pdev_probe() registers drivers.  Registering drivers kicks
> > > > >     off processing of probe deferrals.
> > > > >   - component_master_add_with_match() could return -EPROBE_DEFER.
> > > > >     making msm_pdev_probe() return -EPROBE_DEFER.
> > > > >   - When msm_pdev_probe() returned the processing of probe deferrals
> > > > >     happens.
> > > > >   - Loop back to the start.
> > > > >
> > > > > It looks like we can fix this by marking "mdss" as a "simple-bus".
> > > > > I have no idea if people consider this the right thing to do or a
> > > > > hack.  Hopefully it's the right thing to do.  :-)
> > > >
> > > > It's a simple test. Do the child devices have any dependency on the
> > > > parent to probe and/or function? If so, not a simple-bus.
> > >
> > > Great!  You can see in the earlier patch in the series that the very
> > > first thing that happens when the parent device probes is that it
> > > calls devm_of_platform_populate().  That means no dependencies, right?
> >
> > It should. But then I reviewed the MDSS binding today and it looks
> > like the MDSS is the interrupt parent for at least some child devices?
>
> Hrm.  How does that work?  Let's see...
>
> ...ah, I believe it works because we don't try to grab interrupts in
> the probe path of our sub-components.  That means we probe them just
> fine without the parent.  I guess it has to be like this because
> otherwise we end up with circular dependencies.
>
> So there is a dependency of the child on the parent and of the parent
> on the child (the parent won't really probe until the children do).
> No idea if this means that the whole thing was architected in a
> non-optimal way or if it's just really hard to fit the DRM Component
> model into the Linux Driver model (or both).  Where does that leave us
> about whether "simple-bus" is OK, though?
>
>
>
> > >  So that means it's fine/correct to add "simple-bus" here?
> > >
> > >
> > > > > Once I do this I notice that my boot gets marginally faster (you
> > > > > don't need to probe the sub devices over and over) and also if I
> > > >
> > > > Can you quantify that?
> > >
> > > I'd say < 100 us.  I can try to quantify more if needed, but it wasn't
> > > the point of this patch.
> > >
> > >
> > > > Have you run with devlinks enabled. You need a command line option to
> > > > enable. That too should reduce deferred probes.
> > >
> > > Ah, good idea!  I will try it.  However, even with devlinks, if there
> > > is any chance of deferred probes then we need a fix like this.  The
> > > point of the patch isn't about speeding things up but about avoiding
> > > an infinite loop at bootup due to a small bug.
> >
> > I think a deferred probe would only happen if there's a dependency we
> > don't track (but we're tracking about everything that's common). But
> > if there's some error, I'm not sure what would happen. Seems like a
> > good test case. :)
>
> Maybe now that I've pointed at msm_dsi_host_register() it will help
> clarify.  I don't know a ton about the MSM DRM world (mostly I just
> jumped in here because I was sick of getting stuck in this infinite
> loop), but I'm not sure how we can get around the problems.
>
> I guess in my specific case we could maybe determine that the bridge
> chip returned -EINVAL and thus would never probe, but what about if I
> put the bridge chip driver in a loadable kernel module?
>
> My device links knowledge is super weak (and I'm currently mostly
> focused on the slightly older 5.4 kernel if that matters) but are you
> saying that the system should just know which device would eventually
> provide the bridge/panel and would know not to probe the main DRM
> driver until after it probes?
>
>
> > > > > have a problem it doesn't loop forever (on my system it still
> > > > > gets upset about some stuck clocks in that case, but at least I
> > > > > can boot up).
> > > >
> > > > Deferred probe only runs when a device is added, so it's not like it
> > > > is continually running.
> > >
> > > If you don't mind looking at the code patch, see:
> > >
> > > https://lore.kernel.org/r/20200710160131.4.I358ea82de218ea5f4406572ade23f5e121297555@changeid/
> > >
> > > Specifically you can see that each time we try to probe we were
> > > calling of_platform_populate().  That appeared to be enough to trigger
> > > things.
> >
> > Like I said, sounds like a bug. Even if 'simple-bus' is the
> > appropriate thing to do here, it should be fixed or at least
> > understood.
> >
> > Rob
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-14 16:32           ` Jeffrey Hugo
@ 2020-07-14 22:13             ` Rob Herring
  -1 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-14 22:13 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Doug Anderson, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Bjorn Andersson, Rob Clark, Andy Gross, Daniel Vetter, Sean Paul

On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > >
> > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > >
> > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > at bootup trying again and again and again.  An example log was:
> > > > >
> > > > > Why do we care about optimizing the error case?
> > > >
> > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > small causes a component of DRM to fail to probe then the whole system
> > > > doesn't boot because it just loops trying to probe over and over
> > > > again.  The messages I put in the commit message are printed over and
> > > > over and over again.
> > >
> > > Sounds like a bug as that's not what should happen.
> > >
> > > If you defer during boot (initcalls), then you'll be on the deferred
> > > list until late_initcall and everything is retried. After
> > > late_initcall, only devices getting added should trigger probing. But
> > > maybe the adding and then removing a device is causing a re-trigger.
> >
> > Right, I'm nearly certain that the adding and then removing is causing
> > a re-trigger.  I believe the loop would happen for any case where we
> > have a probe function that:
> >
> > 1. Adds devices.
> > 2. After adding devices it decides that it needs to defer.
> > 3. Removes the devices it added.
> > 4. Return -EPROBE_DEFER from its probe function.
> >
> > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > sure how it wouldn't cause an infinite loop in that case.
> >
> > Perhaps the missing part of my explanation, though, is why it never
> > gets out of this infinite loop.  In my case I purposely made the
> > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > every time.  Obviously I wasn't going to get a display up like this,
> > but I just wanted to not loop forever at bootup.  I tracked down
> > exactly why we get an - EPROBE_DEFER over and over in this case.
> >
> > You can see it in msm_dsi_host_register().  If some components haven't
> > shown up when that function runs it will _always_ return
> > -EPROBE_DEFER.
> >
> > In my case, since I caused the bridge to fail to probe, those
> > components will _never_ show up.  That means that
> > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> >
> > I haven't dug through all the DRM code enough, but it doesn't
> > necessarily seem like the wrong behavior.  If the bridge driver or a
> > panel was a module then (presumably) they could show up later and so
> > it should be OK for it to defer, right?
> >
> > So with all that, it doesn't really feel like this is a bug so much as
> > it's an unsupported use case.  The current deferral logic simply can't
> > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > if your probe function adds devices each time through the probe
> > function.
> >
> > Assuming all the above makes sense, that means we're stuck with:
> >
> > a) This patch series, which makes us not add devices.
> >
> > b) Some other patch series which rearchitects the MSM graphics stack
> > to not return -EPROBE_DEFER in this case.
>
> This isn't a MSM specific issue.  This is an issue with how the DSI
> interface works, and how software is structured in Linux.  I would
> expect that pretty much any DSI host in the kernel would have some
> version of this issue.
>
> The problem is that DSI is not "hot pluggable", so to give the DRM
> stack the info it needs, we need both the DSI controller (aka the MSM
> graphics stack in your case), and the thing it connects to (in your
> case, the TI bridge, normally the actual panel) because the DRM stack
> expects that if init completes, it has certain information
> (resolution, etc), and some of that information is in the DSI
> controller, and some of it is on the DSI device.

Ah yes, DRM's lack of hot-plug and discrete component support... Is
that not improved with some of the bridge rework?

Anyways, given there is a child dependency on the parent, I don't
think we should work-around DRM deficiencies in DT.

BTW, There's also a deferred probe timeout you can use which stops
deferring probe some number of seconds after late_initcall.

Rob

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-14 22:13             ` Rob Herring
  0 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-14 22:13 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Sean Paul,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, Doug Anderson, dri-devel,
	linux-kernel, Andy Gross, Bjorn Andersson, freedreno

On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > >
> > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > >
> > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > at bootup trying again and again and again.  An example log was:
> > > > >
> > > > > Why do we care about optimizing the error case?
> > > >
> > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > small causes a component of DRM to fail to probe then the whole system
> > > > doesn't boot because it just loops trying to probe over and over
> > > > again.  The messages I put in the commit message are printed over and
> > > > over and over again.
> > >
> > > Sounds like a bug as that's not what should happen.
> > >
> > > If you defer during boot (initcalls), then you'll be on the deferred
> > > list until late_initcall and everything is retried. After
> > > late_initcall, only devices getting added should trigger probing. But
> > > maybe the adding and then removing a device is causing a re-trigger.
> >
> > Right, I'm nearly certain that the adding and then removing is causing
> > a re-trigger.  I believe the loop would happen for any case where we
> > have a probe function that:
> >
> > 1. Adds devices.
> > 2. After adding devices it decides that it needs to defer.
> > 3. Removes the devices it added.
> > 4. Return -EPROBE_DEFER from its probe function.
> >
> > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > sure how it wouldn't cause an infinite loop in that case.
> >
> > Perhaps the missing part of my explanation, though, is why it never
> > gets out of this infinite loop.  In my case I purposely made the
> > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > every time.  Obviously I wasn't going to get a display up like this,
> > but I just wanted to not loop forever at bootup.  I tracked down
> > exactly why we get an - EPROBE_DEFER over and over in this case.
> >
> > You can see it in msm_dsi_host_register().  If some components haven't
> > shown up when that function runs it will _always_ return
> > -EPROBE_DEFER.
> >
> > In my case, since I caused the bridge to fail to probe, those
> > components will _never_ show up.  That means that
> > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> >
> > I haven't dug through all the DRM code enough, but it doesn't
> > necessarily seem like the wrong behavior.  If the bridge driver or a
> > panel was a module then (presumably) they could show up later and so
> > it should be OK for it to defer, right?
> >
> > So with all that, it doesn't really feel like this is a bug so much as
> > it's an unsupported use case.  The current deferral logic simply can't
> > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > if your probe function adds devices each time through the probe
> > function.
> >
> > Assuming all the above makes sense, that means we're stuck with:
> >
> > a) This patch series, which makes us not add devices.
> >
> > b) Some other patch series which rearchitects the MSM graphics stack
> > to not return -EPROBE_DEFER in this case.
>
> This isn't a MSM specific issue.  This is an issue with how the DSI
> interface works, and how software is structured in Linux.  I would
> expect that pretty much any DSI host in the kernel would have some
> version of this issue.
>
> The problem is that DSI is not "hot pluggable", so to give the DRM
> stack the info it needs, we need both the DSI controller (aka the MSM
> graphics stack in your case), and the thing it connects to (in your
> case, the TI bridge, normally the actual panel) because the DRM stack
> expects that if init completes, it has certain information
> (resolution, etc), and some of that information is in the DSI
> controller, and some of it is on the DSI device.

Ah yes, DRM's lack of hot-plug and discrete component support... Is
that not improved with some of the bridge rework?

Anyways, given there is a child dependency on the parent, I don't
think we should work-around DRM deficiencies in DT.

BTW, There's also a deferred probe timeout you can use which stops
deferring probe some number of seconds after late_initcall.

Rob
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-14 22:13             ` Rob Herring
@ 2020-07-14 22:52               ` Bjorn Andersson
  -1 siblings, 0 replies; 40+ messages in thread
From: Bjorn Andersson @ 2020-07-14 22:52 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jeffrey Hugo, Doug Anderson, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel, Rob Clark,
	Andy Gross, Daniel Vetter, Sean Paul

On Tue 14 Jul 15:13 PDT 2020, Rob Herring wrote:

> On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > >
> > > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > > >
> > > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > > at bootup trying again and again and again.  An example log was:
> > > > > >
> > > > > > Why do we care about optimizing the error case?
> > > > >
> > > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > > small causes a component of DRM to fail to probe then the whole system
> > > > > doesn't boot because it just loops trying to probe over and over
> > > > > again.  The messages I put in the commit message are printed over and
> > > > > over and over again.
> > > >
> > > > Sounds like a bug as that's not what should happen.
> > > >
> > > > If you defer during boot (initcalls), then you'll be on the deferred
> > > > list until late_initcall and everything is retried. After
> > > > late_initcall, only devices getting added should trigger probing. But
> > > > maybe the adding and then removing a device is causing a re-trigger.
> > >
> > > Right, I'm nearly certain that the adding and then removing is causing
> > > a re-trigger.  I believe the loop would happen for any case where we
> > > have a probe function that:
> > >
> > > 1. Adds devices.
> > > 2. After adding devices it decides that it needs to defer.
> > > 3. Removes the devices it added.
> > > 4. Return -EPROBE_DEFER from its probe function.
> > >
> > > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > > sure how it wouldn't cause an infinite loop in that case.
> > >
> > > Perhaps the missing part of my explanation, though, is why it never
> > > gets out of this infinite loop.  In my case I purposely made the
> > > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > > every time.  Obviously I wasn't going to get a display up like this,
> > > but I just wanted to not loop forever at bootup.  I tracked down
> > > exactly why we get an - EPROBE_DEFER over and over in this case.
> > >
> > > You can see it in msm_dsi_host_register().  If some components haven't
> > > shown up when that function runs it will _always_ return
> > > -EPROBE_DEFER.
> > >
> > > In my case, since I caused the bridge to fail to probe, those
> > > components will _never_ show up.  That means that
> > > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> > >
> > > I haven't dug through all the DRM code enough, but it doesn't
> > > necessarily seem like the wrong behavior.  If the bridge driver or a
> > > panel was a module then (presumably) they could show up later and so
> > > it should be OK for it to defer, right?
> > >
> > > So with all that, it doesn't really feel like this is a bug so much as
> > > it's an unsupported use case.  The current deferral logic simply can't
> > > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > > if your probe function adds devices each time through the probe
> > > function.
> > >
> > > Assuming all the above makes sense, that means we're stuck with:
> > >
> > > a) This patch series, which makes us not add devices.
> > >
> > > b) Some other patch series which rearchitects the MSM graphics stack
> > > to not return -EPROBE_DEFER in this case.
> >
> > This isn't a MSM specific issue.  This is an issue with how the DSI
> > interface works, and how software is structured in Linux.  I would
> > expect that pretty much any DSI host in the kernel would have some
> > version of this issue.
> >
> > The problem is that DSI is not "hot pluggable", so to give the DRM
> > stack the info it needs, we need both the DSI controller (aka the MSM
> > graphics stack in your case), and the thing it connects to (in your
> > case, the TI bridge, normally the actual panel) because the DRM stack
> > expects that if init completes, it has certain information
> > (resolution, etc), and some of that information is in the DSI
> > controller, and some of it is on the DSI device.
> 
> Ah yes, DRM's lack of hot-plug and discrete component support... Is
> that not improved with some of the bridge rework?
> 
> Anyways, given there is a child dependency on the parent, I don't
> think we should work-around DRM deficiencies in DT.
> 
> BTW, There's also a deferred probe timeout you can use which stops
> deferring probe some number of seconds after late_initcall.
> 

I don't think we can rely on the deferred probe timeout, given that it
was reverted back to 0 seconds past late_initcall - which given that
most of the involved components are modules, means that without the
opt-in command line option we would likely fail to bring up the display.

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-14 22:52               ` Bjorn Andersson
  0 siblings, 0 replies; 40+ messages in thread
From: Bjorn Andersson @ 2020-07-14 22:52 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sean Paul,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jeffrey Hugo, David Airlie, linux-arm-msm, Doug Anderson,
	dri-devel, linux-kernel, Andy Gross, freedreno

On Tue 14 Jul 15:13 PDT 2020, Rob Herring wrote:

> On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > >
> > > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > > >
> > > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > > at bootup trying again and again and again.  An example log was:
> > > > > >
> > > > > > Why do we care about optimizing the error case?
> > > > >
> > > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > > small causes a component of DRM to fail to probe then the whole system
> > > > > doesn't boot because it just loops trying to probe over and over
> > > > > again.  The messages I put in the commit message are printed over and
> > > > > over and over again.
> > > >
> > > > Sounds like a bug as that's not what should happen.
> > > >
> > > > If you defer during boot (initcalls), then you'll be on the deferred
> > > > list until late_initcall and everything is retried. After
> > > > late_initcall, only devices getting added should trigger probing. But
> > > > maybe the adding and then removing a device is causing a re-trigger.
> > >
> > > Right, I'm nearly certain that the adding and then removing is causing
> > > a re-trigger.  I believe the loop would happen for any case where we
> > > have a probe function that:
> > >
> > > 1. Adds devices.
> > > 2. After adding devices it decides that it needs to defer.
> > > 3. Removes the devices it added.
> > > 4. Return -EPROBE_DEFER from its probe function.
> > >
> > > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > > sure how it wouldn't cause an infinite loop in that case.
> > >
> > > Perhaps the missing part of my explanation, though, is why it never
> > > gets out of this infinite loop.  In my case I purposely made the
> > > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > > every time.  Obviously I wasn't going to get a display up like this,
> > > but I just wanted to not loop forever at bootup.  I tracked down
> > > exactly why we get an - EPROBE_DEFER over and over in this case.
> > >
> > > You can see it in msm_dsi_host_register().  If some components haven't
> > > shown up when that function runs it will _always_ return
> > > -EPROBE_DEFER.
> > >
> > > In my case, since I caused the bridge to fail to probe, those
> > > components will _never_ show up.  That means that
> > > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> > >
> > > I haven't dug through all the DRM code enough, but it doesn't
> > > necessarily seem like the wrong behavior.  If the bridge driver or a
> > > panel was a module then (presumably) they could show up later and so
> > > it should be OK for it to defer, right?
> > >
> > > So with all that, it doesn't really feel like this is a bug so much as
> > > it's an unsupported use case.  The current deferral logic simply can't
> > > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > > if your probe function adds devices each time through the probe
> > > function.
> > >
> > > Assuming all the above makes sense, that means we're stuck with:
> > >
> > > a) This patch series, which makes us not add devices.
> > >
> > > b) Some other patch series which rearchitects the MSM graphics stack
> > > to not return -EPROBE_DEFER in this case.
> >
> > This isn't a MSM specific issue.  This is an issue with how the DSI
> > interface works, and how software is structured in Linux.  I would
> > expect that pretty much any DSI host in the kernel would have some
> > version of this issue.
> >
> > The problem is that DSI is not "hot pluggable", so to give the DRM
> > stack the info it needs, we need both the DSI controller (aka the MSM
> > graphics stack in your case), and the thing it connects to (in your
> > case, the TI bridge, normally the actual panel) because the DRM stack
> > expects that if init completes, it has certain information
> > (resolution, etc), and some of that information is in the DSI
> > controller, and some of it is on the DSI device.
> 
> Ah yes, DRM's lack of hot-plug and discrete component support... Is
> that not improved with some of the bridge rework?
> 
> Anyways, given there is a child dependency on the parent, I don't
> think we should work-around DRM deficiencies in DT.
> 
> BTW, There's also a deferred probe timeout you can use which stops
> deferring probe some number of seconds after late_initcall.
> 

I don't think we can rely on the deferred probe timeout, given that it
was reverted back to 0 seconds past late_initcall - which given that
most of the involved components are modules, means that without the
opt-in command line option we would likely fail to bring up the display.

Regards,
Bjorn
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
  2020-07-14 22:52               ` Bjorn Andersson
@ 2020-07-15 17:30                 ` Rob Herring
  -1 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-15 17:30 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Jeffrey Hugo, Doug Anderson, freedreno,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	David Airlie, linux-arm-msm, linux-kernel, dri-devel, Rob Clark,
	Andy Gross, Daniel Vetter, Sean Paul

On Tue, Jul 14, 2020 at 4:54 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Tue 14 Jul 15:13 PDT 2020, Rob Herring wrote:
>
> > On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > > >
> > > > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > > > >
> > > > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > > > at bootup trying again and again and again.  An example log was:
> > > > > > >
> > > > > > > Why do we care about optimizing the error case?
> > > > > >
> > > > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > > > small causes a component of DRM to fail to probe then the whole system
> > > > > > doesn't boot because it just loops trying to probe over and over
> > > > > > again.  The messages I put in the commit message are printed over and
> > > > > > over and over again.
> > > > >
> > > > > Sounds like a bug as that's not what should happen.
> > > > >
> > > > > If you defer during boot (initcalls), then you'll be on the deferred
> > > > > list until late_initcall and everything is retried. After
> > > > > late_initcall, only devices getting added should trigger probing. But
> > > > > maybe the adding and then removing a device is causing a re-trigger.
> > > >
> > > > Right, I'm nearly certain that the adding and then removing is causing
> > > > a re-trigger.  I believe the loop would happen for any case where we
> > > > have a probe function that:
> > > >
> > > > 1. Adds devices.
> > > > 2. After adding devices it decides that it needs to defer.
> > > > 3. Removes the devices it added.
> > > > 4. Return -EPROBE_DEFER from its probe function.
> > > >
> > > > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > > > sure how it wouldn't cause an infinite loop in that case.
> > > >
> > > > Perhaps the missing part of my explanation, though, is why it never
> > > > gets out of this infinite loop.  In my case I purposely made the
> > > > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > > > every time.  Obviously I wasn't going to get a display up like this,
> > > > but I just wanted to not loop forever at bootup.  I tracked down
> > > > exactly why we get an - EPROBE_DEFER over and over in this case.
> > > >
> > > > You can see it in msm_dsi_host_register().  If some components haven't
> > > > shown up when that function runs it will _always_ return
> > > > -EPROBE_DEFER.
> > > >
> > > > In my case, since I caused the bridge to fail to probe, those
> > > > components will _never_ show up.  That means that
> > > > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> > > >
> > > > I haven't dug through all the DRM code enough, but it doesn't
> > > > necessarily seem like the wrong behavior.  If the bridge driver or a
> > > > panel was a module then (presumably) they could show up later and so
> > > > it should be OK for it to defer, right?
> > > >
> > > > So with all that, it doesn't really feel like this is a bug so much as
> > > > it's an unsupported use case.  The current deferral logic simply can't
> > > > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > > > if your probe function adds devices each time through the probe
> > > > function.
> > > >
> > > > Assuming all the above makes sense, that means we're stuck with:
> > > >
> > > > a) This patch series, which makes us not add devices.
> > > >
> > > > b) Some other patch series which rearchitects the MSM graphics stack
> > > > to not return -EPROBE_DEFER in this case.
> > >
> > > This isn't a MSM specific issue.  This is an issue with how the DSI
> > > interface works, and how software is structured in Linux.  I would
> > > expect that pretty much any DSI host in the kernel would have some
> > > version of this issue.
> > >
> > > The problem is that DSI is not "hot pluggable", so to give the DRM
> > > stack the info it needs, we need both the DSI controller (aka the MSM
> > > graphics stack in your case), and the thing it connects to (in your
> > > case, the TI bridge, normally the actual panel) because the DRM stack
> > > expects that if init completes, it has certain information
> > > (resolution, etc), and some of that information is in the DSI
> > > controller, and some of it is on the DSI device.
> >
> > Ah yes, DRM's lack of hot-plug and discrete component support... Is
> > that not improved with some of the bridge rework?
> >
> > Anyways, given there is a child dependency on the parent, I don't
> > think we should work-around DRM deficiencies in DT.
> >
> > BTW, There's also a deferred probe timeout you can use which stops
> > deferring probe some number of seconds after late_initcall.
> >
>
> I don't think we can rely on the deferred probe timeout, given that it
> was reverted back to 0 seconds past late_initcall - which given that
> most of the involved components are modules, means that without the
> opt-in command line option we would likely fail to bring up the display.

I meant just as a way to make progress booting in this case where the
display is never coming up. We're talking only about a better
experience for an error case.

Maybe a simple solution is just having some delay inserted between
delayed probe triggers so progress is made.

Rob

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
@ 2020-07-15 17:30                 ` Rob Herring
  0 siblings, 0 replies; 40+ messages in thread
From: Rob Herring @ 2020-07-15 17:30 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Sean Paul,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jeffrey Hugo, David Airlie, linux-arm-msm, Doug Anderson,
	dri-devel, linux-kernel, Andy Gross, freedreno

On Tue, Jul 14, 2020 at 4:54 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Tue 14 Jul 15:13 PDT 2020, Rob Herring wrote:
>
> > On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > > >
> > > > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > > > >
> > > > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > > > at bootup trying again and again and again.  An example log was:
> > > > > > >
> > > > > > > Why do we care about optimizing the error case?
> > > > > >
> > > > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > > > small causes a component of DRM to fail to probe then the whole system
> > > > > > doesn't boot because it just loops trying to probe over and over
> > > > > > again.  The messages I put in the commit message are printed over and
> > > > > > over and over again.
> > > > >
> > > > > Sounds like a bug as that's not what should happen.
> > > > >
> > > > > If you defer during boot (initcalls), then you'll be on the deferred
> > > > > list until late_initcall and everything is retried. After
> > > > > late_initcall, only devices getting added should trigger probing. But
> > > > > maybe the adding and then removing a device is causing a re-trigger.
> > > >
> > > > Right, I'm nearly certain that the adding and then removing is causing
> > > > a re-trigger.  I believe the loop would happen for any case where we
> > > > have a probe function that:
> > > >
> > > > 1. Adds devices.
> > > > 2. After adding devices it decides that it needs to defer.
> > > > 3. Removes the devices it added.
> > > > 4. Return -EPROBE_DEFER from its probe function.
> > > >
> > > > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > > > sure how it wouldn't cause an infinite loop in that case.
> > > >
> > > > Perhaps the missing part of my explanation, though, is why it never
> > > > gets out of this infinite loop.  In my case I purposely made the
> > > > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > > > every time.  Obviously I wasn't going to get a display up like this,
> > > > but I just wanted to not loop forever at bootup.  I tracked down
> > > > exactly why we get an - EPROBE_DEFER over and over in this case.
> > > >
> > > > You can see it in msm_dsi_host_register().  If some components haven't
> > > > shown up when that function runs it will _always_ return
> > > > -EPROBE_DEFER.
> > > >
> > > > In my case, since I caused the bridge to fail to probe, those
> > > > components will _never_ show up.  That means that
> > > > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> > > >
> > > > I haven't dug through all the DRM code enough, but it doesn't
> > > > necessarily seem like the wrong behavior.  If the bridge driver or a
> > > > panel was a module then (presumably) they could show up later and so
> > > > it should be OK for it to defer, right?
> > > >
> > > > So with all that, it doesn't really feel like this is a bug so much as
> > > > it's an unsupported use case.  The current deferral logic simply can't
> > > > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > > > if your probe function adds devices each time through the probe
> > > > function.
> > > >
> > > > Assuming all the above makes sense, that means we're stuck with:
> > > >
> > > > a) This patch series, which makes us not add devices.
> > > >
> > > > b) Some other patch series which rearchitects the MSM graphics stack
> > > > to not return -EPROBE_DEFER in this case.
> > >
> > > This isn't a MSM specific issue.  This is an issue with how the DSI
> > > interface works, and how software is structured in Linux.  I would
> > > expect that pretty much any DSI host in the kernel would have some
> > > version of this issue.
> > >
> > > The problem is that DSI is not "hot pluggable", so to give the DRM
> > > stack the info it needs, we need both the DSI controller (aka the MSM
> > > graphics stack in your case), and the thing it connects to (in your
> > > case, the TI bridge, normally the actual panel) because the DRM stack
> > > expects that if init completes, it has certain information
> > > (resolution, etc), and some of that information is in the DSI
> > > controller, and some of it is on the DSI device.
> >
> > Ah yes, DRM's lack of hot-plug and discrete component support... Is
> > that not improved with some of the bridge rework?
> >
> > Anyways, given there is a child dependency on the parent, I don't
> > think we should work-around DRM deficiencies in DT.
> >
> > BTW, There's also a deferred probe timeout you can use which stops
> > deferring probe some number of seconds after late_initcall.
> >
>
> I don't think we can rely on the deferred probe timeout, given that it
> was reverted back to 0 seconds past late_initcall - which given that
> most of the involved components are modules, means that without the
> opt-in command line option we would likely fail to bring up the display.

I meant just as a way to make progress booting in this case where the
display is never coming up. We're talking only about a better
experience for an error case.

Maybe a simple solution is just having some delay inserted between
delayed probe triggers so progress is made.

Rob
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-07-15 17:30 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10 23:02 [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting Douglas Anderson
2020-07-10 23:02 ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 1/9] drm/msm: Use the devm variant of of_platform_populate() Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 2/9] dt-bindings: msm/dpu: Add simple-bus to dpu bindings Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 3/9] dt-bindings: msm/mdp5: " Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 4/9] drm/msm: Avoid manually populating our children if "simple-bus" is there Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 5/9] arm64: dts: qcom: sc7180: Add "simple-bus" to our mdss node Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 6/9] arm64: dts: qcom: sdm845: " Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 7/9] arm64: dts: qcom: msm8916: " Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 8/9] arm64: dts: qcom: msm8996: " Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-10 23:02 ` [PATCH 9/9] ARM: dts: qcom: msm8974: " Douglas Anderson
2020-07-10 23:02   ` Douglas Anderson
2020-07-13 14:11 ` [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting Rob Herring
2020-07-13 14:11   ` Rob Herring
2020-07-13 14:58   ` [Freedreno] " Jeffrey Hugo
2020-07-13 14:58     ` Jeffrey Hugo
2020-07-13 15:08   ` Doug Anderson
2020-07-13 15:08     ` Doug Anderson
2020-07-13 20:25     ` Rob Herring
2020-07-13 20:25       ` Rob Herring
2020-07-13 21:32       ` Rob Clark
2020-07-13 21:32         ` Rob Clark
2020-07-13 23:50       ` Doug Anderson
2020-07-13 23:50         ` Doug Anderson
2020-07-14 16:32         ` [Freedreno] " Jeffrey Hugo
2020-07-14 16:32           ` Jeffrey Hugo
2020-07-14 22:13           ` Rob Herring
2020-07-14 22:13             ` Rob Herring
2020-07-14 22:52             ` Bjorn Andersson
2020-07-14 22:52               ` Bjorn Andersson
2020-07-15 17:30               ` Rob Herring
2020-07-15 17:30                 ` Rob Herring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.